MPIN model estimation via standard ML methods

Estimates the multilayer probability of informed trading (MPIN) using the standard Maximum Likelihood method.

Usage

mpin_ml(data, layers = NULL, xtraclusters = 4, initialsets = NULL,
detectlayers = "EG", ..., verbose = TRUE)

Arguments

data: A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).
layers: An integer referring to the assumed number of information layers in the data. If the argument layers is given, then the maximum likelihood estimation will use the number of layers provided. If layers is omitted, the function mpin_ml() will find the optimal number of layers using the algorithm developed in Ersan and Ghachem (2022a) (as default).
xtraclusters: An integer used to divide trading days into (1 + layers + xtraclusters) clusters, thereby resulting in #comb(layers + xtraclusters, layers) initial parameter sets in line with Ersan and Alici (2016) , and Ersan (2016) . The default value is 4 as chosen in Ersan (2016) .
initialsets: A dataframe containing initial parameter sets for the estimation of the MPIN model. The default value is NULL. If initialsets is NULL, the initial parameter sets are determined by the function initials_mpin().
detectlayers: A character string referring to the layer detection algorithm used to determine the number of layer in the data. It takes one of three values: "E", "EG", and "ECM". "E" refers to the algorithm in Ersan (2016) , "EG" refers to the algorithm in Ersan and Ghachem (2022a) ; while "ECM" refers to the algorithm in Ghachem and Ersan (2022a) . The default value is "EG". Comparative results between the layer detection algorithms can be found in Ersan and Ghachem (2022a) .
...: Additional arguments passed on to the function mpin_ml. The recognized argument is is_parallel. is_parallel is a logical variable that specifies whether the computation is performed using parallel processing. The default value is FALSE.
verbose: A binary variable that determines whether detailed information about the steps of the estimation of the MPIN model is displayed. No output is produced when verbose is set to FALSE. The default value is TRUE.

Value

Returns an object of class estimate.mpin

Details

The argument 'data' should be a numeric dataframe, and contain at least two variables. Only the first two variables will be considered: The first variable is assumed to correspond to the total number of buyer-initiated trades, while the second variable is assumed to correspond to the total number of seller-initiated trades. Each row or observation correspond to a trading day. NA values will be ignored.

References

Ersan O (2016). “Multilayer Probability of Informed Trading.” Available at SSRN 2874420.

Ersan O, Alici A (2016). “An unbiased computation methodology for estimating the probability of informed trading (PIN).” Journal of International Financial Markets, Institutions and Money, 43, 74--94. ISSN 10424431.

Ersan O, Ghachem M (2022a). “Identifying information types in probability of informed trading (PIN) models: An improved algorithm.” Available at SSRN 4117956.

Ghachem M, Ersan O (2022a). “Estimation of the probability of informed trading models via an expectation-conditional maximization algorithm.” Available at SSRN 4117952.

Examples

# There is a preloaded quarterly dataset called 'dailytrades' with 60
# observations. Each observation corresponds to a day and contains the
# total number of buyer-initiated trades ('B') and seller-initiated
# trades ('S') on that day. To know more, type ?dailytrades

xdata <- dailytrades

# ------------------------------------------------------------------------ #
# Estimate MPIN model using the standard ML method                         #
# ------------------------------------------------------------------------ #

# Estimate the MPIN model using mpin_ml() assuming that there is a single
# information layer in the data. The model is then equivalent to the PIN
# model. The argument 'layers' takes the value '1'.
# We use two extra clusters to generate the initial parameter sets.

estimate <- mpin_ml(xdata, layers = 1, xtraclusters = 2, verbose = FALSE)

# Show the estimation output

show(estimate)
#> ----------------------------------
#> MPIN estimation completed successfully
#> ----------------------------------
#> Likelihood factorization: Ersan (2016)
#> Estimation Algorithm 	: Maximum Likelihood Estimation
#> Initial parameter sets	: Ersan (2016), Ersan and Alici (2016)
#> Info. layers in the data: provided by the user
#> ----------------------------------
#> 3 initial set(s) are used in the estimation 
#> Type object@initialsets to see the initial parameter sets used
#> 
#>  MPIN model   Sequential  
#> 
#> 
#> ==========  ===========
#> Variables   Estimates  
#> ==========  ===========
#> alpha       0.749997   
#> delta       0.133334   
#> mu          1193.52    
#> eps.b       357.27     
#> eps.s       328.63     
#> ----                   
#> Likelihood  (3226.469) 
#> mpin(j)     0.566172   
#> mpin        0.566172   
#> ==========  ===========
#> 
#> -------
#> Running time: 0.818 seconds

# Estimate the MPIN model using the function mpin_ml(), without specifying
# the number of layers. The number of layers is then detected using Ersan and
# Ghachem (2022a).
# -------------------------------------------------------------
# \donttest{
estimate <- mpin_ml(xdata, xtraclusters = 2, verbose = FALSE)
# }
# Show the estimation output

show(estimate)
#> ----------------------------------
#> MPIN estimation completed successfully
#> ----------------------------------
#> Likelihood factorization: Ersan (2016)
#> Estimation Algorithm 	: Maximum Likelihood Estimation
#> Initial parameter sets	: Ersan (2016), Ersan and Alici (2016)
#> Info. layers detected	: using Ersan and Ghachem (2022a)
#> ----------------------------------
#> 10 initial set(s) are used in the estimation 
#> Type object@initialsets to see the initial parameter sets used
#> 
#>  MPIN model   Sequential  
#> 
#> 
#> ==========  ============================
#> Variables   Estimates                   
#> ==========  ============================
#> alpha       0.216664, 0.050001, 0.483339
#> delta       0.230769, 0.666673, 0.034481
#> mu          602.86, 986.44, 1506.81     
#> eps.b       336.91                      
#> eps.s       335.89                      
#> ----                                    
#> Likelihood  (643.458)                   
#> mpin(j)     0.082615, 0.031196, 0.460647
#> mpin        0.574458                    
#> ==========  ============================
#> 
#> -------
#> Running time: 6.833 seconds

# Display the likelihood-maximizing parameters

show(estimate@parameters)
#> $alpha
#>   layer.1   layer.2   layer.3 
#> 0.2166640 0.0500008 0.4833392 
#> 
#> $delta
#>    layer.1    layer.2    layer.3 
#> 0.23076940 0.66667315 0.03448076 
#> 
#> $mu
#>   layer.1   layer.2   layer.3 
#>  602.8611  986.4359 1506.8130 
#> 
#> $eps.b
#> [1] 336.9118
#> 
#> $eps.s
#> [1] 335.8871
#> 

# Display the global multilayer probability of informed trading

show(estimate@mpin)
#> [1] 0.5744584

# Display the multilayer probabilities of informed trading per layer

show(estimate@mpinJ)
#>    layer.1    layer.2    layer.3 
#> 0.08261535 0.03119626 0.46064683 

# Display the first five initial parameters sets used in the maximum
# likelihood estimation

show(round(head(estimate@initialsets, 5), 4))
#>   alpha.1 alpha.2 alpha.3 delta.1 delta.2 delta.3     mu.1      mu.2     mu.3
#> 1  0.1167  0.1000  0.5333  0.2857  0.1667  0.0938 561.0181  644.3616 1462.722
#> 2  0.1167  0.1500  0.4833  0.2857  0.3333  0.0345 561.0181  762.2363 1510.798
#> 3  0.1167  0.5333  0.1000  0.2857  0.1250  0.0000 561.0181 1286.9692 1581.709
#> 4  0.2167  0.0500  0.4833  0.2308  0.6667  0.0345 599.4843  997.9859 1510.798
#> 5  0.2167  0.4333  0.1000  0.2308  0.1154  0.0000 599.4843 1435.2633 1581.709
#>      eps.b    eps.s
#> 1 336.1429 336.1852
#> 2 336.1429 336.1852
#> 3 336.1429 336.1852
#> 4 336.1429 336.1852
#> 5 336.1429 336.1852