Skip to contents

Based on the algorithm in Ersan (2016) , generates initial parameter sets for the maximum likelihood estimation of the MPIN model.

Usage

initials_mpin(data, layers = NULL, detectlayers = "EG",
 xtraclusters = 4, verbose = TRUE)

Arguments

data

A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).

layers

An integer referring to the assumed number of information layers in the data. If the value of layers is NULL, then the number of layers is automatically determined by one of the following functions: detectlayers_e(), detectlayers_eg(), and detectlayers_ecm(). The default value is NULL.

detectlayers

A character string referring to the layer detection algorithm used to determine the number of layers in the data. It takes one of three values: "E", "EG", and "ECM". "E" refers to the algorithm in Ersan (2016) , "EG" refers to the algorithm in Ersan and Ghachem (2022a) ; while "ECM" refers to the algorithm in Ghachem and Ersan (2022a) . The default value is "EG". Comparative results between the layer detection algorithms can be found in Ersan and Ghachem (2022a) .

xtraclusters

An integer used to divide trading days into #(1 + layers + xtraclusters) clusters, thereby resulting in #comb(layers + xtraclusters, layers) initial parameter sets in line with Ersan and Alici (2016) , and Ersan (2016) . The default value is 4 as chosen in Ersan (2016) .

verbose

a binary variable that determines whether information messages about the initial parameter sets, including the number of the initial parameter sets generated. No message is shown when verbose is set to FALSE. The default value is TRUE.

Value

Returns a dataframe of initial parameter sets each consisting of 3J + 2 variables {\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)b, \(\epsilon\)s}. \(\alpha\), \(\delta\), and \(\mu\) are vectors of length J where J is the number of layers in the MPIN model.

Details

The argument 'data' should be a numeric dataframe, and contain at least two variables. Only the first two variables will be considered: The first variable is assumed to correspond to the total number of buyer-initiated trades, while the second variable is assumed to correspond to the total number of seller-initiated trades. Each row or observation correspond to a trading day. NA values will be ignored.

References

Ersan O (2016). “Multilayer Probability of Informed Trading.” Available at SSRN 2874420.

Ersan O, Alici A (2016). “An unbiased computation methodology for estimating the probability of informed trading (PIN).” Journal of International Financial Markets, Institutions and Money, 43, 74--94. ISSN 10424431.

Ersan O, Ghachem M (2022a). “Identifying information types in probability of informed trading (PIN) models: An improved algorithm.” Available at SSRN 4117956.

Ghachem M, Ersan O (2022a). “Estimation of the probability of informed trading models via an expectation-conditional maximization algorithm.” Available at SSRN 4117952.

Examples

# There is a preloaded quarterly dataset called 'dailytrades' with 60
# observations.   Each observation corresponds to a day and contains the
# total number of buyer-initiated   transactions ('Buys') and
# seller-initiated transactions ('S') on that day. To know   more, type
# ?dailytrades

 xdata <- dailytrades

# Obtain a dataframe of initial parameter sets for estimation of the MPIN
# model using the algorithm of Ersan (2016) with 3 extra clusters.
# By default, the number of layers in the data is detected using the
# algorithm of Ersan and Ghachem (2022a).

 init.sets <- initials_mpin(xdata, xtraclusters = 3)
#> The function initials_mpin(...) has generated  initial parameter sets.
#> 
 To display the initial sets, store them in a variable or call (initials_mpin(...)). 
#> 
To hide these messages, set the argument 'silent' to TRUE (silent = TRUE).
#> 

# Show the initial parameter sets

 show(round(init.sets, 2))
#>    alpha.1 alpha.2 alpha.3 delta.1 delta.2 delta.3    mu.1    mu.2    mu.3
#> 1     0.12    0.10    0.53    0.29    0.17    0.09  561.02  644.36 1462.72
#> 2     0.12    0.15    0.48    0.29    0.33    0.03  561.02  762.24 1510.80
#> 3     0.12    0.22    0.42    0.29    0.23    0.04  561.02  973.02 1520.96
#> 4     0.12    0.53    0.10    0.29    0.12    0.00  561.02 1286.97 1581.71
#> 5     0.22    0.05    0.48    0.23    0.67    0.03  599.48  997.99 1510.80
#> 6     0.22    0.12    0.42    0.23    0.29    0.04  599.48 1254.73 1520.96
#> 7     0.22    0.43    0.10    0.23    0.12    0.00  599.48 1435.26 1581.71
#> 8     0.27    0.07    0.42    0.31    0.00    0.04  674.20 1447.29 1520.96
#> 9     0.27    0.38    0.10    0.31    0.04    0.00  674.20 1492.30 1581.71
#> 10    0.33    0.32    0.10    0.25    0.05    0.00  828.82 1501.77 1581.71
#> 11    0.10    0.05    0.48    0.17    0.67    0.03  584.02 1028.16 1426.52
#> 12    0.10    0.12    0.42    0.17    0.29    0.04  584.02 1215.94 1437.68
#> 13    0.10    0.43    0.10    0.17    0.12    0.00  584.02 1365.64 1491.19
#> 14    0.15    0.07    0.42    0.33    0.00    0.04  732.06 1356.78 1437.68
#> 15    0.15    0.38    0.10    0.33    0.04    0.00  732.06 1409.65 1491.19
#> 16    0.22    0.32    0.10    0.23    0.05    0.00  924.28 1420.79 1491.19
#> 17    0.05    0.07    0.42    0.67    0.00    0.04 1052.74 1283.03 1369.84
#> 18    0.05    0.38    0.10    0.67    0.04    0.00 1052.74 1342.32 1417.45
#> 19    0.12    0.32    0.10    0.29    0.05    0.00 1184.33 1354.80 1417.45
#> 20    0.07    0.32    0.10    0.00    0.05    0.00 1290.24 1361.25 1424.66
#>     eps.b  eps.s
#> 1  336.14 336.19
#> 2  336.14 336.19
#> 3  336.14 336.19
#> 4  336.14 336.19
#> 5  336.14 336.19
#> 6  336.14 336.19
#> 7  336.14 336.19
#> 8  336.14 336.19
#> 9  336.14 336.19
#> 10 336.14 336.19
#> 11 446.88 356.41
#> 12 446.88 356.41
#> 13 446.88 356.41
#> 14 446.88 356.41
#> 15 446.88 356.41
#> 16 446.88 356.41
#> 17 531.68 367.46
#> 18 531.68 367.46
#> 19 531.68 367.46
#> 20 556.69 399.68

# Use these initial parameter sets to estimate the probability of informed
# trading, the number of information layers will be detected from the
# initial parameter sets.

 estimate <- mpin_ml(xdata, initialsets = init.sets, verbose = FALSE)

# Display the estimated MPIN value
 show(estimate@mpin)
#> [1] 0.5744584

# Display the estimated parameters as a numeric vector.
 show(unlist(estimate@parameters))
#> alpha.layer.1 alpha.layer.2 alpha.layer.3 delta.layer.1 delta.layer.2 
#>  2.166640e-01  5.000080e-02  4.833392e-01  2.307694e-01  6.666732e-01 
#> delta.layer.3    mu.layer.1    mu.layer.2    mu.layer.3         eps.b 
#>  3.448076e-02  6.028611e+02  9.864359e+02  1.506813e+03  3.369118e+02 
#>         eps.s 
#>  3.358871e+02 

# Store the posterior probabilities in a dataframe variable, and show its
# first 6 rows.

 modelposteriors <- get_posteriors(estimate)
 show(round(head(modelposteriors), 3))
#>   post.N post.G[1] post.G[2] post.G[3] Post.B[1] Post.B[2] Post.B[3]
#> 1      1         0         0         0         0         0         0
#> 2      0         1         0         0         0         0         0
#> 3      0         0         1         0         0         0         0
#> 4      0         1         0         0         0         0         0
#> 5      0         1         0         0         0         0         0
#> 6      0         1         0         0         0         0         0