Factorizations of the different PIN likelihood functions

The PIN likelihood function is derived from the original PIN model as developed by Easley and Ohara (1992) and Easley et al. (1996) . The maximization of the likelihood function as is leads to computational problems, in particular, to floating point errors. To remedy to this issue, several log-transformations or factorizations of the different PIN likelihood functions have been suggested. The main factorizations in the literature are:

fact_pin_eho(): factorization of Easley et al. (2010)
fact_pin_lk(): factorization of Lin and Ke (2011)
fact_pin_e(): factorization of Ersan (2016)

The factorization of the likelihood function of the multilayer PIN model, as developed in Ersan (2016) .

fact_mpin(): factorization of Ersan (2016)

The factorization of the likelihood function of the adjusted PIN model (Duarte and Young 2009) , is derived, and presented in Ersan and Ghachem (2022b) .

fact_adjpin(): factorization in Ersan and Ghachem (2022b)

Usage

fact_pin_eho(data, parameters = NULL)

fact_pin_lk(data, parameters = NULL)

fact_pin_e(data, parameters = NULL)

fact_mpin(data, parameters = NULL)

fact_adjpin(data, parameters = NULL)

Arguments

data: A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).
parameters: In the case of the PIN likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)_b, \(\epsilon\)_s). In the case of the MPIN likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)_b, \(\epsilon\)_s), where \(\alpha\), \(\delta\), and \(\mu\) are numeric vectors of size J, where J is the number of information layers in the data. In the case of the AdjPIN likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\theta\), \(\theta'\), \(\epsilon\)_b, \(\epsilon\)_s, \(\mu\)_b, \(\mu\)_s, \(\Delta\)_b, \(\Delta\)_s). The default value is NULL.

Value

If the argument parameters is omitted, returns a function object that can be used with the optimization functions optim(), and neldermead().

If the argument parameters is provided, returns a numeric value of the log-likelihood function evaluated at the dataset data and the parameters parameters, where parameters is a numeric vector following this order (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)_b, \(\epsilon\)_s) for the factorizations of the PIN likelihood function, (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)_b, \(\epsilon\)_s) for the factorization of the MPIN likelihood function, and (\(\alpha\), \(\delta\), \(\theta\), \(\theta'\), \(\epsilon\)_b, \(\epsilon\)_s ,\(\mu\)_b, \(\mu\)_s, \(\Delta\)_b, \(\Delta\)_s) for the factorization of the AdjPIN likelihood function.

Details

The argument 'data' should be a numeric dataframe, and contain at least two variables. Only the first two variables will be considered: The first variable is assumed to correspond to the total number of buyer-initiated trades, while the second variable is assumed to correspond to the total number of seller-initiated trades. Each row or observation correspond to a trading day. NA values will be ignored.

Our tests, in line with Lin and Ke (2011) , and Ersan and Alici (2016) , demonstrate very similar results for fact_pin_lk(), and fact_pin_e(), both having substantially better estimates than fact_pin_eho().

References

Duarte J, Young L (2009). “Why is PIN priced?” Journal of Financial Economics, 91(2), 119--138. ISSN 0304405X.

Easley D, Hvidkjaer S, Ohara M (2010). “Factoring information into returns.” Journal of Financial and Quantitative Analysis, 45(2), 293--309. ISSN 00221090.

Easley D, Kiefer NM, Ohara M, Paperman JB (1996). “Liquidity, information, and infrequently traded stocks.” Journal of Finance, 51(4), 1405--1436. ISSN 00221082.

Easley D, Ohara M (1992). “Time and the Process of Security Price Adjustment.” The Journal of Finance, 47(2), 577--605. ISSN 15406261.

Ersan O (2016). “Multilayer Probability of Informed Trading.” Available at SSRN 2874420.

Ersan O, Alici A (2016). “An unbiased computation methodology for estimating the probability of informed trading (PIN).” Journal of International Financial Markets, Institutions and Money, 43, 74--94. ISSN 10424431.

Ersan O, Ghachem M (2022b). “A methodological approach to the computational problems in the estimation of adjusted PIN model.” Available at SSRN 4117954.

Lin H, Ke W (2011). “A computing bias in estimating the probability of informed trading.” Journal of Financial Markets, 14(4), 625-640. ISSN 1386-4181.

Examples

# There is a preloaded quarterly dataset called 'dailytrades' with 60
# observations. Each observation corresponds to a day and contains the
# total number of buyer-initiated trades ('B') and seller-initiated
# trades ('S') on that day. To know more, type ?dailytrades

xdata <- dailytrades

# ------------------------------------------------------------------------ #
# Using fact_pin_eho(), fact_pin_lk(), fact_pin_e() to find the likelihood #
# value as factorized by Easley(2010), Lin & Ke (2011), and Ersan(2016).   #
# ------------------------------------------------------------------------ #

# Choose a given parameter set to evaluate the likelihood function at a
# givenpoint  = (alpha, delta, mu, eps.b, eps.s)

givenpoint <- c(0.4, 0.1, 800, 300, 200)

# Use the ouput of fact_pin_e() with the optimization function optim() to
# find optimal estimates of the PIN model.

model <- suppressWarnings(optim(givenpoint, fact_pin_e(xdata)))

# Collect the model estimates from the variable model and display them.

varnames <- c("alpha", "delta", "mu", "eps.b", "eps.s")
estimates <- setNames(model$par, varnames)
show(estimates)
#>        alpha        delta           mu        eps.b        eps.s 
#>   0.88135868   0.06522792 870.07467354 455.66252617 378.39347697 

# Find the value of the log-likelihood function at givenpoint

lklValue <- fact_pin_lk(xdata, givenpoint)

show(lklValue)
#> [1] -9104.868

# ------------------------------------------------------------------------ #
# Using fact_mpin() to find the value of the MPIN likelihood function as   #
# factorized by Ersan (2016).                                              #
# ------------------------------------------------------------------------ #

# Choose a given parameter set to evaluate the likelihood function at a
# givenpoint  = (alpha(), delta(), mu(), eps.b, eps.s) where alpha(), delta()
# and mu() are vectors of size 2.

givenpoint <- c(0.4, 0.5, 0.1, 0.6, 600, 1000, 300, 200)

# Use the output of fact_mpin() with the optimization function optim() to
# find optimal estimates of the PIN model.

model <- suppressWarnings(optim(givenpoint, fact_mpin(xdata)))

# Collect the model estimates from the variable model and display them.

varnames <- c(paste("alpha", 1:2, sep = ""), paste("delta", 1:2, sep = ""),
              paste("mu", 1:2, sep = ""), "eb", "es")
estimates <- setNames(model$par, varnames)
show(estimates)
#>       alpha1       alpha2       delta1       delta2          mu1          mu2 
#> 6.157480e-01 3.553656e-01 9.057862e-01 6.349368e-02 6.018977e+02 1.032878e+03 
#>           eb           es 
#> 4.038852e+02 2.557359e+02 

# Find the value of the MPIN likelihood function at givenpoint

lklValue <- fact_mpin(xdata, givenpoint)

show(lklValue)
#> [1] -5791.781

# ------------------------------------------------------------------------ #
# Using fact_adjpin() to find the value of the DY likelihood function as   #
# factorized by Ersan and Ghachem (2022b).                                 #
# ------------------------------------------------------------------------ #

# Choose a given parameter set to evaluate the likelihood function
# at a the initial parameter set givenpoint = (alpha, delta,
# theta, theta',eps.b, eps.s, muB, muS, db, ds)

givenpoint <- c(0.4, 0.1, 0.3, 0.7, 500, 600, 800, 1000, 300, 200)

# Use the output of fact_adjpin() with the optimization function
# neldermead() to find optimal estimates of the AdjPIN model.

low <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
up <- c(1, 1, 1, 1, Inf, Inf, Inf, Inf, Inf, Inf)
model <- nloptr::neldermead(
givenpoint, fact_adjpin(xdata), lower = low, upper = up)

# Collect the model estimates from the variable model and display them.

varnames <- c("alpha", "delta", "theta", "thetap", "eps.b", "eps.s",
              "muB", "muS", "db", "ds")
estimates <- setNames(model$par, varnames)
show(estimates)
#>        alpha        delta        theta       thetap        eps.b        eps.s 
#> 5.862029e-01 2.201064e-01 3.830600e-01 7.207765e-03 3.353657e+02 3.336114e+02 
#>          muB          muS           db           ds 
#> 1.499150e+03 8.755152e+02 6.426980e+02 8.399421e-01 

# Find the value of the log-likelihood function at givenpoint

adjlklValue <- fact_adjpin(xdata, givenpoint)
show(adjlklValue)
#> [1] -8711.678