Skip to contents

The PIN likelihood function is derived from the original PIN model as developed by Easley and Ohara (1992) and Easley et al. (1996) . The maximization of the likelihood function as is leads to computational problems, in particular, to floating point errors. To remedy to this issue, several log-transformations or factorizations of the different PIN likelihood functions have been suggested. The main factorizations in the literature are:

  • fact_pin_eho(): factorization of Easley et al. (2010)

  • fact_pin_lk(): factorization of Lin and Ke (2011)

  • fact_pin_e(): factorization of Ersan (2016)

The factorization of the likelihood function of the multilayer PIN model, as developed in Ersan (2016) .

  • fact_mpin(): factorization of Ersan (2016)

The factorization of the likelihood function of the adjusted PIN model (Duarte and Young 2009) , is derived, and presented in Ersan and Ghachem (2022b) .

  • fact_adjpin(): factorization in Ersan and Ghachem (2022b)

Usage

fact_pin_eho(data, parameters = NULL)

fact_pin_lk(data, parameters = NULL)

fact_pin_e(data, parameters = NULL)

fact_mpin(data, parameters = NULL)

fact_adjpin(data, parameters = NULL)

Arguments

data

A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).

parameters

In the case of the PIN likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)b, \(\epsilon\)s). In the case of the MPIN likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)b, \(\epsilon\)s), where \(\alpha\), \(\delta\), and \(\mu\) are numeric vectors of size J, where J is the number of information layers in the data. In the case of the AdjPIN likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\theta\), \(\theta'\), \(\epsilon\)b, \(\epsilon\)s, \(\mu\)b, \(\mu\)s, \(\Delta\)b, \(\Delta\)s). The default value is NULL.

Value

If the argument parameters is omitted, returns a function object that can be used with the optimization functions optim(), and neldermead(). If the argument parameters is provided, returns a numeric value of the log-likelihood function evaluated at the dataset data and the parameters parameters, where parameters is a numeric vector following this order (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)b, \(\epsilon\)s) for the factorizations of the PIN likelihood function, (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)b, \(\epsilon\)s) for the factorization of the MPIN likelihood function, and (\(\alpha\), \(\delta\), \(\theta\), \(\theta'\), \(\epsilon\)b, \(\epsilon\)s ,\(\mu\)b, \(\mu\)s, \(\Delta\)b, \(\Delta\)s) for the factorization of the AdjPIN likelihood function.

Details

The argument 'data' should be a numeric dataframe, and contain at least two variables. Only the first two variables will be considered: The first variable is assumed to correspond to the total number of buyer-initiated trades, while the second variable is assumed to correspond to the total number of seller-initiated trades. Each row or observation correspond to a trading day. NA values will be ignored.

Our tests, in line with Lin and Ke (2011) , and Ersan and Alici (2016) , demonstrate very similar results for fact_pin_lk(), and fact_pin_e(), both having substantially better estimates than fact_pin_eho().

References

Duarte J, Young L (2009). “Why is PIN priced?” Journal of Financial Economics, 91(2), 119--138. ISSN 0304405X.

Easley D, Hvidkjaer S, Ohara M (2010). “Factoring information into returns.” Journal of Financial and Quantitative Analysis, 45(2), 293--309. ISSN 00221090.

Easley D, Kiefer NM, Ohara M, Paperman JB (1996). “Liquidity, information, and infrequently traded stocks.” Journal of Finance, 51(4), 1405--1436. ISSN 00221082.

Easley D, Ohara M (1992). “Time and the Process of Security Price Adjustment.” The Journal of Finance, 47(2), 577--605. ISSN 15406261.

Ersan O (2016). “Multilayer Probability of Informed Trading.” Available at SSRN 2874420.

Ersan O, Alici A (2016). “An unbiased computation methodology for estimating the probability of informed trading (PIN).” Journal of International Financial Markets, Institutions and Money, 43, 74--94. ISSN 10424431.

Ersan O, Ghachem M (2022b). “A methodological approach to the computational problems in the estimation of adjusted PIN model.” Available at SSRN 4117954.

Lin H, Ke W (2011). “A computing bias in estimating the probability of informed trading.” Journal of Financial Markets, 14(4), 625-640. ISSN 1386-4181.

Examples

# There is a preloaded quarterly dataset called 'dailytrades' with 60
# observations. Each observation corresponds to a day and contains the total
# number of buyer-initiated transactions ('B') and seller-initiated
# transactions ('S') on that day. To know more, type ?dailytrades

xdata <- dailytrades

# ------------------------------------------------------------------------ #
# Using fact_pin_eho(), fact_pin_lk(), fact_pin_e() to find the likelihood #
# value as factorized by Easley(2010), Lin & Ke (2011), and Ersan(2016).   #
# ------------------------------------------------------------------------ #

# Choose a given parameter set to evaluate the likelihood function at a
# givenpoint  = (alpha, delta, mu, eps.b, eps.s)

givenpoint <- c(0.4, 0.1, 800, 300, 200)

# Use the ouput of fact_pin_e() with the optimization function optim() to
# find optimal estimates of the PIN model.

model <- suppressWarnings(optim(givenpoint, fact_pin_e(xdata)))

# Collect the model estimates from the variable model and display them.

varnames <- c("alpha", "delta", "mu", "eps.b", "eps.s")
estimates <- setNames(model$par, varnames)
show(estimates)
#>        alpha        delta           mu        eps.b        eps.s 
#>   0.88135868   0.06522792 870.07467354 455.66252617 378.39347697 

# Find the value of the log-likelihood function at givenpoint

lklValue <- fact_pin_lk(xdata, givenpoint)

show(lklValue)
#> [1] -9104.868

# ------------------------------------------------------------------------ #
# Using fact_mpin() to find the value of the MPIN likelihood function as   #
# factorized by Ersan (2016).                                              #
# ------------------------------------------------------------------------ #

# Choose a given parameter set to evaluate the likelihood function at a
# givenpoint  = (alpha(), delta(), mu(), eps.b, eps.s) where alpha(), delta()
# and mu() are vectors of size 2.

givenpoint <- c(0.4, 0.5, 0.1, 0.6, 600, 1000, 300, 200)

# Use the output of fact_mpin() with the optimization function optim() to
# find optimal estimates of the PIN model.

model <- suppressWarnings(optim(givenpoint, fact_mpin(xdata)))

# Collect the model estimates from the variable model and display them.

varnames <- c(paste("alpha", 1:2, sep = ""), paste("delta", 1:2, sep = ""),
              paste("mu", 1:2, sep = ""), "eb", "es")
estimates <- setNames(model$par, varnames)
show(estimates)
#>       alpha1       alpha2       delta1       delta2          mu1          mu2 
#> 6.157480e-01 3.553656e-01 9.057862e-01 6.349368e-02 6.018977e+02 1.032878e+03 
#>           eb           es 
#> 4.038852e+02 2.557359e+02 

# Find the value of the MPIN likelihood function at givenpoint

lklValue <- fact_mpin(xdata, givenpoint)

show(lklValue)
#> [1] -5791.781

# ------------------------------------------------------------------------ #
# Using fact_adjpin() to find the value of the DY likelihood function as   #
# factorized by Ersan and Ghachem (2022b).                                 #
# ------------------------------------------------------------------------ #

# Choose a given parameter set to evaluate the likelihood function
# at a the initial parameter set givenpoint = (alpha, delta,
# theta, theta',eps.b, eps.s, muB, muS, db, ds)

givenpoint <- c(0.4, 0.1, 0.3, 0.7, 500, 600, 800, 1000, 300, 200)

# Use the output of fact_adjpin() with the optimization function
# neldermead() to find optimal estimates of the AdjPIN model.

low <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
up <- c(1, 1, 1, 1, Inf, Inf, Inf, Inf, Inf, Inf)
model <- nloptr::neldermead(
givenpoint, fact_adjpin(xdata), lower = low, upper = up)

# Collect the model estimates from the variable model and display them.

varnames <- c("alpha", "delta", "theta", "thetap", "eps.b", "eps.s",
              "muB", "muS", "db", "ds")
estimates <- setNames(model$par, varnames)
show(estimates)
#>        alpha        delta        theta       thetap        eps.b        eps.s 
#> 5.675257e-01 1.941894e-01 5.363950e-01 1.232052e-03 3.355085e+02 3.336092e+02 
#>          muB          muS           db           ds 
#> 1.506971e+03 8.750166e+02 6.445527e+02 5.761288e+00 

# Find the value of the log-likelihood function at givenpoint

adjlklValue <- fact_adjpin(xdata, givenpoint)
show(adjlklValue)
#> [1] -8711.678