Factorizations of the different PIN likelihood functions
Source:R/model_factorizations.R
factorizations.Rd
The PIN
likelihood function is derived from the original PIN
model as
developed by Easley and Ohara (1992)
and
Easley et al. (1996)
. The maximization of the
likelihood function as is leads to computational problems, in particular,
to floating point errors. To remedy to this issue, several
log-transformations or factorizations of the different PIN
likelihood
functions have been suggested.
The main factorizations in the literature are:
fact_pin_eho()
: factorization of Easley et al. (2010)fact_pin_lk()
: factorization of Lin and Ke (2011)fact_pin_e()
: factorization of Ersan (2016)
The factorization of the likelihood function of the multilayer PIN
model,
as developed in Ersan (2016)
.
fact_mpin()
: factorization of Ersan (2016)
The factorization of the likelihood function of the adjusted PIN
model
(Duarte and Young 2009)
, is derived, and presented in
Ersan and Ghachem (2022b)
.
fact_adjpin()
: factorization in Ersan and Ghachem (2022b)
Usage
fact_pin_eho(data, parameters = NULL)
fact_pin_lk(data, parameters = NULL)
fact_pin_e(data, parameters = NULL)
fact_mpin(data, parameters = NULL)
fact_adjpin(data, parameters = NULL)
Arguments
- data
A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).
- parameters
In the case of the
PIN
likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)b, \(\epsilon\)s). In the case of theMPIN
likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)b, \(\epsilon\)s), where \(\alpha\), \(\delta\), and \(\mu\) are numeric vectors of sizeJ
, whereJ
is the number of information layers in the data. In the case of theAdjPIN
likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\theta\), \(\theta'\), \(\epsilon\)b, \(\epsilon\)s, \(\mu\)b, \(\mu\)s, \(\Delta\)b, \(\Delta\)s). The default value isNULL
.
Value
If the argument parameters
is omitted, returns a function
object that can be used with the optimization functions optim()
,
and neldermead()
.
If the argument parameters
is provided, returns a numeric value of the
log-likelihood function evaluated at the dataset data
and the
parameters parameters
, where parameters
is a numeric vector
following this order (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)b, \(\epsilon\)s)
for the factorizations of the PIN
likelihood function, (\(\alpha\),
\(\delta\), \(\mu\), \(\epsilon\)b, \(\epsilon\)s) for the factorization of the
MPIN
likelihood function, and (\(\alpha\), \(\delta\), \(\theta\),
\(\theta'\), \(\epsilon\)b, \(\epsilon\)s ,\(\mu\)b, \(\mu\)s, \(\Delta\)b, \(\Delta\)s) for the factorization of
the AdjPIN
likelihood function.
Details
The argument 'data' should be a numeric dataframe, and contain
at least two variables. Only the first two variables will be considered:
The first variable is assumed to correspond to the total number of
buyer-initiated trades, while the second variable is assumed to
correspond to the total number of seller-initiated trades. Each row or
observation correspond to a trading day. NA
values will be ignored.
Our tests, in line with Lin and Ke (2011)
,
and Ersan and Alici (2016)
, demonstrate very
similar results for fact_pin_lk()
, and fact_pin_e()
, both
having substantially better estimates than fact_pin_eho()
.
References
Duarte J, Young L (2009).
“Why is PIN priced?”
Journal of Financial Economics, 91(2), 119--138.
ISSN 0304405X.
Easley D, Hvidkjaer S, Ohara M (2010).
“Factoring information into returns.”
Journal of Financial and Quantitative Analysis, 45(2), 293--309.
ISSN 00221090.
Easley D, Kiefer NM, Ohara M, Paperman JB (1996).
“Liquidity, information, and infrequently traded stocks.”
Journal of Finance, 51(4), 1405--1436.
ISSN 00221082.
Easley D, Ohara M (1992).
“Time and the Process of Security Price Adjustment.”
The Journal of Finance, 47(2), 577--605.
ISSN 15406261.
Ersan O (2016).
“Multilayer Probability of Informed Trading.”
Available at SSRN 2874420.
Ersan O, Alici A (2016).
“An unbiased computation methodology for estimating the probability of informed trading (PIN).”
Journal of International Financial Markets, Institutions and Money, 43, 74--94.
ISSN 10424431.
Ersan O, Ghachem M (2022b).
“A methodological approach to the computational problems in the estimation of adjusted PIN model.”
Available at SSRN 4117954.
Lin H, Ke W (2011).
“A computing bias in estimating the probability of informed trading.”
Journal of Financial Markets, 14(4), 625-640.
ISSN 1386-4181.
Examples
# There is a preloaded quarterly dataset called 'dailytrades' with 60
# observations. Each observation corresponds to a day and contains the
# total number of buyer-initiated trades ('B') and seller-initiated
# trades ('S') on that day. To know more, type ?dailytrades
xdata <- dailytrades
# ------------------------------------------------------------------------ #
# Using fact_pin_eho(), fact_pin_lk(), fact_pin_e() to find the likelihood #
# value as factorized by Easley(2010), Lin & Ke (2011), and Ersan(2016). #
# ------------------------------------------------------------------------ #
# Choose a given parameter set to evaluate the likelihood function at a
# givenpoint = (alpha, delta, mu, eps.b, eps.s)
givenpoint <- c(0.4, 0.1, 800, 300, 200)
# Use the ouput of fact_pin_e() with the optimization function optim() to
# find optimal estimates of the PIN model.
model <- suppressWarnings(optim(givenpoint, fact_pin_e(xdata)))
# Collect the model estimates from the variable model and display them.
varnames <- c("alpha", "delta", "mu", "eps.b", "eps.s")
estimates <- setNames(model$par, varnames)
show(estimates)
#> alpha delta mu eps.b eps.s
#> 0.88135868 0.06522792 870.07467354 455.66252617 378.39347697
# Find the value of the log-likelihood function at givenpoint
lklValue <- fact_pin_lk(xdata, givenpoint)
show(lklValue)
#> [1] -9104.868
# ------------------------------------------------------------------------ #
# Using fact_mpin() to find the value of the MPIN likelihood function as #
# factorized by Ersan (2016). #
# ------------------------------------------------------------------------ #
# Choose a given parameter set to evaluate the likelihood function at a
# givenpoint = (alpha(), delta(), mu(), eps.b, eps.s) where alpha(), delta()
# and mu() are vectors of size 2.
givenpoint <- c(0.4, 0.5, 0.1, 0.6, 600, 1000, 300, 200)
# Use the output of fact_mpin() with the optimization function optim() to
# find optimal estimates of the PIN model.
model <- suppressWarnings(optim(givenpoint, fact_mpin(xdata)))
# Collect the model estimates from the variable model and display them.
varnames <- c(paste("alpha", 1:2, sep = ""), paste("delta", 1:2, sep = ""),
paste("mu", 1:2, sep = ""), "eb", "es")
estimates <- setNames(model$par, varnames)
show(estimates)
#> alpha1 alpha2 delta1 delta2 mu1 mu2
#> 6.157480e-01 3.553656e-01 9.057862e-01 6.349368e-02 6.018977e+02 1.032878e+03
#> eb es
#> 4.038852e+02 2.557359e+02
# Find the value of the MPIN likelihood function at givenpoint
lklValue <- fact_mpin(xdata, givenpoint)
show(lklValue)
#> [1] -5791.781
# ------------------------------------------------------------------------ #
# Using fact_adjpin() to find the value of the DY likelihood function as #
# factorized by Ersan and Ghachem (2022b). #
# ------------------------------------------------------------------------ #
# Choose a given parameter set to evaluate the likelihood function
# at a the initial parameter set givenpoint = (alpha, delta,
# theta, theta',eps.b, eps.s, muB, muS, db, ds)
givenpoint <- c(0.4, 0.1, 0.3, 0.7, 500, 600, 800, 1000, 300, 200)
# Use the output of fact_adjpin() with the optimization function
# neldermead() to find optimal estimates of the AdjPIN model.
low <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
up <- c(1, 1, 1, 1, Inf, Inf, Inf, Inf, Inf, Inf)
model <- nloptr::neldermead(
givenpoint, fact_adjpin(xdata), lower = low, upper = up)
# Collect the model estimates from the variable model and display them.
varnames <- c("alpha", "delta", "theta", "thetap", "eps.b", "eps.s",
"muB", "muS", "db", "ds")
estimates <- setNames(model$par, varnames)
show(estimates)
#> alpha delta theta thetap eps.b eps.s
#> 5.862029e-01 2.201064e-01 3.830600e-01 7.207765e-03 3.353657e+02 3.336114e+02
#> muB muS db ds
#> 1.499150e+03 8.755152e+02 6.426980e+02 8.399421e-01
# Find the value of the log-likelihood function at givenpoint
adjlklValue <- fact_adjpin(xdata, givenpoint)
show(adjlklValue)
#> [1] -8711.678