# Factorizations of the different PIN likelihood functions

Source:`R/model_factorizations.R`

`factorizations.Rd`

The `PIN`

likelihood function is derived from the original `PIN`

model as
developed by Easley and Ohara (1992)
and
Easley et al. (1996)
. The maximization of the
likelihood function as is leads to computational problems, in particular,
to floating point errors. To remedy to this issue, several
log-transformations or factorizations of the different `PIN`

likelihood
functions have been suggested.
The main factorizations in the literature are:

`fact_pin_eho()`

: factorization of Easley et al. (2010)`fact_pin_lk()`

: factorization of Lin and Ke (2011)`fact_pin_e()`

: factorization of Ersan (2016)

The factorization of the likelihood function of the multilayer `PIN`

model,
as developed in Ersan (2016)
.

`fact_mpin()`

: factorization of Ersan (2016)

The factorization of the likelihood function of the adjusted `PIN`

model
(Duarte and Young 2009)
, is derived, and presented in
Ersan and Ghachem (2022b)
.

`fact_adjpin()`

: factorization in Ersan and Ghachem (2022b)

## Usage

```
fact_pin_eho(data, parameters = NULL)
fact_pin_lk(data, parameters = NULL)
fact_pin_e(data, parameters = NULL)
fact_mpin(data, parameters = NULL)
fact_adjpin(data, parameters = NULL)
```

## Arguments

- data
A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).

- parameters
In the case of the

`PIN`

likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)_{b}, \(\epsilon\)_{s}). In the case of the`MPIN`

likelihood factorization, it is an ordered numeric vector (**\(\alpha\)**,**\(\delta\)**,**\(\mu\)**, \(\epsilon\)_{b}, \(\epsilon\)_{s}), where**\(\alpha\)**,**\(\delta\)**, and**\(\mu\)**are numeric vectors of size`J`

, where`J`

is the number of information layers in the data. In the case of the`AdjPIN`

likelihood factorization, it is an ordered numeric vector (\(\alpha\), \(\delta\), \(\theta\), \(\theta'\), \(\epsilon\)_{b}, \(\epsilon\)_{s}, \(\mu\)_{b}, \(\mu\)_{s}, \(\Delta\)_{b}, \(\Delta\)_{s}). The default value is`NULL`

.

## Value

If the argument `parameters`

is omitted, returns a function
object that can be used with the optimization functions `optim()`

,
and `neldermead()`

.

If the argument `parameters`

is provided, returns a numeric value of the
log-likelihood function evaluated at the dataset `data`

and the
parameters `parameters`

, where `parameters`

is a numeric vector
following this order (\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)_{b}, \(\epsilon\)_{s})
for the factorizations of the `PIN`

likelihood function, (**\(\alpha\)**,
**\(\delta\)**, **\(\mu\)**, \(\epsilon\)_{b}, \(\epsilon\)_{s}) for the factorization of the
`MPIN`

likelihood function, and (\(\alpha\), \(\delta\), \(\theta\),
\(\theta'\), \(\epsilon\)_{b}, \(\epsilon\)_{s} ,\(\mu\)_{b}, \(\mu\)_{s}, \(\Delta\)_{b}, \(\Delta\)_{s}) for the factorization of
the `AdjPIN`

likelihood function.

## Details

The argument 'data' should be a numeric dataframe, and contain
at least two variables. Only the first two variables will be considered:
The first variable is assumed to correspond to the total number of
buyer-initiated trades, while the second variable is assumed to
correspond to the total number of seller-initiated trades. Each row or
observation correspond to a trading day. `NA`

values will be ignored.

Our tests, in line with Lin and Ke (2011)
,
and Ersan and Alici (2016)
, demonstrate very
similar results for `fact_pin_lk()`

, and `fact_pin_e()`

, both
having substantially better estimates than `fact_pin_eho()`

.

## References

Duarte J, Young L (2009).
“Why is PIN priced?”
*Journal of Financial Economics*, **91**(2), 119--138.
ISSN 0304405X.

Easley D, Hvidkjaer S, Ohara M (2010).
“Factoring information into returns.”
*Journal of Financial and Quantitative Analysis*, **45**(2), 293--309.
ISSN 00221090.

Easley D, Kiefer NM, Ohara M, Paperman JB (1996).
“Liquidity, information, and infrequently traded stocks.”
*Journal of Finance*, **51**(4), 1405--1436.
ISSN 00221082.

Easley D, Ohara M (1992).
“Time and the Process of Security Price Adjustment.”
*The Journal of Finance*, **47**(2), 577--605.
ISSN 15406261.

Ersan O (2016).
“Multilayer Probability of Informed Trading.”
*Available at SSRN 2874420*.

Ersan O, Alici A (2016).
“An unbiased computation methodology for estimating the probability of informed trading (PIN).”
*Journal of International Financial Markets, Institutions and Money*, **43**, 74--94.
ISSN 10424431.

Ersan O, Ghachem M (2022b).
“A methodological approach to the computational problems in the estimation of adjusted PIN model.”
*Available at SSRN 4117954*.

Lin H, Ke W (2011).
“A computing bias in estimating the probability of informed trading.”
*Journal of Financial Markets*, **14**(4), 625-640.
ISSN 1386-4181.

## Examples

```
# There is a preloaded quarterly dataset called 'dailytrades' with 60
# observations. Each observation corresponds to a day and contains the
# total number of buyer-initiated trades ('B') and seller-initiated
# trades ('S') on that day. To know more, type ?dailytrades
xdata <- dailytrades
# ------------------------------------------------------------------------ #
# Using fact_pin_eho(), fact_pin_lk(), fact_pin_e() to find the likelihood #
# value as factorized by Easley(2010), Lin & Ke (2011), and Ersan(2016). #
# ------------------------------------------------------------------------ #
# Choose a given parameter set to evaluate the likelihood function at a
# givenpoint = (alpha, delta, mu, eps.b, eps.s)
givenpoint <- c(0.4, 0.1, 800, 300, 200)
# Use the ouput of fact_pin_e() with the optimization function optim() to
# find optimal estimates of the PIN model.
model <- suppressWarnings(optim(givenpoint, fact_pin_e(xdata)))
# Collect the model estimates from the variable model and display them.
varnames <- c("alpha", "delta", "mu", "eps.b", "eps.s")
estimates <- setNames(model$par, varnames)
show(estimates)
#> alpha delta mu eps.b eps.s
#> 0.88135868 0.06522792 870.07467354 455.66252617 378.39347697
# Find the value of the log-likelihood function at givenpoint
lklValue <- fact_pin_lk(xdata, givenpoint)
show(lklValue)
#> [1] -9104.868
# ------------------------------------------------------------------------ #
# Using fact_mpin() to find the value of the MPIN likelihood function as #
# factorized by Ersan (2016). #
# ------------------------------------------------------------------------ #
# Choose a given parameter set to evaluate the likelihood function at a
# givenpoint = (alpha(), delta(), mu(), eps.b, eps.s) where alpha(), delta()
# and mu() are vectors of size 2.
givenpoint <- c(0.4, 0.5, 0.1, 0.6, 600, 1000, 300, 200)
# Use the output of fact_mpin() with the optimization function optim() to
# find optimal estimates of the PIN model.
model <- suppressWarnings(optim(givenpoint, fact_mpin(xdata)))
# Collect the model estimates from the variable model and display them.
varnames <- c(paste("alpha", 1:2, sep = ""), paste("delta", 1:2, sep = ""),
paste("mu", 1:2, sep = ""), "eb", "es")
estimates <- setNames(model$par, varnames)
show(estimates)
#> alpha1 alpha2 delta1 delta2 mu1 mu2
#> 6.157480e-01 3.553656e-01 9.057862e-01 6.349368e-02 6.018977e+02 1.032878e+03
#> eb es
#> 4.038852e+02 2.557359e+02
# Find the value of the MPIN likelihood function at givenpoint
lklValue <- fact_mpin(xdata, givenpoint)
show(lklValue)
#> [1] -5791.781
# ------------------------------------------------------------------------ #
# Using fact_adjpin() to find the value of the DY likelihood function as #
# factorized by Ersan and Ghachem (2022b). #
# ------------------------------------------------------------------------ #
# Choose a given parameter set to evaluate the likelihood function
# at a the initial parameter set givenpoint = (alpha, delta,
# theta, theta',eps.b, eps.s, muB, muS, db, ds)
givenpoint <- c(0.4, 0.1, 0.3, 0.7, 500, 600, 800, 1000, 300, 200)
# Use the output of fact_adjpin() with the optimization function
# neldermead() to find optimal estimates of the AdjPIN model.
low <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
up <- c(1, 1, 1, 1, Inf, Inf, Inf, Inf, Inf, Inf)
model <- nloptr::neldermead(
givenpoint, fact_adjpin(xdata), lower = low, upper = up)
# Collect the model estimates from the variable model and display them.
varnames <- c("alpha", "delta", "theta", "thetap", "eps.b", "eps.s",
"muB", "muS", "db", "ds")
estimates <- setNames(model$par, varnames)
show(estimates)
#> alpha delta theta thetap eps.b eps.s
#> 5.675257e-01 1.941894e-01 5.363950e-01 1.232052e-03 3.355085e+02 3.336092e+02
#> muB muS db ds
#> 1.506971e+03 8.750166e+02 6.445527e+02 5.761288e+00
# Find the value of the log-likelihood function at givenpoint
adjlklValue <- fact_adjpin(xdata, givenpoint)
show(adjlklValue)
#> [1] -8711.678
```