Estimates the Adjusted Probability of Informed Trading
(`adjPIN`

) as well as the Probability of Symmetric Order-flow Shock
(`PSOS`

) from the `AdjPIN`

model of Duarte and Young(2009).

## Usage

```
adjpin(data, method = "ECM", initialsets = "GE", num_init = 20,
restricted = list(), ..., verbose = TRUE)
```

## Arguments

- data
A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).

- method
A character string referring to the method used to estimate the model of Duarte and Young (2009) . It takes one of two values:

`"ML"`

refers to the standard maximum likelihood estimation, and`"ECM"`

refers to the expectation-conditional maximization algorithm. The default value is`"ECM"`

. Details of the ECM method, and comparative results can be found in Ghachem and Ersan (2022a) , and in Ghachem and Ersan (2022b) .- initialsets
It can either be a character string referring to prebuilt algorithms generating initial parameter sets or a dataframe containing custom initial parameter sets. If

`initialsets`

is a character string, it refers to the method of generation of the initial parameter sets, and takes one of three values:`"GE"`

,`"CL"`

, or`"RANDOM"`

.`"GE"`

refers to initial parameter sets generated by the algorithm of Ersan and Ghachem (2022b) , and implemented in`initials_adjpin()`

,`"CL"`

refers to initial parameter sets generated by the algorithm of Cheng and Lai (2021) , and implemented in`initials_adjpin_cl()`

, while`"RANDOM"`

generates random initial parameter sets as implemented in`initials_adjpin_rnd()`

. The default value is`"GE"`

. If`initialsets`

is a dataframe, the function`adjpin()`

will estimate the AdjPIN model using the provided initial parameter sets.- num_init
An integer specifying the maximum number of initial parameter sets to be used in the estimation. If

`initialsets="GE"`

, the generation of initial parameter sets will stop when the number of initial parameter sets reaches`num_init`

. It can stop earlier if the number of all possible generated initial parameter sets is lower than`num_init`

. If`initialsets="RANDOM"`

, exactly`num_init`

initial parameter sets are returned. If`initialsets="CL"`

: then`num_init`

is ignored, and all`256`

initial parameter sets are used. The default value is`20`

.`[i]`

The argument`num_init`

is ignored when the argument`initialsets`

is a dataframe.- restricted
A binary list that allows estimating restricted AdjPIN models by specifying which model parameters are assumed to be equal. It contains one or multiple of the following four elements

`{theta, mu, eps, d}`

. For instance, If`theta`

is set to`TRUE`

, then the probability of liquidity shock in no-information days, and in information days is assumed to be the same (\(\theta\)`=`

\(\theta'\)). If any of the remaining rate elements`{mu, eps, d}`

is set to`TRUE`

, (say`mu=TRUE`

), then the rate is assumed to be the same on the buy side, and on the sell side (\(\mu\)_{b}`=`

\(\mu\)_{s}). If more than one element is set to`TRUE`

, then the restrictions are combined. For instance, if the argument`restricted`

is set to`list(theta=TRUE, eps=TRUE, d=TRUE)`

, then the restricted AdjPIN model is estimated, where \(\theta\)`=`

\(\theta'\), \(\epsilon\)_{b}`=`

\(\epsilon\)_{s}, and \(\Delta\)_{b}`=`

\(\Delta\)_{s}. If the value of the argument`restricted`

is the empty list (`list()`

), then all parameters of the model are assumed to be independent, and the unrestricted model is estimated. The default value is the empty list`list()`

.- ...
Additional arguments passed on to the function

`adjpin()`

. The recognized arguments are`hyperparams`

, and`fact`

. The argument`hyperparams`

consists of a list containing the hyperparameters of the`ECM`

algorithm. When not empty, it contains one or more of the following elements:`maxeval`

, and`tolerance`

. It is used only when the`method`

argument is set to`"ECM"`

. The argument`fact`

is a binary value that determines which likelihood functional form is used: A factorization of the likelihood function by Ersan and Ghachem (2022b) when it is set to`TRUE`

, otherwise, the original likelihood function of Duarte and Young (2009) . The default value is`TRUE`

. More about these arguments are in the Details section.- verbose
A binary variable that determines whether detailed information about the steps of the estimation of the AdjPIN model is displayed. No output is produced when

`verbose`

is set to`FALSE`

. The default value is`TRUE`

.

## Details

The argument 'data' should be a numeric dataframe, and contain
at least two variables. Only the first two variables will be considered:
The first variable is assumed to correspond to the total number of
buyer-initiated trades, while the second variable is assumed to
correspond to the total number of seller-initiated trades. Each row or
observation correspond to a trading day. `NA`

values will be ignored.

If `initialsets`

is neither a dataframe, nor a character string from the
set `{"GE",`

`"CL",`

`"RANDOM"}`

, the estimation of the `AdjPIN`

model is
aborted. The default initial parameters (`"GE"`

) for the estimation
method are generated using a modified hierarchical agglomerative
clustering. For more information, see `initials_adjpin()`

.

The argument `hyperparams`

contains the hyperparameters of the `ECM`

algorithm. It is either empty or contains one or two of the following
elements:

`maxeval`

: (`integer`

) It stands for maximum number of iterations of the`ECM`

algorithm for each initial parameter set. When missing,`maxeval`

takes the default value of`100`

.`tolerance`

(`numeric`

) The`ECM`

algorithm is stopped when the (relative) change of log-likelihood is smaller than tolerance. When missing,`tolerance`

takes the default value of`0.001`

.

## References

Cheng T, Lai H (2021).
“Improvements in estimating the probability of informed trading models.”
*Quantitative Finance*, **21**(5), 771-796.

Duarte J, Young L (2009).
“Why is PIN priced?”
*Journal of Financial Economics*, **91**(2), 119--138.
ISSN 0304405X.

Ersan O, Ghachem M (2022b).
“A methodological approach to the computational problems in the estimation of adjusted PIN model.”
*Available at SSRN 4117954*.

Ghachem M, Ersan O (2022a).
“Estimation of the probability of informed trading models via an expectation-conditional maximization algorithm.”
*Available at SSRN 4117952*.

Ghachem M, Ersan O (2022b).
“PINstimation: An R package for estimating models of probability of informed trading.”
*Available at SSRN 4117946*.

## Examples

```
# We use 'generatedata_adjpin()' to generate a S4 object of type 'dataset'
# with 60 observations.
sim_data <- generatedata_adjpin(days = 60)
# The actual dataset of 60 observations is stored in the slot 'data' of the
# S4 object 'sim_data'. Each observation corresponds to a day and contains
# the total number of buyer-initiated transactions ('B') and seller-
# initiated transactions ('S') on that day.
xdata <- sim_data@data
# ------------------------------------------------------------------------ #
# Compare the unrestricted AdjPIN model with various restricted models #
# ------------------------------------------------------------------------ #
# Estimate the unrestricted AdjPIN model using the ECM algorithm (default),
# and show the estimation output
estimate.adjpin.0 <- adjpin(xdata, verbose = FALSE)
show(estimate.adjpin.0)
#> ----------------------------------
#> AdjPIN estimation completed successfully
#> ----------------------------------
#> Likelihood factorization: Ersan and Ghachem (2022b)
#> Estimation Algorithm : Expectation-Conditional Maximization
#> Initial parameter sets : Ersan and Ghachem (2022b)
#> Model Restrictions : Unrestricted model
#> ----------------------------------
#> 20 initial set(s) are used in the estimation
#> Type object@initialsets to see the initial parameter sets used
#>
#> AdjPIN model
#>
#> =========== ==============
#> Variables Estimates
#> =========== ==============
#> alpha 0.03334
#> delta 0.5
#> theta 0.931031
#> theta' 0.5
#> ----
#> eps.b 1661
#> eps.s 1952.08
#> mu.b 3459
#> mu.s 3442.74
#> d.b 1429.93
#> d.s 1425.22
#> ----
#> Likelihood (664.083)
#> adjPIN 0.018132
#> PSOS 0.412461
#> =========== ==============
#>
#> -------
#> Running time: 1.896 seconds
# Estimate the restricted AdjPIN model where mub=mus
# \donttest{
estimate.adjpin.1 <- adjpin(xdata, restricted = list(mu = TRUE),
verbose = FALSE)
# Estimate the restricted AdjPIN model where eps.b=eps.s
estimate.adjpin.2 <- adjpin(xdata, restricted = list(eps = TRUE),
verbose = FALSE)
# Estimate the restricted AdjPIN model where d.b=d.s
estimate.adjpin.3 <- adjpin(xdata, restricted = list(d = TRUE),
verbose = FALSE)
# Compare the different values of adjusted PIN
estimates <- list(estimate.adjpin.0, estimate.adjpin.1,
estimate.adjpin.2, estimate.adjpin.3)
adjpins <- sapply(estimates, function(x) x@adjpin)
psos <- sapply(estimates, function(x) x@psos)
summary <- cbind(adjpins, psos)
rownames(summary) <- c("unrestricted", "same.mu", "same.eps", "same.d")
show(round(summary, 5))
#> adjpins psos
#> unrestricted 0.01813 0.41246
#> same.mu 0.01814 0.41238
#> same.eps 0.01771 0.40802
#> same.d 0.01812 0.41235
# }
```