Estimates the Probability of Informed Trading (`PIN`

) using
Bayesian Gibbs sampling as in
Griffin et al. (2021)
and the initial sets
from the algorithm in Ersan and Alici (2016)
.

## Usage

```
pin_bayes(data, xtraclusters = 4, sweeps = 1000, burnin = 500,
prior.a = 1, prior.b = 2, verbose = TRUE)
```

## Arguments

- data
A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).

- xtraclusters
An integer used to divide trading days into

`#(2 + xtraclusters)`

clusters, thereby resulting in`#comb(1 + xtraclusters, 1)`

initial parameter sets in line with Ersan and Alici (2016) . The default value is`4`

.- sweeps
An integer referring to the number of iterations for the Gibbs Sampler. This has to be large enough to ensure convergence of the Markov chain. The default value is

`1000`

.- burnin
An integer referring to the number of initial iterations for which the parameter draws should be discarded. This is to ensure that we keep the draws at the point where the MCMC has converged to the parameter space in which the parameter estimate is likely to fall. This figure must always be less than the sweeps. The default value is

`500`

.- prior.a
An integer controlling the mean number of informed trades, such as the prior of informed buys and sells is the Gamma density function with \(\mu\) ~

`Ga(prior.a,`

\(\eta\)`)`

. The default value is`1`

. For more details, please refer to Griffin et al. (2021) .- prior.b
An integer controlling the mean number of uninformed trades, such as the prior of uninformed buys and sells is the Gamma density function with \(\epsilon\)

_{b}~`Ga(prior.b,`

\(\eta\)`)`

, and \(\epsilon\)_{s}~`Ga(prior.b,`

\(\eta\)`)`

. The default value is`2`

. For more details, please refer to Griffin et al. (2021) .- verbose
A binary variable that determines whether detailed information about the steps of the estimation of the PIN model is displayed. No output is produced when

`verbose`

is set to`FALSE`

. The default value is`TRUE`

.

## Details

The argument 'data' should be a numeric dataframe, and contain
at least two variables. Only the first two variables will be considered:
The first variable is assumed to correspond to the total number of
buyer-initiated trades, while the second variable is assumed to
correspond to the total number of seller-initiated trades. Each row or
observation correspond to a trading day. `NA`

values will be ignored.

The function `pin_bayes()`

implements the algorithm detailed in
Ersan and Alici (2016)
.
The higher the number of the additional clusters (`xtraclusters`

), the
better is the estimation. Ersan and Alici (2016)
,
however, have shown the benefit of increasing this number beyond 5 is
marginal, and statistically insignificant.

The function `initials_pin_ea()`

provides the initial parameter sets
obtained through the implementation of the
Ersan and Alici (2016)
algorithm.
For further information on the initial parameter set determination, see
`initials_pin_ea()`

.

## References

Ersan O, Alici A (2016).
“An unbiased computation methodology for estimating the probability of informed trading (PIN).”
*Journal of International Financial Markets, Institutions and Money*, **43**, 74--94.
ISSN 10424431.

Griffin J, Oberoi J, Oduro SD (2021).
“Estimating the probability of informed trading: A Bayesian approach.”
*Journal of Banking \& Finance*, **125**, 106045.

## Examples

```
# Use the function generatedata_mpin() to generate a dataset of
# 60 days according to the assumptions of the original PIN model.
sdata <- generatedata_mpin(layers = 1)
xdata <- sdata@data
# Estimate the PIN model using the Bayesian approach developed in
# Griffin et al. (2021), and initial parameter sets generated using the
# algorithm of Ersan and Alici (2016). The argument xtraclusters is
# omitted so will take its default value 4. We also leave the arguments
# 'sweeps' and 'burnin' at their default values.
estimate <- pin_bayes(xdata, verbose = FALSE)
# Display the empirical PIN value at the data, and the PIN value
# estimated using the bayesian approach
setNames(c(sdata@emp.pin, estimate@pin), c("data", "estimate"))
#> data estimate
#> 0.03118213 0.03085038
# Display the empirial and the estimated parameters
show(unlist(sdata@empiricals))
#> alpha delta mu eps.b eps.s
#> 8.000000e-01 4.583333e-01 8.739662e+02 9.895824e+03 1.182724e+04
show(estimate@parameters)
#> alpha delta mu eps.b eps.s
#> 7.916234e-01 4.576451e-01 8.735210e+02 9.884163e+03 1.183896e+04
# Find the initial set that leads to the optimal estimate
optimal <- which.max(estimate@details$likelihood)
# Store the matrix of Monte Carlo simulation for the optimal
# estimate, and display its last five rows
mcmatrix <- estimate@details$markovmatrix[[optimal]]
show(tail(mcmatrix, 5))
#> alpha delta mu eps.b eps.s PIN
#> sweep.996 0.7779183 0.4891180 898.5476 9889.621 11862.55 0.03113409
#> sweep.997 0.7251994 0.3436632 897.3669 9858.720 11868.38 0.02908097
#> sweep.998 0.7249533 0.3278762 891.8707 9858.303 11836.27 0.02894054
#> sweep.999 0.8553372 0.4304456 901.6256 9891.904 11802.18 0.03432826
#> sweep.1000 0.8226413 0.5006092 908.7577 9878.343 11850.33 0.03326095
# Display the summary of Geweke test for the Monte Carlo matrix above.
show(estimate@details$summary[[optimal]])
#> mean std.dev geweke.z-score geweke.p-value
#> alpha 7.916234e-01 0.049640817 -1.159521 0.12312200
#> delta 4.576451e-01 0.067547990 1.855060 0.03179385
#> mu 8.735210e+02 19.906073480 2.502118 0.00617263
#> eps.b 9.884163e+03 14.771331512 -1.624263 0.05215985
#> eps.s 1.183896e+04 16.367303544 -2.319559 0.01018237
#> PIN 3.084648e-02 0.002000441 1.663777 0.04807846
```