Skip to contents

Based on the algorithm in Ersan and Alici (2016), generates initial parameter sets for the maximum likelihood estimation of the PIN model.

Usage

initials_pin_ea(data, xtraclusters = 4, verbose = TRUE)

Arguments

data

A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).

xtraclusters

An integer used to divide trading days into #(2 + xtraclusters) clusters, thereby resulting in #comb(1 + xtraclusters, 1) initial parameter sets in line with Ersan and Alici (2016) . The default value is 4.

verbose

a binary variable that determines whether information messages about the initial parameter sets, including the number of the initial parameter sets generated. No message is shown when verbose is set to FALSE. The default value is TRUE.

Value

Returns a dataframe of initial sets each consisting of five variables {\(\alpha\), \(\delta\), \(\mu\), \(\epsilon\)b, \(\epsilon\)s}.

Details

The argument 'data' should be a numeric dataframe, and contain at least two variables. Only the first two variables will be considered: The first variable is assumed to correspond to the total number of buyer-initiated trades, while the second variable is assumed to correspond to the total number of seller-initiated trades. Each row or observation correspond to a trading day. NA values will be ignored.

The function initials_pin_ea() uses a hierarchical agglomerative clustering (HAC) to find initial parameter sets for the maximum likelihood estimation. The steps in Ersan and Alici (2016) algorithm differ from those used by Gan et al. (2015) , and are summarized below.

Via the use of HAC, daily absolute order imbalances (AOIs) are grouped in 2+J (default J=4) clusters. After sorting the clusters based on AOIs, they are combined into two larger groups of days (event and no-event) by merging neighboring clusters with each other. Consequently, those groups are formed in #comb(5, 1) = 5 different ways. For each of the 5 configurations with which, days are grouped into two (event group and no-event group), the procedure below is applied to obtain initial parameter sets.

Days in the event group (the one with larger mean AOI) are distributed into two groups, i.e. good-event days (days with positive OI) and bad-event days (days with negative OI). Initial parameters are obtained from the frequencies, and average trade rates of three types of days. See Ersan and Alici (2016) for further details.

The higher the number of the additional clusters (xtraclusters), the better is the estimation. Ersan and Alici (2016) , however, have shown the benefit of increasing this number beyond 4 is marginal, and statistically insignificant.

References

Ersan O, Alici A (2016). “An unbiased computation methodology for estimating the probability of informed trading (PIN).” Journal of International Financial Markets, Institutions and Money, 43, 74--94. ISSN 10424431.

Gan Q, Wei WC, Johnstone D (2015). “A faster estimation method for the probability of informed trading using hierarchical agglomerative clustering.” Quantitative Finance, 15(11), 1805--1821.

Examples

# There is a preloaded quarterly dataset called 'dailytrades' with 60
# observations. Each observation corresponds to a day and contains the
# total number of buyer-initiated trades ('B') and seller-initiated
# trades ('S') on that day. To know more, type ?dailytrades

xdata <- dailytrades

# Obtain a dataframe of initial parameters for the maximum likelihood
# estimation using the algorithm of Ersan and Alici (2016).

init.sets <- initials_pin_ea(xdata)
#> The function initials_pin_ea(...) has generated 5 initial parameter sets.
#> 
To display them, either store them in a variable or call (initials_pin_ea(...)). 
#> 
To hide these messages, set the argument 'verbose' to FALSE.
#> 

# Use the obtained dataframe to estimate the PIN model using the function
# pin() with custom initial parameter sets

estimate.1 <- pin(xdata, initialsets = init.sets, verbose = FALSE)

# pin_ea() directly estimates the PIN model using initial parameter sets
# generated using the algorithm of Ersan & Alici (2016).

estimate.2 <- pin_ea(xdata, verbose = FALSE)

# Check that the obtained results are identical

show(estimate.1@parameters)
#>        alpha        delta           mu        eps.b        eps.s 
#>    0.7499975    0.1333342 1193.5179655  357.2659099  328.6291793 
show(estimate.2@parameters)
#>        alpha        delta           mu        eps.b        eps.s 
#>    0.7499975    0.1333342 1193.5179655  357.2659099  328.6291793