Skip to contents

Estimates the Probability of Informed Trading (PIN) using the initial set from the algorithm in Gan et al.(2015).


pin_gwj(data, factorization = "E", verbose = TRUE)



A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).


A character string from {"EHO", "LK", "E", "NONE"} referring to a given factorization. The default value is set to "E".


A binary variable that determines whether detailed information about the steps of the estimation of the PIN model is displayed. No output is produced when verbose is set to FALSE. The default value is TRUE.


Returns an object of class


The argument 'data' should be a numeric dataframe, and contain at least two variables. Only the first two variables will be considered: The first variable is assumed to correspond to the total number of buyer-initiated trades, while the second variable is assumed to correspond to the total number of seller-initiated trades. Each row or observation correspond to a trading day. NA values will be ignored.

The factorization variable takes one of four values:

  • "EHO" refers to the factorization in Easley et al. (2010)

  • "LK" refers to the factorization in Lin and Ke (2011)

  • "E" refers to the factorization in Ersan (2016)

  • "NONE" refers to the original likelihood function - with no factorization

The function pin_gwj() implements the algorithm detailed in Gan et al. (2015) . You can use the function initials_pin_gwj() in order to get the initial parameter set.


Easley D, Hvidkjaer S, Ohara M (2010). “Factoring information into returns.” Journal of Financial and Quantitative Analysis, 45(2), 293--309. ISSN 00221090.

Ersan O (2016). “Multilayer Probability of Informed Trading.” Available at SSRN 2874420.

Gan Q, Wei WC, Johnstone D (2015). “A faster estimation method for the probability of informed trading using hierarchical agglomerative clustering.” Quantitative Finance, 15(11), 1805--1821.

Lin H, Ke W (2011). “A computing bias in estimating the probability of informed trading.” Journal of Financial Markets, 14(4), 625-640. ISSN 1386-4181.


# There is a preloaded quarterly dataset called 'dailytrades' with 60
# observations. Each observation corresponds to a day and contains the
# total number of buyer-initiated trades ('B') and seller-initiated
# trades ('S') on that day. To know more, type ?dailytrades

xdata <- dailytrades

# Estimate the PIN model using the factorization of Ersan (2016), and initial
# parameter sets generated using the algorithm of Gan et al. (2015).
# The argument xtraclusters is omitted so will take its default value 4.

estimate <- pin_gwj(xdata, verbose = FALSE)

# Display the estimated PIN value

#> [1] 0.4417375

# Display the estimated parameters

#>        alpha        delta           mu        eps.b        eps.s 
#>    0.5833376    0.1714269 1197.2546207  554.0730552  328.5610583 

# Store the initial parameter sets used for MLE in a dataframe variable,
# and display its first five rows

initialsets <- estimate@initialsets
show(head(initialsets, 5))
#>       alpha     delta       mu    eps.b    eps.s
#> 1 0.5666667 0.1764706 1214.401 556.6875 336.1852