Skip to contents

Estimates the Probability of Informed Trading (PIN) using the initial parameter sets generated using the grid search algorithm of Yan and Zhang (2012).


pin_yz(data, factorization, ea_correction = FALSE, grid_size = 5,
                                                  verbose = TRUE)



A dataframe with 2 variables: the first corresponds to buyer-initiated trades (buys), and the second corresponds to seller-initiated trades (sells).


A character string from {"EHO", "LK", "E", "NONE"} referring to a given factorization. The default value is "E".


A binary variable determining whether the modifications of the algorithm of Yan and Zhang (2012) suggested by Ersan and Alici (2016) are implemented. The default value is FALSE.


An integer between 1, and 20; representing the size of the grid. The default value is 5. See more in details.


A binary variable that determines whether detailed information about the steps of the estimation of the PIN model is displayed. No output is produced when verbose is set to FALSE. The default value is TRUE.


Returns an object of class


The argument 'data' should be a numeric dataframe, and contain at least two variables. Only the first two variables will be considered: The first variable is assumed to correspond to the total number of buyer-initiated trades, while the second variable is assumed to correspond to the total number of seller-initiated trades. Each row or observation correspond to a trading day. NA values will be ignored.

The factorization variable takes one of four values:

  • "EHO" refers to the factorization in Easley et al. (2010)

  • "LK" refers to the factorization in Lin and Ke (2011)

  • "E" refers to the factorization in Ersan (2016)

  • "NONE" refers to the original likelihood function - with no factorization

The argument grid_size determines the size of the grid of the variables: alpha, delta, and eps.b. If grid_size is set to a given value m, the algorithm creates a sequence starting from 1/2m, and ending in 1 - 1/2m, with a step of 1/m. The default value of 5 corresponds to the size of the grid in Yan and Zhang (2012) . In that case, the sequence starts at 0.1 = 1/(2 x 5), and ends in 0.9 = 1 - 1/(2 x 5) with a step of 0.2 = 1/m.

The function pin_yz() implements, by default, the original Yan and Zhang (2012) algorithm as the default value of ea_correction takes the value FALSE. When the value of ea_correction is set to TRUE; then, sets with irrelevant mu values are excluded, and sets with boundary values are reintegrated in the initial parameter sets.


Easley D, Hvidkjaer S, Ohara M (2010). “Factoring information into returns.” Journal of Financial and Quantitative Analysis, 45(2), 293--309. ISSN 00221090.

Ersan O (2016). “Multilayer Probability of Informed Trading.” Available at SSRN 2874420.

Ersan O, Alici A (2016). “An unbiased computation methodology for estimating the probability of informed trading (PIN).” Journal of International Financial Markets, Institutions and Money, 43, 74--94. ISSN 10424431.

Lin H, Ke W (2011). “A computing bias in estimating the probability of informed trading.” Journal of Financial Markets, 14(4), 625-640. ISSN 1386-4181.

Yan Y, Zhang S (2012). “An improved estimation method and empirical properties of the probability of informed trading.” Journal of Banking and Finance, 36(2), 454--467. ISSN 03784266.


# There is a preloaded quarterly dataset called 'dailytrades' with 60
# observations. Each observation corresponds to a day and contains the
# total number of buyer-initiated trades ('B') and seller-initiated
# trades ('S') on that day. To know more, type ?dailytrades

xdata <- dailytrades

# Estimate the PIN model using the factorization of Lin and Ke(2011), and
# initial parameter sets generated using the algorithm of Yan & Zhang (2012).
# In contrast to the original algorithm, we set the grid size for the grid
# search algorithm at 3. The original algorithm assumes a grid of size 5.

estimate <- pin_yz(xdata, "LK", grid_size = 3, verbose = FALSE)

# Display the estimated PIN value

#> [1] 0.5661739

# Display the estimated parameters

#>        alpha        delta           mu        eps.b        eps.s 
#>    0.7500022    0.1333345 1193.5176294  357.2654069  328.6290139 

# Store the initial parameter sets used for MLE in a dataframe variable,
# and display its first five rows

initialsets <- estimate@initialsets
show(head(initialsets, 5))
#>       alpha     delta     mu    eps.b    eps.s
#> 1 0.1666667 0.1666667 6946.5 192.9583 230.3250
#> 2 0.5000000 0.1666667 2315.5 192.9583 230.3250
#> 3 0.8333333 0.1666667 1389.3 192.9583 230.3250
#> 4 0.1666667 0.1666667 4167.9 578.8750 307.5083
#> 5 0.5000000 0.1666667 1389.3 578.8750 307.5083