Skip to contents

Generates a dataset object or a data.series object (a list of dataset objects) storing simulation parameters as well as aggregate daily buys and sells simulated following the assumption of the AdjPIN model of Duarte and Young (2009) .


generatedata_adjpin(series=1, days = 60, parameters = NULL, ranges = list(),
restricted = list(), verbose = TRUE)



The number of datasets to generate.


The number of trading days, for which aggregated buys and sells are generated. The default value is 60.


A vector of model parameters of size 10 and it has the following form {\(\alpha\), \(\delta\), \(\theta\), \(\theta'\), \(\epsilon\)b, \(\epsilon\)s, \(\mu\)b, \(\mu\)s, \(\Delta\)b, \(\Delta\)s}.


A list of ranges for the different simulation parameters having named elements alpha \((\alpha)\), delta \((\delta)\), theta \((\theta)\), thetap \((\theta')\), eps.b (\(\epsilon\)b), eps.s (\(\epsilon\)s), mu.b (\(\mu\)b), mu.s (\(\mu\)s), d.b (\(\Delta\)b), d.s (\(\Delta\)s). The value of each element is a vector of two numbers: the first one is the minimal value min_v and the second one is the maximal value max_v. If the element corresponding to a given parameter is missing, the default range for that parameter is used, otherwise, the simulation parameters are uniformly drawn from the interval (min_v, max_v). The default value is list().


A binary list that allows estimating restricted AdjPIN models by specifying which model parameters are assumed to be equal. It contains one or multiple of the following four elements {theta, mu, eps, d}. For instance, If theta is set to TRUE, then the probability of liquidity shock in no-information days, and in information days is assumed to be the same (\(\theta\)=\(\theta'\)). If any of the remaining rate elements {mu, eps, d} is set to TRUE, (say mu=TRUE), then the rate is assumed to be the same on the buy side, and on the sell side (\(\mu\)b=\(\mu\)s). If more than one element is set to TRUE, then the restrictions are combined. For instance, if the argument restricted is set to list(theta=TRUE, eps=TRUE, d=TRUE), then the restricted AdjPIN model is estimated, where \(\theta\)=\(\theta'\), \(\epsilon\)b=\(\epsilon\)s, and \(\Delta\)b=\(\Delta\)s. If the value of the argument restricted is the empty list (list()), then all parameters of the model are assumed to be independent, and the unrestricted model is estimated. The default value is the empty list list().


A binary variable that determines whether detailed information about the progress of the data generation is displayed. No output is produced when verbose is set to FALSE. The default value is TRUE.


Returns an object of class dataset if series=1, and an object of class data.series if series>1.


If the argument parameters is missing, then the parameters are generated using the ranges specified in the argument ranges. If the argument ranges is set to list(), default ranges are used. Using the default ranges, the simulation parameters are obtained using the following procedure:

  • \(\alpha\), \(\delta\): (alpha, delta) uniformly distributed on (0, 1).

  • \(\theta\), \(\theta'\): (theta,thetap) uniformly distributed on (0, 1).

  • \(\epsilon\)b: (eps.b) an integer uniformly drawn from the interval (100, 10000) with step 50.

  • \(\epsilon\)s: (eps.s) an integer uniformly drawn from ((4/5)\(\epsilon\)b, (6/5)\(\epsilon\)b) with step 50.

  • \(\Delta\)b: (d.b) an integer uniformly drawn from ((1/2)\(\epsilon\)b, 2\(\epsilon\)b).

  • \(\Delta\)s: (d.s) an integer uniformly drawn from ((4/5)\(\Delta\)b, (6/5)\(\Delta\)b).

  • \(\mu\)b: (mu.b) uniformly distributed on the interval ((1/2) max(\(\epsilon\)b, \(\epsilon\)s), 5 max(\(\epsilon\)b, \(\epsilon\)s)).

  • \(\mu\)s: (mu.s) uniformly distributed on the interval ((4/5)\(\mu\)b, (6/5)\(\mu\)b)..

Based on the simulation parameters parameters, daily buys and sells are generated by the assumption that buys and sells follow Poisson distributions with mean parameters:

  • (\(\epsilon\)b, \(\epsilon\)s) in a day with no information and no liquidity shock;

  • (\(\epsilon\)b+\(\Delta\)b, \(\epsilon\)s+\(\Delta\)s) in a day with no information and with liquidity shock;

  • (\(\epsilon\)b+\(\mu\)b, \(\epsilon\)s) in a day with good information and no liquidity shock;

  • (\(\epsilon\)b+\(\mu\)b+\(\Delta\)b, \(\epsilon\)s+\(\Delta\)s) in a day with good information and liquidity shock;

  • (\(\epsilon\)b, \(\epsilon\)s+\(\mu\)s) in a day with bad information and no liquidity shock;

  • (\(\epsilon\)b+\(\Delta\)s, \(\epsilon\)s+\(\mu\)s+\(\Delta\)s) in a day with bad information and liquidity shock;


Duarte J, Young L (2009). “Why is PIN priced?” Journal of Financial Economics, 91(2), 119--138. ISSN 0304405X.


# ------------------------------------------------------------------------ #
# Generate data following the AdjPIN model using generatedata_adjpin()     #
# ------------------------------------------------------------------------ #

# With no arguments, the function generates one dataset object spanning
# 60 days, and where the parameters are chosen as described in the section
# 'Details'.

sdata <- generatedata_adjpin()

# Alternatively, simulation parameters can be provided. Recall the order of
# parameters (alpha, delta, theta, theta', eps.b, eps.s, mub, mus, db, ds).

givenpoint <- c(0.4, 0.1, 0.5, 0.6, 800, 1000, 2300, 4000, 500, 500)
sdata <- generatedata_adjpin(parameters = givenpoint)

# Data can be generated following restricted AdjPIN models, for example, with
# restrictions 'eps.b = eps.s', and 'mu.b = mu.s'.

sdata <- generatedata_adjpin(restricted = list(eps = TRUE, mu = TRUE))

# Data can be generated using provided ranges of simulation parameters as fed
# to the function using the argument 'ranges', where thetap corresponds to
# theta'.

sdata <- generatedata_adjpin(ranges = list(
  alpha = c(0.1, 0.15), delta = c(0.2, 0.2),
  theta = c(0.2, 0.6), thetap = c(0.2, 0.4)

# The value of a given simulation parameter can be set to a specific value by
# setting the range of the desired parameter takes a unique value, instead of
# a pair of values.

sdata <- generatedata_adjpin(ranges = list(
  alpha = 0.4, delta = c(0.2, 0.7),
  eps.b = c(100, 7000), mu.b = 8000

# Display the details of the generated simulation data

#> ----------------------------------
#> Data series successfully generated
#> ----------------------------------
#> Simulation model 	: AdjPIN model
#> Model Restrictions 	: Unrestricted model
#> Number of trading days	: 60 days
#> ----------------------------------
#> Type object@data to get the simulated data
#>  Data simulation  
#> ===========  ==============  ============
#> Variables    Theoretical.    Empirical.  
#> ===========  ==============  ============
#> alpha        0.4             0.4         
#> delta        0.201059        0.25        
#> theta        0.18174         0.222222    
#> theta'       0.018578        0.041667    
#> ----                                     
#> eps.b        3172            3167.34     
#> eps.s        3646            3644.48     
#> mu.b         8000            7970.21     
#> mu.s         7496            7485.51     
#> d.b          10894           10936.09    
#> d.s          9209            9187.01     
#> ----                                     
#> Likelihood                   (757.554)   
#> adjPIN       0.256           0.242       
#> PSOS         0.19            0.233       
#> ===========  ==============  ============
#> -------
#> Running time: 0.013 seconds

# ------------------------------------------------------------------------ #
# Use generatedata_adjpin() to check the accuracy of adjpin()              #
# ------------------------------------------------------------------------ #

model <- adjpin(sdata@data, verbose = FALSE)

summary <- cbind(
  c(['adjpin'], model@adjpin, abs(model@adjpin -['adjpin'])),
  c(['psos'], model@psos, abs(model@psos -['psos']))
colnames(summary) <- c('adjpin', 'psos')
rownames(summary) <- c('Data', 'Model', 'Difference')

show(knitr::kable(summary, 'simple'))
#>                  adjpin        psos
#> -----------  ----------  ----------
#> Data          0.2420692   0.2327284
#> Model         0.2418783   0.2321891
#> Difference    0.0001909   0.0005392