Dynamic Autoregressive Liquidity (DArLiQ)

Abstract We introduce a new class of semiparametric dynamic autoregressive models for the Amihud illiquidity measure, which captures both the long-run trend in the illiquidity series with a nonparametric component and the short-run dynamics with an autoregressive component. We develop a generalized method of moments (GMM) estimator based on conditional moment restrictions and an efficient semiparametric maximum likelihood (ML) estimator based on an iid assumption. We derive large sample properties for our estimators. Finally, we demonstrate the model fitting performance and its empirical relevance on an application. We investigate how the different components of the illiquidity process obtained from our model relate to the stock market risk premium using data on the S&P 500 stock market index.


Introduction
Liquidity is a fundamental property of a well-functioning market, and lack of liquidity is generally at the heart of many financial crises and disasters.Common ways of measuring liquidity using high-frequency data include bid-ask spreads, effective spreads, realized spreads, depth, weighted depth, and transaction volume.There is a large literature that uses such measures to compare market quality across markets, across time, and before and after interventions of various sorts.For example, it has been a big part of the debate around high-frequency trading, that is, whether such trading activity has improved or degraded market liquidity, see for example, Hendershott, Jones, and Menkveld (2011) and O'Hara and Ye (2011).There are many complex issues in working with high-frequency trade and quote data.Instead, there are several methods widely used to measure liquidity using lower frequency data, that is, daily transaction price data, see Goyenko, Holden, and Trzcinka (2009) for a review.We focus on the Amihud illiquidity measure as proposed in Amihud (2002) which has proven to be very popular in the empirical literature.It is easy to implement and by all accounts relatively robust.It has been shown to influence the cross-sectional asset returns through the so-called illiquidity premium, see Amihud and Mendelson (2015).
We propose a dynamic semiparametric model for illiquidity measured by the daily component of the Amihud measure.Specifically, we propose a multiplicative error model (MEM) that contains a nonparametric long-run trend and a parametric short-run autoregressive process as in Engle, Gallo, and Velucchi (2012).The trend part is important for many datasets where liquidity has improved in a secular fashion such as the S&P CONTACT Linqi Wang lw711@cam.ac.ukFaculty of Economics, University of Cambridge, Cambridge, UK.Supplementary materials for this article are available online.Please go to www.tandfonline.com/UBES.500 over the last hundred years and Bitcoin over the much more recent period of its operation.The nonparametric trend is comparable with the conventional monthly averaged measure widely used in the literature, except that our measure is available daily and the implicit length of averaging is controlled by a bandwidth parameter to be chosen by practitioners.Further, the dynamic component measures the short-run variation in liquidity that may be of equal interest.
We approach estimation of the parametric part through GMM based on the first conditional moment restriction, which is consistent under minimal conditions, as well as through a semiparametric likelihood procedure that assumes iid shocks.In the latter approach, we consider two cases, one where the shock distribution is parametric such as the Weibull or Burr distributions and a further case in which the shock distribution is not specified and is treated nonparametrically.The Burr distribution and the nonparametric distribution allow for heavy tails that we might see during liquidity crises.We develop the distribution theory for the three cases to enable valid inference.
We use the five largest U.S. technology company stocks and the Bitcoin asset to demonstrate the model performance in terms of fitting relevant features of the illiquidity data and we provide various model diagnostics and specification tests.We show that the efficient semiparametric maximum likelihood estimator, assuming a parametric Weibull or Burr distribution for the error term, captures most of the salient features of the illiquidity process.This can be further improved by using a nonparametric density estimator.
We also investigate how the different components of the illiquidity process obtained from our model relate to the stock market risk premium using data on the S&P 500 stock market index.We find that the detrended market risk premium is positively affected by the anticipated short-run illiquidity process and negatively associated with the unanticipated component of market illiquidity.This observation is in agreement with the results of Amihud (2002) which were based on an autoregressive (AR) model fit to monthly illiquidity.
The Multiplicative error model has been applied to many different positive-valued financial time series including volatility, duration between trades, and transaction volume, see for example, Engle (2002).The MEM model and its applications and developments over the last 20 years are reviewed in Cipollini and Gallo (2022).The VLAB applies this model and provides regular updates on their website (https://vlab.stern.nyu.edu/liquidity) for a number of series according to their specific implementation.We give a more detailed comparison of our model with theirs in Appendix A of Hafner, Linton, and Wang (2023).
The remainder of the article is organized as follows.In Section 2, we discuss the time series properties of the Amihud illiquidity measure and introduce our DArLiQ model.In Section 3, we discuss estimation via GMM based on conditional moment restrictions and through a semiparametric maximum likelihood procedure that assumes iid shocks.Large sample properties of our procedures are provided in Section 4. We provide a Monte Carlo study in Section 5 to analyze the finite sample performance of the GMM and semiparametric ML estimation procedures.Section 6 presents a detailed empirical application.The appendices are delegated to a separate file which is available online as Hafner, Linton, and Wang (2023).

Amihud Illiquidity and the Model
The Amihud (2002) illiquidity measure of a stock at time t, A t , is defined as where R t j is the stock return and V t j is the (dollar) trading volume at time t j , j = 1, . .., n t .Typically, the measure is computed over monthly or lower frequencies by averaging the daily illiquidity ratio t j over n t observations within the period of interest, for example n t being the number of trading days within a month.Intuitively, the Amihud measure captures the fact that a stock is less liquid if a given trading volume generates a larger move in its price.The Amihud illiquidity measure is a good proxy for high-frequency measures of price impact (Goyenko, Holden, and Trzcinka 2009;Hasbrouck 2009) with the advantage of only requiring daily price and volume data.Barardehi et al. (2021) proposed to replace the close-to-close return by the overnight component of that return.Fong, Holden, and Tobek (2018) proposed a more general class of liquidity measures based on ratios of functions of volatility to functions of trading volume.Both modifications can easily be accommodated in our framework, but we focus on the original Amihud measure as this is currently the most popular approach.Empirical evidence points to the existence of factors driving low-frequency variations in illiquidity dynamics in addition to higher-frequency variations.To illustrate, we plot in Figure 1 the evolution of daily log Amihud illiquidity measure for S&P 500 over 1950-2021.We observe that the illiquidity series exhibits a strong downward trend over time, at least up to 2005.Trends in illiquidity series are not limited to S&P 500, as we show in Appendix B using Fab5 and Bitcoin series.This evidence motivates our framework for the Amihud illiquidity measure which weakens the requirement on stationarity.We develop a new class of dynamic autoregressive liquidity (DArLiQ) models, which captures both the slow-varying long-term trend and short-run autoregressive component relevant for illiquidity modeling.
Suppose that t is a nonnegative process that follows a multiplicative process as in Engle and Gallo (2006) but possesses a nonparametric multiplicative component to account for nonstationarity or trend as in Engle and Rangel (2008) and Hafner and Linton (2010).Let (2) where g(.) is a positive and smooth but unknown function of rescaled time, * t = t /g(t/T) is the rescaled liquidity, and ζ t is a nonnegative random variable with conditional mean one and finite unconditional variance denoted as σ 2 ζ .We present evidence later that the assumption of finite unconditional variance for the shock process is reasonable.Extensions to higher order models, including more lags of λ t and * t are straightforward and analogous to the ARCH literature.Note that ω > 0, β ≥ 0, γ ≥ 0 are sufficient conditions for λ t > 0 with probability one.Furthermore, provided β + γ < 1, the process * t is stationary in mean (and perforce strictly stationary) and follows an ARMA(1,1) process (with heteroscedastic errors).If β + γ ≥ 1, then the process * t is not weakly stationary although it is strongly stationary for a range of such parameter values.More problematic is that in this case E( t ) ceases to exist and our estimation strategy, which is based directly on moment restrictions will fail.This can be addressed using the approach based on quantile restrictions developed as in Koo and Linton (2015).In this article, we will focus on our mean based approach as evidence show that t possesses several moments and so our main strategy is reasonable for most datasets.We further show in Appendix J that the normalized trend functions obtained using the mean and median smoothing methods are comparable and the choice of the smoothing method has a minor impact on the parameter estimates for the λ t process.
There is an identification issue because we can multiply and divide the two components g, λ by the same constant.We suppose that E(λ t ) = 1, which is achieved by setting ω = 1−β −γ .The series * t = λ t ζ t possesses the same stationarity properties as t from the model without a trend.We may suppose that the error process ζ t is iid with some cdf F. Francq and Zakoïan (2006) (Theorems 2 and 3) ensures that the process * t is strictly stationary and geometrically ergodic under our restrictions on β, γ .The iid assumption can be helpful for estimation but it may also be important for the calculation of "Liquidity at Risk", which would require further assumptions about the conditional quantiles of ζ t .Note that the process t actually depends on T and forms a triangular array, t,T , but for notational economy, we suppress this dependency.The process may be initialized from its stationary distribution or from some fixed values { 0 , λ 0 }.

Estimation
We suppose that a sample of nonnegative t , t = 1, . . ., T is observed.We will throughout maintain that t possesses uniformly bounded moments up to some order, which as acknowledged requires some restriction on the dynamic parameters.Under these conditions, estimation is guided by assumptions made about the error process ζ t .The minimalist approach is to assume only that with probability one In that case, one can estimate the function g(.) by conditional mean smoothing of t and the identified parameters β, γ by the GMM approach.Provided that additional high level weak dependence conditions are satisfied, one can ensure a CLT for the resulting estimators.As in Cipollini et al. (2013), one may wish to additionally specify a second conditional moment restriction whereby E(ζ 2 t −(1+σ 2 ζ )|F t−1 ) = 0 with probability one.This additional moment restriction permits more efficient estimation provided that this restriction is true, but if it is not true, using this additional moment restriction will bias the parameter estimates.
We may further assume that ζ t is iid which implies the conditional moment restriction but also other restrictions.Consider the mixed continuous/discrete case where for all x ≥ 0, where f is an absolutely continuous density function with support (0, ∞).In some cases, the discrete component (zero returns) is important while in others it is not.For estimation, we may either assume that f is of unknown form or assume that f is parametrically specified, that is, f ϕ for some unknown shape parameter ϕ such as Weibull or Burr.For forecasting future values of t , one does not need the shock distribution, but prediction intervals and LAR (Liquidity at Risk) require the estimation of some features of the error distribution.

Estimation based on Conditional Moment Restriction
We first convert the conditional moment restriction E( t |F t−1 ) = g(t/T)λ t to the unconditional moment restriction E( t ) = g(t/T), t = 1, . . ., T. We use this condition to obtain an initial consistent estimator of g by the kernel smoothing method, specifically we let where K is a kernel function symmetric about zero supported on [−1, 1] satisfying K(u)du = 1, while h is a bandwidth.Because of the equally spaced observations in time, the denominator of the Nadaraya-Watson estimator is unnecessary here for interior points.In this context, the kernel estimator is Best Linear Minimax (under iid errors) at any fixed u ∈ (0, 1) according to Fan (1993), that is it is equivalent to the local linear estimator.
The estimator does suffer from boundary bias and in particular g(0), g(1) will not be consistent without modifications.A standard way to correct for boundary bias is to use boundary kernels that adapt to the estimation point as they approach the boundary, see Gasser, Muller, and Mammitzsch (1985).An alternative method is local linear kernel regression, which does not require an explicit boundary correction, see Fan and Gijbels (1996).The issue with both methods is that the estimate of g(u) is not guaranteed to be nonnegative everywhere, whereas the simple estimator is nonnegative with probability one.In practice, at least for our application, this does not seem to be an issue and we use also the local linear method.Nevertheless, for our theoretical results we work with g(u) as above for all u ∈ [h, 1 − h] and in the boundary region we either renormalize by T t=1 K h (t/T − u)/T (Nadaraya-Watson estimator) or we set In this case, we guarantee positivity of our estimate but suffer some performance loss at the boundary.Another advantage of the simple estimator is that one can interpret the widely computed measure A t defined in (1) as a special case of g(u) with uniform kernel and bandwidth equivalent to a month around point u.We define the detrended liquidity * t = t / g(t/T), t = 1, . . ., T. We next estimate the dynamic parameters θ = (β, γ ) by exploiting the conditional moment restriction E( * t |F t−1 ) = λ t , where * t = t /g(t/T), t = 1, . . ., T. In practice, we define for any θ ∈ , where is a compact set defined below, where we take initializations * 0 , λ 0 = 1 for simplicity.Then we define , where z t−1 ∈ F t−1 are instruments, and let where W is a weighting matrix and I T ⊂ {1, . . ., T}.In the sequel we suppress the notation I T , although we discuss this issue in the supplementary material.The estimator θ GMM is consistent for θ under our conditions below (and we drop the subscript GMM below).We suggest an improved estimator of g(.).We work from conditional moment restriction E t λ t |F t−1 = g(t/T), which is now feasible given our consistent estimates of θ and hence λ t .We take a simple implementation of local GMM, Gozalo and Linton (2000) and Lewbel (2007), based only on constant instruments in which case we obtain the closed form where λ t = λ t ( θ, g) are estimated in the previous procedure.The kernel K is as before but the bandwidth sequence h may be different reflecting the different bias variance tradeoff.

Estimation based on iid Assumption
We assume that the error ζ t is iid with mean one, variance σ 2 ζ , and cdf F as specified above.The semiparametric case where the density f is unknown has been treated in other time series models by Kreiss (1987), Linton (1993), Drost and Klaassen (1997), and Ling and McAleer (2003).For a given density f , define the Fisher scale score and information: 7)

Parametric Density Case
Suppose that ζ t follows a parametric distribution with density f depending on some unknown parameters ϕ, denoted as f ϕ .Among potential candidates, Weibull and Gamma distributions have proven to be solid choices for duration modeling, for example in Engle and Russell (1998), as they are more flexible than the exponential distribution and allow for over-or underdispersion.In case of fat-tailed data, a popular choice in the duration literature has been the Burr distribution, which nests the Weibull and allows for fat tails, see for example, Bauwens and Giot (2001).Analyzing the key features of the fitted shock in the GMM case can provide some indications on which distribution to choose.We elaborate on this point in the empirical study.
If g(.) were known, the log likelihood function of { 1 , . . ., T } is, apart from a term to do with g(.) which does not depend on parameters, equal to where ζ t (θ ) = t /λ t (θ )g(t/T).From this we can see the separability of π ; the parameter π may be estimated by simply counting the frequency of zeros of t . 1 The remaining quantities are estimated using nonzero observations only.To avoid complicating the notation, we assume in the sequel that π = 0.In practice, given a consistent estimate of g(.), we may maximize an estimated version of likelihood L(θ , ϕ), where g(.) is replaced by g(.) or g (.).In fact, we avoid further nonlinear optimization by using our initial consistent estimates of θ and auxiliary initial consistent estimates of ϕ, ϕ, which may be obtained through closed-form moment estimators.For example, in the Gamma case, parameterized to have mean one, the parameter ϕ can be estimated as one over the sample variance of the residuals.
We show in Appendix E.2 that the efficient score functions (in the semiparametric model) for η = (θ , ϕ) in the presence 1 We adopt this approach in Appendix H where we investigate the occurrence of zero returns in S&P 500 over 1950-2021.The data contains 125 zeros (0.69% of the sample) and majority occurred before 2000.
of unknown g(.) are: To obtain fully efficient estimates of η, we use one-step updating from initial root-T consistent estimates, Bickel (1982), Bickel et al. (1993), Linton (1993), Drost and Klaassen (1997), and Ling and McAleer (2003).Denote η = ( θ , ϕ) , η = ( θ , ϕ) , and let * ηt = ( * θt , * ϕt ) , then let where * θt ( η, g) and * ϕt ( η, g) are given by * θt ( η, (ζ ) .The iid structure also permits one to improve the estimation of g by using the local likelihood method of Tibshirani and Hastie (1987).Suppose that f , θ were known, then the local likelihood estimator of g(u) based on data t is given by the maximizer of with respect to the parameter g ∈ R + .In general, this involves nonlinear optimization with respect to the scalar parameters.Instead, we will pursue a one-step updating approach.Following Fan and Chen (1999), we may update the estimator of g by where we have L g (g; u) = ∂ L(g; u)/∂g and L gg (g; u) =

Nonparametric Density Case
In Appendix E.3, we derive the efficient score function for θ in the semiparametric model with unknown f , g, thereby extending Drost and Werker (2004).This is Suppose we have initial consistent estimators of θ , g(.).Then, one can estimate the density function The residuals are defined as ζ t = t / g(t/T) λ t , t = 1, . . ., T. This estimator does not impose restriction E(ζ t ) = 1.Instead, we also consider estimator based on the rescaled residuals ζ t / T t=1 ζ t /T.Notationally, we assume the same kernel as in the estimation of the liquidity trend but this need not be the case.In particular, since ζ t ≥ 0 one may wish to use special kernel methods adapted to this problem, Chen (2000) and Scaillet (2004).
We propose to construct efficient estimators of θ by two step estimation based on initial consistent estimates of θ , f , g.Specifically, let: Here, 1 t is a trimming function needed theoretically to reduce the effect of small density estimates.In practice, we have found reasonable performance without trimming.One possible trimming scheme was considered in Linton and Xiao (2007).In the literature on "adaptive estimation", a number of other devices are used primarily to promote simple proofs.These include discretization of the initial estimator and sample splitting, see for example Kreiss (1987) and Linton (1993).We may also update the estimator of g in this case by the onestep improvement , where h † is another bandwidth sequence.

Large Sample Properties
We suppose that * t is stationary and alpha mixing.This can be shown to hold under the parameter restrictions provided ζ t is iid It may also hold when ζ t itself is only described as a stationary mixing process although this can be difficult to establish; we give further discussion in the Supplementary material.We define the long run variance for a stationary mixing process x t as lrvar( Assumption A1.Suppose that g(.) ∈ G, where for c > 0 G = g : g : [0, 1] → R + , g(x) ≥ c, g (x) < ∞ for allx ∈ (0, 1), . Uniformly for all θ ∈ , the function M(θ , g) is continuous in g (with respect to the L 2 metric) at g = g 0 .For all sequences of positive numbers δ T → 0, sup θ∈ , g−g 0 ∞ ≤δ T M T (θ , g)− M(θ , g 0 ) = o P (1).
Assumption A6.For all positive sequences δ T , ω T with δ T → 0 and Assumption A4 is sufficient for consistency of θ given that our estimator g is uniformly consistent and g ∈ G with probability one.Assumption A5 and A6 are needed for asymptotic normality of θ.These conditions have been verified in a number of different model settings under more primitive conditions, see Chen, Linton, and Van Keilegom (2003).We establish in our treatment of the properties of g that it is uniformly consistent at a rate that can be better than T −1/4 , which is also required for this theory.The term 2 (θ , g x0 ) • (g − g 0 ) determines the correction term in the limiting variance and is established in Appendix D. We also establish in the appendix that √ T M T (θ 0 , g 0 ) + 2 (θ 0 , g 0 ) • ( g − g 0 ) satisfies a CLT.This as usual requires undersmoothing of the nonparametric estimation part, for example using half of the selected bandwidth.The next two assumptions are needed for the estimators based on the iid error assumption, A7 is for the case where the error density is parametric and A8 is for the case where it is nonparametrically treated.
Assumption A7.We suppose that k,l (x; ϕ) = ∂ k+l ∂ϕ k ∂x l log f ϕ (x) exists and is continuous in both its arguments in a small neighborhood of ϕ 0 and in all x ∈ R + for k, l = 1, . . ., 4 and that sup Assumption A8.We suppose that f (.) is three times continuously differentiable in x ∈ R + .Furthermore, the efficient information matrix is well defined and positive definite at θ = θ 0 and continuous in θ in a neighborhood of θ 0 .

Conditional Moment Restrictions
We first consider the properties of the GMM estimator based on the first conditional moment restriction.This estimator makes the weakest assumptions about the process ζ t and so it is more robust than the subsequent procedures we analyze.

Nonparametric Trend
We first consider the estimator g(u), u ∈ (0, 1), that is based on smoothing of the raw illiquidity.We may rewrite the model ( 2), (3) as a nonparametric regression model with trend in mean and variance t = g(t/T)+g(t/T)v t , where v t is a mean zero stationary and alpha mixing series.We adapt the results of Francisco-Fernández and Vilar-Fernández (2001) for local polynomial estimators in the case without trending heteroscedasticity to obtain the following central limit theorem.
Theorem 1. Suppose that A1-A3 hold and h = cT −1/5 for c > 0. Then for u ∈ (0, 1) where σ 2 v,∞ is the long-run variance of v t , that is, lrvar(v t ).The estimator is consistent and asymptotically normal with optimal rate of T −2/5 based on the smoothness assumption.Bandwidth selection and inference procedures require the estimation of lrvar(v t ), which in general requires further justification.Instead, it may be preferable to work with the refined estimator g(u) based on the estimator of θ .We have for this estimator the following CLT.
Theorem 2. Suppose that Assumptions A1-A3 hold, that θ is √ T-consistent, that h = cT −1/5 for some c > 0 and Th 5 → 0 and Th/ log T → ∞.Then for any u ∈ (0, 1) The bias term is the same as in Theorem 1 by virtue of the undersmoothing of the first step, see Linton and Xiao (2007).The limiting variance is different though and in particular it is proportional to the variance of ζ t , which is generally smaller and easier to estimate than the long run variance of v t .When ≥ 1 by the Cauchy-Schwarz inequality since E(λ t ) = 1.For this estimator, consistent standard errors (assuming undersmoothing) can be based on ζ , for example the sample variance of ζ t after estimation.The omission of the bias effect in the confidence interval has been subject to criticism and debate and many alternative inference approaches have been suggested, at least in the iid case, see for example Schennach (2015) and Calonico, Cattaneo, and Farrell (2018).One simple approach here is to use the pilot model method used in bandwidth selection, Silverman (1986).Specifically, suppose that g(u) = exp(a 0 + a 1 u) for some unknown parameters a 0 , a 1 .In that case, g (u) = a 2 1 exp(a 0 + a 1 u) and given estimates of a 0 , a 1 which can be obtained by the OLS of logarithmic liquidity on a constant and trend (with some adjustment).One can include the estimated bias in the inference procedure.
We may further use this pilot model to select the bandwidth.Since g (u)/g(u) = a 2 1 , a "rule-of-thumb" optimal bandwidth procedure would be h T −1/5 , where σ 2 = σ 2 v,∞ if we smooth t , for example using a Newey-West estimator, and σ 2 = σ 2 ζ if we work with t / λ t .Here, h(u) happens to be constant across u ∈ (0, 1).One may also consider higherorder polynomials for the log trend to capture different shapes.In this case, the term a 4 1 in h(u) should be replaced by the corresponding quantity of 1 0 (g (u)/g(u)) 2 du.As undersmoothing is required for the first stage bandwidth h in Theorem 2, a possible practical device is to use a fraction of the rule-of-thumb bandwidth, such as a half, which we will employ in our empirical applications.Hart (1991) showed that conventional cross-validation fails in settings that include our estimator g(.), that is, the usual recipe for selecting h based on minimizing leave-one-out (or equivalently penalized) squared residuals will produce h ∼ 0. This arises because the serial correlation in error term leads to a bias in the risk estimation.He proposed a modification to address this, that essentially involved estimating the long run variance.We note that our refined estimator g(.) is not subject to this criticism, since the error term in that case is a martingale difference sequence.This suggests (although we have not proven this here) that standard cross-validation based on { t / λ t } would produce an asymptotically optimal bandwidth choice for g(.).In summary, the estimator g(u) has an advantage over g(u) because of the simplicity of handling inference and bandwidth choice.On the other hand, the estimator g(u) is a simple linear estimator and is robust to the specification of λ t .

Parametric Components
Let θ = (β, γ ) be interior values of set , where = {θ : ˜ ≤ β, γ , β + γ ≤ 1 − ˜ } ⊂ R 2 for some ˜ > 0. This guarantees that for example λ * t (θ ) ≥ ˜ for all θ ∈ .It also ensures that λ t possesses at least a first moment.In practice, ˜ is chosen to be machine zero.Define Theorem 3. Suppose that Assumptions A1-A6 hold, that √ Th 2 → 0 and Th/ log T → ∞, and that w t is a stationary mixing process satisfying A2.Then as In general, the asymptotic variance of θ will depend on the long run variance of process w t .Inference procedures may be based on the Newey-West method applied to the residuals

Restrictions from iid shocks
We suppose here that ζ t is iid with mean one and density f and that we have initial consistent estimators of g(.), θ available from the GMM procedure described above, say.
In the case where f is parametrically specified with parameters ϕ, the model is semiparametric with parameters η = (θ , ϕ) and unknown function g.We first consider the local likelihood estimator of the trend function based on the estimated ϕ.
Theorem 4. Suppose that Assumptions A1-A3 hold and that η is √ T-consistent.Suppose that h * = cT −1/5 for some c > 0 and that Th 5 → 0 and Th/ log T → ∞.Then for any u ∈ (0, 1), the local likelihood estimator satisfies for some bias b(u), We note that regarding the estimation of g(.) the asymptotic distribution is the same whether the error density is known or this is estimated parametrically or nonparametrically, Linton and Xiao (2001).This estimator improves on g(u) and g(u) when iid assumption is correct by reducing the asymptotic variance, using the standard Cramer-Rao arguments see Tibshirani (1984).We define pointwise confidence bands in Hafner, Linton, and Wang (2023).
We next turn to the properties of the estimated parametric components defined in (10).We need some further regularity conditions, basically smoothness and moment conditions about the parametric density function.
In the case where f is nonparametrically specified, the model is semiparametric with parameters θ and unknown functions g, f .We turn to the properties of the estimated parametric components defined in (13).

Monte Carlo Study
We report in Tables 1 and 2 the estimation accuracy resultsmeasured by bias and standard deviation-for 2000 replications with sample size T ∈ {500, 1000, 2000, 5000, 10000}.Focusing on the bias criterion, we observe that the bias decreases with respect to the sample size.The ML estimates of the β parameter have lower bias than the GMM ones for low levels of persistence (β + γ = 0.95 in Table 1).However, the opposite pattern is observed for higher levels of persistence (β + γ = 0.99 in Table 2).For the γ parameter, the ML method gives more accurate estimates.We further note that the GMM approach tends to overestimate the β parameter while it underestimates the γ parameter, which leads to a more accurate estimation of the overall degree of persistence β + γ compared to the ML approach.For the standard deviation criterion, we observe that it decreases when the sample size increases and the rate of decrease for the ML estimation method is faster than the GMM one, which roughly corresponds to a factor of √ T.Moreover, the standard deviation of the ML estimates is overall lower, with the lowest levels achieved by the ML estimation based on the Burr distribution.This observation confirms that the ML estimation with correctly specified distribution provides the most efficient estimates.

Empirical Study
The ability to accurately model the illiquidity series is useful to quantify conditions in financial markets and track their evolution over time.In our application, we consider the five largest U.S. tech stocks (Fab 5) and Bitcoin to analyze their illiquidity series using our DArLiQ model. 2 We summarize in Table 3 the descriptive statistics of the illiquidity series. 3The Bitcoin asset is an order of magnitude less liquid compared to the tech stocks during this period and the illiquidity series of Bitcoin is more volatile, exhibits higher skewness and has thicker tails.The five tech companies have comparable levels of liquidity-although Apple stock is slightly more liquid than the others.Moreover, the illiquidity of Amazon stock has higher skewness and thicker tails compared to the other tech companies.
We plot in Figures 3 and 4 (Appendix G.1) the illiquidity and log illiquidity series.To manage boundary issues, we use the local linear method to obtain an initial consistent estimator of the trend g(t/T) where we opt for a Gaussian kernel.In this step, we choose the bandwidth according to the rule of thumb derived in Section 4.1.1 and it is reported in the second row of Table 5. 4 The red curves represent the estimated trend functions and their logarithms.Figure 3 shows that the estimated g(t/T) serves as a good approximation for the time-varying mean of the illiquidity series.5A strong downward trend is observed for most illiquidity series, indicating an overall improvement in liquidity conditions over time.We notice a temporary worsening in liquidity conditions during significant market events such as the burst of the dot-com bubble and 2007-2009 Global Financial Crisis.

Estimation Results
We introduce the detrended illiquidity series, * t = t /g(t/T), which is assumed to be mean stationary.For the parameters θ of the λ t process, we first consider a GMM approach, which is based only on the first conditional moment restriction.We then focus on the estimation results using a semiparametric MLE approach based on an iid shock assumption.

Estimation based on Conditional Moment Restrictions
We use the GMM approach based on the conditional moment restrictions to obtain consistent initial estimates of the λ t process parameters θ .We consider the minimalist case where the model is estimated using only the first conditional moment restriction, that is, E * t We further improve the estimates of the g(t/T) function using the estimated λ t = λ t θ GMM obtained in the previous step.This, in turn, allows us to further improve the estimates of the θ parameters.We report the initial and the updated estimates with associated standard errors in Table 4.It can be observed that the parameter estimates are statistically significant.In addition, the sum of the coefficients β + γ is close to one, indicating high persistence in the shortrun dynamics of the illiquidity series.6T is the total number of observations in the sample period.h 0 is the bandwidth used for the initial estimate of the trend function which is obtained following the rule of thumb.h is the bandwidth used when updating the trend function and is selected using the leave-one-out cross validation approach.
We improve the estimates of the trend function based on the estimated λ t process, that is, we estimate the g(.) function based on t / λ t .In this step, we choose the bandwidth using a leaveone-out cross validation approach, see for example Chu and Marron (1991) for more details.We can observe from Table 5 that the bandwidth to update the trend function is much smaller than the bandwidth used in the initial step as there is less variation in the t / λ t series. 7We further plot the log transforms of the initial and updated estimates of the trend function, that is log g(t/T), together with the log illiquidity series in Figure 5, Appendix G.2.We observe that the updated trend function estimates are different from the initial estimates but only to a minor extent.In addition, the updated estimates of the λ t process parameters are slightly different from the initial estimates with the overall degree of persistence (β + γ ) being almost the same.This observation indicates that a two-step approach consisting in first using a local linear estimator for the trend function and then estimating the λ t process and its associated parameters θ can be a viable option in empirical applications.
In addition, we compute the sample standard deviation σ U ζ of fitted shock series ζ t obtained using the updated λ t parameters.We report the values in the last part of Table 4 and we observe that the sample standard deviations of all assets are below one, indicating under-dispersion.Lastly, we estimate the tail index of the shock series which is between three and four for Apple and Bitcoin while it is between five and eight for the other assets.Our weakly stationary.The moment-based estimator of g(.) is not appropriate in this setting, but this can be addressed using the median smoothing method proposed by Koo and Linton (2015).We show in Appendix J that the normalized trend functions obtained with the mean and median smoothing methods are comparable and the choice of the smoothing method has a minor impact on the parameter estimates for the λ t process. 7The values reported in Table 5 are the optimal bandwidth selected according to the rule-of-thumb (h 0 ) and cross validation approach (h).When undersmoothing is required, we use half of the selected bandwidth, that is, h 0 /2 and h/2.
results suggest that the shocks of Apple and Bitcoin have thicker tails. 8

Estimation: iid Error Term with Parametric Density
We estimate the model using the semiparametric MLE approach where we assume an iid error term.The conditional distribution of the error term ζ t can be chosen within the class of distributions satisfying the desired requirements, that is, the density having nonnegative support with unit mean and finite variance σ 2 ζ .We present estimation results assuming that ζ t follows a Weibull( (1 + ϕ) −1 , ϕ) distribution with shape parameter ϕ.Based on the local linear estimator of g(t/T), we first obtain a consistent estimator of the λ t process parameters via the Quasi-Maximum Likelihood (QML) estimation approach.We then obtain the fully efficient estimates with a one-step update using the efficient scores based on the initial consistent estimators as introduced in Section 3.2.1.
We report the estimated parameters with their standard errors in Table 6.The estimates of the dynamic parameters of λ t , β and γ , are quite similar across assets, and all illiquidity series exhibit a high degree of persistence as the sum of the estimated coefficients β + γ is close to one.These findings are quite analogous to univariate GARCH models for volatility.The estimated shape parameters of the Weibull error terms have a higher dispersion across assets and range from 1.14 to 1.38, indicating that the volatility of ζ t ranges from 0.73 to 0.88.This is consistent with the observation that the five tech stocks have comparable volatilities while Bitcoin has much higher volatility.
We provide diagnostics on the validity of our assumptions on the error term ζ t .Concerning the iid assumption, we plot in Figures 6 and 7 (Appendix G.3), respectively the ACF of ζ t and ζ 2 t .We observe that overall there is no evidence suggesting autocorrelation in the residuals or squared residuals.Moreover, we use the Probability Integral Transform (PIT) to check how well the assumed Weibull distribution fits the data.The histograms of the PITs (shown in Figure 8 Appendix G.3) are quite close to a uniform distribution.
The tail index analysis of the fitted shock series ζ t in Section 6.1.1 suggests that the shocks have a thicker tail than the Weibull while exhibiting under-dispersion features.We thus also consider fat-tailed distributions in our analysis, such as the Burr-which nests the Weibull and Lomax-and Inverse Burr distributions.The estimation results in Appendix I.2 show that the Lomax distribution lacks the ability to capture the 8 See Appendix I.1 for more details on the tail index analysis.Instead, we focus in the next section on whether a more flexible nonparametric density can provide a better fit to the data.We further improve the estimation of g(t/T) by maximizing the local likelihood based on the estimated λ t and the error density.The log transforms of the initial and updated estimates of the trend function, that is log g(t/T), are plotted in Figure 9 (Appendix G.3) together with the log illiquidity series.As in the GMM case (Appendix G.2), the updated trend function estimates are different from the initial estimate but only to a minor extent.

Estimation: iid Error Term with Nonparametric Density
We investigate whether replacing the parametric error density f with a nonparametric kernel estimator can further improve the model fit to data.We plot, in Figure 10 (Appendix G.4), the estimated nonparametric density (using residuals ζ t from the GMM case) against the Weibull density using the shape parameter estimate from Section 6.1.2.We observe that the estimated Weibull density curves do not fall between the pointwise two standard deviation bands of the estimated nonparametric densities, suggesting that the difference between the estimated parametric and nonparametric densities is statistically significant.
The estimated nonparametric density allows us to further improve the ML estimation for the λ t process.We obtain the fully efficient estimates in the nonparametric case using the one-step update approach based on the efficient scores introduced in Section 3.2.2.The estimates and standard errors are reported in Table 7. Comparing the estimated λ t parameters in the parametric (Table 6) and nonparametric case (Table 7), we observe that the differences are quite small.This indicates that gains in efficiency with respect to GMM, using a onestep update based on the efficient scores, are quite robust with respect to the distribution of ζ t .However, goodness-of-fit of the parametric and nonparametric models, measured in terms of log-likelihood, is quite different, see results in Table 8.We conclude that the ML estimation assuming a Weibull density for ζ t provides good performance, but using a nonparametric estimator can further improve the fit in terms of likelihood.In particular, if distributional aspects of the fitted model such as tail properties are important, for example for risk management purposes, then the nonparametric model is clearly be preferred.Amihud (2002) considers an autoregressive model for annual and monthly market illiquidity and then relates this to the market risk premium.Specifically, he estimates the regression R mt − R ft = a + b liq e t + c liq u t + ε t , where R mt and R ft are the (annual or monthly) market return and risk-free rate, respectively.liq e t , liq u t are the expected and unexpected components of illiquidity determined from the first order autoregression liq t = c 0 + c 1 liq t−1 + η t , where liq t is the (annual or monthly) average illiquidity that we called A t or rather its logarithm.His estimation results strongly support his prior hypothesis that stock excess return is an increasing function of expected illiquidity, and unexpected illiquidity has a negative effect on contemporaneous unexpected stock return (i.e., b > 0 and c < 0).

Risk Premium
We reproduce his analysis within our daily model framework, which has three components to illiquidity: the slowly varying trend, the short-run anticipated dynamic component and the unanticipated shock.We first consider the specification for daily stock returns R mt − R ft = a + bg(t/T) + cλ t + dζ t + ε t , ( where g(.), λ t , and ζ t are defined above.This allows the risk premium to depend on all three of the components of illiquidity of our model, see Escanciano, Pardo-Fernández, and Van Keilegom (2017) for related specifications.It is difficult to tie together our model for t with these regressions (but the same comment would apply to the original work of Amihud 2002).One criticism might be that we have used returns to construct liquidity and therefore this variable appears on both sides of the regression.However, we can think of returns as being composed of a direction and a magnitude component from the decomposition R = |R| sign(R).Our liquidity model is about the magnitude |R|, (directional information is not used at all by our liquidity model), whereas the dependent variable of regression ( 14) reflects the direction.We note that there exists a strong downward trend in illiquidity while returns are somewhat stationary.This suggests that the relationship between the long-run trend of liquidity and the stock excess return would be less significant.9Therefore, in the application, we focus on the alternative regression for the detrended equity premium, that is, To be consistent with our assumptions about λ t , ζ t , we should have α = −(γ + δ) but we don't impose this in the estimation although data suggest that α −( γ + δ).In any case, there is a generated regressor issue here when we replace λ t and ζ t by their estimated quantities.Therefore, we compute the standard errors for the coefficient estimates using the nonparametric bootstrap procedure outlined in Appendix F. We use the daily historical data of the S&P 500 index to carry out the analysis.The illiquidity and log illiquidity together with the return series are plotted in Figure 11 (Appendix G.5).We report in Table 9 the estimation results based on (15). 10e observe that the estimated γ coefficient for the short-run expected illiquidity component λ t is positive and significant indicating that the expected market excess return is an increasing function of the short-run expected illiquidity.This is consistent with the intuition that higher expected market illiquidity would make investors demand higher excess returns to compensate for this risk exposure.Moreover, the estimated δ for the shock term ζ t is negative and significant, suggesting that unexpected market illiquidity has a negative effect on stock excess return.This could be because stock prices would likely fall when illiquidity unexpectedly rises, thus decreasing expected returns.
for some κ ≥ 4 and for l = 0, 1.Furthermore, the efficient information matrixI * ηη (η)is well defined and positive definite at η = η 0 and continuous in η in a neighborhood of η 0 .

Table 4 .
Estimated parameters of the λ t process based on first moment restriction.The estimated parameters are θ = (β, γ ) for the λ t process.The U superscript indicates that the estimates are the ones obtained using the updated trend function.The numbers in parentheses are the standard errors of the corresponding parameter estimates.σ U ζ is the sample standard deviation of ζ t and tail U ζ is the tail index estimated using the regression log(Rank − 1/2) = a − b log(Size) based on the 5% largest fitted shocks ζ t .

Table 5 .
Bandwidth choice for the initial estimate and the updated trend function.

Table 6 .
Fully efficient estimates of the λ t process parameters in the parametric (Weibull) case.Estimated parameters are θ = (β, γ , ϕ) for λ t process.ϕ is the shape parameter of the Weibull distribution with mean 1 and standard deviation σ ζ of -dispersion feature with unit mean restriction.The Burr distribution reduces to the Weibull except for Apple and Bitcoin whose shock terms have thicker tails than the other stocks.The Inverse Burr outperforms the Weibull and Burr in terms of log likelihood except for Microsoft.No distribution among the ones considered above consistently provides a better fit and we thus refrain from searching for more general distributions. under

Table 7 .
Fully efficient estimates of the λ t process parameters in the nonparametric case.The estimated parameters are θ = (β, γ ) for λ t process.The numbers in parentheses are standard errors.

Table 8 .
Log likelihood comparison between parametric (Weibull) and nonparametric cases.The difference is log LL in the nonparametric case minus log LL in the parametric case.

Table 9 .
Coefficient estimates of the risk premium regression.We estimate the regression based on (15).Returns are expressed in percentage points.Standard errors are based on the nonparametric bootstrap procedure with 500 replications (see Appendix F for details).Significance level * * * p < 0.01; * * p < 0.05; * p < 0.1.