Volatility Estimation When the Zero-Process is Nonstationary

Abstract Financial returns are frequently nonstationary due to the nonstationary distribution of zeros. In daily stock returns, for example, the nonstationarity can be due to an upwards trend in liquidity over time, which may lead to a downwards trend in the zero-probability. In intraday returns, the zero-probability may be periodic: It is lower in periods where the opening hours of the main financial centers overlap, and higher otherwise. A nonstationary zero-process invalidates standard estimators of volatility models, since they rely on the assumption that returns are strictly stationary. We propose a GARCH model that accommodates a nonstationary zero-process, derive a zero-adjusted QMLE for the parameters of the model, and prove its consistency and asymptotic normality under mild assumptions. The volatility specification in our model can contain higher order ARCH and GARCH terms, and past zero-indicators as covariates. Simulations verify the asymptotic properties in finite samples, and show that the standard estimator is biased. An empirical study of daily and intradaily returns illustrate our results. They show how a nonstationary zero-process induces time-varying parameters in the conditional variance representation, and that the distribution of zero returns can have a strong impact on volatility predictions.


Introduction
Financial returns are frequently zero. This can be due to liquidity issues (e.g., low trading volume), price discreteness or rounding error, data issues (e.g., imputation due to missing values), market closures, and other market-specific characteristics and developments.
A number of approaches accommodate the occurrence of zeros. In continuous time approaches, for example, zeros occur when the assumed underlying price process is not observed. Lesmond, Ogden, and Trzcinka (1999) used this idea to construct a popular measure of liquidity based on observed zeros. More recently, the role of zeros has been repositioned in a continuous time framework by, amongst others, Bandi, Pirino, and Reno (2017), and Bandi et al. (2020). Building on these developments,  derived a bias-correction for realized volatility (RV), and Buccheri, Pirino, and Trapi (2020) proposed a way to improve portfolio management in the presence of zeros. Appendix D (supplementary material) outlines the connection between continuous time approaches to zeros and the model proposed here. In a second body of literature, zeros naturally occur due to the discreteness of price changes. Hausman, Lo, and MacKinlay (1992) proposed an ordered probit model for discrete price changes. Russell and Engle (2005) proposed an Autoregressive Conditional Multinomial (ACM) model in combination with their Autoregressive Conditional Duration (ACD) model from Engle and Russell (1998). Liesenfeld, Nolte, and Pohlmeier (2006) criticized this approach, and proposed instead a dynamic integer count model. This was extended to the multivariate case in Bien, Nolte, and Pohlmeier (2011). Rydberg and Shephard (2003) proposed a model where the price increment is decomposed multiplicatively into three components: Activity, direction and integer magnitude. Catania, Di Maria, and Santucci de Magistris (2020) proposed a discrete mixture approach to discrete price changes. In a third body of literature, price changes are continuous except at zero. Hautsch, Malec, and Schienle (2013) proposed a zeroinflated model for volume. Kümm and Küsters (2015) proposed a zero-inflated model, where zeros occur either because there is no information available or because of rounding. In Harvey and Ito (2020), zeros occur due to censoring of an underlying continuous variable. Finally, the generalized autoregressive conditional heteroscedasticity (GARCH) class of models provides a fourth body of literature, since it accommodates zero-returns as long as the innovation can be zero, see the discussion in Sucarrat and Grønneberg (2020). In particular, if the standardized innovation is stationary, the parameters of a GARCH specification can be consistently estimated by the standard quasi-maximum likelihood estimator (QMLE) even when the conditional zeroprobability is time-varying, see, for example, Escanciano (2009).
While the aforementioned contributions accommodate zeros in one or another way, very few of them pay attention to the fact that the zero-process can be nonstationary. This is striking, since the zero-process is frequently nonstationary. In daily stock returns, for example, a downwards (upwards) trend in the zeroprobability can be due to an upwards (downwards) trend in liquidity over time, or an upwards (downwards) trend in the price level of the stock. Sucarrat and Grønneberg (2020) found widespread evidence of a trend in the zero-probability of daily stock returns at the New York Stock Exchange (NYSE). (We revisit a selection of their stocks in Section 5.1.) In intraday returns, the zero-probability is often nonstationary periodic: It is lower in periods with low liquidity (e.g., when the opening hours of the main financial centers do not overlap), and higher in periods with high liquidity (e.g., in hours where the main financial centers are open at the same time). An example is Kolokolov, Livieri, and Pirino (2020), who find clear evidence of a periodic zero-probability in intraday stock returns.
Here, in this article, we propose volatility models that accommodate nonstationary zeros, where the zero-probability can be trend-like or periodic in nature, or both. To this end, volatility is specified as a generic scale (i.e., the conditional variance is a special case). We derive a modified QMLE, which we label the zero-adjusted QMLE, and prove its consistency and asymptotic normality. We start with the standard GARCH(1,1) model for which the regularity conditions are more explicit, then we extend the results to more general models which allow for higher order lags, asymmetries and also indicators of lagged zero returns. In the stationary case, our regularity conditions coincide with the sharpest assumptions given in the literature for CAN of the QMLE. Our asymptotic results mainly rely on the ergodic theorem for nonstationary processes introduced in Francq and Gautier (2004). Variations of it have also been used in Azrak and Mélard (2006), Phillips and Xu (2006), and Regnard and Zakoian (2010). Section 2 is devoted to the simple GARCH(1,1) model. In Section 3, we extend our results to more general specifications. In particular, we consider a model where lags of zero-indicators are added as covariates. This specification is of special interest, since empirical evidence suggests jumps may follow zeros, see Kolokolov and Reno (2019). Supplemental Appendix A collects the proofs of our theorems, propositions and lemmas. Section 4 contains finite sample simulations of our estimator. They show that the Standard QMLE is biased in our experiments, and verify our asymptotic results. In particular, the empirical standard errors correspond well to the asymptotic ones in finite samples. Section 5 contains an empirical application of our results. They show how a nonstationary zero-process induces time-varying parameters in the conditional variance representation. Accordingly, the distribution of zero returns can have a strong impact on volatility predictions. Finally, Section 6 concludes and suggests lines for further research.

Structure and Estimation of the GARCH(1,1) Specification
Let (I t ) a bitstream sequence, that is, a sequence valued in {0, 1}. This bitstream sequence is said to be well fed in zeros (resp. ones) if, for all t, there exists u ≤ t such that I u = 0 (resp. I u = 1). The value I t = 0 indicates a zero return and I t = 1 indicates a nonzero return at time t. Conditionally on (I t ), we will consider time series ( t ) such that t = 0 if I t = 0 and t follows a non degenerated GARCH-type model when I t = 1. First consider a simple zero-inflated GARCH(1,1) model of the form with a sequence (η t ) of nondegenerated real random variables, and nonnegative parameters ω 0 , α 0 and β 0 . Note that, for the moment, we do not make any precise assumption on the model. In particular, the sequence (I t ) can be the realization of a nonstationary sequence. Therefore, the model (2.1) can be considered as being semiparametric. Moreover, if a solution of Equation (2.1) exists, in general it is nonstationary. The following proposition gives a condition for the existence of such a solution.
Proposition 2.1. Given sequences (η t ) and (I t ), and parameters ω 0 > 0, α 0 ≥ 0 and β 0 ≥ 0, there exists a (unique) (nonanticipative ) finite solution to Equation (2.1) if (2.2) This condition is satisfied if for all t there exists s > 0 such that lim sup k→∞ In the previous proposition, a nonanticipative solution means that σ t is measurable with respect to the sigma-field F t−1 generated by {η u , I u ; u < t}, and a finite solution means that σ t < ∞ a.s. for all t, and (σ t ) is bounded in probability, in the sense that ∀ε > 0, ∃M > 0 and n > 0 such that P(σ t > M) < ε ∀|t| > n.
Recall that the necessary and sufficient strict stationarity condition of the standard GARCH(1,1) model is is supposed to be a stationary and ergodic sequence, A1 implies (2.2) (simply because log(α 0 η 2 t I t + β 0 ) ≤ log(α 0 η 2 t +β 0 )). The condition is not necessary, however, because when β 0 = 0 and (I t ) is well fed in zeros, it is easy to see that (2.2) is satisfied without any restriction on α 0 , in particular even when γ = E log(α 0 η 2 t ) > 0. Note also that we cannot conclude when γ t = 0 because we do not make assumptions of the distributions of the zeros in (I t ).
For stationary GARCH models with iid innovations, it is known that the strict stationarity condition γ < 0 entails the existence of a marginal moment (see Berkes, Horváth, and Kokoszka 2003b, lem. 2.3). The following proposition is a direct extension of that result.
Under this assumption, I t = 0 if and only if t = 0, and the sequence (I t ) is then observable whenever ( t ) is observed. Given observations 1 , . . . , n , it is then possible to estimate the parameter θ 0 = (ω 0 , α 0 , β 0 ) ∈ ⊂ (0, ∞) 2 × [0, ∞) by θ n = arg min θ∈ l n (θ ), l n (θ ) = 1 n n t=r 0 +1 t (θ ), where r 0 ≥ 1 is a fixed integer and σ 2 t (θ ) = ω + α 2 t−1 + β σ 2 t−1 (θ ), with a fixed initial value σ 2 1 (θ ). To show the consistency of this modified version of the QMLE, we need additional assumptions. We would like to deal with situations where the occurrence of the zeros may be random or/and periodic of period T ∈ N * (T = 1 meaning no periodicity). To this aim, we assume that I t is determined by a realization of a Tdimensional stationary process, at least for large t. Each date t = (N − 1)T + ν corresponds to a cycle N = N t ∈ Z and a season ν = ν t ∈ {1, . . . , T}. More precisely, we have N = t/T , where · denotes the ceiling function.
A3. Let (η t ) t∈Z and (S N ) N∈Z be two independent stationary and ergodic processes defined on some probability space ( , A, P), respectively, valued in R and S := {0, 1} T . Let S N = (S (N−1)T+1 , . . . , S (N−1)T+T ) . Assume that there exists an almost surely finite random time t 0 such that, with probability one, I t = S t for all t ≥ t 0 .
For daily returns, there is usually no seasonality in the zeroprocess, and Figure 1 shows that the frequency of zeros stabilizes after a certain point for the stocks studied in Section 5.1. For these series, it therefore seems reasonable to assume A3 with T = 1 and t 0 (ω) corresponding to a certain date.
It is important to emphasize that in model (2.1), the sequence I t is given. Therefore, even when I t is the realization of a stationary process, that is, in A3 T = 1 and I t = S t (ω) for all t, conditionally on (I t ), the sequence ( t ) is not stationary. Indeed, it is clear that t and t+1 can not have the same distribution when I t = I t+1 . We will work with random variables of the form f (I t , I t−1 , . . . ; η t , η t−1 , . . . ) which, conditionally on (I t ), are not stationary. The following lemma shows that a kind of law of large numbers can however be applied to such nonstationary sequences under A3. Similar results appear in Azrak and Mélard (2006), Francq and Gautier (2004), Phillips and Xu (2006), and Regnard and Zakoian (2010).
To facilitate interpretation, suppose that Eη t = 0, as is generally the case for GARCH processes. Under A6 and (2.2), we have σ 2 t = Var( t |F t−1 , I t = 1). Thus, σ t corresponds to the volatility of t when this return is nonzero. When I t = 0, the variable σ t does not have such an interpretation. Given the observations 1 , . . . , n , one can thus interpret σ n+1 as the volatility of the future return n+1 under the scenario that the latter is nonzero. Since I t is taken as exogenous variable, our model is not sufficient to predict Obviously, to be able to estimate the parameter of the volatility process (σ t ), it is also necessary to assume that (I t ) is well fed in 1s. We even have to assume that if I t = 1, then I t−1 is not always equal to zero, otherwise, in the ARCH(1) case, t denotes the sigma-field generated by {S u , η u ; u ≤ t}. We however assume that the conditional distribution of S t given F S,η t−1 is not degenerated in the following sense.
For the asymptotic normality of the QMLE, it is necessary to assume the following.

Extension to General Volatility Models and Model Checking Tests
We now extend Model (2.1) by considering the general zeroinflated volatility model Note that this general formulation includes all GARCH(p, q) models, as well as numerous volatility models with asymmetries, such as the Asymmetric Power ARCH model (APARCH) of Ding, Granger, and Engle (1993). Given observations 1 , . . . , n and arbitrary initial values t for t ≤ 0, for θ ∈ let Define θ n by Equation (2.3) and assume the following.
B1: There exists a finite solution to Model (3.1), which is of the form t = e(θ 0 ; I t , B2: For any real sequence (x i ), the function θ → σ (x 1 , x 2 , . . . ; θ) is continuous on and belongs to (ω, ∞] for all θ ∈ and for some ω > 0. B3: There exist a random variable K measurable with respect to { u , u ≤ 0} and a constant ρ ∈ (0, 1) such that where K and ρ are as in B3 and V(θ 0 ) is some neighborhood of θ 0 .
B6: There exists a neighborhood V(θ 0 ) of θ 0 such that, for t = 1, . . . , T, the following variables have finite expectation: For Model (2.1), we have seen that B1 is satisfied under A1, and the first part of B2 is satisfied under A4 and A6. The identifiability condition in B2 and B4 are entailed by A7 and A8. Assumption A5 entails B3 and B5, as well as the existence of σ t (θ ) and its derivatives for all θ ∈ . Relations (A.8) and (A.10) of the proof of Theorem 2.2 show that B6 also holds true under the assumptions of Theorem 2.2.
−ω 0 and the same constraints and notations as for (2.1). If τ i0 > 0, then zero returns tend to increase the volatility, as could be expected when zero returns reflect liquidity issues, but we do not impose this sign constraint a priori. It is clear that, for identifiability of the τ i0 coefficients, it is necessary to assume that (I t ) is well fed in zeros and ones. There exist less trivial reasons for nonidentifiability of the parameters. For example, if P ( t = 1| t−1 = 0) = 1 and P ( t = 0| t−1 = 1) = 1 then all the pairs (τ 01 , τ 02 ) such that τ 01 + τ 02 is fixed are equivalent. We thus reinforce A7 and A8 by assuming that Note that, by convention, Model (3.2) with r = 0 corresponds to (2.1). In this case, A8 * reduces to the conditions E(S j 0 |F S,η j 0 −1 ) ∈ (0, 1) and E(S j 0 −1 |F S,η j 0 −2 ) ∈ (0, 1) a.s., which is an alternative to A7 and A8.
It is common to assess the adequacy of a time series model by testing the whiteness of the residuals, plotting their empirical (partial) autocorrelations of using formal portmanteau tests, see the monograph by Li (2004). To test the goodness of fit of volatility models, Li and Mak (1994) proposed portmanteau tests based on the autocovariances of the squares of the residuals. The asymptotic distribution of these tests has been studied in particular by Berkes, Horváth, and Kokoszka (2003a) for the standard GARCH models, Carbon and Francq (2011) for APARCH models, Francq, Wintenberger, and Zakoïan (2018) for Log-GARCH and EGARCH models.
First note that η t = 0 when I t = 0, so that η t should only be a good proxy of η t when I t = 1. Let n 1 = n t=r 0 +1 I t and t 1 , . . . , t n 1 the increasing subsequence of the times t ∈ {r 0 + 1, . . . , n} such that I t = 1. For fixed integers h < n 1 and m < n 1 , let We will determine the asymptotic distribution of the vector r m of autocovariances of the squares residuals under the null hypothesis H 0 : the process ( t ) satisfies (3.1).
Define the m × d 0 matrix whose hth row is In particular, it is shown in appendix that the following assumption is satisfied under the assumptions of Corollary 3.1. B7: If λη 2 t−i + μ ∂ log σ 2 t (θ 0 )/∂θ = 0 a.s. for i ≥ 1 and t ≥ 1 then λ = 0.
Let I m the identity matrix of size m and p 1 = T −1 T t=1 P(S t = 1) the asymptotic proportion of 1's in the bitstream sequence, which can be estimated by p 1 = n 1 /n. Theorem 3.2. Under H 0 , the assumptions of Theorem 3.1 and B7, we have It can be seen that an alternative consistent estimator of D is the empirical variance of ϒ t 1 , . . . , ϒ t n 1 , where The portmanteau test of Li and Mak (1994) consisted in rejecting H 0 at the asymptotic level α

Simulations
To study the finite sample properties of the zero-adjusted QMLE, we undertake a set of Monte Carlo simulations. In the simulations the GARCH specifications are nested in where Equation (4.2) is a particular case of model (3.2) for which the zero-adjusted QMLE is studied in Corollary 3.1. The parameter values correspond (approximately) to the median values of the estimates in Table 3. The zero-probability π 0t = Pr(I t = 0) is governed by one of the following DGPs: DGP 1: π 0t = 0 for all t.
This means {I t } is stationary in DGP 1, but not in DGPs 2 and 3. In DGP 2 the zero-probability π 0t is downwards trending in a way that is characteristic among the daily returns of Section 5.1, see Figure 1. For t = 1 the probability is π 0t = 0.5, and then it declines until t = n · 0.7, that is, at 70% of the sample, where π 0t = 0.05. Thereafter, π 0t remains constant and equal to 0.05. This is in line with A3. In DGP 3 the zero-probability is periodic-as is common in intraday financial data, and varies between π 0t = 0.1 and π 0t = 0.4 as in our illustration in Section 5.2.
The results for the GARCH(1,1) model are contained in the upper part of Table 1. For comparison, we include the results of the Standard QMLE in addition to the zero-adjusted QMLE. Note that in DGP 1 the two QMLEs-and therefore also their results-are identical. When n =10,000, the average finite sample error is 0.004 or less in absolute value for the zeroadjusted QMLE. For the Standard QMLE, by contrast, the finite sample error ranges from 0.02 to 0.13 (in absolute value) in DGP 2, and from 0.01 to 0.056 (in absolute value) in DGP 3. This can be substantial in empirical applications. The asymptotic standard errors of the zero-adjusted QMLE are contained in the columns labeled ase(.), see the supplemental appendix for their computation. The values correspond well to their empirical counterparts-contained in the columns labeled se(.), since they differ a maximum of 0.001 (in absolute value) across the DGPs. When n = 3000, the zero-adjusted QMLE also produces substantially less biased estimates than the ordinary QMLE, and the empirical standard errors correspond reasonably well to their asymptotic counterparts. The only exception is β in DGP 3, where the Standard QMLE is slightly less biased.
The results for the GARCH(1,1) model with the lagged zeroindicator as covariate are contained in the lower part of Table 1. Note that simulations under DGP 1 is not possible due to exact colinearity. Qualitatively, the simulation results are similar to those of the plain GARCH(1,1). When n=10,000, the average finite sample bias is low in absolute value for the zero-adjusted QMLE (0.006 or less), whereas it is high for the Standard QMLE (0.010 to about 0.504 in absolute value). The largest bias is for Table 1. Simulations of the Standard and zero-adjusted QMLEs (Section 4). τ 0 in DGP 2. The empirical standard errors of the zero-adjusted QMLE again correspond quite well to the asymptotic ones, since the bias is always 0.002 or less in absolute value. When n = 3000, the average finite sample bias is 0.016 or lower in absolute value for the zero-adjusted QMLE, and the associated discrepancy between the empirical standard errors and the asymptotic ones are never larger than 0.009 in absolute value. In other words, in these experiments the finite sample properties of the zeroadjusted QMLE are also quite good. Similarly, the biases of the Standard QMLE are again quite large, since they range from 0.011 to about 0.495 in absolute value. Also here is the largest bias for τ 0 .

Empirical Illustrations
Standard estimators of volatility, for example, the Standard QMLE, provide estimates of the conditional variance. The volatility σ 2 t in our model, by contrast, is not at the same scale-level. To facilitate comparison, the conditional variance representation of our model is therefore obtained as follows: where π 1t = Pr(I t = 1|F t−1 ), recall Remark 2.1. In other words, the conditional variance representation can be written as a GARCH with time-varying parameters. In particular, in the case of a GARCH(1,1) with the lagged 0-indicator as covariate, the conditional variance representation is where σ 2 t,0adj = π 1t σ 2 t , ω 0t = π 1t ω 0 , α 0t = π 1t α 0 , β 0t = π 1t π 1,t−1 β 0 , τ 0t = π 1t τ 0 . (5.3) A higher value on the zero-probability π 0t = 1 − π 1t thus implies a lower "volatility-level" ω 0t , a lower "sensitivity" α 0t to nonzero price increments in the previous period, a lower impact β 0t from the conditional variance (i.e., σ 2 t−1,0adj ) in the previous period, and a lower impact τ 0t from a zero-return in the previous period. Note also that, when the change in π 1t from t − 1 to t is sufficiently small, then β 0t ≈ β 0 .

Daily Returns at the NYSE
We revisit a subset of the NYSE stocks studied in Sucarrat and Grønneberg (2020). The subset of stocks, 24 in total, together with descriptive statistics of their daily returns, are contained in Table 2. The daily returns are computed as t = 100 · (ln S t − ln S t−1 ), where S t is the closing price of the stock in question at day t. The datasource is Bloomberg. To be included in the subset, the NYSE stock must satisfy four criteria. First, at least n = 1000 daily price observations must be available over the period 3 January 2007-4 February 2019. Second, the proportion of zero returns must be greater than 10% over the available sample. Third, a moving average (n = 500) estimate of the zero-probability should clearly indicate that the zero-process is nonstationary. Graphs of the moving averages are contained in Figure 1. One of our anonymous reviewers suggested that a trend-like evolution in the zero-probability may be due to a corresponding trend-like evolution in the price level: The higher (lower) the nominal price, the lower (higher) the zeroprobability due to discrete price changes. Plots of the prices (see the supplemental appendix) suggest such an effect may indeed be present in several of the stocks. Finally, the fourth criterion is that the graphs suggest assumption A3 holds.
GARCH estimates of the daily returns obtained with the zero-adjusted QMLE are contained in Table 3. As noted above, the estimates are not directly comparable to standard GARCH estimates-recall (5.1) and (5.2), and must therefore be adjusted  Ljung and Box (1979) test statistic for first-order autocorrelation in 2 t , with P-value denoting the associated p-value. 0s, the number of zero returns. π 0 , the proportion of zero returns. NOTE: zero-adjusted QMLEs of σ 2 t = ω + α 2 t−1 + βσ 2 t−1 + τ 1 { t−1 =0} . s.e., standard error of estimate. Upper bound of 95% CI for τ computed as τ + s.e.( τ ) · 1.96, where s.e.( τ ) is the standard error of τ . Lower bound computed as max{− ω, L}, where L = τ − s.e.( τ ) · 1.96. To avoid explosive volatility-paths, the upper bound τ ≤ 10 is imposed during estimation. χ 2 (2), the results from the portmanteau test of Section 3 of autocorrelation up to and including order 2 of η 2 t (p-value in parentheses).
before comparison. As an example, suppose the estimate on the ARCH coefficient α is 0.375 (as for the CPS stock) and that the zero-probability at t is π 0t = 0.3. Then, the estimate of the time-varying ARCH-coefficient α t in the conditional variance representation is obtained as α t = π 1t α = (1 − π 0t ) α = 0.263. In periods where the zero-probability is 0, the estimates can be interpreted as those of the conditional variance representation. Figure 2 contains estimates of coefficients in the conditional variance representation for different values on the zero-probability π 0t . The vertical lines in the plots are 95% Confidence Intervals (CIs) of the estimates. When π 0t = 0, nine of the estimates of α t lie in the 0.1-0.4 range. This is markedly higher than the typical estimate of a stationary and liquid index or stock, whose estimate is typically below 0.1. This suggests the volatility of this type of stocks can be much more sensitive to price changes at t−1 (when π 0t is zero or close to zero). But more studies are needed before firm conclusions of general nature can be made. As π 0t increases to 0.6, almost all estimates go below 0.1. The estimates of β t are obtained under the assumption that π 0t = π 0,t−1 . This is why they do not change with π 0t in the plots. Four estimates are lower than 0.7. For liquid indices and stocks, they are typically above 0.8. All-in-all, therefore, the plots do not suggest the estimates of β t tend to be very different from those of liquid indices and stocks. The estimates of τ t provide an indication of whether a zero in the previous period tends to increase (τ > 0) or decrease (τ < 0) volatility in the next period. In 9 out of 24 stocks, the 95% CIs do not contain the value 0 (see also Table 3), so the hypothesis of an effect is supported in these cases (at 5%). For one of these stocks the effect is estimated to be negative, whereas for the other 8 it is estimated to be positive. Finally, the portmanteau test in the final column suggest there is room for improvement (at the 10% significance level) in two of the stocks. Let σ t,0adj denote the estimated conditional standard deviation of our zero-adjusted QMLE, and let σ t denote the estimated conditional standard deviation of the Standard QMLE. To investigate the properties of their discrepancy, we study the distance x t = σ t,0adj − σ t . To obtain an estimate of σ t,0adj , an estimate of π 1t is needed. To this end, we device a nonstationary smoothing filter based on the first-order autoregressive conditional logit (ACL) of Russell and Engle (2005). Specifically, the smoothing filter is specified as (5.4) where φ > 0 controls the smoothness: The closer to 0, the smoother. Instead of fixing φ to 0.01 (as we do), one could instead consider estimating it by, say, maximum likelihood. However, this leads to considerably more erratic paths of π 1t for our stocks. Plots of Equation (5.4) against the moving averages from Figure 1, and GARCH estimates of the Standard QMLE, are both contained in the supplemental appendix. Table 4 reports the properties of x t . The first column contains the results of a test of whether E(x t ) = 0. The test is implemented via the regression x t = μ + u t with Newey and West (1987) standard errors, H 0 : μ = 0 and H A : μ = 0. In all but four cases is the null rejected at 5%. So the results provide comprehensive support in favor of the alternative hypothesis that the volatility paths differ significantly. In all of the significant cases, the average of x t is negative. So the Standard QMLE tends to provide volatility estimates that are too high, on average, for the stocks we consider. The next two columns contain the maximum and minimum values of x t , respectively. These provide an indication of how the conditional volatilities differ on a day-to-day basis.
As is clear, they show that the discrepancy can be huge, since they range from −20.9 (the lowest minimum) to 4.3 (the highest maximum). This can have important implications for risk and hedging purposes.

Intraday 5-min USD/EUR Returns
Intraday financial returns are frequently characterized by a periodic nonstationary zero-process; see, for example, Kolokolov, Livieri, and Pirino (2020). An example is the intraday 5-min USD/EUR exchange rate return. Let S t denote the exchange rate at the end of a 5-min interval, and let r t denote the log-return in basis points from the end of one interval to the end of the next: trading days are included in the sample, and a typical trading day contains 24 × 12 = 288 returns. The first return of a trading day covers the interval from 00:00 CET to 00:05 CET, whereas the last covers 23:55 CET to 00:00 CET. The upper part of Table 5 contains descriptive statistics of the returns. As usual, the returns are characterized by excess kurtosis relative to the normal distribution, and first-order autocorrelation in 2 t . The proportion of zero-returns over the sample is 20.3%, and the right graph of Figure 3 depicts how the zero-proportion varies intradaily across the 24-hour trading day. In the beginning of the day, only the Asian markets are active, so the zero-probability is higher. As European markets open, activity increases and so the zero-probability falls. The zero-probability remains low until the close of the European markets, and then gradually increases again as only the American markets remain active. The zeroprobability reaches its peak at the close of the American markets.
The middle part of Table 5 contains the GARCH estimates. In both the standard and zero-adjusted cases, τ is estimated to be negative, and the 95% CIs for τ do not contain the value 0. In other words, the results suggest a zero-return in the previous period tends to reduce volatility in the next period at the 5-min frequency for this exchange rate during the sample period of the data at the trading platform in question. To obtain estimates of π 1t and π 0t , we use a centred moving average of length 12that is, one hour of trading-made up of the intradaily zeroproportions of the 5-min intervals. The zero-proportions over the trading day, together with the estimate π 0t , are both depicted in the right graph of Figure 3. Note that the periodic cycle is 288. Figure 4 contains the estimates of the time-varying parameters implied by Equation (5.2) together with the estimates of the Standard QMLE. As is clear, the standard estimate of ω is biased downwards throughout the day, and it is also outside the 95% CI throughout the day. The standard estimate of α is biased upwards throughout the day, and most of the time outside the 95% CI. The intraday evolution of the zero-adjusted estimate α t is similar to that of ω t : It is at its highest in the middle of the day when trading is at its highest, and at its lowest in the beginning and end of the day when trading is at its thinnest. The zero-adjusted estimate of β t oscillates about the Standard estimate of 0.857, and only in a couple of instances is the Standard estimate outside the 95% CI. The estimates of τ are both negative. The standard estimate is biased upwards, but it is always within the 95% CI of the zero-adjusted estimate. So they are not significantly different from each other at 5%.
One of our anonymous reviewers asked us to compare the estimates of the zero-adjusted GARCH, which is of observed return, with those of a GARCH model of the efficient return process as defined in Bandi et al. (2020). There, zeros occur when the efficient return process is unobserved. To this end,  Ljung and Box (1979) test statistic for first-order autocorrelation in 2 t (p-value in square brackets). 0s, number of zeros. π 0 , proportion of zeros. s.e., standard error of estimate. 95% CIs computed as τ ± s.e.( τ ) · 1.96, where s.e.( τ ) is the standard error of τ . Lower bound computed as max{− ω, L}, where L = τ − s.e.( τ ) · 1.96. χ 2 (2), the result of the portmanteau test of Section 3 of autocorrelation up to and including order 2 of η 2 t (p-value in square brackets). Moment, the modified moment-based estimator of Kristensen and Linton (2006), see the supplemental appendix. Avg x t , average of x t . P-value, the p-value of a two sided test with E(x t ) = 0 as null (implemented via the regression x t = μ + u t with Newey and West (1987) standard errors). we derive a modified version of the moment-based estimator of Kristensen and Linton (2006), see Appendix D (supplementary material) for the details. The estimates are also contained in the middle part of Table 5. Note that an estimate of τ is not available for this estimator. Compared with the zero-adjusted estimates depicted in Figure 4, the ω and α estimates are lower, whereas the estimate of β is higher. The α and β estimates of 0.019 and 0.958, respectively, are particularly different, since they are always substantially outside the 95% CIs of the zero-adjusted estimates.
To investigate to what extent the Standard and zero-adjusted QMLEs produce different volatility estimates, we study the distance x t = σ t,0adj − σ t , just as in Section 5.1. The lower part of Table 5 reports the properties of x t . Again the test of whether E(x t ) = 0 or not is implemented via the regression x t = μ + u t with Newey and West (1987) standard errors. The average of x t is −0.037, and a two-sided test with 0 as null is rejected at all the usual significance levels. Accordingly, the results suggests the Standard QMLE produces conditional volatilities that are too high, on average. Unconditionally, the value of −0.037 is not large. Conditionally, the range between the maximum and minimum values of x t suggests the discrepancy can be large on a day-to-day basis. Figure 5 contains the graph of x t . Most of the time x t lies between 0.3 and −1.0. Recalling that the 5-min returns are expressed in basis points, these differences do not appear to be large in economic terms.

Conclusions
Financial time series are frequently nonstationary due to a nonstationary zero-process. In these situations, standard estimators are not consistent. We propose a GARCH model that accommodates a nonstationary zero-process, and derive a zeroadjusted QMLE. The nonstationary zero-process can either be trend-like in nature, as is common in daily data, or periodic, as is common in intraday data, or both. The volatility specification in our model can contain higher order ARCH and GARCH terms, asymmetry terms ("leverage") and past zero-indicators as covariates. The latter is of special interest in the current context, since it enables us to study the effect of a zero return on volatility in the subsequent period. Consistency and asymptotic normality of the zero-adjusted QMLE is proved under mild assumptions. Moreover, under stationarity of the zero-process the estimator will still be CAN, so there is no harm in applying our estimator under stationarity. Finite sample simulations verify that the estimator has good finite sample properties, and confirm that the Standard QMLE is biased when the zero- process is nonstationary. Two empirical studies illustrate our results. One is on 24 daily stock returns at NYSE, and one is on intraday 5-min USD/EUR exchange rate returns. In both studies we find that the time-varying zero-probability affects the dynamics in substantial ways, that the fitted volatilities can differ significantly, and that a zero-return in the previous day can have a substantial effect on volatility in the subsequent day. Interestingly, however, we do not always find that the effect is positive.
While a nonstationary zero-process is frequent in financial time-series, only recently have researchers directed their attention toward this characteristic. Several lines of future research suggest themselves. First, the extension to more general volatility models outlined in Section 3 accommodates models with asymmetry ("leverage"). An interesting line of further research is to study how the evolution of the zero-probability impacts on the effect of asymmetry. Second, it is well known that financial time series-both daily and intradaily-can be nonstationary due to changes in the level of the unconditional volatility. How frequent are such changes due to a nonstationary zero-process? To the best of our knowledge, this has not been investigated before. Third, to obtain the conditional variance representation of our model, estimates of the time-varying probabilities of a nonstationary zero-process is required. This is challenging. More research is needed to ascertain what the most suitable approach is, and under which assumptions. Fourth, as noted by by one of our anonymous reviewers, the zero-process may not be the only source of nonstationarity. In addition, the volatility intercept (ω), and the ARCH and GARCH parameters, may also be time-varying. To the best of our knowledge, nobody has developed methods for situations where both types of nonstationarities are present. Finally, knowledge about the relation between observed zeros and the underlying efficient return process is limited, so more research on this is needed.

Supplementary Materials
The supplementary file contains four appendices. Appendix A gathers all the proofs and some complementary theoretical results. Appendix B describes the computation of the asymptotic covariance of the 0-adjusted QMLE in the simulation experiments. Appendix C gives details about the NYSE stocks used in the empirical study. Appendix D discusses the possibility of estimating the volatility of the efficient returns considered in Bandi et al. (2017) and Bandi et al. (2020). The replication files are available via https://www.sucarrat.net/research/replication-files-garch-0-non stationary.zip.