Generalized Autoregressive Positive-valued Processes

Abstract We introduce generalized autoregressive positive-valued (GARP) processes, a class of autoregressive and moving-average processes that extends the class of existing autoregressive positive-valued (ARP) processes in one important dimension: each conditional moment dynamic is driven by a different and identifiable moving average of the variable of interest. The article provides ergodicity conditions for GARP processes and derives closed-form conditional and unconditional moments. The article also presents estimation and inference methods, illustrated by an application to European option pricing where the daily realized variance follows a GARP dynamic. Our results show that using GARP processes reduces pricing errors by substantially more than using ARP processes.


Introduction
The finance literature has widely relied on autoregressive positive-valued (ARP) processes to model variations in the distribution of positive time series.Autoregressive gamma (ARG) processes are leading examples of ARP processes.In modeling the term structure of interest rates, volatility factors are traditionally designed to follow ARG dynamics (Le et al. 2010;Monfort et al. 2017).In option pricing, when modeling the key component, the conditional variance, several papers use ARG dynamics (Feunou and Tedongap 2012;Majewski, Bormetti, and Corsi 2015).ARG processes have also been used to model intraday financial market activity, in particular intertrade duration (Gourieroux, Jasiak, and Fol 1999;Gourieroux and Jasiak 2006).
Other well known examples of ARP processes include the autoregressive gamma-zero (ARG 0 ) (Monfort et al. 2017) used to model term structure of interest rates dynamics at the zerolower-bound, the Heston-Nandi GARCH (Heston and Nandi 2000) used to model the conditional variance of stock returns with Gaussian distribution, the inverse-Gaussian GARCH (Christoffersen, Heston and Jacobs 2006) used to model the conditional variance of stock returns with inverse-Gaussian distribution and the integer-valued autoregressive (INAR) model (Al-Osh and Alzaid 1987) used to model positive integervalued processes.
ARP processes are characterized by an affine cumulant generating function where I t is the sigma algebra generated by (x s , s ≤ t).The ARG process studied in Gourieroux and Jasiak (2006) corresponds to the discretization of the Cox-Ingersoll-Ross diffusion process (Cox, Ingersoll, and Ross 1985).Its corresponding functions CONTACT Bruno Feunou feun@bankofcanada.caBank of Canada, Ottawa, Canada.Supplementary materials for this article are available online.Please go to www.tandfonline.com/UBES.ω(u) and α(u) for the scalar u < 1/ϕ are given by ω(u) = −ν log(1 − uϕ), and α(u with ν ≥ 0, ϕ > 0 and φ ≥ 0. It admits the following state space representation where U t+1 is a latent process that follows a Poisson distribution denoted by P(•) and γ (•) is the standard gamma distribution.Gourieroux and Jasiak (2006) provide the following generalization to any order (p, q): where φ 0 = 1 for identification, p ≥ 1 and q ≥ 1.One implication of this ARG process in definition ( 3) is that all the conditional cumulants (the derivatives of ψ t (u) at u = 0) are driven by the same factor m t .Indeed, we have ψ (n)  t (0) = ω (n) with f (n) (u), the nth order derivative of function f (•) at u.This suggests that both the conditional expectation and the conditional variance of x t are driven by m t , and are, therefore, perfectly positively correlated, and that all moments are highly positively correlated.There is considerable empirical evidence contradicting the very tight restriction between the first two moments imposed by affine models, in particular when considering interest-rate and variance modeling (Cieslak and Povala 2016).Using swap data, Collin-Dufresne, Jones and Goldstein (2004) find that a popular and well-documented three-factor affine model implies volatility paths that are negatively correlated with the GARCH volatility estimates of weekly changes in the six-month rate.Andersen and Benzoni (2010) use intraday Treasury data to show that realized yield volatility is unrelated to principal components extracted from the cross-section, which proxy for model-implied volatility.Regarding the dynamic of the stock market, there is also evidence that the expectation of the realized variance has a distinct dynamic apart from the variance of the realized variance (Corsi et al. 2008).
Numerous contributions in the literature focus on building complex parametric and semi-parametric time series models where the first four moments have very distinct dynamics.Hansen (1994) and Jondeau and Rockinger (2003) are pioneers in that literature and have shown that in the first four moments, behavior and persistence are very different.For instance, the skewness is strongly persistent while kurtosis is much less so.Chang et al. (2011) extract nonparametric measures of riskneutral variance and skewness from option prices and show that the correlation between option-implied skewness and optionimplied volatility for the S&P500 is -0.06 and the correlation between the average option-implied skewness and average option-implied volatility for the S&P100 components is 0.05.
This inability of affine dynamics to fit all the moments jointly implies that they cannot fit the conditional density and, hence, they generate large option pricing errors, as shown in our empirical investigations.To mitigate these shortcomings of ARP models, we introduce GARP processes to extend ARP processes in one important dimension: each conditional moment dynamic is driven by a different moving average of the variable of interest (x t ).Moreover, GARP processes are parsimonious and, importantly, they maintain one of the key advantages of ARP processes: a closed-form multi-step ahead distribution.Hence, we are able to add much-needed flexibility to the ARP dynamics while keeping its main advantage, i.e., computing derivative prices in closed-form.
The key principle of our generalization of ARP processes and affine models in general to a GARP dynamic is simple: contrary to affine dynamics where all the cumulants are driven by the same factor m t (see ( 5)), we want each cumulant ψ (n)  t (0) to be driven by its own specific factor (say, m where m (n) t is a moving average of x t : One way to achieve that with a minimal number of additional parameters is to set θ n = βθ n , which is equivalent to the following recursive formulation of the conditional cumulant generating function: where functions α(•) and ω(•) are given in (2).The main theoretical challenge of this article is to build a process whose the cumulant generating function has the recursive formulation in (8).We successfully tackle that challenge, and our findings extend not only the ARG process, but also all the other ARP models.
Our approach is not the only way to enhance ARP and disentangle its moments dynamics.An alternative is to model x t as the sum of independent univariate ARG dynamics, which we refer to as the multi-factor ARG, or MARG for short.The MARG pioneered by Le et al. (2010) is the leading framework in discrete-time affine term structure of interest rate models.Its appeal is simple and intuitive: if one is interested in disentangling the first K moments, it is enough to sum K independent ARGs.However, as we discuss in detail in Section 5, the MARG loses its analytical tractability when we analyze the dynamic conditional only on the variable of interest x t .Another approach in option pricing is the Wishart autoregressive (WAR) process studied in details in Gourieroux, Jasiak, and Sufana (2009), Yu, Li, and Ng (2017), and Gourieroux and Sufana (2010).Although this is the leading extension of the ARG dynamic, it suffers from the same shortcomings as the MARG.
Our generalization of ARP processes builds on the foundations provided by Feunou and Meddahi (2009) who introduce the generalized affine class of processes.Generalized affine processes describe a family of univariate time-series processes whose cumulant generating function (ψ t (u)) is written recursively as follows: where functions α(u), ω(u), β(u) and θ(u) are parametric functions of u that need to be specified.
Building concrete examples of stochastic processes belonging to the generalized affine family has proven challenging, this is one of the key contribution of this paper.Our GARP processes characterized in equation ( 8) are concrete examples of generalized affine processes characterized in equation ( 9).Indeed, each specific GARP process corresponds to a choice of functions α(u), ω(u) and setting function β(u) to a constant (i.e.β(u) = β, for all u) and θ(u) as a linear function of u, that is θ(u) = θ u.
The generalization of the ARG dynamic to the so-called generalized autoregressive gamma (GARG) process was first introduced in Feunou and Meddahi (2009) as an example of processes belonging to the generalized affine class of models.Our contribution builds on Feunou and Meddahi (2009), but is distinct in 3 important dimensions: (1) we show how the principle of the generalization of ARG to GARG applies to other ARP processes, (2) We provide an in-depth study of our proposed class of GARP processes which includes the GARG, and finally (3) we provide the first empirical applications of this family of processes and document the clear gains that can be obtained.
This article is organized as follows.Section 2 defines generalized autoregressive positive-valued processes (GARP).Section 3 derives the short-term and long-term dynamics of conditional moments.Section 4 discusses ergodicity conditions and derives moments of the unconditional distribution.Section 5 compares the GARG and MARG models.Section 6 discusses issues related to identification and statistical inference.An application to option pricing is presented in Section 7, where we show that the GARG dynamic dominates ARG.A final section concludes the paper.Proofs and extensions are gathered in the online appendices.

Specification
We consider a univariate ARP process x t with t ≥ 1, which is characterized by equation (1).As mentioned in the introduction, ARP processes include the ARG, the ARG 0 , the Heston-Nandi GARCH, the inverse-Gaussian GARCH and the INAR.We generalize ARP processes to GARP processes which are characterized by a recursive cumulant generating function as in equation ( 8).The ARP dynamic is nested within the GARP and is obtained by setting β = 0 or θ = 0. Feunou and Meddahi (2009) generalize the ARG process to the GARG process through the following state space representation:

Generalized Autoregressive Gamma-Zero Processes
Here we discuss the generalization of the autoregressive gammazero (ARG 0 ) of Monfort et al. (2017) which is essential for term structure of interest rates modelling at the zero-lower bound as it encompasses a zero-point mass, which is not possible with ARG dynamic.First, both the ARG 0 and the ARG can be written within a single affine framework (also known as extented-ARG processes) as follows: where I t is the sigma algebra generated by (x s , s ≤ t), and The ARG dynamic is obtained by setting ξ = 0 while the ARG 0 dynamic is obtained by setting ν = 0. Using the decomposition of x t+1 given in equation ( 10), we generalize these extented-ARG processes in the following state-space representation: where ν j , ϕ j , and φ j are given in (13) and ξ j ≡ ψ(βθ) j .

Generalized Heston-Nandi GARCH Process
The Heston and Nandi (2000) model is arguably the most popular discrete-time option pricing model.It is an affine-GARCH model where the dynamic of the conditional variance is given by where ε t+1 ∼ iidN (0, 1) .Heston and Nandi (2000) show that the log-conditional moment generating function x t+1 is affine in x t : where We generalize the Heston and Nandi model by decomposing x t+1 as in (10) and by specifying the following state-space for Z (j) t+1 : where

Generalized Inverse-Gaussian GARCH Process
The inverse-Gaussian GARCH process of Christoffersen, Heston and Jacobs (2006) is also a dynamic for the conditional variance x t that is affine but non-Gaussian, and builds on inverse-Gaussian innovations instead.The process is specified as follows: where The inverse-Gaussian GARCH is affine.Indeed, Christoffersen, Heston and Jacobs (2006) establish that where We generalize the inverse-Gaussian GARCH by decomposing x t+1 as in equation ( 10) and by specifying the following statespace for Z (j) t+1 : where

Generalized INAR Processes
In risk analysis, the variable of interest x t is often integer-valued and measures the number of claims in period t.The processes in this class are called integer-valued autoregressive (INAR) and have been explored in the time series and insurance literature (see Darolles, Gourieroux, and Jasiak 2006 for a detailed discussion).The INAR is an affine model with a linear conditional cumulant generating function similar to those of ARG dynamics: where I t is the sigma algebra generated by (x s , s ≤ t), where 0 < ρ < 1 and λ > 1.Using the decomposition of x t+1 given in equation ( 10) we generalize this INAR process in the following state-space: , where β and θ are positive integers.

Alternative Formulation of GARP Processes
We can now establish a key result of this paper, that the cumulant generating function for each of the GARP dynamics defined in sections 2.1, 2.2, 2.3, 2.4, and 2.5 follow a recursive dynamic as in equation ( 8).
Proposition 1.Let us assume that the positive-valued univariate process of interest x t follows the dynamics described in equations ( 10), ( 11), and (12).Then, for a scalar u such that 1−uϕ j > 0 for all j > 0, the conditional cumulant generating function of x t+1 (ψ t (u)) exists and evolves according to the following recursive dynamic: where functions α(•) and ω(•) are given in (2).
The condition for the existence of the cumulant generating function in ( 21) is 1 − uϕ j > 0 for all j > 0. That condition is equivalent to u < 1 ϕ min j≥0 1 θ j .Hence, the set of values for u depends on θ : The fact that the cumulant generating function only exists for negative arguments when θ > 1 is an argument for constraining θ to be below one.Indeed, in most applications, we would build positive processes that are obtained as linear combinations of GARG processes with positive loadings.We show in Section 4 that θ ≤ 1 is a necessary condition for weak ergodicity.
Proposition 2. The cumulant generating function of the generalized autoregressive gamma-zero process follows a recursion similar to (21): The cumulant generating function of the generalized Heston-Nandi GARCH process follows a recursion similar to (21): The cumulant generating function of the generalized inverse-Gaussian GARCH process follows a recursion similar to (21): The cumulant generating function of the generalized INAR process follows a recursion similar to (21): Equation ( 21) is an alternative formulation of GARP models.In other words, the GARP dynamics have two equivalent formulations: The first one is the state-space representation (given by equations ( 10), (11), and ( 12) for the GARG), which is useful for the simulation of the GARP dynamic.The second one is specified by means of the conditional cumulant generating function (ψ t (•)) and is given by ( 21).All the results and implications of this paper use the second formulation of the model.
Proposition (1) restates a result already proven in Feunou and Meddahi (2009) while proposition (2) establishes that all the GARP processes defined in this paper have recursive cumulant generating function representation such as in proposition (1).This is one of the key theoretical contributions of this paper.
Proof of Proposition 1.Using (10) and the fact that all the Zs on the right-hand side are conditionally independent, we have: .
By assumption, and the state-space representation given by ( 11) and ( 12) implies that Hence, which establishes Proposition 1.
The proofs for the results in proposition 2 are similar to that of proposition 1, the derivation are almost identical expect for the key step given in equation ( 26).In section 1 of the Appendix we derive that key step for all the GARP processes introduced in this paper.
We now study the statistical properties of GARP processes.Our results applied to all the different GARP dynamics introduced in this paper.To keep that degree of generality, we will not use the explicit expression of functions α(•) and ω(•) given in equation (2).

Moments Dynamic
In this section, we derive the dynamics of conditional moments, including expectation, variance, skewness and kurtosis.As (21) specifies the dynamic of the log-conditional moment generating function (cumulant generating function), it is convenient to derive the law of motion of conditional cumulants (derivatives of the cumulant generating function at 0).From (21), we have in particular, All positive affine processes considered in this paper share one important property: all derivatives ω (n) (0) and α (n) (0) are positive for all n.We can also easily establish that x t is an ARMA(1,1) with the autoregressive parameter α (0) + βθ and the moving average parameter βθ.Indeed, using (28), we have = ω (0) + α (0) + βθ x t + u t+1 − βθu t .(30)

Importance of Parameters β and θ
Equation ( 27) implies that Consequently, each conditional cumulant (i.e., ψ (n) t (0) for a given n) is driven by its own factor (m which is a moving average of the variable of interest x t .Hence, with only two additional parameters (β and θ ), we are able to generate a parsimonious generalization of ARG processes that disentangles the dynamics of all the conditional moments.Further, the ability to disentangle moments dynamics stems from parameter θ .Indeed, when θ = 1, all the conditional cumulants are driven by the same factor, ∞ j=0 β j x t−j , and thus are perfectly positively correlated.
To confirm and complete this central point, we compute the implied correlation between two conditional cumulants at different orders n and m.If ρ < 1, we establish (Section 2 of the Appendix) that the correlation between ψ (n)  t (0) and ψ (m)  t (0) is given by corr ψ where and ϑ = 33), it is readily apparent that θ = 1 implies that all the cumulants are perfectly correlated and, thus, θ = 1 is essential to break the tight links between moments that are inherent within ARP processes.

Initial Cumulant ψ 0 (u) and the Dynamic of x 1
In practice, when computing the conditional cumulant function through recursion ( 21), we need a starting cumulant function ψ 0 (u).We set ψ 0 (u) to the unconditional average E [ψ t (u)].This is similar to the practice in the GARCH literature, where the initial variance is typically set to the unconditional expectation of the conditional variance process.Under the conditions ρ < 1 and βθ n < 1, we show in Section 2 of the Appendix that the unconditional expectation of ψ We can thus derive the unconditional expectation of ψ t (u) using the following identity: where μ ≡ ω (0) 1−ρ is the unconditional expectation of x t .Using functions ω(u) and α(u) defined in (2), we deduce that It is also worth stressing that E [ψ t (u)] is not the unconditional cumulant function of x t .Later we discuss the conditions required for the unconditional distribution (and hence the unconditional cumulant function) of x t to exist.Since ψ 0 (u) given in ( 36) is the cumulant generating function of x 1 , it implies that x 1 has the following state-space representation:

Multi-Horizon Dynamic
Like affine models, an important characteristic of GARP processes is the existence of a closed-form forecast of any nonlinear transformation of a GARP process at any horizon.This characteristic enables financial applications such as closed-form bonds and option pricing.This result is not new as it has been establishes in the more general family of generalized affine processes in Feunou and Meddahi (2009).The multi-horizon cumulant generating function defined as is computed analytically in Section 3.1 of the Appendix where we establish that: The derivatives of ψ t (u; h) at u = 0 give closed-form expressions of moments of the time t distribution of x t+h .We show in Section 3.2 of the Appendix that where

Weak Ergodicity
This section discusses weak ergodicity conditions, which are conditions under which the distribution at horizon h tends to a limiting distribution.For more on ergodicity, see Darolles, Gourieroux, and Jasiak (2006), where the focus is on affine processes.Weak ergodicity is equivalent to the convergence of the multi-horizon cumulant generating function, which is also equivalent to the convergence of the h-step ahead nth conditional cumulant ψ (n)  t (0; h) derived in (39) as h increases and for all n.
Let us denote A necessary condition for the convergence of the series In Section 4.1 of the Appendix, we show the following result: Proposition 3. ψ (n)  t (0; h) converges as h increases if and only if ρ < 1 and βθ j < 1 for j = 1, . .., n.
and the cumulant generating function of the unconditional distribution is Unlike the unconditional expectation of the conditional cumulant generating function E (ψ t (u)) , which has been characterized analytically and shown to belong to the gamma distribution family (see ( 36)), we have not been able to compute ψ ∞ (u) in closed-form and thus are unable to assess whether the unconditional distribution belongs to a known family of distribution.It is important to stress again that ψ Finally, the weak ergodicity implies the existence of an invariant distribution whose cumulant generating function is ψ ∞ (u) , such that if the process is initialized from its invariant distribution, it is stationary.

Autocorrelation Functions
Various methods exist for examining serial dependence in stationary GARP processes.In this section, we consider the first-and second-order autocorrelograms.Our goal is to show how the flexibility (throughout parameters β and θ ) of GARP impacts the serial dependence of the series of interest x t .

Autocorrelation of the Level
Since x t is an ARMA(1,1) (see discussions at the end of Section 3.1) and if ρ < 1 and βθ 2 < 1 (which is equivalent to the covariance-stationarity of x t ), we have

Autocorrelation of the Squared Values of the Process
The second-order autocorrelogram represents the serial dependence in squared values of the process.Let us denote , where ε t ≡ x t −ψ t−1 (0).z t is a VARMA(1,1) since its conditional expectation . Indeed, we have where Hence, using the conditions ρ < 1 and βθ n < 1 for n ≤ 4, we have cov(z t+h , z t ) = h−1 z cov(z t+1 , z t ).
We provide closed-form expressions for cov(z t+1 , z t ) and var [z t ] in section 6.2 of the Appendix.

How β and θ Impact Unconditional Moments
We consider different values for β and θ and fix ν = 0.039, ϕ = 0.017, and φ = 0.252.These values are the results of the estimation on realized variance data (see Section 7.1).Figure 1 plots unconditional moments as functions of β and θ and reveals interesting insights.First, unsurprisingly low values for β imply a model very close to the original ARG.When β is low, the model is broadly similar for different values of θ .Similarly, when θ is low the model is broadly similar for different values of β.This is not surprising since θ is only identified if β is sufficiently different from zero and vice versa.While the volatility increases with β, the skewness and kurtosis decrease with β.The same findings apply when looking at variation across θ .The skewness and kurtosis decrease with θ , but the volatility increases with θ .
In conclusion, adding θ < 1 improves the ability of the new model to fit highly skewed and fat-tailed time series.Turning to the correlation between cumulants given by (33) and plotted in the last row of Figure 1, we observe that the correlations between cumulants decrease with β, with a perfect correlation for low values of β.For very high values of β, the correlation between cumulants decreases, with θ reaching values as low as 0.8.There is a U-shaped pattern as a function of θ for the medium value of β.In conclusion, adding parameters β and θ breaks the tight link between cumulants that is embedded in ARG dynamics and affine models in general.Finally, in the ARG dynamic the autocorrelogram of the level is very similar to that of the squared values of the process, as shown by the first plot in the second row of Figure 1.We can see how increasing θ and thus departing progressively from the affine structure enables the disentanglement of these autocorrelograms.

Comparison with Multi-Factor Affine Models
Our discussion so far has focused on a simple extension of a single-factor ARP process (and affine processes in general).Our proposal is similar to the generalization of the AR dynamic to ARMA or ARCH to GARCH.Our approach is not the only way to enhance ARG and disentangle its moments dynamics.An alternative is to model x t as the sum of independent univariate ARG dynamics, which we refer to as the multi-factor ARG, or MARG for short.The MARG pioneered by Le et al. ( 2010) is the leading framework in the discrete-time affine term structure of interest rate models.Its appeal is simple and intuitive: if one is interested in disentangling the first K moments, it is enough to sum K independent ARGs.However, as we discuss in detail in this section, the MARG loses its analytical tractability when we analyse the dynamic conditional only on the variable of interest x t .In option pricing, the Wishart autoregressive (WAR) process studied in detail in Gourieroux, Jasiak, and Sufana (2009), Yu, Li, and Ng (2017), and Gourieroux and Sufana (2010) is the leading extension of the ARG dynamic.The WAR has the same shortcomings as the MARG.
The MARG resembles our model specification given in (10), where the multiple latent factors are assumed to be independent both conditionally and unconditionally.Formally, the MARG is given by where a k > 0; k = 1, . . ., K and x t+1,k ; k = 1, . . ., K are independent ARG processes.A MARG may seem less constrained than our GARG model given in (10).First, each latent factor x t+1,k has its own set of parameters while the parameters for the factors (Z (j) t+1 ) in the GARG model given in ( 10) are all related, as shown in ( 13).
Second, while the x t+1,k , k = 1, . . ., K in ( 43) are independent, their counterparts (the Z (j) t+1 ) in ( 10) are conditionally independent but are dependent unconditionally.In fact, the conditional distribution of Z (j) t+1 depends only on x t−j and not on Z (j) t .In other words, although in the GARG dynamic, the Z (j) t+1 are latent, their distributions depend only on the observed process of interest x t .This property enables us to compute analytically the distribution of x t+1 conditional on its own past, without resorting to any filtering procedure.This is not the case of the MARG, which requires a filtering procedure.
In fact, it is fair to compare the two models when we derive the distribution of x t+1 conditional on its own past implied by the MARG dynamic.

Cumulant Generating of x t+1 Conditional on (x s , s ≤ t).
To avoid cumbersome mathematical derivations, we focus on the case K=2.We can also, without loss of generality, assume that a k = 1, as the a k are not separately identifiable for the parameters of the latent process x t+1,k .In fact, a k x t+1,k is also an ARG process.Hence, we have Similar to (1), ψ t (u) denotes the one-step ahead cumulant generating function of x t+1 conditional on its own past.Formally, we have ψ t (u) ≡ ln E exp (ux t+1 ) |I t , where I t is the sigma algebra generated by (x s , s ≤ t).We show the following result in Section 8 of the Appendix: Proposition 4. The dynamic of ψ t (u) implied by the MARG dynamic is given by where The two dynamics to compare are given by ( 44) for the MARG model and ( 8) for the GARG model.While both models express the conditional cumulant generating function recursively, the recursion for the GARG is simple as it has a closedform expression while the one for the MARG is non-linear and requires computing tedious integrals.

Mean and Variance of x t+1 Conditional on (x s , s ≤ t)
We push the comparison one step further by evaluating the first two moments implied by ( 44).This is done by taking the first two derivatives of ( 44) with respect to u.To ease the mathematical derivations, we focus on the case ϕ 1 = ϕ 2 = ϕ, which implies that both the MARG and the GARG have the same number of parameters.ϕ 1 = ϕ 2 = ϕ implies that the MARG dynamic collapses to the ARG dynamic if and only if φ 1 = φ 2 .
The following proposition, proven in Sections 8.1 and 8.2 of the Appendix, gives the dynamic of the first two moments: Proposition 5.The dynamic of ψ t (0) implied by the MARG dynamic is given by The dynamic of ψ t (0) implied by the MARG dynamic is given by and Equations ( 45) and ( 46) contain integrals that complicate our analysis.To solve these integrals and gather more intuition, we resort to approximations by assuming that Combining ( 45) and ( 46) with the approximation given in ( 47) leads to the following corollary: Corollary 2. The dynamics of ψ t (0) and ψ t (0) implied by the MARG dynamic are given by and We verify numerically that these approximations are accurate.In fact, we show that the same dynamic is obtained when Kalman filters methods are used (see Monfort et al. 2017).First, the two moments dynamics are interrelated since to compute the time t + 1 expectation (ψ t+1 (0)), we need past variance (ψ t (0)) and vice versa.This contrasts with the GARG dynamic where each cumulant dynamic is computed independently.This connection between moments dynamics is at the heart of our problematic.Second, the dynamic here is nonlinear, which complicates the temporal aggregation.This is in contrast to the linear dynamic of the GARG.It is important to stress that the model is affine when we condition on the unknown unobserved component but becomes nonaffine if we condition on the observed variable x t .This paper is about the dynamic of x t conditional on the past of x t .However, we nonetheless investigate the MARG empirically in Section 7.

Statistical Inference
The univariate GARG has five parameters, φ, ϕ, ν, β, θ ; the goal of this section is to discuss their estimation and statistical inference.First, the five parameters are well identified.In Section 6.1 of the Appendix, we discuss an identification approach that consists of expressing φ, ϕ, ν, β, and θ as functions of quantities that can be directly estimated: the unconditional mean, variance, skewness and the first two autocorrelations.We have also run simulation exercises to establish heuristically the identification of the five parameters.The results are displayed in Tables 1 and 2 of the Appendix.In this section we review standard estimation methods such as the pseudo-maximum likelihood and the maximum likelihood and show that they can be used for the estimation of GARG processes.This implies that standard statistical inferences related to these methods remain valid in the context of the GARG dynamic.

Maximum Likelihood Estimators
We exploit the Fourier inversion formula to compute the conditional density as follows: where i stands for the imaginary unit.This enables the estimation of the GARG's parameters through the maximum likelihood (ML) procedure and the use of standard inference to compute standard errors.Because a numerical inversion is involved, some practical challenges could arise.

Pseudo-Maximum Likelihood Estimators
Since conditional moments are available in closed-form, the GARG can be estimated using pseudo-maximum likelihood.
Because GARG processes have a closed-form conditional characteristic function, one alternative to the ML-based method is the empirical characteristic function (ECF) estimation method.Finally, some applications in the stochastic volatility and term structure of interest rates literatures require latent factors.In Sections 6.2, 6.3, and 6.4 of the Appendix, we discuss the ECF estimator, the generalized method of moments and the generalized method of moments.

Fitting the Historical Joint Dynamic of S&P 500 Returns and Realized Variances
In this section we denote the day t stock price and return by S t and R t , with R t ≡ ln (S t /S t−1 ).We design an option pricing model where returns and realized variances (RV t ) are modeled jointly in line with the literature (Christoffersen et al. 2014;Majewski, Bormetti, and Corsi 2015).Our measure of realized variances, RV t on day t, is the sum of the squared 5-min logreturns observed within day t.To highlight the usefulness of the GARG process, we assume that the realized variance follows a GARG dynamic (instead of the ARG dynamic), that is, where r is the risk-free rate (calibrated to the sample average of the 3-month Treasury Bill rate) and λ is interpreted as the price of risk as it indicates the variation in the equity risk-premium per unit of variation in the realized variance.Equation ( 52) is well motivated empirically by several studies in the literature (Andersen et al. 2001;Andersen, Bollerslev and Dobrev 2007), where it is shown that time t return conditional on time t realized variance follows a Gaussian distribution.The remaining challenge is to model the conditional distribution of the realized variance.Many studies have relied on the ARG process; we will instead assume a GARG dynamic, as in (53).
Note that both the ARG2 and the MARG models have the same number of parameters (5) as the GARG.Before giving details on option pricing, we evaluate the relative performance of GARG processes in fitting salient facts of the dynamic of the observed realized variance series.

Data and Empirical Results
The empirical investigation begins by obtaining daily historical realized variances for the S&P 500 index from oxfordman.ox.ac.uk.The data cover the period from January 01, 2000, to December 31, 2017.Table 1 contains the maximum likelihood estimation for the benchmark ARG processes and the GARG process on daily historical realized variances.The likelihood and BIC figures indicate that the GARG is the best performing model.Unsurprisingly, the likelihood ratio tests favor all the alternative specifications against the basic ARG model (ARG0).
To better gauge the ability of the different models to fit the data, we report the observed sample mean, variance, skewness and kurtosis and compare them with each model's implied moments.Contrary to ARG models, the GARG is able to match these unconditional moments.
To further shed light on these results, we plot sample autocorrelations and cross-correlations in Figure 1 of the Appendix.The top left panel displays the realized variance's autocorrelation function across models.The other panels display the cross-correlations between the level and the squared, corr(RV t , RV 2 t+h ), the cross-correlation between the squared and the level, corr(RV 2 t , RV t+h ), and the autocorrelation of the squared corr(RV 2 t , RV 2 t+h ).We can see that none of the ARG models are able to capture the long memory inherent in the observed variance dynamic.In contrast, the GARG dynamic better matches the sample autocorrelograms.
To diagnose the different models, we extract the conditional mean E t−1 [x t ] and the conditional variance var t−1 [x t ] , then  compute the standardized residuals as In principle, the better a model is at fitting the conditional mean (the first conditional cumulant), the smaller the autocorrelation in z t .In the same vein, the better a model is at fitting the conditional variance (the second conditional cumulant), the smaller the autocorrelation in z 2 t .Figure 2 displays the autocorrelograms of the level and square of the standardized residuals z t along with 95% confidence bounds and depicts interesting insights.Only the GARG model is able to extract the first two moments dynamics with accuracy as its autocorrelograms lie mostly within the confidence bounds.The basic ARG model, ARG0, is unable to fit both moments, especially the conditional mean.The other two versions of the ARG processes, ARG1 and ARG2, are able to fit the second moment with the same accuracy as the GARG dynamic but at the cost of the first moment fitting.This highlights the central point of our proposal: there is a tension between fitting the first two moments that are inherent in the ARG dynamic, which we are able to overcome with the GARG process.The MARG model provides a clear improvement over ARG dynamics regarding the first moment, but it is still outperformed by the GARG dynamic on both dimensions.
The physical properties of the realized variance dynamic we have investigated above are likely to have important implications for the models' ability to fit a large panel of options.This is the task we now turn to.

Risk-Neutral Estimation
Similar to the ARG processes, the GARG processes are built to enable closed-form option prices.In this exercise, we assume that the joint dynamic given by ( 52) and ( 53) is under the riskneutral probability measure.This implies that r is the risk-free rate and λ = 0. We provide full details on option pricing under the GARG and MARG dynamics in Sections 7.2 and 8.2 of the Appendix.We estimate the different models by optimizing their fit on option data.This analysis aims at exploring the ability of each specification to properly match the risk-neutral distribution embedded in option contracts.We start by presenting the key features of the option data panel used in our empirical analysis and then study the performance of the various models relying on the implied volatility root-mean-squared-error.

Option Data
We use European-style options written on the S&P 500 index.The observations span the period January 10, 1996, to August 28, 2013.This dataset is available through OptionMetrics, which supplies data for the U.S. option markets.In line with the literature, we only include out-of-the-money (OTM) options with maturities ranging from 15 to 180 days.This selection procedure is intended to guarantee that the contracts considered are liquid.We also filter out options that violate basic no-arbitrage criteria.For each maturity quoted on Wednesdays, we select only the six most liquid strike prices, which amounts to a dataset of This table shows estimation results for six different models.We used Wednesday closing out-of-the-money (OTM) call and put contracts from OptionMetrics for the period beginning January 10, 1996, and ending August 28, 2013.We report the estimated parameters (Est) along with their corresponding standard errors (SE).The second-tolast row shows the implied volatility root-mean-squared errors (IVRMSEs).For comparison, the second-to-last row reports the IVRMSE ratio of each specification to the benchmark ARG0 model.
21,283 option contracts.To ease calculation and interpretation, OTM put prices are converted into corresponding in-the-money call values, by exploiting the call-put parity relationship.We provide a detailed description of option data in Section 7.3 of the Appendix.

Fitting Options
We explore the performance of the different models by relying on the implied volatility root-mean-squared error (IVRMSE) (see Appendix 7.3 for details).Diebold and Mariano 2002) to assess whether differences in pricing errors across models are statistically significant.We test ARG1 against ARG0, ARG2 against ARG1 and finally, GARG against ARG2.The results indicate p-values of 0 except for the test of ARG2 against ARG1, implying that ARG0 errors are statistically significantly higher than ARG1, while GARG errors are statistically lower than ARG2.There is no statistical difference between ARG1 and ARG2 option pricing errors.

Model Fit by Moneyness, Maturity, and VIX Levels
We now scrutinize the overall performance results reported in the bottom panel of Table 2. To this end, we report the IVRMSE by moneyness, maturity and VIX levels in Table 3.We see that all models offer a satisfactory performance (low IVRMSEs) in matching at-the-money options contracts.By contrast, fitting deep OTM call and put options seems more challenging.Interestingly, the ability of the various specifications to match the observed option-implied volatility appears consistent across the term structure of the options, as the IVRMSEs are of comparable magnitude.Moreover, the performance of these models tends to deteriorate nearly monotonically as a function of the VIX level.This observation suggests that the ability of the models to generate realistic option prices weakens in highly volatile times.
Nevertheless, the GARG model dominates the other models along the moneyness, maturity, and VIX level dimensions.However, we would have expected the fit to be significantly better for long-maturities since the GARG process is able to generate slowly decaying autocorrelations.The fit in this longer maturity dimension would certainly improve using a higher order GARG dynamic, which we introduce in Section 1.2 of the Appendix.

Conclusion
This study introduces the generalized autoregressive positivevalued (GARP) dynamic.The GARP is a parsimonious generalization of autoregressive positive-valued (ARP) dynamics able to overcome tight links between conditional moments that are implicit within ARP processes.The GARP dynamic enables each moment to be driven by its specific moving average of the variable of interest.Besides, the new process maintains the practical advantage of ARP dynamics and affine models in general: it has a closed-form multi-step ahead moment generating function.
Empirically, we show that the GARG dynamic dominates ARG in fitting the historical dynamic of realized variance, and most importantly in describing the behavior of a large panel of option prices.Our generalization so far has focused on models with finite moments at all orders, which clearly restricts applications to a certain type of financial data.Generalization to processes with infinite moments is an exciting topic for future research.

Supplementary Materials
This supplement provides (i) the proofs of the results reported in the main article, (ii) some additional theoretical results that are commented on but not reported in the main article, and (iii) a Monte Carlo study and additional empirical results.
) with B n,k being the partial or incomplete exponential Bell polynomial, D • h n the function D n compounded h times with itself and M [n, k] the (n, k)th entry of a generic matrix M. We provide the definition and explicit expression of B n,k in Section 4.3 of the Appendix.

Figure 2 .
Figure 2. Autocorrelograms of residuals.These figures plot the autocorrelation of the level and the squared values of the standardized residual z t .For each model, we extract the implied conditional mean E t−1 [xt] and the implied conditional variance var t−1 [xt] , then compute the standardized residuals as z t ≡ x t − E t−1 [xt] / var t−1 [xt].The horizontal dashed lines represent the upper and lower confidence bounds.The sample begins from January 01, 2000, and ends December 31, 2017.
where matrix Cn and matrix function D n (•) are given in (40).ψ

Table 1 .
Estimation using historical realized variance.This table shows maximum likelihood estimation results for four different models.We used daily historical realized variances for the S&P 500 index from January 01, 2000, to December 31, 2017.We report the estimated parameters (Est) with their corresponding standard errors (SE).
Table 2 contains the results of the option-based estimation.Clearly, our option-fitting strategy yields accurate parameter estimates, as evidenced by fairly small standard errors and sizeable model likelihoods.Because we are fitting the model only on options, the estimates correspond to risk-neutral parameters.The proposed GARG model clearly outperforms the alternative ARG specifications, as it delivers the highest likelihood value, the lowest BIC and the smallest global IVRMSE.Specifically, the GARG model offers about 10% and 20% improvement over the benchmark ARG0 model in terms of log-likelihood and IVRMSE, respectively.While the MARG model outperforms the ARG dynamics, it has a marginally lower IVRMSE than the GARG.At the bottom of the table, we report p-values of the Diebold-Mariano Test (see