Ambiguity in the Cross-Section of Expected Returns: An Empirical Assessment

This article estimates and tests the smooth ambiguity model of Klibanoff, Marinacci, and Mukerji based on stock market data. We introduce a novel methodology to estimate the conditional expectation, which characterizes the impact of a decision maker’s ambiguity attitude on asset prices. Our point estimates of the ambiguity parameter are between 25 and 60, whereas our risk aversion estimates are considerably lower. The substantial difference indicates that market participants are ambiguity averse. Furthermore, we evaluate if ambiguity aversion helps explaining the cross-section of expected returns. Compared with Epstein and Zin preferences, we find that incorporating ambiguity into the decision model improves the fit to the data while keeping relative risk aversion at more reasonable levels. Supplementary materials for this article are available online.


INTRODUCTION
Although there is a long tradition to model preferences with subjective expected utility (SEU), researchers nowadays consider more sophisticated preference representations. If a decision maker (DM) has vague information about the model that determines the distribution of outcomes, uncertainty does not only appear as risk, that is, fluctuations with a known probability distribution, but also as ambiguity about the model itself. Ambiguity may cause a loss of utility to a DM. The resulting bias in portfolio allocations, mirroring the aspiration for robust decision making, might have a perceptible impact on asset prices. This article investigates if ambiguity aversion is present in investors' decision patterns by looking at the cross-section of stock returns and macroeconomic variables. We estimate a set of preference parameters assuming that investors act in line with the smooth ambiguity (SA) model of preference as developed by Mukerji (2005, 2009). We compare the model's pricing performance with the ambiguityneutral recursive preference model of Epstein and Zin (1989), EZ henceforth, and the pure ambiguity (PA) model that features risk neutrality but ambiguity aversion.
Intuitively, a DM with SA preferences considers a whole set of different economic models. For each, she calculates a certainty equivalent with respect to expected utility. Her decisions are finally based on the expected utility of the set of certainty equivalents with respect to a second utility function. This function displays the DM's ambiguity attitude and is characterized by the ambiguity parameter η. One goal of this article is to estimate this parameter and thus gauge the ambiguity attitude of investors. An alternative approach to introduce ambiguity aversion is the multiple priors model of Gilboa and Schmeidler (1989). They assumed that a DM does not consider all certainty equivalents belonging to different candidate models, but only the "worst case." Hansen and Sargent (2001) suggested the relation of this approach to the robustness theory of Anderson, Hansen, and Sargent (2000) and Hansen and Sargent (2008). Ju and Miao (2012) pointed out that the SA framework con-tains these and further preference specifications as special and limiting cases. Halevy (2007) investigated a variety of decision models using extensions of the Ellsberg (1961) experiment. Bossaerts et al. (2010) and Ahn et al. (2011) analyzed the impact of ambiguity in portfolio choice experiments. In contrast to experimental studies, our results are based on historical stock market data. Anderson, Ghysels, and Juergens (2009) and Brenner and Izhakian (2011) also approached empirical evaluations of the impact of ambiguity on asset returns. Both include ambiguity factors that are supposed to capture time-variation in ambiguity about stock returns into linear multi-factor models and find that ambiguity is a priced factor. Similarly, we find evidence for ambiguity in the cross-section of expected returns. As opposed to these articles, our approach stems from life-cycle consumption-based asset pricing. Epstein and Schneider (2010) reviewed the literature on ambiguity and asset markets. They concluded that ambiguity has important implications for the pricing of financial assets. General equilibrium asset pricing applications of the SA approach include Collard et al. (2011), Ju and Miao (2012), and Miao, Wei, and Zhou (2012). In these articles, the risk aversion parameter γ is set to a low value, while the ambiguity parameter η is calibrated to match important asset pricing moments. The assumed values vary significantly between the asset pricing applications in the literature. However, as for γ , there also has to be a reasonable range for η. The findings of Halevy (2007) are interpreted by Chen, Ju, and Miao (in press), who infer an ambiguity parameter between 50 and 90. Our point estimates of η are between 25 and 60, while γ is clearly lower and within the range considered plausible by Mehra and Prescott (1985). The substantial difference to the risk aversion parameter indicates that market participants are ambiguity averse.
As shown by Mehra and Prescott (1985), the consumptionbased asset pricing model of Lucas (1978) and Breeden (1979) has severe problems in explaining the large equity premium and the cross-sectional variation in expected returns. We investigate whether accounting for ambiguity helps explaining these phenomena and compare the pricing performances of the SA model with several benchmark models. We find that it is difficult to discriminate between these decision models solely based on pricing errors. The SA model achieves a slightly better fit to the data with low relative risk aversion.
To estimate preference parameters, we use the generalized method of moments (GMM) of Hansen (1982). Hansen and Singleton (1982) employed GMM to estimate the consumptionbased capital asset pricing model, while Epstein and Zin (1991) estimated the EZ model. GMM relies on Euler equations to test the fit of candidate pricing kernels. Compared with EZ preferences, the pricing kernel and hence the Euler equation of the SA model contains an additional term, which characterizes the impact of an investor's ambiguity attitude on asset prices. An ambiguity averse agent puts more weight on economic models that yield a low expected continuation value. Estimation of this expected value, conditional on the economic model, imposes technical difficulties. Motivated by a long-run risks (LRR) model, we show how to overcome these.
Another difficulty in estimating consumption-based asset pricing models with recursive preferences is that it requires the return on the wealth portfolio, which is not observable. Several approximations have been proposed in the literature. Epstein and Zin (1991) used the return on a broad stock market index. However, Chen, Favilukis, and Ludvigson (2013) and Lustig, Van Nieuwerburgh, and Verdelhan (2012) studied the properties of the return on wealth and found that it is less volatile and only weakly correlated with the return on the stock market. Among others, Campbell (1996) and Jagannathan and Wang (1996) accounted for the large fraction of human wealth in total wealth. As in Zhang (2006), we use a proxy for the return on wealth based on the variable cay of Lettau and Ludvigson (2001), which includes human wealth and total asset holdings.
The remainder of this article is organized as follows. Section 2 reviews SA preferences and the pricing kernel. Section 3 discusses the estimation technique and its finite sample properties. The preference models are estimated based on post-war consumption and stock market data in Section 4. Section 5 concludes. An online supplementary material provides information about the GMM estimation and a simulation study.

SMOOTH AMBIGUITY PREFERENCES
In this section, we briefly review SA preferences. The model's static version was introduced by Klibanoff, Marinacci, and Mukerji (2005) and generalized to a dynamic model by Klibanoff, Marinacci, and Mukerji (2009) and Ju and Miao (2012). It is a generalization of recursive preferences as developed by Kreps and Porteus (1978) and Epstein and Zin (1989). Since our goal is to provide a strong intuition for the nature of SA preferences, we set technicalities aside and refer the interested reader to the articles named above.

The Decision Model
Consider a DM who evaluates future consumption plans C = (C t ) t∈N with respect to the recursive value function where ρ denotes the reciprocal of the DM's elasticity of intertemporal substitution (EIS) and δ the DM's subjective time discount rate. The uncertainty aggregator R accounts for risk and ambiguity in the continuation value V t+1 (C) of future consumption. The aggregation depends on the DM's attitudes toward risk and ambiguity. At each point in time t, the DM faces a model set t . The key assumption of the smooth ambiguity model is that the DM entertains a subjective prior μ t , that is, a probability measure on t that mirrors her assessment of the likelihood of each candidate model in t to be the "true" one. A model is given by a probability measure π t+1 ∈ t on the state space that pins down the distribution of the future consumption stream. We use the subscript t + 1 for all models in the time t model set to point out that the true model is a random variable in t. Each model π t+1 yields a conditional certainty equivalent u −1 E π t+1 [u(V t+1 (C))] , where u is a utility function characterizing the DM's risk attitude. The uncertainty aggregator is defined as the unconditional certainty equivalent where v is a further utility function. The ambiguity attitude of the DM depends on the curvature of the composition φ = v • u −1 .
If it is concave, the DM dislikes mean-preserving spreads in expected utilities conditional on the models in t , which implies ambiguity aversion. If it is linear, the DM is ambiguity neutral and R t reduces to the aggregator of EZ preferences. We assume that u and v are of the power utility type The parameter vector = (ρ, δ, γ, η) describes the DM's preferences, where γ describes the DM's attitude toward risk and η her attitude toward ambiguity. She is ambiguity averse if η > γ . Summing up, at time t the DM evaluates consumption plans C according to Equation (1) nests the value functions of EZ preferences (η = γ ) and time-separable constant relative risk aversion (CRRA) utility (ρ = η = γ ).
An ambiguity neutral DM does not ignore ambiguity, but is aware of disperse consequences brought forward by different candidate models in the model set t . However, ambiguity neutrality means that such a DM aggregates probability distributions corresponding to the candidate models with the help of μ t . The DM thus only considers unconditional probability distributions. To enable a comparison of the preference parameters of an ambiguity averse and an ambiguity neutral DM, we define the term effective risk aversion as the coefficient of relative risk aversion that yields the same certainty equivalent corresponding to the unconditional probability measure as an ambiguity averse DM that distinguishes between conditional and unconditional probability measures. This definition is motivated by Bonomo et al. (2011), who considered a similar notion for generalized disappointment aversion risk preferences, as defined by Routledge and Zin (2010).
The question how the preference parameters of the smooth ambiguity model translate into effective risk aversion depends on the relation between the variance of the conditional distributions (risk) and the variation in certainty equivalents corresponding to the conditional distributions (ambiguity). If, for example, the different candidate models yield very similar conditional probability distributions, effective risk aversion is close to the risk aversion coefficient γ as the variance of the unconditional probability measure is largely driven by the variation in the conditional measures. If, however, the variances of all single conditional measures are very low, effective risk aversion is biased toward the ambiguity aversion coefficient η. Thus, the definition of effective risk aversion is not canonical but depends on the properties of t and may vary from one decision to the other.

The Pricing Kernel
The pricing kernel ξ links preferences to asset returns via the relation where E t is an abbreviation of E μ t E π t+1 and R i t,t+1 denotes the gross return on money invested at time t for one period in an arbitrary asset i. In complete markets, (ξ t,t+1 ) t∈N is a unique series of random variables. It can be expressed in terms of continuation values with the help of the value function given in Equation (1). Following Duffie and Skiadas (1994) and Hansen et al. (2007), it satisfies as reported in Hayashi and Miao (2011), Proposition 8. The first three terms are the EZ pricing kernel, which collapses to the CRRA pricing kernel for γ = ρ. The last term displays the impact of the DM's ambiguity attitude on asset prices. Its numerator is the conditional certainty equivalent of the continuation value as described in Section 2.1. Hence, the DM considers the conditional certainty equivalent corresponding to a certain economic model π t+1 relative to the unconditional certainty equivalent. Depending on her ambiguity attitude, she puts more (if ambiguity averse, that is, γ < η) or less (if ambiguity lov-ing, i.e., γ > η) weight on economic models that yield a low expected utility. The continuation value is unobservable and in applications it is usually more convenient to work with the pricing kernel in terms of the return on wealth. We define θ 1 := 1−γ 1−ρ and θ 2 := 1−η 1−γ . The pricing kernel then is where R w denotes the return on the wealth portfolio, that is, the claim on aggregate consumption. The parameter θ 2 expresses the concavity of φ and therefore the ambiguity attitude of the DM. Hence, the bias through the additional last term in the pricing kernel (compared to the EZ pricing kernel) causes the impact of the DM's ambiguity attitude on asset prices. We decompose the pricing kernel into three parts and consider a number of special cases. Assuming γ = η yields the ambiguity neutral Epstein-Zin pricing kernel, as ξ SA equals 1. Likewise, if ρ = γ , then ξ EZ equals 1 and ξ t,t+1 = ξ CRRA t,t+1 × ξ SA t,t+1 . We refer to this case as pure ambiguity (PA) preferences. This term is justifiable as the impact of ξ CRRA is negligible in our applications. A pure ambiguity investor cares for covariation of model-implied expected returns with model-implied expectations of consumption growth and return on wealth. Consider the assumption ρ = γ = 0, which further facilitates the pricing kernel as ξ CRRA trivializes to the constant time discount rate e −δ . Equation (2) is then given by

ESTIMATION TECHNIQUE
In this section, we introduce the econometric methodology to infer attitudes toward ambiguity from financial market data. We use GMM to estimate the preference parameters. From Equation (4), it is clear that two central components of the pricing kernel are the return on the wealth portfolio, which cannot be observed at the market, and the expected value, conditional on the economic model, that distinguishes SA from EZ preferences. We discuss a return proxy in Section 3.2 and provide an estimation technique for the conditional expectation in Section 3.3. Section 3.4 summarizes the results of a simulation study that investigates the finite sample properties of our estimation technique. Details on the simulation study can be found in the online supplementary material.

GMM Estimation
Euler equations link asset returns to consumption growth and the return on the wealth portfolio. The imposed population moment restrictions can be employed to test the fit of candidate pricing kernels. To weight the moment conditions, we use the identity matrix. Minimizing the sum of squared pricing errors makes the results comparable to asset pricing tests using ordinary least-square (OLS) cross-sectional regressions. In addition, the identity matrix is suitable for comparing SA preferences with the benchmarks of EZ and PA preferences, as it is invariant across all models tested. According to Altonji and Segal (1996), first-stage GMM estimates are also more robust in finite samples. Cochrane (2005, chap. 11) explained several additional advantages of using a prespecified weighting matrix.
The null hypothesis that all moment conditions are zero can be tested using Hansen's J-test. If we acknowledge that all models are misspecified, hypotheses tests of the null of correct model specification against the alternative of incorrect specification are of limited value. Following the idea that we are looking for the least misspecified model, we compare root mean squared errors (RMSE) and Hansen and Jagannathan (1997) distances (HJD) of different preference specifications and parameter vectors. We test if these performance measures are zero using the methodology proposed by Jagannathan and Wang (1996) and Parker and Julliard (2005). We use a Wald test to evaluate the restriction ρ = γ , which corresponds to time separability in the EZ model and to pure ambiguity preferences in the SA model. We furthermore evaluate the restriction for ambiguity neutrality (γ = η). Details on the estimation procedure and on testing hypotheses are provided in the online supplementary material. Ferson and Foerster (1994), Hansen, Heaton, and Yaron (1996), Smith (1999), and Ahn and Gadarowski (2004) pointed out that commonly employed specification tests reject too often in finite samples. Thus, relying solely on these tests to evaluate the goodness of fit of candidate asset pricing models is problematic. Lewellen, Nagel, and Shanken (2010) showed that focusing too closely on high cross-sectional R 2 's and small pricing errors can be misleading and that it is important to evaluate if the decision models produce plausible preference parameter estimates.
Allowing all parameters to be estimated freely focuses solely on model fit. Restricting certain preference parameters to economically reasonable values, balances the objective between minimizing pricing errors and the plausibility of the parameter estimates. We follow Bansal, Gallant, and Tauchen (2007) and fix the EIS at economically reasonable values. There is considerable debate about the correct value of the EIS. Hall (1988), Campbell and Mankiw (1989), and Yogo (2004) found an EIS close to zero, while Vissing-Jørgensen and Attanasio (2003), Bansal and Yaron (2004), Guvenen (2006), and Chen, Favilukis, and Ludvigson (2013) argued for a higher value. Hansen, Heaton, and Li (2008) and Malloy, Moskowitz, and Vissing-Jørgensen (2009) set the EIS to one. This choice simplifies the analysis considerably. However, it implies that the wealth-consumption ratio is constant. Lettau and Ludvigson (2001) and Lustig, Van Nieuwerburgh, and Verdelhan (2012) showed that this contradicts empirical evidence. In the LRR model of Bansal and Yaron (2004), a drop in volatility and a rise in expected consumption growth increase the wealth-consumption ratio if the EIS is greater than one. Bansal, Khatchatrian, and Yaron (2005) supported the negative relation between volatility and asset prices and Lustig, Van Nieuwerburgh, and Verdelhan (2012) showed that the LRR model produces a wealth-consumption ratio that fits the data.
As suggested by Constantinides and Ghosh (2011), we report results for several values of the EIS. Prefixing the EIS is beneficial for several reasons. First, it is very difficult to estimate the EIS reliably. As our main object of interest is the DM's attitude toward ambiguity and not the magnitude of the EIS, setting it to economically reasonable values simplifies the estimation of the ambiguity parameter. Second, it facilitates the comparison of parameter estimates of the EZ and SA models, that is, the effect that differences in the estimated EIS cause large changes in the estimated values of risk and ambiguity aversion is avoided. Furthermore, fixing the EIS at reasonable levels may provide valuable guidance on the magnitude of risk and ambiguity aversion for researchers in calibrating the SA model.

Return on Wealth
Testing candidate pricing kernels corresponding to recursive preference models presumes that either the continuation value of the future consumption plan in Equation (3) or the return on the wealth portfolio in Equation (4) is observable. The wealth portfolio is an asset that pays aggregate consumption as dividends. Although aggregate consumption is observable, neither the return on aggregate wealth nor the continuation value can be observed at the market. This causes severe problems for estimating consumption-based asset pricing models, as pointed out by Ludvigson (2012). Approximating the continuation value is discussed in Hansen, Heaton, and Li (2008), Ju and Miao (2012), and Chen, Favilukis, and Ludvigson (2013).
Approximating the return on wealth with a suitable function of observable variables is another alternative. Epstein and Zin (1991) approximated the return on aggregate wealth by the return on a broad stock market index. Among others, Stock and Wright (2000) and Yogo (2006) followed this approach. However, a stock market index is only a good approximation to the return on aggregate wealth if human capital and other nontradable assets are minor components of aggregate wealth. Critique of this approach goes back to Roll (1977). Lustig, Van Nieuwerburgh, and Verdelhan (2012) showed that human capital makes up the largest fraction of aggregate wealth. Campbell (1996) and Jagannathan and Wang (1996) included human capital. However, other components of wealth, such as total household asset holdings, should also be accounted for. We discuss an approach that incorporates all kinds of wealth by using the cay variable, defined by Lettau and Ludvigson (2001).
cay approximates innovations in the log consumption-wealth ratio. The authors assumed that asset holdings and human capital sum up to total wealth and that human wealth is approximately proportional to labor income. cay is defined as where c denotes log consumption, a log asset holdings, and y log aggregate labor income. The Appendix contains precise specifications of the variables used. The variable ω is the relative share of asset holdings in total wealth, which is assumed to be constant over time. Lettau and Ludvigson (2001) proposed that c, a, and y are cointegrated and estimate the coefficients ω and 1 − ω using OLS. Since the estimates do not perfectly sum up to 1, we proceed as Zhang (2006) and divide the estimate of ω by the sum of both estimates.
Let W t denote aggregate wealth at time t. Using the bud- , the return on wealth is given by Labor income does not enter the budget constraint explicitly, but implicitly through the assumption that human capital is a part of aggregate wealth. We assume that C t W t = κ · exp(cay t ), that is, the consumption-wealth ratio fluctuates around its steady state value κ. An approximation to the return on wealth is .
We set κ in line with the values given in Lettau and Ludvigson (2001), which yields an average consumption-wealth ratio of about 1/25 in annual terms. The constant κ is of minor relevance for the estimation, since it is the timing of innovations to the consumption-wealth ratio rather than its level that is important for the estimation. Setting the average ratio to 1/83, as reported by Lustig, Van Nieuwerburgh, and Verdelhan (2012), yields parameter estimates that are virtually the same. Table S1 in the online supplementary material contains summary statistics of the proxy r cay = log R cay . Its mean is about 1.5% per quarter and thus slightly lower than the average stock market return. Moreover, r cay has similar statistical properties as the return on wealth in Chen, Favilukis, and Ludvigson (2013). They found that the return on aggregate wealth is less volatile compared with the return on the Center for Research in Security Prices (CRSP) stock market index and the correlation between the two is rather low. In our sample, the standard deviation is less than one tenth of the standard deviation of the CRSP stock market index and the correlation between the two return series is 0.51.

Estimation of the Conditional Expectation
Using Euler equations to estimate the SA model requires an empirical estimate of the conditional expectation E π t+1 [Y t+1 ], where we define We assume that the conditional expectation of Y t+1 is a function of a standardized vector X t+1 of time t + 1 regressors. For the approach to be valid, the economic model needs to be explicable by these regressors. More precisely, there has to be a bijective relation between the set of economic models and the image of X, that is, all possible realizations of the regressor variables. Which variables are suited to identify the economic model? The risk-free rate and the log price-dividend ratio are observable, show a clear business cycle pattern, and have a long tradition as predictors of stock and bond returns. Cochrane (2005, chap. 20) provided a detailed review of the literature. Furthermore, in standard affine asset pricing models, as, for example, those of Bansal and Yaron (2004) and Drechsler and Yaron (2011), these variables are approximately affine in the state vector that pins down the distribution of consumption and dividend growth. Among others, Constantinides and Ghosh (2011) and Bansal, Kiku, and Yaron (2012) exploited this relation to estimate LRR models. In the models they considered, these two variables span the state space. This also holds if the representative investor is ambiguous about the distribution of consumption and dividend growth. They inverted the expressions for the log price-dividend ratio and the risk-free rate to express the state variables in terms of observables. Because of their economic relevance and motivated by the relation in affine asset pricing models, we use these two quantities and their first lags as predictor variables.
We run a locally linear regression where is the conditional volatility and ε denotes an iid zero mean, unit variance disturbance term, and interpret the fitted value m(X t+1 ) as the conditional expectation. Using logs corresponds to minimizing the squared relative error that is reasonable, since the conditional expectation is multiplied by other terms in the Euler equations. Moreover, it guarantees that the predicted Y t+1 as a part of the SA pricing kernel is positive. We allow for nonlinearities in m and approximate the conditional expectation locally by a linear function. Nagel and Singleton (2011) used a similar approach to estimate conditional moments. Fan (1992) and Ruppert and Wand (1994) showed that the local linear estimator has several advantages compared to other nonparametric estimators. For each i ∈ {0, . . . , T − 1}, an estimate of the conditional expectation at the data point X i+1 ism The weighting function w assigns more weight to observations close to the current data point and less weight to observations farther away. If l denotes the number of explanatory variables, that is, the length of the vector X t+1 for any t ∈ {0, . . . , T − 1}, the weighting function is defined as The weighting depends on the specification of the kernel function K, which assigns local weights to the linear estimator. The vector of bandwidths h = (h 1 , . . . , h l ) controls the neighborhood of the current point. As Nagel and Singleton (2011), we use the Epanechnikov kernel K(u) = 3 4 (1 − u 2 )1 (|u|≤1) , which minimizes the mean squared deviation of the corresponding kernel density estimator. We also employ other kernel functions (not reported) and find that the choice of the kernel does not alter the results. We allow for an individual bandwidth for each regressor. A large bandwidth h i results in a smooth estimate that might neglect important features of the data contained in the ith regressor, while for a small bandwidth the estimate follows the data very closely. The optimal vector of bandwidths h is chosen by minimizing the cross-validation criterion denotes the estimate computed excluding the (i + 1)th data point. In our applications, the bandwidths range from 2.8 to 6.6. They are larger for the contemporaneous regressors than for the first lags and increase in ρ.

A Simulation Study
We estimate the SA model based on simulated consumption and return data to investigate the performance of our estimation technique in finite samples. To simulate data, we rely on a long run risks model similar to that of Bansal and Yaron (2004), in which the distribution of consumption and dividend growth depends on two state variables: Trend consumption growth and economic uncertainty. Both are modeled as meanreverting processes. We assume that the representative investor perceives innovations in the state variables as ambiguous. We simulate return data given that the investor is ambiguity averse with SA = (ρ = 0.667, δ = 0.0033, γ = 5, η = 20) and use the parameterization of Bansal, Kiku, and Yaron (2012) for all other parameters. As discussed in Section 3.1, we fix ρ at 0.667 and estimate the remaining preference parameters. Details on the model and its solution, as well as an extensive discussion of the results of the simulation study can be found in the online supplementary material.
We find that the point estimates of the preference parameters are very close to the imposed values, however, with large standard errors for η and especially γ . This indicates that it is rather difficult to estimate the risk aversion and ambiguity parameters jointly in small samples. The large standard error of the risk aversion coefficient implies that ambiguity neutrality (γ = η) is difficult to reject. Even though the model is true, the median p-value of the Wald test is above 10%. We also run simulations with larger samples (100, 200, 500, and 1000 years). The Wald tests reject ambiguity neutrality for sample sizes of 100 years and above.
In addition, we investigate the bias if the parameters are estimated assuming EZ or PA preferences but the data were generated given the SA model. In case of EZ preferences, the median risk aversion estimate is 14.29, which can be interpreted as the effective risk aversion of the ambiguity averse investor with SA . It is clearly above the assumed risk aversion coefficient of 5. The p-values of the specification test based on the HJD and the J-test are slightly below 5%. The RMSE is close to the one given the full model is estimated. These results show that even if ambiguity aversion is present, it is rather difficult to discriminate between SA and EZ models solely based on their pricing errors in small samples. The same is true for the PA model, which is not rejected by any of the specification tests although it is misspecified.
Concerning the finite sample properties of the specification tests, we are able to draw several conclusions. We observe that in our simulation study, the rejection rates are far too large for samples smaller than 500 years. This confirms that the test rejects too often for sample sizes typically used in empirical tests of consumption-based asset pricing models. Similar to Ahn and Gadarowski (2004), we find that the specification test based on the Hansen and Jagannathan (1997) distance also performs poorly in small samples. In line with Parker and Julliard (2005), we find that the test based on the RMSE behaves superior in finite samples.

EMPIRICAL EVIDENCE
In this section, we estimate the preference parameters based on consumption and stock market data. Furthermore, we analyze the evolution of the estimated pricing kernels and investigate the in-sample and out-of-sample pricing performances of the alternative preference models. The sample period is from the first quarter of 1952 to the third quarter of 2011. R cay is used as proxy for the return on wealth. The set of test assets includes the 3-month Treasury bill, the CRSP value weighted stock market index, 10 portfolios formed on size, 10 book-to-market value sorted portfolios, and 10 industry portfolios. We use alternative sets of test assets to explore the sensitivity of the results in Section 4.2. The data are described in the Appendix. Table S1 in the online supplementary material contains descriptive statistics of the variables used in the estimation.

Parameter Estimates
The estimated parameters and their standard errors are shown in Table 1. All results are reported for two values of the EIS, 1.5 and 2. These values are typically used in asset pricing studies with recursive preferences. As outlined in Section 3.1, restricting the value of ρ has several advantages. To verify that the assumed values are in line with empirical evidence, we estimate the models treating ρ as a free parameter. We obtain point estimates close to the values above (not reported). However, standard errors are very large, as the objective function is very flat in ρ. The problem of getting precise estimates of the EIS has also been reported by Bansal, Gallant, and Tauchen (2007) and Constantinides and Ghosh (2011), among others. Moreover, we study the case ρ = 0, which is interesting from a theoretical point of view as γ = ρ yields risk neutrality.
For the SA model, the estimated vectorsˆ SA are (0.000, 0.014, 7.016, 61.035), (0.500, 0.012, 2.295, 35.091), and (0.667, 0.011, 0.819, 24.817). The subjective time discount rate is approximately 5% per annum and estimated with great precision. The risk aversion coefficients are estimated within the range considered plausible by Mehra and Prescott (1985), however, with large standard errors. The point estimates of the ambiguity parameter η are considerably larger than the  Epstein and Zin (1989) preferences, and PA to pure ambiguity preferences. HAC standard errors are in parentheses. The RMSE is the square root of the mean squared Euler equation error. HJD denotes the Hansen and Jagannathan (1997) distance. The table also reports the cross-sectional R 2 , the Wald tests for the hypotheses γ = ρ and η = γ , and the J-test for overidentifying restrictions (p-values in parentheses). Details on the tests are provided in the online supplementary material.
risk aversion estimates. As in our simulation study, the large standard errors of the risk aversion and ambiguity parameters show that these parameters are hard to estimate precisely. An economic interpretation why it is difficult to identify the two parameters separately is that once the agent knows the economic model, the remaining uncertainty of the distribution of returns might be of minor relevance. If returns are well described by the economic model, ambiguity accounts for a major part of the overall uncertainty and the impact of risk aversion on asset prices is relatively small. The ambiguity parameter increases in the value of the EIS. For the intuition of this result, consider the long-run risks model with ambiguity about the state variables, in which the market prices of the long-run risk factors are proportional to θ 1 θ 2 − 1 (see the structural equations provided in the online supplementary material). When estimating the preference parameters, returns are given exogenously. Consequently, the representative investor's preferences just influence the pricing kernel and thus the market prices of risk. To match the equity premium in the data, an increase in ρ is compensated by a lower value of η, as long as ρ < 1. SA preferences price the cross-section of expected returns rather well and an RMSE of zero cannot be rejected. The other two specification tests, the J-test and the test based on the HJD, reject the model. However, we have seen in Section 3.4 that the finite sample properties of these two tests are rather poor.
Assuming EZ preferences, that is, estimating the parameters given η = γ , yieldsˆ EZ of (0.000, 0.013, 55.104), (0.500, 0.011, 31.075), and (0.667, 0.011, 21.843). The point estimates of relative risk aversion are clearly above the values considered plausible by Mehra and Prescott (1985). This indi-cates that these estimates may correspond to the effective risk aversion of investors and that we estimate a misspecified model, which counterfactually neglects ambiguity and the investor's ambiguity aversion. Malloy, Moskowitz, and Vissing-Jørgensen (2009) argued that the EIS has little impact on the risk aversion estimate if the estimation is solely based on the cross-sectional variation in returns. In contrast to their study, we force the models to also match the equity premium. We observe that higher values of the EIS lead to larger estimates of the risk aversion parameter. As above, this may be explained by the standard long-run risks model, in which the market prices of long-run risks are proportional to θ 1 − 1.
Estimating preference parameters under the assumption of PA preferences, that is, imposing ρ = γ , yields point estimates of δ and η that are similar to those if SA preferences are estimated. The pricing performance with respect to cross-sectional R 2 and RMSE is equal to that of the SA model. This was expected, as the hypothesis ρ = γ is far from being rejected in the SA model due to low point estimates in combination with large standard errors of γ . Note that ρ = γ does not imply time separability in case of SA and PA preferences as η is significantly different from ρ. Time separability is rejected in all three models.
We find the relationγ SA <γ EZ <η SA <η PA . In light of the results of our simulation study,γ EZ might be interpreted as the effective risk aversion of ambiguity averse investors. The result that the ambiguity parameter is estimated above the risk aversion coefficient indicates the presence of ambiguity aversion in the cross-section of expected returns. The null hypothesis of ambiguity neutrality is not rejected by the Wald test. However, this has to be put into perspective to the finite sample evidence in the simulation study (see the online supplementary material), where the Wald test did not reject ambiguity neutrality even in the presence of ambiguity aversion.

Sensitivity of the Parameter Estimates
To explore the sensitivity of the results with respect to the number of test assets, we also estimate the SA model using 10, 70, and 100 portfolios in addition to the 3-month Treasury bill and the return on the CRSP stock market index. The 10 portfolios contain 5 size-sorted and 5 value-sorted portfolios, while the 70 portfolios contain 10 size-sorted, 10 value-sorted, 30 industrysorted portfolios, 10 portfolios sorted by long-term reversal, and 10 sorted by dividend yield. The 100 portfolios are doublesorted by size and book-to-market value. Estimation results are given in Panel 1 of Table 2. Details about the portfolios can be found in the Appendix.
For all estimations, we fix the EIS at 2. The point estimates of all parameters are not statistically different for the different sets of test portfolios. Point estimates of the ambiguity parameter range from 35 to 42. If 100 portfolios are used in the estimation, the Wald test rejects ambiguity neutrality. In this case, the estimated risk aversion coefficient is negative. Negative estimates of γ are also reported by other studies, for example, Hansen and Singleton (1996) and Parker and Julliard (2005). Neely, Roy, and Whiteman (2001) noted that point estimates of γ are quite sensitive to the choice of test assets.
We also report parameter estimates using the optimal weighting matrix in Panel 2 of Table 2. The point estimates of all three parameters are in line with the estimates reported in Panel 1. Especially, the estimates of the ambiguity parameter η are fairly robust and range from 29 to 41. The RMSEs are larger, indicating that the optimal weighting matrix puts less emphasis on accurately pricing the original test assets. Due to the greater precision of the parameter estimates, ambiguity neutrality is rejected. This result supports our main finding that investors are ambiguity averse and that this attitude affects assets returns.

Estimated Pricing Kernel
The upper panel of Figure 1 shows the pricing kernels of the three decision models, givenˆ SA = (0.500, 0.012, 2.295, 35.091),ˆ EZ = (0.500, 0.011, 31.075), andˆ PA = (0.500, 0.012, 0.500, 35.302), that is, point estimates reported in Table 1. The shaded areas represent NBER recessions. The estimated pricing kernels are always positive and thus satisfy the no arbitrage condition. Economic theory suggests that an investor evaluates payoffs more highly when economic conditions are bad, that is, during recessions. Figure 1 shows that the realized pricing kernels have a clear business cycle pattern. As consumption growth and the return on wealth are low during recessions, the realized pricing kernels are highest during these periods.
The realized pricing kernels show a similar behavior over time. In the estimation, we force the mean of the realized pricing kernel to match the inverse of the average real quarterly gross return on the risk-free asset, which is 1.0029 in our sample. Thus, the average pricing kernels are all close to one. Figure 1 shows that the peaks in the pricing kernel are more pronounced for the EZ model. Especially during the recent financial crisis the pricing kernel of the EZ model reached a value of 5.14 in contrast to only 1.82 for the PA model. An ambiguity averse investor pays relatively little attention to single extreme outcomes in consumption growth and the return on wealth. She rather cares about the expected utility conditional on the economic model at hand, respectively, its certainty equivalent, which leads to less extreme values of the pricing kernel.   (5) in the lower panel. SA refers to smooth ambiguity preferences, EZ to Epstein and Zin (1989) preferences, PA to pure ambiguity preferences. The EIS is set to 2. The shaded areas represent NBER recessions.
To provide deeper insights into how ambiguity distorts the pricing kernel, we consider the pricing kernel decomposition in Equation (5). If investors know the economic model, the conditional expectation is a constant and does not contain any additional information. If ambiguity is present and investors care about it, the question is whether the conditional expectation matters for the pricing of assets. To improve the fit to the cross-section of expected returns, ξ SA has to carry additional information compared with ξ CRRA and ξ EZ . The sample correlation between ξ CRRA and ξ SA is about 0.25. If ξ EZ and ξ SA have a correlation close to one, the introduction of ambiguity is basically relabeling risk as ambiguity. For the realized pricing kernel of the SA model, the sample correlation between ξ EZ and ξ SA is 0.53. This shows that ambiguity matters for asset prices. Table 3 reports cross-sectional relative pricing errors (in percent), which are the square roots of the mean squared differences between realized and predicted returns divided by the square roots of the mean squared returns. We use the sample covariance between the pricing kernel and the portfolio returns to calculate predicted returns (details are provided in the online supplementary material).

Pricing Performance
How can ambiguity help explaining the equity premium and the cross-sectional variation in expected returns? The EZ model only accounts for the covariation of returns with consumption growth and the return on wealth. The pricing kernel of the SA model contains the additional term ξ SA . Hence, it also accounts for the covariance between returns and the continuation value of the time t + 1 economic model. The PA model accounts exclusively for the latter. Consider a portfolio that has low returns whenever the economic model is unfavorable, that is, when it yields a low continuation value. Ambiguity averse investors command a premium for bearing this uncertainty (ambiguity premium). Compared to the EZ model, the expected return on such an asset is higher given SA preferences. Thus, the SA model may help explaining the returns of portfolios, which are highly exposed to ξ SA . If consumption growth and the return on wealth already characterize the economic model rather well, that is, ξ CRRA × ξ EZ and ξ SA are highly correlated, the ambiguity premium can be replicated by amplifying the risk factors of the EZ model. This can be achieved by using a high value of relative risk aversion. In Section 4.3, we have seen that the correlation is 0.53. Thus, we expect the risk factors of the EZ model to replicate the ones of the SA model to some extent, but not entirely.
Both models perform similarly in matching the equity premium. In the data, it is 1.61% per quarter, while it amounts to 2.01% in the SA model, 2.05% in the EZ model, and 1.94% in the PA model. To quantify the contribution of ambiguity in the SA model, we decompose the model-implied equity premium into a risk premium and an ambiguity premium by using the log linear approximation Given ρ = 0.5, we find that the risk premium accounts for 31.19% of the equity premium, while the ambiguity premium makes up the remaining 68.81%. Table 3 shows that it is difficult to discriminate between the models based on their pricing performances. All models have difficulties in accurately pricing book-to-market and industry portfolios. The pricing errors of the 10 size sorted portfolios are considerably lower, with the ambiguity sensitive models slightly outperforming. Concerning the industry sorted portfolios, although the average pricing performance is similar across models, there are some noteworthy differences for the individual industries. The absolute pricing error of industry portfolio 1 (nondurables) reduces from 0.76% in the EZ model to 0.43% in the PA model, while for industry portfolio 4 (energy) it increases from 0.20% to 0.54%. Consistent with the arguments above, the correlation between the return on portfolio 1 and ξ SA is relatively large in absolute terms, while it is low for portfolio , which are computed by taking the square roots of the mean squared differences between realized and predicted returns divided by the square roots of the mean squared returns. The standard errors in parentheses are calculated by bootstrapping using 10,000 replications. SA refers to the smooth ambiguity model, EZ to Epstein and Zin (1989) preferences, and PA to pure ambiguity preferences. The construction of predicted returns is described in Section S1 of the online supplementary material. To evaluate the in-sample pricing performance of the models, we rely on 10 size sorted, 10 value sorted, and 10 industry portfolios. "All" contains 10 size, 10 book-to-market, and 10 industry sorted portfolios. These portfolios, together with returns on a Treasury bill and CRSP value weighted stock index are used to estimate the parameters.
To evaluate the out-of-sample pricing performance, we look at 10 portfolios formed on long-term reversal, 10 dividend yield sorted portfolios, 10 portfolios formed on earnings to price ratios, and 10 cash-flow to price sorted portfolios. The construction of the individual portfolios is described in the Appendix.
4. In line with this, the ambiguity premium contributes more to the total premium for portfolio 1 compared with portfolio 4. The SA model allows for a risk and an ambiguity premium and yields medial pricing errors of 0.52% for portfolio 1 and 0.46% for portfolio 4. We also investigate the fit with respect to 10 long-term reversal portfolios, portfolios formed on dividend yield, and portfolios based on two corporate profitability measures, the earnings to price ratio, and the cash-flow to price ratio. As these portfolios were not used in the estimation, pricing these assets constitutes a test of the out-of-sample performance of the preference models. Table 3 shows that the ambiguity sensitive models price the portfolios more accurately than the EZ model, in particular the long-term reversal and the dividend yield sorted portfolios. It is, however, not possible to distinguish the models based on their out-of-sample pricing performance. This corroborates our finding from Section 4.1 that the pricing performance of the EZ model is rather similar to the ambiguity sensitive models, that is, an ambiguity neutral investor with high (effective) risk aversion prices the cross-section of assets rather similar to an ambiguity averse investor.

CONCLUSION
Several recent studies show that ambiguity may have a significant impact on asset prices. However, there is little research investigating whether ambiguity aversion is actually present in the prices of traded assets and how consumption-based asset pricing models that account for ambiguity perform in explaining the cross-section of expected returns. To the best of our knowledge, this is the first study that estimates the SA model based on financial market data. Our point estimates of the ambiguity parameter are between 25 and 60, while relative risk aversion is clearly lower and within the range considered plausible by Mehra and Prescott (1985). This shows that market participants are ambiguity averse.
We analyze whether the SA model is able to explain the crosssection of expected returns and if it improves upon EZ and PA preferences. We find that ambiguity helps explaining the crosssectional variation in expected returns while the concept of risk aversion is negligible in the presence of ambiguity. However, solely based on pricing errors and commonly employed model specification tests, it is difficult to discriminate between the decision models. Our simulation study shows that even in an economy where ambiguity has a perceptible impact on asset prices, the pricing performances of ambiguity sensitive and ambiguity neutral decision models are similar. In the SA model, there is an additional priced factor that compensates for bearing model uncertainty. Thus, the total equity premium constitutes a risk premium and an ambiguity premium. If ambiguity is neglected, matching the equity premium and the cross-section of expected returns requires a high level of relative risk aversion to make up for the missing ambiguity premium. The SA model can account for the patterns in expected stock returns with lower relative risk aversion and thus provides a more reasonable explanation of asset prices.

APPENDIX: DATA
Risk-free rate: We use the 3-month secondary market Treasury bill rate from the H.15 release of the Federal Reserve Board of Governors (http://www.federalreserve.gov/releases/h15/data.htm) as riskfree rate.
Stock returns: All stock returns are taken from Kenneth French's homepage (http://mba.tuck.dartmouth.edu/pages/faculty/ken. french/data_library.html), including the CRSP value weighted stock return index, which we use as proxy for the return on the stock market. As test assets, we employ the return on the 3 month Treasury bill, the CRSP value weighted stock return, and the returns on 30 additional equity portfolios. Among these, 10 value weighted portfolios are formed on size (market equity) at the end of each June using NYSE breakpoints, 10 value weighted portfolios formed on BE/ME (book equity at the last fiscal year end of the prior calendar year divided by market equity at the end of December of the prior year) at the end of each June using NYSE breakpoints, and 10 industry portfolios (the sectors are Consumer Nondurables, Consumer Durables, Manufacturing, Energy, Business Equipment, Telecommunication and Television, Retail, Healthcare, Utilities, and Other) also formed at the end of each June. In Sections 4.2 and 4.4, we also use the returns on 10 portfolios formed on long-term reversal, 10 dividend yield sorted portfolios, 10 portfolios formed on earnings to price ratios, and 10 cash-flow to price sorted portfolios. We moreover use 30 industry portfolios that are constructed similarly to the 10 industry portfolios considered above, as well as 100 style portfolios. The latter are the intersections of 10 portfolios formed on size and 10 portfolios formed on BE/ME (see the description above). For a detailed description of the return data, see the URL above.
Inflation:All returns are deflated using the seasonally adjusted Consumer Price Index (CPI). We obtain the CPI from the Bureau of Labor Statistics (http://www.bls.gov/cpi). Quarterly inflation is the growth rate of the CPI in the final month of the current quarter over the final month of the previous quarter.
Consumption and return on wealth:We use the same definitions of consumption, labor income, asset holdings, and cay as in Lettau and Ludvigson (2001). The updated data are available on Martin Lettau's homepage (http://faculty.haas.berkeley.edu/lettau/data_cay.html). Lettau and Ludvigson (2001) defined aggregate consumption as expenditures on nondurables and services, excluding shoes and clothing. The quarterly data are seasonally adjusted at annual rates, in billions of chain-weighted dollars. Labor income is defined as wages and salaries plus transfer payments plus other labor income minus personal contributions for social insurance minus taxes. Asset holdings is household net worth in billions of current dollars. We refer to Lettau and Ludvigson (2001) for a more detailed description of the data.

SUPPLEMENTARY MATERIALS
The online supplementary material provides information on summary statistics, the estimation technique, as well as on the simulation study that are not contained in the article. Section 1 comprises a table with summary statistics of the data used in the article. Section 2 describes how we estimate the model parameters and conduct tests with GMM. In Section 3, we briefly review the long-run risks model and its solution. We then discuss the conditional expected value, a key feature of the smooth ambiguity model. Afterwards, we extensively discuss the finite sample properties of our estimation technique based on simulated data.