What Does the Volatility Risk Premium Say About Liquidity Provision and Demand for Hedging Tail Risk?

This paper provides a data-driven analysis of the volatility risk premium, using tools from high-frequency finance and Big Data analytics. We argue that the volatility risk premium, loosely defined as the difference between realized and implied volatility, can best be understood when viewed as a systematically priced bias. We first use ultra-high-frequency transaction data on SPDRs and a novel approach for estimating integrated volatility on the frequency domain to compute realized volatility. From that we subtract the daily VIX, our measure of implied volatility, to construct a time series of the volatility risk premium. To identify the factors behind the volatility risk premium as a priced bias we decompose it into magnitude and direction. We find compelling evidence that the magnitude of the deviation of the realized volatility from implied volatility represents supply and demand imbalances in the market for hedging tail risk. It is difficult to conclusively accept the hypothesis that the direction or sign of the volatility risk premium reflects expectations about future levels of volatility. However, evidence supports the hypothesis that the sign of the volatility risk premium is indicative of gains or losses on a delta-hedged portfolio.

Whether realized volatility is greater than or less than implied volatility is an empirical question, and one that has been studied over time (see Mixon, 2009). However, the literature contains conflicting evidence about implied volatility as an unbiased estimator of future realized volatility (e.g. Canina and Figlewski, 1993;Christensen and Prabhala, 1998). Theory suggests that implied volatility should be a biased estimate of future realized volatility since implied volatility includes the market price of risk; that is, implied volatility is the expected "actual" (or statistical) volatility plus a risk premium. In mathematical finance this is formalized in terms of a change of measure. The volatility risk premium is defined as the difference between the expected future volatility under the physical measure (ex-ante forecast of realized volatility) and the expected future volatility under the risk-neutral measure (implied volatility from option prices). Therefore, the existence of a non-zero volatility risk premium indicates that not only is implied volatility a biased estimator of future realized volatility but, furthermore, that the bias is systematically priced.
The volatility risk premium has been an active area of research in financial economics for some time now. Whereas existing studies typically begin with an asset pricing model or some framework of stochastic or time-varying volatility, we take a step back from the theoretical foundation of volatility in financial markets and perform a purely data-driven analysis of the volatility risk premium, leveraging the insights of Big Data analytics. That is, we start with a massive data-set of transaction level prices: our sample includes over half a billion trades in SPDRs, the ETF that tracks the S&P 500 index, from 2006 to 2011. We use this data to estimate the realized volatility of the market using a robust methodology with minimal assumptions. Then, we compare this realized volatility to a measure of model-free implied volatility, daily over the same five year period. Our proxy for the model-free implied volatility is the VIX volatility index, which is commonly used in other studies of the volatility risk premium. Since our objective is to quantify the volatility risk premium in a model-free, nonparametric manner, we compare the computed ex-post realized volatility to the contemporaneous level of VIX. As a result, we essentially return to the fundamental idea of the volatility risk premium as a bias -not a random bias, but a systematically priced bias. Through our data-driven analysis we seek to better understand the economic determinants of this bias.
Our study of the volatility risk premium contributes to a recent trend in financial research and risk analysis. In any quantitative field, there are two approaches to conducting research: model-based and data-driven. 1 This dichotomy is perhaps more pronounced now than ever in the quantitative fields of financial economics -i.e. asset pricing, derivatives, and risk management. Traditionally, the researcher would construct a model based on theory and then use data to empirically verify or validate the model and justify the economic intuition it conveys. However, with the abundance of financial data being generated every day and the increasing popularity of "Big Data" along with data mining and machine learning techniques making their way into the financial engineer's toolbox, a new data-driven approach to research in these areas is gaining popularity. We view our study as an extension of this research philosophy to better understand how the market prices volatility.
In particular, our data-driven analysis highlights the role of intermediaries that provide liquidity to investors who seek to hedge their downside tail risk -that is, a supply and demand framework -for pricing volatility in the market. We begin with a statistical analysis of the data, using large-scale data collected from several different sources. This introduces the additional challenge of having to collate data from multiple platforms, in different formats, often on differing time scales. Such challenges are commonplace in "Big Data" research and give rise to problems such as spurious correlation, time asynchronicity, and noise accumulation (see Fan, 2013;Fan et al., 2014). However, using tools from high-frequency finance and big data analytics, we are able to obtain a clean and more precise estimate of the true integrated volatility without any reliance on a model or parametric assumptions. Our consistently estimated realized volatility is then compared to a model-free measure of implied volatility, proxied by the VIX index, to get our time-series of the volatility risk premium. We make the distinction between this ex-post formulation of the volatility risk premium with other studies and note that our resulting time series is the daily realization of the volatility risk premium. We still interpret this as the market price of volatility risk, but consistent with our data-driven philosophy, frame our measure of the volatility risk premium as a systematically priced bias.
This bias should be negative on average since the realized volatility is expected to be less than the option implied volatility, the latter containing the risk premium.
The literature suggests that since traditional risk factors are unable to explain the negative volatility risk premium, there is some independent risk factor that is driving its dynamics (Carr and Wu, 2009). 2 Despite the volatility risk premium being negative in expectation, or on average, we find that there is a large amount of variation in the volatility risk premium over time including several instances with extreme positive spikes. Following our framework that the volatility risk premium is a systematically priced bias, our empirical design approaches the problem from the perspective of a deviation analysis, decomposing it into magnitude and direction. Although this may be basic relative to modern tools in statistical learning theory, such an analysis has never been performed to understand how the financial markets price risk. This allows us to use straightforward regression techniques to gain valuable insights into the magnitude (absolute value) and direction (sign) of the systematically priced bias associated with the volatility risk premium, and to our knowledge we are the first to do so. We control for micro-level noise in our specific methodology for computing realized volatility at high frequency. Then examining the absolute deviation of the realized volatility from the implied volatility mitigates the effects of the extreme swings in the volatility risk premium (what might be termed as "outliers" in traditional finance research) producing more robust results. Additionally, the empirical specification using absolute deviation, or the magnitude of the volatility risk premium, highlights the potential "missing risk factor" by accounting for supply and demand imbalances in the market for hedging tail risk.
We find that the magnitude of the volatility risk premium, that is the absolute value of the bias between the realized and implied volatility, reflects these supply and demand imbalances in the index option market where investors buy protection on downside tail risk. The demand effects are captured by a statistically significant relationship between open interest for put options on the S&P 500 and the magnitude of the volatility risk premium. The supply effects, are captured by the TED spread and the credit spread. The former often appears as a proxy for liquidity (or illiquidity) in financial markets and is viewed a general measure of financial instability. The latter, which is most significant during the Financial Crisis in our results, interestingly has the interpretation of dealers deleveraging and shrinking their balance sheets by selling off risky positions. This is consistent with the evidence provided by Adrian and Shin (2010).
As for the direction of the volatility risk premium, practitioners believe that the volatility risk premium's sign is indicative about future levels of realized volatility.
When the volatility risk premium is negative, implied volatility is higher than realized volatility and market participants believe that volatility is likely to increase in the future. On the contrary, when the volatility risk premium is positive, implied volatility is less than realized volatility and market participants believe that volatility is likely to decrease in the future. This is related to the contentious idea of implied volatility's ability to forecast future realized volatility in the literature and, more recently, the "Expectation Hypothesis" discussed in Aït-Sahalia et al. (2012). In fact, Aït-Sahalia et al. (2012) provides a way to test this hypothesis about the direction of the volatility risk premium.
An alternative explanation in the finance literature says that the sign of the volatility risk premium represents the gains or losses on market makers' delta-hedged positions. This was first proposed by Bakshi and Kapadia (2003) within the context of stochastic volatility and jump-diffusion models. 3 Using our model-free, data-driven analysis, we are able to find much stronger evidence in favor of this explanation.
The remainder of this paper is structured as follows. The next section, Section 2, presents the methodology for computing realized volatility using ultra-high-frequency data. Section 3 discusses the economic basis for the volatility risk premium and why implied volatility can, and will, deviate from realized volatility. We also propose our testable hypotheses based on several stylized facts about the volatility risk premium and the market for hedging downside tail risk from the financial economics literature.
Section 4 details the data collection and methodology for constructing the volatility risk premium. Our empirical analysis and results are presented in Section 5. Section 6 concludes. We have two Appendices: Appendix A reviews different statistical methodologies for computing realized volatility with emphasis on the estimation of integrated volatility with high-frequency data; Appendix B presents simulations that demonstrate the benefit of using ultra-high-frequency data in estimating integrated volatility and the extent to which our method performs better than alternatives.

Realized Volatility and Frequency Domain Estimation of Integrated Volatility
In this section we discuss the frequency domain estimation of realized volatility from Olhede et al. (2009). While this does not represent a methodological innovation of our work, we argue that it is a powerful methodology for computing realized volatility under fairly general conditions with minimal parametric assumptions and is therefore able to give cleaner estimates using noisy ultra-high-frequency data and yield new insights into how volatility is priced in the market. More specifically, the procedure computes a consistent and unbiased estimator of integrated volatility at ultra-highfrequencies under very general specifications of the microstructure noise process. The frequency domain methodology, and its statistical properties, is compared to some alternative methodologies for estimating volatility with high-frequency data in Appendix A.

The methodology
Without loss of generality but for the sake of formalization, we can start with a latent true (log)price X t that follows an Ito process where W t is a standard Brownian Motion and µ t and σ t are time-varying drift and volatility, respectively, that may or may not follow stochastic processes themselves. 4 However, what we observe is the transaction price, or its logarithm, Y t at times where t i represents microstructure noise.
The frequency domain methodology uses a discrete Fourier transform to go from the time domain, on which financial data is typically portrayed, to the frequency domain. For a generic time series U t j , j = 1, · · · , N, the discrete Fourier transformation (3) Using this notation, the integrated volatility can be written in terms of the variance of J (X) On the other hand, applying the transformation on the observable transaction price Y t at times {t i } ∈ [0, T ] leads to a noise contribution at each frequency, where for now we assume t is a white noise process with variance a 2 . It can be seen that the high-frequency coefficients are more heavily contaminated by the noise.
Ideally we would like to shrink E|J and therefore remove the noise locally at each frequency. In practice we still need to estimate the multiscale ratio L k . One way is to use the Whittle log-likelihood to estimate the unknown quantities in Equation (6), (7) and it follows that It can be shown that the final de-biased estimator is a consistent estimator of the integrated volatility:

Autocorrelated Noise
This frequency domain methodology also allows us to easily model autocorrelated noise and disentangle the noise effect at each frequency in the same way. If we assume that t j is an autocorrelated stationary time series, it is convenient to model it as a moving average process of order q, where {η t j } is a white noise process with variance σ 2 η . This MA(q) specification leads to a new likelihood function Therefore the multiscale ratio is given by where the order of the moving average process -i.e. the appropriate number of lags in the autocorrelated noise -is determined through model selection using the corrected Akaike information criteria (AICC), 8

Advantages
The frequency domain method for estimating integrated volatility has several desirable features, both in terms of the statistical properties and the practicality in applying to real financial data. From the financial modeling and data analysis pointof-view, working in the frequency domain provides an elegant way to address more general specifications of the microstructure noise process. It bears repeating that maintaining minimal parametric assumptions and high degree of flexibility are some of the features that make this methodology attractive for our purposes, considering the data-driven philosophy of our approach.
Many methods for estimating integrated volatility assume that the noise process, t , is i.i.d. or uncorrelated. However, in practice this is an unreasonable assumption. 6 Autocorrelated microstructure noise may be a more reasonable assumption, since large disturbances this second may be highly correlated with large disturbances last second, especially if there is a lot of noise in the market. This may give the impression that the market is more turbulent or volatile, when in fact the persistent volatility in our observed time series, Y , is coming from the microstructure noise. Thus, we need a clean way to strip away the true volatility of the price process in the presence of microstructure noise at ultra-high-frequencies. This is why we use the frequency domain estimation method in computing integrated volatility. In fact, we find that the data indicates the latent noise process has on average lag-1 autocorrelation, and the time varying order of autocorrelation ranges from 0 to 5.
Once we get our de-biased, clean measure of realized volatility in the market, we can compare it to a measure of implied volatility to get a time series of the volatility risk premium on which we perform our deviation analysis. The next section discusses these concepts -implied volatility, the volatility risk premium -and constructs our testable hypotheses.
3 Implied Volatility and the Volatility Risk Premium The notion of implied volatility is well understood and widely used by options traders and financial engineers. In this section we briefly review the concept of implied volatility and discuss in greater detail the volatility risk premium which is imbedded in implied volatilities but not included in realized volatilities. Therefore, if we wish to look at the difference between realized volatility and implied volatility, it gives us a measure of the volatility risk premium. We further view this deviation between realized and implied volatilities in an ex-post sense as a priced bias in the options markets.
While it is well known that the Black-Scholes-Merton option pricing model relies on unrealistic assumptions and therefore cannot reasonably price options, the model is still widely used by traders to infer the level of volatility associated with a particular option on a given asset (e.g. stocks, indexes, or currencies). The idea is that Researchers including Rubinstein (1994), Dupire (1994), and Derman and Kani (1994) have, to much success, extended this idea of implied volatility to extract market information across entire classes of options on a given asset (i.e. different strikes and/or maturities; often referred to as the "volatility smile" or "volatility surface") and fit a deterministic function of asset price, strike price, and time to expiry.
More recently, the work of Britten-Jones and Neuberger (2000) and Jiang and Tian (2005) broke away from the reliance on models and derived and implemented a model-free implied volatility using only current option prices. It is along these lines that we would like to use implied volatility in our data-driven analysis.
The implied volatility is forward-looking and represents the market's expectation of volatility over the life of the option. Mathematically, it can be thought of as an expectation under the risk-neutral or pricing measure. The volatility as computed from the underlying asset price movement can be thought of as being generated under the physical or statistical measure. The difference between the two represents the market price of volatility risk, or what is referred to as the "volatility risk premium".
In some instances it is easier to work with the squared volatility, which leads to the variance risk premium defined as follows.
Taking the square root of each of the expectations in Equation (15) gives the volatility risk premium which we will denote as a lowercase vrp t . While we present all of our results in terms of the volatility risk premium, everything is robust with respect to the variance risk premium. The reason we chose to use the former is that the results are easier and more natural to interpret in terms of vol units.
In many studies the first term in Equation (15) is computed as an ex-ante conditional expectation of the future realized volatility or variance given the current value through an econometric model. We intentionally choose not to do this, but rather use an ex-post measure of the realized volatility. 7 The reason is computing the exante conditional expected realized volatility introduces statistical problems such as overfitting, model error, and possible misspecification bias. Instead we compare the ex-post realized volatility (averaged over a one month period) to the model-free im-plied volatility (covering the same horizon) to get a realization of the volatility risk premium on each trading day over the sample period. In symbols, this is Therefore another distinction from the uppercase VRP t in Equation (15) and the lowercase vrp t in Equation (16) is that the latter is our bias representation of the volatility risk premium. Furthermore, once we substitute the level of the VIX volatility index in for the risk-neutral expectation, it would be the realization of the volatility risk premium which is similar to the realized P&L on a volatility swap (see Demeterfi et al., 1999).
Recent studies have been able to establish some interesting empirical properties of the volatility/variance risk premium. Bollerslev et al. (2009) andBollerslev et al. (2011) examine the predictability of the variance risk premium on stock market returns noting that the predictive power is better than other financial and macroeconomic factors that are typically used in stock market return forecasting. Zhou (2011) studies the predictability of the variance risk premium across financial markets through equity returns, bond returns, and credit spreads with the VRP predictability maximized typically in the one to four month horizon.
Using high-frequency index futures data, Wu (2011) considers the role that jumps play finding that both the jump arrival rate and the absolute value of the negative variance risk premium are proportional to the variance level. The interpretation, which has implications for our hypotheses below, is that when volatility is high the volatility risk premium is either very positive or very negative; that is the bias between realized and implied volatility increases when the level of uncertainty is heightened.
Thus, the volatility risk premium appears to be a priced risk factor in the capital markets and investors are willing to pay a premium to hedge their downside risk, especially when uncertainty is high. However, our knowledge is still very limited about the determinants of the volatility risk premium and we do not have sound empirical evidence documenting what exactly the volatility risk premium says about the mechanics of the market for pricing and hedging risk.
We construct our testable hypotheses based on several stylized facts from the recent literature in financial economics. First, the volatility risk premium represents option market makers' willingness to absorb inventories and provide liquidity (Gârleanu et al. (2009), Nagel (2012). Also, investors are net buyers of index options (Gârleanu et al. (2009)). To the extent that investors use index put options to hedge their downside tail risk, then we should be able to use option market data to draw inferences about investors' demand for hedging downside tail risk and intermediaries' willingness to meet this demand (i.e. provide liquidity). The volatility risk premium can, therefore, naturally be interpreted as the compensation that option market makers receive for this intermediation and liquidity provision (supply) to meet hedging demand. Adrian and Shin (2010) find evidence of this interpretation in the expansion and contraction of financial intermediaries' balance sheets.
We begin our analysis by postulating that the direction (sign) contains different information than the magnitude (absolute value) of the vrp, further justifying our empirical strategy of a deviation analysis. By some accounts, traders view the direction of the volatility risk premium as indicative of nothing more than the market's expectation of future levels of volatility. Therefore, pulling out the sign leaves us with the absolute deviation of the realized volatility from implied volatility. Based on the aforementioned stylized facts about the volatility risk premium, we expect that the magnitude of the volatility risk premium reflects the supply and demand imbalances in the market for hedging downside tail risk. When demand for hedging downside tail risk increases, market makers will take the short side (sell volatility) but must be compensated appropriately. The price of volatility increases and implied volatility rises relative to realized levels. When demand for hedging downside tail risk decreases, there will be a selloff of volatility and market makers will take the other side, but only at a substantial discount. Implied volatility falls relative to realized levels. Therefore, the magnitude captures the extent to which market makers must be compensated to provide liquidity to the options markets, either as a premium or discount if intermediaries are selling volatility to meet hedging demand or buying it back in response to a reduction in hedging demand. Our hypotheses about the absolute deviation or the magnitude of the volatility risk premium follow from this logic.
Magnitude Hypotheses: H1: The magnitude of the volatility risk premium reflects investors' demand for hedging tail risk.
H2: The magnitude of the volatility risk premium reflects the willingness of option market makers to absorb inventory and provide liquidity.
We should note that the hypotheses need not be mutually exclusive. Hypothesis H1 can be thought of as a demand-side effect and Hypothesis H2 can be thought of as a supply-side effect. We may, therefore, find that supply and demand forces work with or against each other to determine the magnitude of the vrp at a given time. Direction Hypotheses: H3: The sign of the volatility risk premium contains information about future levels of realized volatility relative to implied volatility.

H4:
The sign of the volatility risk premium reflects the delta-hedged gains or losses for option market makers.
We perform our deviation analysis on the systematic bias between implied and realized volatility by decomposing the vrp in Equation (16) into magnitude and sign as follows To see how the market prices this systematic bias we then conduct several statistical and econometric tests on each of the components of vrp in Equation (17) within the context of the hypotheses H1-H4.
In the next section we discuss our data in detail: the collection, cleaning, and processing; as well as the construction of our volatility risk premium time series and the other financial market and economic variables that are used in our econometric analysis of the volatility risk premium.

Data Collection
The data used for our empirical analysis came from several different sources, on multiple platforms, and were analyzed using a variety of softwares. This is a common feature of Big Data analytics and requires careful processing and collating to ensure that the data are in consistent formats, with large-scale computations often being done in parallel (see Fan et al., 2014). First, we began by cleaning and processing the ultra-high-frequency transaction data for the SPDR ETF. Then we used the cleaned price data to estimate the integrated volatility on the frequency domain via the Fast Fourier Transform (FFT) algorithm. The computed integrated volatility was then merged with a daily time series of the VIX index, and the difference between the two time series gave us the volatility risk premium. Finally, we had to collect, clean, and merge with the economic, financial market, and risk factor variables from their respective databases. This data was used in our econometric analyses of the determinants and drivers of the volatility risk premium.
Transaction price data for the SPDR ETF (ticker SPY) was obtained from the TAQ database within WRDS. The sample period we studied goes from July 2006 to June 2011. Over these five years there were a total of 523,814,632 trades. For our integrated volatility estimation method to work best, we need as many observations For data cleaning and processing purposes, we filtered the data based on the "Correction Indicator" (CORR) and "Sale Condition". We kept only transactions where CORR=00; these represent regular trades that were not cancelled or corrected.
This resulted in only 0.003% of the data being removed from the sample, leaving us with 523,796,850 trades remaining. We also eliminated any "Special Condition Trades" which introduced suspicious and irregular patterns in the transaction price sequences (i.e. large jumps that were immediately reversed). This resulted in 1.8% of the data being removed from the sample leaving us with 514,270,624 trades remaining.
Since multiple trades can occur in any given second, we next introduced an aggregation step in the data processing. This would allow us to have a second-by-second time series of SPY prices. We tried two methods for aggregation: median and sizeweighted average price and did not find significant aberrations. Finally, we had to include an expansion step to account for seconds where no trades were executed. To address these instances we used piecewise constant interpolation; i.e. if there was no trade at second t then we filled it with the last executed price t − 1 ("last tick").
This resulted in 29,461,859 second-by-second data points covering 1,259 trading days.
This was the data that was used to compute our daily time series of monthly realized volatility (on a rolling 21 trading day basis) via the frequency domain estimation methodology.
The daily opening level of the VIX volatility index was obtained from the CBOE database. The VIX is a model-free implied volatility extracted from near-term outof-the-money put and call options on the S&P 500 index. 9 The explanatory variables in our regressions also come from multiple sources.
First, we have the traditional risk factors from the Fama-French Three Factor Model (Fama and French, 1993); the data for the Fama-French factors are available from Kenneth French's website. 10 We also include the credit spread, also known as the default risk premium, which is the difference in yield on Baa-rated and Aaa-rated corporate debt. The yields on corporate debt, by Moody's rating, are available from the FRED database maintained by the Federal Reserve Bank of St. Louis. Use of the credit spread as a risk factor in asset pricing studies goes back to Chen et al. (1986) and has the interpretation as a measure of investor risk aversion. It has subsequently been used in volatility risk premium studies such as Zhou (2011). We will see that there is also a supply-side interpretation of the highly significant effects that the credit spread has on the volatility risk premium (and its magnitude), especially during the Crisis subperiod. The TED spread is included to capture liquidity effects in the financial markets and as a measure of distress in the financial system.
The TED spread is the difference between 3-month Eurodollar rates and 3-month Treasury rates, both of which are also available through the FRED database. The interpretation of the TED spread follows from the logic that as uncertainty in the financial system heightens, financial institutions charge more to each other for shortterm borrowing-this is reflected in Eurodollar rates; at the same time they require 9 For details on the methodology used in constructing the VIX volatility index please see CBOE (2009). A similar methodology is employed in Jiang and Tian (2005) where the information content of model-free implied volatility is studied. The CBOE volatility index is studied in Carr and Wu (2006) and Jiang and Tian (2007).

As a final explanatory variable we include the daily open interest for put options
on SPY, which is obtained from the OptionMetrics database. We use this as a proxy for investors' demand for hedging tail risk, following the combined logic of several recent options studies. Cao and Han (2013) use open interest as a proxy for option demand pressure. Additionally, Gârleanu et al. (2009) find that investors are net buyers of index options. Since S&P 500 index put options give investors a way to hedge against market-wide crashes, the open interest provides a natural proxy for investors' demand to purchase protection and hedge downside tail risk.
While options on the SPDR ETF (SPY) are different from index options on the S&P 500 (SPX), they essentially provide the same protection for investors and have some features that may make them more attractive (see Kelly et al., 2012). In fact, our choice to use SPDR options might be even more consistent with our desired proxy as demand for hedging downside tail risk, since the former are American-style options with physical settlement (SPX index options are European-style options with cash settlement only) and therefore give the investor more flexibility and robustness in protecting themselves against market crashes. Furthermore, when we discuss market makers' delta-hedging, SPDRs would be a more effective hedge on SPY put options than on SPX index options.
The explanatory variables are summarized in Table 1. Descriptive statistics for all variables are given in Table 2, which will be referred to throughout the discussion.

Construction of the Volatility Risk Premium
In this section we discuss our construction of the volatility risk premium from the market data. Consistent with our representation of the vrp as a "bias", we calculate it as the deviation of the realized volatility from the expected volatility implied by option prices. Therefore we first compute the realized volatility as the estimated integrated volatility using the Fourier method, described in Section 2, on the ultra-high-frequency transaction data for the SPDR S&P 500 ETF (ticker SPY). This gives us a daily time series of the realized volatility; however, there is a lot of variation in the day-to-day realization of this quantity which will contribute to additional statistical noise in our attempt to quantify the systematic bias that is the vrp. To better represent this systematic bias we smooth the time-series of realized volatility by taking a rolling average of the next 21 trading days so as to cover the same month as the contemporaneous VIX index, which is our measure of model-free implied volatility.
Define this rolling average as RV t t=2006.07 is then subtracted from the average realized volatility to measure the extent that the implied volatility represents a biased expectation of the future realization: We use the Open value (rather than the Close) of VIX so as to be consistent with our realized volatility estimate in terms of the 21-trading-day period for which we are looking at on any given day. 11 A time series of our computed vrp over the sample period is plotted in Figure 1.
Looking at the figure, two observations stand out immediately: first, the risk premium is negative throughout most of the sample period (we confirmed, statistically, that the vrp is not a mean zero process by conducting a series of one-sample t-tests for the One possible explanation for this large positive spike is that the option markets underpriced the actual volatility level during that period of time. Unexpected shocks such as Lehman Brothers' failure and subsequent government interventions kept the markets on edge, and it was impossible to know the magnitude of such a market tsunami and its impacts on realized volatilities, a priori. The idea that the government would provide a backstop against any large financial catastrophe, known as the "Fed put", was arguably priced into the market keeping implied volatilities low relative to realized volatilities. Therefore, there was a strong bias in one direction with implied volatility underestimating the realized volatility. This eased a bit after Lehman did fail, but when officials were quick to step in thereafter, it remained to keep implied volatilities lower than perhaps they should have been given the circumstances. After the government programs such as TARP and QE were in place and it became clear there would be no "quick fix", implied volatilities rose relative to realized levels (even though both were rising steadily during this entire period because of the high degree of overall uncertainty) thus reversing the bias.
There is evidence that during this time options dealers went from suppliers of liquidity in the market for hedging tail risk to demanders of liquidity (see Chen et al., 2015). This might help explain the joint effect of sign and magnitude of the volatility risk premium, which we briefly discuss at the end of the paper. First, however, we seek to identify and understand the effect that supply and demand imbalances in this market have on the magnitude of the bias, or the absolute deviation of the realized volatility from the implied volatility. We will see that when the demand for hedging tail risk increases, the deviation between realized and implied volatility widens as well. Similarly, when there is a supply shock (an extreme case would be when dealers switch from suppliers of liquidity to demanders of liquidity) we see that this deviation widens. We find that our proxies for supply and demand side variables are significant, both statistically and economically; furthermore, both the partial effects (changing supply or demand holding all else constant) and marginal effects (letting the other factors vary as well) produce consistent results. Based on this supply and demand framework, we interpret the magnitude of the volatility risk premium as the price for hedging downside tail risk. The direction of the volatility risk premium, or the sign of the bias, we find reflects different information which may also be related to the role of intermediaries (i.e. options dealers) in the market for hedging downside tail risk.

Empirical Analysis
Now we empirically examine the properties of the volatility risk premium as a priced bias between realized and expected volatility. Following our deviation analysis approach, we isolate the magnitude of the bias (absolute value) from the direction (sign) and are able to identify how they reflect different economic forces. First, we analyze the magnitude of the volatility risk premium -the absolute deviation -in relationship to traditional risk factors as well as additional factors to capture supply and demand imbalances in the market for hedging downside tail risk. As is common practice in empirical finance studies, we first control for the known market risk factors, and then include proxies for investor demand for hedging tail risk and financial intermediaries' willingness to take on risk, provide liquidity, and meet that demand (i.e. the supply side). Investors who wish to hedge their downside tail risk or the risk of a market-wide crash, can buy put options on the S&P 500. As discussed in the previous section, they can do this using either index put options (SPX) or put options on the SPDR ETF which tracks the S&P 500 (SPY put options). We use open interest on the latter as our proxy for investors' demand for hedging tail risk. The supply side effects are more subtle. On the one hand, there is the well known TED spread which is often a proxy for market liquidity and a measure of distress in the financial sector (see Brunnermeier, 2009). Within the framework of our analysis it could even be interpreted as a net cost of funding for financial institutions who borrow overnight and pledge T-bills as collateral. We do find that the TED spread has moderate explanatory power at varying points in time over the sample period. The explanatory variable that has more significance and is linked to supply side forces is the credit spread. We find that the credit spread is highly significant and explains the variation in the magnitude of the volatility risk premium quite well. While the credit spread has traditionally been used in empirical finance studies as a proxy for risk aversion (which is not unrelated to the volatility risk premium), we use an interpretation that has appeared in the recent financial economics literature that relates the widening or tightening of credit spreads with "de-leveraging" or increased risk-taking, respectively, of large dealer banks. This is consistent with the hypothesis that the magnitude of the deviation between the realized and expected volatility reflects supply-side imbalances. The evidence in favor of this hypothesis is compelling, especially during the Crisis period.
After a detailed examination of the factors that are priced into the magnitude of the bias that is the volatility risk premium, we turn our attention to the direction or sign. The visual evidence in Figure 1 clearly suggests that the vrp is usually negative, and statistical tests reject the null hypothesis that it is a mean zero process.

Magnitude Regressions
Our econometric specification is to regress the magnitude component of the volatility risk premium on the variables to see the extent to which the absolute value of deviation between realized and implied volatility is explained by supply and demand imbalances in the market for hedging tail risk. The general regression specification is The results for the whole sample period are shown in Table 3. First, we see that the adjusted R-squared is 58% which indicates that, over our entire sample period, the factors are able to explain more than half of the variation in the magnitude of the volatility risk premium. We control for traditional financial market factors with the Fama-French three factors; the only significant factor is the market risk premium The economic significance lends convincing support for our hypotheses. The coefficient estimate for CS is 5.17; so when the credit spread widens by 100bps, the deviation between the realized and the implied volatility increases by 517bps. The traditional risk aversion story supports this; when market-wide risk aversion is heightened, the BBB yields will rise further above AAA yields, it is natural to expect that the options market will also price this in and increase the premium for hedging downside risk which will result in implied volatilities rising relative to realized volatility.
If the vrp is negative (i.e. realized volatility is less than implied volatility) this will result in an increase in the absolute deviation between the two. As we will explain below, there is a more compelling explanation for the highly significant -both statistical and economic -effect of credit spreads on the magnitude of the deviation between realized and implied volatility that is robust to whether the vrp is positive or negative and it reflects precisely to what we have been referring as supply-side effects. We return to this shortly in our discussion of the "de-leveraging effect" (see Section 5.1.1).
It is important to mention here that when credit spreads widen, other variables can change too. Therefore, it is also useful to perform marginal regressions on the individual factors. The logic behind marginal regressions is as follows. Suppose Y = α + β 1 X 1 + β 2 X 2 + ; when X 1 increases from x 1 to x 1 + ∆ 1 , a more plausible scenario is for X 2 to change from E[X 2 |X 1 = x 1 ] to E[X 2 |X 1 = x 1 + ∆ 1 ] rather than stay the same. If we treat the conditional expectation E[X 2 |X 1 ] as linear in X 1 (i.e., E[X 2 |X 1 ] = a + bX 1 ), then the change in y is ∆ y = (β 1 + β 2 b)∆ 1 rather than just β 1 ∆ 1 , which is equivalent to the marginal regression of Y on X 1 . That is, if we regress Y on X 1 only, then the coefficient before X 1 is exactly β 1 + β 2 b. We ran the POI marginal regression is 5.70 × 10 −7 and is significant at the < 1% level). For the entire sample period, the data supports H1, both statistically and economically.
Next, we performed the magnitude regression in Equation (19) for three subperiods -"Pre-Crisis", "Crisis", and "Post-Crisis" -to see if we can identify any patterns that might coincide with the dramatic changes in financial markets as a result of the  Tables 4, 5, and 6, for "Pre-Crisis", "Crisis", and "Post-Crisis", respectively.
An initial comparison of the results in Tables 4, 5, and 6 with those in Table 3 reveals two preliminary observations. First, there is some variation in the explanatory power of the factors over the three subperiods. The R-squared in the "Pre-Crisis" subperiod is at the same level as the R-squared for the whole sample period (around 58%); the R-squared for the "Crisis" subperiod is higher than the whole sample period (64% versus 58%); and the R-squared for the "Post-Crisis" subperiod is lower than the whole sample period (28% versus 58%). It is rather interesting that the best fit is achieved during the Financial Crisis. Over the entire sample period, the most significant explanatory variables are credit spread and put option open interest, thus supporting hypotheses H1 and H2; and there seems to be a trend that these two factors become increasingly more significant over time. In the "Pre-Crisis" subperiod, neither the credit spread nor put option open interest are significant, but the TED spread is the sole significant explanatory variable (at the < 1% level). During the "Crisis" subperiod, the credit spread is highly significant (at the < 1% level) and open interest is significant at the 10% level. In the "Post-Crisis" subperiod, both credit spread and put option open interest are highly significant (at the < 1% level), and the TED spread reappears as minimally significant (< 10% level). Thus, the data suggests that supply and demand factors play an increasing role in explaining the bias between realized volatility and implied volatility. Additionally, since the quality and quantity of data increased over this period, it might also suggest the additional economic insight that can be obtained from in-depth analysis of large scale financial data.
When examining the results from the "Post-Crisis" subperiod one notices that the R-squared (28%) drops off compared to either of the other subperiods or the entire sample period. However, despite the lower R-squared, it is important to note that both the credit spread and the put option open interest are highly significant. Thus, the economic interpretation of the coefficients is consistent with the findings from the whole sample period and supports our hypotheses. The R-squared is interesting since it indicates perhaps a structural change after the financial crisis. Since the R-squared measures the proportion of the variation in the y-variable (here the magnitude of the vrp) that is explained by the x-variables (the risk factors and economic variables) it makes sense to simply look at the descriptive statistics to see if any patterns emerge to which we might attribute a structural change. In the "Post-Crisis" subperiod, both the vrp and its magnitude have less variation (standard deviations of 4.19 and 3.52, respectively) than the entire sample period (standard deviations of 6.91 and 5.63, respectively), so a possible explanation is that some factor(s) lost relatively more of their variability and became less correlated with the vrp. We know that early on in the sample period, the TED spread appeared to have good explanatory power for the magnitude component, but that the significance seems to disappear over time. We also see that the mean and standard deviation of the TED spread become very small in the "Post-Crisis" subperiod (mean of 38bps and standard deviation of 17bps).
The TED spread was our proxy for liquidity risk and financial market stability, which has a secondary or tertiary effect when analyzing the regression results. However, it is possible that after the Crisis, the low TED spread, with its minimal variation, ceased to be a good proxy for liquidity risk and financial stability. We also saw that over time the put option open interest, our proxy of demand for hedging tail risk, seemed to become increasingly more important. While the mean put option open interest increased over time (4.13mm, 6.59mm, and 10.51mm, for the "Pre-Crisis", "Crisis", and "Post-Crisis" subperiods, respectively), which could just be indicative of the growing market for index options and options on ETF's, its standard deviation falls during the "Crisis" subperiod (from 1.57 to 0.92) and then more than doubles in the "Post-Crisis" period (2.05). It is possible that because of the government's implicit and explicit backstop -the so-called "Fed put" -demand played less of a role during the Crisis, but as we emerged, new demand for hedging tail risk came to be a key driver in explaining the market price of volatility risk. Therefore, our structural change could be the increased role of put option open interest and investors' demand for hedging tail risk rather than liquidity and overall financial stability as being a key factor.
The highest overall explanatory power of the credit spread during the "Crisis" subperiod suggests that supply-side effects drove results during the Financial Crisis.
This can, in part, be explained by less variation in the put option open interest (see summary stats and also figure). This may be because of the aforementioed "Fed put" and implicit (or explicit) government bailout, or it could be that dealers went from being liquidity providers to liquidity suppliers themselves. This could help explain the change in the sign of vrp. But there is still a compelling explanation in terms of the magnitude component where the data tells an interesting story that we refer to as the de-leveraging effect.

The "De-leveraging Effect"
During the "Crisis" subperiod, the R-squared of the magnitude regression indicates that the factors are able to explain 64% of the variation in the magnitude component of the volatility risk premium. The most significant explanatory variable is the credit spread, which not only provides validation for the risk aversion interpretation of the volatility risk premium, but also supports our hypothesis that the magnitude of the bias between realized and implied volatility reflects the willingness of market makers to absorb inventory and take risk onto their balance sheet. This is what we refer to as the supply-side effect. When there are supply-side shocks, market makers would be less willing to take on risk and provide liquidity and therefore the wedge between realized and implied volatility grows. Either the dealers would charge a larger premium for providing the liquidity (thus driving the vrp more negative) or become demanders of liquidity (driving the vrp largely positive).
So what is it, exactly, that the credit spread variable is picking up? For many years empirical finance studies have used credit spread as a proxy for risk aversion.
We propose an alternative, but certainly not contradictory, interpretation that credit spreads, or the difference between yields on speculative and investment grade debt, captures a related supply-side effect that is present for the entire sample period and seems to dominate during the "Crisis" subperiod. When a financial institution is concerned with the risk on its balance sheet, one way to reduce the overall risk exposure is to de-leverage. De-leveraging can be achieved through the right-hand-side of the balance sheet by altering the capital structure: buying back debt, issuing equity, or both. However, there is evidence that large financial institutions have a preference to de-leverage through the left-hand-side of the balance sheet: i.e. reducing its holding of risky assets (Adrian and Shin, 2010 This de-leveraging also justifies the dominant impact of credit spreads in explaining the volatility risk premium during the "Crisis" subperiod in our results, since there is a high degree of overlap in the set of institutions that serve as primary dealers in the credit markets and those that are market makers in index options. 13 During the "Crisis" subperiod, recall that the most statistically significant explanatory variable was the credit spread (see Table 5). The economic significance of the coefficient estimate is that for a 100bps widening of the credit spread we would expect a 624bps increase in the price of volatility risk. As can be seen from Figure 2, credit spreads increased dramatically at the end of 2008. While this can certainly be viewed as an increase in risk aversion, it is also reflective of the massive de-leveraging that occurred after the failure of Lehman Brothers. As large financial institutions reduced their holdings of speculative grade debt they were also reluctant to take on additional risk in other markets. It is reasonable to conclude that during this time dealers in index options increased the price at which they were willing to make a market for hedging downside tail risk, thus increasing the magnitude of the volatility risk premium during this time. Because of the inherent leverage in option positions, it is not surprising to see a multiplier effect to the order of 5 to 7 times that of what occurs in the credit markets. This supply-side effect was clearly the driving force behind the increase in the magnitude of the volatility risk premium, not only because it is the only statistically significant explanatory variable during the "Crisis" subperiod, but also because there were no sharp increases in demand for put options on SPRDs. This is illustrated in Figure 3 where open interest fluctuated between 5,000,000 and 10,000,000 when the range for the entire sample period is 2,000,000 to nearly 16,000,000; this can also be seen in Table 2 by comparing the standard deviation for the put open interest variable across subperiods. While it may seem curious that during the most uncertain point in the financial crisis investors were not aggressively trying to hedge their downside tail risk, but there is an intuitive explanation for this: the so-called "Fed put". After the failure of Lehman Brothers and the subsequent bailout of the financial industry, it became clear that the federal government would provide a backstop either implicitly or explicitly. Therefore, there was little need for investors to pay the high price to hedge their downside tail risk. After the financial crisis, however, demand for hedging downside tail risk returned (see Figure 3) and, indeed, put option open interest was highly significant in the "Post-Crisis" subperiod (see Table 6). This is in addition to the credit spread which still represents the supply-side effect.
Of course, we can also examine what impact this supply-side effect would have on the magnitude of the volatility risk premium if we allow the demand-side effect and other factors to change with it. The marginal regressions indicate that a 100bps increase in the CS results in a 696bps increase in the absolute deviation between realized and implied volatility during the "Crisis" subperiod (the coefficient on the CS marginal regression is 6.96 and is significant at the < 1% level). Thus, when we allow the other factors to change, as they likely did considering the dynamic market events during this time, the de-leveraging effect is even more pronounced. Wu (2011) shows that the absolute value of the variance risk premium is proportional to the level of volatility in the market. Therefore, the variance risk premium (and consequently the volatility risk premium) is either very negative or very posi-tive when volatility levels are high. This provides added justification for examining the absolute value of the volatility risk premium to make inferences about the price of volatility risk. More specifically, it suggests that uncertainty in the market increases the bias between realized and implied volatility and the price that must be paid to hedge this risk. In this section we were able to provide empirical evidence explicitly linking this to demand for hedging tail risk and liquidity provision to the volatility market (hypotheses H1 and H2) including results that reflect the massive de-leveraging going on in the markets during the Financial Crisis. Next we examine the direction, or sign, of the volatility risk premium.

Sign Tests
While a negative volatility risk premium is justified theoretically and for the most part supported by the data, the large positive spike in vrp in the middle of the Financial Crisis as well as several other positive spikes throughout the entire sample period provides a paradox. To help reconcile this paradox within our priced bias interpretation, we now examine the information in the direction, or sign, of the bias as proposed by Hypotheses H3 and H4. Recall, H3 reflects the view of some derivative traders that the direction of the volatility risk premium reflects the market's expectation of future changes in volatility. When the vrp is negative, then the realized volatility in the equity market is less than the implied volatility extracted from option prices. Volatility is priced higher in the forward-looking options market indicating that market participants expect realized volatility to increase in the future. When the vrp is positive, then the realized volatility in the equity market is greater than the implied volatility extracted from option prices. Volatility is priced lower in the forward-looking options market indicating that market participants expect realized volatility to decrease in the future.
We seek to test Hypothesis H3 using a modified version of the regression proposed in Aït-Sahalia et al. (2012) to test the Expectation Hypothesis. First, to establish a baseline we run the following specification of the Expectation Hypothesis: Here, the y-variable is the average ex-post realized volatility using the frequency domain estimation methodology for the current 21 trading-day period; the x-variable is the 21 trading-day lagged VIX. The idea is to see whether implied volatility does convey information about the market's expectations about the future realized volatility levels. If implied volatilities are unbiased and efficient estimates of future realized volatility, then we could use the VIX index to predict what the future level of realized volatility will be one month in the future. This is the essence of the Expectation Hypothesis.
The results for this regression can be found in Table 7. The coefficient β 1 is significant at the < 1% level and the R-squared indicates that the lagged VIX is able to explain 33% of the future realized volatility. This suggests that implied volatility does have some predictive power for realized volatility, or in other words, it represents to some extent the market's expectation about future realized volatility. We note that the t-statistic and p-value associated with β 1 just tells us that the coefficient is statistically different from zero. It is easy to show that β 1 is also statistically less than 1. 14 The implication is that given the current level of VIX, to predict the future realized volatility you would first discount the level of VIX (by approximately 0.45) and then add a constant (α = 5.28%). Furthermore, the results in Table 7 indicate that for VIX > 9.60, implied volatility tends to overestimate future realized volatility; this says that, except when VIX is very low, there should be a negative volatility risk premium. However, we know from examining our time series, that the vrp was positive when VIX was approaching historical highs. Therefore, perhaps a simple linear model such as in Equation (20)  To test hypothesis H3, we introduce a binary variable, sgn(vrp) which equals -1 if vrp < 0 and +1 if vrp > 0 and run the modified regression specified as: This attempts to identify whether or not the direction, or the sign, of the volatility risk premium provides additional information about future levels of realized volatility.
The results in Table 8 show that including the sign of the vrp improves the explanatory power as the adjusted R-squared is 55% with everything -α, β 1 , β 2 -significant at the < 1% level. Note that β 1 is still statistically less than 1, but the relationship between the lagged level of VIX and the future level of realized volatility is no longer linear.
The results can be used to predict next month's realized volatility in terms of the sign of the current volatility risk premium and conditional on the current level on VIX. Suppose the VIX is currently 21 (roughly the median value for our entire sample period). Then our prediction for next month's realized volatility level depends on whether the vrp is currently positive or negative. If the vrp is negative, then the results of the regression specified by Equation (21) predicts next month's realized volatility to be approximately 13.43%; and, thus, when the vrp is negative, implied volatility overestimates expected future realized volatility. However, if the vrp is positive, the forecast changes to 29.97% and now implied volatility underestimates expected future realized volatility. This appears to confirm hypothesis H3; similar analysis of higher and lower VIX values (e.g., using 30 and 13, roughly the median value for the "Pre-Crisis" and "Crisis" subperiods, respectively) leads to the same conclusion.
These results should be met with some degree of skepticism. Aït-Sahalia et al. Alternatively, Bakshi and Kapadia (2003) provide an explanation that is more consistent with our motivating theme about market-making and intermediation in the options market. They show, both theoretically and with empirical evidence on index options, that a negative volatility risk premium is representative of the underperformance of a delta-neutral portfolio, where the trader sells calls and purchases ∆ units of the underlying as a hedge (or sells puts and short sells ∆ units of the underlying as a hedge). Since we are examining the vrp in terms of market makers who provide liquidity to investors that wish to hedge downside tail risk with put options on the market (the S&P 500 index or SPDR ETF), the market maker must short sell the underlying (e.g. SPDRs) in order to maintain delta-neutrality. Consequently, the market maker will have a gain on the delta-hedge when the S&P 500 is down and a loss on the delta-hedge when the S&P 500 is up. Table 9 shows the annualized returns on the S&P 500 index when the volatility risk premium is positive (Panel A) and negative (Panel B). We can see that for every period that the vrp is positive, the S&P 500 has negative returns. This is consistent that, in the less frequent instances when the vrp is positive, traders making a market in SPY put options have a profit on their delta-neutral hedge. More often than not, when the vrp is negative, traders making a market in SPY put options are losing money on their delta-neutral hedge as evidenced by the majority of periods that show positive returns on the S&P500, as well as the overall average annualized return of 10.99% when the vrp is negative.
This suggests strong evidence in favor of H4.
In order to give more econometric rigor and statistical significance of this relationship posited by hypothesis H4, we ran the following regression: where the dependent variable, S&P return t , is the annualized daily return on the S&P500 index and on day t and the independent variable, sgn(vrp t ), is the sign of the contemporaneous volatility risk premium; The results for the regression specified by Equation (22) can be found in Table 10. The coefficient estimate for β 1 is negative and significant at the < 5% level. We can therefore conclude from this regression that returns on the S&P 500 index statistically depend on whether the volatility risk premium is positive or negative. Furthermore, the negative coefficient estimate supports the delta-hedged gains argument of Bakshi and Kapadia (2003), but within the context of put options on the market: when the vrp is positive, returns on the S&P 500 (or SPDRs) can be expected to be negative (and the delta hedge of being short the underlying will make money), whereas when the vrp is negative, returns on the S&P 500 (or SPDRs) can be expected to be positive (and the delta hedge of being short the underlying will lose money).
It turns out that this interpretation of the sign of the bias between realized and implied volatility is not unrelated to the theme of intermediation in the market for hedging downside tail risk. There is a possible way to unify both the magnitude and direction of the priced bias interpretations of the volatility risk premium, based on encouraging new evidence in a paper by Chen et al. (2015). Using detailed data on trading in deep out-of-the-money S&P 500 put options Chen et al. (2015) are able to track variation in market-makers' willingness and ability to provide liquidity to investors seeking insure against market crashes (what we refer to as "hedging downside tail risk"). They find that during the Financial Crisis, market makers went from being liquidity providers to liquidity demanders. To the extent that this period coincides with the period where the vrp goes sharply positive in Figure 1, it is possible that this flip in sign along with the spike in magnitude represents the restricted supply of liquidity to the market for hedging tail risks in conjunction with dealers' desire to switch to the other side of the market and becoming liquidity demanders. Since our proxy for demand is open interest, which was rather constant during this period, our analysis might not be picking up that effect. This would just further solidify the intermediary's role in driving the deviation between realized and implied volatility.
However, it would require considerable additional examination to properly weave these two aspects together and, thus, we leave this for future research. Our deviation analysis of the magnitude and direction of the volatility risk premium separately as two components of a systematically-priced bias provides the first steps needed to better understand how the market prices volatility.
"Big Data" has the potential to transform research in many areas, including financial economics and risk analysis. We use a massive data set, collected from numerous sources, to perform a unique study of how the market prices volatility. Our research questions equate the volatility risk premium to a systematically priced bias between ex-post realized volatility and ex-ante expected volatility implied by options. Unlike most other studies of the volatility risk premium, rather than start with a theoretical model of volatility, we begin with intensive data-driven methods leveraging the insights of Big Data analytics.
First, we collected price and volume data on every transaction in SPDRs, the ETF that tracks the S&P 500 index, over a five year period from 2006 through 2011 yielding over half a billion observations. We then use a novel technique to estimate the integrated volatility using the processed ultra-high-frequency data in the frequency domain. This methodology allows us to distinguish the true volatility of the price process and the microstructure noise, even when the noise is correlated over time. The result is a consistent, de-biased estimate of the integrated volatility as our measure of realized volatility. In constructing the time series of the volatility risk premium, we smooth the daily realized volatility by taking a 21-trading-day rolling average and subtract from it the daily value of VIX volatility index for the same month.
Insofar as the option implied volatility represents the market's expectation of future volatility, this formulation of the volatility risk premium is very much like a statistical bias. We decompose this bias into magnitude (absolute value) and direction (sign) and analyze them separately. Based on stylized facts about the volatility risk premium, we construct four testable hypotheses about its economic meaning and determinants. The general theme is that the volatility risk premium cannot be explained by traditional risk factors, but rather are related to supply and demand forces in option markets and the role of market makers in providing liquidity to investors who seek to hedge their downside tail risk. This is all viewed within the lens of the volatility risk premium being systematically priced bias.
The results indicate that the size of this bias represents the price that market makers require to meet the demand of investors who wish to hedge their downside tail risk and compensates for supply and demand imbalances in this market. In fact, we find compelling evidence that during the Financial Crisis, supply-side forces dominated as financial intermediaries shed risky positions and were reluctant to take more risk onto their balance sheets. Demand-side forces dried up as the implicit guarantees and "Fed put" made hedging tail risk less attractive for investors. This is substantiated by the highly significant credit spread, which reflects market makers de-leveraging during the Crisis, and reduced significant in put option open interest, which is our proxy for investors' demand for hedging tail risk.
Practitioners view the sign of the volatility risk premium, on the other hand, as the market's expectation about future levels of volatility. This is similar to the Expectation Hypothesis discussed in Aït-Sahalia et al. (2012) and, while we are able to find some evidence in favor of this hypothesis in the data, statistical issues raise doubt on the validity of the inferences and there is no clear economic interpretation within the conceptual framework we established. An alternative hypothesis links the sign of the volatility risk premium to the gains and losses on traders' delta-hedged positions when making a market for index options. Bakshi and Kapadia (2003) were the first to propose this interpretation, and we are able to find fairly conclusive evidence in favor of it in the market for S&P 500 put options. That is, market makers provide liquidity to investors seeking to hedge their downside tail risk -via put options on SPDRs -will delta-hedge these positions by short-selling shares of the underlying ETF. We find that returns on the S&P 500 are negative over all consecutive trading days where the volatility risk premium is positive, indicating a delta-hedged gain for the market maker. We find that the returns on the S&P 500 are positive over all but two series of consecutive trading days where the volatility risk premium is negative, indicating a delta-hedged loss for the market maker. Regression results corroborate this finding.
Overall, the ability of our data-driven analysis to identify economic insights into how the market prices volatility is very encouraging for researchers interested in us-ing similar approaches for other quantitative studies in financial economics and risk analysis. While we do utilize the results from the existing literature along with economic intuition to highlight some stylized facts about the volatility risk premium and come up with testable hypotheses, our analysis does not rely on any specific theoretical model and has minimal parametric assumptions. The trend over our sample period seems to indicate that the increase in high-frequency data will allow for more precise and accurate measurement of the volatility risk premium as a systematically priced bias. This will then allow for even better identification of the determinants of the volatility risk premium and highlight the role that intermediation in the market for volatility and hedging downside risk as well as the role that supply and demand imbalances play in driving the deviation between realized and implied volatilities.
A naïve estimator would be the Realized Volatility (RV) estimator which is a consistent estimator in a noise-free model. However, since the observable price process given by Equation (2) is contaminated by the microstructure noise this RV estimator is both biased and inconsistent.
Statistical theory indicates that we should be able to improve the accuracy and precision of our estimate by increasing the rate at which we sample the data; hence the value of ultra-high-frequency data. 16 This would be the case if we could observe X t directly; but microstructure noise introduces an added dimension of complexity to the problem. In fact, assuming iid noise, the bias of the RV estimator is 2nE This tells us that as we increase the frequency of the price data, the effect from noise becomes more overwhelming.
One way to address the problem of noise when sampling at too high of a frequency is to sample sparsely and use the corresponding RV estimator. This practice, known as the subsampling approach, was first introduced by Zhou (1996). However, even when sampling sparsely at the optimally-determined frequency, the fact that large portions of data are discarded violates basic statistical principles. Furthermore, Zhang et al. (2005) argue that sampling over longer horizons merely reduces the impact of microstructure, rather than quantifying and correcting its effect for volatility estimation.
One of the earliest solutions to incorporate the full data sample is Two Scales Realized Volatility (TSRV) as proposed by Zhang et al. (2005). The TSRV estimator is based on subsampling, averaging, and bias-correction. They sample sparsely over subgrids of n observations to get K subsamples on a slower time scale. For each such sample the RV estimator is [Y, Y ] (sparse,k) T , k = 1, · · · , K, and averaging them yields the estimator [Y, Y ] (avg) T . The final de-biased estimator is: after accounting for the bias, wheren = n K . Choosing the optimal sampling step K = cn 2/3 yields the convergence rate n −1/6 . The TSRV estimator is shown to outperform the standard RV estimator empirically in the study by Aït-Sahalia and Mancini (2008).
A closely related estimator is Multiple Scale Realized Volatility (MSRV), which is proposed and derived in Zhang (2006). As a generalization of TSRV, the MSRV estimator combines M different time scales with weights, when chosen optimally, can achieve the optimal convergence rate n −1/4 . Based on a different smoothing idea, Fan and Wang (2007)  The pre-averaging approach of Jacod et al. (2009) uses all or most of the data, and removes noise by local average smoothing, i.e., averaging over a moving window.
The result is a rate optimal (with convergence rate n −1/4 ) consistent estimator of integrated volatility in the presence of microstructure noise.
Returning to the parametric approach, if we do not assume that volatility is constant and noise normally distributed with variance a 2 , but nevertheless use the log-likelihood function in Equation (23), the resulting estimator is the quasi maximum likelihood estimator (QMLE). Interestingly, it is still a consistent estimator at the most efficient rate n −1/4 . Statistical properties of the QMLE are derived in Xiu (2010).

B Simulations
We compare the performance of Fourier method and naive subsampling at different sampling frequency on simulated data using a Heston (1993) model: where ν t = σ 2 t . The parameters are set as follows: µ = 0.05, κ = 5, α = 0.04, γ = 0.5, and the correlation between the two Brownian motions B t and W t is ρ = −0.5. 17 The initial values are X 0 = 0 and ν 0 = 0.04. We take T as one day, and simulate data with ∆ t = 0.1s, which yields a sample path of length N = 234, 000 in one trading day. We first calculate the underlying true integrated volatility by a Riemann sum approximation of the integral, i.e.: T N N i=1 σ 2 i = T 0 σ 2 t dt. Then we add AR(2) noise i = 0.6 i−1 − 0.4 i−2 + η i , to get the observed data Y i = X i + i , whereη i 's are i.i.d. N (0, σ 2 η ) and we set σ η = 5 × 10 −4 .
We estimate the integrated volatility using two methods, the Fourier method and the naive subsampling, which yields < X, X > F ourier T and < X, X > subsampling T . We calculate the RMSE (root-mean-square error) of the estimates to the truth over 200 simulated sample paths. To further illustrate the effect of high frequency data, we evaluate two methods from ∆ t = 1s up to ∆ t = 150s. Figure 4 shows the RMSE of the Fourier method and the naive subsampling against decreasing sampling frequencies.
The takeaway of this figure is two folds. First, the Fourier method can effectively filter the correlated microstructure noise, and works better than naive sampling method. We did not implement other more sophisticated methods for comparison, as the simulation is not to illustrate the superiority of the Fourier method, but rather to justify the use of high frequency data. Second, if we can filter the microstructure noise, higher frequency gives us a better estimate as we are able to utilize more data hence more information.  Fama-French market risk factor; market return minus risk-free rate SMB Fama-French size factor HML Fama-French value factor Credit Spread (CS) Difference in yield on Baa-rated and Aaa-rated corporate debt TED Spread (TED) Difference between 3-month Eurodollar rate and 3-month Treasury rate Put Open Interest (POI) Daily open interest for put options on SPY   Table 3: Regression of |vrp| on explanatory variables: Whole Sample Period. Standard errors and t-statistics are computed using the Newey-West correction. Significance levels are < 1%, < 5%, and < 10% for ***, **, and *, respectively.  Table 4: Regression of |vrp| on explanatory variables: Pre-Crisis Subperiod. Standard errors and t-statistics are computed using the Newey-West correction. Significance levels are < 1%, < 5%, and < 10% for ***, **, and *, respectively.  Table 5: Regression of |vrp| on explanatory variables: Crisis Subperiod. Standard errors and t-statistics are computed using the Newey-West correction. Significance levels are < 1%, < 5%, and < 10% for ***, **, and *, respectively. 3.67e-03 *** Adjusted R 2 0.28 Table 6: Regression of |vrp| on explanatory variables: Post-Crisis Subperiod. Standard errors and t-statistics are computed using the Newey-West correction. Significance levels are < 1%, < 5%, and < 10% for ***, **, and *, respectively.