Short-Term GDP Forecasting with a Mixed Frequency Dynamic Factor Model with Stochastic Volatility

In this paper we develop a mixed frequency dynamic factor model featuring stochastic shifts in the volatility of both the latent common factor and the idiosyncratic components. We take a Bayesian perspective and derive a Gibbs sampler to obtain the posterior density of the model parameters. This new tool is then used to investigate business cycle dynamics and for forecasting GDP growth at short-term horizons in the euro area. We discuss three sets of empirical results. First we use the model to evaluate the impact of macroeconomic releases on point and density forecast accuracy and on the width of forecast intervals. Second, we show how our setup allows to make a probabilistic assessment of the contribution of releases to forecast revisions. Third we design a pseudo out of sample forecasting exercise and examine point and density forecast accuracy. In line with findings in the Bayesian Vector Autoregressions (BVAR) literature we find that stochastic volatility contributes to an improvement in density forecast accuracy.


Introduction
The conduct of monetary and fiscal policy relies on the timely assessment of current and future economic conditions. The task of providing an accurate picture of the current cyclical position is significantly plagued by the delay with which crucial economic indicators are released. GDP data, for example, are usually published with a 45 days delay both in the US and in the euro area. Important quantitative monthly indicators, like industrial production indexes, suffer more or less from the same publication delay. Survey data, on the other hand, provide very timely information as they are published roughly at the end of the reference month. Unfortunately, forecasts based on qualitative data only are known to be much less reliable than predictions based on quantitative information, see . The econometric literature has progressed significantly in the field of short-term forecasting in the past decade, and a number of tools have been developed, capable of dealing with the asynchronous timing of data releases, integrating data at different frequencies and dissecting the information content of monthly releases for tracking quarterly variables. Small and large scale factor models, in particular, have become the workhorse for short term forecasting. On the small scale side, building on the Stock and Watson (1989) coincident indicator, , henceforth MM03, have proposed a unified framework for modeling quarterly GDP together with monthly indicators. The approach has recently been extended by  to accommodate real time issues and different GDP releases. On the large data side, research by  and  has documented the predictive content of a large number of indicators for GDP growth and also introduced new tools to link monthly data releases to GDP forecast revisions. These models are nowadays used on a regular basis to inform decision makers both at Central Banks as well as in private institutions. 1 Although the literature has moved very rapidly, there are still some gaps between the demands posed by policy makers and the answers that the models discussed above can provide.
In particular, policy makers have become more and more interested in having not only point forecasts, but also a model based assessment of the uncertainty surrounding the outlook. This is testified by the number of Central Banks that have started publishing fan charts and confidence bands around their medium/long term forecasts (Bank of England, Bank of Canada, Norges Bank, South Africa Reserve, the Sveriges Riksbank, the Bank of Italy and the US Fed). Despite the growing preference for a probabilistic assessment of economic projections, however, the focus of short-term forecasting models is still on point forecasts.
Another open issue in the field of short-term forecasting relates to parameter instability.
As economic systems evolve and are hit by large shocks, the link between different indicators is likely to change over time, requiring some flexibility in the models parameters. The issue of forecast failure in the presence of structural breaks, which has been explored extensively in the case of points forecasts, has been recently extended to density forecasting. In particular  show that changes in the underlying data generating process can severely hinder the accuracy of density forecasts produced with time invariant models. If structural breaks, however, are sufficiently large and distant in the past they do not pose serious problems. They can be easily detected with standard statistical tests and parameter instability can then be either incorporated in the model or simply bypassed by splitting the sample or adopting a rolling estimation scheme. These strategies, however, are not viable if breaks are, rather than large and discrete, small and continuous, a form of parameter time variation that has received a lot of attention in the macro empirical literature in the past decade. Indeed a number of papers, mainly concerned with structural monetary analysis, have documented the inadequacy of constant parameter models for describing macroeconomic data and have called for some form of slow, continuous, time variation in model parameters, see ,  and Primiceri (2005). The message coming from this literature has raised little attention in the density forecast literature until the recent paper by , who shows that allowing for stochastic shifts in the volatility of the shocks in a BVAR significantly increases density forecast accuracy for a number of US variables.
In this paper we take stock of these issues and develop a mixed frequency small scale factor model that is suitable for producing density forecasts and that allows for time variation in some of the parameters. We start off with the basic setup of MM03 and twist it in two directions.
Our first step consists of casting the model in a bayesian estimation framework. By treating the model parameters as random variables we can draw a distribution of forecasts from the predictive density making the model easily suitable for producing density forecasts. Secondly, following , we extend the model to allow for random shifts in the volatility of the underlying shocks. To clarify the expected gains of this modeling choice let us briefly describe the setup. In our factor model each variable is treated as the sum of two components, a latent factor which is common to all the variables and governs the amount of correlation across components, and an idiosyncratic component which is uncorrelated across variables. Dynamics are introduced by letting the common and the idiosyncratic components follow an autoregressive process subject to random shocks. Our innovation consists of allowing the variances of these shocks to vary continuously over time rather than being constant as in MM03. 2 We expect the model to boost the variance of the shocks in more turbulent times, hence producing wider confidence bands and providing a more accurate assessment of the uncertainty surrounding the median forecast. On the other hand, the richer structure of our model implies a much heavier parametrization, posing a trade-off between model flexibility and model parsimony, an issue that can be particularly important in relatively short samples and that can only be evaluated empirically.
After showing how to estimate the model we turn to an empirical application in which we use a small number of monthly indicators to predict quarterly GDP growth in the euro area.
We present three sets of results. First we extend the approach proposed by , henceforth GRS, and show how successive macroeconomic releases not only improve point forecast accuracy but also increase the precision of density forecasts and reduce the width of forecast intervals. Second, we illustrate how, in a given quarter, our new tool can be applied not only to interpret the news content of monthly releases, like in , but also to assess how much 'confidence' the model places on the revisions implied by the release of monthly indicators. Third, we design a (pseudo) real time out of sample forecasting exercise and evaluate both the point and density forecasts produced by the model. In line with  we find that the introduction of stochastic volatility leads to an improvement of both point and density forecast accuracy.
The paper is structured as follows. In section 2 we describe the model. In section 3 we discuss the main steps of the Gibbs sampler used for simulating the posterior distribution of the parameters. Section 4 presents the empirical application and section 5 concludes.

The model
In this section we spell out the details of our model. Let Y q,t be a quarterly series, which can be seen as a monthly variable with its value associated to the third month of the quarter and missing observations in the first two months, and Y m,t a vector of k monthly series Y mj,t , for j = 1, 2, . . . , k (from this point onwards we use the convention that whenever we write mj we mean the j th element in the vector of monthly variables, for j = 1, 2, . . . , k). 3 Now let Y q,t be the geometric mean of a latent random variable Y qt such that: Filtering both sides with the filter (1 − L 3 ), after some simple manipulation yields: where small case letters indicate growth rates over the previous three months: y q,t = ∆ 3 lnY q,t .
We assume a dynamic (single) factor model for the latent process y q,t and the monthly observed variables y m,t . 4 The system of measurement equations is: The law of motions of the factor and of the the idiosyncratic disturbances of the quarterly and monthly variables are described by the following: where v t , q,t and mj,t are uncorrelated N(0,1) and the Φ i (L) polynomials are lag polynomials of order p i : for i = f, q, mj. The log-volatilities λ i,t follow a driftless random walk: for i = f, q, mj, and are assumed to be independent across equations. The hypothesis that the innovations to the idiosyncratic components of the factor and of the observable are crosssectionally uncorrelated is an important identifying assumption that forces all the comovement 4 The use of more than one common factor does not pose any additional technical difficulty as it would simply result in an enlargement of the state vector in the State Space representation of the model. in the panel to occur through the common factor.
Since the variable y q,t is not observed, the measurement equations can be rewritten in terms of the observable variable y q,t using the identity (2): where α 1 = 3α 1 . A more compact state space representation of the model is the following: where y t collects both quarterly and monthly variables, the state vector µ t includes the unobserved factor f t and the idiosyncratic components (u q,t and u m,t ), the matrix F collects the factor loadings, H collects the autoregressive parameters of the laws of motion of the unobserved factors and of the idiosyncratic components, the time varying variance matrix Q t is a diagonal matrix whose diagonal elements are the variances e λ f,t , σ 2 q e λq,t , σ 2 m e λ mj,t , Λ t is the vector of drifting volatilities and Ξ is a diagonal matrix collecting the variances that determine the amount of time variation of the log-volatilities (σ 2 λ,i , i = f, q, mj).
This model nests the one proposed by MM03, which can be easily recovered from our more general setup by shutting off the drifting volatilities, that is by setting Λ 0 = 0 and Ξ = 0. In this case the matrix Q t is replaced by its constant counterpart Q.
To identify the model parameters some restrictions need to be placed. First, the scale of the factor loadings and of the factors cannot be separately identified, so we restrict the variance of the errors of the common factors to be 1 (see equation 4). Second, we need to fix the scale of the stochastic volatilities, which enter as a multiplicative element of the constant variances (see equations 4 to 6) and therefore can not be separately identified. We follow Del  and identify the stochastic volatilities by setting to zero their initial condition

Model Estimation
The model is estimated with Bayesian methods using a Metropolis within Gibbs sampling procedure. The Gibbs sampler simplifies the daunting task of obtaining draws of the model parameters from the joint posterior distribution to a number of more manageable problems, which involve sampling from the distribution of a subset of parameters conditional on the remaining ones and on the data, see Conditioning on f t , Φ i (L) and λ i,t , this is a standard regression with autocorrelated and heteroscedastic disturbances. Pre-multiplying by Φ i (L) and dividing by e λ i,t /2 one obtains a standard regression model with homoscedastic, uncorrelated residuals. Positing a Normal-gamma conjugate prior, the conditional posterior for β(i, L) and σ i is also Normal-gamma.

Step 3: drawing H
The transition matrix H can also be drawn row by row. Take the i th transition equation: Conditioning on µ i,t and on the ith element of the Q t matrix (q i,t ), this is a regression with heteroscedastic residuals. The residuals can be whitened by dividing by q i,t . Positing a Normal prior for the regression coefficients the conditional posterior is also Normal. 5

Step 4 and 5: drawing the stochastic volatilities
There are a number of methods for drawing the stochastic volatilities λ i,t and the variance σ λ,i . We employ the  algorithm, which involves drawing from a log-normal density and a Metropolis acceptance step. Details on the algorithm can be found in , Appendix B.2.5.

3.4
Step 6: drawing µ t Conditioning on all the other parameters and on the data, draws of the state vector can obtained via a state vector simulation smoother as in  or with the disturbance smoother proposed by . We resort to the latter, which turns out to be slightly more efficient from a computational point of view.  Table 1 were selected from a large pool of candidate series adapting the algorithm used by  to our Bayesian setting. More details can be found in Appendix B. The empirical specification of the model also follows closely the one proposed by . In particular we use a single common factor that summarizes the current state of the business cycle. In this setting the Industrial Production indexes and the interest rate spread load on the common factor contemporaneously. Survey data, on the other hand, are treated as if they were in phase with the year-on-year growth rate of the Industrial Production index, therefore loading a 11 terms moving average of the common factor. 6 We also let the bilateral exchange rate enter the model in year-on-year percentage growth, the rationale being that pricing to market is likely to buffer temporary exchange rate short term movements with a variation in profit margins so that only more persistent fluctuations impact on economic growth. The exact specification of the state space matrices can be found in Appendix C. 6 We also experimented with a different specification in which we relaxed this restriction and let the survey indicators load freely on 12 distributed lags of the common factor. This modification worsened slightly the results. Our intuition for this is that as the model is already heavily parametrized restricting the model space leads to more efficient estimates. Also, notice that our setup allows for serial correlation in the idiosyncratic components, so that any phase shift induced by our restriction will be picked up by the AR(2) structure of the idiosyncratic terms. Also a model featuring two factors like in  does not improve upon our benchmark specifications.
Our empirical analysis proceeds as follows. After a brief discussion of how the priors are set (4.1), we estimate the model on the full sample to gauge the relative contributions of the various indicators to the common factor and also to evaluate if the model actually captures any significant shifts in the variance of the common and idiosyncratic errors (4.2). We then turn to three empirical exercises. The first one regards the typical situation of a forecaster that is required to update her forecasts at each release of new data. In this context we set off by replicating with our model the analysis of news performed by GRS and evaluate how point forecast accuracy is affected by data releases. We take advantage of the Bayesian nature of our model and extend GRS results to examine how new data affects density forecast accuracy and the width of forecast intervals (4.3). We then turn to a different concept of news, introduced in the literature by . We show how our set up adds a new dimension to their tool, as one can use draws from the posterior to derive a measure of uncertainty around the news content of each data (or block of data) release (4.4). Our third set of results concerns a fully fledged out-of-ample forecast exercise in which we assess the point and density forecasting performance of our model (4.5).

Priors
To set the prior hyperparameters we follow Primiceri (2005) and retain a three years training sample. Since the model features an unobserved component that is common to the indicators, we start by getting an initial estimate of the common factor f t as the cross-sectional average of the monthly indicators f start t over this training sample. Conditioning on this we then get an estimate of the factor loadings with an OLS regression of the indicators on f start t . The prior distributions of the factor loadings are then centered around thisβ OLS . The prior is flat as we set the variance to 10 3 V (β OLS ). By regressing the residuals of these OLS regressions u OLS,t on their first two lags we also obtain an OLS estimate of the autoregressive parameters of the idiosyncratic shocks to the observable indicators. The prior distributions of the φ s are then centered on this estimate and their variance is set to 10 3 V (φ i,OLS ), where i = q, h, s. Similarly, we use this training sample estimate of the factor f start t to set the prior mean and variance of φ f,1 , φ f,2 . Finally, we need to set the degrees of freedom and the scales of the prior inverse-Gamma distributions for the variances of the idiosyncratic shocks. In all the IG distributions we set the degrees of freedom to 1, the minimum required to make the prior distributions proper while keeping the weight of the prior as low as possible. The prior scale parameters are set at the sum of square residuals of the OLS estimates obtained on the training sample data. The Gibbs sampler is initialized at the prior means.

Full sample results: loadings and volatilities
In this section we report the estimation results of our model for the entire sample which starts in January 1991 and ends in May 2011. Appendix D assesses the convergence of the Markov Chain to the ergodic distribution based on the inefficiency factors.. A first evaluation of the relative importance of the indicators that are included in the model is given by the posterior estimates of the factor loadings (β), which are shown in Table 2. The highest posterior median weight (0.49) is given to the Industrial production index, followed by GDP (0.38) and by the Industrial production index in the Pulp and Paper sector. Survey data receive roughly the same weight (around 0.1), with a slight prevalence given to the PMI and the weakest contribution coming from the Michigan US Consumer Survey. The annual rate of change of the euro-dollar exchange rate and the US spread have a counter-cyclical effect on GDP. The sign of these two parameters is easily rationalized by considering that these indicators typically lead the business cycle, so that their correlation with current cyclical conditions (measured by the common factor) is negative.
An alternative way to look at the relative importance of the different indicators in the estimation of the business cycle is given by the forecast weights used by , first derived by . Differently from factor loadings, Kalman filter based forecast weights take into account the different timeliness of the indicators and are time varying. To see how these weights are computed consider the update equation of the unobserved states that is used in the Kalman filter to update the estimate of the unobserved states upon the arrival of new information at time t: where µ t\t−1 is the forecast of the states based on the information set available at time t − 1, K t is the Kalman gain and v t is a forecast error defined below. Now, from the measurement and transition equations the following can be derived: Plugging (16) and (17) in (15) an autoregressive representation of the filtered states can be obtained: Inverting equation (18) one obtains the moving average representation of the unobserved states as a function of the observed variables. Time variation in these weights stems from the time varying nature of the Kalman gain K t . Given the ragged edged nature of data releases, when timely (survey) data are published they will receive relatively more weight than lagged (hard) indicators. Table 3 shows these weights estimated using the last available vintage, that is May 2011. 7 Given publication lags, GDP growth is known only for the first quarter of 2011 and available 7 The weights are scaled as contributions to the forecasts of GDP.
Industrial Production data refer to March 2011. The following results emerge: 1. In the last month of each quarter, when GDP is observed, the Kalman filter matches GDP forecast with the actual figure, so that all the weight is given to GDP.
2. In the first and second month of the quarter, when GDP is unobserved but all the monthly indicators are known (the dataset is "balanced"), over half of the estimate of (unobserved) real activity growth depends on the two Industrial Production indexes. Soft indicators play a minor role, with a relatively stronger contribution coming from the Economic Sentiment Indicator 3. In the months when neither GDP nor Industrial Production data are available, the strongest contribution to GDP forecast comes from the Economic Sentiment Indicator, followed by the PMI.
The factor model delivers a smoothed monthly estimate of GDP growth, which we show in Figure  We next look at the hard indicators that receive the largest weights in the estimation of the common factor (GDP and IP). Visual inspection of the variances of the idiosyncratic shocks to these two indicators reveals that volatility has been rather stable over most of the sample, with the exception of the latest recession, when it surged significantly until 2008 to fall thereafter.
Finally, the variance of the US spread shows a slight upward trend during the Nineties and a much more persistent increase during the 2007/2009 recession, consistently with the financial origins of the recent economic downturn.

News and forecasts 1
Given the mixed-frequency nature of our model, GDP forecasts are continuously updated as new monthly data become available. The impact of data releases on forecast revisions can be assessed using the methodology developed by GRS. To clarify the spirit of the exercise, the concept of vintage needs to be formally introduced.
The Ω v j vintage is defined as: that is the information set Ω v j is composed of n indicators available from month 1 to month T iv j , where the date for which the last observation is available varies across indicators. Within our model, a GDP forecast is obtained as an expectation of future GDP conditional on this information set.
Now consider a new vintage Ω v j+1 , which differs from the previous one for the release of a new observation of the i th indicator: The updated information entails a change in the conditioning set and, consequently, a forecast revision. Notice that we work with final data vintages in a pseudo real time context, that is, we do not consider data revisions but only new end of sample releases. This means that, starting from a given point in time, we let the information set gradually expand, one indicator at the time.
Data releases can occur at different intervals within the month but, for simplicity, following GRS we set up a stylized calendar in which the order of release of the various indicators is kept fixed within the month. The stylized calendar is shown in indication to whether the model forecast gains confidence as new information accrues and the forecast horizon decreases. Notice that this notion of forecast confidence is not strictly related to forecast accuracy, since the model might produce narrower confidence bands around its cen-tral forecast but the forecast distribution might well be moving further away rather than closer to the target as the information set expands. Second, we move beyond point forecast accuracy and evaluate the evolution of density forecast accuracy. To this end we use the log-score, that is the logarithm of the predictive density generated by the model evaluated at the outturn of the series. Since the log-score measures the probability that the model assigns to the actual value prior to its realization, we expect to see higher log-scores as the information set expands.
We view these three tools (Mean Squared Errors, confidence interval width and log-scores) as strongly complementary in the evaluation of the impact of news on GDP forecasts. Provided that point forecast accuracy increases with the arrival of more information (i.e. provided that the MSE falls), it is desirable to have less uncertainty around the forecast (i.e. it is desirable to have narrower confidence intervals) but this decrease in uncertainty must not come at the cost of lower density forecast accuracy (i.e. the log-score must rise). are, however, reassured by two considerations. First, the differences between the in-sample and the out-of-sample analysis presented by GRS are minimal, so that the marginal impact of data releases seems to depend on the timeliness of the releases and on the forecast horizon much more than on parameter uncertainty (inherently higher in an out-of-sample setting). Second, in section 4.5 we provide genuine out-of-sample evidence that successive data releases increase point forecast accuracy, although in a simplified setting in which, instead of considering each series separately, we consider only two partitions of the information set, i.e. hard and soft data.
The timing of the analysis is the following. We consider releases from January 2004 to May 2011 and, in line with GRS, we forecast each quarter from the first month of the quarter to the first month of the subsequent one, that is we compute three nowcasts and one backast.
For each month we update the vintages sequentially according to our stylized calendar, sample 1000 draws from the posterior, run the Kalman filter and smoother and, for each posterior draw, produce nine GDP estimates, corresponding to the release of each of the nine indicators, and consequently nine forecast errors. We compare our model forecasts with those of a naive constant growth model.
In Figure 3 we We next assess the evolution of forecast confidence over the forecasting cycle. Since for each of the nine data releases we have an entire distribution of GDP forecasts we can gauge forecast confidence by measuring the dispersion of these forecasts. As a measure of statistical dispersion we use the standardized interquartile range, that is the difference between the 75 th and the 25 th percentiles standardized by the median. We choose the interquartile range since it has some desirable statistical properties, in particular it is a robust statistics (i.e. it is not affected by outliers) and in a symmetric distribution it equals the median absolute deviation.
The evolution of the interquartile range over the forecast cycle is depicted in Figure 4. Two results are worth stressing. First, the chart reveals a clear downward tendency in the dispersion of GDP estimates, indicating that the confidence that the model places on its GDP forecasts increases as conjunctural information accrues. Second, soft data play an important role in driving the reduction in forecast dispersion, especially at the very beginning of the forecast cycle when a strong fall in forecast uncertainty occurs as the first surveys become available.
Finally, in Figure 5 we show the evolution of the log-score (crossed line) together with the log-score obtained with the constant growth model (dotted line). Consistently with point forecast accuracy results, density forecast accuracy monotonically increases at the release of each new indicator, suggesting that as the forecast horizon shortens the model assigns (ex ante) a progressively higher probability to the actual GDP releases.

News and forecasts 2
In a recent paper  propose and derive an alternative way to map directly news into forecast revisions. They motivate this alternative measure of news by noticing that in factor models the forecast of the unobserved factors is a weighted average of present and past observable indicators, with weights endogenously assigned by the Kalman smoother.
When the information set is enriched by a new release, the Kalman smoother incorporates the new information by revising the weights assigned to all the available indicators making it impossible to discern whether an improvement in forecast accuracy is due to the new release or to a revision of the weights assigned to other indicators. They therefore devise a way to dissect more precisely the contribution of each release to forecast revisions. Their method, whose technical details are described in Appendix E, is of particular interest in cases when, instead of considering the release of a single indicator, a whole block of data is released and the contribution of the news content of each single indicator needs to be assessed.
Our setup, by providing a quantification of the the uncertainty surrounding the news content of a new data (or block of data) release, provides a more complete picture of the forecast revision implied by the intra-monthly information flow.
We illustrate this point using as a case study the GDP forecast of the second quarter of 2010.
We start nowcasting this GDP release in the first half of April, when the February Industrial Production numbers become available. We update our forecasts twice a month until the first half of August, right before the first GDP estimate is published. The first by-monthly update To complement the analysis with a measure of uncertainty on both (1) the overall revision implied by the release of an entire data block and (2) the contribution of each indicators to such revision, at each by-monthly update of our information set we draw 1000 forecasts from the predictive density and map each of these forecasts onto the news.
In Figure 7 we report estimated kernel densities of the overall revision to the forecast due to the release of 'hard' (upper panel) and 'soft' (lower panel) data between April and July.
To show what the individual contributions look like we report in Figure 8 similar densities for two selected indicators, namely Industrial production and the Economic Sentiment Indicator, which appear to be responsible for most of the revisions over the forecast cycle.
From the comparison of these distributions with the information provided in Figure 6, the importance of having a tool to identify the credibility of forecast updates emerges quite clearly.
In the second half of April and May, for example, the model picks up first a strong upward, then a strong downward revision due to the release of survey data, which can be largely be attributed

Out of sample forecasting performance
The last empirical analysis we conduct is a pseudo out of sample forecast exercise. The design of the exercise is similar in spirit to the sequence of forecasts updates discussed in the previous section. In particular, for each quarterly GDP release we provide eight forecasts, starting from six months before the end the quarter of interest to one month afterwards (backcast). Taking as a target, for example, the third quarter of each year, we produce the first forecast in March and the last one in October. We update each of these projections twice a month, when, respectively, its RMSFE falls below 1) earlier in the forecast cycle. The gap between the two RMSFE shrinks considerably when the previous quarter GDP is released (mid-month update of the second nowcast) and the baseline model can use the first order GDP autocorrelation to adjust the nowcast.
Turning to density forecast evaluation, we look first at coverage rates, that is the frequency with which the actual outcome falls within a given confidence interval. Given the popularity in Central Banks and among forecasters of confidence intervals and fan charts, this seems a natural starting point. If the model produces a density forecast which matches well the underlying unknown density function that has generated the data, one can expect that the actual coverage rate equals the nominal one. For example one should find that in our out-ofsample exercise GDP growth fell 10% of the times within our 10% confidence interval, 20% of the times within our 20% confidence interval and so forth. To gauge uncertainty we run a t-test on the null hypothesis that the actual coverage equals the nominal one. 11 In Table 5 we report the coverage rate for the baseline model (without stochastic volatility). We look at backcast (projections one month after the end of the quarter), nowcast (projections during the quarter), 11 As emphasized by  this test is slightly imprecise as it abstracts from parameter uncertainty and 1 step ahead forecast (projections for the next quarter).
It is clear that the baseline model produces far too wide confidence intervals for the backcast and for the one quarter ahead forecast. In three and four cases out of ten, respectively, according to the p-values these differences are statistically significant at the 10% confidence level. In the case of the nowcast the situation improves, with the tests rejecting only once. Finally, we look at the normalized probability integral transforms (PITS) of the forecast errors, another popular tool for evaluating density forecasts. According to the testing framework developed by , if the model forecast density matches the density that generated the data, the PITS should be independent standard normal. We follow  and test these conditions (zero mean, unit variance and no serial correlation) separately and jointly, with the Berkowitz (2001) likelihood ratio test for the joint null of zero mean, unity variance, and no serial correlation of order 1.
The p-values of these tests are presented in Table 7 for the baseline model without stochastic volatility and in Table 8

Conclusions
This paper introduces a mixed frequency factor model with stochastic volatility, and develops a Bayesian procedure for its estimation. The model deals with all the challenges faced by a forecaster that needs to produce updated quarterly GDP forecasts at each relevant data release, like data sampled at different frequencies and ragged-edge data. Differently from existing linear models, our setup allows for continuous shifts in the volatility of the errors of both the common factor and of the idiosyncratic errors, a feature that in the macro forecasting literature has been shown to improve both point and density forecast accuracy.
This measurement tool is applied to the problem of forecasting euro area GDP at short horizons. When estimated over the whole sample, the model picks up significant shifts in the volatility of the errors, with two peaks coinciding with the major recessionary episodes of the past twenty years.
We further illustrate how, in a given quarter, the factor model can be used to assess the uncertainty around the news content of monthly releases of hard, soft and financial indicators.
Consistently with findings in the literature, we find that forecast accuracy improves significantly in connection with the release of monthly data as the forecast horizon decreases. Also, forecast uncertainty (measured by the width of the forecast distribution) progressively decreases as more information on the quarter of interest becomes available.
Finally, we design a (pseudo) real time out of sample forecasting exercise and evaluate out of sample point and density forecasts accuracy. In line with Clark (2011) we find that the introduction of stochastic volatility significantly contributes to an improvement in density forecast accuracy.

A Details of the Gibbs sampler
We describe in more details the six blocks that compose our Gibbs sampler procedure: A.1 Block 1: drawing the factor loadings β q , β h , β s In the first block of the Gibbs sampler we draw the slopes, conditional on all the other parameters of the model. To see how this is done let us start from the measurement equation of the hard indicator: where the law of motion of the idiosyncratic shock is u h,t = φ h,1 u h,t−1 + φ h,2 u h,t−2 + h,t e λ h,t /2 and h,t ∼ N (0, σ h ). Since we are conditioning on all the parameters, on the factor f t and on the stochastic volatilies λ h,t we can treat this equation as a simple regression with autocorrelated and heteroscedastic residuals. To whiten the residuals, we quasi-difference the equation by filtering both sides with the filter 1 − φ h,1 L − φ h,1 L 2 and dividing each observation by e λ h,t /2 : L 2 )f t /e λ h,t /2 . Then positing a Normal prior: The case of survey variables can be treated accordingly after noticing that x t = (1 − φ h,1 L − φ h,1 L 2 ) 11 j=0 f t−j /e λs,t/2 . In the case of quarterly variables two adjustments are needed. First, since the variable is observed only every three months only these observations can be used for estimating the factor loading. Second, in the measurement equation an MA(4) regression error appears: where w(L) = 1 3 + 2 3 L + L 2 + 2 3 L 3 + 1 3 L 4 . Furthermore the error term u t is an AR(2) process u q,t = φ q,1 u q,t−1 + φ q,2 u q,t−2 + q,t e λq,t/2 . Our estimation strategy consists of working out the variance covariance matrix of the error terms of equation (25), Φ(φ q,1 , φ q,2 , σ 2 q ), which we can treat at this step of the sampler as if it were known. Then it suffices to divide each observation by e λq,t/2 and premultiply both sides of the equation by Φ − 1 2 to obtain a standard regression with uncorrelated residuals. We are now in the familiar setting in which we can posit a normal prior and draw β q from a normal posterior.
A.2 Block 2: drawing φ f,1 , φ f,2 , φ q,1 , φ q,2 , φ h,1 , φ h,2 , φ s,1 , φ s,2 To draw the parameters that govern the autocorrelation of the idiosyncratic shocks first notice that since we are conditioning on the state vector µ t , we can treat the common factor f t and the residuals u q,t , u h,t , u s,t as known. The transition equations become standard regression problems which can be analyzed separately (again after pre-whitening to take into account the stochastic volatility components). We employ normal priors p([φ j,1 φ j,2 ] ) ∼ N (φ j , Σ φ,j ), where j = f, q, h, s, and for each equation we draw from the respective normal conjugate posteriors.
We rule out explosive roots by drawing from the untruncated Normal posterior and discarding draws if the roots of φ j (L) = 0 lie outside the unit circle.
A.3 Block 3: drawing the innovation variances σ 2 f , σ 2 q , σ 2 h , σ 2 s The variances of the innovations to the idiosyncratic shocks can also be easily drawn once we condition on the state vector µ t , on the φ s and on the stochastic volatilities. We again proceed by treating the transition equations one at the time. Let us consider a generic element of the state vector µ i,t . Its law of motion is: For the innovation variance σ 2 i we posit an inverse-Gamma prior p(σ 2 i ) = IG(n i , s 2 i ). Since the prior is conjugate it can be interpreted as adding n i artificial observations to the state variable µ i,t . The prior embodies the belief that the sum of squared residuals of these artificial observations equals s 2 i : Given our assumption that the idiosyncratic shocks are normal the posterior is also an inverse-Gamma, IG(T + n i , The weight of the prior is therefore proportional to the prior degree of freedom parameter n i .

A.4 Block 4: drawing the state vector µ t
Since the model can be cast in state space draws of the state vector can obtained via a state vector simulation smoother as in  or with the disturbance smoother proposed by . We resort to the latter, which turns out to be slightly more efficient from a computational point of view. 13 A.5 Block 5: drawing λ i,t To sample the stochastic volatilities λ i,t notice that conditional on all parameters and on the states µ t the orthogonal innovations η i,t /σ h,i are observable. The λ i,t can then be sampled adopting the date-by-date blocking scheme developed by . 14 .
A.6 Block 6: drawing σ 2 h,i The final block of the sampler involves drawing the variances of the log-volatilities. Conditioning on the log-volatilities and postulating an inverse-Gamma prior distribution, the σ 2 h,i can also be drawn from an inverse Gamma posterior.

B The selection of the monthly indicators
Small scale models have their own "curse of dimensionality": since they rely on a small set of indicators, they are prone to the criticism of potentially leaving out relevant information compared to factor models that use hundreds of time series. In the literature, however, the initial enthusiasm for the use of very large sets of data has started waning when some authors have pointed out that models that use a smaller set of accurately targeted predictors might deliver more accurate forecasts.  and , for example, question the usefulness of 'too much information' for forecasting purposes. The former, in particular, shows that a number of variable selection techniques (already widely used in biomedical statistics where the number of covariates is typically very large) give encouraging results when applied to economic time series.
To make the choice of the indicators to be included in our model as objective as possible we proceed as follows. We start by considering a dataset of more than a hundred variables for the period 1987-2011 15 , and select a subset of 39 indicators similar to those employed in  and in . We then set a priori four core variables that we decide to include in the model, which are Industrial Production for the euro area (IP), the composite Purchasing Manager Index (PMI), the European Commission Economic Sentiment Indicator (ESI) and the Germany IFO Business Climate Index. To select the remaining variables, we calculate as a benchmark the percentage of GDP variance explained by the factor computed from the core variables only, as in , and design an algorithm for the selection of a set of additional indicators which maximize this statistic.
1. We evaluate datasets with all core variables and one other variable at a time in order to calculate the explained variance, and the probability that it is higher than in the dataset with core variables only. In this way we obtain a ranking of the other series.
2. We add a variable at a time, starting with the ones with an higher probability to increase the explained variance with respect to the benchmark; we keep the variable only if this probability increases. We end up with the small set of 8 variables described in the main text.

C The state space specification in the empirical application
The specification we adopt follows  where surveys are modeled as a 12 terms moving average of the unobserved factor, while hard variables load the factor contemporaneously. This amounts to imposing that surveys are in phase with the year on year growth rate of Industrial Production (and of the other hard indicators). To get an idea of the state representation of the model while keeping notation to a minimum we present the case of a toy model with one quarterly variable, one hard indicator and one soft indicator in which all the idiosyncratic shocks follow an AR(2) process. The more general case can be easily derived from this example. The loading matrix F in the measurement equation (10) can be written as: where β q , β h and β s are the loadings of, respectively, the quarterly variable, the hard and the soft indicators. The state vector is: The transition matrix is: Since the idiosyncratic shocks are collected in the state vector the matrix R t is a (k+2) dimension zero matrix while the matrix Q t is a diagonal matrix which collects all the variances: Q t = diag 1 0 0 0 . . . σ 2 q e λq,t 0 0 0 0 σ 2 h e λ h,t 0 σ 2 s e λs,t 0 (32) D Assessing the convergence of the Markov chain to the ergodic distribution We assess the convergence of the Markov chain to the ergodic distribution by looking at the autocorrelation properties of the draws across sets of parameters. In the full sample estimate of the models we run 30000 replications and retain the last 5000 draws. As a measure of convergence of the Markov Chain we consider the inefficiency factors (henceforth, IFs) of the draws, which are defined as the inverse of the relative numerical efficiency measure (RNE) of . The RNE is computed considering one parameter at the time and using the sequence of draws as time dimension. Specifically the RNE is defined as: where S(ω) is the spectral density of the draws of a given parameter at the frequency ω. The denominator S(0) is the spectral density at the zero frequency, a measure of the long run variance of the draws. The spectral densities are estimated by the smoothed periodogram using a 32 points Bartlett triangular window which weighs less more distant autocorrelations.
In Figure 10 we present the IFs. As the figure shows the autocorrelation of the draws is very low, with values of the IFs overall below two, that is ten times lower than the threshold (twenty) which can be considered as satisfactory, as stressed by Primiceri (2005).

E News and forecast revisions
In their paper  derive a way to decompose a forecast revision as a linear function of news.
They denote as Ω v a vintage of data corresponding to a statistical data release v, which as an example can be mid-month for industrial production and end of month for surveys, in order to define news as: the surprise incorporated in a new data with respect to what was expected given information Ω v . A forecast revision is defined as: and can be expressed as weighted average of news: where: is the state vector covariance matrix obtained as a by-product of the Kalman Smoother.                 Note to Table 7. P-values for the null hypotheses of zero mean, unit variance, no serial correlation and joint Normality/Indipendence of forecast errors at different horizons. Backcast refers to two weeks (1) and one month (2) after the end of the quarter of interest. Nowcast refers to the first two weeks (1), the first month (2)and so on of the quarter of interest. One step ahead to the next quarter in the same periods as in Nowcast.   Table 8. P-values for the null hypotheses of zero mean, unit variance, no serial correlation and joint Normality/Indipendence of forecast errors at different horizons. Backcast refers to two weeks (1) and one month (2) after the end of the quarter of interest. Nowcast refers to the first two weeks (1), the first month (2)and so on of the quarter of interest. One step ahead to the next quarter in the same periods as in Nowcast. Block 6: σ h,i Note to Table 10. Each panel in the figure corresponds to one of the six blocks of the Gibbs sampler discussed in Appendix A. For example, the top left panel reports the IF for the nine slope parameters β in the model, the top center for the 20 AR(2) parameters of the idiosyncratic errors, and so forth. Regarding the fourth block, for the sake of simplicity instead of reporting the IF for the whole state vector (where many elements are just repeated with a lag) we report the linear combination of the state vector H(1, :)µ t , where H(1, :) stands for the first line of the H matrix. This amounts to reporting the IF for the GDP draws (we only report the IF for the time periods for which GDP is unobserved, since for the remaining period the variance of the draws is zero. This vector therefore has 148 elements given that estimation is performed on 219 observations).