Summary
 Top of page
 Summary
 1 Introduction
 2 Data
 3 Models
 4 Results
 5 Conclusions
 Acknowledgements
 References
 Appendix
 Supporting Information
This paper compares alternative models of timevarying volatility on the basis of the accuracy of realtime point and density forecasts of key macroeconomic time series for the USA. We consider Bayesian autoregressive and vector autoregressive models that incorporate some form of timevarying volatility, precisely random walk stochastic volatility, stochastic volatility following a stationary AR process, stochastic volatility coupled with fat tails, GARCH and mixture of innovation models. The results show that the AR and VAR specifications with conventional stochastic volatility dominate other volatility specifications, in terms of point forecasting to some degree and density forecasting to a greater degree. Copyright © 2014 John Wiley & Sons, Ltd.
1 Introduction
 Top of page
 Summary
 1 Introduction
 2 Data
 3 Models
 4 Results
 5 Conclusions
 Acknowledgements
 References
 Appendix
 Supporting Information
A growing number of studies have provided evidence of timevarying volatility in the economies of many industrialized nations. Regarding this point, most available evidence, based on data through the early to mid 2000s, has highlighted the Great Moderation (e.g. Stock and Watson 2003, 2007; Cogley and Sargent, 2005; Primiceri, 2005; Koop and Potter, 2007; Benati, 2008; Justiniano and Primiceri, 2008; Giordani and Villani, 2010). Some more recent studies have shown that, following the Great Moderation, volatility rose sharply during the severe recession of 2007–2009 (e.g. Clark, 2009, 2011; Curdia et al., 2013).
Modeling the apparently significant time variation in macroeconomic volatility is important to the accuracy of a range of types of inference. In general, of course, least squares estimates of vector autoregressive (VAR) coefficients remain consistent in the face of conditional heteroskedasticity, but ordinary least squares (OLS) variance estimates do not. Moreover, modeling the conditional heteroskedasticity can yield more efficient generalized least squares (GLS) estimates of VAR coefficients; Sims and Zha (2006) have emphasized the value of volatility modeling for improving efficiency. Accordingly, in both dimensions, taking account of time variation in volatility should improve the VARbased estimation and inference common in macroeconomic analysis. In particular, in VARbased analysis of impulse responses, variance decompositions, and historical decompositions—used, for instance, to assess the effects of alternative monetary policies—modeling time variation in conditional volatilities is likely to be important for accurate inferences. In addition, some recent dynamic stochastic general equilibrium (DSGE) model research (e.g. FernandezVillaverde and RubioRamirez, 2007; Justiniano and Primiceri, 2008) has emphasized the importance of modeling time variation in volatility for explaining the sources of the Great Moderation and other changes in volatility.
Modeling changes in volatility should also help to improve the accuracy of density forecasts from VARs. Shifts in volatility have the potential to result in forecast densities that are either far too wide or too narrow. For instance, in light of the Great Moderation, density forecasts for gross domestic product (GDP) growth in 2006 based on time series models assuming constant variances over a sample such as 1960–2005 would probably be far too wide, with inflated confidence intervals and probabilities of tail events such as recession. As another example, in late 2008, density forecasts for 2009 based on time series models assuming constant variances for 1985–2008 would have been too narrow. Results in Giordani and Villani (2010), Jore et al. (2010) and Clark (2011) support this intuition on the gains to point and density forecasts of modeling shifts in conditional volatilities. D'Agostino et al. (2013) show that the combination of timevarying parameters and stochastic volatility improves the accuracy of point and density forecasts. These benefits to allowing timevarying volatility could prove useful to central banks that provide density information in the form of forecast fan charts and qualitative assessments of forecast uncertainty.
In most recent macroeconometric studies of timevarying volatility (e.g. Stock and Watson, 2003, 2007; Cogley and Sargent, 2005; Primiceri, 2005; Benati, 2008), the time variation in volatility has been captured with a single model: stochastic volatility, in which log volatility follows a random walk process. In Bayesian estimation algorithms, the stochastic volatility specification is computationally tractable. In addition, studies such as Clark (2011) and Carriero et al. (2012) have shown that it is effective for improving the accuracy of density forecasts from AR and VAR models. However, there are alternatives that could also be effective for capturing changes in macroeconomic volatility. 1 Studies such as Koop and Potter (2007), Giordani and Villani (2010) and Groen et al. (2013) have used models in which volatility is subject to potentially many discrete breaks; others, such as Jore et al. (2010), have used models with a small number of discrete breaks. Yet another model of timevarying volatility would be a generalized autoregressive conditional heteroskedasticity (GARCH) specification. While the pioneering development of ARCH (Engle, 1982) and GARCH (Bollerslev, 1986) included applications to inflation, these models seem to have become rare in macroeconometric modeling, with the exception of a few studies, such as Canarella et al. (2010) and Chung et al. (2012).
While a number of studies in the finance literature have compared alternative models of timevarying volatility of asset returns (e.g. Hansen and Lunde, 2005; Geweke and Amisano, 2010; and Nakajima, 2012), no such broad comparison yet exists for macroeconomic variables. 2 Accordingly, this paper compares alternative models of timevarying macroeconomic volatility, included within autoregressive and vector autoregressive specifications for key macroeconomic indicators and estimated using Bayesian inference. We base our comparison on realtime outofsample forecast accuracy, for both point and density forecasts of US data on GDP growth, the unemployment rate, inflation in the GDP deflator and the 3month Treasury bill rate. 3
The set of univariate AR models includes the following volatility specifications: constant volatility; stochastic volatility (with both constant coefficients in the conditional mean portion of the model and timevarying coefficients in the conditional mean); stochastic volatility following a stationary AR process; stochastic volatility coupled with fat tails; GARCH; and a mixture of innovations model. The set of VARs includes the same volatility specifications, except for the mixture of innovations model (for reasons of computational tractability).
Our results indicate that the AR and VAR specifications with stochastic volatility dominate models with alternative volatility specifications, in terms of point forecasting to some degree and density forecasting to a greater degree. Therefore, at least from a macroeconomic forecasting perspective, these alternative volatility specifications seem to have no advantage over the now widely used stochastic volatility specification.
The paper proceeds as follows. Section 2 describes the data. Section 3 presents the models and estimation methodology; details on priors are provided in the Appendix; and details of estimation algorithms and some addition results are provided as supporting information in a supplementary Appendix. Section 4 presents the results. Section 5 concludes.
2 Data
 Top of page
 Summary
 1 Introduction
 2 Data
 3 Models
 4 Results
 5 Conclusions
 Acknowledgements
 References
 Appendix
 Supporting Information
We use quarterly data to estimate models for growth of real GDP, inflation in the GDP price index or deflator (henceforth, GDP inflation), unemployment rate and 3month Treasury bill rate. We compute GDP growth as 100 times the log difference of real GDP and inflation as 100 times the log difference of the GDP price index, to put them into units of percentage point changes. The unemployment rate and interest rate are also defined in units of percentage points (annualized in the case of the interest rate).
We obtained quarterly realtime data on GDP and the GDP price index from the Federal Reserve Bank of Philadelphia's Real Time Dataset for Macroeconomists. For simplicity, we use ‘GDP’ and ‘GDP price index’ to refer to the output and price series, even though the measures are based on gross national product (GNP) and a fixed weight deflator for much of the sample. As described in Croushore and Stark (2001), the vintages of the RTDSM (RealTime Data Set for Macroeconomists) are dated to reflect the information available around the middle of each quarter. In vintage t, the available data run through period t − 1.
In the case of unemployment and interest rates, for which realtime revisions are small to essentially non–existent, we simply abstract from realtime aspects of the data and use currently available time series. We obtained monthly data on the unemployment rate and 3month Treasury bill rate from the FAME database of the Federal Reserve Board of Governors and formed the quarterly unemployment and interest rate as simple withinquarter averages of the monthly data.
As discussed in such sources as Romer and Romer (2000), Sims (2002) and Croushore (2006), evaluating the accuracy of realtime forecasts requires a difficult decision on what to take as the actual data in calculating forecast errors. 4 We follow studies such as Romer and Romer (2000) and Faust and Wright (2009) and use the second available estimates of GDP/GNP and the GDP/GNP deflator as actuals in evaluating forecast accuracy. In the case of hquarterahead forecasts made for period t + H with vintage t data ending in period t − 1, the second available estimate is taken from the vintage t + h + 2 data set. In light of our abstraction from realtime revisions in the unemployment and interest rates, for these series the realtime data correspond to the final vintage data.
We evaluate forecasts from 1975:Q1 to 2011:Q2, which requires realtime data vintages from 1975:Q1 to 2011:Q4. For each forecast origin t starting with 1975:Q1, we use the realtime data vintage t to estimate the forecast models and construct forecasts of quarterly values of all variables for periods t and beyond. 5 We report results for forecast horizons of 1, 2, 4, and 8 quarters ahead. In light of the time t − 1 information actually incorporated in the models used for forecasting at t, the 1quarterahead forecast is a current quarter (t) forecast, while the 2quarterahead forecast is a nextquarter (t + 1) forecast, etc. For most models, the starting point of the model estimation sample is 1955:Q1; in some of these specifications, we use data for the 1948–1954 period to set the priors on some parameters, as detailed in the Appendix. For the VARTVPSV and ARTVPSV specifications, to permit the use of a longer training sample for setting the prior on the initial VAR or AR coefficients, the starting point of the model estimation sample is 1961:Q1 and we use data for the 1948–1960 period to set the priors on some parameters.
3 Models
 Top of page
 Summary
 1 Introduction
 2 Data
 3 Models
 4 Results
 5 Conclusions
 Acknowledgements
 References
 Appendix
 Supporting Information
This section provides the specifications of our models and an overview of the estimation methods. 6 Because in most cases the AR models are simplifications of corresponding VAR models, we present the VAR models first and then provide a much briefer presentation of AR models. The priors are detailed in the Appendix, and the estimation algorithms (which are very similar to those in other studies) are detailed in the supplementary Appendix.
The volatility specifications we include reflect a range of considerations. While other studies have provided strong evidence of timevarying volatility in macroeconomic variables, the prevalence of VAR and AR models with constant volatility leads us to include these specifications in our analysis. These models are of course easy to use for forecasting, and for point forecasting their performance should not be affected much by conditional heteroskedasticity (we return to this point in our analysis of results).
Among possible specifications of timevarying volatility, we treat randomwalk stochastic volatility (simply ‘stochastic volatility’ in what follows) as a baseline because, since the pioneering work of Cogley and Sargent (2005) and Primiceri (2005), it has become the dominant approach in recent macroeconometric modeling. However, there are reasons to think this particular specification within the class of stochastic volatility models could be too restrictive in some dimensions. First, log volatility could follow a stationary, loworder AR process rather than a random walk. In the recent literature on DSGE models with stochastic volatility, some studies model log volatility as a random walk (e.g. Justiniano and Primiceri, 2008), while others model it as a stationary AR(1) process (e.g. FernandezVillaverde and RubioRamirez, 2007). In forecasting, the random walk specification might adversely affect density forecast performance by allowing volatility to blow up (becoming either unduly high or low) over the forecast horizon. Second, in light of the dramatic movements in macroeconomic variables over the recent Great Recession, the standard stochastic volatility specification could miss fat tails in the underlying shock distribution. Curdia et al. (2013) find that adding fat tails to stochastic volatility improves the fit of a DSGE model for the US economy. Motivated by these considerations, we examine the forecasting performance of VAR and AR models with (random walk) stochastic volatility, stationary AR(1) stochastic volatility, and both fat tails and stochastic volatility. 7 In light of other evidence of time variation in the regression coefficients of VAR and AR models (e.g. Cogley and Sargent, 2005), we also consider VAR and AR models with both stochastic volatility and timevarying (regression) parameters.
Despite the prevalence of stochastic volatility in recent macroeconometric modeling, VAR and AR models with GARCH could reasonably be considered as alternatives. While GARCH has become prevalent over time in finance modeling (e.g. Geweke and Amisano, 2010) and not common in macro modeling, Engle (1982) and Bollerslev (1986) had inflation in mind with their development of AR and GARCH models. Moreover, there are some recent macroeconomic analyses that have used GARCH formulations (e.g. Canarella et al., 2010; Chung et al., 2012). In DSGEbased modeling, while FernandezVillaverdeandRubioRamirez (2010) describe reasons to prefer stochastic volatility to GARCH (primarily because GARCH makes it more difficult to separate volatility shocks from levels shocks), there are some DSGE applications that consider GARCH (Andreasen, 2012).
Finally, within the set of AR models, we consider a specification in which volatility is subject to potentially many discrete breaks, rather than to the continuous breaks implied by stochastic volatility or GARCH specifications. For reasons described in Koop and Potter (2007), for example, a specification with potentially large, discrete breaks may be conceptually preferable to models with small, frequent breaks. Our particular specification is much more readily applied to AR models than VAR models, so we only consider an AR specification: a model that takes the mixture of innovations form, developed in studies such as Giordani et al. (2007), Koop and Potter (2007) and Groen et al. (2013).
While the mixture model we consider is similar to a Markov switching model, for computational reasons we omit a direct comparison to switching models, leaving such a comparison for future research. Switching can be difficult to use with vector autoregressions, which has limited their use (see, for example, discussions in Bognanni, 2013, and Hubrich and Tetlow, 2012). Given the current state of the art, a VAR with Markov switching would pose a significant computational challenge in a realtime forecasting evaluation spanning more than 140 quarters.
3.1 VAR Models
While we write out the models for a general lag order p, all of the VAR models include four lags, except that, to streamline computations, the VARTVPSV model includes two lags, following studies such as Cogley and Sargent (2005) and D'Agostino et al. (2013).
3.1.1 Constant Volatility
Let y_{t} denote the k × 1 vector of model variables, B_{0} = a k × 1 vector of intercepts and B_{i},i = 1, … ,p a k × k matrix of coefficients on lag i. For our set of k = 4 variables, we consider a VAR(p) model with a constant variance–covariance matrix of shocks:
 (1)
3.1.2 Stochastic Volatility
The VARSV model includes the conventional macroeconometric formulation of a random walk process for log volatility:
 (2)
where A = a lower triangular matrix with ones on the diagonal and nonzero coefficients below the diagonal, and the diagonal matrix Λ_{t} contains the timevarying variances of underlying structural shocks. 8 This model implies that the reduced form variance–covariance matrix of innovations to the VAR is var(v_{t}) ≡ Σ_{t} = A^{ − 1}Λ_{t}A^{ − 1 ′ }. Note that, as in Primiceri's (2005) implementation, innovations to log volatility are allowed to be correlated across variables (not only in this baseline stochastic volatility specification but also the other specifications detailed below). 9 Thus Φ is not restricted to be diagonal.
3.1.3 Stationary ar(1) Stochastic Volatility
The VARstationary SV specification treats log volatility as following an AR(1) process, which we presume to be stationary: 10
 (3)
3.1.4 Stochastic Volatility with Fat Tails
The VARSVt model augments the (random walk) stochastic volatility specification to include fat tails, similarly to the DSGE specification considered in Curdia et al. (2013), which follows the stochastic volatility with fat tails formulation of Jacquier et al. (2004):
 (4)
where d_{i} denotes the degrees of freedom of the Studentt distribution that is the marginal distribution of . Fat tails arise due to the q_{i,t}, which are assumed to be independent over time and across variables. We consider two treatments of the degrees of freedom of the fat tails component. In the first (used with what we identify as the ‘VARSVt’ model), d_{i} is a parameter to be estimated (for each variable). In the second (the ‘VARSVt, 5 df’ specification), d_{i} is simply fixed at 5, to ensure fat tails.
3.1.5 TimeVarying Parameters and Stochastic Volatility
Letting X_{t} denote the collection of righthandside variables of each equation of the VAR and B_{t} denote the period t value of the vector of all VAR coefficients (of dimension k(kp + 1) × 1), the VARTVPSV model takes the form given in Cogley and Sargent (2005):
 (5)
where A = a lower triangular matrix with ones on the diagonal and nonzero coefficients below the diagonal. The VAR coefficients follow randomwalk processes, with innovations that are allowed to be correlated across coefficients. The volatility portion of the VARTVPSV model is the same as that of the VARSV specification.
3.1.6 GARCH
The VARGARCH model incorporates a standard GARCH(1,1) process (as in Chung et al., 2012, for example) for the orthogonalized error of each VAR equation: 11
 (6)
where A = a lower triangular matrix with ones on the diagonal and nonzero coefficients below the diagonal. In this GARCH formulation, each variable is treated independently, with the conditional variance h_{i,t} a function of one lag of itself and one lag of the squared error from the VAR equation. We impose conditions to ensure positivity and stationarity of each volatility process.
3.2 AR Models
All of our AR models include two lags for GDP growth and four lags for unemployment, inflation and the Tbill rate. 12 We consider AR models with constant volatility (AR), stochastic volatility (ARSV), stationary stochastic volatility (ARstationary SV), stochastic volatility with fat tails (ARSVt, using estimated degrees of freedom and a fixed 5 degrees of freedom), stochastic volatility with timevarying parameters (ARTVPSV) and GARCH (ARGARCH). For all of these models, the AR specification is the same as the corresponding VAR specification, simplified to a univariate setting (k = 1), which eliminates the A matrix and makes Φ, Λ_{t}, Q_{t} and H_{t} scalars instead of matrices. Accordingly, in the interest of brevity, for these models we omit the AR specification details and instead refer the reader to the VAR specification details given above. The only model for which we spell out details is the ARmixture specification.
3.2.1 Mixture of Innovations
The ARmixture model is specified as follows, for each scalar time series y_{t}:
 (7)
In this model, the constants π_{j} are the probabilities of breaks in each period, for each parameter j (either a coefficient or the log volatility). If a break occurs to parameter j, the parameter shifts by an innovation n_{j}, which has variance q_{j}.
3.3 Estimation Algorithms and Sampling of Forecasts
We estimate all of the models described above using Bayesian Markov chain Monte Carlo (MCMC) methods. This section provides a brief overview of our methods. The supplementary Appendix and the studies cited below provide additional detail on algorithms and priors.
For the AR and VAR models with constant variances, we use the Normaldiffuse prior and posterior detailed in such sources as Kadiyala and Karlsson (1997) and estimate the models by Gibbs sampling.
To estimate the ARSV, ARstationary SV, ARTVPSV, VARSV, VARstationary SV and VARTVPSV models, we use Gibbs samplers. Stochastic volatility is estimated with the algorithm of Kim et al. (1998), as detailed in Primiceri (2005). 13 For the VARTVPSV and ARTVPSV models, our algorithm is the same as Primiceri's (2005), but simplified to treat the A matrix as constant as in Cogley and Sargent (2005). For the VARSV model, the volatility portion of the model is handled as it is in the case of the VARTVPSV specification. The VAR coefficients are drawn from a conditional posterior distribution that is multivariate normal, with a GLSbased mean and variance given in Clark (2011). For the VARstationary SV specification, we add to the VARSV algorithm a step to draw the coefficients of the AR(1) process for each variable's log volatility from a conditional posterior distribution that (like the prior) is multivariate normal. Because the innovations to log volatility can be correlated across variables, the conditional posterior of these coefficients corresponds to that for a seemingly unrelated regression system of equations.
To estimate the ARSVt and VARSVt models, we extend the Gibbs sampling algorithm used for the VARSV specification to accommodate fat tails, following Jacquier et al. (2004). The key extension is the addition of a step to draw, for each variable, the time series of q_{i,t} from an inverse Gamma distribution. The other steps are the same as those of the VARSV algorithm, but for a few normalizations of data or innovations to reflect the q_{i,t} terms. For the case in which we estimate the degrees of freedom, we rely on an exponential prior for the degrees of freedom and conditional posterior that requires a Metropolis step (for this, we use the implementation of Koop, 2003), treating each variable independently.
For the ARGARCH and VARGARCH models, we use a MetropoliswithinGibbs MCMC algorithm, combining Gibbs sampling steps for model coefficients with a random walk Metropolis–Hastings (MH) algorithm to draw the GARCH parameters. Our MH algorithm for the GARCH parameters is similar to those in Vrontos et al. (2000) and So et al. (2005). Specifically, we employ an adaptive MHMCMC algorithm that combines a randomwalk Metropolis (RWM) and an independent kernel (IK) MH algorithm. In the case of the VARGARCH model, the Choleski matrix A is handled in the same way as it is in the VARs with stochastic volatility.
Finally, our approach to estimating the ARmixture model is taken from Groen et al. (2013). The steps in their Gibbs sampler include: using the algorithm of Gerlach et al. (2000) to sample the latent states κ_{j,t} that indicate the timing of breaks in the coefficients and variance; using the simulation smoother of Carter and Kohn (1994) to sample the regression parameters; and using the algorithm of Kim et al. (1998) to draw the timevarying volatility and the variance of innovations to volatility.
All of our reported results are based on samples of 5000 posterior draws, retained from larger samples of draws. However, we use different burn periods and thinning intervals for different models, depending on the mixing properties of the algorithms (drawing on our own results on mixing properties and others in the literature, such as Carriero et al., 2012). Details on the burn samples and thinning intervals are given in Table 1 of the supplementary Appendix.
The posterior distributions of forecasts reflect the uncertainty due to all parameters of each model and shocks occurring over the forecast horizon. For example, to simulate the predictive density of the VARTVPSV specification, we follow the approach of Cogley et al. (2005). From a forecast origin of period t, for each retained draw of the time series of B_{t} up through t, Λ_{t} up through t, A, Q and Φ, we: (i) draw innovations to coefficients for periods t + 1 to t + H (H = the maximum forecast horizon considered) from a normal distribution with variance–covariance matrix Q and use the randomwalk structure to compute B_{T + 1}, … ,B_{T + H}; (ii) draw innovations to log volatilities for periods t + 1 to t + H from a multivariate normal distribution with variance–covariance matrix Φ and use the randomwalk model of logλ_{t + h} to compute λ_{T + 1}, … ,λ_{T + H}; (iii) draw innovations to Y_{t + h}, h = 1, … ,H, from a normal distribution with variance Σ_{T + h} = A^{ − 1}Λ_{T + h}A^{ − 1 ′ }, and use the vector autoregressive structure of the model along with the time series of coefficients B_{T + h} to obtain draws of Y_{t + h}, h = 1, … ,H. The draws of Y_{t + h} are used to compute the forecast statistics of interest. To take another example, for the VARSV model, we use the same approach to simulating the predictive distribution, except that the steps for simulating time series of the VAR coefficients are eliminated. 14
4 Results
 Top of page
 Summary
 1 Introduction
 2 Data
 3 Models
 4 Results
 5 Conclusions
 Acknowledgements
 References
 Appendix
 Supporting Information
In light of the key role that timevarying volatility will play in our results, it is useful to begin with a review of macroeconomic volatility over time. 15 For that purpose, we use our last vintage of data to estimate the VAR and VARSV models over a sample of 1955:Q1–2011:Q3. As a simple baseline, for the VAR model, we compute the residuals at the posterior mean of the coefficients and estimate rolling windows (41quarter centered moving averages) of standard deviations of the reducedform residuals. For the VARSV model, we obtain time series of the standard deviations of reducedform shocks to each variable (from the diagonals of Σ_{t} = A^{ − 1}Λ_{t}A^{ − 1 ′ }) by computing the standard deviations at each draw and then computing the posterior medians and 70% credible sets.
The volatility estimates reported in Figure 1 display considerable variation over time, with some fairly significant comovement. The simple rolling window estimates of residual standard deviations from the VAR show the volatility of GDP growth and unemployment declining through the 1960s, rising sharply until about 1980, plunging with the Great Moderation, and then rising again at the end of the movingaverage sample window, reflecting the Great Recession of 2007–2009. The rolling window estimates of volatility in inflation and the Tbill rate trend up from early in the sample through roughly 1980 and then follow a pattern similar to that for growth and unemployment. The estimates of standard deviations from the VARSV display broadly similar patterns, but with sharper and somewhat more frequent movements than are evident in the rolling window estimates from the VAR. In either case, the comovement of macroeconomic volatility appears to be high; the correlations of the volatility estimates from the VARSV range (across variables) from 0.77 to 0.96. Overall, this fullsample evidence points to important variation and comovement in volatility that models likely need to capture to succeed in density forecasting.
4.1 Forecast Metrics
Turning to the realtime outofsample forecast comparison that is the focus of the paper, for the purpose of assessing the efficacy of alternative models of timevarying volatility, we separate our comparisons of AR models from our comparisons of VAR models. We use a recursive forecasting scheme, expanding the model estimation sample as forecasting moves forward in time. We provide results for a full sample of 1975:Q1–2011:Q2 and for a Great Moderation sample period of 1985:Q1–2007:Q4 (included in part for comparability to samples used by other studies). Because the results are broadly similar across these samples, we cover them jointly, rather than separately, in our discussion. The section proceeds by detailing our approaches to comparing forecasts and then presenting the results.
We first consider the accuracy of point forecasts (defined as posterior medians), using root mean square errors (RMSEs). We then consider density forecasts, using both the average log predictive score and the average continuous ranked probability score (CRPS). The predictive score, motivated and described in such sources as Geweke and Amisano (2010), is commonly viewed as the broadest measure of density accuracy. We compute the log predictive score using the quadratic approximation of Adolfson et al. (2007):
 (8)
where denotes the observed outcome, denotes the posterior mean of the forecast distribution, and V_{t + h  t} denotes the posterior variance of the forecast distribution.
To facilitate the reading of results from tables, we present the RMSEs, log scores and CRPS for benchmark models and relativetobaseline measures of RMSEs, scores and CRPS for other models. In light of existing research findings that show random walk stochastic volatility typically improves the accuracy of forecasts from AR and VAR models (e.g. Clark, 2011; D'Agostino et al., 2013), we take models with random walk stochastic volatility (ARSV for univariate specifications and VARSV for VAR specifications) as baselines and compare models with constant volatility or other timevarying volatility specifications to the baselines. More specifically, in our tables, for the baseline AR and VAR models with stochastic volatility, we report the RMSEs, average log scores and average CRPS. For the other AR (VAR) models, we report: ratios of each model's RMSE to the baseline ARSV (VARSV) model, such that entries less than 1 indicate that the given model yields forecasts more accurate than those from the baseline; differences in score relative to the ARSV (VARSV) baseline, such that a positive number indicates a model beats the baseline; and ratios of each model's average CRPS relative to the baseline ARSV (VARSV) model, such that entries less than 1 indicate that the given model performs better.
To provide a rough gauge of whether the differences in forecast accuracy are significant, we apply Diebold and Mariano (1995) ttests for equality of the average loss (with loss defined as squared error, log score,or CRPS). 17 In the tables, differences in accuracy that are statistically different from zero are denoted by one, two or three asterisks, corresponding to significance levels of 10%, 5% and 1%, respectively. The underlying pvalues are based on tstatistics computed with a serial correlationrobust variance, using the prewhitened quadratic spectral estimator of Andrews and Monahan (1992). Our use of the Diebold–Mariano test with forecasts from models that are, in many cases, nested is a deliberate choice. Monte Carlo evidence in Clark and McCracken (2011a,b) indicates that, with nested models, the Diebold–Mariano test compared against normal critical values can be viewed as a somewhat conservative (in the sense of tending to have size modestly below nominal size) test for equal accuracy in the finite sample. Since the ARSV (VARSV) model nests the AR (VAR) model, for the comparison of the AR (VAR) model to the ARSV (VARSV) specification, we report pvalues based on onesided tests, taking the AR (VAR) as the null and the ARSV (VARSV) as the alternative. Because the ARSV and ARGARCH (VARSV and VARGARCH) models are not nested, for these model comparisons we report pvalues based on twosided tests. Since the other models considered nest the ARSV and VARSV baselines, for the remaining comparisons we treat each model as nesting the baseline, and we report pvalues based on onesided tests, taking the ARSV (VARSV) as the null and the other model in question as the alternative.
4.2 Point Forecasts: RMSEs
The results in Table 1 indicate that, for AR models, a model with stochastic volatility yields point forecasts that, in general, are either about as accurate or more accurate than forecasts from a model with constant volatility. For inflation and the interest rate, forecasts from the ARSV model are more accurate than forecasts from the AR model, with differences that are usually statistically significant. In the case of GDP growth and unemployment, the ARSV forecasts are about as good as or a little less accurate than the AR model forecasts. As examples, at the 4quarterahead horizon in the 1975–2011 sample, the ratio of the AR RMSE to the ARSV RMSE is 0.997 for GDP growth and 0.962 for unemployment, while the corresponding ratios are 1.041 for inflation and 1.061 for the interest rate.
Table 1. Realtime forecast RMSEs (RMSEs for ARSV and VARSV benchmarks, RMSE ratios in all others)  GDP growth, 1975:Q1–2011:Q2  GDP growth, 1985:Q1–2007:Q4 

h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 


Univariate models 
ARSV  0.742  0.748  0.745  0.747  0.448  0.461  0.488  0.491 
AR  1.003  1.006  0.997  0.995  0.986  0.990  0.976  0.974 
ARGARCH  0.982**  1.013  1.003  1.005  0.990  0.992  0.983**  1.004 
ARmixture  1.021  1.073  1.062  1.071  1.020  1.029  1.065  1.133 
ARstationary SV  1.002  1.002  0.996  0.996  1.000  0.995  0.987**  0.988** 
ARSVt, 5 df  1.004  1.010  1.006  1.006  1.012  1.016  1.010  1.016 
ARSVt  1.002  1.004  1.002  1.004  1.006  1.007  1.007  1.005 
ARTVPSV  1.024  1.044  1.024  1.019  1.008  1.006  1.021  1.050 
Multivariate models 
VARSV  0.738  0.717  0.720  0.732  0.481  0.495  0.492  0.473 
VAR  1.069**  1.078*  1.048  0.989  1.066***  1.059**  0.997  0.973 
VARGARCH  1.138***  1.265***  1.195***  1.082***  1.132**  1.139**  1.144  1.016 
VARstationary SV  1.018  1.014  1.012  0.989  1.019  1.019  0.998  0.994 
VARSVt, 5 df  1.008  0.998  1.000  1.006  1.007  1.004  0.979***  0.993* 
VARSVt  1.002  0.998  1.003  1.002  1.000  1.001  0.992**  0.998 
VARTVPSV  0.976  0.975  0.975  1.000  0.992  0.951**  0.991  1.005 
 Unemployment, 1975:Q1–2011:Q2  Unemployment, 1985:Q1–2007:Q4 
h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 
Univariate models 
ARSV  0.314  0.553  1.018  1.704  0.164  0.292  0.537  0.932 
AR  0.961  0.964  0.962  0.910  0.985  0.992  0.998  0.978 
ARGARCH  0.905  0.927  0.978  0.929  1.020  1.015  1.014  0.939 
ARmixture  0.962  1.100  1.302  1.528  1.112  1.177  1.283  1.366 
ARstationary SV  1.022  1.027  1.020  1.017  1.016  1.023  1.023  1.015 
ARSVt, 5 df  0.993**  0.998  0.999  0.994*  1.001  1.003  1.002  0.998 
ARSVt  0.997*  1.000  0.999  0.997*  1.002  1.000  1.000  0.999 
ARTVPSV  0.892**  0.912*  0.953  0.938  0.983  0.982  0.981  0.993 
Multivariate models 
VARSV  0.286  0.509  0.922  1.387  0.153  0.270  0.495  0.752 
VAR  1.010  1.031  1.050  1.004  1.049**  1.072*  1.082  0.986 
VARGARCH  1.071**  1.142***  1.149**  1.120*  1.100*  1.147  1.140  1.043 
VARstationary SV  1.007  1.012  1.018  1.015  1.017  1.017  1.018  0.995 
VARSVt, 5 df  1.008  1.014  1.020  1.007  1.016  1.015  1.020  1.001 
VARSVt  1.005  1.006  1.009  1.005  1.007  1.005  1.009  1.000 
VARTVPSV  0.947  0.969  0.984  0.993  0.971  0.961  0.972  1.051 
 Inflation, 1975:Q1–2011:Q2  Inflation, 1985:Q1–2007:Q4 
h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 
Univariate models 
ARSV  0.308  0.339  0.372  0.478  0.246  0.259  0.259  0.317 
AR  1.035***  1.035***  1.041**  1.046*  1.016**  1.038***  1.065***  1.119*** 
ARGARCH  1.059**  1.061***  1.090**  1.102  1.027  1.022  1.072**  1.096** 
ARmixture  1.137  1.142  1.163  1.121  1.036  0.987  1.019  0.993 
ARstationary SV  1.008  1.005  0.997  1.007  1.007  1.014  1.029  1.049 
ARSVt, 5 df  1.013  1.001  0.999  1.016  1.010  1.011  1.012  1.022 
ARSVt  1.004  1.002  0.998  1.003  1.005  1.004  1.009  1.006 
ARTVPSV  1.006  1.008  0.965  0.973  1.001  0.973  0.991  0.930 
Multivariate models 
VARSV  0.309  0.352  0.413  0.592  0.244  0.262  0.267  0.342 
VAR  1.022*  1.040**  1.003  0.986  1.032***  1.043***  1.080***  1.108*** 
VARGARCH  1.081  1.063  1.025  1.065  1.151**  1.030  1.113  1.026 
VARstationary SV  1.003  1.004  1.006  1.020  1.006  1.004  1.020  1.040 
VARSVt, 5 df  1.004  1.006  1.027  1.019  1.002  1.001  1.008  1.017 
VARSVt  1.005  1.005  1.011  1.008  0.998  1.000  0.998  1.001 
VARTVPSV  0.990  1.037  0.858**  0.783***  1.000  0.979  0.939  0.838** 
 Interest rate, 1975:Q1–2011:Q2  Interest rate, 1985:Q1–2007:Q4 
h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 
Univariate models 
ARSV  0.810  1.251  1.772  2.622  0.399  0.734  1.252  1.882 
AR  1.053***  1.072***  1.061  1.073  1.158***  1.137*  1.089  1.085 
ARGARCH  0.992  1.070  0.997  1.019  0.951  0.983  0.944**  0.971 
ARmixture  1.047  1.204  1.241  1.525  0.968  1.053  1.161  1.323 
ARstationary SV  0.999  0.999  0.997  0.999  1.003  1.000  1.000  1.003 
ARSVt, 5 df  0.997  0.999  1.003  1.005  1.003  1.006  1.008  1.020 
ARSVt  0.998  1.000  0.999  1.002  1.002  1.002  1.002  1.009 
ARTVPSV  0.951  1.025  0.958  0.996  0.885**  0.932*  0.962  1.038 
Multivariate models 
VARSV  0.787  1.206  1.735  2.591  0.399  0.728  1.223  1.885 
VAR  1.039**  1.056**  1.031*  1.021  1.072**  1.049**  0.995  0.970 
VARGARCH  1.065  1.165  1.068  1.156  1.001  1.057  0.964  0.959 
VARstationary SV  1.005  1.000  0.998  1.006  1.002  1.005  1.003  1.013 
VARSVt, 5 df  1.008  1.008  1.009  1.009  1.011  1.015  1.006  1.007 
VARSVt  1.004  1.006  1.005  1.004  1.003  1.006  1.002  1.000 
VARTVPSV  1.005  1.062  1.009  1.012  0.853**  0.901**  0.969  0.981* 
Among AR models with timevarying volatility, none of the alternative volatility specifications considered yield any consistent, sizable advantage over our stochastic volatility baseline. In some cases, some of these alternative models—GARCH and the ARmixture in particular—are significantly less accurate than the ARSV baseline. An example is the ARGARCH performance in inflation forecasting in the 1975–2011 sample. That said, there are some variablehorizon combinations for which GARCH, the mixture formulation, stationary SV or fat tails improve on the ARSV baseline. However, these advantages are not consistent (across variables and horizons), and they are typically small. As an example, allowing fat tails very slightly improves the accuracy of unemployment rate forecasts in the 1975–2011 sample. Finally, note that extending the ARSV model to include timevarying parameters (ARTVPSV) improves forecast accuracy in some cases (e.g. unemployment) and reduces it in others (e.g. GDP growth).
Turning to the results for VAR models, the specification with stochastic volatility (VARSV) fairly consistently and significantly improves on the accuracy of the model with constant volatility (VAR). For example, at shorter horizons, the RMSEs of the VAR forecasts of GDP growth are roughly 6% higher than the RMSEs of the VARSV forecasts. The advantage of the VARSV model over the VAR tends to be larger and more significant for inflation and the interest rate than for GDP growth and unemployment. Among VAR models with timevarying volatility, none of the alternative volatility specifications considered yield any consistent, sizable advantage over the stochastic volatility baseline. In fact, on balance, the VAR with GARCH is almost always dominated, often significantly, by the VARSV baseline, with RMSEs that often exceed baseline by 15% or more. But, as in the results for AR models, while there are some variablehorizon combinations in which specifications including stationary volatility or fat tails improve on the VARSV baseline, these advantages are not consistent (across variables and horizons), and they are typically small. Finally, extending the VARSV model to include timevarying parameters (VARTVPSV) improves forecast accuracy in many, although not all, cases, consistent with the findings in D'Agostino et al. (2013).
Overall, for point forecasting, including stochastic volatility in autoregressive models seems to help forecast accuracy more often than it harms accuracy, while among models with timevarying volatility, none of the alternatives we consider offer any advantage over the (random walk) stochastic volatility baseline.
In these point forecasts, the gains in forecast accuracy provided by modeling timevarying volatility are not easily explained. In a very large sample, in the presence of timevarying volatility, the VAR (AR) and VARSV (ARSV) models should yield the same coefficient estimates. However, in our finite sample of data, the coefficient estimates differ. For each model, the posterior medians of the forecast distributions are very similar to (unreported) point forecasts obtained using just the posterior means of coefficients. Accordingly, the differences in VAR and VARSV forecasts are due to differences in (posterior mean) coefficient estimates, not some other effect of stochastic volatility on the predictive density. However, it is difficult to pinpoint large differences in coefficients across the two models that clearly drive the differences in forecasts. Instead, there seems to be a fairly large number of small to modest differences in coefficients that together lead to some differences in forecasts. Some of the larger differences in coefficients across the VAR and VARSV specifications seem to be in estimates of the interest rate equation, particularly in the coefficients on inflation and the interest rate. One more easily identified pattern is that the difference in accuracy of inflation forecasts is largely due to lower bias of forecasts from the VARSV model, due to a lower implied mean of inflation for the VARSV model than the VAR model. However, the difference in mean inflation is also not easily linked to particular coefficient differences. Consequently, to explain the gains to point forecast accuracy that come with modeling timevarying volatility, we are left to speculate that, in the finite sample, in the presence of sharp movements in volatility and persistent movements in variables such as unemployment, inflation and the interest rate, including stochastic volatility can help to reduce some adverse effects of volatility changes on constantvolatility parameter estimates.
4.3 Density Forecasts: Log Predictive Scores
The results in Table 2 for log predictive scores indicate that, for AR models, including stochastic volatility significantly improves the accuracy of density forecasts relative to models with constant volatility, although more so at shorter horizons than longer horizons. At shorter horizons, the gains in average predictive scores are typically much bigger than the differences in RMSEs associated with stochastic volatility models. As an example, in the 1985–2011 sample, for 1quarterahead forecasts of GDP growth, the AR model improves on the RMSE of the ARSV baseline by 1.4%, while the ARSV model has a predictive score that is 26% higher than that of the AR specification.
Table 2. Average log predictive scores (scores for ARSV and VARSV benchmarks, score differences in all others)  GDP growth, 1975:Q1–2011:Q2  GDP growth, 1985:Q1–2007:Q4 

h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 


Univariate models 
ARSV  − 1.017  − 1.088  − 1.146  − 1.218  − 0.695  − 0.739  − 0.790  − 0.790 
AR  − 0.134***  − 0.090  − 0.039  0.027  − 0.260***  − 0.249***  − 0.219***  − 0.227*** 
ARGARCH  − 0.037  − 0.068**  − 0.065  − 0.052*  − 0.078***  − 0.100***  − 0.085  − 0.120** 
ARmixture  − 0.389  − 0.420  − 0.388  − 0.333  − 0.106  − 0.110  − 0.165  − 0.173 
ARstationary SV  − 0.044  − 0.024  − 0.009  0.025  − 0.088  − 0.079  − 0.067  − 0.093 
ARSVt, 5 df  − 0.022  − 0.020  − 0.004  0.013  − 0.047  − 0.044  − 0.043  − 0.054 
ARSVt  − 0.010  − 0.005  − 0.006  0.008  − 0.015  − 0.013  − 0.016  − 0.019 
ARTVPSV  − 0.010  − 0.009  0.000  0.010  − 0.015  − 0.018  − 0.031  − 0.068 
Multivariate models 
VARSV  − 0.999  − 1.068  − 1.102  − 1.184  − 0.714  − 0.769  − 0.761  − 0.777 
VAR  − 0.179***  − 0.119**  − 0.084  0.002  − 0.243***  − 0.204***  − 0.214***  − 0.227*** 
VARGARCH  − 0.179***  − 0.231***  − 0.275***  − 0.216  − 0.193***  − 0.236***  − 0.407***  − 0.477*** 
VARstationary SV  − 0.065  − 0.034  − 0.032  0.017  − 0.129  − 0.116  − 0.153  − 0.179 
VARSVt, 5 df  − 0.036  − 0.022  − 0.002  − 0.003  − 0.046  − 0.038  − 0.028  − 0.039 
VARSVt  − 0.017  − 0.012  0.002  − 0.003  − 0.015  − 0.016  − 0.005  − 0.003 
VARTVPSV  0.027  0.053**  0.021  0.005  0.012  0.049  0.006  − 0.014 
 Unemployment, 1975:Q1–2011:Q2  Unemployment, 1985:Q1–2007:Q4 
h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 
Univariate models 
ARSV  − 0.094  − 0.790  − 1.921  − 3.100  0.324  − 0.231  − 0.979  − 1.774 
AR  − 0.109*  0.008  0.483  1.069  − 0.212***  − 0.174  0.046  0.400 
ARGARCH  0.056  0.112  0.500  1.123  − 0.049  − 0.123  − 0.028  0.298 
ARmixture  − 0.095  0.092  0.449  0.779  − 0.090  − 0.112  − 0.163  − 0.324 
ARstationary SV  0.028  0.066  0.243  0.453  − 0.014  0.012  0.119  0.299* 
ARSVt, 5 df  − 0.035  − 0.024  0.088  0.109  − 0.032  − 0.026  0.038  0.118* 
ARSVt  − 0.020  − 0.029  0.030  0.051  − 0.022  − 0.013  0.024  0.054 
ARTVPSV  0.046  0.081  0.333  0.772  0.006  − 0.024  0.063  0.300 
Multivariate models 
VARSV  0.025  − 0.596  − 1.447  − 2.447  0.413  − 0.125  − 0.729  − 1.225 
VAR  − 0.178***  − 0.168***  − 0.032  0.370  − 0.229***  − 0.224***  − 0.137  0.076 
VARGARCH  − 0.515***  − 0.430*  − 0.101  0.246  − 0.791***  − 0.747***  − 0.510***  − 0.183 
VARstationary SV  − 0.032  − 0.014  0.065  0.326  − 0.050  − 0.054  − 0.020  0.081 
VARSVt, 5 df  − 0.034  − 0.039  − 0.024  0.079  − 0.046  − 0.039  − 0.031  0.016 
VARSVt  − 0.017  − 0.023  − 0.041  − 0.011  − 0.015  − 0.010  − 0.018  − 0.000 
VARTVPSV  0.025  0.038  0.071  0.269  0.020  0.004  − 0.010  − 0.013 
 Inflation, 1975:Q1–2011:Q2  Inflation, 1985:Q1–2007:Q4 
h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 
Univariate models 
ARSV  − 0.192  − 0.272  − 0.355  − 0.573  − 0.001  − 0.078  − 0.112  − 0.322 
AR  − 0.078***  − 0.090***  − 0.108***  − 0.117***  − 0.078**  − 0.105***  − 0.174***  − 0.166*** 
ARGARCH  − 0.119  − 0.100  − 0.137**  − 0.116***  − 0.185  − 0.142  − 0.196  − 0.137** 
ARmixture  − 0.336  − 0.368  − 0.230  − 0.341  − 0.356  − 0.312  − 0.239  − 0.507 
ARstationary SV  − 0.015  − 0.015  − 0.043  − 0.057  − 0.034  − 0.044  − 0.092  − 0.077 
ARSVt, 5 df  − 0.010  − 0.017  − 0.036  − 0.039  − 0.016  − 0.025  − 0.056  − 0.055 
ARSVt  0.000  0.000  − 0.005  − 0.007  − 0.002  − 0.004  − 0.015  − 0.009 
ARTVPSV  0.017  0.019  0.029  0.052  0.014  0.038*  0.034*  0.078* 
Multivariate models 
VARSV  − 0.203  − 0.302  − 0.424  − 0.718  − 0.011  − 0.104  − 0.146  − 0.404 
VAR  − 0.041*  − 0.072**  − 0.085**  − 0.098**  − 0.069**  − 0.092***  − 0.169***  − 0.147*** 
VARGARCH  − 0.386***  − 0.367***  − 0.374***  − 0.376***  − 0.526***  − 0.506***  − 0.582***  − 0.510*** 
VARstationary SV  − 0.007  − 0.003  − 0.036  − 0.048  − 0.029  − 0.030  − 0.082  − 0.065 
VARSVt, 5 df  − 0.014  − 0.018  − 0.028  − 0.015  − 0.005  − 0.005  − 0.039  − 0.038 
VARSVt  − 0.006  0.000  − 0.004  0.003  0.001  0.005  − 0.009  − 0.005 
VARTVPSV  0.029  0.020  0.076*  0.150***  0.003  0.022  0.072  0.169** 
 Interest rate, 1975:Q1–2011:Q2  Interest rate, 1985:Q1–2007:Q4 
h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 
Univariate models 
ARSV  − 0.902  − 1.570  − 2.324  − 2.966  − 0.598  − 1.279  − 1.974  − 2.264 
AR  − 0.420***  − 0.246  0.139  0.092  − 0.285***  − 0.045  0.224  0.115 
ARGARCH  0.056  0.121  0.400**  0.471  0.137*  0.171  0.339**  0.206 
ARmixture  − 0.136  − 0.021  0.299  0.294  0.024  0.029  0.098  − 0.193 
ARstationary SV  0.032  0.079*  0.234**  0.236  0.048  0.121*  0.219**  0.159** 
ARSVt, 5 df  − 0.011  0.039  0.119*  0.123*  − 0.009  0.043  0.089  0.008 
ARSVt  0.005  0.020  0.060**  0.077*  0.009  0.026  0.042*  − 0.001 
ARTVPSV  0.093**  0.111*  0.332*  0.449  0.136***  0.161**  0.241**  0.126 
Multivariate models 
VARSV  − 0.834  − 1.501  − 2.204  − 2.921  − 0.471  − 1.145  − 1.853  − 2.263 
VAR  − 0.434***  − 0.253  0.090  0.200  − 0.350***  − 0.108  0.195  0.226 
VARGARCH  − 0.326***  − 0.160  0.192  0.368  − 0.378***  − 0.176  0.115  0.109 
VARstationary SV  − 0.004  0.039  0.139**  0.193*  − 0.000  0.039  0.124*  0.107 
VARSVt, 5 df  − 0.040  0.003  0.036  0.015  − 0.042  − 0.015  0.011  − 0.039 
VARSVt  − 0.023  − 0.006  − 0.000  − 0.025  − 0.017  − 0.012  − 0.014  − 0.048 
VARTVPSV  0.068**  0.079*  0.190*  0.324*  0.100***  0.118*  0.170  0.164 
As in the point forecasts from the set of AR models, none of the alternative volatility specifications yield any consistent, sizable advantage over our stochastic volatility baseline. Some models offer an occasional advantage, but only occasionally. 18 However, the results for the unemployment rate yield some more notable, although not significant, differences. The outcomes for the unemployment rate during the 2007–2009 recession fell further in the extremes of the tails of the distribution than is the case for the outcomes of other variables. 19 This is especially true with stochastic volatility, which implied at the time that forecast uncertainty was low by historical norms. The log score for ARSV forecasts of the unemployment rate declines very sharply as the forecast horizon rises, more so in the sample that goes through 2011 than in the sample that ends in 2007. As a consequence, at the longer forecast horizons, models such as the AR and the ARGARCH yield a much higher score than the ARSV baseline in the 1975–2011 sample. However, these gains are not statistically significant, despite their size.
The patterns are broadly similar in results for the set of VAR models. With the multivariate specifications, including stochastic volatility with the VARSV model often improves on the log scores of the constant volatility VAR model, more so at shorter horizons than longer horizons. At the 1 quarter horizon, the score advantage of the VARSV model over the VAR is roughly 20% for growth and unemployment forecasts, less than 10% for inflation and 40% for the interest rate. The VARSV model dominates the VAR with GARCH, with the exception of unemployment and interest rates at longer horizons. Making volatility stationary as in the VARstationary SV model improves scores in some cases (e.g. unemployment and interest rates at longer horizons) and lowers them in others (e.g. growth and inflation at most horizons). 20 Adding fat tails to stochastic volatility typically lowers the log scores by a small amount. 21 Finally, consistent with the results of D'Agostino et al. (2013), adding TVP to the VARSV model typically (not always) improves density forecast accuracy, by amounts that are sometimes small and other times sizable enough to be statistically significant.
Overall, these results show that including stochastic volatility in autoregressive models typically yields sizable gains in density accuracy as measured by log scores, whereas among models with timevarying volatility none of the alternatives we consider offer any advantage over the stochastic volatility baseline.
4.4 Density Forecasts: CRPS
In the CRPS results for AR models shown in Table 3, including stochastic volatility consistently improves, often significantly, the accuracy of density forecasts relative to models with constant volatility, typically more so at shorter horizons than longer horizons (recall that a lower CRPS indicates better performance). For example, in interest rate forecasts over the 1985–2011 sample, the ratio of the CRPS of the AR model relative to the ARSV model is 1.305 at the 1quarter horizon, 1.165 at the 2quarter horizon and 1.052 at the 4quarter horizon. Among models with timevarying volatility, none of the alternatives offers any consistent advantage over the ARSV baseline. The ARmixture model is almost always worse than the baseline. The ARGARCH specification is usually less accurate than the ARSV baseline, except in interest rate forecasting. Allowing fat tails or making volatility stationary typically reduces density forecast accuracy by a small amount, although in a few cases these model enhancements yield small improvements in forecast accuracy (e.g. the ARstationary SV model offers small, statistically significant reductions in CRPS in longer horizon forecasts of interest rates). One other finding worth noting is that, in the case of the unemployment rate forecasts, the CRPSbased performance of the baseline model with stochastic volatility does not deteriorate as rapidly with the forecast horizon as did the log scorebased performance of the same model. As a result, as the forecast horizon increases, the model with constant volatility does not improve as much in relative terms under the CRPS measure as it did under the log score measure. This pattern reflects the fact that the CRPS is less sensitive to outlier outcomes.
Table 3. Average CRPS (CRPS for ARSV and VARSV benchmarks, CRPS ratios in all others)  GDP growth, 1975:Q1–2011:Q2  GDP growth, 1985:Q1–2007:Q4 

h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 


Univariate models 
ARSV  0.381  0.388  0.394  0.397  0.261  0.267  0.282  0.283 
AR  1.059***  1.057***  1.042**  1.036*  1.136***  1.140***  1.114***  1.120*** 
ARGARCH  1.007  1.027**  1.022  1.021  1.039**  1.053***  1.045  1.063** 
ARmixture  1.070  1.076  1.082  1.086  1.025  1.023  1.074  1.132 
ARstationary SV  1.017  1.015  1.009  1.003  1.040  1.042  1.030  1.041 
ARSVt, 5 df  1.004  1.008  1.006  1.004  1.007  1.013  1.017  1.019 
ARSVt  1.002  1.000  1.004  0.999  1.000  0.999  1.010  1.007 
ARTVPSV  1.021  1.027  1.032  1.029  1.010  1.011  1.033  1.058 
Multivariate models 
VARSV  0.388  0.386  0.389  0.395  0.275  0.281  0.279  0.276 
VAR  1.111***  1.108***  1.080**  1.033*  1.150***  1.145***  1.121***  1.121*** 
VARGARCH  1.160***  1.269***  1.257***  1.205***  1.162***  1.213***  1.315***  1.349*** 
VARstationary SV  1.028  1.025  1.021  1.005  1.047  1.060  1.056  1.065 
VARSVt, 5 df  1.010  1.002  0.996  1.003  1.009  1.014  0.987**  0.994 
VARSVt  1.004  1.002  1.002  1.000  1.000  1.006  0.997  0.994* 
VARTVPSV  0.964  0.955*  0.977  1.009  0.991  0.958**  0.995  1.011 
 Unemployment, 1975:Q1–2011:Q2  Unemployment, 1985:Q1–2007:Q4 
h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 
Univariate models 
ARSV  0.154  0.274  0.526  0.955  0.093  0.158  0.293  0.530 
AR  1.019  1.009  0.977  0.891  1.131**  1.140  1.096  1.005 
ARGARCH  0.959  0.980  0.995  0.900  1.063  1.111  1.123  0.999 
ARmixture  1.045  1.104  1.204  1.276  1.151  1.212  1.324  1.457 
ARstationary SV  1.008  1.013  1.009  0.992  1.010  1.014  1.008  0.978* 
ARSVt, 5 df  0.998  1.001  1.000  0.989  1.008  1.014  1.011  0.995 
ARSVt  0.999  1.003  1.002  0.992  1.002  1.005  1.005  0.999 
ARTVPSV  0.947  0.964  0.972  0.940  1.018  1.047  1.047  1.046 
Multivariate models 
VARSV  0.145  0.255  0.464  0.719  0.088  0.152  0.282  0.448 
VAR  1.060***  1.077**  1.061  0.973  1.137***  1.155***  1.115  0.966 
VARGARCH  1.340***  1.340***  1.246***  1.146**  1.742***  1.632***  1.384**  1.102 
VARstationary SV  1.013  1.015  1.012  0.997  1.027  1.024  1.010  0.974 
VARSVt, 5 df  1.016  1.024  1.020  1.005  1.022  1.021  1.025  1.005 
VARSVt  1.009  1.009  1.010  1.004  1.015  1.009  1.010  1.003 
VARTVPSV  0.951  0.955  0.973  1.016  0.984  0.969  0.970  1.017 
 Inflation, 1975:Q1–2011:Q2  Inflation, 1985:Q1–2007:Q4 
h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 
Univariate models 
ARSV  0.169  0.183  0.203  0.260  0.138  0.146  0.149  0.186 
AR  1.039***  1.037***  1.061***  1.076***  1.038**  1.066***  1.115***  1.127*** 
ARGARCH  1.103  1.088  1.118*  1.112*  1.168  1.145  1.188  1.132* 
ARmixture  1.122  1.103  1.092  1.047  1.071  1.023  1.032  0.964 
ARstationary SV  1.002  1.005  1.015  1.026  1.012  1.023  1.056  1.055 
ARSVt, 5 df  1.006  1.005  1.007  1.022  1.010  1.013  1.019  1.022 
ARSVt  0.999  1.002  1.000  1.008  1.004  1.006  1.006  1.003 
ARTVPSV  0.997  0.989  0.966  0.958  0.996  0.970  0.981  0.924 
Multivariate models 
VARSV  0.171  0.188  0.217  0.309  0.139  0.151  0.153  0.200 
VAR  1.027**  1.039**  1.034  1.027  1.039**  1.057***  1.125***  1.120*** 
VARGARCH  1.251***  1.249***  1.228***  1.179**  1.401***  1.394**  1.518***  1.381*** 
VARstationary SV  0.998  0.994  1.011  1.028  1.006  1.006  1.042  1.032 
VARSVt, 5 df  1.002  1.000  1.017  1.014  0.999  0.998  1.011  1.015 
VARSVt  0.999  1.002  1.004  1.004  0.997  0.999  0.999  0.999 
VARTVPSV  0.976  1.010  0.893**  0.800***  0.989  0.975  0.944  0.839** 
 Interest rate, 1975:Q1–2011:Q2  Interest rate, 1985:Q1–2007:Q4 
h = 1Q  h = 2Q  h = 4Q  h = 8Q  h = 1Q  h = 2Q  h = 4Q  h = 8Q 
Univariate models 
ARSV  0.365  0.627  1.000  1.594  0.220  0.416  0.737  1.134 
AR  1.162***  1.138***  1.047  1.035  1.305***  1.165***  1.052  1.033 
ARGARCH  0.971  1.019  0.930***  0.957*  0.952  0.967  0.903**  0.926 
ARmixture  1.021  1.102  1.103  1.273  0.977  1.057  1.141  1.319 
ARstationary SV  0.997  1.002  0.983***  0.977***  1.000  0.993  0.984  0.978** 
ARSVt, 5 df  1.003  1.003  0.998  0.997  1.008  1.007  1.004  1.009 
ARSVt  1.000  1.004  0.995  0.998  1.001  1.007  1.003  1.007 
ARTVPSV  0.933  0.985  0.927*  0.974  0.896**  0.926*  0.938  1.001 
Multivariate models 
VARSV  0.361  0.615  0.975  1.575  0.216  0.406  0.710  1.135 
VAR  1.135***  1.096**  1.010  0.971  1.249***  1.095**  0.980  0.924 
VARGARCH  1.136*  1.161*  1.025  1.041  1.240***  1.122*  0.992  0.960 
VARstationary SV  1.004  1.001  0.993  0.985  0.999  0.998  0.994  0.996 
VARSVt, 5 df  1.013  1.013  1.010  1.002  1.012  1.013  1.012  1.009 
VARSVt  1.010  1.004  1.005  1.005  1.006  1.005  1.007  1.004 
VARTVPSV  0.940*  0.994  0.981  0.989  0.872***  0.905**  0.951  0.956* 
With VAR models, the patterns are broadly similar. In most cases, compared to a VAR with constant volatility, a VAR including stochastic volatility improves density accuracy as measured by the CRPS, again more so at shorter horizons than longer horizons. Moreover, at horizons of 1 and 2 quarters, the gains in accuracy associated with stochastic volatility are statistically significant. Among models with timevarying volatility, no other specification offers consistent improvement over the VARSV baseline. Notably, the VARGARCH model performs significantly worse for almost every variable and horizon combination. Making stochastic volatility stationary or adding fat tails does not have much effect on CRPSbased density accuracy; these extensions slightly reduce accuracy in some cases (e.g. the performance of the VARstationary SV specification with GDP growth forecasts) and improve it in others (e.g. the performance of the VARSVt model in longer horizon GDP growth forecasts in the 1985–2007 sample). Once again, though, adding TVP to the VAR with stochastic volatility typically improves forecast accuracy, especially for inflation and the interest rate.
5 Conclusions
 Top of page
 Summary
 1 Introduction
 2 Data
 3 Models
 4 Results
 5 Conclusions
 Acknowledgements
 References
 Appendix
 Supporting Information
This paper compares, from a forecasting perspective, alternative models of timevarying macroeconomic volatility, included within AR and VAR specifications for key macroeconomic indicators. The set of models includes constant volatility; random walk stochastic volatility; stochastic volatility following a stationary AR process; stochastic volatility coupled with fat tails; GARCH; and a mixture of innovations model. The forecast comparisons cover GDP growth, the unemployment rate, inflation in the GDP deflator and a shortterm interest rate from 1975 to 2011. Our results indicate that the AR and VAR specifications with stochastic volatility dominate models with alternative volatility specifications, in terms of point forecasting to some degree and density forecasting to a greater degree, in particular when using proper scoring rules such as the CRPS. We conclude that, from a macroeconomic forecasting perspective, these alternative volatility specifications seem to have no advantage over the now widely used random walk stochastic volatility specification.
While this paper has focused on economic forecasting, we suggest the results have implications for macroeconomic modeling. There has been considerable effort over the last several years to enable DSGE models to account for timevarying volatility, to be able to explain the sources of the Great Moderation and other changes in volatility (examples include Justiniano and Primiceri, 2008; FernandezVillaverde and RubioRamirez, 2007; FernandezVillaverde et al., 2010; and Curdia et al., 2013). These studies have examined questions including which shocks drive timevarying volatility, the roles of different shocks in business cycle transmission and the roles of coefficients compared to shock sizes. While some work with DSGE models has considered GARCH (e.g. Andreasen, 2012) and Markov switching (e.g. Bianchi, 2013), most work on DSGE models with timevarying volatility has focused on random walk stochastic volatility. Our finding that stochastic volatility is, in general, in forecasting, at least as good as most other readily tractable (if sometimes more complicated) volatility specifications that might be considered supports the focus of structural modeling on stochastic volatility.