## 1. INTRODUCTION

This paper analyses the performance of some predictive volatility models built to exploit high-frequency data. This is carried out through the development of a class of models we call high-frequency-based volatility (HEAVY) models, which are designed to harness high-frequency data to make multistep-ahead predictions of the volatility of returns. These models allow for both mean reversion and momentum. They are somewhat robust to certain types of structural breaks and adjust rapidly to changes in the level of volatility. The models are run across periods where the level of volatility has varied substantially to assess their ability to perform in stressful environments.

Our approach to inference will be based on the use of the ‘Oxford-Man Institute's realised library’ of historical volatility statistics, constructed using high-frequency data. Such statistics are based on a variety of theoretically sound non-parametric estimators of the daily variation of prices. In particular, it includes two estimators of interest to us. The first is realised variance, which was systematically studied by Andersen *et al.* (2001a) and Barndorff-Nielsen and Shephard (2002). The second, which has some robustness to the effect of market microstructure effects, is realised kernel, which was introduced by Barndorff-Nielsen *et al.* (2008). Alternatives to the realised kernel include the multiscale estimators of Zhang *et al.* (2005) and Zhang (2006) and the pre-averaging estimator of Jacod *et al.* (2009).1

The focus of this paper is on predictive models, rather than on non-parametric measurement of past volatility. Torben Andersen, Tim Bollerslev and Frank Diebold, with various co-authors, have carried out important work on looking at predicting volatility using realised variances. Typically they fit reduced-form time series models of the sequence of realised variances—e.g. autoregressions or long-memory models on the realised volatilities or their logged versions. Examples of this work include Andersen *et al.* (2001a,b, 2003, 2007).

The approach we follow in this paper is somewhat different. We build models out of the intellectual insights of the ARCH literature pioneered by Engle (1982) and Bollerslev (1986), but bolster them with high-frequency information. The resulting models will be called HEAVY models. These models also use ideas generated by Engle (2002), Engle and Gallo (2006) and Cipollini *et al.* (2007) in their work on pooling information across multiple volatility indicators and the paper by Brownlees and Gallo (2009) on risk management using realised measures. Our analysis can be thought of as taking a small subset of some of the Engle *et al.* models and analysing them in depth for a specific purpose, looking at their performance over many assets. Our model structure is very simple, which allows us to cleanly understand its general features, strengths and potential weaknesses. We provide no new contribution to estimation theory, simply using existing results on quasi-likelihoods. We show that when we marginalise out the effect of the realised measures, HEAVY models of squared returns have some similarities with the component GARCH model of Engle and Lee (1999). However, HEAVY models are much easier to estimate as they bring two sources of information to identify the longer-term component of volatility. We further find that the additional information in the realised measure generates out-of-sample gains, which are particularly strong when the parameters of the model are estimated to match the prediction horizon, using so-called ‘direct projection’.

The structure of this paper is as follows. In Section 2 we will define HEAVY models, which use realised measures as the basis for multi-period-ahead forecasting of volatility. We provide a detailed analysis of these models. In Section 3 we detail the main properties of ‘Oxford-Man Institute's realised library’ which we use throughout the paper. In Section 4 we fit the HEAVY models to the data and compare their predictions to those familiar from GARCH processes. Section 5 discusses possible extensions. Section 6 draws some conclusions.