SEARCH

SEARCH BY CITATION

Keywords:

  • Change-points;
  • CUSUM;
  • long memory;
  • mean change;
  • unit-root;
  • variance change

Abstract

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. CUSUM procedures under dependence
  5. 3. Likelihood ratio statistics
  6. 4. Estimating the number of breaks
  7. 5. Discriminating break points, long memory and unit roots
  8. 6. Miscellanea
  9. Acknowledgements
  10. References

This paper gives an account of some of the recent work on structural breaks in time series models. In particular, we show how procedures based on the popular cumulative sum, CUSUM, statistics can be modified to work also for data exhibiting serial dependence. Both structural breaks in the unconditional and conditional mean as well as in the variance and covariance/correlation structure are covered. CUSUM procedures are nonparametric by design. If the data allows for parametric modeling, we demonstrate how likelihood approaches may be utilized to recover structural breaks. The estimation of multiple structural breaks is discussed. Furthermore, we cover how one can disentangle structural breaks (in the mean and/or the variance) on one hand and long memory or unit roots on the other. Several new lines of research are briefly mentioned.


1. Introduction

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. CUSUM procedures under dependence
  5. 3. Likelihood ratio statistics
  6. 4. Estimating the number of breaks
  7. 5. Discriminating break points, long memory and unit roots
  8. 6. Miscellanea
  9. Acknowledgements
  10. References

The analysis of structural breaks, or change-points, has its origins in quality control (Page, 1954, 1955) but has since become an integral part of a wide variety of fields with a significant statistical component, among them economics (Perron, 2006), finance (Andreou and Ghysels, 2009), climatology (Reeves et al., 2007) and engineering (Stoumbos et al. 2000). Much of the methodology was first developed for independent observations, so naturally procedures to detect instabilities in mean and variance have played a dominant role. The relevant lines of research containing results on simple location shift and scale shift models, and on more complex regression models for the independent case may be found in Brodsky and Darkhovsky (1993), Carlstein et al. (1994) and Csőrgő and Horvath (1997).

In many applications, however, it appears to be of interest to incorporate serial dependence of the observations into the statistical analysis. It is this case that the paper is concerned with. Two approaches to deal with the time series effect have emerged. One aims at quantifying the effect of dependence on the test statistics developed for the independent setting and then to extend their reach to include also the second-order properties as given for example in the autocorrelation function. In this case, the fitting of a particular parametric time series model may be avoided. This appears to be advantageous whenever ambiguity arises at the model fitting stage and model misspecification becomes an issue. The first approach then leads to establishing functional central limit theorems for the dependent case and, most crucially, to deriving appropriate estimators for the long-run variances. The second approach utilizes particular time series models and seeks to explicitly describe the dependence structure concurrently with potential structural breaks in the observations. Most popular are the classes of linear ARMA and nonlinear GARCH-type models. Since parametric assumptions are being made, likelihood methods are available and can be used to design relevant test statistics.

Traditionally, structural break problems have been phrased as hypothesis tests. The null is set up to describe structural stability, the alternative contains one or multiple structural break(s). The test statistics may be viewed as two-sample tests adjusted for the unknown break location, thus leading to max-type procedures. Often asymptotic relationships are derived to obtain critical values for the tests. After the null hypothesis is rejected, the location(s) of the break(s) need(s) to be estimated. This is the setting covered, for example, in Bai and Perron (1998), and Csőrgő and Horváth 1997. Lately, there have been attempts to view structural break estimation as a model selection problem, see Davis et al. (2006), Lu et al. (2010), Robbins et al. (2011). Besides these contributions to historical methods, sequential detection procedures have been developed. This is in line with the original interest in quality control (Page, 1954), where one monitors the output of a production line and wishes to signal deviations from the null hypothesis (the in-control scenario) quickly. These methods have been extended to other areas of applications, notably economics and finance in a number of recent publications, see Aue et al. (2009e), Chu et al. (1996).

During the past decade or so, structural breaks research has focused more on how to disentangle structural breaks from other forms of departures from the null model such as long memory and unit roots as they have a similar effect on the second-order properties of a time series. For example, all induce a slow decay in the autocorrelation function. Empirical research findings by Bhattacharya et al. (1983) and others were subsequently made rigorous by a number of authors, among them Giraitis et al. (2003), Berkes et al. (2006), Aue et al. (2009d), Harvey et al. (2010).

The contents of this paper are selective. We did not have the space to cover, for example, frequency domain and wavelet-based methods. The interested reader may refer Picard (1985), Adak (1998), Lavielle and Ludena (2000) and Ombao et al. (2001) for more information. Bayesian change-point methods can be found in Barry and Hartigan (1993) and Chib (1998). Additional references for contributions in econometrics and finance may be found in Andreou and Ghysels (2009), Perron (2006).

The paper is organized as follows. In Section 2, we discuss how the popular CUSUM procedures may be adjusted to serial dependence in the observations. Section 3 covers likelihood procedures for parametric time series models such as the popular ARMA processes. Section 4 deals with the estimation of multiple break points. Section 5 contains information on how to distinguish structural breaks from other departures of the null model, namely long memory and unit roots. Section 6 summarizes miscellanea, such as sequential procedures for time series and on the detection of breaks in functional data. We tried to keep the presentation of results illustrative. The interested reader may find a more in-depth analysis of results in the original research papers, many of which are given in the extensive yet still selective list of references.

2. CUSUM procedures under dependence

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. CUSUM procedures under dependence
  5. 3. Likelihood ratio statistics
  6. 4. Estimating the number of breaks
  7. 5. Discriminating break points, long memory and unit roots
  8. 6. Miscellanea
  9. Acknowledgements
  10. References

Let inline image denote the set of integers. One often used model is the signal-plus-noise model

  • image(1)

where inline image is the signal and inline image the noise component, which has E[ɛt] = 0 and inline image. In many instances it is of interest to test for the structural stability of the signal. Structural stability may mean one of the following:

  •  The signal is constant, that is, all μt ≡ μ are identical. This is equivalent to saying that the unconditional mean of the Yt’s does not change over time.
  •  If one has additional information on the form of the signal, expressed through an r-dimensional covariate inline image, one can use inline image, with signifying transposition, and work in a linear regression framework. Structural stability then refers to the constancy of the regression coefficients, that is, all βt ≡ β are identical, which is equivalent to saying that the form of the conditional mean does not change over time.
  •  In some cases, one is also interested in testing for the structural stability of the (conditional) variance. Here inline image may potentially be a function of time. More generally, it is of interest in the time series context to look into the stability of the second-order structure as expressed through the autocovariance function inline image, inline image, assuming zero-mean stationarity of inline image.

It is the first two bullet points that this section is mainly concerned with, but some remarks will be added in the end on the stability of variances and covariances/correlations. Testing for the constancy of the unconditional mean is one of the most often studied problems in change-point analysis. Csőrgő and Horváth (1997) provide a survey of various methods for this case if the data can be assumed independent. Most prominent among these methods are versions of what is known as the cumulative sum (CUSUM) procedure. In this section, we show how CUSUM procedures may be adjusted to time series data.

2.1. Structural breaks in the unconditional mean

To be specific, assume that n observations inline image have been taken from the real-valued stochastic process inline image in (1), and that we are interested in testing the null hypothesis of constant (unconditional) means

  • image

against the alternative that H0 is not true and that the mean has changed at least once during the observation period. The (rescaled) CUSUM process inline image of the observations is given by

  • image(2)

where ⌊·⌋ denotes integer part. Note that, under H0, inline image, so that the value of the CUSUM is then independent of the unknown (but common) mean μ. It is now important to quantify the large-sample behavior of the partial sums of inline image. Since the observations are possibly correlated, the classical Functional Central Limit Theorem (FCLT) as described, for example, in Billingsley's (1968) monograph cannot be applied directly. Define the standardized partial sum process inline image by

  • image

Much recent research effort has then been devoted to providing conditions under which the weak convergence

  • image(3)

in the Skorohod space D[0, 1] holds. Here W = (W(t):t ∈ [0,1]) denotes a standard Brownian motion and ω > 0 is a scaling parameter to be discussed further below. Recent contributions discussing the weak convergence of dependent random variables are Bradley (2009b), Dedecker et al. (2007), Wu (2007) and Aue et al. (2007). If (3) holds, one immediately gets also that

  • image(4)

where B = (B(t) : t ∈ [0,1]), with B(t) = W(t) − tW(1), is a standard Brownian bridge. It can be seen from (4) that the CUSUM process limits for independent observations and time series data have the same form but differ in the scaling parameter ω. Since we have the relation inline image, ω2 is typically referred to as the long-run variance. In applications, ω2 is unknown and has to be estimated from the data with an estimator inline image. If this estimator is weakly consistent, (4) implies that

  • image(5)

Finding accurate estimates of ω2 under dependence can be a difficult task that is commonly approached through the use of kernel estimators of the Bartlett or flat-top type. For the most up-to-date, and essentially optimal, long-run variance estimation results we refer to the recent paper Liu and Wu (2010).

With the preceding, we can now construct a test procedure for H0. Evaluated in the argument x = k/n, the CUSUM process Zn(x) basically compares the sample mean of the observations up to lag k with the global sample mean of all observations. Since the timing of the break is unknown, one checks all possible choices k ∈ {1,…, n}. This leads to the max-type test statistics

  • image

whose values should be ‘small’ if H0 holds and ‘large’ if H0 is violated. To quantify this statement, one utilizes (5) and the Continuous Mapping Theorem to obtain

  • image

where inline image signifies convergence in distribution. The distribution of M has been tabulated, for example in Shorack and Wellner (1986), and can therefore be used to construct tests that hold a pre-specified asymptotic level α. In the Example 1, we discuss the application of Mn in the case of autoregressive processes of order one (AR(1) processes).

Example 1. (CUSUM for the Mean of AR(1) Processes) Here, we highlight the CUSUM procedure if the innovations form an AR(1) process, so that

  • image

where inline image is a sequence of independent standard normal random variables. The left panel of Figure 1 shows how the CUSUM process Zn behaves under the null hypothesis of no mean break. It can be clearly seen how variability increases with increasing autoregressive parameter φ. The long-run variance in this example is computed as inline image and needs to be estimated to appropriately rescale Zn. If one is reasonably confident in the AR(1) assumption, one can do this parametrically. Otherwise, the lag-window estimator

  • image

can be used, where inline image are the sample covariances, wh = 1 − h/n are the Bartlett weights and inline image is a bandwidth selector. Andrews (1991) recommended the choice inline image with inline image. Using the Bartlett estimator with this bandwidth, the scaled CUSUM processes inline image are plotted in the right panel of Figure 1. For the simulated data, a visual inspection seems to be confirm that the approach works well. There are, however, other results in the literature that point to certain disadvantages using the Bartlett estimator for strong positive and negative correlations, among them Robbins et al. (2011) who recommended innovations-based CUSUM procedures (see also below). Note that the 10% critical value of the CUSUM statistic is 1.225 and that unadjusted correlation will clearly inflate the empirical levels.

image

Figure 1.  The fluctuations of the CUSUM process Zn increase with the autoregressive parameters φ = −0.8,−0.4,0,0.4,0.8 (left). The fluctuations of the scaled CUSUM process using the Bartlett estimator are more homogeneous (right). For all cases, the same independent standard normal innovations have been used

Download figure to PowerPoint

The Examples 2 and 3 deal with non-obvious applications of the CUSUM method to checking the stability of the correlation in AR(1) processes and the stability of the expected volatility in GARCH(1,1) processes. To keep the exposition readable, we restrict the discussion to simple models. Similar results, however, are also expected to hold for higher-order counterparts of the AR(1) and GARCH(1,1) setting. It should be stressed that CUSUM-based procedures work reasonably well if the parameter of interest is a moment, for example, covariances as in Aue et al. (2009 b).

Example 2. (CUSUM for the Correlation of AR(1) Processes) Let us now assume that the data are generated by the AR(1) process inline image, where inline image are i.i.d. with E[ɛ1] = 0, inline image and inline image. The interest here is in testing

  • image

against the alternative

  • image

for an unknown 1 ≤ k* < n. Assume that under H0, |φ| < 1, so that inline image constitutes a stationary sequence. Also under H0, we have that inline image and a change in φ implies therefore a change also in the second moment. Thus, functionals of the CUSUM process Zn in (2) with inline image may be used. Since the fourth moments are finite, under H0, inline image satisfies (3) and (5).

Assume that, under HA, the parameter φ changes to φ* at k* = ⌊κn⌋ for some κ ∈ (0, 1), where |φ|, |φ*| < 1. Then, it can be shown that

  • image(6)

where inline image signifies convergence in probability. Since the estimator inline image for the long-run variance ω2 associated with the squared process inline image remains bounded also under H0, the asymptotic consistency of Mn is established, that is, Mn [RIGHTWARDS ARROW] ∞ in probability as n [RIGHTWARDS ARROW] ∞.

Figure 2 shows in the left panel the time series plot of n = 500 observations of which the first k* = 250 were generated according to an AR(1) process with parameter φ = 0.6 and the remaining n − k* = 250 observations from an AR(1) process with parameter φ* = −0.2. Independent standard normal variates were used as innovations. The first half of the simulated data exhibits a smooth sample path, indicating positive correlation, while the second half is rougher, indicating negative correlation. The right panel shows the scaled CUSUM process inline image of (6). Indicated as a horizontal line is also the limit in (6), which can be computed as 0.130 using the simulation parameters. The data gives that inline image. This maximum is reached for the 210th observation. The second largest peak in the sample path of inline image is reached at lag 254 and returns a value 0.131. Since the transition between pre-break and post-break sample is smooth, greater difficulties in locating the break time have to be expected.

image

Figure 2.   Plot of an AR(1) process whose parameter changes from φ = 0.6 to φ = −0.2 at lag k* = 250, indicated by a dotted vertical line (left). The process inline image (right). Here the horizontal dashed line indicates the limit on the right-hand side of (6)

Download figure to PowerPoint

Example 3. (CUSUM for the Volatility of GARCH(1,1) Processes) Let inline image be given by

  • image
  • image

with νt > 0, inline image and inline image i.i.d. with E[ɛt] = 0 and inline image. Assume that inline image have been observed. This is the famous GARCH(1,1) process that parametrizes the conditional variance in terms of an ARMA(1,1) relation, see Engle (1982) and Bollerslev (1986). The relevant parameter vectors are inline image and the null hypothesis of interest is

  • image

implying that the conditional variances inline image are a stationary process. Note that, under H0, inline image is strictly stationary if inline image. For additional information on variants of GARCH processes and their applications we refer to Aue et al. (2006a), and Francq and Zakoïan (2010).

Assume that, instead of testing H0 directly, we are interested in testing the stability of the expected volatilities inline image. Notice that inline image are unobservable quantities but that, by construction, inline image, so that testing procedures may be set up using the CUSUM process Zn in (2) with inline image. According to Berkes et al. (2008), the partial sums of inline image obey a FCLT, so (3) and (5) are satisfied.

The testing procedures based on Mn work best if the structural breaks occur close to the sample center, so that pre-break and post-break sample are of similar size. Power naturally decays if the breaks appear early or late. To improve detection ability for this latter case, weighted versions of the CUSUM procedure have been introduced. This leads to test statistics based on functionals of the type inline image, where the supremum is taken over an interval inline image and q is a weight function satisfying certain conditions (see Csőrgő and Horváth (1997) for details). Most prominent is the weight function inline image as it is the standard deviation of the Brownian bridge B(x). These functionals are related to the maximally selected likelihood ratio test assuming normality, see Section 3 below. Notice that, if inline image, then the Continuous Mapping Theorem implies that

  • image

Critical values for an application of the truncated CUSUM test can now be obtained via simulation of the limit process, whose distribution depends on the amount of truncation ε. If, however, εn = 0 or εn [RIGHTWARDS ARROW] 0 as n [RIGHTWARDS ARROW] ∞, then inline image even under H0 and the test statistic has to be renormalized in order to obtain a non-degenerate limit distribution, namely the Gumbel or double exponential distribution. The resulting non-standard limit theorems are called Darling–Erdős-type results after Darling and Erdős (1956), see also Example 4. For these, standard FCLTs are not applicable and stronger assumptions on the underlying processes are needed such as strong approximations (see Einmahl (1989) for the independent case) and rates of convergence for inline image.

Similar to the discussion in Example 2, one can show that the CUSUM-based test statistic Mn has asymptotic power one under the one-break alternative of exactly one mean change. The power may, however, not be monotone in the sense that a larger break size leads to a more powerful test. This is mainly due to the behaviour of the long-run variance estimator inline image under the alternative. More specifically, the issue occurs because the bandwidth, chosen to work well under the null hypothesis, can be severely flawed under the alternative. Remedies have been offered by Altissimo and Corradi (2003), Juhl and Xiao (2009), and Shao and Zhang (2010).

Once a structural break has been detected, the time of its occurrence has to be estimated. The CUSUM can also be used to locate the break points. We continue to assume for now that, under HA, there is exactly one break at time k* = ⌊nκ⌋ with some κ ∈ (0, 1). Notice that k* itself cannot be estimated consistently, but that the break point fraction κ can be consistently estimated by

  • image

If the number of breaks in the underlying observations inline image is equal to m and the observations can thus be divided into m + 1 homogeneous subsequences, then inline image still converges in probability to one of the m distinct break locations. Therefore CUSUM-based testing procedures are consistent also against the m-break alternative. Results on the estimation of multiple break points can be found, for example, in Bai (1999), Banerjee and Urga (2005), Bernard et al. (2007) and Döring (2011). We return to the estimation of multiple breaks in Section 4 below.

The weak convergence in (5) has a multivariate extension that will be discussed in the remainder of this section. Assume that we change the signal-plus-noise model (1) to its d-dimensional counterpart

  • image

where inline image is a process taking values in inline image and inline image are stationary time series innovations with E[ɛt] = 0 and inline image. This setting is similar to the one discussed in Horváth et al. (1999). Based on observations inline image, we wish to test the null hypothesis

  • image

against the at-least-one-change alternative. Following the steps leading to (2) above, one defines analogously the d-dimensional CUSUM process inline image by

  • image(7)

Then, as in the univariate case, we have that under H0 the value of Zn(t) is independent of the unknown but common mean parameter μ. Let (Sn(x) : x ∈ [0, 1]) be the d-dimensional partial sum process

  • image

Aue et al. (2009b) and Bradley (2007) provide conditions under which the weak convergence

  • image(8)

in the d-dimensional Skorohod space Dd[0, 1] holds. The limit inline image in the previous display is a d-dimensional Brownian motion with covariance matrix Ω, that is, WΩ is Gaussian with E[WΩ(x)] = 0 and inline image. Notice that Ω is termed the long-run covariance matrix (or, a multiple of the spectral density matrix at the origin). It is assumed that one can construct a weakly consistent estimator inline image for Ω, that is,

  • image(9)

Sufficient conditions for (9) to hold may be found, for example, in Liu and Wu (2010). If (8) and (9) are satisfied and if Ω is non-singular, one obtains that

  • image(10)

where inline image denote independent standard Brownian bridges. Functionals of the quadratic form process inline image may now be used to construct test statistics. Using (10) the Continuous Mapping Theorem yields that

  • image

The limit distribution M has been tabulated (see Shorack and Wellner (1986)) and thus asymptotic test procedures can now be constructed. Weighted versions of the multivariate CUSUM process, such as

  • image

with lower bound ln = 1/(n + 1) and upper bound un = n/(n + 1), which are more sensitive to find structural breaks in the beginning and the end of the sample, have also been considered in the literature. More information on this may be found in Csőrgő and Horváth (1997).

2.2. Structural breaks in the conditional mean

The CUSUM methodology introduced in the Section 2.1 can be modified to cover linear regression models. To this end, we return to the model (1) and assume that the signal component has the form inline image with some r-dimensional covariate process inline image. This gives the model

  • image

Assume that observations are available for inline image. The null hypothesis then becomes

  • image

which has to be tested against the alternative that H0 is not true. To construct the test, one computes first an estimator inline image for β, typically using least squares techniques, and defines then the estimated residuals inline image. If H0 is true, the estimated residuals will be ‘close’ to the innovations. If H0 is violated, the estimated residuals should systematically deviate from the innovations. This can be exploited if the CUSUM process Zn in (2) is built with inline image in place of the observations inline image. This scenario has been considered (including extensions) by Bai and Perron (1998). These authors also provide a set of non-restrictive conditions on the covariates inline image and the innovations inline image that allow for an asymptotic theory akin to the one outlined in Section 2.1. Both are notably allowed to exhibit time series character. Extensions to the multivariate regression setting may be found in Qu and Perron (2007). The use of weighted residuals is discussed in Huškováet al. (2007) for the case that inline image is an AR process.

Other choices for regressors that may be used include polynomials as in Aue et al. (2008, 2009), Kulperger (1985), Tang and MacNeill (1993), as well as more general smooth functions satisfying certain regularity conditions, see Aue et al. (2012). Importantly, lagged response variables have been used as regressors and therefore provided a structural breaks framework for autoregressive time series that is based on innovations rather than the observations themselves. This approach has been advocated by Bai (1993) and Yu (2007) in a more general ARMA setting and more recently by Robbins et al. (2011), who show that empirical levels of innovations-based test statistics tend to be less inflated than those of observation-based test statistics, and closer to the nominal levels.

2.3. Structural breaks in the variance and second-order characteristics

Testing for structural breaks in the variance has been of interest as well. Important contributions for the independent setting are due to Inclán and Tiao (1994), who developed an iterative cumulative sums of squares algorithm and applied it to IBM stock data. Related work was done by Gombay et al. (1996), who studied the detection of possible breaks in the variance with and without concurrent breaks in the mean in a sequence of independent observations.

In the time series case, it often makes sense to assume a linear process structure. Owing to the Wold decomposition, any purely non-deterministic zero-mean stationary process inline image can be represented in the form

  • image(11)

where inline image are coefficients determining the dependence structure. To enable more parsimonious modeling, these coefficients are typically approximated by ARMA process fitting. Lee and Park (2001) showed how Inclán and Tiao's (1994) test can be applied to linear processes by computing the appropriate scaling. This is similar to the adjustment of CUSUM procedures for the mean.

More generally, the linear process in (11) can be utilized to detect structural breaks in covariance and correlation structure. This was done in Berkes et al. (2009b), who gave results related to the stability of the covariances inline image based on weighted approximations. In a nonlinear time series setting parametric procedures were utilized by Kokoszka and Leipus (2000) to detect breaks in the parameters of ARCH processes, and by Berkes et al. (2009b) to sequentially monitor for breaks in the parameters of GARCH processes.

The multivariate variance-covariance structure of a large class of time series was investigated by Aue et al. (2009b). Wied et al. (2012) use a similar approach to test for the constancy of cross-correlations in a bivariate setting. Andreou and Ghysels (2002) were concerned with the dynamic evolution of financial market volatilities.

3. Likelihood ratio statistics

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. CUSUM procedures under dependence
  5. 3. Likelihood ratio statistics
  6. 4. Estimating the number of breaks
  7. 5. Discriminating break points, long memory and unit roots
  8. 6. Miscellanea
  9. Acknowledgements
  10. References

The CUSUM procedures discussed in Section 2 are non-parametric and do not make use of a particular time series model fit. The dependence of the observations enters only in form of the long-run variance estimators inline image and inline image. In many instances, parametric time series assumptions are being made that describe explicitly the dependence structure found in the data. As a consequence, forecasting procedures may be easily implemented and can be based on existing algorithms as long as the observations satisfy the underlying no structural break null hypothesis. Here, discuss structural break procedures based on likelihood methods.

Let us assume that we are interested in the d-dimensional process inline image and that the distribution of Yt depends on an unknown parameter vector θt. Assume further that inline image have been observed. In this case the null hypothesis of structural stability becomes

  • image

which is to be tested against the alternative that there is exactly one unknown break point k*, that is,

  • image

To do so, suppose for the moment that the break was located at lag k. Then, one can split the data into the two subsamples inline image and inline image and costruct the likelihood function inline image taking into account that these subsamples have different generating parameters θ0 and inline image. The likelihood inline image can now be compared to the likelihood Ln(θ) coming from the null model, assuming that the parameter θ0 generated the observations. The comparison is done via the likelihood ratio

  • image

where, inline image is the maximum likelihood estimator for θ0 for the first subsample and inline image is the maximum likelihood estimator for inline image for the second subsample. Since the location of the break is unknown, one rejects the null hypothesis H0 when the maximally selected log-likelihood ratio statistic

  • image

is large. The distribution of Zn and other functionals of −2 log Λk, k = 1,…, n − 1, were derived in a series of papers by Gombay and Horváth (1990, 1994, 1996) and Horváth (1993) in the case of independent observations.

In time series models, it can be difficult to compute the joint distributions of inline image and inline image. In this case, the quasi-likelihood method can be used. Usually the normality of the observations is assumed to get an explicit expression for Λk, but the properties of Λk and Zn are studied without using normality. The following example shows how the likelihood-based methodology can be applied to autoregressive time series.

Example 4. (AR Processes) We focus on the case of univariate observations and follow Davis et al. (1995), who considered the segmented autoregressive model

  • image

with inline image and i.i.d. innovations satisfying E[ɛ1] = 0 and inline image. Under the null hypothesis H0 of structural stability, we have thus k* ≥ n and all observations are generated from the parameter vector inline image. Under the one structural break alternative HA, we have that 1 ≤ k* < n and the observations are generated from the parameter vector inline image prior to k* and by inline image thereafter.

Assuming that the errors inline image follow a normal distribution, the likelihood ratio Λk can be constructed and the maximally selected log-likelihood ratio statistic Zn computed. If H0 should hold, then one obtains an extreme value limit distribution, in accordance with the discussion in Section 2. More specifically, it is shown in Davis (1995) that, under additional regularity assumptions not stated here,

  • image

where

  • image
  • image

and Γ(·) denotes the Gamma function. It should be noted again that this limit result does not require the normality of the innovations. For the application of the test procedure σ2 has to be replaced with a consistent estimator inline image converging at a fast enough rate.

The extreme value asymptotics in Example 4 appears because Zn is, by definition, built from a maximum over all possible break locations. Since this includes also the very early and late lags, standard limit results based on FCLTs do not apply anymore and additional normalization in form of the centering sequence bn and scaling sequence an are required to obtain a non-degenerate limit distribution. Many authors (among them Andrews (1993), Bai (1999), Bai and Perron (1998), Ghysels et al. (1997) and Hansen (2000)) have, instead of applying extreme value theory, resorted to a truncation of Zn and thus enabled standard limit theory.

It is known (see Hall (1979)), that the convergence to the extreme value limit can be slow and asymptotic tests often tend to be too conservative in finite samples. To circumvent this issue, one can use resampling and bootstrap methods that lead to a better approximation of the test levels. Related literature, including both retrospective and sequential methods, is Aue et al. (2012), Hušková (2004), Hušková and Picek (2005), Kirch (2008), and Hušková and Kirch (2012).

4. Estimating the number of breaks

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. CUSUM procedures under dependence
  5. 3. Likelihood ratio statistics
  6. 4. Estimating the number of breaks
  7. 5. Discriminating break points, long memory and unit roots
  8. 6. Miscellanea
  9. Acknowledgements
  10. References

Here, we briefly discuss several approaches to estimating and locating multiple break points in the observations. This problem has since its origins, for example in Yao (1988), been treated as a model selection problem. In this framework, one can view the segmented, piecewise stationary time series model as the one that best matches the observations, even without explicitly assuming that the underlying model be true for the data.

For most of this section, we focus on the form

  • image(12)

where inline image, E[ɛt] = 0 and inline image. Equation (12) describes a model which has piecewise constant means and variances. All mean and standard deviation parameters, μ and σ, are unknown, as are the number of breaks m0 and the break locations inline image. Assuming that the observations are normal and changes occur only in the mean, Yao (1988) suggested to use Schwarz’ (1978) criterion to estimate m0. Yao's result was modified by Serbinowska (1996) to cover the case of binomial observations. Her method was applied to determine the number of authors of the Lindisfare Gospels. Kühn (2001) extended the Schwarz’ criterion to the time series case using strong invariance principles.

To estimate the number of breaks in observations given by (12), one computes, for m ≤ M and a given candidate segmentation inline image, the residual sum of squares

  • image

where inline image denotes the sample mean of the ℓth segment. Define

  • image

with the minimum being taken over the candidate segmentations inline image. Yao (1988) suggested to estimate m0 with

  • image

Notice that the first term in SC is related to the log-likelihood and is thus a measure for the goodness of the fit, while the second term is a penalty term that is applied to avoid overfitting. Roughly speaking, dn must be larger than the rate of convergence of the partial sums on the segments (without breaks) to standard Brownian motions. Kühn (2001) has proved that for time series satisfying strong invariance principles one obtains weak consistency,

  • image

Vostrikova's binary segmentation procedure works well for estimating the number of breaks only if m0 is rather small. This procedure can be based, for example, on the maximum-type CUSUM procedures discussed in Section 2, since they reach their largest value in the neighbourhood of a (true) break point. Using this largest value to split the sample into two, and repeating the same steps on the subsamples, one obtains a multiple testing procedure. While the binary segmentation can be used under fairly general assumptions on the underlying process, it is consistent only of the significance levels chosen in each step converge to zero with increasing sample size. The interested reader may confer Bai (1997, 1999), Bai and Perron (1998), and Qu and Perron (2005) for more details.

In a piecewise stationary autoregressive time series setting, Davis et al. (2006) proposed an estimator for m0 based on the minimum description length principle (which amounts to selecting a different penalty term dn). The consistency of the procedure is assessed empirically and a consistency result for the relative break locations κ, where inline image, is proved in the case that m0 is known. Aue and Lee (2011) used a similar approach for image segmentation. Kurozumi and Tuvaandorj (2011) returned to modifications of the classical model selection criteria to estimate m0. They use variants of Akaike's information criterion, the Bayes information criterion and Mallow's Cp criterion and discuss conditions for consistency of these procedures.

5. Discriminating break points, long memory and unit roots

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. CUSUM procedures under dependence
  5. 3. Likelihood ratio statistics
  6. 4. Estimating the number of breaks
  7. 5. Discriminating break points, long memory and unit roots
  8. 6. Miscellanea
  9. Acknowledgements
  10. References

Tests for structural stability are not robust against other deviations from the null such as long memory and unit roots. All three of these phenomena would, for example, inherit an autocorrelation function with many significant lags that decay only slowly. Differences can be hard to detect based on finite samples. In this section, we summarize the research in this still active area, starting with the breaks vs. long memory case in Section 5.1 and continuing with the breaks vs. unit roots case in Section 5.2.

5.1. Structural breaks and long memory

A stationary process inline image is said to exhibit long memory if its autocovariance function γ is not absolutely summable, that is, inline image. An example of long memory processes is the class of fractionally integrated ARMA time series introduced in Granger and Joyeux (1980) and Hosking (1981), in which the difference operator (1−B)d is applied for fractional values of d. Stationarity is given if inline image. It can be seen that the spectral density of a long memory process is unbounded at the origin. Popular applications of long memory processes may be found in hydrology, environmental sciences and notably macroeconomics.

Research in the past two decades has, however, revealed that the features described above, typically regarded as indicative of long memory, can also occur in short memory processes, for which inline image, affected by structural breaks in the mean or trend. For example, Bhattacharya et al. (1983) showed that the so-called Hurst effect (to achieve convergence, partial sums have to be scaled with nH, inline image being the Hurst parameter, and d > 0) can also be explained in a short memory scenario if structural breaks in the mean are allowed. Giraitis et al. (2001) followed up on these findings by showing that a number of popular test statistics used to detect long memory diverge to infinity also in the presence of structural breaks in a short memory sequence. Attention was consequently given to the development of statistical procedures that could discriminate between these situations.

Berkes et al. (2006) phrased this problem in the form of a hypothesis test with the null hypothesis corresponding to structural breaks in the mean of a short memory time series. More precisely, the models under consideration are the following.

  •  Under the null hypothesis H0, the observations inline image come from the structural break model of Section 2, namely
    • image
    where μ ≠ μ*. The innovations inline image are assumed to satisfy the FCLT (3) and be fourth-order stationary.
  •  Under the alternative HA, the fourth-order stationary observations satisfy inline image, where the normalized partial sum process inline image given by
    • image

satisfies the weak convergence

  • image(13)

for some inline image and cH > 0. Here inline image denotes a fractional Brownian motion with Hurst parameter H, that is, a Gaussian process with zero mean and covariances inline image.

The test statistic for these hypotheses is constructed as follows. Let inline image be an estimator for the time of change as discussed in Sections 2 and 4. Use this estimate to split the observations into the two subsamples inline image and inline image. Recall the definition of the CUSUM process Zn in (2) and define the max-type test statistics

  • image

where inline image denotes the CUSUM process for the second subsample, assuming further that inline image and inline image are consistent estimators of the long-run variance, based on the two subsamples, of inline image under H0. The test statistic is then defined as

  • image(14)

Let B(1) and B(2) denote two independent standard Brownian bridges. Under additional regularity assumptions on the timing and the magnitude of the break, one can show with the help of the weak convergence in (2.5) that, under H0,

  • image

while, under HA, inline image, thereby showing consistency.

Example 5. The sample ACF plotted in the left panel of Figure 3 comes from n = 500 observations of the process inline image, where μt = 0 if tk* = 250 and μt = 1 if t > k* = 250, and inline image are from an AR(1) time series with parameter φ = 0.8. This is an example of a time series observed under H0. In the right panel of Figure 3, we see the sample ACF of n = 500 observations of the long-memory process

  • image
image

Figure 3.   ACF plot of a short memory series subject to a mean break (left) and ACF plot of a long-memory process (right). Details on the specifications may be found in Example 5.1

Download figure to PowerPoint

where inline image and (ξt) are independent standard normals. For the simulations, inline image has been truncated at an upper limit N=10,000. Both sample ACF plots display the same features and it seems possible to mistake long-memory for short-memory subject to mean breaks, and vice versa. An application of inline image separates the two phenomena. First, consider the short-memory series. The break point is estimated at inline image. Then, inline image and inline image are computed. The corresponding plot in the left panel of Figure 4 shows that now the null of short memory can no longer be rejected and long memory is ruled out. Second, consider the long-memory series. The break point is estimated at inline image. Then, inline image and inline image are computed and shown in the right panel of Figure 4. Short memory with a mean break is rejected and the series is identified to have strong dependence. Note that the 10% asymptotic critical value is 1.36.

image

Figure 4.  Sample path of inline image for a short memory series subject to a mean break (left) and sample path of inline image for a long-memory process. The same scale is used in both plots. Vertical lines indicate the estimated break location

Download figure to PowerPoint

Other contributions in the literature reversed the roles of H0 and HA. Among them are Ohanissian et al. (2008) who used an aggregation approach, Qu (2011) who utilized a frequency domain test statistic, and Shao (2011) who worked with a CUSUM-type statistics. Baek and Pipiras (2012) designed tests based on an estimation of the self-similarity parameter. Additionally, Hidalgo and Robinson (1996), Lazarová (2005) and Yoon (2005) provided tests for structural breaks in the mean if the errors exhibit long memory. The effects of persistence and breaks in volatility were examined in Giraitis et al. (2003). Finally, Yamaguchi (2011) estimated the time of change when the long-memory parameter is subject to change.

5.2. Structural breaks and unit roots

Fractionally integrated processes as discussed in the previous section were introduced to bridge the gap between stationary time series and non-stationary time series with unit roots such as the random walk. The latter are important in econometrics. In this section, we discuss how to separate breaks in the mean from random walks and also from innovations that switch from a stationary to a random walk behaviour. In particular, we are interested in the following.

  •  Under the null hypothesis H0, the observations inline image come from the structural break model of Section 2, namely
    • image
    where μμ*. The innovations inline image are assumed to satisfy the FCLT (3).
  •  Under the first alternative inline image, the observation have a constant mean μ and the process inline image satisfies the weak convergence
    • image(15)
    with some scaling parameter ω and a Brownian motion W = (W(t):t ∈ [0,1]).
  •  The second alternative inline image has constant means μ, but at lag k* the stationary observations start to display unit-root behaviour, that is, inline image are such that the innovations satisfy (3) and inline image satisfy (15).

Several test statistics will be considered. The first procedure is the CUSUM procedure, adjusted for a mean break, inline image in (14) and therefore the prototype of statistics used to detect breaks in level. The second procedure is based on the modified adjusted range statistic

  • image

as proposed by Lo (1991). Using an estimator inline image for the time of change, one can proceed as in the previous section and modify Rn for a potential mean break and obtain

  • image

where inline image denotes the sample mean of inline image. This gives the test statistic inline image.

A modification (by a mean correction) of the KPSS test (Kwiatkowski et al., 1992) is

  • image

This rescaled variance statistic was considered by (Giraitis et al., 2001, 2003). Mimicking the above steps, one constructs again two statistics on the subsamples that are split at the estimated break point to obtain

  • image

from which the test statistics inline image is computed. If B(1) and B(2) denote again two independent standard Brownian bridges, and if other regularity conditions are met, one gets the following limit distributions under the null hypothesis. It holds, under H0 and if n[RIGHTWARDS ARROW]∞,

  • image

The limit distributions can be easily obtained as they are maxima of two independent random variables with known distribution functions, see Shorack and Wellner (1986). It can be shown that all tests are weakly consistent against both inline image and inline image. The results presented here follow Aue et al. (2009d). All omitted details can be found there.

Example 6. Suppose that we generate n = 500 observations of the random walk (Yt), where inline image and (ξt) a sequence of independent standard normals. The left panel of Figure 5 shows a typical ACF plot, which is similar as those in Figure 3. An application of inline image first estimates the break point location at inline image. Then, inline image and inline image are computed. The corresponding plot in the right panel of Figure 5 shows that now the null of short memory is rejected at the 10% significance level for which the critical value equals 1.36. One should note the greater smoothness in this plot (compared to the plots in Figure 4). Similar behaviours can be shown for inline image and inline image, but are omitted here.

image

Figure 5.  ACF plot of a random walk (left) and sample path of inline image for a random walk. The vertical line indicates the estimated break location

Download figure to PowerPoint

Related contributions are Kim et el. (2002). Cavaliere and Taylor (2008). Harvey et al. (2009, 2010). These authors study the possibility of breaks in the mean and/or the variance of the innovations and the unit-root problem. Their methodology proceeds by first attending to the mean/variance breaks and then to test whether the errors are stationary or exhibit unit roots. An interesting application can be found in King and Ramlogan-Dobson (2011).

6. Miscellanea

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. CUSUM procedures under dependence
  5. 3. Likelihood ratio statistics
  6. 4. Estimating the number of breaks
  7. 5. Discriminating break points, long memory and unit roots
  8. 6. Miscellanea
  9. Acknowledgements
  10. References

We briefly discuss several other lines of research in the structural break field that have some impact. First, we mention some of the contributions to sequential methodologies as they apply to time series in Section 6.1. Second we discuss in Section 6.2 some recent publications in the field of functional data analysis, where contributions discussing the temporal dependence are still sparse.

6.1. Sequential procedures

While the body of literature concerned with retrospective break point tests and estimation procedures is rich, this has as of late not been true to the same degree for the corresponding sequential procedures. Starting with the seminal paper Chu et al. (1996) this has slowly changed. These authors have developed fluctuation tests that are based on the general paradigm that an initial time period of length n is used to estimate a model with the goal to monitor for parameter changes on-line.The asymptotic analysis is carried out for n[RIGHTWARDS ARROW]∞. To test the null hypothesis of structural stability sequentially, one defines a stopping time τn that rejects the null as soon as a suitably constructed detector function Γn crosses an appropriate threshold gn (measuring the growth of the detector under the null), that is,

  • image

Sequential tests based on CUSUM-based detector functions were considered in Aue et al. (2006, 2009) and Horváth et al. (2004) for linear and time series regressions, in Gombay and Serban (2009), Huškováet al. (2007) for AR processes, and in Berkes et al. (2004) for GARCH processes. Hušková and Kirch (2012) and Kirch (2008) developed bootstrap techniques for sequential CUSUM procedures. The detection of breaks in counting processes is covered in Gut and Steinebach (2002, 2009). Aue et al. (2012) designed sequential monitoring procedures to test for the stability of the betas in a functional version of the Capital Asset Pricing Model. Other contributions in the literature utilize moving sum, MOSUM, procedures. These have the advantage of faster detection times when compared to CUSUM procedures. A unifying view based on generalized fluctuation tests, incorporating CUSUM and MOSUM procedures as special cases, was offered in Chu et al. (1999a, b), Leisch et al. (2000), Kuan and Hornik (2005), and Zeileis et al. (2005). Starting with Aue and Horváth (2004), there have been a number of contributions deriving the limit distribution of the stopping time τn under the alternative of a structural break, see for example Aue et al. (2008, 2009). Steland (2007) reversed the roles of null and alternative and monitors under the unit-root null hypothesis. Pawlak et al. (2010) designed nonparametric methods based on the vertical box control chart.

6.2. Structural breaks for functional data

A field that has seen increased research output is functional data analysis. Research in this area assumes that data can be described by smooth curves rather than as discrete observations. This approach has become popular both in situations of dense data (roughly, many observations per curve) and sparse data (roughly, few observations per curve). The relevant lines of research are presented in Ramsay and Silverman (2005), Horváth and Kokoszka (2012). Early contributions on structural breaks in functional data are Berkes et al. (2009), who developed a functional CUSUM-type test exploiting functional principal components analysis and applied it to temperature records viewing annual profiles as one functional observation. Aue et al. (2009e) analyzed the limit distribution of a break point estimator in the same setting. The time series character of functional data was for the first time systematically treated in Hörmann and Kokoszka (2010), who also highlighted how structural break procedures are affected by serial functional correlation. Horváth et al. (2010) investigated how one might test for structural stability of the autoregressive operator in a Hilbert-space valued autoregressive process. Many open problems remain.

References

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. CUSUM procedures under dependence
  5. 3. Likelihood ratio statistics
  6. 4. Estimating the number of breaks
  7. 5. Discriminating break points, long memory and unit roots
  8. 6. Miscellanea
  9. Acknowledgements
  10. References