Measuring the degree of non‐stationarity of a time series

In time series analysis, there is an extensive literature on hypothesis tests that attempt to distinguish a stationary time series from a non‐stationary one. However, the binary distinction provided by a hypothesis test can be somewhat blunt when trying to determine the degree of non‐stationarity of a time series. This article creates an index that estimates a degree of non‐stationarity. This index might be used, for example, to classify or discriminate between series. Our index is based on measuring the roughness of a statistic estimated from the time series, which is calculated from the roughness penalty associated with a spline smoothing/penalized least‐squares method. We further use a resampling technique to obtain a likely range of index values obtained from a single realization of a time series. We apply our method to ascertain and compare the non‐stationarity index of the well‐known earthquake and explosion data. © 2016 The Authors. Stat Published by John Wiley & Sons Ltd


Introduction
Recently, there has been intense interest in tests of second-order stationarity. See, for example, Priestley & Subba Rao (1969), Von Sachs & Neumann (2000), Paparoditis (2009Paparoditis ( , 2010, Cardinali & Nason (2010), Dette et al. (2011), Dwivedi & Subba Rao (2011), Nason (2013), Taylor et al. (2014), Basta (2015), Barlett et al. (2015), Jentsch & Subba Rao (2015), Jin et al. (2015), Bandyopadhyay & Subba Rao (2016) and Cardinali & Nason (2016). Most, if not all, of these articles consider the problem of testing the null hypothesis that a time series, ¹X t º, is second-order stationary against the alternative hypothesis that it is not. In practice, such a null/alternative assessment is somewhat blunt. This article takes a different approach: one where we assign an index, S.X t / D S.¹X t º/, to a time series, ¹X t º, that measures the degree of non-stationarity of ¹X t º. The key advantage of a non-stationarity index is that, with two time series, ¹X t º and ¹Y t º, one can attempt to ascertain whether series ¹X t º is more non-stationary than series ¹Y t º.
Section 2 develops an experimental empirical measure of non-stationarity based on the mathematical roughness of the time evolution of fitted parameters of a dynamic linear model. For stationary series, the parameters are constant, which results in zero roughness and a zero value of S. For non-stationary series, greater parameter wiggliness (over time) implies greater roughness and larger positive values of S. Section 3 shows some examples of the measure S for different processes.
We can compute S.X t / based on one realization of X t , t D 1, : : : , T. However, we are more interested in S with respect to the stochastic process ¹X t º, and hence, Section 4 proposes a bootstrap technique that obtains a likely distributional assessment of the quantity S.¹X t º/ computed on the process. Such a distribution can be particularly useful when Stat (wileyonlinelibrary.com) DOI: 10.1002/sta4.125 The ISI's Journal for the Rapid Dissemination of Statistics Research comparing two, or more, series to see whether one is really more non-stationary than the other or whether their distributions are not distinguishable.
Although many of the articles listed earlier focus on hypothesis tests, many of them use metrics that could usefully be employed to measure non-stationarity as we do here, for example, the L 2 metric from Dette et al. (2011). Although we develop a sensible index of stationarity here, it is by no means the only one that could be constructed. Indeed, it would be possible to define measures of non-stationarity that measure different aspects that might, in some cases, give contradictory answers.
2 Non-stationarity as roughness To assess second-order stationarity, we require some way of modelling at least the second-order structure of our time series. One possibility, engaged here, is the use of an underpinning Bayesian dynamic linear model that can flexibly adapt to the underlying process structure, which we review next. The underlying idea is not restricted to dynamic linear models but could also be applied to other moments; local wavelet, S j .z/; or Fourier spectra, f.z, !/, as functions of time z; see Dahlhaus (1997) or Nason et al. (2000) for example.
Let y 1 , y 2 , ..., y n be a realization observed from a Gaussian time-varying autoregressive (TVAR) process ¹Y t º of order 1, TVAR.1/, with an unobservable autoregressive parameter Â t . Dynamic linear models (e.g. West & Harrison, 1997, Chapter 4) provide a general extendable representation for such series. A TVAR.1/ Gaussian dynamic linear model is characterized by two coupled stochastic processes, which are the observation and system equations, respectively, with respective innovation variances of V and W, which we assume to be constant. The quantities G t , V and W are assumed known (or at least well modelled) and deterministic and can, in practice, be selected using the usual dynamic linear modelling procedures. Let D t denote the usual information set at time t. The system equation in model (1) can represent any of the following: a prior belief, a latent process or an unobservable signal at time t, Â t , that a domain expert might be interested to estimate. Page 106 of West & Harrison (1997) explains how prediction in this set-up proceeds in a recursive manner and that the posterior distribution Â t j D t can be shown to follow an N.m t , C t / distribution.
For our purposes of measuring stationarity, the key modelling issue is whether Â t is constant over time. Indeed, ¹Y t º is second-order stationary if and only if ¹Â t º is constant, and this occurs if and only if W D 0 and Â t D G t D c for some constant c. For Gaussian dynamic linear models, the necessary information is completely specified by m t and C t (as the current prior becomes the next posterior), and naturally, these two are constant for stationary ¹Y t º. To measure the degree of non-stationarity, we need some metric for assessing increasing "non-constancy," and one possibility is to measure the "wiggliness" or roughness of the trajectory of m t . In this article, we study the roughness of the trajectory of O Â t , the estimated Â t .
To assess the roughness of Â t , we fit a non-parametric function g.t/ to the estimated mean response curve O Â t by minimizing the following penalized least-squares criterion: where > 0 is selected using the generalized cross-validation methodology proposed by Golub et al. (1979). In minimizing Q, we not only obtain a good estimator g.t/ of m t , but, crucially, we compute the roughness measure R t n t 1 ¹g 00 .t/º 2 dt, which is an estimate of the roughness of Â t . The integral of the squared second derivative is not the only roughness measure: an extensive literature on global and local roughness measures exists; see for example Good & Gaskins (1971, Section 5), Green & Silverman (1993, Chapter 1) or Ramsay & Silverman (2010, Section 5.2.2).
We have two primary motivations for choosing R t n t 1 ¹g 00 .t/º 2 dt as the roughness measure. First, the measure can be applied over any time interval and, more importantly, over compactly supported non-overlapping intervals. Second, the theory underlying Q is mature and well understood and is linked to the theory of natural cubic splines. That is, the solution g.t/, which minimizes criterion (2), among the class of all continuous functions with continuous firstorder derivatives, is a natural cubic spline possessing the following representation (e.g. Silverman, 1985;Green & Silverman, 1993;Wood, 2006): jˇj .t i /; i D 1, 2, : : : , n; t 1 < t 2 < : : where theˇj.t/ are cubic polynomial spline basis functions and the coefficients j are estimated by minimizing Eq.
(2). The t 1 < t 2 < : : : < t n are the knots where g.t/ is continuous and has continuous second-order derivatives at the knot points.
However, our primary interest is not in the fitted g but in the second term of (2), which measures its roughness. Because Â t is random, it is rational not to interpolate its estimate but find a solution g t that fits best within the class of all smooth functions. Hence, the second term is a good surrogate estimator for the roughness for m t . We can give a more parsimonious matrix expression for the roughness measure. Define D ¹ 1 , 2 , : : : where .Y t / are the that arise from fitting the smoothing spline to ¹Y t º. Further, define to be the non-stationarity index of the time series ¹Y t º.

Some expository examples
This section contains two sets of simple expository examples showing how S.Y t / changes as ¹Y t º moves from a stationary model to increasingly non-stationary TVAR.1/ models. Smoothing splines were fitted using the routine smooth.spline, and the non-stationarity index S.Y t / was computed using the function bsplinepen of the package fda (Ramsey & Wickham, 2014), all within the R statistical package. Figure 1 shows the value of S for four models. In each case, G t is constant over time. The plots show the result of computing S.Y t / on different realizations, where one parameter is changed but the others are held constant. So, Figure 1(a) has W D 0.05, V D 1 and G t 2[ 0.2, 1], but all are fixed for each realization. Figure 1(b) has G t D 0.8 and V D 1 but changes W 2[ 0.05, 1]. From both of these plots, one can see that increasing G t or increasing W results in increased roughness of the mean path of Â t , increasing non-stationarity and increasing S.Y t /. The bottom row shows constant S.Y t / for two different stationary processes.
The second set of plots shows the spline fitting to a realization from five different models. Model a specifies Â t D 0.8, and, in all cases, W D 0.03.   Figure 2 are for one realization. We repeated the process for 5000 independent realizations: the summary statistics for each model are shown in Table I. So, for example, both plot and table show that model c exhibits the highest degree of non-stationarity, whereas model a is stationary but, owing to sampling fluctuations, is not precisely zero.
Model c highlights that the non-stationarity index S is an additive measure of the total changes in the gradient of the parameter under study. It cannot distinguish between deterministic and stochastic non-stationarity.

Bootstrap distribution of the non-stationarity index
Computation of S.Y t / on one realization provides limited assessment of the non-stationarity of the underlying process. For a fuller assessment, we construct a conditional bootstrap distribution for the non-stationarity index. For two time series, ¹X t º and ¹Y t º, a bootstrap estimated distribution of the corresponding non-stationarity indices S.X t / and S.Y t / can be plotted to give an idea of whether one process is more non-stationary than the other. In recent years, several   resampling methods for locally stationary processes have been developed. See, for example, Sergides & Paparoditis (2008) and Kreiss & Paparoditis (2012, 2015. Our proposed resampling procedure generates bootstrap innovations corresponding to a fairly general class of TVAR processes. Because our generating model is a Gaussian dynamic linear model, we adapt the state space bootstrap algorithm of Stoffer & Wall (1991), which works by joint estimation and resampling from the system and observation residuals, as described next.

Assessing distribution of S using a conditional bootstrap algorithm
For the dynamic linear model (1), given Â t j D t , Y t is independent of its past and future. This implies that to resample from the forecast distribution Y t j D t 1 , at time t, we only need to resample from the conditional distribution Y t j Â t , D t 1 . The resulting algorithm is a modified residual bootstrap that uses the forecast distribution Y t j D t 1 and the parameter distribution Â t j D t 1 with information until time t 1, to generate bootstrap sample Y t at time t.
(1) Our bootstrap is based on model (1). Then the state at time t 1 can be derived from (1) where a t and f t are the prior mean and forecast mean for Â t and Y t , respectively; R t is the prior variance of Â t ; Q t is the (prior) forecast variance of Y t ; and A t is the regression coefficient of Â t on Y t (these are standard dynamic linear model quantities used repeatedly by West & Harrison, 1997).
(2) Generate bootstrap residuals at time t using (3) Resample N times from e t with replacement obtaining e .i/ t for i D 1, : : : , N. Construct N bootstrap samples by , i D 1, : : : , N.
To assess bootstrap parameter estimation errors, we define two mean-squared loss functions about Â t and m t .
where ¹m .i/ t º T tD1 are the bootstrap estimated mean trajectories for bootstrap sample i D 1, : : : , N. We carried out a bootstrap evaluation of S for each of the examples in Table I for N D 5000 and T D 1000, and the summary statistics are shown in Table II    panel shows the histogram associated with the stationary AR.1/ process (in blue) with a kernel density estimate superimposed in orange. Here, the index has a positively skewed distribution with a mode close to zero. For the remaining panels, each histogram shows the bootstrap distribution of S for the other four models with the stationary AR.1/ density estimate from the top left panel superimposed for comparison. For all the other, non-stationary models, the bootstrap distributions are shifted significantly to the right compared to the stationary case. The supporting information presents further figures that compare descriptive sample statistics of parameters of original and typical bootstrap samples for the aforementioned models in chronological order, for TVAR.1/ in Figure 6, sinusoidal TVAR.1/ in Figure 7, exponential TVAR.1/ in Figure 8 and Gaussian TVAR.1/ in Figure 9.
5 Non-stationarity of the earthquake and explosion data This section applies the non-stationarity index to compare the earthquake, Y t , and explosion, X t , time series from Shumway & Stoffer (2006), which are available in the R statistical package astsa of Stoffer (2016) and shown in Figure 4. Here, T D 2048. As before, we use Gaussian dynamic linear model (1) as a model for both time series. Through careful modelling, we selected W D 0.01, G t D 0.8 and V D 1. Such models permit flexible, yet simple, incorporation of time-varying mean and variance components for observed and latent dynamic signals. We wish to see how S changes over the extent of each series. Hence, we partitioned the series into 20 chronologically ordered non-overlapping time windows, B k , k D 1, : : : , 20, 19 of length 100 and the final one of length 148, and computed S separately on each window. Following the advice of Shumway & Stoffer (2006, footnote on p. 229), we avoided overlapping windows. Figure 4 shows the two time series with the value of S computed on each window B k .
The non-stationarity index for the earthquake data shows two bursts of high values of the index around t D 300 and t D 600. For the former, one can see a slight downward shift in the process itself; for the latter peak, it is difficult to see precisely what might be causing the larger value of S. There are smaller peaks around t D 1200, 1500 and 1700, where changes in the stochastic structure appear to occur. For the explosion data, the non-stationarity index is large around the point of the series where the variance rapidly increases (t D 1200) but seemingly misses the obvious structural change very early on in the series. The non-stationarity index for the earthquake series is much larger than for the explosion one. Both series are presented on approximately commensurate scales, and so the larger non-stationarity index for the earthquake data indicates a higher degree of non-stationarity. This might be due to it possessing sustained periods of regime change compared to the explosion data, which is akin to singular striking of an object.

Correcting for scale effects
The non-stationarity index S is affected by the scale or measured units of an underlying time series. For example, if we multiply a series by a scale factor, c > 0, then also increases by c and hence S increases by c 2 . For situations where series are measured in the same units, for example, the lack of scale invariance might be seen as an advantage. In other situations, one might require a scale-invariant measure of non-stationarity. There is no optimum method for achieving scale invariance, but we have experimented with the following quantities: where sd t2B k .¹Y r º/ is the standard deviation of the series in the block B k , which overlaps time point y: this normalizes S by the local standard deviation. In S C .Y t /, we normalize by the posterior variance C t , while in S .B k /, we divide the original index S by the squared norm of fitted spline coefficients.
Figure 5 plots S .Y t / and S C .Y t / for the earthquake and explosion data, and the supporting information Figure 10 shows the plot for S .B k /. The normalization has an effect, particularly in the lower left panel of Figure 5 where the sizes of the non-stationarity indices for the two peaks are closer than previously. However, the normalization may seem to be less influential than expected, possibly because the scales of the earthquake and explosion series are reasonably commensurate to begin with.

Discussion
This article proposes a descriptive tool, S.Y t / from (5), that measures the degree of non-stationarity of a time series. This contrasts with hypothesis tests of stationarity that adopt a coarser binary approach to the question of time series stationarity. The top panels in Figure 1 clearly show that when the modelled Â t is changing fastest (increasing G t in different examples), the assessed non-stationarity increases as expected. To prevent relying on the results of S on a single realization, we adopt a bootstrap approach to obtain an estimate of the distribution of S with respect to the underlying stochastic process of the time series.
The non-stationarity index can be flexibly computed on a number of time-varying statistics such as spectra, autocorrelation functions at different lags or, as here, statistics arising from various stochastic models such as dynamic linear models. We can compute the index on the whole series or on a set of windows as shown earlier. The index could also be computed on streaming or online data, taking advantage of fast algorithms for spline computation such as in Section 5 of Weinert (2009).
Although our earlier proposal is one reasonable possible index of non-stationarity, it is certainly not the only one or the optimal (if such a thing exists). For example, we have already remarked that, for a TVAR.1/ model with Â t , following a linear trend as a function of t would result in a zero value of S. Hence, one might consider examination of indices (wileyonlinelibrary.com) DOI: 10.1002/sta4.125 The ISI's Journal for the Rapid Dissemination of Statistics Research that depend on the first derivative of the time-varying statistic, for example, R ¹g 0 .t/º 2 dt and also using some method for estimating g for the particular time series in question. We have mostly focused on Gaussian examples that are well modelled by dynamic linear models, but other models could and should be used in other situations where one is faced with heavy-tailed errors or trend, for example.