Prediction of climate nonstationary oscillation processes with empirical mode decomposition



[1] Long-term nonstationary oscillations (NSOs) are commonly observed in climatological data series such as global surface temperature anomalies (GSTA) and low-frequency climate oscillation indices. In this work, we present a stochastic model that captures NSOs within a given variable. The model employs a data-adaptive decomposition method named empirical mode decomposition (EMD). Irregular oscillatory processes in a given variable can be extracted into a finite number of intrinsic mode functions with the EMD approach. A unique data-adaptive algorithm is proposed in the present paper in order to study the future evolution of the NSO components extracted from EMD. To evaluate the model performance, the model is tested with the synthetic data set from Rössler attractor and with GSTA data. The results of the attractor show that the proposed approach provides a good characterization of the NSOs. For GSTA data, the last 30 observations are truncated and compared to the generated data. Then the model is used to predict the evolution of GSTA data over the next 50 years. The results of the case study confirm the power of the EMD approach and the proposed NSO resampling (NSOR) method as well as their potential for the study of climate variables.

1. Introduction

[2] Nonstationary oscillations (NSOs), also referred to as nonstationary sinusoids [Kuznetsova and Tsirulnik, 2004] or quasi periodicity [Meyers and Pagani, 2006], have been observed in climatological data such as global surface temperature anomalies (GSTA), the North Atlantic Oscillation (NAO) and the Pacific Decadal Oscillation (PDO) index values. For example, global surface temperature shows long-term (or low-frequency) nonstationary processes such as decadal [Ghil and Vautard, 1991] and multidecadal oscillations [Schlesinger and Ramankutty, 1994a]. However, these oscillations are often so irregular that they cannot be represented with a simple sine or cosine wave. In other words, the phase and modulus are changing with time, which implies nonstationarity.

[3] A stochastic model that reproduces an NSO is useful to predict the variations of climatic processes and study their impacts on other variables such as hydrologic regimes. It is, however, a difficult task to model a NSO process. For example, a simple linear and Autoregressive Moving Average (ARMA) model [Salas et al., 1980] can be applicable for the stochastic modeling of climatic variables. However, the model assumes the applied time series is stationary and the NSO process is not adoptable. Another alternative is to employ data-adaptive simulation techniques such as the index sequential method [Ouarda et al., 1997], block bootstrapping [Efron and Tibshirani, 1993; Vogel and Shallcross, 1996], and k-nearest neighbor resampling (KNNR) [Lall and Sharma, 1996]. However, these methods are also not able to capture a long-term NSO process.

[4] One applicable model for NSO processes is the shifting mean level (SML) model developed by Salas and Boes [1980] and Sveinsson et al. [2003] in which the long-term oscillation pattern is modeled with the shifting mean process. However, the correlation structure of the model decreases exponentially, implying that the oscillation is not properly conveyed. The best way of preserving a long-term NSO process in a stochastic model still remains in question.

[5] Another alternative might be (1) to extract the long-term NSOs from observed data into different frequency components and (2) to build a time series model for individual NSO components. Therefore, we need an algorithm that properly separates the long-term NSOs embedded in observed data into few components to be manageable. However, in general, extraction algorithms for frequency decomposition require preinformation about the observed signals, and their performance is downgraded when an overall trend exists [Elsner and Tsonis, 1994; Schlesinger and Ramankutty, 1994a; Elsner and Tsonis, 1996]. Other algorithms such as wavelet analysis [Torrence and Compo, 1998] and the multitaper method [Thomson, 2001] provide too many components to manipulate.

[6] Meanwhile, Huang et al. [1998] proposed a decomposition technique to disclose a hidden intrinsic NSO structure in a time series, named the empirical mode decomposition (EMD). From this decomposition, oscillation structures embedded in a time series with different frequency levels are expressed as intrinsic mode functions (IMFs). It has been proven that this data-adaptive decomposition method, EMD, extracts well the NSOs into a finite number of IMFs even if combined with a long-term trend. Furthermore, [Wu and Huang [2004] experimented with a Monte Carlo simulation to investigate how EMD performs on white noise. They derived a way to test the significance of IMFs from this experiment. EMD analysis has been applied in climate research for instance by Xie et al. [2002], Li and Davis [2006], Pegram et al. [2008], McMahon et al. [2008], and Lee and Ouarda [2010].

[7] In the present paper we propose an approach to model NSO processes. EMD is employed to capture the long-term NSO in the data. The decomposed components from EMD, called IMFs, are tested to determine whether an extracted component is induced from a white noise or from a physical force [Wu and Huang, 2004]. These IMFs are categorized into three types (overall trend, oscillatory, and residuals) and modeled according to their characteristics.

[8] For the overall trend, the change rate of the trend is fitted with a polynomial regression model if the trend component is significant from the test [Wu and Huang, 2004]. For the oscillatory component, a particular data-adaptive algorithm is proposed in this paper in order to extend the future evolution of the NSO process. To the authors' knowledge, no stochastic model exists for the extension of the NSO process when its time series varies smoothly with time and its frequencies and phases are not stationary. Finally, for the residuals, the sum of the insignificant components is treated either as random white noise or autocorrelated red noise. It is modeled accordingly with a short memory time series model. Parametric (e.g., normal random noise or lag-1 Autoregressive) or nonparametric approaches (e.g., bootstrapping or KNNR) can be adopted for this purpose. Note that even if a climatic system cannot be precisely predictable in a short-term range, the overall long-term change can be predictable. In this paper, we focus on the long-term oscillatory process instead of the short-term process.

[9] To validate the model performance, the model was tested with the synthetic data from a nonlinear chaotic system, the Rössler attractor [Rössler, 1976, 1995]. As a case study, the proposed model was applied to GSTA data. The GSTA observations of the last 30 years were truncated and compared to the data generated from the model. Finally, the next 50 years of data were generated to predict the evolution of GSTA data into the future.

[10] The paper is organized as follows. In section 2, we describe two fundamental models employed in the proposed approach, the KNNR and EMD. The procedures of the proposed model for the selected oscillation components, the parameter estimation approach, and the modeling of the other components are presented in section 3. The proposed model is applied to a synthetic nonlinear oscillation time series in section 4. The application of the procedure to the GSTA data is presented in section 5. Finally the summary and conclusions are presented in section 6.

2. Background

[11] The proposed procedure is based on two existing models, KNNR and EMD. In the next sections, we present the fundamentals of these two approaches.

2.1. KNNR

[12] Lall and Sharma [1996] developed KNNR to simulate annual and seasonal time series. The background of this approach is based on a k-nearest neighbor (KNN) density estimator that uses the distance to the kth nearest data point and its volume containing k-data points. The conditional probability density function is approximated using KNNs of the current state. The overall procedure is summarized as follows:

[13] 1. Find the KNNs to the current state by computing the distance from the current value to all the historical records.

[14] 2. Select one of the k-neighbors randomly. The weighting probability is then given by

equation image

This weighting probability is derived from approximating the local k-nearest neighbor density as a Poisson process [Lall and Sharma, 1996].

[15] 3. Set the successor of the selected neighbor from step (2) as the simulated value for the next time step. The steps are repeated until generating the required simulation.

2.2. EMD: The Sifting Process

[16] EMD is an algorithm that extracts NSO components with different frequencies from a time series. Any complicated data set can be decomposed into a finite number of oscillatory modes whose frequencies are significantly apart from each other. Resolved components from EMD are defined as IMFs. An IMF is a function that satisfies the following two conditions: (1) the number of extrema must be equal to the number of zero crossings or differ from it at most by one and (2) the mean value of the two envelopes determined by the local maxima and minima must be zero.

[17] The basic EMD procedure (called sifting process) and its necessary feature, orthogonality, are described in the following subsections. Then, the significance test of IMFs is discussed. For further details, the readers are referred to Huang et al. [1998] and Huang and Wu [2008].

2.2.1. Sifting Process

[18] The following sifting process serves (1) to eliminate riding waves and (2) to smooth uneven amplitudes. The process to obtain the finite number of IMFs from a time series, x(t) (t = 1, N; N = record length) is described with an example in Figure 1. The thick solid line of Figure 1 represents the observation data.

Figure 1.

Example of the sifting process (thick solid line, observations; two thin solid lines, upper and lower envelopes; thick dotted line, mean of the envelopes (i.e., m1 in equation (2)). Notice that some overshoots and undershoots are presented (e.g., time = 12 for undershoot, and time = 25 for overshoot).

[19] To begin with the first component, we need to find the local maxima and minima of the time series. The local maxima and minima are connected with a cubic spline line [Press et al., 2002], called upper and lower envelopes. These extrema are presented with two thin solid lines in Figure 1. The mean value of the spline-connected maxima and minima, m1 (thick dotted line in Figure 1), is subtracted from the original time series, x, and denoted as h1, i.e.,

equation image

Here, h1 could be the first IMF. However, the estimated local maxima and minima are imperfect with undershoots and overshoots (refer, for instance, to the values of t = 12 and 25 in Figure 1 for the examples, respectively). Thus, h1 might not satisfy the IMF conditions mentioned above. Therefore, we have to repeat the sifting process a number of times by treating h1 as the data, x, then

equation image

in which m11 is the mean of the envelopes with h1 as the data (x) in the first iteration. Notice that the second subscript index (e.g., h11 and m11) indicates the additional repetition number for the sifting process. If we repeat the process k times, then h1k = h1(k−1)m1k. Then the first IMF component is designated as c1 = h1k. Note that the first IMF contains the fastest changing signal.

[20] In practice, no matter how many times the data are sifted, some asymmetric wave forms can still exist implying that perfect IMF conditions cannot be met. Therefore, one should establish a criterion for the sifting process to stop when it is guaranteed that the IMF retains enough physical sense of both amplitude and frequency modulations. In this regard, a number of several stopping criteria for k have been adopted in the literature [Huang et al., 1998; Huang and Wu, 2008]. One example in Huang et al. [1998] is SD = equation imagehik−1 (t) − hik (t)∣2/equation imagehik−1 (t)2, where this value must be smaller than a value predetermined by the user (typically 0.2–0.3; Huang et al. [1998]). The value of 0.2 is employed in the current study. Pegram et al. [2008] and Peel et al. [2009] advocated employing rational splines instead of cubic splines to improve the end effects and to decrease the number of siftings. In the current study, rational splines are not used mainly because their use requires additional adjusting procedures which are time consuming and need some additional work. The advantage of the rational spline method is that it allows for an interplay between spline tension and IMF characteristics, whereas the original cubic method provides a single output [Pegram et al., 2008]. Peel et al. [2009] carried out additional work to assess the performance of rational spline-based EMD for a global annual precipitation data set. Future research efforts on the use of EMD in the prediction of nonstationary hydroclimatological oscillation processes should integrate the use of rational splines.

[21] The subsequent components (ci where i = 2, …, n) are estimated by treating the residual (ri) as the signal (x), iteratively. The ith residual is defined as

equation image

In other words, substituting x by ri in equation (2), the same procedure as for c1 is repeated. For the last component, rn is treated as cn+1, where n is the total number of IMFs. Then x = equation imageci. Note that the last component, cn+1, represents the overall trend (but not IMF) in that no further decomposition is available and no oscillations exist in a given time span. Wu and Huang [2004] found that EMD is a dyadic filter bank. Therefore, the number of IMFs on average is approximately the same as the base 2 logarithm of the record length.

[22] To obtain better uniqueness of the IMF, the ensemble EMD (EEMD) algorithm was developed by Wu and Huang [2009], where the EMD algorithm is assisted by a white noise term. Since the uniqueness of the components is an important requirement in a stochastic model, EEMD is adopted in this study. The algorithm is simple, but still several details are required in the procedure and need to be addressed, such as rounding-off error, the edge effect, etc. These details are beyond the scope of the present study. Orthogonality, one of the most important features in EMD analysis for independent modeling, is discussed in the following.

2.2.2. Orthogonality

[23] A measurement of local orthogonality was proposed by Huang et al. [1998]. However, the overall orthogonality cannot be estimated with this measurement. To obtain a global orthogonality check, cross-correlation is employed in the present study. Here, orthogonality implies no correlation because the oscillatory IMFs have zero mean. Huang et al. [1998] mentioned that orthogonality is met only locally but not guaranteed globally since each component is extracted from the difference between the signal or the residual of the preceding component and the local mean (not overall mean) of its envelopes as in equations (2) or (4).

[24] Orthogonality is an important characteristic that is required in the present work. The reason is that if each component is separately modeled while certain components are significantly correlated with each other, the total variance in reconstructing the data might be underestimated. By assuming a zero mean (by excluding the last trend component, cn+1), the variance of the original signal is

equation image

where E[z] indicates the expectation of the variable z. Hereby, the orthogonality indicates that if the covariance term of equation (5) is equal to zero, then the variance of the original signal, E[x2] = σx2, is only the summation of the variance of each component, equation image var(ci). In other words, the separate modeling of each component implies that the total variance is carried by the variance of the individual components. However, if the components are significantly correlated, then the variance of the reconstructed data will be reduced. This is more significant when the highly correlated components possess a high variance. In the case where components are significantly correlated and they are neighbor components (e.g., j = i ± 1), a good alternative is to model the correlated components as one by summing them up (i.e., ci = ci + cj). In practice, we model the summed component to avoid the leakage of the total variance in the generated data.

2.2.3. Significance Test of IMF Components

[25] To see whether an IMF for EMD contains a true signal or just a random noise component, Wu and Huang [2004; 2005] developed a significance test. The total energy of the time series x(t) with Ej = 1/Nequation image [cj(t)]2 is

equation image

The equality between the second and third components of equation (6) comes from the assumption that the IMFs are orthogonal. The numerical experiments in Wu and Huang [2004] reveal that

equation image

where Tj is the mean oscillation period calculated by the inverse of the frequency of the Hilbert-Huang Transform for the jth IMF. Based on the relationship between the energy and the mean period at each component, Wu and Huang [2004] established a statistical significance test for IMF components derived from white noise. If the IMF energy of the observed data with a certain mean period is higher than the upper bound of a certain confidence interval, the corresponding IMF is considered statistically significant at the given level. Wu and Huang [2004] showed that the first IMF contains no perceivable physical process so that it can be safely assumed to be pure noise. Therefore, c1 is not considered in the significance test.

3. Methodology

[26] The principle of the applied model here is (1) to decompose the time series x(t) into a finite number of IMFs, (2) to find the significant components among them, (3) to fit a stochastic time series model (parametrically or nonparametrically) to the selected significant components and the residuals accordingly, (4) to extend the future evolution of each component from the fitted models, and (5) finally, to sum up those separately modeled components. Furthermore, a unique algorithm, called NSO resampling (NSOR), is introduced to model the significant IMFs. The overall process is schematically presented in Figure 2a and the NSOR model procedure is presented in Figure 2b.

Figure 2.

(a) Overall process of the proposed model and (b) procedure of the NSOR modeling.

[27] From the significance test [Wu and Huang, 2004], the components are categorized into three types according to their characteristics as (1) oscillatory components, c(j), where c(j) is the significant component determined from the significance test, j = 1, …, J, and J is the number of the selected components excluding the overall trend; (2) overall trend, cn+1, if significant; and (3) a white (random) or red (autocorrelated) noise component, ɛ, which is the summation of the residuals (insignificant components), i.e.,

equation image

[28] The time series modeling procedure mainly employs the change rate of an oscillatory time series, defined as Δc(t)/Δt, instead of the direct modeling component c(t). Generally, most data are observed discretely at the same interval (e.g., a second, an hour, or a year); i.e., Δt = 1. Subsequently,

equation image

[29] The main reason to use the change rate in the proposed approach is that if we use the component data directly, an abrupt change might occur, which is not desirable in a smooth oscillation process. Since any given value includes the previous one as c(t) = c(t − 1) + Δc(t), abrupt change can be avoided by employing the change rate. The change rate Δc(t) is employed to model the oscillatory and overall trend components in the current study.

3.1. NSOR

[30] The crucial characteristics of an oscillatory IMF are (1) the phases and frequencies vary in a certain range for each mode; (2) since the components are oscillatory, past cycles are repeated in the future in a similar manner but not with the exact same phase and frequency; and (3) the components change smoothly, similar to a sine wave. No ordinary time series model can reproduce these characteristics of IMF. Therefore, a unique algorithm to simulate this NSO is proposed here.

[31] Since IMF components containing the NSO process are repetitive, the observed data contain sufficient information to extend the observations and estimate the future evolution. In other words, time series modeling of the NSO process can be carried out by resampling the historical oscillation. A unique original block bootstrapping technique was developed for multivariate time series simulation by Lee [2008]. While a common block bootstrapping technique resamples the subsequent block independently of the previous block with a fixed block length, Lee [2008] preserves the relation between blocks with KNNR and sets the block length as a random variable. The procedure can be briefly summarized as follows: (1) generate the block length, LB, from a discrete random variable (e.g., Poisson or geometric); (2) use KNNR to select the starting value of the following block; and (3) obtain a block that follows the point selected by KNNR with length LB. The NSOR method proposed herein is hence based on the bootstrapping algorithm presented by Lee [2008].

[32] Before describing the details of the modeling procedure, verifications and hypothesis of the adopted techniques for NSOR modeling are discussed as follows:

[33] 1. Block bootstrapping is employed because the sequences are oscillatory within a certain range of the variation of phases and frequencies. Observations represent sufficient information to generate sequences preserving NSO characteristics. However, if the observations are used by themselves, only a historical sequence is repeated with a different combination and the generated sequences do not vary smoothly. Therefore, the change rates (equation (9)) are bootstrapped instead. The generated data, combined with the following random block length, are totally different sequences from the historical ones and vary smoothly.

[34] 2. Random block length in block bootstrapping is employed since the IMF component carries the nonstationary process so that the time to change the phases and frequencies of oscillations is random.

[35] 3. KNNR is employed to choose the starting value of the next block so that the sequence is smoothly continued. A particular distance measurement in KNNR should also be further adopted to manipulate the smooth extension of future evolution.

[36] Suppose that we have a sequence of a certain IMF component c(t) where t = 1,., N and we want to extend the sequence from its end point (cH(N)), as shown in Figure 3. The “H” superscript indicates that the sequence represents observed data, and the “G” is for generated data. Hence set cH(N) = cG(0). With this setup, the NSOR procedure is suggested as follows (also illustrated schematically with a flowchart in Figure 2b).

Figure 3.

Artificial NSO time series for the description of NSOR modeling. Notice that there are four candidate points (k = 4) as p2, p4, p6, and p8 for the current condition cH(N). Among four candidate points, p2 is selected and the successors of ΔcH(p2) with length LB are taken.

[37] 1. A block length, LB, is randomly generated from a discrete distribution (e.g., Poisson or Geometric), for the extended values that follow cH(N). A Poisson distribution is employed in the present study, because the distribution shape is generally close to a normal distribution centered on the mean rather than a positively skewed shape (e.g., Geometric). More information on the selection of this discrete distribution in block bootstrapping can be found in Lee [2008]. The employed Poisson distribution with its parameter (τ) is

equation image

Note that the parameter (τ) is the mean of LB. The parameter selection for the Poisson distribution (τ) is discussed later.

[38] 2. Distances are estimated to find close observations to the current status in KNNR as follows:

equation image

where α1 = 1/σc2 (i.e., variance of the component data, c), α2 = 1/σΔc2, and j = 2, …, (NLB). The last LB elements are excluded to ensure that the records following the selected points are at least LB in length. Also, the first element (j = 1) is omitted because it is not feasible to estimate the change rate in this case. The objective of measuring a distance is to find historical points that are similar to the current condition. The current condition here is cG(0) = cH(N). The closest points to the current condition in Figure 3 are [p1, …, p9]. However, the current change rate presents a rising condition. The second term of the right side of equation (11) takes into account this discrepancy by yielding large distances for the points with an opposite slope. To ensure that one of the two distances does not dominate the other, the weighting factors (α1 and α2) are multiplied. Another alternative to equation (11) is to use only the first term (i.e., Dj = ∣cG(t − 1) − cH(j)∣) and omit the points with opposite slope to the current condition. This approach might not be appropriate when the selection points are near local extrema of an oscillation signal, in which case the sign of the slope is not very meaningful. Therefore, equation (11) is employed throughout this paper.

[39] 3. The k numbers of the smallest distances among j = 2, …, NLB are obtained, and their time indices are stored; this assumes that the candidate points are [ p2, p4, p6, p8] when k = 4.

[40] 4. One of the k numbers of points is randomly selected with the weighting probability given in equation (1).

[41] 5. If p2 is selected, the change rates of the LB successors are taken (i.e., {ΔcH (p2 + 1), …, ΔcH (p2 + LB)}.

[42] 6. The generated data with length LB are obtained by

equation image

where cG(0) = cH (N)

[43] 7. Steps 1–6 are repeated until the required data are generated.

3.2. Parameter Selection of NSOR

[44] In the NSOR process, two parameters (i.e., the number of nearest neighbors (k) and the block length random variable (τ)) are to be selected. A heuristic approach and a generalized cross-validation (GCV) procedure for the number of the nearest neighbors (k) can be employed [Lall and Sharma, 1996]. The heuristic approach k = equation image is adopted in the present study with its theoretical justification [Fukunaga, 1990; Lall and Sharma, 1996].

[45] The Poisson distribution parameter for block length random variable (τ in equation (10) and τ = E[LB]) is equivalent to the fixed block length in a general block bootstrapping. Hall et al. [1995] suggested an alternative to select the block length by (1) setting the block length as N1/α and then (2) finding α by minimizing the mean square error between the block bootstrap estimates and the estimate from the entire observed data of a target statistic. This is, however, only applicable when block bootstrapping is employed for the estimation of the standard error of a certain statistic. The objective of the bootstrapping in the current study is, however, to simulate sequences preserving the NSO characteristics.

[46] Wilks [1997] proposed an applicable alternative to this case. He induced a rule from trial and error evaluation of autoregressive synthetic series. By assuming that the given data follows a first-order or second-order autoregressive (AR(1) or AR(2)) model, the block length is selected with a function of the variance inflation factor (VIF), V, and the record length, N. Wilks [1997] concluded that the AR(2) case is much more robust and more widely applicable than AR(1). Therefore, in the present study we applied this block length selection with the AR(2) process assumption for the parameter τ in equation (10). This assumption concerning the block length leads to the preservation of the memory of the current state. Further serial dependence is ensured through the selection of the subsequent block with KNNR instead of independent blocks [Lee, 2008].

[47] The VIF depends on the autocorrelation as shown by the equation:

equation image

where ρj are the model estimates of the autocorrelations at lags j and N is the record length. The AR(2) model and its estimates of ρj are explicitly presented in Appendix A. The bias-adjusted VIF is V′ = V exp(3V/N). The parameter τ in equation (10) (also, mean value of LB) is chosen by solving the following equation:

equation image

3.3. Modeling Overall Trend and Residuals

[48] A regression method (such as linear, polynomial, or exponential) might be employed for modeling the trend component. Here, polynomial regression is applied to the change rate of the trend component when the IMF test indicates that the trend component is significant. Notice that the p-order polynomial regression (PR(p)) of the change rate (Δc(t)) is equivalent to the p+1 order of the original component (c(t)). It is easy to adopt the current state on the extension of the future evolution as c(t) = c(t − 1) + Δc(t) by utilizing the change rate in trend modeling.

[49] The residuals of the data excluded from the trend and the significant oscillatory components are treated as either random noise or autocorrelated noise according to the time dependency structure. One can fit a proper stochastic time series model for the residuals, such as KNNR [Lall and Sharma, 1996], ARMA [Salas et al., 1980], etc.

[50] The separately generated data sets with three different approaches (NSOR-oscillatory component, PR(p)-trend component, and KNNR or ARMA residuals) add up to the generated data in an original domain.

4. Model Validation With Rössler Attractor

4.1. Rössler Attractor

[51] To test the performance of the suggested model, we select an example of a nonlinear dynamic system, Rössler attractor [Rössler, 1976], which is one of the most famous and chaotic attractors. The attractor is the solution to a system of three nonlinear ordinary differential equations as

equation image

where (x, y, z) ∈ ℜ3 are dynamical variables defining the phase space with time t, and (α, β, δ) ∈ ℜ3 are parameters. This attractor was intended to behave similarly to the Lorenz attractor [Lorenz, 1963]. However, unlike Lorenz, a qualitative understanding of the chaotic flow is easier to obtain [Rössler, 1976]. Huang et al. [1998] and Kijewski-Correa and Kareem [2007] illustrated that the nonlinear Rössler system is well represented with EMD and Hilbert transform. We selected this attractor because it oscillates within a fixed range but the oscillations are chaotic.

4.2. Results

[52] For the application, the system was realized with the same parameter set, [α, β, δ] = [1/5, 1/5, 7/2], as Huang et al. [1998] and Kijewski-Correa and Kareem [2007] and with the initial state [x0, y0, z0] = [−3, 3, 1]. A three-dimensional view of the realized series is in Figure 4a presenting the single spiral, unlike the two spirals of the Lorenz attractor.

Figure 4.

Experiment of Rössler attractor with α = 1/5, β = 1/5, δ = 7/2. (a) Three-dimensional representation of the attractor without random component. (b) Time series of the Rössler attractor with random component. The extension of the attractor is also involved (t = 451–500) with truncation of the corresponding records (thin solid line, observed records; thick solid line, the sum of the selected components (t = 1–450) and the mean of the 200 realized extensions (t = 451–500); thin gray dotted lines, the 200 realized extensions of only the selected components (4th + 5th)). (c) The partial presentation of Figure 4b (t = 400–500) except that the dotted lines represent 200 realized extensions of the selected components plus the simulated residuals.

[53] Among the three variables of the Rössler attractor, we applied the suggested approach only to the x variable. The time series of the x variable contains only smooth and deterministic components. Therefore, we added a random component to perturb the system as x + ɛ where ɛ is normally distributed with zero mean and the same variance: x, i.e., ɛ ∼ Φ(0, σx2). Here Φ(μ, σ2) represents the normal distribution with mean μ and variance σ2. The time series of the realized 500 observations is shown in Figure 4b with a thin solid line. The time period 400–500 is magnified in Figure 4c.

[54] To test the prediction capability of the proposed model, we truncated the last 50 years of record. Then, 200 stochastic series of the truncated part are generated from the proposed model and compared to the realized observations from the system.

[55] The extracted IMFs of the realized Rössler system are illustrated in Figure 5. Results show that among all IMFs, the first three IMFs of the composite signal (c1, c2, and c3) capture the high-frequency random process. The long-term oscillation from a nonlinear deterministic chaotic process is sifted in the 4th and 5th IMFs. This result is supported by the IMF significance test shown in Figure 6. The 4th and 5th components are highly significant compared to the other components. The 6th and 7th components show very low energy, which implies that these components most likely represent a random noise process. Even though the significance test indicates that the long-term trend component (c8) might not be induced by a random process, the variability of this 8th IMF is negligible compared to the original time series (see Figure 5). Therefore, we only model the combination of the 4th and 5th components with NSOR and treat the others as the residuals (i.e., c1+ c2+c3+ c6+ c7+ c8).

Figure 5.

(top) Time series of the original sequence, which is simulated from the Rössler attractor plus random noise, and (remaining panels) the extracted components with EMD (c1c8). Note that (1) the critical long-term oscillation signals are sifted in c4 and c5 and (2) each y axis has a different scale in order to present the variability of each component appropriately.

Figure 6.

Significance test of the Rössler attractor with 95% (solid line) and 99% (dotted line) confidence limits. Each point (asterisk) below the lines indicates that the hypothesis that the corresponding IMF of the observed series is not distinguishable from the corresponding IMF of a random noise series cannot be rejected with the confidence levels (95% and 99%, respectively). Notice that c4, c5, and c8 are significant while c4 has the highest mean normal energy, and the first component c1 is generally considered as random component in EMD analysis and neglected in the test [Wu and Huang, 2004].

[56] In Figures 4b and 4c, the dotted lines during the period 451–500 show the extended series of only the selected components (b) and of all the components including the residuals (c). The thick solid line of this period in Figures 4b and 4c presents the mean of the 200 extended series. Even though the results indicate a slight overestimation in the first decade and a slight underestimation in the following decade compared to the original sequence (thin solid line, they globally illustrate that the proposed model predicts well the future evolution of the Rössler attractor. The generated 200 series including the residuals shown in Figure 4c cover the variability of the original series.

[57] For further testing, another Rössler system with its most commonly employed parameter set as [α, β, δ] = [0.1, 0.1, 14] (shown in Figure 7a) [Rössler, 1995] was realized with a smaller variability of the random component ɛ ∼ Φ(0, 1/4σx2) than the previous Rössler system (σx2). The extension of the truncated last 300 values (Figure 7b) indicates that the future evolution of the nonlinear chaotic system is well reproduced by covering the variability of the system as shown in Figure 7c. One could apply the random component with higher variability (e.g., ɛ ∼ Φ(0, 2σx2)). However, this might overwhelm the signal of the Rössler system and hence the NSO process is vague. In this case, the significance test of Wu and Huang [2004] might indicate that no significant signal is observed as shown in section 3. Further discussion of this point is presented in section 6.

Figure 7.

The same as Figure 4 except with a different parameter set (α = 0.1, β = 0.1, δ = 14) and different time period, as t = 1–1700 for the observations, and t = 1701–2000 for the extension part.

5. Application to Global Surface Temperature Anomalies

5.1. Data Description

[58] Annual-scale GSTA (1856–2003) data, with the deviation from the 1961–1990 yearly mean, are employed to test the ability of the suggested model as shown in Figure 8 with a thin solid line. The data was downloaded from the University of East Anglia's Climate Research Unit,, referred to as TaveGLv2 in the Web site [Jones et al., 2001]. This data was already analyzed with a number of decomposition procedures, such as singular spectrum analysis [Elsner and Tsonis, 1994; Schlesinger and Ramankutty, 1994a; Elsner and Tsonis, 1996] and EMD [Wu et al., 2007; Zhen-Shan and Xian, 2007]. Note that unlike previous studies, the main objective here is to suggest a time series model that captures the NSOs and provides the future evolution of GSTA.

Figure 8.

Time series of the GSTA data and its selected IMF components as well as its linear fitting (thin solid line, observations; thin dash-dotted line, overall trend; thin dotted line, linear fitting; thick solid line, the combination of the selected IMF components).

[59] For an overview of the extension and prediction ability, the observations of the last 30 years and 38 years are truncated, and are compared to the data generated from the proposed model. Finally, the next 50 years of data are generated to predict the evolution of GSTA data in the future.

5.2. Results

[60] The overall GSTA time series is shown in Figure 8 (thin solid line for observations). An overall increase of global temperature is evident. Furthermore, a long-term NSO is observed in Figure 8. After performing EMD, the extracted IMF components are shown in Figure 9. By adopting the ensemble algorithm for EMD analysis in the current study [Wu and Huang, 2009], the number of components from EMD, (n+1), is equal to seven instead of six, unlike the previous studies on the same data [Wu et al., 2007; Zhen-Shan and Xian, 2007].

Figure 9.

IMFs of the GSTA data. Note that the y coordinates of each panel are differently scaled; otherwise, the shape of some signals might not be distinguishable.

[61] The variance of each IMF of the GSTA data is presented in Table 1. It reveals that the last overall trend component contains most of the variability. The sum of the component variances is about 95% of the variance of the original signal. This implies that 5% of the variance is carried through the cross-correlations between the IMF components as in equation (5). The cross-correlation matrix for the oscillatory components is presented in Table 2. It is shown that the fifth component is significantly correlated with the fourth and sixth. Modeling those components separately might lead to an underestimation of the total variance. These signals should be modeled as a combined one component if significant.

Table 1. Variance of all IMF Components, the Sum, and Original Signal (x) for the GSTA Data
Table 2. Cross-Correlation of the Oscillatory IMF Components for the GSTA Data

[62] The significance test of the IMFs employing the random noise simulation and energy spectra in the frequency domain (see equation (7)) is shown in Figure 10. It reveals that the fourth, fifth, and seventh components are significant components rather than random noise processes. The seventh component represents the overall trend, the fifth component represents the 60–75 year oscillation, and the fourth component has some irregular changes with a long-term variation as presented in Figure 9. The 60–75 year oscillation in this data set was suggested by Schlesinger and Ramankutty [1994a]. Through SSA and by removing the overall trend with the prediction of the GCM model, Schlesinger and Ramankutty [1994a] showed that the long-term oscillation might exist. Elsner and Tsonis [1994; 1996], however, suggested that these signals might come from the autocorrelated red noise process with 0.9 lag-1 autocorrelation function (ACF). However, autocorrelation is easily overestimated if a long-term trend exists, which is the case for this GSTA series [Schlesinger and Ramankutty, 1994b].

Figure 10.

The same as Figure 6 except that the IMFs of the GSTA data are employed here. Notice that significant components are c4, c5, and c7, and the trend component c7 is the most significant one.

[63] To provide a simple explanation, a random simulation was performed as x(t) = 0.01t + ɛ(t), where ɛ(t) is a random noise which follows the standard normal distribution. The sequence represents a perfectly random noise except for the strong increasing trend, but it still has a very high lagged correlation (data not shown).

[64] Likewise, the higher lag-1 ACF is induced from the strong increasing trend in the GSTA data (around 0.9) shown in Figure 11a. The detrended data (i.e., xc7), however, has a much lower ACF of about 0.5, as shown in Figure 11b. Furthermore, the ACFs of the detrended data (Figure 11b) show a strong oscillation with around a 65 year period. The same autocorrelation structure can be found in the ACFs of the residual of the linear regression (Figure 11c). This is further evidence of the existence of the 60–75 year oscillation. After subtracting the significant components from the test shown in Figure 10, the ACFs of the residual (i.e., xc4c5c7) present no significant serial correlation except the first one (0.2) as shown in Figure 11d with confidence bounds ± 0.16. Therefore, we chose c4 and c5 as the oscillatory components, c7 as the trend component, and the others as the autocorrelated residuals.

Figure 11.

Autocorrelation function (ACF) of (a) observed signal, (b) detrended data (xc7), (c) residuals of the linear regression fitted data, and (d) residuals of the selected components (xc4c5c7). The thick lines represent ACFs with a function of lag. The two horizontal lines above and below the zero line at each panel represent the confidence bounds, ±2/equation image, approximately 95% confidence interval.

[65] At first, the significant trend component (c7) was modeled with a PR(1) model on the change rate whose order was selected by Akaike information criterion [Akaike, 1974; Bozdogan, 1987]. The PR(1) model on the change rate is equivalent to a PR(2) model on the original signal. Then, the summation of the selected 65–70 year components (c4 + c5) was modeled with the NSOR algorithm. Finally, the combination of the residual components (c1 + c2 + c3 + c6) was modeled with KNNR since a slight autocorrelation still exists (as much as 0.2 above the confidence bounds ± 0.16). The AR(1) model was also tested for modeling the residuals, resulting in no significant difference from KNNR (data not shown). In summary, the modeling scheme of this GSTA data was carried out in three parts: (1) the change rate of the overall trend component (c7) was modeled using an order-1 polynomial regression (linear regression); (2) the combination of the fourth and fifth components (c4 + c5) was modeled with the NSOR algorithm; and (3) the residuals (c1 + c2 + c3 + c6) were modeled with KNNR.

[66] In Figure 8, the time series of the summation of the selected components (thick solid line) is overlaid with the observed data (thin solid line). The long-term trend and oscillatory pattern of the observations are well represented by the selected components. The linear regression fitting is also included with the dotted line in Figure 8 in order to magnify the ability of the selected components to capture the variability of the observed data.

[67] To verify the model performance, the last 30 years of the historical data were truncated and the remaining observed data (1856–1973) were fitted with the same model as before. A total of 200 sets of 30 year extensions were generated from this model. The simulation results are presented in Figure 12 with only the selected significant components (Figure 12a) and all the components (Figure 12b). Figure 12 reveals that the generated data capture well the overall trend and the long-term oscillation within the given uncertainty from the random noise and the NSOR process. Even if the 30 year data are excluded in the estimation, the overall trend and the oscillation are well predicted by the suggested model. The data point in the year 1973 seems to be already on the increasing regime which leads to a relatively easy condition to predict. To test the model in a more realistic condition, the observed data were truncated up to the year 1965. In this case, the last years represent a 38 year period that has a locally decreasing regime. This provides a good case to test whether the proposed model properly conveys the oscillation and trend to predict the future values. The overall trend and oscillation process are fairly well extended within the uncertainty range carried out by the random component (data not shown because of the similarity with Figure 12). Conclusively, the results above show that the proposed model preserves well the historical patterns of trend and NSO. Therefore, this model can reliably extend the future sequence of the GSTA data.

Figure 12.

Generated sequences for GSTA data for which the last 30 years are truncated (i.e., the observed period of 1856–1973). (1) Thin solid line represents the observations; (2) thick solid line shows the selected IMF components for 1856–1973 and the mean of the generated 200 realizations for 1974–2003; and (3) dotted gray lines represent the 200 realizations of (a) only the selected components and (b) all components.

[68] A simple AR model was tested in order to check whether such a traditional model is also able to predict the future evolution of GSTA data. Since the serial correlation of the observed data has long-term persistence as shown in Figure 11a, high-order AR models were applied. The partial autocorrelation of this data indicates that the possible AR model order could be of 4 or 5. In Figure 13, the 200 generated sequences from AR(5) and AR(10) models are illustrated (observed period: 1856–1973, past 30 years are truncated). The generated sequences from AR(5) show a decreasing trend and the ones from AR(10) present no trend while the observed data presents an increasing trend. We also tested much higher order AR models (e.g., AR(20) and AR(30)). The results were not much different from AR(5) and AR(10) (data not shown). The next 50 years of the GSTA data records were generated to show how the global temperature might vary. The extension of the trend for the next 50 years is presented in Figure 14a which includes only the generated fourth, fifth, and seventh components. Figure 14b provides a plot of the generated data including all the generated components.

Figure 13.

Generated sequences from the simple AR model for GSTA data for which the last 30 years are truncated. (1) Solid line represents the observations; (2) gray lines represent the 200 realizations from (top) AR(5) and (bottom) AR(10); and (3) thick dotted line represents the mean of the 200 realizations.

Figure 14.

Extension of 50 years with (a) the selected components only and (b) including all the components for the GSTA data (thin solid line, observations; dash-dot line, overall trend; thick solid line, selected IMF components; gray lines, 200 realizations).

[69] Due to the global warming signal and the noise term, the long-term oscillation might not be easy to observe clearly. Nevertheless, the prediction results (Figure 14) indicate that, during the next few decades, the increase in the global temperature might be moderated by the decreasing tendency of the long-term oscillation, after which a stronger temperature increase is expected. Finally, the results seem to indicate that even given the moderation of the temperature change derived from the natural driving force (60–75 year oscillation) over the next few decades, the anthropogenic effects on global warming should not be underestimated. A higher increase is expected a few decades later.

6. Conclusions and Remarks

[70] The present study dealt with the modeling of the NSO processes in climatic regimes. Realistic long-term NSOs embedded in observed data can be extracted by expressing the oscillation structures as IMFs through the EMD technique. The extracted components are modeled in three ways. If significant, the overall trend is modeled with a polynomial regression, and the oscillatory components are modeled with the proposed NSOR algorithm. The residuals are treated as an autocorrelated red noise or white noise, modeled with the general time series model (KNNR or AR(1)) or a normal random process, respectively. In conclusion, the proposed NSOR model is capable to provide a useful prediction of a future sequence by employing the long-term oscillation pattern of the observed data. The validation study over a nonlinear chaos system, the Rössler attractor, concluded that the proposed model can be suitable to extend the future evolution of the nonlinear system. Furthermore, through the successful modeling of the NSO process, we were able to achieve a reasonable prediction of the future global temperature change.

[71] A few remarks need to be mentioned concerning the proposed model. First, this is a stochastic model, implying that no physical factors are included. For example, the prediction of the GSTA basically assumes that the physical forces are similar to the historical ones. The extension might deviate if the overall trend is changed through different anthropogenic driving forces. The trend might be different even if these forcings are unchanged, due to the precedent anthropogenic forcings, until a new equilibrium of climate system is reached. This would take millennia. The fact that the current NSOR model is able to extend the last 30 years of GSTA data well, might suggest that the physics of the NSOs has not changed much yet. Second, the proposed model is useful for the data sets embedding NSO or quasiperiodic processes. However, the observed data structure, such as the ACF, should be checked thoroughly before applying this model. If no oscillation structure can be found, a Box-Jenkins type of model (e.g., SML or ARMA) may represent a better alternative. Finally, the reliable prediction range depends highly on the persistence features governed by the climatic system and the length of the observed data series. Based on the performance of the synthetic application and case study, it is suggested that a horizon of more than half the cycle of the oscillation process is not appropriate, especially with the small number of cycles commonly observed in geophysical data such as in the GSTA data set. Furthermore, the target IMF for the NOSR modeling should have at least three zero crossings (or one total cycle). Even with these limitations, the results of the present work illustrate the potential of the proposed model to provide future long-term predictions.

[72] The proposed model can eventually be applied to the extended climate indices in order to predict the future evolution of hydroclimatic variables such as precipitations or streamflows. This work has been performed and presented in a separate paper [Lee and Ouarda, 2010].

Appendix A:: Second-Order Autoregressive Model

[73] For the time series of x(t) with zero mean, the second-order autoregressive (AR(2)) model is expressed as

equation image

where z(t) ∼ Φ(0,σz2). Also, ϕ1 and ϕ2 are the parameters related to the first and second lag ACFs (ρ1 and ρ2) as

equation image

The ACFs of this model satisfy the second-order difference equation:

equation image

The parameters of the AR(2) model (ϕ1 and ϕ2) in equation (A1) are obtained from equation (A2) with the historical estimates of the ACFs (i.e., ρ1 and ρ2). The other N − 3 ACFs, ρj, for equation (13), are estimated sequentially from equation (A3).


[74] The authors acknowledge the funding from the National Sciences and Engineering Research Council of Canada (NSERC) and the Canada Research Chair Program. The comments on the EMD methodology and the program support by Zhaohua Wu are also acknowledged. The authors wish also to thank the Editor, Renyi Zhang, as well as three anonymous reviewers whose comments helped considerably to improve the quality of the manuscript.