[1] An influential 1996 paper presented a statistical analysis showing that the prolonged ENSO warm event of the early 1990's was inconsistent with the historical pattern of ENSO variability and therefore concluded that there had been a shift in ENSO behavior possibly connected to global warming. A fundamental problem with this earlier analysis is that the data used to test for a shift in ENSO behavior were not independent of the data used to identify the hypothetical shift. A new analysis is presented that avoids this problem by using more recent data. The results raise a question about the earlier finding.

[2] In an influential paper, Trenberth and Hoar [1996] presented a statistical analysis purportedly showing that the prolonged ENSO warm event in the early 1990's reflected a shift in ENSO behavior possibly connected to global warming. The paper elicited a spirited exchange over some important technical issues [Harrison and Larkin, 1997; Rajagopalan et al., 1997, Trenberth and Hoar, 1997; Wunsch, 1999a, 1999b; Trenberth and Hurrell, 1999a, 1999b; Rajagopalan et al., 1999]. This exchange revolved around the best way to measure anomalous ENSO behavior and to assess its significance and is well worth reading. However, a more fundamental problem appears to have been overlooked. As discussed in more detail below, this problem concerned testing for a shift in ENSO behavior using data that were not independent of those used to identify the potential shift. Since the publication of Trenberth and Hoar [1996] more than a decade of additional data have accumulated and it is possible to revisit this issue in a way that avoids this problem. The purpose of this paper is to present such analysis. The results raise a question about the earlier finding.

[3] The remainder of this paper is organized as follows. In Section 2, the approach of Trenberth and Hoar [1996] is reviewed. The results of a new analysis are presented in Section 3. Section 4 contains some concluding remarks.

[4] According to Trenberth and Hoar [1997], henceforth TH97, the analysis of Trenberth and Hoar [1996], henceforth TH96, began with the observation that “aspects of the recent warming of the tropical Pacific from 1990 to mid 1995 were unprecedented in the observational period of the previous 113 years” (TH97, p. 3057). To assess the significance of this unprecedented behavior, TH96 turned to the time series of sea level pressure anomalies at Darwin (DSLP) over the period 1882–1995. They noted that DSLP was positive in all 22 seasons from the winter of 1989–1990 to the spring of 1995 (the test period) and that “this was an unprecedented run … of one sign since 1882, when the record began” (TH97, p. 3058). In addition, the average value of DSLP during this 22-season period was 0.94 mb.

[5] Motivated by these observations, TH96 decided to test “the null hypothesis of no change [in the behavior of ENSO] relative to the first hundred years of record for 1882 to 1981” (TH97, p. 3057). To do so, they fit a time series model to the first 100 years of the DSLP record (the baseline period) and, by simulating from the fitted model, showed that both a run of 22 or more seasons with the same sign and a run of 22 seasons (of either sign) with an anomaly greater than or equal to 0.94 mb occurred with very small probability. On the basis of these results, TH96 concluded that the 1990–1995 ENSO warm event reflected a shift in ENSO behavior.

[6] This analysis suffers from a fundamental problem. Even in the absence of a shift in ENSO, unprecedented behavior will occur from time to time. Upon observing such behavior, a test based on related data can confirm that it is unusual. However, unless the test explicitly accounts for the fact that the test period was chosen precisely because it was unusual, this not constitute a valid test for a shift in ENSO behavior. Percival and Rothrock [2006] discussed a related issue in assessing the significance of trends in climate data. A valid procedure in a situation where the observational record is used to identify a potential shift in ENSO behavior can be based on an assessment of the predictive ability under the hypothesis of no shift using fresh data. Such a procedure is outlined in the next section.

3. A New Analysis

[7] The accumulation of additional DSLP observations since 1995 provides the basis for a valid test for a shift in ENSO behavior. This paper focuses on assessing the significance of the average of the DSLP seasonal anomalies (relative to the baseline period 1882–1981) currently available that post-date the analysis of TH96. There are 42 such anomalies covering the period summer 1995 through autumn 2005. The entire record of DSLP seasonal anomalies is shown in Figure 1. A simple idea would be to apply the approach of TH96 to the more recent data. However, as a result of serial dependence in DSLP, the more recent data are not strictly independent of the test period of TH96. To avoid this problem, the assessment here is made conditionally on the 22 anomalies in the test period. In words, this analysis addresses the question: Conditional on the 22 seasonal DSLP anomalies of the test period of TH96, how exceptional was the average of the subsequent 44 anomalies in light of the variability of the baseline period 1882–1981? Following TH96, this question is addressed by fitting a time series model to the anomalies in the baseline period and using the properties of the fitted model to assess the significance of the average of the post-test period anomalies conditional on the previous 22 anomalies. The legitimacy of this approach stems from the fact that the data used to test for a shift in ENSO behavior were not selected on the basis of their exceptionality.

[8] Although TH96 assessed the significance of the average DSLP anomaly in the test period by simulation, this is not necessary. Consider the (m + n)-vector random variable Y = (Y′_{1}Y′_{2})′ where Y_{1} consists of the m = 22 time-ordered anomalies in the test period of TH96 and Y_{2} consists of the n = 42 time-ordered anomalies from the subsequent period. Assume that Y has a multivariate normal distribution and let μ and Σ be its mean vector and variance matrix, respectively. Note that, as the elements of Y form a time series, the elements of Σ correspond to autocovariances. Partition μ as μ′_{1}μ′_{2})′ and Σ as:

where μ_{1} and Σ_{11} are the mean vector and variance matrix of Y_{11}, μ_{2} and Σ_{22} are the mean vector and variance matrix of Y_{2}, and Σ_{12} = Σ′_{21} is the covariance matrix between Y_{1} and Y_{2}. It is a standard result that the conditional distribution of Y_{2} given Y_{1} = y_{1} is also multivariate normal with mean vector:

and variance matrix:

The average anomaly of interest is given by:

where e is an n-vector of 1's. It follows that the conditional distribution of _{2} given Y_{1} = y_{1} is (univariate) normal with mean e′μ_{2∣1}/n and variance e′Σ_{2∣1}e/n^{2}.

[9] To assess the significance of _{2}, the parameters of this model were estimated using data from the baseline period. Specifically, a time series model was fit to the baseline data, providing estimates of the autocovariances in Σ under the null hypothesis of no shift in ENSO behavior. In addition, under this hypothesis, μ_{1} = μ_{2} and further, because the data are anomalies, this common mean is 0.

[10] On the basis of the information provided in TH96, it was not possible to reproduce the details of their baseline modeling. Here, monthly values of DSLP were extracted from the database maintained by the University of East Anglia and seasonal values were formed. These seasonal values were converted into anomalies by subtracting seasonal averages formed from the baseline period. Following TH96, an ARMA(3,1) model was fit to the anomalies in the baseline period. Like TH96, the basic result of this analysis is not highly sensitive to the form of the ARMA model. Let Y_{t} be the anomaly in season t. The fitted model was:

where the estimated standard deviation of the innovation process ɛ_{t} was 0.664 mb. This is broadly similar to the model fitted in TH96. As a check on the adequacy of the fitted model, Figure 2 shows the log spectrum of this fitted model along with the log periodogram of the baseline data. The agreement is good. The theoretical autocovariances of the fitted model used to form Σ are plotted in Figure 3.

[11] As a preliminary exercise, this fitted model was used to reproduce the result of TH96 by assessing the unconditional significance of the average anomaly _{1} in the test period. The purpose of this exercise is to verify that results from the model fitted here are comparable to those of TH96. Under the hypothesis of no shift in ENSO behavior, this average has unconditional mean 0 and standard deviation (e′Σ_{11}e)^{1/2}/m. Based on the autocovariances in Figure 3, this standard deviation is 0.288 mb. The observed value of _{1} is 0.92 mb (close to the value of 0.94 mb reported in TH96). Under the fitted model, the probability of observing a value this large or larger is around 0.0007. This confirms the result of TH96 that the behavior of ENSO during the test period was unusual. Of course, as the test period was selected on the basis of the unusualness of ENSO behavior, this analysis does not constitute a valid test for a shift in ENSO behavior.

[12] Turning to the main part of the analysis, the observed value of _{2} was 0.29 mb. Under the fitted model, the conditional man and standard deviation of _{2} given the observed value of Y_{1} are −0.016 mb and 0.209 mb, respectively. The corresponding significance level is 0.072. By conventional standards, this is at best weakly significant and casts doubt on the previous claim of a shift in ENSO behavior.

4. Discussion

[13] The test of TH96 for an ENSO shift was invalid because the hypothesis of a recent ENSO shift was tested using data that were not independent of those used to formulate the hypothesis. This is not a mere technicality, but a fundamental problem that can give very misleading results. A valid approach in this situation can be based on an assessment of predictive ability under the null hypothesis of no shift using fresh data. The purpose of this paper has been to present such a test. The details of the analysis were designed to follow those of TH96 to ensure a valid basis for a comparison of results. The results presented in the previous section confirm the finding of TH96 that ENSO behavior in the early 1990's was unusual, but show that the claim that this reflects a shift in ENSO behavior is at best only weakly supported by the data.

[14] The least satisfactory part of this analysis is the use of a linear time series model to represent the behavior of DSLP. Although the fitted ARMA model captures the second-order characteristics of the data well, linear models are limited in the range of dynamics that they can represent. As the dynamics underlying ENSO are likely to be nonlinear, it would be preferable to base statistical inference on a nonlinear time series model. Nonlinear time series models and their analysis are discussed by Fan and Yao [2005]. Ideally, the specification of such a model would be physically based.

Acknowledgments

[15] The helpful comments of two anonymous reviewers are acknowledged with gratitude. This work was supported by NSF grant DEB-0515639.