The magnitude of sea surface temperature variability in the NINO3.4 region of the equatorial Pacific on decadal and longer timescales is assessed in observational data, state-of-the-art (Coupled Model Intercomparison Project 5) climate model simulations, and a new ensemble of paleoclimate reconstructions. On decadal to multidecadal timescales, variability in these records is consistent with the null hypothesis that it arises from “multivariate red noise” (a multivariate Ornstein-Uhlenbeck process) generated from a linear inverse model of tropical ocean-atmosphere dynamics. On centennial and longer timescales, both a last millennium simulation performed using the Community Climate System Model 4 (CCSM4) and the paleoclimate reconstructions have variability that is significantly stronger than the null hypothesis. However, the time series of the model and the reconstruction do not agree with each other. In the model, variability primarily reflects a thermodynamic response to reconstructed solar and volcanic activity, whereas in the reconstruction, variability arises from either internal climate processes, forced responses that differ from those in CCSM4, or nonclimatic proxy processes that are not yet understood. These findings imply that the response of the tropical Pacific to future forcings may be even more uncertain than portrayed by state-of-the-art models because there are potentially important sources of century-scale variability that these models do not simulate.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Climate variations originating in the tropical Pacific disrupt global water and energy cycles [e.g., Trenberth et al., 1998], with consequences for humans and ecological systems throughout the world [e.g., Stenseth et al., 2002]. On interannual timescales, these variations reflect the influence of processes associated with El Niño/Southern Oscillation (ENSO). As an interannual phenomenon, ENSO's statistics are well observed, and its causes are fairly well understood [e.g., Trenberth and Caron, 2000; Neelin et al., 1998]. On longer timescales, however, variability in the tropical Pacific is less well-characterized, due in part to the limited duration and paucity of most observational records [e.g., Deser et al., 2004]. Investigating such low-frequency variability requires gleaning information from diverse sources including instrumentally based products, paleoclimate archives, and global climate model simulations.
 Characterizing decadal-to-centennial (“dec-cen”) climate fluctuations in the tropical Pacific is critical to understanding how the region may evolve with human-induced climate change. If dec-cen variability is substantial and forced by external influences on Earth's climate, then future changes may be correspondingly prominent and even predictable to the extent that future forcing trajectories can be known. On the other hand, if dec-cen variability is internally generated, then future changes will be strongly governed by the interplay between external forcing and internal variability.
 It is clear from paleoclimate archives that dec-cen variability in the tropical Pacific may be more prominent than instrumental records alone reveal [e.g., Cole et al., 1993; Urban et al., 2000; Cobb et al., 2003; Conroy et al., 2008; Ault et al., 2009; Tierney et al., 2010; Li et al., 2011]. It is less clear, however, how external influences and internal processes generate variability at these timescales. Climate modeling studies suggest both components are important: the duration of the seasons, solar intensity, and volcanic activity have all fluctuated in the past, and climate simulations appear sensitive to those forcings [e.g., Emile-Geay et al., 2007, 2008; Ammann et al., 2007]. On the other hand, dec-cen variability may be internally generated as a residual of energetic interannual variability [Vimont, 2005; Ault et al., 2009; Wittenberg, 2009; Newman et al., 2011b], or from ocean-atmosphere interactions that occur too slowly to be considered part of the canonical timescales explained by ENSO theory. For example, using the Zebiak-Cane (ZC) model [Zebiak and Cane, 1987], Clement and Cane  show that multidecadal and longer timescales of variability emerge in a very long (150,000 years) control run as a consequence of nonlinearities in the system. Tropical Pacific dec-cen variability also emerges without any external forcing in atmospheric general circulation models (GCMs) coupled to a simple thermodynamic “slab” ocean [Dommenget and Latif, 2008; Clement et al., 2011], and it arises in unforced fully coupled GCMs from deterministic processes [e.g., Meehl and Hu, 2006; Wittenberg, 2009].
 In this study, we evaluate the magnitude of tropical Pacific dec-cen variability in an extensive suite of data sets including instrumentally based products, climate model simulations from the Climate Model Intercomparison 5 (CMIP5) archive, and a newly published ensemble of paleoclimate reconstructions [Emile-Geay et al., 2013a, 2013b] (hereafter referred to as EG13a,b). In each of these data sets, the magnitude of dec-cen variability is evaluated against the null hypothesis that it is not different from “multivariate red noise” (commonly known as a multivariate Ornstein-Uhlenbeck process). This expectation is simply an extension of the more familiar univariate red noise null hypothesis applied to a vector of time series. We generate multivariate red noise using a linear inverse model (LIM) of tropical ocean-atmosphere dynamics [Newman et al., 2011a, 2011b]. The LIM is empirically derived from late 20th century observational data and as such serves as a benchmark for the magnitude of dec-cen variability that may arise from random permutations of the space-time covariance structures that underlie modern tropical Pacific climate. To our knowledge, this is the first attempt at extending the linear, stochastically forced paradigm of ENSO variability [e.g., Penland and Sardeshmukh, 1995] to characterize the nature of fluctuations in the equatorial Pacific during the last millennium.
2 Data and Methods
 We use the NINO3.4 index, defined as the average sea surface temperature (SST) from 170°W–120°W and 5°S–5°N, to characterize ENSO variability in instrumental products, paleoclimate reconstructions, and model simulations. All NINO3.4 indices reflect 3 month boreal winter (December-January-February; DJF) means to focus on the part of the year when the interannual signal is strongest [e.g., Rasmusson and Carpenter, 1982].
 Instrumental SST data originate from the Kaplan et al.  data set (1856–2011), the ERSSTv3 data set [Smith et al., 2008] (1854–2011), and the HadSST2 data set (1870–2010) [Rayner et al., 2003], which was interpolated as described in EG13a to produce “HadSST2i.”
 Reconstructed NINO3.4 SST time series were estimated from a network of paleoclimate records described in EG1a,b. Briefly, these reconstructions were generated from multiple different observational products using a hybrid “regularized expectation-maximization” (RegEM) “truncated total least squares” (TTLS) methodology described by Mann et al. . Although an alternative reconstruction method was also employed in EG13a (composite plus scale), its fidelity on dec-cen timescales was shown to be demonstrably worse than RegEM's, and we therefore focus only on the results obtained from RegEM (see the supporting information). The paleoclimate data used for the reconstruction include tree-ring records from Asia, Indonesia, and both American continents; coral records from the Red Sea, Indian Ocean, tropical Pacific, and Caribbean Sea; ice core records from Asia and South America; sediment cores from the East Coast of Africa and the Cariaco Basin (Venezuela); and a speleothem record from the Arabian peninsula. As in EG13b, we compare this reconstruction with two recent studies [Wilson et al., 2010; Mann et al., 2009], which both employed RegEM and used some of the same underlying proxy data to reconstruct SST in the equatorial Pacific.
 Simulated SSTs are from the following sources:
 A long (1300 year) control integration of the Community Climate System Model Version 4 (CCSM4) [Gent, 2011; Deser et al., 2011]. Solar and greenhouse gas (GHG) forcings for this simulation were held constant at 1850 levels. The atmosphere was simulated on a 0.9° latitude and 1.25° longitude grid, and the ocean was simulated on a nominal 1° by 1° grid. Despite remarkable improvements in ENSO representation in CCSM4 over CCSM3 [Gent, 2011], its amplitude remains overestimated with respect to observations [Deser et al., 2011].
 A CCSM4 “last millennium” integration. For this simulation, the model was run with time-evolving external forcing components (i.e., solar and volcanic activity, land use change, and GHG increases) from 850 C.E. through 2005 C.E. These components follow the Paleoclimate Model Intercomparison Project III (PMIP3) protocols for last millennium experiments described in Schmidt et al. . Their implementation in CCSM4 is documented by Landrum et al. .
 Six 20th century CCSM4 century integrations, run at the same resolution as the last millennium simulation and the preindustrial control [Meehl et al., 2012].
 An ensemble of 24 “pre-industrial” control simulations from state-of-the art models that are currently available as part of the CMIP5 project (see Table S1). Importantly, these experiments were run without external changes to the boundary conditions. This set of integrations allows us to examine the magnitude of dec-cen variability across a wide range of models with different parameterizations, resolutions, and physics.
 Runs from six models that contributed “last millennium” simulations to the CMIP5 archive. Like the CCSM4 simulation, these runs were forced with externally varying boundary conditions, although not necessarily the same ones used in the CCSM4 simulation (see Schmidt et al.  for a review of all last millennium forcing options). As different time domains were selected by the individual modeling groups to simulate the last millennium, we only considered the 1000 to 1850 C.E. period of overlap from the individual runs.
 A 150,000 year unforced simulation of the ZC model [Zebiak and Cane, 1987; Karspeck et al., 2004], which we divided into 150 nonoverlapping 1000 year segments to create a 150-member ensemble. The ZC model is one of intermediate complexity that simulates ENSO through simplified ocean/atmosphere physics and nonlinear thermodynamic coupling between the ocean and atmosphere. Its geographic domain is restricted to 29°S–29°N and 124°E–80°W, which isolates the tropical Pacific. Including output from this model allows us to assess the potential role of nonlinear air-sea coupling in generating dec-cen variability. In this model, as in CCSM4, ENSO is too energetic [Zebiak and Cane, 1991].
 Time series of the primary NINO3.4 data sets considered here are shown in Figure 1. The figure also shows globally averaged downward solar flux annual anomalies at the top of the atmosphere from the CCSM4 last millennium simulation. These anomalies reflect the combined influences of solar and volcanic activity as well as other radiative forcing terms.
 Power spectra of the NINO3.4 time series were estimated using the multitaper method [Thomson, 1982]. We used a time-bandwidth parameter of four, which allows for the application of seven tapers to estimate robust spectra at the cost of some bandwidth resolution. Our findings are consequently most relevant to the underlying shape of the spectrum and not necessarily to any narrow-band features. Confidence limits for NINO3.4 spectral densities were estimated from the LIM simulations (see section 3) by sorting each realization's NINO3.4 spectrum at each frequency.
3 Null Hypothesis
 We evaluate the magnitude of dec-cen variability in NINO3.4 indices against the null hypothesis that it is not different from what would be produced by multivariate red noise generated by the LIM of Newman et al. [2011a, 2011b]. This LIM captures key statistics of tropical Pacific variability from 1960 to 2000, including the shape of the NINO3.4 power spectrum on interannual to decadal timescales [Penland and Sardeshmukh, 1995; Newman et al., 2011a, 2011b].
 The LIM assumes that tropical Pacific dynamics can be approximated by a system evolving according to the following Langevin equation of a multivariate Ornstein-Uhlenbeck process:
where x is an ocean state vector (e.g., evolving maps of variables important to the system), L is a linear operator (a matrix), and Fs is a white noise representation of the seasonal and spatial statistics of atmospheric weather [e.g., Newman et al., 2011a]. From equation (1), it is clear that our null hypothesis has the same mathematical form as univariate red noise, except that xis a vector of time series (hence, the term multivariate red noise). We suggest that equation (1) is a better null hypothesis for the spectrum of climate variability than its univariate analog because power spectra calculated from the individual elements of x may have peaks that arise from evolving, asymptotically damped anomalies [e.g., Kleeman, 2010], yet do not reflect deterministic nonlinear or nonstationary processes.
 The variables chosen for the ocean state vector (x in equation (1)) are seasonal anomalies of SST, sea surface salinity (SSS), 20°C isotherm depth (a proxy for the thermocline depth), and wind stress in the tropics (25°S to 25°N). Importantly, the LIM is developed from late 20th century data using only the zero lag and one season (3 month) lag covariance matrices of x, so that any variability arising on longer timescales is solely a consequence of seasonal covariance and autocovariance structures that are present in the observations (for details on constructing a LIM, see Penland and Sardeshmukh ). The LIM was run 100 times at monthly resolution for 1000 years to generate a large ensemble of climates with statistics nearly identical to those of the late 20th century.
 Several qualitative features distinguish dec-cen variability in the individual records from one another (Figure 1). In the reconstruction, the 25 year running mean fluctuates on decadal to multicentury timescales. From the 12th through 17th centuries, its values are close to modern values, but during the 18th century, they are about half a degree lower on average. Some variations in the 25 year running mean of one representative member of the LIM ensemble are also evident, whereas the CCSM4 control appears to exhibit almost no variability on these timescales. The forced CCSM4 simulation exhibits several major drops in the 25 year mean during the second half of the 13th century, the late 15th century, and the early 19th century. These drops correspond well with large volcanic eruptions used to force the model (black triangles). The reconstruction, in contrast, does not appear to exhibit any consistent anomalies at the times of these eruptions.
 We compare NINO3.4 power spectra from observations, reconstructions, and models with the LIM in Figure 2. On interannual through centennial timescales, the LIM simulates variability that encompasses the spectra of the three instrumental data sets and of the reconstructions. As expected, it does not capture the excessively energetic interannual peak in variance simulated by CCSM4 during the 20th century (because it is a LIM based on 20th century observations, not CCSM4). At centennial and lower frequencies, spectral densities in both the reconstructions and the last millennium simulation are well above what the LIM generates (Figure 2b), and they are in remarkably good agreement with each other despite the lack of temporal agreement between the two series (Figure 11 and correlations in Table S1).
 The LIM ensemble also appears to capture much of the interannual variability simulated by the unforced ZC and CMIP5 ensembles (Figures 2c–2d), although the ZC 5 year spectral peak exceeds the upper limit (Figure 2c). Similarly, several of the control simulations exhibit interannual variability beyond the upper bounds of the LIM ensemble (Figure 2d). Despite this, on timescales longer than about 20 years, the CMIP5 ensemble is almost entirely within the range of, or slightly below, the LIM simulations, whereas some of the ZC simulations (about 27%) exhibit dec-cen variance beyond that of the LIM.
 Centennial (200 to 1000 year) timescales of variability from all data sets considered here are compared in Figure 2e. Centennial variability in the EG13a,b reconstructions ranges from well above the upper bound of the LIM for the ERSSTv3 and HadSST2i reconstructions to only nominally above it for the Kaplan product. Although the NINO3 reconstruction of Mann et al.  agrees well with the EG13a,b range, the Wilson et al.  reconstructions do not. Instead, they exhibit centennial variability that is below the lower bound of the LIM, possibly reflective of how low-frequency fluctuations were removed from the underlying data (complete spectra of these reconstructions are shown in Figure S1). As in Figure 2, about 27% of the ZC runs have variability above the upper limits of the LIM, with only one CMIP5 control simulation that is nominally above it (GFDL-CM3; Figure S2). The CMIP5 simulations of the last millennium are generally more energetic on centennial timescales than their respective control integrations, although many fall within the LIM confidence limits despite being forced by time-varying boundary conditions (Figure 2e and Figure S3).
 In the supporting information, we present additional analyses to clarify the sensitivity of our main results to certain methodological details. For example, an alternative technique for estimating the power spectrum (the Blackman-Tukey method) yields qualitatively similar results to those shown in Figure 2 (Figure S4), as does using a LIM developed from the ERSSTv3 data set (Figure S5). Likewise, the power spectrum of the EG13a,b reconstruction during the preindustrial era (1150 to 1850 C.E.) is nearly identical to the full reconstruction (Figure S6). However, we found that the CCSM4 NINO3.4 spectra are somewhat less energetic when computed over time periods that exclude the 1850–2005 interval (Figure S5).
 Finally, there is necessarily reconstruction uncertainty, which is explored extensively in EG13a,b. Evidence presented in those studies suggests that dec-cen variability in the NINO3.4 reconstruction is likely underestimated, due in part to the “regression dilution problem,” which trades variance for bias and is inherent to all regularized regression-based methods [Frost and Thompson, 2000; Tingley et al., 2012]. Additional analyses in our supporting information illustrate that dec-cen variability is not systematically connected with any one proxy type (Figures S7 and S8) or region (Figure S9). In addition, the time evolution of the dec-cen fluctuations in the reconstruction are not the product of any one site (Figure S10). These results support the arguments made in EG13a,b that the magnitude of dec-cen variability in the reconstructions cannot be explained by known methodological, proxy, or other nonclimate factors. Thus, the fortuitous agreement in the magnitude of variability on centennial and longer timescales in the reconstruction and the CCSM4 last millennium simulation remains to be explained.
5 Summary and Discussion
 We have assessed dec-cen variability in model, instrumental, and paleoclimate representations of NINO3.4 variability using an empirical multivariate red noise model (a LIM) as a benchmark. Decadal to multidecadal variability during the past millennium may not require exogenous mechanisms, but rather result from different permutations of variability and autocorrelation that are consistent with the statistics of seasonal SST anomaly evolution during the latter half of the 20th century. This result was hinted at in Newman , Newman et al. [2011a], and Ault et al.  but has been tested more rigorously here.
 On centennial and longer timescales, reconstructed and simulated NINO3.4 variability is above what the LIM can produce. However, the origin of this variability differs in the models and reconstructions. In CCSM4, centennial variability arises primarily as a thermodynamic response to external influences, especially explosive volcanism: a simple one-dimensional stochastic climate model shows that the power spectrum of radiatively induced temperature anomalies on dec-cen timescales agrees well with the CCSM4 last millennium simulation for the period of 850 to 1850 C.E. (Figure S11). As most of the forced variability in the radiative budget of the last millennium simulation is due to volcanic aerosols (e.g., Figure 1 and also Landrum et al. ), this suggests that the majority of the low-frequency variability in the model's NINO3.4 region is likewise driven by reconstructed volcanic activity.
 In contrast to the primarily volcanic origin of centennial variability in CCSM4, epochs following large eruptions (e.g, 1258, 1453, and 1816 C.E.) are not obviously discernible in the reconstruction (Figure 1). Chronological uncertainty in paleoclimate records may prevent the reconstruction from recording short-lived volcanic events (EG13b), but this effect would only explain the lack of a strong volcanic signal in the reconstruction and not the presence of substantial dec-cen variability overall. Moreover, the reconstruction is significantly anticorrelated to the 200 year solar signal, whereas CCSM4 and almost all the other CMIP5 last millennium simulations are positively correlated to it (see Table S3 and EG13b). In total, CMIP5 model and reconstruction low-pass time series mostly range from being uncorrelated to anticorrelated to each other (see Table S4). As discussed in EG13b, this finding may imply that models do not simulate dynamical responses to increased insolation that are consistent with the paleoclimate record [e.g., Clement et al., 1996].
 Although explosive volcanism and solar oscillations appear to impart centennial variability in the reconstructions and the models, we cannot rule out the possibility that these variations largely arise from internal climate fluctuations. Both the LIM and especially the ZC ensembles generate centennial-scale amplitudes that come close to the lower bound of the EG13a,b reconstructions. Hence, if the Kaplan reconstruction is most reflective of the true variance of centennial-scale fluctuations, then these timescales are only nominally above the distribution of the LIM and within the ZC spread (Figure 2e). Nonetheless, the ERSSTv3, HadSST2i, and Mann et al.  reconstructions are all above the LIM and ZC upper bounds, while unforced CMIP5 simulations are not. Consequently, if the variability is indeed largely unforced, its magnitude is well above what any state-of-the-art models produce naturally in the NINO3.4 region.
 The differences in modeled and reconstructed NINO3.4 variability during the last millennium have important implications for anticipating the role of climate change in the equatorial Pacific during the coming century. Namely, if the estimates of dec-cen variance obtained from the multiproxy reconstructions reflect climate responses to external forcings, then those forced responses are different in nature than in the CCSM4 simulation. On the other hand, if they arise from nonclimate sources of variability, then they reflect processes in the individual archives that filter climate information in ways we do not yet understand. This possibility has important implications for our study and beyond and argues that continuing efforts to develop “forward” models of individual paleoclimate archives [e.g., Anchukaitis et al., 2006; Evans et al., 2006; Truebe et al., 2010; Thompson et al., 2011] will deepen our understanding of dec-cen variability in this region and elsewhere.
 Finally, if the proxy-inferred magnitude of dec-cen variability is accurate and arises from internal sources—that is, independent of any forcing—our current generation of global climate and Earth system models, as well as the nonlinear ZC model of exclusively tropical processes, appear unable to generate variance commensurate with that seen in the reconstructions. If this is the case, then natural variability may play a fundamental role in modulating forced changes in the region in the future, and we may not yet have an adequate numerical modeling paradigm to fully represent the range of future climate states that could emerge in the tropical Pacific as a consequence of these two combined effects. Future work on the spatial structure of dec-cen variability could help disentangle the aforementioned uncertainties as forced, unforced, and nonclimate sources of variance at these timescales may differ in their geographic patterns.
 We thank Julia E. Cole and Bette Otto-Bliesner for insights and comments. Work was partly supported by an NCAR/ASP Fellowship (to T. Ault) and NOAA CVP (M. Newman). NCAR is sponsored by the National Science Foundation.
 The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.