We assess the magnitude of decadal to multidecadal (D2M) variability in Climate Model Intercomparison Project 5 (CMIP5) simulations that will be used to understand, and plan for, climate change as part of the Intergovernmental Panel on Climate Change's 5th Assessment Report. Model performance on D2M timescales is evaluated using metrics designed to characterize the relative and absolute magnitude of variability at these frequencies. In observational data, we find that between 10% and 35% of the total variance occurs on D2M timescales. Regions characterized by the high end of this range include Africa, Australia, western North America, and the Amazon region of South America. In these areas D2M fluctuations are especially prominent and linked to prolonged drought. D2M fluctuations account for considerably less of the total variance (between 5% and 15%) in the CMIP5 archive of historical (1850–2005) simulations. The discrepancy between observation and model based estimates of D2M prominence reflects two features of the CMIP5 archive. First, interannual components of variability are generally too energetic. Second, decadal components are too weak in several key regions. Our findings imply that projections of the future lack sufficient decadal variability, presenting a limited view of prolonged drought and pluvial risk.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Droughts and floods have devastating effects on human lives and livelihoods. They can undermine agriculture, natural ecosystems, human health, energy generation, transportation, and the daily lives of residents in afflicted regions. As the world warms, these hydroclimate extremes are expected to occur more frequently than they did during the 20th century [Trenberth, 1999; Intergovernmental Panel on Climate Change, 2007; Dai, 2011]. Anticipating how hydroclimate risks will evolve during the coming decades is complicated by the superposition of forced (anthropogenic) changes on natural variability, the role of which is not yet well constrained, but may be substantial.
 Accurate depiction of D2M variability is critical for anticipating the risk of drought in many regions. For instance, Meko et al.  and Pelletier and Turcotte show that Monte Carlo realizations of drought indices with prominent D2M variability depict return rates of low-flow decades that are more frequent than in realizations with weak D2M fluctuations. However, the last generation of models, those comprising Climate Model Intercomparison Project III (CMIP3) archive, was unable to capture key statistics characterizing decadal to multidecadal (D2M) precipitation fluctuations [Ault, 2011]. Specifically, CMIP3 simulations overestimated the magnitude of high frequency fluctuations, and consequently underestimated the risk of future decadal-scale droughts. This mismatch was likely caused by the behavior of the tropical Pacific in CMIP3 models, which was dominated by interannual oscillations that were too energetic [Guilyardi et al., 2009].
 Since the last Intergovernmental Panel on Climate Change (IPCC) report was released, a new generation of coupled general circulation models (GCMs) has been developed and made publicly available as part of the Climate Model Intercomparison Project 5 (CMIP5) effort. Advances in our understanding and representation of historical forcing components [Taylor et al., 2012], as well as model improvements in cloud representation [Jiang et al., 2012] and tropical Pacific behavior [Kug et al., 2012] make this generation of simulations an invaluable resource for assessing likely outcomes of climate change over the coming decades and centuries. It is critical, however, to evaluate the ability of these models to simulate realistic 20th century variability regionally and across a variety of timescales. This study therefore quantifies the relative and absolute regional importance of D2M hydroclimate fluctuations in observations and new state-of-the-art GCMs.
2. Data and Methods
 We use observational data from the Global Precipitation Climatology Centre's (GPCC) gridded (2.5 × 2.5) reanalysis product (version 4) [Rudolf et al., 2005], which spans January 1901 through December 2007. Models from the CMIP5 archive were included in this analysis if data were available from both the forced 20th century (historical) and unforced pre-industrial (piConrol) experiments. The protocols for these experiments are described by the Program for Climate Model Diagnosis and Intercomparison online at:http://cmip-pcmdi.llnl.gov/cmip5/. Briefly, historical experiments typically covered 1850 to 2005 (see Table S1 in Text S1 of the auxiliary material) and were forced by changes (both anthropogenic and natural) in atmospheric chemistry and aerosols during that time. The control integrations were run with fixed external boundary conditions and depict variability that is generated internally by the models.
 We calculated annual totals (Jan–Dec) of precipitation from observational data and model output, then linearly interpolated all products to a 2° by 2° grid. We then used a 3rd-order Butterworth bandpass filter to extract D2M (10–100 year) time series of precipitation from each data set. Interannual precipitation time series were estimated using a higher frequency (2–10 year) 3rd-order Butterworth filter. The Butterworth filter is well-suited to this application because its frequency response is nearly flat within the passband, and relatively steep at the high- and low- frequency cutoffs; these characteristics ensure that our representation of interannual and D2M variations are accurate and independent from each other. We calculated standard deviations and variances of the raw, interannual, and D2M precipitation data sets. Fractional variance was calculated as a ratio of interannual, or D2M, variance to raw (i.e., overall) variance. Interannual and D2M standard deviations, variances, and fractional variances were calculated from each model, then averaged together to produce an ensemble mean for each metric. When multiple runs were available from a single model, they were averaged together prior to calculating the multi-model mean.
 Variability in instrumentally-based precipitation data is greatest in the tropics and subtropics (Figure 1a). In most of these regions, the overall standard deviation of precipitation is between 100 and 200 mm/yr, but ranges as high as 400 mm/yr in Indonesia and the Amazon region of South America. On interannual timescales, precipitation standard deviations are of a comparable magnitude (Figure 1b), whereas on D2M timescales they are lower (Figure 1c).
 The relative importance of interannual and D2M components of observed precipitation can be seen in Figures 1e and 1f. Interannual precipitation fluctuations generally account for between 60% and 80% of the overall variance (Figure 1e), and therefore D2M variations account for less (10% to 35%) (Figure 1f).
 As with observations, CMIP5 precipitation variability is greatest in the tropical and subtropical regions. Over land (where observations are available), standard deviations of CMIP5 precipitation are in good agreement with those calculated from the GPCC data set (Figure 2a). Tropical standard deviations over oceans are even greater, with maximum values in the western Pacific between 500 and 550 mm/yr. The geographic pattern of interannual standard deviations is quite similar (Figure 2b) to the overall pattern in Figure 2a, while the decadal pattern is similar in appearance but weaker in magnitude (Figure 2c).
 From Figures 2e and 2f, it is clear that more precipitation variance occurs on interannual timescales in the CMIP5 archive than in observations. Over continental and marine regions alike, these timescales contribute at least 70% of the total variance, and in the equatorial Pacific they account for as much as 85%. In contrast, only about 10% to 20% of the overall variance occurs on D2M timescales. Results from the CMIP5 control integrations (auxiliary material) and for the individual models are comparable to the 20th-century results.
 Similar findings to those shown in Figures 1 and 2 were obtained using different observational data sets, time domains, and definitions of D2M variability (auxiliary material). In particular, we found that the both the PREC/L [Chen et al., 2002] and CRU-TS2.1 [Mitchell and Jones, 2005] precipitation products exhibited very similar spatial patterns to those in Figure 1, implying that the maps in this figure do not reflect methodological choices used to develop the GPCC data set. We also found that only using the 1901–1950 and 1950–2000 portions of the GPCC product yielded standard deviations and variances that were similar to those calculated from the entire 1901–2007 time domain. Our definition of D2M variability encompasses any variations on 10–100 year timescales, but maps of D2M variability on 10–50 year timescales are very similar to those in Figure 1. Finally, estimating variances by integrating spectral densities (e.g., calculating the area under the power spectrum on 10–100 year timescales) did not change our findings appreciably. Taken together, these tests of our methodology confirm that D2M variability in observations is more prominent than in CMIP5 simulations regardless of the observational data set used, pre-processing choices, or the definition we used for D2M variability.
4. Discussion and Implications
 Our results suggest CMIP5 simulations of the historical era (1850 to 2005) underestimate the importance D2M variability in several regions where such behavior is prominent and linked to drought. These areas include northern Africa [e.g., Giannini et al., 2008], Australia [Cai et al., 2009; Leblanc et al., 2012], western North America [Seager, 2007; Overpeck and Udall, 2010], and the Amazon [Marengo et al., 2011]. Although it is possible the observational network gives too much emphasis to D2M variability, evidence from several recent paleoclimate studies suggest that this is not the case. For example, Shanahan et al.  and White show that substantial low-frequency fluctuations occurred during the last millennium in western Africa and the Amazonas region, respectively. Reconstructions of hydroclimatic variables across the western US also demonstrate decadal to centennial scale variability in pre-instrumental epochs is well outside of the 20th century range [e.g.,Woodhouse and Overpeck, 1998; Cook et al., 2004], meaning that the 20th century estimates of D2M prominence may in fact be conservative.
 Do the disagreements between CMIP5 simulations and observations reflect simulated interannual fluctuations that are too strong, D2M variations that are too weak, or both? Figure 3 shows the ratio of interannual to D2M variances from observed (Figure 3a) and simulated (Figure 3b) data. Regions where the observed ratio falls between 2:1 and 1:1 are characterized by substantial low-frequency hydroclimate variability and include most of North Africa, Australia, western North America, and northern South America. The interannual to D2M variance ratios in CMIP5 models, by contrast, are generally much higher (5:1 to 10:1) and more uniform over land. The ratios of D2M observational variances to D2M model variances are shown inFigure 3c. This map highlights three regions where the absolute (not just the relative) magnitude of precipitation variability on D2M timescales is greater in observations than in models: the Mediterranean region of north Africa, the Himalayan plateau, and the Amazon region.
Figure 3 suggests that observational and model differences in precipitation variability reflect two different statistical features. For Africa, Australia, much of South America, and western North America, the disagreement between Figures 3a and 3b can be explained by interannual variances that are stronger in the models than in the observations. This explanation is supported by Figures 1e and 2e, which show that interannual fluctuations comprise only about 60% of the overall variance in the observations, but closer to 80% or 85% in the models. In these regions, simulated interannual variability “swamps” the lower frequency variations in the model, but not in observations. On the other hand, northern Africa, the Himalayan plateau and the Amazon region experience decadal fluctuations that are much stronger in nature than what is simulated by the CMIP5 models. In these regions, the models lack the strength of the observed low-frequency variance. Models underestimate the importance of decadal variability in both cases, but for different reasons.
 As with the CMIP3 archive, much of the disagreement between observations and models likely arises from tropical Pacific sources of variability that are excessively energetic in GCM simulations [e.g., Guilyardi et al., 2009]. This idea is supported by Figure 2, which clearly shows that the region of the tropical Pacific that is most strongly influenced by El Niño and La Niña is also the region that is most dominated by strong interannual precipitation variability in the CMIP5 archive.
 Whether D2M precipitation variability in observations is generated by external forcings or internal sources remains unclear. If it is a forced response, then it is either driven by a forcing agent that is missing in the current CMIP5 experimental protocols or an agent that provokes a response that is different at D2M timescales in models than in nature. The latter possibility is suggested by Booth et al. , which indicates that aerosol forcing of sea surface temperatures in the Atlantic may be responsible for producing decadal drought in northern Africa. On the other hand, if it arises from internal sources of variability, then models appear unable to generate internal D2M amplitudes that are comparable to observations, although some are more successful than others (see Figures S7 and S8 in Text S1). In any case, the mismatch between 20th century observations and simulations suggests that model projections of the future may not fully represent all sources of D2M variations.
 Our findings have important implications for evaluating decadal hindcasts, understanding decadal predictions, and most especially for assessing drought risk on D2M timescales. If observed estimates of decadal variance are accurate, then the current generation of models depict D2M precipitation fluctuations that are too weak, implying that model hindcasts and predictions may be unable to capture the full magnitude of realizable D2M fluctuations in hydroclimate. Consequently, the risk of prolonged droughts and pluvials in the future may be greater than portrayed by these models. Weak decadal variability in the models is particularly striking in semiarid regions, where water resources are already limited, and in the Amazon region, where hydroclimate is strongly connected to the carbon balance and to the vitality of the worlds largest rainforest. Understanding the processes that generate D2M precipitation fluctuations in these regions may facilitate future model improvements, while incorporating observational estimates of D2M variability with model projections through novel statistical methods should lead to a more comprehensive view of future risk.
 We thank Jonathan Overpeck for insights and comments. Work was partly supported by an NCAR/ASP Fellowship (to T. Ault), NOAA CCDD funding (Cole), and NSF-CGD grant 1127331 (Cole and Ault). St. George was supported through a fellowship from the University of Minnesota's Institute on the Environment. NCAR is sponsored by the National Science Foundation.
 The Editor thanks the two anonymous reviewers for their assistance in evaluating this paper.