• Open Access

Comment on “The added value to global model projections of climate change by dynamical downscaling: A case study over the continental U.S. using the GISS-ModelE2 and WRF models” by Racherla et al.


1 Main Comment

Racherla et al. [2012, hereinafter RSF12] analyzed two decades, 1968–1978 and 1995–2005, comparing time slices of global and regional climate models simulations with analyses of observations and in particular verifying the skill of models at capturing the changes in seasonal mean surface air temperature and precipitation between the two decades over 11 regions of the U.S. The global simulation is from the Goddard Institute for Space Studies (GISS)-ModelE2 coupled atmosphere-ocean global climate model (AOGCM), integrated on a 2° by 2.5° grid and incorporating anthropogenic and natural forcings with a detailed representation of gas phase, sulfate, black carbon, nitrate, and secondary organic aerosol chemistry. Dynamical downscaling was achieved with a version of the Weather Research and Forecasting (WRF) atmospheric regional climate model (RCM), integrated over a North American domain on a 45 km mesh with 216 by 126 cells in the west-east and south-north directions, respectively. WRF was driven by atmospheric fields and sea surface temperature (SST) and sea ice concentrations (SIC) from the GISS-ModelE2 simulation. The WRF model did not include anthropogenic aerosol forcings nor land use changes. RSF12 found very modest skill at reproducing the observed trend of temperature and precipitation over the past 37 years for their single AOGCM simulation, and very little improvement if any from dynamical downscaling with the higher-resolution RCM.

Does this constitute a failure for the global climate model or, as purported by Kerr [2013], a failure for dynamical downscaling? We do not think so, due to the use of an inadequate experimental protocol. The results of the experiment as designed were strongly influenced by the presence of internal variability and sampling errors, which masked the rather small climate changes that may have occurred as a consequence of changes in forcing during the period considered.

Statistical theory informs us that any average calculated from a limited sample is affected by sampling error. If we note by σ2 the variance of seasonal mean associated to interannual variability (IAV), then an N year mean seasonal climatology has an associated uncertainty (average square error) equal to s2  =  σ2/N, assuming independence of seasonal means across years. The difference between two N year means would have an associated uncertainty equal to S2 = 2 σ2/N, assuming constant IAV across years. In RSF12, the change between two 10 year means is calculated over a 37 year period (1967–2005), with an expected square error equal to S2 = 2 σ2/10. We note that the corresponding 37 year mean has an expected square error equal to s2 = σ2/37; clearly, the change statistics is much more prone to sampling error than the average statistics, as S2 = 7.4 s2.

The larger sampling error, combined with the fact that the changes are small during the historical period considered, makes the use of simulated trend to assess climate models difficult to apply despite its formal appeal. We note that the situation may be quite different in some future, say 2050, when climate changes could be calculated between two 30 year periods such as 2020–2050 and 1970–2000. Assuming that anthropogenic emissions of greenhouse gases continue over the next decades, and based on current understanding of greenhouse effect, we expect that the combined effects of the stronger climate trend then and longer averaging period would concur to increasing the signal-to-noise ratio, assuming of course that models can adequately reproduce the observed features of the real climate, in terms of their time mean and variability, and changes thereof.

de Elía et al. [2013] estimated the time over which one can expect that the climate change would exceed natural variability, and they introduced the concept of “Expected number of Years before Emergence” (EYE). Their results indicate for example EYE values ranging from 20 to more than 50 years for wintertime temperature for various regions of North America. Using a related concept of “Time of Emergence” (TOE) for projections over Europe, Maraun [2013] obtained that across wide areas, the local trends for heavy summer precipitation emerge only late in the 21st century or later. Given that these values are based on model-simulated internal variability, which as noted by Lovejoy [2013] tend to be underestimated by models, especially at low frequencies, the emergence of a signal in the real world may be even longer. Using large ensemble of CCSM3 projections for this century, Deser et al. [2012, Figure 2] have shown that many members were required to detect a significant response over a 23 year period from now over the U.S.: generally, more than three members were needed for temperature and more than 12 for precipitation.

Should climate models be expected to capture the changes in surface air temperature and precipitation between two historical decades? The answer is that the processes responsible for the trend need to be represented in the model and in the case of a nested model in the driving boundary conditions as well. Climate models could be expected to capture past and future climate changes in two ways: (1) from the memory of the specified initial conditions (IC) of the atmosphere, land, and oceans and (2) from the prescribed evolution of boundary conditions (BC) forcing. Current understanding of Earth system predictability indicates IC memory time scales of a few weeks for the atmosphere, and a few months to a few years for land surface conditions; the ocean on the other hand exhibits variability across a wide range of time scales because of its multiple heat reservoirs and modes of variability. BC forcings include contributions that are nearly constant in time (e.g., orography, land-sea mask, and astronomical parameters) and others that vary in time. The latter may be further subdivided in natural (e.g., volcanoes and sun cycles) and anthropogenic (e.g., greenhouse gases and aerosols (GHGA) resulting from the fossil fuel burning and land use changes) forcings. In the context of atmosphere-only regional models, BC forcings would also include the prescribed distribution and evolution of SST and SIC, as well as atmospheric lateral BC. The fluctuations of the climate system can be decomposed into its free and forced components, referred to as natural variability and climate changes, respectively.

Numerical weather prediction (NWP) operates on the basis of the time evolution from the specified IC, while climate change projections have traditionally been approached from the perspective of changes in BC forcings affecting the forced component of the climate system. Recent changes in paradigm are taking place, however, emphasizing the role of IC for near-term climate predictions, e.g., Giorgi [2005] stated that “....because of the long time scales involved in ocean, cryosphere and biosphere processes a first kind predictability component also arises. The slower components of the climate system (e.g. the ocean and biosphere) affect the statistics of climate variables (e.g. precipitation) and since they may feel the influence of their initial state at multi decadal time scales, it is possible that climate changes also depend on the initial state of the climate system.” While decadal prediction represents a colossal challenge [e.g., Solomon et al., 2011], there are, however, clear indications from the climate forecast systems participating in the fifth Coupled Model Intercomparison Project (CMIP5) of skill in predicting regional-scale temperature anomalies over the past 50 years; most of the skill results from changes in atmospheric composition, but also partly from the initialization of the predictions [Doblas-Reyes et al., 2013].

Given that an AOGCM is used by RSF12, the only contact the AOGCM has with real-world chronology is through specified anthropogenic and natural BC forcings from GHGA, a rather weak forcing compared to that of SST and SIC anomalies as would be the case of an atmosphere-only GCM in Atmospheric Model Intercomparison Project-type experiments [e.g., Gates, 1992]. Hence, it is important to recall that natural variability in the AOGCM simulation is not going to synchronize with reality except if by luck.

The relatively small amplitude of recent past climate changes makes most challenging the detection of observed changes and causal attribution to changes in forcing agents, particularly at regional scale, because natural variability, both in observed and modeled changes, blurs the climate trends resulting from changes in climate forcings [e.g., Hegerl and Zwiers, 2011]. The Intergovernmental Panel on Climate Change Fourth Assessment Report (IPCC AR4) [2007, Figure SPM.4] presented observed surface temperature anomaly for the period 1906 to 2005 and “plumes” diagrams of climate models simulations using natural-only and both natural and anthropogenic historical forcings. The width of the plumes represents the 5 to 95% range of simulation results and accounts for the combined effects of different forcings being used, the different response of participating models to specified forcings, and natural variability in model simulations. Over all continents, the observation line falls within the models simulations plumes, which is interpreted favorably in terms of the skill of the ensemble of global model simulations to reproduce observed changes. As stated in IPCC AR4 [2007, chapter 9], “When human factors are included, the models also simulate a geographic pattern of temperature change around the globe similar to that which has occurred in recent decades.” Note that while ensemble averaging of simulated results can be used to filter out models' natural variability, there is no equivalent way of overcoming natural variability in observations, save by time averaging, which is only applicable under constant climate forcing conditions. Hence, acknowledging the presence of natural variability is clearly required for an adequate verification of model simulations against observations.

In recent years the assessment of climate models performance has gradually shifted from comparing time mean variables to their time variations. IPCC AR4 [2007, chapter 8] stated that “… developments in AOGCM formulation have improved the representation of large-scale variability over a wide range of time scales. The models capture the dominant extratropical patterns of variability… The atmosphere-ocean coupled climate system shows various modes of variability that range widely from intra-seasonal to inter-decadal time scales.” Lovejoy [2013], however, noted that “Analysis of several simulations of the past millennium shows that their low-frequency variability using “reconstructed forcings” is somewhat too small compared to the observed variability.”

Although not widespread, the use of recent past climate changes to assess climate models performance, as promoted by RSF12, has nevertheless been used already in a few studies. For example, Pierce et al. [2009] analyzed CMIP3 ensemble simulations from 21 AOGCM for the 1960–1999 period during which the observed trend in the western U.S. is +0.10°C/decade. They found that “because of the importance of natural variability in a limited domain, it is not uncommon for models with a strongly positive ensemble-averaged trend to have individual realizations with a negative trend. A single model realization does not provide a reliable estimate of the warming signal.” They emphasized the importance of ensembles of simulations: “…enough realizations must be chosen to account for the (strong) effects of the models’ natural internal climate variability. In our test case, 14 realizations were found to be sufficient…” Räisänen [2007] analyzed the skill of 21 AOGCMs participating to IPCC AR4 to reproduce the observed trend from 1955 to 2005. A 50 year period was used to reduce the effect of internal variability in observations and a 21 model ensemble to minimize the effect of internal variability in model results. A spatial correlation of 0.48 for temperature and 0.23 for precipitation was obtained between the multimodel mean trend and the observations. He noted that climate changes in individual model simulations were more strongly affected by internal variability and were less similar to the observed changes than the multimodel means. Using an ensemble of simulations of the Hadley Centre's most recent AOGCM with improved treatment of volcanoes and mineral and anthropogenic aerosol processes including their direct and indirect effects, Booth et al. [2012] showed that the model exhibited a rather remarkable success in reproducing the observed historical trends in the North Atlantic SSTs. This work instills some optimism for decadal prediction when suitable forcings are incorporated in models.

To what extent the imperfect reproduction of the observed trends as obtained by RSF12 is due to model structural errors is mute. An experiment could be designed to separate structural errors from natural variability effects, using a “perfect prognosis” approach similar to the “identical twins” experiments reported by Lorenz [1982] for NWP. The experiment would consist of making a reference simulation from an AOGCM that would henceforth be considered as the truth and used for verification of an ensemble of simulations performed with the same model but initialized from slightly different IC, discarding a sufficiently long period to decrease the influence of IC on ensuing simulations. Because the member simulations would be compared to a simulation of the same model, they would not be influenced by model's structural errors, and any failure of the members at reproducing the trend of the reference run could be unambiguously attributed to internal variability and sampling effects.

2 Dynamical Downscaling

The issue of the added value of RCMs afforded by additional processes permitted in the fine-scale RCM but unresolved by coarse-mesh global models is still one of the many topics discussed in the RCM community, as well as are questions such as whether RCMs should or should not change the resolved scales of driving AOGCMs; see, for example, Lund [2009] for the state of affairs a few years ago and IPCC AR5 [2013, section 9.6.4] for a recent update. It is of course impossible now to know whether there exists a relation between RCMs' skill for the present-day climate simulation and future climate change projections.

The use of recent past climate changes to assess RCM performance has already been used to some extent. The study by Lorenz and Jacob [2010] comparing simulations of 15 RCMs driven by reanalyses revealed that they could reproduce with some success the linear trend of annual mean temperature in the period 1960–2000. Bukovsky [2012] studied the temperature trend simulated by six RCM driven by reanalyses over past 24 years as part of the North American Regional Climate Change Program (NARCCAP) [Mearns et al., 2009]. Her study showed that the RCMs succeeded in reproducing some observed temperature trends in winter and spring seasons. In their analysis of an ensemble of regional climate change projections performed within NARCCAP, de Elía et al. [2013] found that AOGCM-driven RCM simulations often show important departures of interannual anomalies from observed one, while the climate change signal over longer period agrees with expected trends due to GHGA. The effect of interannual variability in trend estimation was shown to contribute to the existence of transitory cooling trends over a few decades, embedded within the expected long-term warming trends. Their study emphasized that the choice of spatial and temporal scales affect the capacity of discriminating climate changes from interannual variability. It must be remembered, however, that studies using observed lateral boundary conditions to drive RCMs, only provide an upper bound on model skill and hence only serve to define some ideal potential predictability limit.

As mentioned by RSF12, dynamical downscaling with high-resolution nested regional climate models (RCMs) has been shown to contribute realistic details by the representation of fine-scale surface forcings and resolving some mesoscale processes. Previous studies have shown RCMs to improve not so much the mean climate but the frequency distribution and representation of extremes for important climatic variables such as precipitation [e.g., IPCC AR4, 2007]. Given that the additional forcings resolved by RCMs are not (much) evolving in time, should dynamical downscaling be expected to improve upon the skill of a global model at simulating historical climate changes (as opposed to mean climate) in areal and seasonal mean temperature and precipitation? One may expect improvements only if there are improved representations of processes and/or additional regional forcings. When time-varying regional forcings are not included in the regional model, this lack impedes the RCM simulations from realizing their full potential regional-scale performance, even if they were driven by suitable large-scale fields.

3 Concluding Remarks

We find that the methodological limitations of the numerical experiment conducted by RSF12 prevent truly addressing the stated aims of the study: Whether an RCM can add value to global model projections of climate changes. To adequately study the skill of any climate model (whether global or regional) at capturing climate trends would require performing an ensemble of simulations to separate signal from noise and to account for all known regional time-varying forcing that may have been active during the studied period.

The statement by RSF12 that “virtually almost all AOGCM-RCM climate change studies continue with the experimental setup of downscaling one historical decade and one future decade, and then differencing the two to obtain high-resolution information,” incorrectly reflects the current practice in regional climate modeling. Current strategy for modeling anticipated climate changes resulting from human activities consists in making long, multidecadal simulations, with many realizations of a model in an ensemble mode and, when possible, with many models. See Laprise [2008] for a review of coordinated RCM experiments dating back of a few years and Kjellström and Giorgi [2010] introducing the special issue of Climate Research on the European ENSEMBLES project. Recently, the World Climate Research Program sponsored the Coordinated Regional Climate Downscaling Experiment project (CORDEX) [Jones et al., 2011], recommending RCM simulations spanning the period 1950–2100. The purpose of long and ensemble simulations is to maximize the signal-to-noise ratio, since the “noise” in individual simulations tends to cancel one another in an ensemble.

In their reply to this comment the authors show that the trend is statistically significant for temperature for some seasons and for some regions. It should be emphasized that the significance of the observed trend differing from zero is only a minimum, necessary but not sufficient, condition for the experiment to make sense if the purpose of a study is to quantify the added value afforded by dynamical downscaling with an RCM. We strongly feel that the issue of the statistical tools to properly analyze nested model simulations deserves attention to design tests that can establish clearly what can be learned by a given experiment and what remains out of reach; this issue clearly falls outside the scope of RSF12 paper and of this comment. In trying to establish the added value of RCMs, perhaps RSF12 were overly ambitious in their expectations, but the RCM community overall may have sinned by lack of it.

In their reply the authors state: “the GCM should not be too skillful or there will be little opportunity for added value” (from dynamic downscaling). This statement is at variance with the stated goal of dynamic downscaling: to add features that cannot be explicitly resolved by coarse-mesh GCM, such as fine-scale forcing (e.g., detailed coastline, peaked mountains, and narrow valleys) and mesoscale circulations permitted by RCM fine mesh. It is well known that biases in driving GCM can seriously impede the potential of RCM to skilful simulations. The importance of using good large-scale lateral BC is a recurring point in all dynamic downscaling work, known as the “Garbage in–Garbage out” rule [e.g., Rummukainen, 2010].

As a side issue, we note in passing that RSF12 selected WRF physical parameterizations to optimize the model performance when driven by GISS-ModelE2 data. This is a very questionable approach since the retained formulation will tend to cancel out the driving model biases. In all RCM work known to us, the best configuration is determined based on RCM simulations driven by the best data, that is from reanalyses.


I would like to express my gratitude to several regional climate scientists who have exchanged with me over the last months on issues related to this comment, in particular to my colleague Ramón de Elía, senior scientist at the Consortium Ouranos.