Accounting for interannual variability: A comparison of options for water resources climate change impact assessments



[1] Empirical scaling approaches for constructing rainfall scenarios from general circulation model (GCM) simulations are commonly used in water resources climate change impact assessments. However, these approaches have a number of limitations, not the least of which is that they cannot account for changes in variability or persistence at annual and longer time scales. Bias correction of GCM rainfall projections offers an attractive alternative to scaling methods as it has similar advantages to scaling in that it is computationally simple, can consider multiple GCM outputs, and can be easily applied to different regions or climatic regimes. In addition, it also allows for interannual variability to evolve according to the GCM simulations, which provides additional scenarios for risk assessments. This paper compares two scaling and four bias correction approaches for estimating changes in future rainfall over Australia and for a case study for water supply from the Warragamba catchment, located near Sydney, Australia. A validation of the various rainfall estimation procedures is conducted on the basis of the latter half of the observational rainfall record. It was found that the method leading to the lowest prediction errors varies depending on the rainfall statistic of interest. The flexibility of bias correction approaches in matching rainfall parameters at different frequencies is demonstrated. The results also indicate that for Australia, the scaling approaches lead to smaller estimates of uncertainty associated with changes to interannual variability for the period 2070–2099 compared to the bias correction approaches. These changes are also highlighted using the case study for the Warragamba Dam catchment.

1. Introduction

[2] With the results of a number of general circulation models (GCMs) available from the Coupled Model Intercomparison Project (CMIP3) database, climate change impact assessments should ideally utilize all available GCMs to ensure that uncertainty is fully incorporated. However, there are a number of problems that prevent the GCM simulations of precipitation being used directly for assessing the impacts of climate change on water resources systems. One of the most important issues is the difference in spatial scale between catchment and typical GCM grid resolutions [Fowler et al., 2007]. Furthermore, rainfall is a parameterized process in GCMs, and the parameterizations are considered least reliable at the catchment scale [Wilby et al., 1998] and rely on the accurate modeling of the energy cycle and the moisture cycle as well as the simulation of clouds, which has been identified as one of the remaining significant uncertainties in GCM simulations [Solomon et al., 2007].

[3] To overcome these issues with GCM precipitation simulations, a series of corrections are generally applied to ensure more realistic representation of observed precipitation. These corrections are aimed at first removing the biases in the precipitation fields and, second, providing precipitation estimates at finer spatial scales (and possibly also finer temporal scales), which are more appropriate for catchment-scale impact assessments.

[4] Both statistical and dynamic downscaling lead to significant enhancements compared to GCM precipitation simulations, in particular, by providing precipitation estimates at scales that better match those required for catchment-scale impact assessments. However, even with the finer spatial resolution, there can still be biases in the precipitation simulations [Christensen et al., 2008]. Methods are therefore required to adjust the results to better match observations over the 20th century, and these are generally termed bias correction methods.

[5] In some studies, bias correction has been used as a standalone method to correct GCM simulations, with, in some cases, a spatial disaggregation step also applied following the bias removal [Elshamy et al., 2009; Ines and Hansen, 2006; Sharma et al., 2007; Wood et al., 2004]. When using bias correction of GCM simulations directly, the method is simple to apply; therefore, many GCMs can be considered, and precipitation changes can be assessed over large regions. A natural alternative to this application of bias correction is the delta change approach, which uses the changes derived from a GCM to scale historical rainfall [Arnell and Reynard, 1996; Chiew and McMahon, 2002; Diaz-Nieto and Wilby, 2005; Mpelasoka and Chiew, 2009]. The attraction of both these methods is their ease of application, with little need for modification for different regions or climatic regimes. An added attraction is the possibility of using multiple GCMs to allow an estimate of the uncertainty associated with the structural choices made in the design of each GCM.

[6] One of the main disadvantages of the delta change or scaling approaches is that variability is assumed to remain the same in the future. This unchanged variability is a problem at multiple time scales in a hydrological setting. At the daily level, scaling approaches lead to rainfall occurrence being the same for the current and future climate [Fowler et al., 2007], whereas in the future, changes are expected for some parts of the world such that rainfall is expected to be more intense but occur less frequently [Mehrotra and Sharma, 2010; Trenberth et al., 2003]. Of greater importance for large water resources systems are changes in variability at low frequencies, particularly at interannual time scales. In drier parts of the world, such as Australia, water resources systems generally include storages with multiyear capacity, and the future adequacy of these systems cannot be assessed if climate sequences do not allow for changes in interannual variability.

[7] Bias correction of GCM precipitation is an attractive alternative to scaling approaches as it is based on actual GCM simulations instead of the historical record, and it can be designed to address known biases associated with the GCM for the region being modeled. While bias correction has traditionally considered corrections of daily or monthly distributions [Fowler and Kilsby, 2007; Ines and Hansen, 2006; Mehrotra and Sharma, 2010; Wood et al., 2004], it is also possible to address biases at other time scales [Johnson, 2010; Johnson and Sharma, 2009].

[8] The aim of this paper is to determine what the implications of each of these methods are in terms of estimating future water availability. What are the advantages and disadvantages of each approach, and how do these affect the future estimates of rainfall and streamflow? This study compares two scaling approaches and four bias correction techniques to estimate changes in future rainfall in Australia using the full suite of CMIP3 GCMs. The different methods are first validated against historical rainfall and then applied to future projections for the latter part of the 21st century for the SRESA2 scenario [Intergovernmental Panel on Climate Change (IPCC), 2000]. A case study assessing the implications on streamflow generation for a major water supply catchment for Sydney, Australia, has also been undertaken to illustrate the differences in the approaches. While previous studies have compared scaling approaches [Mpelasoka and Chiew, 2009] or bias correction methods [Johnson and Sharma, 2009; Wood et al., 2004], this study evaluates both, and by validating the methods against historical data, it makes improved judgments on the suitability of the different approaches to different climatic settings. It is important to note that spatial disaggregation of the bias-corrected GCM simulations is not being considered, although for some locations and applications this is a vital step after GCM biases are removed.

[9] Section 2 describes the scaling and bias approaches that have been evaluated. Section 3 provides details of the data sets, the case study of the Warragamba catchment in New South Wales, Australia, and the evaluation statistics used to compare the different approaches. Results are presented in section 4, with discussion and conclusions based on the results provided in section 5.

2. Scaling and Bias Correction Methods

[10] The aim of this paper is to compare simple, commonly used approaches for estimating climate change impacts on rainfall and, in particular, to examine how interannual variability is accounted for in future projections. Two scaling approaches were compared by Mpelasoka and Chiew [2009], namely, constant scaling and daily scaling. They also assessed the performance of daily translation, a bias correction technique, which at the monthly scale has also been termed quantile mapping [Wood et al., 2004]. Two bias correction approaches were compared by Johnson and Sharma [2009], including monthly bias correction and a nested bias correction approach, which accounted for biases in interannual variability. In addition, Johnson [2010] compared a third bias correction approach to the two used by Johnson and Sharma [2009]. This third method allowed for nesting at multiple time scales but did not correct for interannual variability biases. This paper compares the performance of all six methods, using monthly data for Australia. A description of each technique is provided below, with readers referred to the above referenced papers for details on all approaches.

2.1. Scaling

[11] Scaling approaches assume that the relative changes in GCM simulations over time are more reliable than the absolute value of the simulations themselves [Fowler et al., 2007]. Therefore, the best estimate of future changes is obtained by calculating a scaling factor derived from the ratio of the GCM future and current climate estimates, followed by applying the factor to an observed time series of rainfall or temperature. The inherent assumption in this approach is that the biases in GCM outputs for the future and current climates will be removed when the changes are calculated. Generally, an additive bias is assumed for temperature changes and a multiplicative bias for precipitation, as shown in equations (1) and (2):

equation image
equation image

where the change in temperature equation image or equation image is related to the GCM estimates of the future climate (either Tmf or Pmf), the GCM estimates of the current (20th century) climate (either Tmc or Pmc), the bias equation image, and the observations (Tobs or Pobs).

[12] The scaling of the historical rainfall time series can be undertaken in different ways. The simplest option is to only consider the mean changes at either monthly, seasonal, or annual frequencies. This is often termed constant scaling or a delta change/perturbation approach. In this study, following the methodology of Mpelasoka and Chiew [2009], changes in the seasonal mean are considered. In more detail, the observations are multiplied by the ratio of the GCM mean monthly rainfall for a particular season in the future and the GCM mean monthly rainfall for the same season in the 20th century, as shown by

equation image

where equation image is the observed precipitation for month i in season j, equation image and equation image are the means of all monthly precipitation from the GCM for season j for the 20th century climate and future climate, respectively, and (Pfi,j)CS is the future rainfall resulting from the constant scaling.

[13] A more sophisticated approach is based on the assumption that climate change is likely to have different effects across different parts of the rainfall distribution. For example, it is generally expected that in some parts of the world, extreme rainfall events will become more intense even if annual rainfall decreases [Huntington, 2006]. Constant scaling is unable to account for this, and thus, quantile scaling provides an extension by calculating the ratio of future to current GCM outputs at different quantiles of the rainfall distribution.

[14] The quantile scaling is carried out as in (4), where now the values for a particular quantile q of the distribution of monthly precipitation in that season are scaled:

equation image

[15] The quantiles are defined by interpolating directly from the empirical cumulative distribution function for each time series.

2.2. Bias Correction

[16] Bias correction methods use the outputs of the GCMs directly, making the assumption that if the biases can be removed, then the GCM outputs will indicate changes in variability and spatial patterns of rainfall, which scaling approaches do not [Fowler et al., 2007]. Four bias correction methods are considered here: two are based on removing biases in the monthly rainfall amounts, and two methods correct for biases on multiple time scales.

[17] Bias correction can be undertaken using parametric or nonparametric approaches. For the parametric monthly bias correction, the mean and standard deviation of the GCM rainfall are corrected to match the observations. This assumes that the distribution of the GCM rainfall is sufficiently similar to that of the observations such that the GCM rainfall only needs to be shifted and scaled to match the observations. The correction method is shown in (5) for the current climate GCM outputs and (6) for the future climate GCM outputs.

equation image
equation image

where s is the sample standard deviation estimated from the observed data (sobs) or the GCM for the current climate (smc) and m is the sample mean again estimated from the observations (mobs) or GCM (mmc) for month i.

[18] Quantile mapping is a nonparametric method that matches the full distribution of monthly values between the GCM and observations, as shown in (7). The scaling factor is then projected onto the future GCM distribution according to equation (8).

equation image
equation image

where for month i, in season j, the qth quantile is corrected using the ratio of the observations to GCM outputs for the 20th century.

[19] The two remaining bias correction methods address biases in the GCM outputs on multiple time scales. As discussed, this allows the interannual variability of the future projections to evolve according to the GCM, allowing that there may be some biases in the modeling of interannual variability compared to the observations. The nested bias correction methodology was developed by Johnson and Sharma [2009] and is aimed at representing both high-and low-frequency variability and persistence in the GCM outputs, thereby making them useful for water resources applications. It is termed nested bias correction because of the nesting of corrections at a range of time scales within the one method. The logic of the nested bias correction was borrowed from models for stochastic rainfall generation, which are used to generate daily rainfall such that the rainfall exhibits appropriate variability when aggregated over longer time scales [Koutsoyiannis, 2001; Srikanthan and Pegram, 2009; Wang and Nathan, 2007]. The method presented by Johnson and Sharma [2009] represents a generic framework for addressing biases in the distribution and persistence at multiple time scales, with a specific implementation using a linear autoregressive transformation chosen to illustrate the process. The steps for the linear autoregressive transformation for the future climate case are presented in equations (9)(15). Current climate GCM outputs can be modified in the same way. The notation is the same as for the other correction methods, with additional notation to represent data at year k and with the sample statistics expanded to include the lag 1 autocorrelation coefficient r. It is important to note that the bias corrections have been carried out independently for each grid cell, without modeling the spatial correlations, which in some locations can be significant.

[20] The first step in the model algorithm is to standardize the time series of monthly GCM rainfall totals equation image by the model sample means and standard deviations for each month i to create equation image, noting that lower case “p” is used to indicate standardized values in the following text and the tilde (e.g., equation image) represents the corrected precipitation at that time scale.

equation image

[21] The monthly lag 1 autocorrelations (equation image) in the GCM outputs are then removed from the standardized time series, and the observed monthly lag 1 autocorrelations (equation image) are applied to modify the value of equation image as follows.

equation image

[22] The observed means and standard deviations are then used to rescale the corrected time series in (10) to finalize the nested time series equation image at the monthly level.

equation image

[23] The nested monthly values equation image are now aggregated to the annual scale (equation image). The monthly process is repeated for the annual time step, with the difference that there is no need to allow for seasonality.

[24] Beginning with the annual time series (equation image), the annual rainfall totals are modified by standardizing with the sample mean and standard deviation of the annual rainfall, such that for year k,

equation image

[25] Modeled lag 1 autocorrelations are removed, and the observed lag one autocorrelations are added. As the observed lag 1 autocorrelations are generally quite small, this step does not generally lead to large changes compared to the standardized annual series (equation image).

equation image

[26] The last step is to create the final annual time series by rescaling with the observed annual means and standard deviations.

equation image

[27] There are now four time series that are used to correct the monthly GCM time series (equation image), the monthly time series itself, the corrected time series at the monthly timescale equation image, the aggregated yearly time series (equation image), and the nested annual time series equation image. Following Srikanthan and Pegram [2009], the corrections at the monthly and annual level can be applied at the same time to create a one-step correction, as shown by

equation image

[28] The final bias correction method, simple nested bias correction, is similar to the nested bias correction shown in equations (9)(15) but does not include corrections for lag 1 autocorrelations; that is, the methodology does not use equations (10) or (13).

2.3. Method Summary

[29] In summary, the performance of six scaling and bias correction methods will be compared across Australia. The methods and their abbreviations are as follows: constant scaling (CS), quantile scaling (QS), quantile mapping (QM), monthly bias correction (MBC), simple nested bias correction (SNBC), and nested bias correction (NBC). Section 3 provides details on the data and test cases for the different approaches.

3. Data and Case Study

3.1. Observed and GCM Data

[30] Monthly gridded rainfall data are available from the Australian Bureau of Meteorology (BOM) for the period 1900–2007. This product is based on historical rainfall observations at stations across Australia and gridded to a 0.25° resolution. A common 1.875° rectangular grid has been chosen to allow comparison of all the GCM outputs, with the BOM rainfall data also regridded to this resolution. There are 201 grid cells covering the Australian land surface.

[31] GCM precipitation outputs from the CMIP3 multimodel database were obtained for all GCMs that have modeled the SRESA2 emissions scenario [IPCC, 2000]. Some of the GCMs have multiple ensemble runs available, which allow model internal variability to be assessed as the GCM is initialized at different times. A total of 18 different GCMs and 38 model runs were used for the analysis as listed in Table 1. For the 20th century, the common period for all GCMs is 1901–1999. This period was split into two 50 year periods, 1901–1950 and 1950–1999, with the earlier period used to calibrate the parameters of the bias correction techniques and the latter period used to test the performance of each of the models. It is important to note that future changes are larger than those recorded over the 20th century across Australia, and therefore, the performance of each of the techniques for the future is not guaranteed through this approach. An assessment of the magnitude of the changes over the 20th century compared to the changes projected for the future is provided in section 4. However, many previous bias correction studies [Elshamy et al., 2009; Ines and Hansen, 2006; Sharma et al., 2007] have not considered the performance of bias correction methods using an independent period at all.

Table 1. List of GCMs Used for the Study
GCMModeling GroupAvailable Ensemble RunsAtmosphere Resolution
BCCR_BCM2_0Bjerknes Centre for Climate Research, Norway1T63 (∼1.9°)
CCCMA_CGCM3_1Canadian Centre for Climate Modeling and Analysis, Canada5T47 (∼2.8°)
CNRM_CM3Météo-France, Centre National de Recherches Météorologiques, France1T63 (∼1.9°)
CSIRO_MK3_5Commonwealth Scientific and Industrial Research Organisation (CSIRO) Atmospheric Research, Australia1T63 (∼1.9°)
GFDL_CM2_0U.S. Department of Commerce, National Oceanic and Atmospheric Administration (NOAA), Geophysical Fluid Dynamics Laboratory (GFDL), United States12° × 2.5°
GFDL_CM2_1U.S. Department of Commerce, National Oceanic and Atmospheric Administration (NOAA), Geophysical Fluid Dynamics Laboratory (GFDL), United States12° × 2.5°
GISS_MODEL_E_RNASA Goddard Institute for Space Studies, United States14° × 5°
INGV_ECHAM4National Institute of Geophysics and Volcanology, Italy1T106 (∼1.125°)
INMCM3_0Institute for Numerical Mathematics, Russia14° × 5°
IPSL_CM4Institut Pierre Simon Laplace, France12.5° × 3.75°
MIROC3_2_MEDRESCenter for Climate System Research (University of Tokyo), National Institute for Environmental Studies and Frontier Research Center for Global Change (Japan Agency For Marine-Earth Science And Technology), Japan3T42 (∼2.8°)
MIUB_ECHO_GMeteorological Institute of the University of Bonn, Meteorological Research Institute of the Korea Meteorological Administration, and Model and Data Group, Germany/Korea3T30 (∼3.9°)
MPI_ECHAM5Max Planck Institute for Meteorology, Germany3T63 (∼1.9°)
MRI_CGCM2_3_2AMeteorological Research Institute, Japan5T85 (∼1.4°)
NCAR_CCSM3_0National Center for Atmospheric Research, United States4T85 (∼1.4°)
NCAR_PCM1National Center for Atmospheric Research, United States4T42 (∼2.8°)
UKMO_HADCM3Hadley Centre for Climate Prediction and Research, Met Office, United Kingdom12.5° × 3.75°
UKMO_HADGEM1Hadley Centre for Climate Prediction and Research, Met Office, United Kingdom11.3° × 1.9°

[32] Future changes in rainfall have been assessed for 2070–2099, a 30 year window chosen to match World Meteorological Organization (WMO) recommendations on estimating mean climate. This period is long enough for parameters to be estimated with reasonably small uncertainty but not too long for the period to have significant trends. The period 1970–1999 was used to calibrate model parameters for the bias correction approaches and for the observed time series required for the scaling methods.

[33] In summary, the two analyses that have been carried out are (1) validation of the scaling and bias correction approaches using 20th century data, with a calibration period from 1901 to 1950 and a validation period from 1950 to 1999, and (2) assessment of likely future changes in rainfall calculated for the period 2070–2099, with model parameters calibrated on the basis of the final 30 years of the 20th century (1970–1999).

3.2. Performance Attributes

[34] To assess the performance of each of the methods over Australia, a range of monthly and annual rainfall attributes have been used as described below. Results are presented for the validation period of 1950–1999. For the future projections, results are presented in terms of the percentage change from the current climate.

[35] Monthly statistics assessed include the mean, standard deviation, and median calculated separately for each month. At the annual timescale the mean, standard deviation, and median are again used to gauge the performance of the methods. In addition, the 5th percentile and standard deviation of 2 year and 5 year rainfall totals are also calculated. The 2 year and 5 year rainfall totals allow the performance of the methods in capturing interannual variability to be assessed. The 5th percentiles reflect years where meteorological drought conditions may occur, while the standard deviations are used to indicate if the variability over all 2 year and 5 year periods reflects the observations. The 5th percentiles of the 2 year and 5 year rainfall totals are divided by the annual mean rainfall for each location to create a statistic that can be compared across the country, regardless of the climate regime.

3.3. Case Study

[36] With such a large number of GCMs and grid cells to report, it is difficult to distill all the available information on likely rainfall changes. A case study has therefore been used, first, to illustrate the differences in rainfall projections between the scaling and bias correction approaches and, second, to show how changes in rainfall translate to streamflow and storage changes.

[37] The case study has been chosen as the Warragamba Dam catchment located near Sydney in New South Wales, Australia, to illustrate implications of the approaches on a water resources system. Warragamba Dam is the largest storage in the Sydney Catchment Authority's (SCA) water supply system, and it supplies nearly 70% of Sydney's water needs. The catchment area for the dam is approximately 9050 km2, and the catchment boundary is shown in Figure 1, along with meteorological station locations and the common grid used for the Australia-wide GCM comparisons, discussed in section 3.1.

Figure 1.

Warragamba catchment boundary and elevations from a 9 s digital elevation model. Meteorological stations are shown with black dots. GCM grid cells are shown as crosses. Rivers are shown as blue lines.

[38] Observed catchment average rainfall and pan evaporation are used to estimate inflows to Warragamba Dam using the Australian Water Balance Model (AWBM). Figure 2 shows an AWBM schematic, which in this study is implemented as an eight-parameter model, with three soil stores each defined by two parameters, first, their proportion of the total catchment area and, second, the storage capacity. The final two parameters define the base flow recession constant and base flow index. The AWBM model is calibrated on the basis of the historical monthly rainfall and evaporation using sequential Monte Carlo sampling to define the maximum likelihood estimate of each model parameter [Fan et al., 2008], with model errors being assumed to follow a normal distribution.

Figure 2.

Australian Water Balance Model schematic.

[39] For the future modeling of the catchment, the nearest grid cell from each GCM to the centroid of the Warragamba catchment is chosen to extract the current and future modeled time series. The resolution of the atmosphere grid for each GCM is listed in Table 1, and for a rectangular grid resolution of, for example, 2°, the Warragamba catchment area covers approximately 25% of a grid cell. It is important that for many impact assessments it is not appropriate to just use the GCM nearest grid point and that some type of spatial disaggregation should be applied [e.g., Maurer and Hidalgo, 2008; Wood et al., 2004]. However, to provide a simple comparison of the results, spatial disaggregation has not been applied for this study. Inputs for the AWBM model for the SRESA2 emissions scenario are then calculated using each of the six correction methods discussed in section 2, along with the raw GCM outputs. The resulting AWBM flow sequences are then applied to a simple storage model of Warragamba Dam to estimate future water availability. The storage model is defined using SCA's stage-storage relationship for the dam, with spills set to occur when the volume exceeds the maximum capacity of 1.9 × 103 m3 and demand estimated as the mean monthly inflow. It was assumed that the dam is half full at the start of the simulations. Sensitivity testing shows that this assumption only affects storage values in the first year of the simulations. It should be noted that the actual operation of the Warragamba Dam is much more complicated and includes interbasin transfers, and the simplified implementation used here is solely for illustrating the implications of using the various scaling and bias correction alternatives on reservoir operation in an easy to understand setting.

[40] Future evaporation has been calculated using Thornwaite evaporation for simplicity, although earlier work has shown that changes to variables other than temperature are important in determining likely impacts of climate change on evaporation [Johnson and Sharma, 2010]. However, given that the purpose of this study is to compare the different bias correction alternatives, this simple method is considered acceptable. In addition, there is a lack of long observational records for all variables required to undertake more complicated estimates of evaporation across the catchment for either bias correction or scaling methods.

4. Results and Discussion

4.1. Extent of Biases in GCMs

[41] Prior to examining the performance of the different correction methods discussed in section 2, it is important to first understand the extent of biases in the raw GCM simulations for the 20th century. Biases in the monthly distributions of rainfall are shown in Figure 3 with the root-mean-square error of mean monthly rainfalls, highlighting the magnitude of the biases in each GCM. Correlations between the observed and modeled mean monthly rainfalls are shown in Figure 4, which demonstrates how well the GCMs capture the seasonal cycle of rainfall. GCM performance varies markedly, although generally most of the models do well in capturing the seasonal cycle of rainfall in northern tropical Australia, with poorer performance in the southern parts of the country.

Figure 3.

Maps of root-mean-square error of raw GCM simulations of mean monthly rainfall for the period 1950–1999 compared to observed data. Areas shown in red have the largest errors in raw GCM simulations, while lighter colors represent better representation of the observed mean monthly rainfall.

Figure 4.

Maps of correlations of raw GCM simulations of mean monthly rainfall for the period 1950–1999 with observed mean monthly rainfall. Areas shown in red have good correlations, which indicates that the GCM models the seasonal cycle of rainfall adequately, while areas shown in white or blue are where the seasonal cycle is poorly modeled.

[42] Table 2 summarizes the mean errors for each GCM for the statistics that are used in sections 4.24.4 to assess the performance of the correction methods. Comparing model performance across all statistics, some of the GCMs have consistently low errors compared to the other GCMs (e.g., MPI_ECHAM5 and GFDL_CM2.0), and conversely, some of the GCMs lead to large errors in all statistics (e.g., BCCR_BCM2_0 and MIROC3_2_medres). For the remainder of GCMs, performance is mixed depending on which statistic is used. Summaries of these raw GCM errors, averaged across all GCMs, are also given in Table 3 to provide a baseline with which to compare the performance of each of the correction methods.

Table 2. Mean Australian Raw GCM Prediction Errors (%)
GCMMonthly MeanMonthly SDMonthly MedianAnnual MeanAnnual SDAnnual Median2 Year Minimum5 Year Minimum2 Year SD5 Year SD
Table 3. Multimodel Ensemble Mean Percentage Prediction Errora
  • a

    The p values are shown in parentheses. MBC, monthly bias correction; SNBC, simple nested bias correction; NBC, nested bias correction; QM, quantile mapping; CS, constant scaling; QS, quantile scaling.

Monthly mean46.9 (0)3.7 (0.01)1.1 (0.48)2.9 (0.18)2.3 (0.91)−2.7 (0)−0.5 (0.01)
Monthly SD9.8 (0)5.1 (0)3.4 (0)8.9 (0)6.2 (0)2.5 (0.04)7 (0)
Monthly median86.4 (0)6.2 (0)2 (0.46)1.1 (0.63)0.2 (0.25)−5 (0)−3.3 (0)
Annual mean30.9 (0)−2.8 (0)−5.2 (0)−4.6 (0)−5 (0)−6.2 (0)−5 (0)
Annual SD−5.8 (0)−7.2 (0)−5.7 (0)−2.5 (0.01)−7.8 (0)−10.4 (0)−7.1 (0)
Annual median35.2 (0)−2 (0)−4.6 (0)−4.1 (0)−4.8 (0)−5.6 (0)−4.7 (0)
2 Year Minimum Rainfall9.7 (0)2.7 (0)−1 (0.1)−3.4 (0)2.8 (0)2.9 (0.01)2.8 (0.02)
5 Year Minimum Rainfall9.5 (0)5.4 (0)3.4 (0.01)1.6 (0.44)5.4 (0)4.4 (0)4.3 (0)
2 Year SD−6 (0)−7.1 (0)−5.8 (0)−2.1 (0.07)−8.1 (0)−11.4 (0)−8.2 (0)
5 Year SD−9.8 (0)−10.3 (0)−9.1 (0)−5 (0.01)−11.4 (0)−18.6 (0)−15.6 (0)

4.2. Validation of Correction Methods: 1950–1999

[43] As discussed in section 3, there are 38 model integrations available at each grid location. Results can therefore be summarized in several ways. The multimodel ensemble mean error of the 38 model integrations can be compared for each method across all grid cells for the range of attributes listed in section 3. Figure 5 presents box plots showing the range of errors, along with the best method, as determined by the smallest mean percentage error, highlighted with gray shading. Each box plot represents the range of errors of the multimodel ensemble mean at all 201 grid cells. Table 3 summarizes the ensemble mean errors across Australia for all statistics, along with a p value denoting if the median of the distribution of errors is significantly different from zero using a Wilcoxon rank sum test.

Figure 5.

Box plots showing the range of multimodel ensemble mean prediction errors across Australia for a range of rainfall statistics. The correction method with the lowest mean prediction error across Australia is highlighted in gray.

[44] The first result to note is that the best method depends on the statistic used to judge the validation performance. For the monthly statistics the scaling approaches lead to the best estimate of the mean and standard deviation, while quantile mapping best matches the monthly medians. This is to be expected, as this approach is nonparametric, so it can shift the shape of the GCM monthly rainfall distributions if required, which should lead to a better match for the median, particularly if the distributions are skewed. The NBC method leads to the smallest prediction errors for measures related to the variability for annual or longer time scales. The widths of the box plots for all methods show that there is significant spatial variation in the results. Table 4 summarizes the proportion of grid cells where a particular method has the lowest prediction error out of the six methods considered. For the annual means, the monthly bias correction provides the lowest prediction errors for over half of Australia. For other statistics the best method does not have such large spatial coverage, generally ranging between 25% and 30% of the country, with the remainder of grid cells reasonably evenly split between the other five methods. In the presentation of the bias correction methodologies we noted that the calculations have been carried out independently for each grid cell. However, it is well known that rainfall fields can have considerable spatial structure, particularly at annual time scales. Correctly modeling this spatial dependence is crucial for applications such as streamflow estimation where rainfall over a catchment is aggregated spatially by a rainfall-runoff model.

Table 4. Number of Grid Cells for Which Each Method Has the Lowest Prediction Error for a Range of Rainfall Statisticsa
  • a

    The highest number for each statistic is highlighted in bold. Total number for grid cells for Australia is 201.

Monthly mean581123316513
Monthly SD113345375520
Monthly median511935343428
Annual mean1141120172415
Annual SD501972231720
Annual median832511202834
2 year minimum rainfall463436402916
5 year minimum rainfall263247234330
2 year SD432260292918
5 year SD252358292937

[45] Figure 6 shows box plots of the average prediction error across Australia for each simulation, where the box plots now summarize the results from the 38 different GCMs, whereas for Figure 5, the multimodel ensemble mean error was summarized for the 201 grid cells in Australia. Generally, the results are quite similar to Figure 5 in terms of the best performing methods. The difference in the methods is particularly evident for the scaling approaches where the prediction errors show little variation between the different GCMs, particularly for the 2 year and 5 year statistics, along with a reasonably large bias in the error. This demonstrates how the error in these longer-term statistics is more dominated by the use of the scaled historical sequence than by the choice of GCM. Mpelasoka and Chiew [2009] found that the differences between different scaling approaches were smaller than the differences in GCM results; however, the results shown in Figure 6 indicate that the uncertainty estimates are strongly dependent on the correction methods used. The other factor that needs to be considered is that the biases in the GCMs themselves may not be constant over time [Buser et al., 2009], and this will also affect the performance of bias correction approaches. This issue is discussed further in section 5.

Figure 6.

Box plots showing the range of Australian mean prediction error across all 38 GCMs considered for a range of rainfall statistics.

4.3. Future Rainfall Projections for Australia

[46] Over the 20th century, the method with the smallest prediction errors varies according to which statistic is used to assess performance. For the future rainfall projections, the same statistics are used to consider likely changes in rainfall properties. Prior to presenting the projected changes from the different scaling and bias correction approaches, an assessment is first made of the magnitude of the changes in the raw GCM simulations compared to the changes in the 20th century over which the methods were validated.

[47] Figure 7 presents box plots summarizing the change in mean annual rainfall for the future compared to the 20th century as estimated from the raw GCM simulations. Each box plot represents the range of changes from the 38 GCM simulations at a single location, and thus, there are a total of 201 box plots summarizing the changes seen at each grid cell across Australia. Overlaid on each box plot is a point showing the change in the observed mean annual rainfall between the first and second halves of the 20th century, which was used to calibrate the bias correction methods. Although the results are highly dependent on which GCM is used to estimate the future change, as expected, the future changes are generally larger than the changes seen in the observational record over the 20th century. As noted, this means that the validation of the performance of the bias correction techniques over the 20th century, presented in section 3, may not be fully representative of the future period changes.

Figure 7.

Comparison of changes in observed mean annual rainfall over the 20th century (shown as red dots) used in the validation of the bias correction methods and the range of future changes in mean annual rainfall estimated from raw GCM simulations. Each box plot and dot represents one grid cell in Australia.

[48] Future changes in the rainfall statistics for the GCM multimodel ensemble mean averaged across Australia are presented in Table 5. Maps showing the spatial pattern of changes in annual average rainfall and the 2 year minimum rainfall totals are presented in Figures 8 and 9. For each method, the change is calculated compared to the 20th century estimate from that method, which for the scaling approaches are the observations themselves. The predicted changes in the raw GCM outputs are also presented in Table 4.

Figure 8.

Multimodel ensemble mean percentage change in annual average rainfall across Australia.

Figure 9.

Multimodel ensemble mean percentage change in 2 year minimum rainfall across Australia.

Table 5. Multimodel Mean Percentage Change in Rainfall Statistics From 20th Century to 21st Century
Annual mean1.036.348.0811.082.141.512.08
Annual SD8.1717.4717.5724.5514.023.5713.77
Annual median0.315.146.719.
2 year minimum rainfall−2.69−3.08−3.18−3.87−3.85−1.04−4.65
5 year minimum rainfall−2.01−2.1−2.05−2.57−2.29−0.53−2.68
2 year SD10.2818.9519.0324.0515.223.5813.5
5 year SD18.2626.92732.2823.073.7813.7

[49] In Figure 8, the change in average annual rainfall shows a consistent pattern across all the methods, with decreases in the southwest and southeastern parts of Australia and increases over the northern areas. Differences between the bias correction and scaling methods are evident in the center of Australia, where the bias correction techniques lead to larger increases in mean annual rainfall than the scaling methods. The other difference is evident in the transition between areas that will be drier in the future to areas that are projected to be wetter in the future. For the scaling approaches, the GCM multimodel ensemble mean for approximately 50% of Australia is within ±5% of the 20th century mean annual rainfall. For the nested bias correction method, only 15% of the country has a change of less than 5% in absolute value. Analysis of the results indicates that the differences in mean annual rainfall are mainly due to correcting biases in the monthly standard deviations and, to a lesser degree, the bias in the annual standard deviations.

[50] For the 2 year minimum rainfall totals, the differences in the way that the methods consider interannual variability corrections are clear. Figure 9e shows that if constant scaling is used to assess likely changes in minimum rainfall totals, then it would be concluded that there is no change for almost all of Australia. This is because the constant scaling approach weights the means and 2 year minimums by approximately the same amount for the future, and hence, when the 2 year minimums are standardized by the mean annual rainfall, there is no change from the current conditions. If the constant scaling were carried out using annual scaling factors, there would be no change at all; as seasonal scaling factors were used, there are slight changes because of the timing of the 2 year minimums with respect to the seasons. The quantile scaling leads to a larger range of changes in interannual variability for the future, although for central Australia it does not show the increases in 2 year minimum rainfall totals that are indicated by the bias correction approaches, in particular the nested bias correction.

[51] The GCM multimodel ensemble mean hides the uncertainty in the estimates for the range of GCMs. Figure 10 therefore presents box plots representing two points, the locations of which are shown in Figure 8a. Similar to Figure 6, which showed the results for the validation period in the 20th century, it is evident how the uncertainty of the interannual statistics is smaller when the scaling approaches are used, particularly the constant scaling. The range of uncertainty in the future estimates of the annual mean rainfall change is generally quite similar no matter which method is used. However, once interannual statistics are considered, the scaling methods lead to smaller estimates of uncertainty compared to the any of the bias correction methods. For point 1, the bias correction estimates from almost all GCMs project a decrease in the 2 year minimum rainfall. For point 2, the bias correction methods show that some of the GCMs project an increase in the 2 year minimum rainfall total while many lead to a decrease. Although the average change over all the models is similar between all methods, the uncertainty estimates are much higher from the bias correction approaches. Since water resources managers need to fully quantify risks to their systems, it is important that the impact on uncertainty ranges of correction method choices is considered.

Figure 10.

Range of projected changes for points shown in Figure 7a for all 38 GCMs for (top) mean annual rainfall and (bottom) 2 year minimum rainfall.

4.4. Warragamba Case Study

[52] Sections 4.2 and 4.3 presented results summarized over all GCMs and over Australia. Results from the case study of the Warragamba catchment are now presented to highlight the implications of the different methods on estimates of water security. Table 6 presents the calibrated model parameters for the AWBM model for the Warragamba catchment based on observed rainfall and streamflows from 1957 to 2003. The Nash-Sutcliffe efficiency for the calibration is 0.76. A plot showing the calibrated and observed time series is presented in Figure 11.

Figure 11.

Observed (black line) and modeled (red line) streamflows for the Warragamba catchment.

Table 6. Calibrated Australian Water Balance Model Parameters

[53] The AWBM model was then used to generate streamflows based on the observed and GCM derived precipitation for the 20th century. Table 7 provides an assessment of how well the GCMs represent the streamflow derived from the observed rainfall. The assessment is based on the match between the empirical cumulative probability distributions of the observed streamflow and GCM streamflow, with the difference in the distributions summarized using the Kolgomorov-Smirnov two-distribution test. Table 7 presents the p values from raw and bias-corrected GCM AWBM simulations. It is clear that in almost all of the cases, the raw GCM streamflow estimates are significantly different from the observed distribution. After bias correction, most of the GCMs simulations match the distribution of the observed streamflows.

Table 7. Kolgomorov-Smirnov Test p Values for the Difference in Distribution of Observed and Modeled Streamflows
Number significantly different at 5%171203

[54] Figure 12 shows the estimates of the change in the annual and interannual rainfall statistics for the future resulting from each method. Once again, it is evident that scaling approaches lead to smaller uncertainty estimates compared to the much wider estimates of uncertainty using the bias correction methods. Also evident from Figure 12a is that the Warragamba catchment is located in the area shown in Figure 8f where the ensemble mean change in annual average rainfall is less than 5% if estimated from the scaling approaches but is larger if estimated using the bias correction approaches. As mentioned, this is due to changes in the variability introduced through the bias correction and represents another reason why correction approaches need to be selected with caution.

Figure 12.

Projected changes in annual and interannual rainfall statistics for all 38 GCMs for the Warragamba catchment.

[55] How do the projected rainfall changes translate into future streamflow and storage in Warragamba Dam? The results indicate that the choice of GCM has more impact on the change between current and future flows than does the choice of bias correction method. The future change, averaged over all GCMs, for each method is as follows: MBC, −14%; SNBC, −10%; NBC, −5%; QM, −7%; CS, −17%; QS, −6%. These changes are relatively small compared to the changes projected by individual GCMs, which range from a −70% decrease to a 110% increase in flows between the current and future periods.

[56] Figure 13 highlights the differences in the observed, current climate, and future streamflows for one GCM (BCCR_BCM2_0). Similar figures are available on request for all other GCMs. Table 8 provides a summary of the Kolgomorov-Smirnov test results on the difference in distribution between the current and future GCM streamflows. In most cases the future distribution is significantly different from the current climate distribution. However, for some GCMs this result is affected by applying scaling or bias correction results. For example, GFDL_CM2.0 was found to have a significant difference between current and future streamflows if the raw GCM simulations were used. Following bias correction or scaling, the differences are not significant.

Figure 13.

Cumulative probability distributions of observed, modeled, and future streamflows in the Warragamba catchment from the BCCR_BCM2_0 GCM. Observed streamflow is shown as a thick black line, the 20th century GCM simulation of streamflow is shown as a red line, and the future GCM simulation of streamflow is shown as a blue dashed line. The Kolgomorov-Smirnov p values are shown in the top left corner of each plot, with the top value referring to the difference between the observed and 20th century GCM streamflows and the lower value referring to the difference between the GCM simulations of the 20th century and the future.

Table 8. Kolgomorov-Smirnov Test Results on Distribution of Current and Future GCM-Derived Streamflows
Number significantly different at 5%16171515131413

[57] Figure 14 shows the percentage of months where the storage is able to supply the full demand and also the percentage of months where no supply is able to be made as the storage is empty. Figures 14a and 14b show the current climate conditions, where it is evident that all bias correction approaches lead to a much better representation of the observed supply. It appears that all correction methods lead to a slightly high bias in the percentage of time that full supply is able to be met, although the large intermodel differences are removed following bias correction. On the basis of the observed precipitation and temperature, it is found that no water could be supplied in 3% of months. This is based on the hypothetical arrangement for the dam used in this analysis, and it is important to note that Warragamba Dam has much more stringent failure criteria than this. For a few GCMs, there are clear dry biases in the raw GCM outputs, with simulations showing that the dam would not be able to meet any demands in up to 20% of months.

Figure 14.

Summary of supply characteristics for Warragamba Dam for the current and future climates.

[58] For the future, all the correction methods moderate the extremes seen in the raw supply curves, although there is substantially more intermodel variation than seen for the current climate. The constant scaling leads to the lowest estimates of percentage of time with full supply for many of the GCMs. There are no clear differences in the results between the different methods.

5. Discussion and Conclusions

[59] It is clear from section 4 that future rainfall projections are strongly influenced by the method used to construct the rainfall scenarios, particularly when considering annual and interannual statistics. The main consideration when using scaling approaches is that variability is constrained by the historical record. For the future projections, there is significantly more uncertainty associated with the bias correction estimates that the scaling estimates, which may affect the outcomes of risk assessments that use the scaled or bias-corrected series as inputs.

[60] The obvious question that this raises is whether the GCM interannual variability is correct for the future. For the historical climate, it is a reasonably simple task to identify models that are able to simulate variability correctly. F. Johnson et al. (An assessment of GCM skill in simulating persistence across multiple time scales, submitted to Journal of Climate, 2010) propose a skill score using discrete wavelets to evaluate the modeling of variability at different time frequencies. This was used to identify GCMs that correctly represent interannual variability globally, as well as examining the relationship between a model's prediction of storage for a synthetic reservoir and its representation of interannual variability. It is expected that skill scores that evaluate features such as interannual variability provide a stronger test than the ability of a model to reproduce the mean climate state [Perkins et al., 2007]. Numerous studies have evaluated the modeling of drivers of interannual variability, such as the El Niño–Southern Oscillation in GCMs [AchutaRao and Sperber, 2006; Joseph and Nigam, 2006; Yu et al., 2009]. However, in general, it is unclear how model performance in reproducing the current climate translates to future projections [Knutti, 2008]. It is therefore difficult to discount any of the GCM simulations of interannual variability for the future. A risk assessment approach where all possibilities are considered would seem to be the best option, and in doing so, bias correction provides a wide range of possibilities for evaluation.

[61] The major assumption regarding the bias correction approaches is that the biases are constant over time and therefore that by estimating the biases for the historical period, they can be removed from the future period simulations to yield a nonbiased climate change estimate. Despite the multitude of studies which have applied bias correction approaches, it appears that only one study has examined the implications of the assumptions on the resulting future estimates. Buser et al. [2009] examine how future temperature projections change with different assumptions of the form of the bias in means and standard deviations. They note that future work could extend this to other variables, such as precipitation, although their Bayesian methodology requires normally distributed variables and would therefore have to be modified to account for nonnormal precipitation distributions. Future work could examine this as well as evaluating the stationarity of biases over the observational record, which may provide some guidance on whether further modifications are required for bias correction methods to provide better future estimates. A further consideration is how the parameters on which the bias correction methods are calibrated are extrapolated into the future, if the distribution of future values is different from the observations or the current climate GCM simulations. This would in particular affect the tails of the distributions, which are of particular interest with respect to drought and flood conditions.

[62] This paper has demonstrated the advantages and disadvantages of bias correction and scaling approaches, as simple and cost-effective methods of investigating climate change impacts on water resources systems. If risk assessments indicate that interannual variability is an important control on a particular water resources system, then bias correction methods can provide additional scenarios of changes in variability that are not available when simply scaling the historical record. Until GCMs can reliably simulate variability in precipitation at time scales important for the water resources systems, it is recommended that a range of approaches including bias correction are used when time or resource constraints do not permit the use of more sophisticated downscaling methodologies.


[63] Funding for this research came from the Australian Research Council and the Sydney Catchment Authority. Their support for this work is gratefully acknowledged. The authors thank Erwin Jeremiah for calibrating the AWBM model. The comments of the editor and three anonymous reviewers have greatly improved the presentation of this work. We acknowledge the modeling groups, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) and the WCRP's Working Group on Coupled Modeling (WGCM), for their roles in making available the WCRP CMIP3 multimodel data set. Support for this data set is provided by the Office of Science, U.S. Department of Energy.