On the contribution of statistical bias correction to the uncertainty in the projected hydrological cycle



[1] Global hydrological modeling is affected by three sources of uncertainty: (i) the choice of the global climate model (GCM) used to provide meteorological forcing data; (ii) the choice of future greenhouse gas concentration scenario; and (iii) the choice of the decade used to derive the bias correction parameters. We present a comparative analysis of these uncertainties and compare them to the inter-annual variability. The analysis focuses on discharge, integrated runoff and total precipitation over ten large catchments, representative of different climatic areas of the globe. Results are similar for all catchments, all hydrological variables and throughout the year with few exceptions. We find that the choice of different decadal periods over which to derive the bias correction parameters is a source of comparatively minor uncertainty, while other sources play larger and similarly significant roles. This is true for both the means and the extremes of the studied hydrological variables.

1. Introduction

[2] Managing future fresh water resources under a changing climate with vastly uncertain future atmospheric greenhouse gas emission scenarios is a daunting challenge facing human society today. Hydrological models are one of the most powerful tools at our disposal to address that challenge. Hydrological simulations of scenario projections require adequate projected fields of meteorological forcing variables such as precipitation and temperature. These cannot be directly derived from global climate model simulations of future climate, which are significantly affected by errors, since the results from such a forced hydrological simulation would be unrealistic and of little use [Hansen et al., 2006; Sharma et al., 2007]. Hydrological models are developed to give realistic output when forced with observed fields. Hence some form of post processing on GCM forcing fields is necessary. Removal of the bias, defined here as the time-independent component of the error, is a widely used form of post processing.

[3] Hagemann et al. [2011] weighted the benefits of bias correction against the apparent addition of uncertainty in the resulting simulated future hydrological fields due to the bias correction parameter choice. Bias correction of projected scenario forcing fields is known to affect not only the projected hydrological fields, but also the climate change signal [Haerter et al., 2011]. Policy makers and fresh water resource strategists require the best possible information regarding hydrological modeling of projected climate. This entails not only the best possible estimate of future hydrological fields, but also the best possible estimate of the associated uncertainties.

[4] In this work we compare the uncertainty, as estimated by spread, in hydrological variables from simulations of projected climates from three different sources: the choice of GCM to force the hydrological simulations, the choice of SRES used to determine the atmospheric greenhouse gas concentrations and the choice of decade used to derive the bias correction parameters. The choice of decade used for the bias correction stands in lieu of a full, and far more cumbersome, analysis of the uncertainty associated with bias correction. This simplification stems from the finding of Piani et al. [2010b] that the choice of decade was by far the largest contributor to the uncertainty in this particular bias correction methodology, when compared to other factors such as fit error, choice of transfer function and observational uncertainty. Piani et al. [2010a, 2010b] chose the decadal time scale, given the total length of observation available (40 years), as a compromise between the need for a large number of non overlapping time periods and the need for these periods to be as long as possible. We accept that the choice of any one particular bias correction methodology may be another source of uncertainty. However a comparative analysis of all bias correction methodologies is beyond the scope of this paper. In the following section 2 the experiment is presented: We describe the models used, the length and boundary conditions of the simulations performed and key references of the bias correction method applied. Section 3 contains the results of the experiments. In section 4 we discuss the results, their limitations and implications and conclude this work. The auxiliary material provides details of our methodology for calculating the spread attributable to a single source and using it to estimate the associated uncertainty.

2. Experimental Outline

[5] Global hydrological simulations were carried out with the hydrological model of the Max Planck Institute for Meteorology (MPI-HM), consisting of the Simplified Land surface (SL) scheme [Hagemann and Gates, 2003], which computes vertical water fluxes, and the Hydrological Discharge (HD) model [Hagemann and Dümenil, 1998], that globally simulates the lateral freshwater fluxes at the land surface. The last 20 years of the 21st century are simulated. In all the analyses that follow, the first 5 years of the simulations are discarded (spin-up period) and only the years from 2086 to 2100 are included. Hydrological simulations were forced by 24 different bias corrected GCM output variables, e.g. total precipitation and mean temperature with daily time series. Each climate simulation is characterized by a choice of one of three different GCMs, one of two distinct SRES-scenarios [Nakićenović et al., 2000], and one of four separate decadal periods (1960–1969, 1970–1979, 1980–1989, and 1990–1999) over which to derive the bias correction parameters (3 × 2 × 4 = 24). The GCMs used are ECHAM5/MPIOM (denoted as ECHAM5 henceforth) [Roeckner et al., 2003; Jungclaus et al., 2006], CNRM-CM3 (CNRM henceforth) [Royer et al., 2002; Salas-Mélia, 2002] and IPSL-CM4 (IPSL henceforth) [Hourdin et al., 2006; Madec et al., 1998; Fichefet and Maqueda, 1997; Goosse and Fichefet, 1999]. The two SRES are scenario A2 and B1. The bias correction methodology is that developed within the EU WATCH project, henceforth referred to as ‘WATCH bias correction’ or WBC (www.eu-watch.org) and described by Piani et al. [2010a, 2010b]. Within this statistical bias-correction we have produced transfer functions of daily precipitation data. A transfer function maps the cumulative distribution function (CDF) of the modeled precipitation data onto that of the observed. For example, in the present case of 10 years of daily data and a monthly correction, a mapping was performed between roughly 300 modeled and observed values, which can easily be obtained by sorting the data and associating values of matching indices. We have then produced fits to the monthly transfer functions. The monthly fit coefficients were then interpolated to yield day-by-day corrections which we finally applied to the model produced future daily precipitation data. Details on the method are given by Piani et al. [2010b]. Temperature corrections are less strongly fluctuating when less data are used and overall are less tricky due to the more well-behaved CDFs. Therefore, in the present study we only focus on the correction of mean daily precipitation, and do not discuss corrections of temperature here. The observational dataset used for the WBC was developed within the WATCH project as well [Weedon et al., 2011]. From each 15 year global hydrological simulation a time series of daily discharge and integrated runoff was calculated for each of the ten large-scale globally distributed catchment areas shown in Figure 1. Also, for the same catchment areas the total daily precipitation was extracted from the forcing simulations.

Figure 1.

Locations and names of the 10 catchment areas studied.

[6] The resulting dataset consists of 24 hydrological time series of daily values of 3 variables of the duration of 15 years for each of the 10 catchments. Before processing, the raw time series are compacted by taking non-overlapping 5-day (pentad) means. The time series thus produced are used to derive spreads which are in turn corrected to serve as proxies for uncertainty. The annual cycle is divided into 73 5-day periods and uncertainties are estimated via the spread of the hydrological variables, for each 5-day period, due to each of the separate sources mentioned. To derive an estimate of the uncertainty we start by taking the differences, for a given hydrological variable, between two simulations that differ in one parameter, be it GCM, SRES or bias correction, but are equal in all other parameter settings. We then calculate the average of this difference across all possible values of the remaining parameters. The final estimate of uncertainty is obtained by correcting the spread by a factor that accounts for the fact that, for each parameter, we only have a limited number of values. The details of the calculation of spread and the correction factor are given in part A of the auxiliary material.

3. Results

[7] Figure 2 shows the annual cycle of uncertainty in discharge, for the 10 large-scale catchments and different sources. Of all the different sources of uncertainty, the choice of decade for the bias correction estimation appears to give the smallest contribution (black line), while the other two sources give comparatively larger contributions. This is true in most catchments and during most of the year. However, there are some exceptions. For example, between January and March in the Amur and Congo catchments, all uncertainty contributions are of similar, and small, significance. For some catchments the uncertainties due to choice of GCM and SRES are very similar and similar also to the inter-annual variability. This is the case for the Amur, the Murray, the Ganges, the Mississippi, the Danube and the Parana catchments. For other catchments, notably for the Amazon and the Yangtze, the contributions due to SRES and GCM are very different in certain times of the year. There does not seem to be a relation between the relative and absolute sizes of the contributions. That is the case for the Yangtze where the differences are the greatest when the uncertainties are at their highest while for the Amazon the opposite is true.

Figure 2.

Annual cycle of uncertainties in discharge due to: the choice of decade for the estimation of the bias correction (CDBC, black); the choice of greenhouse gas loading scenario (SRES, blue); the choice of forcing global climate model (GCM, red); inter-annual variability (Year, yellow).

[8] Results for the integrated runoff (not shown) are similar to the case for river discharge. The choice of decade for bias correction gives the smallest contribution to total uncertainty. However, in the case of integrated runoff, the results are far more straightforward. The relative strengths of the different uncertainty contributions are almost - although not exactly - constant. After bias correction, the smallest contribution is given by the SRES, then the GCM. Both contributions from the later sources are comparatively smaller than the inter-annual variability.

[9] In Figure 3 we summarize the information on uncertainty contribution presented in Figure 2 for discharge and we give the same information for total catchment runoff and precipitation. Details of how the column lengths are obtained are given in part B of the auxiliary material. Here we simply state that they are the annual average of the ratios of the single uncertainty contributions to the inter-annual variability (hence, the yellow columns are all unity). The result gives an indication of the annual average of the relative contribution of bias correction relative to the total uncertainty. These results must be taken only as indicative since there is no unique and objective way to calculate a ‘mean annual relative strength’ for each contribution. For example, normalization by different values, one for each pentad of the year, and then taking the average will not yield the same result as averaging first and then normalizing. The boxes at the right end of Figure 3 give the extremes of variation of the column lengths across different basins. That is, the bottom (top) of each color bar in the right hand boxes corresponds to the lowest (highest) value across all basins for the relevant color.

Figure 3.

Annual mean uncertainty relative contributions from all sources (CDBC, black; SRES, blue; GCM, red; Year, yellow) and for all three examined variables (rows) and for all catchments (columns). The boxes at the right end give the extremes of variation of the column lengths across different basins.

[10] Furthermore, we note that the relative contributions from the choice of decade for bias correction, although always the smallest, are larger for discharge than for total basin runoff and precipitation. There is considerably more homogeneity across different basins for runoff and precipitation than for discharge. We attribute this behavior to the fact that the bias-correction used is applied to the CDF of daily precipitation values. Consequently, temporal correlations - which are important to proper hydrological simulations - are not adjusted to those of the observations. While runoff is rather significantly spatio-temporally correlated with precipitation, discharge is subject to a nonlinear temporal filter that affects its temporal correlation characteristics. This filtering is imposed by the lateral transport of water through surface and subsurface storages and the flow network within the catchment. The associated delay of flow may range from several days to months for the large catchments, which depends on the location of the main runoff generating event (snow melt, rainfall) with regard to the river mouth.

[11] In the case of discharge for the Amazon and Yangtze basins, uncertainty from GCM is significantly larger than all other contributions. For Amazon, the soil moisture - atmosphere feedbacks during the drying season (from July to October) are quite different in the GCMs. For Yangtze, the spatial and temporal patterns of the monsoon in the GCM simulations are different, which leads to similar runoff patterns but rather different discharge values.

[12] Similar inferences can be made by looking at the extreme value behavior. Figure 4 shows the 99th percentile of daily discharge as a ratio to the mean value for each calendar month and across the entire 15 year time series. Results are shown only for the Nile and the Danube catchments. For these two catchments, as for the others (not shown), there are large seasonal variations in the 99th percentile-to-mean ratio values. However the lines tend to cluster according to color, that is, simulations with different bias correction calibration periods give similar results. There are, of course, some variations within the results. In some cases, for example for the Nile basin, B1 scenario and CNRM model (bottom left, green lines) from January to August, the values are very similar. The opposite is true for example for the Danube basin, A2 scenario (top right, green lines), where the spread among simulations with different bias correction parameters is comparatively large. However, when compared to differences due to changes in SRES or GCM, the former differences are negligible. For most months and all four panels, the spreads of lines with the same color are non-overlapping. These imply that the uncertainty due to change in GCM is greater than that due to bias correction parameter (decadal) choice. Also, by comparing the two columns, it is clear that the same can be said about the changes in SRES. A similar analysis for integrated runoff and precipitation yields similar results (not shown).

Figure 4.

Ratio of 99th percentile and mean of daily discharge for each calendar month. Shown are results for (top) Danube and (bottom) Nile catchments and results for (top) B1 and (bottom) A2 SRES. Green, black and red lines represent hydrological simulations that used CNRM, ECHAM and IPSL output respectively as forcing data. Lines in different styles (solid, dashed, dotted and short dashed) represent different choices of decade to constrain bias correction parameters.

4. Discussion and Conclusion

[13] While statistical bias correction does not add to the skill of any climate model simulation, such schemes are currently indispensible when climate model data - e.g. daily values of precipitation and temperature - are to be fed into hydrological and other impact models. However, limited historical time series and the associated inter-decadal variability give rise to some degree of uncertainty in the bias correction methodology itself. In this article, we therefore addressed the question: ‘To what extent does the choice of decade for bias correction estimation further obscure future climate projections?’. To this end, we have broken down the control period of observations and model data into equal time intervals and computed the corrections on each segment separately - thereby artificially introducing a level of noise in the correction procedure. We have then examined the uncertainties in the projections resulting from the choice of any of the four different corrections. Finally, we have contrasted these uncertainties with those due to the choice of (i) different GCMs; (ii) different SRES-emission scenarios. These uncertainties are also compared to the inter-annual variability. The comparison was performed with respect to precipitation, runoff and discharge. Our results show that the contribution to the final uncertainty in these three observables from the choice of correction decade is generally small compared to all other sources of uncertainty.

[14] We point out that the use of a longer derivation period - in this case 40 years instead of 10 years - would lead to less noise in the fit-procedures and thereby further reduce the contribution of the bias-correction uncertainty to the total uncertainty in the projection period. Furthermore, the choice of the GHM will introduce further uncertainties to the GCM-GHM modelling chain. The assessment of these uncertainties is beyond the scope of the present study and will be addressed in a future study.


[15] This work was supported by funding from the European Union within the WATCH project (contact 036946). The authors would like to thank Tobias Stacke (MPI-M) for implementing several modifications into MPI-HM. The GCM data were obtained from the CERA database at the German Climate Computing Center (DKRZ) in Hamburg.

[16] The Editor wishes to thank an anonymous reviewer for their assistance evaluating this paper.