A new quantile-based mapping method is developed for the bias correction of monthly global circulation model outputs. Compared to the widely used quantile-based matching method that assumes stationarity and only uses the cumulative distribution functions (CDFs) of the model and observations for the baseline period, the proposed method incorporates and adjusts the model CDF for the projection period on the basis of the difference between the model and observation CDFs for the training (baseline) period. Thus, the method explicitly accounts for distribution changes for a given model between the projection and baseline periods. We demonstrate the use of the new method over northern Eurasia. We fit a four-parameter beta distribution to monthly temperature fields and discuss the sensitivity of the results to the choice of distribution range parameters. For monthly precipitation data, a mixed gamma distribution is used that accounts for the intermittent nature of rainfall. To test the fidelity of the proposed method, we choose 1970–1999 as the baseline training period and then randomly select 30 years from 1901–1999 as the projection test period. The bootstrapping is repeated 30 times to mimic different climate conditions that may occur, and the results suggest that both methods are comparable when applied to the 20th century for both temperature and precipitation for the examined quartiles. We also discuss the dependence of the bias correction results on the choice of time period for training. This indicates that the remaining biases in the bias-corrected time series are directly tied to the model's performance during the training period, and therefore care should be taken when using a particular training time period. When applied to the Intergovernmental Panel on Climate Change fourth assessment report (AR4) A2 climate scenario projection, the data time series after bias correction from both methods exhibit similar spatial patterns. However, over regions where the climate model shows large changes in projected variability, there are discernable differences between the methods. The proposed method is more sensitive to a reduction in variability, exemplified by wintertime temperature. Further synthetic experiments using the lower 33% and upper 33% of the full data set as the validation data suggest that the proposed equidistance quantile-matching method is more efficient in reducing biases than the traditional CDF mapping method for changing climates, especially for the tails of the distribution. This has important consequences for the occurrence and intensity of future projected extreme events such as heat waves, floods, and droughts. As the new method is simple to implement and does not require substantial computational time, it can be used to produce auxiliary ensemble scenarios for various climate impact-oriented applications.
 Global circulation models (GCMs) are an important tool in assessing climate change and in helping decision making to cope with the consequences of changing climate (World Modeling Summit for Climate Prediction, 2008; http://wcrp.ipsl.jussieu.fr/Workshops/ModellingSummit/Documents/FinalSummitStat_6_6.pdf). These numerical coupled models, based on the governing physical laws and expressed in forms of primitive equations, represent large-scale (generally hundreds of kilometers) flow patterns and dynamics for Earth system components, including the atmosphere, ocean, land surface, and sea ice. It is on such spatial scales that GCMs exhibit encouraging skill in simulating climate responses [Randall et al., 2007]. However, their ability to capture local-scale (usually tens of kilometers) or even regional-scale patterns that are directly relevant to end users for decision making and mitigation strategy planning is less promising, especially for precipitation [e.g., Wood et al., 2004]. Thus, there is a mismatch between the capacity of current GCMs and desired details at local scales. What is more, imposed on this is the inevitable model bias due to inadequate knowledge of key physical processes (e.g., cloud physics) and simplification of the natural heterogeneity of the climate system that exist at finer spatial scales. Although systematic improvements have been reported in the latest generation of climate models [Reichler and Kim, 2008], a general cold bias is still evident for temperature, and substantial precipitation biases, especially in the tropics, are not uncommon in many models [Randall et al., 2007].
 Downscaling approaches, either physical process-based dynamic downscaling or statistically based ones, are required to remove systematic biases in models and transform simulated climate patterns at coarse grid to a finer spatial resolution of local interest [Maurer and Hidalgo, 2008]. The dynamic approach uses limited area models or high-resolution GCMs to simulate physical processes at fine scales with boundary conditions given by the coarse-resolution GCMs. The statistical approach transforms coarse-scale climate projections to a finer scale through trained transfer functions that connect the climate at the two spatial resolutions. To capture the anthropogenic climate change signal, the choice of predictor variables is a critical step [Hewitson and Crane, 2006]. Two important considerations are that (1) the selected predictors should reflect the primary circulation dynamics of the atmosphere reasonable well and (2) there is a physical connection to the predictant. There are also statistical downscaling methods primarily for the purpose of bias correction which involve some form of transfer function derived from cumulative distribution functions (CDFs) of observations and model simulations [e.g., Ines and Hansen, 2006; Piani et al., 2010; Wood et al., 2004]. The advantages and disadvantages of both approaches have been thoroughly documented [e.g., Fowler et al., 2007; Wilby et al., 2009]. The key advantage of the statistical approach is the lower computational requirement compared to the dynamical model–based alternative, and thus, statistical downscaling approaches are widely used in climate impact–related research work.
 An underlying assumption for statistical approaches is that the statistical model which connects the large-scale circulation and the local climate (or the transfer function relating observations to model simulations) remains unchanged in an altered climate, which may not hold. However, if the time series used to train the statistical model is long enough, many different situations, including the altered climate, may be included [Zorita and von Storch, 1999]. In applying transfer functions to future climate projections, many existing statistical bias correction methods simply assume the higher-order moments do not change much in the future climate scenario, yet the latest report from the IPCC suggests otherwise [Meehl et al., 2007]. The nonstationarity issue has a profound influence on climate impact studies. Neglecting changes in the distribution, particularly higher-order moments, may result in underestimation of future climate extremes such as heat waves related to temperature or droughts and floods associated with precipitation extremes. In this study, we focus on statistical bias correction and propose a simple methodology called equal distance-based quantile-to-quantile matching to correct monthly precipitation and temperature fields from the World Climate Research Program's Coupled Model Intercomparison Project phase 3 (CMIP3) GCM simulations that formed the basis for the fourth assessment report (AR4) of the IPCC. The proposed method has the advantage that it explicitly incorporates changes in the distribution in the future climate. We will show that changes in the tails of the distribution can be better reflected with our method when compared to the traditional quantile-based mapping method (see section 4 for details).
 The paper is structured as follows. In section 2, we describe the study domain, followed by the data and methodology in sections 3 and 4, respectively. Section 5 presents results of the methodology validation. Section 6 details the application of the methodology to future projections of precipitation and temperature. Finally, discussion and conclusions are given in section 7.
2. Study Domain
 We select the Northern Eurasian Earth Science Partnership Initiative (NEESPI) domain as the study area, which is defined as being between 15°E in the west and the Pacific Coast in the east and between 40°N in the south and north to the Arctic Ocean coastal zone. In this study, we extend the NEESPI domain to 34°N. The NEESPI area accounts for about 19% of the global land surface, covering diverse climate zones from Mediterranean climate in the southwest coast to subarctic climate in Siberia in the far northeast. This region is also characterized by a carbon-rich, cold region component of the Earth system with potentially large implications for the global climate system [Groisman et al., 2009]. The NEESPI region experiences an amplified response to climate forcing due to positive feedbacks [Moritz et al., 2002] related to, for example, albedo reduction, glacial and sea ice retreat, and vegetation expansion processes. Climatic changes have already been reported in this region in recent decades, including increased winter air temperature [Rawlins and Willmott, 2003; Robeson, 2004], reduced snow cover [Serreze et al., 2000], reduced sea ice retreat [Serreze et al., 2003; Serreze and Francis, 2006], and increased runoff and river discharge [Peterson et al., 2002]. These rapid changes have raised concerns of abrupt climate change. For regions north of 60°N for the 21st century, regional model simulations project continuous warming (particularly large in autumn and early winter, being 2°C–4°C higher than spring and summer) and an overall wetter climate. The associated impacts include widespread reductions in snow cover and a shortened snow cover season, accompanied by more frequent climate extremes [Christensen et al., 2007] that may have devastating consequences. A recent study by Solomon et al.  points out that greenhouse gas–induced changes may be irreversible in the time frame of a million years. Thus, there is an urgent need to investigate potential changes, especially those beyond the climatological change in this region, to form the scientific basis for possible disaster mitigation and design strategies for adaptation.
3. Data Sets
 The observations used in this study are gridded monthly temperature and precipitation data prepared by the Climate Research Unit (CRU) of University of East Anglia at 0.5° resolution for 1901–2000 [New et al., 2002]. The CRU data set is one of the best available consistent long-term gridded observational records. These data are averaged to 1.0° resolution. It should be noted that the CRU data set does not make adjustments for gauge undercatch of snowfall. For high-latitude basins, this means wintertime snowfalls may be underestimated by the CRU data set, and some modeling has shown that this may be important [e.g., Tian et al., 2007]. However, other studies have shown that gauge undercatch adjustments overpredict regional observations of precipitation and snow water equivalent [Troy et al., 2008]. For illustration purposes, we use the model outputs from the Parallel Climate Model (PCM1) [Meehl et al., 2001, 2004; Washington et al., 2000] developed at the National Center for Atmospheric Research. We choose to use one ensemble member for simplicity and to illustrate the method. However, the method can be readily applied to an ensemble of realizations to form an envelope to reflect the uncertainty in future projections. We use the model data sets for the 20th century (20C3M) and the Special Report on Emissions Scenarios A2 future scenarios for 2001–2099 [Nakićenović and Swart, 2000], which were run as part of CMIP3. The A2 scenario is generally regarded as a worst-case scenario that sees a fourfold to fivefold increase in CO2 emissions over 2000–2099, during which CO2 concentrations increase from about 350 to 850 ppm. The model data are regridded to the same resolution as the observations (1.0°) using bilinear interpolation. Thus, the preprocessed model outputs are a combination of inherent model errors and uncertainties introduced by the interpolation scheme. If a much finer resolution is desired, a more sophisticated spatial downscaling scheme utilizing auxiliary information, for example, elevation and pressure level, may be used. The CRU data are used to bias correct the 20C3M model data, and these corrections are then applied to the A2 future projections to provide bias-corrected future climate data.
Figure 1 compares the basic statistics of temperature for the observations and 20C3M data for 1970–1999 and the A2 scenario data for 2070–2099. The model has a cool bias, most noticeable in higher latitudes, which is typical of most climate models [Randall et al., 2007]. High latitudes generally show stronger variability regardless of the season. For the baseline period of 1970–1999, PCM1 captures the spatial distribution of regions with strong variability but exaggerates wintertime variability. For July, there is less agreement for high-variability regions between the PCM1 and CRU data. In addition to an increase in mean for the future projection, the model predicts a decrease in wintertime variability but an increase in summertime. The higher-order moment skew also changes in the future. It is clear that change in the distribution of the future climate is not limited to shifts in the mean; there are also changes in higher-order moments. These changes are more important for resources managers and stakeholders, as the consequences are usually more devastating. This supports our intention to develop the new quantile-mapping technique that deals with nonstationarity in climate time series and particularly in higher moments. On an annual scale, a general cold bias exists in the multimodel averages, usually less than 2°C outside of the polar regions. Individual CMIP3 models can exhibit larger errors. For PCM1, the errors are as large as 5°C in the Sahara desert and coastal areas of midlatitudes to high latitudes [Randall et al., 2007]. In terms of seasonal patterns, the cold bias in summer (July) is more pronounced than that in winter (January). For 1970–1999, the spatially averaged mean bias is about −1.2°C in January and up to −3.9°C in July.
 For precipitation (Figure 2), the NEESPI region has a very diverse climate, characterized by low precipitation over the majority of inland areas in wintertime as a result of the strong influence from cold and dry polar air masses from high latitudes. The exception is the Mediterranean region, which is characterized by a seasonal peak in winter. In summertime, the Asian monsoon brings moist air from the tropical Pacific Ocean, resulting in high precipitation in East Asia. Extremely dry summers for the Mediterranean coasts are caused by the sinking air of the subtropical high. The large-scale spatial structure of modeled precipitation agrees with observations for the period 1970–1999, suggesting that the model captures the large-scale dynamics reasonably well. The precipitation brought by extratropical cyclones along midlatitude storm tracks through western Europe and into Eurasia [Money, 2000] is overestimated by the model. This has also been noted by Randall et al. , who found a positive bias in the PCM1 compared to the Climate Prediction Center Merged Analysis of Precipitation [Xie and Arkin, 1997] climatology over Eurasia. This may be an indication that the model-simulated westerlies are too strong or the model topography fields are oversimplified (too flat), allowing depressions to penetrate farther eastward into central Eurasia. Also, the precipitation variability is underestimated during the monsoon season for coastal areas of East Asia. The same can be seen for the Mediterranean region during wintertime. Regions with high precipitation are usually characterized by relatively high variability. For CMIP3 models in general, they adequately capture most climatic features, for example, lower precipitation at higher latitudes. The models also represent well a local minimum in precipitation near the equatorial Pacific and local maxima at midlatitudes due to frequent storms. Despite the apparent skill exhibited by the CMIP3 multimodel mean, models individually still display substantial biases that may exceed the magnitude of the mean observed precipitation climatology in the tropics in particular [Randall et al., 2007]. Under the A2 scenario, July precipitation is projected to increase more than 50% for the Tibetan Plateau, which is picked up by most CMIP3 models, and the higher the elevation, the more the increase. A reduction in precipitation is projected for central Asia and southern Europe, due to more easterly and anticyclonic flow. However, there is less consensus among GCMs for the summertime precipitation decrease for these regions [Christensen et al., 2007].
 The goal of this study is to develop a simple and effective statistical bias correction method. The basic principle, regardless of the complexity of the statistical model, is to establish a statistical relationship or transfer function between model outputs and observations based on available historical data sets and then apply the established transfer function to future model projections to infer the possible trajectory of future observations. The quantile-based mapping method (CDF matching (CDFm hereafter)) [Panofsky and Brier, 1968] maps the distribution of monthly GCM variables (precipitation and temperature) onto that of gridded observed data. The method is a relatively simple approach that has been successfully used in hydrologic and many other climate impact studies [e.g., Cayan et al., 2008; Hayhoe et al., 2004; Maurer and Hidalgo, 2008]. For a climate variable x, the method can be written as
where F is the CDF of either the observations (o) or model (m) for a historic training period or current climate (c) or future projection period (p). Basically, to bias correct model values for a future period, we first find the corresponding percentile values for these future projection points in the CDF of the model for the training period and then locate the observed values for the same CDF values of the observations. These are the model values after bias correction. Figure 3a illustrates how the method works. The CDFs for the observations and model are constructed from the January temperature field for a point near 60°N, 150°E. According to equation (1), a value of 25 in a future projection time series corresponds to a Fm-train value of 0.2 under the current climate, which can be transferred to the observed value according to the quantile function of the observations (the solid circle). The major advantage of the method is that it adjusts all moments (i.e., the entire distribution matches that of the observations for the training period) while maintaining the rank correlation between models and observations. However, an underlying assumption of the method is that the climate distribution does not change much over time, that is, it is stationary in the variance and skew of the distribution, and only the mean changes. This, however, may not hold [Meehl et al., 2007; Milly et al., 2008] with the possibility that the model projects changes in the higher moments as well, as shown in Figures 1 and 2. One way to allow for this is to incorporate information from the CDF of the model projection instead of assuming that the historic model distribution applies to the future period. For a given percentile, we assume that the difference between the model and observed value during the training period also applies to the future period, which means the adjustment function remains the same. However, the difference between the CDFs for the future and historic periods is also taken into account. Accordingly, the method can be written mathematically as
This method is termed the equidistant CDF matching method (EDCDFm). Figure 3b illustrates how the method works. A value of 25 corresponds to 0.05 in the CDF of the future model projection. For the 5th percentile, the difference between the model and observations under current climate is subtracted from the model projection to get the bias-corrected value (the solid circle in Figure 3b, which, in this example, is different from that estimated by the CDFm method). The same procedure is repeated for every point for the future projection time series, which can then be used to construct the CDF for the bias-corrected future projection time series (the dotted-dashed line). If the distribution for the future climate is the same as the current one (training period), the results from the CDFm and EDCDFm methods will be identical. Moreover, if the changes in variability are small, results from both methods will be close. In this example, however, the distribution for the model has changed from the training period to the future period. As a comparison, the distribution function using the method of EDCDFm is different from that using CDFm, especially near the tails. Figure 3 is solely for the purpose of illustration; the method is validated comprehensively in sections 5 and 7.
 If empirical CDFs are used for the quantile mapping, frequent interpolation or extrapolation is required, which is unsatisfactory. We therefore fit parametric distributions to both the temperature and precipitation fields. A four-parameter beta distribution is fitted to temperature fields:
where B is the beta function. We first approximate the range parameters (a and b) as the extreme values from the data, extended by a certain percentage of the standard deviation (σ), similar to the approach by Watterson . Once the range parameters are determined, the shape parameters p and q can be estimated by the method of maximum likelihood estimation. To investigate the sensitivity of the fitted distribution to the choice of range parameters, we extended the extreme values from 5% to 200% of σ. The differences between the constructed CDFs are small once the selected percentage value is above 30%. We therefore set range parameters for every grid point as the extreme values extended by half of the σ values. Our analysis suggests that this value is adequate to enclose possible extreme values in future projections.
 Precipitation is intermittent in nature, and even at monthly time scales a mixture of months with no rain and months with rain can occur, especially for dry regions. We construct a mixed gamma distribution to account for this feature. The CDF for the mixed distribution can be written as
where P is the percentage of months with rain and H(x) is a step function having a value of zero when there is no rain and a value of 1 when there is rain. For the portion of a given time series with rain, a two-parameter gamma distribution is used:
At the monthly scale, the majority of the NEESPI region has a value of 1 for P. Regions with values of <1 are limited to small dry areas: centered on central Asian countries east of the Caspian Sea (including northern Iran, Afghanistan, Turkmenistan, and Uzbekistan) in summer and western China and the Tibetan Plateau in winter. Note that the PCM1 model roughly captures the geographical distribution for the summer dry regions but totally misses the dry region in wintertime, and the implications of this in the bias correction are discussed in section 5.1.
5. Methodology Validation
5.1. Validation Using a Bootstrapping Approach
 The validation of a method or model is done traditionally by testing its performance for a specific historical period, say, 30 years, for which observations are available, a so-called “pseudoprojection.” However, the performance statistics calculated in this way may not be representative, as the results may be different if we choose a different validation period. This may arise since the model performance is time dependent, particularly for areas with pronounced seasonality and large variability, for example, wintertime temperature for continental interior and summertime precipitation for high latitudes. Ideally, we should use as many different periods as possible for the validation, and some statistics can then be derived to assess the performance of the methodology. Limited by the available observational records (100 years), we use a bootstrapping approach. We also sampled continuous overlapping 30 year periods (e.g., 1901–1930, 1911–1940), but these gave essentially similar results to those of the bootstrapping approach. Specifically, we choose 1970–1999 as the training period, in accordance with many climate impact studies. Then we randomly select 30 years out of 100 (1901–2000) as the test data. This procedure is repeated 30 times. The model performance is then evaluated in terms of biases in reproducing the observations for selected quantiles. The results collectively tell how well the method performs using 1970–1999 as the baseline climatology.
Figure 4 shows the spatial patterns for median values averaged from the 30 bootstrap experiments using the CDFm and EDCDFm methods. For temperature, the spatial patterns of the bias-corrected fields resemble the observations for both the cold (January) and warm (July) seasons (Figure 4). Similar conclusions can be drawn for other quartiles as well. There are distinct differences between the January and July model biases. For January, PCM1 is too warm for a large area of continental interior and northern Asia north of 60°N but is too cold for the northwest and southeast portions (Figures 4a and 4b). Overall, PCM1 exhibits a moderate cold bias when averaged spatially but with a large spread in winter (Figure 5). The results after bias correction show a dramatic reduction in the mean and range of the biases. The results for the EDCDFm and CDFm methods are very similar (Figures 4c, 4d, and 5). For July, the model shows a strong and systematic cold bias for nearly the entire NEESPI region (Figures 4e and 4f) with a mean bias of over 4° for all quartiles (Figure 6), though there is a much smaller variability compared to the results for January. Both the EDCDFm and CDFm methods are very effective in terms of reducing the original model biases (Figures 4g, 4h, and 6). The remaining biases, after bias correction, also tend to cluster more tightly around zero compared to the raw PCM1 data, indicating a much smaller mean and greatly reduced variability (Figure 6). It is worthwhile to note that the EDCDFm and CDFm methods exhibit comparable skill with no one method being superior to the other, at least for the 20th century data and for the examined quartile values.
 For precipitation, an issue arises in some dry regions where the model has essentially no skill at all: either (1) the observations have no or very little rainfall in a few months, but the model indicates that it rains every month and total rainfall can be of the order of 1000 mm, or (2) the observations indicate significant rainfall (hundreds of millimeters), but the model is very dry. This issue arises for about 0.4% of grid points in January and 0.2% in July. For these grid points, we use the ratio of the future to the current climate in the model instead of the difference as the transfer function to adjust the observed CDFs. The method works well for those problematic grid points. As the model shows such poor skill for these grid points, it may even be more appropriate to directly use the observed climatology (perhaps with the addition of some white noise) instead of attempting to bias correct the model data.
Figure 7 shows the median values for the precipitation field (January in Figures 7a–7d and July in Figures 7e–7h). Again, both the EDCDFm and CDFm methods show improvements compared to the original model outputs with more realistic spatial patterns. In January, the negative bias in the PCM1 for the Mediterranean regions and positive bias for the East Asian Peninsula (Figures 7a and 7b) are substantially reduced by both methods (Figures 7c and 7d). The underestimation of summertime Asian monsoon rainfall for East Asia is also dramatically improved after bias correction (Figures 7e–7h). In terms of the remaining biases, the spatial patterns for the different quartiles are very similar for the two methods (Figures 8 and 9). Over regions with relatively high precipitation and high variability (Figure 2) the biases tend to also be relatively large. This is understandable, as such regions are usually characterized by strong dynamics, which can be difficult to simulate. Both the EDCDFm and CDFm are able to remove biases considerably, but less skillfully in July compared to January. Relatively large biases are still apparent for regions with high precipitation and large variability (up to 30 mm/month), especially for July, though of a much smaller magnitude than the model biases.
 The reason for the similar performance from both statistical bias correction methods may be explained by the way the sampling is done in the bootstrapping experiment and the very limited length of the climate records (30 out of 100 points for each sample). Given long enough climate records and substantially more samples, we may pick up extreme distributions, which can be used to investigate the sensitivity of methods to data having quite different distributions. As these conditions, especially the former, are hard to satisfy, we have designed another synthetic experiment specifically for the purpose of examining climate extremes.
5.2. A Synthetic Experiment for Bias Correction of Climate Extremes
 Our main goal is to develop a method that can better handle a changing climate, including changes in variability and extreme monthly values. Therefore, a key question is whether the EDCDFm is more efficient in dealing with changed variability than the CDFm. To answer this question, we conducted a synthetic experiment using the 20th century temperature data. First, we used the entire 20th century data (1901–1999) for training and then chose the lower 33% and upper 33% of the full 20th century data for testing, thus representing the two extreme situations. We then calculated the mean absolute biases, root-mean-square error (RMSE), and reduction of RMSE (one minus the ratio between the RMSE from the bias correction method and RMSE from PCM1) for the 5th (95th) percentiles of the lower 33% and upper 33% of the full data record. The results are presented in Table 1. First, both methods show significant reduction in all error statistics analyzed, in line with the findings reported in section 5.1. The reduction of RMSE ranges from 70% in January to about 95% in July. Both methods show smaller mean biases and RMSE in both wintertime and summertime, with greater skill in summertime as a result of the smaller variability. From Table 1, we may say that the EDCDFm method is superior to the CDFm method. The former shows, on average, a 4–5% (10%) further reduction in RMSE in summer (winter). Thus, the EDCDFm developed in this study seems more skillful in dealing with changed variability.
Table 1. Summary of Statistics for the Synthetic Bias Correction Experiments Using the Lower 33% and Upper 33% of the Full 20th Century Data Sets for Temperaturea
See text for details about the experiment design. MAB, mean absolute bias; RMSE, root mean square error; R- RMSE, reduction, defined as 1-RMSEBC/RMSEPCM1, where RMSEBC represents RMSE model outputs after bias correction. The statistics represent spatial averages.
I, PCM1; II, EDCDFm; III, CDFm.
5.3. Choice of Training Period
 Another question that has not been addressed in the literature is how the results would differ when using another time period for training. To shed light on this, we carried out another bootstrapping test for the January temperature field using 1901–1930 as the training period. The spatial patterns are similar to those shown in Figure 4a. Both statistical bias correction methods produce data in better agreement with observations. However, it is also interesting to note that the spatial patterns for the presented mean biases differ from those depicted in Figure 5 (top) in terms of the sign of the mean biases: the slight positive biases for the southeast when using 1970–1999 switch to negative biases when using 1901–1930, and the northwest of Russia now exhibits positive biases (not shown). The analysis suggests that the bias-corrected time series are directly tied to the model's performance during the training period, supporting the argument that model biases are time dependent, and agrees with the finding by Reifen and Toumi . They demonstrated that GCMs that perform well in one period may not necessarily perform equally well for another period, possibly a result of nonstationarity of climate feedback strength or a model's representation of external forcings (e.g., CO2 concentration, aerosols). The results indicate that care should be taken when using a specified time period for training. However, if the temporal variation of the bias has a much smaller magnitude than the bias itself, we may simply choose a period we feel most confident with if the main objective is bias removal. The choice of the past few decades (1970–1999) as the training period for bias correction of future projections in many climate downscaling studies may be justified for the following reasons: (1) recent observations are more reliable, and (2) recent decades have experienced warming and thus are more likely to resemble future projections.
6. Future Projection
 In this section, we use the EDCDFm method to downscale a future model run of the A2 scenario, a high-emission scenario that is associated with high population growth, regionally oriented economic development, and slow technological change [Nakićenović and Swart, 2000]. The objectives of this are twofold: first, to see how the bias-corrected future projection compares to the original PCM1 model output and, second, for such a high-emission scenario with associated changes in temperature and precipitation, to see how different the results are when using the EDCDFm or CDFm. We present the results for the period 2070–2099.
 For temperature, the two methods give very similar spatial patterns for the mean and variability (Figure 10). Compared to the original model projections, both methods show higher temperatures in July, suggesting a much stronger warming will occur (because of the cold 20th century bias in the model, which the methods correct for). In January, a colder northern Asia and warmer northwest and southeast are expected at the end of the 21st century under the A2 scenario. These features are consistent with the validation results discussed in section 5. The bias-corrected projections show a dramatic reduction in variability for large areas in January. On the other hand, enhanced variability is evident for Barents Sea coastal areas from both downscaling methods in July, a result of a combination of underestimated variability for the baseline period and increased variability in the future projection. However, there are some subtle differences, primarily related to the variability in January: the EDCDFm method shows smaller variability for the eastern coast and the central and eastern parts of northern Asia compared to that from the CDFm method (Figure 10). The reduced variability for these regions is a reflection of the method's adjustment to the exaggerated variability for the current climate and a reduced variability in future projection by PCM1. For these regions, EDCDFm tends to have a distribution with higher values for lower percentiles and lower values for higher percentiles. The further away from the 50th percentile, the larger the differences between these two methods, exceeding 1°C near the tails in January (Figure 11, top). The differences in July are relatively smaller in both magnitude and geographical coverage (Figure 11, bottom). The results also suggest that EDCDFm is more sensitive to reduced variability than the CDFm counterpart. Comparatively, July differences between these two methods are less discernable, though results from EDCDFm are about 0.5°C–1°C higher than those from CDFm for the coastal areas of the Kara Sea and Laptev Sea. The values between the 25th to 75th percentiles are comparable. The differences are more apparent for percentiles below the 5th and above 95th percentiles for January.
 The spatial structure of precipitation after bias correction is quite similar for the two methods (Figure 12). In wintertime, the bias-corrected precipitation is higher for the Mediterranean region and the northwest of Russia. A reduced precipitation amount is observed for Japan. In summertime, both downscaling methods suggest higher precipitation for East Asia, particularly for east coastal China, the Korean Peninsula, and Japan. These changes seem reasonable considering the biases in the model-simulated precipitation for the training period. The differences between these two methods are more subtle and isolated. With respect to differences for given percentiles, differences are small for lower percentile values, which comes as no surprise because the lower percentiles correspond to low precipitation amounts; thus, the absolute differences are small. On the other hand, higher percentiles correspond to higher precipitation amounts; thus, the absolute differences can be relatively large (up to 10 mm/month), especially for regions where the model shows marked changes in variability.
7. Discussion and Concluding Remarks
 We have developed a new quantile-based mapping method, called EDCDFm, for the purpose of bias correction of GCMs near surface meteorological fields and compared it to the traditional CDFm. Different from most previous studies, which usually only select one time period for validation, a bootstrapping method is used that samples the full historic record to test the fidelity of the downscaling. This gives an indication of the robustness of the method and highlights regions of high confidence or large uncertainty. Both methods are able to reduce biases in the PCM1 model temperature and precipitation fields significantly. The validation results from the bootstrapping experiment suggest that the EDCDFm is comparable to the traditional quantile-based mapping method if we use 1970–1999 as the training period.
 We further designed a synthetic experiment to investigate the efficiency of both methods in reducing climate model bias in a synthetically generated changing climate. The analysis for selected statistical metrics (mean absolute bias, RMSE, reduction of RMSE) suggests that the EDCDFm method is superior to the CDFm method. Thus, the EDCDFm developed in this study seems more skillful in dealing with changed variability. This may have profound implications for climate impact studies that are concerned with changing climate variability.
 To investigate the sensitivity of bias correction results to the choice of different baseline periods, we performed the same bootstrapping experiments using the period of 1901–1930 as the training period. We found that the choice of baseline period does have an impact on the bias correction results: the bias-corrected results are tightly connected to the model biases for the training period. The results indicate that care should be taken when using a specified time period for training.
 In bias correcting future scenario projections, the differences between the two methods are subtle for the selected quartiles. However, for regions where the model exhibits changes in variability, the differences are more apparent at the tails of the distributions. The EDCDFm method is more sensitive to reduced variability in the future projection and tends to produce a probability density function that has a much narrower range than the CDFm method. For these regions, although the mean values are very close for both methods, the differences can be as large as 1°C when approaching the tails (e.g., January temperature projection for the 5th or 95th percentile). This may have profound implications for the hydrology in high-latitude regions where, for example, temperature-dependent snowmelt in the spring is the predominant source of water.
 Although statistical downscaling approaches do not provide a physical explanation for biases, they have a computational advantage to dynamic downscaling and have skill comparable to limited area climate models, at least for present climate conditions [Hay and Clark, 2003; Wilby and Wigley, 2000; Wood et al., 2004]. This makes statistical downscaling very attractive to climate impact and assessment applications because large ensemble members can be generated relatively easily. On the other hand, the EDCDFm assumes a time-invariant bias (or transfer function) like other statistical downscaling techniques. This assumption may not necessarily hold. For example, if the region of interest experiences a different rainfall regime in the future as a result of changes in large-scale circulation, then the future bias would be controlled by different processes, and the behavior of the bias may change. This is a limitation of the downscaling method, but one that cannot be tested, as we cannot validate future projections. Compared to other statistical downscaling approaches, the EDCDFm has two attractive features. First, the method explicitly considers changes in the distribution of the future climate, including the tails of the distribution, which are most pertinent for climate impact and assessment studies. Second, the proposed method is very straightforward and simple to use. It is particularly useful for a large spatial domain such as the Eurasian continent or the entire globe. In contrast, many statistical downscaling approaches are based on regression-type models or weather classification schemes developed for small regions or specific sites and therefore are inappropriate for large-scale applications. Furthermore, nearly every statistical downscaling approach is developed for a specific purpose or location, and so there is no panacea, although general guidelines on the choice of statistical method have been developed [Wilby et al., 2004].
 The EDCDFm method can be readily applied to other climate model outputs in producing ensemble climate projections for a variety of scenarios. We have successfully applied the method to several other coupled climate model projections. Combined with a temporal disaggregation approach such as a climate analog approach [e.g., Benestad, 2001; Zorita and von Storch, 1999] or a more traditional random sampling method [e.g., Hirabayashi et al., 2005], gridded time series at daily or even finer temporal scales can be derived for, but not limited to, hydrologic impact studies. This is the subject of an accompanying study in which our approach is to sample the daily and subdaily variation from the historic observations (thus maintaining realistic temporal and spatial structure in the data) but looking at improved methods that take into account projected changes in the daily and subdaily statistics (e.g., storm frequencies and intensities and diurnal cycles of temperature and precipitation).
 Changes in the hydrological cycle, especially more frequent extreme events, can have the most devastating impacts on human society and result in losses exceeding billions of dollars. Meanwhile, there is increasing evidence of the intensification of the global hydrological cycle for the past century [Huntington, 2006; Milly et al., 2002]. The proposed method offers decision makers another possible scenario of the future climate. It is especially useful for regions experiencing changes at a relatively fast pace, such as high latitudes. These scenario forcings will be invaluable in assisting, for example, water resources planners to assess future flood risk, update reservoir operation rules, and initiate talks for sharing cross-border water resources, especially for water-scarce regions.
 The study is supported by NSF grant ARC-0629471 and NASA grant NNG06GE62G. Their support is gratefully acknowledged. We acknowledge the modeling groups, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) and the WCRP's Working Group on Coupled Modeling (WGCM), for their roles in making available the WCRP CMIP3 multimodel data set. Support of this data set is provided by the Office of Science, U.S. Department of Energy.