Is Bias Correction in Dynamical Downscaling Defensible?

Localized projections of 21st‐century hydroclimate variables obtained from downscaling Global Climate Model (GCM) output are central to informing regional impact assessments and infrastructure planning. Regional GCM biases can be significant and, for dynamical downscaling, can be addressed either before (a priori) or after (a posteriori) downscaling. However, a priori bias correction (APBC) has generally unexplored effects on climate change signals. Here we analyze dynamically downscaled solutions of CMIP6 GCMs over the Western U.S., with and without APBC, and quantify APBC's impact on climate change signals relative to other irreducible uncertainty sources. For temperature and precipitation, the uncertainty introduced by APBC is negligible compared to that arising from GCM choice or internal variability. Furthermore, APBC greatly reduces regional models' unrealistically high snow‐water‐equivalent (SWE) biases that result directly from GCM errors. We leverage this finding to encourage the dynamical downscaling community to adopt APBC as a standard operating procedure.


Introduction
The weather patterns of the 21st century will change and produce surface air temperature and precipitation at local (sub 10 km) spatial scales that are fundamentally dissimilar from patterns observed during the historical record (Milly et al., 2008).Global Climate Models are our main tools to project the Earth system's response to anthropogenic forcings in the face of natural climate variability at continental-to-global scales and seasonal-tocentennial scales.At the same time, at any given location, GCMs predict a persistently large range of hydroclimate outcomes in the 21st century for a given future socioeconomic pathway (Eyring et al., 2016;Soden et al., 2018;Stouffer et al., 2017).The range of model projections is growing (Bourdeau-Goulet & Hassanzadeh, 2021), and some of this spread is due to more complete assessments of internal variability from ensembles of projections (Deser et al., 2020;Lehner et al., 2020).
The maturation of GCM ensembles along with computing advances has seen the emergence of dynamically downscaled ensembles (Mearns et al., 2017;Rastogi et al., 2022;Xu et al., 2021) that are used to translate information in the GCM ensembles to the local scale for regional impact assessments and infrastructure planning.Yet there remain biases in GCMs that, when transmitted to a regional climate model (RCM), lead to large biases in high-resolution outputs (McSweeney et al., 2015;Plavcová & Kyselý, 2012;Wu et al., 2005;Xu et al., 2021;Zhang et al., 2022).For impacts assessments and adaptation planning, biases in GCM simulations continue to be a persistent problem.When GCMs are downscaled, whether dynamically or statistically, such biases are a barrier to trusting the information contained in downscaled climate projections.GCM biases relative to historical observations and reanalyzes at best reduce the interpretability and downstream usability, and at worse reduce confidence in the suitability of GCMs for such projections, reflecting the "garbage in, garbage out" issue in statistical Global Climate Models (GCMs) introduces trivial uncertainty in dynamically downscaled temperature and precipitation projections • Corresponding uncertainties in snow are significant, but non-APBC projections of snow are physically unrealistic and should be discarded • Minimally invasive APBC preserves GCM trends at regional scales while producing useable and realistic downscaled hydroclimate projections

Supporting Information:
Supporting Information may be found in the online version of this article.and dynamical downscaling (Benestad et al., 2017).The transmission of these GCM biases into RCMs must be addressed by practitioners.
There is a large body of literature that explores the value and limitations of bias correcting as part of producing downscaled projections and their subsequent use in impacts studies.For statistical downscaling, previous studies have focused on developing solutions under the principle that the signal of climate change from a GCM, irrespective of diagnosed GCM biases, should be preserved (Pierce et al., 2023).Bias correction is also commonly applied to dynamically downscaled data sets for use in impact assessment following downscaling.However, there are many documented issues that arise from bias correction that collectively introduce uncertainty into these solutions or undermine their credibility and are therefore controversial, for example, strange artifacts in the endproduct or nonphysical climate change signals (Ehret et al., 2012;Keller et al., 2022;Maurer & Pierce, 2014;Mehrotra & Sharma, 2015).In contrast, bias correction of GCM outputs before dynamical downscaling (a priori bias correction or APBC) has been demonstrated to be a suitable pathway for achieving downscaled simulations with very small biases.Bruyère et al. (2014) showed the benefits of bias correcting the mean-state GCM outputs before downscaling in their examination of North Atlantic tropical cyclones for a single GCM.However broader scale applicability of this and other bias correction techniques, including their effects on uncertainty in and credibility of future climate projections, has only been explored sparingly (e.g., Xu et al., 2021).
As a result, basic questions surrounding the use of APBC in dynamical downscaling persist.First and foremost, should GCM outputs be bias corrected prior to RCM ingestion, and if so, how should they be bias corrected?As noted above, RCM outputs are generally bias corrected following downscaling (a posteriori) for community use.There is a tension here that is unique to dynamical downscaling: without bias correction, the dynamically downscaled solution maintains consistency with GCM dynamics, but also perpetuates its boundary condition biases.In the downscaled context, these biases may then be amplified in nonlinear, unphysical, and uncorrectable ways.In that case, it is more difficult to use such solutions for applications directly.With APBC, however, the dynamically downscaled solution does not perpetuate GCM biases but may no longer be consistent with the dynamics of its parent GCM.Additionally, uncertainties may be introduced in the future change signal, which work against the climate communities' efforts to reduce the spread in climate projections.While APBC solutions are more straightforward to use for applications, the impact of APBC on physical consistency and added uncertainty are good reasons for dynamical downscalers to avoid the practice altogether.This tension is not resolvable without systematic analyses of the similarities and differences between dynamical downscaling solutions that are and are not bias corrected.
Regardless of the controversy surrounding APBC for downscaling future projections, it is uncontroversial that APBC substantially reduces downscaled biases (Bruyère et al., 2014).However, there have been no studies to date that explore the potential uncertainties introduced by APBC into future projections from a traditional variance decomposition perspective (Hawkins & Sutton, 2009), primarily due to a lack of paired downscaling experiments that allow one to systematically assess such uncertainties.For the first time, we undertake such an effort using a targeted nine-member suite of dynamically downscaled GCMs from Phase 6 of the Coupled Model Intercomparison Project (CMIP6) across the Western U.S., both with and without APBC.With these experiments, we determine when, where, and for which variables bias correction introduces variance into downscaled projections.If the APBC solutions can be interpreted as having the same likelihood as the non-APBC solutions, this additional variance can be interpreted as the uncertainty introduced by bias correction.Our analysis connects to, and also augments, the traditional variance decomposition framework (as introduced by Hawkins & Sutton, 2009) to assess the relative contribution of APBC to uncertainty in future projections.

Materials and Methods
The analysis presented here uses simulations from the Weather Research and Forecasting (WRF) model version 4.1.3(Powers et al., 2017;Skamarock et al., 2019) using a configuration focused on the hydroclimate of the Western U.S. (Rahimi et al., 2022).Here, the WRF model is forced with initial and boundary conditions from Earth System Models that completed experiments from the Scenario Model Intercomparison Project (Scenar-ioMIP;O'Neill et al., 2016) as part of CMIP6 (Eyring et al., 2016).These WRF runs produce a dynamically downscaled estimate of the atmospheric and surface state across the Western U.S., archived at hourly resolution and 9 km spatial resolution.Dynamical downscaling was performed for nine Earth System Models (see Table 1 in Supporting Information S1 or the legend in Figure 1) selected based on their skill in simulating a variety of large-scale measures of Northern Hemisphere mid-latitude circulation and its variability (Krantz et al., 2021;Simpson et al., 2020).For each GCM, a pair of WRF downscaling experiments are conducted: (a) using the unadjusted GCM output as boundary conditions and (b) applying bias correction to the mean-state GCM fields following Bruyère et al. (2014) before downscaling.The method decomposes a GCM-simulated variable, say x GCM , into its mean-state historical (1980-2014) climatological mean x GCM,0 plus the deviation x′ GCM : The same decomposition is then conducted for a reference data set-in this case, the ECMWF's fifth Reanalysis (Hersbach et al., 2020): x ERA5 = x ERA5,0 + x′ ERA5 .The mean-state bias Δ = x GCM,0 x ERA5,0 is then subtracted from the original GCM signal to arrive at the bias-corrected quantity used to drive the WRF model: Bias correction is applied to surface pressure and temperature, mean sea-level pressure, and sea-surface temperatures, as well as to three dimensional temperature, geopotential height, specific humidity, and horizontal winds.Note that this bias correction technique preserves the long-term trends and variability of the original GCM data.We refer the interested reader to Rahimi, Huang, Goldenson, et al. (2024) for further details on the method and downscaled data sets.To assess historical biases, we compare the WRF-downscaled solutions with PRISM daily mean temperature and daily accumulated precipitation (Daly  et al., 1994) and the University of Arizona snow water equivalent data product (Broxton et al., 2016;Zeng et al., 2018).More information on our assessment of historical biases, projected changes and uncertainties, and sensitivities to elevation are described in Supporting Information S1.
To understand the time-varying proportion of total variability in the WRF-downscaled output due to model uncertainty, the effect of APBC, and internal variability, we use an approach related to the variance decomposition described in Hawkins and Sutton (2009); see Supporting Information S1.The entire procedure is applied to data from either (a) a single geospatial location (e.g., WRF model grid boxes) or (b) Western U.S.-wide areaweighted averages.

Results
We first compare domain-averaged historical biases in WRF-downscaled solutions versus the future change signal for seasonal mean daily precipitation, seasonal mean daily average temperature, and peak annual daily high-elevation snow water equivalent, both with (solid circles) and without (empty circles) APBC (see Figure 1).Historical biases are defined as the differences in observed (Daly et al., 1994;Zeng et al., 2018) and WRFdownscaled 1981-2010 climatology; projected changes are defined as the differences in WRF-downscaled end-of-century (2071-2100) and historical  climatology.Projected changes include a 95% bootstrap confidence interval, and we examine low-elevation (less than 1,500 m above sea level) separately from highelevation regions.As expected, for all variables and in all seasons, APBC significantly reduces historical biases relative to solutions without APBC.
Regarding the future change signal, the effect of APBC on projected changes can be quite different for the three variables: for Western U.S.-average seasonal mean temperature, APBC has essentially no impact on the projected change, as evidenced by the fact that the lines connecting empty and solid circles are flat.For Western U.S. mean seasonal precipitation, APBC has uneven effects: across different model-season combinations, the projected changes can increase, decrease, or stay the same when APBC is applied.Despite these differences, the effect of APBC on projected changes to precipitation is not statistically significant (i.e., the confidence intervals for no-APBC and APBC for a given model-season always overlap), suggesting that the effect of APBC is small compared to the internal variability of the model.We also note that modifications to the precipitation change signal are generally much smaller than 0.2 mm day 1 .Exceptions to this finding occur across low elevations during the summer and autumn, where precipitation is predominantly the result of convection.The removal of mean-state thermodynamic instability biases during APBC, biases endemic to the broader CMIP6 GCM suite (e. g., Chavas & Li, 2022;Lepore et al., 2021), may be responsible for the modification of such precipitation signals (Rahimi, Huang, Goldenson, et al., 2024).
The story is quite different for annual peak SWE, where APBC has a clear and systematic effect: for all models, the projected changes decrease in absolute value when APBC is applied.Since the projected changes are all negative, this means that APBC causes these decreases in SWE to shrink in magnitude.This is because mean-state regional GCM biases (cold deep-layer, low-level instability, and strong upstream mid-tropospheric cyclonic relative vorticity biases common to the whole ensemble) lead to an overly wet, cold, and snowy solution when APBC is not applied (Rahimi, Huang, Goldenson, et al., 2024).The APBC procedure warms and dries the RCM solution, resulting in less snow and therefore effectively reducing snow loss in transient warming.It is noteworthy that (as with precipitation) the uncertainty bars for projected changes with and without APBC overlap, that is, the effects of APBC on projected changes are small relative to internal variability.Nonetheless, the effect of APBC on projected changes to SWE is the same regardless of the parent GCM, indicative of a robust effect arising from common GCM biases.
The disconnect between the temperature, precipitation and SWE response to bias correction is stark.It can be explained not only by overall mean-state cold and wet GCM biases just described, but also because the SWE field has differential sensitivity as a function of elevation in the downscaled context (see contrast in slope of the lines in the graph between low and high-elevations in Figure 1) and SWE has strong, non-linear temperature and precipitation sensitivities (e.g., Luce et al., 2014).This disconnect is initially surprising: after all, the SWE changes are driven directly by those very same changes in temperature and precipitation.Clearly the future SWE changes are also very sensitive to the cold and wet biases in the GCMs, and their elevationally dependent manifestation as strong SWE biases in the downscaled context.To the extent that future SWE changes are traceable to the same causes of the strong SWE biases that non-APBC models exhibit in their historical runs, rather than being traceable to the models' more realistic projected changes in temperature and precipitation, we interpret non-APBC SWE changes to be unphysical.The root causes of SWE biases in historical runs cannot be ignored, as they are perpetuated into future projections.Because of this, the non-APBC SWE solutions are not equally likely outcomes, and should be down-weighted.In this sense, it may not be appropriate to label the variability in SWE changes arising from APBC as introducing "uncertainty," a topic we return to below.
Next, we apply an augmentation of the traditional variance decomposition framework (as introduced by Hawkins & Sutton, 2009) to assess the relative contribution of APBC to variability across downscaled future projections (see Supporting Information S1).When all simulated futures can be deemed equally likely, this variability can be interpreted as uncertainty (Scenario uncertainty is not considered because all dynamical downscaling solutions were developed using a single shared socioeconomic pathway, SSP3-7.0.) Figure 2 shows the fraction of total variability (%) over time in Western U.S.-wide averages of selected variables due to model choice, the effect of APBC, and internal variability.Note that a minimum SWE mask (climatology of less than 10 mm of SWE) is applied before computing the fractional variance explained by these three sources of variability for all snowrelated variables.The four variables shown are chosen to illustrate the range of effects of APBC on future simulated Western U.S.-wide hydroclimate.On the one hand, APBC introduces essentially no variability for downscaled projections of springtime mean temperatures, while the effect is nonzero yet quite small for springtime mean precipitation (particularly relative to variability from both internal variability and models).On the other hand, APBC has a large effect on each of the snow-related variables, explaining roughly 35% of the endof-century variability in springtime snow cover fraction (SCF) and in excess of 50% of the end-of-century variability in peak SWE.Particularly for peak SWE, the variability due to APBC is large relative to that arising from model choice and internal variability (end-of-century fractional variances of approximately 30% and 17%, respectively).Corresponding plots for all other variable-season combinations are shown in Figure S1 in Supporting Information S1, for both seasonal mean and seasonal daily maximum summaries.For other variables the story is largely the same as what is shown in Figure 2: APBC has a negligible effect on Western U.S.-wide mean and extreme temperature and precipitation, while its effect on SCF is large in all seasons except winter; the cold season sees high SCFs that persist through 2100 at high-elevations in both sets of downscaled projections.
Because dynamical downscaling is a tool used to create climate data that can be useful to end-users at decisionrelevant scales, we next consider the variability introduced by APBC locally by considering time slice averages at each RCM grid cell (see Figure 3).For brevity, we present the spatial distributions of introduced variability for the two edge-case variables: one that shows the least sensitivity (springtime mean temperature) and another showcasing the highest sensitivity (peak annual SWE) to bias correction locally.These variables are illustrative of the general effect of APBC.For mean temperatures in the spring, there is minimal spatial variation in each source of variability regardless of time period, such that model variability dominates the end-of-century variability in downscaled projections.For peak annual SWE, the effect of APBC is small at the beginning of the record (1st decade, particularly with respect to internal variability) and becomes quite heterogeneous by the end of century: -wide averages of selected variables due to model choice (blue), the effect of a priori bias correction (red), and internal variability (orange).Note a minimum SWE mask is applied before taking Western U.S. averages for spring snow cover fraction and annual peak SWE; see Figure 3.

10.1029/2023GL105979
for some regions the effect of APBC is even larger than was suggested in Figure 2 (in excess of 80% in, e.g., northern California), while in other regions (e.g., eastern Montana) its effect is negligible.Generally speaking, for regions where end-of-century variability arising from APBC is small, the variability budget is dominated by the models (see, e.g., the eastern Sierra).
Finally, we show the ratio of end-of-century fractional variability from APBC versus internal variability for the two variables shown in Figure 3 as well as springtime maximum daily temperature and mean precipitation, plotting a so-called "uncertainty ratio" for individual grid boxes (Figure 4a) as well as aggregated over elevation groups for the three primary mountain ranges in the Western U.S. (Figure 4b).Variability due to APBC dominates where this ratio is greater than 100% (dark blue colors), while APBC contributions to variability are relatively small when this ratio is less than 100% (dark red colors).In cases where model uncertainty and internal variability dominate total uncertainty at end-of-century (e.g., for precipitation), the deleterious effects of APBC are relatively small, but the benefits of APBC (greater realism) remain.
For mean temperature, there are some small areas where the APBC uncertainty is larger than that of internal variability, but generally the APBC uncertainty is negligible (Recall too that internal variability uncertainty pales in comparison to scenario uncertainty by the end of the 21st century; Hawkins & Sutton, 2011.)Uncertainty ratios for maximum daily temperatures show a similar pattern to those of mean temperature, with less than 1% of grid boxes having an uncertainty ratio greater than 100%.For both mean and extreme temperature, areas in which APBC uncertainty dominates internal variability correspond to predominately snow covered regimes at high elevations.As discussed above, the non-APBC experiments produced unrealistically high SWE in the historical period, and this model behavior continues into the future simulations.Specifically, the land surface tends be more snow covered and subsequently less sensitive to diurnal heating in the non-APBC experiments, leading to an unrealistic simulation of the surface energetics that drive maximum daily temperature, particularly during spring when the melt season begins.Incoming solar energy that could be converted to sensible heat and warm the overlying air is reflected away from the surface by a snowpack that is unrealistically large in areal extent.Since this behavior in the non-APBC experiments can be traced to unphysical SWE biases, these solutions can likely be downweighted.Additionally, we note limited regions where end-of-century APBC uncertainty for precipitation is large relative to internal variability (approximately 3% of the domain), perhaps due to the removal of the aforementioned biases from the GCM outputs.However, for the non snow-related variables, it is noteworthy that the APBC uncertainty remains small relative to internal variability across the vast majority of the domain, and model uncertainty is still the dominant source of variance (Figure 2).Corresponding plots of the end-of-century APBC uncertainty relative to internal variability for all other variables and seasons (including both mean and  Panel (a) shows the ratio of end-of-century fractional uncertainty (%) due to a priori bias correction (APBC) divided by the corresponding quantity for internal variability.We specifically show the 2091-2100 average, corresponding to the 9th decade panels in Figure 3), for four select variables.The ratio of APBC uncertainty to internal variability is shown for different elevation groups of three mountain ranges (panel b), for the Cascades ("C"), the Sierra Nevada ("SN"), and the Rockies ("R").Specific limits for the elevation categories are range-specific; see Table S2 in Supporting Information S1.Consistent with Figure 2, the variability introduced by APBC into peak SWE is large in many locations, especially at mid to high elevations where SWE is very high biased in the downscaled GCMs due to those GCMs' cold and wet biases.As noted above, we interpret the SWE changes in non-APBC simulations to be unphysical, to the extent they are traceable to SWE biases rather than changes in temperature and precipitation.Nevertheless, it is useful to be able to gauge the impact of these SWE biases in the context of other sources of variability in future SWE behavior.

Discussion
Here, we were able to resolve an outstanding question regarding the need for, and appropriate approaches to, managing the gap between the need for minimally biased regional climate change information, and the propagation of GCM biases in downscaling, through APBC.We show that APBC is a reasonable strategy to respond to challenges that regional climate modeling efforts face with biased boundary conditions.It is a necessary strategy because both diagnosing and eliminating the sources of biases in GCMs are ongoing.To take just one example, GCMs can exhibit issues with their Eastern Pacific Ocean circulation (Krantz et al., 2021) with many potential reasons including the erroneous orientation of land and ocean in the ESM (Li et al., 2020).But infrastructure and planning decisions for climate resiliency development cannot wait on the open-ended timelines of scientific efforts to improve GCMs.
We systematically analyzed dynamical downscaling experiments in the Western U.S. within the framework of narrowing uncertainty developed in Hawkins and Sutton (2009).We were able to show that APBC introduces negligible effects on future projections of temperature and precipitation, compared to other sources of uncertainty in these projections such as internal variability or model choice.In contrast APBC has a very large effect on projected changes in SWE.However, this impact is traceable in part to strong sensitivity of SWE change to SWE biases in the historical climate, which in turn arise from cold and wet biases in the GCMs.Thus we argue that APBC may in fact produce more physical future SWE solutions, and that the non-APBC SWE solutions should be down-weighted.Using this framework, the applications community can determine whether or not dynamical downscaling with APBC could be undertaken with a given GCM.Considerations to take into account include the size of bias correction required by that GCM, and the users' uncertainty requirements.In general, it seems likely that APBC will turn out to be defensible in other regions and for many user applications.However, it is important to reiterate that our results are strictly valid for the quantities presented here, namely mean and extreme seasonal temperature and precipitation and snow-water-equivalent.
There are potential implications of this work for downscaling using the pseudo global warming (PGW) approach (Liu et al., 2016;Rasmussen et al., 2014).In the PGW method, future mean GCM deltas are superimposed on reanalysis boundary conditions and downscaled using a RCM.The climate change signal is quantified by differencing the PGW simulation from a companion reanalysis-only-driven experiment, whose representation of the historical climate is presumed to be minimally biased.In fact PGW can be thought of as a technique that builds in a bias correction step, as with the APBC implemented here.However, unlike APBC, the PGW method assumes that the historical internal climate variability is identical in a future climate, and does not allow for the study of questions surrounding times of emergence (e.g., when in the 21st century future SWE across the west would fall below its historical tenth percentile value).Studies needing to consider future changes in climate variability could consider APBC as an alternative to PGW, in order to preserve GCM variability signals while reducing historical bias.
Finally, the results here reinforce the importance of dynamically downscaling only the least-biased GCMs.As a GCM's biases grow, bias correction is increasingly likely to distort the solutions and increase uncertainties, hence becoming less useful as a tool.End-users may subsequently be left with a climate projection which seems to be of high-quality in terms of historical bias but within lies new uncertainties resulting from the distortion of the climate change signal.As such, it may only be possible to implement APBC following the identification of topperforming GCMs through a process-based and use-inspired selection process.These concepts and our results are also timely given the speed at which modeling centers are producing downscalable GCM outputs.With more GCMs and ever-growing compute power, the risk for contributing to the "garbage in, garbage out" problem is too becoming more prominent and must continue to be addressed.

Figure 1 .
Figure 1.Comparison of Western U.S.-averaged historical biases (x-axis; observed minus downscaled, 1981-2010 average) in WRF-downscaled solutions versus the future change signal (y-axis; 2071-2100 average minus 1981-2010 average) for seasonal mean daily average temperature (panel a.; in °C), seasonal mean daily precipitation (panel b.; in mm day 1 ), and peak annual daily snow water equivalent (panel c.; in mm), both with (solid circles) and without (empty circles) a priori bias correction.Projected changes include a 95% bootstrap confidence interval.

Figure 2 .
Figure2.Fraction of total variability (%) over time in western-U.S.-wide averages of selected variables due to model choice (blue), the effect of a priori bias correction (red), and internal variability (orange).Note a minimum SWE mask is applied before taking Western U.S. averages for spring snow cover fraction and annual peak SWE; see Figure3.

Figure 3 .
Figure 3. Maps of the spatial distribution of fractional variance in spring mean temperature (panel a.) and annual peak SWE (panel b.) due to models, the effect of a priori bias correction, and internal variability for decade-averaged time slices.Gray areas in panel (b) correspond to Weather Research and Forecasting grid boxes for which the multi-model mean climatology of downscaled SWE over 1980-2100 is less than 10 mm.

Figure 4 .
Figure 4.Panel (a)  shows the ratio of end-of-century fractional uncertainty (%) due to a priori bias correction (APBC) divided by the corresponding quantity for internal variability.We specifically show the 2091-2100 average, corresponding to the 9th decade panels in Figure3), for four select variables.The ratio of APBC uncertainty to internal variability is shown for different elevation groups of three mountain ranges (panel b), for the Cascades ("C"), the Sierra Nevada ("SN"), and the Rockies ("R").Specific limits for the elevation categories are range-specific; see TableS2in Supporting Information S1.
and precipitation) are shown in Figure S2 in Supporting Information S1, where the story is largely the same as what is shown in Figure 4.