In general, biases of climate models depend upon the climate state (i.e., are nonstationary). Recent studies have shown that the adoption of a stationary temperature bias can lead to an overestimation of projected summer warming in southern Europe. It has also been proposed to use a bias correction that increases linearly with temperature. While such an assumption is well-justified for near-term projections, one wonders whether and at what temperature this relation levels off if it does. Here we show, using regional climate model simulations of the ENSEMBLES project and from a single-model perturbed physics ensemble, that the linear bias assumption breaks down at high model temperatures, followed by a transition to a constant bias relation. This transition is apparent in strongly biased model simulations and supported using a pseudo-reality approach. We show that soil moisture scarcity explains a large degree of summer temperature biases across both ensembles and that the limits of soil moisture depletion are responsible for the transition. A linear temperature bias correction therefore potentially over-corrects summer warming, and implicitly assumes unphysical relations between soil moisture and temperature, in particular when considering high-emission scenarios. We conclude that a physically consistent and time-dependent temperature bias correction considering the state of the soil would increase the robustness of bias correction and reduce the uncertainty of 21st century summer warming.
 A common assumption made in numerical modeling of climate change is that model biases do not significantly depend upon the climate state. This assumption is made implicitly when computing climate change signals as the difference between scenario and control climates. The fact that climate sensitivities and climate change signals differ between models contradicts this assumption, yet the lack of identifiability generally inhibits the separation between changes in model bias and changes in the climate state [Buser et al., 2009].
 Evidence how changes in model biases of summer temperatures could affect climate change signals comes from three lines of argumentation. First, validation of monthly mean temperatures for the European summer shows that model biases increase with temperature [Christensen et al., 2008]. Second, it has also been demonstrated (for the same region and season) that models substantially overestimate interannual summer temperature variability (IASV), and that this overestimation is inconsistent with the assumption of an invariant bias [Buser et al., 2009]. Third, the adoption of pseudo realities for the present and future climate shows that temperature biases are nonstationary in summer due to a biased sensitivity of cloud cover and soil moisture [Maraun, 2012].
 These studies motivated the consideration of bias changes in the construction of climate change scenarios. Buser et al.  employed a Bayesian framework, while Boberg and Christensen  (hereafter BC12) and Christensen and Boberg  fitted a linearly increasing bias correction to validation results during the control period. Resulting bias corrections reduce projections of Mediterranean summer warming of ENSEMBLES and CMIP5 simulations by typically 10%–30%.
 The fundamental difficulty with bias correction methodologies is the extrapolation from the current climate (where the bias behavior is known from validation studies) to some scenario climate (where we lack any observational guidance) [Ehret et al., 2012; Teutschbein and Seibert, 2012]. Can we extrapolate a linearly increasing bias into the future or should we expect that the bias increase levels off? Evidence that a bias increase might be constrained comes from recent results demonstrating a robust relationship between the ability to reproduce present-day IASV and the projected changes in the ENSEMBLES simulations [Fischer et al., 2012]. Models that overestimate present-day IASV do not indicate a strong increase in IASV due to an increasing temperature bias, but rather show small or even negative changes in IASV.
 The present paper studies physical constraints for nonstationary summer temperature biases. Southern European biases are analyzed first within the ENSEMBLES simulations (as in BC12) and second in a regional climate model (RCM) perturbed-physics ensemble. In the first part of the analysis, we aim at determining constraints for the linear bias assumption that agree with findings on IASV changes. These constraints are determined by testing the statistical robustness of a linear regression and by employing a pseudo-reality framework. In the second part of the analysis, the constraints are linked to physical limits of soil moisture depletion, which demonstrate that a linear bias assumption would ultimately imply unphysical conditions.
2 Model Ensembles and Observations
2.1 Multi-Model Ensemble (MME)
 Two RCM ensembles are considered in this study, a multi-model ensemble (MME) generated within the ENSEMBLES project [van der Linden and Mitchell, 2009] and a newly introduced single-model perturbed physics ensemble (PPE). From the ENSEMBLES project we use the global climate model (GCM) driven RCM simulations which transiently project the IPCC emission scenario A1B [Nakicenovic, 2000] up to the year 2099. The selection consists of 15 model chains using five different GCMs and eight different RCMs at a model resolution of 0.22° [van der Linden and Mitchell, 2009]. Such a MME samples the structural uncertainty of climate change signals arising from structurally different GCMs and RCMs.
2.2 Perturbed Physics Ensemble (PPE)
 As a companion to the MME, a PPE is generated by varying physical parameters of a particular model within bounds elicited by expert judgment [Bellprat et al., 2012a]. We perturb physical parameters of the RCM COSMO-CLM (hereafter CCLM) which is a versatile nonhydrostatic limited area model [Steppeler et al., 2003; Förstner and Doms, 2004], widely used for generation of climate change scenarios [e.g., Kotlarski et al., 2012] as well as for short-term cloud resolving simulations [Langhans et al., 2012]. Based on a previous parameter calibration study [Bellprat et al., 2012b], two samples of parameter configurations have been determined relying on five model parameters. The two samples consist of five calibrated parameter configurations which lead to optimal model results and five decalibrated parameter configurations which lead to deliberately bad model results, when using reanalysis data as boundary forcing. The parameters and their calibrated and uncalibrated distributions are provided in Bellprat et al. [2012b] (Section 3.1).
 The design of the PPE allows to study the effects of model calibration and to sample the parameteruncertainty of RCM-simulated climate change signals. For each of the 10 parameter settings, a control (1970–2004) and a scenario (2065–2099) time-slice are computed, using a realization of the GCM HadGEM2 (Riahi et al. , RCP8.5 scenario) obtained from the CMIP5 model archive as boundary forcing. Note that the scenario used for the PPE differs substantially from the scenario projected in the MME, since the RCP8.5 emission scenario reaches much higher concentrations of CO2,eq by the end of the century compared to A1B (1250 ppm compared to 700 ppm). The model configuration is identical to Bellprat et al. [2012b], with a model resolution of 0.44° and the Euro-CORDEX simulation domain (www.euro-cordex.net). All simulations are initialized using a transient simulation performed with one of the calibrated parameter settings simulated for the period 1950 to 2099. In order to ensure model equilibrium for each parameter setup, an additional spin-up period of 5 years is omitted for each time-slice, leaving 30 years to compute climatological averages.
2.3 Observational Data
 For the validation of 2 m temperature, we use the observational gridded data set E-OBS (v.7), which is available at the model grids used for both of the ensembles [Haylock et al., 2008]. The observational uncertainty associated with this data set at monthly and interannual timescales is significantly smaller than regional modeling uncertainties [Bellprat et al., 2012a] and also smaller than the magnitude of typical summer temperature biases.
3.1 Back From a Linear to a Constant Bias Relation
 In a first step, we explore the robustness of a linearly increasing bias assumption based on a quantile-quantile comparison between simulated and observed monthly averages for the Mediterranean region (MD), see Figure S1, as proposed in BC12. The analysis is provided in Figure 1a for the MME. Consistent with BC12, a majority of the models overestimates temperatures in the upper tail of the distribution, i.e., the biases increase with temperature. For illustration, the simulation with the largest (METO-HC_HadRM3Q16, hereafter HC) and with the smallest (SMHI-RCA_ECHAM5, hereafter SMHI) model biases are highlighted.
 Following Christensen and Boberg , we quantify the increase in model bias by fitting a linear regression. Figure 1a provides two estimates, one using the 50% warmest months (dashed lines), and the other using the 10% warmest months (solid lines). For the red curves (HC model) this confirms that the bias increases with temperature (dashed line), but also that a transition to a constant bias regime occurs (solid line). The analysis thus supports the hypothesis that the linear bias assumption breaks down at some elevated temperature. Such a behavior is also apparent in further simulations of the MME (C4I-RCA3_HadCM3Q16, ETHZ-CLM_HadCM3Q0, METO-HC_HadRM3Q0,METO-HC_HadRM3Q3, and DMI-HIRHAM5_ARPEGE), while it is not evident for the model with the smallest bias (SMHI, see blue lines).
 The same analysis is provided for the PPE Figure 1b. Similarly to the MME, the simulations feature increasing model biases at high temperatures, which are consistently smaller in the calibrated simulations (best highlighted) than in the uncalibrated simulations (worst highlighted). The linear regression for the 50% (dashed line) and 10% (solid line) warmest month again indicates a weakening of the linear bias relation at high temperatures in particular for the uncalibrated simulation reaching higher model temperatures.
 To further explore the apparent transition of a linear to a constant bias relation, we consider in addition a pseudo-reality framework [e.g., Maraun, 2012]. In Figure 1c, we hence assume the SMHI model of the MME to represent the observations (both during the control (black line) and scenario (red line) period). In order to restrict the analysis to the processes controlled by the two RCMs, we eliminate the difference in the large scale forcing from the driving GCMs. This is here done by subtracting the distinct forced mean annual warming over land of the two GCMs (, equal to 2.1 K) from the HC monthly scenario distribution
 The fact that the model temperature biases in the scenario period in Figure 1c do not rise beyond those observed in the control period supports the aforementioned hypothesis of a transition from a linear bias back to a constant bias assumption.
 The transition to a constant bias relation is schematically illustrated in panel Figure 1d, where in a simulation which overestimates high temperatures (black line), the bias remains constant in a future climate (red solid line) opposed to a linear increase (red dashed line). Following Buser et al. , the overestimation in the control climate is alternatively depicted as an overestimation of the interannual summer variability (IASV, ) compared to the observations (). Assuming that in a future climate the overestimation of high model temperature quantiles no longer increases, the IASV is expected to decrease as a consequence. Such a decrease in IASV is in fact projected by the HC simulation opposed to the SMHI simulation which shows an increase in IASV (see ΔσIASV in Figure 1c). The transition to a constant bias relation hence resolves the apparent contradiction between previous findings on the IASV and a linear bias correction.
 The correction of climate projections assuming a linear temperature bias carries the risk of overcorrection. To illustrate this, we correct the temperature increase of the two model ensembles according to the linear and constant bias assumptions (see Figure 2). While in the MME (using the A1B scenario) the linear bias correction leads to a decrease of the ensemble mean warming of about 25% consistent with BC12, the ensemble mean warming of the HadGEM2 PPE is reduced drastically by about 60% under the high-emission scenario RCP8.5. Such a correction would result in a smaller warming over land than over the Mediterranean Sea (see black crosses Figure 2), which contradicts results gained from past modeling studies and observations [Sutton et al., 2007]. Apart from this, the linear bias correction in the PPE leads to a strong increase of the ensemble spread as shown by the grey box plots, failing a central objective of bias correction methods, namely to reduce model uncertainties.
3.2 Limits of Model Drying
 To further investigate the physical plausibility of the linear bias relation, we study the soil moisture availability and its relation to the evaporative fraction. The latter is defined as the ratio between the latent heat and the sum of the latent and sensible heat flux. As reviewed by Seneviratne et al. , the evaporative fraction is parametrized to decrease linearly with soil moisture in a soil-moisture limited regime. Such a relation holds for the simulations of the PPE as shown in Figure 3a where for each simulation the soil moisture in the control (square) as well as in the scenario (triangle) period is shown. The linear regression accurately intercepts the plant wilting point at zero evaporative fraction, while negative values of evaporative fraction are unphysical in the current context. The figure shows that simulations with calibrated parameter settings (blue) are moister than simulations with decalibrated settings (red), and that the former dry out stronger than the latter. Simulated changes in soil moisture (and thus evaporative fraction) depend upon the soil moisture in the control climate (Figure 3b).
 As the depletion of soil moisture is limited and assuming that this depletion is responsible for a large degree of summer model biases as suggested by Christensen and Boberg , the model biases should—from a physical point of view—no longer increase once further drying is disabled by soil moisture depletion. In this case, the evolution of model biases would consequently rather be determined by processes affecting the radiative balance [Fischer and Schär, 2009]. In Figure 4a, we show that available soil moisture expressed as the evaporative fraction explains, as suggested, a large degree of summer model biases across both ensembles. The relation is very robust for the PPE (red and blue symbols), but it is also significant at a 5% level for the ENSEMBLES simulations (black symbols). The 95% confidence bounds show the uncertainty of this relationship owing to other processes as, e.g., radiative or dynamical effects. Assuming that this linear relationship holds in a future warmer climate, a linear bias correction would be associated with negative evaporative fractions for the decalibrated simulations of the PPE and for the HadRM3Q16 simulation of the MME. In particular, the simulations of the PPE are affected by this physical inconsistency, since the strong warming under the RPC8.5 scenario implies strong corrections when using a linear bias correction scheme. Even though a linear bias correction projects only one simulation of the MME to an unphysical state, linear bias corrections on a grid point level, as proposed in BC12, potentially project large amount of grid points beyond the limits of soil drying.
4 Discussion and Conclusions
 We have investigated the linear temperature bias correction proposed by BC12 for two RCM model ensembles driven by global climate models. The analysis shows that the linear bias assumption breaks down at some elevated temperature and might no longer hold in warmer future conditions. The model results assessed suggest a transition from a linear bias relation (where the bias grows with temperature) to a constant bias assumption (where the bias is independent of temperature). Evidence comes from strongly biased model simulations for current climate, and is additionally supported by the pseudo-reality approach. Using a perturbed physics ensemble, the transition to a constant bias can be explained by the fact that soil-moisture depletion cannot exceed the maximum soil-moisture content. Given the constraint of soil-moisture depletion, we show that a linear bias correction can actually imply unphysically low or even negative soil-moisture levels.
 Detailed results were presented for the Mediterranean region only, consistent with BC12. The same analysis is provided for the Alpine region (AL), the Iberian Peninsula (IP), and Eastern Europe (EA) in the supplementary material. Similar results are obtained, supporting the robustness of our conclusions. Further, the discussed findings are restricted to monthly and regional averages, yet the importance of a transition of the bias relations at a grid-point level and at a daily resolution as proposed in, e.g., Piani et al.  is probably even larger, as the soil moisture depletion is more frequent at higher resolutions.
 The development of a summer temperature bias correction methodology accounting for a transition of bias behavior is beyond the scope of this study. However, we argue that such a correction should include the state of soil moisture as a main factor for high temperature biases. Other studies [Ho et al., 2012; Ehret et al., 2012; Teutschbein and Seibert, 2012] already pointed out that current bias correction methods do not incorporate the physical causes of model biases, and thus inconsistencies can emerge as highlighted in the current study.
 Another major question that has not specifically been addressed is the origin of the temperature biases in the GCMs and RCMs considered. As evident from BC12 and as confirmed by the current analysis, summer temperature biases of the ENSEMBLES RCMs are not equally distributed but are systematically positive. This implies that even ensemble mean results will be systematically biased, which is somewhat surprising as the considered RCMs employ a wide range of parametrization. One hypothesis that could explain this systematic bias is associated with the representation of the soil-moisture precipitation feedback. There is some evidence from both numerical as well as observational studies that parametrized convection appears to unduly favor a positive feedback [Hohenegger et al., 2009; Taylor et al., 2012]. A positive soil-moisture precipitation feedback has a tendency to amplify the response to external forcings (toward moist or dry conditions), while a negative feedback would moderate the response and make the system more resilient.
 We are indebted to the COSMO consortium and the CLM community for providing access to and support of the CCLM model. We particularly would like to acknowledge technical support by the staff of MeteoSwiss and the Center for Climate Systems Modeling (C2SM). This work was supported by a grant from the Swiss National Supercomputing Centre (CSCS) under project ID s78. Further, we would like to acknowledge the E-OBS data set from the ENSEMBLES project and the data providers in the European Climate Assessment and Dataset (ECA&D) project (http://eca.knmi.nl). Finally, we thank Filippo Giorgi and one anonymous reviewer for very constructive feedback on this manuscript.
 The Editor thanks one anonymous reviewer for his/her assistance in evaluating this paper.