Corresponding author: E. M. Fischer, Institute for Atmospheric and Climate Science, ETH Zurich, Universitätsstrasse 16, CH-8092 Zürich, Switzerland. (firstname.lastname@example.org)
 Summer temperature variability has been projected to increase in Central Europe in response to anthropogenic greenhouse gas forcing. Based on an unprecedented set of global and regional climate models from the ENSEMBLES project, we assess the robustness of these projections on interannual to daily time scales. In comparison to previous analyses using PRUDENCE simulations, we find a more diverse climate change signal for interannual summer temperature variability and a clear dependence upon present-day model performance. Models that realistically represent present-day variability, tend to consistently project increasing interannual variability at the end of the 21st century. We demonstrate that the partitioning of latent and sensible heat fluxes controlled by soil moisture is crucial to understand the projected changes across the multi-model experiment. The projected increase in daily summer temperature variability is more robust and consistently simulated by all models. Likewise, all models consistently project reduced daily temperature variability in winter. Thus, it is a robust signal across the entire ensemble that in summer and south-central Europe hot extremes warm stronger than the mean, and in winter and northern Europe cold extremes warm stronger than mean temperatures.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Adaptation to increasing climate variability, in addition to changes in the mean climate, represents a serious challenge to society, economy and ecosystems. Regarding extreme events, it has been argued that changes in variability are more important than changes in mean [e.g., Katz and Brown, 1992; Schär et al., 2004]. The relevance of variability changes has also been motivated by projections for European heatwaves. While most of the projected changes in the exceedance of temperature thresholds (e.g., heatwave frequency or warm-spell duration indices) are accounted for by a shift in the mean climatological distribution [e.g.,Ballester et al., 2010], changes in the intensity of extremes (e.g., changes in uppermost percentile departures or return periods of extremes) are highly sensitive to changes in higher-order statistical moments such as daily temperature variability [Fischer and Schär, 2010].
 In winter daily temperature variability is projected to decrease particularly over northern Europe, which implies that cold extremes are warming more than mean temperatures [Fischer et al., 2011b; Kjellström et al., 2007].
 Many of the above findings, in particular on changes at interannual scales, are primarily based on regional climate model (RCM) projections of the European model intercomparison project PRUDENCE [Christensen and Christensen, 2007] or a small subsample of the follow-up project ENSEMBLES [van der Linden and Mitchell, 2009]. Since in the latter experiment we identified one simulation that shows conflicting results with the above projections we decided to revisit the findings. We here assess the robustness of the projections based on the full set of ENSEMBLES RCMs that cover a substantially larger uncertainty range than earlier regional experiments.
2. Data and Methods
 Here we present results based on the RCM experiment ENSEMBLES [van der Linden and Mitchell, 2009]. We analyse historical runs and transient projections of the following 14 GCM-RCM chains that are driven by six different GCMs (given in brackets) forced with the SRES A1B emission scenario: C4I (HadCM3Q16), CNRM (ARPEGE), DMI (ECHAM5, ARPEGE), ETHZ (HadCM3Q0), HC (HadCM3Q0, HadCM3Q3, HadCM3Q16), ICTP (ECHAM5), KNMI (ECHAM5), MPI (ECHAM5), SMHI (ECHAM5, HadCM3Q3, BCM). The simulations were performed at a resolution of roughly 25 km over a domain encompassing all of Europe and parts of the North Atlantic and North Africa.
 The results from the ENSEMBLES experiment are compared against projections of the earlier multi-model experiment PRUDENCE. We applied the same analysis to 8 RCMs of the PRUDENCE ensemble (DMI, ETH, GKSS, HC, METNO, KNMI, MPI and SMHI), all driven by the same atmospheric GCM HadAM3 with SSTs from HadCM3 (A2 scenario). Note that most of the modelling centres contributed to both multi-model experiments but often with two different model versions. In contrast to the transient simulations in ENSEMBLES, the PRUDENCE experiment only covers two time slices at the end of the 20th and 21st century. The simulations were evaluated against the ENSEMBLES gridded data set (E-OBS version 5.0) [van den Besselaar et al., 2011] that is available at the native grid of most of the RCMs.
3. Interannual Summer Temperature Variability
 All contributing RCMs in the multi-model experiment PRUDENCE [Christensen and Christensen, 2007] projected a substantial increase in interannual summer temperature variability (hereafter IASV, expressed as standard deviation σ across 30 years of JJA mean temperatures) in central Europe (Figure 1a). Consequently, Central Europe has been highlighted as a hot spot where models show robust increases in IASV [Fischer and Schär, 2009; Schär et al., 2004; Seneviratne et al., 2006; Vidale et al., 2007]. However, in ENSEMBLES the projected changes in IASV at the end of the 21st century are substantially smaller and not significant in the ensemble mean across all 14 GCM-RCM chains (Figure 1b). We find that the changes in IASV at the end of the 21st century vary in sign and magnitude across the GCM-RCM chains. Results from 4 characteristic models that show discrepant IASV signals are shown inFigure 2c (all models are shown in Figure S1 in the auxiliary material). Particularly in Central Europe, here defined as the zonal belt between the Mediterranean and the North and Baltic Sea, RCMs project contrasting changes, ranging from dramatic increases of more than 60% to substantial decreases of 20–40% (Figure S1).
 There are several methodological differences in the model setup between PRUDENCE and ENSEMBLES that could account for these differences. The most apparent hypothesis is that the lower robustness in the IASV response in ENSEMBLES relates to the fact that a more diverse set of GCMs with RCMs is considered. In PRUDENCE almost all RCMs were driven by the same GCM, while in ENSEMBLES nine RCMs were driven by six different GCMs. In terms of seasonal mean temperature responses, the driving GCM is a dominant source of uncertainty [e.g., Fischer et al., 2011a]. If the uncertainty in the sign of the IASV signal would mainly come from the different driving GCMs, one would expect the RCMs driven by the same GCM to show a similar signal. However, this is not always the case. For instance two RCMs driven by the same GCM, C4I (HadCM3Q16) and HC (HadCM3Q16), show significant but opposite changes in IASV over France (Figure S1). Likewise, the IASV response does not clearly cluster for the same RCMs (e.g., the three SMHI or the HC simulations show distinctly different signals for the three driving GCMs).
Figure 2 suggests that IASV projections (Figure 2c) relate to the representation of present-day IASV state (Figure 2a), as the two RCMs on the left and on the right show distinctly different behaviour. Many RCMs suffer from a substantial overestimation of present-day IASV over Central Europe with respect to observations (Figure 2a, columns 3 and 4), a bias that has earlier been identified in the PRUDENCE experiments [Kjellström et al., 2007]. In total, 8 out of 14 GCM-RCM chains clearly overestimate present-day IASV, particularly in Central Europe (Figure S2), and in some cases by more than a factor of 2 (Figure 2a). The bias does not seem to be predominantly sensitive to the driving GCM. However, the RCM plays an important role, e.g., independent of the driving GCM the SMHI or the HC model tend to simulate similar present-day IASV in Central Europe.
 We find that the two RCMs simulating a greater present-day IASV over Central Europe show a low meanEF (Figures 2band S3). This relationship is remarkably robust for the whole multi-model ensemble, with negative correlations betweenEFand IASV across the 14 GCM-RCM chains ranging between −0.74 in Eastern Europe and −0.93 in France (Figure 3). In other words, RCMs that simulate hardly any limitation of latent heat by soil moisture availability (high EF), tend to show small year-to-year differences in EF and thus low IASV (Figure S4). In contrast, RCMs with intermediateEFexhibit significant year-to-year variations in EF (representing dry and moist states). Consequently, a lack of latent heat flux amplifies temperature anomalies in dry years (soil-moisture limitation of evapotranspiration), and dampens them in wet years, giving rise to greater IASV. Such a dependence of IASV on latent heat fluxes and soil moisture availability has earlier been postulated as a key mechanism for changes in IASV [Fischer and Schär, 2009; Seneviratne et al., 2006; Vidale et al., 2007]. Our results support the proposed mechanism with evidence from a substantially larger set of models, and it suggests that the mean state of EF accounts for much of the model differences in present-day IASV.
 As highlighted above, the two RCMs in Figure 2cthat project a stronger increase in IASV show a distinctly lower present-day IASV than the two other RCMs (Figure 2a). This negative correlation between present-day IASV and its change is robust across the whole ensemble (Central Europe:r = −0.72; Eastern Europe: r = −0.77; France: r = −0.74) (Figure 4). In other words, RCMs that overestimate present-day IASV project only weak further increases, or in some cases even a reduction in IASV, whereas RCMs with realistic present-day IASV project a strong increase. This behaviour, which is also seen in other regions (Figure S5), is consistent with the physical mechanism proposed above, which explains the model differences in the present-day representation of IASV. Under climate change all RCMs experience a general reduction inEFas a result of a general drying of the land surface. However, if present-day EF is low the drying does not translate in a further increase of IASV and in some cases even in an IASV reduction. We suggest that this is due to the fact that once a model is dry in most of the summers and the mean soil moisture state approaches the plant wilting point, the variability does not further increase or may even decrease if soils dry out in all summers. Our findings highlight that projections of the European summer climate are highly dependent on the representation of soil moisture-atmosphere interactions, which in turn depend upon feedbacks with the boundary layer, clouds and radiation.
 The strong relation between simulations of present-day conditions and projected changes in IASV suggests that the observations can be used to constrain the model ensemble. This approach is based on the assumption that a model with a better representation of the present-day conditions provides a more credible estimate of future changes. Taking into account the proposed physical mechanism, such an approach seems to be justified in this context.Figure 4shows the observed IASV for 1970–1999 (red line) and the 5–95% confidence interval representing the sampling uncertainty due to internal climate variability (blue range). For the regions France and Eastern Europe, as well as for whole of Central Europe (here defined as the latitudes between the Mediterranean and the North and Baltic sea), 6–7 GCM-RCM chains fall in the confidence interval of the observed IASV.
 All RCMs with a realistic representation of present-day IASV project an increase in IASV in Eastern Europe or averaged across the whole of Central Europe. Likewise, in France the RCMs with good observational agreement for IASV tend to show no change or an increase in IASV with one exception. In summary, while the mean of all ENSEMBLES models shows no clear IASV signal, a reduced ensemble consisting of RCMs with a more realistic representation of present-day IASV projects a significant increase in IASV over Central Europe (Figure 1c). The mean response in the constrained ensemble (Figure 1c) obscures the fact that all six GCM-RCM chains actually project a pronounced IASV increase over some substantial region, but not necessarily over the exact same region (Figure S1). In the constrained ensemble mean, the area of significant changes is thus substantially smaller than in the individual members and situated somewhat southeast to the one projected in PRUDENCE (Figure 1a). Again, the difference in the patterns presumably relates to the different present-day mean state. Consequently, all RCMs in the constrained ensemble experience a transition from a wet to intermediate state and thus an IASV enhancement but not necessarily over the same region. The constrained ensemble further yields a substantially smaller mean summer warming over central Europe than the raw ENSEMBLES mean (Figure S6), suggesting that some of the systematic mean summer temperature biases [cf.Boberg and Christensen, 2012; Buser et al., 2009; Christensen et al., 2008] are related to the same mechanisms as highlighted for IASV.
4. Variability at Daily Time Scales
 The daily summer temperature variability (hereafter DSV, expressed as standard deviation σacross all daily summer temperatures in a 30-yr time window) is overestimated with respect to observations in many RCMs (Figures 2d and S7) [see also Kjellström et al., 2010]. The DSV bias is larger in absolute terms (somewhat smaller in relative terms) and similar in their spatial patterns to the bias in IASV. The same RCMs suffer from the largest overestimation of variability at daily and interannual scales. We find a very similar and equally pronounced model dependence of DSV on the mean summer evaporative fraction (Figures 3d–3f). Models with an intermediate mean state (i.e., lower EF) tend to show substantially higher variability than others with a wet mean state (Figures 2b and 2d).
 The projected changes for DSV are more robust than for IASV (Figures 1d and 1e). All ENSEMBLES RCMs project a substantial increase in DSV over Central Europe, with greatest changes varying in latitude (Figure S8). The DSV climate change signal is robust across much of the this region and is greatest along the northern coast of the Mediterranean, consistent with the pattern described based on the PRUDENCE and a subsample of the ENSEMBLES models [Fischer and Schär, 2010]. DSV can be interpreted as a combination of a low frequency (e.g., interannual), medium-term frequency (variability induced by seasonal cycle) and high frequency (intra-seasonal day-to-day) component [Fischer and Schär, 2009]. For changes in DSV the contribution of the high-frequency day-to-day variability has been found to be clearly dominant [Fischer and Schär, 2009]. We suggest that the robust DSV changes here mainly arise from enhanced variability at day-to-day time scales as well as a more pronounced seasonal cycle. The DSV response is independent of the measure for variability and very similar if expressed with non-parametric estimates such as the interquantile range between 5th and 95th percentile (Figure S9) or the interquartile range.
 The relation between present-day DSV and its climate change signal is less pronounced than for IASV. We find a similar relation as for IASV but further south over the Mediterranean region and the Iberian Peninsula, where RCMs with high (low) present-day DSV show no (strong) DSV increase towards the end of the 21st century. This supports earlier findings, which suggested that in a dry model state, temperatures in southern Europe reaches an upper bound which is rarely exceeded and is presumably related to a complete drying of the soil (i.e., EF approaches zero) [Fischer and Schär, 2009].
 The increase in temperature variability has strong implications for changes in hot extremes. The hottest days tend to warm substantially more than the mean summer temperatures. This is consistent with results of CMIP3 GCM multi-model experiment, in which 15 in 16 models show enhanced variability (not shown) and amplified warming in the extremes described inKharin et al.  and Orlowsky and Seneviratne .
 In winter on the other hand, RCMs consistently project reduced daily temperature variability over northern Europe (Figure S10), which is consistent with earlier studies [Fischer et al., 2011b; Kjellström et al., 2007]. North of about 50°N all the models project reduced daily winter variability. This implies that winter cold extremes warm more than the winter mean temperatures. It has been argued that this reduced variability relates to the area of strongest snow melt (reduction in snow covered fraction) [Fischer et al., 2011b; Gregory and Mitchell, 1995].
5. Discussion and Conclusions
 We have revisited projections of European summer temperature variability based on the recent multi-model experiment ENSEMBLES. Unlike the previous PRUDENCE simulations (based on a single GCM with a simplified representation of SST variations), the ENSEMBLES experiments considers fully coupled transient simulations performed with six atmosphere-ocean GCMs. The projected changes in interannual summer temperature variability (IASV) are found to be more diverse than in earlier studies based on the PRUDENCE experiment. While in PRUDENCE every participating model projected substantially enhanced IASV, the ENSEMBLES models show generally smaller increases, and some models even a significant decrease.
 The climate change signal in IASV varies across GCM-RCM chains without clear dependency on the driving GCM. Although mean temperatures are largely determined by the driving GCM [Fischer et al., 2011a], we find that the IASV signal is determined by both GCM and RCM, likely through their influences upon the large-scale circulation and the regional interaction between land-surfaces and atmosphere, respectively. If anything the signal rather clusters along RCMs.
 We find that the IASV climate change signal clearly depends on the quality of the present-day simulations. It becomes stronger and more prominent, when the models considered are constrained to credibly represent current interannual variability. This supports earlier findings suggesting an increase in IASV over a zonal belt between North and Baltic Sea and Mediterranean. In general, IASV is found to strongly depend on the state of the land surface, namely the evaporative fraction that expresses the partitioning of turbulent fluxes into sensible and latent heat fluxes. If evaporative fraction is high (when latent heat fluxes are hardly limited by soil moisture availability) IASV tends to be low. For intermediate land surface conditions (when latent heat fluxes are occasionally limited) IASV is enhanced. Finally, for very dry land surface conditions (when the evaporative fraction is low in virtually any summer) the IASV does not further increase but tends to decrease. This interpretation is consistent with studies of anomalous European summers [Fischer et al., 2007; Schär et al., 2004] and has been advanced in relation to climate change based on a sensitivity experiment with a single RCM [Seneviratne et al., 2006]. This result supports the idea that large increases in IASV are restricted to a transition zone between the dry climates to the south (where evaporation is largely soil-moisture limited) and the moist climate to the north (where soil moisture is abundant and evaporation is largely radiation limited) [Koster et al., 2009]. This transition occurs in most of the models but not at the same latitude. As a result the changes may often be diluted in the ensemble mean signal.
 The change in daily summer temperature variability (DSV) over south-central Europe is more robust and found in all GCM-RCM chains of the ENSEMBLES project and all PRUDENCE RCMs. We even find the same signal in all but one of the CMIP3 models (not shown). Using variability composition,Fischer and Schär revealed that DSV increases predominantly due to higher variability at day-to-day time scales which is enhanced due to the same mechanisms as the diurnal temperature range. Only at very dry conditions over southernmost Europe, RCMs suggest a reduction in variability due to a distinct upper temperature bound reached at conditions where soils become completely dry.
 The RCMs used here are run at a resolution of about 25 km and thus sub-grid scale processes such as convection and land-surface processes need to be parameterized. This is a significant limitation and results should be interpreted with caution, as fundamental feedback processes such as the soil moisture-precipitation feedback have been shown to be sensitive to the representation of (parameterized or explicitly resolved) convection [Hohenegger et al., 2009]. The representation of convection is also relevant as variability in surface shortwave radiation contributes to enhanced IASV [Lenderink et al., 2007; Fischer and Schär, 2009]. Concerns about parameterized processes are particularly relevant since models tend to overestimate temperatures for hot European summer conditions, and since climate change projections depend on the treatment of biases [Boberg and Christensen, 2012; Buser et al., 2009; Christensen et al., 2008]. We further expect that particularly at interannual scales temperature variability is sensitive to changes in atmospheric circulation variability. Due to a lack of available mid-tropospheric fields in the ENSEMBLES archive, it was not possible to carefully assess the contribution arising from circulation here.
 On a more fundamental level, our study supports the idea that models should rigorously be assessed regarding their ability to represent various aspects of the climate system. If a robust relationship between current conditions and projections is identified, it should be used to constrain the model ensemble. This has two implications. First, model evaluation should address multiple variables related to underlying key processes, and should not only address the mean climate but also other aspects such as variability or trends. Second, methodologies used to constrain model ensembles should be as objective as possible. In the current study a subjective methodology has been explored that appeared successful for the current purpose. However, more comprehensive approaches are needed to exploit the potential of observations to constrain model projections.
 Together with results for winter variability changes our study yields a robust signal across ENSEMBLES and PRUDENCE that in summer and south-central Europe hot extremes warm stronger than the mean, and in winter and northern Europe cold extremes warm stronger than mean temperatures.
 The Editor thanks Randall Dole and an anonymous reviewer for assisting in the evaluation of this paper.