Land-climate coupling has been shown to be important for European summer climate variability and extreme events. However, the sensitivity of these feedbacks to land surface model (LSM) choice has been little investigated up to now. In this study, we assess the impact of the LSM on the simulated climate variability in a regional climate model (RCM). The experiments were conducted with the COSMO-CLM2RCM. COSMO-CLM2can be run with two alternative LSMs, the 2nd-generation LSM TERRA_ML or the more sophisticated 3rd-generation LSM Community Land Model (CLM3.5). The analyzed simulations include control and sensitivity experiments with prescribed soil moisture (dry or wet). Using CLM3.5 instead of TERRA_ML improves the simulated temperature variability by alleviating an overestimation of temperature inter-annual variability in the RCM. Also, the representation of the probability density functions of daily maximum summer temperature is improved when using the more advanced LSM. The reduced climate variability is linked to a larger ground heat flux and smaller variability in soil moisture and short-wave radiation. The latter effect results from the coupling of the LSM to the atmospheric module. In addition, using CLM3.5 reduces the sensitivity of COSMO-CLM2to extreme soil moisture conditions. An analysis assessing the relationship between the standard precipitation index and the subsequent number of hot days in summer reveals a better representation of this relationship using CLM3.5. Hence, we find that biases in climate variability and extremes can be reduced and the representation of land-climate coupling can be improved with the use of the more sophisticated LSM.
 The term “land-climate coupling” refers to the degree to which the land surface controls the climate in a given region, for instance through evapotranspiration. One of the most important aspects in this context is the soil moisture-temperature feedback. When the availability of soil moisture limits the energy used for the latent heat flux, more energy is used for the sensible heat flux, consequently increasing near surface temperature [Seneviratne et al., 2010]. Land-climate coupling is of varying strength in different regions of the world. The variability of evapotranspiration is large enough to influence climate only in the transitional regions between dry and wet climates [Koster et al., 2004; Seneviratne et al., 2010]. However, the definition of these transitional regions is not static. Regions characterized by an overall dry or wet climatology can occasionally present a transitional soil moisture regime, too. For example, the summer 2003 in Central Europe was so dry that the occurring heat wave was enhanced by the lack of soil moisture [e.g., Ferranti and Viterbo, 2006; Fischer et al., 2007; García-Herrera et al., 2010; Seneviratne et al., 2012b]. In addition, not only soil moisture availability but also vegetation properties determine the partitioning of net radiation into latent and sensible heat flux [Bonan, 2008; Teuling et al., 2010; Williams et al., 2012] and can, therefore, impact air temperature, boundary layer stability, or precipitation [Seneviratne et al., 2010]. Thus, to study land-climate coupling with climate models, we need land surface models (LSMs) which represent soil hydrology and vegetation processes realistically.
 Regional climate models (RCMs) are useful tools to study land-climate coupling, since many of the involved processes are regional in nature [Giorgi, 2006]. These studies are strongly dependent on how climate models represent the land surface and its coupling to the atmosphere [Irannejad et al., 2003]. Land surface models represent hydrological, biogeophysical and biogeochemical processes which determine the exchange of radiation, heat, water and carbon between the land surface and the atmosphere. Most current RCMs use relatively simple 2nd-generation LSMs [Davin et al., 2011; Subin et al., 2011] which are not optimal to investigate biosphere-climate feedbacks, despite improvements compared to even simpler earlier bucket-model schemes [Sellers et al., 1997; Pitman, 2003]. Recently, efforts were made to couple sophisticated 3rd-generation LSMs to RCMs [e.g.,Steiner et al., 2009; Subin et al., 2011; Davin et al., 2011; Davin and Seneviratne, 2012].
 In this study, we use a coupled land-atmosphere RCM including two alternative LSMs (a 2nd-generation and a 3rd-generation scheme). This set-up allows us to assess the role of land surface representation in simulating the European climate. In previous efforts to couple sophisticated 3rd-generation LSMs to RCMs, analyses were focused on the evaluation of annual or seasonal mean climate [Steiner et al., 2009; Subin et al., 2011; Davin et al., 2011; Davin and Seneviratne, 2012]. The role of land surface parameterization for land-atmosphere coupling has also been investigated with a global climate model coupled to three different LSMs [Wei et al., 2010a, 2010b] but without investigating the effect on climate extremes. In contrast, we focus here on inter-annual climate variability and climatic extremes. To our knowledge, the role of land surface parameterization choice for climate variability and extremes in RCMs has not been studied in detail in Europe.
 The aim of the study is to investigate whether the RCM coupled to the more sophisticated LSM represents well land-climate coupling, climate variability and extremes. If so, this model version can be used for investigations of biosphere-climate feedbacks in the future. To this end, we perform control runs and prescribed extreme soil moisture experiments with both LSMs. We investigate if the relevant processes are correctly represented in the simulations by putting our results into context with observations. In particular, we assess possible improvements (or lack thereof) when this new model is compared to that with the simpler 2nd-generation scheme.
 The structure of this article is as follows: Section 2 describes the models and data used in this study as well as the methodologies used in the analysis. Section 3presents the results for mean summer climate and land-climate coupling. Insection 4 we focus on climate variability and temperature extremes. In addition, we investigate the connection between heat waves and droughts. A discussion of the main results as well as the conclusions of this study are provided in section 5.
2. Methods and Data
2.1. Model Description
 In this study, we perform simulations with COSMO-CLM2 [Davin et al., 2011; Davin and Seneviratne, 2012], an RCM based on the combination of COSMO-CLM [Rockel et al., 2008] and CLM3.5 [Oleson et al., 2008]. COSMO-CLM is a non-hydrostatic RCM jointly used by the COnsortium for Small-scale Modeling (COSMO) and the Climate Limited-area Modeling Community (CLM-Community). We use version 4.8.11 of COSMO-CLM with a second-order leapfrog scheme for the time integration. Vertical turbulent mixing is parameterized according to a level 2.5 closure using Turbulent Kinetic Energy (TKE) as a prognostic variable [Mellor and Yamada, 1974, 1982]. For moist convection, we use the mass flux scheme of Tiedtke .
 The native LSM in COSMO-CLM, TERRA_ML [Grasselt et al., 2008], is retained within the COSMO-CLM2 framework so that either CLM3.5 or TERRA_ML can be used with the same atmospheric model. Table 1 summarizes the differences between CLM3.5 and TERRA_ML, and Davin et al. provides a more detailed description. CLM3.5 is a state-of-the-art LSM, which is overall more sophisticated than TERRA_ML. CLM3.5 uses a tile approach and explicitly represents surface heterogeneity, whereas TERRA_ML does not represent sub-grid scale heterogeneity. The radiation fluxes are calculated separately for canopy and soil/snow surfaces in CLM3.5, whereas they are derived from the simulated grid-scale surface albedo and temperature in TERRA_ML. Both models solve the Richards equation for hydrological processes, but CLM3.5 has a prognostic groundwater model coupled to the lowest soil level. In addition, CLM3.5 explicitly calculates stomatal conductance and photosynthesis. Hence, the degree of complexity of these two LSMs largely differs. On the one hand, these differences allow us to investigate the role of land surface representation in the context of land-climate coupling. On the other hand, these differences are manifold and complex, making an exhaustive cause-effect analysis difficult.
Table 1. Main Differences Between the Land Surface Models CLM3.5 and TERRA_ML
Explicit, multiple land units per grid cell (e.g. glacier, lake, vegetated)
No explicit sub-grid scale heterogeneity
Short-wave and long-wave radiation fluxes calculated for canopy and soil/snow surface, two layer scheme
Derived from simulated grid-scale surface albedo and temperature
Richards equation 10 soil layers, prognostic groundwater model
Richards equation 8 soil layers
Stomatal conductance and photosynthesis
Explicit link C3 and C4 plants
Empirical relation, photosynthesis not represented
2.2. Experimental Design
 We perform 6 model runs with the COSMO-CLM2 RCM using the two alternative LSMs. The two control runs (CTLTERRA and CTLCLM) have interactive soil moisture. Additionally, we perform experiments with prescribed very high (“WET”) or very low (“DRY”) soil moisture. Unlike the control runs, in these experiments soil moisture evolution is decoupled from the atmospheric state (the same method was used, e.g., in Koster et al.  and Seneviratne et al. ). In these uncoupled simulations, soil moisture is prescribed in all soil levels at each time step for each grid point separately according to soil type. We either prescribed soil moisture to very wet conditions (field capacity) for “WETTERRA” and “WETCLM” or to very dry conditions (0.05 vol-%) for “DRYTERRA” and “DRYCLM.” Table 2 provides an overview of all model experiments.
Table 2. Overview of Model Runs and Their Acronyms
control run with interactive soil moisture
soil moisture prescribed to 0.05 vol-%
soil moisture prescribed to field capacity
control run with interactive soil moisture
soil moisture prescribed to 0.05 vol-%
soil moisture prescribed to field capacity
 All experiments are conducted over a European domain with 0.44° (≈50 km) horizontal resolution, 32 vertical layers, and a model time step of 240 seconds. We derived the lateral boundary conditions from the ERA-Interim re-analysis data (www.ecmwf.int/research/era/do/get/era-interim, Dee et al. ). ERA-Interim is the latest ECMWF global atmospheric re-analysis which covers the recent data-rich period and is continuing in real time. The simulations cover the period 1989–2008. The first year is used as spin-up and we thus analyze only data from 1990–2008 in the following.Figure 1shows the model domain with its topography and the definition of the sub-regions used for the analysis.
2.3. Evaluation Data Sets
 To evaluate the control runs, we use the E-OBS gridded version 5.0 of the European Climate Assessment and Dataset (ECA&D) [Haylock et al., 2008]. E-OBS is a daily gridded observational data set for precipitation and temperature in Europe based on ECA&D information. The full data set covers the period 1950–2009, however, we only use data from 1990–2008, given the length of the simulations.
 We analyze the simulations over seasons, such as JJA (June, July, August) for summer. The mean of variables such as temperature and precipitation is calculated over the respective months over the period 1990–2008. As a measure for inter-annual climate variability we use the inter-annual standard deviation (σ) of the respective variable. In addition, we use the evaporative fraction (calculated as LE/(LE + SH) for daily data) which indicates how much of the available energy is used for evapotranspiration. The evaporative fraction is a good measure of the evaporative regime, with low values indicating soil moisture limitation and high values found in energy-limited regimes [e.g.,Seneviratne et al., 2010]. Another measure used in the analyses is the correlation between temperature and latent heat fluxes (Corr(T, LE), see also Table 3) as a measure for soil moisture-temperature coupling. Again we use daily data. A negative correlation of temperature and latent heat flux indicates moisture limitation whereas positive correlations are found when the latent heat flux is energy limited [Seneviratne et al., 2006; Jaeger et al., 2009]. Also, we calculated these measures for daily summer FLUXNET data. We did not correct for energy closure, but the measures we analyze are unlikely to be strongly affected by this closure, as the bowen ratio (SH/LE) can be assumed to be approximately correct in the measurements [Foken et al., 2012].
Table 3. Overview of the Climate Indices Used in This Study
Standard deviation of respective variable, measure for inter-annual variability
LE/(LE + SH)
Evaporative fraction, fraction of available energy used for evapotranspiration
Correlation between temperature and latent heat flux, measure for land-climate coupling, negative correlation indicates moisture limitation of the latent heat flux
90th-percentile of daily Tmax
Number of hot days with Tmax> long-term
(1990–2008) 90th-percentile of CTL
90th-percentile-based mean heat wavelength.
Mean of all spells with at least two
consecutive days with Tmax> long-term (1990–2008) 90th-percentile of CTL
Standard precipitation index, standardized drought index taking into account accumulated precipitation over preceding 3 months [McKee et al., 1993]
2.4.2. Significance and Skill Score
 We test if the differences between using CLM3.5 or TERRA_ML are statistically significant. The numbers in the lower-right corner of the difference maps indicate the area weighted fraction of land points at which the null hypothesis of ‘being from the same distribution’ is rejected at the 5% level according to the two-sided Kolmogorov-Smirnov test (as used inJaeger and Seneviratne ).
 Moreover, we use a skill score defined by Perkins et al.  which tests how well a model captures the observed probability density functions (PDF). In Section 4.2 we use this metric for the PDFs of the daily maximum temperatures. It measures the overlapping area between two PDFs. For perfect agreement between model and observations the skill score equals one. It is calculated by summing up the probability at each bin of a given PDF (equation (1)).
where n is the number of bins used to calculate the PDFs, and Zm and Zo correspond to the frequency of values in a bin from the model and the observations, respectively.
2.4.3. Climate Extremes
 We use several hot temperature and drought indices to investigate climate extremes. Table 3provides an overview of the employed indices. As a measure for extreme temperatures we use the 90th-percentile of daily maximum temperature (perc90). The number of hot days (nhd) counts the number of days where the daily maximum temperature (Tmax) is above perc90. The heat wave duration index (hwdimean) is the mean length of all heat spells where Tmax is above perc90 for at least two consecutive days [Lorenz et al., 2010].
 As a measure for duration and intensity of droughts we use the standard precipitation index (SPI). SPI is a standardized index which takes into account the accumulated precipitation of the preceding months [McKee et al., 1993]. The SPI can be calculated for different time periods. We use here the 3-month SPI. Consequently, precipitation deficits are computed out of the three months preceding the current month. Then, a Gamma function is fitted to the cumulative precipitation separately for each ending month for the whole time series (to take into account seasonal differences in distributions). These cumulative distributions are then transformed into a standard normal distribution (with mean zero and variance of one) which gives the value of the SPI for three months [McKee et al., 1993; Lloyd-Hughes and Saunders, 2002]. However, since this approach is not practical for computing SPI for a large number of data points, we use an approximative conversion (following Lloyd-Hughes and Saunders ). A main advantage of the SPI is that it only depends on precipitation, for which relative exhaustive measurement networks exist (unlike for soil moisture for instance). A disadvantage is that it does not necessarily capture the full range of droughts. Nevertheless, studies have shown that SPI can be well related to soil moisture droughts [e.g., Hirschi et al., 2011; Mueller and Seneviratne, 2012].
2.4.4. Quantile Regressions
 Quantile regression originates from ordinary least squares regression and was introduced as an extension. It is used to assess the response of a variable in all parts of its data distribution and not only in the mean. Instead of using conditional mean functions as in ordinary least squares regression, conditional quantile functions are used. This method has been often used in econometrics [Koenker and Bassett, 1978; Koenker, 2005]. Recently, it has also been used in geophysics [Barbosa, 2008] and climatology [Hirschi et al., 2011; Quesada et al., 2012; Mueller and Seneviratne, 2012] (for details see above mentioned references). We look at the response of nhd to SPI. The calculations were done with R using the “quantreg”package. We calculated the significance of the slopes using 1000 bootstrapping samples and the xy-pair method, but only for nhd in July and preceding SPI. The figures show the regression slopes for the number of hot days in June, July and August and the 3-month SPI in the (respective) preceding month. Results for the 6-month SPI are similar but the region with statistical significance is largely reduced (not shown).
3. Mean Climate and Land-Climate Coupling
 First, we investigate the impact of the different LSMs on mean summer climate and land-climate coupling.Figure 2 displays mean summer temperature, precipitation, evaporative fraction and the correlation between temperature and latent heat flux (Corr(T, LE)) for CTLCLM (Figures 2a–2d), CTLTERRA (Figures 2e–2h), and the difference between CTLCLM and CTLTERRA (Figures 2i–2l).
3.1. Mean Summer Climate
 Differences in temperature between the two model versions are largest in Northern Europe. Figure 2i shows that CTLCLM is warmer than CTLTERRA in Northern Europe, whereas it is slightly colder or similar to CTLTERRA in the southern part of Europe. For precipitation the largest differences are also found in the North, where CTLCLM is drier than CTLTERRA (Figure 2j). However, this difference in precipitation is only significant for 14% of the grid points.
 More pronounced differences between CTLCLM and CTLTERRA exist for the evaporative fraction (Figure 2k). Compared to CTLTERRA, the evaporative fraction is larger for CTLCLM in the North and smaller in the South. Consequently, the North–south gradient in the evaporative fraction is reduced in CTLCLM. Nonetheless, both model versions display an energy-limited evapotranspiration regime in Central and Northern Europe and a soil moisture-limited evapotranspiration regime in the South (Figures 2c and 2g). This is consistent with results of several observational analyses [Teuling et al., 2009; Seneviratne et al., 2010; Mueller and Seneviratne, 2012].
 We use FLUXNET data to evaluate the realism of the simulated evaporative fraction in the two model versions. The colored dots in Figures 2c and 2g (and Figures 2d and 2h) correspond to measured data from several FLUXNET sites. The colorbar is the same as for the model results. The size of the dots indicates the amount of available data for the calculation of the evaporative fraction at the respective site. In addition, we show scatterplots comparing model results and observations in Figure 3, which are more quantitative. Overall, the evaporative fraction from the FLUXNET sites is better represented by CTLCLM (Figures 2c and 2g and Figures 3a and 3c). CTLTERRA overestimates the evaporative fraction in the North where too much energy is used for evapotranspiration. Hence, the partitioning of sensible versus latent heat flux is improved when using the more sophisticated LSM. This finding is consistent with the results of Davin et al.  and Davin and Seneviratne , who showed that this better partitioning results in a decreased bias in cloud cover when using CLM3.5 instead of TERRA_ML in the COSMO-CLM2framework. The reduced cloud cover has a positive effect on biases in simulated net short-wave radiation, temperature and other surface variables.
3.2. Land-Climate Coupling
 The correlation between temperature (T) and latent heat flux (LE) (Figures 2d and 2h) can be used as a measure for soil moisture-temperature coupling [Seneviratne et al., 2006]. Negative correlations indicate soil moisture limitation and a strong coupling of land and atmosphere. CTLCLM shows a clearer limitation of the area of strong coupling to Southern Europe (Figure 2d). Both model versions capture the general pattern with negative correlations of Corr(T, LE) in Southern Europe and positive correlations in Central and Northern Europe. As for the evaporative fraction, we compare Corr(T, LE) to FLUXNET data. Only one FLUXNET station on the British Islands shows a negative correlation in the North which is not shown by either model version. Figures 3b and 3d reveal that the root mean squared error for CTLCLM compared to FLUXNET is smaller than the one from CTLTERRA. On the other hand, the correlation for CTLTERRA and FLUXNET is higher. Hence, with the available observations it is not possible to determine if one of the two model versions performs better than the other with respect to Corr(T, LE).
3.3. Influence of Soil Moisture State on Mean Summer Climate
 To gain additional insights on how the two model versions react to surface processes, we also perform sensitivity experiments with extreme soil moisture conditions. In these experiments the land surface is decoupled from the atmosphere. The sensitivity of the summer mean temperature to the soil moisture state is shown in Figure 4. In general, the extreme soil moisture experiments show the expected results. The DRY runs result in higher temperatures (Figures 4b and 4f) and less precipitation (not shown) whereas the WET runs result in colder temperatures (Figures 4d and 4h) and more precipitation (not shown). The results using the more sophisticated LSM are consistent with those using the simpler LSM, as well as with results of Jaeger and Seneviratne (which were also based on COSMO-CLM simulations with the TERRA_ML LSM). However, the difference between DRYCLM-CTLCLM and WETCLM-CTLCLM are smaller than those using TERRA_ML (Figures 4j and 4l). This shows that the model sensitivity to soil moisture changes is smaller when using the more complex LSM.
4. Climate Variability and Extremes
4.1. Inter-annual Summer Climate Variability
 To evaluate climate variability in summer, we use the inter-annual standard deviation.Figure 5shows the inter-annual variability (σ) in 2-meter temperature for CTLCLM, CTLTERRA, their difference, and the corresponding biases compared to E-OBS observations. Inter-annual temperature variability is substantially overestimated in CTLTERRA (Figure 5d). In CTLCLMthis problem is largely alleviated, with a more realistic representation of inter-annual temperature variability (Figure 5b). This difference between CTLCLM and CTLTERRAcan also be seen in the inter-annual variability of the sensible and latent heat fluxes (Figure 6). Also for the heat fluxes, the variability in CTLTERRA is much larger than in CTLCLM (Figures 6e and 6f).
 The overestimation of summer climate variability is a common feature of most RCMs [e.g., Vidale et al., 2007; Jacob et al., 2007; Lenderink et al., 2007] and has been attributed to combined effects of downward long-wave radiation, net short-wave radiation and evaporation [Lenderink et al., 2007]. It seems to have no unique cause for all RCMs, but previous COSMO-CLM versions have been shown to have a large sensitivity to soil drying, leading to decreased evaporation and enhanced summer temperature variability [Lenderink et al., 2007]. Thus, we investigated possible causes for the better performance in CTLCLM, namely differences in the representation of soil moisture, ground heat flux, downward long-wave and incoming short-wave radiation.
 Soil moisture is expressed as volumetric water content over several soil levels. The volumetric water content (over the first 0.829 meter of CTLCLM compared to the first 0.7 meter of CTLTERRA) is mainly increased in CTLCLM compared to CTLTERRA (Figures 7a, 7c, and 7e), indicating less summer drying in CTLCLM. However, since the soil levels of CLM3.5 and TERRA_ML are very different, an exact comparison between CLM3.5 and TERRA_ML is not possible. If we consider only the levels down to 0.1656 m (CLM3.5) resp. 0.16 m (TERRA_ML), the levels which are closest to each other, CTLCLM has a lower soil moisture content than CTLTERRA in summer (not shown). Therefore, we cannot definitely confirm that less summer drying is the cause for the improved temperature variability in CTLCLM.
 Nevertheless, we obtain a clear decrease of soil moisture variability in CTLCLM compared to CTLTERRA (Figures 7b, 7d, and 7f; results for other soil levels are similar, not shown). The region where soil moisture variability is smaller in CTLCLM is more extended than the region where climate variability is smaller (Figures 5e, 6e, and 6f and Figure 7f). Nonetheless, the reduced soil moisture variability is likely one of the causes for the decreased climate variability in CTLCLM.
 Part of the available energy at the surface is used to heat the ground (ground heat flux) during the day. The ground heat flux is up to 10 W/m2 larger for CTLCLM than CTLTERRA during summer (Figure 8e). In addition, the variability in the ground heat flux is also enhanced in CTLCLM for large areas in Southern and Central Europe (Figure 8f). The fact that more heat can be stored in the ground in CTLCLM can also explain the reduced climate variability due to an increased buffering effect of the soil column.
Lenderink et al. also proposed incoming long-wave and net short-wave radiation as possible causes for the overestimation of inter-annual temperature variability.Figure 9shows the standard deviation of incoming long-wave and incoming short-wave radiation. We prefer to look at incoming short-wave radiation instead of net short-wave radiation because incoming short-wave radiation is actually forcing the land surface, however, the results are comparable. There is no consistent decrease in incoming long-wave radiation from CTLCLM to CTLTERRA and some regions show even an increase (Figure 9e). On the other hand, incoming short-wave radiation displays clearly smaller variability in CTLCLM compared to CTLTERRA in most parts of Europe (Figure 9f). This suggests that a large fraction of the reduced (and more realistic) summer temperature variability in CTLCLMis due to a reduced incoming short-wave radiation variability in the coupled model, and is thus due to feedbacks between the more sophisticated LSM and the atmospheric module.
4.2. Temperature Extremes
 To investigate temperature extremes in the control runs we analyze several indices listed in Table 3. The patterns of perc90 agree with those of summer mean temperature (Figures 2a and 2e and Figures 10a and 10e). Compared to E-OBS, CTLCLM has a smaller bias in perc90 than CTLTERRA, which underestimates perc90 in the North and overestimates it in the South (Figures 10b and 10f). The better representation of perc90 in CTLCLM is consistent with the more realistic representation of climate variability as well as the improved simulation of mean summer temperature in this model version [Davin and Seneviratne, 2012].
 Heat wave duration indices show a less distinctive pattern. There is no well-defined pattern in the differences between the two model versions (Figures 10c, 10d, 10g, and 10h). CTLCLMrather underestimates the mean heat wave duration compared to E-OBS (Figure 10d). CTLTERRA rather overestimates the mean heat wave duration in Northern and Western Europe, whereas it underestimates hwdimean in Northern Italy, Southeast Spain and some other regions (Figure 10h).
Figure 11 displays the PDFs of Tmax for 4 different regions (Iberian Peninsula (IP), France (FR), Mid Europe (ME), and Eastern Europe (EA), as defined, e.g., in Christensen and Christensen  and shown in Figure 1) for the different experiments compared to E-OBS. For all four regions, the Tmax distribution in CTLCLMis more similar to E-OBS than in CTLTERRA (the exact values of several statistics are shown in auxiliary material Table S2). Sscore is a skill score measuring the common area of two PDFs (section 2.4). It is computed for the control runs compared to observations and confirms that CTLCLM displays more realistic PDFs for Tmax. In total, CTLCLM captures more than 80% of the observed PDF in all regions except the Mediterranean (MD, values for all PRUDENCE subdomains are shown in auxiliary material Figure S1a). In contrast, in all regions, CTLTERRA always captures less than 80% of the observed distributions (Figure S1b). Thus, not only perc90 itself has a smaller bias in CTLCLM, but also the whole PDF is more realistic with the more sophisticated LSM.
 In line with previous studies [Zhang et al., 2009; Jaeger and Seneviratne, 2011; Hirschi et al., 2011; Mueller and Seneviratne, 2012] the extreme soil moisture experiments show that the effect of soil moisture on Tmax is mostly asymmetric (Figure 11). Except for IP, the change in PDF for DRY is larger than for WET (Figures 11b–11d and 11f–11h). The influence of dry soil moisture conditions on hot extremes is, therefore, larger than the influence of wet anomalies. For IP both effects are similar in magnitude (Figures 11a and 11e). This is also true for MD (not shown). Hence, in regions which are rather soil moisture limited (IP, MD, Figure 2c), the influence of wet and dry anomalies on hot extremes has a similar magnitude. In regions where energy limitation is predominant (FR, ME, EA), only dry anomalies influence hot extremes in a noticeable way. The influence of soil moisture on minimum daily temperatures is very small (not shown).
4.3. Quantile Regressions for Number of Hot Days and Standard Precipitation Index
 To study soil moisture-temperature feedbacks during climatic extremes, we study the relationship between the number of hot days (nhd) and the standard precipitation index (SPI, indicates wet or dry conditions). We use quantile regressions to investigate this relationship (seesection 2.4 for details on the methodology). The advantage of this analysis is that it uses only widely measured data, i.e. temperature and precipitation. Hence, the influence of wet and dry conditions on hot extremes in the models can be compared to observations [see also Hirschi et al., 2011].
Figure 12 shows the regression slopes for nhd and SPI for the two different control runs and the observations for the 90% quantile for the European domain. The lower quantiles (not shown) show almost no relation between nhd and SPI, whereas higher quantiles display mostly negative slopes. Negative slopes mean a widening of the nhd distributions with drier conditions. Hence, in regions with negative slopes, hot days occur more often with drier conditions. Regions with no or positive slopes do not show this behavior. From the studies of Hirschi et al.  and Mueller and Seneviratne , we expect negative slopes in Southern Europe (transitional soil moisture regime) and no or positive slopes in Central and Northern Europe (wet regime). This is shown by both models as well as E-OBS. Note that we use the 3-month SPI and not the 6-month SPI as inHirschi et al. , because the results are more pronounced for the 3-month SPI (not shown).
 The highest quantile (90%) is most important for climate extremes. CTLCLM displays negative slopes in Southern Europe and positive slopes in some areas of Central, Northern and Eastern Europe (Figure 12a) for the 90% quantile. This appears similar for CTLTERRA (Figure 12c), yet, the absolute values of the slopes are higher in CTLTERRA (Figure 12e). Negative slopes in E-OBS are often underestimated in CTLCLM (Figure 12b). The largest disagreement between CTLCLMand E-OBS occurs in a region over France, Switzerland, Austria and Northern Italy where slopes are strongly negative in E-OBS (Figure 12f) and only slightly negative or even positive in CTLCLM (Figure 12a). This region also shows too small negative slopes in CTLTERRA. The too small negative slopes could be related to the underestimation of hwdimean in Northern Italy (Figures 10d and 10h). Besides, the region in the East where slopes are negative is too large in CTLTERRA (Figure 12d). The regions where the two CTLs do not agree (Figure 12e) partially overlap with regions where they show the largest difference in Corr(T, LE) (Figure 2l). Furthermore, the difference patterns in Figure 12e are also similar to the differences in σT2m, σSH, and σLE (Figures 5e and Figures 6e and 6f). This indicates a connection between the disagreement in land-climate coupling and climate variability in CTLCLM and CTLTERRA.
Auxiliary materialFigure S2 shows the significance levels for the 90% quantile slopes for nhd in July and SPI in June. In most cases, regions where the 90% slopes are significant correspond to regions with negative slopes and where soil moisture-temperature coupling is high (Figures 2d and 2h). CTLTERRA shows the largest significant regions (auxiliary material Figure S2).
 The regression slopes for several quantiles averaged over regions such as IP, FR, ME, or EA show the widening of the hot day distribution with decreasing SPI (not shown). Altogether, models and observations agree well on the main behavior. The direct comparison of the slopes at all quantiles between models and observations (Figure 13) shows that CTLCLM and CTLTERRAagree quite well with E-OBS in FR (Figure 13b). CTLTERRA represents well the whole shape of the curve and CTLCLM agrees very well with observations for all quantiles except the 90th percentile for which it underestimates the slope. Both model versions underestimate the negative slope over IP (Figure 13a) and overestimate it in ME (Figure 13c), whereas this overestimation is larger in CTLTERRA. In EA (Figure 13d), CTLCLMagrees well with E-OBS (except for the 90% quantile which is again underestimated), whereas CTLTERRA overestimates the negative slope (the same is true for the Alps, not shown). In summary, CTLCLM agrees overall better with observations than CTLTERRA, but nonetheless underestimates the slope of the highest quantiles. Generally, CTLTERRA overestimates the effect of dry conditions on temperature extremes. In regions where this relationship should be most pronounced (IP, MD), both models underestimate the effect of dry conditions on hot extremes.
5. Discussion and Conclusions
 This study evaluates the performance of the COSMO-CLM2RCM using two alternative LSMs with respect to land-climate coupling, climate variability, and extremes. A state-of-the-art 3rd-generation LSM (CLM3.5) is compared to a simpler 2nd-generation model (TERRA_ML).
 When TERRA_ML is used a very pronounced overestimation of inter-annual summer temperature variability is found. This feature is a common problem in most current RCMs [e.g.,Lenderink et al., 2007]. When using the more sophisticated LSM, this issue is substantially alleviated as inter-annual variability is decreased. We also found that the distribution of daily maximum temperature is better captured in CTLCLM. The pattern of the 90%-percentile over Europe is improved, as well as the whole probability density functions for maximum daily temperatures over various regions (improvement from 56–79% to 70–84% agreement to observations). Even though the PDFs of Tmax are better captured in CTLCLM, the persistence of heat waves is rather underestimated in CTLCLM. The relationship between the number of hot days (nhd) and the standard precipitation index (SPI) is overall well captured in both model versions. Nonetheless, it is underestimated in Southern Europe and overestimated in Mid Europe. In the transitional zone from strong to weak land-climate coupling, CTLTERRAalso overestimates this relationship. This is an indicator for the better (spatial) representation of land-climate coupling in CTLCLM.
 The decreased temperature variability when using CLM3.5 can be explained by a larger ground heat flux and a smaller variability in soil moisture and incoming short-wave radiation. It is difficult to affirm if the larger ground heat flux in CLM3.5 is realistic, since not many observations are available to evaluate this flux. Generally, the ground heat flux is relatively small, about 10% of net radiation during daytime and over vegetated areas. During nighttime and over sparsely and non-vegetated areas it becomes more important [Ronda and Bosveld, 2009]. The models simulate a mean summer net radiation in Europe between 70–150 W/m2, thus, a first estimate of the ground heat flux is about 7–15 W/m2. Tsuang  estimates the ground heat flux to be between 6–12 W/m2 for summer between 30°N–60°N. Results from TERRA_ML are smaller (3–9 W/m2, see Figure 8c) and the ground heat flux in CLM3.5 is rather higher (9–18 W/m2, see Figure 8a). Nevertheless, the values simulated by CLM3.5 seem to be in a realistic range and the larger ground heat flux in CLM3.5 appears thus reasonable.
 The decrease in incoming short-wave radiation variability is another reason for the smaller temperature variability in CTLCLM. Davin et al.  and Davin and Seneviratne have shown that biases in net short-wave radiation, net long-wave and net radiation are smaller when using CLM3.5 compared to TERRA_ML, suggesting a better representation of radiative fluxes in general. The reduced bias in net short-wave radiation is caused by a better representation of cloud cover, which is itself the result of a better partitioning of the surface fluxes [Davin et al., 2011]. Figure 3 of the present study confirms the better partitioning of latent and sensible heat fluxes.
 We also found that soil moisture variability is lower in CLM3.5 resulting in a decrease in the variability of the surface fluxes as well as temperature. We note that Oleson et al.  state that soil moisture variability in CLM3.5 is rather underestimated. Therefore, some of the decrease in temperature variability may occur for the wrong reason. One of the main differences between CLM3.5 and TERRA_ML is that in CLM3.5 the lowest soil level is coupled to a simple prognostic groundwater model [Niu et al., 2007]. Compensating effects from the groundwater model could be a reason for the low soil water variability in CTLCLM.
Lorenz et al.  have shown that the persistence of heat waves is influenced by soil moisture variability. Thus, the low soil moisture variability in CLM3.5 could be the reason for the underestimation of heat wave persistence in CTLCLM. In some regions CTLCLMalso underestimates the relationship between SPI and nhd for the highest quantile, so, land-climate coupling may be at the lower end in this model version. This, in turn, could be linked to the underestimation of heat wave persistence in CTLCLM.
 In conclusion, COSMO-CLM2coupled to the Community Land Model provides a good tool for regional scale investigations of land-climate coupling, despite a possible underestimation of soil moisture variability. Overall, the model coupled to CLM3.5 is found to have a more realistic coupling between the land and the atmosphere compared to that coupled to TERRA_ML, which also results in a better representation of climate variability in Europe. Soil moisture experiments in combination with vegetation experiments can be used in the future to investigate soil moisture- versus vegetation-climate feedbacks in COSMO-CLM2 given the detailed representation of vegetation processes in the CLM3.5 land surface model and its overall good performance in coupled mode.
 This study was supported by the Swiss National Foundation, through the NCCR climate project ECOWAT and the NFP61 DROUGHT-CH project. Computing time was provided by the Swiss National Supercomputing Centre (CSCS). We thank colleagues at NCAR for access to and help with the Community Land Model, especially Sam Levis and Dave Lawrence. We are indebted to the COSMO and COSMO-CLM Communities as well as MeteoSwiss and ECMWF for providing access to and support for COSMO-CLM and the ERA-Interim reanalysis. We are particularly thankful to Daniel Lüthi for technical support, Brigitte Müller for helpful discussions, and many other colleagues who took their time to discuss results. In addition, we thank the anonymous reviewers for their constructive criticism that helped to improve the manuscript. We acknowledge the E-OBS data set from the EU-FP6 project ENSEMBLES (http://ensembles-eu.metoffice.com) and the data providers in the ECA&D project (http://eca.knmi.nl). This work used eddy covariance data acquired by the FLUXNET community and in particular by the CarboEuropeIP and TCOS-Siberia networks. We acknowledge the financial support to the eddy covariance data harmonization provided by CarboEuropeIP, FAO-GTOS-TCO, iLEAPS, Max Planck Institute for Biogeochemistry, and University of Tuscia.