Boreal winter 2009–2010 made headlines for cold anomalies in many countries of the northern mid-latitudes. Northern Europe was severely hit by this harsh winter in line with a record persistence of the negative phase of the North Atlantic Oscillation (NAO). In the present study, we first provide a wider perspective on how unusual this winter was by using the recent 20th Century Reanalysis. A weather regime analysis shows that the frequency of the negative NAO was unprecedented since winter 1939–1940, which is then used as a dynamical analog of winter 2009–2010 to demonstrate that the latter might have been much colder without the background global warming observed during the twentieth century. We then use an original nudging technique in ensembles of global atmospheric simulations driven by observed sea surface temperature (SST) and radiative forcings to highlight the relevance of the stratosphere for understanding if not predicting such anomalous winter seasons. Our results demonstrate that an improved representation of the lower stratosphere is necessary to reproduce not only the seasonal mean negative NAO signal, but also its intraseasonal distribution and the corresponding increased probability of cold waves over northern Europe.
 In Northern extratropics, the wintertime climate shows large intraseasonal and interannual fluctuations compared to other regions and seasons. As experienced in Europe during the winter 2009/10, cold spells induce particularly strong socio-economic impacts (e.g., energy demand, transport, industry, emergency protection systems) and are likely to spur regional outbreaks of scepticism regarding climate change [Gershunov and Douville, 2009]. Both statistical analysis [e.g., Cattiaux et al., 2010] and improvements in the predictability of such climate events [e.g., Cohen et al., 2010] are therefore crucial challenges for scientists and decision makers.
 In the northern latitudes, the weekly to monthly fluctuations of the stratospheric polar vortex (SPV) dominate the wintertime variability and are therefore a potential source for long-range climate predictability in the extratropics [Thompson et al., 2002; Charlton et al., 2007]. In particular, Baldwin and Dunkerton  and Gerber et al.  suggested a possible downward propagation of stratospheric zonal wind anomalies towards the troposphere. Using an empirical technique to weaken the SPV in their atmospheric GCM, Scaife and Knight  highlighted the possible contribution of the stratosphere to the negative phase of the North Atlantic Oscillation (NAO) of winter 2005/06 and the associated European cold spells. More recently, Douville [2009b] confirmed and generalized the relevance of the polar stratosphere over the whole Northern extratropics by comparing the skill of two hindcasts of winters 1971–2000, with a relaxed versus interactive lower stratosphere.
 In this study we focus on the European winter of 2009/10. Section 2 carries on the weather-regime analysis done by Cattiaux et al.  over the whole 20th century. Section 3 discusses the potential predictability of this anomalous winter season based on ensembles of global atmospheric simulations. Finally, results are summarized and consequences for both operational seasonal forecasting and 21st century climate scenarios are discussed in section 4.
2. How Unusual?
Cattiaux et al.  showed that European cold spells of winter 2009/10 can be explained by an extreme persistence of negative NAO conditions, from an analysis of North-Atlantic weather regimes over 1957–2010. Here we carry on this analysis over the whole 20th century by using the geopotential height at 500 hPa (Z500) of the 20th Century Reanalysis V2 over 1891–2006 (hereafter 20CR) [Compo et al., 2011]. This new dataset is compared to former re-analyses provided by (i) NCEP/NCAR over 1948–2010 (NCEP) [Kistler et al., 2001], (ii) NCEP/DOE over 1979–2010 (NCEP2) [Kanamitsu et al., 2002], (iii) ERA-40 over 1957–2002 (ERA40) [Uppala et al., 2005], and (iv) ERA-Interim over1989–2010 (ERAI) [Uppala et al., 2008]. 20CR, NCEP and ERA-40 daily anomalies are computed relative to their respective smoothed (using a 15-day running average) 1971–2000 daily climatologies. NCEP2 (ERAI) anomalies are computed relative to the NCEP (ERA40) climatology.
 The intraseasonal to interannual variability of European temperatures is often described as an alternation between the preferential states of the North-Atlantic dynamics, known as “weather regimes” [e.g., Vautard, 1990], generally obtained by clustering daily Z500 anomalies [Michelangeli et al., 1995]. Here we concatenate Z500 anomalies of 20CR, NCEP and ERA40 over the wintertime months (December-January-February-March, DJFM) of the period 1971–2000, and apply the “kmeans” algorithm on these 90 winters. We obtain the four often-used wintertime North-Atlantic weather regimes [e.g., Cassou, 2008; Woolings et al., 2010]: Atlantic Ridge (AR), NAO−, Blocking (BL) and NAO+ (Figure 1, left). For all re-analyses, each day of DJFM is then classified into one regime by minimizing the Euclidean distance between its Z500 anomaly and the common regimes' centroids. Finally, a three-day persistence criterion is applied to the classification in order to only retain quasi-stationary patterns, which declassifies about 10% (sightly depending on the reanalysis) of DJFM days.
 We find an excellent agreement between all re-analyses in DJFM frequencies of regimes' occurrences (Figure 1, right). In particular, the 20CR reanalysis compares well with others over periods of overlap (all r > 0.95). Since assimilated data in this area do not change much along the whole 20CR reanalysis [Compo et al., 2011], 20CR regimes' frequencies are likely to be reliable for 1891–1948. In the following we consider DJFM regimes' frequencies over the whole period 1891/92–2009/10 by concatenating 20CR (1891/92–2007/08) and ERAI (2008/09–2009/2010). Other combinations would not have changed our results in a significant manner. Consistently with Cattiaux et al. , the winter 2009/10 stands out with a remarkable peak in NAO− occurrences (82 days over DJFM according to ERAI), which leads to weak occurrences of other regimes: 6 days in AR, 11 in BL and 11 in NAO+. The closest winter to 2009/10 in terms of DJFM regimes' frequencies is 1939/40, characterized by 82 days in NAO−, 15 in AR, 10 in BL and 7 in NAO+ according to 20CR. Such a 70-year-old analog provides an opportunity to put in prospect colds spells of winter 2009/10 relative to long-term trends in European temperatures.
 In-situ measurements of daily minimum temperature (Tmin) over Europe are provided by the ECA&D project [Klein-Tank et al., 2002], after selecting 168 stations on the basis of (i) the availability of at least 90% of data over the period 1930–2010, and (ii) only one station per 0.5° × 0.5° grid cell, similarly as done by Cattiaux et al. . We focus on 1930–2010 in order to both retain a sufficient number of stations and include winters 1939/40 and 2009/10. As for Z500, anomalies are computed at each station relative to the smoothed 1971–2000 daily climatology. DJFM correlations between frequencies of NAO+ or NAO− and Tmin anomalies over Northern Europe (as defined as the average over the 56 stations included in 15°W–40°E, 50–75°N, see black rectangle in Figure 3) and the overlapping period (1930/31–2009/10) are highly significant (r = 0.65 and −0.6, p-values ≪ 1% assuming independence from one winter to another, Figure 1), which confirms the NAO influence on the interannual variability of wintertime temperatures over Northern Europe [Trigo et al., 2002]. Correlations between frequencies of AR or BL and Tmin are not significant. Despite equivalent regimes' frequencies, winters 1939/40 and 2009/10 differ by more than 2 degrees in Tmin anomaly: respectively −5.3(−4.6/−6.3)°C and −1.9(−1.7/−2.2)°C. Values between brackets indicate a 90% confidence interval obtained by a bootstrap procedure on stations (see Tmin range in Figure 1).
 At the intraseasonal time scale, winters 1939/40 and 2009/10 exhibit similar features: a first NAO− episode starting in mid-December, a very persistent second one (∼40 days) near the mid-winter, and a shorter third one in March (Figure 2). During these episodes, daily anomalies of Tmin averaged over Northern Europe were much colder in 1939/40 (e.g., below −15°C in January 1940) than in 2009/10 (always above −10°C, Figure 2). In particular, the number of very cold days (VCD), as defined by a Tmin anomaly below the 10th centile of the DJFM 1971–2000 distribution (here −4.7°C), is much higher in 1939/40 than in 2009/10 (65 vs. 24). In terms of seasonal Tmin, that makes 1939/40 the 2nd coldest of winters 1930/31–2009/10, while 2009/10 is only the 15th.
 Can such a departure between temperatures of winters 1939/40 and 2009/10 be explained without invoking global warming? This question may be investigated by fitting the 1930/31–2009/10 Tmin time series with a linear regression model taking regimes' frequencies (fi) and time (t) as predictors,
Within such a simple model, the estimated linear trend is close to 0.5°C/century, whereas the estimated standard deviation of the residual term ε is close to 0.5°C. So, the long-term trend is not expected to be identified as significant very often when comparing only two single winters. However, in the specific case of the 1939/40 and 2009/10 winters, the large difference of 3.3°C allows to reject the null hypothesis β = 0 (vs β ≠ 0) at the 9 % significance level. It suggests that the winter 2009/10 would have been colder in the absence of a long-term warming trend in Europe.
3. How Reproducible?
 The potential predictability of the anomalous winter 2009/10 is assessed by performing two sets of control (“Control Free” CTF/“Control Nudged” CTN) and perturbed (“Cold Winter Free” CWF/“Cold Winter Nudged” CWN) experiments. CTF/CTN consist of five-member ensembles of 1971–2000 global atmospheric simulations driven by monthly-mean observed SST, only differing from atmospheric initial conditions. They provide a model climatology as well as a benchmark for evaluating potential seasonal predictability Douville [2009b]. CWF/CWN are 30-member ensembles of atmospheric simulations of October 2009 – March 2010, initialized from each year of the first member of the corresponding control experiment and driven by monthly-mean observed SST (from the ERAI dataset). CTN (CWN) only differ from CTF (CWF) by the additional implementation of a stratospheric relaxation towards ECMWF analyses north of 25°N. This original experimental design allows us to discriminate influences of global SST and of a more realistic Northern stratosphere on the potential predictability of the European cold winter 2009/10. Similarly to Section 2, CWF/CWN anomalies of winter 2009/10 are computed relative to their respective CTF/CTN 1971–2000 climatology.
 All simulations are performed with a medium-resolution configuration (linear T63 truncation, reduced 128 by 64 Gaussian grid, 31 vertical levels) of the ARPEGE-Climat spectral model with a hybrid σ-pressure vertical coordinate [Gueremy et al., 2005]. While such a vertical resolution is too coarse to explicitly resolve the stratosphere (only 5 vertical levels above 100 hPa), it is sufficient to simulate a realistic SPV using a simple nudging strategy. Here nudging is applied at each time step (every 30 min) to horizontal winds and temperature in the northern extratropical stratosphere (north of 25°N and above 100 hPa, see Douville [2009b] for nudging details). Reference fields are 6-hourly ECMWF re-analyses (ERA40 for CTF/CTN, ERAI for CWF/CWN) linearly interpolated at the model time step.
 The ERAI Z500 anomaly of DJFM 2009/10 exhibits a strong negative phase of the Arctic Oscillation (AO), which is poorly captured by the CWF ensemble simulation (Figure 3a): while the negative AO signature is reproduced over North Pacific as a well-known response [Wallace and Gutzler, 1981] to the intense El Niño event of winter 2009/10, no statistically significant response appears over North Atlantic. In contrast, CWN shows a reasonable agreement with ERAI over both North Pacific and North Atlantic, as indicated by an Anomaly Pattern Correlation (APC) of 0.83. Relaxing the Northern-extratropical stratosphere towards ERAI therefore leads to an improved NAO− pattern. Moreover, the analysis of daily Z500 outputs indicates that the winter 2009/10 observed anomaly of NAO− frequency (60 days more than the 1971–2000 average) is much better captured in CWN (69 ± 5 days more than the CTN average, p-values ≪ 1% assuming independence from one winter to another) than in CWF (12 ± 8 days more than the CTF average, p-value > 10%).
 In order to compare Tmin anomalies simulated by CWF/CWN with a reference gridded dataset, we use Tmin anomalies of ERAI reanalysis, which compare well with ECA&D observations used in Section 2 (Figure 3b). Not surprisingly, the CWF experiment poorly reproduces the Tmin meridional gradient, characterized by cold (warm) anomalies north (south) of 40°N, which is a well-known signature of the negative NAO [e.g., Trigo et al., 2002; Scaife and Knight, 2008]. Despite an APC with ERAI of 0.69, the amplitude of the signal is strongly underestimated and most of the anomalies are not statistically significant at a 10% level. Relaxing the stratosphere towards ERAI again leads to a significantly improved simulation of the observed Tmin pattern, albeit still weaker than ERAI and slightly shifted northward (APC = 0.78), with unrealistic values over Iberia, France and Greenland.
 Beyond seasonal averages, the predictability of intraseasonal features such as cold spells constitutes a crucial issue. Figure 3 shows the frequency (in %) of VCD over the DJFM 2009/10 period, as defined in Section 2. While ERAI exhibits more than 20% of VCD over Europe, the CWF experiment does not show any significant signal. In contrast, the relaxation experiment CWN shows a consistent and significant response over Northern Europe, with more than 20% of VCD above 50°N.
 What happened in the stratosphere that helped the relaxation experiment show such a marked improvement? In the case of winter 2009/10, two sudden stratospheric warmings have occurred, possibly favored by both an El Niño event in the tropical Pacific [Ineson and Scaife, 2009] and a lower-than-normal Eurasian snow cover in October [Cohen et al., 2010]. Assimilating these stratospheric warmings in the ARPEGE-Climat model via a relaxation towards ERAI improves the capacity of the model to reproduce the downward propagation of the zonal wind anomalies and the NAO signal and corresponding temperature anomalies over northern Europe.
 Our study first analyzes how unusual European winter 2009–2010 was, both in terms of large-scale circulation and temperatures. Thanks to the recent 20CR reanalysis, we show that one needs to get back to the winter of 1939/40 to find a dynamically-analogous DJFM season with more than 80 days in NAO− regime. However this winter 1939/40 was much colder than 2009/10 (by more than 2 degrees), the difference being better explained in a linear regression scheme framework by including a global warming effect rather than by circulation anomalies only. In line with Douville [2009b], we then use ensembles of atmospheric simulations to show that a perfect prediction of wintertime global SST does not guarantee a skilful hindcast of winter 2009–2010 North-Atlantic large-scale circulation and European surface temperatures. Through an original nudging technique, we also demonstrate that an improved simulation of the lower stratosphere is a key challenge for predicting not only the seasonal mean NAO, but also its intraseasonal distribution and the occurrence of cold waves over northern Europe. While our experimental design is highly idealized and does not tell much about the effective climate predictability that can be derived from a better representation of the stratosphere, both observational and modelling studies suggest that the stratospheric polar vortex can be influenced by low-frequency signals both in the stratosphere (e.g., volcanic aerosols, Quasi-Biennial Oscillation) or at the lower boundary conditions (e.g., ENSO in the tropical Pacific, Eurasian snow cover, Arctic sea-ice). Improving the troposphere-stratosphere coupling is therefore a real challenge for the dynamical forecasting community. It should be however emphasized that an increased vertical resolution at the tropopause and/or in the stratosphere is not sufficient for this purpose. Parallel ensembles of a high-top version of ARPEGE-Climat with 91 vertical levels (not shown) did not show a particular improvement of winter reproducibility compared to the low-top version used in the present study. Albeit not hopeless, the way towards improved monthly-to-seasonal predictions of wintertime climate over Europe is therefore still long and full of pitfalls.
 This work is partly supported by the AXA research fund. The authors truly thank all providers of reanalysis and observational datasets used in this study, especially the U.S. Department of Energy (DOE), the Office of Biological and Environmental Research (BER), and the National Oceanic and Atmospheric Administration Climate Program Office for their support for the Twentieth Century Reanalysis Project (http://www.esrl.noaa.gov/psd/data/20thC_Rean/). The authors would also like to acknowledge Ricardo Trigo and two anonymous referees for insightful comments that helped clarify the manuscript.
 The Editor thanks Ricardo Trigo and anonymous reviewer for their assistance in evaluating this paper.