Statistics of sudden stratospheric warmings using a large model ensemble

Using a large ensemble of initialised retrospective forecasts (hindcasts) from a seasonal prediction system, we explore various statistics relating to sudden stratospheric warmings (SSWs). Observations show that SSWs occur at a similar frequency during both El Niño and La Niña northern hemisphere winters. This is contrary to expectation, as the stronger stratospheric polar vortex associated with La Niña years might be expected to result in fewer of these extreme breakdowns. Here we show that this similar frequency may have occurred by chance due to the limited sample of years in the observational record. We also show that in these hindcasts, winters with two SSWs, a rare event in the observational record, on average have an increased surface impact. Multiple SSW events occur at a lower rate than expected if events were independent but somewhat surprisingly, our analysis also indicates a risk, albeit small, of winters with three or more SSWs, as yet an unseen event.


| INTRODUCTION
Periodically, the winter stratospheric polar vortex undergoes a complete breakdown.These events, known as major sudden stratospheric warmings (SSWs) can happen when impulses of wave activity propagate upward through the stratosphere (Charney & Drazin, 1961) or through internal dynamics (Holton & Mass, 1976).Major winter SSWs can have a profound impact on weather and climate in the North Atlantic/European area (Finkel et al., 2023;Polvani et al., 2017; see Kidston et al., 2015, for a review).While events can be relatively short lived, surface impacts can last for up to 2 months following the onset of the events and are characterised by the negative phase of North Atlantic Oscillation (NAO) or Northern Annular mode (Baldwin & Dunkerton, 2001).These impacts occur in roughly 2/3 of events (Bett et al., 2023;Charlton-Perez et al., 2018;Sigmond et al., 2013).As such they are an important source of predictability on seasonal timescales and while individual events have a deterministic timescale of weeks (Marshall & Scaife, 2010;Taguchi, 2016), the probability of SSW events is predictable at lead times of several months using ensembles (Scaife et al., 2016).A recent review (and references therein) on SSWs can be found in Baldwin et al. (2021).
We use a large ensemble of initialised climate simulations, produced as seasonal retrospective forecasts (hereafter called hindcasts) to explore various attributes of northern hemisphere winter SSWs.In these hindcasts, the system is close to the observed state but the ocean and atmosphere are still coupled, allowing evolution of the ocean.In addition to the sea surface temperature (SST) variability that arises through ocean-atmosphere coupling, the initialised system also potentially takes advantage of predictability that may arise from other sources (e.g., Nie et al., 2019).
El Niño-Southern Oscillation (ENSO) is the largest mode of interannual climate variability in the tropics and episodes of warm El Niño and cold La Niña events, which occur every few years in the equatorial Pacific, have widespread tropospheric impacts (e.g., Davey et al., 2014).ENSO also has a well-established impact on the stratosphere, with El Niño being associated with a weaker stratospheric polar vortex in northern hemisphere winter (e.g., Brönnimann et al., 2004), and La Niña with a stronger vortex, although this latter association is somewhat less robust (Iza et al., 2016).The mechanism for this stratospheric influence is via tropospheric modulation of the pressure in the Aleutian low region, which alters vertical planetary wave propagation, and impacts the wave driving on the polar vortex (Bell et al., 2009;Cagnazzo & Manzini, 2009;Garfinkel & Hartmann, 2008;Ineson & Scaife, 2009;Manzini et al., 2006).
Early modelling studies indicated that SSWs occur more frequently during El Niño than La Niña years (Taguchi & Hartmann, 2006), a result which would seem consistent with the weaker polar vortex during El Niño.However, somewhat contrary to expectation, the observational record indicates that an enhanced frequency also occurs in La Niña years, despite the stronger than average polar vortex, with both El Niño and La Niña having a higher SSW frequency than ENSO-neutral years (Butler & Polvani, 2011).In a recent modelling study, Weinberger et al. (2019), show that this observed finding does occur in a small sample of their ensemble members, but that overall, the relationship is essentially linear.A review of the ENSO-stratosphere teleconnection can be found in Domeisen et al. (2019).During the period 2020-2023 there has been an unusually prolonged La Niña event, with three consecutive winters of La Niña conditions.In two of these winters, 2020/21 and 2022/23, SSWs have occurred (Andrews, 2023;Lockwood et al., 2022).It therefore seems timely to revisit the ENSO / SSW relationship.
In the observational record, winters with more than one SSW are unusual (for example, Butler & Polvani, 2011), and case studies of the 2009/10 winter, when SSWs occurred in early February and late March 2010, show the winter NAO was one of the lowest on record (Fereday et al., 2012;Ouzeau et al., 2011).Our large ensemble allows us to explore the impacts of multiple events, and also the chances of multiple events occurring, in more detail.
The outline for this paper is as follows: Section 2 describes the data and methods used.The results of reviewing the ENSO/SSW relationship are shown in Section 3.1, Section 3.2 looks more generally at the impact of model winters with more than 1 SSW, and Section 3.3 examines the risk of winters with more than 1 SSW compared with that expected by chance.Discussion and conclusions are given in Section 4.

| DATA AND METHODS
The initialised hindcasts are from the GloSea5-GC2 (GloSea5 hereafter) seasonal prediction system (MacLachlan et al., 2015).The underpinning Met Office climate model used in this version of the GloSea5 system is HadGEM3-GC2 (Williams et al., 2015), with a vertical resolution of 85 levels in the atmosphere (with a top at 85 km) and 75 levels in the ocean (with a 1 m top level).The ocean horizontal resolution is 0.25 on a tri-polar grid and the atmosphere has horizontal resolution of 0.83 (longitude) Â 0.56 (latitude).Two hindcast sets, produced in association with real-time forecasts during the winters of 2019 and 2020, are used; these use the same model but differ in the seed used in the stochastic physics scheme (MacLachlan et al., 2015).There are 24 winters from 1993/94 to 2016/17 and we use hindcast sets initialised on 25th October, 1st November and 9th November, each with 7 ensemble members.The total number of samples is 24 (years) Â 3 (hindcast sets) Â 7 (ensemble members) Â 2 (sets), 1008 winter realizations in total.
Data from ERA5 (Hersbach et al., 2020) are used to calculated statistics for observed SSWs.Daily means of 10 hPa zonal mean wind are calculated from 00, 06, 12 and 18Z values.We use extended winters from 1957/58, the year when improvements were made to the observing system for the International Geophysical Year, to 2022/23, giving 66 years in total.
As in Butler and Polvani (2011), we use the Charlton and Polvani (2007) definition for SSWs, noting that on the basis of more recent data, the 10 hPa, 60 N criterion has been confirmed as a suitable threshold (Butler & Gerber, 2018): 1.The central date of the warming is the first day on which the daily zonal mean zonal wind at 10 hPa and 60 N transfers to easterly.2. Once a warming is identified, an interval of 20 consecutive days with westerly winds must exist before another event can be defined.3. Cases where the zonal winds are easterly and do not return to westerly for at least 10 consecutive days before April 30th are considered final warmings and are excluded.
Except for when we are specifically looking at multiple SSW events in the model, we count an SSW winter as one which has 1 or more SSW events, that is, the frequency in any year cannot be greater than 1.This is different from some other studies (including Butler & Polvani, 2011) where double events are counted, and the frequency can be greater than 1.Like Butler and Polvani (2011), this study analyses the extended winter period, November to March.
In the model, 647 winters out of 1008 have at least one SSW, that is, a frequency of 0.64 with a 95% confidence interval of 0.61-0.67.This compares with 35 winters out of 66 in ERA5, a frequency of 0.53 with a 95% confidence interval of 0.41-0.65,which indicates that the observed frequency lies within the more robust estimate from the model hindcasts and gives confidence that our model is acceptable in this respect.This is consistent with Lockwood et al. (2022) and Bett et al. (2023), who analysed GloSea5 for the shorter December to February period.
For classification of ENSO years, we use definitions from NOAA/CPC.The Oceanic Niño Index (ONI) is a 3-month running mean of ERSST.v5SST anomalies in the Niño 3.4 region (5 N-5 S, 120 -170 W), based on centred 30-year base periods which are updated every 5 years.For this study, El Niño and La Niña winters require that the November-December-January ONI meets a threshold of ±0.5 C and that this season is also part of an historical event.Historical El Niño or La Niña events are defined as having occurred when this threshold has been met for 5 or more consecutive overlapping ONI seasons (NOAA/CPC, 2023).The 66 years ERA5years consist of 23 El Niño years, 23 La Niña years, and 20 ENSO-neutral years (Table S1).The GloSea5 hindcast period includes 8 El Niño events (336 realizations), 6 ENSO-neutral events (252) and 10 La Niña events (420), and as ENSO prediction skill is very high at this lead time (Dunstone et al., 2016) we use the observed ENSO state for grouping the SSWs.

| El Niño, La Niña and SSW frequency
Examining ERA5 reanalysis data for 66 extended winters (November-March) for the period 1957/58 to 2022/23, we find that the SSW frequency for El Niño years to be 0.57, La Niña is 0.57 and ENSO-neutral is 0.45 (Figure 1a).This is in broad agreement with Butler and Polvani (2011), who find that frequencies of SSWs are roughly equal in El Niño and La Niña years, and greater than that in ENSO-neutral years (see also tabs. 1 and 2 of Domeisen et al., 2019).In contrast, analysis of the GloSea5 hindcasts suggests the more intuitive result, with an approximately linear relationship with ENSO phase, the highest frequency for SSWs occurring in El Niño years, 0.72, and the lowest in La Niña, 0.59 (Figure 1b).We also note that the likelihood of there being one (or more) SSW increases (decreases) with the magnitude of the El Niño (La Niña) event.Choosing the three largest amplitude El Niño events from the hindcast sample increases the likelihood to 0.79, whilst for the 3 largest La Niña events the likelihood decreases to 0.55, consistent with the relationship found for the DJF season for this model (Figure 6c, Lockwood et al., 2022).The model results indicate linearity in this respect, and with regard to the mechanisms linking ENSO SST variability with the stratospheric polar vortex, in agreement with Weinberger et al. (2019) (see their tab.3).
Our model has many more realisations of ENSO years than the observed record.So next, we ask the question, can the observed frequencies be sampled by chance from this model?To explore this possibility, we randomly re-sample the model hindcasts 20,000 times based on the observed number of events in the different ENSO phases.For example, for El Niño, we put all the hindcast El Niño years (336) into a 'bag' and then pick random samples each consisting of 23 years (the number of El Niño years in the observations).We find that for each resulting model distribution with respect to ENSO phase, the observations lie within the central 95% of the model distribution (Figure 2), indicating that the observed frequencies are actually consistent with the model.We conclude that, according to our model, it cannot be ruled out that the observed increase in SSW frequency during both phases of ENSO arose by chance due to the relatively short historical record.
The overall distribution of SSWs by month for GloSea5 is generally in good agreement with observations with the highest frequencies for both El Niño and La Niña occurring in January and February (Butler & Polvani, 2011; Figure 3).Interestingly, the model indicates a higher frequency of SSWs for La Niña than El Niño in November and December and lower frequency for the remainder of the period.This suggests consistency with the well documented shift in ENSO response from early to late winter, with La Niña having a more blocked negative NAO-like pattern in early winter, and being more mobile and positive NAO-like in late winter; El Niño shows the converse (Fereday et al., 2012;Moron & Gouirand, 2003).

| Impact of winters with more than one SSW
We assess the mean impact of an SSW on the winter mean (DJFM) surface pressure and near surface temperature, compared with years without an SSW and find the expected negative NAO response and associated quadrupole surface temperature pattern, with warming over eastern North America and southern Europe and cooling over southern North America and northern Europe (Figure 4a,b).This response is in good qualitative agreement with observations (Polvani et al., 2017;Thompson & Wallace, 1998).Note that the analysis shown in Figure 4 does not distinguish between El Niño, La Niña and ENSO-neutral events, and yet the signature of El Niño can be seen in the tropical Pacific near-surface temperature because, as we have already seen (Figure 1b) the years without SSWs are biased towards La Niña.Hence the mid-latitude response will contain an element of the El Niño tropospheric response in addition to that associated directly with the SSW (Jiménez-Esteve & Domeisen, 2018).
In our analysis of ERA5, double SSWs occur on just 7 occasions in the 66-year record or roughly 1 double event in every 9 years.Double events are also found in GloSea5 hindcasts, with a not dissimilar frequency of about 1 every 8 years.Using our large model sample of 132 double SSWs, we can examine the mean surface impact.Compared to winters with 1 SSW the NAO is more negative and the associated near-surface temperature impacts are enhanced.On average over DJFM there is an NAO decrease of À1.65 hPa relative to winters with the 1 SSW event (Table 1).

| Unseen events: Multiple SSWs per winter
On just 4 occasions in our hindcast set, representing a roughly 1/250 year event, winters with 3 SSWs are identified (Figure 5a,b).Spatial plots of the surface response (Figure 4e,f) suggest that the surface response is further amplified with respect to the double SSW winters (Figure 4c,d), and the NAO is more negative than the double winters by À4.6 hPa (Table 1), although we note this is likely to be uncertain due to the small sample.Nevertheless, this suggests the potential for an extreme impact on surface climate.The size of our ensemble is so large, the SSW algorithm even detects 1 event with 4 SSWs (Figure 5c), although we note that in this case the timeseries shows that each event is short-lived, with the 10 hPa zonal mean wind becoming easterly for only a brief period, and after the 4th SSW, the wind only becomes weakly westerly, without a full recovery of the polar vortex.
While the frequency of having 1 SSW shows a linear relationship with ENSO phase, for 2 SSWs the difference between El Niño and La Niña cannot readily be distinguished, with similar frequencies of 0.15 and 0.12, and 95% confidence intervals which overlap.However, we also note that for the 3 and 4 SSW events, none of these events coincides with La Niña conditions, with 3 occurring in El Niño winters (2002,2009,2014) and 2 in ENSO-neutral winters (1993ENSO-neutral winters ( , 2001)), consistent with the raised probability of single events during El Niño.In addition, these latter events do not appear to be obviously related to the initial strength of the polar vortex at the start of winter, with timeseries starting from values of both stronger and weaker 10 hPa winds than the average model climatology.While the influence of other external forcing has not been ruled out, it also possible that internal variability may play a role (e.g., Palmeiro et al., 2023).If SSWs were independent, random events, the number of SSWs per season would approximate a Poisson distribution.In Figure 6, the blue bars show the probability mass function of a Poisson distribution with the expected number of events per season, λ, equal to 1.03, chosen to match the probability of zero SSWs per season measured in GloSea5.The vertical lines on each bar represent the 95% sampling uncertainty with a sample size equal to that in GloSea5 (N = 1008 seasons), estimated by randomly generating 1000 samples of 1008 seasons.The grey bars plot the probabilities measured from GloSea5, and we can see that the probabilities of multiple SSWs per season in GloSea5 are significantly lower than expected from a Poisson distribution.
To test whether this discrepancy arises from the definition of SSWs used in this paper, which requires 20 consecutive days of westerly winds between SSWs, and 10 consecutive days of westerly winds before 30th April for a final warming, we test another statistical model where we randomly generate whether an SSW occurs on each day of the extended winter using a daily SSW probability of p = 6.84 Â 10 À3 , again with p chosen to match the probability of zero SSWs measured in GloSea5.We then count the SSWs using the definition described above.We generated 1000 samples of 1008 seasons, and the mean and 95%  sampling uncertainty are shown by the orange bars and vertical lines in Figure 6.Although the probabilities are closer to GloSea5 than the Poisson distribution, the probability of multiple SSWs per season is still significantly lower in GloSea5.This likely at least partly reflects the timescale for the stratosphere to come back to a dynamical/radiative equilibrium, which in practice often involves an overshoot to colder stratospheric temperatures (anomalously strong vortex) while the warming due to dynamical effects is being re-established (Bloxam & Huang, 2021;Hardiman et al., 2020) and it suggests that multiple SSW winters are rarer than might otherwise be expected.

| DISCUSSION
We show that GloSea5, which overall has a realistic SSW frequency, indicates that the likelihood of SSWs in El Niño years is greater than that during La Niña.Our key finding is that the observed result, that El Niño and La Niña are associated with similar SSW frequencies, which are larger than those for ENSO-neutral years, could well have arisen due to the sampling variability in the limited observational record.This finding is in agreement with a recent study using the GEOSCCM chemistry-climate model in which observed SST variations were imposed globally (Weinberger et al., 2019).The result is important as an active stratosphere can make a significant difference to the surface climate response (Butler et al., 2014) and seasonal winter forecasts (Lockwood et al., 2022) and we need to be able to quantify this risk for our seasonal predictions.
Nevertheless, although we have many hindcast members, these cover only 24 years of initial conditions from the observational record, and may not fully sample the diversity in ENSO (Ren et al., 2019;Timmermann et al., 2018), which may itself impact on the influence on the stratosphere (Calvo et al., 2017).Other processes may also not be fully represented.For example, solar variability may impact SSW rate (Labitzke, 1987), but is not explicitly included in this version of the model, although its effects should be represented in initial conditions both in the atmosphere and the ocean (Andrews et al., 2015).Although not explored here, there may also be an influence from the quasi-biennial oscillation (Anstey et al., 2022) and sensitivity to detection criteria (Song & Son, 2018).
Similarly, the possibility of error due to model bias cannot be ruled out.Free running models show uncertainty in the SSW / ENSO phase relationship (Garfinkel et al., 2012;Song & Son, 2018) and differences in this relationship may be due to the positioning of ENSO teleconnection patterns in the North Pacific in relation to the region most associated with SSWs (Garfinkel et al., 2012).In addition, recent analysis suggests that teleconnections from the tropics to the mid latitudes may be too weak in seasonal prediction models (Garfinkel et al., 2022;Williams et al., 2023).We also note that our model does not reproduce the observed periods of low and high decadal variability in SSW frequency of the 1990s and 2000s, respectively (Domeisen, 2019;Reichler et al., 2012), although the reasons for this apparent decadal variability are still not known.Despite this, it has been shown that the correction of model mean bias, while altering the frequency of SSWs, does not appear to alter the SSW/ENSO phase relationship (Tyrrell et al., 2022).
Finally, while there are only a few observed events, our large model sample enables us to briefly explore winters with 2 or more SSWs.We find that increases in the number of warmings lead to an increasingly negative NAO and stronger near surface temperature impact.In an application of the UNSEEN methodology (Thompson et al., 2017), we find that on very rare occasions the model produces 3, or even on one occasion 4, SSWs.While the risk of such events would likely always be low in seasonal outlooks, our daily initialisation strategy for forecasts (Arribas et al., 2011) enables warnings at shorter lead times once SSW events are within the deterministic range.

F
I G U R E 1 SSW frequency with ENSO phase.Likelihood of one or more SSWs occurring during El Niño (red), La Niña (blue) and ENSO-neutral (grey) winters (November to March) for (a) observations (ERA5 1957/58-2022/23) and (b) GloSea5 hindcasts.The black dashed line shows the SSW frequency for all years.Vertical bars show the 95% confidence interval.

F
I G U R E 2 Resamples of GloSea5 hindcasts with ENSO phase.Histogram showing the likelihood of one or more SSWs for random samples of GloSea5 winters (November to March) for (a) El Niño, (b) La Niña) and (c) ENSO-neutral conditions.Vertical lines show the SSW frequency for observations (red, solid) and the 95% range of the resampled GloSea5 hindcasts (cyan, dashed).

F
I G U R E 3 Seasonality of SSWs with ENSO phase.Histogram of frequency of SSWs during El Niño (red) and La Niña (blue) for months November to March for (a) observations (ERA5 1957/58 to 2022/23) and (b) GloSea5 (1993/94-2016/17).F I G U R E 4 Impact of SSWs on winter (December to March) climate.Difference in sea-level pressure (hPa) (left) and nearsurface temperature (K) (right) for winters with 1 SSW (a,b), winters with 2 SSWs (c,d) and winters with 3 SSWs (e,f) with respect to winters where the polar vortex is not active.T A B L E 1 Difference in mean NAO (DJFM) for various winter SSW counts relative to the climatological NAO for all GloSea5 hindcast winters, 19.92 hPa.

F
I G U R E 5 Examples of GloSea5 winters with multiple SSWs.Timeseries of 60 N 10 hPa zonal mean wind (green) for 3 GloSea5 hindcasts in which three (a,b) and four (c) SSWs occur.The GloSea5 10 hPa zonal mean wind climatology is shown in black.

F
I G U R E 6 SSW probability mass functions for random, independent SSWs and in GloSea5.The blue bars show the SSW probability mass function for a Poisson distribution with λ = 1.03.The orange bars show the SSW probability mass function for a statistical model where we randomly generate whether an SSW occurs on each day of the extended winter using a daily SSW probability of p = 6.84 Â 10 À3 , and applying the SSW definition when counting the SSWs (see text for details).The grey bars show the probabilities measured from GloSea5.For the statistical models, the vertical lines show the 95% sampling uncertainty for a sample size matching GloSea5 (1008 seasons).