How well do forecast models represent observed long‐lived Rossby wave packets during southern hemisphere summer?

Rossby wave packets (RWPs), are atmospheric perturbations linked to the occurrence of extreme weather events such as heatwaves, extratropical cyclone development and other equally destructive phenomena. Under certain circumstances, these packets can last from several days to 2–3 weeks in the atmosphere. Therefore, forecast models should be able to correctly predict their formation and development to enhance extreme weather events prediction from 10 to 30 days in advance. In this study, we assess whether the NCEP and IAP‐CAS sub‐seasonal forecast models can predict the evolution of observed RWPs that last more than 8 days (long‐lived RWPs or LLRWPs) during southern hemisphere summer. Results show that the NCEP (IAP‐CAS) model forecasts LLRWPs that appear eastward (westward) from the observed LLRWPs. Both models forecasted LLRWPs that rapidly lose energy after the 6th–7th lead day of simulation, which could limit LLRWPs prediction to the synoptic time scale. Additionally, both models better forecast LLRWPs when the packets manifest in the eastern Pacific. Southern Annular mode (SAM) and El Niño Southern‐Oscillation (ENSO) do not seem to exert a large influence in the representation of LLRWPs. Nevertheless, during the best LLRWPs forecasts, the observed circulation anomalies signal the manifestation of negative SAM events. In contrast, both forecast models struggle at forecasting LLRWPs when a blocking situation develops to the South of Australia. Lastly, an inactive Madden Julian Oscillation (MJO) seems to favor the development of accurate LLRWPs forecasts, whereas during phases 3, 5 in the NCEP model and 3, 8 for IAP‐CAS, the models struggle at forecasting LLRWPs.

Due to their link to extreme weather events and their impact in atmospheric predictability, it is important that numerical weather prediction models have a good representation of the development of RWPs to obtain skillful forecasts (Quinting & Vitart, 2019).An accurate representation of LLRWPs development can enhance extreme weather events detection up to 10-30 days in advance.Quinting and Vitart (2019) analyzed the representation of RWPs in various sub-seasonal to seasonal models (S2S) during northern hemisphere winter, concluding that S2S models are able to give a good estimation of the characteristics of the RWPs.However, that study focused on the northern hemisphere, and did not distinguish between short and long-lived RWPs.Recent studies have characterized RWPs in the southern hemisphere and show that in the summer LLRWPs represent about 10% of the total RWPs and their frequency of occurrence is influenced by the Southern Annular Mode (SAM) and El Niño-Southern Oscillation (ENSO) (Pérez et al., 2021;Sagarra & Barreiro, 2020).As a continuation of these studies, here we aim to analyze whether S2S models are able to forecast the development and trajectories of LLRWPs.To do so, we compare the trajectories of LLRWPs tracked in a reanalysis against the trajectories of the LLRWPs forecasted by two S2S models.Additionally, we studied whether LLRWPs forecast is affected by the area where the LLRWPs first manifested, the dominant SAM or ENSO phase, or by the Madden Julian Oscillation (MJO).
The paper is organized as follows.Section 2 describes the datasets, the RWPs tracking methodology and the analysis performed to assess LLRWPs representation in both models.Section 3 shows the results followed by a discussion and Section 4 presents a summary of the study.
The period of study focuses in the southern hemisphere summer (December to March, or DJFM) as done in Sagarra and Barreiro (2020) and Pérez et al. (2021).We restricted our analysis to the DJFM season between 1999 and 2010, due to time constrains of the NCEP dataset, having 11 DJFM seasons available for the analysis.
RWPs propagate in the atmosphere as meanders of the jet stream, producing a series of troughs and ridges restricted to a certain latitudinal band and showing a mainly eastward direction during austral summer (Chang, 1999b).Therefore, RWPs can be characterized by computing the envelope that surrounds the packet.In order to compute the envelope (V 300env , in m/s) we used meridional winds at 300 hPa (V 300 ) following the methodology detailed in Pérez et al. (2021).Given that RWPs propagation is mainly zonal during southern hemisphere summer (Chang, 1999a(Chang, , 1999b)), we average the data in the latitudinal band between 40 and 65 S.

| Description of RWPs tracking algorithm and detection of LLRWPs
Prior to the application of the RWPs tracking algorithm, we need to filter out low amplitude V 300env data to avoid tracking noise.Nevertheless, the selection of the threshold is not obvious because there are no physical properties that separate one packet from another (Souders et al., 2014b).We apply a threshold of 19 m/s (Pérez et al., 2021) for ERA 5, and a threshold of 18 (17) m/s for NCEP (IAP-CAS) data.The chosen thresholds are based on the distribution of 7-day running mean values of V 300env , shown in Figure S1 (as done in Grazzini & Vitart, 2015;and Sagarra & Barreiro, 2020).The method considers that V 300env values smaller than the median of the distribution are noise and thus are set to zero.
The tracking algorithm is based on the maximum envelope technique (Grazzini & Vitart, 2015;Pérez et al., 2021;Sagarra & Barreiro, 2020), which locates areas with the maximum daily values of V 300env above a minimum threshold, identifying the RWPs center of activity, and then follows their propagation to the east assuming that the packets travel with speeds between 15 and 45 /day.After the tracking stage, the algorithm links truncated trajectories using proximity criteria and registers the RWPs characteristics.Afterwards, we retained RWPs that lasted more than 8 days (LLRWPs) and registered the dates when the LLRWPs are detected (T d ) and their areas of formation (X d ).A full detailed description of the algorithm is available in Pérez et al. (2021).Finally, it is important to highlight that this algorithm requires a zonally symmetric wind flow (Grazzini & Vitart, 2015), which is only observed during austral summer (Chang, 2000).
To search for LLRWPs in the S2S models, we downloaded the reforecast datasets starting the forecasts at days T d , and transformed V 300 into V 300env following the methodology mentioned in Section 2.1.The tracking algorithm is then applied to each simulation to search for forecasted LLRWPs.As we search for RWPs that starts close to X d , before the tracking stage, V 300env forecast outside the range [ ] for lead days 1-3 of simulation is deleted from the datamatrix.R is a typical Rossby radius (1000 km), T n the lead day of the forecast and V min (V max ) the minimum (maximum) propagation speed of the packets, here considered as 10 (50) /day to allow for small biases in the reforecasted data.
After the tracking stage, we search in every simulation for LLRWPs that start their propagation between lead days 1-3 of the forecast.If the tracking algorithm detects a LLRWP that matches this condition, we save that trajectory as the forecasted evolution of a LLRWP (hereafter FRWP), else, we assume that the simulation failed to predict a FRWPs and proceed to the next simulation.An example for the NCEP model is displayed in Figure 1, where a LLRWP was detected in ERA 5, and two NCEP simulations where FRWPs are close to the observed LLRWP.

| Representation of LLRWPs in the forecast models, and the influence of SAM, ENSO and MJO
We start by analyzing whether the forecast models can predict the development of the LLRWPs.To do so, first we measured the proportion of ensemble members able to predict a FRWPs/FRWPs that lasted more than 8 days during different ENSO/SAM phases.The classification of the SAM and ENSO phases follows the same criteria as in Pérez et al. (2021).
Second, in order to study how similar are the FRWPs compared to the observed LLRWPs, we measured the zonal displacement between the observed LLRWPs trajectory and the FRWPs the first 9 days after the detection of the observed LLRWPs.Additionally, in order to infer how energetic are the FRWPs compared to the observed LLRWPs, we studied the differences between the V 300env at the center of the observed LLRWPs against the forecasted values of V 300env found at the center of the FRWPs.
Third, we classified simulations as best/good/bad/ worst forecasts as those that were able to predict the development of a LLRWPs in (100-75)/50/25/0% of the ensemble members, respectively.It is worth pointing out that results of model performance may change by using a larger ensemble.
Next, we measured the differences in geopotential height anomaly at 300 hPa (Z a300 ) using reanalysis and reforecast data during the best/worst LLRWPs forecasts.This is to assess the differences in the mean atmospheric circulation.In order to do so, we constructed the mean Z a300 from days T d À T d+10 , being T d the starting dates of simulations with the best/worst LLRWPs forecasts.Afterwards, we assess the statistical significance of the results using a Student t-test at 10% level, comparing Z a300 data that belong to dates with best/worst forecasts of LLRWPs against the rest of the dataset (Z a300 data that do not belong to best/worst forecasts).
Lastly, we studied MJO activity during the periods with the best/worst LLRWPs forecasts to assess whether the propagation of MJO during the LLRWPs lifetime affects the LLRWPs forecast.To do so, we first calculated the climatological frequency of having the MJO in every stage (C), and its standard deviation (STD) during austral summer between 1979 and 2020.Next, we calculated the probability of finding every MJO phase during the first 10 days since day T d for the best/worst LLRWPs forecasts.If the relative frequency of occurrence of a certain MJO phase during good/bad forecasts is outside the range C ± STD, that MJO stage is more frequent/absent than usual.

| LLRWPs tracking, ENSO and SAM influence
We found 39 LLRWPs in the austral summer between 1999 and 2010 (around 3.5 LLRWPs per season), 20 LLRWPs were found during neutral SAM years, 14 in negative SAM and 5 in positive SAM years.In the case of ENSO events, 15 LLRWPs appear in La Niña events, 8 for Neutral ENSO and 16 in El Niño years.These results are consistent with Pérez et al. (2021) and with the fact that during positive SAM the strengthening of the westerlies diminishes the meandering of the flow.
As each model has four simulations available, there are 156 simulations available per model.The NCEP model was able to forecast the formation of FRWPs in 86% of the simulations, and 52% of them surpassed the 8 days threshold.FRWPs showed a mean lifespan of 9.1 ± 4.7 days.The IAP-CAS forecasted the development of FRWPs in 84% of the simulations, although barely 40% of them lasted more than 8 days.FRWPs tracked last around 8.2 ± 4.4 days.Oppositely, observed LLRWPs displayed a mean lifespan of 13.0 ± 2.7 days.Therefore, forecast models can predict LLRWPs development but underestimate their lifespan.A distribution of observed LLRWPs and FRWPs lifetime is shown in Figure S2.
Table 1 shows the proportion of total FRWPs/FRWPs that lasted more than 8 days that were correctly forecasted by the models during different SAM/ENSO events.In the NCEP model, the proportion of FRWPs found during years with positive SAM is lower compared to other SAM phases, and neutral ENSO shows the largest proportion of detected FRWPs.Overall, in the IAP-CAS model we have similar results to those observed in NCEP.Nonetheless, for FRWPs that surpassed the 8 day threshold, the highest (lowest) proportion is found during positive SAM events in the NCEP (IAP-CAS) model.This large difference between models might be due to the low number of cases during positive SAM events, which makes the results very sensitive to small differences.Meanwhile, for ENSO events the NCEP model shows the highest (lowest) proportion of FRWPs that surpassed the 8 days threshold in La Niña (neutral) years.In contrast, for the IAP-CAS model, the highest (lowest) proportion of FRWPs with lifespan above 8 days is detected in neutral (La Niña) years.
We generally find lower frequencies of FRWPs that lasted more than 8 days in IAP-CAS.This is consistent with the fact that FRWPs detected by the NCEP model have longer lifespans compared to those found in IAP-CAS.

| Model representation of LLRWPs and influence of the MJO
Figure 2 shows the zonal displacement between the location of the observed LLRWPs and the FRWPs the first  9 days of the packets lifespan.FRWPs detected in the NCEP (IAP-CAS) model tend to appear more to the east (west) from the observed packet.This pattern remains approximately constant after day one until the 8-9 th lead day, when the median of both distributions is near zero.This change could be attributed to the loss of FRWPs as the simulation advances.
In Figure 3, we display the difference of V 300env at the center of the packet between the observed LLRWPs minus the tracked FRWPs in each lead day of simulation.Positive (negative) values signal that the forecast model underestimates (overestimates) the energy contained within the packet.Initially, the NCEP model does not greatly differ from the reanalysis.Nonetheless, starting on the 6 th lead day of the simulation, the energy contained in FRWPs decays rapidly, indicating that FRWPs are less energetic compared to the observed LLRWPs.Nevertheless, wave packets tracked in the IAP-CAS always tend to underestimate the energy contained in the observed wave packets.Thus, even though both models detect a similar number of FRWPs, IAP-CAS is much less energetic compared to the reanalysis.Giannakaki and Martius (2016) showed that forecast models in the northern hemisphere usually underestimate the area and strength of the waveguide.Moreover, Gray et al. (2014) concluded that in the northern hemisphere, the potential vorticity fields where RWPs propagate fall rapidly with lead time in numerical weather prediction models.Therefore, an underestimation of the potential vorticity anomaly fields as the forecast advances, causes that V 300env in the forecast diminishes faster than in the reanalysis.It is plausible to think that a similar process can be at work in the southern F I G U R E 3 Analogous to Figure 2, but for V 300env differences at the center of the wave packet on the observed LLRWPs against its forecasted trajectory.Positive (negative) values signal that the FRWPs have lower (higher) V 300env , thus, RWPs forecasted by the model are less (more) energetic compared to the observed LLRWPs.
hemisphere, therefore, LLRWPs forecasts might be limited to the synoptic scale.
When we focus on the classification of the simulations, near 18% of the NCEP simulations belong to the worst forecasts, 23% to bad forecasts, and 59% to good or the best forecasts.Conversely, 36% of the IAP-CAS simulations belong to the worst forecasts, 26% to bad forecasts, whereas only 38% belong to the good/best forecasts.These results further suggest that the NCEP model is better at forecasting LLRWPs compared to the IAP-CAS.Figure 4 shows the areas where the total proportion of FRWPs/FRWPs with lifespan above 8 days.Both models show that most of FRWPs were first detected in the eastern Pacific (241-300 E), and western South-Atlantic basins (301-359 E).But when we retain simulations that are part of good and/or best forecasts, most of FRWPs were first detected at the central-eastern Pacific basin (180-300 E) in the NCEP model, and in the eastern Pacific (241-300 E) for the IAP-CAS model.One possibility that might explain these results is that the eastern Pacific basin has a maximum of baroclinity (Solman et al., 2003), which favor RWPs development.Thus, RWPs that appear in the eastern Pacific basin will propagate toward the Atlantic-Indian sector where the jet stream, which acts as a waveguide where RWPs propagate, reaches its maximum intensity.Consequently, FRWPs gain stability and propagate for longer periods.
We next examine the mean atmospheric flow in the reanalysis and forecast models during the best/worst forecasts (Figure 5).It is worth mentioning that Pérez et al. (2021) concluded that the northward displacement of the jet stream (this is, during negative SAM events) causes the development of a cyclonic circulation to the southwest of New Zealand.This enables the extension of the waveguide where RWPs propagate into the Pacific, thus favoring LLRWPs.In agreement with this study, Figure 5 shows in all panels an anomalous cyclonic circulation to the southwest of New Zealand.Moreover, this F I G U R E 4 Detection areas of total FRWPs/proportion FRWPs that lasted more than 8 days in the simulations.P EREZ-FERN ÁNDEZ and BARREIRO cyclonic circulation is strongest and is accompanied by generally low geopotential height anomalies between 40 S and 60 S during the best forecasts.In addition, Z a300 in high latitudes significantly increases during the best forecasts which, together with the negative Z a300 in midlatitudes signal the manifestation of negative SAM events.Consequently, results suggest that LLRWPs forecasting might be more feasible during negative SAM years.Alternatively, during the worst forecasts, the circulation anomalies do not show a clear common global pattern.There seems to exist a stationary wave extending from Australia southwards in both models.However, in NCEP forecasts, there are several positive Z a300 anomalies in subtropical latitudes, that are not present in IAP-CAS.These findings suggest that some atmospheric processes lead to the development of a stationary wave near New Zealand which impedes RWPs propagation into the Pacific.Furthermore, the spatial structure suggests that the wave patterns of Figure 5 may be at least partly forced from the tropical region.To further look into that we explore the possibility that the MJO may play a role.
Figure 6 shows the probability of occurrence of a certain MJO phase during the best/worst forecasts against their climatological frequency.During the best forecasts, both models show an anomalously inactive MJO, and phases 4-8 are specially absent, particularly in the IAP-CAS model.Also, the probability of finding phases 1-3 is near climatology.By contrast, during the worst forecasts in the NCEP model, the MJO is more active than usual in phases 3 and 5, oppositely, phases 1-2 are mostly absent, whereas the remaining phases occur near climatology.In the IAP-CAS model, the worst F I G U R E 5 Z a300 fields from T d -T d+10 , being T d the dates when we obtained the best/worst forecast skill in the NCEP and IAP-CAS models.Left (right) figures show the Z a300 field obtained using the reanalysis (forecast) data.Orange (blue) areas signal positive (negative) anomalies.
forecasts are also characterized by an active MJO, particularly in phases 3 and 8, appearing with much higher frequency than the climatology.Thus, in both models the best (worst) forecasts are characterized by an inactive (active) MJO.
The Z a300 patterns shown in the composite of the worst forecasts of the NCEP and IAP-CAS models (Figure 5), do not match with circulation anomalies associated with their most frequent stages of the MJO (see Figure 1 of Alvarez et al., 2016).One reason that might explain these results is that because the MJO is more active than usual in certain phases, the anomalies observed are a mixture of signals without a defined structure.Therefore, the maps obtained are not similar between models, and usually show weaker Z a300 values that are less significant than the anomalies associated with the best forecasts.
Results show that an active MJO degrades the LLRWPs forecast, which might be attributed to the interaction between the tropically excited and mid-latitude waves.Nonetheless, we have to take into consideration that even though the MJO forecast is reliable until 25 days in advance (Fu et al., 2013), current biases in the representation of the MJO and its teleconnections (Lim et al., 2018) may degrade LLRWP predictions.

| CONCLUSIONS
RWPs are considered precursors of extreme weather events.Under certain circumstances they can last for several days to weeks in the atmosphere, thus, studying the representation of these long-lived packets might improve extreme weather event detection in the subseasonal time scale.Here we considered two S2S models (NCEP and IAP-CAS), and showed that they are capable of forecasting long-lived RWPs.Nevertheless, packets forecasted by NCEP (IAP-CAS) model are systematically shifted to the east (west) from the original packet, although they propagate with similar speeds.Both NCEP and IAP-CAS models struggle to forecast RWPs that last more than a week because predicted packets rapidly lose energy after the first week of simulation, which might limit long-lived RWPs forecast to the synoptic scale.Both models accurately represent long-lived RWPs that appear in the eastern Pacific sector, and also during negative SAM events.However, the worst forecasts of both models manifest a stationary wave train that blocks the propagation of wave packets to the south of Australia.Moreover, MJO influences LLRWPs forecast in the models, such that during simulations with good LLRWPs forecasts the MJO is anomalously inactive in both models.In contrast, during the worst LLRWPs forecasts, some MJO phases are more active than usual (phases 3, 5 in the NCEP model and 3, 8 for IAP-CAS) and results show a mixture of signals with lower significance and amplitude that do not seem to be linked to a specific MJO phase.These differences between models may be due to distinct simulated MJO dynamics and teleconnections, or could be related to the misrepresentation of the MJO in the forecast, complicating the identification of MJO phases that might favor LLRWPs forecast.Nevertheless, our results suggest that LLRWPs prediction in two sub-seasonal models is influenced by the activity of the MJO, and inactive (active) periods of MJO lead to improve (degrade) LLRWPs forecasts.Future studies should focus on determining whether the misrepresentation of the MJO in forecasts models greatly affects RWPs predictions in the mid-latitudes of the southern hemisphere.

F
I G U R E 1 Hovmoller diagram of V 300env (in m/s) during the propagation of a LLRWPs detected in the reanalysis at 06/01/2003 (upper left), NCEP V 300env forecast from the first 2 perturbed simulations (upper mid and upper right), plus graphical representation of the tracked trajectories (down figure).Black lines in the upper figures identify the trajectory of the original LLRWPs (FRWPs) detected in the reanalysis (forecast), and colored lines in the down figure identify the trajectories of the observed LLRWPs and FRWPs.

F
I G U R E 2 Frequency histogram of the FRWPs displacement from the original LLRWPs found in the reanalysis in each lead day.Positive (negative) bias signals that the FRWPs appear more westwards (eastwards) compared to the observed LLRWPs.Black lines signals the area of 0 bias whereas red (blue) lines show the median location of the FRWPs tracked in the ensemble mean for NCEP (IAP-CAS) forecast.

F
I G U R E 6 Relative frequency of the MJO phases detected during the propagation of LLRWPs for the best (left figures) and worst (right figures) forecasts found in NCEP and IAP-CAS models.Orange dots represent the mean climatological probability of finding the MJO in a specific phase whereas back lines show the range between mean climatological probability ± its standard deviation.
Proportion of total FRWPs and FRWPs that lasted more than 8 days found in forecasts during different SAM and ENSO stages in NCEP and IAP-CAS model.
T A B L E 1