Predictability of the East Africa long rains through Congo zonal winds

East Africa is highly vulnerable to extreme weather events, such as droughts and floods. Skillful seasonal forecasts exist for the October–November–December short rains, enabling informed decisions, whereas seasonal forecasts for the March–April–May (MAM) long rains have historically had low skill, limiting preparation capacity. Therefore, improved long rains prediction is a high priority and would contribute to climate change resilience in the region. Recent work has highlighted how lower‐troposphere Congo zonal winds in MAM strongly impact regional moisture fluxes and the long rains total precipitation. We therefore approach long rains predictability through the predictability of the Congo winds. We analyze a set of hindcasts from a dynamical prediction system that is able to reproduce the long rains—Congo winds relationship in its individual ensemble members. Encouragingly, in observations, the strength of MAM Congo zonal winds and East Africa rainfall show substantial correlation with the MAM Atlantic (including North Atlantic Oscillation, NAO) and Indo‐Pacific variability, suggestive of ocean influence and potential predictability. However, these features are replaced by different teleconnections in the hindcast ensemble mean fields. This is also true for NAO linkage to Congo winds, despite correct representation in individual members, and good skill in hindcasting the NAO itself. The net effect is strongly negative skill for the Congo winds. We explore statistical correction methods, including using the Congo zonal wind as an anchor index in a signal‐to‐noise calibration for the long rains. This is considered a demonstration of concept, for subsequent implementation using models with better Congo zonal wind skill. Indeed, the clear signals found in the Atlantic (including Mediterranean) and Indo‐Pacific, studied here both in observations and a dynamical prediction system, motivate evaluation of these features across other prediction systems, and offer the prospect of improved physically‐informed long rains dynamical predictions.


K E Y W O R D S
Congo, dynamical prediction, East Africa, long rains, North Atlantic Oscillation, seasonal forecast

| INTRODUCTION
It is well established that East Africa (EA) seasonal forecast skill for March-May (MAM, long rains) is currently much lower than that for the October-December short rains (e.g., Batté & Déqué, 2011;Dutra et al., 2013;Mwangi et al., 2014;Walker et al., 2019).A goal of this paper is to diagnose a dynamical prediction system during this challenging forecast period; dynamical models have shown limited skill for MAM EA rainfall at any substantive lead-time (MacLeod, 2019a;Walker et al., 2019).This contrasts with the good dynamical model skill for October-December EA rainfall (Bahaga et al., 2016;Batté & Déqué, 2011), which has been attributed to strong links with the El Niño-Southern Oscillation and the Indian Ocean Dipole (e.g., Behera et al., 2005;Black et al., 2003;Hirons & Turner, 2018;Indeje et al., 2000;Nicholson & Kim, 1997), that are largely absent for the long rains (e.g., Ogallo, 1988;Pohl & Camberlin, 2006).The region is highly vulnerable to droughts and floods, so there is clear humanitarian importance attached to making progress on this forecast problem (e.g., FEWS NET, 2013NET, , 2022;;Magadzire et al., 2017).
Recent work has firmly established a strong physicallybased role for lower-troposphere Congo zonal wind anomalies impacting EA MAM rainfall totals through impact on moisture fluxes (Finney et al., 2020;Nakamura, 1968;Okoola, 1999aOkoola, , 1999b;;Walker et al., 2020).In this paper, we therefore approach the potential predictability of the EA long rains by exploring the predictability of the Congo MAM 700-hPa zonal wind.In addition, a North Atlantic Oscillation (NAO) connection to Congo zonal wind anomalies was identified for the February-April season in Todd and Washington (2004).Given the established predictability of the NAO (e.g., Scaife et al., 2014), this is another new avenue to explore.After considering the NAO, this paper then also considers other potential sources of predictability for the MAM Congo winds.Given recent work connecting the long rains to Indo-Pacific sea-surface temperatures (SSTs, e.g., Funk et al., 2018), this is also a candidate to consider that may provide predictability of the Congo windslong rains system.
Section 2 describes data and methods.Section 3 examines the Congo winds-long rains relationship within each individual ensemble member.Section 4 explores hindcasts of the NAO-Congo wind linkage.Section 5 explicitly examines the predictability of the Congo winds, comparing links to tropical SSTs in the hindcasts and observations.Section 6 considers the potential for improvements to EA rainfall skill through statistical hindcast correction that draws in part on the hindcast Congo winds.Section 7 provides concluding discussion.

| DATA AND METHODS
Hindcasts are from the UK Met Office GloSea5 dynamical model (MacLachlan et al., 2015).An ensemble of 56 members is constructed for each year, initialized across January 9th to March 1st (details in Table S1).These start dates represent hindcasts that have generally shown very low skill for EA long rains in dynamical models (e. g., MacLeod, 2019aMacLeod, , 2019b)).
The period of the hindcasts (1993-2016) mostly samples interannual MAM EA rainfall variability during the relatively dry epoch 1998-2011 (Wainwright et al., 2019), so the analysis period is not marked by strong autocorrelation in the long rains.For the MAM EA index used here at lag 1, r = 0.16 (not significant) and this is true for all key MAM indices used in the paper.Therefore, full effective sample size of 24 is assumed, which corresponds to threshold values of Pearson correlation coefficient (r) of +/À 0.40, 0.34, 0.27 for significance at p = 0.05, 0.10, 0.20, respectively.For model validation and empirical investigation, the following datasets are used: precipitation from the Global Precipitation Climatology Project (Adler et al., 2003), atmosphere from ERA-Interim and key results cross-checked with ERA5 (Dee et al., 2011;Hersbach et al., 2020) and SST from NOAA Optimum Interpolation High Resolution (Reynolds et al., 2007).The NAO index is calculated as the surface pressure difference between Azores and Iceland (domains follow Smith et al., 2020).

| INTERNAL MODEL WIND-RAINFALL CONNECTION
Area-average MAM indices (Figure 1a) of EA rainfall and zonal 700-hPa winds (U700) over the Congo (index hereafter named U700C) are calculated for both observations and for each individual GloSea5 ensemble member.Consistent with Walker et al. (2020), observed EA rainfall and U700C are highly correlated (r = 0.72, Figure 1b; and similar using ERA5 for U700C, r = 0.71, see Table S2 for various such cross-checks).The equivalent correlation for the GloSea5 ensemble members (a total of 1344 cases) is also very similar (r = 0.67, Figure 1b), indicating that the ensemble members accurately represent the mechanism linking EA rainfall and U700C.
This result (Figure 1b) suggests that if GloSea5 could skillfully predict U700C, this would imply predictability of EA MAM rainfall.Note however, Figure 1b is internal to the individual ensemble member hindcasts.Whether the correct phase of the U700C-EA rainfall system is present in the hindcast ensemble mean (EM) is a further question, that is explored below.

| POTENTIAL ROLE FOR THE NAO
A link between NAO and Congo zonal winds was previously identified for the February-April season (Todd & Washington, 2004).Therefore, the NAO is here considered as a possible source of predictability for U700C.In the hindcasts, MAM NAO shows significant skill (EM skill of r = 0.55, consistent with Lledo et al., 2020), and displays (Figure 1c) the same anomalously weak signal-to-noise ratio seen in winter (Eade et al., 2014;Scaife & Smith, 2018).This suggests that NAO predictability in MAM has potential to deliver (either directly, or through common forcing) some predictability of the U700C index.
The observed NAO-wind relationship in Todd and Washington (2004) is reproduced here for MAM (Figure 2a).In the observations, NAO correlates with the U700C index at r = À0.36 and with EA rainfall at r = À0.49(full correlation map, Figure S1a).
Is this NAO link reproduced in the individual ensemble members?To answer this, as in Figure 1b, the 56 ensemble members are catenated from each of the 24 years, to give 1344 cases.The model NAO correlates at r = À0.25 with the model U700C and at r = À0.31 with the model EA rainfall (full maps in Figure S2).These values are therefore a little lower than in the observations, but have the same sign and, with the large sample size of 1344, are estimated as highly significant (discussed in Table S3).The model results suggest the observed sample correlation of NAO versus EA rainfall (r = À0.49) is at the high end of possible 24-year outcomes (see Table S4), but still well within the range implied by the model.In both observations and the model, the NAO correlates slightly higher with EA rainfall than with U700C.The hypothesis that the NAO linkage to EA rainfall occurs partly through U700C (which impacts EA rainfall, i.e., mediation, Kolstad and MacLeod (2022)) and in addition, partly through a different route, is supported by multiple regression analysis of the three indices, in both observations and the model (Table S5).Overall, these results with the individual members appear to suggest the model should have skill in predicting EA rainfall through the NAO-wind-rainfall connection.
However, the EM predictions from the model behave differently.The correlation of the EM NAO with the EM U700 field (Figure 2b) loses (or perhaps displaces east of 20 E) the two bands that were oriented southwest to northeast, located to the south of the NAO in the observed pattern (Figure 2a).Now, the negative U700 correlation over Congo seen in the observations as part of the banding structure (Figure 2a) is replaced with a weak isolated zone of positive correlations over Congo in the model (Figure 2b).The difference in structure is already seen in the southern pole of the NAO, which for the model in terms of U700 is oriented too strongly in the east-west direction, and penetrates substantially into northwestern Africa, leading to negative U700 correlations where in the observations there is already the next band of opposite sign correlations (Figure 2a).In summary, the EM hindcast of NAO correlates at r = 0.32 with the EM hindcast of U700C (Figure 2b).In the observed system, the opposite sign of association operates (r = À0.36, and seen in Figure 2a).
A possible source of this discrepancy emerges by considering the observed SST relationship with the observed NAO (Figure 2c) and comparing that to the hindcast EM SST relationship with the hindcast EM NAO (Figure 2d).The observations reveal the well-known tripole SST anomaly across the North Atlantic, with particularly strong negative correlations in the eastern tropical North Atlantic, generally considered to be primarily a response to NAO (e.g., Hurrell & Deser, 2010;Visbeck et al., 2003, with boreal spring also active, Penland & Hartten, 2014).However, in the hindcasts, the relationship of the EM NAO to SSTs in this region is weaker (Figure 2d), and does not extend close to West Africa (5-15 N).Furthermore, the gradient from this location to the eastern Gulf of Guinea is reversed.These differences may be expected to impact nearby climate anomalies; model experiments in Todd and Washington (2004) suggested tropical North Atlantic SSTs could serve as a link between the NAO and its impact over equatorial Africa.
Thus, in the GloSea5 forecast system, there does not appear to be a working mechanism in the predictable portion of variance (EM) linking NAO through to MAM EA rainfall (evidenced further in Figure S1b).This may be related to model error (including the EM NAO predicted signal being too weak), or it may be that the NAO-U700C linkage is in fact confined to the internal portion of NAO variance and is fundamentally not predictable, that is, this linkage will always be confined to the individual members of an ensemble forecast.In future extensions of this analysis, useful insights are also likely from previous work on winter NAO and tropical Atlantic linkages (e.g., analysis of CMIP experiments in Jing et al., 2020), but expression in boreal spring, and in initialized seasonal hindcast experiments, brings challenges unique to the problem here.

| OTHER SOURCES OF PREDICTABILITY FOR CONGO ZONAL WINDS
This section evaluates and diagnoses the EM hindcasts of MAM U700C.First, it is confirmed that U700C (EM) still has a significant positive correlation with EA rainfall (EM), like in the individual members (Figure 1b), but just not as strong (r = 0.47, see Figure S3).Next, comparing observed and EM U700C hindcast values (Figure 3a, black line), a substantial highly significant negative skill emerges (r = À0.49,p = 0.015).This finding suggests that there is potential predictability for U700C, but that there is indeed an error in the model hindcasts that is leading to a systematic tendency for reversal of sign in the hindcast anomaly.The observed U700C index in fact correlates negatively not just with the co-located EM hindcast U700, but also, with the hindcast U700 right across the equatorial Indian Ocean, where the correlation intensifies (Figure 3b).This is a much stronger signal than any NAO-related one in Figure 3b, where no correlations of substance extend northwestward into the NAO domain.Therefore, the NAO does not appear to be the primary source of the negative U700C skill, even though the reversed EM NAO link in GloSea5 (compare Figure 2a,b) could potentially make a small contribution.
To explore where potential sources of predictability of U700C may reside in the ocean, the observed U700C index has been correlated with observed SST (Figure 3c).Several areas emerge as potential contributors to U700C variability, including the tropical Atlantic, eastern Mediterranean and northwestern Indian Ocean.However, strongest signals are in the western Pacific (negative correlation "<" shape), which has been associated with EA MAM precipitation potential predictability (Funk et al., 2018(Funk et al., , 2019;;Funk & Hoell, 2015;Lyon & DeWitt, 2012).However, the model EM hindcast U700C correlates with hindcast SST in a very different way (Figure 3d).It fails to capture the western tropical Pacific relationship, but in contrast emphasizes positive correlations with the southeastern tropical Indian Ocean.Across the Indo-Pacific, the sign of correlations is generally reversed in Figure 3d compared to Figure 3c, although maxima are differently located.Differences between EM and observed teleconnections could simply be a result of internal systematic but unpredictable variability.However, it is hypothesized that the western Pacific signal should be predictable at this short-lead time, which would imply a weakness of this model in this region, with a lack of western Pacific relationship in the hindcasts (Figure 3d), being replaced by opposite sign relationship in the southeastern Indian Ocean; it is proposed that this substantially contributes to the model U700C hindcast wind error.
The above interpretations are supported by also considering the correlation of precipitation with U700C (contours on Figure 3c,d).The tropical precipitation correlations support the presence of strong regional climate variations associated with the SST correlations emphasized above, at a scale that is consistent with impact on U700C.For example, hindcast U700C correlates strongly and positively with precipitation in the southeastern Indian Ocean (where SST correlations are positive), whereas observations emphasize negative correlations with the western Pacific region (where SST correlations are negative).Alignment of SST and precipitation correlations (see Kumar et al., 2013) is also present in the tropical Atlantic (Figure 3c,d For reference, the gray box is the U700C index domain (Figure 1a).
with reversed sign for observations compared to model, suggesting errors in the tropical Atlantic likely play a role in the reversed sign of skill for U700C.The sign of precipitation correlation in the tropical North Atlantic generally extends into West Africa, especially in the model hindcasts (Figure 3d) highlighting that a next step is the need to understand links to continental convection for a full understanding of long rains predictability.Further insight emerges on the Indo-Pacific errors when the U700C versus U700 field relationship is considered in observations (Figure 4a) and the model (Figure 4b).In the model, the correlations suggest that anomalous winds connect from the southeastern tropical Indian Ocean into EA and the Congo region.This is absent in the observations, which reflect the way the western Pacific SST influence reaches across the Indian Ocean (Funk et al., 2018).Detailed assessments have shown such linkages are important but often erroneous in models during other seasons (e.g., October-December, Hirons & Turner, 2018).
The above interpretation is further reinforced when considering the wider GloSea5 validation of U700 (Figure 4c) and SST (Figure 4d).Areas of low SST prediction skill, as well as low U700 skill, are found in the northwestern tropical Pacific near where the observed U700C index strongly correlates with observed SST.
In addition, relatively low SST hindcast skill is found in the eastern tropical North Atlantic and Gulf of Guinea, areas that also have near-zero U700 skill.The region is an area prone to model errors.Rainfall along the equatorial Atlantic here (Figure S1b) and in other GloSea5 simulations, exhibits erroneous connections with the NAO (also shown for winter in Scaife et al., 2017) that are likely related to long-standing model errors in the equatorial Atlantic (Dippe et al., 2019).Furthermore, given the reversed relation between EM NAO and U700C (Figure 2a,b), and the broadly reversed Atlantic SST correlation with U700C (Figure 3c,d), the Atlantic (including the Mediterranean) should also be considered in a deeper assessment of the U700C hindcast errors.These results point the way to future analysis of forecast models, to explore the way in which the Indo-Pacific and Atlantic are handled at this time of year, seeking models that deliver robust predictability of U700C and EA rainfall.

| EXPLORING MODEL CORRECTION
The tight relationship between U700C and EA rainfall (Figure 1b) motivates considering whether U700C may be used as the anchor index for signal-to-noise (SNC) calibration improvement of EA rainfall hindcasts.This is analogous to the successful use of NAO as anchor index for SNC hindcast improvement of precipitation in the NOAimpacted region (Smith et al., 2020).The problem is that, unlike the NAO, the skill of U700C here is negative.In principle, SNC can utilize negative skill information (see below).However, we do not advocate implementing this model correction operationally until there is physicallyinformed understanding of the negative skill.Furthermore, the negative skill, while maintained in sign when using ERA5, is weaker in amplitude (see Table S2), introducing further uncertainty.With these caveats, we here make a demonstration of concept by implementing SNC using the result in Figure 3a.
The ratio of predictable components equation (Eade et al., 2014) uses the correlation coefficient between the EM F I G U R E 5 Time-series of March-May East Africa rainfall for observations (Obs, black line) compared to GloSea5 ensemble mean (Raw, blue line) and the GloSea5 ensemble mean after correction using signal-to-noise calibration (Calibrated, orange line).Vertical lines show interquartile ranges of the ensemble member values.The validation correlation (comparing to the observed line) is given in the bottom right corner for the raw and calibrated series respectively.The calibration uses the hindcast 700-hPa Congo zonal wind index, whose ensemble mean hindcast correlates with observed precipitation at r = À0.19 (the partial correlation removing the effect of the ensemble mean hindcast rainfall, is larger at r = À0.36,enabling the significant coefficient a 2 in Equation 1).
and observations (Equation S1) and is the first of three steps (Equations S1-S3) delivering a U700C value (termed U 0 ) in each year with magnitude consistent with the signalto-noise paradox and with corrected sign of anomaly.In each year, the subset of U700C ensemble members that are closest to U 0 are selected (see Figure S4).These members can be expected to now contain some of the U700C-EA rainfall linkage found in Figure 1b, as well as some of the direct skill that was present in the model EA precipitation from other sources (Figure 5, r = 0.26).Application of SNC (Figure 5) increases skill to r = 0.40 (p = 0.05).
It is insightful to compare the SNC result with a model output statistics regression, consistent with previous dynamical seasonal forecast correction approaches (e.g., Ndiaye et al., 2009).Such approaches can also allow for optimal spatial shift in the signal (e.g., through canonical correlation analysis, Colman et al., 2020).However for the purpose of a simple comparison with SNC, we implement a regression prediction model (hereafter, REG) using the hindcast EA precipitation and U700C as predictors: where EA obs and EA mod are the observed and EM hindcast values of EA MAM precipitation, and U700C mod is the EM MAM hindcast of U700C.Such a model has a fit skill of r = 0.44.The a 1 coefficient is positive (significant at p = 0.056) reflecting the fact that the hindcast EA rainfall already has a small amount of positive skill independent of U700C mod .The a 2 coefficient is negative (significant at p = 0.089), with the negative sign consistent with reversing the wind-rainfall signal compared to observations (as also done in the SNC).The predictions from REG and SNC are highly correlated (r = 0.86, plotted in Figure S5).One advantage of SNC is that it identifies actual realizations in the model system, which may be further consulted for properties such as daily weather sequences.

| CONCLUDING DISCUSSION
The Indo-Pacific and the eastern side of the tropical Atlantic represent complex systems to capture in MAM seasonal forecasts over Africa, and likely have been at the root of difficulties in seasonal predictions at this time of year over EA (e.g., see Walker et al., 2019).However, it is encouraging that the MAM Congo zonal wind index (U700C) is closely tied to MAM EA rainfall in observations (r = 0.72), and this is also true in the individual hindcast ensemble members studied here (r = 0.67).This suggests U700C is an indicator of a coherent system delivering rainfall anomalies in observations and the model.Therefore, predictability of U700C has been the focus of this paper.
The NAO correlates with U700C and EA rainfall strikingly in both observations and the hindcast individual ensemble members, for example, NAO v EA rainfall 1993-2016, r = À0.49 in observations and r = À0.31 in the hindcast members with the large sample size of 1344.However, these correlations are absent in the hindcast EM.This requires further work to understand why the signal is confined to the internal (non-predictable) variance of this model, despite strong EM skill for the MAM NAO.Possible sources of model error pointed to include the weak amplitude of NAO signals (e.g., Figure 1a) and errors in tropical Atlantic variability.In addition, in terms of EA rainfall linkage to northern midlatitudes, the primary role of the NAO in MAM is likely gradually replaced with linkages to sharper upper-level troughs and ridges over the Mediterranean region (Camberlin & Philippon, 2002;Wainwright et al., 2022;Ward et al., 2021) during months that are closer to winter (e.g., February-March), motivating assessment at monthly resolution.
The U700C skill is found to be strongly negative, and with evidence of a clear predictable component of variability in the U700C hindcasts.We note possible contributions to this negative skill from variability across the tropical and broader Atlantic, and the Indo-Pacific.However, linkages seen in observations emanating from the western Pacific (Funk et al., 2018) may require better hindcasts for successful model representation of the Indo-Pacific influence on U700C and EA rainfall.
We have explored the potential of model recalibration, in part motivated by the presence of a so-called signal-to-noise paradox for U700C, where the model is better at predicting the real world than its own ensemble members (Figure 3a).Results deliver some improvement, but are still relatively modest compared to many tropical areas (Scaife et al., 2019), and are considered at this stage a demonstration of concept, for subsequent implementation with models that contain better U700C skill.In addition, skill improvements as a function of lead-time and perhaps initial ocean-atmosphere state (such as NAO phase) should be investigated.
In summary, we propose that accurate model representation of the U700C links to the Indo-Pacific and Atlantic regions offers the prospect of much improved forecast skill for MAM EA rainfall.These linkages are a priority for further assessment in current models, including their interplay with intraseasonal variability via the Madden-Julian Oscillation (e.g., Maybee et al., 2023;Vellinga & Milton, 2018).The work here also encourages further investigation of physically-informed recalibration of current forecasts.
Region used for the East Africa rainfall index (shaded blue, 12.5 N to 10 S, 30 E to 52.5 E, land points) and for the Congo zonal wind index (brown box, 5 N to 5 S, 10 E to 30 E).(b) Scatter plot of March-May seasonal mean 700-hPa zonal wind anomaly over the Congo versus the rainfall anomaly over East Africa, for observations (black) and for the individual ensemble members of GloSea5 (purple).The years are 1993-2016 (GloSea5 has 56 ensemble members, so a total of 1344 realizations).(c) Correlation (March-May) of GloSea5 hindcast NAO index with the observed NAO index, as a function of ensemble size (black).Ensembles of each size are generated through selecting random members of the ensemble, and repeating 1000 times.The red line is the correlation when replacing observations with one member of the ensemble.Years used are 1993-2016.

F
I G U R E 2 (a) Correlation (March-May) between the observed NAO and the observed 700-hPa zonal wind, 1993-2016.For guidance, local significance at the 5% level is +/À0.40.NAO is here defined as the surface pressure difference for the box (70 N to 63 N, 25 W to 16 W) minus (40 N to 36 N, 28 W to 20 W).(b) Same as (a) but for the ensemble mean hindcast NAO versus the ensemble mean hindcast 700-hPa zonal wind.(c) Same as (a) but with the observed SST.(d) Same as (b) but with the ensemble mean hindcast SST.For reference, the gray box is the Congo zonal wind index domain (Figure1a).
U R E 3 (a) Same as Figure1c, but for the 700-hPa zonal wind index over the Congo (U700C index).Figure1cdisplays the signature of the established NAO predictability (with signal-to-noise paradox), and similarities here are notable, with observed-model correlation growing with ensemble size (black line), and model-model correlation (purple line) peaking at absolute values lower than the modelobserved (black line).(b) Correlation (March-May) between the observed U700C index and the ensemble mean hindcast 700-hPa zonal wind.Years used are 1993-2016.For guidance, local significance at the 5% level is +/À 0.40.(c) Correlation (March-May) between observed U700C and observed SST (shading) and observed rainfall (contours).Contours at +/À0.2 and 0.4, positive dark red, negative dark blue.Years used are 1993-2016.(d) Same as (c) but for the ensemble mean hindcast of U700C and ensemble mean hindcast of SST (shading) and rainfall (contours).For reference, in b (c, d) the gray (orange) box is the U700C index domain (Figure 1a).

F
I G U R E 4 (a) Correlation (March-May) between observed U700C index and observed 700-hPa zonal wind.Years used are 1993-2016.For guidance, local significance at the 5% level is +/À0.40.(b) Same as (a) but for ensemble mean hindcast U700C index versus ensemble mean hindcast of 700-hPa zonal wind.(c) Validation of March-May hindcast of 700-hPa zonal wind.Statistic shown is the correlation (calculated at each grid-box) between the ensemble mean hindcast and the observed over 1993-2016.(d) Same as (c) but for sea-surface temperature.