The observed synoptic scale precipitation relationship between Western Equatorial Africa and Eastern Equatorial Africa

An improvement to subseasonal (i.e., days to weeks) rainfall prediction across Equatorial Africa is an important area of current research. This is because most countries in this region are highly dependent on rain‐fed agriculture, and so millions of livelihoods are at risk in the event of an unexpected poor harvest. This study examines 16 years of daily precipitation anomalies to investigate the relationship in precipitation between Western Equatorial Africa (WEA) and Eastern Equatorial Africa (EEA). Using lead/lag correlation and spatio‐temporal correlation patterns over various sub‐regions, a synoptic‐scale relationship in precipitation is presented between WEA and EEA, in which precipitation over EEA lags precipitation over WEA by 1–2 days. In addition, central WEA and sub‐regions in South Sudan display a synoptic‐timescale precipitation contrast, suggesting a weak precipitation dipole. Consistent with the known heterogeneity characteristic of Equatorial Africa's precipitation, our findings suggest that the 1–2 days precipitation relationship is dependent upon the sub‐region under investigation. Furthermore, our results indicate a coherent synoptic‐timescale eastward/northeastward propagating signal with a speed of approximately 12 m/s. Composite and correlation analyses of precipitation anomalies and a novel equatorial wave dataset show an apparent connection between eastward/northeastward propagating wet anomalies and Kelvin wave lower‐tropospheric convergence. This suggests that Convectively Coupled Kelvin Waves (CCKWs) play a role in modulating the 1–2 days convection and precipitation propagation between WEA and EEA. These results imply that monitoring the propagation characteristics of CCKWs may be important in synoptic‐timescale forecasting over Equatorial Africa.


| INTRODUCTION
Short-to-medium-range convective rainfall variability has a direct influence on rain-fed agricultural productivity because it determines the amount of available soil moisture (e.g., Black et al., 2016) as well as the frequency of replenishing surface and underground water for production (Taylor et al., 2019). However, extreme precipitation events are likely to cause destructive flooding and landslides; therefore, to minimize loss of life and livelihoods, it is important to develop robust forecasting systems for various timescales (Ongoma et al., 2018). In Equatorial Africa (defined here as 15 S-15 N, 0 -51 E), reliability of short-term forecasts depends on experts' ability to extrapolate the prevailing conditions, as well as interpretation of forecast maps from global forecast centres (Graham et al., 2015). However, while forecast model guidance from convection permitting models has become an additional tool to support forecasting over EEA (e.g., Woodhams et al., 2018), a similar tool for WEA is less common. Even with the evolution of convection permitting models, the challenge of unrealistic representation of the multi-spatial and temporal interaction of key drivers of convection variability remains.
A few areas in central Equatorial Africa experience a bimodal rainfall cycle, while those outside the 5 S-5 N latitudinal band experience a unimodal rainfall cycle (e. g., Dunning et al., 2016). Precipitation in Equatorial Africa is characterized by remarkable spatial variation at intraseasonal (e.g., Nguyen and Duvel, 2008), seasonal (e. g., Sandjon et al., 2014;Fotso-Kamga et al., 2019) and interannual (e.g., Janowiak, 1988, Figure 2e) timescales. Because of the heterogeneity of precipitation over this region, previous studies delineated their areas of study into smaller sub-regions, using either monthly totals (e. g., Indeje et al., 2000;Dezfuli, 2011) or annual totals (e.g., Badr et al., 2016). Here, for the first time, we use high spatial resolution daily precipitation records to objectively identify small sub-regions that are characterized by similar daily precipitation variability over all of Equatorial Africa.
One major step in improving short-to-medium-range forecasting is enhancing knowledge on the role of synoptic scale disturbances in modulating precipitation. Synoptic-scale disturbances influence convection and precipitation by significantly perturbing the basic state of atmospheric fields such as pressure, temperature and wind. The resulting latent heat released by the deep convection associated with the perturbed fields can cause the generation of new equatorial waves (e.g., Lindzen, 2003). As a wave propagates, it interacts with convection through, for example, enhancing low-level moisture flux (e.g., Mekonnen et al., 2008;Sinclaire et al., 2015) and influencing convection and precipitation downstream, and this may cause a variation in its speed of propagation.
The role of synoptic-scale tropical disturbances in influencing precipitation and convection in West Africa north of the equator is adequately documented (Ventrice et al., 2012;Yang et al., 2018), but our knowledge of the role of synoptic-scale disturbances in influencing precipitation variability within a few degrees about the equator remains limited. CCKW activity over Equatorial Africa has been detected in both observations (e.g., Pohl and Camberlin, 2006b;Laing et al., 2011;Wheeler and Nguyen, 2015;Mekonnen and Thorncroft, 2016) and regional climate model simulations (e.g., Tulich et al., 2011;Jackson et al., 2019). These convection-organizing waves are most active during March-April-May (MAM) (Roundy and Frank, 2004). They originate from various regions in the tropics such as the eastern Pacific (e.g., Mekonnen et al., 2008;Liebmann et al., 2009) and propagate eastward over Equatorial Africa with a phase speed of about 12-22 m/s (e.g., Laing et al., 2011), although their speed of propagation depends in part on their location in the global tropics (Mounier et al., 2007;Laing et al., 2011). Pohl and Camberlin (2006b) detected a Kelvin wave signal that causes upper tropospheric cooling during boreal spring, thereby enhancing the convective potential over EEA. The findings in Laing et al. (2011) indicate that CCKWs influence convective activity over Equatorial Africa through modulation of low-level wind shear that is related to southwesterly monsoonal winds and midtroposheric easterly jets. Wheeler and Nguyen (2015) highlighted that the passage of a CCKW over Equatorial Africa is preceded by low-level easterly wind anomalies, while the trailing end of the wave is dominated by anomalous low-level westerlies that are in phase with positive geopotential height anomalies. Sinclaire et al. (2015) noted that during March-June, eastward propagating CCKWs favour initiation of synoptic-scale convective systems and that annually, an average of six to seven CCKWs propagate through the Congo basin in this period. While all these studies used OLR to identify CCKWs, we take a unique approach of identifying the CCKWs, using a novel dynamics-based equatorial wave dataset. Mekonnen and Thorncroft (2016) found a dipole pattern of synoptic-scale convective activity between the Congo Basin and Eastern Africa. They also observed that CCKWs modulate a coherent eastward/northeastward propagating convective signal, that oscillates between enhanced and suppressed states with a periodicity of 3-4 days. They found that low-level westerly and southwesterly anomalous winds are linked to enhanced convection over East Africa, while northeasterly wind anomalies are linked to suppressed convection. In agreement with Mekonnen and Thorncroft (2016),  suggested that the days with westerlies exhibited a stronger wet signal over EEA and that, the wet signal is suppressed on days with strong easterlies. The study by Mekonnen and Thorncroft (2016) merits extension by examining: (i) actual observed daily precipitation instead of OLR; (ii) repeating (i) for a number of different datasets; (iii) undertaking this for the whole year (rather than certain seasons only); and (iv) using a higher spatial resolution.
While the precipitation connection between WEA and EEA and the associated mechanisms are important, they have not been thoroughly investigated even though previous publications have found that WEA is a key source of moisture and convective instability that helps explain precipitation variability over EEA (e.g., Nicholson, 1996;Mafuru and Guirong, 2018). In turn, moisture evaporated over EEA's great lakes is an important source for precipitation over WEA (e.g., Van der Ent et al., 2010). This study aims at advancing knowledge of the synoptic-scale interaction between WEA and EEA in the context of daily precipitation anomalies, which are more linked to convective processes and more relevant to human impacts than OLR. More precisely, our objectives are: i To assess the precipitation linkage between WEA and EEA based on small sub-regions characterized by similar daily precipitation characteristics. ii To identify possible mechanisms driving variability associated with the precipitation connection between WEA and EEA.
The rest of this manuscript is organized as follows. Section 2 describes the datasets analysed, followed by Section 3 describing the methodology. In Section 4 the results are presented and finally, the discussion and conclusions are given in Section 5.
2 | DATA 2.1 | Tropical rainfall measuring mission (TRMM) The tropical rainfall measuring mission (TRMM) daily precipitation dataset has a latitudinal coverage spanning 50 S-50 N and is produced at a resolution of 0.25 × 0.25 , as described in Huffman et al. (2007). The 3B42 version 7 of the daily TRMM estimates is computed by accumulating eight 3-hourly TRMM 3B42 records obtained by merging precipitation estimates from multiple satellites (Huffman et al., 2007), and for regions where the Global Precipitation Climatology Centre (GPCC) global precipitation analysis dataset (e.g., Schneider et al., 2008) is available, GPCC is used for bias correction of the final TRMM product. Gebremicael et al. (2019) found a percentage bias and correlation coefficient between the TRMM precipitation product and rain gauge data of within ±25% and greater than 0.5, respectively, for various time scales. Dinku et al. (2007) found that TRMM 3B42 performed well compared to other satellite precipitation products over EEA and so, we use this version for the period 1998-2013.

| Global precipitation climatology project (GPCP-1DD)
The Global Precipitation Climatology Project (GPCP-1DD) is a 1 × 1 global daily precipitation product produced by merging various satellite estimates and rain gauge observations (e.g., Huffman et al., 2001). The microwave estimates are generated from the Special Sensor Microwave Imager (SSM/I) on board the Defence Meteorological Satellite Program, and the Infrared (IR) data uses the Geostationary Operational Environmental (GOES) Precipitation Index (GPI) which relates cloud top temperature to precipitation rate (Huffman et al., 2001). Precipitation estimates are then computed using the Threshold-Matched Precipitation Index (TMPI) that is applied on the SSM/I data to isolate raining pixels in the IR data (Huffman et al., 2001). Rain gauge observations are indirectly used when the GPCP-1DD accumulations are scaled to match the GPCP monthly product. Here we use the 1DD daily estimates for the period 1997-2012.

| ERA-Interim
ERA-Interim (ERA-I) is a global atmospheric reanalysis dataset produced by the European Centre for Medium-Range Weather Forecasts (ECMWF), available on a spatial resolution of 0.7 × 0.7 and a 6-hour temporal resolution at 37 pressure levels (see Dee et al., 2011). The current study uses the daily averaged fields for the period of 1998-2013.

| Equatorial wave dataset
The equatorial wave dataset is produced using the zonal and meridional wind and geopotential height field in ERA-I. The equatorial wave dataset has a spatial resolution of 1 × 1 and a 6-hour temporal resolution at 10 pressure E584 levels for the entire 24 S-24 N latitudinal belt, and includes equatorial waves with zonal wavenumbers k = 2-40 and a period of 2-30 days. Equatorial waves are identified by projecting the dynamical fields onto various equatorial wave modes using their meridional structures, described by parabolic cylinder functions in y and sinusoidal variations in x (see details in Yang et al., 2003). One aspect that makes this dataset unique is that the data are projected onto each pressure level independently, allowing the data to reveal the vertical structure, rather than having to assume dispersion relations; furthermore, equatorial waves are identified based on their dynamical structure (Yang et al., 2003). Because the equatorial dataset does not use OLR to identify the waves, it has no information about precipitation, thus the relationship between this dataset and precipitation is independent of the technique used to generate it. This dataset is available from 1997-2018, however this study uses the daily averaged data for the period 1998-2013.

| Empirical orthogonal teleconnection (EOT)
The influence of oscillatory modes of tropical convection and precipitation variability is location specific (e.g., Pohl and Camberlin, 2006a;Sinclaire et al., 2015). Here, we subdivide our domain (defined above) into smaller sub-regions with relatively similar daily precipitation variability, using the Empirical Orthogonal Teleconnection (EOT) technique (e.g., Smith, 2004). Prior to performing the EOT on the daily precipitation dataset, the annual cycle was removed by subtracting a 30-day running mean of the daily 16-year rainfall climatology from each year used in the study. Since our interest is to understand high frequency precipitation variability, the influence of low frequency modes of variability such as ENSO was also removed by subtracting the 100-day moving average from the time series at every gridpoint in the domain. The 100-day moving average was used because it is consistent with the 3-month running mean of sea surface temperature anomalies used to define the Oceanic Nino Index (ONI) at the National Oceanic Atmospheric Administration (NOAA) Climate Prediction Centre (CPC) (e.g., Huang et al., 2016). Finally, any long-term trend over the 16-year period was removed using linear regression. The remaining anomalies were then subjected to the EOT algorithm. We note that re-running the algorithm without removing the trend, or removing it using Locally Estimated Scatterplot Smoothing (LOESS) instead of linear regression, did not influence the resulting subregions.
The purpose of the EOT technique is to objectively identify sub-regions exhibiting similar daily precipitation characteristics. This technique produces modes which are orthogonal in either space or time. The EOT technique was first described by Van den Dool et al. (2000) (see brief description in the Supporting Information), however, Smith (2004) proposed a modified version. Here, we use the modified EOT approach as described in Smith (2004).
The modified EOT approach is implemented in a few steps. First, a "base point" is identified whose time series is best correlated to the domain-area-average time series. Second, the base point time series is correlated with the time series of every grid-point in the domain. Third, the sub-region is obtained by finding the longitude-latitude box that encloses the meridional and zonal line segments intersecting the base point which include all contiguous grid points along those segments whose correlation coefficients between the grid point time series and the base point time series exceed 0.2 (this step is needed for our study but not for EOT analysis in general). Finally, the variance explained by the first base point is subtracted from every grid point in the domain, thus creating a new dataset (the residual). In each iteration, the four steps are repeated to identify a sub-region and thus the first iteration identifies sub-region 1, the second iteration identifies sub-region 2, etc., until the desired number of subregions is identified. Note that the order in which particular sub-regions are identified does not indicate their relative percentage of the total variance explained, and for this study, the sole purpose of the EOT analysis is finding sub-regions of spatial co-variance rather than quantifying variances explained. Since the order in which the subregions are identified is not physically important, the identified sub-regions were renamed: for example, W1 represents the westernmost sub-region in WEA and E1 is westernmost within EEA. A more detailed description of the modified EOT approach can be found in Smith (2004) and Stephan et al. (2017). The use of the EOT algorithm enabled an objective identification of sub-regions made up of points with relatively similar daily precipitation variability. The location and dimensions of the various sub-regions identified are the only outputs of the EOT analysis used in subsequent analysis. For example, the sub-region's area-average time series is computed by spatially averaging the above-calculated anomalous time series over the dimensions of the sub-region.

| Correlation and composite analysis
Correlation analysis is used for all pairs of sub-regions over various days of lead/lag in order to identify propagating signals. As presented, for any given pair of sub-regions, a positive value implies that EEA is lagging WEA. In all subsequent analysis, we use the word "lag" (with a positive value) to refer to days after the reference time (day 0) and likewise lag with a negative value for days before the reference time. The bootstrapping technique is used to test for the statistical significance of the correlation coefficient between the area-averaged time series of a pair of sub-regions, referred to here as "A" for WEA and "B" for EEA. Bootstrapping is done by holding A in its original order, and then, to preserve its original synoptic-timescale autocorrelation, B is divided into blocks of 100 days. A total of 1,000 samples of B are constructed by randomly drawing the blocks with replacement and stitching the sampled blocks together so that each sample is the same size as B. Time-series A is then correlated with every sample and the 95th percentile of the absolute value of these correlation coefficients is determined. If the absolute value of the correlation coefficient of the original time series is greater than this 95th percentile, it is considered to be statistically significant at the 95% confidence level.
Composites are performed on "events", where an event is defined for a given pair of sub-regions if two conditions are satisfied. First, precipitation occurs in excess of a threshold in a sub-region in WEA. Second, 2 days later, the precipitation in the corresponding sub-region in EEA exceeds that sub-region's threshold given that the previous day's precipitation was below the threshold. Our choice of 2 days in defining the event is informed by results in Section 4.2 concerning typical evolution. The thresholds were determined by calculating the 66.7th percentile for each sub-region's area-averaged time series. The 66.7th percentile was chosen because it is an operational precipitation forecast threshold that categorizes the heavy rainfall events that may be associated with organized convective systems. All days with zero areaaveraged raw precipitation amounts were removed before computing the thresholds.

| Identification of sub-regions
Daily TRMM anomalies for a 16-year period were subjected to EOT analysis and a total of 32 sub-regions were identified. The robustness (in terms of the identified subregions' location) of using TRMM daily anomalies was tested by subjecting 15 years of GPCP daily anomalies to the same EOT analysis. Figure 1 shows an example of the sub-regions identified using TRMM daily anomalies. Those identified using the GPCP dataset were generally in similar locations as those shown in Figure 1 (not shown). The 29 E longitude was chosen to divide Equatorial Africa (defined above) into WEA and EEA for consistency with previous studies (e.g., see Sandjon et al., 2014, Figure 1) and with an operational weather forecasting model over Lake Victoria in EEA (e.g., see Woodhams et al., 2018, Figure 1). The sub-regions over ocean and those whose base-points were located on 29 E were disregarded. A total of 17 sub-regions in WEA and 8 in EEA were considered for further analysis (see example in Figure 1 and those not shown in Figure 1, see Figure S1). Re-running the algorithm without removing the trend, or removing it using Locally Estimated Scatterplot Smoothing (LOESS) instead of linear regression, did not influence the resulting sub-regions.

| Lead/lag correlation
Correlation coefficients between area-average time series of TRMM rainfall anomalies (described in Section 3.1) of each sub-region in EEA with every sub-region in WEA for lead/lag −5 to +5 days over the entire 16 years were calculated. Since there were 17 sub-regions identified in WEA and 8 sub-regions in EEA, there are a total of 136 different pairs. Sub-regions E1 and E3 have the strongest correlation when correlated with the various sub-regions in WEA (ignoring the strongly overlapping W17-E1 pair). We note that E3 is located over Lake Victoria, which strongly influences the climate over EEA (e.g., Song et al., 2004). Previous studies have found a strong influence of the eastward propagating signal of the Madden-Julian Oscillation (MJO) on precipitation over Lake Victoria (e. g., Hogan et al., 2015). The lead/lag correlation coefficient analysis for some sub-regions in WEA versus E1 and E3 in Figure 2 shows peak correlation coefficients at lag +1 to 2 days followed by a local minimum on lag +3 to 4 days. In Figure 2b, E3 shows a peak correlation with W5, W9, W12 and W14 around lag +1 to 2. In comparison with Figure 2b, there are strong peak correlation coefficients on both sides of lag 0 (for W3 and W12) in Figure 2a, which could indicate an oscillation.
Tables 1 and 2 show example results of minimum and maximum correlations of area-average time series for E3 and E1 with every sub-region in WEA, respectively. Values are all significant at the 95% level, though they are typically below magnitude 0.2 as expected for daily precipitation variability. In Table 1, the strongest positive correlation coefficients are seen on lag +1 (i.e., E3 vs. W14), and the positive correlation coefficients on lag +2 (i.e., E3 vs. W5, E3 vs. W9) are generally weaker than those on lag +1. The strongest negative correlation coefficients for some regions seen on lag 0 in Tables 1 and 2 suggest a contrasting relationship between the different pairs of sub-regions. Note that from Table 1, it can be seen that the magnitude of the correlation coefficients is generally below 0.2.
Further analysis on peak correlation coefficients seen in Figure 2 is done by considering the correlation coefficients for each WEA sub-region versus every EEA subregion and identifying the strongest correlation among all lags (0-5) for each pair. Then, for each WEA subregion which is paired with each of the eight EEA sub-regions, the maximum of these correlations is retained for each lag and is plotted in the column for that WEA sub-region. This is shown in Figure 3. For example, the ninth column of Figure 3 shows that, of all pairs of subregions, E2 has the strongest correlation at lag +1 with W9. Similarly, for all such pairs with the strongest correlation at lag +2, E3 is the maximum. No pairs in the ninth column have the strongest correlation at any nonzero lag other than lag +1 or lag +2, so the colours for those lags do not appear in this column. We perform this F I G U R E 1 Example of subregions identified by subjecting 16 years of daily TRMM anomalies to an EOT. The letter "W" attached to the number in the red sub-regions indicates that the sub-region is located in Western Equatorial Africa and "E" likewise for EEA. Since the order in which these sub-regions were identified does not matter, they were renamed for clarity F I G U R E 2 Lead/lag correlation coefficients of 16-year daily TRMM precipitation anomalies for (a) E1 versus W3, W5, W9 and W12 (b) E3 versus W5, W9, W12 and W14. The dashed line shows the largest minimum threshold for statistically significant positive correlation coefficients at 95% confidence level. For any given pair of sub-regions a positive value implies that EEA is lagging, and we use the word "lag" (with a positive value) to refer to days after the reference time (day 0) and likewise negative values for days before the reference time analysis in order to show where the "peaks" of the lagcorrelation distribution (like those shown in Figure 2) are largest for each WEA sub-region. Figure 3 shows that 15 WEA sub-regions (all except W3 and W17) have at least one EEA pairing with its strongest correlation at lag +2, which is why all columns except 3 and 17 have a yellow cell. Lag +2 is therefore the most common lag measured in this way, followed by lag +1 with nine cells and lag +3 with eight cells. Of those 15 WEA sub-regions with lag +2 values, eight (or 53%) have the maximum of these strongest correlations when paired with E3, which is why most yellow cells occur in the third row. This highlights a possible unique interaction between sub-regions in WEA and E3. Furthermore, of the pairings between all WEA subregions and E3 (row 3 in Figure 3), the strongest correlation coefficient is seen between E3 and W9 at lag +2. Also, the strongest correlation coefficients between the westernmost sub-regions (W1-3; see Figure S1 for the location of W1-2) and E3 occur at lag +3, which is why the first three cells in the third row are red. Subsequent work looks at E3 in more detail because it indicates the highest number of sub-regions in WEA with which it exhibits the strongest correlation coefficient at any lag; note that we focus on lag +2 in this paper, for reasons discussed below. We acknowledge that the correlation between W11 and E2 (0.18) is stronger than that between W9 and E3 (0.13). Our choice to further analyse W9 and E3 is premised on the higher number of WEA subregions that exhibit the strongest correlation coefficient with E3 compared to E2. This suggests that overall, E3 is more likely to demonstrate WEA-EEA interaction. Also, we investigate the W11-E2 pairing, and E2 in general, to check sensitivity to sub-region choices. We also note that the W5-E3 pairing has only a slightly lower correlation (0.12) than the W9-E3 pairing, and that W5 is closer to the equatorial latitude range covered by E3, so we also investigate the W5-E3 pairing for sensitivity to choice of sub-region. Over the continuum of lag 0 to +4, Figure 3 suggests that the strongest correlations are seen at lag +1 and +2.
Figure 3 also shows that the correlations peak at lag +1 when the associated pair of sub-regions is closer to one another (e.g., W9 and E2, W14 and E3, W8 and E1) and similarly at lag +2 or +3 when they are further apart (e.g., W9 and E3, W5 and E3, W3 and E3). However, by lag +3, the correlation coefficients are not as strong; thus, emphasis is placed on lag +2 as it provides an opportunity to understand the behaviour of the signal across a wider west-east stretch of the domain under investigation, while still having relatively strong correlation coefficients.
The precipitation anomalies were also divided into four seasons consisting of MAM, June-August (JJA), September-November (SON) and December-February (DJF), and lead/lag correlation coefficients over the various sub-regions were recalculated (not shown). The results showed that the strength of the correlation coefficients almost doubled in MAM and SON. This is likely due to more rainfall events occurring during those seasons. However, the peak of the correlation coefficients was seen at similar lead/lag days as those shown in Tables 1 and 2. The spatio-temporal correlation pattern between W9's area-average time-series and every grid-point is shown in Figure 4. The strongest positive correlation coefficients occur on day +1 when the eastward/northeastward-propagating signal progresses from W9 and propagates to a region approximately centred on 5 S, 25 E (over W14), F I G U R E 3 The maximum correlation coefficients between each WEA and EEA sub-region at various lag days. For each column (WEA sub-region), the dark blue, light blue, yellow, red and dark red cells indicate that the maximum correlation coefficient at lag 0, +1, +2, +3 and +4, respectively, out of all EEA sub-region pairings with the WEA sub-region, occurred with the corresponding row (EEA sub-region). Correlation coefficients are computed using the entire 16 years of TRMM daily precipitation anomalies and only correlation coefficients that are statistically significant at 95% confidence level are indicated in figure. The correlation coefficient for column 17, row 1 was omitted because due to an extensive overlap between E1 and W17, it is expected that the correlation coefficient will be influenced by the overlapping grid points and then on day +2 the strongest positive correlation coefficients advance further east to a region approximately centred on 3 N, 33 E (over E3). By day +3 (and also day +4, not shown), the signal becomes much weaker. The pattern seen in Figure 4 suggests a coherent synoptic-scale eastward-propagating signal. Figure 4 also shows little eastward propagation from day +2 to day +3, unlike previous days. This result is consistent, however, with the findings in Liebmann et al. (2009, Figure 1), and several possible reasons for this behaviour are discussed below. Note that we find a similar pattern to Figure 4 when we redo the same analysis but using either W5 or W11 in place of W9 (pattern for W5 is shown as Figure S2).
The area-average time series over all the sub-regions in EEA (e.g., E1-8) were correlated with every grid point in the domain at lag 0, −1, −2 and −3. Figure 5 shows the spatio-temporal distribution of the correlation coefficients for E1 (Figure 5a-d) and E3 (Figure 5e-h) for lag 0, −1, −2 and −3 (For E2, see Figure S2). There is a clear difference in the structure of the correlation patterns over lag −1, −2 and −3. For E3, it is seen in Figure 5 that there is no switch in polarity of the local sub-region correlation signal over all lag days. However, E1 indicates a change in the polarity of the local signal over the four lag days. The correlation coefficient pattern seen in Figure 5a-d may be interpreted to mean that when wet anomalies dominate E1 (in South Sudan), dry anomalies occur in central WEA (e.g., over W3, W5, W8) and vice versa, suggesting a weak precipitation dipole. Here, the word "dipole" is used to refer to behaviour leading to a twopole (positive-negative) spatial structure of the signal on day 0 which is also similarly present on an earlier or later lag, and where a similar structure with reversed polarity is present between those 2 days. For E1 lag −3 (Figure 5d), the structure of the signal resembles that on day 0 (Figure 5a), though the signal is weak in Figure 5d. However, the correlation pattern seen in Figure 5h does not entirely resemble that in Figure 5e. The coefficient pattern seen in Figure 5a-d therefore suggests a precipitation dipole. This result is consistent with Mekonnen and Thorncroft (2016), who found a dipole in boreal summer OLR between the Congo Basin and Eastern Africa.
To test whether the unique behaviour of E1 was due to its off-equatorial location, a similar spatio-temporal correlation analysis over E2 was performed, since both E1 and E2 are located approximately along the same longitudinal band and distance from the equator. The structure of the correlation pattern over E2 ( Figure S3) for lag 0, −1, −2 and −3 was similar to that of E3 (Figure 5e-h) and not that of E1 (Figure 5a-d). This suggests that the behaviour of E1 is not due simply to its distance from the equator.
F I G U R E 4 Correlation coefficients between area average time series based on 16 years of TRMM daily precipitation anomalies over W9 and every grid point in the domain for (a) day 0, (b) day +1, (c) day +2, (d) day +3. Only correlation coefficients that are statistically significant at 95% confidence level are shown Figure 5 shows a clear difference in the spatio-temporal correlation patterns of E3 and E1. Patterns for the other E sub-regions (e.g., E6, E8) were similar to that for E3, thus not shown. We speculate that for all sub-regions in EEA, except E1, a "local" influence within EEA persists for a longer timescale than the influence from WEA.
To further assess the relationship between precipitation in a sub-region and the driving circulation at every Only correlation coefficients that are statistically significant at 95% confidence level are shown grid-point, area-averaged precipitation anomalies in W3, W5 and W9 were correlated with 850 hPa zonal wind anomalies for the entire period, regardless of the season. Results indicate a coherent eastward/northeastward propagation signal (not shown) as seen in Figures 4 and  5, discussed further below. Figures 4 and 5a-d highlight the approximately 12 m/ s eastward/northeastward-propagating synoptic-timescale signal. Based on previous studies Laing et al., 2011), the propagation speed of the signal seen in Figures 4 and 5a-d closely matches that of a CCKW. Figure 6a shows a time-longitude plot of TRMM daily precipitation anomalies and the 850 hPa Kelvin wave divergence field (Section 2.4) for April 2004, as an example. The period shown here was selected because it includes several events (see Section 3.2 for definition, and Section 5 below for more discussion) that were indicated by several different pairs of near-equatorial sub-regions (e.g., W5-E3, W9-E3 and W4-E3, see Figure S1 for W4 location), suggesting that on these dates there was an eastward propagation of wet anomalies. For our primary pairing of W9-E3, there are three such events beginning March 25, April 17 and May 01. Note that in Figure 6, both precipitation and Kelvin wave divergence is averaged between 7 S and 3 N because W5 and E3 are also located within this latitudinal belt. Figure 6a shows that eastward-propagating wet anomalies coincide with Kelvin wave low-level F I G U R E 6 (a) Time-longitude section for daily precipitation anomalies for 24 march to may 5, 2004 (shaded) where the contour lines show Kelvin wave divergence (dashed lines negative) at 850 hPa and both divergence and precipitation are averaged between 7 S and 3 N. Contour line interval is 9.3 × 10 −7 S −1 . (b) Contour lines show the lagged correlation coefficient between 16 years of daily precipitation anomalies over W9 and grid point 850 hPa Kelvin wave divergence (dashed lines negative) and shading shows the autolagged correlation of precipitation over W9 and grid point precipitation averaged over 7 S and 3 N convergence, while dry anomalies coincide with Kelvin wave low-level divergence. The alignment of precipitation anomalies and Kelvin wave low-level divergence/convergence seen in Figure 6a suggests that CCKWs play a role in modulating the eastward propagating precipitation signal.
The relationship between precipitation in various subregions in WEA and the Kelvin wave divergence over lag −3 to lag +4 was also analysed. As an example, Figure 6b shows the lagged correlation coefficients between area-averaged precipitation anomalies in W9 and grid-point Kelvin wave divergence at 850 hPa over the 16-year period over lag −3 to lag +4. In agreement with Figure 6a,b reveals a signal of a Kelvin wave that progresses into WEA from the Atlantic Ocean and progresses into EEA with a phase speed of 10 m/s. The pattern seen in Figure 6b is in agreement with Mekonnen et al. (2008), who suggested that CCKWs propagate into Africa from the Atlantic Ocean.

| Isolation of event time indices
A meaningful way of understanding the eastward/northeastward-propagating signal seen in Figures 4 and 5 is to count the "events" (see above definition) when precipitation exhibits eastward propagation. Figure 7 shows the number of events for W9-E3. Similarly, the number of events for other pairs of sub-regions was counted. For W9-E3, the approach identified 87 events during DJF, an average of about five events per season; for W5-E3, a total number of 103 events during MAM were identified, an average of about six events per season.
In Figure 7, JJA shows the fewest events of any season. This could be because during these months, EEA is generally dry. DJF shows stronger interannual variability in the number of events. Results from calculating the standard deviation either including the outlier in DJF 2002 or replacing it with the mean of events across all DJF seasons indicate that the higher apparent variability in DJF in Figure 7 is due to the outlier in DJF 2002 and is otherwise not much higher than in MAM or SON (not shown). Figure 7 highlights that events are characterized by a large season-to-season variability. Note that similar figures, but for other pairs of sub-regions, also indicate a similar variability (not shown). Because seasonal precipitation over Equatorial Africa is strongly modulated by the ITCZ (Nicholson, 2018), we speculate that the synoptic-scale systems that modulate these events are likely associated with the ITCZ. For example, during DJF, the ITCZ is in the Southern Hemisphere in a location similar to that of W9, and this coincides with the period when W9-E3 has a higher number of events compared to JJA, when the ITCZ is in the Northern Hemisphere. Similarly, during JJA, when the ITCZ is in the Northern Hemisphere (in proximity with E1), W9-E1 shows a higher number of events (not shown).

| Composite analysis
Finally, the composite method (averaging over events) is used to investigate the average structure, characteristics and wind regime associated with the precipitation F I G U R E 7 Season-to-season variability in the number of precipitation events for W9-E3 relationship between WEA and EEA. Because previous studies have indicated that the circulation patterns associated with precipitation variability in the different seasons varies markedly Camberlin, 2006a, 2006b), composites based on Equatorial Africa's known wet seasons are calculated. The events were first categorized by season and a composite was calculated only when the total number of events in a given season (MAM and SON, over the 16-year period) exceeded 60. In calculating a composite, day 0 corresponds to the day when wet anomalies exceeded the threshold in the sub-region in WEA, and negatively/ F I G U R E 8 Lagged composite of W9-E3 events for (a, b, c, d) MAM and (e, f, g, h) SON for lag 0, +1, +2 and +3, respectively.
Precipitation anomalies that are statistically significant at the 95% confidence level are shown. The contour lines show the 850 hPa Kelvin wave divergence (dashed lines negative). Contour line interval is 1.1 × 10 −7 S −1 positively lagged composites are computed to explore the propagation characteristics of the wet anomalies. Figure 8 (for W9-E3) shows composites of precipitation anomalies and 850 hPa Kelvin wave divergence for MAM (Figure 8a-d) and SON (Figure 8e-h) over lag 0 to +3. Wet anomalies propagate eastward together with Kelvin wave convergence (dashed contours), while dry anomalies coincide with Kelvin wave divergence (solid contours). In addition, Figure 8 shows that Kelvin waves slowdown in EEA. The composites of events over various seasons for different sub-regions indicated a similar structure as that seen in Figure 8 (not shown). This Figure is consistent with Figures 4 and 7 and further highlights the role of CCKWs in modulating the 2-day precipitation connection between WEA and EEA. Figure 9 shows a composite of SON daily precipitation anomalies and 850 hPa horizontal wind anomalies (a similar composite for MAM is shown in Figure S4). First, in all panels there is low-level convergence into the region of strong wet anomalies. Second, a wave-like signature is evident: dry anomalies (at 32 E) in Figure 9b are just east of wet anomalies (25 -30 E), and west of this dry anomalies are again seen (15 -20 E), while further west (9 E) weak wet anomalies can be seen. This shows a "dry-wetdry-wet" pattern, which further highlights the role of a convectively coupled wave in modulating the 1-2 days precipitation linkage between WEA and EEA. This figure also shows anomalous low-level westerly flow west of the positive rainfall anomaly. These anomalous westerlies move eastward on lag +1 and +2, but on lag +3 both the rainfall and wind signal are weak. Composites of the mean sea level pressure (MSLP) in all seasons indicate that on day −2 and day −1, positive MSLP anomalies dominate the Atlantic Ocean while negative anomalies dominate the Indian Ocean (not shown). The positive MSLP anomalies shift eastward on day 0, +1, +2, +3 and +4. The orientation of the West-East pressure gradient is an indication of a travelling large-scale tropical disturbance. Examining geopotential height, Figure 10 shows a composite of MAM daily precipitation anomalies and 850 hPa geopotential height anomalies (SON is shown in Figure S5). Positive geopotential height anomalies advance eastward along with the wet anomalies from lag 0 to lag +3. Generally, the positive geopotential anomalies are in phase with the low-level westerly wind anomalies in Figure S4. Figure 10 provides further evidence of the role of Kelvin waves in modulating precipitation connection between WEA and EEA.
We also use the equatorial wave dataset and a percentile-based threshold to identify days (events) when a CCKW propagates through Equatorial Africa. The threshold is computed from the time series obtained from F I G U R E 9 Composite of SON daily TRMM precipitation anomalies (shaded) and SON 850 hPa wind anomalies (vectors) on W9-E3 events at (a) day 0, (b) day +1, (c) day +2 and (d) day +3. Only precipitation anomalies that are statistically significant at the 95% confidence level are shown. The wind vectors are plotted regardless of statistical significance averaging the Kelvin wave low-level convergence along the central latitude but within the two longitudes of a sub-region (Kelvin wave events were defined only for WEA sub-regions). The Kelvin wave events are identified by selecting all the days on which the Kelvin wave induced low-level convergence is below the kth percentile of the low-level divergence (i.e., above (100 − k)th percentile of convergence). For example, the 90th percentile of Kelvin wave low-level convergence is defined by taking the 10th percentile of low-level divergence (including negative and positive values). Using the 90th percentile as a threshold over W9, for instance, we identified Kelvin wave events and then compared these to the events that were previously identified using the methodology described in Section 3.2. The Kelvin wave events were matched to the "Day 0" of the precipitation events, when the precipitation was above the given threshold in the WEA sub-region. Events such as March 25, April 17 and May 01 (shown in Figure 6a) showed up as both Kelvin wave and precipitation events. We then calculated the percentage of precipitation events that were also Kelvin wave events and found a 25 and 60% overlap when the 90th and 60th percentile of Kelvin wave convergence is used as a threshold, respectively. This means, for instance, that 60% of the precipitation events identified over all seasons are associated with CCKW low-level convergence greater than the 60th percentile threshold (though this is a fairly weak Kelvin wave threshold, since values below the 50th percentile are divergent). Note that these percentages are higher than those expected purely by chance, which are simply the percentage of all days that are above the two Kelvin wave convergence thresholds (10 and 40% for the 90th and 60th percentile thresholds, respectively). On the other hand, a majority of Kelvin wave events (even defined at the high threshold) are not associated with precipitation events as previously defined, so Kelvin waves are not the only explanation for those events and most Kelvin waves do not cause such events; future work will look into possible reasons for this, including event definitions and other possible drivers. Results similar to Figure 7, but based on W9's Kelvin wave events, showed that MAM indicated more Kelvin wave events than SON in 12 out of 16 years (not shown). Figure 11 shows a composite of MAM 850 hPa Kelvin wave convergence and the corresponding daily precipitation anomalies for the W9 strong Kelvin wave events (above 90th percentile convergence). It can be seen in F I G U R E 1 0 Composite of MAM daily TRMM precipitation anomalies (shaded) and MAM 850 hPa geopotential height anomalies (contour lines) for W9-E3 events on (a) day 0, (b) day +1, (c) day +2 and (d) day +3. Only precipitation anomalies that are statistically significant at the 95% confidence level are shown, and warmcoloured contours indicate positive geopotential height anomalies while coolcoloured contours correspond to negative anomalies. Contour line interval is 5.0 × 10 −1 geopotential metres (gpm) Figure 11 (similar to Figure 8a-d) that the Kelvin wave low-level convergence propagates eastward together with a wet signal, reaching EEA 2 days after its presence in WEA. As noted earlier, the wet signal persists over the Lake Victoria sub-region (E3) on day +3 (and on day +4, not shown). Again, both the Kelvin wave low-level convergence and the wet signal are not seen to coherently progress from central EEA into the Indian Ocean, which suggests a weakening of the CCKW signal as it propagates from WEA into EEA. A composite for SON, but based on CCKW events, was similar to Figure 8e-h (not shown).
It has been suggested that the MJO influences precipitation over sub-regions in EEA (e.g., Camberlin, 2006a, 2006b). We investigated whether the events identified in Section 4.3 occur in a preferred MJO phase using the Real-time Multivariate MJO Index RMM1 and RMM2 (Wheeler and Hendon, 2004). Our results indicate that the 2-day precipitation relationship may not be dependent on MJO activity, since no particular MJO phase is favourable for the occurrence of the events (not shown).

| DISCUSSION AND CONCLUSIONS
A synoptic-scale relationship in precipitation between WEA and EEA has been detected and investigated. Lead/ lag and spatio-temporal correlation analysis revealed that precipitation in EEA lags precipitation over WEA by 1-2 days. Some sub-regions in central WEA and South Sudan exhibit a synoptic-scale precipitation contrast, suggesting a weak precipitation dipole. Composite analysis shows that CCKWs play a role in facilitating the precipitation connection between WEA and EEA.
The lead/lag and spatio-temporal correlation analysis was performed on sub-regions identified using the EOT algorithm. Daily precipitation anomalies were used in this EOT analysis to objectively identify sub-regions of similar daily precipitation characteristics, rather than relying on a subjective identification. There are some differences between the locations of the various sub-regions identified in this study and those in previous publications, such as Indeje et al. (2000) and Badr et al. (2016). These discrepancies may be due to the method and the temporal and spatial resolution of the dataset used (e.g., Nicholson, 2017). As pointed out by Nicholson (2017), the sub-regions that are frequently identified in the literature include Lake Victoria, the highlands and the coastal plains and our approach similarly identified sub-regions over Lake Victoria (E3), coastal areas (E6) and highlands (W17). These results highlight the previously documented spatial heterogeneity of daily precipitation, a result that is consistent with heterogeneity found in pentad (5 days) precipitation variability (e.g., Camberlin, 2006a, 2006b).
F I G U R E 1 1 Same as Figure 8a-d but composite calculated using CCKW events defined for W9. Contour line interval is 6.0 × 10 −7 S −1 Results from the lead/lag correlation analysis of various pairs of sub-region area-averaged time series of precipitation anomalies indicate statistically significant peak correlation coefficients on day +1 or day +2, suggesting that precipitation over EEA lags that in WEA by 1-2 days. It is also seen that the lag at which the correlation coefficient peaks is contingent upon the distance between the sub-regions. For sub-regions that are far apart (e.g., W5 and E3) the correlation coefficient peaks at lag +2, while for the sub-regions that are close to one another (e. g., W14 and E3) the correlation coefficient peaks on lag +1 (e.g., Figure 2b). This suggests a coherently propagating synoptic-scale system. We note, however, that the correlation coefficients are quite weak, and this may be explained by the "spotty" (i.e., scattered) nature of precipitation over much of Equatorial Africa. Sumner (1983) found similar weak correlation coefficients in their analysis of daily precipitation in Tanzania.
The propagation characteristics of the synoptic-scale signal highlighted above were investigated using spatiotemporal correlation analysis. The results revealed a coherent eastward/northeastward-propagating signal. The propagation structure of this signal (with an estimated speed of about 12 m/s) suggests a role of a largescale disturbance in modulating the precipitation relationship between WEA and EEA. The speed of propagation closely matches that of CCKWs detected in Mekonnen et al. (2008) and Laing et al. (2011). Another aspect of our results is the weakening or complete decay of the coherent eastward/northeastward signal shown in Figure 4d and in composites (e.g., Figures 8 and 11). The coherent eastward/northeast signal is thus fairly shortlived, which may seem somewhat surprising. However, the propagation characteristics of this signal may be compared to the findings in Liebmann et al. (2009, Figure 1, see discontinuity at about 37 E). Also, our results are in agreement with Mounier et al. (2007), who found that the CCKW footprint is characterized by a weak signature over East Africa. Our analysis also reveals a contrasting signal shown in the patterns between the sub-regions in South Sudan (E1) and central WEA, which can be interpreted as a weak precipitation dipole. This result lends support to Mekonnen and Thorncroft (2016), who proposed a dipole relationship in convective activity between East Africa and the Congo Basin.
The structure of the spatio-temporal correlation patterns for all sub-regions in EEA over lag 0, −1, −2 and −3 (with the exception of E1) was similar to that shown for E3 in Figure 5e-f. The persistence of positive correlation coefficients over EEA's sub-regions for lag 0 to −3 may suggest that rainfall over several sub-regions in EEA is particularly influenced by features that do not propagate from WEA but rather persist within EEA. We find this consistent with earlier studies that have suggested the local nature of precipitation over EEA (e.g., Sumner, 1983;Nicholson, 2011). Furthermore, the distinctive spatio-temporal correlation pattern for E1 over lag 0, −1, −2 and −3, while indicative of a weak dipole as mentioned above, does not mean that precipitation over this sub-region is solely driven by eastward/northeastwardpropagating features either. Our results indicate that a comprehensive understanding of drivers of daily to synoptic-scale variability in precipitation requires that small sub-regions need to be considered separately.
The identification of events in each season allowed a further investigation into factors modulating the 1-2 days precipitation connection. There are an average of six events per year for W5-E3 in MAM and five events for W9-E3 in DJF. These numbers closely match the 6-7 CCKWs that propagate through the Congo basin during March-June according to previous studies (e.g., Sinclaire et al., 2015). Wheeler and Nguyen (2015, Figure 7) found an average of 5 CCKWs per boreal spring season, a result that is also consistent with our results.
The structure of the anomalous wind and the associated precipitation anomalies seen in Figure 9 resembles the observed Kelvin wave signature seen in Wheeler and Nguyen (2015, Figure 7). We used a novel equatorial wave dataset to find evidence of alignment of the eastward propagating precipitation anomalies and Kelvin wave convergence. The collocation of wet anomalies and Kelvin Wave convergence is evident in a time-longitude plot ( Figure 6a) and composites (Figures 8,9,11). As shown in Wheeler and Nguyen (2015), it is also seen here that anomalous westerly flow and wet anomalies in Figure 9 are in phase with the positive geopotential height anomalies in Figure 10. The association between the anomalous westerly flow and the wet signal shown here (e.g., Figure 9c) is consistent with results from . We also show that as the eastward propagating wet signal becomes weaker, so does the dynamical signature of the CCKWs. This confirms the role of CCKWs in facilitating the 1-2 days precipitation connection between WEA and EEA in most seasons, more particularly during MAM and SON.
Overall, composites on Kelvin wave convergence in Figure 11 confirms that CCKWs play a role in modulating precipitation linkage between WEA and EEA, although not all precipitation events are linked to Kelvin wave events, and most Kelvin wave events do not cause precipitation events; future work should investigate these definitions and mechanisms further. Finally, over the 16year period, Kelvin wave event numbers in MAM in each year were greater than those in SON of the same year in 75% of the period. This is consistent with the relatively stronger wet signal and the associated Kelvin wave induced convergence in Figures 8d and 11d, compared to the corresponding SON composite (Figure 8h). This result also appears to be consistent with the predominant occurrence of Kelvin waves found during boreal spring (e.g., Roundy and Frank, 2004).
Our results suggest that synoptic-scale weather forecasters over EEA need to monitor the propagation of CCKWs into the region. However, they need to be mindful that the speed of CCKWs may vary depending on how strongly the wave is coupled to convection, and not all CCKWs will necessarily lead to eastward propagating synoptic-scale precipitation events. Also, our results can be useful in developing statistical methods for synoptic-timescale precipitation forecasting over EEA. Finally, our results highlight the potential value that will be gained from a realistic representation of CCKWs and their interaction with localized convective systems in high-resolution operational forecasting systems such as the new Met Office operational convectionpermitting model output for tropical Africa. Yang, G.-Y., Methven, J., Woolnough, S., Hodges, K. and Hoskins, B. (2018) Linking African easterly wave activity with equatorial waves and the influence of Rossby waves from the Southern Hemisphere. Journal of the Atmospheric Sciences, 75, 1783-1809.

SUPPORTING INFORMATION
Additional supporting information may be found online in the Supporting Information section at the end of this article.