El Niño Diversity Across Boreal Spring Predictability Barrier

El Niño exerts widespread hydroclimate impacts during boreal summer. However, the current prediction of El Niño across boreal spring has the most severe forecast errors, partially due to the lack of understanding diversified El Niño onset and decay. Here we show, through nonlinear k‐means cluster analysis of evolutions of 40 El Niño events since 1870, El Niño exhibits complex and diverse flavors in its onset and decay across boreal spring predictability barrier. We detected three types of El Niño onset and three types of decay. Each type exhibits distinct coupled dynamics, precursors, and hydroclimate impacts. The results guide the prediction of different types of El Niño transition amid spring predictability barrier and global land precipitation during early and late boreal summer. The new classification offers a metric to evaluate performances of climate models and to project future change of El Niño properties and its predictability.


Introduction
Skillful prediction of El Niño/Southern Oscillation (ENSO) provides a basis for seasonal climate forecasts and services that are critical to managing climate hazard risks and resources. Despite the enormous efforts made so far, the real-time ENSO forecasts have not seen significant improvements over the past two decades (Barnston et al., 2017). The most severe forecast errors are associated with the prediction of ENSO across boreal spring (Ham et al., 2019;Tippett et al., 2012) or during northern summer, namely, the transition phases of El Niño. However, the dominant impact of El Niño on land precipitation also occurs in boreal summer. Therefore, understanding the diversity of El Niño onset and decay is crucial for improving El Niño and monsoon prediction and projection of future change of its predictability.
The current delineation of El Niño diversity, however, is primarily based on the spatial structure of the sea surface temperature anomaly (SSTA) at its mature phase during northern winter. El Niño has been classified as eastern Pacific (EP) and central Pacific (CP) El Niño, respectively (Ashok et al., 2007;Kao & Yu, 2009;Kug et al., 2009;Takahashi et al., 2011;Timmermann et al., 2018;Yeh et al., 2009;Yu et al., 2010;. The EP El Niño tends to have a maximum warming in the equatorial eastern Pacific (5°S-5°N, 150°W-90°W), while CP El Niño tends to have a maximum warming in the equatorial central Pacific (5°S-5°N, 160°E-150°W). However, the identified CP El Niño events vary considerably among different authors . Further, the superposition of the two leading EOF modes of tropical Pacific SSTAs yields a mix of EP and CP events (Giese & Ray, 2011;Johnson, 2013;Zhang et al., 2019). The inconsistency of the current classification of CP El Niño events arises from the use of different ways of definition and subjective criteria, different variables, and insufficient samples Timmermann et al., 2018). An intrinsic weakness of the classification is the narrow focus on mature phase of SSTA structure.
It has been shown that El Niño diversity may be better delineated by considering its evolution (Wang et al., 2019). Here we pay particular attention to El Niño transition (onset and decay) process and use cluster analysis to examine long records of reasonable quality (1871-2017) to increase statistical significance. Diagnostic analysis of dynamical processes is conducted to unravel underlying El Niño dynamics. For prediction purposes, we detect robust precursors and examine boreal summer hydroclimate impacts associated with each identified type of El Niño onset and decay.

Diversity in El Niño Onset: Distinctive Mechanisms, Precursors, and Impacts
To focus on the development of El Niño events, we conducted a cluster analysis of the equatorial SSTAs averaged between 5°S and 5°N from October (− 1) to October (0) for 40 El Niño events occurred during 1871-2017 ( Figure S1). Here Year 0, Year − 1, and Year 1 denote the El Niño year, the year before, and the year after, respectively. This analysis is an extension of Wang et al. (2019) by including seven more El Niño events that occurred in the late 19th century. The results in Figure 1a serve to verify the previous classification and to facilitate comparison with 40 El Niño decay processes. The reason for the choice of four onset clusters is explained in the Method. The corresponding silhouette value diagram ( Figure S2a) indicates that the four clusters are well parted except for 1888. Figure 1. (a) Spatial-temporal structure of the equatorial SST anomalies (shading in units of°C) and 1,000-hPa zonal wind anomalies (contour in units of m/s) in four clusters of El Niño onset. Shown are composite longitude-time diagrams of the equatorial SSTA averaged between 5°S and 5°N for Strong Basin-wide El Niño (7 events), MEP El Niño (14 events), MCP El Niño (9 events), and Successive El Niño (10 events). The stippling denotes the regions where the SSTA signal (group mean) is larger than noise (the standard deviation of each member from the group mean). The green lines outline the propagation tracks of maximum SSTAs. The time ordinate is from October of the year prior to El Niño (− 1) to the March after the El Niño year (1). The merged HadISST and ERSST5 data from 1871 to 2017, and merged EC data from 1901 to 2017 were used after removing small linear trends. The composite patterns derived from the undetrended and linearly detrended data are nearly identical. (b) The same as in Figure 1a except for four clusters of El Niño decay: slow decay to summer CP warming (16 events), early decay to EP La Niña (14 events), Late decay to CP La Niña (five events), and Continuing (five events). The time ordinate is from October (0) of the El Niño year to December (1) of the year after El Niño. The composite patterns derived from the linearly detrended data. Figure 1a shows the composite patterns of the SSTA and surface zonal wind anomalies for each cluster. Cluster I, consisting of the seven strongest El Niño events, initiates in the western Pacific, develops rapidly across the Pacific basin, and reaches a large amplitude with maximum SSTA (> 2.5°C) around 120°W, which is named as strong basin-wide (SBW) El Niño. Cluster II includes 14 moderate events, in which warming starts from the far eastern Pacific and propagates westward, reaching a moderate amplitude with maximum SSTAs (1.0°C to 2.5°C) around 130°W, so they are called moderate EP (MEP) events. Cluster III contains nine moderate events, in which warming originates from the western Pacific and propagates/extends eastward, reaching a moderate maximum SSTA around 160°W, and they are called moderate CP (MCP) El Niño. Note that the initiation of SBW events features strong westerly anomalies that expand eastward continuously from the western to central Pacific during January (0) to April (0), which well distinguishes from the MCP events. Cluster IV depicts a continuation of an El Niño event and is referred to as Successive El Niño. The results shown here confirm the robustness of the previous classification.
How far in advance can we foresee the occurrence of different onset clusters? We find that the four onset clusters can be differentiated before April (0) by using three predictors ( Figure 2a). First, the western Pacific (WP, 5°S to 5°N, 120°E to 170°E) warming preludes the onsets of MCP and SBW, while this is not the case for the MEP and Successive events ( Figure 1a). Thus, we first use WP SSTA > 0.05°C (the red line in Figure 2a) to distinguish the MCP/SBW from MEP/Successive events. Second, we find that from October (−1) to April (0) the MEP events are preceded by the surface easterly anomalies over the western-central Pacific (WCP, 5°S to 5°N, 140°E to 120°W), while the Successive events feature surface westerly anomalies ( Figure 1a). Thus, the WCP zonal wind anomaly (the left-side ordinate in Figure 2a) can well separate the MEP and Successive events. Third, to distinguish SBW and MCP events, we noted that from January (0) to April (0), the SBW events feature a rapid intensification and an eastward expansion of westerly anomalies in the WCP ( Figure 1a); in contrast, the MCP events show weak westerly anomalies without eastward expansion ( Figure 1a). To quantify this difference, we used "April mean zonal wind anomaly plus three tendencies (April minus March, April minus February, and April minus January)" (the right-side ordinate in Figure 2a) The statistical significance of the separation of the four clusters by the three precursors is tested by using a two-way contingency table (Method), and the results are shown in Table S1a. The degree of freedom is nine, and the chi-square value equals 79.5 (p < 0.001), indicating the separation is significant at the 99.9% confidence level.
Three types of El Niño development during June-July-August involve different coupled dynamics (Wang et al., 2019), as revealed by heat budget analysis of an ocean surface layer temperature tendency (detailed derivation and explanation are provided in the Method). Different from Wang et al. (2019), here we have computed the heat budget during their respective onset phases (Table S2). The results confirm that (a) the onset of MCP events primarily involves the zonal advective feedback by anomalous currents, (b) the onset of MEP events is mainly attributed to the thermocline feedback in the eastern Pacific, and (c) the early onset of SBW events involves three feedback processes-the zonal advective, upwelling, and thermocline feedbacks. The reasons have been elucidated in Wang et al. (2019) and will not be repeated here.
The three types of El Niño onset exert distinctive impacts on boreal summer land precipitation because their corresponding SST anomalies are remarkably different, not only in locations but also warming intensities ( Figure 3). During May-June (0), the early onset of the SBW events induces significant dryness over the Maritime Continent, Indochina peninsula, and equatorial South America, but not the moderate MEP and MCP events that have insignificant impacts on precipitation except Bangladesh and the middle Yangtze River Valley. During July-August (0), the MCP warming significantly reduces rainfall over northern India, northern China, Mexico, and West Africa; in contrast, the MEP warming causes deficient rainfall over central-western India, Yangtze River Valley, Ecuador, Venezuela, and the southeast United States. The dry anomalies induced by the MCP and MEP warming tend to be complementary in geographic locations due to their different locations of maximum warming (Figure 3b). With the strong warming, the SBW events severely reduce land precipitation over the regions affected by both the MCP and MEP warming, including the Maritime Continent, tropical South and North America, southern and northern India, inland East Asia, and northern African monsoon region.

Diversity in El Niño Decay: Distinctive Mechanisms, Precursors, and Impacts
A cluster analysis of the equatorial SSTAs from October (0) to October (1) was made, focusing on El Niño decay. Since the results of k-means cluster analysis are dependent on the number, k, of clusters chosen, we had tested the solutions of the k from two to six. Similar to the onset analysis (Method), we find k = 4 yields the most meaningful results. The corresponding silhouette values depict four well-separated clusters ( Figure S2b). The spatial-temporal structures of the individual events are generally similar within the same cluster but differ from the other cluster patterns ( Figure S3). Figure 1b presents composite spatial-temporal structures for each cluster of the post-El Niño evolution. Clusters I to III represent three different forms of El Niño decay. Cluster IV represents "Continuing" El Niño events. Cluster I describes a "slow decay" process. The Slow decay cluster features a moderate decay rate, especially in the central Pacific, so that during decaying summer a weak warming persists over the CP. The majority of the slow decay events (12 out of 16 events) eventually reach a neutral condition 12 months after the peak phase ( Figure S3a). Both Clusters II and III show a large decay rate and a transition to a La Niña, but the cooling occurs in boreal spring and summer, respectively. Therefore, Cluster II is called "Early decay," and Cluster III is called "Late decay." The Early decay is characterized by a cooling initiating in the far eastern Pacific in boreal spring, which then propagates westward to the central Pacific, evolving into an "EP type" of La Niña in summer. On the other hand, the Late decay from SBW events starts the cooling in the CP during summer that evolves into a "CP-type" of La Niña in autumn. The equatorial zonal wind anomalies coupled with the SST gradients also exhibit distinctive features among three decay events (Figure 1b).
The surface layer heat budget analysis indicates that the Slow decay events are controlled dominantly by the zonal advective feedback by anomalous currents (Table 1). The decay rate is the smallest among the three, which is consistent with the slowly decaying westerly anomalies and the warming in the central Pacific (Figure 1b). The Early decay events are attributed to the zonal advective feedback and augmented by the thermocline and upwelling feedback processes. These processes cause rapid cooling in both the central and eastern Pacific. The eastern Pacific cooling is associated with the rapid waning of westerly anomalies in the equatorial Pacific from 150°E to 130°W during late spring (Figure 1b). The Late decay from SBW to a CP La Niña is predominated by strong thermocline feedback and augmented by zonal advective and

Geophysical Research Letters
upwelling feedbacks. The reason is that the strong warming center was located in the eastern Pacific, and its cooling is mainly caused by the rapid shoaling of the thermocline in the eastern Pacific from spring to early summer ( Figure S4).
What triggers different decay processes? There is no easy answer to this question as decay is a coupled process, and both atmosphere and ocean processes can trigger it. The only obvious signal is that SBW events tend to presage the Late decay events. However, a moderate CP or EP event can decay either rapidly or slowly, or even amplify. Can we differentiate the decay from continuing events and distinguish three types of decay events?
By examination of Figures 1b and S3, we find that two precursors before April (1) can generally distinguish four decay clusters (Figure 2b). The first precursor is the SSTA tendency from ND(0) to MA(1) over the central-eastern Pacific (CEP) (5°S to 5°N, 180°to 100°W) region (CEPSSTT). The Early and Late decay events, which have a large decay rate, tend to exhibit a significant negative tendency from ND(0) to MA(0) (CEPSSTT < −0.6°C); however, the Slow decay tends to have a moderate negative tendency (CEPSSTT > −0.6°C), although some of them mix with the Early decay cases. In contrast, the continuing events show a positive tendency (CEPSSTT > 0°C), which can differentiate them from the Slow decay cases (Figure 2b). This precursor reflects the persistence of SST tendencies in the CEP after the El Niño peak in ND(0).
The second precursor is the SSTA averaged over the CEP from January (1) to April (1), which completely separates the Early and Late decay events as the Late decay events follow the SBW events with enduring warming. The separation of the four clusters by the two precursors results in four categories shown in Figure 2b. The statistical significance of the separation of the four clusters by the two decay precursors is tested by using the contingency table shown in Table S1b. The chi-square value is 79.5 with a degree of freedom 9, indicating that the separation is significant at the 99.9% confidence level.
Notably, three types of El Niño decay show conspicuously distinctive SSTAs and thus exert remarkably different impacts on boreal summer land precipitation (Figure 4). The slow decay events show a CP warming in MJ(1), thereby causing significant dry anomalies in northeastern India and Amazônia; however, the impacts disappear in JA(1) as the positive SSTA decays. In contrast, the early decay events do not have a significant impact until JA(1) when the El Niño transforms into a La Niña. In this case, the land rainfall tends to increase over Central America, Indonesia, western India, and West Africa. In the late decay of SBW, a significant warming lingers in the eastern Pacific, which prolongs the dry anomalies over southern Amazon and eastern Brazil during MJ(1), but its impact on the tropical western hemisphere becomes insignificant in JA(1) as the central Pacific warming fading away. Notably, the increased rainfall over the Yangtze River Valley is prominent in MJ(1) and persists into JA(1). The persistent impacts on the East Asian summer monsoon are produced by the strong anticyclonic anomaly over the western North Pacific, which maintains its strength by a positive thermodynamic feedback between the WP anticyclone and underlying ocean mixed layer in the Indo-Pacific warm pool (Wang et al., 2000;. The northward transport of moisture along the northwestern flank of the subtropical high often causes severe flooding over the Yangtze River valley, but only in the summer after a strong El Niño (Wang et al., 2017).

Two-Year El Niño Events
Our classification not only distinguishes strong from moderate events but also tends to separate the first-year from second-year El Niño events, although not perfect. There is a total of nine 2-year El Niño events from 1871 to 2017 ( Figure S1). The onset cluster analysis identifies seven moderate second-year events as Successive, but the other two (1877 and 2015) were classified into the SBW events because of their extraordinarily large amplitude. The decay cluster analysis identified five strong second-year El Niño as the Continuing events, but the other four were mixed into the Slow decay events because of their weak intensity ( Figure S3). Note that the three onset groups and three decay groups are generally unrelated except the linkage between the SBW onset events and the Late decay events.

Discussion
Prediction of boreal summer monsoon rainfall poses one of the greatest challenges in climate sciences. One of the difficulties is related to the diversified SST anomalies associated with the onset and decay of El Niño

10.1029/2020GL087354
Geophysical Research Letters events. The present work innovated classification of El Niño diversity by focusing on both the temporal evolution and spatial structure of the onset and decay processes ( Figure 1) and revealed the differences among different types of El Niño onset and decay in their coupled dynamic processes, precursors, and hydroclimate impacts during northern summer. The result opens an avenue for improving predictions of the differing types of El Niño onset/decay across the spring predictability barrier and their impacts on the northern summer climate, especially the challenging monsoon rainfall prediction.
The objective classification proposed in this study is expected to stimulate in-depth investigations of diversified behavior of ENSO, to enrich the metrics to evaluate climate models' performance, and to form a comprehensive understanding of future changes in El Niño diversity.
The El Niño teleconnection patterns (Figures 3 and 4) highlight the necessity and advantages of differentiating the impacts of diversified El Niño onset and decay during early and late boreal summer, respectively. These teleconnection patterns provide a testbed for understanding the underlying climate dynamics and for assessing climate models' reproducibility of El Niños' climate impacts, which is critical for improved climate predictions. Note that the results shown in Figures 3 and 4 are derived by using the data from 1901 to 2017, and caution should be exercised when applying the results to a shorter period, such as 30-40 years, because the teleconnection pattern may experience multidecadal variations (e.g., Shi & Wang, 2018;Wang et al., 2020;Webster et al., 1998).
The identified precursors (Figure 2) show that the peak SST anomalies associated with the four clusters of El Niño (SBW, MEP, MCP, and Successive) may be anticipated 7 months in advance in April (0). So are the four

10.1029/2020GL087354
Geophysical Research Letters types of El Niño decay. In particular, the precursors can guide predicting strong and moderate El Niño across the boreal spring predictability barrier. However, the identified precursors here are confined to the equatorial Pacific SST and surface zonal winds anomalies only. Additional and longer-lead precursors that can distinguish different types of El Niño onset and decay can be expected through further consideration of thermocline anomalies (Capotondi & Sardeshmukh, 2015) and the climate anomalies outside the equatorial Pacific, including subtropical Pacific processes (Vimont et al., 2003;Wang et al., 1999;Yu & Kim, 2011), and the tropical Atlantic and Indian Ocean variability Ham et al., 2013;Timmermann et al., 2018).
The constant-depth surface mixed layer model used for heat budget analysis is a simplified version of the variable-depth mixed layer model with entrainment (Method). A constant-depth mixed layer is not neglecting turbulent entrainment. Rather, it assumes this entrainment just balances the vertical motion caused by the divergence over the surface mixed layer. If there is upwelling (downwelling), and no turbulent entrainment, the layer would thin (thicken), not be constant. In the equatorial eastern-central Pacific, the mixed layer depth is controlled by ocean dynamics, not sensitive to the wind stirring, and changing of the turbulent coefficient associated with convection has little impact on the mixed layer temperature (Wang et al., 1995). Therefore, it is reasonable to use a constant-depth surface layer model to analyze the processes contributing to the SST variability in the eastern-central Pacific, where heat transports by upwelling and currents play a dominant role in ENSO dynamics.
The composite results shown in this work represent features of majority or typical events and do not reflect exceptional events. The event-to-event differences are measured by the silhouette values ( Figure S2). Since the silhouette value ranges from −1 to +1, a negative or a near-zero value can identify exceptional events that have low similarity to the composite events and deserve individual case studies.

Cluster Analysis
k-means cluster analysis (Wilks, 2011) uses squared Euclidean distance to measure the "similarity" among each cluster member and the corresponding cluster centroid. The silhouette values, ranging from −1 to +1, evaluate the performance of cluster analysis. A high silhouette value indicates that the member is well matched to its own cluster and poorly matched to neighboring clusters (Kaufman & Rousseeuw, 2009). The results of cluster analysis are dependent on the number, k, of clusters chosen. The solution of k = 2 mainly separates MEP events from the others. In k = 3, the SBW events emerge as a new cluster. The four-cluster analysis further separates the MCP and Successive events. When k = 5 or higher, the sample size in some clusters becomes too small to test the statistical significance. We chose k = 4 based on physical consideration.

Contingence Table and Chi-Square Test
A contingency table, also called a two-way frequency table, is a tabular mechanism to present categorical data in terms of frequency counts. Pearson's chi-square statistic is used to test independence between the row and column variables in the contingency table (Campbell, 2007;Cochran, 1952). It is calculated as 10.1029/2020GL087354 where O i is the observed value shown in the contingency table and E i is the expected value, which is derived by ("Row Total" times "Column Total") divided by the total number of events. Pearson's chi-square test statistic follows an asymptotic chi-square distribution with (R − 1) * (C − 1) degrees of freedom when the row and column variables are independent, where R and C are the numbers of row and column in the contingency table.

Budget Analysis of the Mixed Layer Temperature
A budget analysis of the ocean surface layer temperature tendency is used to quantify the contributions of different processes to the El Niño development, which is calculated as (1) where the overbars and primes indicate climatological mean and anomalous quantities, respectively; T denotes the mixed layer temperature; V = (u, v, w) represents the zonal and meridional currents, and upwelling velocities, respectively; ∇ = (∂/∂x, ∂/∂y, ∂/∂z) represents the three-dimensional gradient operator; Q net denotes the net downward heat flux at the ocean surface; ρ (=10 3 kg m −3 ) is water density; C p (=4,000 J kg K −1 ) is the specific heat of water; and R denotes the residual term. The mixed layer depth H is taken as a constant 50 m (An & Jin, 2004), and the analysis result is not sensitive to the different mixed layer thickness, such as H = 30 m or 70 m.
Equation 1 is valid for a "constant-depth surface layer" model, which was used in the Cane-Zebiak (CZ) model (Cane, 1979;Zebiak & Cane, 1987). It is a simplified version of a variable-depth mixed layer model. The accurate forms of the ML temperature T 1 and ML depth h 1 are (see eq. 2.5a and c) in Wang et al., 1995): (2:1) ( 2:2) where H W e ð Þ is a Heaviside function of the entrainment velocity W e . Assume that the ML has a constant depth, that is, h 1 = H 1 and ∂h 1 ∂t ¼0, so that the ML depth equation becomes W e ¼H 1 ∇ · V 1 * ¼W s , which indicates that the entrainment rate, W e , in this case is identical to the upwelling velocity at the bottom of the ML (W s ), and it is determined by the vertical integrated divergence, H 1 ∇ · V 1