An Approach to Link Climate Model Tropical Cyclogenesis Bias to Large‐Scale Wind Circulation Modes

Attributing sources of tropical cyclogenesis (TCG) bias to large‐scale circulation in global circulation models is challenging. Here, we propose the use of empirical orthogonal functions as an approach to understand model bias of TCG. Two leading modes of large‐scale wind circulations in the West Pacific can explain the TCG frequency and location in both climate reanalysis and the MetUM model. In the reanalysis, the two modes distinguish the summer monsoon trough position and the strength of the north Pacific subtropical high. However, in the model, the wind circulations are biased toward the positive phase of simulated modes thus overestimating TCG in the entire Main Development Region. This bias is further related to the north‐eastward shifted monsoon trough and a weakened subtropical high, and overly strong tropics‐subtropics connections. This approach could be deployed more widely to other basins and models to diagnose the causes of TCG bias.

The approach helps to address the common questions in climate models: (a) How well do climate models simulate the relationship between WNP TCG frequency and large-scale wind circulation patterns ?, and (b) what roles do large-scale circulations have in TC bias? The Met Office atmosphere-only climate models are chosen here to demonstrate the approach. For WNP TCG, these models have consistent positive bias through the whole typhoon season in a hierarchy of simulations with various configurations in horizontal resolution and air-sea coupling, regardless of verification data source (Camp et al., 2015;Roberts et al., 2015Roberts et al., , 2020aRoberts et al., , 2020bFeng et al., 2019Feng et al., , 2020Vidale et al., 2021). The TCG bias in the Met Office models was found to favor a northward shift and slow translation speed of tracks. This paper's scope is to propose a methodology to enable a clearer cascade of processes from large-scale wind circulations to the TCG bias, which can be adapted to other climate models and basins.
The atmosphere-only climate simulations used here are produced by GA7.1 with daily sea surface temperature (SST) forced by HadISST2 at 0.25° resolution interpolated to the model's atmospheric grid. Three ensemble members are generated by perturbing atmospheric physics tendencies randomly and continuously during the simulation by the Stochastic Kinetic Energy Backscatter v2 scheme (Bowler et al., 2009). Hereafter, in this paper, the GA7.1 simulations are termed as "MetUM" simulations.
We used atmospheric data from the ECMWF fifth generation climate reanalysis (ERA5; Hersbach et al., 2020) to validate TCG and wind circulations. The native horizontal resolution of ERA5 is 31 km.

TC Track
TCs are tracked and identified in ERA5, and in each member of the MetUM simulations, from six-hourly data on a common spectral grid (T63). Positive vorticity centers that exceed 0.5 × 10 −5 s −1 in the northern Hemisphere are tracked in the spectrally filtered 850 hPa vorticity. Then, the vorticity maxima at levels from 850 to 200 hPa are added to the tracks using a warm core search. The TC tracks are identified by applying the vorticity and warm core criteria tailored for the model horizontal resolution. Details of the tracking and identification method are described in Hodges et al. (2017). TCG is defined as the first point of the identified track. The MetUM simulations and ERA5 have similar horizontal resolution and have the same TC tracking method, so that it is fairer to validate MetUM against ERA5 TCs than against observed TCs. The conclusions are unchanged when validating against the observed Best Track data, except the slightly larger bias in the east (Feng et al., 2020) related to an earlier identification of TCG in ERA5  and some mismatched storms between the two datasets (Bourdin et al., 2022;Stansfield et al., 2020).

Analysis Methods
The multivariate empirical orthogonal function (EOF) method (Liang et al., 2018;Wheeler & Hendon, 2004), which analyses multiple variables, is used to derive the temporally and spatially independent modes of horizontal wind circulation. The multivariate EOFs analyze the zonal and meridional wind velocities at 850 hPa in the WNP (0-40°N, 90-180°E), both in ERA5 and the MetUM ensemble mean. We analyze the typhoon peak season (July-October) when 65%-70% of TCs occur, over 1979-2014. Here, we only use the first two EOF components, which explain >65% of the variance in wind circulation ( Figure S1 in Supporting Information S1). The 850 hPa winds are used because they have a close relationship with TCG, although the EOF can be applied to winds at any other levels.

Biases in Large-Scale Circulations and TCG
In the WNP, during the typhoon peak season, the horizontal large-scale circulation is dominated by the WNPSM system (Wang et al., 2001;Wang & Fan, 1999). In ERA5, at the lower troposphere, the basic state of the WNPSM consists of prevailing south-westerly monsoon winds from the northern Indian Ocean, and easterly trade winds steering along the western and equatorward sides of the NPSH (15-30°N, 140-180°E) (Figure 1a). The steering flow in the equatorward flank of the NPSH is in conjunction with the Pacific Walker circulation. The south-westerly monsoon winds and easterly trade winds form a northwest-southeast oriented MT, which is depicted by strong relative vorticity (RV) and convergence in the lower tropospheric, and strong outflow at the upper level (Figures S2a and S2b in Supporting Information S1). Additionally, the MT is also bounded with low vertical wind shear (VWS) and increased relative humidity (RH) (Figures S2c and S2d in Supporting Information S1). These conditions create favorable environment for TCG, as indicated by the northwest-to-southeast tilted distribution of TCG in ERA5 (Figure 1a). The Main Development Region (MDR) is east of the Philippines and the South China Sea. 4 of 9 The regional large-scale circulation in MetUM is shown in Figure 1b. Although MetUM captures the typical features of the WNPSM system, there are significant biases ( Figure 1c). The most pronounced bias is a weaker NPSH and a northward displaced MT. The 500 hPa geopotential height (GPH) between 10 and 30°N is 10-20 m lower than in ERA5. This is accompanied by an anomalously cyclonic circulation in the lower troposphere, with westerly anomalies toward east of the Philippines and easterly anomalies toward south Japan and north China. In MetUM, other environmental conditions, for example, RV, VWS and RH, are also biased ( Figures S2 andS3). The causality between biases in the large-scale circulation and local conditions is not straighthood. We focus on large-scale circulation as it provides a more complete picture of environment.
The MetUM simulates the expected zonal distribution of TCG, with most forming east of the Philippines (Figure 1b). TCG is overestimated in the poleward flank of the MDR (10-30°N), and slightly underestimated in the equatorward flank (0-10°N) (Figure 1d). The bias is also associated with a significantly northward shift of TCG. MetUM has the largest number of TCG at 15°N, which is about 5° further north than that in ERA5 ( Figure  S4 in Supporting Information S1). This indicates that the frequency and position of TCG are both biased. The basin-wide frequency of TCG is overestimated by ∼6 TCs/season. Figure 2a shows the first leading EOF (EOF1) of 850 hPa wind variations in ERA5 over 1979-2014. EOF1 explains 53% of the variance. In the positive phase, it is characterized by lower-tropospheric westerlies in the western and central tropical Pacific. EOF1 is associated with anomalous convergence in the central Pacific and divergence in the western Pacific. EOF1 is also related to a zonal dipole structure in the upper tropospheric winds, middle-level RH and VWS ( Figure S5 in Supporting Information S1), resembling an El Nino phase. In the positive phase, the environmental conditions are associated with an eastward displacement of the MT, suppressing TCG occurrence in the western sector (100-140°E) and favoring it the eastern sector (140-180°E) (Figure 3a). We confirm that EOF1 represents the zonal migration of the MT, with r = 0.94 between PC1 and the MT index (Table S1 in Supporting Information S1). Thus, EOF1 has a longitudinal seesaw effect on regional TCG frequency within the basin, that is, EOF1 determines the spatial distribution of WNP TCG. There is no significant correlation between the basin-wide TCG frequency and PC1 (Figure 4a).

ERA5
The second mode (EOF2) accounts for 12% of the variance. EOF2 is highlighted with a lower-tropospheric cyclonic circulation in the center of the basin (10-30°N and 120-160°E) (Figure 2b). EOF2 is related to meridional dipole-like variations in environment conditions. In the positive phase, there is anomalous convergence in the equatorward flank of the circulation and divergence in the poleward flank. EOF2-associated meridional variations are also seen in upper-level convergence, RH and VWS ( Figure S6 in Supporting Information S1). EOF2 has a strong positive impact on TCG in the northern part of the MDR, where most TCG events occur, and a weak negative impact in the southeast sector (Figure 3b). PC2 has a significant correlation with the basin-wide TCG frequency (r = 0.49; Figure 4b). EOF2 resembles the dynamical features associated with a strengthening of the East Asian summer monsoon (Vega et al., 2018;Wang et al., 2001;Wang & Fan, 1999). Thus, PC2 is significantly correlated with the NPSH index, with r = −0.61. However, the basin-wide TCG frequency has a stronger correlation with PC2 than with the NPSH, suggesting that PC2 is a better descriptor for the basin-wide frequency.

MetUM
In MetUM, the first two EOFs explain 58% and 14% of the variance in 850 hPa winds (Figures 2c and 2d), respectively. Compared to ERA5, in the positive phase, EOF1 in MetUM has much stronger westerly winds in a wider tropical belt (0-20°N) and has an enhanced cyclonic circulation in the subtropics (20-40°N). This corresponds to an overestimated low-level convergence across the tropical belt with an equatorward tilt with longitude. We further find that in MetUM EOF1 is related to an anomalously closed vertical circulation cell within the tropics ( Figure S7 in Supporting Information S1). This implies that the strengthened westerly winds in the tropics and the enhanced cyclonic circulation in the subtropics are dynamically organized. In MetUM, the strengthened and meridionally expanded westerlies in the tropics shift the MT position further north (Figure 3c). Consequently, in the positive phase of EOF1, TCG frequency is overestimated in the south sectors of the basin. In MetUM, related to the position bias of the MT, the positive effect of EOF1 on regional TCG becomes dominant, causing a significant correlation between the basin-wide TCG and PC1 (r = 0.31; Figure 4c). In contrast, there is no such basin-wide relationship in ERA5 (Figure 4a).
Although MetUM reasonably simulates EOF2 (Figures 2b and 2d), there are discrepancies. First, in the southern flank of the circulation, zonal wind anomalies are larger than expected, leading to stronger local convergence. Second, the centroid of the subtropical circulation shifts north-eastward by 5° (geographic units). Third, EOF2 in MetUM has a closer link to zonal winds of the central equatorial Pacific (140-180°E, 10°S-10°N). This easterly anomaly could transport more moist air to the southern flank of the circulation. This unrealistic tropics-subtropics connection is elucidated by an anomalously vertical circulation cell with ascent in the north and descent in the south in the positive phase of EOF2 ( Figure S7 in Supporting Information S1). In contrast, in ERA5 EOF2 has much less link to the circulation to the south (20°N-20°S; Figures S6a and S6b in Supporting Information S1). All these point that EOF2 and associated NPSH variations are much stronger in MetUM than in ERA5. Because TCG occurs mostly in the southern flank of EOF2, the positive correlations between PC2 and regional TCG are reasonably simulated (Figures 3b and 3d). The correlation between the basin-wide TCG frequency and PC2 is slightly higher (r = 0.61) than that in ERA5 (r = 0.49; Figures 4b and 4d).

Discussion
Here, we utilize the circulation mode-TCG relationships gained from the above results to paint a picture of the long-term bias in MetUM.
In ERA5, EOF1 only affects regional TCG (Figure 4). However, The MetUM has a much stronger EOF1 circulation regime with a north-eastward shifted MT (Figure 3). This biased EOF1 can affect TCG in a large domain and exert a stronger control over the basin-wide TCG frequency (Figure 4). Because the climatological mean wind circulation is biased toward the positive phase of MetUM EOF1 throughout the whole period (Figure 1), the increase of MetUM TCG frequency in the whole basin is mostly related to an EOF1-type wind bias. In ERA5, EOF2 dominates the basin-wide TCG frequency, confirming the results of . In MetUM, because this mode shifts north-eastward and is stronger, its effect on the basin-wide frequency becomes stronger (Table S1 in Supporting Information S1). Thus, the EOF2-type wind bias plays a secondary role in the basin-wide TCG bias. In short, the two EOF-type wind biases contribute to the TCG bias by altering both the TCG location and the basin-wide frequency.
Regional TCG bias is caused by inaccurate simulations of both the position and frequency of TCG. We relate it to different combinations of biased wind patterns. The most noticeable bias in TCG (Figure 1d) is an overestimation in the west quadrants. The bias in the southwest sector resembles the positively biased EOF1-TCG relationship in MetUM (Figures 3a and 3c), driven by the northward shift and strengthening of the MT. MetUM has a significant westerly wind bias in this sector (Figure 1c). Thus, the TCG bias in the southwest is mainly caused by the EOF1type westerly wind bias. The TCG bias in the northwest sector resembles the biased EOF2-TCG relationship. In this sector, TCG in MetUM has biased positive correlations with EOF2 and has expected correlations with EOF1. Thus, the TCG bias in the northwest is attributed to the EOF2-type wind bias depicted by a weakened NPSH.
In the northeast quadrant, a northward shifted and strengthening EOF2 leads to an increase of TCG in the northeast (Figures 3c and 3d). In contrast, a shifted and strengthening wind pattern resembling EOF1 suppresses TCG in the northeast sector (Figures 3a and 3b), and this counteracts the positive contribution of the EOF2-type bias. In this quadrant, as the TCG bias is positive, the EOF2-type wind circulation bias plays a dominant role. The TCG bias in the southeast quadrant shows the same pattern as in the EOF2-TCG correlation. This suggests that this TCG bias is related to an enhanced EOF2-type circulation pattern.
Here, we emphasize other implications of the EOF analysis approach, which help understand the model performance. First, in MetUM, the two leading modes of regional winds have an unrealistically merged relationship with the MT and NPSH indices, compared to those in ERA5 (Table S1 in Supporting Information S1). In ERA5, the MT and NPSH indices are clearly separated and preferentially related to PC1 and PC2, respectively. But, in MetUM, PC1 is strongly correlated with both the MT (r = 0.84) and NPSH (r = −0.64) indices, and PC2 is weakly correlated with these two indices (r = −0.43). This mixed relationship is related to the enhanced tropics-subtropics connections both vertically and horizontally in MetUM ( Figures S5-S7 in Supporting Information S1), which damp the expected distinction of these two indices.
Furthermore, in ERA5, both modes are significantly correlated with the WNPSM index (r = 0.64 and 0.60, respectively; Table S1 in Supporting Information S1), making them good proxies to the WNPSM. However, in MetUM, the WNPSM index has significant correlation only with PC1 (r = 0.86). This discrepancy implies that the WNPSM index cannot capture the features of the simulated WNPSM system. Instead, the regional boxes used to define the index should perhaps be relocated to reflect the model bias.
Second, the EOF analysis approach can better reveal the erroneous relationship between the basin-wide TCG frequency and wind circulations in models than just using circulation indices (Table S1 in Supporting Information S1). Because in MetUM the circulation indices cannot capture the variability of the WNPSM system, the relationships between basin-wide TCG frequency and these indices in models become misleading in understanding the role of regional circulations in TCG. For example, the correlations between TCG frequency and these indices in MetUM are identical to those in ERA5. This does not necessarily mean that the large-scale relationship is accurately simulated in the model. Instead, in the EOF method, comparing the model to ERA5, the PC values have distinct relationships with the TCG frequency, clearly indicating misrepresented relationships between basin-wide TCG and wind circulations.
Finally, the EOF approach can further point out the possible factors causing the errors in large-scale circulation. The EOF analysis partitions the WNPSM system into a tropics-oriented regime (EOF1) and a subtropics-oriented regime (EOF2). In MetUM, both modes have an overly strong tropics-subtropics connection, horizontally and vertically ( Figures S5-S7 in Supporting Information S1). This connection can be further related to a biased local Hadley circulation. As the methodology could also be applied monthly, it is interesting to know how the TCGcirculation relationship varies with seasonality (Feng et al., 2021). This type of analysis readily and objectively identifies key factors which can inform further model development.
The biases in the circulation and the subtropical high in MetUM could be caused by for example, air-sea interactions, unresolved convection, and coupling between convection and large-scale circulation (Feng et al., 2019;Martin et al., 2021;Rodriguez et al., 2017). A new subgrid parametrization scheme for convection or the convection-resolving simulations (Hanley et al., 2019;Rooney et al., 2022) may reduce such biases. Therefore, the EOF diagnostic tool can be used to trace the progress made by any model improvements. Nevertheless, some biases are always expected in any model and the purpose of this paper is to propose a diagnostic tool that continues to be helpful. The relationship between the biased circulations and biased local environments is not straightforward. But the EOF analysis is beneficial as it well explains TCG with a more complete picture of large-scale modes. In contrast, local environments, including the genesis potential index, have difficulty in explaining regional TCG in both observations and models (Cavicchia et al., 2023;Menkes et al., 2012).

Conclusions
By decomposing the wind circulations in the WNP using the multivariate EOFs, we gain a better understanding of model bias of TCG. Two main circulation patterns explain the regional and basin-wide frequency of WNP TCG in ERA5 and the MetUM simulations. The first circulation pattern, which is typically depicted by westerly wind anomalies in the western tropical Pacific, favors TCG in the southeast sector of the MDR and suppresses in the northwest. The second circulation pattern, which is associated with the strength of the North Pacific subtropical high, encourages TCG on the poleward side of the MDR. We confirmed that in ERA5 the second pattern dominates the basin-wide TCG frequency.
The long-term bias of wind circulation resembles the positive phase of these two EOF patterns in MetUM, characterized by an anomalous cyclonic circulation and a weakened subtropical high. We found that the mean state of local and remote circulation is biased toward favorable conditions for TCG frequency in both the northern and southern flanks of the MDR, related to two distinct modes of bias. The TCG bias in both basin-wide and regional scales is a combination of these two biased circulation patterns, associated with overly strong tropics-subtropics connections. The credibility of models in TCG mean state closely relies on how well the models simulate these distinct wind circulation patterns both in the position and strength. This EOF-based analytical approach can trace the sources of model genesis bias more clearly back to regional large-scale wind circulation. It could be applied to other GCMs and basins.