Interagency discrepancies in tropical cyclone intensity estimates over the western North Pacific in recent years

This study investigates interagency discrepancies among best‐track estimates of tropical cyclone (TC) intensity in the western North Pacific, provided by the Joint Typhoon Warning Center (JTWC), the China Meteorological Administration (CMA), and the Regional Specialized Meteorological Center (RSMC) Tokyo during 2013 to 2019. The results reveal evident differences in maximum wind speed (MSW) estimates, where linear systematic differences are significant. However, the Dvorak parameter (CI) numbers derived from the MSWs reported by the three agencies are internally consistent. Further analysis suggests that the remained CI discrepancies are related to differences in the estimation of intensity trends, initial intensities, and TC positions among these datasets. In addition, the CI estimates provided by the JTWC for TCs over the open ocean are generally higher than those reported by the CMA and RSMC. However, the estimates from CMA and RSMC tend to give higher TC intensities for the TCs in the mainland and coastal areas of China and Japan, respectively, than those over the open ocean with the same intensity in JTWC dataset. This pattern potentially reflects the extensive use of surface observations by these two agencies for landfalling and offshore TCs. These results may help the research community to get more accurate details about the TCs in WNP from the best track datasets of different agencies.


| INTRODUCTION
The best-track datasets (BTDs) for all tropical cyclones (TCs) in the western North Pacific (WNP) are provided by various meteorological agencies, including the Joint Typhoon Warning Center (JTWC), the China Meteorological Administration (CMA), and the Regional Specialized Meteorological Center (RSMC) Tokyo. All these datasets are included in the International Best Track Archive for Climate Stewardship (IBTrACS) dataset . In these datasets, TC intensity is defined in terms of the maximum surface wind speed (MSW) over an average period and the minimum sea level pressure (MSLP) near the center of the TC. Previous studies have reported significant discrepancies in TC intensity estimates among the different BTDs (Kang & Elsner, 2012;Knapp et al., 2013;Kueh, 2012;Ren et al., 2011;Song et al., 2010). These discrepancies would affect the results of interdecadal variability and climatic trends in TC activity (Klotzbach & Landsea, 2015;Song et al., 2010;Wu et al., 2006), and evaluation of the forecasts from official agencies and numerical models on TC intensity (Chen et al., 2021).
Although it has been suggested that data provided by these agencies are simply irreconcilable (Lander, 2008), others have proposed that disparate MSW values are the result of two main factors Kossin et al., 2013;Song et al., 2010). First, MSW values are averaged over different time intervals: the JTWC, CMA, and RSMC use 1-, 2-, and 10-min periods, respectively. Second, the various agencies employ different operational procedures and incorporate additional observational data into their MSW estimates.
Over the open ocean, estimates of TC intensity rely largely on the analysis of satellite-derived measurements (Dvorak, 1975(Dvorak, , 1984Lu et al., 2019;Zhuo & Tan, 2021). The most widely used approach, the Dvorak technique (Dvorak, 1975(Dvorak, , 1984, first selects a corresponding cloud pattern from geostationary satellite imagery to assign a T-number, after which the current intensity (CI), is determined by modifying the T-number according to specific correction criteria. TC intensity is then estimated from conversion of the CI number. However, various conversions of CI to MSW have been developed to account for different averaging periods of winds. The JTWC uses the original CI-MSW conversion values of Dvorak (1984), the RSMC employs a modified conversion set (Koba et al., 1991) in which CI is translated into MSLP and 10-min average MSW. Prior to 2013, the CMA estimated TC intensity via a cloud pattern recognition approach that was similar, but not identical to the Dvorak technique (Ying et al., 2014). After 2013, the agency adopted the Dvorak technique (Dvorak, 1984) in operational use (Xu et al., 2015). The CMA developed its current CI-MSW conversion values from statistical relationships between CI number and MSW reported in historical CMA dataset (Bai et al., 2022). The conversion tables in use by the JTWC, RSMC, and CMA are listed in Table 1.
In addition to these modifications to the Dvorak technique, different agencies also incorporate variable observational data to evaluate TC intensity. The RSMC would modify the Dvorak-derived MSW values according to ocean surface wind data from microwave satellites and surface observations (Kunitsugu, 2012). The CMA routinely incorporates in situ observations and radar data, particularly for landfalling and offshore TCs (Bai et al., 2022).
Previous studies comparing TC intensity datasets have focused primarily on discrepancies among long-term trends. Yet, the interagency comparisons based on CI values can only be made meaningfully for the years after 2013, when all three agencies adopted the Dvorak technique as the primary tool for assessing TC intensity. Meanwhile, fewer studies have examined the spatial distributions of interagency intensity differences. In seeking to improve our understanding of interagency discrepancies, this study poses these questions: Have TC datasets become more consistent since all these three agencies adopted the Dvorak technique? What are the causes of discrepancies among BTDs? The information on the three datasets in Section 2, evaluates the respective differences in intensity in Section 3. A summary and discussion are provided in Section 4.

| DATA
This study utilized publicly available TC intensity datasets provided by the CMA (https://tcdata.typhoon.org.cn/ zjljsjj_zlhq.html), the RSMC (http://www.jma.go.jp/jma/ jma-eng/jma-center/rsmc-hp-pub-eg/trackarchives.html), and the JTWC (https://metocph.nmci.navy.mil/jtwc/ Note: The conversion relationships used by the JTWC and RSMC were provided by Dvorak (1984) and Koba et al. (1991), respectively. MSW values reported originally in knots have been converted to m/s by a multiplying factor of 0.51444. The CI-MSW conversion table used by the CMA was developed from statistical relationships between CI numbers and MSWs reported in historical CMA dataset (Bai et al., 2022).
period 2013-2019. Nonetheless, the annual frequency of the concurred-TCs is 25.1, indicating that a number of independent TC events are not recorded in all three BTDs. Further analysis of the independent TCs is shown in the supplemental material. To make the following comparison of intensity estimation more realistic, this study utilizes only those events with MSW exceed 17.2 m/s and concurred in all three BTDs (n = 3287).

| Interagency intensity differences varied with TC intensity
The interagency MSW differences remain significant, even after the universal adoption of the Dvorak technique in 2013 ( Figure 1). Since the CMA and RSMC report MSW based on 2-and 10-min averages, respectively, the JTWC employs a 1-min averaged and thus reports generally higher MSW values than the other two agencies. The mean MSW JTWC is 36.5 m/s, which is 4.5 and 3.0 m/s larger than the mean MSW RSMC and MSW CMA , respectively (the subscript indicates the dataset). The correlation coefficients between MSW differences with MSW values are positive and statistically significant at the 1% level, which indicates the interagency disparity among MSW estimates grows with increasing MSW. Maximum discrepancies among the three datasets occurred in the intensity of Typhoon Haiyan (2013), with MSW JTWC 75 m/s, which is 29 m/s and 20 m/s stronger than MSW RSMC and MSW CMA , respectively. Numerous studies have suggested that differences in MSW among agencies likely result from the use of  different time-averaging periods (Kang & Elsner, 2012;Kueh, 2012). However, some studies revealed that the different averaging periods do not always imply differences of reported MSW between agencies . Regardless of the time periods used, the Dvorakderived CI should be the standard for TC intensity since all agencies perform a Dvorak analysis. Therefore, the MSW were reversed to the CI numbers according to the specific CI-MSW relationship employed by each agency.
It should be noted that the derived CI numbers approximate the actual Dvorak CI values, since the reported MSW values can be adjusted using other observations. Therefore, if the sole cause of disparate MSW values is the use of different conversion tables for mapping from CI to MSW, then the difference in CI number should be close to zero. Results show that the CI discrepancies in the stronger intensity range can be diminished significantly ( Figure 1). Although the mean interagency CI differences are less than 1.0, the CI JTWC are systematic higher than those reported by the CMA and RSMC. Nakazawa and Hoshino (2009) Figure 1 also illustrates how the interagency difference in CI shifts with changing TC intensity. The negative correlation coefficients between CI differences with CI values indicate the interagency disparity among CI estimates decrease with increasing CI. When CI exceeds 4.5, the relative differences in CI numbers are minimal. The mean CI differences are less than 0.5, with 89% and 73% of the CI CMA and CI RSMC falling within ±0.5 of CI JTWC , respectively, indicating that the Dvorak technique is effective for estimating the intensity of strong TCs (CI ≥ 4.5). This is likely because the Dvorak technique is more objective when a TC exhibits a well-defined eye. When the CI number is <4.5, however, the CI JTWC tends to be more consistent with CI CMA , which is typically larger than that of the CI RSMC . This pattern likely reflects interagency differences in CI-MSW mapping; both the JTWC and CMA employ the original table proposed by Dvorak (1984) when CI < 4.0, while the RSMC adopted the CI-MSW system of Koba et al. (1991), effectively increasing the MSW for CI <4.5 (Knaff et al., 2010). Consequently, CI numbers for a given MSW are generally lower in the RSMC dataset than that in the JTWC and CMA BTDs for low-intensity TCs. Moreover, the divergence among CI numbers is caused by the wide variety of Dvorak satellite patterns used when TC is weak.
Given this analysis, the distinct CI-MSW relationships employed by each agency are the predominant factor that causes interagency discrepancies in MSW estimates. Besides that, the variable CI numbers derived from the Dvorak technique also cause interagency MSW discrepancies. Mean values of CI JTWC are systematically higher than both CI RSMC and CI CMA . When comparing RSMC and CMA, the mean CI CMA is greater than CI RSMC for weak TCs but the opposite is true for stronger TCs.

| Interagency intensity differences varied with TC intensity change
Composites of the differences in CI values were constructed by binning the CI into every 0.5 for nine intensity change-based stratifications to explore the effect of the intensity change on CI differences between each two datasets. As the number of samples exhibiting opposite intensity trends in the last 6 h between any two datasets is small, those samples are combined with the nearest groups depicted in Figure 2. For example, only 12 TCs are reported as intensifying in the JTWC dataset but weakening in the CMA dataset, those samples have been grouped with those shown as intensifying in the JTWC dataset but steady in the CMA dataset.
The largest CI differences appeared for the samples intensifying in the JTWC dataset but steady/weakening in the CMA dataset (Figure 2a). Of the 1220 samples recorded in the JTWC dataset as intensifying, 28.2% of these are depicted as steady or decaying in the CMA dataset. This contrast in estimated intensity trends serves to enlarge the CI offset between the two agencies. In addition, for those samples recorded as intensifying in both datasets, the mean CI JTWC are still larger than CI CMA . The average initial intensities and CI development in the last 6 h (4CI) for the concurred-TCs in the three datasets are calculated. The results show that the mean initial CI JTWC is 2.62, which is 0.06 larger than CI CMA , while the intensification rates are essentially the same (4CI JTWC = 0.025; 4CI CMA = 0.028). Therefore, for those samples shown in both datasets as intensifying, the relatively high CI JTWC likely reflect the stronger initial intensity recorded in that dataset.
The mean CI JTWC is also systematically greater than CI RSMC , especially for weak TCs (Figure 2c). However, for the samples listed as intensifying or steady in the RSMC dataset, but decaying in the JTWC dataset, CI R-SMC > CI JTWC when the CI number exceeds 4.5. The mean 4CI and initial intensity are also compared between these two datasets. The results show that although TCs are shown as intensifying more rapidly (4CI RSMC = 0.037) in the RSMC dataset, their lower initial intensity (CI = 2.02) renders CI RSMC weaker than CI JTWC for samples reported as intensifying by both datasets. The distribution of CI difference between the CMA and RSMC is similar to that between the JTWC and RSMC. The CMA-derived CI values are significantly higher than RSMC values when CI <4.0. The CI CMA is smaller than CI RSMC samples shown by RSMC data as intensifying when CI >4.0.
Furthermore, analysis of the numbers based on different 4CI groups (Table 2) show that both the intensifying (4CI > 0) and decaying (4CI < 0) numbers are the least, but the rapid intensifying (4CI > 1.0) and rapid decaying (4CI < À1.0) numbers are the most in RSMC dataset. These contrasting ratios indicate that there may be some differences of the constraints for allowable intensity change in different agencies. For developing TCs, Dvorak (1984) limited final T number change less than 0.5 (1.0) over 6 h when T-number <4.0 (≥4.0). The CI number rules of Dvorak (1984) is the same as final T number for developing TCs. As shown in Table 2, only 7 (5) samples in the JTWC dataset violated the intensification rules; for these, the CI number changed by >1.0 (0.5) during the 6-h period when CI ≥4.0 (<4.0). The statistics for the CMA intensity change ratios are similar to those of the JTWC dataset. However, there are 21 (67) samples violated the intensification rules in the RSMC dataset, which is 3.0 (13.4) times those of the JTWC datasets. These results are consistent with the findings of Nakazawa and Hoshino (2009), who concluded that the JTWC/RSMC CI offset is due partly to the incidence of slowly weakening TCs reported by the JTWC, and also the differences in estimated intensity trends between the agencies.
In summary, the observed disparities in CI among the three datasets are related to the differences in estimates of intensity trends, initial intensities, and intensity change ratios. Meanwhile, this discrepancy of the rapid intensifying and decaying numbers suggests that the permissible margin of intensity variability proposed by   Dvorak (1984) is somewhat relaxed by the RSMC, resulting in more rapid intensification and weakening of TCs.

| Interagency intensity differences varied with differences in TC center position
Figure 2 also depicts the influence of the variability in estimated center location on reported TC intensity differences among the three agencies. Although correlation coefficients for this relationship are small, they are nonetheless statistically significant at the 0.01 level. Positive relationships exist between the RSMC and the other two BTDs, suggesting that discrepancies in CI are greater when there is larger difference in estimated TC position (Figure 2d, f). Meanwhile, the samples with larger CI difference are likely weaker TCs (Figure 1d, f). These results show that the difference in position between agencies decreases with increasing intensity, which is consistent with previous studies . This is likely because locating the TC center is more difficult for a weak TC, which is less organized.

| Spatial distribution of interagency intensity differences
The spatial differences in CI values within any 5 Â 5 square between these three datasets are depicted in Figure 3. The CI difference is relatively small, but it is nevertheless nonzero. Mean CI values for the JTWC dataset are still greater than CMA-and RSMC-derived values throughout much of the WNP. An important feature of these distributions is that CI differences over mainland China and Japan (and the coastal areas) contrast starkly with those over the open ocean. The CI JTWC are not always larger than those from the other two agencies in these regions. The mean values of CI CMA and CI RSMC seem to be largest in some areas off their coastal areas. Given that the different spatial distributions of the interagency intensity, the statistics for subsets of regions are examined. Region A (17.5-40 N, 102.5-127.5 E) denotes the China mainland and coastal zone,  denotes the Japan mainland and coastal zone, and the Region C is the WNP area outside Regions A and B. The mean CI differences for each CI group in these three regions are shown in Figure 3 (right panel). Those samples are combined with the nearest groups if the number of samples is smaller than 10.
The mean values of CI JTWC -CI CMA are À0.03, 0.01, and 0.12 for TCs in Region A, Region B, and Region C, respectively, which means that CI JTWC are systematically higher than CI CMA in the open ocean, while the difference is much small in the Japan mainland and coastal zone. However, CI CMA are higher than those in the JTWC dataset for TCs in the China mainland and coastal zone. Figure 3b also shows the CI estimates by these two agencies in Region A are consistent for TCs stronger than 5.5, but CI CMA are larger for TCs with CI 2.5-3.5 and 4.5-5.0.
The mean values of CI JTWC -CI RSMC are 0.34, 0.11, and 0.31 for TCs in Region A, Region B, and Region C, respectively. There are marked overlaps of the CI JTWC -CI RSMC values for TCs in Regions A and C (Figure 3d), with the larger discrepancies for weak TCs and smaller discrepancies for strong TCs. The results indicate the techniques and procedures of the operational intensity estimates used in these two agencies are consistent for TCs in Region A and Region C. However, the values of CI JTWC -CI RSMC in Region B, which are much smaller, are well distinguishable from that in the other regions.
The distributions of the CI discrepancies between CMA and RSMC are shown in Figure 3f. The CI CMA tends to be larger than that of the CI RSMC when CI≦4.5, and which is smaller when CI > 5.0 for TCs in Regions B and C. However, the CI threshold increases to 6.5 for TCs in Region A. In short, the discrepancies between the three agencies are very evident as revealed by the three region subsets. JTWC tends to overestimate the TC intensities both in terms of MSW and CI as compared with the other two agencies in the open ocean. However, the estimates from both the CMA of China and the RMSC of Japan tend to give higher TC intensities for the TCs in their mainland and coastal areas than those over the open ocean with the same intensity in JTWC dataset. Such distributions indicate that the disparity cannot be attributed to the diverse CI-MSW conversions and operational Dvorak procedures applied by each agency. Some studies suggested that there is a clear need for in situ observations to strengthen estimates of TC intensity, since the efficacy of the Dvorak technique is limited to instances when TCs make landfall (Barcikowska et al., 2012). In this case, the broad use of additional supplementary sources by the CMA and RSMC is potentially causing divergent CI estimates. Bai et al. (2022) quantified the disparity among intensity estimates for Typhoon Lekima (2019) provided by the three agencies, and concluded that the CMA's estimates were strongly influenced by surface observations when Lekima (2019) made landfall in China. The minimum observed sea surface pressure was 936 hPa and the maximum 2-min sustained wind speed was 50.5 m/s recorded on Sansuan Island (station height 79 m elevation). Typhoon Lekima underwent eyewall replacement as it approached the coastline, yet that pattern was not described by the original Dvorak (1984) technique. The CMA incorporated the in situ observations, and the official intensity of Lekima was set as 52 m/s, which was 6 and 8 m/s greater than the JTWC and RSMC values, respectively.
The RSMC has employed the Dvorak (1984) technique operationally since 1987, with forecasters routinely incorporating observational data (satellite-derived ocean surface wind data and surface observations) to make estimated TC intensities consistent with weather chart analyses (Kunitsugu, 2012). As there are considerably more surface observations made over Japan than there over the

| SUMMARY AND DISCUSSION
This study evaluates interagency discrepancies in estimates of TC intensity over the WNP between 2013 and 2019. Overall, RSMC-derived MSW is more consistent with CMA estimates, with an MAE (2.7 m/s) that is half that of MSW JTWC -MSW RSMC .
Reported CI values are reversed by the MSWs based on the respective CI-MSW relationship employed by each agency. The mean differences in CI values are smaller than 1.0, which means the distinct CI-MSW relationships employed by each agency are the dominate factor which causes interagency discrepancies in MSW estimates. However, there remain systematic CI disparities among the three datasets. In general, JTWC-derived CI numbers are systematically higher than those estimated by the CMA and RSMC. When comparing RSMC and CMA, the mean CI is greater in the CMA dataset than that in the RSMC dataset for weak TCs, but the opposite is true for stronger TCs. This result indicates that, in addition to the variable CI-MSW relationships, the different CI numbers estimated via the Dvorak technique are also a source of interagency discrepancy. The further analyses find that discrepancies in reported CI are linked to differences in estimates of intensity change, initial intensity, TC position estimation and the use of additional supplementary sources by these agencies.
Finally, the spatial distribution of interagency difference reveals that the distribution characteristics for MSW and CI disparities near the coasts of China and Japan are starkly different from those in the open ocean. JTWC tends to overestimate the TC intensities both in terms of MSW and CI as compared with the other two agencies in the open ocean. However, the estimates from CMA and RSMC tend to give higher TC intensities for the TCs in the mainland and coastal areas of China and Japan respectively, than those over the open ocean with the same intensity in JTWC dataset. Both the CMA and RSMC emphasize the importance of supplementing estimates with surface observational data, which potentially results in interagency differences in TC intensity estimates near the shores of China and Japan.
This investigation shows that estimated CI numbers reported by the three agencies are internally consistent, which suggests that the Dvorak technique is generally effective and reproducible for constraining TC intensity. However, it also shows that there may be some conflicting results when TC intensity is solely estimated by Dvorak technique or by the additional in-situ observational data. Therefore, greater transparency and detailed records in the operational procedures and supplementary data sources of each agency are needed to improve interagency consistency.