Are Forecasts of the Tropical Cyclone Radius of Maximum Wind Skillful?

The radius of maximum wind (RMW) defines the location of the maximum winds in a tropical cyclone and is critical to understanding intensity change as well as hazard impacts. A comparison between the Hurricane Analysis and Forecast System (HAFS) models and two statistical models based off the National Hurricane Center official forecast is conducted relative to a new baseline climatology to better understand whether models have skill in forecasting the RMW of North Atlantic tropical cyclones. On average, the HAFS models are less skillful than the climatology and persistence baseline and two statistically derived RMW estimates. The performance of the HAFS models is dependent on intensity with better skill for stronger tropical cyclones compared to weaker tropical cyclones. To further improve guidance of tropical cyclone hazards, more work needs to be done to improve forecasts of tropical cyclone structure.


Introduction
Improving track and intensity forecasts of tropical cyclones is an important goal of the Hurricane Forecast Improvement Project (HFIP; Gall et al., 2013).Continuous progress has already been made in improving track and intensity forecast errors in part due to the advancements in numerical weather prediction models (Cangialosi et al., 2020;DeMaria et al., 2014).Another important goal of HFIP is to improve forecasts of tropical cyclone size to allow for improved messaging of tropical cyclone hazards such as storm surge (Marks et al., 2019).It is unclear how well current numerical weather prediction models forecast tropical cyclone size in terms of the radius of maximum winds (RMW), given that previous work by Cangialosi and Landsea (2016) found that dynamical models struggled with forecasting wind radii.
The National Hurricane Center (NHC) commonly uses the radius of 34-, 50-, and 64-kt wind radii to communicate the extent of the tropical cyclone wind field.Another important structural measure of the size of a tropical cyclone is the RMW.The wind radii estimates are provided in quadrants and can account for asymmetries while the RMW is a symmetric quantity that can often be ill defined for weak systems.The RMW is more closely related to the most significant impacts of a tropical cyclone including storm surge (e.g., Irish et al., 2008); however, it is not currently forecast by NHC because of limited forecast guidance and observations.Observational constraints are a limiting factor in verifying RMW forecasts because the RMW is not always directly (or completely) observed.Estimates of the RMW are more reliable for strong tropical cyclones that are well defined and during times when there is aircraft reconnaissance or radar observations.Observations of the RMW have been quality-controlled and included in the NHC Best-Track Data set (HURDAT2; Landsea & Franklin, 2013) starting in 2021 which motivates further examination of RMW forecasts.Understanding the performance of models that predict the RMW is a critical step needed for RMW to be incorporated into hazards forecasting.
Numerical guidance for NHC forecasters on the RMW is primarily available from dynamical models as statistical-dynamical models such as the decay Statistical Hurricane Intensity Prediction Scheme (DSHP;DeMaria & Kaplan, 1994, 1999;DeMaria et al., 2005) and the Logistic Growth Equation Model (LGEM; DeMaria, 2009) do not currently forecast the RMW or wind radii.The new Hurricane Analysis and Forecast System (HAFS) models, HAFS-A and HAFS-B, have shown promise in forecasting tropical cyclone structure with relative improvement in RMW bias from the Hurricane Weather Research and Forecasting (HWRF) model (Hazelton et al., 2023;Wang et al., 2023).However, it remains unclear whether the HAFS RMW forecasts would be considered skillful.The NHC uses a climatology and persistence model (CLIPER; Neumann, 1972) as a baseline to measure the skill of track, intensity, and wind-radii forecasts.Using a CLIPER model as a baseline helps account for differences in the difficulty of forecasts, which facilitates the comparison of skill for different storms and different years (Cangialosi & Franklin, 2014).See DeMaria et al. (2022) for a comprehensive discussion of the track and intensity CLIPER models used at NHC. Currently there is no CLIPER available to use as a baseline to evaluate the skill of RMW forecasts.Penny et al. (2023) has shown that improvements to storm surge forecasting are possible through better representation of the tropical cyclone wind field.Due to the lack of available RMW guidance, a multi-linear regression technique using the NHC official forecast (OFCL) was developed to better predict the RMW within the Probabilistic Storm Surge (P-Surge) model (Penny et al., 2023).Chavas and Knaff (2022) derived a method to calculate the RMW using the radius of 34 kt winds, maximum intensity, and latitude using angular momentum principles.We can create an RMW forecast using the method described in Chavas and Knaff (2022) since all the required variables are forecast by OFCL through 72 hr.It is unclear whether simple statistical frameworks to derive the RMW from Penny et al. (2023) or using Chavas and Knaff (2022) will be competitive with dynamical models that can explicitly simulate and resolve the maximum winds as well as eyewall replacement cycles, asymmetries, and interactions with complex terrain.
With the goal to improve the performance of the operational storm surge model at the NHC, this work will evaluate the skill of RMW forecasts in the North Atlantic from dynamical and statistical models relative to a new climatology and persistence model.This work aims to motivate future studies to focus on improving forecasts of the tropical cyclone structure in addition to track and intensity.Skillful forecasts of the wind structure from HAFS could be leveraged to help drive hydrodynamical models to improve storm surge forecasting.Here we will discuss the performance of two statistical techniques to forecast the RMW and compare their performances to the HAFS models using a simple climatological baseline.
Section 2 discusses the creation of a climatological RMW to use as a baseline and the verification procedure.Section 3 details the verification of the statistical and HAFS models and showcases an example for Hurricane Laura (2020).A summary of the results and general conclusions on the future of RMW forecasting are provided in Section 4.

Climatology and Persistence
It is common to perform a forecast verification using a common baseline to determine how skillful the forecasts are.A climatology and persistence model (OCD5) is used as a baseline for evaluating skill of NHC track and intensity forecasts (Cangialosi & Franklin, 2014;DeMaria et al., 2022).OCD5 combines intensity forecasts from the Decay-Statistical Hurricane Intensity Forecast model (D-SHIFOR; Jarvinen & Neumann, 1979;Knaff et al., 2003) and track forecasts from the CLIPER5 model (CLIPER; Neumann, 1972;Aberson, 1998).This baseline is flexible and helps to explain interannual variability in errors when there are more or less "normal" tropical cyclone tracks.At present, OCD5 and other CLIPER models do not include climatological RMW forecasts which motivates the development of one.
We create an RMW estimate to use as a baseline that scales with latitude and intensity, similar to the exponential functions used in Vickery et al. (2000) and derived by Willoughby and Rahn (2004) and Willoughby et al. (2006).An updated data set of operationally estimated RMWs within 2 hr of aircraft reconnaissance from 2001 to 2019 in the North Atlantic is used from Penny et al. (2023) to derive the RMW estimate.A power-law relationship shown in Equation 1 relates latitude, intensity, and RMW that minimizes the absolute errors over the sample and explains 37.7% of the variance in the 2001-2019 data set at the initialization time.This model is consistent with the form shown in Penny et al. (2023) and explains more variance in the RMW observations than the 24.3% of variance in flight level RMW observations explained by the equation in Willoughby and Rahn (2004).The climatological RMW forecast will use Equation 1where RMW is in units of n mi, V MAX is the maximum intensity in units of kt, LAT is the latitude in degrees, and the coefficients α = 760.06199,β = 0.9516, and γ = 0.0242.
By using the latitude and intensity forecast from OCD5, we can subsequently calculate the RMW at any forecast time using Equation 1.Using this framework, the climatological RMW will range from 6 n mi for 180 kt major hurricanes near 5°N to 78 n mi for a 30 kt tropical depression at 40°N.To improve the consistency of the RMW forecast we apply a three-point moving average to the 12-hourly forecasts.
Persistence of the initial estimate of the RMW also needs to be applied to the climatological value for the CLIPER baseline.Persistence is applied using the offset technique used in the wind radii CLIPER model (Knaff et al., 2007(Knaff et al., , 2018).An offset is calculated by taking the RMW at forecast initialization minus the derived RMW from Equation 1.The offset is applied in full at the initial time and then linearly reduced to zero by 60 hr so that the forecast beyond 60 hr is solely the climatological value.Because the offset could reduce the RMW forecast, a minimum RMW of 5 n mi is enforced at any forecast time.The RMW climatological and persistence baseline model will be referred to as RCLP.See Figures S1 and S2 in Supporting Information S1 for a comparison between RCLP and observations and Figure S3 in Supporting Information S1 for a comparison between RCLP and persistence.

Forecast Verification
The verification of the RMW forecasts will follow similar rules used in official NHC forecast verification procedure (Cangialosi & Franklin, 2014).A forecast is only verified if a tropical or subtropical cyclone exists at the initial and verifying time.Tropical cyclones are not verified after dissipating or undergoing extratropical transition.Since tropical and subtropical depressions are less likely to have a well-defined center where the RMW is difficult to determine, only times where tropical systems are at least of tropical storm strength (>34 kt) are included.
The verification metric used will be mean absolute error (MAE) with the skill of the forecasts being MAE skill.Tests of the verification metric revealed that the results are not sensitive to the use of root-mean-square error or median absolute error.The verification data set will be HURDAT2 and the operational best-track estimates of RMW for 2020.The verification requires a homogeneous sample, meaning that forecasts from all models must be present to be included in the sample.In addition, only forecasts that are initialized and verified over water are included to limit the impacts of land.
We will focus this analysis on HAFS-A and HAFS-B which are the newest high-resolution dynamical forecast models available to forecasters (Hazelton et al., 2023).The analysis will evaluate retrospective forecasts from 2020 to 2022 for which consistent versions of the HAFS-A and HAFS-B models were used.The output will also include a consistent version of the Geophysical Fluid Dynamics Laboratory (GFDL) vortex tracker which outputs the point-based RMW from the model (Marchok, 2021).The vortex tracker is a program that objectively identifies the tropical cyclone center and wind structure using multiple variables such as the relative vorticity, mean sea level pressure, geopotential heights and wind speeds at various atmospheric levels (Marchok, 2002).
The verification will only be conducted in the North Atlantic basin because of the relative lack of observations in the eastern North Pacific.The training data set used to derive Equation 1 is also based on North Atlantic only observations.RMW observations are more prevalent in the North Atlantic and the quality of the RMW estimates are more reliable.We have also conducted additional tests with our sample constrained to only times within 2 hr of aircraft observations (these results are shown in Figures S4 and S5 in Supporting Information S1); however, this substantially limits the sample size of the data set.In addition, the sample size gets biased toward more intense tropical cyclones when constraining to times with aircraft observations.This sample bias subsequently leads to lower RMW errors than would be expected with a diverse set of tropical cyclones.Therefore, all data will be used in the main evaluation of this study, but in doing so we recognize that there is larger observational uncertainty in the RMW, particularly for weak systems.
Although the OFCL forecast from NHC does not include the RMW, the Chavas and Knaff (2022) method for deriving the RMW will be applied to the forecast track, intensity and the non-zero averaged radius of 34 kt winds from OFCL and will be referred to as CHVS.Since the radius of 34-kt winds is only forecast to 72 hr, the value at 72 hr is fixed through 120 hr to allow for a 5-day forecast unless the tropical cyclone weakens below tropical storm strength.Since the CHVS model does not have a correction for the initial RMW, the offset technique is used to blend the forecast and initial RMW.The optimal ending time for the offset for the CHVS model was determined to be 36 hr.In addition, we will compare the forecasts from the RMW model used operationally in P-Surge (RMWP).The RMWP model uses a multi-linear regression technique that is also based on the OFCL forecasts of track, intensity, and available wind radii.The coefficients and techniques used for the creation of RMWP can be found in Penny et al. (2023).
Each of the statistically derived RMW forecasts (RMWP and CHVS) are "early" guidance, while the HAFS forecasts are "late" guidance.Since the HAFS models are not finished running by the time the forecast advisories are issued, the previous HAFS runs are used for guidance, but need to be converted to the analysis time through an interpolation step.The forecasts of HAFS-A and HAFS-B are interpolated using the standard interpolation techniques with the same methods for intensity being applied to the RMW.During this process, the forecasts are interpolated to 3-hr resolution, extrapolated, smoothed 10 times using a 1-2-1 filter, and time shifted.An offset is then applied based on the initial errors, which is then linearly reduced to the final offsets of 72 and 84 hr for HFAI and HFBI, respectively.These offsets were found to objectively optimize the skill of the RMW forecasts for each model.The conversion to early guidance is also beneficial in reducing high-frequency RMW fluctuations, since temporal smoothing has been found to reduce forecast errors (Zhang et al., 2021).

2020-2022 Verification
First we will examine the performance of the models over the full retrospective data set spanning 2020-2022.An examination of the MAE and biases for HFAI, HFBI, RMWP, CHVS, and RCLP is shown in Figure 1.Each of the MAE curves for the models has a similar profile where the errors rapidly grow over the first 36 hr before leveling off.Both HFAI and HFBI have higher MAE at all lead times relative to RMWP and CHVS.The MAE for HFAI is at best 13% less skillful than RCLP for 48-hr forecasts with the largest skill deficit being for 120-hr forecasts.Both the HAFS models have positive RMW biases for forecasts through 48 hr which suggests that some of the RMW errors are coming from the initialization of the models with a vortex that is too large.This initial size bias is partially removed by the interpolator (See Figures S6 and S7 in Supporting Information S1) and will likely be improved in the future by the assimilation of inner-core observations.The errors for both HAFS models are larger than the errors from climatology and persistence resulting in negative skill at all forecast hours.This means that on average the forecasts of RMW from HAFS are not skillful (Figure 1b).
The derived RMW forecasts from CHVS and RMWP have lower MAE than the HAFS models at all forecast lengths within the 2020-2022 retrospective data set (Figure 1a).The CHVS method of deriving the RMW has slightly higher errors compared to the RMWP forecasts at nearly all forecast hours.The bias of the CHVS model increases over the first 24 hr before leveling off around 5 n mi.Overall, the CHVS model has negative skill compared to RCLP over the first 36 hr but is comparable to or slightly better than RCLP from 48 to 96 hr.
The forecasts of the RMW from the operational P-Surge RMW model (RMWP) performs the best and has the lowest overall MAE over the 2020-2022 North Atlantic retrospective data set.The RMWP model also has the largest bias of any model that is consistently negative, meaning that the forecast RMW is too small on average.The skill of RMWP relative to climatology and persistence is overall similar over the first 24 hr; however, RMWP does show 10% skill in forecasting the RMW for 48-72 hr forecasts.Interestingly none of the statistical or HAFS models are skillful for 5-day forecasts.
Since weak tropical cyclones are known to have the largest uncertainty in the observed RMW (Combot et al., 2020), we can mitigate some of the observational uncertainty by verifying tropical cyclones that are at least a Category 1 Hurricane on the Saffir Simpson Hurricane Wind Speed Scale (Schott, 2012).Figure 2 shows a verification of only hurricanes in a similar manner as Figure 1.The sample size has been greatly reduced by analyzing stronger tropical cyclones, although there are still enough forecasts to draw conclusions from.First, the overall characteristics of the RMW error profiles show a smaller short-term increase and the error slowly increases with forecast length without saturating at 120 hr.The errors of the climatology and persistence model are also reduced for hurricanes compared to weaker tropical cyclones owing to the smaller range of potential solutions.For stronger tropical cyclones, the biases for all the forecast models become increasingly negative making tropical cyclones that are too small.The negative bias from the statistical models is partially due to OFCL having a positive intensity bias, although the HAFS biases cannot be explained by biases in the intensity forecasts (not shown).
The skill of each model is better for stronger tropical cyclones compared to weaker tropical cyclones with the errors from the RMW CLIPER also reduced.The derived RMW forecasts from CHVS and RMWP have 10%-25% skill after 36 hr over RCLP and comparable to slightly less skill for 12-24 hr forecasts.HFAI shows better performance at predicting the RMW than HFBI with 5%-10% skill after 48 hr.HFBI shows overall similar performances to climatology after the first 24 hr.RMW forecasts through 24-hr for HFAI and HFBI are still not considered skillful relative to climatology and persistence, although that difference in MAE is <1 n mi. Figure 2 shows that the HAFS models have only slight skill in forecasting the RMW but are overall not distinguishable from climatology and persistence.The statistical models based on the OFCL forecast are currently the best available RMW models, and are skillful for stronger tropical cyclones.The better performance of the statistical models is statistically significant at the 99% confidence level for 48-96 hr forecasts using a two-tailed Student's ttest.

Hurricane Laura (2020)
We offer a case study of Hurricane Laura (2020) to better understand and illustrate the performance of the statistical versus dynamical RMW forecasts.Hurricane Laura caused catastrophic storm surge in Louisiana and in total caused an estimated 23 billion dollars in U.S. damages (NCEI, 2023).The track and intensity guidance for Hurricane Laura was overall quite good leading to official intensity forecast errors that were below average at almost all lead times (Pasch et al., 2021).
The track, intensity, and RMW forecasts for HFBI and OFCL/RMWP are shown for Hurricane Laura in Figure 3.The track and intensity forecasts from OFCL were consistent from cycle-to-cycle and close to the observed.The track and intensity forecasts from HFBI were also consistent and close to the observed, although HFBI had a left of track bias for Hurricane Laura's eventual landfall in Louisiana (Figure 3a).Both HFBI and OFCL forecast the rapid intensification of Hurricane Laura but underestimated the intensification rate which is an expected error profile given rapid intensification and the close proximity to landfall (Trabing & Bell, 2020).
The RMW forecasts from HFBI and RMWP in Figure 3c show larger differences than in the intensity forecasts.
The RMW forecast from RMWP tended to be too small early in the lifecycle of Laura and did not capture the increase in RMW from 21 to 22 August.However, the RMWP model was overall consistent from cycle-to-cycle and able to capture the RMW contraction on 23 August and the small RMW through landfall on 27 August.The RMWP model was also able to capture the increase in the RMW following the weakening after landfall, although  the magnitude was underestimated because RMWP was not trained on any forecasts over land (Penny et al., 2023).
The RMW forecasts from HFBI for Hurricane Laura had much more variability compared to RMWP.HFBI was able to forecast the increase in RMW on 21 and 22 August, although the forecasts had considerable run-to-run variability over the following 3 days.While Laura was a weaker tropical storm and moving near the Caribbean islands, the RMW from HFBI would range from anywhere between 40 and 110 n mi.The HFBI forecasts of the RMW improved after August 25 as Laura became a strong tropical storm and moved into the Gulf of Mexico.
The large spread in RMW forecasts could be related to island interactions and highly asymmetric convective structures around the system, although it is unclear whether the errors are coming from the model or how the GFDL vortex tracker identifies the RMW near complex terrain.HFBI was also able to capture the increase in RMW associated with landfall, although timing errors due to the left of track bias causes an overestimate of the RMW post landfall.

Summary and Conclusions
The radius of maximum wind (RMW) is an important structural parameter of tropical cyclones that is key to forecasting tropical cyclone hazards but has not been routinely verified due to observational uncertainties.The recent inclusion of RMW estimates into the Best-Track Data set as well as new advancements in numerical weather prediction models has motivated an examination of RMW forecasts.A simple RMW climatology and persistence model based on intensity and latitude has been defined to determine if RMW forecasts are skillful.
The performance of the RMW forecasts from HAFS and two statistically derived RMW forecasts using NHC's official forecast were tested on a common set of forecasts in the North Atlantic from 2020 to 2022.The operational RMW forecasts used by P-Surge (RMWP) and the RMW derived from angular momentum principles (CHVS) outperform both HAFS-A and HAFS-B models.For stronger tropical cyclones, the HAFS RMW forecasts are comparable to RCLP.The statistically derived RMW forecasts are skillful relative to climatology and persistence and provide the best forecasts of RMW although with a negative bias.The competing biases between the dynamical and statistical models suggests that a consensus technique should be investigated to further improve RMW forecasts.It should be noted that although the RMW forecasts for the HAFS models were not skillful, this does not imply that forecasts of the 34-, 50-, and 64-kt wind radii are not skillful, which would require a separate verification.
Forecasting the RMW is dependent on both the track and intensity forecast which could explain some of the differences between the statistical and dynamical forecasts of the RMW.However, both the track and intensity forecasts from HAFS and OFCL are better than OCD5 and cannot explain the full skill differential.The RMWP and CHVS models are trained directly on best-track RMW estimates, which may explain some of the better performance relative to HAFS.The lower errors of RMWP compared to CHVS is likely due to more structural information in the model which uses the non-zero average 50 and 64 kt wind radii when available.
The initial RMW bias in HAFS could be due to the vortex data assimilation which will be improved in future versions of HAFS.A source of errors for the HAFS RMW forecasts could be related to how the GFDL vortex tracker handles storms near islands and weaker storms that are not well defined.Although we have limited the sample to cases that verified over the ocean, a broadening of the wind field is possible due to the complex terrain in the Caribbean, and could partially explain the negative bias in the statistical models as evident in the forecasts from Hurricane Laura.The terrain impacts could also explain some of the larger errors in the HAFS forecasts for Hurricane Laura and future work is ongoing to explain and rectify the inconsistent forecasts shown in Figure 3c.
A major reason why RMW forecasts are not typically examined is because of the observational uncertainty in the estimates.The current best-track RMW estimates are the best available data set to conduct this verification which incorporates multiple observing platforms, although there are still likely biases in the estimates.In a limited comparison between high fidelity observations from synthetic aperture radar (SAR) and the best track observations, it was shown that a high bias may exist within the operational working best track RMWs utilized (Combot et al., 2020).As we continue to get more SAR overpasses and a more robust data set of quality-controlled RMW estimates, SAR observations could be used to verify against.
Additional guidance on tropical cyclone structure, both observations and forecasts, remains important for future progress in hazard forecasting.A 2-D wind field would be the most beneficial for hazards forecasting but many applications related to hazard forecasting still rely on an accurate estimate and forecast of the RMW.The statistical methods for deriving the RMW are currently the most skillful despite the inability to forecast complex interactions with land, eyewall replacement cycles, and storm asymmetries.New techniques, such as postprocessing methods that would better characterize storm structure from model output and more complex artificial intelligence methods should be leveraged to help improve forecasting of the RMW.

Figure 1 .
Figure 1.Homogeneous verification of 2020-2022 North Atlantic forecasts of the radius of maximum wind (RMW) for HFAI, HFBI, RMWP, CHVS, and RCLP.The mean absolute error (MAE) is shown in solid lines and the bias is shown with dashed lines in (a).The vertical bars on the MAE lines are the standard error.The skill of each of the models relative to RCLP is shown in (b) with zero skill line denoted by the black dashed line.Note that skill is not calculated at the initialization time because all models have the same initial RMW from the interpolation routine.

Figure 2 .
Figure 2. Same as Figure 1 but for only tropical cyclones with an intensity ≥65 kt at forecast initialization.

Figure 3 .
Figure 3.All track (a), intensity (b), and radius of maximum wind (RMW) (c) forecasts from HFBI (green) and OFCL/RMWP (magenta) for Hurricane Laura (2020).The black line is the best-track with 6-hourly positions denoted with black circles.The Saffir-Simpson wind speed intensity categories is denoted with alternating gray shading in (b).