An Assessment of Southern Hemisphere Extratropical Cyclones in ERA5 Using WindSat

ERA5 reanalysis output is compared to WindSat polarimetric microwave radiometer measurements for Southern Hemisphere midlatitude to high‐latitude cyclones between 2003 and 2019. WindSat provides independent measures of low‐level wind speed, total column water vapor (TCWV), cloud liquid water (CLW), and precipitation, which are not assimilated into ERA5. We implement a tracking scheme to identify cyclone centers, before using cyclone composites to match concurrent data in ERA5 and WindSat. We find ERA5 and WindSat show comparable spatial structures for all variables, although their distributions show poorer agreement for CLW and precipitation. Compared to WindSat, ERA5 underestimates TCWV by up to 5% and CLW by up to 40%. ERA5 underestimates precipitation in the warm sector by up to 15%, but overestimates in the cold sector by up to 60%. Similar biases in ERA5 are seen compared to Advanced Microwave Scanning Radiometer for EOS (AMSR‐E) data, even though AMSR‐E radiances are assimilated into ERA5. Comparing ERA5 and WindSat across the cyclone lifecycle, strong spatial correlation is seen as the cyclone deepens and reaches peak intensity, before slightly declining as the cyclone decays. In the cold sector ERA5 shows an underestimation of CLW, yet overestimates precipitation at all lifecycle stages. However, in the warm sector precipitation is underestimated. This potentially suggests biases within the ERA5 parameterizations of cloud and precipitation causing a disconnect between the two. Despite this, ERA5 shows strong correlation with WindSat and determines cyclone structure well across the cyclone lifecycle, showing its value for use in cyclone compositing analysis.

, Hawcroft et al. (2012), and Utsumi et al. (2017) have shown that up to 90% of precipitation in the midlatitude storm tracks is associated with cyclones and their associated fronts.Catto et al. (2012) also show that more precipitation is associated with fronts in the Southern Hemisphere than the Northern Hemisphere.Pfahl and Wernli (2012) identified that a high percentage of precipitation extremes (up to 80%) are also found to be directly related to cyclones.Utsumi et al. (2017) also show large amounts of extreme precipitation in midlatitude regions are associated with cyclones.When analyzing cyclones over the US West Coast, Zhang et al. (2019) found that 45% of cyclones have an associated atmospheric river which can enhance the precipitation and latent heat release and contributes to the deepening of the cyclone.More recently, McErlich et al. (2023) has shown that relationships between the frequency and intensity distribution of precipitation are also strongly controlled by the presence of large-scale precipitation processes, such as cyclones.
Satellite and ground-based observations are invaluable tools for the analysis of cyclone structure, and these different sources of data have led to the development of several competing conceptual models (e.g., Browning, 1997;Carlson, 1980;Semple, 2003).Semple (2003) demonstrates how these conceptual models can be used at each phase in the cyclone lifecycle to provide a description of the physical processes occurring within the system and the range of evolution pathways.However, the lack of generality of case studies means they cannot easily be used to evaluate conceptual or numerical models (Jakob, 2003).Another assessment method uses a cyclone centered compositing methodology to create average information from a large number of cyclones.Aggregating atmospheric features over a large data set allows a statistical measure of a model's ability to represent the large-scale dynamical processes and air flows, as well as their influence on moisture around these systems.
Many studies have used reanalysis data sets to study the structure and evolution of cyclones.Reanalyzes assimilate observational data into a dynamical model framework, which can cause issues in the representation of atmospheric variables such as precipitation (Herold et al., 2016).However, they have good spatial and temporal coverage, which is especially useful over the Southern Ocean where observational data sets are sparse (GCOS, 2021).The verification of model data using reanalysis products has some clear limitations, as precipitation in reanalysis data sets is strongly dependent upon the parameterization in the underlying model (Catto et al., 2010).As a result, significant deficiencies are apparent when compared with observations, even in the most recent analyses.For example, Naud et al. (2020) investigated cyclonic precipitation in reanalyzes and models compared to IMERG satellite retrievals.They found ERA-Interim and MERRA-2 overestimate precipitation in the dry sector of the cyclones, and underestimate precipitation in the warm sector of the cyclone.Though they also note that the IMERG observational data set might also exaggerate precipitation rates in vigorously ascending regions.Recent work by Lavers et al. (2022) has evaluated ERA5 precipitation biases globally, that work found that the smallest errors occur over the midlatitudes in winter relative to station data.These small random errors are attributed to winter precipitation at these latitudes being commonly produced by extratropical cyclones and that these processes are well resolved in ERA5.
This study provides a quantitative assessment of the quality of ERA5 in characterizing extratropical cyclones in the Southern Hemisphere over midlatitudes to high latitudes (30°S-90°S), relative to the WindSat satellite data set.Most of these cyclones are located over the Southern Ocean (see Figure 1).We compare output from the ERA5 reanalysis with WindSat data over the ocean to identify the similarities and differences in the cyclone characteristics between these two products.WindSat is not assimilated into the ERA5 reanalyzes and therefore provides an independent analysis of the quality of ERA5 over a wide range of geophysical variables.The focus on a single satellite instrument means that sampling differences associated with using multiple satellite instruments are also removed.Where possible we supplement this analysis with the Advanced Microwave Scanning Radiometer for EOS (AMSR-E) data set, but this does not provide an independent comparison with ERA5 as AMSR-E data has been assimilated into ERA5 (as highlighted in Hersbach et al. (2020)).It does however allow us to examine the quality of the WindSat data set.
We focus our attention over the Southern Ocean due to the many well-established issues with the representation of cloud and precipitation in models over this region.Known issues of model representation over the Southern Ocean include too little cloud cover (e.g., Bodas-Salcedo et al., 2012;Kuma et al., 2020;McErlich et al., 2021;Schuddeboom et al., 2018), excessive sunlight absorbed by the ocean surface (e.g., Hyder et al., 2018;Trenberth & Fasullo, 2010), a lack of clouds in the cold sectors of cyclones (e.g., Bodas-Salcedo et al., 2014), a lack of reflective supercooled water clouds (e.g., Bodas-Salcedo et al., 2016;Kuma et al., 2020), and an overestimation of the frequency and underestimation of the intensity in precipitation associated with fronts (Catto et al., 2015; 10.1029/2023JD038554 3 of 18 Priestley et al., 2020).Beadling et al. (2020) also showed warm biased sea surface temperatures over the Southern Ocean still exist in CMIP6 models, which also effects the position of cyclones tracks (Priestley et al., 2020).Many of these model biases are not independent, such as shortwave radiative biases over the Southern Ocean forming from an underestimation of cloud within the models.

ERA5
We use output from the ERA5 reanalysis (Hersbach et al., 2020), obtained from the Copernicus Climate Change Service (C3S, 2017).ERA5 is available on a 0.25° latitude/longitude grid and is utilized to examine cyclones over the Southern Hemisphere for the years 2003-2019 inclusive.This period is chosen to match with the available period for WindSat observations.Work detailed in McDonald and Cairns (2020) shows that ERA5 is consistent with a number of other reanalyzes over the satellite era with little variation over that period, hence this period should be representative of reanalysis in general.While it has already been established that there are disparities between reanalysis and observational satellite-based and gauge-based data sets when it comes to estimates of precipitation, recent work has shown that ERA5 represents the structure of global precipitation distributions well (McErlich et al., 2023).ERA5 output is available on an hourly temporal resolution, but three hourly data was used in this study.ERA5 output for the 10 m u-component of wind, 10 m v-component of wind, total column water vapor (TCWV), total column cloud liquid water (CLW), and mean total precipitation rate (MTPR) are used.
ERA-Interim has been used in a number of studies that focused on extratropical cyclones (e.g., Hodges et al., 2011;Naud et al., 2014Naud et al., , 2020)), However, only a small number of cyclone related studies (Priestley et al., 2020(Priestley et al., , 2022) ) have used ERA5 thus far.Even fewer of these studies make use of the cyclone compositing methodology (Priestley & Catto, 2022).Significant work has already identified the utility of previous reanalyzes, such as that in Hoskins and Hodges (2005) which used the 40-year ECMWF reanalysis (ERA40) data to perform a detailed analysis of the Southern Hemisphere storm tracks.Given that ERA5 is a next-generation reanalysis with an even higher spatial resolution than these previous studies it is likely to be suitable for cyclone compositing.

WindSat
WindSat (Meissner & Wentz, 2009) is a multifrequency polarimetric microwave radiometer developed by the US Naval Research Laboratory for the National Polar-orbiting Operational Environmental Satellite System Integrated Program Office.WindSat was designed to demonstrate the capability of polarimetric microwave radiometry to measure the ocean surface wind vector from space and was launched on the Coriolis satellite on 6 January 2003 (Gaiser et al., 2004).This radiometer operates at five discrete frequencies (6.8, 10.7, 18.7, 23.8, and 37.0 GHz); all are fully polarimeteric except the 6.8 and 23.8 GHz channels that have only dual polarization.WindSat operates in a near-polar orbit, defining a set of varying swaths on the Earth surface which means that the WindSat sampling pattern is not continuous.This leads to a set of gridded swathes with spatial resolutions between 39 by 41 km and 8 by 13 km depending on the frequency of the radiometer channel.Despite a scheduled 3 year lifetime, WindSat continued to provide brightness temperature measurements of the ocean surface up until October 2020.The sampling of WindSat is densest toward high-latitudes and midlatitudes which means that this instrument is well suited to examining cyclones over the Southern Ocean, and its long atmospheric record allows for a valuable comparison with ERA5.
Calibrated WindSat products are available from Remote Sensing Systems, and we use the v7.0.1 WindSat products.Specifically, the fields of all-weather 10-m wind speed, columnar atmospheric water vapor, columnar CLW content, and rain rate are used in this study.The WindSat product uses a consistent processing scheme and a robust radiative transfer model, which allows the intercalibration of the WindSat data with other microwave radiometers collecting brightness temperatures over the ocean.Details about the retrievals used in these products are available in Gaiser et al. (2004) and Meissner and Wentz (2009), which are only performed over the oceans.WindSat retrievals use measurements at C-band and X-band frequencies coupled with a statistical algorithm to retrieve wind speeds that works in all weather conditions, a capability unique to WindSat.In their work, they noted that since the model function and the retrieval algorithms are empirical, the satellite wind measurement accuracy has been quantified over a wide range of atmospheric conditions.Freilich and Vanhoff (2006) identify those changes in atmospheric water vapor and liquid water, small-scale ocean surface roughness and foam (influenced primarily by winds), sea-surface temperature variations, and the presence of rain all cause variations in the WindSat measurements.As different geophysical processes have different influences at differing frequencies and polarization, WindSat observations can be used to estimate other geophysical quantities.The secondary set of data products calculated are: rain rate, column integrated CLW, TCWV, and sea surface temperature.
The work of Freilich and Vanhoff (2006) validates a WindSat vector wind retrieval by comparing the WindSat estimates with observations from both meteorological buoys and the QuikSCAT scatterometer.The retrieval scheme utilized in Freilich and Vanhoff (2006) displays unbiased wind speeds relative to buoy measurements for speeds between 5 and 15 m s −1 outside periods of rainfall.However, comparison suggests that the Wind-Sat retrieval overestimates wind speed at high buoy speeds.Comparison of the WindSat and QuikSCAT wind distributions display nearly identical means and standard deviations, providing confidence in the capability of WindSat.Meissner et al. (2011) compared with WindSat wind measurements with buoys and demonstrated a small bias of 0.04 m s −1 during no rain conditions.More recently work detailed in Zheng et al. (2019) compares WindSat radiometer measurements with scatterometer derived wind speeds and direction finding near zero bias and a small random errors between the two data sets.

AMSR-E
In addition to the ERA5 and WindSat data sets, this study also uses data from the AMSR-E onboard the polar-orbiting Aqua satellite.AMSR-E measures the microwave emission at six frequencies ranging from 6.9 to 89 GHz, with both vertical and horizontal polarization at all frequencies (Kawanishi et al., 2003).In particular, we use version seven of the columnar atmospheric water vapor, columnar CLW content, and rain rate AMSR-E products available from Remote Sensing Systems.AMSR-E data are available between 2003 and 2011, or just over half of the observational period of WindSat.AMSR-E measurements of Brightness Temperature are assimilated into ERA5 (Hersbach et al., 2020), so it is not an independent data set.However, AMSR-E still provides useful insight on the quality of WindSat data.

Cyclone Tracking and Compositing Methodology
The cyclone tracking algorithm used in this study was detailed by Crawford and Serreze (2016) and has subsequently been used in a number of further studies (e.g., Crawford & Serreze, 2017;Crawford et al., 2020;Hell et al., 2020;Koyama et al., 2017).The algorithm uses sea level pressure information rather than 850 hPa vorticity.However, results are expected to be similar (Hoskins & Hodges, 2005;Neu et al., 2013;Simmonds & Rudeva, 2014), though it has been demonstrated that using the relative vorticity field potentially allows the identification of smaller scale cyclones earlier in their development (Hoskins & Hodges, 2005;Ulbrich et al., 2009).A detailed explanation of the cyclone tracking algorithm used can be found in Crawford and Serreze (2016), but the main steps are briefly detailed.
ERA5 mean sea level pressure (MSLP) information is first reprojected from the ERA5 latitude/longitude grid to a 50 km Equal-Area Scalable Earth Grid (EASE-Grid) in the Southern Hemisphere (Brodzik et al., 2012(Brodzik et al., , 2014)), centered over the South Pole.Cyclone centers were then identified between 2003 and 2019 with a temporal resolution of 3 hr.Existing research from Crawford et al. (2021) suggests that applying the cyclone tracking to MSLP data with a resolution shorter than 3 hr can lead to unrealistic splitting of the cyclone tracks, hence our decision to use ERA5 data at this resolution.The cyclone tracking algorithm determines local minima in the MSLP field and analyses the corresponding pressure gradient.A radii-based threshold is used to identify whether it is a closed low pressure system and thus characterizes a cyclone.A maximum propagation speed of 150 km/hr is used to identify related low-pressure centers and combine them into continuous cyclone tracks.A maximum elevation of 500 m was used to make a mask such that cyclone centers identified above this height were ignored.Further criteria rejecting systems that have a lifespan shorter than 24 hr or a track length less than 100 km are also applied.We also restrict cyclone tracks to those that spend some part of their lifetime at latitudes south of 30°S.
In order to assess the suitability of the cyclone tracking scheme over the Southern Hemisphere, and check ERA5's tracking capabilities, Figure 1a shows the track density over the defined domain.The track density is defined as the number of monthly cyclone tracks passing through a 500 km by 500 km area centered on each grid point.The highest density of cyclones is located around the Antarctic coastline.This pattern matches well with previous Southern Hemisphere cyclone track climatologies (Bengtsson et al., 2006;Hodges et al., 2011;Hoskins & Hodges, 2005).
Output from the cyclone tracking algorithms was used to transform a range of ERA5 data into a cyclone centered coordinate system in the form of cyclone composites.The compositing process followed a similar methodology to that described in Catto et al. (2010).First, the locations of the cyclone center were identified using the tracking algorithm, to be used as the origin of the cyclone centered coordinate system.Data were extracted in a radius centered on each cyclone across the period of analysis.Due to the changing longitudinal extent of the cyclones as a function of latitude, the composite field was derived in polar coordinates, then interpolated onto a higher resolution polar coordinate grid to allow for smooth sampling across composites.Finally, individual composites are rotated so that the direction of propagation of the cyclone is chosen to be traveling eastward.Given the zonal westerly winds over the Southern Ocean many cyclones require little rotation.This step approximately aligns the position of the warm/cold fronts and the area of warm, moist air associated with them.While not all fronts will be at the same position relative to the direction of the cyclone, this rotation acts to focus the structure of the composite (Govekar et al., 2011).
Cyclone composites are derived over a circle of radius 2,000 km.This radius is commonly used within previous work (e.g., Booth et al., 2018;Field & Wood, 2007;Field et al., 2008;Naud et al., 2012), although some studies use smaller radii (e.g., Catto et al., 2010;Flaounas et al., 2015;Naud et al., 2020;Sinclair et al., 2020).Some studies use a slightly larger but comparable 20° region surrounding the cyclone (e.g., Bengtsson et al., 2009;Priestley & Catto, 2022).To assess the suitability of the compositing radius, Figure 1b displays the cumulative frequency of maximum cyclone radius observed across all cyclone tracks.The mean value for the distribution is 1,000 km, while the 99.9th percentile value for the distribution is approximately 27,00 km.Therefore, setting the compositing radius at 2,000 km means that greater than 95% of cyclones will be fully represented in the compositing scheme across all stages of their lifecycle.
During the compositing process, WindSat data are only included based on two conditions.First, observations are only included if they occur within 1 hr of the time defined by the ERA5 reanalysis.Second, only data that is also within a 2,000 km radius of the cyclone center are utilized.Because of the nonuniform distribution of the WindSat swath, not all cyclone centers have corresponding WindSat data in the entire 2,000 km composite radius.ERA5 reanalysis output is composited using the same method and only included in the composite when corresponding WindSat data are available.Thus, we effectively use the presence of WindSat data to create a mask to reduce sampling biases.The same procedure is also completed to match the ERA5 and AMSR-E cyclone composites.As WindSat only produces retrievals over the ocean, all ERA5 output over land will effectively be masked out.

Analysis of Cyclone Lifecycle
In this study, cyclones are initially composited over all stages of development.The resulting composites cannot be expected to display characteristics of the well-known development stages.To gain a greater understanding of the 10.1029/2023JD038554 6 of 18 differences between ERA5 and WindSat fields, we partition the cyclones by their development phase relative to the time of maximum depth of the cyclone.Here depth is defined as the difference between the edge pressure and central pressure of the cyclone.The edge pressure of the cyclone is determined using the last closed isobar around the cyclone center.In order to partition the cyclones into periods of deepening, peak intensity, and decay, a criterion based the deepening rate (DpDt, scaled by latitude) was also assessed.Cyclone tracks were only kept if the deepening rate changed from positive to negative around the point of peak intensity.
For each cyclone track that passed this criterion, three periods were defined.The period of peak intensity was defined as 6 hr either side of the time of maximum depth.The period of deepening was defined as measurements between 6 and 18 hr previous to the time of maximum depth.The period of decay was defined as measurements between 6 and 18 hr after the time of maximum depth.Tracks without measurements 18 hr before and after the point of peak intensity were rejected, causing a minimum cyclone lifespan of 36 hr to be considered.Different periods were investigated, but 12 hr was chosen to ensure a large proportion of tracks were not removed, while still filtering out cyclones without clear deepening and decay periods.Even after the application of these criteria our composite analysis still includes over 35,000 cyclone tracks.

Comparison of Mean ERA5 and WindSat Fields
Figure 2 displays cyclone-centered composites of 10 m horizontal winds (UV10), TCWV, CLW, and MTPR for both the ERA5 and WindSat data.Figure 2 also displays the difference between the two data sets, defined as ERA5-WindSat.Cyclones have been tracked over the Southern Hemisphere, so the top of the composite corresponds to the equatorward sector.Similarly, the bottom corresponds to the poleward sector.Because of the rotation applied to the cyclone composites, the top of the composites may not align with north, so cardinal directions are not used to describe cyclone features.
Figure 2a shows that ERA5 UV10 winds display an axially asymmetric wind structure with the strongest winds above the cyclone center in the upper left quadrant.The lowest winds are also close to the cyclone center in the lower right quadrant.Field and Wood (2007) indicate that the clearly defined "eye" at the center of the cyclone in their analysis highlights the quality of the reanalysis derived cyclone locations and that the compositing methodology is working in their study.The clear "eye" in our analysis therefore highlights the quality of the ERA5 derived cyclone positions and corresponding composites.Figure 2b shows WindSat 10m winds which also displays an axially asymmetric wind structure with similar features to those in the ERA5 reanalysis.Inspection of the differences in Figure 2c shows that ERA5 displays smaller 10 m wind speeds compared to WindSat across nearly the entire composite, with the largest differences of up to 40% occurring around the cyclone center.Looking at the wind vectors seen in Figures 2a and 2b, the direction of the wind vectors shows only slight changes across the composite between the ERA5 and WindSat data sets.When investigating the wind speed distributions of the zonal and meridional components separately (Figure S1 in Supporting Information S1), WindSat displays a bimodal structure which is less pronounced in the ERA5 output.Assessing each quadrant individually shows the differences between ERA5 and WindSat are largest in the left quadrant of the cyclone.
The cyclone composite of ERA5 TCWV (Figure 2d) displays the expected contrast in TCWV between the dry poleward (bottom) and moist equatorward (top) portions of the cyclone.In particular, the pattern displays a tongue of dry air wrapped around the left flank of the cyclone which extends above the low-pressure center into the upper left quadrant.Correspondingly a warm moist tongue is observed to the right of the cyclone extending from the upper right quadrant toward the bottom of the cyclone.This distribution of TCWV is consistent with previous analyses (Field & Wood, 2007;Naud et al., 2012Naud et al., , 2014)), which display the contrast in humidity between the dry poleward and moist equatorward portions of the cyclone.For example, equivalent potential temperature composites shown in Catto et al. (2010) display a very similar pattern.Figure 2e shows WindSat TCWV composites are structurally similar to the patterns observed in ERA5, although Figure 2f shows that ERA5 has slightly lower TCWV across the entire composite (up to 5% relative difference).The largest differences occur in the poleward half of the composite, suggesting that the high-water carrying capacity of the warm sector of the cyclone is very well captured well by ERA5.In particular, ERA5 shows lower values of TCWV in the dry tongue located in the lower left quadrant of the cyclone composite.The two data sets also show differences directly right of the cyclone center, where the moist TCWV tongue in WindSat extends further poleward than in ERA5. Figure 2g shows ERA5 CLW has a clear comma cloud structure, as identified in conceptual models (see Semple, 2003), with the tail of the comma in the upper right quadrant of the composite.Govekar et al. (2014) directly linked the three-dimensional distribution of clouds with the dynamics of a composite cyclone and quantified the relationships between them.In particular, they identified the distinct comma structures similarity to the vertical motion field derived from reanalysis. Figure S2 in Supporting Information S1 shows that ERA5 vertical velocity matches with the shape of the comma cloud, agreeing with the previous work detailed in Govekar et al. (2014).Maximum CLW values are observed on the tip of the spiral structure in CLW in Figure 2g.These features are likely related to the warm conveyor belt (WCB), a stream of warm moist air that originates at low levels in the warm sector and travels parallel to the cold front (Harrold, 1973).When it reaches the surface warm front the WCB rises rapidly along moist isentropes.As this warm air ascends, it forms the frontal cloud and the cloud head.WindSat CLW (Figure 2h) displays the same comma-like structure as observed in the ERA5 output.Differences between the ERA5 and WindSat composite show lower CLW values in ERA5 across the entire composite (see Figure 2i).While a difference of up to 30% exists within the high CLW comma structure, the greatest difference in relative terms occurs in ERA5 (up to 40%) lies within the drier lower left quadrant where CLW values are lower.
ERA5 cyclone composites of MTPR in Figure 2j show that the spatial pattern of the rain rate is similar to the CLW pattern displayed in Figure 2g as might be expected.The rain rate therefore also displays a comma structure to the right of the cyclone center with the tail of the comma extending into the upper right quadrant, a feature also been by WindSat (Figure 2k).A comparison between ERA5 and WindSat in Figure 2l shows the largest difference of up to 60% occur left of the cyclone center, where ERA5 has greater rain rates.This pattern may occur because the rain rate is greater in the poleward side of the composite, but also because the peak precipitation rate occurs further toward the left in ERA5 than in WindSat.This difference in the location of the comma cloud also produces a region in the upper right quadrant of the cyclone where WindSat has slightly greater rain rates than ERA5 with values up to 15% larger.The pattern shows an asymmetric difference in wet and dry regions similar to those identified in Naud et al. (2020) between ERA-Interim and IMERG.Field and Wood (2007) have previously identified a broad correlation of the rain rate with the moist water vapor tongue, which they suggest represents the position of the warm conveyor belt, confirming that most of the rainfall is associated with this feature.We observe a similar relationship in the ERA5 output and WindSat observation.Notably, the difference seen between ERA5 and WindSat in the upper right quadrant of Figure 2l matches well with the position of the moist water vapor tongue seen in Figures 2d and 2e.
Thus far, we have not made any assumptions about whether the structures represented in ERA5 or WindSat are more representative of reality.In order to provide a further reference points, we examine a second satellite data set, AMSR-E.Figure 3 compares ERA5 output and AMSR-E data relative to the cyclone center for TCWV, CLW and MTPR.Due to differences in the AMSR-E and WindSat/ERA5 wind speed products, the two were not compared.We use the WindSat WSPD_AW product derived using all channels and three separate algorithms to obtain winds in all weather conditions, which are not determined in AMSR-E.
Figures 3a and 3b shows that TCWV displays similar structure for ERA5 and AMSR-E.Figure 3c shows that ERA5 has consistently lower TCWV across the entire composite, with differences of up to 7%.This is a near identical pattern to the differences seen in Figure 2f where the biggest differences lie in the poleward half of the composite.These differences are seen despite AMSR-E data being assimilated into ERA5 (Hersbach et al., 2020).Figures 3d-3f for CLW are also consistent with the patterns observed between ERA5 and WindSat (Figure 2i).For the MTPR, Figure 3i shows increased precipitation compared to ERA5 as seen in Figure 2l for WindSat.However, the upper right quadrant where ERA5 shows greater precipitation compared to AMSR-E is far weaker than that seen in Figure 2l for WindSat.Overall, these results suggest the two satellite products are consistent with each other, which might be expected given that they are derived using similar retrieval schemes and work on similar principles.We therefore suggest that ERA5 displays small to medium size biases compared to observations, where it tends to underestimate the amount of moisture, yet overestimate precipitation in the drier sections of the cyclones.These biases are suspected to be driven by parameterizations of cloud within ERA5.Mülmenstädt et al. (2021) highlights that warm rain processes are still a problem in many NWP and climate models, associated with changes in cloud lifetime.Given that many models have recently updated there parameterizations to include more supercooled liquid water in this region, these errors may partially explain the pattern observed (Schuddeboom & McDonald, 2021).The cold sector of the cyclone having far too much precipitation in ERA5 might also be explained by the "dreary state" of precipitation in models as summarized by Stephens et al. (2010), where in order to produce the same amounts of total precipitation as observations models were producing light amount of precipitation at approximately twice the frequency.

Variability of Fields
In addition to inspecting the mean values in the ERA5 output and the WindSat data for similarities and differences, examination of the zonal and meridional distribution of wind (Figure S1  Figure 4 displays a kernel density plot of ERA5 output against WindSat estimates for UV10, TCWV, CLW, and MTPR.Due to the large amount of data across the 35,000 cyclone tracks sampled, every fifth point is used within these calculations.Regions in the kernel density estimate are colored based on occurrence, and a minimum density of 10 points per pixel is used to remove infrequently occurring combinations of ERA5 and WindSat data.Note that the density of points is displayed on a logarithmic scale.
Figure 4a indicates good agreement between ERA5 and WindSat for UV10, with a correlation coefficient of r = 0.91 and a gradient of m = 0.94.However, at high UV10 WindSat displays much greater wind speeds than those observed for ERA5.This potentially suggests issues with ERA5 at high wind speeds or an error in the WindSat retrieval at high wind speeds.Figure 4b displays ERA5 TCWV output against WindSat estimates.Very 10.1029/2023JD038554 10 of 18 strong agreement is seen for the entire range of TCWV values, with a correlation coefficient of r = 0.99 and gradient of m = 0.97.Examination of the higher TCWV values suggests an upper limit for WindSat TCWV in its retrievals.Overall, the underlying distributions of UV10 and TCWV show small differences between ERA5 and WindSat which are comparable to the results displayed in the average patterns in Figures 2a-2f.
The values for ERA5 and WindSat CLW displayed in Figure 4c show far less agreement than for UV10 and TCWV.A much broader distribution is observed, with a correlation coefficient of r = 0.62 and gradient of m = 0.49.This agrees with results observed on Figures 2g-2i, where WindSat displays greater average CLW than ERA5.ERA5 output for MTPR plotted in Figure 4d displays a very broad distribution of values between ERA5 and WindSat, with a correlation coefficient of r = 0.49 and gradient of m = 0.35.High MTPR values as seen by WindSat tend to show greater MTPR than the corresponding ERA5 output.This agrees with the pattern observed in the wetter sector of the cyclone in Figures 2j-2l.Interestingly, a large number of cases where WindSat measurement does not detect precipitation indicate corresponding ERA5 outputs that include precipitation.This suggests that ERA5 is overestimating precipitation, and agrees with the pattern observed in the drier sector of the cyclone in Figures 2j-2l.This results in the linear least squared regression displaying a gradient of m = 0.35, indicating that overall ERA5 output has lower MTPR than the corresponding WindSat measurement.
Additionally, Figure S3 in Supporting Information S1 displays a normalized root-mean-square error (RMSE) analysis within the composite framework.The normalized RMSE in Figure S3 in Supporting Information S1 is low for UV10 and TCWV, with values up to 20% and 7%, respectively.While for CLW the normalized RMSE is generally much larger (above 20%), but maximizes near 60% inside the region of highest CLW identified in Figure 2.While the RMSE for MTPR is generally below 20% across the composite, the normalized RMSE reaches 80% in the regions associated with largest rainfall identified in Figure 2.This indicates that the uncertainly between ERA5 and WindSat is significantly larger for precipitation, and further emphasizes the presence of parameterization biases in ERA5.

Representation of Cyclones Across Their Lifecycle
Our composite analysis reveals distinct patterns in the distribution of water vapor, cloud, and precipitation near cyclones, which are reproduced in ERA5 and WindSat in Figure 2.However, distinct differences exist in these patterns as a function of lifecycle stage, strength, and deepening rate, as moisture convergence strongly depends on the cyclone's velocity field (e.g., Field & Wood, 2007;Klein & Jakob, 1999;Naud et al., 2012).We analyze cyclone composites for ERA5 and WindSat across regions of deepening, peak intensity and decay related to the depth of the cyclone.This provides a comparison of how structure changes in each data sets as the cyclone evolves, and how patterns differ between the two.This analysis is undertaken on a subset of the cyclone composites shown in Figure 2 which display clear periods of deepening, peak intensity and decay around the point of maximum cyclone depth.
Figure 5 displays the TCWV field from ERA5 (Figures 5a-5c) and WindSat (Figures 5d-5f) for the three different phases of the cyclone, while Figures 5g-5i display the percentage difference between the two.The amount of moisture in the warm sector decreases throughout the cyclone lifecycle in both ERA5 and WindSat.In particular, both show a weakening of the warm moist water vapor tongue, while the dry tongue strengthens and propagates further into the upper half of the composite.This behavior likely suggests frontal occlusion as the cyclone begins to weaken.Figures 5g-5i show that ERA5 always has lower TCWV than WindSat, with larger relative differences in the poleward area of the composite where ERA5 shows drier air.During the deepening phase, differences of up to 5% show comparable structure to that seen in Figure 2f with a bias in the position of the warm moist water vapor tongue.In order to compare how differences between ERA5 and WindSat change across cyclone lifecycle, Figures 5g-5i also display the absolute mean bias averaged across the composite.As the cyclone reaches peak intensity and begins to decay, the absolute mean bias in ERA5 increases negligibly from 3.1% to 3.2%.
Figure 6 shows cyclone composites for CLW derived similarly to Figure 5. CLW decreases over the cyclone lifecycle in both data sets, with a section of dry air strengthening and wrapping around the cyclone center.Examination of patterns in ERA5 (Figures 6a-6c) and WindSat (Figures 6d-6f) shows general agreement with the patterns observed in Figure 5, where areas of high CLW match well with the moist water vapor tongue.Differences between ERA5 and WindSat in Figures 6g-6i show that ERA5 almost always has lower CLW than WindSat across all stages of the lifecycle with differences of up to 60% associated with the driest region of the composite.The exception to this is the moist upper right quadrant of the cyclone where ERA5 shows CLW values up to 15% larger than WindSat.These relative differences are greater than the maximum underestimation (overestimation) in ERA5 CLW seen in Figure 2i of 40% (0%).Another notable feature is that as the comma cloud structure begins to rotate and dissipate, the pattern in the difference also rotates as the drier region moves into the equatorward portion of the composite.When looking at how the differences between ERA5 and WindSat change throughout the lifecycle, the absolute mean bias decreases slightly from 17.5% to 15.8%.7g-7i show ERA5 predominantly overestimates MTPR in the cold sector of the cyclone, while underestimating within the warm sector.The greatest differences of up to ±70% are observed during the deepening phase of the cyclone, but then begin to blur and reduce as the cyclone reaches peak intensity and enters the decay period.Again, behavior suggests frontal occlusion as the cyclone begins to weaken.Overestimation in MTPR is comparable to that in Figure 2l of 60%, but breaking analysis into periods of the cyclone lifecycle shows a greater underestimation of ERA5 MTPR compared to WindSat.This is most pronounced within the warm sector of the cyclone, where maximum underestimation of 30% in Figure 2l increases to 70% in Figure 7.However, the absolute mean bias only increases slightly from 28.4% to 29.6% throughout the cyclone lifecycle, where a decrease in the warm sector underestimation is offset by an increase in overestimation elsewhere within the cyclone composite.
Despite differences seen across Figures 5-7, ERA5 and WindSat show similar spatial structure in each variable.In order to provide a more quantitative comparison, Figure 8 shows the Pearson correlation coefficient (r) between the ERA5 and WindSat spatial patterns, determined using a linear least squares regression.Overall, ERA5 and WindSat display the best agreement within the deepening region with correlation coefficients above 0.9.Agreement reduces in CLW and MTPR as the cyclone evolves, with lower agreement in the peak intensity region and the lowest agreement within the decay region.Comparing the TCWV composites shows a correlation coefficient of almost 1 across all regions, which is unsurprising given the largest differences between the two are 5% and that there is assimilation of AMSR-E and other radiances which are sensitive to TCWV.CLW correlation is slightly poorer with weakest correlation during the decay period of 0.93.Although still strong, MTPR correlation is the lowest of the three variables examined with a correlation coefficient of 0.8 during the decay period.Correlation decreases moving from TWCV to CLW and MTPR, potentially indicating additive biases in the parameterization of rainfall generating processes within ERA5.

Discussion and Conclusion
ERA5 reanalysis output of UV10, TCWV, CLW, and MTPR over the Southern Ocean are used to form cyclone composites to derive an integrated viewpoint of cyclone features.These composites are then compared with those derived from WindSat and AMSR-E radiometer measurements.Because WindSat is not assimilated into ERA5, it proves an independent measure of how well ERA5 represents cyclonic structure and cyclone evolution.AMSR-E radiances are assimilated into ERA5, but still provide a useful comparison.
A comparison between the mean horizontal wind speed cyclone composites calculated from ERA5 output and from WindSat data displays very similar structures (Figures 2a and 2b), but ERA5 shows slightly lower wind speeds in general compared to WindSat.More detailed inspection of the zonal and meridional components of the wind shows that the distributions between the ERA5 and WindSat data can be quite different, with ERA5 failing to fully reproduce the bimodal wind speed distribution displayed in WindSat (Figure S1 in Supporting Information S1).This may provide evidence that small mesoscale features are not adequately simulated in the ERA5 reanalysis.Recent work, Priestley and Catto (2022), applied the cyclone compositing methodology to CMIP6 and HighResMIP models compared to baseline composites produced using ERA5.They found that HighResMIP models underestimated lower tropospheric winds compared to ERA5, although HighResMIP compared better.Given that ERA5 displays lower winds than WindSat, these models may have slightly larger issues with the representation of wind speed than identified in that work.However, it is important to note that studies validating WindSat's wind retrievals do not evaluate wind speeds during rainy periods or for high wind speeds (Freilich & Vanhoff, 2006;Meissner et al., 2011;Zheng et al., 2019).Since cyclones are linked with both intense precipitation and high wind speeds, a limitation of this study is that WindSat could potentially exhibit unquantified biases during these specific periods.
Examination of the TCWV and CLW fields demonstrate that ERA5 manages to replicate the structure of the corresponding WindSat cyclone composites well, although Figure 4 suggests a weaker correlation in the CLW distribution between ERA5 and WindSat.However, we also show that both TCWV and CLW is lower in ERA5 over almost the entire region of the composite, although the TCWV differences (up to 5%) are far smaller than those in the CLW (up to 40%).Analysis on Figures 5 and 6 shows that the TCWV spatial structure in WindSat and ERA5 show good correspondence with those for CLW.This suggests that biases in the parameterization of cloud are likely the driver of the large differences in CLW relative to the differences in TCWV, despite the assimilation of radiances from AMSR-E which likely constrain both TCWV and CLW.These cloud biases between ERA5 and WindSat would lead to variations between the two in the amount of water vapor condensing into liquid droplets.Further comparison between ERA5 and AMSR-E data in Figure 3 shows similar underestimates as identified with the WindSat data.A good match between the two satellite data sets highlights the utility of the WindSat data set.
When comparing cyclone composites of the precipitation rate (Figures 2j-2i), the biggest differences of up to 60% occur slightly to the left of the cyclone center, where ERA5 is shown to have a greater maximum precipitation rate than WindSat.In part, these differences occur because the peak precipitation in ERA5 is seen to be shifted further left compared to WindSat.However, these regions where ERA5 is overestimating MTPR compared to WindSat correspond to regions where it underestimates both CLW and TCWV.These differences are also seen when comparing ERA5 with the AMSR-E data set.Our results agree with Naud et al. (2020), who found ERA-Interim and MERRA-2 overestimate precipitation in the dry sector of the cyclones, and underestimate precipitation in the warm sector of the cyclone.These biases appear to remain within the ERA5 reanalysis product, and points to possible continuing parameterization issues within ERA5, given the agreement between WindSat and the AMSR-E product.Assessing the underlying MTPR distribution between ERA5 and WindSat (Figure 4d) indicates that high ERA5 rain rates are often overestimated compared to the corresponding WindSat measurement.However, ERA5 also produces precipitation in cases where WindSat does not observe precipitation.This overestimation of ERA5 precipitation further reinforces the idea of parameterization issues within ERA5.It also agrees with the "dreary state" of precipitation in models as summarized by Stephens et al. (2010), and the increased occurrence of wet days in ERA5 compared to observational GHCN rain gauge data as determined in (McErlich et al., 2023).
When breaking TCWV, CLW, and MTPR into stages of the cyclone lifecycle (Figures 5-7), these biases remain, and strengthen in the case of CLW and MTPR across the cyclone lifecycle.Although, for MTPR, a decrease in the underestimation of precipitation in the warm sector is offset by an increase in the overestimation of precipitation elsewhere within the cyclone composite.The dry poleward region of the cyclone shows the area of largest relative difference across all variables.The average bias increases slightly over cyclone lifecycle for the TCWV and MTPR and decreases slightly for CLW.Our results show that strongest rain rates occur in the deepening region before the cyclone reaches its maximum strength.This provides observational support for the idea that the release of latent heating associated with precipitation is an important contributor to the intensification of cyclones (Binder et al., 2016;Ludwig et al., 2014;Wernli et al., 2002).Given previous work in Field and Wood (2007), Booth et al. (2018), andNaud et al. (2020) we might interpret that changes in precipitation during the deepening and peak intensity stages are driven by changes in the precipitable water vapor, while changes during the decay stage are driven by change in dynamics.The fact that the differences between ERA5 and WindSat are relatively constant potentially means that these controls of precipitation are reasonably well captured in the ERA5 reanalysis.
In summary, this study shows that ERA5 represents the near surface wind speeds and TCWV of extratropical cyclones well.Representation of CLW and precipitation rate is poorer; ERA5 underestimates CLW, yet overestimates precipitation in the cold sectors of the cyclone.Warm sector precipitation is also underestimated in ERA5 compared to WindSat.Despite biases seen in ERA5 compared to WindSat, both data sets show similar spatial structure across the cyclone lifecycle for TCWV, CLW, and MTPR.Quantifying this using a Pearson correlation shows strong agreement between the two data sets, although agreement lessens during the decay period of the cyclone for CLW and MTPR.This suggests that ERA5 is adequately determining cyclone structure across a range of cyclonic life stages and is valuable for use in cyclone compositing analysis.

Figure 1 .
Figure 1.(a) Cyclonic track density, defined as the monthly occurrence of tracks in 500 km by 500 km box centered on each grid cell of the tracking domain.(b) Cumulative frequency of occurrence of the maximum cyclone radii reached by each cyclone track.The solid (dashed) red line shows the mean (99.9th percentile) value of the distribution.

Figure 2 .
Figure 2. Cyclone-centered composites of 10 m horizontal winds (UV10) derived from (a) ERA5 and (b) and WindSat composited from all cyclones observed during 2003-2019 inclusive.(c) The percentage difference between the two data sets (ERA5-WindSat).(d-f) The same as (a-c) but for total column water vapor (TCWV).(g-i)The same as (a-c) but for cloud liquid water (CLW).(j-i) The same as (a-c) but for mean total precipitation rate (MTPR).(a, b) also display wind vectors for ERA5 and WindSat, respectively.Cyclones have been rotated so that the direction of storm propagation is toward the right.
in Supporting Information S1) demonstrates that looking at other statistical properties can be useful.Corresponding data for ERA5 and Wind-Sat are determined across each individual sampled cyclone region used to calculate the averages displayed in Figure 2. ERA5 and WindSat data are identified separately for each individual grid point within the sampled cyclone composite.Analysis of the paired ERA5 and WindSat data allows an assessment of the variability in ERA5 and WindSat, their connection, and captures how representative the averages presented in Figure 2 are compared to the underlying distributions of each data set.

Figure 3 .
Figure 3. Cyclone-centered composites of total column water vapor (TCWV) from (a) ERA5 and (b) and Advanced Microwave Scanning Radiometer for EOS (AMSR-E) composited from all cyclones observed during 2003-2011 inclusive.(c) The percentage difference between the two data sets (ERA5-AMSR-E).(d-f) The same as panel (a-c) but for cloud liquid water (CLW).(g-i) The same as panel (a-c) but for mean total precipitation rate (MTPR).

Figure 4 .
Figure 4. Kernel density plot of ERA5 output at each individual grid point a time step across the cyclone composite data set, compared against WindSat measurements for (a) 10 m horizontal winds (UV10), (b) total column water vapor (TCWV), (c) cloud liquid water (CLW), and (d) mean total precipitation rate (MTPR).The density of points is displayed using a logarithmic scale.The correlation coefficient (r), gradient (m), and intercept (c) of a linear least squares regression and resultant fit line are displayed on each subplot.

Figure 5 .
Figure 5. Cyclone composites of total column water vapor partitioned into the deepening, peak intensity, and decay regions for (a-c) ERA5, (d-f) WindSat, and (g-i) the difference between the two.

Figure 6 .
Figure 6.Cyclone composites of cloud liquid water partitioned into the deepening, peak intensity, and decay regions for (a-c) ERA5, (d-f) WindSat, and (g-i) the difference between the two.

Figure 7
Figure7shows cyclone composites at different periods of the cyclone lifecycle for the MTPR.Examination of the ERA5 and WindSat data in Figures 7a-7c and 7d-7e, respectively, shows the comma-cloud structure in MTPR weakens over the cyclone lifecycle in both data sets.A dry column pushes deeper into the cyclone from the poleward sector and the comma cloud rotates in a clockwise direction.Examination of the differences between ERA5 and WindSat in Figures7g-7ishow ERA5 predominantly overestimates MTPR in the cold sector of the cyclone, while underestimating within the warm sector.The greatest differences of up to ±70% are observed during the deepening phase of the cyclone, but then begin to blur and reduce as the cyclone reaches peak intensity and enters the decay period.Again, behavior suggests frontal occlusion as the cyclone begins to weaken.Overestimation in MTPR is comparable to that in Figure2lof 60%, but breaking analysis into periods of the cyclone lifecycle shows a greater underestimation of ERA5 MTPR compared to WindSat.This is most pronounced within the warm sector of the cyclone, where maximum underestimation of 30% in Figure2lincreases to 70% in Figure7.However, the absolute mean bias only increases slightly from 28.4% to 29.6% throughout the cyclone lifecycle, where a decrease in the warm sector underestimation is offset by an increase in overestimation elsewhere within the cyclone composite.

Figure 7 .
Figure 7. Cyclone composites of mean total precipitation rate partitioned into the deepening, peak intensity, and decay regions for (a-c) ERA5, (d-f) WindSat, and (g-i) the difference between the two.

Figure 8 .
Figure 8.The correlation coefficient between ERA5 and WindSat for the total column water vapor, cloud liquid water, and mean total precipitation rate variables across cyclone lifecycle using a linear least squares regression.Correlation is determined spatially across each grid point in the cyclone composites.