The impact of weak environmental steering flow on tropical cyclone track predictability

Typhoons Haiyan (2013) and Hagupit (2014) are examples of two tropical cyclones (TCs) for which, despite similarities in the track and intensity, the predictabilities differed greatly. Both TCs made landfall over the Philippines having followed a similar track across the Pacific and both reached intensities in excess of 60 m · s−1. Operational global ensemble forecasts showed large uncertainty in the track of Hagupit , whereas the ensemble spread for Haiyan was considerably less. Using the Met Office's Unified Model, 5‐day global ensemble forecasts were produced for both storms. Consistent with the operational forecasts produced at the time of the storms, the spread of tracks is greater in the forecasts produced for Hagupit than Haiyan . The latter was located on the southern periphery of the subtropical high and embedded in a strong easterly flow. In contrast, the position of Hagupit between two anticyclones earlier in the forecast is key to the subsequent motion of the storm in determining whether Hagupit would make landfall over the Philippines or turn to the north. Upper‐level winds contributed the most to the depth‐averaged steering flow. Statistically significant differences in the strength of the upper‐level anticyclone to the east of the storm, the strength and position of the upper‐level ridge downstream of the storm and the location of a detached potential vorticity (PV) streamer appear between two groups of ensemble members – those which turn to the north and those which make landfall. Positional differences of the TC in different ensemble members earlier in the forecasts, particularly in the east to west direction, are correlated to larger northeast to southwest position differences later in the forecast. Ensemble sensitivity analysis suggests that this initial east–west positional variance is linked to the upper‐level geopotential height directly south of the storm. Accurately representing both the steering flow and the position of Hagupit is vital for an accurate forecast.


INTRODUCTION
Tropical cyclone (TC) track forecasts have improved greatly over the past few decades due to the advancement of numerical weather prediction (NWP) models. Mean global model track errors for a 72 hr forecast are now comparable to those of a 12 hr forecast 25 years ago (Yamaguchi et al., 2017). For example, in the Western North Pacific, the U.K. Met Office's Unified Model (MetUM) average 72 hr track forecast error has decreased from over 600 km to under 200 km since 1992 (Heming, 2016), whilst the 48 hr average TC track forecast error is approximately 100 km (Short and Petch, 2018). Despite these improvements, there remain cases where a TC forecast has large track errors. Identifying and understanding situations where the motion of a TC is difficult to predict is essential for preparing effective warnings and thus mitigating the potential impact of the storm. Further, identifying weaknesses in the models will help focus the future development of models which should lead to the improvement of forecasts.
Various previous studies have investigated in detail the reasons why some TC tracks are difficult to predict. Most of these have focussed on storms in the North Atlantic basin. For example, large track errors in forecasts for hurricanes Sandy (Magnusson et al., 2014;Munsell and Zhang, 2014;Torn et al., 2015) and Joaquin (Nystrom et al., 2018;Alaka et al., 2019;Miller and Zhang, 2019) were attributed to various sources, such as errors in the representation of the steering flow (particularly in situations where the flow is weak), the vortex depth and thus the storm's interaction with steering level winds, and the modification of the synoptic-scale environment and thus steering flow by the storm itself. Earlier theoretical work showed that storms embedded in a large-scale deformation flow could be associated with large track errors (e.g., Emanuel, 2005, figure 18.2). This was the case in TCs Joaquin (2015), Lionrock (2016) and Debby (2017). For each of these storms, Torn et al. (2018) showed a synoptic setting characterised by large-scale deformation was responsible for causing uncertainty in the track forecasts. The steering flow within 500 km of the TCs determined to which side of the axis of contraction each of the TCs would move, thus leading to large differences in the future position of the storm.
Here we investigate two major TCs in the Western North Pacific basin that made landfall over the Philippines -typhoons Haiyan (2013) and Hagupit (2014). Despite both TCs being of similar intensity and following a similar path, Haiyan had a highly predictable track, whereas Hagupit did not. Ensemble forecasts for Hagupit from different NWP models all had members which, three days prior to landfall, predicted the storm would turn to the north and miss the Philippines. Peng et al. (2017) demonstrated that many of the Western North Pacific TCs which are associated with large track errors (defined as >600 km for a 72 hr forecast) are storms which either recurve or for which the model predicts the TC will recurve when in reality it does not. Hagupit is an example of the latter.
Our aim is to understand the dynamical causes of the differences in the predictability of the two storms, and in particular the role of two-way interactions between the storms and their environment. To do this we analyse a set of global ensemble forecasts produced with the MetUM. A deeper understanding of the processes affecting the predictability of TC tracks in the Philippines region is particularly important because the Philippines is the most vulnerable country in the world to TCs (Eckstein et al., 2019), with approximately five to six landfalling TCs every year (Cinco et al., 2016).
The remainder of the paper is structured as follows. Section 2 details the model and methods used in the study. An overview of the two TCs and their forecasts is given in Section 3. Further analysis into the the dynamical causes of the uncertainty of Hagupit is conducted in Section 4. A summary and conclusion is given in Section 5.

Model
The Met Office global ensemble prediction system (MOGREPS-G; Bowler et al., 2008) has been used to produce a sequence of ensemble hindcasts for Haiyan and Hagupit. MOGREPS-G is based on the MetUM (Cullen, 1993), which solves the full, deep-atmosphere, non-hydrostatic equations of motion using a semi-implicit, semi-Lagrangian numerical scheme (Wood et al., 2014). Prognostic variables are discretised via Arakawa-C grid staggering (Arakawa and Lamb, 1977) in the horizontal and Charney-Phillips grid staggering (Charney and Phillips, 1953) in the vertical, with a hybrid-height, terrain-following vertical coordinate. The horizontal grid spacings are 0.45 • and 0.3 • in the zonal and meridional directions, respectively (approximately 50 × 33 km in the Tropics). In the vertical there are 70 levels, the spacing of which increases quadratically with height, relaxing towards a horizontal lid 80 km above mean sea level. The model time-step is 12 min. The MetUM includes a comprehensive set of parametrisation schemes for key physical processes, and the way in which these are configured defines a model science configuration. Here we use the Global Atmosphere 6.1 (GA6.1; Walters et al., 2017) science configuration which, at the time of writing, was used operationally at the Met Office to produce global deterministic and ensemble forecasts. Among the parametrisations, convection is parametrised using the mass-flux scheme of Gregory and Rowntree (1990) with many extensions. Walters et al. (2017) give full details.
Initial conditions for each ensemble member are formed by adding perturbations to the Met Office global analysis, where perturbations are generated using an ensemble transform Kalman filter (ETKF; Bishop et al., 2001). In the MOGREPS-G system, 44 perturbations are computed every forecast cycle by mixing and scaling evolved perturbations from the previous forecast cycle. As in the operational MOGREPS-G system at the time, a subset (11) of these are used to initialise perturbed member forecasts, giving a 12-member ensemble in total (including one unperturbed control member). The effects of structural and subgrid-scale model uncertainties in the ensemble system are accounted for through two stochastic physics schemes: the random parameters scheme (Bowler et al., 2008) and the stochastic kinetic energy backscatter scheme (Bowler et al., 2009).
Ensemble forecasts were initialised every 12 hr between 0000 UTC 04 November 2013 and 1200 UTC 08 November 2013 for typhoon Haiyan, and between 1200 UTC 02 December 2014 and 1200 UTC 07 December 2014 for typhoon Hagupit. Each forecast was run out to 120 hr. This study focuses on a subset of these forecasts, as described below. For each storm we have checked that forecasts initialised within 12 hr of the chosen initialisation time show qualitatively similar results.
In addition, we have also produced a single 45-member (44 perturbed members plus one control) 5-day ensemble forecast for Hagupit (initialisation time 1200 UTC 03 November 2014). This larger ensemble is used in Section 4.4 to facilitate a more robust statistical analysis of some of the conclusions from previous sections.

Observational data
The ERA5 dataset (Hersbach et al., 2020) (Knapp et al., 2010). This dataset combines storm information from multiple centres to provide estimates for TC positions and intensities.

Tropical cyclone tracking
A pressure centroid method is used to locate the centre of the storm (Nguyen et al., 2014). This method was found to be the most accurate by Nguyen et al. (2014) when compared to other common tracking methods. A pressure centroid is calculated upon a circle of radius R centred on an initial guess of the storm location (in this case the location of the minimum sea level pressure) to give a "new guess" of the TC centre, wherex andŷ are the longitude and latitude of the location of the "new guess", P ′ i = P 500 − P i with P 500 the sea level pressure averaged azimuthally at a radius of 500 km, P i the sea level pressure at the grid point i, and x i and y i are the longitude and latitudes at each of the grid points within the radius R. Following Nguyen et al. (2014) we define the radius R to be 2R 80 , where R 80 is the radius at which the azimuthal winds are 80% of the maximum. The value of R 80 was chosen as it is generally more stable than the radius of maximum winds (Nguyen et al., 2014). The pressure centroid method is an iterative process and is repeated until the location of the "new guess" is within 0.01 • of the previous guess.

Calculation of steering winds
TC motion is primarily controlled by the large-scale environment (e.g., Holland, 1983;Velden and Leslie, 1991;Chan, 2005) along with the beta effect caused by the Earth's Coriolis force (Holland, 1983;Fiorino and Elsberry, 1989;Smith et al., 1990). The steering flow of a TC is the environmental winds which are responsible for determining the motion of the storm. Early studies suggested averaged winds within 7 • of the TC centre at a height between 700 and 500 hPa were well correlated with the motion of the TC (George and Gray, 1976;Chan and Gray, 1982). However, subsequent studies suggested that a mass-averaged deep-layer mean better represented the motion of the storm, with the exact depth dependent on factors such as the intensity of the storm or vertical wind shear (Velden and Leslie, 1991;Wang and Holland, 1996).
Recently the technique of removing the irrotational and non-divergent winds associated with a TC's vortex (Galarneau and Davis, 2013) has become popular. In Galarneau and Davis (2013) the irrotational and non-divergent winds are removed from the total wind within a certain radius from the storm centre to give the environmental winds. This is done throughout a vertically averaged depth which, along with the radius, has been optimised to ensure the environmental winds accurately describe the storm's motion. Typically the optimum layer extends vertically from 850 hPa to an upper boundary between 300 and 200 hPa, while the optimum radius is usually between 300 and 400 km (Galarneau and Davis, 2013;Fowler and Galarneau, 2017;Nystrom et al., 2018;Torn et al., 2018).
We adopt this vortex removal procedure. Using the vertical component of relative vorticity ( ) and the horizontal divergence ( ), the streamfunction ( ) and velocity potential ( ) are calculated by solving the associated Poisson equations: where r 0 is some radius from the centre of the storm that is to be determined. Equations (2) are solved with Dirichlet boundary conditions of zero at the edge of a large domain covering the Western North Pacific. The non-divergent (u nd ) and irrotational (u ir ) winds associated with the TC vortex can then be computed via Finally, the non-divergent and irrotational winds are removed from the wind field, u, to leave the environmental winds, u env : The TC is steered by the flow acting over the depth of the storm. Hence the environmental winds are averaged from a bottom pressure layer, p b , to the top of the layer, p t , and also averaged spatially within a radius R, to give the environmental steering vector, V env . That is, after converting u env to cylindrical coordinates about the storm centre, u env (r, , p) r dr d dp.
(5) A number of variables must be chosen to calculate the environmental wind vector. The value of R is chosen to be 2R 80 , and is independent of the value of r 0 . In line with past studies, the bottom pressure layer, p b , was chosen to be 850 hPa -the approximate height of the top of the boundary layer. To define the removal radius, r 0 , and top pressure layer, p t , combinations of values between 250 and 650 km for r 0 and between 700 and 200 hPa for p t were tested. Since the environmental winds should closely match the storm motion, r 0 and p t were chosen so that the magnitude of the residual vector, is minimised, where V fc is the storm's forecasted motion vector.
In Equations (2), the relative vorticity will include contributions from both the TC itself (whose effects we wish to remove in constructing the steering flow) and the larger-scale sheared background flow (whose effects we wish to keep, since they are part of the steering flow by definition). In contrast to previous studies using a vortex removal procedure, in Equation (2) we thus set where tot is the (total) relative vorticity, avg is an estimate of the relative vorticity of the larger-scale sheared background flow, is latitude, and is longitude. We calculate avg ( , p) by (zonally) averaging tot across a longitudinal range covering the Western North Pacific, whilst excluding a 10 • × 10 • area surrounding the TC (whose large values of tot should not enter into avg ). Although avg is typically an order of magnitude smaller than peak values of tot in the TC, we believe that the refinement Equation (7) better isolates the vorticity of the TC. It thus leads to more accurate calculations of the TC-induced winds in Equation (3), and then the environmental winds in Equation (4), particularly where the TC is embedded in strong horizontal shear.

Ensemble sensitivity analysis
Ensemble-based sensitivity analysis uses linear regression to highlight the sensitivity of a scalar forecast metric, J, to a state variable, x, at a particular location and time earlier in the forecast (Ancell and Hakim, 2007;Torn and Hakim, 2008). For an ensemble of size N, the sensitivity of the forecast metric to a state variable at a particular grid point is defined as where x i is a 1 × N vector of the state variable at grid point i, J is a 1 × N vector of the scalar forecast metrics of each ensemble member, cov is the covariance and var is the variance. A full derivation of this equation is found in Ancell and Hakim (2007). Both x i and J are normalised by the ensemble standard deviation to eliminate the impact of different magnitudes and units. Thus the sensitivity ( J∕ X i ) demonstrates the impact on the forecast metric of increasing the state variable by one standard deviation. Regions of high sensitivity are indicative of locations where forecast uncertainty will have an outsized impact on the subsequent forecast metric.

Typhoon Haiyan (2013)
Haiyan developed from a westward moving tropical disturbance in a mixed Rossby-gravity wave train on 02 November 2013 (Shu and Zhang, 2015). A favourable environment, including exceptionally high sea surface temperatures (SSTs) and weak vertical wind shear, led to rapid intensification and the storm becoming a category-5-equivalent (on the Saffir-Simpson scale) supertyphoon at 0000 UTC 05 November. Rapid intensification continued and the storm reached a peak intensity of 87 m⋅s −1 (1-min sustained winds) and 895 hPa (minimum sea level pressure) at 0000 UTC 07 November. Haiyan remained at this intensity as it approached the Philippines and made landfall approximately 20 hr later with an intensity of 85 m⋅s −1 , becoming the most powerful TC on record to make landfall (Landsea and Cangialosi, 2018). The extensive impact of the storm included over 6,200 people losing their lives, a further 4 million being displaced from their homes, and over US$ 775 million of damage (Food and Agriculture Organization, 2014;Lum and Margesson, 2014). The high intensity of Haiyan at landfall is in part due to its fast propagation speed across the Pacific Ocean (Lin et al., 2014). Slow-moving TCs mix the ocean surface water and reduce SSTs, thus suppressing the intensification of the storm, a process known as "SST feedback" (Schade and Emanuel, 1999). However, Haiyan was a particularly fast-moving storm, with a translation speed of 8-11 m⋅s −1 prior to landfall. This rapid direct motion was due to Haiyan's position on the southern periphery of the subtropical high (Figures 1a and 1b). The subtropical high, shown by the 500 hPa geopotential height contours and labelled 'H' in Figure 1, was elongated across the Western North Pacific inducing an easterly geostrophic flow, consistent with the motion of Haiyan. This synoptic set-up is common for TCs which make landfall over the Philippines (Peng et al., 2017). After landfall Haiyan was located towards the western edge of the high and thus embedded in a flow with a stronger northward component. Indeed, after Haiyan crossed the Philippines and moved into the South China Sea, it took a more northward path, interacting with Vietnam and making landfall over China. Haiyan dissipated over China on 11 November at 1200 UTC.

Typhoon Hagupit (2014)
As a tropical storm on 01 December 2014, Hagupit was initially located in a similar position to Haiyan, and moved in a westnorthwest direction. The storm underwent rapid intensification on 03 December, reaching a peak intensity of 83 m⋅s −1 (1-min sustained surface winds) at 0600 UTC 04 December. However, on 05 December Hagupit slowed and took a more westerly direction, making landfall approximately 36 hr later. At the point at which the storm slowed, it also weakened due to large vertical wind shear, quelling concerns that the storm would make landfall with similar intensity to Haiyan 13 months earlier. Despite not causing as much devastation as Haiyan, Hagupit was still a high-impact storm for the Philippines, with 18 deaths reported, over 4 million people affected and approximately US$ 100 million of damage to infrastructure (OCHA, 2014). As with Haiyan, Hagupit was initially positioned on the southern periphery of the subtropical high, albeit further to the west ( Figure 1c; labelled 'H'). As the storm moved across the Pacific, the elongated high evolved into two high pressure systems either side of the storm and an upper-level trough directly to the north ( Figure 1d; the highs and the trough are labelled 'H' and 'T', respectively). From 1200 UTC 04 December to 1200 UTC 07 December, Hagupit's propagation speed was approximately 3 m⋅s −1 . During this period the trough also remained almost stationary. The outflow of the storm led to ridge-building and the subsequent detachment of a PV streamer downstream of the trough (Figure 1d; labelled 'PVS'). The PV streamer interacted with the upper-level anticyclone and became positioned to the south of the anticyclone.

Track forecasts
The straight motion of Haiyan was well predicted by the Met Office global ensemble. Figure 2 shows forecasts for Haiyan from three initialisation times 12 hr apart. The forecast for the earliest initialisation time, 0000 UTC 04 November, exhibits the greatest spread in the track. Two ensemble members predict the storm to make landfall to the north of the best track. However there is still considerable certainty in the forecast: all of the ensemble members predict the storm to be moving westwards directly towards the Philippines and the direct positional error (DPE) of the ensemble average remains below 200 km until T+120. Subsequent forecasts (Figure 2b,c) have fewer ensemble members which significantly deviate from the ensemble average, whilst the ensemble average DPE is less than 150 and 130 km for the forecasts initialised at 1200 UTC 04 November and 0000 UTC 05 November, respectively. In each of the forecasts for Haiyan, the best track lies within the ensemble spread and the translation speed is accurately predicted, suggesting a small track error. Track forecasts for Hagupit (Figure 3), in contrast to those of Haiyan, exhibit a large amount of variability (note that the track plots in Figures 2 and 3 are on different horizontal scales to allow better visualisation given the differences in storm translation speeds). Although some members of the forecast initialised at 0000 UTC 03 December ( Figure 3a) predict Hagupit to veer toward the south, most ensemble members from all three forecasts predict that Hagupit will either make landfall over the central Philippines or turn to the north prior to making landfall. In each of the forecasts the storm slows considerably for approximately 48 hr from 0000 UTC 05 December. Following this period, at approximately 0000 UTC 07 December, the storm's speed increases and the tracks begin to diverge. Although the forecast shown in Figure 3c is initialised less then 3 days before landfall, the ensemble is still unable to predict with any certainty whether or not the storm will make landfall.
The ensemble spread, S, of the tracks at any given time is calculated by taking the unbiased estimator for the variance over the ensemble (Fortin et al., 2014), that is, where n is the number of ensemble members, x i is the position of the storm in ensemble member i and x the ensemble average position. The prefactor √ (n + 1)∕n is a correction that should be used with small ensembles. Figure 4 compares the ensemble spread and DPE of the ensemble mean for one forecast of Haiyan and one forecast of Hagupit. The forecasts shown are those initialised at 1200 UTC 04 November 2013 and 1200 UTC 03 December 2014 for Haiyan and Hagupit, respectively. These are chosen as they are both initialised approximately three days before landfall, however forecasts initialised 12 hr before and after these times exhibit similar results (not shown). The ensemble mean of Hagupit shows a much greater positional error than that of Haiyan. For Hagupit, the positional error increases to approximately 500 km at T+81 before decreasing slightly towards 400 km at T+120. Meanwhile the ensemble spread increases throughout the forecast up to approximately 200 km, which is because the ensemble members are moving in different directions. In comparison, the ensemble spread of Haiyan initially increases rapidly before plateauing, with both effects driven by spread in the along-track component. More precisely, some ensemble members move at quite different speeds during the first 24 hr, but after that these speed differences are much smaller (not shown). The positional error of Haiyan remained small throughout the forecast, showing a much more predictable storm. Despite this, the differences in the ensemble spread between the two storms is less distinct. It should be noted when comparing the ensemble spread that the storm speeds are different. Hagupit's track spread is greater than Haiyan's despite Haiyan travelling much further during the forecast. Further, the fact that the ensemble spread for Hagupit is always increasing indicates that the individual ensemble members are moving away from each other. The forecast initialised at 0000 UTC 04 December has an even greater ensemble spread as the ensemble members move away from each other for longer (not shown).
Due to limited observational coverage of the storm, this study does not attempt to diagnose forecast track errors. Rather, the uncertainty of the (ensemble) forecasts is analysed, irrespective of their accuracy (i.e., relative to what actually happened). Therefore the large DPE for Hagupit will not be explored further and comparisons will be made between ensemble members rather than to observations. For both cases the intensity of the storms is significantly underestimated compared to observations. This is to be expected owing to the relatively coarse resolution of the global ensemble, and is a general problem of TC forecasts in global models (DeMaria et al., 2014). As this work focuses on the uncertainty in track forecasts, rather than intensity, we do not discuss the intensities further, but we acknowledge that storm intensity can affect a TC's track. For example, the distribution of diabatic heating can generate a component of TC motion (e.g., Wu and Wang, 2000).

Analysis of steering flow
The large track spread of Hagupit suggests that the steering flow of the storm may be important in determining its predictability. The steering flow of a TC is the environmental flow which best matches the storm's movement. To investigate the steering flow of the storms, it is necessary to partition the winds associated with the storm from those of the environment which are responsible for the steering of the storm. This is done using the TC removal method described in Section 2.4. Due to the associated computational costs of determining the optimal vortex radius r 0 and upper-level p t at every output time for every ensemble member, the TC removal technique was carried out on a single 12-member ensemble forecast for each storm. The forecasts used were initialised at 1200 UTC 04 November 2013 for Haiyan and 1200 UTC 03 December 2014 for Hagupit. Figure 5 shows, for both Haiyan and Hagupit, a contour plot of the residual vector magnitude (Equation 6) for different r 0 and p t . This is created using data output every 3 hr from each ensemble member. Results are very similar to previous studies which use this technique (Galarneau and Davis, 2013;Fowler and Galarneau, 2017;Torn et al., 2018), with, on average, the optimal r 0 being approximately 300-400 km, and the optimal p t being 300 hPa for Hagupit and 200 hPa for Haiyan. We note that the optimum p t for Haiyan may be above 200 hPa; however, taking p t equal to 200 hPa still produced a small average residual vector when compared to the speed of the storm. Averaged across all times and ensemble members initialised at 1200 UTC 03 December for Hagupit, the magnitude of the residual vector is slightly over 1.2 m⋅s −1 when using a constant radius r 0 . However, if r 0 is allowed to vary with time, then the average residual is significantly smaller, at approximately 0.6 m⋅s −1 . Over a 5-day forecast, the equates to a positional error that is smaller by approximately 250 km. Therefore, we continue our analysis in this section using a radius r 0 that has been optimised at every 3-hourly output time. Figure 6 shows ensemble averaged streamlines for the 850-250 hPa pressure-weighted depth-averaged storm-removed winds, for both Haiyan and Hagupit at two different times. The large-scale circulations depicted in Figure 6 are consistent with the features outlined in Figure 1. In the case of Haiyan the streamlines show the storm is embedded in the easterly flow which is associated with the anticyclone to the north (Figure 6a). This flow strengthens as the forecast continues and Haiyan approaches the Philippines (Figure 6b). In comparison, Figure 6c shows Hagupit is located to the southwest of the sub-tropical anticyclone. As such, the motion of the storm is in a northwest direction. By 1800 UTC 05 December (Figure 6d), Hagupit has moved into a position of weak steering flow between two anticyclones. The western anticyclone creates a steering flow towards the Philippines,

Sensitivity of storm track to initial position
In cases where a TC shows a large amount of uncertainty in the track forecasts, it is of interest to understand how much of this uncertainty is caused by slight differences in the position of the TC at the start of the forecast in the individual ensemble members. In this section, the storm-removed, depth-averaged environmental winds (i.e., those shown in Figure 6) are used to calculate a number of trajectories for each ensemble member initialised from a region around the forecasted position of the TC. The difference in the initial positions of the trajectories is representative of a typical track forecast error. The future spread of the trajectories is then determined by the initial positional differences only, as they are computed using the same ensemble member and thus the same environmental winds. This spread can then be compared to the track spread across the ensemble to give an indication of how important small positional differences earlier in the forecast are in determining the subsequent spread of the tracks.
Trajectories are initialised at T+24 in a 1.6 • × 1.6 • box around the forecast centre of the storm. The box dimensions were chosen such that the maximum displacement between the initial location of a trajectory and the forecasted position of the storm, at T+24, was slightly larger than the average error of NWP models T+24 TC track forecasts (approximately 75 km; Short and Petch, 2018). This means that trajectories starting in the box can be considered to be within the bounds of a normal track error. Trajectories are calculated to T+120 with a Runge-Kutta fourth-order scheme using the 3-hourly storm-removed environmental winds. Figure 7 shows the trajectories computed for ensemble member 6 of Hagupit. Ensemble member 6 was chosen as its track was similar to the ensemble average and it predicted Hagupit would make landfall close to the observed location. The trajectories exhibit a similar behaviour to the original ensemble (Figure 3b). The trajectories show that, had the storm been located in a slightly different position at T+24, it may have recurved and missed the Philippines (i.e., influenced by the eastern anticyclone in Figure 6d), or alternatively it may have propagated too far south (i.e., influenced by the western anticyclone in Figure 6d). This highlights that, in a single ensemble member, the track of the storm is sensitive to its position earlier in the forecast. It could be expected that, if the environment in each of the global ensemble members were the same as that in ensemble member 6, then small differences in the position of the storm in different ensemble members could lead to a large track spread. In comparison, ensemble member 1 for Haiyan shows each of the trajectories remain close to, and move in the same direction as, the forecasted track ( Figure 8). Thus, even if there were a small positional error at T+24 for Haiyan, this would not grow into a large error and the storm location would still be predicted with a high degree of certainty. As with Figures 2 and 3, the horizontal domains shown in Figures 7 and 8 are different for visualisation purposes.
Trajectories were computed using the environmental winds for each of the 12 ensemble members for both storms. As with the original ensemble of tracks, the spread of the trajectories is calculated using Equation (9). The ensemble average of these trajectory spreads is shown in Figure 4. The spread of the trajectories for Hagupit is very similar to the spread of the 12-member ensemble forecast, further illustrating that the uncertainty in the forecasts of Hagupit is caused by the environment in which the storm is embedded.
The differences between the trajectory spread for Haiyan and Hagupit are less distinct than those of the full ensemble spread (Figure 4). Although by the end of the forecast the trajectory spread for Hagupit is almost twice that of Haiyan, earlier on the spreads are very similar. For a number of ensemble members the trajectory spread for Haiyan is small as trajectories remain close to the forecasted track and move westwards towards the Philippines (e.g., Figure 8). However, in some ensemble members the location of Haiyan to the south of the anticyclone caused trajectories to the south of the 1.6 • box to move slower than those to the north, introducing large along-track spread (not shown). The cross-track trajectory spread remains small throughout the forecast until Haiyan has passed over the Philippines.
The trajectory technique presented here could be used to estimate the uncertainty in a deterministic track forecast for a TC. The trajectories provide a computationally cheap method of determining different TC paths that can occur due to positional variation in the ensemble. In situations like that of Hagupit, the large spread will highlight the intrinsic uncertainty caused by the environment. In other situations the trajectories may highlight a finite number of possible TC paths depending on the position of the TC relative to different environmental features. Of course, it would only be an approximation since the modification of the storm environment and thus steering flow by the storm itself is not accounted for. The results also have the potential to be somewhat misleading if uncertainty is caused by environmental changes between individual ensemble members, rather than positional changes. However, the method could be useful if there is

Impact of the storms' environment on the steering flow
For the remainder of this paper we focus on the forecasts of Hagupit to further understand the reasons for the low predictability of the storm track. In this section we investigate how Hagupit interacts with its environment and how subtle differences in the environments of the different ensemble members are related to differences in the track forecasts. To do this, two forecasts are time-lagged to create a 24-member ensemble. Forecasts initialised at 1200 UTC 03 December and 0000 UTC 04 December 2014 are used. These forecasts show similar characteristics with some ensemble members turning north and others making landfall over the Philippines.
They are initialised approximately 81 and 69 hr before the storm made landfall. Ensemble members are split into two groups depending on whether the forecasted storm turns to the north (NORTH members) or makes landfall (WEST members). These two groups, from the time-lagged ensemble, are shown in Figure 9. Each group consists of eight members: four from the earlier forecast initialised at 1200 UTC 03 December, and four from the later forecast initialised at 0000 UTC 04 December. The remaining ensemble members (shown in Figure 9) were omitted for various reasons. Two ensemble members in the later forecast went considerably further south than other members and one in the earlier forecast turned considerably further north. These extreme members were omitted to ensure they did not skew the group averages. To ensure each group was made up of four ensemble members from each of the two initial times, ensemble members which predicted Hagupit to make landfall over north Philippines (i.e., ensemble members which were borderline between F I G U R E 9 Track forecasts for Hagupit split into two groups depending on if the storm is forecast to make landfall close to the correct location ("WEST"), if it is forecast to turn to the north ("NORTH"), or if the ensemble member is omitted from both groups ("Others"). Two 12-member forecasts, initialised at 1200 UTC 03 December and 0000 UTC 04 December 2014 are time-lagged [Colour figure can be viewed at wileyonlinelibrary.com] the two groups) were also omitted. Beyond ensuring forecasts initialised at different times were represented by the same number of ensemble members, these groups were chosen subjectively to represent the differing behaviours of ensemble members. Figure 10 shows the depth-averaged and ensemble-averaged steering flow for both the WEST group and the NORTH group at 0000 UTC 07 December. The steering flow has been calculated using a fixed removal radius (r 0 ) of 350 km to allow for a comparison between the ensemble members. To calculate the average steering flow, the steering flow for each ensemble member in the WEST and NORTH groups is centred on the average forecasted location of the storm in that group; following this, the steering winds are averaged across all ensemble members. The solid circle in Figure 10 is the removal radius whilst the dashed circle shows the position of the removal radius of the other group. Comparing the two groups shows there is a distinct difference in the steering flow. The NORTH group has a stronger northward component to the steering flow compared to the WEST group which is being steered towards the Philippines. In both cases there is a deformation field with very weak steering flow to the northwest of the storm. Compared to the NORTH group, the WEST group has an average position further to the southwest and thus is under less influence from the trough to the north.
While the depth-averaged steering flow accurately matches the motion of the ensemble members, it does not highlight which levels of the atmosphere are most important in steering Hagupit. Figure 11 splits the depth-averaged steering flow into different levels. The biggest influence to the depth-averaged steering flow occurs at the upper levels. At 850 hPa the steering flow in both the WEST and NORTH groups is very similar and weak. At 500 hPa there are differences in the direction of the environmental winds, but wind speeds are still relatively weak. At 300 hPa there is a much larger contribution to the average steering as well as a stronger southerly flow in the NORTH group. Thus, it can be concluded that the main contribution to the differences in steering winds between the two groups is due to the upper-level winds.
The differences in the upper-level steering between the two groups suggest there are differences in the upper-level environments. Figure 12 shows the differences between the average 300 hPa geopotential heights of the NORTH and WEST groups. Statistical significance, shown by the hatching, is determined using a bootstrap resampling method. Two groups of ensemble members of equal size to those of the WEST and NORTH groups are chosen at random without replacement. The difference between these two groups is calculated. This process is repeated 300 times, from which a 95% confidence interval is calculated for the difference between the two groups. Regions where the difference between the NORTH and WEST groups are outside of this confidence interval are statistically significant.
Throughout the forecast the differences between the two groups in the environments close to the TCs' location are small and subtle. To the south of the storm the geopotential height for the NORTH group is slightly greater than the WEST group throughout the forecast (Figures 12a-12d). At 1200 UTC 05 December, the NORTH members are also associated with a slightly stronger upper-level high pressure system to the east of the storm (Figure 12b). Although the differences are only small, the hatching in Figure 12 indicates they are statistically significant. By 1200 UTC 06 December more significant differences begin to develop between the two groups; however, these differences are further downstream rather than in the vicinity of the storm. The NORTH members are associated with stronger downstream ridge building (Figure 12c, approximately 35 • N and 150 • E) which ultimately leads to differences in the position of the detached PV streamer shown in Figure 1d, and shown here by the differences in geopotential height (Figure 12d, 15 • N and 165 • E). The PV streamer in the NORTH members has propagated further to the west than in the WEST members, also impacting the southern periphery of the high. Finally a dipole can be seen close to the location of the storm in Figure 12d, approximately (12 • N,128 • E). This indicates that by 1200 UTC 07 December there are statistically significant differences in the location of the storm in each of the groups, with the WEST members already further to the west at this point.
The outflow of the TC is investigated using the trajectories shown in Figure 12 to understand how the storm contributes to the regulation of its environment. The F I G U R E 10 Average 850-200 hPa steering flow of the WEST and NORTH groups (shown in Figure 9) for Hagupit at 1200 UTC 07 December 2014. The solid circle shows the removal radius of 350 km and is centred on the position of the group, whilst the dashed circle is the relative position of the other group. Arrows show the winddirection and the shading is the wind speed (m⋅s −1 ) [Colour figure can be viewed at wileyonlinelibrary.com] trajectories are calculated using the winds at 300 hPa from the location of the storm at 1200 UTC 04 December, the time when the motion of the storm slows. In each ensemble member, trajectories are initialised 0.25 • apart from a 2.5 • × 2.5 • box centred on the group average location of the storm. Therefore, from the eight members of each group, 968 trajectories are calculated. These trajectories are then split into three groups depending on whether they go to the west, interact with the downstream ridge, or interact with the upper-level high. Each group consists of approximately 33% of the trajectories. A small number of trajectories are omitted from the averaging if they do not fulfil any of the criteria, which occurs when the trajectories become wrapped in the storm's circulation. Once split into the groups, the trajectories are averaged to form one trajectory per group. The criteria for splitting the trajectories and averaging over each group is somewhat arbitrary; however, the three resulting average trajectories for each group demonstrate the general paths of the TC outflow. Whilst the trajectories shown are initialised at 1200 UTC 04 December and are calculated at 300 hPa, a number of other initial times and heights were used to explore the outflow channel of Hagupit. It was found that this outflow channel remained stationary during the three days prior to landfall when the motion of Hagupit had stalled, and was always present at heights above 400 hPa (not shown).
The outflow from both the NORTH and WEST groups interacts with the upper-level trough, downstream ridge and the region directly north of the Philippines (Figure 12). The outflow path towards the downstream ridge is between the upper-level trough and the high pressure to the east of the storm. Associated with the outflow channel is strong upper-level irrotational flow from the storm (not shown). Although no significant differences were found between the NORTH and WEST groups, this irrotational outflow plays a crucial role in regulating the environment. In particular the irrotational winds aid the ridge amplification and formation of the jet streak through the advection of low-PV air towards the large PV gradient associated with the upper-level trough (Riemer et al., 2008;Keller et al., 2019). At the same time, positive PV advection on the eastern side of the upper-level trough opposes the eastward propagation of the wave (Pantillon et al., 2013;Riemer and Jones, 2014). Whilst it is not the aim of this study to discuss how uncertainty in the TC location can lead to atmospheric uncertainties further afield, it is interesting to note the link between the outflow of the storm and the downstream differences that develop between the NORTH and WEST groups. Figure 12d shows that in both groups the position of the trajectories by 1200 UTC 07 December is close to the location of the PV streamer (located at approximately 15 • N and 165 • E). This suggests that the anticyclonic outflow of Hagupit helps promote the Rossby wavebreaking event by increasing the anticylonic wind component in the downstream trough. This is similar to other observed processes often seen in TCs which undergo extratropical transition (e.g., Riemer and Jones, 2010;Keller et al., 2019).
Differences in the average outflow of the storm between the NORTH and WEST groups (shown by the trajectories in Figure 12) are very small and subtle. The complexity of the interactions between the TC and its environment and the feedback of these interactions onto the steering flow make it difficult to distinguish which, if any, of the subtle differences between the two groups are significant to the future positional differences between the F I G U R E 11 As Figure 10, but at a pressure level of (a, b) 300 hPa, (c, d) 500 hPa and (e, f) 850 hPa [Colour figure can be viewed at wileyonlinelibrary.com] two groups. However, they do show the complex interplay between the storm and the environment in which it is embedded.
Comparisons between the NORTH and WEST groups show that differences in the depth-averaged steering flows of the ensemble members are dominated by differences in the upper-level winds. In this section, differences in the upper-level environments were identified downstream from the storm; however, the environments close to the storm did not show many significant differences between the NORTH and WEST groups. In the next section we look more closely at the impact of these F I G U R E 12 Geopotential height differences (dam) at 300 hPa between NORTH and WEST groups (NORTH minus WEST) for Hagupit. Contours are the average geopotential heights at 300 hPa for NORTH and WEST groups. Positive (negative) differences which are statistically significant at the 95% confidence level are shown by horizontal and vertical (diagonal) hatching. The thick lines are average trajectories of the TC outflow for the NORTH and WEST groups calculated from trajectories which are initialised over a 2.5 • × 2.5 • box centred on the location of the storm at 1200 UTC 04 December. The trajectories are calculated using the wind fields at 300 hPa. During averaging they are split into three groups depending on if they head west, north towards the downstream ridge, or become wrapped in the upper-level high. The stars indicate the position of the trajectory at the time shown in the plot: (a) 1200 UTC 04 December, (b) 1200 UTC 05 December, (c) 1200 UTC 06 December and (d) 1200 UTC 07 December 2014 [Colour figure can be viewed at wileyonlinelibrary.com] differences using a larger 45-member ensemble. In particular we use ensemble sensitivity analysis to determine the impact these features have on the steering of the storm.

Ensemble-based sensitivity analysis
The 45-member ensemble is produced for the forecast initialisation time of 1200 UTC 03 December. The track forecasts for the 45-member ensemble are shown in Figure 13. The ellipses are the contours of the 95% bivariate normal distribution at each 24 hr lead time in the forecast (and T+3, the first output time). The forecast of the larger ensemble shows similar characteristics to that of the smaller ensemble from the same initial time (Figure 3b). In particular, the storm is predicted to stall before landfall in each of the members. The ellipses do not include the best track position (shown by the stars in Figure 13) from as early as T+24, showing the large error associated with the forecasts. After the storm has stalled, from approximately T+48 to T+96, forecasts predict the storm to either move towards the Philippines before making landfall or turn to the north. As with the earlier analysis of the 12-member ensemble, the aim of this section is to not diagnose forecast errors, but to understand the uncertainty in the ensemble forecast. Whilst Hagupit's track at the times highlighted was outside of the ensemble ellipses, the observed path of the storm is generally within the spread of ensemble tracks -it is just the translation speed that is too slow in the model. Nonetheless, the model captures the general large-scale uncertainty that happened in reality and thus our experiment is still useful for understanding uncertainty in tracks under similar synoptic conditions. The orientation of the ellipses in Figure 13 demonstrates the direction in which there is greatest spread in the track forecasts at that time. The orientation changes between T+72 and T+120 from a west to east direction to a southwest to northeast direction. The major axis is defined in the same way as in Hamill et al. (2011) and is the direction in which the track forecasts vary the most at any particular time. Figure 14 shows the correlation of ensemble members along the major axis line at each time to the position of the ensemble members along the major axis at T+120. Statistically significant (>0.294) correlations are seen after 24 hr of the forecast. Although at this time the ellipses are nearly circular and thus there is not a clear major axis, further analysis showed that there were statistically significant correlations between the cross-track position of ensemble members and the position at T+120 from as early as T+15. By T+48 the correlation has exceeded 0.5, highlighting the importance of the storm's location earlier in the forecast to whether it turns to the north or heads straight towards the Philippines. For example, ensemble members which predict the TC to be further to the east at T+72 are correlated to the ensemble members which predict the TC to be towards the northeast by T+120. This is consistent with the results of the previous section where statistically significant differences in the storm's position were seen in the NORTH and WEST groups at 1200 UTC 06 December and 1200 UTC 07 December (Figure 12c,d).
Using ensemble sensitivity analysis, described in Section 4.4, the sensitivity of the distance along the major axis at T+120 is compared to the storm-removed streamfunction averaged between 850 and 300 hPa (i.e., the same metric as shown in Figure 6). The quantities in Equation (8) are normalised by the ensemble standard deviation; therefore, ensemble sensitivity analysis shows the likely impact on the storm position should there be a perturbation of one standard deviation to the ensemble average streamfunction. In each case, the storm-removed winds are calculated using a constant radius of 350 km. A constant radius was chosen to avoid the additional computational cost of finding the optimum radius in each of the ensemble members. Figure 15a shows that increasing the streamfunction by one standard deviation at T+6 (1800 UTC 03 December) in the region to the northeast and to the south of the anticyclone to the east of the storm would cause the TC to be located up to 80 km further along the major axis of the ellipse in the northwest direction at T+120. This suggests that the TC is sensitive to the steering flow and the strength and shape of the high to the east of the storm early in the forecast. In Figure 15b there is no longer a region of positive sensitivity to the south of the anticyclone; however, the sensitivity to the northeast remains. Although the location of the positive sensitivity to the northeast of the anticyclone is similar to the differences between the NORTH and WEST groups caused by the TC outflow shown in Figure 12, the fact that ensemble sensitivity analysis highlights this region during the first 24 hr of the forecast suggests that the strength of this anticyclone earlier on is important to the subsequent predicted track of Hagupit .
In Figure 15c there is a broad region of sensitivity linked to both the anticyclone to the east and the downstream ridge. At this point, 24 hr into the forecast, it can be expected that positional differences and differences in the TC outflow, similar to those seen in Figure 12, are impacting the environment. Therefore, the assumption that the state variable and forecast metric are independent in Equation (8) no longer holds. This is also true for Figure 15d at 1200 UTC 06 December. In this instance there is a broad region of strong negative sensitivity close to the storm and strong sensitivity to the location of the trough to the north of the TC. Whilst the sensitivities at this point in the forecast (T+72) are likely caused by positional differences, the plot highlights the importance of the position and depth of the trough to the north. Increasing the streamfunction by one standard deviation in this area would cause the trough to not extend as far to the south. This in turn would mean a stronger easterly steering flow and weaker northerly steering flow for Hagupit, causing the forecast to predict the TC to make landfall instead of turn to the north. The analysis in this section has used the storm-removed winds to calculate the streamfunction to use as the state variable. Therefore, the impact of positional differences have been masked from the analysis to demonstrate how uncertainties in the steering flow can also lead to uncertainties in the track forecasts.

SUMMARY AND CONCLUSION
Although TC track forecasts have improved significantly over the past few decades, there remain cases where the forecasted position of a storm has been associated with a large amount of uncertainty. Although this is not necessarily a bad thing -a TC may be inherently unpredictable due to the chaotic nature of the atmosphere -understanding the causes of uncertainty on a case-to-case basis helps forecasters to understand why a forecast may be uncertain in a similar scenario. Typhoon Hagupit (2014) is an example of a TC in which the positional error and ensemble uncertainty of the forecasts were large. MetUM global ensemble forecasts initialised up to 60 hr before landfall failed to predict with certainty where, or indeed if, Hagupit would make landfall. Some ensemble members predicted the storm to make landfall over the central Philippines and others predicted Hagupit to turn to the north, missing the Philippines altogether. The potential impact of Hagupit was particularly high as it occurred only 13 months after the high-impact typhoon Haiyan made landfall in the Philippines. Although both storms exhibited many similarities in track and intensity, the predictability of Haiyan's track was greater than the track of Hagupit. The difficulty in predicting Hagupit's track was linked to the weak environmental flow in which the storm was embedded. Whilst Haiyan was steered by a strong subtropical ridge, Hagupit was embedded in a much weaker steering flow and became positioned between two anticyclones. To the east an anticyclone pulled the storm to the north, and to the west an anticyclone steered the storm towards the Philippines. This synoptic set-up resembles the large-scale deformation identified in similar cases with low TC predictability (e.g., Emanuel, 2005;Torn et al., 2018). An ensemble of trajectories, calculated using the storm-removed environmental steering flow from initial positions representative of a typical TC forecast error, showed similar spread and characteristics to the ensemble track forecasts. This demonstrated that the exact position of the forecast earlier on is critical to its subsequent track.
The most significant contributions to the depth-averaged steering flow for Hagupit came from the upper levels. North-turning ensemble members were associated with a slightly stronger upper-level anticyclone to the east earlier in the forecasts. As the storm slowed on approach to the Philippines, there were significant differences in the downstream ridge building and the position of a detached PV streamer. At approximately 0000 UTC 07 December there was a statistically significant east-west positional difference of the TC between NORTH and WEST groups of ensemble members, with the NORTH members being positioned further to the east.
A global 45-member MetUM ensemble showed there was indeed a statistically significant correlation between ensemble members predicting the TC to be positioned further to the east at T+72 (1200 UTC 06 December) and ensemble members predicting the storm to be positioned further to the northeast at T+120 (1200 UTC 08 December). Additionally, ensemble sensitivity analysis showed that this positional difference earlier in the forecast could be linked to the strength and shape of the anticyclone to the east. Increased strength of the storm-removed streamfunction to the southeast of the storm or to the northeast of the anticyclone at T+24 or T+48 is associated with TC positions further to the northeast at T+120. Ensemble sensitivity analysis also highlighted sensitivities to the depth and position of the upper-level trough later in the forecast.
Large TC track errors are often associated with steering flow in which small perturbations to the TC location, or to the steering flow itself, can cause the TC to move into F I G U R E 15 Ensemble sensitivity of Hagupit's position along the major axis at T+120 to the storm-removed streamfunction averaged between 850-300 hPa (shading) at (a) 1800 UTC 03 December (T+6), (b) 1200 UTC 04 December (T+24), (c) 1200 UTC 05 December (T+48), and (d) 1200 UTC 06 December (T+72). The sensitivities show the expected change in TC position along the major axis at T+120 (shown by the straight line with positive distance dashed and negative dotted) if the streamfunction were to be increased by one standard deviation at that point. The contours are of the ensemble mean storm-removed streamfunction. The TC symbol is the ensemble average position of the TC at that particular time [Colour figure can be viewed at wileyonlinelibrary.com] a markedly different position later in the forecast. This is usually due to a bifurcation point in the environmental flow (e.g., Grams et al., 2013;Torn et al., 2018). Hagupit's steering flow showed similarities to TCs located close to a bifurcation point. The east to west differences which formed while the TC entered the region between two anticyclones was key to whether or not Hagupit was predicted to make landfall or turn north. However, Hagupit differed from other case-studies as the steering flow broke down and reached near-zero. The likelihood of Hagupit turning north (i.e., steered by the anticyclone to the east) or making landfall (i.e., steered by the anticyclone to the west) depended on positional differences which developed after Hagupit became located between the two anticyclones. These positional difference developed because of environmental differences in the ensemble members leading to slight differences in the steering flows. Forecasting these differences is made more difficult because Hagupit's own outflow interacts and influences the development of the environment.
The analysis of Hagupit adds to existing studies which predominantly investigate TCs in the North Atlantic basin. The novel use of perturbed trajectories calculated using the environmental winds provides a computationally efficient method of assessing the potential uncertainty of a deterministic TC forecast. The interactions between the TC and the environment show how outflow from a TC can modify the environment and cause a change to the TC's steering flow (e.g., Keller et al., 2019). Above all, the many different processes highlighted in this study which may have impacted the motion of Hagupit highlight the complexities of TC motion in weak steering flow and thus the need for continued case-studies to enhance understanding.
A limitation of global TC forecasts is their inability to accurately resolve some smaller-scale processes and thus predict the TC's intensity and structure. In addition, the use of a convection scheme in global models may mean that the distribution of diabatic heating in the eyewall is not well represented. Further, the convection is directly coupled to the outflow of the storms which can also feed back onto the storm motion. Finally, accurately forecasting the vertical structure of the storm is also important to ensure the storm interacts with steering winds at the correct heights.
In a companion paper, we will analyse a set of convection-permitting (4.4 km grid length) ensemble forecasts for Haiyan and Hagupit, with the aim of understanding how increased horizontal resolution and an explicit representation of convection affect model track predictions.