Estimating Maximum Extent of Auroral Equatorward Boundary Using Historical and Simulated Surface Magnetic Field Data

The equatorward extent of the auroral oval, the region which separates the open‐field polar cap regions with the closed field subauroral regions, is an important factor to take into account when assessing the risk posed by space weather to ground infrastructure. During storms, the auroral oval is known to move equatorward, accompanied by ionospheric current systems and significant magnetic field variations. Here we outline a simple algorithm which can be used to estimate the maximum extent of the auroral equatorward boundary (MEAEB) using magnetic field data from ground‐based observatories. We apply this algorithm to three decades of INTERMAGNET data, and show how the auroral oval in the Northern hemisphere moves South with larger (more negative Dst) storms. We simulate a number of storms with different magnitudes using the Space Weather Modeling Framework (SWMF), and apply the same auroral boundary detection algorithm. For SWMF simulated storms with Dst > −600nT, the estimates of the MEAEB are broadly in line with the same estimates for historical events. For the extreme scaled storms (with Dst < −1,000 nT), there is considerable scatter in the estimated location of the auroral equatorward boundary. Our largest storm simulation was calculated using Carrington‐like estimates for the solar wind conditions. This resulted in a minimum Dst = −1,142 nT, and a minimum estimated auroral boundary of 35.5° MLAT in places.

• The maximum extent of the auroral equatorward boundary was estimated for individual days for three decades of ground magnetic field data • Geomagnetic storms were simulated using the Space Weather Modeling Framework, and found to give boundaries similar to historical data • Extreme geomagnetic storms (Dst < −1,000 nT) were simulated, resulting in auroral equatorward boundaries below 40° magnetic latitude Supporting Information: • Supporting Information S1 • Data Set S1 • Data Set S2 • Data Set S3 • Data Set S4 • Data Set S5 • Data Set S6 • Data Set S7 • Data Set S8 • Data Set S9 • Data Set S10 • Data Set S11 • Data Set S12 • Data Set S13 • Data Set S14 • Data Set S15 • Data Set S16 • Data Set S17 A number of related but different phenomena have been used to estimate the the poleward and equatorward boundaries of the auroral oval during geomagnetic storms. These include the presence of electron precipitation (Carbary et al., 2003;Ngwira, Pulkkinen, Mays, et al., 2013;Ngwira, Pulkkinen, Wilder, et al., 2013), optical auroral sightings (Ding et al., 2017;Milan et al., 2009;Silverman, 2005), and changes in ground magnetic variations (as described above) (Woodroffe et al., 2016). Of these different metrics, optical auroral sightings have been recorded for the longest time (Stephenson et al., 2004), but it can be difficult to precisely quantify the 'footprint' location of the aurora (i.e., the locations directly beneath the aurora), and care must be taken when interpreting historical sources. In addition, auroral emissions are not always coincident with the auroral oval, and can even occur during periods of low activity (Silverman, 2003). Electron precipitation data only exists for the satellite era, and as an incomplete record, so few of the largest geomagnetic storms in the record have such data.
Enhanced geomagnetic variations due to increased electric currents in the ionosphere are perhaps the most consequential indicator of the auroral boundary in terms of threats to modern infrastructure. In addition, geomagnetic data have been continuously measured across the globe since the 1830s (Stern, 2002), and multiple large geomagnetic storms have occurred during this time. Despite this, early geomagnetic records are often off-scale or incomplete (Shea & Smart, 2006), or difficult to access for performing a global study. In terms of widespread and readily accessible geomagnetic field data, digital archives such as INTERMAGNET and SuperMag host data from geomagnetic observatories from around the 1980s to present (at cadences of 1 min or quicker).
In this study, we outline a simple algorithm that uses only geomagnetic field data such as these from multiple locations to estimate the MEAEB (for brevity). This algorithm separates the more equatorward and less active subauroral region from the more active poleward region in the Northern hemisphere (in terms of geomagnetic activity) for different days. We apply this algorithm to 25 years of global geomagnetic field data, and plot the location of the MEAEB against minimum Dst for each day, allowing us to build a relation between the MEAEB and storm-intensity. Finally, to investigate the MEAEB for storms larger than −589 nT, we simulate storms with a range of intensities using the Space Weather Modeling Framework (SWMF) (Toth et al., 2005(Toth et al., , 2012. The most intense of these simulations use solar wind inputs that are scaled to Carrington-like conditions, and result in extreme geomagnetic conditions (Dst < −1, 000 nT). We apply our auroral boundary algorithm to these simulations, and compare to estimates for the historical Carrington event.

Identifying MEAEB from Geomagnetic Data
The basic function of our algorithm used to calculate the MEAEB latitude is to automatedly separate the relatively geomagnetically quiet and more equatorward subauroral region from the more active poleward region, for a fixed time-period. In effect, this estimates the maximum equatorward extent of the auroral region, as opposed to an instantaneous position. The top panel of Figure 1 shows the measured horizontal magnetic field components (B X and B Y ) for the Eskdalemuir INTERMAGNET observatory for the 29-31 October 2003 "Halloween" storm period. From these, horizontal electric field values were calculated for each site using the frequency dependent (ω) plane-wave equation (Pirjola, 2001 where μ 0 is the vacuum permeability and subscripts X and Y refer to the North-South and East-West components, respectively. From these two calculated electric field components, the horizontal electric field (E H ) was calculated From this, the maximum E H was noted. In this study, the Quebec 1D resistive model outlined in Boteler and Pirjola (1998) was used for all electric field calculations. In reality, each of the INTERMAGNET sites will have different subsurface resistivity profiles. By using the same profile for each, we are in effect using the maximum calculated electric field value as a proxy for variations in the geomagnetic field measured at each site. This approach (using maximum 1-D calculated E H as a proxy for magnetic variations) has been used before by Woodroffe et al. (2016), Ngwira, Pulkkinen, Mays, et al. (2013);Ngwira, Pulkkinen, Wilder, et al. (2013), and Pulkkinen et al. (2015).
In addition to the maximum horizontal electric field, the magnetic latitude of the site was calculated using the AAGCMv2 method (Shepherd, 2014 Figure 2. For this particular example, there is a clear boundary at approximately ±50° which separates the quieter subauroral region (with calculated electric fields < 0.5 Vkm −1 ), and the more geomagnetically active poleward regions (>0.5 Vkm −1 ). This boundary is what we describe as the MEAEB for this particular day, and what our algorithm attempts to identify.
In order to automatically calculate the maximum latitudinal extent of this boundary, the following steps were taken. First, the logs of the maximum calculated electric field values were taken for the northern hemisphere (as there are far more points here than in the Southern hemisphere). A natural cubic spline fit was applied to the data (Woltring, 1986). A fixed smoothing parameter (p = 400) was used in order to prevent overfitting the data. The gradient of this smoothed fit was then taken at every point. The latitude at which this gradient was at its greatest was taken as the MEAEB location, that is, where the amplitude of the electric field was seen to increase the most. In order to estimate errors in this fit, 500 spline fits were calculated from n randomly selected subsamples of the data (where n = 0.75 times the available sites). The standard deviation of these 500 calculated MEAEB locations was then used as an estimated error for the fit.  disturbed day (in blue) had elevated maximum E H values at all latitudes when compared to the quiet day (in red). The fitted smoothed spline fits are shown as bold lines, and the vertical dashed lines show the points along these smoothed lines with the largest gradient. These mark the calculated MEAEBs for the 2 days. It can be seen that the disturbed day had a calculated MEAEB that occurred at 52.5°N. This value can be compared to the ±50° boundary estimated by eye in Figure 2. The quiet day's boundary was calculated at 63.7°.

Applying Algorithm to 25 Years of Geomagnetic Data
The MEAEB location was calculated for every day from 1991 to 2016 using 1-min INTERMAGNET data taken from all available observatories. The magnetic observatories that contribute to these networks are distributed across the globe, with the majority in the northern hemisphere. Since 1991 (the start of the availability of INTERMAGNET data), the number of recording INTERMAGNET observatories has steadily increased from around 40 to over 100. In addition to the INTERMAGNET data, 1-min data were taken from the SuperMag database (http://supermag.jhuapl.edu) for the 13-14 March 1989 geomagnetic storm.
The location and shape of the auroral oval can change significantly throughout the course of a single geomagnetic storm. The algorithm uses the maximum E H over a fixed time-period, therefore calculating the maximum equatorward extent of the auroral boundary, as opposed to an instantaneous position. By using the maximum E H for a relatively long period (i.e., 24 h), the auroral and subauroral regions are better differentiated, as not all observatories North of the auroral boundary will experience elevated geomagnetic activity at exactly the same time. In addition, a calendar day is a convenient time window given the long timescale of 25 years covered in the study, and the fact that INTERMAGNET data files are given for individual observatories for each calendar day.
For a given day, the horizontal magnetic time-series for each available magnetic observatory were examined. Datapoints which were more than 12σ from the mean of the time-series were considered spurious and removed. Gaps in the data were linearly interpolated over. Where some time-series exhibited large artificial steps of several hundred nT (i.e., where the data baseline suddenly increased or decreased), these datasets BLAKE ET AL.  were discarded. Using the remaining data from the observatories, the algorithm outlined in Section 2 was applied, and the MEAEB was calculated.
For each day, the calculated MEAEB was plot against the corresponding minimum daily Dst (taken from the World Data Center for Geomagnetism, Kyoto-wdc.kugi.kyoto-u.ac.jp). This can be seen in Figure 4. Errorbars are the previously mentioned 1σ estimates from 500 bootstrapped spline fits. As can be seen from this plot, the calculated MEAEB can be seen further South for larger geomagnetic storms. For minimum Dst values >−200 nT, days appear to have calculated MEAEB's within a decreasing band of approximately 8°. Days with minimum Dst <−200 nT are less plentiful, and show a larger scatter in MEAEB position. The day with the lowest calculated MEAEB was for the March 14, 1989, with a calculated boundary location of 45 ± 3.8° magnetic latitude.
For days with low geomagnetic activity, the subauroral and poleward regions are poorly differentiated in ground geomagnetic data (or in maximum calculated E H , our chosen proxy). As such, the boundary calculation routine outlined above can misattribute a lower auroral boundary than is expected, and produce a large 1σ errorbar. Figure 4 has days with 1σ > 5° omitted for this reason.
As previously mentioned, the number of INTERMAGNET observatories varies by date. The limited number of operational observatories toward the start of the data set meant that binning the observatories by longitude and calculating the MEAEB for different longitudes was not possible using the above method. For example, in 1991 there were only 40 geomagnetic observatories with available data, 34 of which were in the Northern hemisphere. Furthermore, most of these were clustered around Europe. All available INTER-MAGNET data were therefore used for every day, and a single latitude value was returned for the MEAEB.
In order to investigate how the number of available observatories affects the calculated MEAEB location, two storm events with a large number of recording observatories were chosen. These were the October 30, 2003 and March 17, 2015 events. For both storm events, an increasing number of observatories (from 40 to the maximum number available) were chosen at random, and the MEAEB was calculated. This was repeated 1,000 times to get 90% confidence intervals for the boundary calculations, for every number of available magnetic observatories. BLAKE ET AL.   . In addition, the 12-14 March 1989 days were included using SuperMag data. Errorbars are 1σ estimates from 500 bootstrapped sample fits. MEAEB, maximum extent of the auroral equatorward boundary. Figure 5 shows the mean of the calculated MEAEBs for both events, as well as the 90% confidence intervals. For both, as the number of observatories used in the boundary calculation increases, the 90% intervals narrow. In the case of the October 30, 2003 event, the mean MEAEB only changes by a fraction of a degree as the number of sites is increased from 40 to 90. For the March 17, 2015 event, the mean MEAEB changes by approximately 1. Seven degree as the number of observatories is increased from 40 to 113. The difference between the change in MEAEB for the two storms may be due to intensity or global structure of the individual storms. For both events, the calculated MEAEB using all available sites was within the 90% confidence interval for the calculated MEAEB using only 40 sites. From these two events, we conclude that days with more available magnetic observatories will have smaller uncertainties associated with the MEAEB calculation. In addition, there is an uncertainty on the order of a few degrees in the location of the calculated MEAEB using our algorithm.

SWMF Simulations and Setup
The simulations performed in this study use the SWMF, a software framework for physics based simulations of the Sun-Earth system (Toth et al., 2005(Toth et al., , 2012. The SWMF combines a number of different physics domains that span a wide range of spatial and temporal scales. These domains cover different parts of the Sun-Earth environment, from the solar corona to the ionosphere. The model used in this study consists of the Block-Adaptive Tree Solar wind Roe-type Upwind Scheme global magnetosphere model (BATS-R-US) coupled to the Rice Convection Model for the inner magnetosphere (RCM) and the Ridley Ionosphere Model (RIM), which together simulate the magnetosphere-ionosphere system's interaction for a number of different solar wind driver scenarios.
BATS-R-US is a magnetohydrodynamic (MHD) model which simulates the plasma conditions in the magnetosphere on a block-adaptive grid (De Zeeuw et al., 2000;Powell et al., 1999). RCM models the inner BLAKE ET AL.  magnetosphere (De Zeeuw et al., 2004;Toffoletto et al., 2003), capturing ring current dynamics by receiving magnetic field and plasma moments from BATS-R-US, then returning plasma density and pressure back to BATS-R-US. RIM is a height-integrated ionospheric electrodynamics model . It receives field-aligned current density from BATS-R-US, and delivers electric potential to RCM and BATS-R-US. The communication among these models is facilitated by the SWMF, allowing for a more comprehensive representation of the Earth's magnetosphere-ionosphere system.
Surface magnetic field perturbations are calculated as part of the SWMF on a user-defined grid for a specified timestep. The surface magnetic field at any point is approximately the sum of the Biot-Savart integrals calculated magnetic contributions from each of the current systems in the magnetospheric and ionospheric domain, as well as the field-aligned currents which connect them. The simulations in this study output a 1 × 1° grid every 60 s. In addition, the simulations output magnetospheric conditions, a 2d shell of ionospheric currents and a SYM-H estimate from the simulation, which is in effect a 1-min Dst value (Wanliss & Showalter, 2006).
The combination of BATS-R-US, RIM and RCM is well established for extreme geomagnetic storm simulations (Ngwira, Pulkkinen, Mays, et al., 2013;Ngwira, Pulkkinen, Wilder, et al., 20132014Welling et al., 2020), and has been shown to perform well when replicating surface dB/dt Toth et al., 2014) and Dst. It is currently being used for operational forecasting at the Space Weather Prediction Center (Haiducek et al., 2017).
Each of the simulations in this study was run on a grid made up of approximately 5.89 million computational cells, with the smallest cells being 1/16 Earth radii in size. A high coupling rate of 5 s was chosen for the different modules, and F 10.7 value of 275 solar flux units was used. This value is consistent with solar maximum conditions (Ngwira et al., 2014). Typically in SWMF simulations, the inner magnetosphere boundary (R body ) and location at which the magnetospheric currents are mapped (R curr ) are set to 2.5 and 3.0 R E respectively. Despite requiring greater computational time, we found that when attempting to simulate larger geomagnetic storms, smaller values for these numbers were necessary in order to correctly map geomagnetic variations at lower latitudes. This is explored further in Appendix A. We therefore reduced these values, depending on the severity of the solar wind drivers used as inputs. For our largest storm simulations, we set R body = 1.25 R E and R curr = 1.5 R E . The latitudinal resolution for RIM was 1°, and the latitude boundary for RIM was 10°. For all the simulations performed in this study, the radial magnetic field was not forced to coincide with the internal magnetic field (B0 value in the simulations). Due to this smaller boundary in the simulation, we also increased the particle density at the magnetospheric boundary to 1,000 particles cm −3 .

Solar Wind Scenarios
Inputs for the simulations are solar wind components in the form of magnetic field, velocity, temperature and density. For our simulations, 1-min data were taken from the ACE and WIND spacecraft (accessed via NASA's OMNIWeb portal -omniweb.gsfc.nasa.gov). Seven periods with different solar wind conditions were chosen to be simulated. These periods were chosen according to solar wind data availability, and because these periods had a range of actual Dst values, from very quiet (Dst = 0 nT) to extremely disturbed (Dst = −422 nT). The solar wind conditions, actual measured minimum Dst values and simulated minimum Dst values are shown in Table 1 Notes. The B Z , V X , and n columns refer to the solar wind inputs used for the simulations. The "Real Dst" column is the minimum Dst for the real events. The "SWMF Dst" column is the minimum calculated Dst from the output of the simulations. The scaled events used scaled solar wind inputs from two historical periods (A and B).

Table 1 Selected Maxima and Minima for the Simulated Events
Solar wind data are of limited availability (due to saturation of satellite instruments during large events), so in order to simulate extreme events more intense than March 1989 event (Dst < −589 nT), the solar wind conditions during two recent storms were scaled and used as inputs for the SWMF simulations. The two storms chosen were the 20-21 November 2003 storm (with a minimum Dst = −422 nT), and the 8-9 November 2004 storm (minimum Dst = −374 nT). This scaling approach was chosen in order to maintain some small-scale structure within the solar wind, as opposed to creating completely synthetic time-series.
The velocity, magnetic field, density and temperatures for the unscaled November 2003 event are shown in Figure 6. An hour into the time-series (dashed vertical red line) marks the arrival for the CME for this storm. Each of time-series after this point were scaled by some factor, to get different solar wind scenarios of increasing intensity. The final scaled iteration (Scaled-B6 in Table 1) is what we estimate to be a "Carrington-like" storm.
BLAKE ET AL.
10.1029/2020JA028284 9 of 21 For our Carrington event, a maximum estimated velocity of 1,945 kms −1 was chosen. This approximate value was arrived at by comparing the timing of the flare on September 1, 1859, with the onset of the geomagnetic storm on the 2 September 1859 (Cliver & Dietrich, 2013;Li et al., 2006). Manchester et al. (2005) simulated an extremely fast CME which traveled 1 AU in 18 h (approximately the same time as the Carrington CME). In order to achieve this, their simulated CME had an eruptive velocity of 4,000 kms −1 . This reduced to ∼2,000 kms −1 at 1 AU. For our Carrington-like solar wind conditions, the velocity after 0801 UT was therefore multiplied by 2.59. Of all of the components of the solar wind, the velocity is the only value that we can bound with some confidence for the Carrington event. Other values must be inferred, or arbitrarily scaled. A maximum intensity for the total magnetic field of the solar wind inputs was set at 91 nT. This value is calculated from the empirical relationship between velocity and B of magnetic clouds at 1 AU recorded by Gonzalez et al. (1998): Although we note here that this relationship is derived from a limited CME data set with peak B intensities of < 40 nT. The B y and B z components of our solar wind were therefore scaled by a factor of 1.6 after 0801 UT. The density was multiplied by a factor of 4 so that it peaked with 115 cm −3 . This arbitrary multiplier is large, but we note that it results in a time-series with a lower peak density than has been measured before in CMEs (Tsurutani et al., 2003). Finally, the temperature was multiplied by a factor of 8, to give a maximum of 6 MK. This is in line with the measured temperature of the July 2012 fast CME (Ngwira, Pulkkinen, Mays, et al., 2013;Ngwira, Pulkkinen, Wilder, et al., 2013). With these solar wind inputs, the Carrington-like simulation returned a minimum Dst of −1,142 nT. This value is in the upper range of Dst estimates for the Carrington event derived using historical magnetic field data (see Cliver & Dietrich, 2013 and references within).
The 20-21 November 2003 solar wind conditions were incrementally scaled six times, to get six different storm events with decreasing Dst (increasing intensity). The 8-9 November 2004 conditions were scaled only twice, as it was found that even when scaled to Carrington-like conditions (Scaled-A2 in Table 1), this resulted in a minimum Dst of only −757 nT.

Calculating MEAEB from SWMF Ground Magnetics
As mentioned above, the SWMF simulations can calculate the geomagnetic field for a user specified grid. In our case, the simulations calculated geomagnetic field data on a 1 × 1° grid in geomagnetic coordinates. We can therefore apply our auroral boundary algorithm to the SWMF simulated geomagnetic data in different ways. Here we outline two approaches. First, we interpolate the geomagnetic field data to INTERMAGNET locations in order to directly compare with our historical MEAEB estimates. Second, we use all of the available simulated geomagnetic data to get a 2D estimate of the MEAEB.

Method 1: Interpolating to INTERMAGNET Sites
In order to directly compare the simulation outputs with the MEAEB locations calculated from INTER-MAGNET data, the simulated geomagnetic field outputs were interpolated to the magnetic coordinates for all of the 95 INTERMAGNET stations that were recording in 2017. From these, E H was calculated using the Quebec resistivity model as before. Then our boundary algorithm was applied. The normalized electric fields in the Northern hemisphere and resulting calculated boundaries for 12 of the simulations are shown in Figure 7. This shows the location of the INTERMAGNET sites as white dots, the calculated boundary as a horizontal red line, and bootstrapped 1σ estimates as a yellow horizontal region.
The daily minimum Dst versus calculated MEAEB latitudes are shown in the top panel of Figure 8   The shaded black region shows a fit of the same form applied to the calculated boundaries ±2σ.

Method 2: Using all Simulated Geomagnetic Field Data
The second approach used all the simulated geomagnetic field to calculate the MEAEB location. For each simulation, the maximum electric field was calculated at all points of the output grid. For every line of longitude, a ±10° averaging window was applied to the maximum calculated E H for each latitude bin. The boundary algorithm was then applied to the resulting averaged geoelectric field to get an auroral boundary location estimate for that particular line of longitude. The window was moved longitudinally by a 1° increment and the process was repeated in order to get a 360° estimate of the auroral boundary location. Figure 9 shows the location of these calculated auroral boundaries for 12 of the simulations. are the two lowest intensity storm simulations (Dst −1 and −7 nT). In these examples, the algorithm does not perform well, for the same reason that it does not perform well for quiet historical days; subauroral and poleward regions are poorly differentiated in terms of E H amplitudes. In addition, for the most intense simulations (Dst < −1,000 nT), the auroral boundary estimate is discontinuous in places.
For every simulation, there are therefore 360 calculated MEAEBs latitudes using Method 2. The median of these is plotted against minimum Dst for each of the simulations in the bottom panel of Figure 8 as red diamonds. In addition, the 25%-75% confidence intervals are plotted as red errorbars, and the total range of BLAKE ET AL.
The shaded black region shows a fit of the same form applied to the 25% and 75% confidence intervals. Equation 6 returns slightly lower calculated MEAEB values than Equation 7.
While the median MEAEB latitude values for all of the simulations are above 40°, the three largest storm simulations (with Dst < −1,000 nT) saw calculated boundaries at certain longitudes dip below 40° N. The lowest calculated boundary was 35.5° for the simulation with a minimum Dst of −1,054 nT. This low-latitude boundary value can be compared to the historical Carrington event, albeit indirectly. While there are not enough existing surface magnetic field data from the Carrington event to directly calculate the MEAEB as above, the location of the auroral latitude for this event can be inferred. One existing surface magnetic field data set for the Carrington event is from Rome. This data set saw an extremely large horizontal magnetic field deviation, which, when coupled with very low-latitude auroral sightings, indicate that the auroral oval was at least as far South as Rome (38.6° magnetic N) in 1859 (Blake et al., 2020;Hayakawa et al., 2019). This indicates that the MEAEB estimates for our largest simulations are consistent with actual superstorm values.

Comparing Algorithm Outputs to Other Auroral Phenomena
As can be seen in Figures 7 and 9, the algorithm outlined in this study can separate the geomagnetically active poleward regions from the more geomagnetically quiet equatorward regions. Throughout the study, we have labeled these calculated points as the MEAEBs. In this section, we compare the algorithm output values to auroral equatorward boundaries estimated using precipitating electron data taken by satellite, as well as the location of the polar cap boundary for two of the SWMF simulations.

Comparison with Empirical Auroral Model
On successive orbits from its launch in 2003, the Defense Meteorological Satellite Program (DMSP) f16 satellite measured the mean energy and energy flux of precipitating electrons in the auroral oval with extreme ultraviolet to far ultraviolet images taken using the Special Sensor Ultraviolet Spectrographic Imager (SSU-SI) instrument. By identifying areas with energy flux thresholds above 0.2 ergs s −1 cm −2 , an initial nightside auroral boundary is identified. This boundary is then combined with a precalculated auroral boundary using the Global UltraViolet Imager (aboard the TIMED satellite, see Zhang & Paxton, 2008) data to get an equatorward boundary estimate for an orbit. These data, along with other along with other products such as identification of discrete auroral arcs, can be found at https://ssusi.jhuapl.edu/, along with a detailed description of the algorithms used. The SSUSI-derived auroral boundary model data are available from 2005 to 2016. In this time-period, the minimum Dst was −247 nT. For 855 randomly selected days in this time-period (including the 100 most disturbed days by Dst), the most equatorward location of the boundaries derived by the SSUSI-derived auroral boundary model were recorded, and compared to our calculated MEAEBs for the same days. This is shown in Figure 10.
In comparison to the SSUSI based model, our algorithm predicts more poleward MEAEBs as the minimum Dst of the day decreases (with slope = 1.35). In addition, there are many geomagnetically quiet days (Dst > −15) for which the SSUSI model gives a very low minimum latitude value (<52°). A least absolute difference linear fit (which effectively weighs these outliers less) is shown in Figure 10. The outputs from our algorithm, which estimates the location of the auroral boundary using surface geomagnetic data (a proxy for electric currents in the ionosphere) gives similar estimates to the electron-precipitation based model. That the two empirical models are not perfectly correlated is unsurprising, as they in effect measure different phenomena associated with the auroral boundary, in order to estimate its daily most equatorward position. Different caveats also exist for each method. In the case of the SSUSI-derived boundaries, that model was made with a limited number of available large-scale (high Kp) geomagnetic events. In addition, the look angle of the instrument can affect measurements (see https://ssusi.jhuapl.edu/data_algorithms for more).
In addition to the SSUSI model, there are also other auroral boundary models that rely on electron precipitation and satellite data, and have been calculated for different time-periods. These include Zhang and Paxton (2008)

Comparison to Simulated Polar Cap Boundaries
Next, we compare our calculated MEAEBs (using both Method 1 and 2) with the polar cap boundary for two of the SWMF simulations (November 20, 2003 and Scaled-B6 in Table 1). The polar cap boundary, which will be a few degrees North of the auroral equatorward boundary (depending on the width of the auroral oval), separates the closed and open geomagnetic field lines. With significant solar wind forcing and reconnection, the polar cap expands, but also shifts toward the dayside (Ngwira et al., 2014). This brings the ionospheric current systems to lower latitudes, and with them an increase in surface magnetic field variations. Figure 11 shows the unscaled 20-21 November 2003 and Scaled-B4 simulations at snapshots when the respective simulations saw the largest expansion of the polar cap boundary. The top row shows total current density in the near-Earth magnetosphere, and the bottom row shows the normalized electric field in the Northern hemisphere, with the 2D extent of the polar cap boundary.
The scaled simulation shows a more compressed magnetopause when compared to the unscaled simulation. This corresponds to a lower dayside polar cap (at 34.5°N MLAT) when compared to the unscaled simulation (41.5° N MLAT). Figure 12 shows how the calculated auroral boundaries compare to the most equatorward extent of the polar cap boundaries for both of the simulations. In both of these instances, the auroral boundaries calculated using Method 1 (interpolated INTERMAGNET sites) were less than 3° further South than the most equatorward position of the polar cap boundary. The boundaries calculated using Method 2 (i.e., all SWMF simulated geomagnetic field data) intersects with the minimum latitude polar cap BLAKE ET AL. boundary in places. In reality, the location of the polar cap boundary and auroral oval should be close, but are not necessarily coincident, with the polarcap boundary expected to be North of the auroral oval (and its emissions) by a few degrees (Carbary, 2005). Figure 12 shows that the boundary calculated using only SWMF simulated geomagnetic data is closely related to the extent of the polar cap boundary.

Discussion and Conclusion
In this study, we have outlined a simple algorithm to estimate the maximum extent of the auroral equatorward boundary from simulated and historical surface geomagnetic field data.  the maximum extent of the auroral boundary is located. The lack of extreme geomagnetic storm days in the database means it is hard to estimate the range of MEAEBs for Dst < −400 nT, although the boundaries can be seen to continue equatorward for what data exist. The most disturbed day for which we have widespread geomagnetic data is the March 14, 1989, with a minimum Dst = −589 nT. This had a calculated auroral boundary of 45° ± 3.8° MLAT.
A number of geomagnetic storms of different intensities (ranging from minimum Dst values of −1 nT to −1,142 nT) were then simulated using a high resolution setup of the SWMF. From the geomagnetic field outputs of these simulations, the MEAEBs were calculated using: (1) interpolated geomagnetic field values at INTERMAGNET locations and (2) using all simulated geomagnetic data to get a 2D estimate of the extent of the auroral oval. For both of these methods, the calculated MEAEBs for the simulations broadly match with the calculated MEAEBs for historical geomagnetic data (i.e., for Dst > −600 nT). This indicates that for low to medium-intensity geomagnetic storms, the SWMF setup used here can replicate the geomagnetic signal of the auroral oval.
For Dst values between 0 to around −600 nT, the extent of the simulated auroral boundaries appears to move equatorward mostly linearly (from >60° to ∼44°). A massive increase in the intensity of the simulated storms (from Dst −600 to <−1,000 nT) resulted in an only slightly more equatorward auroral boundary (down to ∼40°). The most extreme simulated storms (Dst < −1,000 nT) had calculated MEAEBs as far South as 35.5° N in places (as calculated using Method 2), and a large scatter. There are not enough worldwide magnetic field data available to directly apply our auroral boundary algorithm to any historical storm day of similar intensity (in terms of Dst). That said, the low latitude auroral boundaries in our large storms (<40°N) are consistent with the estimated auroral oval location of the Carrington event (at least 38.6°N) (Blake et al., 2020).
Our MEAEB estimates were compared to an empirical auroral boundary model using satellite electron precipitation data for 855 days between 2005 and 2016. Our boundary estimates are broadly in line with the SSUSI-derived model, although it should be noted that our algorithm in effect uses the magnetic signature of electrical currents in the ionosphere, as opposed to electron precipitation. Future work should more comprehensively compare our estimates to the various empirical satellite based auroral boundary models for a larger time-period. In addition to this, the MAEABs for two simulations were found to be closely related to the maximum equatorward extent of the polar cap boundary, as expected.
BLAKE ET AL. The relationship between the size of a geomagnetic disturbance and the location of the auroral oval is particularly important when estimating the effects of extreme events. A common approach to estimating peak geomagnetic and geoelectric field values for an extreme geomagnetic superstorm for a location is to apply different fits to distributions of all available historical measurements (J. L. Love, 2020; J. L. Love et al., 2016;Pulkkinen et al., 2008Pulkkinen et al., , 2012Riley & Love, 2017;Thomson et al., 2011). As digital magnetic field data is typically available for only a few decades (depending on the location), a low or mid-latitude location may have been only subauroral for all available data. Depending on the location, such a site may become engulfed by magnetic variations from the auroral oval as it expands during an extreme storm. Extrapolating from measured geomagnetic field data for an extreme geomagnetic storm estimate may therefore underestimate peak geomagnetic field values in this scenario.
The large scatter in calculated auroral oval latitude for the more extreme simulations may be indicative of a suboptimal simulation setup, and different parameters may be needed to adequately simulate extreme geomagnetic storms. For example, the radial component of the total magnetic field can be forced to coincide with the B0 field in future simulations. It may be useful to re-run the larger simulations using greater resolution in the different SWMF models used. In addition, the combination of BATS-R-US, RCM and RIM is just one possible configuration that can be used to simulate geomagnetic storms, and the position of the auroral boundary will be explored in the future with different SWMF models. An example of this is the comprehensive inner magnetosphere-ionosphere model (CIMI) (Fok et al., 2014). In addition, all of the storms here represent a limited number of solar wind templates. In particular, all of the storms that resulted in a Dst < −1,000 nT were scaled versions of the 20-21 November 2003 storm event. CMEs with different orientations and substructures will have varying levels of geo-effectiveness. Future studies will use more varied large-scale solar wind inputs. In particular, efforts are being undertaken to simulate a storm which will more accurately replicate aspects of the Carrington event (i.e., the quick recovery in the geomagnetic field at low-latitudes).

Appendix A: Location of Inner Magnetospheric Current Mapping in SWMF Simulations
The latitude at which a magnetic field line at an L-shell L touches the surface of the Earth can be described by For smaller L values, the magnetic field line will have a footprint at a lower latitude. As outlined in Section 3, the location of the inner boundary of the magnetospheric domain (R body ) and the location at which the magnetospheric currents are mapped (R curr ) are two parameters that can be altered when running the BATS-R-US simulation. The values chosen can have a marked effect on the distribution of B H at the Earth's surface.
Through the course of running the simulations in this study, it was found that for more intense solar wind drivers, these values needed to be lowered, in order to avoid sharp discontinuities in the surface geomagnetic field. As the R body parameter is increased, the footprint of the FACs which connect the magnetosphere to ionosphere is mapped to higher latitudes. This is highlighted in Figure 13, which shows the maximum SWMF-calculated B H at every point on the Earth's surface for three test simulations. Each of these simulations used the SWPC v2 high resolution BATS-R-US grid (approximately 1.9 million cells, minimum cell size = 1/8 R E ), and were driven using the 'Scaled-B4' solar wind conditions (see Table 1). Different values for R body and R curr were used for each of these runs, and the corresponding Λ latitudes are plotted as horizontal dashed white lines (SWPC's operational run uses R body = 2.5 R E and R curr = 3.0 R E ).
For the runs with R curr = 3.5 R E and R curr = 3.0 R E , there is a sharp discontinuity in ΔB H at the Λ-latitudes in both hemispheres. For the run with R curr = 1.8 R E , there are clearly auroral and subauroral regions, but this is not demarcated by the Λ-latitudes.
For each of the 15 simulations shown in Table 1, a suitably small R curr value was chosen such that no sharp discontinuity in ΔB H was seen. We recommend that R curr is set to a value less than 3 R curr when an intense geomagnetic disturbance is to be simulated.

Data Availability Statement
These data, along with example Python scripts used to calculate the MEAEBs can be found at https://doi. org/10.5281/zenodo.4035207. The results presented in this study rely on data collected at magnetic observatories. We thank the national institutes that support them and INTERMAGNET for promoting high standards of magnetic observatory practice (www.intermagnet.org). Data were also obtained from the SuperMAG database (http://supermag.jhuapl.edu/info/?page=faq). Dst values were obtained from the World Data Center for Geomagnetism, Kyoto (http://wdc.kugi.kyoto-u.ac.jp/). Solar wind data were accessed using the NASA OMNIweb dataportal (omniweb.gsfc.nasa.gov). SWMF simulations were performed on the NASA Center for Climate Simulation's Discover cluster. SSUSI-derived model auroral boundary data were taken from https://ssusi.jhuapl.edu/. Figure 13. Maximum B H on Earth for three test simulations using the SWPC grid. All three simulations were run using the Scaled-B4 solar wind inputs (see Table 1