Constraining the location of the outer boundary of Earth’s outer radiation belt

Characterizing the location of the outer boundary of the outer radiation belt is a key aspect of improving radiation belt models and helps to constrain our understanding of the mechanisms by which the source and seed electron populations are transported into the radiation belts. In this paper, we hypothesize that there are statistical differences in the electron distribution function across the radiation belt outer boundary, and thus analyze electron flux data from the THEMIS (Time History of Events and Macroscale Interactions during Substorms) satellites to identify this location. We validate our hypothesis by using modeled electron L* values to approximately characterize the differences between electron distribution functions inside and outside of the radiation belts. Initially, we perform a simple statistical analysis by studying the radial evolution of the electron distribution functions. This approach does not yield a clear discontinuity, thus highlighting the need for more complex statistical treatment of the data. Subsequently, we employ machine learning (with no dependence on radial position or L*) to test a range of candidate outer boundary locations. By analyzing the performance of the models at each candidate location, we identify a statistical boundary at ≈ 8 R E , with results suggesting some variability. This statistical boundary is typically further out than those used in current radiation belt models.


Introduction
Earth's radiation belts typically manifest as two toroidal regions of magnetically confined, energetic plasma. The outer radiation belt (ORB) comprises a highly dynamic electron population, where fluxes can change by orders of magnitudes on minute timescales (Blake et al., 1992). The relativistic electrons commonly observed in the ORB pose a threat to spacecraft via surface charging and electrostatic discharges between internal components (Baker, 2001;Eastwood et al., 2017;Frederickson et al., 1991). As the well-used geostationary and medium earth orbits overlap with the ORB, there is significant interest in being able to accurately model and forecast its electron properties.
There exist a number of radiation belt models, including: Salammbô (Beutier & Boscher, 1995;Boscher et al., 2000;Bourdarie et al., 2005); VERB (Versatile Electron Radiation Belt) (Subbotin & Shprits, 2009); Abstract Characterizing the location of the outer boundary of the outer radiation belt is a key aspect of improving radiation belt models and helps to constrain our understanding of the mechanisms by which the source and seed electron populations are transported into the radiation belts. In this paper, we hypothesize that there are statistical differences in the electron distribution function across the radiation belt outer boundary, and thus analyze electron flux data from the THEMIS (Time History of Events and Macroscale Interactions during Substorms) satellites to identify this location. We validate our hypothesis by using modeled electron L* values to approximately characterize the differences between electron distribution functions inside and outside of the radiation belts. Initially, we perform a simple statistical analysis by studying the radial evolution of the electron distribution functions. This approach does not yield a clear discontinuity, thus highlighting the need for more complex statistical treatment of the data. Subsequently, we employ machine learning (with no dependence on radial position or L*) to test a range of candidate outer boundary locations. By analyzing the performance of the models at each candidate location, we identify a statistical boundary at ≈8 R E , with results suggesting some variability. This statistical boundary is typically further out than those used in current radiation belt models.
• Simple statistical analysis of Time History of Events and Macroscale Interactions during Substorms data does not yield a boundary location • Machine learning is used to identify a boundary location using electron distribution functions • The boundary location is found to be significantly beyond geosynchronous orbit Correspondence to: T. Bloch, t.bloch@pgr.reading.ac.uk Citation: Bloch, T., Watt, C. E. J., Owens, M. J., Thompson, R. L., & Agiwal, O. (2021). Constraining the location of the outer boundary of Earth's outer radiation belt. Earth and Space Science, 8, e2020EA001610. https://doi. org/10.1029/2020EA001610 STEERB (Storm-Time Evolution of Electron Radiation Belt) (Su et al., 2010a(Su et al., , 2010b(Su et al., , 2011; DREAM (Dynamic Radiation Environment Assimilation Model) (Reeves et al., 2012), and BAS-RBM (British Antarctic Survey's Radiation Belt Model) (Glauert et al., 2014). One of the critically important aspects of defining the boundary conditions for these models is the outer boundary of the ORB (OBORB), since this boundary acts as a time dependent source for the simulations.
There are two aspects of specifying this boundary condition. First, the location must be specified either in physical or adiabatic invariant coordinates, and secondly the source distribution must be specified for the chosen boundary location. Typically, a boundary location is chosen around geosynchronous orbit or an equivalent position in adiabatic invariant coordinates, and the source distribution is taken from either a model output (e.g., Vette, 1991) or observational data. The model boundary locations used do not necessarily correspond to the physical outer boundary, but instead are chosen to maximize the amount of data available to construct the source distribution (more recently this has been data from geosynchronous orbit or the apogee of the Van Allen Probes mission). Importantly, there may be physical processes outside of the arbitrary, data-maximizing boundary location which cannot be included through these modeling approaches. Until radiation belt models capture the entire physics of the radiation belts, they will have difficulty in predicting future behavior, since they will be limited to using reanalysis of past behavior rather than being able to fully model the dynamics into the future.
Determining the extent of the ORB relative to the location of the tail plasma sheet may help to identify mechanisms which may provide the crucial trapped seed population (Jaynes et al., 2015). Since Earth's plasma sheet is known to be an important source of electrons that ultimately form the radiation belt, though the precise mechanism of transport is not well understood (e.g., Forsyth et al., 2016Forsyth et al., , 2014Sergeev et al., 2015).
Given the importance of the OBORB, and the lack of empirical investigation into its location, we here attempt to identify a statistical boundary location. This investigation is built upon the following hypotheses about the ORB and its electron content: 1. The distribution function of the trapped radiation belt electron population differs from the distribution function of the untrapped electrons. 2. There exists statistically-or explicitly -a radial limit at which the distribution functions of trapped and untrapped electrons will -diverge.
Here, trapped electrons refer to radiation belt electrons which exhibit closed drifting and bouncing trajectories, as opposed to the untrapped electrons, whose drift paths lead to them being lost to different magnetospheric regions. Distributions functions in this work are as a function of energy. A further point of note is that different distribution functions for the untrapped electrons have been observed between dawn and dusk, due to electrons injected in the midnight sector being lost to the magnetopause without reaching the dusk sector (Li et al., 2010;Sorathia et al., 2017). Thus, comparing the differences in the distribution functions between dawn and dusk should allow us to identify the radial extent of the bound electrons more easily.
In Section 2 the data and data processing will be discussed. In Section 3.1 the current definition of what constitutes the radiation belt (i.e., where a trajectory has a defined L*) is used to set a benchmark for the type of differences between the ORB and untrapped distribution functions. In Section 3.2 the statistical radial evolution of the distribution function is presented. In Section 3.3, machine learning is employed as a hypothesis testing tool and a statistical boundary location is found for both the dawn and dusk MLT sectors. Finally, we will summarize and make concluding remarks in Sections 4 and 5.

Data
Given that this investigation requires data over a large range of radial distances, we use data from the (THEMIS) spacecraft (publicly available through NASA's CDAWeb archive). The distribution functions are derived from electron flux data from the electrostatic analyzer (ESA) to give us the energy range 10 eV-30 keV and the solid state telescope (SST) to give us the energy range 30-719 keV (Angelopoulos, 2008;McFadden et al., 2008). Data is taken from THEMIS probes A, D and E between 2007/09/27 and 2019/09/29, whilst data from probes B and C is taken up till 2010, at which point they were moved to a lunar orbit BLOCH ET AL.
10.1029/2020EA001610 2 of 14 (Russell & Angelopoulos, 2014). Note that for the L* analysis in Section 3.1, data is only used up until 2017 due to the availability of OMNI data in the SpacePy L* calculator (Morley et al., 2010). Qualitatively, this limitation is very unlikely to affect the results.
This investigation will focus on identifying the equatorial boundary location, and will use data from the dawn and dusk MLT sectors. We use the spacecraft's position in GSM co-ordinates to specify dawn and dusk data (6 and 18 ± 3 MLT hours), and we use geomagnetically aligned (GEOMAG) co-ordinates to specify data from the magnetic equatorial region (Z = 0 ± 0.5 R E ). This latter step is done to ensure that the region we are sampling corresponds to the magnetic equator in the appropriate coordinate system.
To construct the distribution functions for the electrons we convert the direction-averaged differential electron (kinetic) energy flux (DEF, eV/cm 2 ⋅ s ⋅ sr ⋅ eV) into phase-space density (PSD, s 3 /m 6 ) as follows: where E is the measured energy of electrons (in Joules) and m e is the rest mass of an electron. Figure 1 presents the equatorial plane (left) and radial (right) distribution of the THEMIS data used. From this, we note that the data is not evenly distributed, but instead has a radial bias with a maximum ≈ 11.5 R E . This distribution is expected given the orbital parameters of the various spacecraft. Two spacecraft (probes D and E) have their apogee at ≈ 11.5 R E , meaning that they are traveling most slowly at this region and so the density of measurements is higher. Probes B and C have apogee at ≈ 30 and 19 R E , and so their measurements of the inner magnetosphere are more spatially sparse. Probe A has an orbit with apogee at ≈10 R E .
In the following analysis, it will be important to ensure that results are not biased by the radial sampling. To address this, we construct ensembles of randomly sub-sampled data. In each of dawn and dusk, we take n equally spaced radial bins between 5 and 13.5 R E (the amount of available data drops after this radial limit). We find the bin with the fewest samples, m (where m ≈ 3,000 if n = 20). We then construct a new data set by randomly sub-sampling m points from every bin 100 times (with replacement). This new data set is now uniformly populated in radial distance.
Such ensemble sampling addresses positional biases of the spacecraft measurements. Furthermore, we maintain the underlying statistical properties of the PSD distributions in each of the radial bins (Efron & Tibshirani, 1986). There also exist biases in the MLT distribution of the data. However, these biases are BLOCH ET AL. much smaller than the radial biases (as can be seen in Figure 1), and the distribution functions are expected to show less of a trend with MLT than radius, so we do not mitigate for them.

Analysis
In this section we explore various methods which might be used to identify the location of the OBORB. Each method involves comparing the electron distribution function within various radial limits. We look at this through the lens of the hypotheses in Section 1. Initially, we use a non-empirical method based upon the evaluation of L* (Roederer, 1967) to investigate our hypotheses within the typical adiabatic invariant coordinate framework. Following this, we use radial binning to observe the radial evolution of the electron distribution function and look for discontinuous behavior signifying the OBORB. Lastly, we employ machine learning methods as a tool for searching for the radial position of the OBORB through a hypothesis testing approach (though not the same hypotheses as in Section 1).

L* Analysis
Our study focuses on finding the radial extent of the ORB in real space (cf. adiabatic invariant space) by analyzing positional differences in the electron distribution function. This naturally leads to using L* to classify whether data is inside or outside of the radiation belts. L* is a modeled property of magnetically trapped particles, which is used to define the extent of the radiation belts (Roederer, 1967;Roederer & Lejosne, 2018;Roederer & Zhang, 2014). In a dipole field, the modeled L* corresponds to the radial distance of the point where the drift path of an electron intersects the magnetic equator. Employing L* as a definition of the radiation belts themselves allows us test our first hypothesis -that the electron distribution functions within and without the ORB differ. We stress that this is only an approximation for the radiation belts, since it relies on empirical field models (L* is not a measured quantity), which have significant disparities (see e.g., Albert et al., 2018;Thompson et al., 2020), and as such we do not use L* to try to quantify the location of the OBORB.
To incorporate the information L* provides (whether or not the electrons are on a closed drift-path), we employ seven magnetic field models to determine L* for a given datapoint -calculated using the SpacePy's wrapper of the IRBEM library (Morley et al., 2010;Roederer & Zhang, 2014;Albert et al., 2018;Thompson et al., 2020) for 90° pitch-angle electrons (we comment on pitch angle in the discussion section). These models are: T89 (N. Tsyganenko, 1989); OPQuiet (Olson & Pfitzer, 1974); T96 (N. A. Tsyganenko, 1995); OSTA (Ostapenko & Maltsev, 1997); T01Quiet (N. A. Tsyganenko, 2002); T01Storm (N. A. Tsyganenko et al., 2003), and T05 (N. A. Tsyganenko, 2005). These models range from being analytic (OPQUIET) to quite heavily solar wind/geomagnetic index parameterized (T05). Given the seven models used, we specify that so long as at least four models returns a finite L* value, the datapoint corresponds to a trapped drift trajectory for at least some of the electrons measured, and is therefore within the radiation belts. This choice was informed by Thompson et al. (2020), who choose three models but suggest that using more models can reduce model-specific biases. Figure 2 presents the results of the L* analysis. We have employed the sub-sampling method described in Section 2, with n = 20, to ensure that there is no sampling bias in the results. In panel of Figure 2(a), the L* occurrence distribution and median L* values (based on the 4-model agreement criteria) are plotted over the range of radial distances. Below 8 R E , > 90% of the data is located within the radiation belts (in that it has a valid L* value in four of the seven magnetic field models). The occurrence fraction of L* values show a monotonically decreasing relationship with increasing radial distance (except>12 R E ), in agreement with theory. We speculate that the increasing occurrence above 12 R E and the decreasing median L* values above 11.5 R E are spurious and represent some of the issues in trying to solely use modeling to define the OBORB (further issues with using current magnetic field models are highlighted in Albert et al., 2018).
Figures 2(d) and 2(e) present comparisons between dawn/dusk and inside/outside of the ORB (on the basis of L* being defined or not). Comparing vertically (i.e., panels (b) with (d), and (c) with (e)) shows the difference between dawn (top) and dusk (bottom). There is a clear enhancement of the ≈ 10 keV seed population electrons (Jaynes et al., 2015) at dawn which is not present at dusk. There is also a depletion of the ≈ 1 keV source population electrons (Jaynes et al., 2015) which only appears outside of the radiation belts.
BLOCH ET AL.
10.1029/2020EA001610 4 of 14 The medians of the THEMIS SST data (>30 keV), follow a power-law-type distribution as other works have found (e.g., Whittaker et al., 2013;Zhao et al., 2019). Comparing between inside and outside of the radiation belts, the main differences (aside from the aforementioned depletion of source population electrons) are the typically more variable PSDs at energies ≲100 keV outside the belt compared to inside. In contrast, the PSDs above this energy are much less variable outside the belt compared to inside. The distribution functions also have a shallower gradient and more variability inside of the radiation belt, highlighting a considerably more enhanced electron population.

Simple Radial Analysis
To investigate the OBORB, we calculate the median and interdecile (i.e., 10 to 90th percentile) range of data in nine radial bins between 5 − 13.5 R E . These results are presented in Figure 3. These distributions are calculated using the random sub-sampling technique described in section 2, with n = 9, to ensure comparable statistics between each of the bins.
We find significant radial evolution in both the dawn and dusk distribution functions. Both display flattening over the mid-range energies, suggesting either wave-particle interactions (Meredith et al., 2020), or the plasma sheet source (Kurita et al., 2011). The notable difference between dawn and dusk is the pronounced bulge in the dawn distribution at ≈ 10 keV, mirrored in the interdecile ranges of the dawn data. We observe that the dawn and dusk distributions diverge with increasing radial distance up to r ≈ 9.7 R E , after which they converge to similar distributions. At low radial distances, the dawn and dusk data may be more consistent because most of the data is inside the radiation belts, and equivalently at the higher radial distance most of the data is likely to be outside of the radiation belts. We observe that the dawn data exhibits the elbow at BLOCH ET AL.
10.1029/2020EA001610 5 of 14 lower radial limits, and suggest that this may be the contribution of untrapped electrons. This is supported by the dusk distribution converging to the enhancement as the radial limit is increased beyond the expected limit of the OBORB and trapped electrons.
The distribution function at 5.0-5.9 R E is very different in form from that at 12.6-13.5 R E , but the change in form occurs gradually, with no obvious discontinuity as a function of radial distance. This may imply that either there is not a hard boundary, or that the boundary location is highly variable. By not finding such a marker, we infer that this simplistic approach isn't best suited to locating the OBORB.
BLOCH ET AL.

Machine Learning Analysis
With the previous method unable to find a clear radial distinction between electron populations, we now employ machine learning. We approach this much like hypothesis testing, a variety of radial limits are proposed as potential OBORBs (hypotheses) and empirically tested to determine which is most appropriate (the validity of an OBORB radial location, and how we might determine it, are discussed below). We constrain the data to the SST energy channels before applying machine learning, ensuring the results are not biased by lower energy particles, strongly affected by the E × B drift (Roederer & Zhang, 2014).
Our empirical analysis for a single set of proposed dawn and dusk radial limits is as follows: 1. Make a hypothesis by selecting a candidate radial limit for the OBORB (e.g., 7 R E in the dusk or dawn sector). 2. Label each datapoint with a 0 if the measurement was made inside of the candidate radial limit, else label it with a 1. These class labels form the targets that a machine learning model (explained later in the text) will try to predict on the basis of the electron distributions. 3. Combine the dawn and dusk labeled data into a single data set. 4. Provide a machine learning model each of the electron distribution functions as features (i.e., what the model will use to form a prediction). Each input is a one dimensional array of the values of PSD at each energy. 5. Train the machine learning model for the given set of input features (electron distribution functions) and targets (whether the data is inside or outside the chosen radial limit). The training set corresponds to 80% of the data, allowing for model performance to be quantified on an un-seen test set (the remaining 20% of the data). 6. Quantify the model performance of estimating whether a datapoint is inside or outside the chosen radial distance using un-seen electron distribution functions from the testing set. Metrics quantify the differences between the model-predicted class labels (0/1, inside/outside) with the class labels prescribed by the boundary location choices.
Note that the neither the radial boundary locations, nor the radial locations of the measurements are provided to the machine learning model. Instead, the model tries to improve classification accuracy by inferring differences in the input features (PSD at each energy) between each set of class labels. By considering how well the model performs, we are assessing how much information is present in the electron distribution functions about the chosen radial distance. As electron distribution functions are expected to show the greatest difference either side of the OBORB, this in turn provides a measure for how good an approximation the chosen radial distance is for the OBORB. By, "greatest difference" we are referring back to our initial hypothesis that the electron distribution functions of trapped electrons are different to those of untrapped electrons.
Each model used in the following analysis is a gradient-boosted (Friedman, 2001) ensemble of decision trees (Belson, 1959) implemented using the LightGBM framework for Python (Ke et al., 2017). For each set of hypothetical boundary locations, a new model is trained, but the model architecture remains the same. Each ensemble is comprised of 256 decision trees (chosen to exceed suggestions from Oshiro et al., 2012, since LightGBM is cheap to run), which each contain 32 leaf nodes. Each model is gradient boosted using the dart algorithm (Rashmi & Gilad-Bachrach, 2015), where gradient boosting is a method of constructing the ensemble such that each subsequent decision tree in the ensemble is trained to correct for mis-classified predictions from the previous decision trees.
To test a large range of hypotheses we implement the above method in a training loop, stepping through each combination of dawn and dusk radial locations between 6 to 12 R E (in increments of 0.2 R E ). By investigating the model performances over this range of plausible OBORB locations, we can assess the existence or otherwise of an OBORB, and whether the location can be constrained to a certain radial distance range. The existence of an OBORB can be judged by the magnitude of the quantified model performances; if models perform well, then it suggests that an OBORB or OBORB region exists. Once validated, the location of the OBORB can be constrained by comparing the relative skill of the different models and seeing if a particular set of boundary locations leads to models which perform better. Where we find radial limits with the best model performance, we know that these locations correspond to a split which maximizes the differences in BLOCH ET AL.

10.1029/2020EA001610
7 of 14 the distribution function data between the two classes (i.e., inside/outside, 0/1). In our context, this would represent the statistical OBORB.
Before detailing the results, we present the distribution of data obtained by our various radial limits. Figure 4 presents the proportion of data labeled as "inside" at each dawn and dusk limit. There is a noticeable increase in the fraction of data within the radial limit at ≈11.5 R E . This is due to the radial bias in the data distribution presented in 1. Generally the central regions of the plot have balanced data distributions. This distribution will be important in evaluating the performance metrics to ensure that they are not biased by having uneven class distributions.
To quantify our model performances, we employ a variety of binary classification metrics: Accuracy, Gilbert Skill Score (GSS), G-mean, F-measure and Critical Success Index (CSI) (Gilbert, 1884;Kubat et al., 1998;Lewis & Gale, 1994). These metrics (aside from accuracy) have been chosen because they are designed to take into account class imbalances. Since different metrics focus on quantifying different aspects of predictive performance (see how the different metrics are constructed in Appendix A), we present the results of multiple metrics to get a more complete view of the model performances. We also consider the inverted F-measure and CSI to account for the fact that they only consider one correct classification label (namely, the true positive predictions, ignoring the true negative predictions), and finally an aggregated metric comprised of the geometric mean of results from all metrics used. These metrics can all be derived from a confusion matrix of the results of our binary classification. See Appendix A for further details of the metrics and how they relate to confusion matrices. Figure 5 presents the results of our machine learning analysis. Each panel presents a 2D histogram of the performance of a metric at each combination of dawn and dusk boundary conditions. Over-plotted are well as six contours evenly spaced between the 70 and 100th percentiles of the data. By all the metrics used, there are models which perform relatively high for at least a subset of the hypothesized boundary locations. The GSS has the lowest numeric model performance, but still has a constrained region of performance exceeding 0.7 (a score of 0 would represent no-skill and −1/3 is the lowest possible value). Aside from the GSS, each metric is constrained to the range 0-1. If our approach were flawed, and machine learning was not a suitable tool, we would expect to find that the models did not perform especially well at any location. Seeing as there are high-performing models (by each metric), we infer this as validation of our machine leaning approach. The contours of model performance presented allow us to constrain the locations of best performance, which we attribute to the OBORB location. However, before we analyze these contours we will discuss the issue of class imbalance.
BLOCH ET AL.  Of the traditional metrics used, it appears that the GSS and G-mean metrics perform most robustly against the class imbalance, as can be seen by the lack of bias towards the upper right, or lower left areas (where the class imbalance is most pronounced). The average of the metrics also provides a class-balanced representation of the results. One thing to note from these results is the similarity between the accuracy, F-measure and CSI. This likely originates from the algebraic similarity between the definitions of these metrics (see Appendix A). By using the inverted versions of these metrics we address the class imbalance when we take our average of the results, and observe how sensitive the results are to the class imbalance (the metric behavior completely changes by focusing on a different true class prediction). Accuracy is inadequate as a metric when used on imbalanced data, since it is easily biased. This bias can be demonstrated in the following hypothetical case. If one has 100 data points, split into two classes (0 and 1), with 99 points falling in the 0 class. Then a model trained on this data may obtain a predictive accuracy of 99% by predicting everything to be in the 0 class. If it is important to be able to correctly predict the other classification, then this model will have no skill, despite the high accuracy.
Whilst we present all of the metric results in Figure 5, for convenience we will focus the remaining discussion on the results of the average of the metrics, as this encapsulates the trends between all of the metrics. We observe a bounded region of best-performance between ≈ 6.9-9.1 R E in the dawn sector and ≈7.0-9.3 R E in the dusk sector. The contours show sharp decrease in the quantiles of performance outside of this area.

Discussion
For the sake of a clear methodology, we have generally made few comments on the results we've found. Here, we will start by discussing the machine learning aspect of this work, since it yields the most interesting results, and subsequently compare with the features found in our simple radial analysis.
In the machine learning analysis, we employed a fairly simple hypothesis testing approach to investigate various radial boundary locations for the OBORB. Our results suggest that a boundary exists, though its location may be highly variable. Variability in the boundary location may originate from myriad sources: pitch-angle dependence; energy dependence, and solar wind/geomagnetic activity. The pitch-angle of electrons is less likely to affect our results due to the focus on the dawn and dusk regions rather than day or night. For these latter MLT sectors, there is a strong pitch-angle dependence of the drift shell being observed at a given radial location (see figures in Roederer & Lejosne, 2018), which act in opposing directions between day and night. As such, this effect is much reduced in our data, though certainly some of the variability in the results is due to this. For electrons of different energies, there are magnetospheric processes that act preferentially. Thus, for each energy level there may be a different radial location corresponding to the last closed drift shell. By limiting our machine learning experiment to only the higher energies, we reduce the energy dependent effects. Though, as discussed below, we may still observe some of these effects even in our more limited energy range. Solar wind and geomagnetic activity are likely to have a significant effect on the OBORB, since such activity leads to large-scale reconfiguration of the magnetospheric topology and geometry. As this study represents (to our knowledge) the first empirical constraint on the OBORB location using in situ data, we do not account for activity, leaving such considerations for future work much in the same way as BLOCH ET AL.
10.1029/2020EA001610 9 of 14 Figure 5. 2D histograms presenting the machine learning model performance, through various metrics. The average of the metrics presented in panel h represents the geometric mean of the metrics presented in panels (a-g). Over-plotted are six contours between the 70 and 100th percentiles of the data, used to draw attention to the regions of best performance. early research into the magnetopause location (Fairfield, 1971). By doing so, our results likely represent a quiet-time or modal OBORB location. One final consideration is cross-species contamination of the electron flux by protons, which may increase the lower energy channel's PSDs and add additional variability to our results (Turner et al., 2012(Turner et al., , 2013. From our results we infer that the lack of parameterizing by solar wind/geomagnetic activity is the dominant factor for the variability in our boundary location, as activity will move the boundary physically, rather than softening it as the pitch-angle and energy dependence do. We infer this by the high skill scores (relative to the maximum value) which are distributed over a large range of potential radial limits. If instead, there was a softer boundary (i.e., a slow transition between the two characteristic distribution functions), we might still expect to see the smooth variation in the metric scores, but we would typically expect the quantitative values to be lower (e.g., all less than 0.5), as the models would find it more difficult to characterize the subtle differences in the slowly changing distribution functions.
Looking specifically at the average (geometric mean) of the metrics in Figure 5h, the distribution is shifted slightly in favor of a larger radial limit at dusk than dawn, but is otherwise quite a symmetric shape. The ovoid shape of the contours suggest a tendency for the boundary to favor similar values at dawn and dusk, though the implicit variability highlights that this may be only a weak tendency (taking the contours as the extrema of the variability, the dawn radial limit can be ±2 R E compared to dusk and vice versa the variability can be ±2.5 R E ).
The dawn-dusk asymmetries observed might be explained by similar dawn-dusk asymmetries in the magnetosphere (Haaland et al., 2017;Staples et al., 2020;Walsh et al., 2014). As we have excluded the lower energy particles from this portion of analysis, we do not expect this asymmetry to be primarily due to E × B drift, since the curvature and gradient drifts are energy dependent and hence will dominate over the electric field drift (though some recent works have shown that the electric field may still contribute: Sillanpää et al., 2017;Califf et al., 2017). Instead, we speculate that this effect is more likely to be due to asymmetries in the (partial) ring current, whose effect is to increase the magnetic field strength at larger radial distances. This causes the electrons to follow the field and drift further out because of the gradient drift experienced. The sense of the dawn-dusk asymmetry suggests it is not simply the result of the algorithm identifying the magnetopause rather than the OBORB, the magnetopause can be compressed to below 8 R E , but this happens much more frequently at dawn than dusk (Staples et al., 2020). Whilst there may be some contamination of the data due to sampling the magnetopause or solar wind, we infer that this is negligible, since electron populations (and hence their distribution functions) are very different. It is expected that the difference between electron distribution functions inside of the magnetosphere and those in the magnetosheath or solar wind is much bigger than the differences between distribution functions inside and outside the radiation belt. A more easily identifiable dichotomy of distribution functions would be picked out more significantly by the algorithm and so we assert that the boundary identified by the algorithm is not the magnetopause, but the OBORB.
Our identification of the OBORB at ≈8 R E is typically larger than the values currently used in radiation belt modeling (e.g., Subbotin & Shprits, 2009;Shin & Lee, 2013;Glauert et al., 2014Glauert et al., , 2018Ozeke et al., 2014Ozeke et al., , 2018, suggesting that these modeling efforts are potentially missing radiation belt phenomena from the outer regions. Other empirical evidence, such as that in Sivadas et al. (2019), also support an OB-ORB location beyond the currently used limits (9-12 R E in their case). The OBORB being located further out opens up the possibility for smaller scale magnetotail behavior (e.g., less severe substorms) to inject particles into the radiation belts, since they would not have to penetrate to such low L-shells. Such injections could lead to additional variability in the radiation belts (Jaynes et al., 2015;Turner et al., 2017) and to enhanced chorus wave activity in the outer regions (Meredith, 2002).
In Figure 3, we observed a flattening of the PSD at the mid-range energies and speculate that this is due to wave-particle interactions (WPIs). Given the energies of these electrons (10-30 keV) and their location (equatorial region, large radial distance) it is likely that whistler-mode chorus waves are the cause (Omura et al., 2008;Li et al., 2011Li et al., , 2010Meredith et al., 2020). The flattening occurs asymmetrically between dawn and dusk, with dawn being affected at lower radial distances. Meredith et al. (2020) present results showing that both lower-and upper-band chorus have a large dawn-dusk asymmetry. These results also show BLOCH ET AL. that specifically the lower-band chorus intensity is high at the large radial distances where we continue to observe the flattening of the distribution. Our presented results extend to larger radial distances than Meredith et al. (2020) or Li et al. (2010), into regions close to the magnetopause. Due to the sparseness of data and research into WPIs in this region, we cannot speculate on whether or lower-band chorus remains the dominant wave affecting the electrons but these results suggest that more investigation may be required.

Conclusions
This study provides the first in situ, empirically constrained location for the outer boundary of the ORB using THEMIS ESA and SST measurements. Characterizing this boundary location accurately is an important aspect of radiation belt modeling, as it forms a time-varying source of electrons.
By applying simple statistical techniques, we observe significant radial evolution of the distribution functions, highlighting the intrinsic differences between the trapped (radiation belt) and untrapped electron populations. However, this approach did not yield a clear boundary location, instead showing a smooth transition between the two states. Such a transition signifies either a soft boundary, or a boundary with significant variability.
We employ machine learning (specifically, ensemble decision tree classification) in a hypothesis-testing framework, to assess whether there exists an identifiable change in electron distribution functions and hence outer boundary to the ORB, and where it may be located. The data set was converted into 900 binary classification datasets, where data was labeled as either inside or outside of specified dawn and dusk radial limits (our hypothesized boundary locations). 900 machine learning models were then trained to learn this classification. Where the models perform better, we infer that our choices of boundary locations coincide more closely with identifiable changes in the electron distribution functions and hence the true statistical boundary location. By aggregating a series of metrics (many designed specifically for imbalanced datasets) we find a region of best performance between ≈6.9-9.1 R E in the dawn sector and ≈7.0-9.3 R E in the dusk sector.
This work presents a novel methodology for identifying the OBORB location, and opens up future research directions in parameterizing the boundary location by solar wind and/or geomagnetic conditions. Our current results better constrain the statistical location of the OBORB and can be incorporated into the construction of radiation belt models, ensuring that they contain the physical processes of the radiation belts, and allowing future analyses to more appropriately capture the dynamics of injection events and how they influence the behavior of the outer ORB.

Appendix: A Metrics
All of the metrics used in this study can be derived from a confusion matrix. A confusion matrix is made up of True Positives (TP), True Negatives (TN), False Positives (FP) and False Negatives (FN). How these correspond to model predictions can be seen in Table A1.
These relate to the following three commonly used, intermediary metrics and to H R , which is used as a correction factor in the Gilbert Skill Score to account for the random chance of correctly categorizing a sample.
BLOCH ET AL.
The F-measure is the harmonic mean of the precision and recall and the G-mean is the geometric mean of the recall and specificity. On top of the proposed metrics, we also consider their values when the class labels are inverted, allowing us to investigate the robustness to the class imbalance (i.e., TP ↦ TN and FN ↦ FP and vice versa). Of metrics defined in Equations 6-10, we note that only the F-measure and CSI will be affected by this change, and so these are the only additional metric scores calculated.

Data Availability Statement
THEMIS data is publicly and freely available in CDF format at: https://cdaweb.gsfc.nasa.gov/pub/data/ themis/. We specifically use the "psef_en_eflux" from the ESA and SST data, and the "pos" and "pos_gsm" from the STATE data. All of these data were obtained for probes A-E.