A New Four‐Component L*‐Dependent Model for Radial Diffusion Based on Solar Wind and Magnetospheric Drivers of ULF Waves

Waves which couple to energetic electrons are particularly important in space weather, as they drive rapid changes in the topology and intensity of Earth's outer radiation belt during geomagnetic storms. This includes Ultra Low Frequency (ULF) waves that interact with electrons via radial diffusion which can lead to electron dropouts via outward transport and rapid electron acceleration via inward transport. In radiation belt simulations, the strength of this interaction is specified by ULF wave radial diffusion coefficients. In this paper we detail the development of new models of electric and magnetic radial diffusion coefficients derived from in‐situ observations of the azimuthal electric field and compressional magnetic field. The new models use L∗ ${L}^{\ast }$ as it accounts for adiabatic changes due to the dynamic magnetic field coupled with an optimized set of four components of solar wind and geomagnetic activity, Bz ${B}_{z}$ , V $V$ , Pdyn ${P}_{\mathrm{d}\mathrm{y}\mathrm{n}}$ , and Sym−H $\mathrm{S}\mathrm{y}\mathrm{m}-H$ , as independent variables (inputs). These independent variables are known drivers of ULF waves and offer the ability to calculate diffusion coefficients at a higher cadence then existing models based on Kp. We investigate the performance of the new models by characterizing the model residuals as a function of each independent variable and by comparing to existing radial diffusion models during a quiet geomagnetic period and through a geomagnetic storm. We find that the models developed here perform well under varying levels of activity and have a larger slope or steeper gradient as a function of L∗ ${L}^{\ast }$ as compared to existing models (higher diffusion at higher L∗ ${L}^{\ast }$ values).

geomagnetic activity leading to rapid changes in the flux of energetic electrons and enhancements of the terrestrial ring current (e.g., Kanekal & Miyoshi, 2021). These periods of enhanced geoeffective solar wind are referred to as geomagnetic storms. During these storms the enhanced fluxes of the radiation belt can be dangerous for human and robotic space activity as well as causing damage to sensitive spacecraft systems (e.g., Baker, 2001;Cassak et al., 2017).
Though the outer radiation belt was discovered over half a century ago (Van Allen & Frank, 1959), our understanding of, and ability to model and forecast its dynamics remains somewhat limited. This is not a result of a lack of understanding of the processes which drive radiation belt dynamics but instead is due to the complexity and variability of the overall system, from the sources and sinks of electrons and wave-particle interactions coupling to these electrons, to the cross-coupling with other magnetospheric regimes such as the magnetopause (e.g., Borovsky & Valdivia, 2018;Halford et al., 2022;. During storms, the dynamics of the radiation belt are controlled by a variety of physical processes leading to loss, transport, and acceleration of electrons. For instance, magnetopause shadowing (Staples et al., 2022;West et al., 1972) coupled with outward radial diffusion driven by Ultra Low Frequency (ULF) waves (e.g., Turner et al., 2012) can drive rapid losses of electrons. At the same time wave-particle interactions driven by Very Low Frequency Chorus and Hiss waves (e.g., Breneman et al., 2015;Halford et al., 2022;O'Brien et al., 2004), electromagnetic ion cyclotron waves (EMIC) (e.g., Bingley et al., 2019;Bruno et al., 2022), ULF waves (e.g., Brito et al., 2015;Jonathan Rae et al., 2018), and kinetic Alfven (e.g., Chaston et al., 2018) waves can enhance electron precipitation across a wide range of energies leading to a loss of radiation belt electrons. Further, enhanced convection and substorm injections can replenish lower energy electrons providing pathway for the excitation of whistler mode chorus which can rapidly accelerate electrons to relativistic (several MeV) energies (Horne, 2003;Horne et al., 2005;Horne & Thorne, 1998;Thorne et al., 2013) creating peaks in electron phase space density (Reeves et al., 2013). ULF Ozeke, Mann, Turner, et al., 2014) waves can further rapidly accelerate and transport these lower energy seed electrons to MeV energies via ULF wave radial diffusion.
The evolution of the radiation belt is controlled by the sum of the effects of the physical processes described above, and whose strengths varies throughout any given storm leading to a variety of responses. For example, while loss processes typically dominate during the early part of geomagnetic storms (Murphy, Inglis, et al., 2018;, the relative strength of these processes varies through the course of a storm and for storm to storm. This variation can lead to striking differences in the topology of the outer radiation belt such as the formation of multiple outer radiation belts (Baker et al., 2013). Following this initial period of loss, the outer radiation belt typically experiences a short period of rapid acceleration (Murphy, Inglis, et al., 2018;. However, the dominant process controlling this acceleration can also vary from storm to storm  and though this period of acceleration is typical of all storms, the amount of acceleration varies (Murphy, Inglis, et al., 2020;Murphy, Mann, et al., 2020). This delicate balance between loss, acceleration and transport processes is often referred to as a competition. However, it's more a symphony than a competition; an orchestra of physical processes of varying magnitudes working in concert to shape the topology of the radiation belt through the course of a geomagnetic storm.
Fully understanding the dynamics of the outer radiation belt requires at least three key steps. First we need to be able to quantify the strength of physical processes controlling these dynamics. Second, we need to assess how the varying strength of the processes act together to impact the overall topology of the outer radiation belt. Finally, we need to understand the pre-existing state of the belts (e.g., initial conditions), and the availability or lack thereof of various source populations (e.g., boundary conditions) through the course of a storm.
ULF wave radial diffusion is a key physical process controlling the dynamics of the outer radiation belt (Schulz & Lanzerotti, 1974). Depending on the gradient of electron phase space density in the outer radiation belt ULF wave radial diffusion can drive either rapid loss or rapid acceleration of radiation belt electrons ( Figure 16 of Li & Hudson, 2019; Figure 3 of Turner et al., 2012) across a wide range of energies. In this paper we detail the development of a new model for electric and magnetic ULF wave radial diffusion coefficients, and , driven by both solar wind input and geomagnetic activity and addressing the first of the three steps described above. This is a more detailed approach which allows the model to account for variability in radial diffusion as well as allowing the model to have a higher cadence (within the limits of radial diffusion) than existing models based on a single low resolution independent variable Kp. The new model is derived from in-situ observations of the azimuthal electric field and compressional magnetic field which are used to create a database of satellite-derived and

Data and Methodology
In this work we utilize in-situ observations of ULF wave fluctuations to develop an analytic model of ULF wave electric and magnetic radial diffusion coefficients ( and ) which can be used in radiation belt simulations. To achieve this, observations of the inner-magnetosphere magnetic and electric field from the Time History of Events and Macroscale Interactions During Substorms (THEMIS) probes (Angelopoulos, 2008;Sibeck & Angelopoulos, 2008), the two Van Allen Probes (Mauk et al., 2013), and magnetic field data from the Operational Environment 15 (NOAA GOES-15) satellite (Singer et al., 1996) from a similar time period are used (2012)(2013)(2014)(2015)(2016)(2017)(2018)(2019)(2020). From these data, we develop a database of ULF wave power spectral density (PSD), calculate satellite-derived and from the PSD database, and subsequently derive an analytic model of both s. Table S1 in Supporting Information S1 lists the satellites, time period, instruments, and observables used to calculate the electric and magnetic ULF wave PSD. These PSD are then used to estimate the magnetic and electric s. In addition to the in-situ magnetospheric observations, we also use the OMNI data set (King & Papitashvili, 2005), and the SpacePy  and International Radiation Belt Environment Modeling (IRBEM) libraries (Boscher et al., 2022) to parameterize the derived PSD and s as a function of prevailing solar wind, geomagnetic conditions, and third adiabatic invariant * . In this study we use the TS05 magnetic field model (Tsyganenko & Sitnov, 2005) to derive * using the (IRBEM) library (Boscher et al., 2022) assuming a 90° local pitch angle. The OMNI data set and * are used as independent variables (or inputs) to train our radial diffusion models (Section 3). Note, we only use GOES West as it is closer to the magnetic equator then GOES East and therefore provides a better estimate of equatorial ULF wave PSD. This requirement, along with ensuring that the missions overlap in time, means that only GOES-15 is used in this study. Further, we use THEMIS A, D, and E as during the period of interest the other two probes are orbiting the moon. Finally, we use * rather then or radial distance as it more accurately characterizes the topological state of the outer radiation belt as it accounts for adiabatic changes due to dynamic changes in Earth's magnetic field. In the following subsections we describe the methodology for calculating ULF wave PSD as well as the methodology to estimate radial diffusion coefficients from the derived PSDs.

ULF Power Spectral Density
Estimating ULF wave radial diffusion coefficients requires the compressional magnetic field ‖ and azimuthal electric field PSD (e.g., Fei et al., 2006;Ozeke, Mann, Turner, et al., 2014). To calculate ‖ and we follow the same general framework utilized within the community (e.g., Ali et al., 2015;Ozeke, Mann, Turner, et al., 2014). For each satellite the magnetic and electric fields are rotated into a field aligned coordinate system using a twenty-minute average of the background field to obtain the compressional magnetic field ‖ and azimuthal electric field . From ‖ and the PSD can be calculated using several methods such as the multi-taper (Ali et al., 2015;Bentley et al., 2018), wavelet (Dimitrakoudis et al., 2015;Sandhu et al., 2021), or the Fourier transform Rae et al., 2012). In general, each method produces similar estimates of the PSD. Here we use the Fourier transform method outlined in Rae et al. (2012) which has been used in several studies for 4 of 20 quantifying ULF wave PSD (Murphy, Inglis, et al., 2018; as well as subsequently estimating radial diffusion coefficients (e.g., Ma et al., 2018;Mann et al., 2016;Ozeke, Mann, Turner, et al., 2014;Ozeke et al., 2017). Using this method, the ‖ and PSD is calculated for each satellite from a twenty-minute time window stepped by 5 min. Each PSD is then tagged with the prevailing solar wind and geomagnetic conditions from the OMNI database and the satellites position in * corresponding to the middle of the 20-min window (spectra with no corresponding OMNI or * data are dropped from the database).
There are several caveats to note in the calculation of the field aligned coordinates and of the electric and magnetic field PSD. First, the datasets in Table S1 in Supporting Information S1 all have different cadences, and so, prior to calculating the PSD, ‖ and are downsampled to 12 s so that all time-series have the same cadence. Second, at low L-shells THEMIS and Van Allen Probes rapidly move through leading to large variations in the background magnetic field. This introduces artificial power into the magnetic field ULF wave PSDs. To circumvent this, for the Van Allen Probes below = 4 we remove a third-degree polynomial from ‖ ; this removes the background magnetic field while aiming to retain the ULF wave fluctuations. Further, we limit Van Allen Probe data to > 3, whilst for THEMIS data we limit observations to > 5. Different limits are chosen due to the different rates at which these different satellites cross along their orbit. Third, both THEMIS and the Van Allen Probes measure with high-fidelity long-wire antennas a two-component electric field in the spin plane; in this study we utilize the Level-2 THEMIS EFI data and Level-3 Van Allen Probes EFW data, both which use the ⋅ = 0 approximation to estimate the third component of the electric field. THEMIS data is further calibrated using SPEDAS  while Van Allen Probes Level-3 data has undergone extensive processing by the EFW team . Finally, regardless of calibration and data processing it is still possible for poor quality electric and magnetic field data to exist leading to anomalous PSDs. Thus, both the ‖ and PSD spectra are further scrubbed by identifying and removing outliers using the following analysis: 1. Calculate the summed PSD for each of ‖ and spectra between 0.83 and 15 mHz. 2. Bin the summed PSD datasets by Kp and L-shell. 3. Calculate the interquartile range (IQR) of each bin (upper quartile-lower quartile, Q U -Q L ) 4. Identify outlying spectra as any spectra whose summed PSD is less than Q L -1.5*IQR or greater than Q U +1.5*IQR and are removed from the PSD data set.
This method of scrubbing data and removing outliers is often used in statistical analysis and data sciences. In our study, this method does an excellent job removing anomalously high power associated with erroneous or unphysical data. For example, prior to removing outliers the mean spectrum is larger than the upper quartile spectra. Once the outliers are removed, the mean spectrum closely follows the upper quartile spectrum; though this still represents a skewed distribution, the similarity between the mean and upper quartile is in agreement with previous statistical studies of ULF wave PSD (e.g., Bentley et al., 2018;. Once processed, scrubbed, and tagged with the OMNI data and * position there are over 1.7 million magnetic and 0.46 million electric field PSD which span nearly an entire solar cycle and cover the entire inner magnetosphere through a variety of solar wind and geomagnetic conditions. Figure 1 shows the median ‖ (top) and (bottom) PSD at each frequency (y-axis) as function of Kp (x-axis) and * (panels 1-8) for comparison with previous studies. Evident in Figure 1 is that power is concentrated at lower frequencies and increases at all frequencies with Kp, consistent with previous studies (Ali et al., 2015;Brautigam & Albert, 2000;Rae et al., 2012). Panels 9 of Figure 1 (far right) shows how the PSD varies with * and Kp at a fixed frequency of 2.5 mHz. Consistent with previous studies this final panel demonstrates that ULF wave PSD increases with both * and geomagnetic activity as parameterized by Kp. The complete database of PSD spectra is used to estimate and as described in the next subsection.

Radial Diffusion Estimation
Radial diffusion coefficients can be calculated in several ways (e.g., Fei et al., 2006;Schulz & Lanzerotti, 1974). In this work, since we are utilizing  Ozeke, Mann, Turner, et al. (2014) to derive time-dependent-event-specific diffusion coefficients. In their study Mann et al. (2016) used these event specific diffusion coefficients to model the formation of the outer and storage radiation belts observed by the Van Allen Probes in September 2012 (Baker et al., 2013). These authors demonstrated that event-specific ULF wave radial diffusion coefficients coupled with dynamic boundary conditions reproduced the double belt structure of the outer radiation belt, highlighting the importance of ULF wave radial diffusion in controlling the dynamic topology of the outer radiation belt. Several follow-on studies have used the same methodology as Mann et al. (2016) to derive event-specific ULF wave radial diffusion coefficients to study the importance of radial diffusion radiation belt dynamics. In general, these studies have demonstrated that radiation belt simulations using event specific radial diffusion coefficients perform better, in that they more accurately reproduce radiation belt electron dynamics, than those that do not .
In this work, we use a similar approach as Mann et al. (2016) where is the equatorial magnetic field strength at the surface of the Earth and is the Earth's radius.
In Equation these are an averaged PSD that are required for calculating the s. In this work the electric and magnetic field PSD are averaged between 0.83 and 15.00 mHz; the averaged PSD and are then substituted into Equations 1 and 2 to derive and . This averaging provides an energy and azimuthal wave number independent estimate of and which simplifies the subsequent derivation of the radial diffusion models by removing two variables discussed in the next section. Figure 2 shows the statistical variation of (left) and (right) as a function of * and Kp from our database. Compared to previous studies the statistical variation of s derived here are similar in magnitude and increase with * and increasing geomagnetic activity (e.g., , Ozeke, Mann, Turner, et al. (2014)). At higher Kp values (e.g., 5 and 6), the magnitude of is comparable to . This is consistent with recent work that has demonstrated that during the main phase of storms the magnitude of ∼ (Olifer et al., 2019;Sandhu et al., 2021). It is important to note that in this approach and ‖ are treated independently whereas in fact they fact they are correlated via Faraday's law. In addition, the electric field includes both the electrostatic as well as the electromagnetic component. Regardless, recent work has demonstrated that the resulting diffusion coefficients are within a factor of approximately two of the electromagnetic diffusion coefficients (Lejosne, 2019). Finally, in utilizing the , Ozeke, Mann, Turner, et al. (2014) it is assumed that the electric field power spectra is flat and the magnetic field power spectra is proportional to frequency −2 . This is not necessarily the case for all spectra, which is why in this work we use the average PSD and in the same as in Mann et al. (2016). The benefit of utilizing the Ozeke, , Ozeke, Mann, Turner, et al. (2014) and Mann et al. (2016) framework is the simplicity of the s which are independent of both energy and azimuthal wave number. In the next section we describe the development of a radial diffusion model utilizing the database of s described here.

Radial Diffusion Model
ULF wave radial diffusion is a key component of radiation belt dynamics, driving periods of enhanced electron loss via outward radial transport as well as electron acceleration via inward transport (e.g., Ozeke et al., 2017). In radiation belt models, the strength of ULF wave radial diffusion is specified by diffusion coefficients (e.g., Brizard & Chan, 2001;Fei et al., 2006). Several researchers have developed models which specify the strength of radial diffusion coefficients as a function of geomagnetic activity from both in-situ and ground-based observations of ULF wave PSD (Ali et al., 2015;Brautigam & Albert, 2000;Ozeke, Mann, Turner, et al., 2014). In this section we build on these studies using a database of satellited-derived ULF wave and radial diffusion coefficients in an attempt to develop a more robust characterization of the rates of ULF wave radial diffusion based on known drivers of ULF waves. Here we describe the development of the radial diffusion model including a parametric study to identify a robust set of independent variables as model inputs, a discussion of different model types/algorithms, and training, testing, and performance of a final model. In Section 4 we investigate the final model, including its performance under different geomagnetic conditions (e.g., quiet and storm times) and comparing to existing diffusion models.

Parametric Study
Previous studies developing radial diffusion models parameterized their diffusion coefficients based on the planetary index Kp. Kp is a three-hour index ranging from 0 to 9; the indices correspond to exponentially increasing perturbations of Earth's magnetic field as characterized by several ground-based magnetometers from around the world. While Kp is a good measure of geomagnetic activity, its utility in models is limited for three reasons.

of 20
First, Kp is a measure of the response of the coupled solar wind-magnetosphere-ionosphere system as opposed to a driver of the dynamics of this system. Second, the three-hour cadence is long compared to storm-time radiation belt time-scales (Olifer et al., 2018;Ozeke et al., 2017). Finally, the exponentially increasing index limits the dynamic range of models, requiring models to extrapolate to high values of Kp during very active times. For example, the Brautigam and Albert (2000) and , Ozeke, Mann, Turner, et al. (2014) models are limited to using data derived for conditions with Kp ≤ 6 while Liu et al. (2016) and Ali et al. (2016) diffusion models are limited to a maximum Kp value of 5. Above these values the respective models must extrapolate to higher Kp values to estimate the magnitude of radial diffusion coefficients under active conditions. Despite these limitations, there is a key advantage in that the s parameterized by Kp in that they can be easily forecasted when forecasts of Kp are readily available Horne et al., 2021).
In this work we use the OMNI data set to parameterize and develop models of radial diffusion at higher cadence using known drivers of ULF waves. The OMNI data has a higher cadence (1 hr, 5 min, and 1 min) than the Kp index. This is ideal for radiation belt models as it provides increased temporal dynamics and resolution (within the limits of radial diffusion). In addition, the OMNI data set contains multiple parameters including the solar wind vector magnetic field, dynamic pressure, the vector solar wind velocity, as well as multiple geomagnetic indices. This allows for the development of models with multiple inputs which can increase model performance; however, one must identify an ideal set of inputs which maximizes model performance, while maintaining model stability (e.g., modeled s do not exhibit high variance with certain combinations of variables) and simplicity (e.g., avoiding the addition of variables with little impact).
In our development of new parameterizations of and we perform a parametric study of * (a proxy of the third adiabatic invariant), Sym − , the GSM y and z components of the interplanetary magnetic field (IMF; and ), solar wind speed and x-component GSM, and the solar wind dynamic pressure ( ) and density ( ) to identify the best combination of independent variables (or inputs) with which to develop models of and . From this set of independent variables, * is the only required variable in our model as it defines the radial diffusion coefficients on a set grid as is required for its implementation in radiation belt simulations. Unlike other models we use * as it is more physically meaningful then radial distance or as it accounts for adiabatic changes due to the dynamic magnetic field. The other independent variables selected for our parametric study correspond to known drivers of ULF wave power in the magnetosphere (Bentley et al., 2018;Kepko et al., 2002;Mathie & Mann, 2001;Murphy et al., 2015;Rae et al., 2019).
To identify an optimized set of independent variables the parametric study loops through every combination of input variables and fits a multi-linear regression model to the log 10 of . For each model we determine the model-data correlation and mean square error where is the number of observations and and ̂ are the observed and predicted data points). Using the correlation and MSE we identify the best combination of independent variables to subsequently train final models of and . In this study we look for combinations of variables which have high correlation and low MSE, in such cases the difference between the observed and predicted data, and ̂ , is low suggesting the model is a good representation of the data. Figure 3 shows the results of this parametric study; panels a and b show the correlation and MSE for every combination of variables (identified simply as integers on the x-axis). Evident in panels a and b is that the correlation increases and MSE decreases for different combinations of inputs. The large jump in the correlation and drop in MSE in Figures 3a and 3b (gray line) occurs as * is added to the linear regression in the parametric study. This suggests that * is a key variable in characterizing ; this isn't surprising as both ULF wave power and are well ordered by * as shown in Figures 1 and 2.  Figure 3c is that combinations of * , Sym − , IMF and dyn coupled with various combinations of IMF , , , and all lead to high correlations and lower MSE. The correlation peaks and MSE minimizes when all input variables are used in the parametric study; however, this introduces significant multicollinearity between the input variables (e.g., and , and ) which can decrease model performance as well as adding additional complexity. From the parametric study, * , Sym − , , , and dyn , are identified as an optimized set of input parameters to model (vertical line Figure 3c); the correlation is high, MSE low, and the correlation between input variables is limited compared to the other combination of inputs while also maintaining a level of simplicity by including only five independent variables. A similar dependence has also been discussed by Dimitrakoudis et al. (2022), albeit for ULF wave power which is directly related to the 8 of 20 strength of radial diffusion. Finally note, that the correlations and MSE for this set of independent variables is on the same order as those when Kp and * are considered as independent variables. However, in this paper we focus on devel oping a model based on known drivers of ULF wave power as opposed to variables and indices which respond to ULF wave power and enhanced geomagnetic activity like Kp. Thus, for the purpose of this study the combination of * , Sym − , IMF , , and dyn are used as independent variables to train models of and .

D LL Model
There are several methods and algorithms which could be used to generate or train models for the parameterization of the electric and magnetic field s. In this work we investigated two algorithms for training models of and , a multi-linear regression model (Murphy, Inglis, et al., 2020;Murphy, Mann, et al., 2020;Murphy et al., 2015), and a neural network model . In general, both the multi-linear regression and neural network models performed similarly using the independent variables identified in Section 3.1; both algorithms produced similar correlations and MSEs when fitted to and . Given the similar model performance, we use a multi-linear regression model for the final parameterization we present here due to its simplicity as compared to the implementation, use, and sharing of neural networks.
In developing the multi-linear regression models, the and datasets are first regularized as a function of * . The regularization ensures that the distribution of s as a function of * is uniform, that is, the number of data points for any given * is the same. This is important as the spacecraft spend more time at apogee, high * , which can bias any model by fitting to the bulk of the data which are preferentially located in one part of the domain. The datasets are then randomly separated into train/test sets using a 70/30% split. The models are trained on the train set while the test set is used to verify performance. Before training, the independent variables are normalized between 0 and 1 by the maximum and minimum values in the train set; this normalization ensures the dependence on each of the independent variables is assessed on a similar scale and generally increases the stability and performance of the resulting models. Finally, we restrict the model space to * ∈ [3,7] ; this encompasses the outer radiation belt and in general improves model performance by removing highly variable data at higher * . Though this limits the dynamic range of the data used to train the linear regression models, the linear regression models can still be used calculate s outside this range. However, care must be taken when extending the model outside these limits as it is not possible to test the models performance in these regions. This final data set is then fitted to the log 10 of and following the functional form shown in Equation 2; the s can then be calculated as shown in Equation 4.
= 10 + 1 * + 2 Sym + 3 + 4 + 5 dyn (4) Figure 4 shows the results of the training of the linear models for and . The left column of Figure 4 shows the residuals, defined as the difference between the model and satellite-derived log 10 (referred to as the residuals), for both the train and test datasets (red and blue) and both and (top and bottom). For both and the residuals are normally distributed and peak around zero for both test and train data sets (left column). This is ideal as it demonstrates that in general the modeled s are similar to the satellite-derived s. Further, the similarity in the distributions of train and test residuals indicates that the model does not suffer from overfitting (e.g., fitting only to the training data).
The middle and right columns of Figure 4 show the observed versus the modeled for both the train and test datasets (middle and right); the correlation and MSE are shown in the top left. For both and the 10 of 20 correlation between the modeled and observed s in both the train and test data sets is high, ∼0.86 accounting for ∼74% of the variance within the data sets. For both models the MSE is also low. Overall, the normal distribution of the residuals, high correlation and low MSE, coupled with the similarities between the train and test data sets demonstrates the robustness and accuracy of the trained models. Table S2 in Supporting Information S1, shows the fitting and normalization coefficients for both models as described by Equations 3 and 4.

Model Performance
In this section we investigate the residuals as a function of each independent variable to provide the reader and specifically modelers with a clear overview of the model's performance under varying conditions. Figure 5 shows the probability distribution functions (PDFs) of the and model residuals (top and bottom; modeled minus satellite-derived ) as a function of the independent variables (rows) for the combined train and test datasets. In general, both the and models perform well as a function of the independent variables. The residuals are peaked around zero as a function of the independent variable. However, this is not true for * and for more extreme values of Sym − , , and dyn . With regard to * , both the and models show bias at * > 6 and * < 4.5 where the residuals are skewed to either positive or negative values. The biases in * are more pronounced than those observed as a function of Sym − , , and dyn and are likely the result of biases that exist in the calculation of * . Note, a similar distribution of residuals is observed when considering only storm-time or quiet-time periods with this choice of independent variables; at extreme values of Sym − , , and dyn and * > 6 the storm-time biases are slightly larger than quiet times. Despite this, the similarity of the core of the distribution between storm-and quiet-time residuals is ideal as it demonstrates that both the and radial diffusion models perform well during both storm-and quiet-times. This was not the case for storm-and non-storm-time Kp-parameterization of examined by Dimitrakoudis et al. (2022) which highlights another pitfall of parametrizing by Kp as such parameterizations cannot account for both storm and quiet times. Due to the similarity between storm-time, quiet-time, and all data, the storm-and quiet-time distributions are shown in Figures S1 and S2 in Supporting Information S1.
While the overall performance of both the and diffusion model is good (Figure 4) the biases as a function of * may introduce errors which can propagate through any radiation belt simulation. For this reason, the residuals have been characterized as a function of * so that the bias can be readily removed from the models. To remove the bias, we use the residual- * PDF. The data is binned in * in increments of 0.1 (corresponding to the grid size in many radiation belt simulations) and so are the residuals. For each * bin we fit a Gaussian of the form shown in Equation 5, where 0 is the height, 1 is the center, and 2 is the width (standard deviation) of the fitted Gaussian. This analysis is shown in Figure S5 in Supporting Information S1 for both the and models and several select * bins. From these fits we can use 1 to shift the modeled s by subtracting 1 from the modeled corresponding to a particular * bin. In this way when the residuals are recalculated they peak around zero.
(5) Figure 6 shows the results of the biased-removed s calculated by shifting each based on the Gaussian fits and corresponding * bin. The left column shows the PDF of residuals versus * and the right column shows the PDF of the residuals-biased-removed versus * for the combined trained/test datasets. Evident in Figure 6 is that once the bias is removed from the modeled s the residuals peak around zero and the bias as a function of * is no longer evident. In this analysis it is important to note that bias as function of * is discrete and not continuous; however, the bias has been characterized on an * grid typically used in radiation belt simulations. Finally, though not the purpose of characterizing residuals, the Gaussian fits provide a very simple means to generate ensembles 12 of 20 of s which can be used in radiation belt simulations. For example, the width and center of the Gaussian can be used with a random number generator to sample from the fitted distribution of residuals and subtract the randomly sampled residual from the modeled s. In this way several time series can be generated to run ensemble simulations of the radiation belt response. Table S6 in Supporting Information S1 contains the Gaussian fit parameters as a function of * for both and . Note that the shift in the residuals, defined by the center of the Gaussian is largest for and that the model performs better than the . Finally, while these corrections help with the model residuals, they are generally small, not exceeding a factor of ∼2.

Event Case Studies and Model Comparisons
In the previous section we discussed the development of new and models and assessed their overall performance as a function of each independent variable. In this section we investigate the dynamics and performance of the new and models during two case studies, an extended period of quiet geomagnetic activity and the March 2013 Geospace Environment Modeling Quantitative Assessment of Radiation Belt Modeling focus group storm challenge event (e.g., Engebretson et al., 2018;Ma et al., 2018). For both cases we compare the s derived here with the Ozeke, , Ozeke, Mann, Turner, et al. (2014), Brautigam and Albert (2000), and Lejosne (2019) s. Note the , Ozeke, Mann, Turner, et al. (2014), Brautigam and Albert (2000), and Lejosne (2019) s are all parameterized by and not * ; however, since such models are routinely used in simulations with no conversion to * we are able to compare them to the model developed here assuming their is directly used as * in simulations. Figure 7 provides an overview of the solar wind and geomagnetic conditions as well as the satellite-derived s during a quiet solar wind and geomagnetic period in November 2013 (see caption for details). At the start of the quiet period the solar wind speed was elevated and Sym − slightly negative for ∼12 hr. Toward the end of the interval the IMF turned southward in association with a period of enhanced dynamic pressure and another period of negative Sym − lasting ∼24 hr. Throughout the quiet interval Kp remained low, below 3+. The observed s were largest at the start and end of the quiet period, in association with elevated solar wind speeds and enhanced dynamic pressure and southward IMF. Through the middle of the quiet period the observed s were generally small. This can be seen in the bottom two rows of the left panel. Throughout the quiet period the modeled s (middle and right panels) showed the same overall pattern as the observed s and the modeled residuals remained consistent with the overall model performance (bottom panels of middle and right column). There is also little difference between the modeled s and the modeled-biased-removed s through the duration of the quiet period. Finally, in both models is larger then by about an order of magnitude; this is generally consistent with the difference between and for higher energy electrons in , Ozeke, Mann, Turner, et al. (2014).
During the quiet interval the modeled s vary as one would expect. They were highest during elevated solar wind speed and negative Sym − , and again with southward IMF, increased , and negative Sym − . However, what is more interesting, is the comparison between the s derived here and existing models of s (Figure 8). The top row of Figure 8 shows the modeled , , and the total = + , with the biases-removed. The subsequent rows show the difference between the logs of the model bias-removed s and the , Ozeke, Mann, Turner, et al. (2014), Brautigam and Albert (2000), and Lejosne (2019) (see Figure caption for additional details). Compared to ), Ozeke, Mann, Turner, et al. (2014, the modeled derived here is larger while the modeled derived here is larger at higher * and smaller at lower * . The combined diffusion shows the same pattern as . For Brautigam and Albert (2000) we only consider their electromagnetic term as the electrostatic term is typically ignored in radiation belt simulations (Ozeke et al., 2013). In this case the electromagnetic diffusion is larger than at higher * and smaller at lower * ; the modeled shows the same pattern when compared to the Brautigam and Albert (2000) electromagnetic term A similar pattern is also observed when comparing with the Lejosne (2019) s. In this case we only compare to those of Lejosne (2019) as there is no analogous or . Note the Brautigam and Albert (2000) electromagnetic term and , Ozeke, Mann, Turner, et al. (2014) are very similar in magnitude. Sym − reaches nearly −150 nT. During the main phase of the storm, March 17, peaks and even exceeds during the same interval. This is consistent with other studies which have shown to be large during the main phase of storms (Olifer et al., 2019;Sandhu et al., 2021). Following the main phase and through the recovery phase both and decay reaching pre-storm levels before rapidly enhancing again on March 21 during a second interval of fast solar speed and southward IMF. These patterns are seen in the satellite-derived, modeled, and modeled-biased-removed s (left, right and middle panels) and are consistent with what would be expected given the solar wind and geomagnetic activity. Comparing the modeled and modeled-bias-removed s there is generally little difference. The model bias-removed is slightly smaller at high and low * . During the main phase ≈ in both models. Finally, the residuals for both models are normally distributed and peak around zero.
During the geomagnetic storm ( Figure 10) the s derived here show a similar relation to the , Ozeke, Mann, Turner, et al. (2014), Brautigam and Albert (2000), and Lejosne (2019) s as that observed during quiet times (Figure 8).
is larger than that of ), Ozeke, Mann, Turner, et al. (2014, especially during the main phase of the storm (Figure 10 left). At high * , is larger than that of , Ozeke, Mann, Turner, et al. (2014) and Brautigam and Albert (2000) and smaller at low * (Figure 10  14 of 20 et al. (2014), Ozeke, Mann, Turner, et al. (2014) and Brautigam and Albert (2000) (Figure 10 right). Compared to Lejosne (2019) the s derived here are larger during the main phase of the storm and at higher L and are generally smaller at lower * (Figure 10 bottom right). Prior to the storm main phase, March 16, and total derived here are smaller than the other three at * < 6 and larger at * > 6 . During this period Kp is elevated compared to quiet values reaching a Kp of 4, while the solar wind remains quiet accounting for the differences between the models.

Summary and Conclusions
In this work we utilize in-situ observations to develop a database of ULF wave PSD and satellite-derived ULF wave radial diffusion coefficients, s. The database of satellite-derived s is used to construct a new * dependent and optimized bias-removed model of and driven by , sw , dyn and Sym − . Overall, the new models of and perform well when compared to the satellite-derived s from which the models were developed.
The new bias-removed s derived are compared to existing models from ), Ozeke, Mann, Turner, et al. (2014, Brautigam and Albert (2000), and Lejosne (2019) during a period of quiet geomagnetic activity and the March 2013 geomagnetic storm. If one ignores the main phase of the storm, the comparison of quiet time s is similar to that of the geomagnetic storm; generally has a steeper gradient as compared to existing models such that is larger at high * and smaller at low * . The steeper gradient in the s derived here is likely due to the use of a * and the TS05 model. The previous studies all used dipole which does not vary with varying solar wind and geomagnetic conditions. Whereas with * , as activity increases the Earth's magnetic field is distorted resulting in changes in * . For example, during periods of enhanced solar wind and geomagnetic activity regions of high ULF wave power can be associated with lower  Brautigam and Albert (2000), and Lejosne (2019). * than would be in a dipole model; this leads to a steeper gradient in ULF wave power as a function of * than as a function of . Overall, the steeper gradient here is likely to be more representative of the actual radial diffusion as it accounts for variations in the third adiabatic invariant * .
During the main phase of the storm is larger then Lejosne (2019) and smaller then the Ozeke, , Ozeke, Mann, Turner, et al. (2014), Brautigam and Albert (2000) s. And, prior to the storm onset, the s derived here are smaller at all * values then the existing models. These differences are likely a result of the parameterization used to quantify the s. Lejosne (2019) uses the magnetopause boundary defined by and to derive the magnetic field perturbations which drive radial diffusion. Both and are elevated for only a short period of time compared to − and which is likely why the s derived here are larger then those of Lejosne (2019) during the main phase. With regard to the period prior to storm onset, Kp is elevated as compared to Sym − , , sw , and dyn which is likely why the Ozeke, , Ozeke, Mann, Turner, et al. (2014), and Brautigam and Albert (2000) s are larger then those derived here. These periods of elevated Kp are associate with enhanced AE driven by magnetospheric substorms. Though substorm activity leads to enhanced ULF waves in the form of short lived and irregular pulsations (Jacobs et al., 1964), the actual enhancement in Kp is the result of the larger substorm bays (Cramoysan et al., 1995) as opposed to ULF waves. Thus, its unclear whether this period would in fact lead to enhanced radial diffusion as depicted by the , Ozeke, Mann, Turner, et al. (2014), and Brautigam and Albert (2000) s as the resulting substorm-driven ULF wave activity is short lived. This difference is likely to propagate through any radial diffusion simulation leading to significant differences in the strength of radial transport and global topology of the radiation belt when using the different models.

of 20
It is important to note three features of the new radial diffusion models. First, with the new models s can be calculated on a cadence as high as 1-min utilizing the high-resolution OMNI data set. However, care must be taken so that the cadence of s remains physical. Typical drift periods of radiation belt electrons range from 100-1000 s of seconds; it is recommended that researchers use the 1-hr OMNI data set to calculate s so that electrons experience multiple orbits within the cadence of . Second, , and total can reach rates greater then 10/day. Though this is in agreement with other models, as illustrated in Figures 8 and 10 (see also Olifer et al., 2019;Sandhu et al., 2021), at these values the concept of radial diffusion will start to break down as electrons would be moving inward on time-scales on the order of or shorter than a drift period. In this regime coherent ULF wave-particle interactions may instead play an important role in the global dynamics of the outer radiation belt (e.g., Murphy, Inglis, et al., 2018;Murphy, Inglis, et al., 2020;Murphy, Mann, et al., 2020). The effect that enhanced ULF wave power, which leads to these large s, and the effect they have on outer radiation belt is an important avenue of future research and which can be investigated with global models (Degeling et al., 2008(Degeling et al., , 2013Komar et al., 2017). Finally, in this work we have adopted the Mann et al. (2016) framework which is based on , Ozeke, Mann, Turner, et al. (2014). In this framework the power spectra are averaged over a fixed frequency range which allows the derivation of s which are independent of energy and azimuthal wave number. This was chosen as it simplifies the diffusion models which is important for operational models. Future work will investigate the performance of the new s in radiation belt simulations and whether more complex models are required which take into account either the energy of electrons, the azimuthal wave number of ULF waves, or both.
In summary, we have used over 40 years of satellite data to calculate and over a range of * from 3 to 7 in non-dipole field using TS05. The diffusion rates are parameterized by an optimized set of solar wind and geomagnetic variables and provide diffusion coefficients at much higher time resolution and over a larger dynamic range then previous studies. The main results are: • The gradient in our is generally steeper as compared to existing models, giving higher diffusion rates at larger * and lower rates at low * .
• is generally higher than at all * except during the main phase of storms. • During the March 2013 magnetic storm generally agrees with previous work except for the initial phase of the storm where diffusion rates are lower at all L* < 5. We suggest that this is due to substorm activity which is captured in the Kp models and not in our new models.
The results are available for use in global radiation belt models to develop better reconstructions of the radiation belt environment (see Supporting Information S1 for model coefficients). Future work will test both the model and model bias-removed performance in radiation belt simulations under varying geomagnetic conditions to determine if the new models improve the performance of radiation belt simulations as compared to existing models. The performance of our models will help to determine if more complex models are required, for example, including electron energy in the derivation of the s. In addition, future studies will investigate the performance of our new models as compared to existing models when using forecasted inputs. Finally, future studies will investigate including additional model inputs and independent variables and the time history of independent variables, along with more complex algorithms for regression as a path to improved models and radiation belt simulations.

Data Availability Statement
The THEMIS data is available via SPEDAS, PySPEDAS or the THEMIS data server, http://themis.ssl.berkeley. edu/data/themis. The GOES data is available from the NOAA National Centers for Environmental Information website, https://www.ngdc.noaa.gov/stp/satellite/goes/. Van Allen Probes data is available at the Coordinated Data Analysis Web (CDAWeb) https://cdaweb.gsfc.nasa.gov/. The OMNI data is available at https://omniweb. gsfc.nasa.gov/. The higher-level database used to develop the models is available via Zenodo, https:// zenodo.org/record/7569732, https://doi.org/10.5281/zenodo.7569732 (Murphy, 2023). The database contains two datasets, a magnetic field data set and an electric field data set. The magnetic field data set contains the derived power spectral density for the compressional magnetic field ‖ from THEMIS, Van Allen Probes, and accompanying position (MLT, L, L* TS05), solar wind, and geomagnetic data. The electric field contains the derived power spectral density for the azimuthal electric field from THEMIS, Van Allen Probes, and GOES and accompanying position (MLT, L, L* TS05), solar wind, and geomagnetic data. See the Zenodo record for more description of the derived database and datasets. for the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) license [where permitted by UKRI, "Open Government Licence" or Creative Commons Attribution No-derivatives (CC BY-ND) license may be stated instead] to any Author Accepted Manuscript version arising. TD is supported by the Air Force Office of Scientific Research under award number FA9550-19-1-7039. SG and RBH were supported by NERC Grant NE/ V00249X/1 (Sat-Risk), NERC National Capability Grants NE/R016038/1 and by NERC National Public Good activity Grant NE/R016445/1. AJH and AB's contribution was funded by the Space Precipitation Impacts group, a NASA Goddard Internal Science Funding Model grant. IRM is supported by a Discovery Grant from Canadian NSERC, and by a UK Royal Society Wolfson Visiting Fellowship. LO is supported by the Canadian Space Agency.