Improved Neutral Density Predictions Through Machine Learning Enabled Exospheric Temperature Model

The community has leveraged satellite accelerometer data sets in previous years to estimate neutral mass density and exospheric temperatures. We utilize derived temperature data and optimize a nonlinear machine‐learned (ML) regression model to improve upon the performance of the linear EXospheric TEMPeratures on a PoLyhedrAl gRid (EXTEMPLAR) model. The newly developed EXTEMPLAR‐ML model allows for exospheric temperature predictions at any location with one model and provides performance improvements over its predecessor. We achieve reductions in mean absolute error of 2 K on an independent test set while providing similar error standard deviation values. Comparing the performance of both EXTEMPLAR models and the Naval Research Laboratory Mass Spectrometer and Incoherent Scatter radar Extended model (NRLMSISE‐00) across different solar and geomagnetic activity levels shows that EXTEMPLAR‐ML has the lowest mean absolute error across 80% of conditions tested. A study for spatial errors demonstrated that at all grid locations, EXTEMPLAR‐ML has the lowest mean absolute error for over 60% of the polyhedral grid cells on the test set. Like EXTEMPLAR, our model's outputs can be utilized by NRLMSISE‐00 (exclusively) to more closely match satellite accelerometer‐derived densities. We conducted 10 case studies where we compare the accelerometer‐derived temperature and density estimates from four satellites to NRLMSISE‐00, EXTEMPLAR, and EXTEMPALR‐ML during major storm periods. These comparisons show that EXTEMPLAR‐ML generally has the best performance of the three models during storms. We use principal component analysis on EXTEMPLAR‐ML outputs to verify the physical response of the model to its drivers.

Nitric oxide (NO) is a cooling mechanism responsible for long-term cooling trends present during solar minimum (Kockarts, 1980) and short-term temperature decreases following large geomagnetic storms (Knipp et al., 2017;Mlynczak et al., 2003). Lei, Burns, et al. (2012) found that for the 2003 Halloween storm, temperature and density post-storm were appreciably lower than pre-storm levels although the specific cause of this is highly contested (Lei et al., 2021;Mikhailov & Perrone, 2020). Many empirical models do not model this phenomena well and predict higher density in the recovery phase of major storms relative to observations (Licata, Mehta, Tobiska, Bowman, & Pilinski, 2021;Oliveira & Zesta, 2019).
The Naval Research Laboratory Mass Spectrometer and Incoherent Scatter radar Extended model (NRLM-SISE-00 but referred to in this paper as MSIS) is a commonly used empirical thermospheric density model (Picone et al., 2002). As with many models (e.g., DTM, Bruinsma, 2015and JB2008, Bowman et al., 2008, MSIS heavily relies on temperature profiles to determine species densities and therefore mass density throughout the thermosphere. A key parameter in predicting the temperature profile is the exospheric temperature (T ∞ ), which is the asymptotic value that the temperature profile approaches at the top of the thermosphere or thermopause (Bates, 1959;Jacchia, 1965). MSIS uses the Bates-Walker temperature profile (Walker, 1965).
The availability of accelerometer-derived density estimates from satellites-such as CHAllenging Minisatellite Payload (CHAMP), Gravity Recovery and Climate Experiment (GRACE), and Swarm-has been advantageous for model development and assessment (Bettadpur, 2012;Friis-Christensen et al., 2006;Lühr et al., 2002). Over the lifetime of satellites with onboard accelerometers, we accumulate measurements over an abundance of locations and space weather conditions. Researchers have used these measurements to derive density estimates by removing accelerations from other sources (Bruinsma & Biancale, 2003;Calabia & Jin, 2016;Doornbos, 2012;Liu et al., 2005;Sutton, 2008). Weimer et al. (2016) used the density estimates from  to approximate exospheric temperatures by varying the temperature parameter in MSIS using the bisection method until the model density closely matched that of the satellite. Weng et al. (2017) followed this methodology and used Sutton's CHAMP density estimates to create an exospheric temperature model. Weimer et al. (2020) had used the derived exospheric temperatures to fit 1,620 linear models to make predictions on a polyhedral grid as a function of different space weather conditions over time. The model is called EXospheric TEMPeratures on a PoLyhedrAl gRid (EXTEMPLAR). In this work, we develop an improved exospheric temperature model by using a single nonlinear artificial neural network (ANN) to make predictions at any location. This global model is called EXTEMPLAR Machine Learned (EXTEMPLAR-ML).
Principal component analysis (PCA), also referred to as Empirical Orthogonal Function (EOF) analysis or Proper Orthogonal Decomposition (POD), is used in this work to investigate the most dominant modes of variability in EXTEMPLAR-ML to ensure that the temperature formulation is as expected. PCA has been used to analyze thermospheric density data sets previously and is often used in the development of reduced-order models (Gondelach & Linares, 2020;Mehta & Linares, 2017;Mehta et al., 2018). PCA has also been used to study satellite accelerometer data sets (Calabia & Jin, 2016;Lei, Matsuo, et al., 2012;Matsuo & Forbes, 2010). Sutton et al. (2012) used PCA to produce basis functions that represented the variability of temperature parameters used in an empirical Jacchia family model to improve its nominal density formulation (Jacchia, 1970). Ruan et al. (2018) used CHAMP density estimates and a physics-based density model to develop an exospheric temperature model based in PCA. Machine learning (ML) models tend to be ambiguous in nature, so we utilize PCA on EXTEMPLAR-ML outputs only to see if the model's response to the inputs is physical without physical constraints being present in training.
In this work, we show that the extension of ML to develop an exospheric temperature model has the following key benefits over previous models: • The ML model improves prediction accuracy over its linear predecessor in many conditions. • While EXTEMPLAR is composed of 1,620 individual regression models for specific locations, EXTEM-PLAR-ML is a single model capable of predicting exospheric temperature and any local time-latitude combination. That is particularly important when predicting temperature and density along an orbit at a high frequency. • The most significant benefit is the nonlinear formulation of EXTEMPLAR-ML. A linear model cannot capture nonlinear processes that are essential to represent during geomagnetic storms. We show in Sections 3.2 and 3.3 that EXTEMPLAR-ML more closely matches satellite estimates for temperature and density during these periods.
The paper is organized as follows: we start by detailing the data and model development. Then, we discuss the methodology for temperature and density prediction using the model. We then compare errors between MSIS, EXTEMPLAR, and EXTEMPLAR-ML across different space weather conditions and at different locations. We investigate the dominant modes and PCA coefficients across an eight-year period. As a case study, we compare the temperature and density predictions of MSIS, EXTEMPLAR, and EXTEMPLAR-ML to CHAMP, GRACE-A, Swarm A, and Swarm B during 10 major geomagnetic storms.

Data
Typically, MSIS predicts the exospheric temperature as a function of position and space weather drivers. Using this computed value, the model then calculates species densities as a function of altitude and therefore neutral mass density. The user can override the internally computed T ∞ and MSIS will determine density based on the provided value. In the past, this has been leveraged in numerical schemes to match MSIS to satellite measurements in order to estimate T ∞ (Forbes et al., 2009(Forbes et al., , 2011Weimer et al., 2016). Weimer et al. (2020) used the  CHAMP and GRACE density estimates from 2002 to 2010 to perform a similar derivation of exospheric temperatures. They had also used Swarm A and Swarm B density estimates from Astafyeva et al. (2017). These two satellites, such as CHAMP and GRACE, had orbits with high inclination and were at ∼470 and 520 km in altitude, respectively (Friis-Christensen et al., 2006). Weimer et al. (2020) had varied T ∞ in MSIS using a binary search method until the temperature was determined to 2 K. They also generated a geodesic polyhedral grid made of 1,620 cells. These cells do not have equal areas, but they were designed to all be within 10.6% of the mean area. The exospheric temperatures are binned to cells based on their geographic coordinates. The distribution of samples on the grid is shown in Figure 1. The number of samples in each bin ranges from 25,998 to 505,180. The distribution heavily favors high latitudes as the satellites have high inclinations.
The F 10 , S 10 , M 10 , and Y 10 indices are representative of solar activity and are part of Space Environment Technologies' (SET) SOLAR2000 algorithm (Tobiska et al., 2000), which has been recently benchmarked by . F 10 is a legacy solar proxy measured since 1947 (Covington, 1948). It is measured by the Dominion Radio Astrophysical Observatory in Canada. The S 10 index is representative of the integrated 26-34 nm solar EUV emission measured by the Solar and Heliospheric Observatory research satellite's Solar Extreme-ultraviolet Monitor. M 10 is a proxy for the Mg II core-to-wing ratio and is measured by the Solar Backscatter Ultraviolet spectrometer on NOAA operational satellites. Y 10 is a hybrid index of 0.1-0.8 nm solar X-rays during solar maximum and Lyman-α emissions during solar minimum. Descriptions of these indices are thoroughly explained by Tobiska et al. (2008). Poynting flux values represent the electrodynamic energy that flows into the atmosphere and are calculated using an electrodynamics model (referred to as the W05 model) described by Weimer (2005aWeimer ( , 2005b, which requires IMF and solar wind velocity data from the Advanced Composition Explorer spacecraft.

Model Development
We had access to over 81 million exospheric temperature estimates from Weimer et al. (2020), the associated polyhedral grid locations, and different space weather indices/proxies as potential drivers. The best linear model from previous work used S 10 , √ 10 , Poynting flux totals (S N and S S ), a temperature perturbation term (ΔT), day of year (doy), and universal time (UT). The cooling effect of Nitric Oxide emissions was simulated in the calculation of ΔT. S N and S S represent the Poynting flux in the northern and southern hemispheres, respectively. Even though the JB2008 thermospheric density model  uses F 10 , S 10 , M 10 , and Y 10 , Weimer et al. (2020) found that S 10 and M 10 had the highest correlation to temperature (r = 0.97). We found that the using all four JB2008 solar indices improved model accuracy. Therefore, we used F 10 , S 10 , M 10 , Y 10 , S N , S S , ΔT, local solar time (LST), geodetic latitude, doy, and UT. For doy and UT, the model uses sine and cosine functions of the fractional doy and UT, generating four temporal inputs (t 1 -t 4 ), see Equation 1. In an effort to prevent any local time discontinuities, transformations were also made to the local solar time input (LST 1 and LST 2 ), see Equation 2.
Upon having set up the data into inputs (described above) and labels (associated log 10 T ∞ ), we leverage a tool called Keras Tuner (O'Malley et al., 2019). This allows us to provide a range of hyperparameters upon which the tuner searches to find the best architecture/model through a Bayesian optimization scheme to minimize the validation loss. The tuner settings are shown in Table 1. The tuner is provided 2 million random training samples and 500,000 validation samples. These 2 million and 500,000 samples are randomly selected from the complete training and validation sets, respectively.
The tuner settings and the necessary number of training and validation samples were determined through preliminary testing. With 150 architectures tested for a tuner with 2 million training and 500,000 validation samples, the tuner takes approximately 14 and a half hours to run being parallelized across four NVIDIA Quadro 5000 GPUs. Once complete, the tuner returns the 10 best models, which we evaluate on independent data to confirm model performance. The best architecture to come out of the tuner had 148,057 trainable parameters and is displayed in Table 2. This model was trained further using the full training set to 220 epochs where the validation loss was at its minimum value. While EXTEMPLAR-ML is not restricted to prediction at the polyhedral grid locations, it  EXTEMPLAR-ML is compared to EXTEMPLAR and MSIS across multiple space weather conditions and as a function of location. First, we timematch the valid T ∞ predictions from each model between 2002 and 2018. The mean absolute error and error standard deviation are computed for each model across all samples. This is not the same methodology used by Weimer et al. (2020) to compute the model statistics, which is why they differ. Next, we look at different F 10 and a p ranges and compute the mean absolute error for each model within these space weather conditions across the test set. Test errors for each model are segmented into each EXTEMPLAR grid location to compare the mean absolute error and error standard deviation visually. This will show how the models' performance compares in different regions (e.g., high latitudes, dayside, and nightside).

Model Interpretation With PCA
ML models can be difficult to interpret, which makes them ambiguous in nature. For EXTEMPLAR-ML, we want to validate that the model is driven by the same dominant processes one would expect in MSIS and linear EXTEMPLAR. To investigate this, we leverage PCA as an indirect interpretation technique. PCA is an eigendecomposition technique that determines uncorrelated linear combinations of the data that maximize variance (Hotelling, 1933;Pearson, 1901). As mentioned in the Introduction, PCA is widely used in thermospheric density and exospheric temperature studies as both a modeling and analytical tool. We use PCA to get insight into EX-TEMPLAR-ML, which requires predictions covering a vast array of conditions. To accomplish this, we evaluated the model at all 1,620 EXTEMPLAR grid locations from 2002 to 2009 at a three-hour cadence. These predictions provide the global evolution of exospheric temperatures spanning over half a solar cycle. We perform PCA on the spatiotemporal temperature maps to obtain the U, Σ, and V matrices. PCA decomposes the data and separates spatial and temporal variations such that: where ∈ ℝ is the model output state (full 2D temperature maps), r is the choice of order truncation, α i are temporal coefficients, and U i are orthogonal modes or basis functions. The modes are the first r columns of the left singular vector derived by performing PCA on an ensemble of model output solutions such that: In Equation 4, m represents the ensemble size (eight years). The temperature data is denoted by . U is the left unitary matrix, and it is made of orthogonal vectors that represent the modes of variation. Σ is a diagonal matrix consisting of the squares of the eigenvalues that correspond to the vectors in U. We can extract temporal coefficients by performing matrix multiplication between Σ and V T . Therefore, the signs of the modes and coefficients are important in the analysis phase.

Geomagnetic Storm Case Study
To assess the performance of EXTEMPLAR-ML during extreme events, we compare MSIS, EXTEMPLAR, and EXTEMPLAR-ML to CHAMP-and GRACE-A-derived temperature and density estimates during 10 storm periods. The storms are listed in Table 3 along with their statistics/attributes. Five of the storms are used by Bruinsma et al. (2021) to assess the capabilities of thermospheric density models. Three of the storms in Table 3 with available CHAMP and GRACE data are from 2005 and were classified by Knipp  Note. There are 14 inputs for Layer 1.

Table 2
Model Architecture for the Best Model From the EXTEMPLAR-ML Tuner storms due to overproduction of nitric oxide that resulted in the inaccurate density predictions. The October 2003 storm was chosen as it was one of the most significant storms of the 21st century. The July 2004 storm was used for comparison between MSIS and EXTEMPLAR by Weimer et al. (2020). The two storms in 2015 were chosen as they reside in the validation set and use Swarm measurements. We will compare the exospheric temperature predictions of each model to the satellite estimates in Section 3.2, and those values will be used in MSIS to predict density shown in Section 3.3.

Results
Once we obtained the model described in Section 2.2, we evaluated it on all training, validation, and test data. This is shown in Figure 3  The T ∞ scatterplots are fairly centered on the 1:1 line, which indicates a zero-error prediction. There is a skew toward underprediction at very high temperatures, particularly on the test set. However, some of these exospheric temperatures are not physical, due to some instances where an abnormally high temperature needs to be input to MSIS to obtain a match with the measured density. All models seem to have poor performance at temperatures above 1500 K on the test set. This subset of the data only makes up 0.03% of the test set, meaning the impact on overall performance is minor, and the scatter can be misleading. There is a similar artifact for the training set. Visually, it seems as if there is an abundance of high-temperature samples where the models perform poorly; however, there are only 778 samples above 2,500 K, making up only 0.001% of the training set. EXTEMPLAR v6 seems to have a lower bias during these conditions but has much higher variance. EXTEMPLAR-ML has the highest coefficient of determination on all three sets, though the EXTEMPLAR v6 values are similar.
The error distributions show that MSIS has a strong tendency to underpredict temperature on all three sets. Both EXTEMPLAR models have similar characteristics for the training and validation sets but EXTEMPLAR v6 has a more negative bias on the test set. A major takeaway from Figure 3 is that all three models struggle on the test set although it is only truly independent data for EXTEMPLAR-ML. It is hypothesized that during this period, there is either an issue with the data or the model drivers may not correspond to as much of the variation in the system as they do elsewhere. Table 4 shows the mean absolute error and standard deviation of the error for MSIS, EXTEMPLAR v6, and EXTEMPLAR-ML in Kelvin. The mean absolute error is also shown as a percentage. Table 5 shows the mean absolute percent error for these three models across different space weather conditions on the test data. These are shown as relative errors to remove the impact of the general temperature level for the conditions. EXTEMPLAR-ML has the lowest mean absolute error on all three sets, while having a lower error standard deviation on the training and validation sets. The difference between the two EXTEMPLAR models in terms of the test error standard deviation is only 0.41 K. However, EXTEMPLAR-ML has reduced the mean absolute error by 2 K on the test set. In Table 5, we see that both the linear and ML versions of EXTEMPLAR are more accurate than MSIS, regardless of solar and geomagnetic activity level. Furthermore, EXTEMPLAR-ML has the lowest mean absolute error for 16 of the 20 conditions investigated. The difference in mean absolute error between the two EXTEMPLAR models never deviates by more than 0.6% though. We show the mean absolute error and error standard deviation for all models at each polyhedral grid cell in Figure 4. Note that only the test set is used for this comparison, and the number of samples in each cell ranges from 4,023 at the equator to 104,278 at the poles.
In Figure 4, the highest MSIS errors on the dayside are at nearly all latitudes while on the nightside, the highest errors favor high latitudes. There is a similar pattern for MSIS' error standard deviation, though they are more  localized. The mean absolute error was lowest for EXTEMPLAR v6 in 638/1,620 cells, and it was lowest for EXTEMPLAR-ML in the remaining 982 cells. MSIS had the lowest error standard deviation in 24/1,620, EX-TEMPLAR v6 has the lowest error standard deviation in 1,190 cells, and EXTEMPLAR-ML only had the lowest value for the remaining 406. Therefore, in the test set, EXTEMPLAR-ML is generally more accurate as a function of location, but EXTEMPLAR v6 is more precise.

Model Interpretation
With the extensive EXTEMPLAR-ML prediction set described in Section 2.3, we can investigate the most dominant modes of variability and their associated PCA coefficients. This is shown in Figure 5. It is important to note that the data were not centered prior to performing PCA, so the first mode is representative of the mean temperature distribution over this period.
The first mode is representative of solar EUV heating denoted by the diurnal temperature map and strong correlation with S 10 . This mode accounts for over 97% of the system's variance. Mode 2 represents a latitudinal Summer-Winter variation. There is a linear progression of the mode with latitude and α 2 oscillated about zero with a period of 365 days. It has an inverse relationship to t 2 described in Equation 1, and its amplitude is a function of the solar activity. The second mode and coefficient are multiplied by −1 to align with t 2 . For Mode 3, the peaks are at high latitudes and the magnitude decreases toward the equator. α 3 most strongly correlates with ΔT and the Poynting flux totals (∼0.60 with S N and S S ). We suspect that this mode corresponds to the effects of high-latitude heating from either Joule heating or electron precipitation. These dominant modes are clear and would likely match that of MSIS or EXTEMPLAR, confirming the model behavior. Again, this specific analysis was only done on EXTEMPLAR-ML outputs for insights on the model behavior. PCA was not used for model development.

Modeled Versus Observed Temperature Along Orbits
In an effort to compare model performance between other temperature models and the observations, we evaluate the MSIS (unmodified), EXTEMPLAR v6, and EXTEMPLAR-ML exospheric temperature values along satellite orbits for the 10 storms listed in Table 3. The mean absolute percent error relative to the satellite temperature estimates is shown in Table 6, and the predictions for the August 2005 storm are shown in Figure 6. Similar figures were generated for all 10 storms and are provided in Supporting Information S1 in addition to a table of mean absolute error in Kelvin.
For all storms, both EXTEMPLAR models perform better than MSIS. Concurrently, EXTEMPLAR-ML has the lowest error along 13 of the 20 satellite orbits out of these three models. EXTEMPLAR-ML reduces T ∞ prediction  Note. Bold text represents the lowest errors for that condition across the three models.

Table 5 Mean Absolute Percent Error Across for MSIS, EXTEMPLAR v6, and EXTEMPLAR-ML as a Function of Space Weather
Conditions on the Test Set error up to 3% relative to EXTEMPLAR v6 and up to 9% relative to MSIS. Looking at the temperature predictions across these storms ( Figure 6 and in Supporting Information S1), it is clear that EXTEMPLAR-ML most closely matches the general trends of the satellite estimates for a majority of each storm period. An important attribute of the model (as for EXTEMPLAR v6) is that it does not follow the outliers. In the data processing scheme, there are erroneous temperature predictions. While both EXTEMPLAR models are regressed on the satellite derived temperatures, it is evident that they pick up on physical trends and are not overfit to outliers. Other takeaways from the figures are that MSIS seems to follow the correct variation in T ∞ but has a tendency to overpredict, and there are periods where EXTEMPLAR v6 deviates from the other temperature sources (e.g., other models and the satellite estimates). These trends are more evident in other storms found in Supporting Information S1, but this storm is shown as it is from the test set.

Modeled Versus Observed Density Along Orbits
We input the exospheric temperatures from Table 6 and Figure 6 into MSIS in order to obtain the associated mass density values along the satellite orbits. This is performed for the 10 storms, and the mean absolute percent errors are shown in Table 7. The predictions for the August 2005 storm are shown in Figure 7. The figures for the nine remaining storms and a table of mean absolute error in kg/m 3 are presented in Supporting Information S1.
The table shows improvements from both MSIS and EXTEMPLAR v6 along 15 of the 20 orbits. For the March 2015 storm, EXTEMPLAR v6 had the lowest errors along both Swarm orbits, but in terms of density, EXTEMPLAR-ML had lower error. It is important to note that the difference between the two models is fairly negligible for some of the storms. However, the benefit of EXTEMPLAR-ML is its global nature. We do not need to find the closest polyhedral grid cell and execute a specific model; EXTEMPLAR-ML takes the location as an input. In Figure 7, all three models have similar predictions prior to the storm. During the storms, MSIS deviates from the satellite estimates more significantly than the EXTEMPLAR models. Through the entire period, EXTEMPLAR v6 tends to result in higher density values than the other models.

Summary
In this work, we developed and optimized a global machine-learned nonlinear regression model to predict exospheric temperatures, given a set of space weather and temporal drivers. This model, called EXTEMPLAR-ML, has fairly a similar performance across the training, validation, and test sets. This is an extension of a linear EXTEMPLAR model developed by Weimer et al. (2020). EXTEMPLAR v6 is a set of 1,620 linear regression models that predict exospheric temperature for their respective locations. While condensing that to a single model in EXTEMPLAR-ML, we decreased the mean absolute error on all three sets relative to EXTEMPLAR v6. While EXTEMPLAR v6 had a lower error standard deviation on the test set, EXTEMPLAR-ML's value was within 0.5 K. It is difficult to comment on the generality of EXTEMPLAR-ML, because Figure 3 and Table 5 show similar trends between the three models in terms of their relative performance on the three sets. We show that EXTEM-PLAR-ML had lower error in 16/20 solar and geomagnetic activity conditions investigated for the test set. When investigating error as a function of position within the test set, EXTEMPLAR-ML had lower mean absolute errors for 61% of the polyhedral grid cells, while EXTEMPLAR v6 had a lower error standard deviation for 73% of cells. Both of the EXTEMPLAR models are developed exclusively for use with NRLMSISE-00.
The use of PCA on model outputs provided insight to the temperature formulation within the "black-box" ML model. The first mode represented the effects of solar EUV heating and accounted for 97.6% of the system's variance. Seasonal-latitudinal variations accounted for the next 0.76% of the variance and were still a function of solar activity. The last mode we looked at only accounted for 0.20% of the variance but described the effects of high-latitude heating caused by geomagnetic storms. These findings are consistent with what would be expected from performing the analysis with MSIS or EXTEMPLAR v6. This analysis was only conducted on EXTEM-PLAR-ML outputs after training; PCA was not used for model development. The purpose was to show that even without physical constraints in the training process, the model's response to the inputs was in fact physical. Note. Bold text represents the lowest error across the three models for each satellite. Note. Bold text represents the lowest error across the three models for each satellite.  Table 3 We performed a case study where EXTEMPLAR-ML, EXTEMPLAR v6, and MSIS predicted T ∞ along CHAMP, GRACE-A, Swarm A, and Swarm B orbits during 10 major geomagnetic storms. These storms were diverse in their properties and spanned nearly 12 years. EXTEMPLAR-ML provided the most accurate predictions along 13/20 satellite orbits. When these temperatures were used for density prediction, the relative accuracy of EX-TEMPLAR-ML became more pronounced. EXTEMPLAR-ML was the most accurate model for 15/20 orbits and had errors significantly lower than MSIS for nearly all storms. Figures were generated for all storms, but only one storm was displayed (both temperature and density). The remaining figures are provided in Supporting Information S1. These figures show the extent to which EXTEMPLAR-ML follows the satellite temperature and density estimates while being robust to outliers that result from the processing of the accelerometer measurements. In the future, we plan to incorporate model uncertainty into EXTEMPLAR-ML . We plan to develop a newer model using temperatures derived with NRLMSIS 2.0 (Emmert et al., 2021). With the successive model, we also plan to use the exact observation locations to reduce errors associated with binning measurements to the polyhedral grid.