Topside Electron Density Modeling Using Neural Network and Empirical Model Predictions

We model the electron density in the topside of the ionosphere with an improved machine learning (ML) model and compare it to existing empirical models, specifically the International Reference Ionosphere (IRI) and the Empirical‐Canadian High Arctic Ionospheric Model (E‐CHAIM). In prior work, an artificial neural network (NN) was developed and trained on two solar cycles worth of Defense Meteorological Satellite Program data (113 satellite‐years), along with global drivers and indices to predict topside electron density. In this paper, we highlight improvements made to this NN, and present a detailed comparison of the new model to E‐CHAIM and IRI as a function of location, geomagnetic condition, time of year, and solar local time. We discuss precision and accuracy metrics to better understand model strengths and weaknesses. The updated neural network shows improved mid‐latitude performance with absolute errors lower than the IRI by 2.5 × 109 to 2.5 × 1010 e−/m3, modestly improved performance in disturbed geomagnetic conditions with absolute errors reduced by about 2.5 × 109 e−/m3 at high Kp compared to the IRI, and high Kp percentage errors reduced by >50% when compared to E‐CHAIM.


Introduction
The ionosphere is a region of charged particles at the upper boundary of the Earth's atmosphere ranging from 70 to 1,000 km in altitude.Propagation of electromagnetic waves through a cold plasma like the ionosphere is primarily a function of the electron density within the plasma, with some influence from the geomagnetic field and particle collisions.Our ability to model electron density directly impacts our ability to model the effects of anomalous solar radiation, or space weather, on radio communications that travel through the ionosphere or bounce along the Earth-Ionosphere waveguide.
Modeling the ionosphere is a critical component of a robust space weather monitoring system.The topside ionosphere is particularly tricky to obtain electron density data from since it is much less accessible by ground instruments such as ionosondes due to the F-region electron density peak.Looking then to satellite data sources, radio occultation measurements are often tuned to the lower altitudes of the ionosphere and are not the best data source for upper topside modeling.A number of LEO satellites which orbit in the topside ionosphere carry either electron density probes like the Defense Meteorological Satellite Program (DMSP) (Redmon et al., 2017), or topside sounders like the ISIS-1 and 2 missions (Eccles et al., 1973).However, due to the 7 km/s ground speed of a satellite in LEO, it is difficult to achieve a high spatial resolution at a given time, or a high temporal resolution at a given location, making it difficult to build an empirical model that can handle making predictions at any time and location given variable geomagnetic conditions.
There have recently been some efforts to use machine learning (ML) to model the ionospheric electron density, as well as other quantities in the sun to Earth space weather system.Empirical models of electron density in the ionosphere assume a baseline mathematical function that generally describes an electron density altitude profile at a given latitude/longitude pair, while ML allows for the creation of models with fewer restrictions on their structure.By leveraging the long time spans of the THEMIS probe data sets (Bortnik et al., 2016), focused on Abstract We model the electron density in the topside of the ionosphere with an improved machine learning (ML) model and compare it to existing empirical models, specifically the International Reference Ionosphere (IRI) and the Empirical-Canadian High Arctic Ionospheric Model (E-CHAIM).In prior work, an artificial neural network (NN) was developed and trained on two solar cycles worth of Defense Meteorological Satellite Program data (113 satellite-years), along with global drivers and indices to predict topside electron density.In this paper, we highlight improvements made to this NN, and present a detailed comparison of the new model to E-CHAIM and IRI as a function of location, geomagnetic condition, time of year, and solar local time.We discuss precision and accuracy metrics to better understand model strengths and weaknesses.The updated neural network shows improved mid-latitude performance with absolute errors lower than the IRI by 2.5 × 10 9 to 2.5 × 10 10 e − /m 3 , modestly improved performance in disturbed geomagnetic conditions with absolute errors reduced by about 2.5 × 10 9 e − /m 3 at high Kp compared to the IRI, and high Kp percentage errors reduced by >50% when compared to E-CHAIM.

Plain Language Summary
The sun interacts with the outer layers of the Earth's atmosphere via the solar wind.Coronal mass ejections and solar flares travel from the sun along the solar wind creating space weather on Earth.A severe storm will cause widespread power outages and disrupt communication services.Unfortunately, we cannot predict space weather as well as we can predict normal weather.In this work, we explain a machine learning model that can predict how the sun will impact one outer layer of the Earth's atmosphere, called the ionosphere.The model combines data that is easier to measure directly and uses it to create these predictions, and compares them to existing physical/mathematical models of the ionosphere.

10.1029/2023SW003501
2 of 12 predicting global plasma density in the magnetosphere using a neural network.Building on that work (Dutta & Cohen, 2022), predicted electron density in the topside ionosphere using a similar NN structure (Watanabe et al., 2021) applied a multilayer NN with nine hidden layers to predict electron density in the topside ionosphere and plasmasphere in a combined model using data from the Hinotori, Akebono, and Arase satellites.In this paper, we build off the results of Dutta and Cohen (2022), and present a number of significant improvements.We then compare extensively to two other empirical models, discussing under what circumstances each work better.

Existing Empirical Models
The International Reference Ionosphere (IRI) is considered the standard ionospheric model (Bilitza et al., 2017).It was created by COSPAR and URSI and continues to be updated with a multitude of data sources.The model takes location, time, and date, and can output electron density, temperature, ion temperature, and ion composition throughout the altitudes of the ionosphere.As the IRI has improved, one area of ongoing development is in the topside ionosphere, with four options developed to better model this region (IRI-2001, IRI-2001cor, NeQuick, andCOR2).These options were developed using topside sounder data from Alouette and ISIS satellites.In this paper, we use the IRI 2016 model via pyglow, a python wrapper.We did not specify a F2 peak density or peak height, so the default URSI maps were used for NmF2 and the Shubin 2015 model for hmF2.The foF2 storm model was turned on.The "NeQuick" model was used as the topside modeling option, which models the topside electron density as a function of altitude using an Epstein layer function, and it is the default topside model the 2016 IRI uses (Bilitza et al., 2022).
The Empirical Canadian High Arctic Model (E-CHAIM) aims to improve upon the IRI by focusing modeling efforts solely on the arctic region of the Earth's ionosphere.The developers of E-CHAIM created a topside model based on improvements to the NeQuick model used by the IRI.E-CHAIM combines a spherical cap harmonic expansion, a Fourier expansion, and a function of F10.7 to capture horizontal spatial variability, seasonal variability, and solar cycle variability, respectively (Themens et al., 2018).This combination of functions is fit four separate times to model the bottomside profile, h m F 2 , N m F 2 , and the topside profile.We ran the Matlab release version 3.3.1 of E-CHAIM to make electron density predictions.The model was run in satellite mode, since all of the data used to build, validate, and test the NN model was sourced from the DMSP and DEMETER program.All other options (storm, precip, dregion) remained off for the comparisons made here, as it was found that model performance was comparable with or without those flags on.

Existing Neural Network Models
Other work using neural networks to model a temporospatially varying quantity in the magnetosphere, ionosphere and/or thermosphere has been done in the past.We present some examples of existing models, and briefly compare those models to the model we present in this work.
A multilayer NN with nine hidden layers has been applied to predict electron density in the topside ionosphere and plasmasphere in a combined model using data from the Hinotori, Akebono, and Arase satellites (Watanabe et al., 2021).The supervised model takes universal time from 1 January, geographic latitude and longitude, altitude, and the past 5 days of F10.7 and Ap indices to predict electron density.While the model predicts electron density at any time and geographical location within altitudes of 500-30,000 km, model validation remains a difficult challenge.Out of all of their available data, 99.803% of the data was used for training leaving only 0.017% for model validation.Designing a model with a larger set of data from which to validate model performance is crucial.
An existing technique for topside ionospheric modeling uses 4 different networks to learn NmF2, HmF2, H0, and dHs/dh (Smirnov et al., 2023).Those networks are trained on inputs similar to ours, including a mix of global index values and geographical features.The four parameters are then fit to a linear alpha-Chapman function to create ionospheric electron density profiles.The network we present instead uses a time history of global index values and geographical features to directly predict ionospheric electron density, and does not impose any additional structure on how the model learns to predict electron density.
Prior work has been done predicting the global plasma density in the magnetosphere using a NN (Bortnik et al., 2016).The model inputs are the prior five hours of the Sym-H index (60 values, one every 5 min), the L 10.1029/2023SW003501 3 of 12 shell, and the sine and cosine of magnetic local time.Data for training, validation, and testing was obtained from the Time History of Events and Macroscale Interactions during Substorms (THEMIS) probes.They obtained correlation coefficients of 0.94 on training and validation data, and a correlation coefficient of 0.93 on testing data, performance we emulate in the topside ionosphere using this method.

Description of Data
The DMSP was a set of satellites launched into Low Earth Orbit between the mid-1960s and mid-2010s, primarily used as a source of environmental information for the US Department of Defense (Redmon et al., 2017).Data from the DMSP has been widely validated and used for modeling purposes (Cai et al., 2019;Eather, 1979;Redmon et al., 2017).12 satellites in the program were equipped with a Special Sensor Ionospheric Plasma Drift/Scintillation Monitor (SSIES) which consists of a Langmuir Probe and an ion sensor, the former of which is used to measure electron density.Each satellite has obtained measurements for anywhere from 3 to 21 years, with an average of ∼11 years of coverage per satellite, or 139 satellite-years total between 1988 and 2023.The satellites' orbital altitude is centered around 830 km, with some data collected as high as 880 km and as low as 760 km.The data can be accessed via CEDAR's Madrigal Database as Level 1 NASA data.DEMETER is a French microsatellite mission which started collecting data in 2004 and ended its mission in 2010 (Lebreton et al., 2006).Its primary objective was studying the potential precursors to earthquakes that might be detected in the ionosphere, although its scientific scope expanded greatly beyond this goal once in went into orbit.This satellite was in Low Earth Orbit, with a nominal altitude of 715 km from orbit entry in 2004 until December 2005 when it was lowered to 660 km.A Langmuir Probe Instrument (ISL) made of two Langmuir Probes was used to collect electron density data.Data collected by Langmuir probes is susceptible to contamination from water vapor (Oyama et al., 2012).The DEMETER satellite shows signs of such contamination, meaning the absolute electron density measurements may not be accurate; however, the data still provides realistic information about the structure and variability of the topside ionosphere (Kakinami et al., 2013).In this context, we discuss the absolute error of models on the DEMETER data set binned by features such as magnetic latitude and Kp in addition to the well validated DMSP data set to observe model behavior over facets of ionospheric variability.

Scoring Metrics
Here we define scoring metrics to compare models, for both accuracy and precision separately: To measure accuracy, we use the coefficient of determination R 2 computed by sklearn.metrics.r2_score.This Python function uses the following formula to compute R 2 : where N is the number of points in the data set, y i is the ith true log 10 (N e ) value from the data set,    is the model predicted log 10 (N e ) value for the conditions associated with the ith data point, and    is the mean model predicted log 10 (N e ) value for predictions made on this data set.Since the relationship between the data set (true) values and the model (predicted) values may not be linear, R 2 may be negative.A negative R 2 indicates that simply predicting    , the mean value of the data set would outperform the model.In Dutta and Cohen (2022), we reported the correlation coefficient R.This is the square root of R 2 as defined here.
To measure precision, we use the explained Variance EV, computed by sklearn.metrics.explained_variance_score, which uses the following formula to compute EV: If the variance in the error between the data set and modeled electron density is 0, then the model has achieved a perfect EV score of 1.As the error variance increases, the EV score drops.The variance of the error between the data set and the modeled electron density can be greater than the variance of the data set, so negative EV scores are possible.

Neural Network Model
Prior work on developing a NN is published in an earlier paper (Dutta & Cohen, 2022).This NN took 39 time series solar index features (Kp, IMF, and F10.7) and 8 location features as input, and was trained via supervised learning to predict log 10 (N e ).In more detail, the global index inputs to the model are the past 7 days of F10.7, the past day of Kp, and the past day of average IMF measurements.The remaining inputs encode the location, specifically we use altitude, geographic latitude and longitude, magnetic latitude, solar azimuth, and magnetic local time.In order to improve and better characterize the NN from Dutta and Cohen (2022), we first modified the network architecture.In particular, we adopted an aspect of the NN presented by McGranaghan et al. (2021), in which the number of nodes from layer to layer was gradually increased and decreased.Such a structure is useful when some features correlate with each other (McGranaghan et al., 2021).The new architecture is reflected in Figure 1, where the first hidden layer consists of more nodes than the number of inputs.In addition to modifying the architecture, we also modified the split of training, validation, and testing data, and trained a new model reflecting these new splits.To better characterize the NN performance, testing data that was not used in training or validation was needed over a full solar cycle.As such, rather than use the later years for validation and testing, we divided up the data by satellite.These inputs are eight location features, which are magnetic local time (sine and cosine), geographic latitude, geographic longitude (sine and cosine), altitude, magnetic latitude, and solar zenith angle, and 39 index features, which are the past 7 days of the F10.7 index, the past day of 3-hr Kp index (8 features), and the past 24 hr of average IMF measurements.These inputs pass through three fully connected hidden layers, with 57, 37, and 10 nodes.
Sigmoid activation functions are used between all of the layers, and a final linear layer is applied before producing the electron density prediction.Dramatic improvement is seen in performance on the DEMETER data set, with a positive R 2 and EV for the new NN, while the old NN had R 2 = −0.19.
The new NN has a lower EV score than the IRI on the DEMETER data set with a difference of 0.26, but the accuracy score is better for the new NN.This implies that the IRI and DEMETER data set have a systematic offset, which could be corrected for if the outputs of multiple models were to be aggregated.In Table 2, we provide the R 2 and EV scores restricted to high magnetic latitude data only, and we reserve discussion of those statistics for the Results section.

Results: NN, IRI, and E-CHAIM Comparisons
We now describe the performance of the NN model alongside the IRI and E-CHAIM models in topside electron density prediction, across characteristics such as time of day, time of year, geography, and geomagnetic conditions.
Here we present MAEs across these characteristics over the DMSP data sets and the DEMETER data set.
Figure 2 plots MAE of the IRI, NN, and E-CHAIM as a function of solar local time (SLT).The top row contains data from all magnetic latitudes and includes IRI, and both previous and new NN models.The bottom row is restricted to magnetic latitudes above 50°N, where E-CHAIM provides an electron density value.We note that E-CHAIM provides an output for magnetic latitudes between 45°N and 50°N to allow for integration with other ionospheric models, but is only recommended for use above 50°N.The DEMETER satellite has limited solar local coverage, so the DEMETER column only plots absolute errors for 08:00-12:00 and 20:00-24:00, with high magnetic latitude data further restricted to 11:00 to 12:00 and 20:00-22:00.On the full DEMETER data set in the top row, the new NN obtains a MAE 2.5 × 10 10 e − /m 3 lower than the MAE from the IRI at 10:00, but performs comparably to the IRI at other times.On the latitude restricted DEMETER data set, the IRI performs slightly better than the NN and the E-CHAIM models across the few solar local times that are available in this data set.Focusing on the top row and the three DMSP data set columns, when comparing IRI and the NN we see that the NN has lower error across all SLTs, with the exception of 15:00 in the DMSP validation data set.
Figure 3 plots MAE of the IRI, NN, and E-CHAIM across months of the year in the same format as Figure 2.There are peaks in MAE corresponding to northern hemisphere spring and southern hemisphere spring in the   Figure 4 depicts MAEs binned by the average Kp value over the day prior to the prediction.The NN performs somewhat better than the IRI across the DMSP and DEMETER data sets, while E-CHAIM performs similarly or worse than the IRI.We note that higher values of electron density are expected when the Kp index is elevated (Papagiannis et al., 1975), so the absolute errors will increase even if the model performs just as well at higher Kp compared to lower Kp.In order to examine this effect, we created Figure 5, which plots mean absolute percentage error as a function of Kp across the same data sets, and leave analysis to the discussion section.
Figure 6 depicts the new NN, IRI, and E-CHAIM predictions next to time series data from the DMSP test data set (satellite F16).Low Kp data is from a time period when the average Kp was less than 2, while high Kp data is from a time period when the average Kp was greater than 7 (2003 Halloween storms).At low Kp, the IRI provides smooth output along the DMSP satellite trace that matches the overall electron density fluctuations recorded by the satellite, while E-CHAIM provides a smooth output that somewhat follows the fluctuations, while the NN output is not as smooth and is more reflective of the DMSP satellite data, able to capture the fluctuations reported by DMSP.At high Kp, the IRI again provides smooth output that follows the electron density fluctuation, but with a bias toward underestimating the electron density.At high Kp, E-CHAIM again provides smooth output but this time better following the fluctuations than the NN or the IRI.The NN output is again not as smooth at high Kp, and detects some peaks and valleys while missing others, but it does not underestimate the electron density at high Kp like the IRI.

Discussion
In this work, we compare a NN model of the topside ionosphere to the IRI and E-CHAIM.These comparisons make use of DMSP and DEMETER data, the latter of which was not used to train the NN or fit IRI/E-CHAIM.
Before we compare the performance of the NN model to existing models, we provide commentary on the characteristics of the DMSP data used for training and testing the model.At the nominal altitude of operation, the DMSP satellites are sun-synchronous, meaning the satellites gather the majority of their data in a restricted set of solar local times, primarily focused around 4-8 SLT and 15-20 SLT.However, the satellites experience drift over their lifetime, allowing for coverage of all solar local times across the data set.We have verified this claim by plotting a histogram of the data set binned by SLT.Additionally, despite this bias toward specific solar local times, we have also verified that the model is trained and tested on data from all combinations of SLT and F10.7, SLT and Kp, and SLT and magnetic latitude.We further posit that the premise of the NN learning based on a limited data set is that it is able to separate out the effects of SLT versus another factor, such as Kp, in making its electron density prediction.This way, when provided a novel (to the model) combination of SLT and Kp, it can still make a useful prediction.
We start our analysis by examining the summary statistics in Table 1, which compares the neural networks to the IRI using the full DMSP and DEMETER data sets at all latitudes.We find that the new NN outperforms the IRI in both correlation coefficient and explained variance scores across all DMSP data sets and the DEMETER data set, and modestly improves in performance over the old NN on the DMSP data sets, with much greater improvement seen on the DEMETER data set.In earlier works, E-CHAIM outperforms the IRI (NeQuick) in the topside ionosphere, producing topside electron density profiles that match Alouette and ISIS profiles with root mean square errors that are 36% lower than the errors obtained by IRI (Themens et al., 2018).However, both the IRI and E-CHAIM tend to produce topside electron density profiles with lower error close to the h m F 2 peak, and as altitude above the peak increases, the errors in the electron density also tend to increase.The NN model developed focused on altitudes significantly higher than the h m F 2 peak, and E-CHAIM's advantage over the IRI is slimmer at these higher altitudes.
In Figure 2, the coefficient of determination for E-CHAIM is 0.01-0.02greater than that of the IRI on the DMSP train and validation data sets, equivalent for the DSMP test data set, and 0.1 lower on the DEMETER data set.The new NN outperforms both E-CHAIM and the IRI across DMSP data sets, but underperforms both on the DEMETER data set.
Here, we go into more detail about performance differences between the IRI and the NN across all magnetic latitudes, and then the IRI, E-CHAIM and the NN on data from magnetic latitudes greater than 50°N, breaking down our analysis by SLT, month of year, Kp index, and magnetic latitude itself.The NN and E-CHAIM perform similarly, both better than the IRI on the DMSP data sets, but the IRI outperforms both on the DEMETER data set across SLT.We note that the DEME-TER data set is extremely limited in SLT coverage, and we conclude that across SLT, the NN performs better than to equivalent to the IRI, with E-CHAIM performing marginally better than the NN from 21:00 to 4:00, and marginally worse from 4:00 to 21:00.

By Month
Looking at the full DMSP and DEMETER data sets (all latitudes) in the top row Figure 3, the highest absolute errors across the IRI and the new NN occurs in April and November, corresponding to northern hemisphere spring and southern hemisphere spring, respectively.Restricted to high (northern) latitudes, the peak error occurs either April or May, again corresponding to springtime in the northern hemisphere.We also note that across the full data sets, the NN outperforms the IRI across all months, by anywhere from 2.5 × 10 9 e − /m 3 up to 2 × 10 10 e − /m 3 .
Considering the E-CHAIM absolute errors by month as in the second row of Figure 3, we observe a modest peak in absolute error corresponding to northern hemisphere springtime across the DMSP data sets, versus a significant peak in error at the same time across the DEMETER data set.Comparing to the IRI, E-CHAIM outperforms the IRI from January through September with lower MAEs by as much as 5 × 10 9 e − /m 3 , and slightly underperforms from October through December on the DMSP data sets, but on the DEMETER data set, the IRI strongly outperforms E-CHAIM from March through October with lower MAEs by as much as 10 10 e − /m 3 , and slightly underperforms from November through February.We see the NN and E-CHAIM perform comparably across the DMSP data sets, while the NN exhibits similar or slightly better performance on the DEMETER data set.Only when limited to high latitude data from a limited window of solar local times as in the high latitude DEMETER data plot does the IRI outperform E-CHAIM and the NN.We conclude that by month of year, the NN is a better predictor of topside electron density.
One study focused on E-CHAIM predictions of ionospheric electron density over Resolute Bay Incoherent Scatter Radar data found that E-CHAIM underestimates electron density in the topside by 20%, and that topside underestimation is prominent during nighttime in the Fall (Larson et al., 2023).The RISR data used covers altitudes from 100 to 500 km, while the DMSP data focuses on altitudes around 850 km and DEMETER data at 715 km.At these higher altitudes we find that E-CHAIM, along with the other models, overestimates electron density in the fall and at night.For comparison, we examine Figures 8 and 9 which provide the mean ratio error by Month and by SLT at high latitudes.By month, the NNs and E-CHAIM attain a ratio of 1 or slightly overestimate the electron density across all data sets from spring through summer, with the exception of May in the DMSP train data set, while the IRI tends to underestimate the electron density over those seasons.In the fall and winter, all of the models tend to overestimate the electron density by a factor of anywhere from 2 to 6.By SLT, the NNs and the IRI maintain a ratio of 1-3, tending to overestimate at sunrise at sunset, while E-CHAIM overestimates ).Bin size is 5° wide and the point for a bin is plotted in the center of the bin (e.g., data with magnetic latitude between 0 and 5°N is plotted at 2.5°N).Empirical-Canadian High Arctic Ionospheric Model (E-CHAIM) predictions are only available for magnetic latitudes above 45°N; however, there are discrepancies between the magnetic latitude provided in the Defense Meteorological Satellite Program data set and the magnetic latitude E-CHAIM computes given the geographic latitude/longitude, altitude, and UTC, which is why E-CHAIM has predictions for a bin centered at 37.5.
in altitude coverage with DMSP and DEMETER.NeQuick struggles to correctly reproduce high altitude electron densities in the Equatorial Ionization Anomaly (Bilitza & Xiong, 2021;Nava et al., 2008).The NeQuick model also underestimates electron densities at high latitudes and elevated solar activity on DMSP data from satellites F13 and F15 (Migoya-Orue et al., 2013).Our findings are in agreement with these previous works, as across the DMSP data sets, we observe higher absolute error in the IRI model between magnetic latitudes ±15°N with lower error between 15 and 60°N and S, with a smaller error peak at the poles, while on DEMETER, there is a prominent peak in error in the midlatitude range and no small peak at the poles.We note that the highest E-CHAIM MAEs from 8 × 10 9 to 1.1 × 10 9 e − /m 3 are seen at 45°N or 75°N across all the data sets.E-CHAIM is recommended for use on magnetic latitudes greater than 50°N, so elevated error below that latitude is to be expected.The NN performs similarly or better than the IRI and E-CHAIM across the DMSP data sets, and performs significantly better than the IRI at midlatitudes when tested on the DEMETER data set, with lower absolute errors by 2 × 10 10 e − /m 3 .We find that the NN matches or outperforms E-CHAIM by magnetic latitude across all of our data sets, and outperforms the IRI except near the poles in the DEMETER data set.

Conclusion and Future Work
We have presented an updated NN to model the electron density in the topside ionosphere, with a focus on altitudes between 700 and 850 km.The updated NN shows improved mid-latitude performance with absolute errors lower than the IRI by 2.5 × 10 9 to 2.5 × 10 10 e − /m 3 , improved performance in disturbed geomagnetic conditions with absolute errors reduced by about 2.5 × 10 9 e − /m 3 at high Kp compared to the IRI, and high Kp percentage errors reduced by >50% when compared to E-CHAIM.We have also reported performance of the models across solar local times and months, noting the high variability of the ionosphere at midlatitudes at sunset, as well as during the springtime in the Northern and Southern hemispheres.In future work, we look toward combining the output of the empirical models with that of the NN, as well as introducing additional data sets to validate the output of this larger hybrid model.
These 47 inputs went into a fully connected NN with two hidden layers consisting of 20 and then 10 nodes, finally providing one output, log 10 (N e ).For NN regulation, a validation data set was withheld from the training process of updating NN weights and biases, and this held out data was used to stop training once performance on that held out data set no longer improved.The training and validation data was sourced from the DMSP, and the testing data was from either the DMSP or the DEMETER satellite.More specifically, DMSP satellites F08 through F19 provide electron density data over the years 1988-2019, where 1988-2011 was used for training, 2012-2016 was used for validation, and 2017-2019 was used for model testing.DEMETER data from July 2004 to December 2005 was also used for model testing.The NN presented in Dutta and Cohen (2022) outperforms the IRI in both R score and EV when tested against the later years of DMSP, but not when the DEMETER data set was taken as truth.However, when considering model performance by geographic location, it struggled at high latitudes, and considering performance under varied geomagnetic conditions, it struggled under heavily disturbed geomagnetic conditions.Here, we present improvements to the model architecture and training data set to improve standalone NN model performance and better characterize the neural networks' performance.

Figure 1 .
Figure 1.Model architecture.The neural network takes a total of 47 inputs.These inputs are eight location features, which are magnetic local time (sine and cosine), geographic latitude, geographic longitude (sine and cosine), altitude, magnetic latitude, and solar zenith angle, and 39 index features, which are the past 7 days of the F10.7 index, the past day of 3-hr Kp index (8 features), and the past 24 hr of average IMF measurements.These inputs pass through three fully connected hidden layers, with 57, 37, and 10 nodes.Sigmoid activation functions are used between all of the layers, and a final linear layer is applied before producing the electron density prediction.

Figure 2 .
Figure 2. Mean absolute errors (MAEs, [×10 10 e − /m 3 ]) of International Reference Ionosphere, neural network, and Empirical-Canadian High Arctic Ionospheric Model (E-CHAIM) across DEMETER and Defense Meteorological Satellite Program data sets, binned by solar local time (SLT) (0-23).The bottom row represents data from magnetic latitudes greater than 45°N, since E-CHAIM provides predictions at and above that latitude, while the top row contains data from all latitudes.The y-axis is the MAE in ×10 10 e − /m 3 , while the x-axis is the SLT from 0 to 23.

Figure 3 .
Figure 3. Same as Figure 2, but the errors are binned by month of year (1-12).

Figure 5 .
Figure 5. Same as Figure 4, but the error metric is mean absolute percentage error.

Figure 6 .
Figure 6.Comparison of data from Defense Meteorological Satellite Program test data set (F16) and new neural network, International Reference Ionosphere, and Empirical-Canadian High Arctic Ionospheric Model (E-CHAIM) performance.The left column contains data from a period of low geomagnetic activity, when the average Kp was under 2, while the right column contains data from a period of high geomagnetic activity when the average Kp was above 7 (during the 2003 Halloween storms.)The bottom row is a time series subset of the top row, chosen to focus on times when the satellite was above 50° magnetic latitude, so E-CHAIM performance is visible.

Figure 7 .
Figure7.Mean absolute errors binned by magnetic latitude, with negative values indicating the southern hemisphere and positive values the northern hemisphere (−90 to 90).Bin size is 5° wide and the point for a bin is plotted in the center of the bin (e.g., data with magnetic latitude between 0 and 5°N is plotted at 2.5°N).Empirical-Canadian High Arctic Ionospheric Model (E-CHAIM) predictions are only available for magnetic latitudes above 45°N; however, there are discrepancies between the magnetic latitude provided in the Defense Meteorological Satellite Program data set and the magnetic latitude E-CHAIM computes given the geographic latitude/longitude, altitude, and UTC, which is why E-CHAIM has predictions for a bin centered at 37.5.
The training data included DMSP data from 1988 to 2011 but excluding satellite F16.The validation data set is obtained from DMSP data from 2012 to 2019, also excluding satellite F16.Therefore, the F16 DMSP satellite, which covers all solar local times, and the years 2003-2019, was used for testing.The distribution of the DMSP training, validation, and testing data sets are similar since they orbit at similar altitudes and use the same instrumentation to record electron density.While there is now temporal overlap between the training/validation data sets and the testing data sets, the satellites were not in the same locations at the same time, and so are independent.This also means that the old NN was trained on a large portion of the current DMSP test data set, since the original model was trained on all DMSP data available from 1988 to 2011.In Table1, we present updated R 2and EV scores on the IRI, the new NN, and the old NN, using the new training, testing, and validation data splits.Despite the old NN having encountered a large chunk of the DMSP test data set in training, the new one still performs comparably, with equivalent or better by 0.01-0.02R 2 and EV scores across all DMSP data sets.As a result of the metrics observed in

Table 1 ,
we determined that the changes made to the NN made improvement over the old NN, and we selected this version of the model for further analysis.
Note.DEMETERdata is from July 2004 to December 2005.DMSP training data is from 1988 to 2011 and excludes data from satellite F16, DMSP validation data is from 2012 to 2019 and excludes data from satellite F16.DMSP testing data is from 2003 to 2019 and is sourced solely from satellite F16.The bold values indicate the model with the best performing coefficient of determination/explained variance score obtained on the data set.
Model performance metrics for the International Reference Ionosphere (IRI), old neural network, new neural network and Empirical-Canadian High Arctic Ionospheric Model (E-CHAIM) with the new Defense Meteorological Satellite Program (DMSP) training, validation and testing data set splits.DEMETER data is from July 2004 to December 2005.DMSP training data is from 1988 to 2011 and excludes data from satellite F16, DMSP validation data is from 2012 to 2019 and excludes data from satellite F16.DMSP testing data is from 2003 to 2019 and is sourced solely from satellite F16.The bold values indicate the model with the best performing coefficient of determination/explained variance score obtained on the high latitude restricted dataset.

Table 2
Data Sets Restricted to Magnetic Latitude >50°N