A Nonlinear System Science Approach to Find the Robust Solar Wind Drivers of the Multivariate Magnetosphere

We propose a method, based on Neural Networks, that detects the nonlinear robust interplanetary solar wind variables, with varying delays, driving the coupled behavior of three geomagnetic indices (Dst, AL, and AU). As opposed to minimizing a prediction error, the method is based on degrading the prediction by distorting the inputs of the trained Neural Networks in order to highlight the most sensible drivers. We show that the z component of the magnetic field, the duskward oriented electric field, and the speed of the particles of the interplanetary medium, at particular time delays, seem to be the most efficient drivers of the three coupled geomagnetic indices. Using only the sensible or robust drivers in the model, we demonstrate that iterated predictions during geomagnetic storm are significantly improved from models that only use one of the outstanding drivers with multiple time delays. The derived robust nonlinear Neural Network model is also a significant improvement over linear approximations, specially when used as iterated predictors.

complexity of the SWMI system (Consolini et al., 2018;Donner et al., 2019;Valdivia et al., 2005Valdivia et al., , 2013, demands the development of models and techniques to account for these interactions in a robust manner. This is becoming particularly true as we are increasingly relying on artificial intelligence models to try to account for such complex behavior (Bortnik et al., 2018;Jawad et al., 2019). However, when developing robust models, it is not enough to just train a neural net with a large number of variables since, given the complex and nonlinear nature of the system, the model will probably not work as efficient on a set of events that is different from the training set. Some variables that do not actually contribute to the prediction can produce an inaccurate model that does not generalize well and that finally deteriorates the prediction. These complications of overfitting becomes even more relevant as the magnetospheric system is a high dimensional system that seems to evolve in a low dimensional attractor (Valdivia et al., 1996) so that these models should represent, to some approximation, the robust dynamics of the system. Hence, a robust multivariate nonlinear system science description, that for example includes the coupling of these three geomagnetic indices (GI) and with solar wind drivers (Borovsky & Denton, 2018;Valdivia et al., 1999) can further our understanding of these interactions and their time scales (Adhikari et al., 2019), and could pave the way to robust Space Weather applications. This is what we are going to start analyzing in this manuscript. So that, for simplicity, we will study the three above described GI (Dst t , AL t , and AU t ) and search for the robust solar wind variables (SWV), and their possible time delays, that drive the magnetosphere response as characterized by these three indices. In our study we will consider data with an hour time scale and let faster variation to future work. We have chosen these variables because they are widely used by the scientific community and therefore the results can be compared with other studies. Additionally, at an hour time resolutions we don't need to worry too much about solar wind propagation issues. In a future manuscript we plan to study spatial data from individual stations and with a higher time resolution, which could improve the spatial biases due to non-uniform spatial sampling.
As a model reconstruction, we will use Neural Networks (NN) because they have demonstrated to be a powerful machine learning tool capable of performing complex tasks (Abiodun et al., 2018). NN have already been used to forecast the Dst index (Gruet et al., 2018;Lazzus et al., 2017) showing that it has a high self correlation with the immediate past measurements, therefore, the NN will include the Dst index at time t as part of the NN variables. The same will be done for the other magnetospheric indices that we are considering here.
A huge advantage of NN is their capability to build functions that can be highly nonlinear and that property is the one we want to exploit in the present study. The counterpart is that once they are trained, they are a black box which parameters remain meaningless to humans. We are therefore strongly limited to extract valuable physical information from what they could have "learned." Furthermore, it is very common that their predictability could be quite high in the training set through overfitting, but not as good in out-ofsample forecasting of other events. There are a number of interesting strategies that have been used to try to avoid these issues, for example the k-fold cross-validation (Zhelavskaya et al., 2017). Here we are taking an alternative, and complementary approach.
Hence, in the present manuscript we propose a method that quantifies the robustness of a solar wind variable as the degradation of the predictability of the trained model when such a variable is perturbed trying to forecast a different testing set of events. The most sensitive or robust variables, are those which produce the largest error when perturbed over the testing set.
In the present study we address the model construction in two ways. First we consider a short term prediction that uses the selected solar wind variables, and their delays, at a particular moment in order to predict the geomagnetic indices for the next hour, which we call one step forecasting. The second way is to use the one step prediction model to forecast the GI for a longer time interval, using the selected SWV, and their time delays, and reinserting the predicted geomagnetic indices in the model, which we will call iterated forecasting.
In order to simulate real-time forecasting, we use the iterated forecasting, since we note that some of these indices are not easily accessible in real time or are available only part of the time.

10.1029/2020SW002634
2 of 17 If successful, our approach would provide a strong indication that the selected robust inputs drive the signal and brings information about the physics of the system. This information could be used as a complementary approach to test, validate, and optimize forecasting models of the magnetospheric response to solar wind input, and in general of any driven complex system under study. For example, predictions of the Dst index are provided by many services like the Space Weather Center Prediction, or www.spaceweatherlive.com. Their short term accuracy could be tested, improved, and optimized by following our strategy to find robust modes of interaction. Additionally, since our strategy of constructing these forecasting models is quite different from the standard ones, it is always useful to provide an alternative forecast, specially during periods where real-time Dst may not be available. The same can be said about AL and AU, which are usually not easily available in real time.
It is important to mention that there are other methods, some of them based on information theory, that have been used to try to ascertain the dependencies and the directed/undirected couplings among the magnetospheric and solar wind variables. For example, (Runge et al., 2018) found, using an information theoretical method for input-output systems, that the solar wind Bz was a relevant driver of AL and SYM (a high resolution version of Dst). Similarly, Wing et al. (2016) applied an information transfer technique to find the solar wind variables that affect the outer radiation belt. As suggested by Balasis et al. (2013) (and references therein), entropy-based measures seem to be able to identify and quantify the linear and nonlinear inter-dependencies between different geophysical variables, variability at different scales, and other characteristics. Our method complements these strategies, as is able to discover and quantify the dependencies and the directed/undirected couplings among variables, but using the same methodology, neural nets in our case, that can then be applied to construct forecast based on the results. Of course, our strategy to find the robust variables can be used for any other linear and nonlinear models. Given the nonlinearity and complexity of these systems, sounds reasonable to use the same methodology that is used for forecasting, being linear or nonlinear, to ascertain the robust variables and couplings in a sort of self-consistent approach.

Data Description
The values of the Dst index are provided hourly from year 1957 to the present at the World Data Center for Geomagnetism in Kyoto (http://wdc.kugi.kyoto-u.ac.jp/dstdir/index.html [Masahito et al., 2015]). No value is missing since 1957, thus no signal reconstruction is needed. We will then use the hourly resolution in all the study. Data of interplanetary medium and the AU and AL indices where retrieved from the OMNI database provided by the National Aeronautic and Spatial Agency (ftp://spdf.gsfc.nasa.gov/pub/data/omni/ high_res_omni/ [King & Papitashvili, 2005]). The oldest available data in OMNI is from 1963 but with an average data rate lower than 20% which is too sporadic for our purposes. From year 1995 the data rate improved significantly thanks to the Wind and ACE satellites commissioning.
OMNI data are provided with a resolution of one minute but are not continuous. Since Dst is provided hourly, we choose to perform this study using a resolution of one hour, therefore OMNI data need to be transformed to 1 h sampling. Values are taken when the UTC minute is equal to zero and will be calculated as the average over all available data of the 60 following minutes. Even if only one value is available during this hour it will be taken as the value of the corresponding hour. In case no data is provided during this interval, they will be generated with by a linear interpolation from the available values. Being aware of Lockwood et al. (2018), we proceeded this way in order to balance these two conflicting ideas, namely to have a reasonable amount of data to train, test, and validate our models; or use extremely drastic filters that lead to insufficient data. The fact that we were able to construct a successful robust model, as described below, in spite of these restrictions, suggests that our choices were acceptable.
The SWV that we use from the OMNI database are the three interplanetary magnetic field (IMF) components given in the Geocentric Solar Equatorial (GSE) coordinate system (B x , B y , B z ), the particle flow speed (V), the proton number density (N), the plasma temperature (T), and the proton dynamical pressure (P). It is worth noticing that p is not a direct measurement but it is calculated from the proton speed and density by P = 2 × 10 −6 NV 2 (nPa). We can also construct composed variables that will be considered in the analysis. One is VB s , where B s is the negative component of Bz, so that it is zero if B z > 0 and B z if B z < 0. Sometimes, people prefer to write these expressions in the Geocentric Solar Magnetospheric (GSM) coordinate system, BLUNIER ET AL.

10.1029/2020SW002634
3 of 17 but for this analysis we stay within the GSE coordinate system as a preliminary analysis with solar wind variables in the GSM system produces the same 6 relevant solar wind variables.
Similarly, we define    2 4 2 0 sin ( / 2) A VB l , where B as the magnitude of the IMF, l 0 is seven times the Earth radii, and the clock angle θ is tan −1 (|B y /B z |) for B z > 0 and π − tan −1 (|B y /B z |) for B z < 0. When using variables in the GSM coordinate system, such formula would describe the Akasofu's index.
All the used interplanetary variable are summarized in Table 1 with their units and typical values in quiet and disturbed times. VBs, ɛ A , and P will be treated as independent variables. One might expect the NN to be able to find the relevant dependencies between the variables measure from the satellites, but this cannot be guaranteed.
Since we are interested in studying perturbations of the magnetosphere, we will concentrate on geomagnetic storms intervals where the Dst index reach a peak below −100 nT. We consider the storm is over the first time that Dst reach Dst end > −10 nT after the peak. The beginning of the storm is taken T b hours before the peak, such that T b is 20% of the time that separates the peak to the end of the storm. Between 1995 and 2018 we identify 97 geomagnetic storms that reach a peak below −100 nT.
In order to improve the robustness of the data we require that all the variables used during the time of the storm fulfill with the following criteria: (a) a maximum of five continuous hours with no data, and (b) at least 90% of data available during the storm interval. After applying those filters, we have a set of 64 storms between 1995 and 2018 that will be used for our study. It is important to mention that the set has a reasonable amount of data that contains quiet conditions so that they are represented in the model, but no an excessive amount that would "swamp" the optimization of the neural net coefficients.
In Figure 1 we show a cartoon of how we separated the training, testing, and validation sets. The training sets has more events (24) than the training (20) and validation sets (20). The same occurs about the number of data points, for example, the testing, training, and validation sets have 3,495, 3,576, and 2,518 entries, respectively. Our sets were selected in terms of number of storm events regardless their duration, and more events are used to train them than to test them. The information about each storm in the sets is provided BLUNIER ET AL. in the Appendix A1, A2, A3. We note that we have a small number of very strong storms (e.g., <−200 nT), which may cause a problem with testing, training, and validation. One possibility, that we plan to analyze in the future, is to renormalize the errors by considering the distribution of Dst values, which would give more relevance, in some sense, of the rarer Dst values.
Hourly sampled data are available in csv format at https://ccc.ciencias.uchile.cl/climaespacial/storms_non_ linear_data. Figures of each variable during each storm used in this study are also thoroughly provided.

Analysis
The 64 selected storms are split in training, testing, and validation sets as described in Figure 1. In order to train and test our NN we use the Keras library under python. All data are scaled with the MinMaxScaler from the python sklearn library.
For the training phase we use the default Keras binary cross entropy loss function defined as: where N out corresponds to the number of outputs of the NN, y j are the normalized data values, ˆj y are the normalized output predicted by the NN. This function is non-negative and converge to zero when ˆj y gets closer to y j . The minimization if performed with the Keras adam optimizer.
All our NN have three hidden layers of 100 neurons activated with a sigmoid function. This allows us to describe a reasonably complex evolution function, given by which gives us the value of the GI at t + 1 from information at previous times. From now on, the bar on top of a variable means it is generated by the NN model. Here G t corresponds to the magnetospheric vector where I n,t is the nth solar wind driver at time t. The nine solar wind drivers are described in Table 1 and we will study their capability to robustly drive G t . In our study we will take, at most, m i = m = 10 for all solar wind drivers, so that we will be able to check the influence of each solar wind driver on the magnetospheric variables up to 10 h in the past. Our models, that have the structure shown in Figure 2 as given by Equation 1, are one step ahead of time forecasters, in the sense that they use previous solar wind and magnetospheric data (up to time t) to produce a prediction of the magnetospheric variables one step forward in time (at t + 1). These NNs models are trained by iteratively changing their coefficients (that defines the model) that so that it lowers the Keras binary cross entropy loss function averaged over the whole training set. The loss function is evaluated over the predicted magnetospheric indexes. In order to reduce the over fitting, the process is stopped when we reach a minimum value of the Keras binary cross entropy loss function averaged over the testing set. Hence, it reduces the over fitting because the coefficients are changed in relation to the training set, and the process is stopped when it reaches a minimum when evaluated in the testing set.
One of the difficulties of training NNs with real data is the existence of multiple solutions that can be found by the algorithm. The minimizing process can stabilize around a bad local minimum, missing a more optimal solution. Therefore, we train 40 NNs with the same structure and variables, but with different seeds for their coefficients that may converge to different minima using the above procedure. We then keep the one with the lowest Keras binary cross entropy loss function averaged over the testing set.
In order to evaluate our models, we compare it to the toy model   that we call persistence model. For the testing and validation phases we compute the unitless mean absolute error (MAE) of a particular geomagnetic index X t , normalized by the error calculated in the persistence model, namely, BLUNIER ET AL.
10.1029/2020SW002634 5 of 17 The advantage of this measure is that it provides a direct comparison with the persistence model, but it also gives equal relevance to all the GI when we add the MAE for the three of them. Hence, it gives equal relevance to each storm independently of their magnitude and fluctuations. In addition, the optimization strategy we use in Keras requires selecting a random subsample of the training set in every optimization step to avoid falling into a set of coefficients that represent a particular time dependence pattern or storm.
In order to evaluate the contribution of a particular solar wind driver, with a particular time delay τ, we will add to it a Gaussian noise centered in zero and with an increasing standard deviation σ, such that the particular solar wind driver is now given by All the other variables are not perturbed. Here, σ will vary from 0% to 100% of the difference between the maximum and the minimum values obtained in the particular solar driver signal during the tested storm interval. If the NN is well trained, the noise introduced to this variable at the particular time delay is expected to degrade the prediction, so that, the higher the value of σ, the higher the value of ϵ MAE should be. This will be called the noised input method. This analysis is done to demonstrate that ϵ MAE grows with sigma, and that for the robust variables this growth is largest. To determine the robust variables, we will simply generate a surrogate of the solar wind variables by shuffling it. This should be equivalent to a σ = 1 noised input. This analysis is then repeated by perturbing another solar wind driver and/or time delay. We expect that the most robust solar wind drivers, at a particular time delay, are the ones that are most sensitive to the perturbation giving the largest error.
Here we report the normalized MAE as ϵ(σ)/ϵ(0) − 1 for each of the geomagnetic indices. Therefore, we start by training a NN that contains the last measurement at time t of the GI and only one solar wind driver I i,t using the above training procedure. We then perturb the same solar wind driver over the testing set, and observe its sensitivity with σ. We repeat the procedure for the other solar wind drivers, as shown in Figures 3a-3c for the storm of March 2001 that reached a peak value of −149 nT for the Dst index. VB s has already proven to be a good driver for Dst (KAMIDE, 1992), a empirical expression linking the Akasofu's parameter with the AE index has been proposed by Akasofu (1981). In this step of the study, BLUNIER ET AL.
10.1029/2020SW002634 6 of 17 Figure 2. Input/Output variables used to train and test the NN in the case of a single solar wind driver (n = 1). The general case with multiple solar wind drivers follows the same pattern. Note that the magnetospheric variables in the evolution model are considered, for simplicity, only at the previous time t. This will be generalized in a future work. combination of solar wind variables is not considered in the entry of the NN, thus VB s , ɛ A , and P are treated as independent variables.
For each trained NN that corresponds to a particular solar wind driver, the testing phase is repeated 50 times for each σ and we keep the average of the obtained error. Since the error is centered, all the curves begin at 0 for σ = 0. The solar wind driver is associated with one error per geomagnetic index. With this procedure, we expect to gain information on the correlation of the disturbed solar wind driver and its capability to predict G t .
We note that B z and VB s are consistently the most sensitive variables for the prediction of these 3 GI, which we will denote them the robust solar wind variables. For example, when the perturbations is of the size of the signal for Dst, meaning with σ = 1 the error is multiplied by nine in the case of B z and above six for VB s , while the other drivers show a much lower perturbation when the noise increases. For the AL index, the effect is lower but still present, and B z and VB s remain by far the most sensitive solar wind drivers. Although in the case of AU the error does not reach more than 100% from the clean input for any of the solar wind drivers, still B z and VB s show up as relevant. Of course, for AU other variables could also be relevant such as density. Hence, we obtain an error for each geomagnetic index given a value of σ that informs us about the global robustness of the solar wind driver in the NN, and therefore, about the global robustness of this solar wind driver to predict G t . Let us note that the strategy we are using to determine the robust solar wind variables may at first sight seem counter intuitive, since we are not trying to minimize an error, but to maximize it. But after subsequent consideration, we hope it becomes more clear.  In order to provide a more general conclusion we repeat the experiment 50 times with σ = 1 for each storm. We show the average error over all the storms for each geomagnetic index and solar wind driver in Figure 4. For a given magnetospheric index and solar wind driver we have a bin that represents how perturbed is the error when the input is noised, therefore, telling us about its contribution to the forecast. We also give an error range in order to observe how much it varies among the 50 surrogate tests. If this bar is small in comparison to the average height, the contribution of the entry is considered significant. If the bar is compatible with 0, we can conclude that the apparent contribution of the driver could come from a statistical fluctuation and is globally not contributing to the prediction. A global overview shows that the Dst index is the most sensitive to the solar wind drivers since the error is clearly higher than AU or AL. AU is not convincing in its sensitivity to the solar wind drivers, only B z has a result not compatible with zero. From the point of view of the solar wind variables, we see again that B z and VB s are picked up by strategy as the most relevant solar wind variables to forecast the geomagnetic indices. The B x and ɛ A variables do not bring significant information to the prediction while the error produced by B y , V, N, T, and P is quite reduced compared with the first two variables B z and VB s . From now on, instead of adding an error to the signal we randomize the order of the solar wind driver time series for a particular storm (a surrogate). Using this randomization input method, we can look deeper inside the solar wind drivers. We are not only interested in identifying the solar wind variable that could contribute to predict the geomagnetic indices, but also at which time t − τ they should be taken to make a robust prediction. In order to find the most relevant solar wind components at a particular time delay I i,t−τ that affect the GI, we test the selected NN for each SWV at a particular time delay and repeat the surrogate randomizing procedure outlined above only to the entry corresponding specifically to the time delay of the SWV we are testing, while the other entries are left unchanged from the original signal.
Again, we expect that the most relevant components are the ones that should degrade the prediction the most. This procedure will produce one global associated error ϵ s for each solar wind driver, at each time delay, of each geomagnetic index and each storm. In order to evaluate each entry, the error is normalized by the error obtained when the entry is not perturbed and we subtract 1 to this ratio. Hence, it will give us clues about which solar wind drivers and delays should be used to construct a robust NN prediction model.
In Figures 5a-5c, we show the variation of the normalized and centered MAE produced by each component of each solar wind driver up to 10 h before the last available measurement for each geomagnetic index. We note that only a few solar wind variables, at particular time delays, are robust variables. Figure 5d displays the sum of the MAE for the three geomagnetic indices, providing a global descriptor of the robust solar wind drivers for this coupled system. The sum done is over the normalized MAE, therefore, in some sense BLUNIER ET AL.  we give equal footing to the three variables. Of course in the future we could consider to use other norms and some re-scaled measures, which could improve the forecasts. We consider that a component contributes to the prediction (i.e., is robust) when it reaches 10% of the maximum value obtained in the matrix. For Dst, B z is contributing up to 3 h before the last measurement while VB s only 2 h. Some delayed components of the speed signal also seem to bring a significant contribution to the prediction. For AU, it looks that many components are turned on given the very low maximum, however, we can highlight the most important components being the immediate measurements of B z , V, and VB s . Finally, AL has five components turned on, where again the last value of B z and VB s seem to be the best contributors. The global sum of the errors, shown in Figure 5d, highlight 6 SW values which will be considered for our final robust model, that has nine entries. They are the three previous values of B z , the last value of V, and the two last values of VB s , in addition to the last values of Dst, AL, and AU.
Once we have determined which are the most relevant solar wind drivers, at particular time delays, that drive the coupled geomagnetic indices, we train a robust NN replacing I t by a vector containing those 6 components. Hence, we now retrain our NN with the global entry containing the same geomagnetic indices but with the six outstanding solar wind inputs of Figure 5d. In parallel we build a multilinear regression BLUNIER ET AL. model using the same entries for comparison. This is done with the LinearRegression under the sklearn python library. This model will then take 27 parameters, meaning 6 SWVs plus 3 GI for each of the three output variables.
As a matter of testing, we also want to use iterated predictions for the NN and linear model, meaning that they have to use the geomagnetic values they predicted t G in the previous time step to produce the next predicted value, namely, B B B V VB VB I corresponding to the subset of 6 robust variables that are used as drivers of the neural net. Such approach may be useful for real time forecasts when only solar wind variables are available. One would naturally expect that the iterated predictions are less accurate than the one step predictions, that uses previously measured G t values to drive the neural net. However, it becomes relevant to compare these results with an equivalent model that does not consider the robustness of the variables. Therefore, as a way to compare, we construct NN and linear models using the 11 last measurements of the best driver B z (left column of Figure 5d), in other words, Equations 1 and 3 run using       1 10 ( , , , ) . Note that the first time delay of each 3 GI is always present in the entry, G t for one step forecasts and t G for iterated prediction. In Figure 6a we plot the results over the validation set for the globally robust NN, while in Figure 6b we show the results of the NN and linear models that use only the B z driver at all-time delays and the 3 GI (11 + 3 inputs to the neural net). Each bin represents the normalized-by-persistence MAE (see Equation 2) for each geomagnetic index with a deviation bar for the 20 storms of the validation set.
In the case of one step forecasting of the robust model, although existent, the improvement does not look significantly different for both linear and NN models, but when we do iterated predictions the results becomes much more interesting. The prediction of Dst by the globally robust NN model are 38% (14.3% for AL, and 17.0% for AU) better than the linear model.
When comparing both robust and B z -based nonlinear NN models, we find that for Dst and AU the iterated globally robust forecast show a global improvement of 12.7% and 8.7%, respectively, as compared with the B z -based model. In the case of AL we obtain no improvement. Note that we have used six solar wind entries for the robust model, compared with the 11 solar wind entries of the B z -based model. If we reduce the B zbased model to only just the first six time delays, the difference is clearer with respect to the robust model. Figure 7 shows the one step and iterated globally robust model forecasting during the storm of May 2013 that reached a Dst minimum of −113 nT. In the left panel we show the predictions for the globally robust NN and linear models and in the right panel we have results corresponding to the B z -based models. The given numbers in the legend correspond to the bare MAE without normalization, and the errors of the different models (and one step vs. iterated) normalized to the persistence. The one step predictions show improvements from the B z based model, for example Dst one step forecasting is improved by 23.8% when comparing the NN models. When looking at the iterated prediction, the globally robust model improves the Dst iterated prediction by 10.5% in this particular case. On the other hand, we highlight the significant improvement of 40.1% in error between the linear and NN robust models in this storm.

Conclusions
The proposed approach is founded in three points that began to be analyzed in this paper, but of course there remain a number of issues and possibilities that will be studied in future work.
First, we designed a successful strategy, as demonstrated for example in improved forecasts, to find the robust variables that reduce overfitting and determine the evolution of a system. This method is based on the same modeling representation used to produce the forecasts in order to keep self consistency in the model representation of the system. This is useful for the magnetosphere, as we realize that it is a high dimensional driven dissipative complex system, and any lower dimensional model representation should avoid non-relevant variables. The approach can be used to improve linear and nonlinear models, as shown in the BLUNIER ET AL.  b) the model using the 11 last hours of B z . We compare the NN versus a linear model using the same entries for one step and iterated predictions. The results are normalized with the error of the persistence. Hence, the iterated forecasts using the nonlinear global robust model is an improvement from the equivalent model that uses only B z variables up to 10 time delays, despite the fact that the later have a larger number of SW inputs (11 vs. 6). Remember that all models (linear and NN) include as additional input variables the 3 GI variables at the previous time (e.g., Dst t , AL t , and AU t ), as described by Equation 1.
The robust model was able to reduce the forecasting errors compared with linear and nonlinear models that consider some non-robust variables.
And, third, the approach can be used as a method of system science discovery to help to determine the relevant drivers and couplings of variables representing different systems and subsystems. For example, it becomes of interest to quantitatively determine, in a robust fashion, the contribution of a particular variable to the evolution of the system or subsystem considered. In this case we note that only a few solar wind variables, with particular time dependencies, affect the evolution of the GI in a robust manner. This suggests a future course of study to try to understand the physical foundations of such solar wind-magnetospheric couplings.
We now expand on some additional details that may contribute to the discussion: We have considered a coupled model of the GI as suggested by previous work (Akasofu, 2020;Runge et al., 2018), and the proposal of Borovsky and Valdivia (2018) that the magnetosphere is a complex system, with a number of coupled subsystems. Hence, it makes sense to look for models that involve more than one region (different indexes). For simplicity, the interaction is taken through their values at the previous time, leaving the consideration of robust couplings by additional time delays of the GI for a future manuscript. Therefore, our strategy could be used to extract information about the storm-substorm interaction, with or without the consideration of the solar wind variables, a work that is already in progress.
Our strategy has picked the most robust solar wind parameters, as additional terms do not provide a significant improvement in the error over this robust, and reduced, model. In addition, this robust model performs better than a model with more variables but that is constructed only with Bz at different delay times. It is interesting to note that in the robust model we see V(t), Bz(t), and VB s (t), but no other combinations (e.g., ɛ A does not appear). It is true, that this could happen because they share some information, although they are not exactly the same. In the future, we could think of strategies to orthogonalize the input variables, but in the NN nonlinear sense, so that we "minimize" the inclusion of partially repeated information in the input variables.
BLUNIER ET AL.  In essence, we have studied the correlation between geomagnetic indices (Dst, AU, and AL) and interplanetary solar wind variables at the L1 point of the Sun-Earth system through 64 geomagnetic storms, for which we have simultaneous data, that occurred since 1995. The method we used is based on training Neural Networks and look at their capability to predict the evolution of the magnetosphere in storm periods. We stressed the entries of the trained Neural Networks in order to evaluate their robustness of the solar wind variables, at particular time delays, in the prediction of the geomagnetic indices. The magnetic z-component of the interplanetary magnetic field and the duskward oriented component of the electric field VB z appeared to be the most robust drivers for the prediction since the addition of a noise to them shown a significant degradation in their capacity to drive the geomagnetic indices. By a similar method we determined which of the nine solar wind variables, and particular time delays, we must consider in this analysis to give the best predictions of the geomagnetic indices. The entries with a time delay above three hours from the last measurement is found to be significantly less relevant to forecast the GI, as compared with variables at smaller time delays. This result may become important when considering the memory of the magnetosphere in processing the solar wind energy, matter, and so on.
The pressure, the ɛ A parameter, the temperature, the x and y components of the magnetic field, and the particle density of the interplanetary medium do not seem to bring significant contribution to the prediction at this level of approximation.
Finally, we built a linear and nonlinear models based on the robust solar wind variables in their capability to forecast. We show that over a sample of 20 storms, the forecasting of Dst is improved by 12.7% from a Neural Network based only on the interplanetary z-component of the magnetic field that consider 11 time delays. Neural network is 38% better to predict Dst than linear models which emphasizes the highly nonlinear behavior of the magnetosphere. Hence, the robust model, with only its 6 solar wind drivers, provides an improvement in the forecast of this simplified SWMI system representation. Of course, we could consider including robust magnetospheric indices at various time delays, as variables for the NN, a work that we plan to conduct in a future publication.
Future forecasting models of GI should strongly consider the highlighted variables and time delays as input of their models. This is in particular true for Dst forecasting which is widely used to describe the state of the magnetosphere. Of course, additional information is provided about AL and AU, which are in general hard to obtain in real time. A similar method can be used to highlight the driving variables of other geomagnetic indices, like K p or A p , used in web services in order to improve the reliability of the forecasts. Those results can be used to construct step-by-step a multivariate, robust system science, description of the magnetosphere evolution.
Moreover, variables that show significant contribution at several time delays can be interpreted as orders in differential equation that can drive the GI. Therefore, it could improve existing minimal system-science mathematical descriptions of the magnetosphere behavior at the 1 h time-scale. The dependency of input/ output variables with different time delays, suggest that it should be possible to re-interpret these equations in terms of differential equation of different order. Therefore, it may provide a hint on how to improve existing minimal system-science mathematical descriptions of the magnetosphere behavior. For instance a number of researchers have proposed improvements to Burton et al. (1975), physically based model for the Dst evolution using solar wind measurements (Fenrich & Luhmann, 1998;Valdivia et al., 1996). The present robust model suggests other variables to include in these analysis, that could provide additional information about the ring decay dependency on the solar wind and magnetospheric variables, memory, relevant drivers, and even suggest higher order differential equations forms for it. Their accuracy still need to be improved and including derivative orders in terms of the relevant parameters we found could be a clue for a better mathematical and physical description.
In terms of computational time, the code takes 3 days to train all the neural nets required for the paper in a single multicore computer, however it can be parallelized extensively which would reduce this time considerably.
Therefore, these type of techniques can be used to find the relevant drivers in other non-linear systems, where including too many variables in the NNs may be dangerous if we intend to produce robust forecasts. We plan to introduce other magnetospheric indices in a larger version of the SWMI system representation.