Composite Model for Predicting SYM‐H Index

Predicting SYM‐H index is significant in space weather because it quantifies the degree of perturbation of the geomagnetic field during storm. This study presents a composite model to predict SYM‐H index based on solar wind parameters by combining the empirical magnetospheric dynamical equation with neural networks. The formula for predicted SYM‐H originates from the well‐known empirical relationship between interplanetary conditions and the Dst index. In particular, the coefficients in the empirical relationship are determined by using neural networks that excel at approaching the function linking the coefficients and the solar wind parameters. The 1‐ and 2‐hr forecasts of SYM‐H during storm time are reliable, and the precision of some cases is even better than the latest models solely using deep neural networks. Based on the composite model, the dependence of loss time and injection rates of ring current energy on the solar wind parameters and SYM‐H are investigated.


Earth and Space Science
JI ET AL.

10.1029/2022EA002560
2 of 15 the relationship between storms and substorms.The electric field leading to the injection of particles from the magnetotail to the ring current is also assumed to result in a westward auroral electrojet.Hence, substorm AL index was used to construct an injection function in the dynamical equation of Dst index.To handle the nonlinear process in the magnetosphere, Zhu et al. (2006) used the NARMAX (Nonlinear Autoregressive Moving Average Model with eXogenous input) approach to develop a model for reliable Dst forecasting and comprehending the physics related to the underlying dynamics.
Along with the development of computer technology and the emergence of big data, machine learning has grown into its golden age with numerous breakthroughs, which benefits space weather prediction (Camporeale, 2019).Machine learning technologies, such as neural networks, originated in the 1940s (McCulloch & Pitts, 1943), flourished in the 1990s, and were applied to forecasting geomagnetic indices (Camporeale, 2019).Wu and Lundstedt (1997) employed partially recurrent neural networks (RNNs) to investigate solar wind-magnetosphere interaction and identified the optimal coupling functions.Lundstedt et al. (2002) also used an RNN to construct a 1-hr prediction model for Dst time series, focusing on simplicity and ease of implementation.However, these models encounter a problem that the model inputs, especially plasma parameters of solar wind imposing on the magnetosphere, are not always available during intense storms (Camporeale, 2019).To avoid this problem as much as possible, only interplanetary magnetic field (IMF) measurements are used as networks input to obtain predictions (Pallocchia et al., 2006).Although many efforts have been made to predict Dst, the forecast of SYM-H index has not been given adequate attention.Cai et al. (2010) first use NARX neural network to predict SYM-H from solar wind and IMF parameters.Compared to an Elman network, the NARX model introduces critical feedback transmitting from the output neuron to the input layer, which adds to the ring current status information as a factor in predicting SYM-H in networks.Using NARX to predict seven storms with WIND satellite data, Cai et al. (2010) found that the averaged correlation coefficient approached 0.91, and the root mean square error (RMSE) was about 14.2 nT.Recently, there have been numerous breakthroughs in deep learning technologies for simulating the behavior of the human brain, mainly including processing images and words (LeCun, 2015).Deep learning neural networks, which mimic the human brain by combining data inputs, weights, and bias, are also introduced to forecast geomagnetic indices.Siciliano et al. (2021) employed long short-term memory (LSTM) and a convolutional neural network (CNN) to forecast SYM-H and compared the performance of the two networks.They found that both networks can provide good predictions of SYM-H for 1-hr in advance, and after SYM-H was also set as an input parameter, the model based on LSTM performed slightly better than the model based on CNN.Collado-Villaverde et al. (2021) constructed deep neural networks (DNNs) consisting of both LSTM layers and CNN layers and multilayer perceptron to forecast the SYM-H and ASY-H indexes.The DNN model is more precise than solely using LSTM or CNN.However, models based on neural networks also have disadvantages.The neural networks extract rules from data, but for supermagnetic storms, observations are rare, which cannot be used to abstract robust rules.In addition, neural networks are unexplainable black-box models; thus, the physics behind the interaction between solar wind and magnetosphere cannot be well understood.
In conclusion, it is valuable to forecast SYM-H accurately, but both the empirical equation method and the neural network have advantages and disadvantages.The empirical equation satisfies the conservation of energy transported in the magnetosphere, but it cannot handle the nonlinear complex processes.On the other hand, the neural network can approximate any complex function relationship; however, this approximation is unexplainable and hard to extrapolate.This work aims to combine the advantages of the two methods to construct a theoretical SYM-H prediction model and discuss the effect of solar wind on model parameters.

Model
The composite model is built based on the modeling framework put forward by Burton et al. (1975), which will be introduced first here and called the BMR model.The BMR model considers the following facts to construct Dst index model: 1. Magnetopause current and ring current contribute to Dst. 2. Magnetopause current can be related to the solar wind dynamic pressure because of the force balance between the momentum carried by solar wind and Lorentz force at the magnetopause.3. Ring current is enhanced by the energy injection from the magnetotail, which depends on the IMF and losses due to charge exchange reaction of ring current ions with low-energy neutral hydrogen and oxygen atoms escaping from the top of the atmosphere. 10.1029/2022EA002560 3 of 15 Combining these constraints, Burton et al. (1975) constructed the following equation for Dst: where b is the measure of the response to dynamic changes of the solar wind, P is solar wind dynamical pressure, F is the injection energy function, a is the loss rate of the ring current, and c is the measure of the quiet ring current.Then, Burton et al. (1975) discussed the determination of parameters one by one and obtained the final prescription: () = 0,  < 0.5mV∕m;  () = ( − 0.5),  > 0.5mV∕m (2) Above coefficients are either constant or linear functions, reflecting that the magnetosphere passively reacts to the solar wind.However, this is not physically accurate, many studies (Consolini, 2018) have proven that the interaction between the solar wind and the magnetosphere is active and nonlinear.Although the BMR model cannot deal with the complex processes in the magnetosphere, it does capture the fundamental physics of the solar wind and magnetospheric interaction, that is, energy conservation.This advantage inspires us to develop a more practical model based on the BMR model.We focus on predicting SYM-H which is defined similarly with Dst but with higher time resolution (1 min) than Dst (1 hr).
The core idea is to apply neural networks to make up for the disadvantage of the BMR model.Specifically, the strong function fitting ability of neural networks is used to characterize the complex solar wind and magnetosphere interactions.Hence, we define this approach as a composite model in this study.The sketch of the composite model is shown in Figure 1.The prerequisite is knowledge of historical SYM-H time series and solar wind conditions N min in advance, and the objective is to predict SYM-H at time i + N, where i denotes current time and N is in terms of minutes.The future used data of the solar wind can be obtained by time shifting of upstream solar wind data (Weimer & King, 2008).As SYM-H is proxy of the ring current energy, such a prediction can be realized by considering the magnetospheric energy processes as the BMR model did.The prediction algorithm is divided into three steps: 1. + .Note that, we only consider southward B z , if B z is northward, the V x B z is set as zero.Finally, we predict SYM-H 1 or 2 hr into the future by following equation: In comparison with the empirical model, the advantage of the composite model is that the parameters b, a 1 , a 2 , c are not constant or simple functions of the solar wind or the index but complex functions with respect to the solar wind and the historical index that are modeled by the neural network as shown in Figure 1.The neural network is constructed by fully connected layers and activation sigmoid functions and the details are described as follows: 1.The neural network for determining b is composed of input layer, one hidden layer and output layer.The input layer contains two feature units, which bring solar wind electric field in y direction (V x B z ) and the squared root of solar wind dynamical pressure P 1/2 as input data.The hidden layer contains four feature units.The output layer has one unit and b equates to exp(−2 × output).2. The neural network for determining loss parameters a 1 and a 2 is composed of four layers, that is, input layer, two hidden layers, and output layer.The input layer has 2 × N feature units that are concatenation of SYM-H and P 1/2 signals from i − N to i.There are 32 feature units and 32 activation units for two hidden layers.The output layer has two activation units and a 1 , a 2 equate to exp(−10 × output).Note that we add the output first hidden layer to the input of the feature units of output layer, so that second hidden layer only characterizes the residual between first hidden layer and output layer.This operation is inspired by ResNet that is commonly used in deep learning (He et al., 2016) and can improve performance of the network when more hidden layers are added.3.For the neural network determining injection parameter c, the structure is similar with the neural network determining a 1 , a 2 , but the input features are V x B z , P 1/2 and there is one output unit.
Thus, adhering to the fundamental framework of energy conservation and taking advantage of neural networks, we construct a more practical model to predict SYM-H.Please note that the composite model introduced here is not optimal.Zhang et al. (2022) found that the offset time of input parameters for the EMD-LSTM model will affect the performance of prediction of >2 MeV electron flux.It can be expected that the offset time of SYM-H for the neural network determining a 1 , a 2 is also a key parameter affecting the performance of the model.The optimization of the model parameters, including the offset time and the number of neural network layers, will be studied in the future.

Data Sets
OMNI data, including synchronized magnetic, particle density, and velocity fields for 2000-2019, are downloaded from the NASA/GSFC's Space Physics Data Facility's OMNI Web service (http://omniweb.gsfc.nasa.gov).The indices and solar wind parameters are retrieved in 1-min intervals.The IMF and plasma parameters are recorded at or around the Earth-Sun L1 location by the instruments aboard the ACE, Geotail, IMP-8, or Wind spacecraft and then shifted to Earth's Bow Shock Nose for studying the solar wind-magnetosphere coupling.Notably, solar wind plasma data availability is limited to solve this problem.Therefore, linear interpolation applied to good data is used to recover bad data.These data cover all phases of the solar cycle from which training, validating, and testing data sets are selected referring to prior studies (Collado-Villaverde et al., 2021;Siciliano et al., 2021), as shown in

Training Method
The prediction algorithm introduced in the above section is conducted using Pytorch-an optimized tensor library for deep learning using GPUs and CPUs, widely applied in the deep learning community.Next, the network is trained by the training data set in Table 1.During the training, an algorithm for a first-order gradient-based optimization of stochastic objective functions called Adam (Kingma & Ba, 2014) is used to optimize the loss function, defined as the RMSE between the observational SYM-H and predictive SYM-H.In fact, due to the simple structure of our network, the training can be finished in several minutes.The learning rate for each iteration step is set to 0.0001.
To avoid overfitting (a common problem in deep learning), we chose to stop training when the loss function of the validating data set ceases to descend.

Metrics
In this study, we employ the RMSE and the coefficient of determination R 2 to evaluate the performance of the composite model.Both of metrics were recommended by Liemohn et al. (2018) and have been applied in many studies (Camporeale, 2019;Yang et al., 2018) to evaluate the performance of the geomagnetic indices prediction model.Hence, employing RMSE and R 2 makes it easy to compare the performance of the composite model with other neural networks.The formula of the RMSE is defined as follows: where y j represent the real values of the indices and    are the predicted values.The subscript j denotes jth sample, and n is the total amount of all samples.The RMSE is expected to be larger than 0, as its value characterizes the distance between the real signals and predictions at the same time.
R 2 is a dimensionless metric defined as follows: where   is the mean value of all samples.The range of R 2 is from 0 to 1.If the prediction is the same as the mean value, R 2 equates to 0, and if the prediction is the real value, R 2 equates to 1.Note that the RMSE represents the absolute error with dimension and R 2 is the relative error without dimension.Only when both metrics are used together can the model's performance be evaluated completely and justly.

Prediction
The metrics used to evaluate the performance of the composite model for all test storms are summarized in Tables 2 and 3, which also include the metrics generated by pure neural network models (Collado-Villaverde et al., 2021;Siciliano et al., 2021).Table 3 shows the RMSE and R 2 of different models for a 2-hr prediction.Notably, Siciliano et al. (2021) did not issue a 2-hr forecast SYM-H, and these data are replaced by the baseline here.The baseline means using SYM-H from 2 hr ago to denote the current prediction, which is just used as a reference.
The RMSEs for 2-hr prediction are larger than for the 1-hr prediction, ranging from about 1 to 5 nT.(King & Papitashvili, 2006).
Figure 3 displays the results for the 2-hr prediction.Clearly, the 2-hr prediction is less accurate than the 1-hr prediction.This is expected because, as a nonlinear system, the magnetosphere would evolve with an emergence of chaos that is difficult to forecast.The obvious discrepancy between the observation and prediction occurs in the main phase of storm, especially for large storm No. the energy injection should be considered.The fact that the composite model performs better during quiet times than during storm times indicates that the energy process in the magnetosphere obeys a simple mode for the stable solar wind, but a more complex mode will appear when the solar wind is disturbed and the IMF is southward.
To further investigate the discrepancy between observed and predicted SYM-H, the scatter plot of the observation and the prediction belong to storm No. 26 (black circle), No. 27 (magenta square), No. 29 (green diamond), and No. 38 (blue angle) are shown in Figure 4, respectively.The red line denotes x = y.For SYM-H larger than zero, the scattered points, most of which belong to storm No. 29, show that the composite model does not perform well for this storm.For the interval of SYM-H from −100 to 0 nT, all points are located around the red line compactly, implying that the composite model gives good prediction in this interval.However, when SYM-H is less than −100 nT, the predicted SYM-H show a trend that they become larger than the observed SYM-H, and this trend becomes more obvious as SYM-H continue to decrease until −300 nT.This is an interesting phenomenon, because many prior studies also show same deviation that predicted SYM-H is larger than observed SYM-H for intense storm (Cai et al., 2010;Siciliano et al., 2021).A physical reason hides behind the common deviation accompanied by most SYM-H models and waits to be explained.As SYM-H is less than −300 nT, most points belong to storm No. 29 and do not show a systematic error.5, the b will change little, which reflects that the reaction between the magnetopause current and the solar wind is not simple.
The decay process of SYM-H is divided into two parts to be characterized in the composite model.The first part is related to a 1 which denotes the decay time of symmetric part of the ring current.The second part is related to a 2 which characterizes the decay time of new injection ring current.Panels e in Figures 5 and 6 show the variation of a 1 /N during the storm No. 26 and No. 29.In Equation 5, a 1 is a coefficient not normalized.For convenience of comparing with BMR model, dividing a 1 by N is applied to change its unit into minute −1 .After being normalized, a 1 /N for N = 60 and N = 120 show a good agreement with each other, implying that the decaying process of the symmetric part of ring current is linear at least within 2 hr.In quiet time, a 1 /N is about 0.75 × 10 −3 min −1 , that is 1.25 × 10 −5 s −1 , much less than 3.6 × 10 −5 s −1 given by Burton et al. (1975).In time of storm, a 1 /N is positively correlated with the intensity of the storm and reaches about 1.5 × 10 −3 min −1 (2.5 × 10 −5 s −1 ) for storm No. 26 as well as 3 × 10 −3 min −1 (5 × 10 −5 s −1 ) for storm No. 29, comparable with Burton et al. (1975)'s decay parameter.Note that the inversion of a 1 is also the loss time of ring current particles, and the decrease of the loss time during active storm is due to that the main energy carrier particles of ring current will change from protons to oxygen ions which have a larger charge exchange reaction cross section (Yue et al., 2019).
Panels f in Figures 5 and 6 show the −a 2 /c which denotes the second decay part of the SYM-H.Here, we do not show the a 2 directly, because in Equation 5,   ∑  =1 + denotes the input ring current energy, so that the decay term can be written as  (2∕) ∑  =1 + where a 2 /c quantifies the decay as a percentage of the injection.Note that minus is added in the front of a 2 /c to keep value positive.The second part of the decay of SYM-H is attributed to the fact that some ring current particles may flow out of the magnetosphere when their drift trajectories intersect the magnetopause (Ji & Shen, 2014).The drifting loss of ring current particles is considered as one dominant loss mechanism during the rapid recovery phase of the storm (Keika et al., 2006;Liemohn et al., 1999) instead of O + charge exchange loss (Hamilton et al., 1988).Figures 5f and 6f show that the proportion of the drifting loss with respect to the injection is approximately 0.2 during quite time.When the storms begin, more energetic particles are injected from the magnetotail and −a 2 /c decrease first, then increase.After entering the recovery phase, the proportion of the drifting loss in the injection reach approximately 0.5 for the storm No. 26 and 0.8 for the storm No. 29, and gradually recover to 0.2 in a few days.Note that, the −a 2 /c for N = 120 is slightly larger than for N = 60, and the −a 2 /c also shows a positive correlation with the intensity of storm.The coefficient c that is used to determine the energy of ring current injection based on the neural network output is displayed in Figures 5g and 6g.With the aim of comparing with the rate of ring current injection in Burton et al. (1975)   can reach 3 × 10 −3 .In Burton et al. (1975) model, the −d is 1.5 × 10 −3 nT ⋅ m/mV/s that is larger than the quiet time value but less than storm time value given by the neural network.

Discussion
SYM-H is a quantity to evaluating the intensity of the geomagnetic disturbance caused by the solar wind.Its magnitude is determined by the intensity of ring current and magnetopause current.Moreover, the temporal variation of SYM-H mirrors the energy variation of ring current.Therefore, the solar wind-magnetosphere energy injection and the loss of ring current energy determine the variation of SYM-H.In light of this, Burton et al. (1975) put forward a dynamical equation to relate solar wind parameters with Dst, a low-resolution version of SYM-H that has a 1-hr resolution.In the framework of Burton's work, we construct Equation 5, a discrete time advance formula for SYM-H based on the solar wind and neural networks.The breakthrough is the employment of neural networks to generate the model coefficients used in Equation 5. New model captures the dependence of the coefficients upon the solar wind conditions and the magnetosphere state.The SYM-H predicted by Equation 5 show good agreement with observed SYM-H.In addition, associated with abundant in situ observations, many processes related to the interaction between the solar wind and the magnetosphere can be investigated now.

Insufficient Energy Injection
Figure 4 shows that, for large storms, the composite model underestimates SYM-H intensity and this underestimation intensifies as SYM-H become smaller.Other SYM-H predictive models also produce similar underestimation (Cai et al., 2010).We argue that insufficient energy injection in Equation 5during large storm results in the underestimation.Two acceleration processes occurring in the magnetosphere during storm can be used to elucidate insufficient energy injection.First, as geomagnetic activity intensifies, the main carriers of ring current also include heavy oxygen ions which can form a stronger ring current (Fu et al., 2001;Yue et al., 2019).Second, apart from adiabatic acceleration associated with large-scale convection or substorm induced dipolarizations, localized wave-particle interaction acceleration play more and more important role in the process that ring current particles are energized (Keika et al., 2013;Zhou et al., 2012;Zong et al., 2012).The composite model only considers the large-scale convection acceleration that is denoted by   ∑  =1  in Equation 5.The contribution of the localized acceleration is not included in Equation 5 and becomes more important for large storm.Hence, as storm becomes larger, the energy injecting to ring current is underestimated, which will result in a lower intensity of SYM-H.For pure neural network models, the explanation is different.The data used to train the neural network for predicting SYM-H are short for large storms because of the limited observations.As a result, the relation of the solar wind with SYM-H given by neural network is mostly applicable to small storms.For large storms, it is inevitable that SYM-H will be underestimated by the neural network.

Decay Process
The decay of ring current energy in Equation 5 is divided into two parts originated from two physical processes.After reaching maximum of SYM-H, the storm will entry recovery phase composed of two parts: the earlier rapid recovery phase (hr) and the later slow recovery phase (10 hr).For rapid and slow recovery phase, explanations and arguments coexist (Hamilton et al., 1988;Ji & Shen, 2014;Keika et al., 2006;Liemohn et al., 1999).One possible explanation is that the rapid recovery phase is dominated by oxygen ions (O + ) charge exchange reaction and the slow recovery phase is attributed to protons (H + ) charge exchange reaction.This is based on observations that O + abundance in ring current increases with the intensity of storm and its charge exchange reaction cross section is larger than H + (Chen et al., 2021;Fu et al., 2001;Yue et al., 2019).In other words, the fast loss of O + after the main phase of large storm results in the rapid recovery phase.The variation of parameter a 1 of the composite model during storm also supports this explanation.Figures 5e and 6e show that the decay rate of SYM-H (a 1 /N) will be intensified in accordance with the disturbance of geomagnetic field.During quiet time, a 1 /N is about 0.75 × 10 −3 min −1 , but during large storm (No. 29) a 1 /N can reach 3 × 10 −3 min −1 , which evidences that the composite model can characterize the change of main energy carriers of ring current from H + to O + during large storm.In addition, the composite model also considers another decay process-the drifting loss out through the magnetopause, which is denoted by  2 ∑  =1  in Equation 5.The enhancement of large-scale convection electric field not only results in the injection of plasma from the magnetotail but also drives plasma flowing out of magnetosphere from the magnetopause on the sun side.In steady state, the drift path of charged particles from the magnetotail is open, which means that the charged particles injected from magnetotail will flow out through the magnetopause except the particles loss during drifting.Figures 5f and 6f show the variation of −a 2 /c that is the proportion of the loss with respect to the injection.In quiet time, this proportion is about 0.4, coinciding with the ratio of loss time caused by charge exchange reaction to the drift time from magnetotail to magnetopause.After the storm begin, we find that this proportion reduces at first and then increases to about 0.6 for small storm (No. 26) and larger than 0.8 for large storm (No. 29).This is reasonable, because at the beginning of storm, the plasma injection is coming first, but the drift out loss through the magnetopause is unchanged, the proportion reduces.After the new injection particles drift to the magnetopause, the loss proportional increases, larger than quiet time value.This is due to quick nightside-sunside drifting resulted by the larger-scale convection electric field and the weaken magnetic field during large storm.More particles are transported to the magnetopause before they loss.The behavior of decay coefficients a 1 and a 2 given by the composite model shows that both the charge exchange loss and the drifting loss affect the decay process of the ring current during the recovery phase.The rapid recovery phase is a composite result caused by both decay.
A doubt is that blue and red lines in Figures 5d, 5f, 5g, 6d, 6f, and 6g have a little difference, but for Figures 5e  and 6e, blue and red lines agree well with each other.Why this difference occurs for different predictive time step, for example, N = 60 or N = 120?The difference indicates the effect of the magnetopause on the coefficients.The magnetopause is formed by pressure balance between the solar wind and the magnetosphere, which is not static but dynamic (Lopez & Gonzalez, 2017).If the characteristic time of the magnetopause motion is about 1-2 hr comparable to the predictive time step in Equation 5, coefficients b, a 2 , c relating to the properties of the magnetopause may vary from N = 60 min to N = 120 min.However, the coefficient a 1 that measures the charge exchange loss of symmetric ring current is independent of the magnetopause because the symmetric ring current is located far away from magnetopause.The characteristic time of the charge exchange reaction is much longer than predictive time step, so the loss term a 1 SYM-H show a linear feature, which means that the coefficient a 1 for N = 60 and N = 120 agrees well with each other after being normalized by N.

Effect of Solar Wind Number Density
The composite model shows that the solar wind number density is a nonnegligible factor for the energy injection.In Equation 5, reconnection is regarded as a major mechanism that the energy is transported from the solar wind to the inner magnetosphere.Consistent with other studies (Burton et al., 1975;O'Brien & McPherron, 2000), we employ the solar wind speeds times the south component of the IMF (V x B z ) as the coupling function which has been proved to measure energy transfer effectively (Gonzalez, 1990).However, it should be noted that the coefficient c at the front of V x B z is not defined as a constant, but a variable generated by the neural network depending on previous V x B z and P 1/2 .Figures 5g and 6g show that the c is in harmony with the P 1/2 , implying that the solar wind dynamical pressure can affect the energy transfer between the solar wind and the magnetosphere.Actually, the main effect of the solar wind speed has been contained in V x B z , so that the P 1/2 dependence is attributed to the solar wind density, not solar wind speed.Same opinion had been put forward and proved by MHD simulation (Lopez et al., 2004).They pointed out that the variation of the number density affects the shock compression rate, then larger compression rate associated with higher number density will result in stronger magnetosheath fields applied to the magnetopause, increasing the rate of transfer of magnetic flux and yielding larger polar cap potential that causes larger convection in inner magnetosphere.On the other hand, the observation evidence for the number density effect on the energy transfer was given by Kataoka et al. (2005) who used in situ observation from the GOES and Cluster spacecraft located in the magnetosheath during the main phase of the superstorm on 20 November 2003.Based on the composite model, details about the coupling function depending on the solar wind parameters can be investigated in the future.

Conclusion
In summary, the composite model combining the energy balance analysis of the magnetosphere with neural networks techniques is put forward to forecast SYM-H in this study.Neural network yields the coefficients contained in the composite model by setting the solar wind and SYM-H historical series as input.The performance of 1-and 2-hr forecasts of SYM-H by the composite model is comparable with that of pure NN models, indicating that the composite model has potential application in space weather prediction.Moreover, the dependence of coefficients generated by the neural networks on the historical solar wind and SYM-H is discussed by connecting with the magnetosphere features, such as oxygen ions increasing during storm, convection velocity enhancing during storm, and the effect of the density number on the coupling function between the solar wind and the magnetosphere.The variation of the coefficients is consistent with the physical process occurring in the magnetosphere.In the future, more quantitative investigations about the energy transfer in the magnetosphere during storm can be carried out based on the composite model.
composite model.The maximum of RMSE is 17.5 nT for the Siciliano model, 15.2 nT for the Collado-Villaverde model, and 13.9 nT for the composite model.The average of RMSE is 9.0 nT for the Siciliano model, 8.1 nT for the Collado-Villaverde model, and 7.2 nT for the composite model.It is evident that the performance of the composite model is slightly better than the Siciliano and Collado-Villaverde models.Only storm No. 37, from 11 March 2015 to 21 March 2015, was an exception that the Collado-Villaverde model prediction generates a less RMSE than the composite model.The R 2 for different models shown in Table 2 also confirms that the composite model performs better than the other two neural network models for most test storms.Especially for storm No. 30, the R 2 obtained by the composite model is 0.867, obviously larger than Siciliano et al. (2021) (0.75) and Collado-Villaverde et al. (2021) (0.798).

Figure 2 .
Figure2.Comparison between observational SYM-H (black line) and the 1-hr predicted SYM-H (red line).The predicted SYM-H is calculated using Equation5, with b, a 1 , a 2 , and c determined by a neural network.

Figure 3 .
Figure3.Comparison between the observational SYM-H (black line) and the 2-hr predicted SYM-H (red line).The predicted SYM-H is calculated using Equation5, with b, a 1 , a 2 , and c determined by a neural network.

Figure 4 .
Figure 4. Scatter plot of the observed and predicted SYM-H indices by the composited model: (a) for 1-hr prediction and (b) for 2-hr prediction.
model, we show −c/60 × 10 3 instead of c where the minus keep value positive and dividing 60 turns the unit from per minute to per second.Here, c/60 have same meaning with the coefficient d in Equation 1.The magnitude of −c/60 is around 1 × 10 −3 nT ⋅ m/mV/s during quiet time, and there is a slight difference for N = 60 and N = 120.When solar wind dynamical pressure P increases, the magnitude of −c/60 is going to increase too.For storm No. 26, the maximum of −c/60 achieves approximately 2 × 10 −3 , and for storm No. 29, the maximum

Figure 5 .
Figure 5. Inputs and outputs of the neural network shown in Figure 1 during storm No. 26.(a) Solar wind electricity field in y direction V x B z ; (b) solar wind dynamical pressure; (c) SYM-H index; (d-g) coefficients b, a 1 /N × 10 3 , −a 2 /c, and c/60 × 10 3 in Equation 5. Red lines are for N = 60 and blue lines are for N = 120.Dashed line is given by BMR model.

Figure 6 .
Figure 6.Inputs and outputs of the neural network shown in Figure 1 during storm No. 29.(a) Solar wind electricity field in y direction V x B z ; (b) solar wind dynamical pressure; (c) SYM-H index; (d-g) coefficients b, a 1 /N × 10 3 , −a 2 /c, and c/60 × 10 3 in Equation 5. Red lines are for N = 60 and blue lines are for N = 120.Dashed line is given by BMR model.
Subtracting the component (bP i ) contributed by magnetopause current from SYM-H i .Then, we obtain SYM- H SYM-H i , and b is same in BMR model, P i is the square of the solar wind pressure.Subscript i means time at i. 2. Considering the loss effect of SYM-H* by subtracting a 1 SYM- *  = SYM-H −  1∕2  , which is only contributed by ring current, where superscript * is used to distinguishedFigure 1. Composite model architecture for SYM-H index prediction.from

Table 1
Table 2 shows the RMSE and R 2 for the 1-hr prediction of SYM-H.The minimum of RMSE among all test storms is 4.2 nT for the Siciliano model, 4.1 nT for the Collado-Villaverde model, and 3.9 nT for the Training Data Sets and Validating Data Sets Used in This Study Table3also displays the corresponding R 2 prediction of the baseline, the Collado-Villaverde model, and the composite model.The values of R 2 for 2-hr prediction are obviously less than 1-hr prediction.This is reasonable that, with the increasing of predic tion time step, the interaction between the magnetosphere and the solar wind becomes more unpredictable and the chaos continue to accumulate.Both metrics support that the composite model forecasts are slightly preciser than Collado-Villaverde et al. (2021) model.The difference of mean RMSE between 1-and 2-hr prediction is 2.8 nT for Collado-Villaverde et al. (2021) model and 2.3 nT for the composite model, and corresponding R 2 difference is −0.049 and −0.038.Hence, the composite model shows a more robust prediction ability as the prediction time become longer.Figure 2 displays the comparison between the 1-hr predicted and the observed SYM-H for four test storms with different intensities of which the minimum SYM-H values are around −100, −200, −300, and −400 nT.Panels in Figure2are ordered by the storm date and the corresponding metrics can be found in Table2.The red-dashed lines represent predictions of SYM-H and are in good agreement with the black-solid lines representing real values.The composite model gives a more accurate prediction during the quiet time and recovery phase.Still, during the sudden commencement and the main phase of magnetic storms, there are some deviation that cannot be neglected.The storm with small intensity, such as storm No. 26, shows a smaller difference than large storms, such as storm No. 27 and No. 29.Another discrepancy is that the moment of the sudden increment of the observational SYM-H does not coincide with the predicted SYM-H, especially for storm No. 29.This discordance may be caused by the uncertain shift of the solar wind parameter from L1 measured point to the Earth's Bow Shock Nose

Table 2
RMSE and R 2 Values for SYM-H Predictions Over the Test Storms, Comparing the Work of Siciliano et al. and Collado-Villaverde et al. for 1-hr Predictions 27, No. 29, and No. 38.The predicted SYM-H is larger than the observational SYM-H, implying that the predicted energy injection into ring current is less than real amount.More mechanism responsible for 2 Values for SYM-H Predictions Over the Test Storms, Compared With Baseline and the Work of Collado-Villaverde et al. for 2-hr Predictions