Ionospheric single-station TEC short-term forecast using RBF neural network

Authors


Abstract

In this article a radial basis function (RBF) neural network improved by Gaussian mixture model is developed to be used for forecasting ionospheric 30 min total electron content (TEC) data given the merits of its nonlinear modeling capacity. In order to understand more about the response of developed network model with respect to stations situated at different latitude, estimated TEC overhead of GPS ground stations BJFS (39.61°N, 115.89°E), WUHN (30.53°N, 114.36°E), and KUNM (25.03°N, 102.80°E) for 6 months in 2011 are used for training data set, validating data and test data set of RBF network model. The performance of the trained model is evaluated at a set of criteria. Our results show that the predicted TEC is in good agreement with observations with mean relative error of about 9% and root-mean-square error of less than 5 total electron content unit, 1 TECU = 1016 el m−2. Our comparison further indicates that RBF network offers a powerful and reliable tool for the design of ionospheric TEC forecast.

1 Introduction

Ionospheric total electron content (TEC) is the amount of free electrons in a column of 1 m2 cross section along the path of the electromagnetic wave between each satellite and the receiver. It is an important descriptive parameter for ionosphere scientific studies, which is crucial for correcting navigation measurements for single-frequency users, especially under geomagnetically disturbed conditions.

Most investigations focus on improved prediction, mapping, and forecasts of the ionospheric TEC. Ionospheric models were established with statistical description based on a large number of observations or a combination of physics and observations [Klobuchar, 1987; Bilitza, 2001; Bust et al., 2004; Schunk et al., 2004; Scherliess et al., 2009]. One of the most widely used models is the International Reference Ionosphere (IRI), which is an international project sponsored by the Committee on Space Research and the International Union of Radio Science. IRI can describe monthly averages of electron density in the altitude range from 50 km to 1500 km if a series of standard parameters are provided including location, local time, day of year, and solar activity indices. Mapping techniques of GPS-derived ionospheric TEC were developed rapidly since Global Positioning System was put into service. In literature, spherical harmonic, inverse distance square interpolation, and kriging are frequently used techniques to construct global and regional TEC maps [e.g., Mannucci et al., 1998; Orús et al., 2005; Sayin et al., 2008; Liu et al., 2011; Huang and Yuan, 2012]. Artificial neural network (ANN) is a computational simulation inspired by simplified human brain processing. Owing to the fact that neural networks are well-known general function approximation, they are frequently used in modeling complex relationship between outputs and inputs in order to find patterns in data. For the nonlinear variation of ionospheric parameter, ANNs have been applied in forecasting the critical frequency of the F2 layer (foF2) and TEC [Francis et al., 2001; Oyeyemi et al., 2005; Tulunay et al., 2006]. The relevant studies demonstrated that ANNs-based approaches are promising in modeling ionospheric parameters.

The variation of TEC is affected by many factors such as time, location, solar activity, geomagnetic activity, and so on. The southern regions of China encompass the northern crest of the equatorial anomaly, where the variation of ionosphere is complex and severe; therefore, comprehensive studies about the ANNs approach need to be taken into account in developing ionospheric TEC mapping and forecasting. Early works were based largely on the architecture of back propagation (BP) neural network and obtained some significant results. Radial basis function networks, one of the most important artificial neural networks, share the common application area with BP networks. Radial basis function (RBF) networks have a characteristic locally tuned response at its center and its width, whereas BP may be regarded as global in comparison.

In this work, a method was developed to perform ionospheric TEC forecast based on the efficient RBF neural network. In the following section, RBF network topology is first introduced, and then the processing method of ionospheric TEC from dual-frequency GPS measurements is described. In section 4, an ionospheric TEC prediction model based on RBF neural network is developed. In the results part, the performance of RBF network model is evaluated and compared with well-known BP network. Finally, the results are discussed with the conclusive remarks on the general performances of neural networks together with the future aspects of the study.

2 RBF Networks Topology

Radial basis function RBF neural networks typically include three layers: an input layer, a hidden layer, and a linear output layer. One neuron in the input layer corresponding to each predictor variable is connected to all the neurons in the hidden layers via unity synaptic weights referring to the same strength of a connection between input layers and hidden layers. Each neuron in the hidden layer consists of a radial basis function centered on a point with the same dimensions as the predictor variables. The output layer has a weighted sum of outputs from the hidden layer to form the network outputs. Assuming that x = [x1, x2, …, xn] is input feature vector defined by the problem itself, this process can be described in equation (1) as follows:

display math(1)

where yj is the jth output, h is the number of neurons in the hidden layer, and wij are the output weights, each corresponding to the connection between a hidden unit and an output unit. The ϕi represents the activation function for RBF network. The most widely used activation function in various scientific fields is taken to be Gaussian indicating the strength of local perception in the context of the input-output mappings:

display math(2)

where the parameters ci are the mean of the ith Gaussian function and ri is the width or standard deviation. The mean vector ci represent the center location while r models the reshape of the activation function Gaussian.

3 Accurate TEC Processing From GPS Observations

Dual-frequency carrier-phase and code-delay GPS observations are combined to obtain slant ionospheric TEC inline image along the satellite-receiver line of sight:

display math(3)

where inline image represents the real slant TEC from the satellite S to the receiver R and inline image is differential code biases due to the transmitting and the receiving hardware. These biases must be estimated and eliminated from the data in order to calibrate the experimental slant TEC obtained from GPS observations. Usually, the slant TEC measurements are converted into the equivalent vertical TEC values independent of elevation angle, and the following mapping function is used to convert slant TEC to vertical TEC, or vice versa [Schaer et al., 1995]:

display math(4)

where inline image is the real vertical TEC and inline image denotes elevation angle at the ionospheric pierce point (IPP) known as the intersect point of the satellite-receiver line of sight passing through ionospheric thin shell fixed at an altitude of 420 km. The vertical TEC is unknown and can be modeled with the following polynomial function [An et al., 2010; Liu et al., 2011]:

display math(5)

where Aab is the coefficient of polynomial, n and m are degrees of the polynomial, s0 is the hour angles of the Sun on the middle epoch of the observation session observed at the central point, s is the hour angles of the Sun of observation epochs at the IPPs, i.e., s − s0 = (λ − λ0) + (t − t0), t is the observation epoch, t0 is the middle time of the observation session, and φ, λ is the latitude and longitude at IPP. According to equations (4) and (5), equation (3) can be expressed as

display math(6)

In this study, the degrees are n = 4 and m = 3. Vertical TEC and instrumental biases are estimated using least squares method. Vertical TEC modeled with spherical polynomial is used as the actual TEC distribution which is used as training, validating, and test data set of neural network model. It should be pointed out that generated TEC values by using the method described above cannot exactly be real distributions of TEC for the southern region of China. Ionosphere over China is characterized with larger vertical delay values and great spatial and temporal gradient, especially in the southern China which is located in the equatorial ionospheric anomaly. Errors are introduced in the conversion of slant TEC to vertical TEC using equation (4) [Rama Rao et al., 2006]. In the equatorial and low latitude, the maximum root-mean-square error is around 6–12 total electron content unit, 1 TECU = 1016 el m−2 (TECU) [Huang and Yuan, 2013]. It is difficult to characterize the TEC variability accurately with the existing techniques. However, the TEC values obtained from dual-frequency GPS primitive observables provide the most readily available data source to inspect the general performance of RBF neural method in the comparative manner.

4 Construction of RBF Network-Based Model

Network basic structure is one of the factors influencing the performance of RBF neural networks. Thus, the primary task of developing RBF network-based ionospheric TEC forecast system is to construct suitable network structure.

4.1 Input Variable Selection and Preprocessing

In the case of artificial neural network, the choice of input feature variables is a fundamental and yet crucial consideration in identifying the optimal functional form. For short-term TEC forecast in the present study, input variables are selected from the available data. Meanwhile, they should be determined according to the relationships within the available data to identify suitable predictors of the model output. In this study, the TEC value at the time instant k designated by D(k) is first chosen as one of the input variables. The rest of the input parameters related to the temporal variation of the TEC values (D(k) − D(k − 1))/Δt and (D(k − 1) − D(k − 2))/Δt are considered due to significant TEC temporal variability at low latitudes. Relative difference (D(k) − D(k − 1))/D(k) is also an important feature parameter characterizing TEC variability and then is chosen as input variables. In addition, day of year and local time are included as input variables. The output is D(k + 1), the 30 min TEC forecast. To allow a numerical continuous trend of data, day number and hour are each split into two cyclical components [Habarulema et al., 2009]. The diagram of input-output variables is shown in Figure 1.

Figure 1.

Input-output variables diagram of RBF neural network.

Considering that several input variables are not independent of each other, maximal entropy is used for dimensionality reduction which is beneficial to speed computation. Simultaneously, as a computationally efficient classification method, K means cluster is used to group the input vector into homogeneous subgroups. For K means clustering, the steps are as follows: first, specify the number of clusters and choose initial clusters then assign cases to clusters based on the smallest amount of distance between the cluster mean and case. This is an iterative process that stops once the cluster means do not change much in successive steps. It should be noted that it is especially important to normalize input data before applying the input to the cluster processing.

4.2 Learning Algorithm of Network

RBF networks are typically trained by a two-stage process. First, the values of the centers and the widths of the RBF functions in the hidden layer are chosen. Second, the weights are trained to maximize the fitting of the network to the training data.

The behavior of RBF networks greatly depends on how the centers and the widths of the basis functions are selected. Several selection schemes of the values of centers and the widths that define its basis function have been reported in many studies [Mashor, 1998; Pedrycz, 1998; Matej and Lewitt, 1996; Grabusts, 2001; Zhang et al., 2010]. The simple approach known as “fixed center” can easily cause uneven distribution of the data points throughout the input space. Thus, a better approach is to use a principled clustering technique to find a set of RBF centers which more accurately reflect the distribution of the data points. A Gaussian mixture model, which is a weighted sum of Gaussian component densities, is commonly used as a parametric model of the probability distribution of continuous measurements due to its capability of representing a large class of sample distributions [Reynold, 2008]. In this study, Gaussian mixture model is introduced to cluster the TEC set and acquire optimal number of clusters. Mean and variance of Gaussian model is estimated using an expectation maximization algorithm iteratively. Once the RBF centers have been determined, the basis function width can then be set to the maximum intercenter squared distance. The basis functions are then kept fixed while the second-layer weights are found in a second phase of training. On a sum of squared error criterion, the weights from hidden to output layer can then be determined using the pseudoinverse

display math(7)

where W is weight matrix and Y is output matrix. Pseudoinverse of H is defined as H+ = (HTH)− 1HT. Although this procedure may not give solutions with an error as low as using general purpose nonlinear optimizers, it is much faster.

5 Data and Results

5.1 Data Preparation

From the above algorithm as explained in section 3, vertical TEC overhead of observation station at 30 min interval is obtained using GPS observational data from dual-frequency International GNSS Service (IGS) stations at different latitudes around the longitude 110°E. The location of GPS stations is listed in Table 1.

Table 1. The Location of GPS Stations
StationsLatitude (°N)Longitude (°E)
BJFS39.61115.89
WUHN30.53114.36
KUNM25.03102.80

To better evaluate the performance of RBF neural network, the GPS observations from 1 July to 31 December 2011 are analyzed. Ionospheric TEC values are divided into training data set and test data set. Table 2 illustrates how the TEC values are assigned within the operation mode of the modeling process. Several geomagnetic storms took place during this period. The 10.7 cm solar radio radiation flux (F10.7) and Dst index are shown in Figure 2 (top and bottom). The mean F10.7 for this period was 135 solar flux unit (sfu) (1 sfu = 10− 22W/m2/Hz), but it varied significantly with a maximum of 190 sfu on 24 September (day 267). As indicated in Figure 2, geomagnetic storms took place on 6 October (day 218), 9, 17, and 26 September (day 252, 260, and 269), and 25 October (day 298). The corresponding Dst index dropped to −107 nT, −69 nT, −70 nT, −101 nT, and −130 nT.

Table 2. Assignment of the Input Data
PhaseDaysYear
Training1 Jul to 31 Oct2011
Validating1 Nov to 10 Nov2011
Test11 Nov to 31 Dec2011
Figure 2.

Variation of (top) F10.7 index and (bottom) Dst index from 1July (day182) to 31 December (day 365) 2011.

In order to find the proper RBF structure, the number of neurons in the hidden layer is varied from 6 to 50 with the step size 4. The results corresponding different number of neurons are evaluated by comparing the sum of squared error which is defined by the equation below [Yilmaz et al., 2009]:

display math(8)

where Δ is the error, D0(k) is the real measurement, Df(k) is the prediction estimation, and n is the number of data points. Recorded error values versus the number of neurons are presented in Figure 3. From the figure, we can see that the error decreases as the number of neurons increases. The error almost reached a constant value when the neurons number exceeds 25. Considering that more neurons undoubtedly increase the complexity of computation, the number of neurons without losing the accuracy significantly should be 35. Then, the error is evaluated to find optimal number of cluster centers. However, errors are not significantly different with the number of cluster centers varying from three to five. Therefore, the network with 35 neurons in the hidden layer and four cluster centers can be considered as the proper structure for the following implementation.

Figure 3.

Errors distribution as a function of number of neurons in hidden layer.

After determining the suitable number of neurons, a simulation is carried out to evaluate the capacity of network model. The output results of network are compared with observed TEC. Training results of neural network are different on different runs due to random initial weights. Thus, network model is trained 10 times, and the average results are utilized in this paper. Figure 4 illustrates the diurnal variations of both the 30 min forecast TEC (black solid line) and observed TEC (blue solid line) for different GPS stations with the training data over the period of 25–29 September (day 268 ~ 272) 2011. TEC observations are absent during the early hours of day 272 for station KUNM. It is immediately evident that the diurnal variations between the forecast and observed GPS TEC measurements are in good agreement with visual inspection. Figure 5 presents the variations both the 30 min forecast TEC (black solid line) and observed TEC (blue solid line) with the test data over the period of 13–17 November (day 317–321) 2011. Again, the diurnal variations between the forecast and observed TEC are in good agreement.

Figure 4.

Diurnal variation of both 30 min forecast TEC (black) and observed TEC (blue) with the training data from day 268 to 272 2011.

Figure 5.

Diurnal variation of both 30 min forecast TEC (black) and observed TEC (blue) with the test data for day 317 to 321 2011.

Figure 6 illustrates the results of regression analysis at different stations during the period between 1 November and 31 December 2011. The slope of the regression line (m) is close to 1, and the y intercept (b) is close to 0, which indicates that the outputs from the RBF agree well with the observed data. It can be seen in Figure 6 that correlation coefficients exceed 0.99 at all stations. It should be pointed out that the use of the last observation as the 30 min forecast would likely produce very similar regression results, and so correlation coefficients are close to 1. The slope of best fit line (red solid line) is close to 1, and the maximum y intercept value is not more than 0.5 TECU.

Figure 6.

Results of regression analysis with best fit line (red solid line) using the data from 1 November to 31 December 2011 at BJFS, WUHN, and KUNM respectively from left to right.

Next, the performance of a RBF network model is further investigated by using different error criterion. In this work, error e, relative error re [Leandro and Santos, 2007], and root-mean-square error RMS are given as follows:

display math(9)
display math(10)
display math(11)

First, the errors between forecasting values and observed measurements are analyzed statistically during the whole period of interest, 1 November to 31 December 2011. Figure 7 presents the error distribution at stations BJFS, WUHN, and KUNM. It can be seen that the errors within ±1.5 TECU are about 94% for station BJFS and 92% for station WUHN. However, the errors within ±1.5 TECU are about 75%, and quite a few errors exceed 10 TECU at station KUNM.

Figure 7.

Distribution of errors with respect to different stations.

As mentioned previously, BP network is used widely to forecast the TEC variation. It is useful to compare the performance of two neural network models. Due to the architecture of BP neural network-based TEC forecasting reported in the literature [Tulunay et al., 2006; Weng et al., 2012], an optimized BP network model with single hidden layer is constructed in this study. BP neural network model is designed and trained with the same input variables with RBF network-based model. The hidden layer with 35 neurons, hyperbolic tangent sigmoid transfer function for hidden layer, linear transfer function for output layer, and the Levenberg-Marquardt back propagation algorithm is chosen in the architecture determination of BP network model. The performance of trained RBF and BP model with respect to TEC prediction is evaluated by calculating the mean relative error and the root-mean-square error for 24 h period. Figure 8 presents a comparison of error distributions between RBF and BP models with the same data set. Meanwhile, considering 30 min in advance is short time interval, the errors are obtained according to equations (10) and (11) simply assuming the current observation as the forecast for the next observation and also shown in Figure 8. It is noted that TEC measurements are not available on day 344 and 345 at station KUNM; however, the missing data would not affect results and conclusion of this study. Using the proposed RBF network model, maximum average relative error at station BJFS is not more than 9%, although the individual relative error might be larger than 9%, especially at night when the background TEC is small. For the RMS errors, they do not exceed 3 TECU at stations BJFS and WUHN, but the errors are larger at station KUNM than at BJFS and WUHN. As shown in Figure 8, mean relative errors and RMS errors from RBF network are smaller than those from BP network in most cases, but RMS differences are not significant. Overall, our proposed RBF approach has an acceptable performance during most of the study period. In addition, the errors of no forecast are larger than RBF network model especially for the low-latitude stations WUHN and KUNM. They are also larger than BP network model on most days.

Figure 8.

(a) Comparison of RBF (top) relative error and (bottom) root-mean-square error with BP network model and no forecast from day 290 to 365 2011 at station BJFS. (b) Same as Figure 8a but at station WUHN. (c) Same as Figure 8a but at station KUNM.

6 Conclusions

This study intends to explore ionospheric TEC forecasting algorithm based on RBF neural network. The performance of network model is evaluated by comparing with BP network model in terms of given error criteria. The results show that forecasting absolute error within ±1.5 TECU is above 90% at stations BJFS and WUHN and decreases to 75% at low-latitude stations KUNM. The mean relative error is not more than 9%. Moreover, the RMS error is small at BJFS, but it increases at low latitudes; for instance, it reaches a maximum of less than 5 TECU at the station KUNM. The main reason for larger errors for KUNM is that KUNM is a low-latitude station, and the associated ionosphere experiences larger electron density gradient than the other stations.

It is difficult to find the objective criterion to evaluate the capacity of both BP network and RBF network in detail owing to their different characteristics. From the RBF and BP neural network forecast error, it is fair to claim that both network models give acceptable performances during the study period. The qualified performance indicates that RBF network is a reliable and alternate tool for the ionospheric TEC forecast of single station.

In this study, about 70% of the data were used for training, and about 30% were used for testing the network model from the original data set of 6 months. This division prevents overtraining and ensures generalization of the results. The trained model is used to predict data for January 2012, and then errors are estimated. A maximum mean relative error of 5% and a maximum RMS error of 2 TECU give acceptable prediction. However, future work will involve the collection of more data for both geomagnetic quiet and disturbed conditions to test the performance of the proposed RBF network model and further develop a regional or global TEC forecast model.

Acknowledgments

We greatly thank three reviewers whose detailed suggestions improved the quality of the paper. This work is supported by National Natural Science Foundation of China (grant 41104096). We are very grateful for the IGS and the Chinese crust deformation monitoring network to provide GPS data and ephemerides, and we also acknowledge MathWorks Company for providing the open source codes which give some valuable references.

Ancillary