Total electron content (TEC) forecasting by Cascade Modeling: A possible alternative to the IRI-2001

Authors


Abstract

[1] The ionospheric parameter, total electron content (TEC) is one of the key parameters in navigation and telecommunication applications. A small group at METU in Ankara has been developing data driven models in order to forecast the ionospheric parameters since 1990. In particular, results on forecasting TEC values one hour in advance by using their Neural Network Model—METU-NN have been reported previously. Since then, some work has been done in order to increase the performance of the METU-NN. In this paper, the most recent advanced model of the Neural Network containing Cascade Modeling (METU-NN-C) based on the Hammerstein systems is introduced. To demonstrate the performance of the METU-NN-C model, the Chilbolton and the Hailsham TEC values are considered during severe Space Weather conditions in some periods of 2001 and 2002. The authors have considered the METU-NN-C model as an alternative to the IRI-2001. In order to facilitate a comparison, IRI-2001 Hailsham TEC values are compared with those of the METU-NN-C by taking observed values as a basis.

1. Introduction

[2] The total electron content (TEC) is the number of free electrons in a column of one meter-squared cross-section along a path through the ionosphere (http://www.chilbolton.rl.ac.uk/weather/tec.htm). TEC is expressed in terms of TEC unit, TECu. 1 TECu is equal to 1016 electrons/m2. The TEC data are very important in telecommunications, radar and navigation applications, which include system designs, plans and operations also. However, in practice, the TEC data may not be available easily in particular during the disturbed conditions.

[3] The ionospheric processes are highly nonlinear and therefore, the variations of the ionospheric parameters including the TEC are nonlinear. As a consequence their mathematical modeling is very difficult. In particular, the data driven modeling such as the Neural Network (NN) modeling has been employed in parallel with the physical models since 1990 [e.g., E. Tulunay, 1991; Williscroft and Poole, 1996; Altinay et al., 1997; Cander et al., 1998; Wintoft and Cander, 1999; Francis et al., 2000; Y. Tulunay et al., 2001; Vernon and Cander, 2002; Y. Tulunay et al., 2004a; Y. Tulunay et al., 2004b; E. Tulunay et al., 2004a; E. Tulunay et al., 2004b; Radicella and Tulunay, 2004; Stamper et al., 2004; McKinnell and Poole, 2004; Oyeyemi et al., 2005; E. Tulunay et al., 2006].

[4] E. Tulunay et al. [2004a, 2006] have forecast the TEC values and constructed a TEC map over Europe with their METU Neural Networks Model, METU-NN. Recently, a new technique of Cascade Modeling based on Hammerstein systems (METU-NN-C) has been developed toward a further improvement of the performance of the Neural Network modeling. This has also enabled the authors to check the performance of the Hammerstein system modeling which is an interesting approach [Narendra and Gallman, 1966].

[5] The International Reference Ionosphere (IRI) is an international project sponsored by the COSPAR and URSI [Bilitza, 2001] (http://modelweb.gsfc.nasa.gov/ionos/iri/iri_members.html).

[6] The IRI-2001 Model has been adopted by the Center for Atmospheric Research of the University of Massachusetts Lowell (UMLCAR) in order to be used in MS-Windows platform (http://umlcar.uml.edu).

[7] The objective of this paper is two fold: 1) To forecast the Hailsham Global Positioning System (GPS) TEC values during the Space Weather conditions of the April 2002 and to fill TEC data gaps; 2) To compare the performances of the METU-NN-C with those of the UMLCAR version of the IRI-2001 Model.

2. Construction of the METU-NN-C

[8] The METU-NN-C Model is built on the principles of the Hammerstein system. The novelty of the model mainly arises from the fact that the METU-NN-C generates the statelike variables of a nonlinear system of interest. Another novelty is the employment of the Bezier curve nonlinearity, which finds a wide usage in computer graphics applications [Bézier, 1972; Rogers and Adams, 1990; Senalp et al., 2006b; Senalp, 2007].

[9] Figure 1 illustrates the concept of a Cascade Model in accordance with the system as introduced by Hammerstein [Narendra and Gallman, 1966]. That is, a Hammerstein system consists of a nonlinear static block cascaded with a linear dynamic block, which is successful in system identification [Narendra and Gallman, 1966]. The authors adapted this concept to fulfill their objectives. When compared with the black box models including standalone Neural Network models, the statelike internal variables, as introduced here, are transparent for system developers and operators, which is an important feature. In addition, it is possible to separate the system into nonlinear static and linear dynamic blocks [Fruzzetti et al., 1997; Ikonen and Najim, 1999; Westwick and Kearney, 2000; Bai and Fu, 2002]. Cascade modeling provides the statelike internal variables in addition to the forecast values. However, standalone black-box models do not supply state information when they are used in forecasting. Black-box forecast models have only their inputs and forecast outputs, which can be achieved by operators.

Figure 1.

Architecture of the Hammerstein system-based modeling.

[10] Figure 2 illustrates the METU-NN-C, which consists of two modules, i.e., METU-NN and METU-C modules. The METU-NN-C Model uses the METU-NN in order to model the nonlinear part of the Hammerstein system.

Figure 2.

Architecture of the Cascade Model (METU-NN-C).

[11] The cubic splines had been used to represent the Hammerstein system nonlinearities [e.g., Dempsey and Westwick, 2004]. In this work, the Bezier curves are chosen to represent the static nonlinearity in the TEC METU-NN-C forecast model due to the drawback reported on the cubic splines [Rogers and Adams, 1990]. The Bezier curves are constructed in terms of defining polygons, which provide local control in representing the nonlinearity. In contrast to the cubic splines, the Bezier curves do not need to pass through the existing data points and they can be of higher orders and they can respond to any small change around a defining polygon point [Bézier, 1972; Rogers and Adams, 1990].

[12] After the METU-NN module of the Figure 2 estimates the state like internal variables, the static nonlinearity and the dynamic linearity of the METU-C module can be determined by using the Cascade Modeling technique.

[13] After designing the architecture of the METU-NN-C, the following procedure is under taken until the model becomes ready for operation:

[14] 1. With some representative input data the model is trained. The METU-NN module is employed during the first part of the training in order to supply the estimates of the internal variables to the METU-C module. Then, the METU-C module is employed during the rest of the training.

[15] 2. The output of the step 1 is validated during training again and again many times until the “validation error during training” decreases down to a certain minimum value and the training is stopped if the “validation error during training” tends toward increasing values. These intermittent validation trials during the training are important since otherwise, the continuation of training without any intermittent validation would lead toward a zero-training error; in other words, in such a case memorization occurs leading to the loss of the generalization capability of the model.

[16] 3. After this “training” phase, the model is ready for the “validation” phase during operation, which is conducted with some representative independent data [Y. Tulunay et al., 2004a; Senalp et al., 2006a]. METU-NN is not used in validation during operation; the METU-C module blocks are used during validation.

[17] 4. Following the “validation” phase, the model is ready for the next operational purposes [Y. Tulunay et al., 2004a].

[18] The model can be employed in practical applications such as the forecast of the TEC values one hour in advance.

[19] The internal architecture of the of the METU-NN statelike variable estimator module consists of six neurons in one hidden layer. The activation functions of the METU-NN in the hidden layer are hyperbolic tangent sigmoid functions and the activation function in the output layer is a linear function, so that the hidden layer outputs represent the static part of the statelike internal variables. The parameters or the weights of the METU-NN module are optimized by using the Levenberg-Marquardt Backpropagation Algorithm during the training phase [Hagan and Menhaj, 1994; Haykin, 1999].

[20] After the METU-NN module is developed, it becomes ready to contribute developing the METU-C static nonlinearity subblock. The input data preparation phase is simplified in the case of Cascade Modeling since first block of the METU-C is static one. The METU-C static nonlinearity subblock uses the hidden layer outputs of the METU-NN module as estimates of the internal variables optimizing METU-C static nonlinearity subblock parameters. Then, by using the outputs of the METU-C static nonlinearity subblock and their past values as inputs to the METU-C dynamic linearity subblock, the parameters of the METU-C dynamic linearity subblock are optimized. Briefly, the parameters of the cascaded static nonlinear block and dynamic linear block in the METU-C module are optimized in this “training phase”. The Levenberg-Marquardt optimization method is used in training the METU-C. As explained before, memorization is prevented by using ‘validation data within training’ and by terminating the training process when the gradient of the validation error approaches zero.

[21] Equations in Table 1 represent the input parameters: TEC values observed at 10-minute intervals (equation (1)) and the trigonometric components of time (equations (2)(5)), during the design, training and validation phases of the construction of the METU-NN-C. The output is the forecast TEC value one hour in advance. Since the inputs are in 10-minute intervals, the output is also in 10-minute interval. Minute values and day values alone do not give full information for representing time. It is important to take the adjacency of the last minute of the present day and first minute of the next day into account. For this purpose, trigonometric components of minutes of day are used in inputs. Similarly, trigonometric components of days of year are used in order to represent the temporal information.

equation image
equation image
equation image
equation image
equation image
Table 1. Input Parameters Used Throughout at the METU-NN-C Model
Input VariablesExplanationEquation No.
u1 (k) = f(k)The present value of the TEC, k: present hour and minute of the time of interest (e.g., k = 3 + (5/60) is for the time 3h 05m on 1 April 2002)(1)
u2 (k) = CmMinute of the day (Cm = −Cos(2.π.m/1440))(2)
u3 (k) = SmMinute of the day (Sm = Sin(2.π.m/1440))(3)
u4 (k) = CdDay of the year (Cd = −Cos(2.π.d/366))(4)
u5 (k) = SdDay of the year (Sd = Sin(2.π.d/366))(5)

[22] The next step is the construction of the Bezier curves to represent the static nonlinearities. To achieve this task, the inputs, up(k), are to be normalized first. Following this, the Bezier curve representations of the internal variables of the METU-C, i.e., xq(k), are given below by the equations (6) and (7).

equation image
equation image

where,

R

is the number of inputs;

(m + 1)

is the number of defining polygon points;

up(k)

are the normalized input variables;

Bpi

are the coefficients to be determined.

[27] Equation (8) shows the output y(k), which is represented by using a dynamic linearity, that is a linear relationship of the internal variables, xq(k), and their past values, xq(kj).

equation image

3. The Data

[28] The TEC values are obtained from the GPS measurements of the Chilbolton (51.8° N; 1.26° W) and Hailsham (50.9° N; 0.3° E) stations at every 10 minute intervals. The time of the measurement used in the input space (the TEC input value) correspond to the input hour and minute. The Chilbolton data are used during the “training” and the “validation during training” phases. On the other hand, the Hailsham data are used for the “validation during operation” phase only.

[29] When the METU-C is run fully, it is expected that a computed forecast value of TEC one hour in advance will be obtained.

[30] Table 2 illustrates the data organization.

Table 2. Data Organization for the METU-NN-C and IRI-2001
Ionospheric Station (Geographic Coord.)METU-NN-CIRI-2001
PhaseData CoveragePhaseData Coverage
Chilbolton (51.8°N; 1.26°W)TRAINING1 April–31 May, 2000NANA
Chilbolton (51.8°N; 1.26°W)Validation during TRAINING1 April–31 May, 2001NANA
Hailsham (50.9°N; 0.3°E)Validation during OPERATION1 April–31 May, 2002Validation during OPERATION18–19 April, 2002

[31] In this work, an approach for forecasting TEC data for data gaps is presented as well. If there is a data-gap then the METU-NN-C model searches the previous available TEC value before the gap of interest and forecasts the TEC values one by one to fill the data gap. Then, it compares the resultant TEC value with the observed TEC value one day before. If the deviation is not within a predetermined safety margin, such as 20 TECu, then it takes the arithmetic average and gives the resultant value as the TEC forecast. There are data gaps on 3–4 April 2002, 15–17 April 2002 and 15 May 2002 within the Hailsham TEC data. Since the observed TEC data are not present at the gaps, gap-filling results cannot be compared with observations. Thus the forecast TEC values for the gaps are not taken into account in the performance analysis. However, they are presented without observations at gaps in the appropriate figures in the Results section as well.

4. Results

4.1. Forecast TEC Values 1 Hour in Advance by Using METU-NN-C

[32] The METU-NN-C forecast model produced the one hour in advance forecast TEC values at 10-minute intervals for April and May 2002. Figures 3, 4, 5, and 6 show the hourly samples of the observed and 1 hour in advance forecast TEC values for the Hailsham Station. In addition, the TEC forecast values for the data gaps are presented without observed TEC data at gaps in the figures. However, the forecast TEC values for the gaps are not taken into account in the performance analysis. The performance measure is given in terms of percent normalized errors. In this case, it is 5.51%, which can be accepted as a reliable forecast. It corresponds to an average absolute forecast error of 1.11 TECu. Having an absolute forecast error less than 2 TECu is important for practical applications [E. Tulunay et al., 2006; Senalp, 2007]. One of the previous successful models of the authors, METU-NN, gave 6.95% normalized error in forecasting TEC values for the same time period and for the same station [E. Tulunay et al., 2004a; Senalp, 2007].

Figure 3.

Hourly samples of the observed GPS TEC values (solid) and hourly samples of one-hour-ahead Forecast TEC values by METU-NN-C (dots) for 1–14 April 2002 at Hailsham.

Figure 4.

Hourly samples of the observed GPS TEC values (solid), and hourly samples of one-hour-ahead Forecast TEC values by METU-NN-C (dots) for 15–30 April 2002 at Hailsham.

Figure 5.

Hourly samples of the observed GPS TEC values (solid), and hourly samples of one-hour-ahead Forecast TEC values by METU-NN-C (dots) for 1–14 May 2002 at Hailsham.

Figure 6.

Hourly samples of the observed GPS TEC values (solid), and hourly samples of one-hour-ahead Forecast TEC values by METU-NN-C (dots) for 15–31 May 2002 at Hailsham.

4.2. The IRI-2001 Model TEC Forecast Outputs and the METU-NN-C Model TEC Forecasts to Facilitate a Comparison

[33] The program of the IRI-2001 Model has got some constraints. That is, it is only possible to obtain a single value at single spatial and temporal coordinate at 15-minute intervals. However, TEC forecasts by METU-NN-C are at 10-minute intervals. Therefore it is not practically easy to obtain IRI-2001 TEC values for the Hailsham station during the period of interest in order to facilitate a comparison between the METU-NN-C and IRI-2001 TEC. Instead, a new forecast run is accomplished for only two days of Hailsham GPS-TEC data at 30-minute intervals on 18 and 19 April 2002.

[34] Figure 7 shows the diurnal variation of the observed Hailsham GPS-TEC values on 18 and 19 April 2002. Superimposed on this curve are 1) the TEC output of the IRI-2001 Model; 2) the forecast TEC of the METU-NN-C Model.

Figure 7.

Observed GPS TEC values for disturbed solar-terrestrial conditions (solid), IRI-2001 TEC outputs (dash dotted) and one-hour-ahead Forecast TEC values by METU-NN-C (large dots) for 18–19 April 2002 at Hailsham.

[35] It is to be noted that the METU-NN-C forecast TEC values are following the nonlinear variation of the observed TEC values visually in a very good way. Whereas the IRI-2001 results are mostly greater than both the observed and the forecast TEC values in magnitude. In this context, the discrepancy between the observed TEC values and the TEC outputs of the IRI-2001 is noted.

[36] Table 3 gives the normalized error values and the cross correlation coefficients between the observed GPS TEC and forecast TEC of the METU-NN-C; and between the observed GPS TEC and the TEC output of the IRI-2001.

Table 3. Performance Measure in Terms of the Error and Cross-Correlation Coefficient Values on the Forecast Hailsham TEC 1 Hour in Advance by Using the METU-NN-C and IRI-2001 for the Time Period of 18–19 April 2002
 METU-NN-CIRI-2001
Normalized Error (%)20.04204.1
Cross Correlation Coefficient (×10−2)98.783.8

[37] Figures 8 and 9 show the scatter diagrams of the observed TEC and METU-NN-C TEC forecast values; and the observed TEC and the TEC output of IRI-2001, respectively. When the scatter diagrams are compared it is seen that the deviations of the scatter points are smaller for the results of METU-NN-C with Bezier curve nonlinearity. The best fit line in the scatter diagram for the results of the METU-NN-C has a slope near 45° and passes through the origin. Thus the system reached the correct operating point within the system identification by METU-NN-C and the forecasting errors are small. Also, the METU-NN-C learned the shape of the inherent nonlinearities. Thus the deviations from a straight line are small in the scatter diagram, and the cross-correlation coefficients are very close to unity at the significance level of α = 0.05.

Figure 8.

Scatter diagram with best-fit line for observed TEC values and one-hour ahead-forecast TEC values by METU-NN-C in 18–19 April 2002 at Hailsham.

Figure 9.

Scatter diagram with best-fit line for observed TEC values and IRI-2001 TEC outputs in 18–19 April 2002 at Hailsham.

5. Conclusions

[38] Reliable operations of radio communication as well as navigation systems and spacecraft control systems largely depend on reliable information concerning the ionospheric parameters such as the TEC values.

[39] In this work, 1) TEC forecasts are realized by the Cascade Model based on Hammerstein system modeling, METU-NN-C. The static nonlinearity of the METU-NN-C is represented by Bezier curves. The TEC values at data gaps are also forecast by using the model developed. 2) TEC forecasts 1 hour in advance by using METU-NN-C Model are presented in order to facilitate a comparison with the TEC output of the IRI-2001 Model during the high sun-spot number year of 2002.

[40] The performance of the forecasts is quantified in terms of percent normalized errors and cross-correlation coefficients.

[41] The results have shown that the data driven approach, as demonstrated by the METU-NN-C Model, is more versatile and has got qualitative and quantitative performance advantages provided that the representative data are available.

[42] In future, the performance of the international reference models such as IRI-2001 can be improved by introducing METU-C modules into the model of interest and by making some adaptations.

[43] In conclusion, it has been demonstrated that the identification of the complex nonlinear processes, such as the TEC variation, can be achieved with high accuracy by cascading a static nonlinear block of Bezier curve representations and a linear dynamic block. Using intelligent techniques and representative data, Cascade Models are successfully employed in identification of ionospheric processes.

Acknowledgments

[44] This work is partially supported by the EU action of COST 296 (Mitigation of Ionospheric Effects on Radio Systems)—The Scientific and Technological Research Council of Turkey Research Project: TUBITAK 105Y003. The GPS-TEC data are kindly provided by Lj. R. Cander.

Ancillary