[1] Data assimilation processes aim to combine measurement data with background models in an optimal way. In anticipation of the availability of global radio occultation (RO) measurements, a computationally practical data assimilation technique for combining RO data with background ionospheric models has been implemented, and simulations have been conducted to assess the utility of the technique. In simulations where tomographic images provide the truth data and the Parameterized Ionospheric Model (PIM) provides the background, a fourfold decrease in the electron density error at 300 km altitude was achieved. A global assimilation simulation has also been conducted using the International Reference Ionosphere as the truth data. For a constellation of eight RO satellites, a factor of four decrease in the vertical total electron content RMS error has been demonstrated. The same simulation also results in a factor of three decrease in the NmF2 RMS error and a halving of the hmF2 RMS error.

[2] Comprehensive, global, and timely specifications of Earth's atmosphere (particularly refractivity profiles of the troposphere and ionosphere) are required to ensure the effective operation, planning, and management of many radio frequency systems. Although many ground-based techniques have been developed to measure atmospheric refractivity, radio occultation (RO) methods are being increasingly investigated [Hoeg et al., 1999]. Terrestrial RO conventionally involves monitoring transmissions from Global Positioning System (GPS) satellites using receivers on low Earth orbiting (LEO) satellites and provides the potential of measuring refractivity profiles in regions where ground-based sensors cannot easily be located, such as deep sea waters.

[3] Unfortunately, the measurements are generally both underdetermined and sparsely distributed, and consequently it is necessary to constrain inversions of RO data. This may be done by making assumptions about the atmosphere (i.e., it is spherically symmetric), by using a limited number of functions to represent the atmosphere (e.g., empirical functions or spherical harmonics), or by assimilating the data into a background model of the atmosphere. This paper will describe the application of data assimilation techniques to ionospheric RO measurements. The emphasis will be on a practical approach which enables RO measurements from an eight satellite LEO constellation to be assimilated in an efficient manner on a single processor personal computer.

2. Standard Techniques

[4] The standard technique for inverting RO measurements to provide vertical profiles of refractivity relies on the Abel transform [Hajj and Romans, 1998; Schriener et al., 1999]. Using this technique, bending angles (derived from the excess Doppler shift observed on the GPS signals) can be inverted to provide vertical refractivity profiles with very high vertical resolution (on the order of the width of the first Fresnel zone) but with poor horizontal resolution. The geometry of the measurement results in the measured bending angles containing information from a large region of the atmosphere because the near-horizontal rays from the GPS to the LEO may remain in the atmosphere for many hundreds of kilometers. Therefore, in the presence of atmospheric structures, the assumption of spherical symmetry required for the Abel transform fails, and the calculated vertical profile will be in error. Furthermore, in order to apply the standard Abel transform, the measurement must lie close to the orbital plane of the LEO, and ideally both the transmitter and receiver should be in free space. Consequently, the number of measurements available from an operational system is restricted.

3. Data Assimilation

[5] The limitation of the spherical symmetry assumption inherent in the Abel transform can be overcome by employing techniques which incorporate RO data into a background model of the environment. The output (also known as the analysis) of a data assimilation process aims to combine measurement data with a background model in an optimal way [Rodgers, 2000]. It is necessary to include a background model because the information that can be extracted from RO measurements is low compared to the required resolution of the electron density field under investigation (i.e., the problem is mathematically underdetermined). Since both the observations and the background model contain errors, it is not possible to find the true state of the environment; instead, the best statistical estimate of the state must be found. Such techniques are also well suited to sparse data sets. Best linear unbiased estimator (BLUE) and related variational (one-, three-, and four-dimensional) data assimilation techniques have been used in meteorology for a number of years and have recently been applied to neutral atmosphere RO measurements [Healy and Eyre, 2000; Kuo et al., 2000]. Work has also been conducted on applying such techniques to ionospheric inversion [Wang et al., 1999; Pi et al., 2001].

[6] The following sections will describe a BLUE algorithm that has been used to directly modify an electron density grid produced by an ionospheric model. Not all monthly median models are suitable for the background since they may smear out ionospheric structures. The parameterized ionospheric model (PIM) [Daniell et al., 1995] was chosen because it is derived from parameterized physical models and therefore provides a representation of ionospheric structures. PIM is also well suited to RO applications, as it contains the Gallagher plasmasphere model.

4. BLUE

4.1. Observations

[7] The observation vector (y) consists of a series of p slant total electron content (TEC) measurements from the RO measurement system. Associated with each slant TEC is the time, date, and positions of both the LEO and GPS satellites. The errors associated with each slant TEC measurement are assumed to be independent, and the variances are all assumed to be equal. Therefore the observation error covariance matrix R is diagonal (with dimensions p × p) and has the same value in each diagonal element (1 × 10^{15} e^{−}/m^{2}).

4.2. Background

[8] PIM provides the background model x_{b} for the BLUE algorithm. It is run to produce values of electron density on a 4° latitude and 4° longitude grid over a geographic region that encompasses both the LEO and GPS satellites. The grid is produced at 60 altitudes ranging from 90 to 24,000 km. The vertical grid spacing ranges from 2 km in the E region to 2000 km in the plasmasphere. From this grid, those vertical columns that are intersected by, or are adjacent to columns intercepted by, rays from the GPS to the LEO are rasterized to form the background model (of dimension n).

[9] The background model error covariance matrix B is block diagonal with dimensions n × n. The background error variance at each height (i.e., the values which comprise the diagonal elements of the background error covariance matrix) is assumed to be proportional to the square of the electron density in the background model at that height. This acts to maximize the absolute errors in the F layer, where the ionosphere is most variable. The error covariances are estimated from the error variances, the vertical separation, and an estimate of the vertical scale height in the ionosphere. Only matrix elements within ±4 altitude levels of the diagonal elements of B have their covariances calculated. All other elements are assumed to be zero. For points that are separated in latitude or longitude, the covariance is further scaled by the horizontal ionospheric scale length. A scale length of four degrees has been assumed.

4.3. Observation Operator

[10] The observation operator H describes the way in which the rays from the GPS satellite pass through the background model. At worst, the departure of the rays from straight line propagation can reach a few kilometers, which is much smaller than the typical vertical scale height in the F region ionosphere [Schriener et al., 1999]. Therefore bending can be neglected. The observation operator can be constructed as a p × n matrix, where each row is associated with one point in a modeled observation. Such a point (e.g., the ith) is a measure of the slant TEC (y) along the ray and can be modeled by the sum of the products of the electron density in the jth voxel (x_{j}) and the ray length within the jth voxel (H_{ij}):

[12] It can be seen that the analysis is produced by modifying the background model (x_{b}) with the differences between the observation vector (y) and observation operator acting on the background model (Hx_{b}). Furthermore, these differences are weighted by K, which is a function of the observation operator and the error covariance matrices.

5. Simulated Measurements

[13] Tomographic images [Bernhardt et al., 1998; Pryse et al., 1998] have been used to simulate slant TEC measurements. The images used in the simulations represent north-south slices of the electron density field at a longitude of 350°E and have been described by Rogers et al. [2001]. Because the image only covers a limited latitude, it has not been possible to simulate the whole of the RO geometry. Instead, rays have been traced from a transmitter moving vertically at 46°N from 90 to 414 km to a receiver at a fixed altitude moving from a latitude of 64°N to 54°N. This allows a simulation of typical ray paths on one side of an occultation tangent point. The measurement was simulated by integrating along rays from the transmitter to the receiver to find the slant TEC. Gaussian noise with a standard deviation of 5 × 10^{14} e^{−}/m^{2} was also added to the slant TEC measurements. This equates to receiver noise with an RMS value of ∼3 mm on the L1 frequency and ∼5 mm on the L2 frequency [Hoeg et al., 1999].

[14]Figure 1 shows a tomographic image measured on 28 March 1998 at 0742 UT. At this time, Kp was 1.7, the sunspot number was 56, and the value of the solar flux was 103.6 × 10^{−22} Wm^{−2}Hz^{−1}. Slant TEC measurements have been simulated by integrating along 55 nonintersecting straight line rays from the transmitter to the receiver. Every fifth ray path is shown as a solid black line. The output of PIM for the same time and geophysical conditions is shown in Figure 2. The PIM electron density grid has then been modified by the BLUE using the slant TEC data (Figure 3). The slant TEC for the simulated rays is shown in Figure 4 for the tomographic image (solid line), the unmodified electron density grid output of PIM (dotted line), and the electron density grid output of the BLUE (dashed line). The BLUE produces a slant TEC that closely matches the input data.

[15] In this example the vertical electron density profiles at a latitude of 50°N produced by the BLUE are very similar to the truth data above approximately 200 km altitude (Figure 5). Below this height the BLUE reverts to the value of the background model (PIM). This is because the largest errors in the background error covariance matrix are in the F region. Therefore the F region tends to be modified in preference to the E region.

[16] Further measurements have been simulated using an additional 70 tomographic images that were collected at a range of times and levels of geomagnetic activity. In each case the errors in the slant TEC along each ray from the transmitter to the receiver have been calculated for both the PIM electron density grid (Figure 6) and for the output of the BLUE (Figure 7). The greatest improvement in accuracy occurs around ray 35, where the standard deviation of the slant TEC errors is reduced by a factor of approximately 8, and the mean error is reduced to close to 0 e^{−}/m^{2}. Similarly, the electron density errors have been calculated for a vertical column situated at a latitude of 50°N (Figures 8 and 9). As expected, the greatest reduction in the errors exhibited by the BLUE output occurs in the F region at approximately 300 km.

6. Global Assimilation

[17] To further test the performance of the BLUE algorithm, global data assimilation simulations have been conducted. The International Reference Ionosphere (IRI-95) was used to provide the truth ionosphere with measurements generated for a radio occultation satellite constellation. The constellation consisted of eight LEOs in four orbital planes (i.e., two satellites in each plane). The ascending nodes of the orbital planes were distributed over a range of right ascensions to ensure an even spacing of the satellites. The orbits were circular, had an altitude of 750 km, and had an inclination of 70° (similar to GPS/MET). The satellite position data were calculated every second for 1 June 2000. Occultation periods were then found by searching for rays from a GPS satellite (from the currently operational GPS network) to a LEO that had a minimum altitude between 50 and 650 km.

[18] PIM was run to provide the initial background model. Then occultations were assimilated in 20 min time periods, during which time the ionosphere was assumed to remain unchanged. For each period, IRI was run to provide the truth ionosphere and slant TEC measurements were simulated by integrating along straight lines rays between occulting satellites. Owing to the height limitations of IRI, the integration to calculate the slant TECs was limited to an altitude of 1000 km. Furthermore, in the subsequent data assimilation process, the position of the GPS satellite was replaced by the position of the intercept between the GPS-LEO ray and a locus of a constant altitude of 1000 km. Therefore no plasmaspheric ionization was included in the simulation. As before, Gaussian noise with a standard deviation of 5 × 10^{14} e^{−}/m^{2} was added to the slant TEC measurements.

[19] Each occultation was assimilated in turn into the background model. Consequently, the analysis (x_{a}) generated by assimilating the first measurement became the background model for the next. It should be noted that the error covariance matrix of the analysis was not calculated, and therefore the background model error covariance matrix for each occultation was calculated from the background model as described previously.

[20] After all the occultations from one 20 min period had been assimilated, it is necessary to evolve the analysis in time to provide the background model for the next period of occultation assimilation. In principle, a physical model of the ionosphere should be used to predict the future state of the electron density grid from the current data assimilation analysis. However, such an approach would require both a sophisticated model and (probably) large computing resources. Therefore a highly simplified approach to the time evolution has been adopted; it was assumed that, in geomagnetic coordinates, the ionosphere remains invariant in space while Earth rotates beneath it. Thus it was assumed that the locations of the main ionospheric structures in the F region (e.g., the equatorial anomaly, the midlatitude trough, and the auroral oval) are controlled by the magnetic field.

[21] In practice, the time evolution was performed by randomly sampling the electron density at each altitude and converting the geographic coordinates of each sample to geomagnetic (a simple dipole magnetic field was used). The change in longitude corresponding to the required amount of time evolution was then subtracted from each coordinate (i.e., 15° of longitude per hour). Each sample was then converted back to geographic coordinates, and a regular latitude-longitude grid was generated by interpolation. This process was repeated for each altitude in the electron density grid. Although this last step increased the required computation time, it ensured that any errors that were introduced were randomly distributed and did not persist through the altitude range of the simulation.

[22] The global assimilation simulation has been run for a period of 24 hours and for two scenarios. In the first, only occultations that lie close to (within 30°) the orbital plane of the LEO have been assimilated. This corresponds to the typical limitation of current RO missions that restrict the data to be along the fore or aft boresite of the satellite motion to facilitate the use of the Abel transform. The second scenario assimilated all available RO measurements, which is acceptable using a data assimilation approach. For the second scenario the vertical TEC difference between IRI and PIM at the start of the assimilation period, and between IRI and the output of the BLUE at the end of the 24 hours, are shown in Figures 10 and 11, respectively.

[23] In each case the RMS errors in the vertical TEC, NmF2, and hmF2 have been calculated for each 20 min period (Figures 12, 13, and 14). For comparison, PIM has also been run for each 20 min period, and the RMS errors between it and IRI have been calculated. It can be seen that in the first (±30°) scenario the vertical TEC error was halved, while in the second the error was reduced by a factor of four. The large errors between PIM and IRI at approximately 15 hours are mainly due to a large artifact that appears in the output of PIM at this time. The reduction of errors with time for the boresite scenario reverses at about 12 hours and then begins to recover again after 17 hours. This may be due to the simple time evolution algorithm that has been used. The reversal is not evident in the second scenario (all measurements), but the reduction of errors essentially stops at ∼12 hours. Again, this is probably due to limitations in the time evolution strategy that has been employed.

[24] Similarly, the RMS error in NmF2 was reduced by a factor of three in the second scenario (all RO measurements assimilated) and a factor of ∼1.5 for the boresite scenario (Figure 13). The RMS error in hmF2 fell by a factor of approximately two for both scenarios (Figure 14).

7. Discussion and Requirements for Future Work

[25] Although the electron density errors in an ionospheric model have been successfully reduced, a number of known limitations remain in the data assimilation process as described in this paper. For example, the effectiveness of data assimilation is highly dependent on the accuracy of the relevant error covariance matrices (both observation errors and background errors). Further work is required on the specification of the error covariance matrices used for data assimilation. In particular, different values for the latitudinal and longitudinal horizontal scale lengths should be developed that can be varied for different locations. Furthermore, notwithstanding any variation in scale length, the current implementation is limited to specifying nonzero values in the background covariance matrix for only those vertical electron density columns that are intersected by, or are adjacent to columns intercepted by, rays from the GPS to the LEO. This restriction should be lifted in order to allow a greater spread of information away from the measurement site.

[26] The error covariance matrices used in BLUE have been specifically tailored for use with radio occultation data. Expansion of the process to use other data sources (such as ground-based TEC, ionosondes, etc.) will require the development of new error matrices. Ideally, the covariance matrix of the output of the data assimilation process (i.e., the errors in the analysis x_{a}) should also be calculated. This could then be used in the next assimilation for which the analysis forms the background model, thereby reducing the magnitude of the errors in areas where data had been included.

[27] Additional limitations are that a very simple approach to the problem of ionospheric time evolution has been adopted and that the use of IRI has limited the simulations to an altitude of 1000 km. Further work will be required to assess the benefits of more sophisticated time evolution techniques and to test the assimilation process with plasmaspheric data to ensure that the background error covariance matrices that have been developed remain valid at these altitudes.

[28] The global assimilation simulations have been run for a period of 24 hours. Ideally, longer simulations should be run to examine the effects of moving further away from the original background model information. However, the existing simulations suggest that, for the constellation used, the best error performance occurs between 12 and 24 hours from the start time. Operationally, this may require two assimilation processes to be run in parallel with a 12 hour staggered start time. The output from the first process would be used from 12 to 24 hours during which time the second process begins assimilating data. The output from the second process would then be used from 24 to 36 hours, while the first is reset with a new run of PIM to form the background model.

[29] Although data from individual RO satellites are available (i.e., GPS/MET, CHAMP, etc.), an RO constellation (such as COSMIC) has not yet been launched. Consequently, it has not yet been possible to demonstrate global RO data assimilation using real RO data.

8. Conclusions and Future Work

[30] Data assimilation techniques have been investigated and applied to ionospheric RO measurements. These techniques show the potential to overcome the limitations of the Abel transform; i.e., the assumption of spherical symmetry, the requirement for the measurements to be in the LEO's orbital plane, and the requirement for the transmitter and receiver to be in free space. Tomographic images of the ionosphere have been used to simulate slant TEC measurements, which have then been used to update the PIM ionospheric model. Results for 70 such assimilations show a fourfold decrease in the electron density error at 300 km altitude. A global assimilation simulation has been run using a constellation of eight LEO RO satellites and IRI to provide the truth ionosphere. A factor of four decrease in the vertical TEC RMS error has been demonstrated.

Acknowledgments

[31] This research program has been funded by the U.K. Ministry of Defense (MOD) as part of its Corporate Research Programme.