Abstract
 Top of page
 Abstract
 1. Introduction
 2. Kalman Filter
 3. Approximations to the Kalman Filter
 4. Forward Model
 5. Results and Validation
 6. Conclusions
 Acknowledgments
 References
 Supporting Information
[1] A threedimensional (3D) Global Assimilative Ionospheric Model (GAIM) is currently being developed by a joint University of Southern California and Jet Propulsion Laboratory (JPL) team. To estimate the electron density on a global grid, GAIM uses a firstprinciples ionospheric physics model and the Kalman filter as one of its possible estimation techniques. Because of the large dimension of the state (i.e., electron density on a global 3D grid), implementation of a full Kalman filter is not computationally feasible. Of the possible suboptimal implementations of the Kalman filter, we have chosen a bandlimited Kalman filter where a full time propagation of the state error covariance is performed, but it is always kept sparse and banded. The effectiveness of ground GPS data for specifying the ionosphere is assessed by assimilating slant total electron content (TEC) data from 98 sites into the GAIM Kalman filter and validating the electron density field against independent measurements. A series of GAIM analyses are presented and validated by comparisons to JPL's global ionospheric maps (GIM) of vertical TEC (VTEC) and measurements from TOPEX. A statistical evaluation of GAIM and GIM against TOPEX VTEC indicates that GAIM accuracy is comparable or superior to GIM.
1. Introduction
 Top of page
 Abstract
 1. Introduction
 2. Kalman Filter
 3. Approximations to the Kalman Filter
 4. Forward Model
 5. Results and Validation
 6. Conclusions
 Acknowledgments
 References
 Supporting Information
[2] The increasing reliance of our civilization on space technologies has made it clear that creating a “space weather” monitoring capability that provides timely and accurate space environment observations, specifications, monitoring, and forecasting is essential for the safe operation of various defense and commercial systems. Space weather can effect power grids at middle and highlatitudes, disrupt communication systems, and degrade the performance of navigation and reconnaissance systems. The degree of success in creating such a “space weather” system depends mostly on (1) the ability to obtain global and continuous measurements related to the space environment and (2) the ability to incorporate these various measurements into a physical model in a selfconsistent manner.
[3] The state of monitoring and forecasting space weather today can be compared to that of conventional weather monitoring and forecasting almost half a century ago, when observations were fragmentary in space and time and means of interpreting them were rudimentary. The global and continuous observations obtained in the lower atmosphere (e.g., from weather satellites and radiosondes), the ability to obtain these observations in a timely manner, and the advances made in global weather modeling and in data assimilation algorithms are the main factors that have brought numerical weather prediction (NWP) models to their current level of success.
[4] On the space environment front, we are witnessing a new era. Significant efforts are being planned to collect further information on solar activities and disturbances in the magnetosphere, and data on the upper atmosphere and ionosphere/plasmasphere are becoming truly global and continuous. A case in point is the Global Positioning System (GPS), in which a global network of over 100 ground receivers and regional networks of hundreds to over 1000 receivers created the unprecedented possibility of producing global maps of vertical total electron content (TEC) and ionospheric irregularities in nearreal time updated subhourly [Pi et al., 1997; Mannucci et al., 1998]. Moreover, within the next few years the number of flight receivers tracking GPS in a limbviewing geometry for ionospheric occultations [Hajj et al., 1994; Leitinger et al., 1997; Hajj and Romans, 1998; Schreiner et al., 1999] will increase to nearly a dozen, providing an extremely dense global set of horizontal cuts through the ionosphere and allowing for accurate 4D global mapping of electron density [Hajj et al., 2000]. This data set, along with other data such as UV airglow radiances [e.g., Dymond et al., 2001] from current and future missions, provide a truly unprecedented global coverage of the upper atmosphere and ionosphere.
[5] A longterm objective of our research is to develop, validate, and use in operational and research modes a Global Assimilative Ionospheric Model (GAIM) capable of assimilating a variety of data types including: (1) slant TEC (the integral of electron density along the transmitterreceiver lineofsight) measurements from GPS ground receivers, (2) change in TEC measurements taken from a lowEarth orbiter (LEO) tracking GPS satellites at positive and negative elevations (i.e., during GPSLEO occultations), (3) in situ measurements of electron density, and (4) UV airglow radiances which are related to the state in a nonlinear manner. Similar to neutral atmospheric weather models (which assimilate, solve for, and predict 4D fields (3 spatial and 1 temporal) of the atmospheric state parameters such as temperature, specific humidity, and wind), GAIM assimilates, solves for, and predicts the electron density in the ionosphere and some of the underlying forcing functions (“drivers”) such as production rates, dynamo electric fields, thermospheric neutral densities, temperatures, and winds. In doing this, GAIM applies two different techniques: (1) the Kalman filter or some approximation thereof, and (2) a 4D variational (4DVAR) technique. The former technique is used to solve for the electron density in space and time without attempting to solve for or adjust the “drivers.” The 4DVAR technique solves for the “drivers” from which the electron density is obtained by solving the ionospheric model equations. Although the two approaches are currently disjoint, they can be combined in an operational scenario where the Kalman filter is used to estimate the initial electron densities, while the 4DVAR is used to estimate the drivers and to produce a prediction for the next data assimilation cycle. The 4DVAR technique is described elsewhere [Rosen et al., 2001; Pi et al., 2003]. Background description of GAIM can also be found in the work of Hajj et al. [2000] and Wang et al. [2004]. Other background information on the use of stochastic inverse theory and the Kalman filter to ionospheric mapping can be found in the work of Fremouw et al. [1992] and Howe et al. [1998, and references therein].
[6] In this paper, our focus is on the use of the Kalman filter for estimating the ionospheric electron density state and its implementation. Even though the current GAIM is capable of assimilating a number of data sources as listed above, we limit the scope of this study to assimilating ground TEC measurements from a network of 98 globally distributed stations. In doing so, we are following the general tradition and “wisdom” of the NWP community, which introduces new measurements into numerical weather models only after very careful examination and much evaluation. The reason is that each data set has its own nuances and characteristics, and it could influence the data assimilation output in both positive or negative ways. Therefore optimal assimilation of any data type requires careful tuning of its error covariance, proper evaluation of the data representation errors, examination of the effect of the data on the analysis and its covariance, and examination of the consistency of the assumptions used in the Kalman filter and its solution.
[7] The paper is organized as follows. In section 2 we review the formulation of the Kalman filter. In section 3 we discuss some practical considerations related to the full Kalman filter, such as memory requirements and number of operations, and introduce the bandlimited Kalman filter. In section 4 we describe the University of Southern California (USC)/Jet Propulsion Laboratory (JPL) GAIM physics model and its solution grid. In section 5 we present examples of ionospheric specifications from GAIM analyses for 22–24 May 2002 and validation results against global ionospheric maps (GIM) and TOPEX. A conclusion is given in section 6.
2. Kalman Filter
 Top of page
 Abstract
 1. Introduction
 2. Kalman Filter
 3. Approximations to the Kalman Filter
 4. Forward Model
 5. Results and Validation
 6. Conclusions
 Acknowledgments
 References
 Supporting Information
[8] We introduce the following definitions (commonly used in NWP) [e.g., Ghil and MalanotteRizzoli, 1991; Daley, 1991]:
 x_{k}^{t}
true state, a discrete representation of the true ionospheric state (density) at time k;
 x_{k}^{a} = 〈x_{k}^{t}/m_{k}^{o}, x_{k}^{f}〉
analysis, an estimate of x_{k}^{t} given measurements at time k, and a forecast x_{k}^{f};
 x_{k}^{f} = 〈x_{k}^{t}/m_{k−1}^{o}〉
forecast, an estimate of x_{k}^{t} given measurements up to time k − 1.
[9] The observations m_{k}^{o} are assumed to be related linearly to the true state x_{k}^{t} through an observation operator H_{k} via the equations
where ε_{k}^{0} is the observational error, which is composed of the measurement error, ε_{k}^{m}, and a representation error, ε_{k}^{r}. The latter is due to the discretization in time and space of the solution for the ionospheric state (for a description of TEC representation error, see Hajj et al. [2000]). For TEC measurements the relation between the observations and the state is already linear. A linearization procedure might be required to relate the true state at time k + 1 to the true state at time k, which can then be written in the form
where Ψ_{k} is a forward model, which can be represented in a matrix form and ε_{k}^{q} is a process noise which reflects our uncertainty in the forward model. A linearization procedure is not required in our case since our dynamical model is already linear, as we shall explain later.
[10] If M_{k}, R_{k}, and Q_{k} are used to denote the measurement, representation, and process noise covariances, respectively, then the Kalman filter can be summarized by the following set of equations:
K is known as the Kalman gain and P^{a} and P^{f} are the analysis and forecast covariances, respectively. The vector (m_{k}^{o} − H_{k}x_{k}^{f}) is known as the innovation vector, and it represents the observation vector minus the predicted observations based on the forecast. The Kalman filter was first introduced by Kalman [1960] and Kalman and Bucy [1961] for linear systems of ordinary differential equations. An overview of the use of the Kalman filter for meteorology can be found in the work of Ghil and MalanotteRizzoli [1991] and Daley [1991].
[11] In the data assimilation process, during a given time step (indexed by k in equations (1)–(8)) the state is assumed to be constant (time steps are taken to be 12 min in our analysis below). According to the Kalman formalism, at time t_{0} (the center of the first time interval), given a forecast (initial) state, x_{0}^{f}, a forecast state error covariance, P_{0}^{f}, and a set of observations, m_{0}^{o} (collected in the interval t_{0} − Δt, t_{0} + Δt; Δt = 6 min in our case) with covariances R_{0} and M_{0}, an improved estimate of the state (x_{0}^{a}, the analysis) at time 0 can be obtained by adding the innovation vector operated upon by the Kalman gain to the forecast state (equation (4)). Moreover, because of the inclusion of the data during this time step, the forecast state covariance is reduced by the second term on the righthand side (RHS) of equation (6) to give the analysis state covariance at time 0. Using a dynamical model of the ionosphere, we can then propagate the state from the first time step (0 min) to the next one (12 min) by use of the forward model, Ψ_{k} (equation (7)). Similarly, we can propagate the analysis state covariance to the next time step by use of equation (8). The propagated state and covariance serve as the forecast for the next time step (12 min), and the process repeats recursively.
[12] The process noise, Q, in equation (8) reflects our uncertainty in the forward model and forms one of the most crucial input to the Kalman filter. If Q is too small, then P_{0}^{a} and P_{0}^{f} will become too small as more data points are assimilated. This would eventually result in giving unjustifiably larger weight to the model causing the data to have no impact. Inversely, if Q is too large, the information in P_{0}^{a}, which carries in it our knowledge of the state based on the physics and previous data, would be lost. Choosing the correct Q requires much care in understanding the limitations of the model under different physical conditions (e.g., during magnetically quiet or disturbed periods); furthermore, for Q to be optimal, it must have the right information regarding the correlation between the state elements. As a first step, for the analysis shown in the subsequent sections, we choose an ad hoc Q_{k} with diagonal elements σ_{i}^{2} = (10^{10} + 0.2 × N_{i})^{2}, where N_{i} is the electron density in voxil i in units of e/m^{3}. The additive term ensures that voxels with small electron densities have error bars that are not too small. The multiplicative term serves the same purpose but for voxels with large electron densities. At this point we offer no justification for this choice of Q_{k} other than to say that after testing several additive and multiplicative values, the ones listed here seamed to give better postfit residuals and better agreement of the analysis with independent data.
[13] The process of assimilating data continuously and propagating the model at each time step in the manner described above is formally known as continuous data assimilation or 4D data assimilation (not to be confused with 4DVAR) [Daley, 1991]. In continuous data assimilation the philosophy is that even if the initial condition and/or the model are imperfect, the accumulation of data will gradually force the model integration to the true ionospheric state. In continuous data assimilation the analysis at time t_{k} depends on all observations taken at t < t_{k}. However, it is also possible to include measurements taken at t > t_{k} when estimating the state at time t_{k} by use of 4DVAR and/or Kalman smoothing [see, e.g., Ghil et al., 1997].
[14] For the Kalman filter to be an unbiased, maximum likelihood, minimum variance estimator, the measurements, and state errors need to follow Gaussian statistics and be unbiased. In that case it is possible to show that the Kalman filter estimator (x_{k}^{a}) also minimizes the cost functional [Bierman, 1977]
where the sum is over all the measurements during step k. This equality can be used to check the consistency of our assumptions on the magnitude of the state and measurement error covariances.
3. Approximations to the Kalman Filter
 Top of page
 Abstract
 1. Introduction
 2. Kalman Filter
 3. Approximations to the Kalman Filter
 4. Forward Model
 5. Results and Validation
 6. Conclusions
 Acknowledgments
 References
 Supporting Information
[15] Since one of the main purposes of ionospheric data assimilation is to produce an ionospheric specification or prediction that is useful for space weather applications, timeliness, where the analysis can keep pace with the data, is a key factor for a practical implementation of the Kalman filter. Because of the large dimension of the state (i.e., the number of volume elements or voxels used to represent the ionospheric state which is of order N = 10^{5} to 10^{6}), the full Kalman filter may not be computationally feasible in a timely manner. This is true because of memory storage limitations and the number of computations required. Saving the state covariance in memory requires saving N^{2} double precision numbers or 80–8000 Gb. However, updating P_{k}^{f} P_{k}^{a} (equation (6)) when assimilating M TEC measurements requires of order M × N^{2} operations, where each operation is defined as the time needed to extract three double precision numbers, B, C, and D, from highspeed storage, the evaluation of A = B + C × D, and the transfer of A to highspeed storage. (The same covariance update requires M × N operations when assimilating in situ measurements.) Furthermore, updating P_{k}^{a} P_{k+1}^{f} (equation (8)) requires of order N^{3} operations. However, the latter transformation (equation (8)) can be made of order c_{1}N^{2} operations, where c_{1} is roughly constant, by taking advantage of the fact that diffusion takes place along magnetic flux tubes and using a common grid to solve for the dynamical equations of the ionosphere and to solve for the ionospheric state in the Kalman filter as described later. In our implementation of GAIM and for a time step of 12 min, c_{1} is of order 1000.
[16] To appreciate the level of computations needed to perform the full Kalman estimation, consider the following example. A subset of 98 GPS stations from the continuously operational global network operated by the International GPS Service (IGS) collects nearly 700 fiveminaveraged lineofsight TEC measurements every 12 min. Assimilating these measurements and updating the state covariance every 12 min requires of order (700 + 1000) × N^{2} operations. An Intel chip with 2 GHz speed at best performs only 2 × 10^{9} operations (as defined above) per second. For N = 10^{5}–10^{6} and for a whole day run, this translates into 2 × 10^{15}–2 × 10^{17} operations or 12–1200 days. Highresolution operational numerical weather prediction models solve for ∼10^{7}–10^{8} variables (temperature, water vapor, and zonal and meridional wind components on a 1° × 1° grid at 30–60 pressure levels). This makes it clear why the full Kalman filter is prohibitive, even on the fastest parallel computers available to date. This is also why the meteorological community has devised numerous approaches/approximations to the Kalman filter including optimal interpolation [e.g., Lorenc, 1981], partitioned Kalman, various reduced Kalman filters, and bandlimited Kalman. Upon evaluation of these various options, we opted for the bandlimited Kalman for reasons given below.
[17] The different options in implementing a Kalman filter present various types of difficulties. For instance, in a partitioned Kalman filter, one splits the region being estimated (the ionosphere) into numerous smaller manageable regions where the Kalman filter is applied separately, and then information is exchanged across boundaries. However, implementation of this approach presents serious problems for assimilating data integrated across large regions such as lineofsite TEC taken at low elevations and limb TEC or UV data. In a partitioned Kalman, data going across different partitions have to be discarded. An examination of the coverage of data (from ground and space) will make it clear that one will most likely end up throwing away the data that could be most valuable (e.g., GPS occultations or UV limb sounders) for sensing the vertical structure of the ionosphere.
[18] In a reduced Kalman filter one solves the forward model on a highresolution grid but estimates the error covariance and a correction to the forecast on a coarseresolution or “reduced” grid. The suitability of this approach depends on the amount and type of data available. If the data are sparse and localized (i.e., not integrated over the region being modeled), the reduced Kalman presents an attractive option since the assimilation of data will affect the regions being observed while having the full resolution of the forward model. However, if data are fairly dense and integrated over the region (e.g., TEC and UV), the reduced Kalman may present a limitation since, by construction, the observations will only yield structures at the coarseresolution level.
[19] Our choice for a bandlimited implementation of the Kalman filter is driven by our desire for maximum flexibility. In the bandlimited approximation, all the Kalman steps (equations (4)–(8)) are performed as usual. However, the state error covariance is truncated such that for a given voxel i, only a subset of the entire set of voxels will have nonzero correlation (i.e., P_{ij} ≠ for some preselected j ∈ [1, N]). In the simplest example of a bandlimited Kalman, a given voxel i will have nonzero covariance only with voxels within a specified “correlation volume” as depicted in Figure 1. In a more complicated example the nonzero correlation terms associated with a voxel i could be all the voxels that are along the same or neighboring magnetic field lines as voxel i, therefore accounting for the strong coupling normally observed along magnetic field tubes. Our implementation is very general where the user can arbitrarily specify the “volume of correlation” associated with any voxel i. In the limit when the volume of correlation for each element covers the entire region being modeled, the band limited becomes the same as the full Kalman filter.
[20] It is worth noting that the term “bandlimited” is strictly valid only for a 1D system, where the elements can be indexed such that the state covariance matrix is zero everywhere outside a finite band around the diagonal. However, for higherdimensional systems, the covariance matrix will always have nonzero elements away from the diagonal terms (Figure 2). Therefore, when referring to a bandlimited Kalman, the term is used only in some abstract sense where in fact the actual state covariance is full, albeit very sparse.
[21] The bandlimited Kalman reduces the number of operations required to update the state error covariance matrix from N^{2} to A × N, where A is the number of voxels within the correlation volume, V_{corr}. Since, for a fixed V_{corr}, A grows linearly with N, the number of operations required to update the covariance is given by (V_{corr}/V) × N^{2}, where V is the volume of the entire region being modeled. For instance, if the correlation volume is chosen to extend over 10 degrees in longitude, 10 degrees in latitude, and 100 km in altitude, and the total volume modeled covers the globe with a height span between 100 and 1600 km, then V_{corr}/V would equal (20/360) × (20/180) × (100/1500) ≈ 1/10^{4}, making the bandlimited 4 orders of magnitude faster than the full Kalman and reducing it to a manageable size. A realistic representation of the state covariance is paramount for obtaining accurate estimates of the state, especially when the data are sparse relative to the size of the state. The bandlimited Kalman filter maintains a sensible covariance, while at the same time it reduces the number of computational steps substantially, thereby making it usable for global, mediumresolution, ionospheric runs.
4. Forward Model
 Top of page
 Abstract
 1. Introduction
 2. Kalman Filter
 3. Approximations to the Kalman Filter
 4. Forward Model
 5. Results and Validation
 6. Conclusions
 Acknowledgments
 References
 Supporting Information
[22] A detailed description of the GAIM physical model is given in the work of Pi et al. [2003]. In summary, we solve the conservation of mass and momentum equations for a plasma, which account for production, loss, and transport of the major ionization species in the F region (O^{+}). These equations can be written as
where n is the ion number density; V is velocity; P and L are production and loss rates, respectively; k_{B} is Boltzmann's constant; T is temperature, M is molecular mass, g is gravitational acceleration; c is speed of light; E and B are electric and magnetic fields respectively; ν is the collision frequency for momentum transfer between the atomic oxygen ion and the neutral particles; and U is neutral wind. An equation similar to equation (11) can be obtained for the electrons, and after ignoring terms that are multiplied by the electron's mass, we obtain
In addition, we also have
[23] The ion and electron densities are obtained by solving the above equations and making use of the empirical or parameterized models of the thermosphere (mass spectrometer and incoherent scatter radar (MSIS)) [Hedin, 1991], thermospheric winds (Hedin wind model (HWM)) [Hedin et al., 1996], solar EUV (SERF2) [Tobiska, 1991], and electric fields [e.g., Heppner and Maynard, 1987; Fejer et al., 1991]. Given all the driving forces, it should be clear that equations (10)–(13) are linear in the ion and electron densities. This linearity is broken once more ions are introduced or the conservation of energy equation is added. Using a singleion model and not solving for the energy balance equation is not uncommon and is used by many ionospheric models. The simplifications and speed offered by the linearity of such a model may well justify its use for data assimilation purposes. One should keep in mind that, in the presence of data, the estimation process will still yield accurate solution of electron densities, even in regions where the background given by the model maybe in error.
[24] Traditionally, these dynamical equations are rewritten in a moving Lagrangian coordinate frame [e.g., Bailey et al., 1993]. The motion of this coordinate frame is dictated by the plasma drift perpendicular to the geomagnetic field lines. This approach introduces significant computational efficiency by transforming a timedependent partial differential equation in a 3D space into a family of timedependent ordinary differential equations in a 1D space following the moving flux tubes. However, this approach also introduces significant complications for data assimilation since the measurements are taken in 3D space across different field lines (e.g., TEC from groundtosatellite or satellitetosatellite links), making the mapping between data and the model parameter space (equation (1)) very difficult to construct. This, in principle, can be overcome by using two frames: a Lagrangian frame used to solve the dynamical equations (10)–(13) and an Eulerian frame where a set of voxels, fixed in space and time, are used to solve for the Kalman filter equations (4)–(8). Ion and electron densities in the two frames can be related to each other by means of interpolation. We refer to this approach as the dualframe approach.
[25] A more elegant and efficient approach is to solve both the Kalman filter equations and the dynamical equations in the same Eulerian frame. In this case the volume elements used to discretize the dynamical equations and to perform the Kalman filter are the same and they are defined by the intersection of constant magnetic field lines, constant magnetic potential lines, and constant magnetic longitudes. We refer to this approach as the singleframe approach.
[26] The USC/JPL GAIM uses the singleframe approach, where Earth's magnetic field is modeled by an eccentric tilted dipole (Figure 3). There are two main advantages to the singleframe over the dualframe approach. (1) The dualframe approach requires interpolation of the densities back and forth between the two frames at each time update in the Kalman filter. (2) The time update matrix [Ψ_{k}]_{ij} (used in equations (7) and (8)) is by definition equal to the partial derivatives ∂n_{i}(k + 1)/∂n_{j}(k), where n_{i}(k + 1) and n_{j}(k) are the densities in voxel i at time k + 1 and voxel j at time k, respectively. In the singleframe approach it is possible to analytically construct this matrix of partial derivatives and directly compute it. In the dualframe approach, additional complications arise in trying to relate the partials of the densities in the Lagrangian frame to those in the Eulerian frame. In the face of these complications, one might have to construct [Ψ_{k}]_{ij} by perturbing n_{j}(k), solving the dynamical forward equations to propagate the state to time k + 1, and then computing the change in n_{i}(k + 1). This has to be done for one voxel at a time, therefore requiring as many forward runs as the dimension of the state. Since the singleframe approach uses the same voxels to solve the dynamical and Kalman equations, it is possible to explicitly form the matrix Ψ_{k} without having to run the forward model. This represents substantial time saving and makes the implementation of the Kalman filter more feasible.
5. Results and Validation
 Top of page
 Abstract
 1. Introduction
 2. Kalman Filter
 3. Approximations to the Kalman Filter
 4. Forward Model
 5. Results and Validation
 6. Conclusions
 Acknowledgments
 References
 Supporting Information
[27] The USC/JPL GAIM model is now able to assimilate four major data types: absolute slant TEC measurements from ground GPS receivers, change in TEC data from GPS occultations, in situ electron density measurements, and UV airglow radiances. Here we present results from GAIM runs when assimilating only ground GPSbased TEC measurements using the bandlimited Kalman filter for the period 22–24 May 2002. This period was chosen to assess the performance of GAIM on quiet days (22 and 24) and a disturbed day (23). For each day, nearly 200,000 GPS TEC measurements, sampled at 1 measurement every 5 min, were available from 96–98 GPS receiver sites using an elevation cutoff of 10 degrees (Figure 4). These TEC measurements are based on dualfrequency phase measurements leveled to the pseudorange with the GPS instrumental biases determined using the JPL GIM technique [Mannucci et al., 1998].
[28] A cross section at a constant geomagnetic longitude of the grid used in the GAIM run is shown in Figure 3. Table 1 summarizes the boundary of the region used in the GAIM run, the vertical and horizontal resolutions of the grid, and the correlation lengths used in each of the radial, longitudinal, and latitudinal directions. The resolution specified is only approximate since the p − q grid (Figure 3) does not map into a uniform grid in geomagnetic latitude, longitude, and height. The correlation volume is intentionally kept small to reduce the number of nonzero offdiagonal terms in the state error covariance matrix to speed up the assimilation run.
Table 1. Specifications of the Grid and Correlation Used in the Data Assimilation RunParameter  Value 

Modeled region longitude range  0–360° 
Modeled region latitude range  −85–85°N 
Modeled region altitude range  100–1500 km altitude 
Latitude resolution  5° 
Longitude resolution  15° 
Altitude resolution  80 km 
Total number of volume elements (voxels)  13,107 
Correlation length in latitude  5° 
Correlation length in longitude  15° 
Correlation length in height  80 km 
[29] Table 2 summarizes the geomagnetic conditions for each day. All 3 days assumed the same E × B drift climatology (that of June solar maximum conditions), MSIS for the neutral densities and temperatures, and HWM for neutral wind. In presenting our results below, we distinguish between two different GAIM runs: (1) GAIM climatology, which refers to the GAIM 3D densities obtained by running the GAIM model without assimilating any data; and (2) GAIM analysis, which refers to the GAIM 3D densities obtained by assimilating the ground TEC data described above. In both cases, GAIM yielded a 3D specification of electron density every 12 min for the entire 3 days considered. These can be integrated vertically to create global 2D maps of VTEC. The GAIM analyses of electron density were validated in two ways: (1) comparison of the GAIM VTEC maps to the GIMs computed from the same ground GPS TEC data, and (2) comparison of GAIM VTEC values to independent TOPEX measurements.
Table 2. Specifications of the Physics Input for the Three Days in May 2002Date  F = 10.7  ap 

22 May 2002  185.6  8 
23 May 2002  184.8  78 
24 May 2002  193.9  2 
5.1. Validation Against GIM
[30] GIM is a mapping technique which assumes a thin shell ionospheric model at 450 km. The details of the technique are described in the work of Mannucci et al. [1998]. In a nutshell, GIM solves for VTEC using a basis set of bicubic splines with local support on a spherical shell. The 2D spherical grid is fixed in solarmagnetic coordinates (magnetic local time). By mapping lineofsight TEC measurements to VTEC at the ionospheric shell piercing point, GIM solves for VTEC on the grid using a square root information filter (SRIF) [Bierman, 1977]. SRIF is equivalent to the Kalman filter but uses the square root of the inverse covariance in order to improve the condition number (reflected in the ratio of its largest to smallest eigenvalues) of matrices and therefore help numerical stability. GIM does not make use of any dynamical model, and therefore it is entirely data driven. In regions where there is no data (e.g., gaps in Figure 4), GIM relies on persistence in time to obtain a solution for VTEC. (More specifically, the a priori VTEC value at a given vertex is set equal to its value from the previous time step with a covariance that grows according to a firstorder GaussMarkov process.) Since the stations are rotating underneath the solarmagnetic reference frame in which the grid is defined, nearly all vertices will have some links going through them during the span of a few hours.
[31] Since GIM is a straightforward interpolation of the GPS TEC data using a 2D shell model, it serves as a proxy for the information content of the GPS data set. GIM matches the TEC data (mapped to vertical) quite well near the GPS sites. However, GIM interpolation is less accurate at distances greater than 1000 km from the nearest site. Moreover, because of the thin shell model used by GIM, horizontal structures in the ionosphere can potentially create artifacts in the VTEC maps, therefore reducing their accuracy near strong gradient regions. Both of these limitations need to be remembered as we compare GAIM to GIM.
[32] To perform the comparison, a GIM global map is updated every 15 min for the 22–24 May 2002 period and interpolated to the 12min GAIM runs for the same period. Owing to space limitation, we only show the comparison between GAIM and GIM at one time frame, about the middle of the period considered (23 May 1100 UTC), which exhibits features that are representative of all the other time frames.
[33] Figure 5 shows snapshots of VTEC from GAIM and GIM, along with maps of absolute and relative differences. Both GAIM climatology (Figure 5a, obtained by running the forward model and without assimilating any data) and GAIM analysis (Figure 5b, obtained by assimilating GPS TEC data) are shown. Figure 5a illustrates that GAIM climatology differs by more than 50% from reality (or at least the GIM proxy) in certain regions. By contrast, Figure 5b shows that GAIM analysis gives VTEC data that are very close to those of GIM, indicating that the TEC data are being used effectively by GAIM. Further examination of the GAIM analysis shows that GAIM reveals the equatorial anomaly more clearly and with higher resolution than GIM (compare left two panels of Figure 5b). This is an indication that the limitation induced by the GIM's thin shell model is reduced or eliminated by the 3D GAIM grid. In addition, since GAIM climatology and GAIM analysis appear to be quite different, even in regions where data is sparse (e.g., in equatorial regions over the Atlantic Ocean and Africa), we conclude that the dynamics introduced by the physical modeling of GAIM plays a significant role in the data assimilation.
[34] To appreciate the significance of the role of the physical model used in GAIM, consider the following. Currently, the GAIM Kalman filter only solves for the O^{+} ion and electron densities and does not adjust any of the drivers. Therefore it is expected that if data stop flowing into GAIM, the GAIM analysis will revert to the GAIM climatology on timescales ranging from minutes to several hours. The wide range of timescales is due to the fact that the production, loss, convection, and diffusion of ions respond to the various driving forces on different timescales. For example, the fast recombination rate of molecular ions causes the F1 region to disappear quickly at night, while the slower recombination rate of atomic ions causes the F2 region to last long after dusk when the radiation from the Sun stops. Even when the driving forces are only approximately correct, the dynamical model plays an important role in assigning the correct timescale at different regions and local times in the ionosphere. Effectively, when the timescale is long, the initial condition of the ion densities will have a stronger effect on the evolution of the ionosphere, therefore extending the influence of data over longer periods and larger regions. When the timescale is short, the initial condition of the ion densities will have little effect on the evolution of the ionosphere, therefore limiting the influence of data to shorter periods and smaller regions. Thus the model plays a crucial role in assigning the proper time correlation length at different local times, heights, and latitudes. This information is completely lost if one uses a constant timescale everywhere, as would be the case if no dynamical model is used to map the state or the state error covariance (equations (7) and (8)).
5.2. Time Evolution During a Magnetic Storm
[35] The May period chosen for our analysis was centered around a magnetic storm. Figure 6 shows the hourly Dst for 3 days starting at 0000 UT, 22 May 2002, and indicates the onset of a magnetic storm at 1200 UT, 23 May. The 3hour ap and Kp indices for 23 May are given in Table 3. Figure 7 shows hourly GAIM analysis of VTEC for the 23 May at UT = 1200, 1300, …, 1700, a period which corresponds to the main phase of the storm. For comparison, we also show the corresponding VTEC maps at the same local time for 22 May as a proxy for the expected ionospheric features during a quiet period. Comparing the 2 days, a clear enhancement of VTEC at the southern geomagnetic equatorial region is seen at 1400 UT during the storm. Furthermore, at 1500 UT, an enhancement of the equatorial anomaly extending from 0800 to 2000 LT, can be seen during the storm day relative to the quiet day. This is presumably caused by a storminduced enhancement of the eastward electric field. Figure 8 shows cross sections of GAIM analysis of electron densities at the same universal times as in Figure 7 but only for 23 May. The cross sections are taken at 7.5 geographic longitude, and therefore the local time is given by the UT + 30 min. The features seen in the VTEC maps, such as the enhancement of VTEC at 1400 UT, are clearly seen in the densities as well.
Table 3. ThreeHourly ap Index on 23 May 2002Time, UT  ap  Kp 

0130  12  3− 
0430  12  3− 
0730  7  2 
1030  111  7− 
1330  179  8− 
1630  236  8+ 
1930  48  5 
2230  18  3+ 
5.3. Validation Against TOPEX
[36] VTEC below the TOPEX track derived from the dualfrequency altimeter has been used extensively as an independent data source for validation [Ho et al., 1997; Codrescu et al., 2001]. Given the precision, latitudinal coverage, and long time series of the TOPEX data, it offers a unique and powerful means of validation. When compared to VTEC derived from GPS ground data, TOPEX VTEC are especially challenging given that they are measurements taken exclusively over the ocean where few GPS stations exist. Also, validation against TOPEX will mainly tell us how well GAIM can estimate VTEC but says nothing about how well it can estimate densities. Validation of densities will require a different data source and will be the subject of a future investigation.
[37] We start by comparing VTEC from TOPEX, GAIM climatology, GAIM analysis, and GIM. Figure 9 shows VTEC for 8 out of the 27 TOPEX tracks on 23 May 2002. Also shown in Figure 9 are the tracks of the TOPEX footprint and neighboring GPS stations used in the assimilation. We note the following features in the comparisons:
[38] 1. GAIM climatology matches TOPEX very well in some tracks (e.g., track 10) while it differs significantly in others (e.g., 13, 14, 15, 16, and 19).
[39] 2. The GAIM analysis is significantly different from the GAIM climatology and compares much better with TOPEX (visible in all tracks), indicating that GPS TEC data are being assimilated effectively.
[40] 3. The agreement between the GAIM analysis and TOPEX is quite good in many cases (e.g., 8, 14, 15, 16, and 19).
[41] 4. Whenever an equatorial anomaly appears in TOPEX, it also appears in the GAIM analysis, (e.g., 9, 13, and 19), in some cases with great fidelity (e.g., 19).
[42] 5. GAIM appears to be able to capture steep VTEC gradients associated with the equatorial anomaly better than GIM (e.g., 9, 15, 16, and 19). This is presumably due to the thin shell limitation of GIM.
5.4. Statistical Comparison to TOPEX
[43] To further assess the performance of GAIM, we examine histograms of VTEC difference between GAIM climatology and TOPEX (left panels of Figure 10), GAIM analysis and TOPEX (middle panels), and GIM and TOPEX (right panels) for all TOPEX tracks during 22–24 May 2002. Statistical summaries of the histograms are given in Table 4. We emphasize that the VTEC differences between GAIM analysis and TOPEX have a standard deviation of σ_{GAIM/A} = 5.2 TEC units (TECU) (1 TECU = 10^{16} e/m^{2}) over the 3 days, which is almost 3 times better than the standard deviation for GAIM climatology (σ_{GAIM/C} = 13.8 TECU), twice better than IRI (σ_{IRI} = 9.6 TECU), and slightly superior to GIM (σ_{GIM} = 5.6 TECU). Both GAIM analysis and GIM VTEC are biased low by 1–2 TECU relative to TOPEX when it should be high given that TOPEX is at 1330 km and GPS is at 20,000 km altitude. One TECU translates into altimetric range delay of ∼2 mm, which is well within the error budget of TOPEX; therefore the bias could very well be due to TOPEX.
Table 4. Statistics on Vertical TEC Differences Between GAIM Climatology (GAIM/C), GAIM Analysis (GAIM/A), GIM, or IRI and Those of TOPEX for the Three Days of 22–24 May 2002, Sorted by Increasing Standard Deviation^{a}Day in May 2002  Quantity  N  Average  SD  Minimum  Maximum  RMS 


24  GAIM/A–TOPEX  3901  −1.3  4.4  −28.6  36  4.6 
22  GAIM/A–TOPEX  3763  −2  4.9  −35  41.7  5.3 
22  GIM–TOPEX  3763  −0.6  5.2  −30.7  28  5.2 
23  GIM–TOPEX  4102  −1.2  5.7  −35  29.3  5.8 
24  GIM–TOPEX  3901  −1  5.9  −36.7  36  6 
23  GAIM/A–TOPEX  4102  −1.7  6.1  −30.4  35.1  6.3 
22  IRI–TOPEX  3763  0.8  8.3  −38.2  32.5  8.3 
24  IRI–TOPEX  3901  3.3  8.8  −39.5  38.1  9.4 
22  GAIM/C–TOPEX  3763  2.3  10.8  −32  49.2  11 
23  IRI–TOPEX  4102  2.2  11.1  −48.2  56.4  11.3 
24  GAIM/C–TOPEX  3901  6.4  14.2  −35  63.9  15.6 
23  GAIM/C–TOPEX  4102  1.6  15.3  −45.5  55.8  15.4 