The Utah State University Gauss-Markov Kalman Filter (GMKF) was developed as part of the Global Assimilation of Ionospheric Measurements (GAIM) program. The GMKF uses a physics-based model of the ionosphere and a Gauss-Markov Kalman filter as a basis for assimilating a diverse set of real-time (or near real-time) observations. The physics-based model is the Ionospheric Forecast Model (IFM), which accounts for five ion species and covers the E region, F region, and the topside from 90 to 1400 km altitude. Within the GMKF, the IFM derived ionospheric densities constitute a background density field on which perturbations are superimposed based on the available data and their errors. In the current configuration, the GMKF assimilates slant total electron content (TEC) from a variable number of global positioning satellite (GPS) ground sites, bottomside electron density (Ne) profiles from a variable number of ionosondes, in situ Ne from four Defense Meteorological Satellite Program (DMSP) satellites, and nighttime line-of-sight ultraviolet (UV) radiances measured by satellites. To test the GMKF for real-time operations and to validate its ionospheric density specifications, we have tested the model performance for a variety of geophysical conditions. During these model runs various combination of data types and data quantities were assimilated. To simulate real-time operations, the model ran continuously and automatically and produced three-dimensional global electron density distributions in 15 min increments. In this paper we will describe the Gauss-Markov Kalman filter model and present results of our validation study, with an emphasis on comparisons with independent observations.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 The ionosphere is a highly variable environment that exhibits significant weather variations with altitude, latitude, longitude, universal time, solar cycle, season, and geomagnetic activity. This variability arises from the couplings, time delays, and feedback mechanisms that are inherent in the ionosphere-thermosphere system, as well as from the effects of the solar, interplanetary, magnetospheric, and mesospheric processes.
 Ionospheric weather disturbances can adversely affect numerous human activities and systems, including survey and navigation systems that use the Global Positioning System (GPS), over-the-horizon radars, HF communications, the tracking and the lifetime of satellites, as well as the Federal Aviation Administration's Wide-Area Augmentation System (WAAS).
 In an effort to specify ionospheric weather, numerous statistical (empirical), analytical, parameterized, and global physics-based theoretical and/or numerical models of the ionosphere have been developed. In addition, coupled models that combine different spatial domains (e.g., ionosphere-thermosphere, magnetosphere-ionosphere, ionosphere-plasmasphere) have also been developed and an overview of the recent model developments is given by Schunk et al. . Although physics-based theoretical and/or numerical models of the ionosphere reproduce many of the observed climatological features, these models generally fail in reproducing ionospheric weather. This lack of reliable specifications is largely due to a lack of reliable estimations of the ionospheric drivers. These drivers include the thermospheric composition and neutral winds, the equatorial and high-latitude electric fields as well as the high-latitude particle precipitation. Currently, the most promising models for ionospheric weather specification and prediction are data assimilation models that combine physics-based models of the ionosphere with observations. Data assimilation models have been successfully used for the past decades as a dominant tool for specifications and forecasts in meteorology and oceanography and are now running routinely at several operation centers. In the ionosphere the implementation of assimilation techniques had been slow due to the lack of sufficient ionospheric observations. However, this situation is rapidly changing, and recently, data assimilation models have also been developed for ionospheric as well as thermospheric specifications [Howe et al., 1998; Pi et al., 2003; Schunk et al., 2004; Scherliess et al., 2004; Minter et al., 2004; Codrescu et al., 2004; Bust et al., 2004; Mandrake et al., 2005].
 A powerful technique to assimilate data into a dynamical model is the Kalman filter [e.g., Gelb, 1974] and this technique was initially introduced by Howe et al.  for ionospheric data assimilation. In their study, simulated slant total electron content (TEC) observations from 51 GPS ground receivers and simulated occultation data from one LEO satellite were assimilated into a statistical dynamical model (Gauss-Markov process). Recently, with funding from a Multidisciplinary University Research Initiative (MURI), two physics-based Kalman filter, data assimilation models of the ionosphere have been developed at Utah State University (USU). Both of these models provide for a Global Assimilation of Ionospheric Measurements (GAIM) and assimilate a diverse set of real-time (or historic) measurements. The most sophisticated and computationally demanding (more than 30 CPUs) of the two USU-GAIM models is the Full Physics Kalman filter [Scherliess et al., 2004] and this model is still under development. The Full Physics Kalman filter uses a physics-based ionosphere-plasmasphere-polar wind model and employs a reduced state approximation. The second model is the Gauss-Markov Kalman filter and it uses the physics-based Ionosphere Forecast Model (IFM) and a Kalman filter in its data assimilation scheme. With this GAIM model, which can be run on a single CPU, the model variability that is used for the construction of the error covariance matrix is calculated a priori and stored in a database. Also, with the Gauss-Markov model, the ionospheric densities obtained from the IFM constitute the background ionospheric density field on which perturbations are superimposed based on the available measurements and their errors. The density perturbations and the associated errors evolve over time via a statistical Gauss-Markov process. Recently, the Air Force Weather Agency (AFWA) has selected this model for its operational use and the same model has been implemented at the Community Coordinated Modeling Center (CCMC). The capabilities of the operational USU-Gauss Markov Kalman filter model have been described by Schunk et al. .
 In its initial development, Schunk et al.  used the Gauss-Markov Kalman filter model to assimilate synthetic data to reconstruct the global Ne distribution, but in addition to the slant TEC measurements, in situ Ne observations from two Defense Meteorological Satellite Program (DMSP) satellites and bottomside Ne profiles from 16 digisondes were included. More recently, Schunk et al.  used the model with real observations from GPS ground receivers, digisondes, and DMSP satellites. We have used the USU Gauss-Markov Kalman filter model to study anomalous enhanced electron densities at low and middle latitudes during the March–April 2004 Climate And Weather of the Sun-Earth System (CAWSES) period. However, the model results have only been validated for several isolated case studies [e.g., Scherliess et al., 2005] and this paper will, for the first time, present an extensive validation of the model results under different geophysical conditions. The effects of the various data types on the model performance are presented by Thompson et al. .
 In the following sections, we first describe the USU Gauss-Markov Kalman filter model as well as its background ionospheric model followed by a description of the data types that can currently be assimilated by the model. Then, we describe the geophysical conditions and the data coverage of our three 24-day long validation periods and present comparisons of the model results with independent observations.
2. USU-GAIM Gauss-Markov Kalman Filter Model
 The Gauss-Markov Kalman filter is based on a physics-based model of the ionosphere and a Kalman filter data assimilation algorithm. The physics-based model is the Ionosphere Forecast Model (IFM), which covers the E region, F region, and the topside ionosphere up to 1400 km altitude [Schunk et al., 1997]. The USU-GMKF is a global model that can support regional, higher definition assimilation windows within the model specification. The higher-definition assimilation window in the regional mode can be used to provide higher resolution for regions of large data coverage, allowing the model grid resolution to be adjusted to that density. In both the global and regional modes, the latitudinal and longitudinal resolutions are adjustable. However, the resolution adopted depends on the data coverage and the computational environment, and consequently, the model is typically executed with a 15° longitudinal resolution and a 4.6° latitudinal resolution in the global mode. In the regional mode, the spatial resolution can be 3.75° in longitude and 1° in latitude if there are sufficient data to warrant such a resolution. With regard to altitude, the GMKF extends from 92 to 1400 km, which covers the E region, F region, and topside ionosphere. The spatial resolution of the output is 4 km in the E region and 20 km in the F region and above.
2.1. Gauss-Markov Kalman Filter Model
 In the USU Gauss-Markov Kalman filter the ionospheric densities obtained from the IFM constitute the background ionospheric density field on which perturbations are superimposed based on the available data and their errors. To reduce the computational requirements, these perturbations and the associated errors evolve over time with a statistical model (Gauss-Markov process) and not, as in the case of the USU Full-Physics-Based Model, rigorously with the physical model. The background ionospheric densities, however, evolve with the full physical model. As a result, the USU Gauss-Markov Kalman filter can be executed on a single CPU workstation.
 In this scheme the total electron density at each grid point can be expressed as:
where NIFM is the electron density obtained from the IFM and Npert is a perturbation density determined by the Kalman filter. The perturbation densities Npert are expressed in a geographic frame and evolve over one assimilation time step (15 min) via:
The transition matrix L is a product of a translation matrix L1 and a diagonal matrix L2. The matrix L1 convects the perturbation density field at each time step in a magnetic Sun-synchronous frame and the diagonal matrix L2 relaxes the perturbations to a zero value in the absence of data. In more detail, the diagonal matrix is composed of diagonal elements equal to exp(−Δt/τ), where Δt is the assimilation time step and τ is a relaxation time. The value of τ in the current version of the model is set globally to τ = 5 hours, but in future versions of the model this value can be spatially and temporally adjusted to better represent the changing geophysical conditions.
 Alternative to the use of the translation matrix L1 the perturbation densities could also directly be expressed in a geomagnetic Sun-synchronous coordinate system. However, the advantages of the use of a geographic reference frame in combination with the use of L1 become apparent in our regional mode of the model. In the regional mode, which is fixed in its geographical coordinates, a Sun-synchronous frame would constantly enter and leave the assimilation region, whereas in a geographic frame it remains stationary.
 The model error covariance matrix P evolves in the Gauss-Markov Kalman filter with the same transition matrix L as the density perturbations:
LT denotes the transpose of the matrix L. The matrix Q denotes the uncertainty of the transition model and is chosen so that in the absence of data the model error covariances are given by the uncertainties in the specification of the IFM background ionospheric densities. These uncertainties are strongly dependent on the geophysical conditions and are largely due to uncertainties in the external forcing parameters of the physical model; namely at low and middle latitudes the neutral winds and densities, and the equatorial electric fields. In order to model Q, we have performed 1107 individual 2-day runs of the IFM with varying external forcing parameters. In these runs, we have specifically varied the equatorial electric fields, the thermospheric neutral winds, and the neutral temperatures and densities over reasonable ranges. To properly track the climatological variations and the changes in the location of the solar terminator, 27 two-day model runs were performed for every 9 days of the solar year. The first day of the 2-day runs was used to spin up the model, and the model results from the second day were used for the construction of Q. Finally, a Hadamard (element by element) product of Q with a Gaussian correlation function was performed to filter out spatial model correlations over larger distances [e.g., Keppenne and Rienecker, 2002]. The correlation function used in the current version of the model has a constant correlation length of 10° in latitude and decorrelates grid points in the zonal direction. However, note that at each time step self-consistent zonal correlations are established through the transition matrix L1. Similar to the choice of the relaxation time τ, the assumed correlation lengths can in future versions of the model be adjusted to better represent the changing regional and geophysical conditions.
 GAIM can potentially assimilate a wide range of data types from numerous ground-based locations and space-based platforms. The data sources include in situ electron densities from satellites, bottomside electron density profiles from a network of ionosondes, line-of-sight Total Electron Content (TEC) measurements between as many as several thousand ground stations and the GPS satellites, TECs between low-altitude satellites with radio beacons and several ground-based receiver chains, TECs via occultations between various low-altitude satellites and between low- and high-altitude satellites, and line-of-sight UV emission data. In the current configuration, the model assimilates phase-leveled GPS slant TEC observations from several hundred ground sites located between ±60° geographic latitude, bottomside Ne profiles from about 20 ionosondes, nighttime 1356 Å UV-radiances from the Low Resolution Airglow and Aurora Spectrograph (LORAAS) aboard the Advanced Research and Global Observation Satellite (ARGOS) and in situ electron densities from up to four DMSP satellites.
 An important initial step in the assimilation of these data is the quality control of the observations. In this step obviously wrong observations are rejected and appropriate data errors are assigned. These errors consist of two parts: An instrumental error associated with the data taking, e.g., the observational error, and an error associated with the representativeness of the observation. The latter, for example, accounts for errors arising from subgrid structures observed in the data but not modeled by the assimilation model [e.g., Daley, 1991]. All data errors are assumed to be Gaussian distributed and uncorrelated with each other. The effects of the latter assumption on the Kalman filter specifications are discussed by Thompson et al. . To linearly relate the observations to the Kalman state vector, e.g., the 3-D distribution of density perturbations, we initially subtract from each measurement the corresponding IFM model value. In addition, for GPS slant TEC a geometry dependent correction for plasmaspheric TEC contributions and, when available, a correction for differential biases for the GPS satellite and receivers is performed. If the GPS receiver biases are not available for a given GPS ground receiver the GMKF automatically augments the state vector with a variable for the missing bias and solves internally for it.
 The resulting residuals d are related to the 3-D density perturbations Npert via the measurement matrix H:
 In order to linearly relate the UV radiances to the Kalman state vector, a slightly different approach is taken. Neglecting optically thick contributions to the 1356 Å radiances, which can be of importance in the topside ionosphere and below the F region peak, and setting the O+ density equal to the electron density, the 1356 Å radiances can be expressed as the integrated square of the electron density along the ray path [Dymond et al., 1997], e.g.,
and when discretized into N volume elements
Here dsi is the length of the ray through the volume element number i and Npertprev is the density perturbation from the last time step, or when solving equation (6) iteratively, the value from the last iteration. The first term on the right side corresponds to the IFM model value for each UV observation and the second term on the left side relates the UV radiances linearly to the new electron density perturbations, Npert. In the Kalman filter equation (6) is solved iteratively for each time step by starting initially with density perturbations from the last time step for Npertprev and then replacing them iteratively with the new values.
2.2. Ionospheric Background Model
 Within the Gauss-Markov Kalman filter, the plasma densities derived from the Ionosphere Forecast Model (IFM) constitute a background density field on which perturbations are superimposed (see equation (1)). The IFM is a model of the global ionosphere that is based on a numerical solution of the ion and electron continuity, momentum, and energy equations [Schunk et al., 1997]. The model calculates the three-dimensional, time-dependent density distributions for four major ions (NO+, O2+, N2+, O+) at E region altitudes, two major (O+, NO+) ions at F region altitudes, and the ion and electron temperatures at both E and F region altitudes. The IFM also contains a simple prescription for calculating H+ densities in the F region and topside ionosphere. The model covers the altitude range from 90 to 1400 km, and outputs the density and temperature distributions in either geographic or geomagnetic coordinates with a 3° latitude and a 7.5° longitude resolution. The IFM takes account of all the important chemical and physical processes, including field-aligned diffusion, cross-field electrodynamic drifts, thermospheric winds, protonospheric exchange fluxes, energy-dependent chemical reactions, neutral composition changes, several ion production sources (auroral electron precipitation, solar EUV radiation, resonantly scattered solar radiation, starlight), electron thermal conduction, and a host of local heating and cooling processes. The IFM also takes account of the offset between the geomagnetic and geographic poles. Recently, the IFM has been calibrated to climatologically match the TEC observations obtained from the TOPEX satellite over a 10 year time period [Zhu et al., 2006].
 The inputs to the IFM are global distributions of neutral densities, temperatures, and winds, and the plasma convection and precipitation patterns. These inputs are included as an integral part of the IFM via well-known empirical models. The IFM is self-contained and easy to use, being driven by a few simple geophysical indices. The model drivers include F10.7 cm, year, day, start time, duration of the model run, and the temporal variation of the Kp from 3 hours prior to the start time to the end of the simulation.
 A complete validation of a global 3-D ionospheric data assimilation model like the USU-GMKF involves many aspects associated with the vertical and horizontal plasma distribution under different geophysical conditions. These conditions include geomagnetically quiet and disturbed periods, different seasonal and solar cycle conditions, as well as weather variations that occur on different timescales. The validation of the plasma distribution must, for example, include key ionospheric parameters like the peak plasma densities in the E and F regions and their peak heights, as well as bottomside profile shapes, topside scale heights, TEC, and the location and strength of zonal and meridional plasma gradients. Clearly, a complete validation of the USU-GMKF model is far beyond the scope of a single study. However, in this study we have started the validation efforts by validating the ionospheric electron density specifications obtained from our Gauss-Markov Kalman filter spanning three 24-day long periods in December 2001 (2001/335–2001/358), January 2004 (2004/001–2004/024), and March–April 2004 (2004/080–2004/103). The main emphases of our validation during these periods are (1) the validation of the F region peak plasma densities (NmF2) over a data-rich region in the midlatitudes and (2) the validation of total electron content over data sparse regions, e.g., the oceans. These validation periods were primarily chosen to cover different geophysical conditions and different data availability. During these periods the USU-GMKF ran continuously and autonomously and assimilated all available data that were provided to the model via a local database. The model tasks included quality control of the data, execution of the background model runs, and assimilation of available data. Furthermore, diagnostic graphics were automatically created that included 2-D maps of TEC, NmF2, hmF2 as well as zonal and meridional cuts through the 3-D plasma distributions at different geographical locations. These graphics, which are not shown here, were used to visually inspect the model outputs for obvious errors, e.g., negative plasma densities or negative TEC values, unrealistically low or high layer heights, etc. Note that the entire model output for all three validation periods, a total of 2304 15-min global 3-D plasma specifications, passed our initial quality control.
Figure 1 shows the solar and geomagnetic conditions during the three validation periods. The first period extended from 1 December 2001 to 24 December 2001 and was characterized by solar maximum conditions as indicated by the large solar flux indices, F10.7 cm, in the top panel of Figure 1. The F10.7 cm flux varied during this period from about 200 to 265 with the larger values occurring during the later part of the period. The geomagnetic activity level was rather quiet during the entire period with maximum Kp values reaching values of up to 4.7.
 The second validation period covered the first 24 days of 2004. This period was characterized by medium to low solar flux conditions, ranging from F10.7 cm values of 130 to 85 during the latter phase of the period. The Kp index during the entire period was slightly elevated with an average value of about 3 and individual Kp values of up to 6 during days 2004/006 and 2004/023. During day 2004/022, Kp reached a value of 6.7.
 The third validation period covered the time from 20 March 2004 up to 12 April 2004 and was partly chosen because the first ten days coincided with the first CAWSES period. This period was also characterized by low to medium solar flux conditions with values ranging from 90 to 130, but contrary to the second validation period it had a rather low geomagnetic base level and two individual short storms on days 2004/094 and 2004/096 with Kp values above 6.
 The data sources assimilated during each of the three periods are listed in Table 1. During all three periods the primary data source was slant TEC from more than 160 ground GPS receivers. During the two 2004 periods, a significant amount of data was also obtained from 14 low- and middle-latitude ionosonde stations as well as in situ electron density observations from the SSIES instrument on board several DMSP satellites. In contrast, during the 2001 period, data from only two ionosonde stations (Wallops Island (37° N, 284° E) and Point Arguello (38° N, 239° E)) were assimilated and no in situ DMSP data were used. However, nighttime 1356 Å UV radiances from the LORAAS instrument onboard the ARGOS satellite were assimilated during this first period. As mentioned above, this inhomogeneous data distribution was chosen intentionally to test the filter under a variety of different conditions. The ionosonde data used in this study were auto-scaled using the ARTIST program and the errors of these measurements are discussed by Thompson et al. .
Table 1. Data Sources Used in the Utah State University Global Assimilation of Ionospheric Measurements (GAIM)
Period 1 December 2001
Period 2 January 2004
Period 3 March–April 2004
Number of GPS ground Rx
Number of ionosondes
DMSP in situ Ne
F13, F14, F15, F16
F13, F14, F15
Nighttime 1356 Å UV
Figure 2 shows a snapshot for day 2004/098 at 2000 UT of the global data distribution of the network of DISS stations as well as the network of GPS ground receivers that were used in the assimilation. Shown are the locations of 14 DISS stations available during this time (black triangles) as well as color coded vertical TEC values obtained by geometrically mapping the observed GPS slant TEC values to the vertical direction and plotting them at their 300 km pierce points. The more than 800 pierce points shown in Figure 2 are associated with the 162 GPS ground stations, where each station simultaneously observes slant TEC with several GPS satellites. The vertical TEC values shown in Figure 2 are only shown for illustration purposes and were calculated from the observed slant TEC values using a geometrical mapping function. It is important to note that the Kalman filter does not assimilate the vertical TEC values shown in Figure 2 but instead uses the original slant TEC data. Figure 2 shows a good data distribution over North America, Europe, and the East Asian sector but large data voids in particular over the Pacific and the African regions. This inhomogeneous data distribution for both the ground GPS and the DISS stations is primarily due to the fact that these stations are currently only located on land and that not all nations have the same distribution density.
Figure 3 shows an example of the USU-GMKF output for 2000 UT on day 2004/98. The top left panel shows the global TEC distribution obtained by vertically integrating through the 3-D plasma distribution obtained from the Kalman filter. In the bottom left panel, a global map of the peak electron density in the F region (NmF2) is shown and in the right panels horizontal slices through the Kalman filter plasma densities at five selected altitudes from 250 km up to 600 km are shown. Note that the plasma densities shown in Figure 3 are the result of our Kalman filter model, which combines all the available data during this time with the information obtained from the background model, and all available prior data, assimilated by the filter.
3.1. Comparison With NmF2 From the Bear Lake Observatory
 As mentioned above, a complete validation of the Kalman filter model needs to address many aspects of the ionospheric plasma distribution. As an initial attempt to validate our model, we compared the Gauss-Markov Kalman filter specifications of the peak electron densities in the F region (NmF2) over the Bear Lake Observatory (BLO) located near Logan, Utah (41.9°N, 248.6°E) with observations of NmF2 obtained from a dynasonde located at BLO. The BLO dynasonde data were obtained from an automated analysis of the BLO dynasonde measurements performed by the Dynasonde 21 Software Suite developed by Wright and Zabotin. The comparisons were performed for the two 24-day periods during January 2004 and March–April 2004. For the December 2001 validation period, BLO dynasonde data were not available. Note that the observations from the BLO dynasonde have not been assimilated in our model and, consequently, were used as an independent validation data set. The location of the BLO dynasonde is shown in Figure 4 together with the locations of the six North American ionosondes that provided bottomside plasma density profiles to the assimilation model. Also shown are the locations of the GPS ground receivers over the North American region that provided slant TEC data to our assimilation model. It can readily be seen in Figure 4 that the GPS data coverage was fairly sparse over this region, with a typical spacing of several hundred kilometers between receivers. For this study we have intentionally not used a more dense network of GPS receivers over the North American region to mimic a realistic GPS data distribution that will be available during operational use. Clearly, a comparably dense global GPS ground receiver distribution, as for example available through the Continuously Operating Reference Stations (CORS) network over North America (several hundred receivers), is not feasible in the near future for most parts of the globe. For this study the closest GPS receiver is about 600 km away from the BLO dynasonde location and the nearest ionosonde that provided data to the assimilation model was located in Point Arguello, California, about 1000 km away from BLO.
Figure 5 shows the comparison of the NmF2 values obtained from our assimilation model (red lines) with the corresponding dynasonde values (black lines) for the two periods in 2004. For this analysis each individual BLO dynasonde observation (typically every 5 min) was compared with the corresponding GAIM model value (15-min increments) that was closest in time to the BLO data. Since the spatial resolution in our global assimilation model is fairly coarse (4.6° latitude, 15° longitude, 20 km altitude in the F region), we have than interpolated the plasma densities from the four neighboring grid points to the BLO location. Finally, a parabola was fitted to the resulting density profiles around the location of the F region peak in order to interpolate between the 20 km altitude steps and to capture the curvature of the density profile around the peak densities. The value of NmF2 was then taken as the maximum value of the fitted parabola. In addition to the GAIM and BLO NmF2 values, Figure 5 also shows the corresponding NmF2 value obtained from a climate run of our background model (blue lines). Here, the background model was driven by external climatological drivers, e.g., winds, densities, electric fields, and no data were assimilated. Although, the climatological NmF2 values exhibit some day-to-day variations associated with changes in the solar flux and the Kp indices, they fail to show the variability observed in the BLO observations. For example, the daytime peak NmF2 values on the two successive days 2004/096 and 2004/097 vary by almost a factor of four from about 2 × 106 cm-3 to 5 × 105 cm-3, whereas the climatological daytime peak values are nearly constant at about 9 × 105 cm−3. This is, of course, not unexpected given the climatological nature of the external drivers used for the climate model run. The GAIM model, on the other hand, follows this large day-to-day weather variability very well. Another dramatic example that shows the large improvements gained from the GAIM model can be observed during the afternoon/evening and nighttime hours of the January 2004 period. During these times the climate model generally strongly overestimates the observed BLO NmF2 values. GAIM, on the other hand, is in excellent agreement with the BLO observations during these times.
 The good agreement between the GAIM model results and the BLO data can even more clearly be seen in Figures 6 and 7, which show, in the right panels, the percentage differences between the GAIM NmF2 values and the corresponding BLO values. The left panels of Figures 6 and 7 show the percentage differences between NmF2 obtained from the Climate model runs and the BLO values. Here, the climatological differences often exceed more than 100%, whereas the GAIM differences are typically in the 20% range. Also note that the systematic differences observed in the climatological values during the January 2004 evening/nighttime hours are absent in the GAIM results.
 However, Figures 5–7 also show some differences between the GAIM results and the BLO observations. This is in particular true for the daytime peak NmF2 values observed during the January 2004 period. A good example for this discrepancy is the 10–11 January 2004 period, when the GAIM daytime peak values are about 30–40% smaller than the BLO daytime NmF2 observations. On the other hand, on 22 January 2004 the GAIM daytime NmF2 values are about 40% larger than the observed values over BLO. During both periods, the GAIM values only slightly differ from their background values, indicating that the observed differences between GAIM and the observations at BLO are most likely associated with smaller scale weather features not seen in the other ionosondes or GPS receivers. It is interesting to note that the larger differences between the GAIM and BLO NmF2 values are more frequent during the January 2004 period than during the March/April 2004 period. A possible explanation for these differences could be the generation of more frequent smaller scale weather features associated with the elevated geomagnetic activity level that occurred during the entire January 2004 time period.
Table 2 shows the mean absolute percentage difference between the NmF2 values observed at BLO and the NmF2 values determined by the GAIM model and the Climate model, respectively. The percentage differences are listed for all local times and separately for afternoon (1200–1700 LT), evening (1700–2400 LT), predawn (0000–0500 LT), and morning (0500–1200 LT) conditions. It can readily be seen from Table 2 as well as from Figures 5–7 that GAIM provides the largest improvements during the evening hours. In this sector the percentage difference drops from 51% down to 18% when data are assimilated into GAIM. Some of the differences observed between the BLO NmF2 values and the GAIM and Climate model values are due to uncertainties in the determination of NmF2 from the observed ionograms and also due to the higher sampling frequency of the BLO dynasonde. The dynasonde provides data at a typical 5-min interval, whereas the GAIM model provides plasma densities at 15-min increments. The BLO NmF2 variability due to short term fluctuations (faster than 15 min) accounts for about 4% of the observed variability during the night and 2% during the day.
Table 2. Mean Absolute Percentage Difference Between NmF2 Observed by the Dynasonde at the Bear Lake Observatory (BLO) and the GAIM and Climate Models, Respectively
3.2. Comparison With TEC From the Topex Satellite
 The locations of the GPS ground receiver stations as well as the locations of the ionosonde stations used in our assimilation are all confined to be on land. Furthermore, with the exception of only a few islands, all of these stations are located on the continental land masses and the ionosonde stations are clustered mainly over North America (see Figure 2). Over the oceans, and in particular over the Pacific region, the data coverage from ground stations is rather sparse. Although the DMSP and LORAAS satellite measurements used in our assimilation runs are available over the oceans, these data are either limited in their altitudinal coverage (e.g., DMSP in situ measurements at 840 km altitude) and/or are confined in their local time distribution (e.g., LORAAS at 0330 LT). To validate the GAIM model over the oceans, we have compared the TEC specifications obtained from the Gauss-Markov Kalman filter with independent TEC observations obtained from the TOPEX satellite. This comparison also serves as a good test for the model ability to propagate information from data-rich regions, e.g., over land, to data sparse regions, e.g., over the oceans.
 The primary objective of the TOPEX satellite [Christensen et al., 1994] is to measure the ocean surface height and, as a byproduct, it obtains vertical total electron content (TEC) from the ground up to its orbital altitude of 1336 km. The TOPEX satellite orbits the Earth with an inclination of 66° in a near Sun-synchronous orbit, advancing 2° per day. The TOPEX TEC measurements are taken nearly every second and for our comparison this data has been rebinned into 18-s averaged values. Each 18-s average TEC value corresponds to a distance of about 135 km or 1° at the orbit altitude of the satellite. The scatter of the original 1-s TEC values about their 18-s averaged means is fairly constant with a spread of about 4–5 TECU. It is well known that the TOPEX TEC data are biased and previous studies [e.g., Orus et al., 2002] have found that this bias can be of the order of several TEC units (2–5 TECU). For our analysis we have accounted for this bias by subtracting 4 TECU from each individual 18-s TOPEX TEC value. It will be shown below that the use of this bias value brings the GAIM and TOPEX TEC data statistically into agreement during all three validations periods.
Figure 8 shows an example of two 3-hour long comparisons between vertical TEC obtained from the TOPEX satellite and the corresponding TEC values extracted from the GAIM model. The comparisons are shown for the 2 consecutive days; 2004/094 (left) and 2004/095 (right) from 1900 UT to 2200 UT. The first of these 2 days was geomagnetically disturbed, with Kp values reaching up to 6.3 (see Figure 1) and with a recovery occurring on the second day. However, the emphasis of Figure 8 is not to illustrate the geomagnetic activity effects on the observed TEC values but to demonstrate the ability of the GAIM assimilation model to track the large variability observed in the TEC values during these 2 days. To illustrate the effects of the assimilation of data into GAIM, Figure 8 also shows the corresponding vertical TEC from a climatological run of our background model, e.g., the case when no data are assimilated into GAIM. The ground tracks of the TOPEX satellite, color-coded with magenta indicating large TEC values and blue indicating small TEC values, are also shown for these two periods in the top panels of Figure 8. On both days the TOPEX satellite tracks start at 1900 UT in the South Pacific and end at about 2200 UT in the North Atlantic (2004/094) and in the Caribbean (2004/095), respectively. Most of the data gaps seen in Figure 8 correspond to times when the TOPEX satellite traversed over land and no TEC observations were obtained. For the comparisons shown in Figure 8 the GAIM and climate TEC values have been obtained by interpolating the three-dimensional electron density distributions from the rather coarse internal model grids, e.g., 15° longitude, 4.6° latitude, to the locations of the TOPEX satellite and then vertically integrating up to the altitude of the TOPEX satellite.
Figure 8 shows the generally very good agreement between the TOPEX TEC observation and the GAIM TEC specifications. In particular, the GAIM model tracks the observed large weather variability in the equatorial anomaly region over the Pacific Ocean from day 94 to day 95. During these 2 days, the TEC observations varied from well developed equatorial anomalies with peak values of about 90 TECU on day 2004/094 to basically undeveloped anomalies with a broad peak value of about 70 TECU over the Pacific region. Although the GAIM TEC anomaly peak values show differences of up to 10 TECU when compared to the TOPEX data, the equatorial and low latitude TEC gradients are well captured by the model. The excellent agreement of the northern equatorial anomaly with the TOPEX data at 094/2120 UT is most likely due to the proximity of available ground GPS data from the nearby Galapagos Islands. Furthermore, the GAIM TEC specifications at the mid latitudes, e.g., from about 2000 UT to about 2100 UT, are in excellent agreement with the TOPEX observations and many of the midlatitude TEC gradients that are observed by TOPEX are well captured by the GAIM model.
 The excellent agreement between the GAIM model and the TOPEX observations becomes even more impressive when the observed TEC values are compared with their corresponding climatological values, e.g., when no data are assimilated (blue lines in Figure 8). Large differences of up to 50 TECU can be seen in the equatorial region on day 2004/094. This comparison clearly shows the significant improvements of the GAIM TEC specifications compared to the climatological values. Note that while Figure 8 shows a typical example of the agreement between the TOPEX and the GAIM TEC values the large discrepancy between the climatological and the observed TEC values on day 2004/094 can most likely be attributed to the geomagnetic activity occurring on this day and the use of empirical drivers for the climate model. The GAIM model, on the other hand, follows the large weather variability very well.
Figure 9 shows a statistical analysis of the differences between the observed TOPEX TEC values and the TEC values obtained from our assimilation (filled histograms) and climate (open histograms) model runs. The TEC differences are shown for all three validation periods, with a total number of more than 180,000 TEC values for the three periods. Figure 9 shows that the GAIM distributions are well centered about the ideal zero value (perfect agreement between the observed and modeled values) and exhibit a nearly Gaussian distribution. The half width of each GAIM distribution is less than 5 TECU for all three periods and the mean error is less than 1.5 TECU for all GAIM distributions. The climate distribution, on the other hand, is somewhat skewed during the December 2001 and March/April periods, with an overestimation of TEC of about 5 TECU during the first period and an underestimation of about 5 TECU during the latter period. This becomes even more evident in Figure 10, where the TEC data during the December 2001 period are sorted according to their geographic latitude position, e.g., the southern (60°S–20°S), the equatorial (20°S–20°N), and the northern (20°N–60°N) regions. The climate model overestimates the observations by about 10 TECU in the South and underestimates the observations by a similar amount in the equatorial region. The GAIM model, on the other hand, is well centered in all three regions and exhibits nearly Gaussian distributions.
4. Summary and Conclusions
 Data assimilation models of the ionosphere have been developed at Utah State University as a central part of a DoD MURI funded program called GAIM (Global Assimilation of Ionospheric Measurements). The most mature of these models is based on a physics-based model of the ionosphere and a Gauss-Markov Kalman filter (GMKF). The physics-based model is the Ionosphere Forecast Model (IFM), which covers the E region, F region, and the topside ionosphere up to 1400 km altitude. Within the GMKF the ionospheric densities obtained from the IFM constitute the background ionospheric density field on which perturbations are superimposed based on the available data and their errors. In its current configuration, the GMKF assimilates slant TEC from hundreds of GPS ground stations, bottomside Ne profiles from several ionosondes, in situ Ne from DMSP satellites, and nighttime line-of-sight UV radiances measured by satellites.
 The GMKF ionospheric density specifications have been validated for a variety of different geophysical conditions during three 24-day long periods in December 2001, January 2004, and March–April 2004. The main emphases of this study were (1) the validation of the F region peak plasma densities (NmF2) over a data-rich region in the midlatitudes and (2) the validation of total electron content over data sparse regions, e.g., the oceans. The three validation periods were primarily chosen to cover different geophysical conditions and different data availability. To mimic the conditions of an operational environment, the USU-GMKF ran continuously and autonomously during these three periods and assimilated all available data that were provided to the model. The model tasks included quality control of the data, execution of the background model runs, and assimilation of the available data. To validate the model, the GMKF ionospheric density specifications have been compared with more than 180,000 independent measures of vertical TEC obtained from the TOPEX satellite over the oceans and more than 13,500 observations of NmF2 from the BLO. The assimilation of the available data is shown to significantly improve the comparison to both of these independent data sets. In particular, the weather variability of the BLO and the TOPEX data is well captured by the assimilation model.
 The research was supported by NSF grant ATM-0408592, NASA grant NNG04GNG3G, and NRL grant N00014-04-1-0143 to Utah State University. The TOPEX data were obtained from the Physical Oceanography Distributed Active Archive Center (PO.DAAC) at the NASA Jet Propulsion Laboratory, Pasadena, CA, http://podaac.jpl.nasa.gov. Use of data from the International GPS Service (IGS) and its participating agencies is acknowledged, http://igscb.jpl.nasa.gov. The CODE Analysis Center of the IGS at the Astronomisches Institut Universität Bern, Switzerland, is acknowledged for providing a priori biases for GPS satellites and selected ground stations http://wwww.aiub.unibe.ch/igs.html). DISS ionosonde data were provided by T. Bullett, Hanscom AFB. The BLO Dynasonde is operated by T. Berkey, Utah State University. Automated analysis of the Bear Lake Dynasonde measurements was performed by the Dynasonde 21 Software Suite developed by J.W. Wright and N.A. Zabotin, University of Colorado (CIRES), available at the NGDC (Boulder, Colorado) Web site http://ngdc.noaa.gov/stp/IONO/Dynasonde.
 Zuyin Pu thanks Dwight T. Decker and another reviewer for their assistance in evaluating this paper.