A performance evaluation of the operational Jet Propulsion Laboratory/University of Southern California Global Assimilation Ionospheric Model (JPL/USC GAIM)

Authors


Abstract

[1] The Jet Propulsion Laboratory/University of Southern California Global Assimilation Ionospheric Model (JPL/USC GAIM) uses two data assimilation techniques to optimally combine ionospheric measurements with the physics model: a sparse, traditional Kalman filter to estimate the three-dimensional density state, and a four-dimensional variational approach (4DVAR) to estimate ionospheric drivers such as the equatorial E × B drift or neutral winds. In this paper we study a specific implementation of the JPL/USC GAIM Kalman filter (single ion, low-resolution, and input data from 200 ground GPS sites) and validate its global accuracy over 137 days by comparisons to independent GPS slant total electron content (TEC) observations (“missing site” tests) and independent JASON vertical TEC observations. The assimilation accuracy is robust with a slant TEC spatial prediction RMS error of 4 TECU (Total Electron Content Unit, 1 × 1016 e-/m2) on average and a vertical TEC JASON RMS error of 7 TECU. Removing what appears to be a positive ≈4.4 TECU bias from the JASON observations, we obtain an improved performance of 5.3 TECU over the oceans. Comparisons with a single, thin shell global ionospheric map model and the International Reference Ionosphere and Bent ionospheric models are also provided.

1. Introduction

[2] Ionospheric imaging has risen in importance to an increasing need by civilian single-frequency GPS users for high accuracy navigation [McCoy, 2003], a growing awareness of storm-time phenomenon in the ionosphere enabled by global studies using multiple satellite data sources [Ho et al., 1996; Coster et al., 2001], and routine ionospheric specification needs for military over-the-horizon communication and detection [Reinisch et al., 1997], among other applications. The atmospheric weather prediction community has already developed mature data assimilation methods using optimal estimation schemes which produce the remarkable successes of modern weather prediction up to five days in advance with impressive accuracy [Kalnay et al., 1998]. The ionosphere, however, presents several unique difficulties that challenge conventional data assimilation techniques: the various solar and thermospheric drivers of the ionosphere are difficult to measure and dominate the ionospheric behavior; the ionosphere is a complex system consisting of seven major ion species coupled to the underlying thermospheric neutral medium; and data availability across the globe is limited (good data coverage in some midlatitude regions over land but sparse coverage in the complex equatorial region and over the oceans). Together, these difficulties make the daily operation of an effective ionospheric monitoring system quite challenging.

[3] Numerous models have been developed to address the need for ionospheric nowcasting (current or slightly latent specification) and forecasting (predicting beyond the time of current data availability). Some ionospheric models, such as the International Reference Ionosphere (IRI) [Bilitza, 2001], Bent [Bent et al., 1972], SAMI2 [Huba et al., 2000], and the Ionospheric Forecast Model (IFM) [Schunk et al., 1998] use relatively few scalar inputs such as solar activity to estimate the drivers of the ionosphere and then predict the current state using either empirically derived relations or physical models propagating through time. These models often incorporate complicated models of neutral densities (MSIS; Hedin [1991]), neutral winds (HWM; Hedin et al. [1996]), electric fields, and auroral convection patterns and precipitation. While these models have improved, they still yield answers which can be significantly in error, especially when storms or fine structures are present which are not captured in the models. The arrival of global, continuous data sets from diverse sources such as over 1500 GPS receivers across the globe, satellite–satellite crosslink occultations, digisonde profiles, in situ satellite density measurements, and ultraviolet (UV) airglow measurements from low-Earth orbiters fueled the need to create other models which are entirely data-driven such as the persistence-driven thin shell model GIM [e.g., Mannucci et al., 1998] and various tomographic codes such as MIDAS [Mitchell, 2001], EDAM [Angling and Cannon, 2004], and others [e.g., Rius et al., 1997]. Others, such as IDA3D [Bust et al., 2004], attempt to incorporate data and ionospheric physics by use of models to define the a priori state of the ionosphere. In spite of high success in regions with excellent data coverage, these techniques are limited in forecasting ability. Newly emerged assimilative models involve first-principles physical models and yet also take in data, like the tomographic models, in an attempt to merge the benefits of both approaches. JPL/USC GAIM [Hajj et al., 2000; Pi et al., 2003; Hajj et al., 2004; Wang et al., 2004; Pi et al., 2004], USU GAIM [Schunk, 2002; Scherliess et al., 2004], and the Fusion Numeric's assimilation model [e.g., Khattatov et al., 2004] have all been developed to address this need with significant variations in the forward modeling and data assimilation approaches.

[4] Creating an ionospheric data assimilation model is a complex endeavor that involves many approximations and trade-offs: what physics to include, what physics to purposely exclude, what driver models to use [Pi et al., 2003], what coordinate system and grid to use for the physics solver, what grid to use for the assimilation process, and several other choices regarding the optimization process. As the full, formal Kalman Filter is impossible to implement on a grid with sufficient resolution to properly specify the ionosphere, multiple approximations can be employed to help reduce the number of required operations without sacrificing important physics [Hajj et al., 2004; Wang et al., 2004]. Selection of grid shape and resolution comes with advantages and disadvantages when considering data coverage, inherent data resolution, resolving ionospheric structures, simplification of the underlying plasma physics equations, and avoiding over-fitting oscillations. There is a need for statistically significant validation covering a wide range of geophysical conditions to understand the advantages and disadvantages of the various approaches. To answer this need, we have developed a process to automatically generate validation results on a continuing basis so that we can begin to address these important questions. What are the relevant tradeoffs for algorithm complexity versus runtime? Do we understand the limitations of not including the most complex features? Many of the more sophisticated techniques require “tuning” of large numbers of parameters, a process that can be as complex and time-consuming as creating the original model. In addition, as we will argue, the most appropriate value for many of these parameters is not a value that corresponds directly to a physical process in the ionosphere, but rather a value chosen to balance physical reality and model requirements. Stated another way, the modeling and data assimilation procedure only approximately reproduces the physical process, and therefore modeling considerations may play as significant a role in parameter determination as physical interpretation.

[5] In this paper, we assess an operational version of the JPL/USC GAIM model that uses minimal settings (minimal resolution, off-diagonal covariance achieved by Gaussian smoothing, etc.). These baseline results will not serve to demonstrate the ultimate capability of our system; instead, it will define a minimum performance level which more sophisticated approaches should be able to match and exceed and a basis upon which rigorous comparison may be made. As we discuss the results, it will be shown that even in the simplest mode of operation, the JPL/USC GAIM is surprisingly accurate when validated using independent model output (IRI, Bent), spatial prediction or withheld-site tests, and independent data sources (TOPEX/JASON). In future studies, various JPL/USC GAIM model improvements will be measured against the baseline presented in this paper for increased accuracy and run time penalty. This code version is operationally used at JPL to compete with GIM solutions with the intention of eventually replacing GIM's functions while yielding additional information such as profiles and ionospheric drivers.

[6] The paper is structured as follows. In section 2, we describe the JPL/USC GAIM model and the parameter settings used. Section 3 discusses the input data sources used for our model. Section 4 describes the performance of our model via withheld-site tests and postfit analysis. In section 5, we explore the GAIM model's predictive ability over the oceans using independent JASON data. Last, section 6 summarizes our findings and briefly describes our future investigations.

2. Model and Parameters

[7] We will only briefly describe the JPL/USC GAIM, as a thorough treatment already exists in the literature [Hajj et al., 2000; Pi et al., 2003; Hajj et al., 2004; Wang et al., 2004]. JPL/USC GAIM runs on a single CPU and typically requires between 1 and 4 hours to complete an entire day of assimilation analysis. The architecture we use focuses around processing a single day at a time; other architectures for continuous, real-time operation also exist with latencies of as little as 5 min. The basic function of our daily GAIM run can be visualized in Figure 1. An initial state estimate of the ionospheric density is formulated by running the physics model without any input data for one day prior to the specified day to permit any initial transients to attenuate, although for an operational system one could use the last state of the previous day as the initial state for the next. Besides the initial density state estimate, one needs an initial covariance estimate (the uncertainty in each density value along with the correlation between grid elements or voxels). Selecting this initial covariance can be very complicated, as it represents an amalgam of physical scale lengths in the ionosphere in all three dimensions and grid resolution smoothing considerations. Covariance “bumping” to preserve the flexibility of the Kalman Filter's solution (adding process noise Q to the error covariance estimate to take model uncertainties into consideration and to enhance filter response time to changing ionospheric conditions) is achieved via a simple (A + B n)2 formulation in which A and B are constants (1e10 1/m3 and 0.2 respectively) and n is the density within the voxel in question. In this analysis, we select a simple, diagonal Gaussian initial spatial covariance with standard deviations to match the resolution of our grid and zero off-diagonal components. This initial state (density + covariance) is then advanced via the Transition Matrix (the physics model or forward model) to obtain a forecast (predicted) density state. The physics model requires solar F10.7 flux and planetary magnetic Ap indices to provide empirical flux inputs. These indices are obtained in real time from a web service (see http://sec.noaa.gov/ftpdir/indices/) and saved to provide a time series for past days. External models produce dynamical drivers such as E × B drifts, thermospheric composition and winds. The observation operator is employed to then map the estimated density states into predicted slant TEC observations, and these are differenced to all incoming data at that time to form the Innovation Vector (residuals). The traditional Kalman filter works to reduce these residuals in a least squares sense over the entire grid at once, weighted by the uncertainty in each voxel and the uncertainty in the incoming data sources. Finally, the output state resulting from the Kalman filter update is saved for the user and fed back into the physics model for the next iteration. After each time step, we discard off-diagonal covariance information to avoid computation time and memory requirements. We are thus utilizing a suboptimal filter in this setup. The JPL/USC GAIM model is fully capable of preserving off-diagonal covariance; however, it is interesting in this initial benchmark to measure the accuracy obtained without this costly requirement. In this case, the day-length runs were performed 3 days after the date of interest to ensure maximal GPS data was available due to station latency. However, other implementations at JPL also perform with a 1 hour and 5 minute latency.

Figure 1.

The Kalman GAIM consists of the output of a physics model and incoming measurements fused together in the Kalman filter, the results of which are fed back into the physics model for the next step. The physics model advances the current state (incoming from below) in time to the forecast state. This forecast state is then merged with data to yield the current best estimate of density and covariance for every voxel in the grid.

[8] In this validation effort, we used 18,624 voxels (volume elements) to comprise the whole of the Earth's global ionosphere up to 1500 km altitude. A mean voxel size for this arrangement yields 5° latitudinal extent, 15° longitudinal extent, and 80 km altitude extent (Figure 2). The effective resolution and smoothness of the density state are enhanced by the introduction of a Gaussian smoothing process into the observation operator. The altitude extent of each voxel is a function of height, to follow the exponentially decreasing plasma density. Our coordinates are specified by a classic p-q-l formulation along tilted magnetic dipole field lines (constant p is along the magnetic field, l in magnetic longitude direction, and p perpendicular to both) described in the work of Pi et al. [2003] in detail. Note that the use of low resolution combined with a magnetic field aligned grid produces unusual “spiky” high-altitude boundaries. In the present runs, our grid is set to span altitudes from 120 km to 1500 km which should capture between 90 and 99% of the plasma depending on the ionosphere–plasmasphere boundary layer which is between 500 km and 1500 km depending on day/night and storm conditions [Schunk and Nagy, 2000].

Figure 2.

Global grid distribution: 18,624 voxels comprise the ionospheric modeling volume. The use of magnetic coordinates (pql) yields exotic voxel shapes and arrangements as they tile along the field lines.

[9] The JPL/USC GAIM implements the “Band Limited” Kalman approximation [Hajj et al., 2004] which simply means a nearest neighbor covariance approximation has been enacted to reduce the number of nonzero covariance terms. The resulting sparse matrix transforms the intractable manipulation of an 18,624 × 18,624 ≈ 350 × 106 element covariance matrix into one with ≈ 500,000 elements (given 3 neighbors in each direction). For the minimal setting operation investigated here, however, we have set all nondiagonal elements of the covariance to zero, resulting in a covariance of only 18,624 elements.

[10] As the grid resolution is low, a technique of Gaussian smoothing has been implemented within the observation operator so as to: increase the effective resolution, enable the model to smoothly represent structures that are somewhat smaller than the discrete voxel sizes, include an effective off-diagonal covariance, and produce a smooth retrieved density field. In this study, the Gaussian function was defined as having a sigma of 5° latitude, 15° longitude, and 80 km altitude to match the average grid voxel size. Incoming data from GPS receivers is modeled as a straight line ray path, piercing a series of individual voxels. The Gaussian smoothing introduces fixed-length, density-independent correlations between neighboring voxels that result in adjacent voxels also being affected by an observation. Effectively, the smoothing introduces off-diagonal correlations similar to a time-dependent off-diagonal covariance. Higher resolution runs would permit off-diagonal covariance to perform a similar function, at the expense of an increase in runtime due to both the higher number of grid parameters required and the off-diagonal covariance in the Kalman update.

[11] Finally, a single species of ion (O+) is used in the model runs presented in this paper, as multiple ions will be the subject of future analysis and validation. Employing a single ion species permits great simplification of the Kalman filter implementation via convenient calculations in the forward model as well as reducing potential difficulties regarding sensitivity among multiple ion species to a single input data type such as slant TEC from GPS.

3. Data and Ground Stations Used

[12] We examined automated runs covering 137 days of data ranging from 06/01/2004 to 11/08/2004. This period was selected arbitrarily and was examined only after the daily GAIM runs were completed; no intentional selection was utilized so as to resemble a realistic operational environment. Some days are missing due to otherwise irrelevant disk storage limitations between 09/26 and 10/09. Overall, the available data amounts to 45 million slant TEC (STEC) GPS observations from 313 GPS receiver stations, with an average of 330,000 STEC observations from just under 200 sites per day. Each station typically produces between 1500 and 3000 observations per day (5 minute data). The particular sites used in each run can vary day to day due to station malfunction or latency exceeding three days, in which case alternate sites are pulled in as necessary. In Figure 3, stations in green were present 90–137 days out of the total of 137, blue were present 30–90 days, and red 0–30 days. It is important to notice the obvious scarcity of data over the oceans; also the center of Africa is virtually unrepresented, and the receivers in South America were of substandard reliability. In general, a lack of quality data and sparse coverage prevails for most of the equatorial region, where the ionosphere is highly structured.

Figure 3.

Distribution and performance of the 313 GPS receiver stations used (200 per day for 137 days). Stations are color-coded to show the total number of days they contributed to the data set. A well-surrounded, rarely used station likely indicates a substitute station for a rare dropout of a trusted site, whereas an isolated site is likely experiencing problems as it would always be selected if possible. Purple circles represent stations withheld for prediction analysis

[13] All data in this paper were taken from the daily single-shell Global Ionospheric Mapping (GIM) run product [Mannucci et al., 1998; Iijima et al., 1999], with data filtering and bias removal already performed by GIM's “front end” processing. Such filtration is absolutely crucial, as improperly leveled GPS data or corrupted pseudo-range measurements introduce noise that obscures features of the ionosphere. Further, a postfit outlier check was implemented by two successive GAIM runs, removing any data found to have more than an 80 TECU (Total Electron Content Unit, 1 × 1016 e-/m2) postfit residual. The chief culprits of such anomalously bad GPS TEC data were severe station multipath and occasional erroneous behavior by Ashtech receivers in which an arbitrary (usually large) constant is added to the pseudo-range values for an entire satellite arc. From the entire 45 million observations, 6447 observations were rejected due to postfit filter failure (0.014% rejection rate). Significantly more were removed prior to this by the GIM front end for anomalously short arcs due to satellite lock loss. All of the data accepted by GIM and utilized by its solution also enters GAIM with the exception of our spatial prediction study for which we chose five reliable stations around the world (see Table 1) and excluded their data from GAIM only as a means of validation.

Table 1. GPS Receiver Stations Used for Spatial Prediction
StationGeomagnetic LatitudeGeomagnetic LongitudeMagnetic LatitudeMagnetic LongitudeDays UsedMean Daily DataTotal Data
BOGT4.6285.915.6−3.31372337320,207
MKEA19.8204.520.4−88.61342190293,426
TIDB−35.4149.0−43.4−133.91372285313,098
WES242.6288.553.5−0.71281786228,628
WTZR49.112.949.096.51362207300,137

4. GAIM Slant TEC Postfit and Spatial Prediction Residuals

[14] The most direct measure of a model's success at fitting data is the postfit residual, the measure of remaining discrepancy between input data and the model's prediction of that same data after assimilation. These residuals can be plotted into histograms to examine their distribution and their RMS taken to characterize the overall magnitude of the global error. Postfit residuals in no way suggest the accuracy of the model, especially in a predictive sense; instead, they measure whether a sufficient number of degrees of freedom were available to properly fit the data. We also compared various nonassimilative models and single-shell GIM as a useful point of reference, since these models are optimized for entirely different uses.

[15] Figure 4 shows the overall RMS of residuals per day for GIM postfit (GIM answer – input data after run), GAIM postfit (GAIM answer – input data after run), IRI2000, and our five GAIM prediction sites (GAIM answer – withheld data not assimilated). The GIM postfit values show excellent ability to fit the ionosphere with daily RMS residuals varying between 1.5 and 3.7 TECU with a mean of 2.1 TECU. GAIM postfit manages to outperform GIM postfit slightly, varying between 1.4 to 2.5 TECU with a mean of 1.6 TECU. This is in accordance with the observation that even at this coarse resolution GAIM has more independent parameters to adjust along a slant ray pathray path (≈63) than does GIM (≈16). However, this is still significant as proof that the data assimilation process is being completed successfully, and indeed in regions of high receiver density such as North America and Europe multiple ray pathray paths share sufficiently small numbers of density voxels to create the potential for under fitting. Also plotted for reference is the IRI2000 model's climatological estimate (no data input) which shows an oscillation of accuracy anticorrelated with the daily F10.7 varying between 5.6 and 15.8 TECU with an average of 7.9 TECU.

Figure 4.

Overall performance and solar parameters versus UT. Each point in the top graph represents the RMS of all Slant TEC measurement residuals for all stations on a single day. The bottom shows the F10.7 and Ap indices for comparison. Note the 24.7-day periodicity due to the solar rotation period.

[16] The “GAIM Predict” entry represents the observed residuals from only the five selected prediction sites. Note that this is inherently a much smaller data set (by a factor of ≈40), and thus we expect larger day-to-day fluctuation. Still, we observe RMS variation between 1.8 and 7.1 TECU with an average of 2.9 TECU. This level of accuracy is somewhat optimistic, as many of the prediction sites chosen for their reliable delivery are also near other sources of data, thereby assisting overall prediction accuracy. Later we will utilize JASON measurements to assess prediction accuracy over the oceans where this weakness is not present.

[17] The lower panel of Figure 4 shows the solar flux index F10.7 and the planetary magnetic index Ap used as input to GAIM. A major Ap disturbance occurred on 25 and 27 August as well as on 8 November, the last day of study. The F10.7 dramatically shows the 24.7 day mean rotation rate of the Sun. Thus the data sample contains a mixture of active solar conditions (high F10.7), low solar activity (low F10.7), active magnetic storms (high Ap), and generally quiet times (low Ap).

[18] We now turn our attention to the latitudinal dependence of postfit. Each point in Figure 5 represents the RMS of postfit or prediction residuals for a specific site for the entire period of study. The density of data on the right-hand side of the graph reflects the dominance in GPS ground coverage in North America and Europe. In the relatively smooth and quiet midlatitude ionosphere we see GAIM and GIM postfits are roughly equivalent, as both models possess sufficient degrees of freedom to well fit the data. However, as we near the magnetic equator, we observe a significant deviation between GIM and GAIM postfits of ≈2 TECU. This is presumably due to the well known difficulty for thin shell models to accommodate the complex features of the equatorial anomaly, especially the northern and southern anomaly peaks. As for the spatial prediction sites (shown as large green diamonds), the prediction capability is quite robust for the data-dense Northern Hemisphere as well as the moderately well covered Australia in the southern midlatitude, roughly matching the postfit error. However, the prediction station near the equator experiences difficulty roughly equivalent to GIM's postfit error of ≈5 TECU. This demonstrates the level of error that can result from depriving the assimilation run of an isolated site in an equatorial region and thus depending entirely on physics-based diffusion and convection to fill in the data hole. More advanced techniques such as driver estimation via four-dimensional variational approach (4DVAR; Pi et al. [2003, 2004]) or Extended Kalman could be employed to improve this climatological data filling.

Figure 5.

Overall performance versus magnetic latitude. Each point represents the RMS of all Slant TEC measurement residuals for a single site versus the magnetic latitude of that site. Note that GAIM separates itself from GIM mostly near the equatorial region.

[19] A histogram of all 137 days slant TEC residuals, both from postfit and prediction analysis, are examined in Figure 6. The histogram of GIM and GAIM postfit residuals are both well-defined Gaussian distributions widths with overall RMS values of 2.5 and 1.7 TECU respectively, indicating GAIM performed significantly better at data fitting than GIM. GAIM Prediction is also remarkably good with an RMS of 4.0 TECU. IRI, without the ability to assimilate data, achieves an RMS of 9.3 TECU and is biased low by 1.9 TECU. Note that to plot “GAIM Predict” on the same histogram, a scaling factor corresponding to the ratio of postfit observations to predict observations (40.2) was used to overlap the graph axis. It is interesting to note that the plasmaspheric component has been assimilated into the simulation despite the use of O+ exclusively. This will not manifest in comparisons of slant TEC, as plasmaspheric bias is therefore in both the observation and the assimilation result. One would expect profiles to be modestly affected, however, with slightly increased nmF2 such that the integrated TEC matches the ionosphere plus plasmasphere.

Figure 6.

Distribution of slant TEC residuals. In order to plot the prediction site GAIM results meaningfully beside the postfit results, a scaling factor (40.2) equal to the ratio of the total number of postfit and prediction measurements was used.

[20] To investigate the dependence of accuracy on elevation and local time, we binned the postfit and prediction slant TEC residuals versus the elevation angle of the STEC ray pathray path and the local time of the GPS receiver, generating Figure 7. The top two panels display the RMS Residual for postfit (left) and prediction (right). As expected, a region of higher RMS residuals is clearly apparent in the postfit plot (left) between 1300 and 1600 LT at all elevations, corresponding to the daytime equatorial anomaly, while the lowest RMS residuals occur between 0400 and 0900 LT, a period that represents no and low solar irradiance (nighttime and morning, respectively). The corresponding prediction plot (right) shows the same region of best performance (0400–0900 LT), but difficulty in prediction at low elevations is much more broadly distributed in the entire noon-to-midnight sector, with larger residuals occurring around 2000 LT. A possible explanation for the different behavior between prediction and postfit residuals is as follows. The JPL/USC GAIM grid can accommodate the spatial gradients in the real ionosphere, but extrapolation in space is hampered by the unmeasured gradients in the postnoon to midnight periods. It is reasonable that spatial prediction is worst around 2000 LT, as the day night boundary and the associated increase in spatial gradients hinders spatial prediction.

Figure 7.

Upper plots are RMS STEC residual, lower plots mean STEC residual. Left plots are postfit residuals, right plots are spatial prediction residuals. Each bin is arranged by local time and elevation of the ionospheric pierce point (IPP) of the slant ray path ray path. Note that for increased contrast, some unused regions of the RMS color map have been blackened artificially. All color map scales are unique to their respective plot.

[21] The mean residuals, shown in the lower two plots, also exhibit an interesting LT behavior. On average for the postfit (left) case, we are underestimating high elevation tracks by as much as 0.7 TECU while overestimating low-elevation tracks by a similar amount, especially between 0500 and 0900 LT. Although noisier due to lower statistics, the prediction plot (right) shows similar behavior. Superimposed upon this general trend is an over-estimation of the ionosphere at low elevations near dawn, followed by underestimation near dusk. This is likely due to limited longitude resolution in the grid, resulting in a blurring across dawn/dusk terminators, a fact that likely also contributed to the increase in RMS Prediction Residual (upper right) near 2000 LT. Higher resolution should reduce this difficulty, although even in this lowest resolution setting a mean estimation error of less than 0.6 TECU results.

5. GAIM Prediction Over Oceans Versus JASON Satellite

[22] The JASON-1 Satellite [Menard and Fu, 2003] follows the TOPEX [Fu et al., 1994] satellite's success at determining the vertical TEC between the spacecraft and the surface of any substantial body of water beneath it to within a single, constant bias. This calibration step to remove the ionospheric delay is crucial to the satellites' primary mission to study ocean surface height; however, it also provides a convenient source of validation data for GAIM. This is a particularly formidable validation goal, as JASON provides TEC data only over open water, precisely where GPS receiver data is rare. Therefore the direct overlap between JASON vertical TEC tracks and GPS ray pathray paths will only occur near shorelines

[23] Figure 8 shows the accumulated error distribution between GAIM and JASON VTEC for the entire 137 days. Immediately we observe an overall bias: JASON VTEC is clearly higher than the GPS-driven GAIM and GIM models. In light of the prior evidence, it is reasonable to estimate a mean bias and use the standard deviation of the “model minus JASON” vertical TEC differences as a measure of GAIM vertical TEC accuracy. The GAIM and GIM standard deviations are 4.6 and 4.3 TECU, while Bent and IRI95 (note this is not the same IRI model as previously compared with) are 5.9 and 7.0 TECU, respectively. The fact that GAIM's VTEC performance is comparable to GIM's is noteworthy, since GIM has been explicitly tuned to yield a smooth VTEC map using the sparse global GPS coverage, while GAIM models a 3D profile and estimates vastly more parameters. The mean GAIM residual is 4.4 TECU, while that for GIM is 3.0 TECU. This gives some suggestion of the JASON bias, and we chose to use 4.4 TECU as the JASON bias for future computations. This value disagrees with the results of Ping et al. [2004] (≈1.6 TECU) but supports the work of Hernandez-Pajares [2004] (≈5 TECU).

Figure 8.

GAIM–JASON residual histogram. A direct comparison (including JASON bias) of GAIM and JASON vertical TEC. Note the clear bias in JASON VTEC.

[24] We expect the best agreement between GAIM and JASON in regions closer to GPS receiver data. Plotting the distance between a point on the JASON track and the nearest GPS ray pathray path ionospheric pierce point (IPP) within the last hour, we obtain Figure 9. Collecting the GAIM–JASON observations into 50 km bins, we plot the number of observations in each bin, the RMS of those observations, and their mean. On the top plot, we observe a “most common” distance of 300 km and a mean distance of 1010 km. The exponentially decreasing tail indicates that, for the distribution of GPS sites used, it is difficult to obtain observations sufficiently far from any GPS data at all, with negligible observations occurring beyond (≈30°) ≈3500 km away from any GPS data. We observe in the lower plot of Figure 9 a clear dependence on GAIM accuracy versus proximity of input GPS data.

Figure 9.

JASON distance to nearest GPS data. The top graph shows the number of JASON observations versus distance to the nearest GPS data pierce point with a most common distance of 300 km and a mean distance of 1010 km. The bottom graph shows the mean and RMS of GAIM–JASON VTEC residuals with and without a bias correction of 4.4 TECU. Notice that the 4.4 TECU bias also makes the mean disagreement between JASON and GAIM become near zero at small distances.

[25] Investigating the relationship between GAIM predictions versus JASON, we now examine the latitudinal relationship in Figure 10. In the top panel, we see that significantly more JASON observations lie in the Southern Hemisphere (larger oceans) where GPS data is sparse. This data sparseness may explain the latitudinal dependence of increased RMS error in the south as seen in the lower graph. The bias between JASON and GAIM is seen in the mean versus latitude at the bottom of the figure, which roughly agrees with the estimate of 4.4 TECU near the equator but increases as JASON approaches the poles. One possible interpretation of this interesting topology is the presence of a plasmasphere sitting above JASON. As we move toward the poles, between 4 and 6 TECU less plasma is observed by GPS than near the equator, helping to explain the increasing differential. Such an interpretation is supported by the modeling work of Gallagher et al. [1988]. If this interpretation is correct, an even higher JASON bias of ≈6–7 TECU is suggested.

Figure 10.

Biased GAIM–JASON error versus latitude. Note that most JASON measurements are in the Southern Hemisphere due to larger ocean surface, with higher RMS error due to greater distance to GPS data. The plasmasphere is visible as a latitude-dependent mean discrepancy between GAIM (GPS data) and JASON (low orbit data).

[26] The precise global location of these discrepancies between JASON and GAIM are of interest. Removing our estimate of JASON's bias, we graph the RMS of GAIM–JASON into bins each of 4° by 4° across the globe resulting in Figure 11. Note the strong agreement (2 to 3 TECU RMS) in the Atlantic ocean in the smooth midlatitude region between densely covered Europe and North America. The Pacific Ocean near Asia also shows similar agreement. However, RMS error can exceed 10 TECU in the center of the Pacific ocean in the Southern Hemisphere. Both large regions of high discrepancy also lie on the magnetic equator where the ionosphere is highly structured and peaks in the equatorial anomaly region. This confirms our expectation of where GAIM is experiencing difficulty: the major challenge of predicting into regions of sparse data coverage comes from regions of large gradient and high instability.

Figure 11.

GAIM–JASON RMS VTEC residual (unbiased). A global geographic map of regions of GAIM versus JASON disagreement collated into 4° × 4° bins. Gray dots on land indicate GPS receiver locations. The black sinusoid is the estimated magnetic equator.

6. Summary and Conclusions

[27] Running the JPL GAIM model with low resolution, diagonal covariance, single ion, and Gaussian smoothing off-diagonal covariance produces good agreement when validated using both missing-site spatial prediction and independent JASON VTEC measurements over the ocean. Postfit residuals for slant TEC observations for over 300 GPS receivers yielded a total RMS of 1.7 TECU, while missing-site spatial prediction tests yielded an RMS of 4.0 TECU. Challenging comparison with JASON's vertical TEC over the world's oceans yielded at first an unfavorable 6.9 TECU RMS residual; however, study suggests at least a ≈4.4 TECU bias to the JASON observations. Removing this bias reduces our overall vertical TEC RMS to 5.3 TECU, with the majority of error predictably occurring over the largest spans of ocean.

[28] In general, this low resolution run was seen to outperform GIM significantly in the equatorial region when comparing slant TEC observations, which is not surprising given the higher number of parameters and the departure from a spherical-shell mapping function and its associated error. In this work we have established a baseline performance level against which future investigations of more sophisticated settings and procedures can be compared. Nondiagonal covariance, higher resolution, an improved model of the Earth's magnetic field, adding space-based occultation data, adding vertical profiles from ionosondes or incoherent scatter radar, improved physics modeling by adding multiple ions, and improved climatological driver estimation via Extended Kalman or 4DVAR approaches are model features worthy of further analysis and will be the subject of future studies.

Acknowledgments

[29] This research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, and was sponsored by the Air Force Research Laboratory and the National Aeronautics and Space Administration.

[30] Arthur Richmond thanks the reviewers for their evaluation of this paper.

Ancillary