In this paper we present our recent research development in the area of ionospheric specification by means of data assimilation of ground-based observations. NOAA's Total Electron Content specification methodology (namely, a Gauss-Markov Kalmanfilter with an empirical model of the ionosphere as a background model) over the continental United States has lately been expanded to the multiregional domains and to the entire globe. Analyses of the global TEC maps reveal clear signatures of thermosphere-ionosphere coupling, in both dynamical and compositional nature, even though the underlying specification methodology does not take thermospheric effects into account. This suggests that ground-based observations of electron density contain some information about the state of the thermosphere. By using a thermosphere ionosphere general circulation model in a prototype Ensemble Kalman filter (EnKF), we examine the role of thermosphere-ionosphere coupling in a global ionospheric specification. Observing system simulation experiments, designed for a global network of ionosondes, suggest that ionospheric data assimilation considerably benefits from self-consistent treatment of thermosphere-ionosphere coupling in a forecast model as well as in assimilation schemes, both of which can be achieved inherently by using the EnKF.
 The NOAA-Space Weather Prediction Center (SWPC) is the designated official source for the United State's space weather prediction, forecast, and warning services. One of SWPC's operational products is the United States Total Electron Content (US-TEC). On a near real-time basis the US-TEC reconstructs the three-dimensional distribution of electron density with Gauss-Markov Kalman filtering of ground-based GPS observations, using the International Reference Ionosphere (IRI) as a background model, and computes TEC over the continental United States (CONUS) [Spencer et al., 2004; Fuller-Rowell et al., 2006]. Its target users are the positioning and navigation community, as well as other applications. For instance, SWPC is developing an ionospheric correction product for the U.S. National Geodetic Survey's Online Positioning User Service based on the US-TEC output [Araujo-Pradere, 2009].
 In the US-TEC, a simple empirical model inferred from the IRI-2001 [Bilitza, 2001] plays the role of a linear forecast model. While correlation among neighboring grids are specified by a Gaussian function in the horizontal direction, the constraints in the vertical direction are prescribed in terms of empirical orthogonal functions (EOFs) (see Spencer et al.  for details). These EOFs are computed from the IRI every 6 h, thus reflecting the time dependent electron density profiles to some extent. Recently, the domain of the Gauss-Markov Kalman filtering has been expanded from CONUS to multiregional domains and then to a global domain (see Figure 1). The new product is currently under verification and validation.
 In addition to its utility as an operational product, the Gauss-Markov Kalman filter in the US-TEC can be used for retrospective studies with the benefit of much greater pool of non-real-time observations that in turn enables the assimilation analysis at higher temporal and spatial resolution. Analyses of regional (including CONUS) or global TEC maps have yielded important insights into the ionosphere, and into the dynamic of the coupled thermosphere-ionosphere system. For example, a recent study of Southeast Asia regional map by Lin et al.  shows that the TEC response to a total solar eclipse of 2009 is best explained by the thermospheric compositional effect. That is to say that the balance between the primary production process of plasmas (i.e., photoionization of atomic oxygen) and the loss processes (i.e., conversion of atomic ions to molecular ions and subsequent recombination with electrons) is influenced by altered composition of molecular species and atomic oxygen during the eclipse. The CONUS map is also used to investigate the geomagnetic storm-time response of the TEC, exhibiting the negative and positive phases caused by thermospheric upwelling and equatorward transport of neutral composition [e.g., Araujo-Pradere et al., 2006].
 One modeling study by Fuller-Rowell et al.  suggests that the TEC enhancement observed in the CONUS map during a major sudden stratospheric warming event in January 2009 can well be traced back to the wave forcing from the lower atmosphere, which is thought to alter lower thermospheric winds and electrodynamics and subsequently drive the observed TEC changes.
 To investigate the role of thermosphere-ionosphere coupling in ionospheric data assimilation, we have designed observation system simulation experiments (OSSE) for a global network of ionosondes using a thermosphere ionosphere general circulation model in an ensemble Kalman filter (EnKF). EnKF allows the use of a fully nonlinear general circulation model as its forecast model and also its forecast error covariance becomes flow-dependent, or in other words the way unobserved variables are constrained by observations in EnKF reflects time-dependent dynamics of the thermosphere and ionosphere. These attributes cannot be achieved, at least easily, in the framework of Gauss-Markov Kalman filter. A prototype EnKF assimilation system has been constructed using the National Center for Atmospheric Research (NCAR) Data Assimilation Research Testbed [Anderson et al., 2009] and the NCAR Thermoshere-Ionosphere Electrodynamics General Circulation Model [Richmond et al., 1992]. The scope of this paper is to report the above-mentioned OSEE results to demonstrate positive impacts of self-consistent treatment of the thermosphere and ionosphere in data assimilation schemes on ionospheric specification; therefore we defer in-depth exploration of this particular EnKF system until further studies with real observations rather than synthetically generated data.
2. Signatures of Coupled Thermosphere-Ionosphere Dynamics
 The complex dynamic of the coupled thermosphere-ionosphere system has been studied for some time, including storm-time dynamics that generally accentuates features of such coupled dynamics [e.g., Fuller-Rowell et al., 2007]. Figure 2 is an example of the TEC analysis, obtained using the Gauss-Markov Kalman filtering in the US-TEC at the resolution of one degree latitude and longitude with observations from about 500 ground-based GPS stations that portrays the ionospheric manifestation of coupled dynamics during the storm of 7–8 November 1998. To accentuate the storm effects, the ratio of the vertical TEC (vTEC) between the storm-time values and the climatological baseline values (that reflect a geomagnetically quiet period) is taken. The vTEC ratio map shown in Figure 2 is at 8 November 1998, 05:15 UT, and the 3-hourly ap index is about 180 at this time. Note that the CONUS region at 05:15 UT is under local evening conditions extending from the early evening to the postmidnight local time sector. Here, the positive (or negative) phase refers to the vTEC ratio that is higher (or lower) than the baseline value, indicated by the contour levels that are greater than 1 (or smaller than 1). First of all, a clear contrast between the positive and negative phase is evident. The poleward positive phase can be explained by the auroral processes, characteristic of high latitudes under geomagnetically disturbed conditions, and the equatorward positive phase is probably due to the combination of electrodynamics and downwelling of the neutral component plus some contribution of day-side plasma transported to the night side by the Earth rotation. On the other hand, the negative phase visible between the two positive bands is likely to be caused by thermospheric upwelling and equatorward transport of neutral composition [Araujo-Pradere et al., 2006]. Subauroral electric fields may also be contributing to this complex picture, for example, via the altered F region chemistry by fast plasma drifts [Schunk et al., 1976] or the upward plasma transport associated with subauroral ion drifts as observed by the Dynamics Explorer 2 [Anderson et al., 1991].
 Since no thermospheric effects are considered within the Gauss-Markov Kalman filter currently used in the US-TEC, such TEC manifestations of the coupled dynamics of the thermosphere and ionosphere are sorely dictated by the ground-based GPS observations. This motivates us to explore ways to take advantage of the dynamics captured in general circulation models in data assimilation schemes to maximize geophysical information that can be extracted from observations.
3. Impacts of Self-Consistent Treatment of the Thermosphere-Ionosphere
 The approach taken in this paper complements the approach taken by most of other ionospheric data assimilation projects in which thermospheric states such as neutral winds are considered to be external drivers. Here the thermospheric feedback is taken into account in both analysis and forecast steps of filtering so that the neutral winds can be estimated from the observation of electron density without special treatment.
3.1. Ensemble Kalman Filter
 An Ensemble Kalman filter (EnKF) assimilation system has been constructed using the NCAR Data Assimilation Research Testbed [Anderson et al., 2009] and the NCAR Thermosphere-Ionosphere Electrodynamics General Circulation Model [Richmond et al., 1992]. Both are community software offered by NCAR. For the results shown in this paper, all the default software settings are adopted unless otherwise mentioned.
 The EnKF is a Monte-Carlo approximation of a sequential Bayesian filtering process [Evensen, 1994]. The algorithm consists of recursive application of an analysis (update) step in which the prior ensemble estimate of the state is updated by observations to produce a posterior (analysis), and a forecast step in which the posterior ensemble is propagated forward in time with a dynamical forecast model to the next observation time. There is no need to compute explicitly the enormous prior covariance matrices that are associated with large general circulation models because of the use of sample covariance with appropriate localization. Besides, the EnKF formulation does not require the linearization of a forecast model or a forward (observation) operator. Important recent developments in the EnKF are related to sampling error issues, often encountered in Monte-Carlo based methods with a small sample number compared to the degrees of freedom of the dynamical model [Evensen, 2009]. The NCAR Data Assimilation Research Testbed provides algorithms that cope with these issues, and in experiments shown later in this paper the covariance is localized but not inflated.
 We include selected subsets of the TIEGCM physical variables into the EnKF state vector. Suppose that one type of the TIEGCM variables evaluated on the model grid is denoted by a vector f such that the size of f is same as the number of TIEGCM grid points (∼75,000). For instance, fNe represents the electron density on the model grid. Likewise, the temperature, the zonal and meridional wind, and the atomic and molecular mixing ratio on the model grid are respectively denoted by fT, fU, fV, fO1, and fO2. The EnKF state vector is denoted by x. (Specifics of the state vector is explained later and also given in Table 1.) In the forecast step all the physical variables in TIEGCM, not limited to the variables included in x, are dynamically evolved according to the physics, chemistry, and electrodynamics of the thermosphere and ionosphere described by Roble et al. , Richmond et al. , and references therein. On the other hand in the update step “coupling” between the thermosphere and the ionosphere is governed by the cross-covariance among the physical variables included as part of the state vector x. If thermospheric parameters are included in the state vector, the cross-covariance between the electron density fNe and thermospheric parameters such as fT, fU, fV, fO1, or fO2 is estimated from the ensemble. Thanks to readily available flow-dependent covariance within the EnKF framework, unobserved thermospheric states as well as data-void regions of the ionosphere can be inferred from the observation of electron density in a consistent fashion with underlying time-dependent dynamics of the thermosphere and ionosphere captured in a general circulation model of the thermosphere and ionosphere.
Table 1. List of the Variables Included in the State Vector x
x = [fNe]′
x = [fNe, fT, fU, fV]′
x = [fNe, fT, fU, fV, fO1, fO2]′
3.2. Observing System Simulation Experiment Design
 We use the locations of 75 ionosonde stations worldwide, listed in Table 2 of Araujo-Pradere et al. , to design observing system simulation experiments. The experiments are conducted for one day under geomagnetically quiet equinox conditions. The default setting is retained for the most of model input parameters. For example, daily F10.7 index (d1), proxy for EUV forcing, is set 150, and the parameters that essentially control the high-latitude energy and momentum input, cross-polar-cap potential drop (d2) and hemispherical power index (d3) are set 45 and 16, respectively. Synthetic electron density observations are generated by sampling the control simulation result at these ionosonde locations, at every 50 km from 150 to 400 km, and then adding centered Gaussian errors. Note that these are perfect model experiments, or in other word the control simulation result serves as “the truth,” denoted here by ft. The forward operator H merely involves interpolation of fNe to observation locations, and so the observation equation is given by
where y is the observation and ∼ (0, σ2). Most of thermosphere and ionosphere general circulation models, especially ones with lower boundary at the mesopause height, exhibit very little sensitivity to initial conditions [Liu et al., 2009]. Instead, the upper atmosphere is strongly controlled by the external forcing that is often prescribed in the models as boundary conditions or input parameters. We created here the ensemble via centered Gaussian perturbation of some of model parameters such as F10.7 index (d1), cross-polar-cap potential drop (d2) and hemispherical power index (d3). Suppose F represents the forecast model that integrates the state vector x from time k-1 to k.
where m is an index for the ensemble, d(m) = [d1(m), d2(m), d3(m)]′ is sampled from the Gaussian distribution: d(m) ∼ ([μ1, μ2, μ3]′, Σd). The mean is set to equal to the one used in the control simulation (i.e., μ1 = 150, μ2 = 45, and μ3 = 16) under the assumption that the model is not systematically biased. d2 and d3 are assumed to be correlated to each other, but the correlation of d1 with d2 and d3 is not considered (i.e., off-diagonal elements in the first row and column of Σd are zero.) Spin-up time is 2 weeks. Assimilation cycle (i.e., how often observations are assimilated) is 30 min, and 64 members are used.
 The following three experiments are conducted to examine the relative impact of different sets of physical variables such as fT, fU, fV, fO1, and fNe that are to be included as part of the EnKF state vector x on the ionospheric specification. Table 1 summarizes what constitutes the state vector for these three experiments. In experiment 1 only the electron density will be adjusted by observations in the update step, although effects of the thermosphere-ionosphere coupling is fully taken into account in the forecast step. In experiment 2 main dynamical variables of the thermosphere (temperature and winds) as well as electron density will be adjusted by observations in the update step through the cross-covariance between observations of electron density and main dynamical variables on model grids that is estimated from the ensemble. In experiment 3 along with electron density, main dynamical variables of the thermosphere (temperature and winds) and major compositions will be adjusted by observations in the update step. Note that molecular nitrogen mass mixing ratio is set to equal to 1 − fO1 − fO2 in the model.
3.3. Root-Mean-Square Difference
Figure 3 shows the root-mean-square error (RMSE) of electron density analyses over the entire model domain. The RMSE is computed as a standard measure of the differences between the analysis faNe and the truth ftNe, i.e., , where J denotes the total number of grid points within a certain model domain. Note that the size of J is exactly the same for all three experiments, even though the size of the state vector x is different among experiments. The RMSE from experiment 1 is shown in black, while the RMSE from experiment 2 and experiment 3 are shown in red and blue, respectively.
 In addition to the RMSE of the assimilation analysis (or the expected value of the posterior distribution) that is shown in solid lines, the RMSE of the forecast (or the expected value of the prior distribution) is shown in dashed lines for comparison. Because the Gaussian probability distribution is assumed in the EnKF framework by definition, the expected value is given namely by the ensemble mean. The overall diurnal cycle of RMSE, seen in all three experiments, originates from the Universal Time distribution of ionosonde stations (indicated by red dots in Figure 4), as data-dense and data-sparse parts of the ionosonde observational network rotate under the dayside equatorial ionization anomaly. With exception of experiment 1 in which only electron density is adjusted by the observations, the RMSE generally decreases over the course of day, at somewhat faster rate for experiment 3 than experiment 2. In comparison to the impact of adjusting thermospheric dynamical variables along with electron density, an additional contribution from compositions adjustment, that can be assessed by comparing experiment 3 to experiment 2, may seem small, however its impact grows over time. Considering that the thermosphere-ionosphere coupling is fully taken into account in the forecast step in all three experiments, it indicates that self-consistent treatment of the thermosphere and ionosphere in data assimilation schemes significantly improves a global ionospheric specification.
Figure 4 in turn displays global maps of the RMSE at every 6 or 12 h, computed at a given longitude and latitude grid location from the electron density analyses, resulting from all three experiments. The analyses at the very beginning of filtering experiment (0 UT) is identical among all experiments (shown in Figure 4b), and the RMSE map of the prior mean is shown in Figure 4a as reference point before any observations are assimilated. As filtering advances through assimilation cycles, the analysis result among experiments starts branching off. Figure 4c shows the RMSE maps of the assimilation analysis for experiment 1 at 6, 12, and 24 UT (13th, 25th, and 49th assimilation cycle). Maps shown in Figure 4d correspond to the analysis from experiment 2, and Figure 4e is for experiment 3. Comparing the RMSE maps at the beginning (Figure 4b) and at the end of experiment 3, shown in Figure 4e, the reduction of error is evident when thermospheric dynamical variables and main compositions, along with electron density, are modified by the observations. It shows a striking contrast to experiment 1, whose RMSE at 24 UT is shown in Figure 4c.
4. Discussion and Conclusions
 This paper presents some of recent research and development efforts to improve global ionospheric specification by means of data assimilation at NOAA-SWPC. The methodology currently in use for production had primarily been developed for assimilation of ground-based GPS observations. Therefore its application is ideal in multiregional domains where we can take advantage of the already (or soon-to-be) established dense regional networks as indicated by boxes in Figure 1 but not necessarily suitable in the global domain. As one of the efforts to overcome the limitations of the current specification methodology, we are assessing benefits of the use of a coupled ionosphere-thermosphere model in more modern data assimilation schemes.
 In this paper we explore the ability of the EnKF to take advantage of coupled dynamics of the thermosphere and ionosphere captured in general circulation models to improve the global ionospheric specification. The Observing System Simulation Experiments, designed with the help of the NCAR-Thermosphere Ionosphere Electrodynamics General Circulation model and the NCAR-Data Assimilation Research Testbed for a global network of ionosonde stations, suggests that self-consistent treatment of thermosphere-ionosphere coupling in assimilation schemes improves the quality of a global ionospheric specification. One of the important advantages of the EnKF is that its covariance is consistent with time-dependent dynamics of the thermosphere and ionosphere captured in the forecast model. In spite of issues of sampling error associated with any Monte-Carlo methods, the cross-covariance between observations and unobserved physical variables estimated from the ensemble is effective enough to constrain the thermospheric state from the observations of electron density, which in turn improves the overall specification of the ionosphere.
 Whether or not the same statement holds true for experiments with real observations hinges on the quality of the model and the availability of high-quality observations. If the relationship between the thermosphere and the ionosphere is misrepresented in the model, it may degrade the overall assimilation analysis quality by adjusting unobserved thermospheric variables in the model by using ionospheric observations. Our analyses of the TEC maps, obtained from the Gauss-Markov Kalman filter used in the US-TEC, support that ground-based GPS observations contain enough information about states of the thermosphere to take advantage of self-consistent treatment of thermosphere-ionosphere coupling in assimilation schemes. As general circulating models of the thermosphere and ionosphere mature, this approach may further our ability to specify states of the ionosphere greatly in the future.
 We thank the reviewers for their very helpful comments. We thank Timothy J. Fuller-Rowell for valuable discussion on the thermosphere-ionosphere dynamics. Tomoko Matsuo greatly appreciates Jeffrey L. Anderson, Timothly Hoar, and Nancy Collins at the NCAR Institute for Mathematics Applied for Geosciences for their kind help with facilitating the use of DART, and TIEGCM community model developers at the NCAR High Altitude Observatory for making the model available. Tomoko Matsuo is supported by the Air Force Office of Scientific Research Multidisciplinary University Research Initiative award FA9550-07-1-0565.