The global core plasma model (GCPM) is a realistic electron density model of the inner magnetosphere. By using long-term total electron content (TEC) data obtained from several GPS tracking stations along the equator and a low Earth orbit satellite, the accuracy of the GCPM equatorial density is examined. According to the statistical analysis of the GPS TEC, we found a remarkable feature in bias errors in the GCPM-derived TEC. Most of the errors are found to distribute in the topside ionosphere due to the simple representation of the density there.
 A global model that provides reasonable estimates of cold plasma density is essential for simulating the propagations of plasma waves in the Earth's plasmasphere; for example, the ray tracing of whistler mode waves [e.g., Bortnik et al., 2009; Santolík and Chum, 2009]. Diffusive equilibrium models have been widely used to represent the field-aligned density distribution because of their simplicity and flexibility. Recently, however, more realistic density models, such as the global core plasma model (GCPM) [Gallagher et al., 2000], the IMAGE/RPI model [Huang et al., 2004], the global plasmasphere ionosphere density (GPID) [Webb and Essex, 2000], and the standard plasmasphere ionosphere model (SIM) [Gulyaeva et al., 2002] have been developed theoretically, semi-empirically, or fully empirically. These recent models enable the simulation of more realistic wave propagations. Among these models, the GCPM has an advantage because it always provides continuous densities in both value and derivative, which is one of the necessary conditions for the simulation of wave propagations. In addition, the computer software of the model is open to the public.
 The GCPM is actually a framework to integrate region-specific models for plasma density that have been developed over the years. It provides typical electron densities throughout the ionosphere, plasmasphere, magnetospheric trough, and polar cap under various solar and geomagnetic conditions. Model density in each region is represented by analytical functions that correspond to the plasma distribution features of the region. The model makes it very convenient to obtain a global electron density profile in the inner magnetosphere, but its statistical accuracy has never been examined. As an analysis of the IRI/GCPM error in mid and high latitudes,Stolle et al.  showed the bias errors of the estimates of the IRI/GCPM density above Europe and north polar regions between April and November 2001 by comparing measurements from the CHAMP satellite with model density. They proposed a data assimilation technique that improves the model estimates by using GPS TEC data.
 In this paper, we discuss the accuracy of the GCPM density, especially in the magnetic equator, through a statistical analysis of the total electron contents (TECs) derived from the GPS data; we then construct an error model for the GCPM. The equatorial density is the most important factor for the GCPM to determine the entire profile in the plasmasphere because field-aligned density is determined by an interpolation of equatorial density and ionospheric density with an exponential function. Long-term global TEC data make it possible to evaluate the accuracy of the global density model.
 It should be noted that the GCPM is a model that provides a typical profile under a given condition and is different from models that are used to forecast ionospheric and plasmaspheric states in the near future, for example, a global TEC prediction model [García-Rigo et al., 2011], that is, the GCPM is more similar to a climate model than a weather model. Then, the construction of the error model is based on an estimate of mean bias error of the GCPM-derived TEC for the measured TEC under each condition. In order to estimate the bias error, it is necessary to identify the contributing factors in estimation errors in the GCPM. Constructing a global density model from the GPS-TEC data is much more difficult than constructing such an error compensation model because the number of contributing parameters is much smaller in the error compensation model.
2. Model and Data
2.1. Equatorial Density in the GCPM
 In the GCPM, the density on the magnetic equator is derived from the international reference ionosphere (IRI) model [Bilitza et al., 1993; Bilitza, 2001] in low altitudes and an empirical plasmaspheric density model in high altitudes [Gallagher et al., 1988; Carpenter and Anderson, 1992]. The IRI model is adopted below an altitude of the maximum negative density gradient above the F2 peak. Above this altitude, the density is extrapolated by
where d0 and d1 are parameters that are fitted to the slope and density at the transition altitude and d represents an altitude.
 The empirical plasmaspheric density nps in the magnetic equator is represented by the following L-dependent function:
where t is the day of the year and is the 13-month-average sunspot number. The logarithm ofnps is mainly represented by a monotonically decreasing linear function for L-value. Coefficients of the linear function and additive minor terms for seasonal variation and solar activity were estimated from observation data of the Dynamic Explorer 1 (DE1) and the International Sun Earth Explorer 1 (ISEE1). We do not include a detailed explanation here but the plasmapause whose location and density slope are determined by local time and Kp index is also included in the GCPM. Density at altitudes lower than the plasmaspheric model inequation (2) is also extrapolated by the same type of power law function represented by equation (1). Around the intersection altitude of the two extrapolation functions, the model densities are smoothly connected in value and derivative with a weighting function.
 Next, in order to calculate the GCPM density on the magnetic equator, date, time, local time, radial distance, latitude and Kp index are required in the ionosphere, date, sunspot number and L-value are in the plasmasphere, and local time and Kp index are in the plasmapause. In this study, we utilize the GCPM source code (version 2.4), which includes the IRI 2007 model for the ionospheric density model.
2.2. Total Electron Content From the GPS Data
 The global positioning system (GPS) is well known as a global navigation satellite system (GNSS), which uses nearly three dozen satellites that transmit precise radio wave signals from orbits at an altitude of 20,200 km (4.17 Re). GPS signal tracking stations have been distributed worldwide following the start of the full-scale operation in 1996. Observation and navigation data received at those stations are available through the International GNSS Service (IGS) [Dow et al., 2009]. The total electron content (TEC) along a path from the GPS satellite to the ground station can be derived from the difference between pseudo-ranges measured by two carrier frequency signals because only a delay due to ambient plasma has a frequency dependence among errors in the measured pseudo-ranges. The TEC is represented by
where m and e are the mass of an electron and elementary electric charge, respectively. P1, P2 and f1, f2are the pseudo-ranges and frequencies of the two signals, respectively.
 In the present study, vertical TEC data obtained at six tracking stations along the magnetic equator were used. The locations of the stations are listed in Table 1. Only the signals received from the GPS satellites that are located at magnetic latitudes within 10 degrees are used. The differential code biases of transmitters and receivers that are bias errors in the estimated TEC were removed by ancillary data provided by the Center for Orbit Determination in Europe.
Table 1. List of the GPS Tracking Stations
3. Error Model of the GCPM-Derived TEC
 Through statistical analyses of the vertical TEC data obtained from the GPS tracking stations, we could not estimate the altitude dependence of the GCPM errors but we could estimate the dependencies on observation conditions such as local time, solar activity, geomagnetic activity, and so on. Next, to improve the GCPM, an error model of the GCPM-derived TEC for such parameters is constructed by using the observed vertical TEC data under various conditions.
3.1. Parameter Dependencies of the Error Model
Figure 1adepicts a scatterplot of the observed and GCPM-derived TECs that cover all of the local times and dates from 2000 to 2009 at the GPS tracking stations. The total number of TEC data is more than 3.5 million. In the figure, the data points on a diagonal solid line indicate that the GCPM estimates the TEC accurately. Points above and below of the line indicate that the GCPM overestimates and underestimates the TEC, respectively.Figure 1bpresents a histogram of the error of the GCPM-derived TEC for the observation. The mean and standard deviation of the error are 6.3 TECU (1 TECU = 1016electron/m2) and 13 TECU, respectively. These two representations indicate that the error depends on the TEC magnitude and it is not distributed according to a simple Gaussian with zero mean. When the observed TEC is larger than around 80 TECU, the GCPM always underestimates the TEC. However, when the observed TEC is less than around 50 TECU, the GCPM frequently overestimates the TEC. In order to construct a GCPM-TEC error model, we have to identify the causal parameters of these errors.
Figures 2a–2f depict the means and standard deviations of conditional histograms of the TEC error on the parameters of day of the year (DOY), local time (LT), sunspot number (SN), Dst index (DST) geomagnetic longitude (MLON) and Kp index (KP), respectively. Except for the DST, these parameters are used in the original GCPM. The error model should be constructed in order that the means of these conditional histograms are reduced to zero. Because the standard deviations represent not only statistical random errors but also variations of the bias errors due to the other parameters, it is unlikely to model mean bias independently for each parameter. In Figures 2a–2c, for example, the histograms are widely distributed on the conditions of 60th < DOY < 120th and 250th < DOY < 310th, 12 hr < LT < 18 hr, and 60 < SN, respectively. According to error distributions in the three-dimensional parameter space of (DOY, LT, SN), large errors are found to be generated when such conditions are simultaneously satisfied. Namely, the error model is not represented by a combination of separate functions of the DOY, LT, and SN; rather, it is a three-parameter function of them.
 As for the parameters of the geomagnetic activity, although the GCPM takes Kp-dependencies into account in the ionosphere and in the plasmapause, Kp dependent bias remains in the TEC errors, as shown inFigure 2f. The error component also depends on the DST as well as the KP, as shown in Figure 2d. In order to choose one parameter to represent the geomagnetic activity, a chi-square test of independence, as explained inAppendix A, of the KP and DST on the TEC errors was undertaken. Although the KP is generally used as a parameter in ionospheric and plasmaspheric density models, the result of the test reveals that the TEC errors are slightly more dependent on the DST than the KP. This result is not surprising because we deal with the GCPM errors, not the density. There is another advantage to adopting the DST instead of the KP from a technical viewpoint. Because the Kp index is expressed on a scale of 28 values and the Dst index is expressed by continuous values, the model using the DST that is free from quantization errors can estimate the TEC error more accurately. On the basis of these considerations, we adopt the DST in the TEC error model.
 In order to reduce the dimension of ε, we represent the GCPM error as follows:
where ε2 is assumed to be represented by a Gaussian function as
in which three numbers are determined by a least squares fitting to the averages of the histograms in Figure 2d. Because data coverage for the MLON was poor, its dependence is not taken into account in the developing error model.
 The validity of the representation of the TEC error by equation (4)is also verified by the chi-square test of independence between the residual errorε1 and the parameters DOY, LT, SN and DST. The χ2 value for the DST, which signifies the degree of dependence of ε1 on the DST, is smaller than the χ2 values for the other parameters by around one order. This result means that the component of the DST is adequately removed by equation (5). In the following section, we present the model construction of the residual error ε1.
3.2. Multidimensional Model Construction
 The error distribution for the LT and DOY is depicted in Figure 3 on the condition of SN < 8 (Figure 3a), 8 < SN < 20 (Figure 3b), 20 < SN < 60 (Figure 3c), and 60 < SN (Figure 3d). The SN is categorized into these four groups so that the data number remains almost the same. Large negative errors are found in the afternoon from February to April and from September to November during solar active periods. These negative errors are considered to be mainly caused by underestimates of seasonal variations of the ionospheric density. Large positive errors are found throughout the daytime from August to December during the solar quiet period.
 For an estimation of ε1 in equation (4)from the error distribution in the three-dimensional parameter space, a parametric model fitting is one of the easiest methods. However, because the errors display a multimodal distribution, a simple Gaussian distribution model is not appropriate to represent them. It takes too much computational time to fit them properly with multiple Gaussian distributions. For such a problem, a nonparametric representation of the error is convenient. The nonparametric representation is, for example, employed in the correction of earthquake location estimation [Ogata et al., 1998]. While a continuous representation with B-spline functions is used in such a study, here we adopt a multivariate discrete spline representation. The advantage of the discrete spline is that model users need not carry out calculations to obtain a correction value; instead, they can obtain it directly from a data array.
 Actually, the error includes unexpected parameter-dependent components that are not built into the error model, such as the magnetic longitude. The adopted nonparametric modeling is effective at removing the random components of such parameters.
 For simplicity, the parameters of DOY, LT and SN are represented as x, y, and z, respectively. The purpose of the discrete spline method is to obtain a smoothed error map G(x, y, z), which is equivalent to ε1, from the error data Fi by interpolating the values in the space (x, y, z). The error data Fiare not individual TEC error data, but are the most frequent values in parameter-digitized cells. Each parameter is digitized, as presented inTable 2. The number of error data Fi is around 12,000 and 80% of the cells have data.
Table 2. Digitization of the Parameters of the Day of the Year (DOY), Local-Time (LT), and Sunspot Number (SN) for the Discrete Spline Fitting Methoda
SN: (a) −8, (b) 8–20, (c) 20–60, (d) 60–.
 In the discrete spline method, G(x, y, z) is calculated by solving a minimum value problem of the following index:
where the first term in the right-hand expression denotes the fitness of the solution distributionG to the TEC error data Fi. The second term denotes the smoothness of G. The weight of the first and second terms is determined by a smoothing parameter γ.
 We define vectors f and g whose elements are the error data Fi and all the digitized values of G, respectively. By using them, equation (6) can be rewritten as
where Q is a matrix in which only one element that corresponds to the data equals 1 in each row and the other elements are 0. C is a matrix that represents a relation with neighboring cell values, that is, in each row, seven elements are set to satisfy the following relation with neighboring cell values:
As boundary conditions of the parameter space, x and y are periodic.
 The solution distribution g is obtained under the condition that the derivative of E equals 0 as follows:
where I is the unit matrix, and γ′ is a parameter determined from γ and the number of the error data. Then, there are three controllable parameters to determine a solution distribution, (γ′/Δx), (γ′/Δy), and (γ′/Δz). These parameters were determined by considering the balance between the influence of data acquisition bias and the generality of the solution distribution. Figure 4 depicts a solution distribution for the LT and DOY on the four conditions of the SN.
 While the dependence of the DST on the TEC error is represented by a Gaussian function, the dependence of the LT, DOY, and SN is represented by a numerical model that consists of a three-dimensional array of 14,016 data (48 × 73 × 4). Each piece of data covers a small parameter space in which the TEC error is assumed to be invariant. An improved TEC can be obtained using the developed error models in addition to the TEC derived from the original GCPM. The simple representation based on a data arrayε1 and an analytical function ε2 is an advantage of the developed model for GCPM users.
3.3. Improvement of GCPM-Derived TECs
 The validity of the developed TEC error model is verified by a scatterplot of the observed TEC and the improved GCPM-TEC, as depicted inFigure 5a. A better correlation between them can be found compared with the plot in Figure 1a. Figure 5bdepicts a histogram of the error of the improved GCPM. The mean and standard deviation of the GCPM-TEC errors become 0.5 TECU and 8.9 TECU, respectively. The estimation accuracy is improved in more than 80% of cases.
Figures 6a–6fshow the means and standard deviations of conditional histograms of the error of the improved GCPM-TEC. The means of the DOY, LT, and SN approach zero and the standard deviations become smaller. The conditional histograms follow Gaussian distributions and there are minimal biases for the parameters.
 In Figure 6e of the MLON, the TEC errors become smaller regardless of the location of the data tracking station, although the MLON was not treated as a parameter of the error model. This result indicates that the TEC errors found in Figure 2e are a result of other parameters besides station location. Considering that the data obtained from each tracking station are distributed throughout the LT and DOY almost uniformly, the causal parameter should be the DST or SN. Regardless, the errors are not strongly related to the location of the station, that is, the original GCPM estimates this regional component well.
4. Discussion on Altitude Distribution of the TEC Errors
4.1. TEC Errors From the GRACE Satellite
 In the previous section, the distribution of GCPM-TEC errors was revealed for several parameters. It is important to examine how such TEC errors are distributed along altitudes. For this issue, we utilized the GPS data obtained from the GRACE satellite that was launched into a low Earth orbit (LEO) of 500 km in 2002. In order to discuss the TEC error distribution on the equatorial plane, only the GRACE TEC data received under the condition where both GRACE and GPS satellites were located within latitudes of 10 degrees and at the same local time are used. In this analysis, we used the GRACE data obtained from August to December 2002. For the ground TEC measurements, the data received under the condition where the GPS satellite was located within latitudes of 10 degrees are used.
Figure 7adepicts a scatterplot of the GRACE-TEC and the GCPM-TEC which is derived from the integration of the GCPM densities along the trajectory of the GPS signal that the GRACE received. The plot reveals the similar tendency as the ground GPS-TEC observations, that is, it underestimates for larger TEC observations. The developed TEC error model provides a TEC error under each condition where the GRACE-TEC was obtained. Then, we examined three error-allocation models; (i) all the errors are allocated to altitudes below 500 km, (ii) error in proportion to the IRI density is allocated at each altitude, and (iii) all the errors are allocated to altitudes above 500 km.
 In the model (i), the GCPM-TECs above the 500 km are invariant as represented inFigure 7a. There are, of course, still bias errors for large observed TECs. The root mean square (RMS) of the errors is 15.1 TECU. This model is not acceptable at all.
 In the model (ii), the GCPM-TEC between the GRACE and GPS satellites is calculated from a density profile derived by multiplying a factor to densities returned by the IRI model. Densities at the topside ionosphere and transition region to the plasmasphere are also changed depending on the modified IRI density. The multiple factor is determined in each case in order that the TEC error is satisfied. In this model, the allocation to altitudes above 500 km is different in each case.Figure 7bdepicts a plot of the GRACE-TEC and the GCPM-TEC in the model (ii). There are still bias errors for large observed TECs although they are small compared with the model (i). The RMS of the errors is 9.0 TECU. Allocation of the error to the whole IRI density profiles is not so good to remove the bias errors.
 In the model (iii), all the errors are allocated to altitudes above 500 km, that is, the IRI density below 500 km is assumed to be reliable. Figure 7cdepicts a plot of the GRACE-TEC and the GCPM-TEC in the model (iii). The RMS of the errors is 7.7 TECU. The bias errors for large observed TECs are removed.
Table 3 shows the RMS of the errors for percentage of allocation to altitudes above 500 km. It is almost monotone decreasing with percentages and the best allocation is found to be around 80–85%. This result signifies that most of the TEC errors should be allocated to altitudes above 500 km.
Table 3. RMS of the Errors of the GCPM-TEC for Percentage of Allocation of the TEC Errors to Altitudes Above 500 km
TECU error (RMS)
4.2. TEC Error Distribution Above 500 km
 The GCPM mainly consists of the topside ionosphere model and plasmaspheric model at the equator above 500 km. In order to check which model dominates the GCPM TEC error, we examined the MLT- and Kp-dependences of the error above 500 km by using GRACE data.Table 4shows the average GCPM TEC errors above 500 km in each MLT and Kp condition. The dashes in the table mean that no data are available. The day-night difference in the topside ionosphere which can be derived from errors in MLT = 15 and 3 is around 20 TECU and almost independent from the Kp index. The effect of the plasmaspheric size which is part of Kp-dependent component in each MLT is at most 3–4 TECU and is much smaller than the day-night difference in the topside ionosphere. The effect of the plasmaspheric bulge can be estimated by the Kp-dependence of differences of the TEC error from MLT = 15 to 18. The average TEC error is increased by 2.3 TECU under geomagnetic active conditions (Kp> 3) and by 7.9 TECU under quiet conditions (Kp = 0,1). The enhancement in small Kp suggests the possibility of underestimates of the effect of the plasmaspheric bulge in the GCPM. However, the effect of the plasmaspheric bulge is also smaller than the day-night difference in the topside ionosphere.
Table 4. Average GCPM TEC Errors Above 500 km in Each MLT and Kp Condition
4.3. Modeling of the Altitude Distribution
 We finally describe a plausible altitude distribution of the TEC error based on the GRACE data. Most of the TEC errors are considered to distribute in the topside ionosphere where influence of the IRI model extends along altitudes. This result from the GRACE data is consistent with the local-time and seasonal dependences of the TEC errors that are derived from the GPS-TEC data obtained at the ground stations.
 The plasmaspheric model in the GCPM is based on in-situ density measurements from the DE1/RIMS. The densities measured by the DE1/RIMS are around 104/cm3 at L = 2 and 103/cm3 at L = 5. Considering that TEC variations due to variances of the measured densities amounts a few TECU in a range from L = 2 to 5, it is readily understood that most of the TEC errors are originated from the topside ionosphere. It should be noted, however, that all the TEC errors are not explained only by the model errors in the topside ionosphere. The effect of the plasmaspheric bulge is also an origin of the TEC error while it amounts to less than several TECU. The effect of the plasmaspheric size is smaller than that of the plasmaspheric bulge. In order to refine the plasmaspheric model, it is necessary to remove the large bias errors in the topside ionosphere at first.
 Since the IRI can be used up to around 600 km, where IRI densities become constant at unrealistically high value, a simple power law function shown in equation (1)is adopted to represent the topside ionospheric profile in the GCPM. The result that most of the GCPM-TEC errors distribute in the topside ionosphere is considered to be caused by inflexibility of the power law function whose parametersd0 and d1 are uniquely determined by the IRI density and its gradient at an altitude of the maximum negative density gradient above the F2 peak.
 In this study, we derive a TEC error model from differences between observed and model-derived TECs, but the effect of the topside ionosphere might be represented more simply by modifying the power law function in the GCPM. Density by the modified power law function should be higher on the condition of high IRI density and lower on the condition of low IRI density than the current GCPM density. Because most of the TEC errors are originated from altitudes above 500 km, plenty of GPS TEC data might be used instead of the GRACE TEC data to modify the density model in the topside ionosphere. Construction of such a model is the future work.
 Because the modification of the equatorial density causes latitudinal discontinuities, the off-equatorial IRI density profile also needs to be modified like on the equatorial plane. Such empirical modeling of ionospheric density parameters are still open subjects of research [e.g.,Hoque and Jakowski, 2011]. In order to examine the field-aligned distribution of the off-equatorial densities, whistler dispersions would be helpful. For example, the Akebono satellite has intermittently observed the dynamic spectra of lightning whistlers since 1989. By analyzing such a massive dataset statistically, the accuracy of the GCPM global density on various conditions would be examined.
 We examined the accuracy of the GCPM equatorial density by using long-term TEC data obtained from several GPS tracking stations along the equator. According to the statistical analysis of the GPS TEC, we found a remarkable feature in bias errors in the GCPM-derived TEC, that is, the GCPM TEC is too high for values of observed TEC below 50 TECU and too low for values of TEC above 80 TECU. These TEC errors have similar seasonal and local-time dependencies to ionospheric densities. The bias model of the TEC errors was constructed by a discrete spline method in which dependencies of local-time, season, solar activity, and geomagnetic activity were taken into account.
 From an analysis of TEC data obtained by the GRACE satellite in a low earth orbit, most of the GCPM TEC errors are found to distribute in the topside ionosphere where influence of the IRI model extends along altitudes. The result is consistent with the local-time and seasonal dependences of the TEC errors. The effect of the plasmaspheric bulge is also an origin of the TEC error, but it amounts to less than several TECU.
 Since the IRI profile in the topside ionosphere is not so reliable due to fewer observation data, the GCPM represent the altitude profile by a simple power law function. The large errors are considered to be originated from this simple representation of the density profile in the topside ionosphere.
Appendix A:: Chi-Square Test of Independence
 The chi-square test of independence used in the present study is explained by usingε and the DST. When εand the DST are independent, the data number in the two-dimensional histogram forε and the DST, Oij, is expected to correspond to a multiplication of data numbers in one-dimensional histograms,Xi and Yj, for ε and the DST, respectively. i and j indicate discrete cell numbers in the histograms. Then, the degree of dependence between ε and the DST is represented by the following equation:
where N is the total data number as follows:
Because all the free parameters in the developing model, except for Kp, are continuous quantities, we applied the equation (A1) to conditional histograms of the parameters that are estimates of their probability distributions.
 Generally, it is necessary to establish a significance level for the test of independence when we judge whether or not there are meaning differences between two parameters. In the present modeling, however, we do not judge the independences of the parameters on the TEC errors but evaluate relative degrees of the independences by the magnitude of χ2 values.
 The authors are grateful to the International GNSS service and the GRACE science team for providing the GPS raw data.