Statistical uncertainty of 1967–2005 thermospheric density trends derived from orbital drag

Authors


Abstract

[1] Estimation of trend uncertainties is an essential element of long-term climate change study. Standard error analysis assumes independent (i.e., uncorrelated) random deviations of the data around the trend, a requirement that is rarely fulfilled by upper atmospheric time series; consequently, uncertainty estimates are typically unrealistically small. To obtain internally consistent estimates of linear trends and trend uncertainties in thermospheric density data, we account for correlated noise by incorporating autoregressive (AR) models of varying order into our error analysis. We apply our method to daily, monthly, and yearly averages of thermospheric mass density data derived from orbital drag, after subtracting out solar and seasonal effects. The resulting trend uncertainty estimates are mutually congruent among the three temporal cadences; in contrast, assuming independent random error produces uncertainty estimates that differ considerably among the daily, monthly, and yearly cases. At 400 km, we estimate the 1967–2005 density trend to be –1.94% per decade, with a 95% confidence interval of [−3.30, −0.59] % per decade. The AR model residuals are consistent with the assumption of independent, normally distributed random errors with uniform variance. Our methodology permits realistic analytical estimates of trend uncertainties for AR processes of arbitrary order, superseding the use of Monte Carlo simulations, and the approach is applicable to trend analysis of other upper atmospheric parameters.

1. Introduction

[2] Earth's thermosphere has been predicted [Roble and Dickinson, 1989; Akmaev et al., 2006] to cool and contract in response to increasing concentrations of atmospheric CO2, which is the dominant ultimate cooling agent of the thermosphere (downward molecular conduction connects the upper thermosphere with the lower thermosphere; CO2 is the primary loss mechanism below about 140 km altitude) [Roble, 1995; Mlynczak et al., 2010]. Consequently, thermospheric density at fixed geometric altitudes should be gradually decreasing, and there is evidence, derived from the orbits of objects in low-Earth orbit, that this is occurring at a rate of 2 to 5% per decade (dependent on the phase of the solar cycle) near 400 km altitude [Keating et al., 2000; Emmert et al., 2004, 2008; Marcos et al., 2005]. Long-term changes in O3, H2O, and CH4 also affect thermospheric densities [Akmaev et al., 2006; Roble and Dickinson, 1989], and it appears that the entire upper atmosphere (including the mesosphere, thermosphere, and ionosphere) is responding to this radiative forcing [Laštovička et al., 2008].

[3] Various methods have been used to estimate long-term upper atmospheric trends and their statistical uncertainties. The objective of most studies is to detect and quantify long-term changes that are not driven by solar forcing, which is the dominant source of thermospheric variations. We here define “trends” as long-term changes that may or may not be monotonic, and use the term “linear trends” to refer to the linear secular component of such changes. A single linear trend is usually assumed, but some studies [e.g., Merzlyakov et al., 2009; Liu et al., 2010] also consider piecewise linear trends. The trend analysis is usually performed on monthly [e.g., Bremer, 1992] or yearly [e.g., Laštovička, 2001] average anomalies, but daily or higher-resolution data are sometimes used [e.g., She et al., 2009], particularly when the trend term is computed simultaneously with other climatological parameters such as seasonal terms. The selection of temporal resolution typically has little effect on the value of the trend estimate, but if not done properly it can greatly affect the uncertainty estimate, as we demonstrate in the following sections.

[4] Most studies have applied standard ordinary least squares (OLS) techniques to estimate the trend and its uncertainty, but these standard estimates assume that each data point is an independent measurement. If there is autocorrelation in the residual time series, then there are fewer degrees of freedom, and the estimated statistical uncertainty of the trend value will be biased low. Tiao et al. [1990] and Weatherhead et al. [1998] discuss the analytical effect of a first-order autoregressive (AR(1)) process on trend uncertainty, and some studies [e.g., Marcos et al., 2005; Liu et al., 2010] have applied Monte Carlo techniques to simulate (usually low-order) autocorrelation effects. However, the residual autocorrelation of upper atmospheric data can be quite complex [e.g., Emmert and Picone, 2010], and low-order AR models may not adequately account for the correlated data. We also note that resampling techniques such as bootstrapping and jackknifing [Efron, 1982] assume independence of the data points, and therefore do not account for autocorrelation.

[5] In this paper, we build on the AR(1) approaches of Tiao et al. [1990] and Weatherhead et al. [1998] by generalizing the trend uncertainty estimation to account for AR processes of arbitrary order. We employ direct parametric statistical inference techniques, which supersede Monte Carlo simulation. We apply the method to global average thermospheric density data, including daily, monthly, and yearly averages, to obtain realistic and internally consistent estimates of thermospheric density trends and their statistical uncertainty. We also examine the distribution of the residuals around the AR models and test whether they are consistent with the assumption of normally and identically distributed random errors, which facilitates construction of confidence intervals.

2. Data and Methodology

[6] We use the 1967–2007 data set of daily globally averaged mass density described by Emmert [2009], which we have extended through January 2010 without incorporating new objects. These data are derived from the orbits of ∼5000 objects with perigee heights between 200 and 600 km. The data have a temporal resolution of 3–6 days, typical short-term precision of 2%, and systematic bias less than 5–10%. The bias is due to imperfect knowledge of the drag coefficient of a spherical reference object (to which the ballistic coefficients of all the objects are calibrated), and does not affect trend estimation. The 2% random error in the global mean density is derived from the variance among results from different objects at a given time and altitude; the interobject variance arises from variations in the true ballistic coefficient of each object (e.g., due to tumbling) and from random errors in the orbital elements. This component of data error does not contribute significantly to the uncertainty of long-term trend estimates.

[7] The objective of our study is to detect and quantify long-term changes in the thermosphere that are not driven by solar forcing. In particular, steady increases in CO2 are expected to produce a gradual cooling and contraction of the thermosphere [Roble and Dickinson, 1989] because CO2 is the thermosphere's dominant cooling agent [Roble, 1995; Mlynczak et al., 2010]. Other greenhouse gases and indirect mechanisms, for example involving the chemistry and dynamics of the underlying mesosphere and lower thermosphere, may also induce long-term changes in thermospheric density [Akmaev et al., 2006; Laštovička et al., 2008; Beig et al., 2003]. However, enhanced CO2 cooling is perhaps the best understood and most straightforward driver of thermospheric climate change. The increase in atmospheric CO2 has been monotonic and approximately linear over the past 40 years [Keeling and Whorf, 2002], so a similar response of the thermosphere is a reasonable first assumption. The simulations of Qian et al. [2006] suggest that the density response is approximately linear over the ∼15% CO2 increase that occurred between 1970 and 2000. Eventually, departures from linearity of CO2 trends and the thermosphere's response will likely accrue to the point where a linear trend model is no longer a reasonable approximation, but because of the relatively large unexplained variability of currently available data, such departures are not yet detectable. We therefore compute estimates of linear trends in our data.

[8] Prior to computing trends, we obtain a time series of density anomalies by filtering the data (equation (1)) with the Global Average Mass Density Model (GAMDM) described by Emmert and Picone [2010], which empirically represents the solar flux, geomagnetic activity, and seasonal dependence of the data. GAMDM is a linear, 25-parameter model fit to the 1986–2007 data using OLS, with parameter covariance determined via a Monte Carlo approach.

[9] We analyze the daily density anomalies and also monthly and yearly averages of the anomalies in an effort to obtain an internally consistent estimate of the trend uncertainty. Although point estimates of linear trends are generally insensitive to the temporal window over which the residuals are averaged, the choice of temporal window can strongly affect standard estimates of the trend uncertainties if autocorrelation is present in the residuals, or if running averages (which produce autocorrelation) are used. It can be demonstrated (see section A1) that the ratio of standard trend uncertainty estimates derived from data at two different cadences (i.e., averaging over different, nonoverlapping temporal windows) has an expectation value of one and is approximately distributed according to the F distribution, provided the random errors are independent and identically and normally distributed (IIND). Conversely, if standard trend uncertainty estimates from different cadences are significantly different, then it is likely that the assumptions (particularly the assumption of independence) about the random errors are invalid. As described below, we examine the residuals around the trend estimates for evidence that the errors are not IIND, and we account for autocorrelation in the residuals, with the objective of obtaining trend uncertainty estimates that are independent of the temporal cadence used.

[10] We define the daily anomalies relative to GAMDM as

equation image

where i = 1, 2, …, ND; ND is the length of the time series; ρi is the daily mass density observation at time ti; ρiG is the corresponding GAMDM value; and the superscript D on the anomaly xi indicates that it is a daily value. The uncertainty in ρiG, as computed from the parameter covariance, is typically less than 0.1%; this source of error has a negligible effect on the trend uncertainties and is not included in our analysis (however, in section 4 we examine the effect of using time intervals other than 1986–2007 in the estimation of the GAMDM parameters).

[11] We also construct time series xjM and xkY at monthly and yearly cadences, respectively, by computing monthly and annual averages of xiD. For brevity, we drop the superscripts except when it is necessary to distinguish among the daily, monthly, and yearly cadences.

[12] Figure 1 shows the three time series. A gradual overall decrease is evident in the yearly averages up until 2005, after which the data drop off sharply during the descent into the 2008 solar minimum. Recent investigations have cited evidence supporting a terrestrial [Emmert et al., 2010] or a solar [Solomon et al., 2010] origin of this anomalous behavior, but the phenomenon remains poorly understood. For the purposes of this paper, it is clear that we cannot reasonably represent the entire time series by a linear trend, and we therefore restrict our analysis to the 1967–2005 time interval.

Figure 1.

Time series of global average density anomalies at 400 km, after filtering with the GAMDM empirical model. Daily (green), monthly (brown), and yearly (blue circles) averages are shown. The straight line shows the 1967–2005 (inclusive) trends computed using ordinary least squares; the trend values and their initial uncertainty estimates are given in the lower-left corner.

[13] We model the linear trends in the anomalies as

equation image

where β1 is the linear trend parameter, β0 is a constant parameter, and the errors ɛi are produced by a stationary stochastic process. We assume that ɛi are normally distributed with covariance

equation image

where equation image is the vector of the ɛi values and V is the Pearson correlation matrix of equation image. To allow for correlation among successive values (VI, where I is the identity matrix), we model ɛ as an autoregressive (AR) process of order p:

equation image

where αk are the AR parameters and ηi are IIND:

equation image

With this statistical model, which requires that the data series is uniformly spaced in time, the maximum likelihood estimators and uncertainties of the trend parameters are [Graybill, 1976]:

equation image

where the prime symbol denotes the matrix transpose. To estimate the AR parameters and thence V, we first compute trend estimates equation image(0) assuming V = I (equivalent to AR order p = 0, and to OLS), along with corresponding residuals y:

equation image

[14] The numerical superscripts indicate the order of the assumed AR process. We then compute the sample autocorrelation of the residuals,

equation image

noting that equation imageyi(0) = 0 by virtue of the OLS solution of the trend model, which includes a constant term β0. From the rk values, we compute AR parameter estimates equation imagek and equation imageη2 by solving the Yule-Walker equations [Yule, 1927; Walker, 1931; Chatfield, 1996]:

equation image

[15] We then construct our estimate of V from the rk and equation imagek values as follows (see equations (A14) and (A15) for further details):

equation image

In practice, it is the inverse of this large (N × N) matrix that is needed, which in general may not be straightforwardly computable. However, with the model defined by equation (10), equation image−1 is a band-diagonal symmetric matrix with (p + 1)(p + 2)/2 unique values. For example, an AR(3) process produces an inverse correlation matrix with 10 unique values:

equation image

where the diagonal ellipses denote repetition of values. Section A2 outlines a computational technique for computing these values directly from the first p sample autocorrelation values (the equation image estimates are not necessary for this purpose, but will be used later in our analysis).

[16] With our estimate of V−1 for a given AR order, we compute revised estimates of the trend parameters and uncertainties:

equation image

[17] We found that the point estimates equation image(p) and (equation imageequation image(p))2 depend insignificantly on p. This is because the structure of (equation image(p))−1 causes the interior points (p < iNp) of the time series to receive equal weight in the calculation. Data near the ends (ip or i > Np) of the time series tend to receive greater weight, but the effect is negligible for our data. We therefore only use the equation image(0) and (equation imageequation image(0))2 values in subsequent calculations. In contrast, the covariance of the trend model parameters, cov equation image(p), can depend strongly on p. Since we are primarily interested in obtaining realistic uncertainties of the linear trend parameter β1, we define the following p-dependent estimate of its variance:

equation image

Finally, in order to test our assumptions about the purely random component, ηi, of the AR models, we compute and analyze the AR(p) model residuals:

equation image

where ip + 1.

3. Results

[18] Figure 1 shows the linear trend estimates equation image1(0)D, equation image1(0)M, and equation image1(0)Y computed assuming V = I, from the daily, monthly, and yearly 1967–2005 anomalies around GAMDM at 400 km altitude. The trend parameter values are close to −1.9% per decade in all three cases (here and elsewhere in the paper we have expressed the trends and their uncertainty as a percentage decadal change, using 100 · (exp β1 − 1)). As expected, the point estimates do not depend significantly on the cadence of the anomalies. However, the estimated uncertainties depend strongly on the cadence, with larger uncertainties for the longer time scales: equation imageβ1(0)M/equation imageβ1(0)D = 3.1 and equation imageβ1(0)Y/equation imageβ1(0)M = 1.4. This dependence indicates that the assumption of V = I is not valid. Figure 2 shows the sample autocorrelation of the OLS residuals y(0)D, y(0)M, and y(0)Y. There appears to be significant or coherent structure out to 1–2 years in all three autocorrelation curves, indicating that autocorrelation indeed needs to be incorporated into the trend uncertainty estimates.

Figure 2.

Sample autocorrelation of OLS residuals around the trends shown in Figure 1. (top) The autocorrelation of the daily (green) and monthly (brown) residuals out to 180 days. (bottom) The autocorrelation of the yearly (blue) residuals out to 9 years. The dotted horizontal lines indicate the 95% interval within which the sample autocorrelation of a normally distributed purely random process would fall.

[19] Applying the procedure outlined in section 2, Figure 3 shows the estimated linear trend uncertainties as a function of the autoregressive order used in the statistical model. For all three cadences, the uncertainties increase substantially from the AR(0) model to the AR(1) model. The estimates from the daily series then gradually increase to a peak of about 0.6% per decade at AR(27). The estimates from the monthly time series increase to about 0.7% per decade at around AR(24). The estimate from the yearly time series is about 0.6% per decade for AR(1) and reaches a maximum value of 0.7% per decade for AR(2). We conclude from Figure 3 that a realistic and self-consistent estimate of the trend uncertainty is 0.6 to 0.7% per decade.

Figure 3.

Dependence of trend uncertainty estimates on the autoregressive order used in the statistical model. The plots show uncertainties computed from (top) daily, (middle) monthly, and (bottom) yearly anomalies.

[20] Figure 4a shows the sample autocorrelation of the daily residuals, z(1)D and z(27)D, around the AR(1) and AR(27) models, respectively. The z(1)D residuals show a significant number of autocorrelation values outside of the 95% interval expected from a purely random, normally distributed process; p = 1 for daily values is therefore not consistent with IIND random errors η defined by equations (4) and (5). The autocorrelation of the z(27)D residuals, on the other hand, does appear to be consistent with a random process. Figure 4b shows the autocorrelation of the monthly AR residuals z(1)M and z(24)M. The z(1)M autocorrelation is within the prescribed limits, although structure is apparent in the autocorrelation. The z(24)M residuals, however, appear to be consistent with a random process, as do the autocorrelations (Figure 4c) of the yearly AR residuals z(1)Y and z(2)Y. It is thus unclear from this diagnostic how long of a temporal lag (1 or 2 years) should be incorporated into an AR model. The corresponding trend uncertainty ranges from 0.6 to 0.7% per decade; we adopt the larger estimate and focus on the monthly AR(24) and the yearly AR(2) models. For the daily residuals, we focus on the AR(27) model, which gives a trend uncertainty estimate of about 0.6% per decade. From the monthly and yearly analyses, we infer that daily AR(365) and AR(730) models will give uncertainty estimates similar to the monthly AR(12) and AR(24) models, and to the yearly AR(1) and AR(2) models. The exact choice of AR order is not critical for the remaining residual analyses described in this section.

Figure 4.

(a) Sample autocorrelation of the daily residuals z(1)D and z(27)D around AR(1) and AR(27) models, respectively. The red dotted horizontal lines indicate the 95% interval within which the sample autocorrelation of a normally distributed purely random process would fall. (b) Sample autocorrelation of monthly AR residuals z(1)M and z(24)M. (c) Sample autocorrelation of yearly AR residuals z(1)Y and z(2)Y.

[21] To test the assumption of uniform variance of the random errors ηi (equivalent to all the diagonal values of V being equal to one), Figure 5 shows the root variance of the AR residuals z(0)D, z(27)D, z(0)M, and z(24)M in annual bins as a function of time. Also shown are the corresponding estimates of the variance of ηi:

equation image

and their 95% confidence intervals, assuming a normal distribution for ηi. Note that the variance estimate is also a product of the solution to the Yule-Walker equations (9), which give values that are very similar to those computed using equation (15).

Figure 5.

Examination of the uniformity of residual variance around selected AR models. (a) Yearly root variance, as a function of time, of daily AR(0) residuals z(0)D. (b) Same, using daily AR(27) residuals z(27)D. (c) Same, using monthly AR(0) residuals z(0)M. (d) Same, using monthly AR(24) residuals z(24)M. The solid horizontal lines indicate the overall root variance of the residuals, and the dashed lines enclose the 95% confidence interval, assuming a uniform normal distribution.

[22] The variance of the daily z(0)D AR residuals exhibits interannual variability that often exceeds the expected 95% confidence interval. Emmert and Picone [2010, section 4.1] analyzed the same residuals and found no obvious dependence of this variance on the F10.7 solar EUV irradiance proxy, but the temporal dependence shown in Figure 5a suggests some anticorrelation with the solar cycle. This anticorrelation is very clear in the variances of the z(27)D residuals (Figure 5b); the root variance is smallest at solar maximum (about 6%) and largest at solar minimum (about 14%). The amplitude of this cyclic behavior far exceeds the 95% confidence interval around the overall root variance. Therefore, the daily OLS residuals (y(0)D) cannot be viewed as arising from a stationary AR process of order less than or equal to 27.

[23] In contrast, the annual variances of the monthly z(0)M and z(24)M AR residuals are well bounded by the 95% confidence intervals, and there is no obvious correlation with the solar cycle. The assumption of uniform variance of ηi is therefore consistent with the behavior of the data, in this case.

[24] Finally, we test the assumption of normally distributed random errors ηi. Figure 6 shows the distribution of normalized daily, monthly, and yearly AR residuals. The daily z(0)D residuals appear to be approximately normally distributed, a property that Emmert and Picone [2010] also reported for the residuals x around GAMDM. However, the quantile-normal plot indicates that the tails of the distribution are slightly heavier than that of a normal distribution, and the residuals consequently fail the Anderson-Darling [Anderson and Darling, 1952; Shorack and Wellner, 1986] test of normality. The heavy tails are even more pronounced in the z(27)D distribution, and are likely a result of the inconstant variance of the residuals illustrated in Figure 5. Despite the failure of the normality hypothesis for the random error of the daily data, the residual distribution does closely follow a normal distribution within the ±2σ interval, and it is therefore possible that confidence intervals can still be reasonably estimated under the assumption of a normal distribution. The monthly and yearly AR residuals shown in Figure 6 are consistent with the assumption of normally distributed random errors, based on the quantile-normal plots and the Anderson-Darling tests.

Figure 6.

Distribution of normalized AR residuals z(p)/equation imageη(p). (top) Daily AR(0) and AR(27) residuals. (middle) Monthly AR(0) and AR(24) residuals. (bottom) Yearly AR(0) and AR(2) residuals. (left) A histogram of the normalized residuals (blue) and the unit normal distribution (pink). (right) Quantile-normal plots of the ordered residuals versus their expected location in a normal distribution; normal deviates should fall along the pink line. Also shown is the Anderson-Darling statistic for the case of unknown mean and standard deviation: A*2 = A2(1 + 4/n − 25/n2), where n is the sample size. A value of A*2 exceeding 0.751 entails rejection of the normality hypothesis at the 5% level [Shorack and Wellner, 1986].

[25] We conclude, then, that the daily, monthly, and yearly anomalies produce self-consistent estimates of 1967–2005 density trends and of the uncertainty in the trends, when the autocorrelation of the residuals is accounted for. At 400 km, we estimate a trend of −1.94% per decade, with a 95% confidence interval of [−3.30, −0.59] % per decade. The results for 250, 400, and 550 km altitude are summarized in Table 1. The equation image and r values for the daily AR(27), monthly AR(24), and yearly AR(2) models are given in Data Set S1 in the auxiliary material.

Table 1. Linear Trend and 95% Confidence Intervals of 1967–2005 Density Anomalies, Using an AR(24) Model for the Monthly Data and an AR(2) Model for the Yearly Data
Data Cadence1967–2005 Density Trend Estimates (%/Decade) equation image1(0) ± 2equation imageimage
250 km400 km550 km
Monthly (p = 24)−1.63 ± 1.24−1.91 ± 1.31−2.12 ± 2.20
Yearly (p = 2)−1.65 ± 1.23−1.94 ± 1.35−2.17 ± 2.32

4. Discussion

[26] The results presented in the previous section pertain to statistical uncertainty that is estimable from the residual variance of the data, which we attribute primarily to stochastic geophysical variability that is not accounted for by our reference climatology (random error in the data also contributes to the residual variance). The analysis does not include error due to any temporally dependent bias that may exist in the density data or in the solar and geomagnetic indices. We have addressed such bias in past articles. The results of Emmert et al. [2010] suggest that F10.7 has been a stable indicator of solar EUV irradiance until at least 2005. Systematic errors may be present in the density data, due to collective variations in the true ballistic coefficients of the objects (e.g., via thermospheric composition variations over the solar cycle). However, this potential error component should largely be absorbed by our reference climatology, and therefore is not expected to significantly influence the trend estimate unless there is a collective long-term drift of the true ballistic coefficients. Emmert et al. [2004] considered the latter possibility and concluded that it is unlikely that this potential source of error could account for the magnitude of the negative trends, which is fairly consistent among trends derived from a wide variety of individual objects. In computing the density data, Emmert [2009] excluded objects whose ballistic coefficients appeared to be unstable. Therefore, we expect that this source of error is small compared to the statistical uncertainty, although it remains a poorly defined quantity.

[27] As a further check on the robustness of the trend estimates, we performed two sets of alternate calculations. First, we used several time intervals other than 1986–2007 for the reference model (ρG in equation (1); we retained the GAMDM formulation). The middle column of Table 2 shows the resulting 1967–2005 trends at 400 km computed from yearly average anomalies; the first value is the same result given in Table 1. The trend estimates differ at most by 0.3% per decade, which is well within the indicated AR(2) confidence intervals. In particular, withholding the 2006–2007 data (which were excluded from the trend analysis presented in section 3) from the reference climatology has little effect on the trend estimate. The right column of Table 2 shows 1967–2007 trend estimates, which was the trend interval used by Emmert et al. [2008]. The 1967–2007 trends are somewhat larger than the 1967–2005 trend, owing to the large negative anomalies in 2006 and 2007 shown in Figure 1.

Table 2. Linear Trends, With AR(2) Uncertainties, of Yearly 400 km Density Anomalies, Using Different Periods for the Reference Climatology
Reference PeriodDensity Trend Estimates (%/Decade) equation image1(0)Y ± 2equation imageimage
1967–20051967–2007
1986–2007−1.94 ± 1.35−2.68 ± 1.55
1986–2005−2.02 ± 1.18−2.90 ± 1.57
1967–1985−2.25 ± 1.25−3.34 ± 1.73
1967–2005−2.07 ± 1.20−3.01 ± 1.62

[28] Second, instead of computing the trend from the density anomalies, we incorporated a trend parameter into the GAMDM formulation and estimated the full set of parameters using the 1967–2005 data. The resulting trend and its 2σ AR(27) uncertainty was −2.15 ± 1.18% per decade. In comparison, Table 2 indicates a trend of −2.07 ± 1.20% per decade when daily anomalies around a 1967–2005 trendless model were analyzed as a second step (equation (2)). The two trend values are very close to each other (and well within the confidence interval), indicating that the trend term is sufficiently orthogonal to the other terms in the climatology (i.e., the solar, geomagnetic, and annual harmonic terms are not absorbing the effect of the trend). This is confirmed by the parameter covariance matrix, covequation image: The maximum correlation between the trend parameter and the other parameters is 0.08.

[29] Improvement of the precision of the density trend estimates (i.e., reduction of the trend statistical uncertainty) requires either a longer time series or a deterministic explanation of more of the variance in the anomalies, especially the variance on time scales greater than one year. Reduction in variance is possible if some of the interannual features in the data could be attributed to lower atmospheric variations, which would require specification of mesospheric and lower thermospheric (MLT) temperature and composition during the analysis period. We deem it unlikely that a substantial portion of the residual interannual variance can be explained by direct solar forcing.

[30] Because of the large anomalous decline in density between 2005 and 2010 [Emmert et al., 2010], we restricted our analysis to the 1967–2005 period (inclusive). The cause of the 2005–2010 decline is under investigation. It appears to be linear with a trend of about 90% per decade, but because of the short time period and the large, presumably unsustainable magnitude of the trend, we elected not to conduct a rigorous analysis of this portion of the time series. The appropriate treatment of this feature in future studies will depend on how the thermosphere responds to the next solar maximum and minimum, and on the physical mechanism responsible, if one can be accurately ascribed. Lean et al. [2011] observed a slight positive trend in global total electron content (TEC) during this period, after accounting for solar forcing. Like thermospheric density, TEC responds strongly and positively to increases in solar EUV irradiance, so it appears unlikely that the large density decline can be satisfactory explained by weaker-than-expected EUV forcing, as suggested by Solomon et al. [2010]. It may prove challenging to construct a scenario for the 2008 solar minimum that adequately explains all available observations; such a scenario could involve the lower and middle atmosphere's response to the unusual solar quiescence.

5. Conclusion

[31] We estimated trends and self-consistent trend uncertainties for 1967–2005 thermospheric density data derived from orbital drag. The trend uncertainty estimates used autoregressive (AR) noise models to account for autocorrelation in the residuals, and were about 50% larger than uncertainty estimates computed from yearly averages under the assumption of independent random error. We analyzed daily, monthly, and yearly averages of the data, and we examined the dependence of the trend uncertainty estimates to the order of the AR models used in the calculations. The three temporal cadences produced mutually consistent trend uncertainty estimates; the largest uncertainties were obtained using AR models that incorporated lags out to 2 years. In contrast, standard uncertainty estimates (which assume uncorrelated random error) were extremely different among the daily, monthly, and yearly cases.

[32] At an altitude of 400 km, we estimated a 1967–2005 density trend of −1.94% per decade, with a 1σ uncertainty of 0.67% per decade and a 95% confidence interval of [−3.30, −0.59] % per decade. The residuals around the trend+AR noise model were consistent with the assumption of independent, normally distributed random errors with uniform variance.

[33] Our methodology permits direct estimation of trend uncertainties for autoregressive processes of arbitrary order, superseding the use of Monte Carlo simulations. The approach is applicable to trend analysis of other upper atmospheric and ionospheric parameters.

Appendix A

A1. Trend Uncertainties at Different Cadences Under the Assumption of IIND Errors

[34] We demonstrate here that when the random errors of a time series are independent (uncorrelated) and normally and identically distributed, the uncertainty of a linear trend computed from the raw data is approximately the same as when the data are first averaged in nonoverlapping temporal windows. Let x represent a time series of N observations at uniformly spaced times t, and let y be the averages of consecutive, nonoverlapping segments of x, with L observations in each segment:

equation image

where uj is the average time within each window, which is assigned to the corresponding average yj.

[35] For simplicity, we assume that there are integer M segments in x, so that M = N/L is the number of elements in y. We also choose the origin of t such that 〈t〉 = 〈u〉 = 0; this choice does not affect the trend estimates. The statistical models of x and y are

equation image

where equation image and η are the IIND random errors, related to each other by

equation image

so that

equation image

[36] The least squares trend estimate computed from the original time series x, and the trend uncertainty, are thus

equation image

For the reduced time series, the corresponding trend estimate and uncertainty are

equation image

If M is sufficiently large, then

equation image

For example, the ratio of the left side of equation (A7) to the right side is 1.04 for M = L = 5, and 1.01 for M = L = 10. Combining equations (A5)(A7), we have

equation image

which indicates, as expected, that the true variance of the trend estimates is insensitive to windowed averaging.

[37] The standard estimates of σequation image2 and σn2 are

equation image

and follow the chi-square distribution:

equation image

[38] The standard estimates of the trend uncertainties are thus

equation image

and their ratio is

equation image

[39] This statistic therefore approximately follows the F distribution:

equation image

[40] This property is demonstrated in Figure A1 with a Monte Carlo simulation that represents a comparison between trend uncertainties computed from daily 1967–2005 data and uncertainties computed from annual averages of that data. In this case, N = 14,245 and M = 39. From Figure 1, the actual density anomalies produce a ratio of (0.10/0.45)2 = 0.05, which is well outside the core of the distribution, indicating that the assumption of independent random error is not valid.

Figure A1.

Distribution of equation image/equation image, the ratio of the trend variance estimate computed from raw data x to the trend variance estimate computed from windowed averages of x, under the assumption of independent and identically distributed Gaussian random errors. For each of 1000 synthetic samples, 14,245 “daily” Gaussian deviates with variance 0.14 were generated and added to a trend of −5.3 × 10−6 per day; this simulation approximately corresponds to the 1967–2005 daily density anomalies shown in Figure 1. Trend variance estimates were computed from each sample and from annual averages of each sample, using equation (A11); the blue histogram shows the distribution of these ratios. The red curve shows the corresponding F distribution with 14,243 degrees of freedom in the numerator and 37 degrees of freedom (39 – 2 years) in the denominator.

A2. Computation of the Inverse Data Correlation Matrix for an AR(p) Process

[41] For a uniformly spaced time series, the inverse data correlation matrix corresponding to a specified AR model order can be estimated from the sample autocorrelation function as follows. The computational procedure is a generalization of the approach outlined by Emmert [2009, equations 12 and 13] (which is effectively an AR(1) model), except that it only applies to the case of a uniformly spaced time series.

[42] We begin by defining the “fitted” autocorrelation function, which is identical to the sample autocorrelation for lags less than or equal to the AR model order, but for larger lags is defined by the estimated AR parameters according to the Yule-Walker equations [Chatfield, 1996, p. 38]:

equation image

The estimated data correlation matrix is then

equation image

The inverse is a band-diagonal symmetric matrix with (p + 1)(p + 2)/2 unique values (see the numerical example below for a demonstration of this property), so only part of the matrix is needed in the calculation. The unique values of equation image−1 can be efficiently computed by successively solving systems of p, p − 1, p − 2, etc. equations. We illustrate the procedure with an AR(3) model, for which equation image−1 has 10 unique values:

equation image

where the second matrix is the top-left corner of equation image−1 containing all of the unique, nonzero values (see equation (11)). The first four values, contained in the first column, are computed using

equation image

The next three values are computed via

equation image

(the prime symbol denotes the transpose) so that

equation image

Similarly, the remaining values are successively computed using

equation image

and

equation image

[43] The full matrix can then be filled out as illustrated in equation (11) or stored in condensed form. Note that the sample autocorrelation function up to lag p is all that is necessary for this procedure. As a numerical example, the equations below show the results we obtained for the AR(3) model of the 400 km daily density residuals:

equation image
equation image

Acknowledgments

[44] This work was supported by the Office of Naval Research. Orbit data were obtained from www.space-track.org. The authors thank J. Lean and R. Meier for helpful discussions.

[45] Robert Lysak thanks Martin G. Mlynczak and another reviewer for their assistance in evaluating this paper.