2.1. Introduction: two-equation case
We assume that the data are stationary, though autocorrelated, upon detrending; in other words, ‘trend stationary.’ Suppose that there are two series of interest, y1τ and y2τ, where τ = 1, …, T. Trends are fitted using
A student's t test of slope equivalence is
where ∧ denotes an OLS estimate, (i = 1,2) denotes an autocorrelation-robust variance estimator for b̂i, and cov(b̂1, b̂2) is the estimated covariance between the trend terms.
Karl et al. (2006) drew attention to an apparent discrepancy between observed and model-generated temperature trends in the tropical atmosphere. Douglass et al. (2007) tested surface-matched differences (Supporting Information) using
where b̂1 denotes the trend through model ensemble means, b̂2 denotes the trend through observations, and s̃1 is the estimated standard error of b̂1. The test (4) incorrectly treats the observations as deterministic and assumes the model observations are independent across time. Santer et al. (2008) instead used
where ∼ denotes a least-squares estimate and ri denotes the first-order autoregressive (AR1) coefficient in series i. The ratio of AR1 terms is commonly referred to as an ‘effective degrees of freedom’ adjustment (Santer et al.2000). Instead of a series providing T-independent observations, it is said to provide only (1 − ri)T/(1 + ri) -independent observations. The resulting variance corresponds to an estimate obtained using an AR1 model, but is not equivalent to that derived from higher order autocorrelation models. In addition, it does not yield a correct 2cov(b̂1, b̂2) term (Supporting Information), which was missing in both Equations (4) and (5). While detrended climate model projections may be uncorrelated with observations, the assumption of no covariance among trend coefficients implies that models have no low-frequency correspondence with observations in response to observed forcings, which seems overly pessimistic.
2.2. Panel regressions
Equation (3) can be obtained using a panel regression. Suppose that the dependent variable is the stacked vector (y1, y2)′, and we estimate the following equation:
(1 1)′ denotes two stacked T-length vectors of ones. (0 1)′ denotes a vector of T zeros stacked on T ones. This is called an indicator or a ‘dummy variable,’ since it indicates (value = 1) if the dependent variable is y2. (ττ)′ denotes a 2T-length vector consisting of two T-length time trends and (0 τ)′ is (ττ)′ times (0 1)′. A test of d̂2 = 0 in Equation (6) can be shown to be equivalent to testing b̂1 = b̂2 (Kmenta 1986; Supporting Information). Hence, the t-statistic on d̂2 in Equation (6) yields the test score (3).
To generalize the framework further, suppose that we are comparing m model-generated series and o observational series, making the total number of series N = m + o. Each source i yields Ti≤T nonmissing observations yiτ over the interval τ = 1, …, T. Define an indicator variable obsiτ = 0 if the record is model generated, and = 1 if it is from an observational series. Denote the ith vector as y′i = [yi1, …, yiT]. Stack these vectors into a single NT × 1 vector y as follows:
Stack the trend vector τ′ = [1, …, T]N times to get the NT × 1 panel trend vector
The indicator, or the dummy, variables are likewise stacked to form
where obsi is (obsi1, …,obsiT)′. The regression equation is then written as
where e is an NT × 1 residual vector with typical element eiτ. Note that all the ‘data’ are on the left-hand side, and the right-hand side consists of dummy variables and trend terms.
When obsij = 0, dyiτ/dτ = b̂1 and when obsit = 1, dyiτ/dτ yields (b̂1 + b̂2). Thus, a t-statistic on b̂1 will test whether the model trend is zero and a test of the linear restriction b̂1 + b̂2 = 0 indicates the significance of the observed slope. The t-statistic on b̂2 tests whether the trend on observations differs significantly from the trend in models.
Equation (10) can be extended further. Suppose that observations come from two different systems, such as satellites and weather balloons. Define two different indicator variables: d1, which is equal to 1 if an observation is from either system 1 or 2, and d2 that is equal to 1 only if the observation is from system 2. The regression equation becomes
The estimated model trend is b̂1. The trend in observations from system 1 is b̂1 + b̂2 and from system 2 is b̂1 + b̂2 + b̂4. The t-statistic on b̂4 tests whether the trend in the second observation system differs from that in the first, and so forth.
Hypothesis testing requires a valid estimator of V(b), the covariance matrix of b. The general form is (Davidson and MacKinnon 2002)
where X denotes the right-hand side variables in Equation (11) and Ω = E(ee′). Obtaining a valid estimate of Ω requires modeling the cross- and within-panel covariances. For a panel i with T observations, define a matrix Ai of AR weights using the panel-specific AR1 coefficient ρi:
2.3. Higher order autocorrelations and multivariate trend models
Vogelsang and Franses (2005, herein VF05) derived two estimators for Ω that impose no parametric restrictions on the lag and correlation structure, as is done in Equation (14). Suppose that the N panels are used one at a time in Equation (1), yielding OLS trend estimates b̂ = b̂1, …, b̂N. Take the N residual series u1τ, …, uNτ and form the T × N matrix U = [u1τ, …, uNτ]. VF05 derive two transformations of U that converge in probability to a scalar multiple of Ω. Of their two estimators, we focus on the form, which has higher power and is slightly easier to compute. It is obtained as follows. Denote V = U′ and take the columns vj, for j = 1, …, T, each of length N. Define a vector . Then, VF05 show that
converges in probability to an unbiased estimate of Ω, regardless of the form of autocorrelation and other departures from the independence assumption. For testing purposes, linear restrictions on the slopes can be written in the matrix form Rb̂ = 0 (Supporting Information). The VF05 test statistic is
where η = Σ(t − t̄)2 and q is the number of restrictions, which in our examples is always equal to 1. Critical values for Equation (16) generated by Monte Carlo simulation are reported in VF05.
The VF05 approach improves on the panel method by providing robust trend variances and covariances regardless of the autocorrelation order and the structure of heteroskedasticity. However, it requires balanced panels, which can be a limitation in some cases.
The VF05 statistic, as with all test statistics, has improved size as the sample size increases. Rejection probabilities also increase as ρ1. Monte Carlo simulations in VF05 show that for T = 100, when q = 1 and ρ> 0.8, just under 10% of scores exceed the 95th percentile, indicating a tendency to over-reject a true null, although this is an improvement compared to earlier alternatives. Each panel in our full sample has well over 100 observations, but a high ρ value. Hence, VF05 scores that are close to the critical values may overstate significance.