[9] Here we introduce a statistical modeling approach that uses nonparametric regression to estimate smooth trends from time series data. The nonparametric regression uses a set of optimal thin plate splines to represent the trends and can be used to make formal inference (e.g., calculate confidence and prediction intervals). As discussed in the Introduction, the approach adopted here consists of three distinct steps: estimation of individual model trends (IMT), baseline adjustment of the trends, and the weighted combination of the adjusted individual model trends to produce a multimodel trend (MMT) estimate. In this section the development and application of this approach will be illustrated using the time series of column ozone data presented in Figure 1. These data correspond to the CCMVal-1 raw time series analyzed by *Eyring et al.* [2007, Figure 7]. Several models participating in CCMVal-1 provided two overlapping time series of column ozone to cover the maximum range of the REF-A2 period (1980–2100): one from the REF-A2 experiment and one from the climate-of-the-20th-century experiment REF-A1. This additional complication is accounted for in the TSAM approach by considering these partially overlapping time series as ensemble members.

#### 2.1. Nonparametric Estimation of the Individual Model Trends

[10] The time series *y*_{jk}(*t*) of an ozone-related index, such as one of those displayed in Figure 1, is additively modeled as the sum of a smooth unknown model-dependent trend, *h*_{j}(*t*), and irregular normally distributed noise:

where the noise field

is assumed to be an independent normally distributed random variable with zero mean and variance *σ*^{2}, and the indices *j* and *k*, respectively, represent model and ensemble-member number. (Here the ensemble index *k* extends over both REF-A1 and REF-A2 simulations for some models.) This is a nonparametric regression of the time series on time. The regression is nonparametric because the function of time does not have a fixed functional form with explicit parameters. The noise term (2), representing natural variability about the trend, is considered to be an independent normally distributed random variable: independent between different times, models, and runs. The variance of the noise is assumed to be constant over all models and runs. By fitting the trend to all the data rather than to each model separately one can obtain better estimates of the noise variance (referred to as “borrowing strength”).

[11] The unknown smooth functions *h*_{j}(*t*) are estimated by fitting the data to a finite set of smooth basis functions having optimal interpolating properties. This was done here by using the generalized additive model **gam**() function in the **mgcv** library of the R language [*R Development Core Team*, 2008; *Woods*, 2006]. The default option was used which fits the data to a set of thin plate regression splines, by maximizing penalized likelihood to find the coefficients multiplying the basis functions. The smoothness of the basis functions is controlled by a smoothing parameter, which is chosen using a leave-one-out generalized cross-validation prediction approach (see *Woods* [2006] for more details). Unlike iterated 1:2:1 Lanczos filter smoothing, typically used on CCMVal-1 data [e.g., *Eyring et al.*, 2007], the thin plate splines are guaranteed to give smooth trend estimates and do not alter their properties at the ends of the series.

[12] The first step in the TSAM approach is to apply the nonparametric regression (1) to the raw time series data. This is illustrated in Figures 2a and 2b by the IMT estimates *h*_{j}(*t*) of the CCMVal-1 March 60°N*–*90°N and October 60°S*–*90°S total column ozone displayed in Figure 1. (Note that, while the smooth trend estimates *h*_{j}(*t*) extend over the full period (1950–2100) in Figures 2a and 2b, we have elected to display the *h*_{j}(*t*) only over the period where data exists for each model.)

#### 2.2. Baseline Adjustment of the Trend Estimates

[13] The initial IMT estimates *h*_{j}(*t*) in Figures 2a and 2b reveal significant differences in the background values of column ozone, particularly in the Arctic (Figure 2a). To facilitate a comparison of the trends across models, anomaly time series are constructed relative to a pre–ozone-hole baseline value of the index. While this is analogous to the procedure employed by *Eyring et al.* [2007], the smoothness of *h*_{j}(*t*) allows a more robust definition of the baseline at a particular time *t*_{0} (i.e., *h*_{j}(*t*_{0})), rather than from the average over some period about *t*_{0}. This results in the anomaly time series:

By construction, the anomaly time series (3) is centered on a baseline value of zero at the time *t*_{0}. Here we chose to have this baseline changed from zero to the multimodel mean of *h*_{j}(*t*_{0}) resulting in the “*t*_{0} baseline-adjusted time series”:

where

where J is the total number of models. Since the multimodel average of the IMT estimates *h*(*t*_{0}) is a close approximation to the final multimodel trend estimate (MMT) derived in the third step of the TSAM approach, the baseline adjustment may be viewed simply as forcing the anomaly time series to go roughly through the final MMT estimate at the reference date *t*_{0}.

[14] The time series (4) contains all the information of (3) plus the multimodel average *h*(*t*_{0}), which can be compared with observations. We will used the baseline *t*_{0} = 1980 since a number of the CCMVal-1 models do not have data prior to this date. Following (4), the 1980 baseline-adjusted time series, *y*′_{jk} for the CCMVal-1 March 60°N–90°N and October 60°S–90°S total column ozone are displayed in Figures 2c and 2d, respectively. The corresponding 1980 baseline-adjusted nonparametric IMT estimates *h*′_{j}(*t*) are presented in Figures 2e and 2f. Following (1) and (4) the 1980 baseline-adjusted nonparametric smooth trend in our model is:

with

Before moving on to the third step in the TSAM, we may ask if the statistical model in (7) is well specified. In other words, are all its model assumptions satisfied in modeling the data, for example, that the noise term ε_{jk} (*t*) is independent from year to year, is normally distributed, and is drawn from the same underlying distribution with zero mean and similar variance. For example, we could have chosen the simpler nonparametric model:

where one trend estimate is made for all time series data instead of individual trend estimates for each model (7). This implicitly defines a different random noise component _{jk}(*t*). The nonparametric trend estimate *g*′(*t*) is displayed as the thick grey line in Figures 2c and 2d. If (8) were a reasonable model for the data then, in addition to being an IMT, *g*′(*t*) could also serve as the MMT thereby eliminating the need for the third step of the TSAM. Visual inspection of the smooth estimate *g*′(*t*) to the 1980 baseline-adjusted time series *y*′_{jk} in Figures 2c and 2d would suggest a reasonable fit. However, because we have built the approach on a probabilistic model, the specification of the *g*′(*t*) and *h*′_{j} (*t*) fits may be tested.

[15] The year-to-year independence of the model noise term may be tested by calculating its autocorrelation function. In Figure 3 the autocorrelation function for the noise term _{jk} (*t*) is displayed for each model for the nonparametric fit (7) to the CCMVal-1 October 60°S*–*90°S column ozone. The dashed blue lines in Figure 3 represent 95% confidence limits. Lines that extend beyond these limits are considered to be sample correlations that are significantly different from zero. Inspection of all the models reveals that the assumption of year-to-year independence is a good one for the model (7). The fits are designed to give “smooth” estimates of the long-term trends; in other words, they capture long-term variations in the trend but do not wiggle up and down annually. Interannual wiggles are suppressed by a roughness penalty when fitting the splines. Since the residuals of (7) are not serially correlated, there is no need to represent the residuals using more complex time series models such as ARMA models or fractional Brownian noise. This would not be the case if one had used simple parametric trends that are unable to follow longer term up and down decadal variations. Unlike model (7), the simpler nonparametric model (8), which has the same trend for all climate models, gives serially correlated residuals (Figure 4) and so is not well specified (i.e., its assumptions are not satisfied when the model is fit to the data).

[17] We conclude, therefore, that (7) represents one of the simplest nonparametric additive models that is satisfied by the ozone indices considered in the two examples.

#### 2.3. Multimodel Trend Estimates

[18] The final step of the TSAM approach involves combining the IMT estimates *h*′_{j}(*t*) to arrive at an MMT estimate:

where the weights *w*_{j}(*t*) have the properties

If the weights are assumed to be nonrandom, and the errors in the individual trends are assumed to be independent, then the squared standard error of the weighted sum is given by:

where *s*_{j}(*t*) is the standard error of the trend estimate *h*′_{j}(*t*), which can be calculated using standard expressions from linear regression [*Woods*, 2006]. The standard error (11) can then be used to estimate the confidence and prediction intervals, respectively, as:

and

[19] The 95% confidence interval in the trend gives the uncertainty in the trend estimate, where is the standard deviation of the noise term. In other words, there is 95% chance that this interval will overlap the expected trend predicted by the statistical model. The interval is pointwise (rather than simultaneous) in that it represents the uncertainty in the trend at each year rather than being an interval for all probable trend curves over the whole period. The 95% prediction intervals give an idea of how much uncertainty there might be in a predicted index value for an individual year. In other words, there is 95% chance that a particular index value on a specific year will lie in this interval. This interval is the combination of uncertainty in the trend estimate and the uncertainty due to natural interannual variability about the trend.

[20] The specific choice of weights in (9) remains open. In general, we decide to base the construction of the weights on a statistical probability model with testable assumptions. Here we have chosen a “random effects” model to determine the weights. This model assumes that the trends for individual models *h*′_{j}(*t*) are random samples from the “true trend” _{j}(*t*):

where

The quantity *λ*^{2} is included to account for additional variance between model trends that cannot be accounted for merely by sampling the uncertainty *s*_{j}^{2}. Using this random effects model, (11) then generalizes to:

which is used here to calculate intervals. Assuming this model is valid, a least squares estimate of *w*_{j}(*t*) may be obtained from (9) employing the weights:

where

Specification of the weights *w*_{j}(*t*) from (17) requires an estimate of the parameter *λ*^{2}. For this we have used the following iterative approach: An initial estimate of the true trend is obtained by calculating *h*′_{λ=0}(*t*). Then an iterative Newton-Raphson algorithm is employed to determine the *λ* that gives scaled residuals that have unit variance as is expected from (14):

Employing this model for the weights produces the MMT estimate *h*′(*t*) for the 1980 baseline CCMVal-1 October 60°S*–*90°S column ozone displayed in Figure 6c. The associated individual model trend estimates *h*′_{j}(*t*) and weights *w*_{j}(*t*) are displayed in Figures 6a and 6b, respectively. In Figure 6, the weights are scaled by the number of models so that a scaled weight of 1 implies a proportional contribution of that model to the MMT estimate.

[21] While this formulation of weights provides a smooth final trend estimate *h*′(*t*), for this example it highlights a potential problem: the individual model weights *w*_{j}(*t*) are very insensitive to the absence of data in the original time series. For example, the time series for the MAECHAM4CHEM model (green) extends only over the period 1980–2019 (see Figure 2). Its scaled weight, however, has a value of roughly 1 over the entire period 1960–2100 suggesting significant contributions of its trend estimate *h*′_{j}(*t*) at times when there are no model data. The original idea behind this model for the weights was that the natural increase in standard errors *s*_{j}^{2} (*t*) in the region where *h*′_{j}(*t*) is extrapolated beyond the model data would cause the weights to decrease naturally toward zero. While Figure 6b indicates that there is some tendency for the weights to display this behavior, many models retain weight values close to unity out to 2100 where they have provided no data.

[22] To correct this unphysical behavior, we introduce the concept of prior weights *w*_{j}^{p}(*t*) into the formulation such that the final weights now have the form:

(with *w*′_{j}(*t*) implicitly replacing *w*_{j}(*t*) in expressions (11) and (17)). An example set of prior weights would be the “on/off” set: *w*_{j}^{p}(*t*) = 1 at times *t* when raw time series data exist for model *j* and *w*_{j}^{p}(*t*) = 0 otherwise. This prescription is illustrated in Figures 6d–6f. It corrects the unphysical behavior identified when *w*_{j}(*t*) of (17) is used alone. However, this on/off prescription is still problematic in that it causes discontinuities in the MMT estimate Figure 6f. The set of prior weights used for the present chapter employ a smoother quadratic taper from a value of 1 where time series data exists to a value of 0 where it is absent:

where

and where [*t*_{j,min}, *t*_{j,max}] defines the period within which data exist for model *j.* This scheme is illustrated in Figures 6g–6i.

[23] Finally, we note that the formulation of prior weights (20) allows a natural entry point for the specification of prior, time-independent, model weights based on performance metrics. Such metric based weights would take on values in the range [0,1] and simply multiply *w*_{j}^{p}(*t*) in the numerator and denominator of expression (20).