Temperature results from multi-decadal simulations of coupled chemistry climate models for the recent past are analyzed using multi-linear regression including a trend, solar cycle, lower stratospheric tropical wind, and volcanic aerosol terms. The climatology of the models for recent years is in good agreement with observations for the troposphere but the model results diverge from each other and from observations in the stratosphere. Overall, the models agree better with observations than in previous assessments, primarily because of corrections in the observed temperatures. The annually averaged global and polar temperature trends simulated by the models are generally in agreement with revised satellite observations and radiosonde data over much of their altitude range. In the global average, the model trends underpredict the radiosonde data slightly at the top of the observed range. Over the Antarctic some models underpredict the temperature trend in the lower stratosphere, while others overpredict the trends.
 As carbon dioxide concentrations rise, the troposphere is expected to warm and the stratosphere is expected to cool. The stratosphere is therefore an important test bed for the performance of climate models and for providing an early indication of climate change [e.g., Ramaswamy et al., 2001]. Pawson et al.  assessed the performance of 13 climate models with well-resolved stratospheres, but these did not include chemistry, and simulations were on average less than 10 years each. They concluded that the models generally had an overall cold bias compared with measurements. Temperature trends have been determined from climate models, with and without chemistry [Shine et al., 2003] and were found to be generally consistent with observations in the global average. However, trends were found to be larger than observed by satellite data near 5 hPa and smaller than observed in the lower mesosphere.
 Model simulations of the SPARC CCMVal (Stratospheric Processes and their Role in Climate, Chemistry Climate Model Validation) project [Eyring et al., 2005] are investigated. Each model has been run for several decades into the future as well as several decades for the past, using a consistent set of climate forcings across all the models. Here, we focus on the model performance of stratospheric temperature for the past, and the current work can therefore be considered a continuation of the works Pawson et al.  and Shine et al.  using updated observations and longer simulations, including some ensemble simulations, as well as consistent forcings.
2. Description of the 3-D Models and Simulations Included
 The main model calculations included are the REF1 simulations described by Eyring et al. . The MRI model used in that work was improved early in this study, and the revised results are included here. The results are from transient simulations for the period 1950 to 2005 or a subset thereof. Table 1 summarizes the models and the periods of integration. All models specified changes in chlorofluorocarbons and halons, from which the active chlorine and bromine amounts were simulated. All models specified the concentrations of the well-mixed greenhouse gases from observations, and specified observed sea surface temperature and sea ice as a model lower boundary condition.
Table 1. Brief Description of Models and Simulationsa
Chemical Effects on Aerosols
Radiative Effects of Aerosols
The source of the surface area densities (SADs) and the references used to compute the optical effects of the aerosols are given by the following footnotes.
 The zonally averaged temperature data for each model was fitted to the same regression equation as was done by Austin et al. , which included a trend term, tropical wind terms to account for the quasibiennial oscillation, the solar cycle and volcanic aerosol terms. Results are presented for the global average and for the polar regions, where temperatures have an important impact on ozone depletion via polar stratospheric cloud formation.
3.1. Annually Averaged Temperatures 1990–1999
 To put the model results in the context of previous studies of middle atmosphere performance, in particular Pawson et al. , we show in Figure 1 the globally averaged annual mean temperature as a function of pressure. Data assimilation fields from the United Kingdom Meteorological Office (UKMO) and the European Centre for Medium Range Weather Forecasts Interim analysis are also included. The assimilation fields differ by less than about 2K throughout the pressure range indicated in Figure 1. The difference between model results and mean UKMO data assimilation fields are also shown in Figure 1 (bottom). The model results are close together in the troposphere and lower stratosphere, and increasingly diverge above the middle stratosphere, as shown by the difference model - observations (Figure 1, bottom). In comparison with the results of Pawson et al., Figure 1 shows better agreement in the troposphere with a smaller range in model temperatures in the middle atmosphere. Unlike in Pawson et al., a cold bias is no longer present, except of about 2 K in the troposphere. In the stratosphere the models are on average too warm by a similar amount.
3.2. Globally Averaged Temperature
 In Figure 2 is shown the evolution of the global average annual temperature weighted in the vertical in the same proportion as the Microwave Sounding Unit (MSU) channel 4 radiance, which peaks near 80 hPa. The increases in observed temperatures following the eruptions of El Chichón (1983) and Mt. Pinatubo (1991) are clearly apparent. The volcanic responses of the models vary substantially, partly because several models did not include aerosol heating (Table 1). The remaining models are similar to the results of Cordero and de Forster , which also indicated that the simulated response to the eruptions varied substantially between models. The observed long term trends are generally larger than simulated by the models, but those models which included aerosol effects (Figure 2, left) show the general behavior of a warming during the eruptions and a rapid cooling thereafter followed by an approximate stabilization, as also discussed by Ramaswamy et al. .
 Temperature trends for the periods 1960–1979 and 1980–1999 are very different (Figures 3a and 3b). Stratospheric cooling was larger during the later period due to ozone depletion, although there is a larger range in the model results due to more models being included and more widely varying ozone trends [Eyring et al., 2006]. The smallest temperature trend in the stratosphere is provided by the CMAM which was found to be related to a problem with the middle atmosphere radiation scheme that underestimated the impact of the CO2 increase. The scheme has now been corrected.
 The results are similar to Shine et al. [2003, Figure 4], in which model ozone trends were specified from observations, although in our results the lower stratospheric cooling peaks at a slightly higher amount in some models. In Shine et al. the discrepancy between Stratospheric Sounding Unit (SSU) data and models was large near 5 hPa. However, the SSU data have recently been corrected for the increase in atmospheric CO2 concentrations [Shine et al., 2008; Randel et al., 2009] and this has led to improved agreement between observations and model results.
3.3. Polar Temperature Trends
 The simulated temperature trends polewards of 67°N (Figure 3c) have many of the features of the global trend (Figure 3b) but with much larger model variability. The near surface warming trends are typically larger in magnitude than the global average. Lower stratospheric trends for the period 1960 to 1979 are small (not shown), and in the 1980 to 1999 period, the switch between tropospheric warming and stratospheric cooling near 300 hPa is a consistent feature of all but one of the models. Nonetheless, the stratospheric cooling rates themselves vary substantially from one model to the next. In the lower stratosphere that variability is particularly large, with model internal variability most likely having a major contribution. The models in general agree with trends derived from radiosondes [Haimberger et al., 2008; Randel et al., 2009], although both models and observations cover a wide possible range. Above about 50 hPa, the models typically show a trend which decreases with height, or remains approximately constant to the middle stratosphere, although the radiosonde data indicate an increasing negative trend to the top of their range.
 The temperature trend results polewards of 67°S (Figure 3d) indicate that significant tropospheric warming is absent from most models in contrast to the Arctic. This is likely due to the radiative-dynamical effect of the ozone hole [Thompson and Solomon, 2002]. For most models, though, the Antarctic temperature trend is similar to the global average, but with enhanced cooling near 100 hPa. The models agree with radiosonde observations in the troposphere, with near zero trends, but in the lower stratosphere there is a large divergence of results, probably due to the differences in the simulated ozone holes which varied in area and depth in the different models. At the altitude of peak cooling, several models appear to underpredict its magnitude, but there are large uncertainties in the rate in both models and observations.
4. Discussion and Conclusion
 We have examined temperature variations and trends in a number of coupled chemistry climate model simulations which were commissioned as part of the 2006 Ozone Assessment [World Meteorological Organization, 2007, chap. 5 and 6]. In the lower stratosphere the simulations were generally consistent with previous work, with increases in temperature during volcanic eruptions [e.g., Cordero and de Forster, 2006], that vary substantially among models, and with step-like features after the eruptions [Ramaswamy et al., 2006]. There is therefore a need in future simulations to treat aerosols in a more realistic manner to try to simulate the volcanic impact in better quantitative agreement with observations.
 The model globally averaged temperatures agreed better with observations than previous assessments [e.g., Pawson et al., 2000]. This is likely to be related to a number of factors, including a slightly colder climatology of the observations, a more carefully controlled set of simulations with standardized forcings, as well as genuine improvements in model performance.
 Trends in the globally averaged temperature were compared with corrected Stratospheric Sounding Unit data [Shine et al., 2008; Randel et al., 2009]. Discrepancies with observations, previously noted for those models which included ozone trends [Shine et al., 2003] have been reduced. This has occurred primarily because of corrections in the observed temperatures leading to reduced trends in the lower mesosphere and enhanced trends in the stratosphere.
 Differences in model formulation and in the simulation of ozone trends likely contribute to the spread in calculated lower stratospheric temperature trends. However, the Arctic is especially prone to variability in the dynamics and this natural variability is reflected in model performance, as well as observed trends.
 We would like to thank Dan Schwarzkopf and Stuart Freidenreich (GFDL) for comments on the manuscript prior to submission. An anonymous reviewer and Markus Rex are thanked for helping to improve the paper. The European Centre for Medium Range Weather Forecasts supplied data, and Ingo Wohltman is thanked for providing additional processing. GEOSCCM data were kindly supplied via the CCMVal project.