We examine the annular mode within each hemisphere (defined here as the leading empirical orthogonal function and principal component of hemispheric sea level pressure) as simulated by the Intergovernmental Panel on Climate Change Fourth Assessment Report ensembles of coupled ocean-atmosphere models. The simulated annular patterns exhibit a high spatial correlation with the observed patterns during the late 20th century, though the mode represents too large a percentage of total temporal variability within each hemisphere. In response to increasing concentrations of greenhouse gases and tropospheric sulfate aerosols, the multimodel average exhibits a positive annular trend in both hemispheres, with decreasing sea level pressure (SLP) over the pole and a compensating increase in midlatitudes. In the Northern Hemisphere, the trend agrees in sign but is of smaller amplitude than that observed during recent decades. In the Southern Hemisphere, decreasing stratospheric ozone causes an additional reduction in Antarctic surface pressure during the latter half of the 20th century. While annular trends in the multimodel average are positive, individual model trends vary widely. Not all models predict a decrease in high-latitude SLP, although no model exhibits an increase. As a test of the models' annular sensitivity, the response to volcanic aerosols in the stratosphere is calculated during the winter following five major tropical eruptions. The observed response exhibits coupling between stratospheric anomalies and annular variations at the surface, similar to the coupling between these levels simulated elsewhere by models in response to increasing GHG concentration. The multimodel average is of the correct sign but significantly smaller in magnitude than the observed annular anomaly. This suggests that the models underestimate the coupling of stratospheric changes to annular variations at the surface and may not simulate the full response to increasing GHGs.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Despite the decrease of Arctic sea level pressure during recent decades, it is difficult to distinguish any anthropogenic contribution in the limited instrumental record [Wunsch, 1999]. Recent studies have attempted to identify the effect of increasing GHG concentrations by averaging annular variability simulated by a large number of general circulation models or GCMs [Osborn, 2004; Rauthe et al., 2004; Kuzmina et al., 2005]. Averaging over multiple simulations reduces the prominence of unforced variability, which by definition is uncorrelated from one simulation to the next. This allows forced variations to be identified with a greater degree of confidence. Models consistently exhibit a decrease in Antarctic surface pressure in response to decreasing concentrations of stratospheric ozone [Kindem and Christiansen, 2001; Sexton, 2001; Gillett and Thompson, 2003; Shindell and Schmidt, 2004] and increasing concentrations of GHGs [Fyfe et al., 1999; Kushner et al., 2001; Cai et al., 2003; Rauthe et al., 2004]. In contrast, the Arctic response to greenhouse gases varies according to the model. Shindell et al.  simulated a NH annular trend of observed magnitude in a model with a higher top and better stratospheric resolution than was typical among climate models. Smaller trends, but of the observed sign, are exhibited by models with more limited stratospheric domains [Fyfe et al., 1999; Gillett et al., 2002; Rauthe et al., 2004]. NH annular trends are further reduced by forcing from anthropogenic sulfate aerosols [Rauthe et al., 2004]. Observed annular trends (or trends in the highly correlated North Atlantic Oscillation or NAO) exceed most multidecadal trends simulated by a collection of unforced models [Osborn, 2004; Kuzmina et al., 2005; Gillett, 2005]. This suggests that at least part of the recent observed changes are forced, although the limited duration of the observational record makes it difficult to assess the magnitude of the models' unforced variability.
 Recently, climate modeling groups worldwide produced an unprecedented number of simulations of climate change from the late 19th century through the end of the 21st century as part of the Fourth Assessment Report (AR4) by the Intergovernmental Panel on Climate Change (IPCC). Compared with previous versions, many of these models contain improved physical parameterizations, increased resolution and extend higher into the stratosphere. In this article, we examine those models with multiple realizations of 20th century climate to identify the forced component of observed annular variations. We focus upon the annular pattern because of its prominent contribution to SLP trends observed during recent decades and its potential importance to the future response to forcing. Here we define the annular pattern as the leading empirical orthogonal function (EOF) of hemispheric SLP. An index of annular variability is given by the corresponding principal component (PC), where a positive index indicates anomalously low SLP in the polar region compensated by an increase in midlatitudes. This is a conventional definition of the annular “mode” [Thompson and Wallace, 2000], although the correspondence of EOFs to the dynamical modes of climate generally cannot be made precise [North, 1984].
 By averaging simulations from different models, common variability emerges that is unlikely to arise from unforced fluctuations. We limit our analysis to models with multiple realizations of 20th century climate to increase the prominence of forced changes in the individual models. Models are also required to have simulations extending to the end of the 21st century, when the forcing is larger and the response potentially easier to detect. These criteria make available 57 simulations of observed 20th century climate from 14 models, with 40 simulations spanning the 21st century. All models are forced by changing concentrations of greenhouse gases and sulfate aerosols. While forcing by volcanic aerosols and stratospheric ozone changes is present only in certain models, this contrast can be used to our advantage to isolate the corresponding response. We contrast the SH annular response between models that include and omit the anthropogenic decrease of stratospheric ozone. We also compare the response in models containing volcanic aerosol forcing to the observed reduction in Arctic sea level pressure (SLP) during the winter following a tropical eruption, as a test of the models' NH annular sensitivity.
 We list the models used to simulate observed and projected climate variability in section 2, while summarizing the imposed forcing. In section 3 we evaluate the spatial structure of the annular pattern simulated by each model in comparison to observations. Observed time variations of the magnitude of this pattern are compared in section 4 to ensemble averages to identify variations forced by increasing GHGs and tropospheric sulfate aerosols, along with changes in volcanic forcing and stratospheric ozone. Our conclusions are presented in section 5.
 We analyze all available IPCC AR4 coupled general circulation models (CGCMs) that have multiple realizations for the 20th century experiments, with at least one simulation continuing to the end of the 21st century, where forcing is prescribed according to the IPCC A1B scenario, as described below. While this limits our survey to 14 out of 23 models that were available in October 2005, restriction to those with multimember ensembles results in a large total ensemble size, while reducing the effect of different physical representations between the models. These 14 multirealization models correspond to 57 out of 75 available realizations of 20th century climate within the IPCC AR4 archive of model output. Each model is listed in Table 1 along with a reference for more complete documentation. Model resolution and ensemble number are listed in Table 2, which also lists the first year of each simulation. Note that we use output from the updated version of the NASA Goddard Institute for Space Studies (GISS) ModelE-H (“Run B” as described by Sun and Bleck ), which includes a more accurate calculation of ocean temperature and corrected trends of stratospheric ozone.
Latitude by longitude. Horizontal resolution is approximate for spectral models, where “T” refers to triangular truncation.
NOAA GFDL CM2.0
2° × 2.5°
NOAA GFDL CM2.1
2° × 2.5°
NASA GISS ModelE-H
4° × 5°
NASA GISS ModelE-R
4° × 5°
NASA GISS Russell AOGCM
3° × 4°
T85 (1.4° × 1.4°)
T42 (2.8° × 2.8°)
T47 (3.75° × 3.75°)
T42 (2.8° × 2.8°)
T42 (2.8° × 2.8°)
T42 (2.8° × 3.0°)
T63 (1.9° × 1.9°)
1.25° × 1.875°
2.5° × 3.75°
 Although the 14 coupled GCMs are distinct, certain models are closely related. The two CGCMs developed by the NOAA Geophysical Fluid Dynamics Laboratory (GFDL), CM2.0 and CM2.1, are distinguished by the Lin and Rood dynamical core in the latter [Lin and Rood, 1996; Lin, 2004] but are otherwise identical. While the NASA GISS coupled models, ModelE-H and ModelE-R, are distinguished by the University of Miami Hybrid Coordinate Ocean Model (HYCOM) and Russell ocean GCM, respectively, [Sun and Hansen, 2002; Russell et al., 1995], both share the same ModelE AGCM [Schmidt et al., 2006]. Other pairs of models from the NCAR and UKMO modeling centers have more extensive differences but share certain physical parameterizations, reflecting their common development.
 All models contain forcing by greenhouse gases and tropospheric sulfate aerosols. While GHG concentration is based upon measurements, sulfate forcing is calculated by offline models constrained by estimated emission of chemical precursors and is not uniform between the ensembles. In addition, many models contain stratospheric forcing by ozone and volcanic aerosols (Table 3). Where model documentation is incomplete, the presence of volcanic forcing is inferred from increases in the model stratospheric temperature or top of the atmosphere reflected solar radiation [Santer et al., 2005; Stenchikov et al., 2005]. Volcanic forcing for most models is calculated by specifying the aerosol optical depth and effective radius as functions of latitude and height [Stenchikov et al., 2005]. The exception is the Meteorological Research Institute (MRI) CGCM2, where volcanic forcing is represented as a temporary reduction of the solar constant. In this case, radiative heating of the stratospheric aerosol layer is absent. This heating refracts planetary waves away from the stratospheric polar vortex, increasing its stability, which induces a positive annular response at the surface corresponding to strengthened midlatitude westerlies [Shindell et al., 2001]. Consequently, the annular response to volcanic forcing is expected to be different in the MRI CGCM2, compared to models forced by a height and latitude dependent optical depth.
Table 3. Model Forcing
For the GISS ModelE-R, stratospheric ozone forcing is 5/9 of the observed value.
 Stratospheric ozone is prescribed with a seasonal cycle. Certain models additionally prescribe the concentration decrease observed beginning in the latter decades of the 20th century using radiosondes, along with satellite retrievals and surface measurements of column amount. During the 21st century, these models assume that stratospheric ozone is either held constant at its present day value or begins a slow recovery toward preindustrial values due to the reduction of anthropogenic halogens (Table 3).
 For a few models, some forcings were applied erroneously. Volcanic forcing in the MIROC Medium Resolution simulations is largest at 50 hPa (Table 3), instead of peaking closer to the tropopause as in other data sets [Hansen et al., 2002; Ammann et al., 2003]. This may reduce equatorward refraction of planetary waves by the volcanic aerosols, decreasing the surface annular response to the eruption. In the GISS ModelE-R ensemble, stratospheric ozone loss between 1979 and 2001 is five-ninths of the observed value (J. E. Hansen et al., Dangerous human-made interference with climate: A GISS modelE study, submitted to Journal of Geophysical Research, 2006). Prescribed loss in the UKMO HadCM3 model is as large as twice the observed value during certain seasons [Gillett and Thompson, 2003].
 All models begin integration during the latter half of the 19th century, while simulating the entire 20th century. Because forced annular variations are often obscured by natural fluctuations during this period, we continue our analysis into the 21st century when the forcing is larger. Greenhouse gas concentrations are taken from the IPCC Special Report on Emission Scenarios A1B scenario [Nakićenović et al., 2000], whereby the current observed increase of atmospheric CO2 of roughly 2 ppm per year doubles by 2050 and declines afterward. Despite this declining rate, CO2 concentration itself rises throughout the 21st century, exceeding twice the preindustrial concentration around 2050 [Hansen and Sato, 2004]. Sulfur dioxide emissions are prescribed by the scenario to fall sharply after 2020 to less than half the current value by the end of the century. Sulfate aerosol forcing, although based upon SO2 emission, is not specified by the scenario and varies among the models. The A1B scenario was chosen because it corresponds to the largest number of simulations of 21st century climate but is not meant as a prediction of future atmospheric composition. Among the 14 models, forty A1B simulations are available for the 21st century.
3. Simulation of Annular Spatial Structure
 Changes between early 20th (1900–1949) and late 21st (2070–2099) century SLP are summarized in Figure 1 by the hemispheric-mean squared difference for this period. Darker shading corresponds to calendar months exhibiting the largest trend over the two centuries. The models identify a consistent season exhibiting the largest change. For the Northern Hemisphere (NH), the trend is largest during boreal winter, roughly between October and March. Trends are exhibited throughout a greater portion of the year in the Southern Hemisphere (SH), peaking during the early austral summer (November through February), with a secondary maximum in winter (May through July).
 We analyze annular variability during months showing the largest centennial change of SLP, to see whether the annular pattern continues to dominate the trend of total SLP during the 21st century as observed during recent decades [Thompson et al., 2000]. The annular pattern is defined here as the leading empirical orthogonal function (EOF) of SLP. For each realization of a model ensemble, EOFs are constructed from monthly averaged anomalies between October and March in the NH and November through February in the SH. These months are additionally chosen for their strong coupling of annular variability between the surface and the lower stratosphere [Thompson and Wallace, 1998; Thompson and Solomon, 2002]. This coupling is believed to be important to forcing of annular variability by greenhouse gases, volcanic aerosols, and stratospheric ozone [Shindell et al., 1999, 2001; Gillett and Thompson, 2003]. To reduce the influence upon the annular pattern resulting from forcing, which is not applied uniformly among the models, we subtract a linear trend from the observations, along with each ensemble member, prior to computing the EOFs. This trend is computed separately for each month. EOFs are constructed for the second half of the 20th century, between 1950 through 1999, when widespread observations for comparison to the models are comparatively routine. Model EOFs are compared to those derived from reanalyses by the National Centers for Environmental Prediction (NCEP) [Kalnay et al., 1996]. The reanalyses are convenient but suspect because gaps in the observations are filled by the reanalysis model. However, because of its hemispheric scale, NH annular behavior is well-sampled and similar in both NCEP and the Trenberth and Paulino  compilation of SLP measurements; annular principal components calculated from the two data sets are correlated at greater than 0.95 during NH winter within our analysis period. In the SH, where observations are comparatively sparse, decadal variations of the annular pattern constructed from NCEP SLP are highly correlated with reconstructions based upon station measurements [Jones and Widmann, 2004].
 The variance of SLP, along with the percentage represented by the annular pattern is shown in Figure 2. For both the model ensemble members and observations, the variance and annular contribution are calculated using monthly averaged anomalies of SLP (relative to the annual cycle), averaged over each hemisphere and over the years and season used to construct the EOFs. For the SH, for example, the temporal average consists of the months November through February between January 1950 and December 1999. For each model, the variance and annular contribution are averaged over all ensemble members. For the NH (Figure 2a), the models generally simulate the observed variance but organize too much of this variability into the annular mode. SH annular variations similarly represent too large a fraction of the hemispheric variability in SLP (Figure 2b), although in contrast to the NH, more models overestimate the observed variance. Note that much of this variance is probably unforced because we have subtracted a multidecadal linear trend prior to analysis.
 For each realization, we scale the EOF so that the corresponding principal component (PC) has unit amplitude for the analysis period (1950–1999). Then, the amplitude of annular variability is indicated by the EOF, for which we form the ensemble mean. To measure the quality of each model's simulated annular pattern, we correlate the spatial variation of the ensemble mean and observed EOF. We also compute the hemispheric standard deviation of these two quantities and plot their ratio on a “Taylor” diagram in Figure 3 [Taylor, 2001]. In this diagram, radial distance from the origin indicates the EOF standard deviation with respect to spatial variations, normalized by the observed value, while the angle from the vertical axis represents the inverse cosine of the correlation coefficient. A model in perfect agreement with the observations would be plotted on the horizontal axis with a normalized standard deviation of unity. The root-mean-square (rms) error for each model EOF (normalized by the hemispheric standard deviation of the observed pattern) increases with distance from this point.
 The performance of models with similar rms errors may be statistically indistinguishable, if the rms error of the individual members of each model ensemble do not cluster tightly around the rms error of each model's ensemble mean. To assess whether the performance of two models is significantly different, we calculate the standard deviation of the rms errors for the members of each ensemble. Some model ensembles have as few as two members, so the standard deviation itself is highly uncertain. To get a more robust estimate, we compute the anomalous rms error of each ensemble member relative to the error of its ensemble mean. We then compute a standard deviation by combining the anomalies from all 14 models for a total of 57 anomalies. While this combination assumes that each model has the same standard deviation, it allows the latter quantity to be estimated with a larger number of samples and is therefore more robust. The standard deviation of the normalized rms error is 0.05, which can be combined with the number of ensemble members to draw confidence intervals for each model's rms error. Similarly, standard deviations of the correlation coefficient and normalized spatial standard deviation are 0.02 and 0.05.
 In the SH (Figure 3a), the correlation is near 0.95 for all the models, indicating that geographic variations in the SH annular pattern are well-simulated. This is also shown by the ensemble mean EOF for each model (Figure 4). The model disagreement is mainly in terms of EOF amplitude, whose standard deviation ranges from 0.9 to 1.6 times the observed value. In the NH (Figure 3b), there is a larger range of correlation. This may reflect the larger stationary wave forcing by topography and thermal contrasts in this hemisphere, which creates zonal variations in the circulation, and is a correspondingly greater challenge to simulate. During the positive phase of the annular mode, reduced pressure over the Arctic is contrasted in both the observations and models by midlatitude increases over the Atlantic and Pacific Oceans [Thompson and Wallace, 1998]. However, the annular pattern in eight of the 14 models has higher SLP over the Pacific, in contrast to the observed Atlantic maximum (Figure 5). Note that the EOF is different between GISS ModelE-H and ModelE-R despite their identical AGCMs. This suggests that the ocean not only contributes to time variations in the annular mode [Delworth et al., 1993] but can influence the spatial pattern of variability through longitudinal variations of thermal forcing, for example.
 There is little relation between model performance in one hemisphere versus the other. Both the linear and Spearman rank correlation of the models' rms error between each hemisphere are not significantly different from zero. Despite this, a few models such as the UKMO HadGEM1 have small rms errors in both hemispheres. In addition, horizontal resolution seems to have little influence upon model error. The GISS ModelE-R performs well in both hemispheres, despite its comparatively low AGCM resolution, though North Atlantic storms travel farther east, in better agreement with the observations, when resolution is doubled [Schmidt et al., 2006].
4. Forced Variability
 We project the change in SLP between the early 20th and late 21st centuries (Figure 1) onto the four leading EOFs of each model in Figure 6, which shows the percentage contribution of each EOF to the hemisphere-mean square difference. For most models, the change in SH SLP is comprised almost entirely of changes to the amplitude of the annular pattern. Other EOFs make a greater contribution to the change in NH SLP, although the annular fraction exceeds 40% in both GFDL models, along with NCAR CCSM3, MIROC, and GISS ModelE-H.
 The NH annular index, defined as the principal component (PC) of the annular pattern, is plotted in Figure 7 as a wintertime average from October through March over the duration of each model integration. The PC is computed as an area-weighted projection of the annular EOF for each model with ensemble mean SLP during these months. The EOF is constructed from detrended monthly anomalies of ensemble mean SLP between October and March during the period 1950 to 1999. The EOF is normalized to have unit amplitude averaged poleward of 60°N so that the PC indicates the negative of the annular SLP anomaly in hPa averaged over this same region. Finally, the principal component is defined as an anomaly relative to the period between 1900 and 1970, a period chosen because it is prior to any systematic model trends. For comparison, the annular PC is shown for the HadSLP1 compilation of SLP observations. This is an update of the Global Mean Sea Level Pressure 2 data set [Basnett and Parker, 1997] and is composed of both land and sea observations supplemented by optimal interpolation in regions lacking data. We choose this SLP data set because it extends across five major volcanic eruptions, whose response we analyze below. During the second half of the 20th century, the annular PC calculated from October to March using HadSLP1 has roughly three-quarters of the amplitude of the NCEP annular PC, and a correlation exceeding 0.8.
 A weighted average of the model PCs is shown as a thin red line in Figure 8, where the weights are the number of realizations in each model's ensemble. The thick red line shows a 10-year low-pass filtered version, while gray shading marks the central 95% of the normal distribution inferred from variations among the individual models during any given year. (Time variations of the shaded area are low-pass filtered.) The black line shows the filtered principal component derived from the HadSLP1 observations with zero average between 1900 and 1970. During most of the 20th century, the multiensemble average shows little variability because the substantial decadal variations exhibited by each model in Figure 7 are uncorrelated. The multimodel annular index begins to increase near the end of the 20th century, but only a few decades into the 21st century does the multiensemble average rise above the interensemble variability of the previous century, indicating reduced Arctic SLP with a compensating increase in midlatitudes. This multiensemble average trend is consistent in sign with that observed during recent decades, although of greatly reduced amplitude. The largest trend is exhibited by GISS ModelE-H and MIROC (Figures 7c and 7j). These are the models where the annular pattern contributes the largest percentage to the change in total SLP since the early 20th century (Figure 6a).
 Using a previous version of the GISS AGCM, Shindell et al.  showed that increasing GHG concentrations forced an upward annular trend in a model extending high enough to represent the stratospheric circulation but not in a model with limited stratospheric resolution. The ModelE AGCM has the highest top among the models in this study (Table 2). However, a comparable trend is produced by the MIROC model, whose top is substantially lower.
 Substantial multidecadal variations are superimposed on an upward annular trend in some models, including both GFDL, HadGEM1, HadCM3, and Canadian models (Figure 7a, 7b, 7h, 7i, and 7l). Note that HadCM3 exhibited a negative annular trend in a previous intercomparison [Rauthe et al., 2004]. Although the models disagree widely as to the extent of the decrease of Arctic SLP in response to increasing concentrations of GHGs and tropospheric sulfate aerosols, no model exhibits a trend toward higher Arctic SLP.
 We use volcanic forcing as an additional assessment of each model's annular sensitivity. Volcanic aerosol forcing is included in many of the models, as listed in Table 3. While the stratospheric aerosols released by volcanos reflect sunlight and cool the surface, absorption of infrared radiation within the aerosol layer heats the stratosphere and increases equatorward refraction of planetary waves propagating up from the troposphere. This deflection increases the stability of the stratospheric polar night jet, which projects positively onto the annular pattern of SLP. During the winter following major volcanic eruptions, Northern Eurasia warms despite the reduction in incident sunlight at the surface beneath the aerosol layer [Robock and Mao, 1992]. This warming is caused by circulation changes associated with a decrease in Arctic pressure and a positive NH annular anomaly [Groisman, 1992; Robock and Mao, 1995; Perlwitz and Graf, 1995]. This effect is not consistently large enough to exceed natural fluctuations after each eruption but emerges only after averaging over a large number of eruptions. Winter warming in response to volcanic eruptions has also been demonstrated by models that resolve stratospheric dynamics [Shindell et al., 2001; Stenchikov et al., 2002; Shindell et al., 2004].
 We calculate the NH annular response for the winter following the eruption of five low-latitude volcanos beginning with Krakatoa in 1883. These five are listed in Table 4, chosen because their inferred optical depths are the largest within the simulation period [Sato et al., 1993; Ammann et al., 2003]. The blue triangles in Figure 8 show the annular index during the winter following each eruption for models with volcanic forcing (see also Figure 7). HadGEM1 is included in this group of models, although volcanic forcing is present in only one of its two ensemble members. The winter response is averaged from October to the following March. The model annular PCs are generally positive following the Krakatoa and El Chichón eruptions, acting to reduce Arctic SLP. However, the average response (marked by red triangles) is effectively zero for the other eruptions, even for Pinatubo where the radiative forcing is comparably large. In contrast, the observed annular PC (marked by black triangles) is positive following all five eruptions and large compared to the model average.
The month that zonal-average column optical thickness first exceeds 0.1 according to Sato et al. .
 The response averaged over the five eruptions is shown for the observations and each model in Figure 9, along with uncertainty according to a two-sided student-t test at the 95% confidence level. The winter average in this figure is constructed from December through February when the response is slightly larger, although our conclusions are unchanged if the average comprises October through March. The observed response following Pinatubo may be exaggerated by a temporary increase of the annular PC in the decade prior to the eruption, and in general, a spurious volcanic response can result from unforced decadal variations. To reduce this effect, we define the postvolcanic annular anomaly by subtracting a “nonvolcanic” decadal average consisting of the 5 years before and after the eruption. In forming this 10-year average, we exclude both the first winter following the eruption, along with the second, when a smaller annular response is detected [Shindell et al., 2004]. (Our conclusions are unchanged if we subtract a 20-year average consisting of a decade before and after the eruption.)
Figure 9 shows that the average observed response is positive, corresponding to an increased meridional pressure gradient and stronger westerlies, with greater onshore advection to warm northern Eurasia. The observed response is statistically distinct from zero. The mean response among the models including volcanic forcing is also positive and statistically distinct from zero but is smaller than and statistically distinct from the observed value. In fact, the annular response in models without volcanic forcing is only slightly smaller. The annular response of the volcanic models as a group is insufficiently sensitive to this forcing. One interpretation is that the models underestimate the coupling inferred from observations between volcanic forcing, stratospheric wave refraction, and the circulation at the surface. This coupling contributes to the annular response to increasing GHG concentrations [Shindell et al., 1999], suggesting that the multimodel annular trend (Figure 8) is also underestimated.
 The individual response among models that include volcanic forcing is generally positive, with the exception of the NCAR PCM1, consistent with the average Arctic SLP in the winter following an eruption, as diagnosed by Stenchikov et al. . (For the GISS ModelE-H, the response in the present study is nearly zero, in contrast to the strong, positive response exhibited by the first version of this simulation archived by the IPCC and analyzed by Stenchikov et al. .) However, for some models, the response varies so greatly among the five eruptions that the average is consistent with both the observed anomaly and zero. This precludes us from identifying which of the individual models are insufficiently sensitive to volcanic forcing in the stratosphere because a realistic sensitivity may be obscured by unforced model variability. Note that the error bars in Figure 9 are calculated based upon variations of the model ensemble average from eruption to eruption. The standard deviation of the individual ensemble members, which is appropriate for comparison to the observed variability, is larger by roughly a factor of the square root of the number of ensemble members, compared to the standard deviation used to calculate the error bars in Figure 9. For example, for the GISS ModelE-R and NCAR CCSM3, with eight and nine ensemble members, respectively, the standard deviation of the individual ensemble members is roughly three times the observed value. Given the small number of eruptions used to calculate the observed variability, this difference is not statistically significant. However, it demonstrates that the observed annular response following an eruption is substantially more consistent than simulated by the models. This may be related to the excessive SLP variability and annular fraction exhibited by the models (Figure 2).
 Variations in forcing among the models contribute to differences in the computed annular trends, despite uniform prescription of greenhouse gas concentrations based upon measurements or the SRES scenario. During the 21st century, when the annular response is largest, radiative forcing by tropospheric sulfate aerosols varies among models, with some containing additional forcing by other tropospheric aerosols, along with solar variability and land use. Many of the forcing differences are small or limited to only a few models and their effect upon annular trends is difficult to identify in the presence of unforced variability. An exception is the effect of changing stratospheric ozone concentrations upon the SH annular mode, which is examined below.
 In the SH, decreasing Antarctic SLP and positive annular trends can be forced both by increasing GHG concentrations or a reduction in stratospheric ozone [Kindem and Christiansen, 2001; Sexton, 2001; Gillett and Thompson, 2003; Rauthe et al., 2004; Shindell and Schmidt, 2004; Arblaster and Meehl, 2006]. To characterize the SH annular response, we distinguish between models that include or omit stratospheric ozone forcing (Figure 10). The former group (denoted by the thin red line) exhibits an annular increase during the late 20th century that closely matches the magnitude of the observed trend calculated from HadSLP1 (black). (The HadSLP1 trend is roughly half of that computed from the NCEP reanalyses between 1950 and 1999 and closer to the annular trend computed directly from station observations by Marshall . Prior to the second half of the 20th century, annular variations within HadSLP1 are based upon comparatively few observations; we use HadSLP1 between 1900 and 1970 only to estimate an “unperturbed” value of the annular index, prior to significant anthropogenic forcing.) Throughout the 21st century, both groups of models exhibit similar positive trends. However, their annular responses are statistically distinct during much of the 21st century, when Antarctic SLP is roughly 2 hPa lower in the model group that includes stratospheric ozone changes. The two sets of models diverge mainly in the late 20th century when ozone trends are largest.
 The individual models exhibit a wide range of annular responses to forcing in the 21st century, as shown in Figure 11. The GISS ModelE and GFDL annular indices become increasingly positive throughout the century (Figures 11a–11d), while the NCAR PCM1 and UKMO HadCM3 responses decline sharply after peaking at the end of the 20th century (Figures 11f and 11i). In part, these reflect different model assumptions about stratospheric ozone in the 21st century (Table 3), which is not prescribed by the IPCC A1B scenario. Models with decreasing stratospheric ozone during the late 20th century assume subsequent recovery [e.g., World Meteorological Organization, 2002], with the exception of the GISS models where the 21st century value is constant. This recovery is expected to offset the positive SH annular trend forced by greenhouse gases [Shindell and Schmidt, 2004]. However, the 21st century reversal of the SH annular trend calculated by the NCAR PCM1 is absent in the NCAR CCSM3 (Figures 11e and 11f), despite identical assumptions of stratospheric ozone and GHG concentration, indicating different model sensitivities to at least one of these forcings.
 Contrasting annular sensitivity to increasing concentration of GHGs is illustrated by the models omitting changes in stratospheric ozone. The prominent annular trends exhibited by the Canadian Climate Center CGCM3.1 and MRI CGCM2 are absent in the Russell GISS and IAP GOALS models (Figure 6b). Variations in model sensitivity are also illustrated by the annular response during early winter (May–July), when trends in total SLP are large (Figure 1), but ozone forcing is small due to the limited sunlight. By coincidence, the models with stratospheric ozone changes are more sensitive to increasing GHG concentrations than models omitting these changes (Figure 12). This gradual 21st century divergence of Antarctic SLP between the two groups of models in response to increasing GHGs is in contrast to the comparatively rapid separation at the end of the 20th century during early summer (Figure 10) that is the result of a decrease in stratospheric ozone. The larger GHG sensitivity among the models with changing stratospheric ozone also accounts for the absence of convergence in the late 21st century between the early summer annular indices of the two model groups (Figure 10), despite the recovery to preindustrial ozone concentration assumed by all but the GISS ModelE.
 We examined annular variations in sea level pressure simulated by 14 coupled model ensembles of 20th and 21st century climate, submitted to the IPCC AR4 model archive. The annular pattern of SLP simulated by the models is highly correlated with spatial variations of the observed pattern during the late 20th century, but the simulated annular variability represents too large a fraction of the total temporal variability within each hemisphere.
 The change in SLP between the early 20th and late 21st centuries exhibits a predominantly annular pattern in both hemispheres of most models. The multimodel average exhibits a positive annular trend in both hemispheres, with decreasing pressure over the poles and a compensating increase in midlatitudes. This trend in the multimodel average is associated with a poleward shift of the storm track in both hemispheres and a strengthening of the upper level westerlies shown by other studies [Yin, 2005; Carril et al., 2005]. The NH annular trend is consistent with the regional change indicated by the multimodel North Atlantic Oscillation [McHugh and Rogers, 2005]. The annular changes in SLP become distinct from intermodel variations during the late 20th century in the SH and early in the 21st century in the NH. In the NH, the trend agrees in sign but is of smaller amplitude than the upward trend observed during recent decades, suggesting that only part of the observed trend is forced [Gillett, 2005].
 The annular trends in both hemispheres result from forcing by increasing concentrations of greenhouse gases and tropospheric sulfate aerosols, while stratospheric ozone changes make an additional contribution to the SH trend. However, there is large variability among models, especially in the NH, and annular trends are absent in some models. A similar wide variation of annular response to greenhouse gas and sulfate forcing was found in previous model comparisons [Osborn, 2004; Rauthe et al., 2004]. While this contrast in annular response could result from differences in assumed forcing (associated with variations in solar irradiance, sulfates, or other aerosols, for example), forcing differences are largest during the 20th century prior to significant annular departures. The exception is forcing by stratospheric ozone changes, but there is large variability of the SH annular response during the 21st century even among models that do not include these changes. Despite large variations in the simulated annular response, no models simulate a trend toward increasing SLP over either pole.
 Contrasting annular trends may result from mechanisms of response that are not uniform among the models. Gillett et al.  concluded that the models as a group are missing mechanisms that contribute to changes in SLP, based upon a comparison of forced changes inferred from the observations and the 20th century simulations considered here. Here we question whether the models include the full range of mechanisms that contribute to annular variability. We find that the mean model annular response to volcanic forcing is of the correct sign but smaller and statistically distinct from the observed value. The observed annular response to volcanic forcing is believed to result from the influence of stratospheric anomalies upon the circulation at the surface [Shindell et al., 2001; Stenchikov et al., 2002], similar to the stratospheric role in response to increasing GHG concentration [Shindell et al., 1999]. IPCC models are generally not designed for accurate simulation of the stratospheric circulation, and their weak annular response to volcanic forcing suggests that the models as a group underestimate this coupling. Stenchikov et al.  reach a similar conclusion, based upon the composite spatial response of SLP and other variables. Alternatively, the models may fail to represent a purely tropospheric response to volcanic forcing. Stenchikov et al.  argue that planetary wave generation within the troposphere is reduced by volcanic aerosols. If the models do not reproduce this sensitivity, then their underestimate of the observed annular response to an eruption may not indicate that the feedback between the stratosphere and annular variability is incorrect.
 Simulated annular variability is generally excessive compared to observations (Figure 3). As a result of large unforced variability, no simulated annular response to volcanic forcing is statistically significant, which is surprising when contrasted to previous experiments with similar or identical models where a significant response was found. Many of these studies calculated not the annular response per se but rather the composite spatial anomaly of surface air temperature or SLP and noted a qualitative resemblance to the annular pattern. It is possible that other patterns of variability contribute substantially to the observed and model response to volcanic forcing and that the annular projection we compute is too narrow a measure of stratospheric coupling. In any case, the annular projection of the multimodel average in response to volcanic forcing remains smaller and statistically distinct from the observed value.
 The unforced variability that obscures the model response to stratospheric forcing by volcanic aerosols prevents us from identifying the models that simulate realistic coupling to the climate at the surface, as well as those aspects of stratospheric physics that must be simulated. In response to GHGs, the MIROC Medium Resolution model and GISS ModelE-H exhibit the largest amplification of the NH annular pattern over the 21st century. While the ModelE domain extends beyond the stratopause to 0.1 hPa, the MIROC model includes only a partial stratosphere. Previous work relating GHG and volcanic forcing to annular trends suggest a role for the stratosphere [Shindell et al., 1999]. However, the trends simulated here indicate that tropospheric mechanisms can drive annular changes [Fyfe et al., 1999], as a result of tropospheric wave dynamics or surface temperature contrasts, for example [Yu and Hartmann, 1993; Cai et al., 2003].
 We thank Donald Anderson of the NASA Science Mission Directorate for supporting this project with a Climate Model Evaluation Project (CMEP) grant under the U.S. CLIVAR Program (http://www.usclivar.org/index.html). We are grateful for the comments of David Karoly, Tim Osborn, Susan Solomon, Georgiy Stenchikov, Bertrand Timbal, and three anonymous reviewers. We are also grateful to John Austin for guiding this article expeditiously through the review process. We thank William Connolley, Monika Esch, Greg Flato, James Hack, William Ingram, Gareth Jones, Jeff Kiehl, Jerry Meehl, Toru Nozawa, V. Ramaswamy, Phil Rasch, Erich Roeckner, and Philip Stier for model documentation. We acknowledge the international modeling groups for providing their data for analysis, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) for collecting and archiving the model data, the JSC/CLIVAR Working Group on Coupled Modelling (WGCM) and their Coupled Model Intercomparison Project (CMIP) and Climate Simulation Panel for organizing the model data analysis activity, and the IPCC WG1 TSU for technical support. The IPCC Data Archive at Lawrence Livermore National Laboratory is supported by the Office of Science, U.S. Department of Energy.