We compare new observationally-based data sets of Antarctic near-surface air temperature and snowfall accumulation with 20th century simulations from global climate models (GCMs) that support the Intergovernmental Panel on Climate Change Fourth Assessment Report. Annual Antarctic snowfall accumulation trends in the GCMs agree with observations during 1960–1999, and the sensitivity of snowfall accumulation to near-surface air temperature fluctuations is approximately the same as observed, about 5% K−1. Thus if Antarctic temperatures rise as projected, snowfall increases may partially offset ice sheet mass loss by mitigating an additional 1 mm y−1 of global sea level rise by 2100. However, 20th century (1880–1999) annual Antarctic near-surface air temperature trends in the GCMs are about 2.5-to-5 times larger-than-observed, possibly due to the radiative impact of unrealistic increases in water vapor. Resolving the relative contributions of dynamic and radiative forcing on Antarctic temperature variability in GCMs will lead to more robust 21st century projections.
 Establishing multi-decadal, continental-scale records of near-surface air temperature (NSAT) and snowfall accumulation (henceforth snowfall) in Antarctica is important for understanding regional climate variability and for elucidating the role of the Antarctic ice sheets in global sea level change. Generating such records is challenging due to Antarctica's sparse observational network. With innovative techniques, several recent studies have enhanced the spatio-temporal records of NSAT and snowfall during the latter half of the 20th century [Monaghan et al., 2006, 2008; Chapman and Walsh, 2007], and longer [Schneider et al., 2006].
 A few studies have evaluated Antarctic climate simulations in global climate models (GCMs) that support the Intergovernmental Panel on Climate Change (IPCC) Fourth Assessment Report (AR4), showing that there have been key qualitative improvements in simulated spatial patterns of NSAT trends compared to IPCC Third Assessment GCMs, but challenges remain [e.g., Lynch et al., 2006; Chapman and Walsh, 2007]. Our study extends on the previous work by employing the new observational records for a quantitative evaluation of simulated 20th century Antarctic NSAT and snowfall variability in five representative IPCC AR4 GCMs at the continental scale, thereby providing a metric of the quality of their 21st century Antarctic climate projections.
2. Data and Methods
 We use the observationally-based NSAT records of Schneider et al.  (S06) and Monaghan et al.  (M08), and the snowfall record of Monaghan et al.  (M06). The M06 and M08 data sets (which are 1° × 1° latitude/longitude gridded products) are compiled into continental-scale annual (and seasonal for NSAT) averages by area-weighting all Antarctic land and ice shelf grid points defined by the land/sea mask of Vaughan et al. . The S06 NSAT record is an annual, continental scale time-series spanning 1800–1999 based on ice-core water stable isotope proxy data calibrated with modern station NSAT records. The M08 NSAT record is a monthly product spanning 1960–2005 that is generated by extrapolating station NSAT records in space and time using background spatio-temporal information from atmospheric model NSAT fields. M08 evaluate their record and find that the variability and trends are consistent with other continental-scale NSAT records and with independent station observations from across the continent in all seasons. The M06 snowfall record spanning 1955–2004 is generated by applying a similar extrapolation technique to a suite of ice core snowfall records. Sensitivity studies indicate that the technique is robust and the resulting record is in agreement with earlier studies finding increases in Antarctic snowfall through the mid-1990s. A recent downturn in snowfall since the mid-1990s renders the overall snowfall trend from the 1955–2004 record statistically insignificant. In summary, evaluation of the observationally-based records employed in this study indicate they can resolve variability and trends, and thus are appropriate for assessing GCM simulations of Antarctic NSAT and snowfall trends during the 20th century.
 Simulations from five IPCC AR4 GCMs have been obtained from the World Climate Research Programme's (WCRP's) Coupled Model Intercomparison Project phase 3 (CMIP3) multi-model data set (Table 1). To ensure robust results, the selected models must have at least 4 ensemble members for both the 20c3m (19th/20th century) and sresa1b (21st century) scenarios, although in this study only 20c3m runs are analyzed. Below we argue that the selected models provide a representative sample of Antarctic climate simulations for all 23 IPCC AR4 GCMs. The results presented from the individual models are based on the ensemble means from each model, and the results for the grand ensemble (GRA) are based on the 20-run ensemble mean (5 models × 4 runs each). Snowfall at each grid point is calculated from precipitation-minus-evaporation (P-E). Continental-scale time series are area-weighted averages of each variable for all Antarctic land and ice shelf grid points defined by the land/sea mask for each model.
Table 1. Description of the IPCC Models Used in This Studya
The 20c3m Ozone? column lists models that include time-variable stratospheric ozone forcing in the 20th century simulations. More detailed information is given by IPCC  and at http://www-pcmdi.llnl.gov.
Resolution of atmospheric component.
Output resolution of CBCM3.1 data is ∼3.75 × 3.75 deg.
 Temporal trends and sensitivity statistics are calculated using least squares linear regression. Confidence intervals for trends and sensitivities are estimated as t05*SEtot, where t05 is the t value for p = 0.05 and SEtot = . SEb1 is the standard error of the regression slope (or the trend), and SEm accounts for additional uncertainty due to imperfect methodology/algorithms used to generate the observational data sets, estimated following M08. Autocorrelation is accounted for by reducing the degrees of freedom when running means are used.
3. Results and Discussion
 The time series of observed and simulated annual Antarctic NSAT and snowfall anomalies are compared in Figure 1. The use of GCM ensembles emphasizes the signal of forced variability but dampens the internal high-frequency variability compared to the observations. However, the use of ensembles does not impact the trends: all 20 member simulations have statistically significant (p < 0.05) NSAT increases from 1880–1999 that are larger than observed and consistent with the results for the ensembles shown in Figure 1. Chapman and Walsh  analyze NSAT time series averaged for 60°–90°S from a larger sample of 11 IPCC AR4 models and all have steady 20th and 21st century increases of similar magnitude to the 5 GCMs in Figure 1. The 1901–1999 trends for the 4 GCMs that are common to both studies rank 1st (GIS), 3rd (NCA), 5th (MRI) and 8th (MPI) from smallest-to-largest among the 11 models, indicating that the 5 models evaluated here provide a representative sample of the IPCC AR4 GCMs. Figure 1 (right) shows a steady increase in snowfall in the GCMs throughout the period. The upward snowfall trends in the GCMs are comparable to the observations throughout the latter ∼4 decades of the 20th century. It is unclear whether the GCMs would capture the downturn in observed snowfall that occurred over the past decade due to the shorter GCM records. Uotila et al.  evaluate estimates of Antarctic P-E trends for 15 IPCC AR4 GCMs (including all 5 appearing here) during the late 20th century and find large differences among models, and compared with global reanalyses. The results by Uotila et al.  indicate that our 5 GCMs provide a representative sample of the broad range of snowfall trends from all 15 models.
 The annual NSAT and snowfall trends for the time-series shown in Figure 1 are quantified in Table 2. The GCM grand ensemble NSAT trends are about 3.75 times larger than the statistically-insignificant observed trend (+0.75 ± 0.07 K century−1 versus +0.20 ± 0.32 K century−1) for the 1880–1999 period, and individual ensembles range from 2.5-to-5 times larger than observed. Over the recent 1960–1999 period, the GCM NSAT trends are about double what they are for the 1880–1999, while the observed 1960–1999 trends are smaller than for 1880–1999. The possible causes of the larger-than-observed Antarctic NSAT increases in the GCMs are examined later.
Table 2. Antarctic NSAT and Snowfall Trends and Confidence Intervals (p < 0.05) for Observations and GCMs for Various Periods and Seasonsa
The observations are in the Schneider and Monaghan columns (NSAT, S06; snowfall, M06; and NSAT, M08). The final row (S/T Sensitivity) gives the sensitivity of annual snowfall to annual NSAT fluctuations, expressed as a percentage of the long-term mean annual snowfall for a given GCM or observation. The S/T sensitivity was calculated by linearly regressing the detrended 10-y running means of snowfall onto NSAT for each data set in order to emphasize decadal-and-longer timescales of greatest interest for climate change. Observed S/T sensitivity is calculated with the 1960–2004 snowfall and temperature records from M06 and M08. The results from the grand ensemble of the 5 GCMs are given in the GCM ‘GRA’ column. The Min GCM and Max GCM columns give the range of individual ensemble means from each GCM.
Values statistically significant from zero (p < 0.05) are shown in italics.
The S/T sensitivity is calculated for the longest time period available for each data set (1960–2004 for Monaghan and 1880–1999 for the GCMS).
 The magnitudes of the 1960–1999 seasonal NSAT trends in the GRA GCM ensembles are comparable to each other and to the annual trends. In all seasons they are highly statistically significant, emphasizing steady increases and general consistency among the ensemble members, similar to the annual case shown in Figure 1. They are larger than the observed trends in all seasons, although in summer (DJF) they are nearly identical to the observations (+1.09 K century−1 observed versus +1.11 K century−1 for GRA). However, the large statistical uncertainty of the observed trends indicates that they are not robust; if the observed DJF trends are calculated for 1960–2002 instead of 1960–1999, their magnitude decreases by about eightfold (+0.13 K century−1). The GIS model ensemble is most similar to the observations, having the smallest trends (all insignificant) in all four seasons. The observed and GCM GRA annual NSAT trends are statistically different for 1880–1999 (p < 0.05), but they are not statistically different (annually and seasonally) for 1960–1999.
 Both observed and GRA GCM snowfall trends for 1955–1999 are positive and statistically significant. If the observed trends are calculated through 2004 (not shown in Table 2), they drop from +32 ± 31 mm century−1 to +19 ± 32 mm century−1, similar to the GRA GCM trends (+17 ± 4 mm century−1), however they become statistically insignificant. The GCM snowfall trends for 1955–1999 are about double their value of +9 mm century−1 (not shown) for the entire 1880–1999 period, similar to the approximate doubling of NSAT trends during 1960–1999 compared to 1880–1999 and suggesting a positive sensitivity of snowfall to NSAT fluctuations. The snowfall/temperature (S/T) sensitivity is quantified in the final row of Table 2. The overall GCM S/T sensitivity is about the same as observed (5% K−1), suggesting that if Antarctic NSATs increase by about 3 K by the end of this century as projected by IPCC AR4 GCMs [Chapman and Walsh, 2007], snowfall may increase by about 15%, equivalent to the additional uptake of ∼1 mm y−1 of global sea level rise in 2100 compared to today (based on the current annual snowfall accumulation rate of ∼170 mm y−1 over the grounded ice sheet [van de Berg et al., 2006]). It is promising that the GCMs have approximately the same S/T sensitivity as observed. However, the Antarctic NSAT trends in the GCMs during the 20th century are too large, raising questions about the quality of GCM projections of NSAT and snowfall changes in the 21st century. Considering the important implications for sea level, we examine possible causes for the larger-than-observed 20th century Antarctic NSAT increases.
 Observed Antarctic NSAT variability and trends during the final decades of the 20th century have been attributed mainly to the leading mode of extratropical variability in the Southern Hemisphere atmosphere, the Southern Hemisphere Annular Mode (SAM) [e.g., Thompson and Wallace, 2000]. The SAM is manifested in the intensity of the circumpolar westerlies between 40°S–65°S, which have been strengthening since the mid-1960s, leading to slight overall NSAT decreases in Antarctica since about 1970 [e.g., Marshall, 2003, 2007]. Modeling studies suggest the increasing SAM trends, which have been largest in summer and autumn, may be linked to anthropogenic changes due to greenhouse gas increases and decreasing stratospheric ozone over Antarctica [e.g., Arblaster and Meehl, 2006]. Miller et al.  find that the IPCC AR4 GCMs reproduce the SAM realistically, although it is too strong in most models. The simulated SAM is similar among the 14 models analyzed by Miller et al.  (including all 5 appearing in this paper), suggesting that our 5 GCMs provide a representative sample of SAM variability among the IPCC AR4 models. Cai and Cowan  show that IPCC AR4 GCMs that include time-variable stratospheric ozone forcing in the 20c3m runs simulate increasing SAM trends in the late-20th century similar to those observed and about double the magnitude of the GCMs with no stratospheric ozone forcing. The findings of Cai and Cowan  imply that the GCMs that do not have time-variable stratospheric ozone forcing (and thus underestimate positive SAM trends on average) may produce larger Antarctic NSAT trends than those that do not, which is consistent with the results in Table 2. The GIS GCM, which includes ozone forcing (see Table 1), has the smallest 1960–1999 annual and seasonal NSAT trends (closest to observed), while the CCC GCM (no ozone forcing) has the largest NSAT trends. The DJF SAM trends in GIS are nearly the same as the observed trends and double the magnitude of the CCC trends (not shown).
 In Figure 2a we investigate the observed and simulated correlation between annual Antarctic NSAT and the SAM. The results are nearly identical for each season (not shown), consistent with the findings of Marshall  that the NSAT response to the SAM has little seasonality. The SAM in the GCMs is calculated from the difference of the normalized monthly zonal mean sea level pressure anomalies at 40° and 65° S, following Gong and Wang . The observed SAM is calculated in a similar manner by Marshall . Ensemble means are averages of SAM indices from each member. The observations have statistically-significant negative correlations with the SAM for both the trended and detrended cases, confirming that the SAM is the primary driver of recent observed Antarctic NSAT changes. The GCMs have approximately the same correlation with the SAM as observed for the detrended case, but they have statistically insignificant correlations for the trended case, suggesting that something other than the SAM is the primary cause of the Antarctic NSAT changes in the GCMs, despite the link between the SAM and NSAT discussed for CCC and GIS in the previous paragraph.
 The SAM influences Antarctic NSAT primarily through a series of dynamic changes to the atmospheric circulation [van den Broeke and van Lipzig, 2004]. Longwave radiative forcing via changes in greenhouse gases (GHGs) and aerosols is another important contributor to NSAT variability [e.g., IPCC, 2007]. In Figures 2b–2e we examine the influence of longwave radiative forcing on Antarctic NSAT fluctuations. Figure 2b indicates a strong relationship (r2 = 0.85) between NSAT and all-sky downward longwave radiation incident at the surface (LWDN), inferring that steady positive trends in LWDN between 1880–1999 (not shown) are a key contributor to the Antarctic NSAT increases in the GCMs. In addition to long-lived GHGs such as CO2 and CH4 which have increased during the 20th century, the hydrologic cycle also exerts strong influences on LWDN. In Figures 2c–2e we examine the relationships between Antarctic NSAT and column-integrated precipitable water vapor; cloud ice; and cloud water in the GCMs. The correlation for NSAT and water vapor is much stronger than it is for NSAT and cloud ice/water (deficient cloud simulations are often cited as a key cause of GCM problems in polar regions). The normalized regression slope of the NSAT/vapor relationship (Figure 2c) is similar to that for NSAT/LWDN (Figure 2b), suggesting that positive 20th century trends in water vapor in the GCMs (not shown) may be an important contributor to the positive trends of LWDN and therefore NSAT. The robust relationship between NSAT and water vapor shown in Figure 2c holds for all 5 of the GCMs evaluated in this study. Due to a lack of reliable long-term observations of atmospheric water vapor over Antarctica it is unclear whether the GCM water vapor increases are too large. However, a recent study of tropospheric temperature trends over Antarctica from satellite microwave sounding data [Johanson and Fu, 2007] may infer whether water vapor has increased via the assumption that relative humidity remains nearly constant if tropospheric temperatures increase (and thus specific humidity would increase) [e.g., Held and Soden, 2006]. Winter and spring tropospheric temperature increases have been offset by summer and autumn decreases, with little annual change from 1979–2005 [Johanson and Fu, 2007]. It is likely that water vapor increases over Antarctica in the GCMs are too large.
 The annual snowfall trends in the GCMs agree with the observations during 1960–1999, but annual NSAT trends for 1880–1999 are too large by a factor of 2.5-to-5. Our results suggest that the larger-than-observed GCM NSAT trends may be related to unrealistic increases in atmospheric water vapor over Antarctica which enhances longwave radiative forcing at the surface. When applied to the longwave radiation trend, the regression relationship presented in Figure 2b suggests that the positive contribution of longwave radiation to 1880–1999 Antarctic NSAT trends in the GCMs is about 4 times larger than the (overall) negative contribution of the SAM (and at least 2 times larger during 1960–1999 when SAM trends are largest). The monotonic increase of Antarctic NSAT in the GCMs may thus be related to the steady rise in GHGs since the 19th century, perhaps leading to an amplified GHG-temperature-water-vapor feedback that is contributing to the larger-than-observed NSAT trends. IPCC AR4 GCMs project that the SAM will continue strengthening throughout the 21st century [e.g., Fyfe and Saenko, 2006], therefore it should be a priority to clarify the relative roles of the SAM and radiative forcing on Antarctic temperatures and how they may change. Until these issues are resolved, IPCC projections for 21st century Antarctic temperature should be regarded with caution.
 This research is funded by the National Science Foundation (NSF-OPP-0337943). We acknowledge the modeling groups, the Program for Climate Model Diagnosis and Intercomparison and the WCRP's Working Group on Coupled Modelling for their roles in making available the WCRP CMIP3 multi-model data set. Support of this data set is provided by the Office of Science, U.S. Department of Energy. Gareth Marshall of the British Antarctic Survey generously provided the SAM index. Contribution 1369 of Byrd Polar Research Center.