5.1.1. Annual Cycle
 SWE is a fundamental measure for the quality of climate models since the correct simulation of SWE requires the accurate simulation of both surface temperature and precipitation. In the following, the seasonal SWE cycle, as simulated in the IPCC AR4 models, are compared with ground-based data compiled by USAF/ETAC (see section 3.1).
 Most state-of-the-art climate models significantly overestimate the snow mass on the Northern Hemisphere (Figures 1a–1d), particularly in spring. On a hemispheric scale, CSIRO-Mk3.0, ECHAM5/MPI-OM and INM-CM3.0 are closest to the observations. In Eurasia, the models ECHAM5/MPI-OM and CSIRO-Mk3.0 most closely match the observation, while in North America, UKMO-HadCM3 and ECHAM5/MPI-OM are the most accurate models (Figure 1c). Restricting the analysis to the zone between 40–60°N, ECHAM5/MPI-OM shows the best agreement to observations out of all participating climate models. Mean Eurasian SWE predicted by the GISS-AOM is clearly higher than in most other climate models, which might be partly related to the snow/ice transformation algorithm (section 4) and problems in how snow is melted (G. L. Russell, personal communication, 2005). Mean Eurasian SWE simulated by GISS-AOM for the period from February to April, is approximately twice that derived from ground measurements. Furthermore, AR4 models produce a wide range of predicted peak snow accumulation. Figure 1 shows that the models tend to significantly overestimate the peak snow accumulation, mainly in Eurasia. The observed Eurasian SWE peaks in February (Figure 1b) and reaches its minimum in August, whereas maximum SWE in North America is reached in March (Figure 1c). Most models predict a delayed peak snow accumulation. Only CSIRO-Mk3.0 is in line with the observed SWE maxima in February, while the other models do not peak until March. Excessive snow amount and a delayed snow melt in spring was also found by Foster et al. . Therefore it is obvious that various state-of-the-art climate models still suffer from a delayed retreat of the snowline and excessive snow amount in spring. It should be mentioned, however, that there are several deficiencies in the USAF/ETAC data as well. It is, e.g., challenging to construct isonivals in areas with relatively few data points such as the boreal forests. In addition, sparse measurements in mountainous areas might bias the SD to too low values.
Figure 1. Monthly mean SWE (1960–1999) for 14 AR4 models and the USAF/ETAC climatology. (a) Northern Hemisphere (land) without Greenland; (b) Eurasia, north of 20°N; (c) North America, north of 20°N; and (d) 40–60°N.
Download figure to PowerPoint
 In order to investigate the main reasons for the overestimated SWE, simulated surface temperature and snowfall are compared with the most accurate currently available data sets for global land surface temperature (CRU) and precipitation (GPCC) (sections 3.3 and 3.4).
 Figure 2 shows a scatterplot of the simulated SWE deviations and the models' temperature bias for both March and April. Figure 2 reveals that there is no distinct relationship between a warm or cold bias and SWE in March. In April, the AR4 models generally show a slight cold bias over both Eurasia and the 40–60°N land region. This might be related to a positive snow/albedo feedback that could partly contribute to the delayed snow melt in spring. However, the coupled climate models generally produce a slight warm bias over North America in April (not shown) along with a positive SWE bias. So temperature biases are not the primary driver for the overly thick snow deck as simulated in (late) winter and spring.
Figure 2. Scatterplot of SWE biases and surface temperature biases (model minus observation). Biases are displayed for (a and b) Eurasia (north of 20°N) and (c and d) the land region between 40 and 60°N for both March and April. Color scale is as in Figure 1: black diamonds, CCSM3; black triangles, CGCM3.1 (T63); and black squares, CSIRO-Mk3.0. Observation are taken from CRU (surface temperature) and USAF/ETAC (section 3). All 10 AR4 models providing both SCA and SWE are shown.
Download figure to PowerPoint
 Accumulated snowfall might also contribute to the poor SWE simulation in many coupled atmosphere-ocean climate models. In order to test this hypothesis, accumulated snowfall was computed from the GPCC precipitation, assuming solid precipitation for CRU temperatures below 0°C. The comparison between the SWE biases and biases in accumulated snowfall reveals that most AR4 models produce too much snowfall during the winter and spring season (Figure 3). High positive SWE biases (such as simulated in GISS-AOM) are generally related to high positive snowfall anomalies. Assuming some snow melt and sublimation, the positive snowfall anomaly could easily explain most of the positive SWE bias. The results are barely sensitive to the threshold temperature for the transition between rain and snow (e.g., 1°C instead of 0°C).
Figure 3. As in Figure 2 but for SWE and accumulated snowfall. Accumulated snowfall for March and April corresponds to the total snowfall from October to March and October to April, respectively. Observed snowfall has been computed by assuming solid precipitation for CRU temperatures below 0°C. Color scale is as in Figure 2: black diamonds, CCSM3; black triangles, CGCM3.1 (T63); and black squares, CSIRO-Mk3.0.
Download figure to PowerPoint
 It is thus likely that the positive SWE biases are primarily caused by an overly high snowfall rate. This finding also suggests that, for a correct SWE simulation, an accurate simulation of the large-scale circulation and the precipitation rate is probably more fundamental than the improvement of the implemented snow model.
 All AR4 models produce the SWE in late autumn/early winter reasonably well. Thus the onset of the snow season is better captured than the snow melt by current climate models. The difficulty in predicting the snow melt might be caused not only by biases in temperature and precipitation, but also in a poor parameterization of the snow melt processes. For example, ECHAM5, the latest ECHAM version, is superior to the previous version 4 in simulating the timing of the spring snow melt by profiting from an improved representation of the snow melt processes [Roesch and Roeckner, 2006].
 Excessive SWE is found in all AR4 models in the Himalayas (not shown). However, uncertainties in snow measurements in high mountainous areas may be afflicted with significant errors and tend to underestimate the area averaged snow height since measurements are sparse and biased to lower situated regions (valleys). In some models, a positive snowfall bias at the southern slopes of the Himalayas probably further contributes further to the excessive Himalayan snow amount [Hagemann et al., 2006].
 Ten out of the 15 investigated climate models predict SD (Table 1). This allows a direct comparison with the USAF/ETAC SD climatology without applying equation (1). The comparison clearly shows that, at least on larger (temporal and spatial) scales, the percentage SD biases are only slightly lower than the respective SWE biases (Table 2). This provides confidence in the SWE validation of the AR4 models and the snow density as reported by Verseghy .
Table 2. SWE and SD Biases (Percentage Differences): Nine-Model Mean Minus USAF/ETACa
| ||Percentage Difference|
|North America (>20°N)|
 In summary, most of the AR4 models suffer from an excessive snow amount in spring and a delayed peak snow accumulation and snow melt. The spread among the models is largest in spring while the onset of the snow season is reasonably well captured by all participating models.
5.1.2. Frequency Distribution
 The frequency distribution of observed and predicted SWE is shown in Figure 4 for spring (Figures 4a and 4c) and autumn (Figures 4b and 4d). Grid cells with spring (MAM) SWE exceeding 10 cm are too frequent in most AR4 models (Figures 4a and 4c). According to the USAF snow depth climatology 12.0%, 15.3%, 5.9% of all grid elements are covered with SWE above 10 cm in spring for Eurasia, North America and the zone 40–60°N, respectively. The corresponding numbers for the mean of the 14 AR4 models are distinctly higher and amount to 23.0%, 21.1%, and 14.9% for the respective regions. On the other hand, most models underestimate the number of grid boxes in Eurasia and North America that are covered with a SWE between 2 cm and 10 cm.
Figure 4. Frequency distribution of SWE for 14 AR4 models (average from 1960 to 1999) and the USAF/ETAC climatology. Frequencies are given for 2 cm bins (classes). The class for snow-free pixels and thin snow decks (0–2 cm) has been omitted. Eurasia (a and b) north of 20°N and (c and d) 60–70°N excluding Greenland in MAM (Figures 4b and 4d) and October/November (Figures 4a and 4c).
Download figure to PowerPoint
 For both Eurasia and North America, the maximum SWE frequencies for the analyzed AR4 models are generally between 5 cm and 15 cm in spring, with a wide spread among the models, whereas the class frequencies for USAF/ETAC generally decrease with increasing SWE (Figure 4a). The CSIRO-Mk3.0 shows superior SWE frequency distribution compared to all other AR4 models. This is clear from either direct visual inspection of Figure 4 or by comparing the skewness S of the various frequency distribution curves. For Eurasia, e.g., S is only positive for USAF/ETAC (S = 0.12) and CSIRO-Mk3.0 (S = 0.27) while the corresponding skewness is negative for all other coupled climate models. For the 60–70°N land zone, simulated SWE peak frequencies in spring are generally between 10 cm and 15 cm, with the highest value predicted by GISS-AOM with approximately 19 cm (Figure 4c). Note that frequencies for the first bin, ranging from 0 cm to 2 cm have been omitted since this gives, as a first approximation, the percentage of snow free pixels in the specified domain. This class contains more than half of all grid boxes for all presented domains excluding the zone 60–70°N.
 In late autumn (Figures 4b and 4d), the frequency distribution of SWE is captured reasonably well in all AR4 models. The number of grid boxes with moderate snow covers (SWE between 2 cm and 10 cm) is generally overestimated in North America by the AR4 models (not shown). This feature agrees with a slight overestimation in the mean snow mass over North America in late spring as produced by most AR4 models (Figure 1c).