Global mean cloud feedbacks in ten atmosphere-only climate models are estimated in perturbed sea surface temperature (SST) experiments and the results compared to doubled CO2 experiments using mixed-layer ocean versions of these same models. The cloud feedbacks in any given model are generally not consistent: the sign of the net cloud radiative feedback may vary according to the experimental design. However, both sets of experiments indicate that the variation of the total climate feedback across the models depends primarily on the variation of the net cloud feedback. Changes in different cloud types show much greater consistency between the two experiments for any individual model and amongst the set of models analyzed here. This suggests that the SST perturbation experiments may provide useful information on the processes associated with cloud changes which is not evident when analysis is restricted to feedbacks defined in terms of the change in cloud radiative forcing.
 The experimental design of Cess et al. , which we follow, was based on atmosphere-only, fixed season (July) integrations in which the sea surface temperature (SST) was varied by ±2K about its climatological mean value. Such experiments were not intended to be representative of realistic patterns of climate change. Rather, the idea was to provide a consistent framework within which to assess processes across an ensemble of GCMs [Cess and Potter, 1988]. A further advantage of this design is that it is computationally inexpensive. To make a direct comparison with these earlier studies we have repeated these experiments with a set (currently ten) of contemporary models. In addition to the standard analysis of top-of-atmosphere radiative fluxes we also use model diagnostics corresponding to the ISCCP cloud classification [Klein and Jakob, 1999; Webb et al., 2001]. These provide detailed information on the responses of different cloud types. The results are compared both to earlier studies using the same experimental design and to a set of contemporary slab ocean model experiments in which the CO2 concentration is doubled. Slab models give a pattern of surface warming that is more similar to fully-coupled models [Williams et al., 2001; Meehl et al., 2004] as they allow the SSTs to respond to surface fluxes and feed back on the atmosphere. Full details of the experimental designs are on the CFMIP website (www.cfmip.net). Where possible, results are presented for models where both ±2K and 2 × CO2 experiments are available.
 There have been few attempts to make a direct comparison between ±2K and 2 × CO2 simulations using the same models. Senior and Mitchell  compared simulations of the Hadley Centre climate model using three different layer cloud parameterizations. They show that both the relative strength and the sign of the feedback may differ between the two experiments. Colman  compared offline feedback calculations from 2 × CO2 experiments with the Cess et al.  results and suggested that, taking into account the differences in the two approaches, the range of cloud feedbacks were comparable. However, the selections of models used in the comparison were not the same.
 Given the importance and influence of the Cess et al. [1990, 1996] results on the study of cloud feedbacks it is appropriate to repeat the analysis in contemporary GCMs and to compare the results to doubled CO2 simulations using the same models, particularly as the validity of the quantitative conclusions drawn from the ±2K studies has recently been questioned [Soden et al., 2004; Stephens, 2005].
2. SST Perturbation Experiments
 We begin by considering the ±2K experiments for the present set of models. Figure 1 shows the cloud feedback parameter, defined as ΔCRF/G, where ΔCRF is the global mean change in the cloud radiative forcing and G is the direct radiative forcing associated with the climate change. In the ±2K experiments G is taken to be the radiative imbalance resulting from the SST perturbation. These are “inverse” climate change experiments: the climate change is prescribed and the models produce their forcings according to
where ΔF and ΔQ are the global mean changes to the outgoing longwave and absorbed shortwave radiation respectively [Cess et al., 1990]. Earlier versions of all of the current models participated in the Cess et al.  study and many changes in model formulation have occurred since. It thus makes little sense to compare the results on a model-by-model basis and so we refer to the present selection as A, B … etc, in ascending order of ΔCRF/G.
 Examination of the cloud feedback parameter and its shortwave (SW) and longwave (LW) components reveals that the range in the net cloud feedback is clearly dominated by that in the SW component in the current selection of models, whereas in the Cess et al.  study there was a broad range of variation in both the SW and LW components.
Figure 1 also shows plots of ΔCRF/G and its SW and LW components derived from 14 models submitted to the IPCC 4th Assessment Report (AR4). In this case G is the radiative forcing due to doubling the atmospheric CO2 concentration (see Webb et al.  for details). This suggests a greater range in ΔCRF/G than the SST perturbation experiments, particularly at the lower end. (The comparison is limited somewhat by there not being ±2K experiments for the two AR4 models with the largest positive cloud feedback.) Direct comparison of models for which both experiments are present suggests that this is not necessarily a result of the ±2K simulations being from an unrepresentative sample of models and may be due to differences in the two experimental designs: the order of the models is different and three of the models which indicate a positive feedback in the ±2K simulations have a negative feedback in the 2 × CO2 simulations. These findings are thus consistent with Senior and Mitchell . Examination of the SW and LW components of the cloud feedback parameter in the 2 × CO2 experiments again shows that the range of the inter-model differences in the net effect is dominated by that in the SW. However, in contrast to the ±2K experiments, there is a much larger impact in the LW in some models. Consequently, the sign of the net feedback is not determined by the SW effect in all cases.
 It should be noted that a reduction of the cloud radiative forcing (ΔCRF/G < 0) does not imply that clouds are acting to the damp the climate sensitivity. Indeed, Soden and Held  suggest that in all 14 GCMs they analyzed clouds act to amplify the climate sensitivity, even though approximately half the models indicate reductions in the net cloud radiative forcing.
3. Comparison of ±2K and 2 × CO2 Experiments
 We now directly compare the cloud feedbacks in the ±2K and 2 × CO2 experiments for those models where both are available. Here we define the cloud feedback terms as ΔCRF/ΔTS, where ΔTS is the global mean surface air temperature change. Figure 2 shows the SW, LW and net cloud feedback terms for the two sets of experiments (9 common models). This confirms that cloud feedbacks will not necessarily be of the same sign in any given model in the two experiments: the SW, LW and net feedback terms are of opposite sign in four, five and three models respectively. For two GCMs (D and F) this is true of all three terms. Clearly, placing the models in order of net feedback also leads to a different outcome in the two cases. It thus appears very difficult (if not impossible) to draw useful inferences on cloud feedbacks in contemporary models from this comparison – the two sets of experiments seem to lead to quite different conclusions.
Figure 2 also shows plots of G/ΔTS (the “total feedback parameter” or the inverse of the “climate sensitivity parameter”) against the net cloud feedback for the full sets of ±2K and 2 × CO2 experiments shown in Figure 1. The best-fit regression line is also shown and the statistics from the regression are given in Table 1. The key conclusion from the original Cess et al.  study was that most of the variation in the climate sensitivity across that selection of GCMs was attributable to differences in cloud feedbacks. Figure 2 and Table 1 show that this is still the case for ±2K experiments with contemporary models and also shows a similar result for the 2 × CO2 simulations. The two lines indicate a similar relationship between G/ΔTS and ΔCRF/ΔTS (the slopes are ∼−1 in both cases), with the displacement arising due to differences in the clear-sky feedbacks, the ensemble mean values of which (1.86 and 1.14 Wm−2 K−1 respectively) correspond to the intercepts of the lines. In the ±2K experiments the snow and sea-ice feedbacks are excluded by design: the mean clear-sky SW feedback of 0.20 Wm−2 K−1 is consistent with Cess et al.  and is due to increased absorption by water vapour [Zhang et al., 1994; Colman, 2003]. In the 2 × CO2 experiments the snow and sea-ice feedbacks increase this value to 0.74 Wm−2 K−1. The mean clear-sky longwave feedback is slightly higher in the ±2K experiments (2.06 Wm−2 K−1 compared to 1.87 Wm−2 K−1) but nonetheless suggests that the combined lapse rate and water vapour feedbacks are comparable [see also Colman, 2003]. Moreover, the variation of the combined lapse rate and water vapour feedbacks across the models is likely to be small as the two effects tend to offset each other [Colman, 2003; Soden and Held, 2006]. Although these clear-sky terms are clearly important in determining the total feedback in any particular model the net clear-sky feedback is relatively invariant with G/ΔTS across the models, so that the variation of the cloud feedback is then the dominant factor [see also Webb et al., 2006]. It should be noted, however, that the clear-sky shortwave term is well correlated with the total feedback in the slab models, although the model-to-model variations are considerably smaller than those in SW cloud feedback. Also shown in Table 1 are the regression statistics derived from nine slab models included in the IPCC 3rd Assessment Report. The results are similar to those from the AR4 models: although the mean values of the net cloud feedback are of opposite sign the variation is so large that this difference is not statistically significant. Indeed, removing one model from the AR4 ensemble changes the sign of the mean. Clearly, caution must be exercised when discussing statistics relating to such small ensembles.
Table 1. Regression of G/ΔTS Versus ΔCRF/ΔTS for the ±2K and 2 × CO2 Experimentsa
Intercept, Wm−2 K−1
, Wm−2 K−1
The ensemble mean values and standard deviations of ΔCRF/ΔTS are given in the final column.
−0.94 ± 0.15
1.85 ± 0.04
0.14 ± 0.26
AR4 − 2 × CO2
−1.03 ± 0.17
1.15 ± 0.04
−0.04 ± 0.31
TAR − 2 × CO2
−0.99 ± 0.21
1.24 ± 0.08
0.07 ± 0.39
 These results imply that, while the two types of experiment may lead to different conclusions regarding the cloud radiative feedbacks in any given model, the more general conclusion that the variation of the total feedback across an ensemble of models is primarily dependent on the variation in the net cloud radiative feedback appears to hold in both cases. This is because the net clear-sky feedbacks vary much less between models.
 We next examine if the differences between the cloud radiative feedbacks between the two experiments are due to fundamental differences in the cloud changes or for some other reason. Figure 3 shows the responses (i.e., the global mean changes divided by ΔTS) of the nine ISCCP cloud types in the seven models for which these diagnostics are available in both experiments. With just one exception (high/thin cloud in I) the sign of the response in each of these cloud types is the same in both experiments for all of the models. (The changes in mid-level cloud in A are negligible in both cases.) This is even the case for D, which indicated different signs for each of the SW, LW and net cloud feedback terms, and H, in which substantial SW cloud feedbacks of opposite sign lead to similar behaviour in the net feedback. Of further interest is that, in general, the sign of the changes in each cloud type is the same in all of these particular models. For example, apart from A each of the models indicates a reduction in both low/thin and low/medium thickness cloud accompanied by an increase in low/thick cloud: this is suggestive of an optical depth feedback in which optically thick cloud becomes thicker but the amount of optically thin cloud diminishes. Similarly, the models all indicate reductions in middle-level thin and medium thickness cloud and increases in high, optically thick cloud, the latter corresponding to changes in deep convective cloud in the tropics and frontal cloud at mid-latitudes.
 The magnitude of the cloud changes is clearly not identical in the two experiments, although in many cases they are very similar. As the radiative effects of the individual ISCCP cloud types vary considerably [e.g., Chen et al., 2000] these differences will lead to differences in the radiative changes due to each cloud type, the cumulative effect of which could explain much of the inconsistency in the cloud radiative feedbacks.
 Consistent with Senior and Mitchell  we find that in any given model the sign of global mean net cloud feedback, and of its shortwave and longwave components, may differ between ±2K and 2 × CO2 experiments. The relative strength of the cloud feedbacks across GCMs is also likely to be different. However, the most important conclusion of the original Cess et al.  study, namely that the variation of the total climate feedback across an ensemble of GCMs depends primarily on the variation in the cloud feedback, holds in both cases.
 Changes in the ISCCP cloud types show a remarkable degree of similarity in the sign of the response between the two experiments: almost without exception the sign of the change of any particular cloud type is the same, indicating that the qualitative behaviour (at least at the global scale) of the 2 × CO2 cloud changes is captured by the ±2K experiments. This suggests that the differences in the cloud feedbacks arise due to the different magnitudes of the individual cloud changes rather than any fundamental difference in the response of the individual cloud types. The effects of non-cloud feedbacks (e.g., changes in sea-ice) on the measure of cloud feedback being used here may also contribute [Zhang et al., 1994]. Soden et al.  estimate that the use of the cloud forcing diagnostic could underestimate the net cloud feedback by up to 0.3 Wm−2 K−1: if the reasons for this (the “cloud masking” effect) operated differently in the two experiments this might also explain part of the discrepancy. In spite of these limitations Soden and Held  show that cloud radiative forcing provides a good measure of inter-model differences in cloud feedbacks when compared to more rigorous methods.
 As stated earlier, the SST perturbation experiments were not designed to be representative of realistic climate change scenarios and results from them should always be interpreted with this in mind. Nevertheless, these results show that they can provide useful information on global mean cloud changes and feedbacks when considering either a single model or an ensemble of GCMs respectively. The inclusion of the ISCCP diagnostics allows the cloud changes to be more easily related to changes in physical processes.
 In certain circumstances, for example, during the model development process, the relatively simple and computationally inexpensive ±2K experiments may provide a good qualitative guide to the impact of developments on both cloud responses and processes under climate change. They may also prove useful when performing ‘high cost’ studies such as simulations at very high spatial resolution.
 This work was supported by the UK Department of Environment, Food and Rural Affairs under contract PECD 7/12/37. We acknowledge the international modeling groups for providing their data for analysis, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) for collecting and archiving the model data, the JSC/CLIVAR Working Group on Coupled Modelling (WGCM) and their Coupled Model Intercomparison Project (CMIP) and Climate Simulation Panel for organizing the model data analysis activity, and the IPCC WG1 TSU for technical support. The IPCC Data Archive at Lawrence Livermore National Laboratory is supported by the Office of Science, U.S. Department of Energy.