Climate model studies of the consequences of solar geoengineering are central to evaluating whether such approaches may help to reduce the harmful impacts of global warming. In this study we compare the sunshade solar geoengineering response of a perturbed parameter ensemble (PPE) of the Hadley Centre Coupled Model version 3 (HadCM3) with a multimodel ensemble (MME) by analyzing the G1 experiment from the Geoengineering Model Intercomparison Project (GeoMIP). The PPE only perturbed a small number of parameters and shares a common structure with the unperturbed HadCM3 model, and so the additional weight the PPE adds to the robustness of the common climate response features in the MME is minor. However, analysis of the PPE indicates some of the factors that drive the spread within the MME. We isolate the role of global mean temperature biases for both ensembles and find that these biases have little effect on the ensemble spread in the hydrological response but do reduce the spread in surface air temperature response, particularly at high latitudes. We investigate the role of the preindustrial climatology and find that biases here are likely a key source of ensemble spread at the zonal and grid cell level. The role of vegetation, and its response to elevated CO2 concentrations through the CO2 physiological effect and changes in plant productivity, is also investigated and proves to have a substantial effect on the terrestrial hydrological response to solar geoengineering and to be a major source of variation within the GeoMIP ensemble.
Solar geoengineering has been proposed as a potential means of reducing the risks of global warming by reducing the absorption of sunlight by the planet and lowering temperatures [Crutzen, 2006; Keith, 2000]. If such solar geoengineering is indeed feasible, a key concern would be the effects it would have on the climate, i.e., whether it would actually reduce the harmful impacts of global warming. Research suggests that only field tests of solar geoengineering that are a substantial fraction of full-scale deployment could provide useful direct information on the climate outcomes [MacMynowski et al., 2011; Robock et al., 2010]. As the climate is naturally highly variable, even after a decade of observation of large-scale solar geoengineering deployment (e.g., more than a few tenths of a Wm−2 forcing globally), one would only be able to loosely bound regional climate changes and precise information on regional changes in climate could take a century or more to gather [MacMynowski et al., 2011]. Thus, model projections are critical to assessing the potential consequences of proposed solar geoengineering schemes, and even if solar geoengineering were to be deployed, they would be central to assessing which observed weather events and decadal trends are attributable to solar geoengineering.
To make such projections of future climate, coupled ocean-atmosphere general circulation models (GCMs) are needed. GCMs have been used to assess the effectiveness and the potential climate consequences of a range of different proposed solar geoengineering techniques, e.g., sunshade geoengineering [Govindasamy et al., 2003; Lunt et al., 2008], sulfate aerosol injection [Niemeier et al., 2011; Rasch et al., 2008], cloud albedo modification [Jones and Haywood, 2012; Latham et al., 2012], desert albedo geoengineering [Irvine et al., 2011], and crop albedo modification [Singarayer et al., 2009]. In these previous studies several different approaches to represent the proposed solar geoengineering techniques were adopted, different boundary conditions were chosen, and the data were not routinely made available, hence directly comparing the results of different studies was difficult. The Geoengineering Model Intercomparison Project (GeoMIP) has proposed a set of standard experiments and output to make such comparisons easier [Kravitz et al., 2011], building on the work done for the Climate Model Intercomparison Project 5 (CMIP5) [Taylor et al., 2012]. Many modeling groups have now run these standard GeoMIP experiments, enabling a comparison of their results and allowing the potential climate consequences of solar geoengineering and its uncertainties to be evaluated.
GeoMIP was envisioned as a way to create a multimodel ensemble (MME) of many different GCMs, but the strategy is equally applicable to perturbed parameter ensembles (PPE) of single models. Although the two approaches differ in important ways, both allow identification of common responses and also of disagreement in model response. They can both be used to project likely outcomes and give some measure of the uncertainty of such projections. Furthermore, they highlight the most important and sensitive processes that govern the response, thus guiding future studies and model development. MMEs are often “ensembles of opportunity,” with no systematic selection process, and not all models within the ensemble perform equally well. However, it has been observed that the mean or median projections of an MME perform better than any single model when compared to observations for global metrics over a range of variables [Reichler and Kim, 2008]. PPEs on the other hand are typically generated by systematically perturbing a number of key physical parameters within a single GCM, and an arbitrary number of ensemble members can be generated, resources permitting [Murphy et al., 2004; Sanderson et al., 2008b; Stainforth et al., 2005]. The systematic structure and larger ensemble size of PPEs has its advantages, but since all ensemble members retain the same model structure, common problems and responses, perhaps restricted to the specific model, can be present across the entire ensemble [Collins et al., 2010; Yokohata et al., 2012]. As has been shown, combining these two approaches to assess the uncertainties of and differences between model projections will lead to new insights [Yokohata et al., 2012].
This study compares the results from the 12 member GeoMIP MME with a 20 member PPE of the Hadley Centre Coupled Model version 3 (HadCM3) GCM, for the sunshade geoengineering experiment G1 of GeoMIP [Irvine et al., 2013; Kravitz et al., 2011]. The G1 experiment consists of an instantaneous quadrupling of preindustrial CO2 concentrations which is balanced by a uniform reduction of incoming solar radiation at the top of atmosphere, i.e., sunshade geoengineering (see Kravitz et al.  for a full description and illustration of the experimental setup). This study aims to analyze the differences within and between the ensembles and to isolate the role of various factors that lead to these differences. The paper consists of the following sections: a methods section, a results section, a discussion, and a conclusion.
2.1 GeoMIP Multimodel Ensemble
In this study we use 12 models from the GeoMIP ensemble that completed the G1 sunshade experiment (Kravitz et al. [2013c, Table 1] lists the models that completed these simulations). The GeoMIP ensemble samples structural uncertainty in GCM projections of the response to solar geoengineering, i.e., the models differ in which processes are included and how they are parameterized. Within a given model structure, there are uncertainties over the parameter values to choose and this parametric uncertainty is what is explored in a PPE. The first key difference between the models of the GeoMIP ensemble is their horizontal and vertical resolutions of the atmospheric component, which range from 1° × 1° (latitude × longitude) to 2.5° × 3.75° (the resolution of HadCM3) horizontally and from 19 to 80 vertical levels. The vertical resolution affects, among other things, the fidelity of the representation of orography, boundary layer processes, and upper atmospheric dynamics [Hardiman et al., 2012]. The atmospheric components of the models contain many poorly understood and uncertain processes, such as clouds, aerosols, and aerosol-cloud interactions, which are represented differently in the different models. This affects their resultant forcing and feedback responses to greenhouse gases [Andrews et al., 2012]. The ocean models also differ, affecting processes that control atmosphere-ocean phenomena such as ocean heat uptake, the El Niño–Southern Oscillation, and the Meridional Overturning Circulation [Cheng et al., 2013; Philip et al., 2010].
The representations of the land surface in the models differ, and these differences are summarized in Table S1 in the supporting information. All models include a representation of the CO2 physiological effect (except EC-Earth), where the greater availability of ambient CO2 causes the stomata on the plants to constrict, reducing stomatal conductance and suppressing transpiration. All models also include the CO2 fertilization effect where the net primary productivity (NPP) of plants increases in response to the greater availability of CO2, counteracting to some degree the reduction in transpiration from the CO2 physiological effect [Betts et al., 1997]. Most models include an interactive vegetation distribution with fixed vegetation distributions in the others (including HadCM3), although the Max Planck Institute-Earth System Model, which does not have dynamic vegetation, simulates changes in leaf area index (LAI), the ratio of leaf surface area to ground surface area. Nitrogen and other nutrient limitations reduce the extent to which increases in NPP in response to elevated CO2 concentrations can be achieved; this photosynthesis downregulation is only represented in some models, and only the models which use the Community Land Model version 4 (CLM4) include a representation of the nitrogen cycle.
2.2 HadCM3 Perturbed Parameter Ensemble
The HadCM3 model was one of the CMIP3 generation of models and has a relatively low computational cost compared to the CMIP5 generation models [Covey et al., 2003; Gordon et al., 2000; Solomon et al., 2007]. This relatively low computational cost has led to HadCM3 being used for a host of PPE applications, e.g., to investigate uncertainty in climate sensitivity [Murphy et al., 2004; Rowlands et al., 2012; Sanderson et al., 2008b], to attribute weather extremes to climate change [Stott et al., 2004], and to inform UK climate adaptation policy [Murphy et al., 2009]. The PPE of HadCM3, which we use in this study, is described in full by Irvine et al.  along with the methodology used to generate it and an assessment of its performance. Here we briefly describe the ensemble design and the parameters perturbed. For details of the ensemble performance, see Irvine et al. .
The ensemble used here employs the fully coupled version of the model. We perturbed ocean parameters as well as atmospheric parameters, and we did not employ flux correction. Irvine et al.  generated an initial ensemble of 200 members and retained those which were projected to remain within 2°C of the observed preindustrial global mean temperature on the basis of a 20 year preindustrial simulation. Twenty-one ensemble members were confirmed to remain within the target range, and six ensemble members were rejected at this stage, as their temperature change was greater than initially projected. After completing 150 years of a quadrupled CO2 experiment, one further ensemble member was rejected, as it showed a nonlinear climate response of unchecked warming. The remaining 20 ensemble members form the PPE used in this study. The long spin-up requirement and resource constraints limited the size of the ensemble, which is far smaller than other PPEs of HadCM3 that have included thousands of members [Stainforth et al., 2005].
The eight parameters perturbed in the PPE are shown in Table 1, which also appears in Irvine et al. . Only eight parameters were perturbed as a compromise between the goal of perturbing as many uncertain processes as possible and the goal of adequately sampling the multidimensional parameter space. If more parameters were perturbed, then a broader range of parametric uncertainty would be probed, but it would be harder to understand the response with the limited number of ensemble members it was possible to run. The parameter perturbations were sampled using a maximin latin hypercube approach, which aims to achieve a high degree of multivariate separation between sampled points in high-dimensional spaces. The six atmospheric parameters are those which were found to most strongly control climate sensitivity in HadCM3 [Rougier et al., 2009; Sanderson et al., 2008a; Stainforth et al., 2005], and they affect convection, cloud formation, and precipitation formation. A single ocean parameter was chosen, the background ocean diffusivity parameter, which has been found to have the greatest effect on the transient climate response and on many other ocean properties in HadCM3 [Brierley et al., 2010; Collins et al., 2007]. The sea ice minimum albedo was also chosen, as it plays an important role in determining the control sea ice conditions and the response of sea ice to changed conditions. The land surface model in all members of the PPE remains as in the standard version of HadCM3 Met Office Surface Exchange Scheme (MOSES) 1 [Cox et al., 1999], i.e., without dynamic vegetation or nutrient limitations for NPP.
Table 1. A List of the Eight Parameters Perturbed in the PPE Experiment Conducted With HadCM3a
Shown are the parameter name with units in parentheses, value of the parameter in the standard configuration, the minimum and maximum for the parameter range, and a short description.
Lateral entrainment rate coefficient
VF1 (m s−1)
Ice fall speed
1.0 × 10−4
5.0 × 10−5
4.0 × 10−4
Cloud droplet to rain conversion rate
CW land/sea (kg m−3)
2.0 × 10−4 5.0 × 10−5
1.0 × 10−4 2.0 × 10−5
2.0 × 10−3 5.0 × 10−4
Cloud droplet to rain conversion threshold over land and sea
Empirically adjusted cloud fraction at saturation
Threshold of relative humidity for cloud formation
Sea ice albedo at 0°C
VDIFF min/max (m2s−1)
Background vertical diffusivity, which varies as a function of depth
2.3 Experimental Design
Three simulations are run for each model and PPE member: a preindustrial control run (piControl), a simulation with an instantaneous quadrupling of preindustrial CO2 levels (abrupt4xCO2) to ~1144 ppmv, and a simulation with an instantaneous quadrupling of CO2 levels and a reduction in insolation (G1). The insolation reduction is chosen to achieve a top of atmosphere (TOA) radiative imbalance that is close to that of the preindustrial. This was achieved by testing whether the mean TOA radiative imbalance was within ± 0.1 Wm−2 of the piControl value over an initial 10 years of simulation, and if not, modifying the solar constant appropriately and repeating. This balanced insolation reduction was then used for a 50 year simulation of G1, which will be compared with the first 50 years of parallel simulations of abrupt4xCO2 and piControl. Five hundred years of preindustrial control were completed before these other experiments were started; as such, there is a small degree of drift in the climate of the piControl simulations for the members of the PPE, as the deep ocean is not fully in steady state [Irvine et al.,2013]. All averages reported in this study are over years 11–50 unless otherwise stated. Although some of the GeoMIP ensemble members had multiple initial condition runs for each experiment, we use only one ensemble member from each model in this study to simplify the analysis. For more details on these experiments, Kravitz et al. [2013c] give an overview of the climate response in these simulations, Tilmes et al.  analyze the precipitation response, and Kravitz et al. [2013b] analyze the hydrological response from an energetic perspective.
3.1 The Role of Global Mean Temperature Biases in Ensemble Response
The GeoMIP ensemble average for the G1 experiment is in near-perfect radiative equilibrium in the global mean, with all individual ensemble members remaining within a few tenths of a Wm−2 of radiative balance. The same is true of members of the PPE, but the PPE mean is out of balance by +0.1 Wm−2. See Table 2 for a summary of the ensemble response. Overall, the procedures applied by the modeling groups have been fairly successful in balancing the radiative forcing of abrupt4xCO2 with an insolation reduction. As might be expected, the global mean temperature response of the ensembles reflects these TOA radiative imbalances, with the GeoMIP ensemble mean showing near-zero change and the PPE mean warming by 0.15°C. However, individual members of the GeoMIP ensemble show some deviations, most notably Beijung Normal University Earth System Model (BNU-ESM) which shows a 0.65°C warming, whereas the PPE members are all within 0.1°C of the ensemble mean response.
Table 2. The Global Mean Response of the PPE and the GeoMIP Ensemble for a Number of Summary Variablesa
The results are presented as ensemble mean ± ensemble standard deviation, with the minimum and maximum of the ensemble response in parentheses. All variables are global means calculated from years 11 to 50 of the simulations.
G1-piControl TOA radiative imbalance (W m−2)
0.10 ± 0.05 (0.00, 0.18)
0.01 ± 0.17 (−0.25, 0.24)
G1-piControl SAT (°C)
0.15 ± 0.05 (0.05, 0.24)
0.01 ± 0.25 (−0.30, 0.65)
G1-piControl precipitation (%)
−4.50 ± 0.18 (−4.83, −4.26)
−4.48 ± 1.34 (−6.51, −1.87)
These global mean surface air temperature (GMST) deviations seen in both ensembles are one source of systematic model spread that is artificial, i.e., this source of model disagreement is potentially avoidable. The climate system response to a radiative forcing perturbation can be conceptually divided into a rapid adjustment and a feedback response or a fast and slow response [Andrews et al., 2010; Bala et al., 2010]. The rapid adjustment is not associated with GMST and has been found to depend strongly on the nature of the forcing mechanism [Andrews et al., 2010]. The feedback response, on the other hand, depends on the degree of GMST change and is largely independent of the forcing mechanism [Andrews et al., 2010]. To investigate the role that GMST deviations in the G1 experiment have on the ensemble spread, we estimate the climate response to G1 if each model had perfectly maintained the preindustrial GMST, i.e., we remove the temperature-driven feedback response and investigate the rapid adjustment response only. Previous solar geoengineering experiments found a linear climate response to changes in insolation for a given CO2 concentration [Irvine et al., 2010; Ricke et al., 2010]. We thus assume that the climate response of all variables to an insolation reduction varies linearly with the magnitude of this insolation reduction, i.e., starting from the abrupt4xCO2 climate and progressing along the vector G1-abrupt4xCO2 as the insolation reduction is increased. Kravitz et al. [2013b] investigated the rapid adjustment and feedback response of the G1 experiment in another way by separately analyzing the first year of simulation to represent the rapid adjustment and the last 40 years to represent a proxy for the feedback response.
We calculate an adjustment factor C that depends on the GMST values for the different experiments as follows:
Starting from the GMST of the abrupt4xCO2 experiment and subtracting the anomaly between the G1 and abrupt4xCO2 experiments, multiplied by our adjustment factor C, will return the preindustrial global mean temperature. The mean adjustment factor for the GeoMIP ensemble is 1.0 (i.e., no change) with a standard deviation of 0.06. BNU-ESM at 1.14 and Goddard Institute for Space Studies (GISS)-E2-R at 0.90 are the two extreme cases in the GeoMIP ensemble, which were too warm and too cold, respectively. The PPE has a mean adjustment factor of 1.03 with a standard deviation of 0.01. We calculate the adjustment factor separately for every ensemble member and apply this linear scaling to generate an adjusted climate response for G1 (from now on G1-adjusted) and apply this to every variable. This G1-adjusted experiment will be used for the remainder of this study unless otherwise stated.
The pattern of surface air temperature (SAT) changes arising from a certain radiative forcing perturbation depend on the spatial pattern and other properties of the radiative forcing, but they also depend on the magnitude of the resulting GMST response, as positive feedbacks specific to certain locations can amplify the local response [Armour et al., 2013]. Table 3 summarizes some changes to the distribution of SAT changes for the two ensembles for the G1 and G1-adjusted experiments covering Arctic warming, which affects many aspects of the climate of the Northern Hemisphere [Solomon et al., 2007], the interhemispheric temperature difference which controls the location of the Intertropical Convergence Zone (ITCZ) [Haywood et al., 2013; Zeng, 2003], and the land-sea temperature difference, which affects the distribution of precipitation between the oceans and the continents [Bala et al., 2008]. Almost all models show greater warming at high northern latitudes than elsewhere, due to the latitudinal differences between the solar and CO2 forcing [Govindasamy et al., 2003; Kravitz et al., 2013a; Lunt et al., 2008]. In G1-adjusted both ensembles show almost the same Arctic mean warming and spread; the linear GMST scaling has approximately halved the GeoMIP ensemble standard deviation. This indicates that GMST plays a particularly strong role in the Arctic response to geoengineering, likely due to the action of positive feedbacks such as the melting of ice and snow [Moore et al., 2014]. All models show either no change or a relative warming of the Northern Hemisphere for G1-piControl, likely due to a combination of the warming of the northern landmasses and the action of stronger positive feedbacks at high northern latitudes than at high southern latitudes. The same is true for the land-sea temperature difference. For both of these measures, the GMST adjustment does not greatly change the response. For both of these measures, for G1 and G1-adjusted, almost all PPE members show a greater change than any of the GeoMIP ensemble members, i.e., a greater warming of the land relative to the ocean and of the Northern Hemisphere relative to the Southern Hemisphere.
Table 3. Results Summarizing the Distribution of SAT Changes for the PPE and the GeoMIP Ensemblesa
The results are presented as ensemble mean ± ensemble standard deviation, with the minimum and maximum of the ensemble response in parentheses. All variables are means calculated from years 11 to 50 of the simulations.
G1-piControl land-ocean SAT (°C)
0.67 ± 0.08 (0.53, 0.79)
0.35 ± 0.18 (−0.02, 0.65)
G1-adjusted-piControl land-ocean SAT (°C)
0.58 ± 0.08 (0.45, 0.68)
0.35 ± 0.18 (0.00, 0.55)
G1-piControl 60–90°N SAT (°C)
1.32 ± 0.28 (0.80, 1.76)
0.95 ± 0.63 (0.06, 2.65)
G1-adjusted-piControl 60–90°N SAT (°C)
1.05 ± 0.24 (0.64, 1.47)
0.92 ± 0.28 (0.55, 1.41)
G1-piControl NH-SH SAT (°C)
0.49 ± 0.10 (0.27, 0.62)
0.17 ± 0.10 (−0.01, 0.36)
G1-adjusted-piControl NH-SH SAT (°C)
0.43 ± 0.10 (0.24, 0.58)
0.17 ± 0.09 (0.06, 0.34)
The strength of the hydrological cycle also depends on the GMST, with simulations and observations showing that the global mean precipitation increases and patterns of precipitation minus evaporation (P − E) intensify as the planet warms [Andrews et al., 2010; Chou et al., 2009]. Thus, the GMST deviations from the piControl will alter the patterns of hydrological change and could lead to greater model disagreement than would be the case had the models perfectly restored the piControl TOA radiative balance. Table 4 shows the global mean, land mean, and ocean mean precipitation anomalies from the piControl for both ensembles for the G1 and G1-adjusted experiments. As most models show some degree of deviation from the piControl GMST for G1, the G1-adjusted response decreases the standard deviation of the global mean precipitation response of the GeoMIP ensemble by 16% but does not greatly affect the already low standard deviation of the PPE. A similar response is seen for the ocean mean and land mean where there is little change in the mean precipitation but a reduction of the ensemble standard deviation. GMST deviations thus play a substantial role in the intraensemble spread for the GeoMIP ensemble hydrological response but do not explain all of the differences within the GeoMIP ensemble. The hydrological changes seen in the G1 experiment are described in detail by Tilmes et al. .
Table 4. Precipitation Responses for the PPE and GeoMIP Ensembles for the G1 and G1-Adjusted Simulationsa
The results are presented as ensemble mean ± ensemble standard deviation, with the minimum and maximum of the ensemble response in parentheses. All variables are means calculated from years 11 to 50 of the simulations.
G1-piControl precipitation (%)
−4.50 ± 0.18 (−4.83, −4.26)
−4.48 ± 1.34 (−6.51, −1.87)
G1-adjusted-piControl precipitation (%)
−4.79 ± 0.17 (−5.00, −4.43)
−4.46 ± 1.13 (−6.47, −2.59)
G1-piControl ocean precipitation (%)
−5.00 ± 0.31 (−5.60, −4.39)
−4.25 ± 1.06 (−5.66, −2.30)
G1-adjusted-piControl ocean precipitation (%)
−5.30 ± 0.35 (−5.30, −4.71)
−4.34 ± 0.90 (−5.42, −2.39)
G1-piControl land precipitation (%)
−2.55 ± 1.34 (−5.50, −0.40)
−5.05 ± 3.63 (−12.22, −0.53)
G1-adjusted-piControl land precipitation (%)
−2.77 ± 1.36 (−5.87, −0.88)
−5.01 ± 3.47 (−12.53, −1.87)
Figure 1 shows the zonal mean standard deviation and ensemble range for G1 and G1-adjusted for SAT, evaporation and precipitation, for both ensembles. The GeoMIP ensemble has a larger range and standard deviation for all the variables displayed in these figures than the PPE ensemble. For SAT, G1-adjusted has a substantially reduced ensemble range and standard deviation of SAT compared to G1 for the GeoMIP ensemble. The differences between the range and standard deviation of G1 and G1-adjusted are smaller for the PPE, which had much smaller differences between the GMSTs of its members. The adjustment has a much smaller effect on evaporation and precipitation, reducing the spread in the GeoMIP ensemble somewhat, but it has little effect on the PPE. Figure 2 shows a similar slight reduction in the zonal mean standard deviation and range of terrestrial evaporation and precipitation for the GeoMIP ensemble after adjustment. Although, while the adjustment decreases the range and standard deviation of the GeoMIP land mean SAT response at most latitudes, it increases these around the equator. Thus, the GMST deviations between G1 and piControl were actually suppressing the differences between the GeoMIP SAT responses over tropical land regions, indicating that other factors than global mean SAT error are driving the terrestrial response.
3.2 Changes in Precipitation and Evaporation Over Land and Ocean
We compare the zonal mean land and ocean hydrological response to G1-adjusted minus piControl of the two ensembles in Figures 3 and 4. Over the oceans, the balance of CO2 and solar forcing leads to a reduction in evaporation peaking at a value of around 5% across the tropical oceans, where the insolation reduction is greatest, but with a large range of responses at high latitudes where sea ice changes may be playing an important role. The oceanic precipitation response generally shows a reduction across the tropics but one which is highly heterogeneous, with a lesser reduction in the subtropics and a consistent reduction at midlatitudes. There are very large differences in the pattern of tropical oceanic precipitation change within the GeoMIP ensemble, whereas all members of the PPE show a similar, strong decrease south of the equator. This distinctive reduction in precipitation south of the equator may be tied to the relatively strong warming of the Northern Hemisphere in all members of the PPE, as hemispheric temperature differences are simulated to lead to a shift of the ITCZ toward the warmer hemisphere [Haywood et al., 2013; Zeng, 2003].
The zonal mean land hydrology response shows larger interensemble and intraensemble differences than the ocean response (see Figures 3 and 4). All members of the PPE show a consistent ~15% reduction in continental evaporation around the equator, whereas the GeoMIP ensemble shows equatorial responses ranging from a slight increase in evaporation for BNU-ESM to a ~25% reduction for GISS-E2-R, with the other models spanning this entire range. The zonal mean land precipitation response of different ensemble members differs much more than the ocean response in both ensembles. However, both ensembles show a general reduction across the tropics, a weaker reduction in the subtropics, and a clear reduction at northern midlatitudes. The magnitude of the tropical and midlatitude reduction in land precipitation correlates with the land evaporation reduction. This is clearest at the northern midlatitudes where the PPE shows a broadly consistent ~ 8% reduction in precipitation, and it can be seen that the GeoMIP ensemble members with the greatest evaporation reductions also show the greatest precipitation reductions [Kravitz et al., 2013b].
3.3 Role of piControl Climatology in the Ensemble Response
All climate models show some degree of bias when compared to observational data, and these are expected to be persistent across different simulations [Solomon et al., 2007]. Models have biases in the particular locations of certain key-observed features of the climate, such as the location of the ITCZ in a given season and the locations of greatest surface ocean temperature variability in the region affected by El Niño–Southern Oscillation [Power et al., 2013]. When model projections are compared, as in the zonal mean plots shown in Figures 3 and 4, these biases in the locations of key features may be responsible for some of the disagreement between the model projections, e.g., if the models showed a consistent northward shift of a certain feature, there could be an apparently large degree of disagreement in a zonal mean anomaly plot. To investigate the role of the preindustrial climatology on the response to G1, we sorted the models' grid cells from lowest to highest according to their piControl values for the variable of interest and then calculated the average anomalies for each percentile band of this piControl distribution. In the next section the role of vegetation is examined, and to test the role of vegetation in mediating the climate response to G1, the data are sorted as above but by the preindustrial net primary productivity (NPP). However, land regions with low levels of NPP, below 1% of the grid cell with the highest value, are excluded and are grouped together as a “low NPP” region. As not all GeoMIP models and none of the PPE members produced results for NPP, the ensemble mean of the preindustrial NPP from the GeoMIP ensemble is used as the basis for these calculations. The distribution of NPP should be similar enough across the ensemble to ensure that the regions of high and low productivity occur in similar regions. Figures S1 and S2 show maps of the preindustrial climatology, and Figure S3 shows averages for each of the percentile bands of the preindustrial climatology for SAT, precipitation, evaporation, and NPP.
Figure 5 shows anomalies between G1-adjusted and the preindustrial control globally and over land on the basis of the piControl distribution. The changes in SAT show a cooling of the regions which were warmest in the preindustrial and a warming of the coolest regions, which is due to the overcooling of the tropics and undercooling of the poles, which has been noted for sunshade geoengineering previously [Govindasamy et al., 2003; Kravitz et al., 2013a; Lunt et al., 2008]. There is some disagreement over the magnitude of this trend across the GeoMIP ensemble, although the absolute magnitude of these changes is small relative to those for abrupt4xCO2 (Figure S4), and there is good agreement across the PPE.
G1-adjusted shows a global mean reduction in the hydrological cycle with precipitation and evaporation reduced by around 4.5% on average in both ensembles, but the GeoMIP ensemble has an ensemble standard deviation more than 5 times greater than the PPE. This same pattern is reproduced globally in regions with greater than average preindustrial evaporation, i.e., in the upper 50% of the distribution, where a reduction of roughly 5% is shown for both ensembles with a much greater spread seen in the GeoMIP ensemble (Figure 5). For the lower 50%, there is either relatively less evaporation or an increase in evaporation. However, even though the relative changes are large for the lowest evaporation regions, the absolute changes are small due to the substantially lower evaporation in these regions in the preindustrial (Figure S3). This increase in evaporation in low-evaporation regions is likely due to the warming of high latitudes. Similar results are found for changes in precipitation, although the reduction in precipitation is more consistent between regions of high and low preindustrial precipitation. The changes in precipitation over land are similar to the global results but with a greater intraensemble spread, which is to be expected given the differences in the representation of land and boundary layer processes across the GeoMIP ensemble, the smaller number of grid cells, and the greater heterogeneity of the preindustrial climatology over land. However, the evaporation changes over land differ from the global results, showing a reduction in evaporation that is greater for high values of preindustrial evaporation, rather than flat for most values as in the global case. This suggests that an additional process is acting to suppress evaporation over land that does not act over the ocean, such as the CO2 physiological effect [Fyfe et al., 2013; Tilmes et al., 2013].
3.4 Role of Vegetation Distribution in the Ensemble Response
In the G1 experiment the global mean radiative balance is maintained despite the elevated CO2 concentrations, but the effect of CO2 on plants remains. The direct effect of the response of plants to elevated CO2 has been shown to have a considerable influence on regional climate and hydrology in model simulations through the suppression of transpiration due to reduced stomatal conductance [Betts et al., 2007; Boucher et al., 2009; Fyfe et al., 2013] and in previous studies was found to make up a considerable fraction of the total climate response when comparing G1-like experiments to piControl experiments [Fyfe et al., 2013; Tilmes et al., 2013].
In the first instance, elevated CO2 concentrations lead to a reduction in stomatal conductance as the plant acts to reduce water loss, and hence transpiration, while still gathering sufficient CO2 to photosynthesize; this results in an increase of the plant's water-use efficiency, the water loss per unit of net primary productivity (NPP) [Farquhar et al., 1989; Franks et al., 2013]. The greater availability of CO2 for photosynthesis and the improved water-use efficiency of photosynthesis lead to an increase in net primary productivity in appropriate conditions [Farquhar, 1997]. This increased NPP is realized as a combination of increased leaf-level carbon accumulation and an increased number of leaves, i.e., a greater leaf area index (LAI) measured as the fraction of leaf area to surface area [Donohue et al., 2013; Franks et al., 2013]. This increase in NPP and LAI leads to an increase in transpiration that can partly or wholly offset those initial reductions from the direct CO2 physiological effect depending on conditions [Betts et al., 1997; Donohue et al., 2013]. The net effect of the direct CO2 physiological effect and changes in net primary productivity on stomatal conductance can be represented qualitatively as
Where g is the stomatal conductance, C is the carbon dioxide concentration, and the subscript “rel” indicates that these values are relative to some reference case [Franks et al., 2013]. This means that if CO2 levels quadrupled and NPP doubled, one would expect an approximate halving of stomatal conductance which would produce a similar effect on transpiration absent any other changes [Franks et al., 2013]. Arid regions are most likely to see a large increase in NPP as the increased water-use efficiency reduces the key limitation in those regions [Donohue et al., 2013], while densely vegetated regions will get a diminishing return from increased investment in LAI and other limitations may be more important. This implies that outside of arid regions, the vegetation response across the GeoMIP ensemble is likely to differ a lot depending on how great the NPP response is, which will depend on the model's formulation. A critical difference is whether or not nitrogen limitations are considered; Jones et al.  noted that the GeoMIP models that include a nitrogen cycle showed a far smaller increase in NPP in response to elevated CO2 concentrations than those without a nitrogen limitation (see also Kravitz et al. [2013a, Figure 1] for more on the wide range of NPP responses across the GeoMIP ensemble).
The effects of the vegetation response to CO2 on climate cannot be isolated without additional simulations, but the dependence of the climate response on vegetation activity can be investigated. Regions with a higher NPP will have a greater rate of transpiration and may be expected to show the greatest response to elevated CO2 concentrations. Figure 6 shows the absolute response of SAT, evaporation and precipitation, and the percentage change of evaporation and precipitation, for G1-adjusted as a function of the terrestrial NPP. CO2 affects transpiration but this variable was not available for all models, so changes in evaporation will be analyzed as these include transpiration changes. The evaporation anomaly for G1-adjusted minus piControl shows a reduction for most models across all vegetated regions, with the greatest reductions occurring in high NPP regions, both in absolute and percentage change terms. The GeoMIP ensemble spread is greatest for the highest percentiles of the NPP distribution despite the much lower initial evaporation for the lower percentiles, which should act to exaggerate any percentage differences between the models as was seen for the global and land mean changes in Figure 5. This wide intraensemble spread of the GeoMIP ensemble for the high NPP regions can be contrasted with the far narrower intraensemble spread of the PPE for the highest NPP regions. It seems likely that these differences in response are connected to differences in the CO2 physiological response for the following reasons: all members of the PPE have the same CO2 physiological response, the three GCMs which employ the Community Land Model (CLM) all show a very similar response (Community Earth System Model with the Community Atmosphere Model version 5.1 (CESM-CAM5.1-FV), Community Climate System Model version 4 (CCSM4) and the Norwegian Earth System Model (NorESM1-M) all shown with dashed lines), and EC-Earth which has no CO2 physiological effect shows a flat response across the NPP distribution. Significantly, the Hadley Centre Global Environmental Model version 2 (HadGEM2) also shows a similar response to the PPE and they share the same land model MOSES, albeit different versions of it. However, the GeoMIP ensemble has a wider range of boundary layer process representations, and members differ in a number of other important ways; additionally, HadCM3 and HadGEM2 as well as CESM-CAM5 and CCSM4 are closely related models that were developed in the same research institutes which may explain some of their similarity [Knutti et al., 2013].
For the GeoMIP ensemble, the SAT response to G1-adjusted is a warming of a few tenths of a degree Celsius for both the low NPP regions and across the rest of the distribution but with a wider GeoMIP ensemble spread for the highest NPP regions. That there is a mix of high- and low-latitude regions within the NPP percentile bands likely explains the flatness of this temperature response (see Figure S1 for a map of the NPP percentile bands). The precipitation and evaporation anomalies are similar, with a greater reduction and wider GeoMIP ensemble spread for high NPP regions indicating that the CO2 physiological effect may be playing some role in controlling the terrestrial precipitation response, but this is less clearly the case than for evaporation. This more muted response is likely because precipitation is less determined by local factors than evaporation, with a greater role in determining the precipitation response played by changes to low-level convergence, remote changes in oceanic source evaporation, and changes in atmospheric stability.
The PPE of HadCM3 used in this study reproduces many of the features of the climate response to solar geoengineering seen in other studies of the G1 experiment [Kravitz et al., 2013a; Schmidt et al., 2012; Tilmes et al., 2013]. However, the PPE produces a narrower range of responses to G1 than the GeoMIP ensemble does, with most members showing a similar pattern of change but differing in the magnitude of change. The PPE used in this study only perturbed a limited number of parameters, primarily those that are important for determining the climate sensitivity of HadCM3, resulting in a range of climate sensitivities of between 3.3 and 5.3°C [Irvine et al., 2013]. This choice was made on the assumption that those processes most important for determining the magnitude of the climate change response would also be important for determining the response to solar geoengineering. However, as Kravitz et al. [2013b] and Schmidt et al.  note, the agreement on the climate response to G1 is much greater in the GeoMIP ensemble than for the climate response to abrupt4xCO2 due to the much smaller change in SAT and the mostly unchanged GMST which means that the large differences between the models' temperature feedback response is not realized in G1 but plays a large role in abrupt4xCO2. PPEs are limited by their common structure and share many biases with the standard model on which they are based [Collins et al., 2010], and this ensemble clearly shows underdispersion in the range of responses to the G1 experiment [Irvine et al., 2013].
Comparing the GeoMIP and PPE ensembles, it was possible to infer some of the uncertain processes that are responsible for differences within and between the ensembles. First, we note that some of the GeoMIP and PPE members did not perfectly balance the CO2 and solar forcings in G1, so there was some degree of temperature change compared to the piControl. We found that by correcting for this GMST deviation, the spread of regional SAT responses, particularly at high latitudes, was reduced. However, this adjustment did not substantially alter the ensemble spread of the hydrological responses. Second, we assessed the role of the piControl climatology in determining the response to G1. At the global level of analysis, we found that there was a fractional reduction in evaporation and precipitation from piControl that was broadly consistent between regions of high and low precipitation and evaporation. An exception to this is found at high latitudes and hence in regions of low evaporation and precipitation, where the imbalance between solar and CO2 forcing led to a rise in temperature and a consequent rise in evaporation and precipitation. This consistency in the reduction in the intensity of precipitation and evaporation suggests that some of the differences in the modeled precipitation response seen in other studies of the GeoMIP ensemble may be arising due to differences in the piControl distribution rather than through differences in the response of the models to the forcing. Third, we assessed the role of terrestrial vegetation on the response to G1. The greatest fractional reductions in evaporation occurred in highly vegetated regions where transpiration is responsible for a large fraction of the total evaporation. There was a large degree of intraensemble spread in this response as the degree to which increases in NPP offset the initial CO2 physiological effect differs greatly between the land models used in the GeoMIP ensemble. This finding was reinforced by the similarity of this response for the three models which shared the CLM4 land surface model and included a nitrogen cycle (NorESM1-M, CCSM4, and CESM-CAM5.1-FV) and separately for the PPE of HadCM3 and HadGEM2-ES which share different versions of the MOSES land model. It is worth noting that all climate models have structural and other similarities, and this is particularly true for models that are produced by the same research center, as HadCM3 and HadGEM2-ES and CCSM4 and CESM-CAM5.1-FV are [Knutti et al., 2013]; this means that other structural similarities, such as the treatment of the boundary layer, may give rise to their similar behavior. As terrestrial precipitation is more strongly affected by remote processes than terrestrial evaporation [Gimeno et al., 2010], we did not find as clear an indication for the role of vegetation processes for this variable.
This study has highlighted a number of uncertainties by comparing a small, targeted PPE with the GeoMIP ensemble, but the PPE did not perturb all relevant physical processes. This limited both its usefulness as a means of investigating the uncertainty in the climate response to solar geoengineering and for identifying the underlying reasons for these uncertainties. However, we believe that this approach of comparing a PPE to an MME is a useful one, and future studies could build on this one to improve the understanding of the uncertainty in the response to solar geoengineering. The findings of this study suggest that while parameters with the largest effect on processes determining climate sensitivity are important, other uncertain processes are also important for determining the response to the G1 solar geoengineering experiment, as temperatures are stabilized, and there is only a limited role for temperature feedbacks. Future studies generating targeted PPEs for solar geoengineering should also include parameters that affect climate dynamics, the control distribution of precipitation, and the CO2 physiological effect [Booth et al., 2012]. The magnitude of the NPP response, which counteracts the initial CO2 physiological effect to some extent, is a particularly significant uncertainty, as it is critical to determining the terrestrial carbon cycle response and is connected to the broader surface hydrological response. Understanding the response of vegetation and surface hydrology to the combined effects of elevated CO2 and climate change is critical to understanding the implications of solar geoengineering.
Multimodel ensembles and perturbed parameter ensembles are complementary approaches for assessing the range of potential climate responses to solar geoengineering, and combining these approaches can provide new insights into the climate response to sunshade geoengineering. Here we found similar responses for G1-piControl in the PPE to those already found in the GeoMIP ensemble but noted that the PPE showed a much narrower range of responses, particularly in terms of its land mean hydrology, land versus ocean and Northern versus Southern Hemisphere temperature changes, and its pattern of tropical precipitation change. We also found that adjusting the G1 response, so that all ensemble members reproduced the piControl global mean SAT, narrowed the range of SAT responses in both ensembles but had little effect on the spread of the terrestrial responses. The central role of the piControl climatology in determining the hydrological response to solar geoengineering was shown, indicating that biases in the piControl climatology could be a major source of model disagreement at the zonal and grid cell level. The terrestrial hydrological response to solar geoengineering was affected by the CO2 physiological effect and the increase of NPP, the strength of which differs markedly across the GeoMIP ensemble. Uncertainty in the magnitude of this reduction in transpiration appears to be a major source of model disagreement over the terrestrial hydrological response to solar geoengineering.
We thank all participants of the Geoengineering Model Intercomparison Project and their model development teams, CLIVAR/WCRP Working Group on Coupled Modeling for endorsing GeoMIP, and the scientists managing the Earth System Grid data nodes who have assisted with making GeoMIP output available. We acknowledge the World Climate Research Programme's Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups for producing and making available their model output. For CMIP, the U.S. Department of Energy's Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. Ben Kravitz is supported by the Fund for Innovative Climate and Energy Research (FICER). The Pacific Northwest National Laboratory is operated for the U.S. Department of Energy by Battelle Memorial Institute under contract DE-AC05-76RL01830. Simulations performed by Ben Kravitz were supported by the NASA High-End Computing (HEC) Program through the NASA Center for Climate Simulation (NCCS) at Goddard Space Flight Center. A.J. was supported by the Joint UK DECC/Defra Met Office Hadley Centre Climate Programme (GA01101). This study was partly funded by the European Commission's 7th Framework Programme through the EuTRACE project (grant 306395). Alan Robock is supported by NSF grants AGS-1157525 and CBET-1240507. The IPSL-CM5A climate simulations were performed with the HPC resources of [CCRT/TGCC/CINES/IDRIS] under the allocation 2012-t2012012201 made by GENCI (Grand Equipement National de Calcul Intensif), CEA (Commissariat à l'Energie Atomique et aux Energies Alternatives), and CNRS (Centre National de la Recherche Scientifique). Helene Muri was funded by the EU 7th Framework Programme grant 306395, EuTRACE.