Rising atmospheric carbon dioxide concentration ([CO2]) is just one aspect of global climate change. However, it is important because it consistently stimulates the growth and harvestable grain production of C3 crops (Kimball et al., 2002; Long et al., 2004; Nowak et al., 2004; Ainsworth & Long, 2005), as well as benefiting C4 crops under drought stress (Ottman et al., 2001; Leakey et al., 2004, 2006). Meanwhile, high temperatures, drought stress and rising ozone concentrations all have negative impacts on crop production (Gitay et al., 2001; Parry et al., 2004). Furthermore, rising [CO2] is unique in being globally almost uniform and so denying spatial proxies for temporal trends. As a result, Parry et al. (2004) singled out its effect as the largest uncertainty in projecting future global food supply. Free-air CO2 enrichment (FACE) experiments currently provide the most realistic measure of the future impact of elevated [CO2] on crop yields. Free-air CO2 enrichment experiments differ from enclosure studies in two salient respects: (1) they are conducted in the open air in farm fields without limiting growing space, or altering microclimate, precipitation or pest/pathogen access; and (2) the scale of the experiments is large enough to be comparable to agronomic trials (typically plots > 300 m2 compared with < 4 m2 in the case of enclosure studies). In Long et al. (2006) and Ainsworth (2008), we reported that stimulation of seed yield in response to elevated [CO2] is lower in FACE experiments than in enclosure studies of the world's four most important food crops. We suggested that the implications of this finding were as follows: modeling studies using CO2 fertilization factors derived from enclosure experiments may have overestimated future food supply; and additional field experiments are needed to understand in greater detail the mechanism of response and to drive research and development efforts to improve crop yields under future climatic conditions. This was argued with the proviso that FACE experiments have been limited in number and geographic range, as well as in the extent to which they currently investigate the interactive effects of the numerous elements of global change. These findings were vigorously challenged and then subsequently dismissed by Tubiello et al. (2007a,b) who argued that the findings of Long et al. (2006) were incorrect for three reasons: a statistically significant difference between FACE and nonFACE data was not adequately tested or proven; results from most crop model simulations are consistent with values from FACE experiments; and lower crop responses to elevated CO2 of the magnitudes in question would not significantly alter projections of world food supply (Tubiello et al., 2007a). This conclusion is in contrast to the findings of Parry et al. (2004), who with similar modeling approaches concluded that whether global food supply remained stable or declined with global change would depend critically on the response of the world's key grain crops to rising [CO2]. The consequences for human welfare will probably be severe if we underestimate the threat of global change to future food supply. This is all the more important given that long lead times will be necessary to produce germplasm better adapted to future growing conditions; typically it may take decades to identify adapted germplasm and then bring it to market in sufficient quantity for these major crops. Consequently, there is an immediate need to evaluate the currently available data correctly and use this to identify the best approaches to predict future food availability and to elucidate the mechanisms of crop responses to elevated [CO2] in order to generate improved germplasm.
Statistically significant differences between FACE and nonFACE experiments
The statistical validity of comparing the mean response of C3 crops to elevated [CO2] from FACE experiments (Supplementary material Table S1) against the modeled best-fit response of chamber experiments (Long et al., 2006: fig. 2) was criticized because curve-fitting methods and data-pooling choices can bias fair comparisons (Tubiello et al., 2007a). In fact, Tubiello et al. (2007a) recommend that a better approach is to ‘whenever possible, use the observed data – rather than “predicted” from curves lacking full biophysical explanatory power.’ We agree, and given the number of C3 crop studies, it is possible to take a direct approach and limit the comparison of FACE experiments and chamber studies to those with similar ambient [CO2] and similar elevated [CO2] (Supplementary material Tables S1 and S2). The FACE data had an average ambient [CO2] of 367 ppm and an elevated [CO2] of 583 ppm, and were normally distributed with a mean yield response ratio of 1.14 (Fig. 1). The chamber data had an average ambient [CO2] of 373 ppm and an elevated [CO2] of 565 ppm, and a mean yield response ratio of 1.31 (Fig. 1). In contrast to the FACE data, the chamber data were not normally distributed (Shapiro–Wilks P < 0.001) and had a much broader range of responses than the FACE data (Fig. 1). Because the data were not normally distributed and the sample sizes for FACE and chamber studies were unequal, we used the less-sensitive, nonparametric Wilcoxon–Mann–Whitney two-sample test to analyze differences between FACE and chamber studies (Steel et al., 1997). This test revealed a significant difference between the response ratio of yields at elevated [CO2] in FACE studies vs chamber studies (P = 0.016). Consistent with our previous analysis (Long et al., 2006), the magnitude of elevated [CO2] stimulation in FACE experiments was essentially half of the stimulation in chamber studies.
The major models do not show good agreement with FACE
The best test of model parameterization and model design is validation of model output against observed experimental data. Tubiello et al. (2007a) report that some key models used in climate change impact assessments (AEZ, Fischer et al., 2002; CERES, Tsuji et al., 1994; EPIC, Stockle et al., 1992) have not been evaluated against FACE data, but where this has been carried out, the models reproduce FACE results well. Tubiello & Ewert (2002) summarized the validation of five widely used crop models with wheat grain yield data from the Maricopa FACE experiment, concluding that the models ‘all showed good agreement’, and this work was referred to again by Tubiello et al. (2007a) to justify this point. However, examination of this comparison fails to justify this claim. The models, in fact, project almost twice the yield actually observed in the FACE experiments, in agreement with Long et al. (2006). Using digitizing software (grafula 3 v 2.10; Wesik, SoftHaus, St Petersburg, Russian Federation), we extracted the data from Fig. 3 of Tubiello & Ewert (2002) and replotted the results (Fig. 2). While the FACE results suggest a mean response ratio of 1.08 under well-watered conditions and a mean response ratio of 1.18 under water stress, the average model outputs estimate a response ratio of 1.18 under well-watered conditions and a response ratio of 1.28 under water stress (Fig. 2). Thus, the Demeter, LINTUL, AFRC, mC-wheat, and Sirius models collectively overestimate the [CO2] fertilization effect by 125% under well-watered conditions and by 56% under water stress conditions (Fig. 2). This corresponds with the greater magnitude of yield stimulation by elevated [CO2] in chamber studies compared with FACE studies, which is described in the previous section and reported in Long et al. (2006).
In summary, comparison of model parameterization and model validation exercises with data from FACE and nonFACE studies does not support the assertions of Tubiello et al. (2007a,b), but instead supports the concern that there are some important quantitative differences in how crops respond to elevated [CO2] in FACE and chamber experiments. Improving projections will require a better integration of experiments and models. While it is standard for an experimental study to provide sufficient information to allow an exact repetition of the work, this standard has not always been upheld for models projecting future food supply, including those used by the Intergovernmental Panel on Climate Change (IPCC). It is in the mutual interest of experimental and modeling studies that the basis of differences should be understood. Yet, in Tubiello et al. (2007a), the CO2 fertilization factors used in a key model were presented for the first time and referenced with the statement ‘personal communication’.
Sensitivity of future food supply to elevated [CO2]
Tubiello et al. (2007a) argue that differences in yield response between chamber and FACE experiments are inconsequential because ‘for any specific socio-economic pathway and climate change, including or not including the effects of elevated CO2 on crops changes the global cereal production results by less than 2%’. By contrast, a second modeling study included in the latest IPCC report (Parry et al., 2004), and based on the same economic model, reports that inclusion of CO2 effects reduced the number of undernourished people in 2050 by 12–32%, depending on the climate change scenario (Easterling et al., 2007). By 2080, inclusion of the CO2 effects reduced the number of undernourished people by 18–63% (Easterling et al., 2007). Therefore, it seems unlikely that CO2 effects will only be ‘moderately important’ in the future, as reported by Tubiello et al. (2007a), and we would argue that there is a pressing need to identify crops and genotypes that can maximize the benefits of rising [CO2]. The IPCC report also states that ‘a number of limitations, however, make these model projections highly uncertain’. Including, ‘projections are based on a limited number of crop models, and only one economic model, the latter lacking sufficient evaluation against observations, and thus in need of further improvements’. This further suggests that the evidence presented by Tubiello et al. (2007a) is not adequate to reject the statistically valid difference in yield stimulation between FACE and chamber experiments described above.
There is broad agreement that the effects of elevated [CO2] measured in experimental settings lacking the potentially limiting influence of pests, weeds, nutrients, competition for resources, soil water and air quality, may overestimate field responses on the farm (Long et al., 2006; Easterling et al., 2007; Tubiello et al., 2007b). For example, Zavala et al. (2008) reported that elevated [CO2] increased the susceptibility of soybean to two beetle pests by down-regulating gene expression related to defense signaling, which in turn reduced the production of feeding deterrents. This was discovered in a FACE experiment where plants were accessible to pests and pathogens. This type of complex interaction between elevated [CO2] and pest damage could not have been predicted from prior chamber experiments. Such findings are inconsistent with the main conclusions of Tubiello et al. (2007a), that there are no meaningful inconsistencies among data from FACE experiments, nonFACE experiments and modeling studies, and the implication that FACE experiments are not needed to address key knowledge gaps about crop responses to elevated [CO2].
What is the way forward?
Free-air CO2 enrichment experiments provide the most realistic conditions for estimating crop yield responses to elevated [CO2]. This is achieved by simulating future atmospheric conditions in the production environment of farm fields, without perturbing the soil–plant–atmosphere continuum, and in plots that are typically an order of magnitude larger than in chamber studies. Extrapolating seed yield responses of crops grown in controlled environments often leads to extremely unrealistic estimates of yield on a meaningful field scale (Supplementary material Table S2). Therefore, controlled environments clearly are not the best experimental facilities for estimating CO2 response ratios of yield. Chamber experiments are particularly valuable as a setting for identifying mechanisms of crop response at the molecular, biochemical and physiological scales. All of the authors of this paper have carried out, and continue to perform, chamber experiments. Long et al. (2006) highlighted, and we repeat here, the concern that a greater number of FACE experiments are needed, in addition to chamber studies, in order to generate the best possible understanding of crop responses to elevated [CO2] and to improve the performance of crops under future conditions. A major assumption of current integrated ecological–economic models of world food supply is that technical progress in crop yields will continue at the past and current pace (Fischer et al., 2005). The importance of investigating the mechanism of plant–environment interactions under realistic field conditions is also recognized by the biotechnology industry when attempting to identify molecular targets for germplasm improvement. For example, Pioneer Hi-Bred has reported, ‘Since we need to target our research effort to the production environment, more emphasis has been placed on field-based profiling experiments, using managed stress environments to generate realistic changes in gene expression’ (Campos et al., 2004). Developing germplasm that responds better to elevated [CO2] is a distant goal and to our knowledge is not a current research priority in industry (Ainsworth et al., 2008). Public research and development is needed to extend capacity beyond the current FACE experiments, which are limited in their geographical distribution, the very narrow range of CO2 concentrations used in the experiments and the inclusion of important interactions with other climate change factors.