Data–model integration is not magic


(Author for correspondence: tel +1 865 5747848; fax +1 865 5769939; email

Modeling ecosystem responses to global change: techniques and recent advances. A Terrestrial Ecosystem Response to Atmosphere and Climatic Change (TERACC) workshop, Fort Meyers, FL, USA, January 2005

Progress has been made over the past decade to better our understanding of terrestrial ecosystem response to global climate change using both empirical and modeling techniques. However, better integration of experiments and models is needed to predict how ecosystems will respond to multiple drivers of global climate change. Recently, a group of empirical ecologists and ecosystem modelers convened to further the integration of modeling with empirical data ( Working groups convened independently under the subheadings of biogeochemistry, plant production, vegetation dynamics and water relations to address three questions central to promoting better interactions between empirical ecologists and ecosystem modelers, as follows.

  • 1How do different models and types of models represent key biological processes?
  • 2What are the major uncertainties and limitations of existing models, and what are the major uncertainties and limitations of existing experimental approaches?
  • 3How well ‘matched’ are experiments and models?

‘Incorporating carbon pool data into global models will require standardized methods of carbon accounting before we can accurately predict how climate change will alter future carbon budgets’

Incorporating heterogeneity

A primary challenge to developing better data–model integration in global climate change research is to develop models that link ecosystem structure to function. The lack of heterogeneity within the existing dynamic global vegetation models (DGVMs) means that models tend to predict homogeneous ecosystems that respond uniformly to environmental perturbations, even though ample empirical data show that changes in vegetation often alter ecosystem processes (Loreau et al., 2002). Models therefore need to be evaluated against both observed vegetation dynamics and flux tower measurements. Additionally, measured changes in ecosystem structure often do not map onto the typical coarse-scale plant functional types (PFTs) used in DGVMs even though the change in structure could be linked to a change in ecosystem function. For example, the Nevada desert Free-Air CO2 Enrichment (FACE) experiment showed an increase in production and seed rain of an invasive grass species under elevated CO2 (Smith et al., 2000), but there were no DGVMs that could explore the long-term effects these changes may have on biodiversity and ecosystem function. Examples of model–data comparisons that could help assess existing DGVM models might include the differential response of C3 and C4 species observed in FACE grassland experiments (Reich et al., 2001; Nowak et al., 2004), and the response of plant functional groups to drought using data from the European grassland, Jasper Ridge and other water manipulation experiments (e.g. Morgan et al., 2004). Finally, experiments need to be designed with model–data comparisons in mind by designing perturbation experiments around PFTs rather than species (P. Moorcroft, Harvard University, pers. comm., meeting notes).

Mechanistic agreement

All models, especially models that predict ecosystem-level responses, must not just get the right answer, but they must get the right answer for the right reason. This is particularly true in the context of climate change, where ecosystems are exposed to perturbations not previously experienced. Models are much better at predicting ecosystem carbon uptake than they are at predicting carbon loss because we have a more solid mechanistic foundation for leaf photosynthesis and canopy radiation interception than we do for respiration. Models of leaf photosynthesis work quite well and are a good example of where the integration of data and models has been successful. There are, however, no robust mechanistic models for respiration. One major limitation lies in the way models handle temperature sensitivity of soil respiration, with some using one simple Q10 relationship for every carbon pool at every base temperature (e.g. TRIFFID, Cox et al., 2000), though respiration is known not to respond this simply to temperature (Knorr et al., 2005). Further, researchers need to be able to partition sources of soil respiration more accurately. Rhizodeposits can be quickly respired by heterotrophs, rendering them difficult to distinguish from root respiration. Knowing the soil respiration rate for a particular ecosystem does not allow for adequate extrapolation under new climate conditions that could sway the balance between autotrophic and heterotrophic respiration. Such variation could have an enormous influence on carbon balance and needs to be addressed by empirical ecologists and ecosystem modelers (Pendall et al., 2004).

Many aspects of biogeochemical cycling such as carbon storage and nutrient uptake are dependant on plant carbon allocation to leaves, stems, and roots. Carbon allocation, however, remains a major weakness in ecosystem models, reflecting its poor theoretical foundation. Ecosystem models that use fixed allocation coefficients do not permit carbon allocation to change in response to environmental drivers such as elevated CO2, and a more dynamic approach (e.g. Moorcroft et al., 2001; Norby et al., 2001) may be preferred. For example, Norby et al. (2004) demonstrated that increased CO2 levels in a sweetgum plantation greatly stimulated fine-root production, with potentially important follow-on effects on biogeochemical cycling, yet a model with fixed allocation would not capture this type of result. Clearly, allocation is a critical area for continued experimental and theoretical research.

Models are often limited by process information, and process information is often collected over short periods of time, making for poor model predictions. Soil carbon pools, for example, are included in models as either discrete or continuous groups according to stability. Experimentalists have no standardized method for quantifying carbon pools among ecosystems, which renders empirical data difficult to handle in models. Incorporating carbon pool data into global models will require standardized methods of carbon accounting before we can accurately predict how climate change will alter future carbon budgets.

Matching timescales

To synthesize data into a common framework, experimentalists need to collect data at a scale that can be used in models, and modelers need to incorporate the data collected by experimentalists better. For instance, many models focus on long-term scales (decades to centuries), whereas experimentalists tend to focus on physiological and ecological scales (e.g. minutes, months and, rarely, years) (Medlyn & McMurtrie, 2005). Thus, a major challenge to comparisons between models that evaluate ecosystem response to climate change and existing data is a mismatch of measurement and modeling timescales. Modelers and experimental ecologists need to learn how to scale up from the measuring processes such as photosynthesis in order to make useful predictions about ecosystem functions such as nutrient availability. The concept of progressive nitrogen limitation (PNL), which states that stimulation of plant growth and carbon sequestration in elevated CO2 will create a negative feedback on growth response by reducing the availability of nitrogen (Luo et al., 2004), is a good example of where experimental data do not match model predictions. Modeled prediction of the amount of nitrogen required by ecosystems are extremely sensitive to slight changes in soil C : N ratios. Such changes may arise under elevated CO2, but are extremely difficult to detect in short-term studies. Empirical ecologists need to come up with ways to test the PNL hypothesis rigorously so that it may, or may not, be incorporated correctly into global change models.

Disturbance events

Changes in disturbance regimes have a critical influence on the outcome of model predictions but are difficult to capture in experimental situations. Modelers and experimentalists need to understand the temporal and spatial scales of extreme events better and to understand their ecological ramifications. One good example is the effort to incorporate fire into global change models (Bond et al., 2005). Relatively small changes in fire regimes can lead to dramatic changes in ecosystem structure, function and development (Bond et al., 2005). DGVMs therefore need to be able to incorporate fire submodels that can track subgrid scale diversity in ecosystem structure and composition that comes from fires. Additionally, large-scale manipulation experiments to distinguish the relative importance of fire on ecosystem structure and function are needed to use in model parameterization (P. Moorcroft, Harvard University, pers. comm., meeting notes).

Precipitation regimes are another example where the timing and scale of inputs can significantly alter the way ecosystems respond to climate change (Smith et al., 2000). Both empirical ecologists and modelers need to address the issues surrounding the timing of precipitation better as the variation and variance in the timing of precipitation are likely to change with climate change and may alter patterns of carbon and nutrient cycling. For example, a drought in the spring may have a different effect on ecosystem dynamics than a drought in the summer or fall (Hanson et al., 2001). Models need to be able to consider inter- and intra-annual variance and predictability in water availability, whereas experimentalists need to measure soil moisture at the appropriate spatial and temporal resolution to support complex modeling.


The call for data–model integration is not new in global change biology (Goudriaan et al., 1999) and the successful use of empirical data to test ecosystem models has become more common (Hanson et al., 2004). However, there is still much progress to be made in this area. Ecosystem modelers and empirical ecologists need to collaborate when establishing research agendas in order to enable the incorporation of data in a meaningful way into ecosystem models. This approach will also help experimentalists provide better data to modelers and help modelers understand what types of data are available to be used in model construction. In the words of one meeting participant, ‘data–model integration is not magic’, but by comparing models where all the important output variables are observed, by having experimentalists collect data at timescales relevant to modelers and by actively facilitating the collaboration of modelers and experimentalists from the onset of projects, the integration of data and models may become more common.


This meeting was organized by Lindsey Rustad and the TERACC steering committee and supported by the National Science Foundation. Notes written by working group leaders and comments by Rich Norby and Nate Sanders greatly improved this note. This report was prepared at Oak Ridge National Laboratory (ORNL) with support from the U.S. Department of Energy, Office of Science, Biological, and Environmental Research Program. ORNL is managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract DE-AC05–00OR22725.