SEARCH

SEARCH BY CITATION

Keywords:

  • modeling;
  • microbial ecology;
  • systems biology;
  • microbial diversity;
  • community function

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Microbial community sampling efforts
  5. Scales of microbial community models
  6. Predicting trends in microbial community modeling
  7. Acknowledgements
  8. References

Microbial communities exhibit exquisitely complex structure. Many aspects of this complexity, from the number of species to the total number of interactions, are currently very difficult to examine directly. However, extraordinary efforts are being made to make these systems accessible to scientific investigation. While recent advances in high-throughput sequencing technologies have improved accessibility to the taxonomic and functional diversity of complex communities, monitoring the dynamics of these systems over time and space – using appropriate experimental design – is still expensive. Fortunately, modeling can be used as a lens to focus low-resolution observations of community dynamics to enable mathematical abstractions of functional and taxonomic dynamics across space and time. Here, we review the approaches for modeling bacterial diversity at both the very large and the very small scales at which microbial systems interact with their environments. We show that modeling can help to connect biogeochemical processes to specific microbial metabolic pathways.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Microbial community sampling efforts
  5. Scales of microbial community models
  6. Predicting trends in microbial community modeling
  7. Acknowledgements
  8. References

To understand microbial systems, it is necessary to consider the scales at which they interact with their environment. These scales range spatially from microns to kilometers and temporally from eons to hours. Accounting for 350–550 billion tons of extant biomass (Whitman et al., 1998), microorganisms are the principal form of life on Earth, and they have dominated Earth's evolutionary history. Prokaryotes, the oldest lineage on the tree of life, first appeared about 3.8 billion years ago (Mojzsis et al., 1996) and have been detected in virtually every environment that has been investigated, from boiling lakes (Barns et al., 1994; Hugenholtz et al., 1998), to the atmosphere (Fierer et al., 2008; Bowers et al., 2009), to deep in the planet's crust (Takai et al., 2001; Fisk et al., 2003; Edwards et al., 2006; Teske & Sorensen, 2008). Microbial metabolism contributes to biogeochemical cycles (O'dor et al., 2009; Hoegh-Guldberg, 2010) and has both direct and indirect impacts on Earth's climate (Bardgett et al., 2008; Graham et al., 2012). Indeed, marine microbial activity has even been implicated as a correlate in earlier mass species extinction events (Baune & Bottcher, 2010). The concept that living processes drive changes the physical environment at the global scale is not new. The ‘Gaia Hypothesis’, which postulates that living processes help maintain atmospheric homeostasis, was published nearly 40 years ago (Lovelock et al., 1974), and there is mounting evidence that this is indeed the case (Charlson et al., 1987; Cicerone & Oremland, 1988; Gorham, 1991). Use of next-generation high-throughput data, however, has only recently made possible direct investigations of the specific molecular mechanisms and microbial consortia responsible for the planet's dynamic equilibrium.

While their effects may be global, microbial systems interact with their environments at microscopic scales. A single gram of soil might contain around 109 microbial units (Torsvik & Ovreas, 2002), and an average milliliter of seawater will contain approximately a million bacterial cells. The wide taxonomic diversity of these populations (Pedros-Alio, 2006) is fostered, at least in part, by myriad microenvironments accessible to the bacteria. In soil and marine systems, the majority of microbial diversity is represented in the minority of biomass (Pedros-Alio, 2006; Sogin et al., 2006; Ashby et al., 2007; Elshahed et al., 2008). Generally, in highly diverse microbial communities, a few abundant taxa predominate, with a long tail of low abundance taxa (Sogin et al., 2006). These low abundance taxa in particular are crucial to our understanding of microbial ecosystems, as they represent the vast functional diversity that can rapidly blossom to high abundance under the appropriate environmental conditions (e.g. Caporaso et al., 2011abc; Gilbert et al., 2011).

Microbial systems can be described using environmental DNA sequence information and contextual metadata, which reveal dynamic taxonomic and functional diversity across gradients of natural or experimental variation (Tyson et al., 2004; Venter et al., 2004; DeLong et al., 2006; Gilbert et al., 2010; Delmont et al., 2011). Taxonomic diversity is a measure of the community species composition, which is maintained or altered via interactions and adaptations between each species and its environment. Functional diversity is a measure of the frequency and the type of predicted enzyme functions encoded in a community's metagenome, and represents the potential to express a phenotype that interacts with a particular environmental state. Increasing depth from continuing advances in sequencing technologies has enabled whole genomes to be reassembled from metagenomic data, which permits appropriate descriptions of the taxonomic and functional potential of individual species imbedded within each community (Woyke et al., 2010; Hess et al., 2011; Iverson et al., 2012). While the goal of this mini-review is not to highlight the impact of these studies on defining the relationships between microbial communities and their environments [which is covered in other reviews, e.g. (Torsvik & Ovreas, 2002; Fierer & Jackson, 2006; Falkowski et al., 2008; Wooley et al., 2010; Gilbert & Dupont, 2011)], it is important to state that each community, whether embedded in a desiccated soil particle or in a biofilm attached to a hermit crab in a coral sea, presents a potentially unique set of interactions with the ecosystem. Here, we summarize current approaches used to generate predictive models that incorporate taxonomic and functional diversity at the metabolic, microbial interaction, community composition, and ecosystem scales of microbial ecology.

Microbial community sampling efforts

  1. Top of page
  2. Abstract
  3. Introduction
  4. Microbial community sampling efforts
  5. Scales of microbial community models
  6. Predicting trends in microbial community modeling
  7. Acknowledgements
  8. References

Metagenomics is the capture and analysis of genomic information from a volume of environmental sample (Fig. 1; Handelsman et al., 1998; Gilbert & Dupont, 2011). Recent advances in direct sequencing of DNA from an environmental sample have generated prodigious amounts of sequence information, resulting in a data bonanza (Field et al., 2011). Equally important as the collection of metagenomic data, however, is the concurrent collection of associated metadata (i.e. the chemical and physical characteristics of the environment undergoing metagenomic analysis). To generate hypotheses regarding the interactions within a community that result in observed patterns in diversity and richness, the relevant physical, chemical and biological factors must be measured. Probes can quantify various parameters, such as temperature, pH, ammonia, silicate, and oxygen concentration, at approximately the scale experienced by individual microorganisms (Debeer et al., 1992; Zhang et al., 1995; Rani et al., 2007; Stewart & Franklin, 2008). Metabolomic techniques such as near- and mid-infrared diffuse reflectance spectroscopy (Forouzangohar et al., 2009), nuclear magnetic resonance, or gas chromatography-mass spectrometry (Viant et al., 2003; Viant, 2008; Wooley et al., 2010) can provide measurements for very small volumes of environmental samples, but they only provide for a fraction of the thousands of metabolites potentially present (Viant, 2008). At the opposite end of the physical scale, remote sensing, recognized as the only tool for gathering data over extensive spatial and temporal scales (Graetz, 1990), collects data measuring electromagnetic radiation reflected or emitted from earth's surface, without direct physical contact with objects or phenomena under investigation. Remotely sensed imagery can provide a synoptic view of landscapes, enabling data acquisition over large expanses and/or physically inaccessible areas. Recent technological advances permit acquisition of imagery with spatial resolution as fine as 60 cm2 and temporal resolution as high as once a day when using a satellite platform.

image

Figure 1. Metagenomic analysis. DNA is extracted directly from a volume of environmental sample. The specific outline featured here is specific to the frequently used MG-RAST metagenomic analysis pipeline (Meyer et al., 2008), but can be generalized, with variations, to a broad range of metagenomic analysis approaches. (a) The width of this bar represents 100% of the entire DNA sequence data collected from an environmental sample. (b) DNA sequences are subjected to quality control, such as removing sequences that contain ambiguous base calls or are technical duplicates. (c) For sequences that pass quality control, the most likely protein-coding frame is identified. (d) For the predicted protein sequences from sequences that have a likely coding frame, the best homology to proteins in a large database of protein sequences is identified. Given the potentially large number of predicted protein sequences from the metagenomic dataset and the size of the database of known proteins, this step can require considerable computational time. (e) Not every predicted protein that has homology to a known protein will be to a protein of known or predicted function. At the end of metagenomic analysis, only a fraction of initial sequence reads may have generated hits to proteins of known functions or taxonomic identity. (f) Collected annotations and their relative distributions across metagenomic datasets are the principle input data for downstream modeling of microbial community structure and function.

Download figure to PowerPoint

Ongoing environmental monitoring projects that focus on using high-throughput sequencing techniques and continuous collection of contextual metadata to explore microbial life (e.g. The Global Ocean Survey (http://www.jcvi.org/cms/research/projects/gos), Tara Oceans (http://oceans.taraexpeditions.org/), the Hawaiian Ocean Time Series (http://hahana.soest.hawaii.edu/hot), the Bermudan Ocean Time Series (http://bats.bios.edu), Western Channel Observatory (http://www.westernchannelobservatory.org.uk/), and The National Ecological Observatory Network (NEON; http://www.neoninc.org)) are generating huge quantities of data on the dynamics of microbial communities in ecosystems across local, continental, and global scales. Recently, studies of coastal marine systems (Gilbert et al., 2010, 2011; Caporaso et al., 2011abc), the human microbiome (Caporaso et al., 2011abc), animal rumen (Hess et al., 2011), and Arctic tundra (Graham et al., 2011; Mackelprang et al., 2011) provide examples of the data density (both sequencing-based and contextual metadata) required to characterize microbial community structure in complex ecosystems.

Scales of microbial community models

  1. Top of page
  2. Abstract
  3. Introduction
  4. Microbial community sampling efforts
  5. Scales of microbial community models
  6. Predicting trends in microbial community modeling
  7. Acknowledgements
  8. References

Modeling approaches to microbial ecosystems can be grouped into four broad categories (Fig. 2). While the specific boundaries in time or space that separate one scale of microbial modeling from another are somewhat arbitrary, modeling approaches can be grouped by their distinct approaches to representing microbial processes and their relationships with their environments.

image

Figure 2. Microbial systems at log scale. In this figure, time and physical scales of different categories of microbial interactions are arranged on log 10 scales. Placements of reference points of interest on figure are approximate. Not featured on this figure, time since the origin of microbial life on earth at ~17.1 log 10 (seconds).

Download figure to PowerPoint

Metabolic

Metabolic models investigate how a single microbial cell interacts with its environment. The ultimate single cell model is one that encapsulates the full potential biochemical reactions within the cell that result in its phenotype and interactions with environmental factors and available nutrients. Recently, developments in the prediction of flux-balance models for individual genomes (Henry et al., 2010, 2011) have enabled these models to be generated in a high-throughput manner for tens of thousands of microbial genomes. This approach is becoming increasingly relevant as draft quality genomes of the most abundant organisms in a microbial community can be assembled from metagenomic data (Woyke et al., 2010; Hess et al., 2011; Mackelprang et al., 2011; Iverson et al., 2012; Luo et al., 2012). In particular, Mackelprang et al. (2011) found that the most abundant organism present in Alaskan permafrost soil was a novel methanogen and that modeling its metabolism from the assembled draft genome provided direct insight into how the thawing permafrost will contribute methane, a powerful greenhouse gas, to the atmosphere.

Microbial interaction

Microbial interaction models predict how the metabolisms of two or more microbial taxa interact with one another and their environment. Flux-balance models, which have been proven to be successful, are now being taken a step further to enable the development of simple interaction models between multiple individual flux-balance models for different genomes (Freilich et al., 2011). Individual-based models represent space as a discrete lattice, and each lattice element can contain microbial cells and measures of environmental parameter levels. Each microbial cell in the model is an individual and can have various capacities to interact with environmental parameters (O'Donnell et al., 2007). Applying individual-based methods to entire microbial communities requires highly detailed, very accurate information about microbial metabolism and the nature of the microenvironment (Ferrer et al., 2008; Freilich et al., 2011). Fortunately, there are computational techniques for describing multiphase transport in complex, porous media like soil, such as the Lattice-Boltzmann method (i.e. Zhang et al., 2005), which is a class of computational fluid dynamics techniques. Using these methods, it may be possible to model the dynamic movement of soil and then overlay this with biological information regarding the dynamics of the microbiome in that system; however, this has not yet been validated. Because this form of modeling can be computationally intensive, some methodological innovations, such as the use of superindividuals, have been advocated (Scheffer et al., 1995). The first study using individual-based modeling to predict the behavior of a microbial community simulated the accumulation of nitrate by nitrifying bacteria in different soil types (Ginovart et al., 2005). Recently, Gras et al. (2010) modeled the metabolism and dynamics of organic carbon and nitrogen in three different types of Mediterranean soil. The model incorporated specific parameters for growth and decay of microbial biomass, temporal evolution of mineralized intermediate carbon and nitrogen, mineral nitrogen in ammonium and nitrate, carbon dioxide, and O2. A good empirical fit of the model was observed using data from laboratory incubation experiments.

An alternate approach to modeling microscale dynamics over relatively short-time scales rather than across very small physical spaces is the Lotka–Volterra-type predator–prey models, or so-called ‘kill-the-winner’ models (Rodriguez-Brito et al., 2010). In the case of microbial life, the predators are viruses. In ‘kill-the-winner’, as abundances of particular taxa increase, so does their vulnerability to predation by viruses, leading to populations that are structurally stable over coarse-grained intervals but marked by rapid fluctuations in structure at the fine-grained level.

Two examples of ecologically relevant microbial interactions for modeling are complex microbial structures like biofilms (Chen et al., 2004; Diaz, 2012) or microbial mats (Heidelberg et al., 2009; Liu et al., 2011). In both these types of microbial communities, certain properties of microbial interaction would not be predictable from the metabolic capacity of any of its constituent members.

Community composition

Community models are concerned with how local environmental conditions shape the compositions of microbial populations. There are currently a number of niche-based techniques that link environmental parameters with microbial community structure (Bowers et al., 2011; Fierer & Lennon, 2011; Fierer et al., 2011; Jutla et al., 2011; Steele et al., 2011; Barberan et al., 2012). An extension of this idea is the development of predictive bioclimatic models (i.e. envelope models, ecological niche models, or species distribution models) that enable the estimation of the geographic and temporal ranges of organisms as a function of environment (Heikkinen 2006; Jeschke and Strayer 2008). Logistic regression uses generalized linear models (Bolker et al., 2009) to fit the presence or absence of a species against climatic variables as a linear function. Generalized additive models (GAM) model species as an additive combination of functions of independent variables (Hastie & Tibshirani, 1990). Climate envelope models like BIOCLIM (Busby, 1991), DOMAIN (Carpenter et al., 1993), and HABITAT (Walker & Cocks, 1991) fit the minimal envelope that defines an organism's possible habitat in multi-dimensional space, but use presence-only data rather than presence/absence. Maximum entropy models [MaxEnt (Phillips et al., 2006)] minimize the relative information entropy (dispersion) between two probability densities defined in covariate space (Elith et al., 2011). The classification and regression tree technique models communities as a binary decision tree in which the decision rules at each node use one or more independent environmental parameter variables (Che et al., 2011). Neural network approaches, such as the genetic algorithm for rule-set prediction (Stockwell & Noble, 1992; Stockwell & Peters, 1999), have powerful predictive capabilities, but only model organism distributions as present or absent as a function of environmental parameters. These niche-based bioclimatic models interpolate species distributions without mechanistic information, based on observed species occurrences in the environment. This indirectly takes into account competition between species, barriers to distribution, and other historical factors a postori, which cannot be physiologically predicted. Niche models yield the realized (actual) niche, rather than the fundamental (theoretical) niche predicted by process-based models (Guisan and Zimmermann, 2000; Morin and Thuiller, 2009). These models can underestimate complex biotic interactions and do not necessarily allow for varying distributions of the same organisms in different environmental conditions. Therefore, a myriad of tools exist to model the dynamics of microbial community structure. However, few if any have attempted to predict the relative abundance of the many thousands of potential species observed in complex systems (Caporaso et al., 2011a, bc).

One particular example of relevant modeling at this scale is for animal-associated microbial communities. Variation in the human gut microbiome has been linked to human health (Burcelin et al., 2011; Marchesi, 2011; Wu et al., 2011). In addition, microbial communities that live within other organisms, such the termite gut or the cow rumen, have potential applications in deriving biofuels from lignocellulosic plant materials (Hongoh, 2010; Hess et al., 2011).

Ecosystem

Ecosystem models of microbial communities span large environments, up to the entire biosphere. The one ocean model (O'dor et al., 2009) represents the global marine ecosystem at the largest possible scale: as a single circular ocean with a 10 000-km radius and a uniform 4 km depth. This model system is used to explore the potential for biodiversity dispersal. In the case of bacteria, a single ‘species’ could transverse the whole ocean in only 10 000 years. However, there are complications to such a simple theoretical model, such as barriers to dispersal. While continents may be the most obvious, currents are just as potent. The MIT General Circulation Model (Marshall et al., 1997) is a mathematical description of the motions that control oceanic and atmospheric currents. Combining these physical models with microbial diversity models, in which a number of microbial phenotypes are initialized and their interaction with the modeled environment determines their relative fitness, should enable accurate prediction of both dispersal, limits of dispersal, and species fitness (Bruggeman & Kooijman, 2007; Follows et al., 2007; Merico et al., 2009). For example, using diversity-based models with the high-resolution general circulation model (Marshall et al., 1997) enables the generation of several dozen parameterized phytoplankton models (Follows et al., 2007; Dutkiewicz et al., 2009). In these ocean-wide model systems, the fitness of modeled organisms responding through a combination of light, nutrient, and temperature adaptations corresponds well to the fitness of laboratory cultures under similar conditions.

Predicting trends in microbial community modeling

  1. Top of page
  2. Abstract
  3. Introduction
  4. Microbial community sampling efforts
  5. Scales of microbial community models
  6. Predicting trends in microbial community modeling
  7. Acknowledgements
  8. References

This is a golden age for microbial ecology. We are generating datasets that could lay the foundation of the next phase in microbial ecosystem modeling. As greater spatial and temporal resolution is achieved, the finer details of community structure will be elucidated, enabling biological, chemical, and physical relationships to be described with mathematical formalisms. The next generation of microscale, bottom-up models will focus on imposing more accurate metabolic models to define flux rates of enzymatic reactions for biological units that interact in massively parallel computational arrays (e.g. http://systems.cs.uchicago.edu/projects/bhive.html). These systems, built of cellular and biochemical components, rely on a mechanistic understanding, which must be a focus for future microbial research. Without an improved knowledge of the biochemical nature of metabolism, metabolic interactions cannot be accurately described. A challenge for such systems will be to integrate physical and chemical disturbance into the model environment. As has been shown with macroscale models of the global ocean, the physical currents, once modeled, enable significantly improved accuracy of prediction for community structure and biomass of individual taxonomic units.

It may be that microbial ecosystems, similar to life at macroscales, are fundamentally fractal in nature (Gisiger, 2001; Brown et al., 2002), displaying statistical self-similarity across multiple scales. If everything were in fact everywhere, then every sampled microbial population would contain a representation of the whole. Patterns of changing abundance in a milliliter of seawater might then mimic the patters observed in entire oceans. Fractal and multifractal systems have been applied to ecological patters in the past (Borda-de-Agua et al., 2002; Brown et al., 2002), and these tools may be valuable in modeling microbial systems as well. As understanding of microbial ecosystems continues to grow, the connections between the micro and the macroscales will become more apparent.

The ability to observe the taxonomic and functional diversity of microbial systems is still a very new technology, and microbial ecosystems are ancient. For a largely immortal organism that takes only 10 000 years to move across the globe and can be safely embedded in solid rock to await the geochemical conditions suitable to resume growth, a few years of observations might be insufficient to grasp the true dynamics of these ecosystems. Perhaps for some microbial taxa, the passing of the seasons are less important than the cycles of El Niño/La Niña, or even the coming and going of ice ages. Microbial ecosystem models are the only lens through which the full scope of microbial ecology can be observed, and provide opportunities for researchers to make predictions of microbial taxonomic and functional structure that extend far beyond the current range of possible observations.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Microbial community sampling efforts
  5. Scales of microbial community models
  6. Predicting trends in microbial community modeling
  7. Acknowledgements
  8. References