Crystal ball – 2013

In this feature, leading researchers in the field of environmental microbiology speculate on the technical and conceptual developments that will drive innovative research and open new vistas over the next few years.

A high-resolution 3D ‘peek’ into microbial community life

Manfred Auer, Life Sciences Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, MS Donner, Berkeley, CA 94720, USA.

Environmental microbiology, like many other biological disciplines, heavily utilizes the powerful high-throughput tools that modern systems biology has to offer. Huge inventory lists are being generated by all kinds of OMICS, including gen-, metagen-, transcript-, prote- and metabol-OMICS. Despite the emergence of such ever growing lists, we still lack a fundamental understanding of microbial community life, in part – I would argue – for a lack of an understanding of the spatio-temporal relationship of the key components, which can be proteins and macromolecular machines or cells, which function in their respective 3D community context. It would seem that spatial vicinity matters at all levels of complexity, and thus that determination of the 3D ultrastructural organization of sufficiently large volumes, as well as precision protein localization via tag- or affinity-based labelling, are key ingredients to a detailed understanding of community life.

Such volumes need to be visualized in 3D, and features of interest must be extracted through segmentation, classification and annotation. In order to go beyond the pretty picture we must determine the volumetric/geometrical parameters, such as volume, shape, diameter, distance, curvature, direction of each of the constituents, and establish their spatial relationships, thus revealing correlations and possibly even causalities.

Microbial community architecture has long been the domain of scanning electron microscopy (SEM). However, this traditional imaging approach – while yielding stunning and enticing pictures – does not provide a true 3D impression as it can only reveal the very surface of an object, such as a biofilm, and does not allow a look inside, and often does not allow real quantification of any of the observations. Furthermore, unexpected features such as the frequently encountered intercellular connections between bacteria have been dismissed as sample preparation artefacts (Dohnalkova et al., 2011). Transmission electron microscopy, on the other hand, while allowing for exquisite sample preservation (McDonald and Auer, 2006; Palsdottir et al., 2009), can only cover a tiny sliver of the 3D volume, and the attempt to reconstruct a 3D volume from serial sections has its own challenges, including finding the exact same area to image for different sections and for every grid, as well as compensating for the unique mechanical deformations of each section, making it difficult to unambiguously reconstruct a serial section 3D volume.

Recently, two techniques, i.e. focused ion beam SEM and serial block face SEM, have entered the arena of intermediate-resolution 3D electron microscopy imaging and show great promise to overcome traditional limitations: these two novel imaging approaches are somewhat related but yet distinctive from one another, each with its own set of advantages and limitations. They allow imaging of currently tens but soon hundreds of microns of biofilms (both in X, Y and also in Z) at a resolution of ∼10 nm in XY and ∼15 or ∼30–50 nm in Z respectively (Fig. 1). What make these approaches so powerful is that sample preparation does not suffer from traditional SEM sample preparation artefacts, and that the overall biofilm organization at the cellular and community level can be visualized while simultaneously allowing the detection of the deployed macromolecular strategies including vesicles, pili or other internally/externally located macromolecular machines.

Figure 1.

Focused ion beam scanning electron microscopy (FIB/SEM) of a mixed microbial community reveals distinct ultrastructural features inside and between cells. Upon 3D segmentation and 3D rendering, the 3D organization can be examined in exquisite detail, probing for the presence and 3D organization of macromolecular machines to cellular and community 3D organization.

To be sure, plenty of obstacles remain to be tackled, such as sufficient access to the very expensive 3D imaging equipment, the sheer visualization of such large and highly complex volumes, as well as the need to develop user-guided and/or (semi)-automated approaches for extracting features of interest, easy 3D volume annotation and quantitative 3D geometrical analysis, and ultimately the translation of data voxels into semantic information. As biologists and microscopist team up with computer scientists, some of these enormous challenges are beginning to be addressed, and hence routine large-scale high-resolution imaging of large biofilm volumes is in reach.

Clearly, we are entering an exciting new era of 3D imaging in microbiology physiology and pathogenesis that will allow us to map the parts list onto the 3D organization in cells and biofilms, and thus we will be able to take a detailed ‘peek’ into microbial community life.


I would like to thank Phil Hugenholtz, Falk Warnecke, Bernhard Knierim, Brandon Van Leer (FEI), Tom Goddard (UCSF), Monica Lin and Mitalee Desai for their help in sample preparation, 3D FIB/SEM imaging, 3D visualization of the depicted mixed microbial community.

This work was supported by the Director, Office of Science, Office of Basic Energy Sciences, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.

Microbial Earth: the motion picture

Edward F. DeLong, Department of Civil and Environmental Engineering and Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

Imagine you win the lottery and your prize is to travel with Sir David Frederick Attenborough (OM, CH, CVO, CBE, FRS, FZS, FSA), to train in the art of crafting popular Nature documentaries. In your travels with the master, you are awed by the raw violence of great whites devouring sea lions, by the smooth stealth of a hunting lioness, by the speed and grace of the gazelle that evades her, and by the unimaginable diversity of plant and animal life in the rainforests and coral reefs. You are equally awed by Attenborough's uncanny skill and craft in capturing the essence of nature and nurture, and the beauty, savagery, vastness and variety, which connects his audience emotionally to natural history in a deep, intuitive and visceral way.

Now it is your turn, a microbial ecologist having just trained with the great Sir David. The BBC gives you mega bucks to produce a 12-part series, ‘Microbial Earth’. So, how are YOU going to connect in the same emotional, visceral and intuitive ways as Attenborough? Will you show the savagery of exoenzyme hydrolysis attacking a dying diatom bloom, the grace and beauty of runs and tumbles in a chemotactic sensory path, the vicious jab of a Type III pilus, or complex food chain dynamics that recycle carbon and energy between microbes and sediments? Do you think you will get Joe Public's rapt attention in these efforts? Hmmm – it is really NOT as easy a task as Attenborough has!

Admittedly, there are some relatively straightforward bridges to be built. After all, videos of ciliates feeding on their bacterial prey, food vacuoles bulging as they gorge, portrays a microbial predator–prey dynamic easily translatable into what people see, know and can intuit. But what about those savage exoenzymes, buzzing electron transport chains, vicious Type III secretion systems, intimate symbioses and vast biogeochemical cycles and gradients? These are not so visceral, intuitive or emotionally accessible, nor arguably so easily portrayed to capture the general public's excitement and imagination. Part of the challenge is that humans simply do not have the intuition, instincts or aesthetic appreciation of microscopic and invisible form, function and interactions (Stahl, 2011; Woese, 1994). The majesty, diversity, impact and complexity of the vast microbial world is not so easily visualized, captured and communicated to the general public – even with the best artists and animators on the planet at your disposal. While the task is certainly not hopeless, and there is great progress to be made, do you really think that today, you could easily top Attenborough's appeal for the public's excitement and attention, in your microbial documentary? (More power to you if your answer is yes – please do it!)

But I digress. My gaze into the crystal ball today is not really about one-upping Attenborough. Instead, I will prognosticate briefly on how recent trends have influenced our appreciation of microbial natural history today, and where we may be heading towards in the future: namely towards a much more deep and realistic four-dimensional motion picture of microbial natural history in the wild!

To understand the present, look to the past. To understand the future, look to the present

It goes with saying that over the past 20 years our perspective on microbial natural history has advanced significantly thanks to the emerging cultivation-independent paradigm (Pace, 2012), as well as advances in more traditional microbiological approaches. From microbial genomics and microarrays, to more recently developed ‘next generation’ sequencing techniques, there have been great gains in the scope, scale and economy of microbial ‘omics’ data acquisition, and the molecular readouts of microbial community structure, function and dynamics that they bring. These advances in turn have brought new insights into the nature of microbial genome evolution, the mechanisms of microbial population dynamics, global maps of microbial taxon distribution and abundance, and the distributions of microbial genes, gene expression and proteins in the environment. The Whole Earth Catalogue of microbes, genomes and genes is fleshing out impressively, at levels unimaginable only just a few years ago. Some may still lament the ‘big data’ problem, complain that we are drowning in data, and quip that information is not knowledge. Of course, there are still great challenges, but the future is bright. While we may be swimming in a sea of big data, as we swim we are learning new ‘strokes’, including new and improved sampling techniques, high-density data archiving capabilities, statistical methods and computational modelling approaches. These newfound capabilities are now facilitating unprecedented views into the natural history of microbial communities and ecosystems, at a scope and scale never-before imaginable.

So, at this juncture, what can we predict about the trajectory of future new views of the natural microbial world? One thing seems fairly certain: we soon will move beyond static surveys, snapshot modes and simpler models of the past. This in part will be driven by integrated pictures of in situ microbial community interactions and dynamics, obtained by ‘filming’ the minute-by-minute microbial activities at high biological resolution, at more and more realistic and relevant spatial and temporal scales. This likely will involve the integration of many new and developing technologies including scaled down microfluidic and nanoscale technologies; automated sampling and sensing coupled with high biological resolution ‘omic'-based approaches; high-speed microscopic visualization and chemical approaches (for example, miniaturized flow cytometers and mass spectrometers); and quantitative mapping of the multiple (omic) readouts of indigenous microbial ‘biosensors’ (aka, microbial community members), onto other biological environmental variations.

Already some of these new motion pictures are beginning to be released, albeit the technologies still need much improvement, and short film clips are only just now becoming available. The autobiographic human gut microbial community drama entitled ‘Our humans and us’ is now being filmed (Caporaso et al., 2011). Instalments of ‘Seasons of our lives: my years in marine picoplankton’ is also being filmed (Gilbert et al., 2009). The ‘Bloom and Bust’ series, documenting phytoplankton successional events, is also being made in several instalments (Rinta-Kanto et al., 1993; Teeling et al., 2012). And again in the sea, a film entitled ‘A day in the lives of marine picoplankton’, shot with automated, Lagrangian sampling and high-resolution community transcriptome profiling (Ottesen et al., 1997), is also being filmed (coming soon to a theatre near you, Ottesen et al., 2012).

These four-dimensional movies of the natural microbial world will increasingly employ remote and continuous sampling and sensing at both micro and macro scales. Sometimes they will be achievable in real time, and sometimes not. And it goes without saying they will require advanced computational, statistical and modelling approaches, to fully develop the plot line and story of the microbial motion picture in the wild. The daily drama and natural historical details of the minutes, days, weeks, months and years in the ‘lives’ of microbial communities that remain obscure at present, will soon come into much sharper focus. With these new perspectives future microbial natural historians are likely to have much richer stories to tell. Microbial natural histories will soon rival the stories told by Sir David Attenborough, as high-resolution, four-dimensional microbial motion pictures more clearly reveal the drama, majesty and intimate interactions that occur each day on the microbial Serengeti.

The next big thing in cyanobacteria

Robert Haselkorn, Department of Molecular Genetics & Cell Biology, The University of Chicago, 920 East 58 Street, Chicago, IL 60637, USA.

It is roughly 20 years since the discovery of the transcription factor HetR in Anabaena. In that period it became clear that the HetR protein alone could not be responsible directly for the expression of the 1500 or so genes needed to turn a vegetative cell fixing carbon into an anaerobic factory fixing nitrogen. With the solution of the X-ray structure of the HetR dimer and studies of its binding to a single palindrome in the Anabaena genome and its regulation by the peptide RGSGR, we are on the cusp of understanding the cascade it directs. The urgently needed information now is the catalogue of auxiliary proteins that associate with HetR to direct it to additional DNA sites, the mechanism by which HetR turns on transcription, and the details of the cascade of genes whose expression is unleashed by HetR.

A different set of questions has been posed in connection with the study of toxins produced by cyanobacteria. What functions do these compounds carry out? Why are they made in the first place? During the past year or two the number and character of toxins produced by cyanobacteria has expanded significantly. Originally we were concerned with the microcystins, cyclic heptapeptides that bind irreversibly to protein phosphatases. Microcystins are made by very large synthetic complexes containing multiple domains, each of which binds an activated amino acid, modifies it and joins it to another, using thioester chemistry. This system is termed non-ribosomal peptide synthesis (NRPS). Not all the NRPS products are cyclic; some are linear and at least one has a lipid side-chain that promotes attachment to cholesterol-containing membranes. And now, as a result of genome gazing, another large family of peptides has been uncovered, this time made by ordinary ribosomal peptide synthesis (Wang et al., 2011). One strain of Anabaena has enough genes to encode hundreds of protein precursors, which are processed into tetrapeptides, cyclized and exported. Some of these are protease inhibitors. Finally, there is a family of alkaloids called anatoxins, made by a series of three polyketide synthases (Cadel-Six et al., 2009). Anatoxin binds to the mammalian nicotinic acetylcholine receptor, causing paralysis.

In the cases of the microcystins and anatoxins, the known targets are eukaryotic, metazoan, even mammalian. The question then arises: what was the original function of these toxins if their contemporary targets arose a billion years later? Could there have been targets among the prokaryotes that occupied related niches when the cyanobacteria were the most advanced organisms on earth? JP Changeux and PJ Corringer asked this question several years ago and found that the cyanobacterium Gloeocapsa contains an acetylcholine receptor. This protein is pentameric and its X-ray structure is almost identical to that of the pentameric AChR from Torpedo. Expressed in Xenopus oocytes, it functions as a proton pump. It remains to be shown that it binds anatoxin (Corringer et al., 2012).

These observations lead to the following prediction: the near and mid-term future will see significant attention to the evolutionary significance of the cyanobacterial toxins: are they signalling molecules, do they play a role in niche competition? Can they be tamed and made useful in medicine or, in the case of anatoxin, basic research on the functions of acetylcholine receptors?

Elephants in the room: protists and the importance of morphology and behaviour

Patrick J. Keeling, Canadian Institute for Advanced Research, Botany Department, University of British Columbia, 3529-6270 University Boulevard, Vancouver, BC, Canada V6T 1Z4.

A couple of years ago I found out I was not a microbiologist after all. I always thought I was, and even told strangers that is what I did, if they ever asked. But at a meeting of the American Society for Microbiology, I learned that my definition of a ‘microbe’ was not particularly representative. This is because I work on protists. Protists are microbial eukaryotes (more or less – we cannot quite decide on a definition), they are found in most of the environments you would expect to find other kinds of microbes (which is to say, everywhere), they are abundant, extraordinarily diverse, and (among my friends, anyway) generally considered to be ecologically important. They do come up sometimes in conversation, or even arguments, such as ‘who is the most important primary producer?’, or ‘are viruses or grazers more important for nutrient cycling?’. But protists are too often excluded from microbial ecosystem models or assessments of their composition; even studies that assess a complete ‘microbiome’ more often than not ignore the microbial eukaryotes.

Before I am written off as a whinging specialist who is feeling marginalized, let me state that there are good reasons for this gap in our knowledge; they reflect interesting reasons that go back to fundamental differences in biology. Indeed, the problems associated with a thorough understanding of microbial eukaryotic ecology are so stark, that my prediction for the next year is not that we will solve these problems, or even make progress. My prediction (or perhaps wishful thinking) is that the ‘eukaryotic question’ will increasingly emerge as an elephant in the room, which is an elegant idiom to describe our failure to grasp the role of so many large microbes that are right under our noses.

Bigger yes, but also different

I would like to discuss two reasons why protists have not entered the mainstream of conventional high-throughput environmental microbiology. The first of these is trivial and well understood: their genomes are bigger and organized differently. We know that new sequencing technologies have had a major impact in our understanding of the diversity and ecological roles of bacteria, archaea and viruses, for example, by allowing whole-community metagenomic surveys. To include protists in these surveys is easy – simply do not filter them out! However, we also know that nuclear genome sizes would require epic sequencing and analysis budgets that are simply not practical. Moreover, we cannot accrue the same benefits for protists, even if we could sequence enough, because their genomes are fragmented, repeat-rich, and lack functionally related gene clustering, all of which limit the inferences we can make about individual genomes and metabolic networks from metagenomics by limiting our ability to link genes to other genes in a genome.

But there is another less discussed, but infinitely more interesting problem. Bacterial and archaeal diversity is substantially manifested at the level of metabolism. Accordingly, the sequence of a bacterial or archaeal genome can go a long way to describing what that organism ‘does’ in the community, because we have developed reasonable ways to translate the information in a genome into predictions about that organism's metabolic actions in the environment. This is not the case for eukaryotes: although microbial eukaryotes harbour a sizable metabolic diversity, they are distinguished from other microbial life in that they manifest a great deal more diversity at the levels of morphology and behaviour. Indeed, morphology and behaviour have a much greater effect on what most protists ‘do’ in the environment than do their metabolic capacities (photosynthesis being an obvious exception). Unfortunately, the manifestation of these properties is much more complex than a straightforward gene–protein correspondence, and we are accordingly much worse at translating the information in a genome into predictions about what an organism looks like or how it behaves.

To illustrate this problem, imagine four dinoflagellate protists living in the same marine environment: one is a free-living benthic autotroph, one is an intracellular parasite of gastropods, one is an obligate photosynthetic symbiont of cnidarians, and one is a heterotrophic grazer feeding on bacteria and eukaryotic algae. Now imagine we have sequenced whole genomes and whole transcriptomes for all four of these organisms. How easy would it be to reconstruct these interactions? The answer is, it would be virtually impossible, even with these miraculous quantities of molecular data. We could recognize that two were photosynthetic, but this might even mislead us to assume they shared a similar niche, when in reality the two forming intracellular relationships with invertebrates might share more in common. This failure is because the most important characteristics that distinguish these organisms and their activities are derived from poorly understood coordinated actions of thousands of gene products, and worse still, subtleties of regulation and epigenetics relating to thousands of genes.

Organisms DO matter – how do we study them?

They say that if you have a hammer, everything looks like a nail, and right now our biggest hammer is sequencing. Getting more sequence data from eukaryotes at the environmental level is a technical problem that can, and soon will be, solved. The most revolutionary solution will be the arrival of routine single-cell genomics and transcriptomics. Despite all we have learned through metagenomic approaches, cells do matter in the final analysis because biological activities are compartmentalized and how the metabolism of a community is partitioned makes a difference; a community is not just the sum of its enzymes, and seeing how functions are distributed across a community will change how we interpret them. Single-cell genomics will therefore be a boon to all environmental microbiology. And for eukaryotes, single-cell transcriptomics in particular will give us a first inroad to their otherwise intractable genomes when it can be automated across natural communities.

How we interpret environmental sequence data from eukaryotes is another problem altogether. If the predictive power of even genome-wide sequence data is critically limited by our inability to infer characteristics of morphology and behaviour from it, then how do we integrate protists into a detailed picture of a microbial community that is primarily based on such data? Certainly being able to predict what an organism is like based on its close relatives will continue to be important, but requires a lot of ‘model’ systems scattered around the tree of eukaryotes to be truly effective. The real answer likely lies in a re-emergence, and indeed a reinvention, of arts like cultivation, ultrastructural characterization, identification and observation of live cells within their natural community, and field microscopy – some of which are badly under-appreciated at present. Our challenge is therefore not to put away our hammer, but to place more emphasis on the need for other tools too (in fact, I once watched a graduate student hammering a screw, so perhaps there is even greater depth to this need). It is not always obvious how these tools will be as adapted to a high-throughput approach as genomic methods were, but advances in imaging and cell sorting open a host of possibilities. So, to some extent, the way forward involves integrating existing methods rather than inventing new ones (e.g. linking high-throughput imaging with single-cell sorting would allow morphology to be linked with genomic data).

In summary then, it is my hope that in the coming years microbial eukaryotes emerge a bit from the shadows of their smaller cousins. Luring them out into the open will require more than protists simply ‘catching up’ with existing methods: we must improve the integration of protists with our understanding of other members of microbial communities by coordination and deliberate efforts to reconstruct entire microbiomes, including all members and their interactions. The genomic revolution has allowed astonishing advances, but perhaps this only means that it needs to be grounded in biology more than ever.

Adopting modularity of metabolism as a guiding paradigm may lead to better accounting and understanding of the unseen majority of life: exercised with focus on the nitrogen cycle

Martin G. Klotz, Evolutionary and Genomic Microbiology Laboratory, Department of Biology, The University of North Carolina, Charlotte, NC 28223, USA.

Obligate aerobic, chemolithotrophic and predominantly autotrophic ammonia-oxidizing bacteria (‘AOB’) cluster within two distant monophyletic groups: the betaproteobacterial family Nitrosomonadaceae and the purple sulfur bacterial genus Nitrosococcus of the Gammaproteobacteria. Yet, these two distant groups seemingly live identical catabolic lifestyles, posing challenging evolutionary questions that have awaited answers for several decades. Long generation times of the AOB and their infamous recalcitrance to transformation, as well as cloning and recombinant expression of their genes, have prevented extensive molecular genetic experimentation to verify their catabolic pathways. Thus, the opportunity in 1999 to sequence and annotate the genome of a bacterium once thought to be the ultimate representative for aerobic nitrogen biology created a lot of buzz and expectations; however, it took almost 4 years from the isolation of ‘pure enough’ genomic DNA to reporting the results (Chain et al., 2003). Aside from the exhilarating experience of finding all the genes necessary to make a living cell and the previously implicated inventory for it being an AOB, little could be gleaned from the genome to answer pressing questions on the evolution of nitrification as a process or the obligate nature of the ammonia-oxidizing lifestyle. This initial genome analysis was soon followed by additional sequencing projects, including other AOB and obligate aerobic chemolithotrophic nitrite-oxidizing bacteria (‘NOB’), that were facilitated by the then fully established DOE Joint Genome Institute (JGI) and initially coordinated by a group of Principal Investigators (PIs) supported by funding from the US National Science Foundation for a Research Coordination Network. The outcome of this endeavour was tremendous: Principal Investigators with different interests and expertise as well as at different levels of advancement in their careers came together and witnessed the power of genuine collaboration, which included the immersion of postdocs, graduate and even undergraduate students ( Along the way, the JGI, working closely with project PIs, developed the mastery of assembling extensive, complex and repetitive contigs into complete polished genomes. Last but not least, we all learned quickly that unfinished (and even finished) genomes generated many, if not more, unanswered questions than they provided answers to about the biology and origin of the organisms under investigation. In case of the AOB and NOB, metabolic (in silico) reconstruction of their genomes revealed that none of the genes implicated in ‘nitrification’ were actually unique to nitrifying bacteria and that the very few identified unique (exclusively present) genes were encoding proteins that – to this day – are not implicated in a ‘process-specific’ lifestyle (Arp et al., 2007; Klotz and Stein, 2008; 2011; Lücker et al., 2010; Campbell et al., 2011; Stein and Klotz, 2011; Simon and Klotz, 2013; Sorokin et al., 2012). Furthermore, additional ground-breaking work discovered that obligate aerobic ammonia-oxidizing Thaumarchaea (Könneke et al., 2005) as well as methane-oxidizing Verrucomicrobia (Hou et al., 2008; Islam et al., 2008) and Proteobacteria (Stein & Yung, 2003 and references in there) also nitrify and that anaerobic AOB oxidize nitrite (Strous et al., 2006). Yet another ‘process group’, the anaerobic methanotrophs (Ettwig et al., 2008) appear to encode many of the genes implicated in the obligate lifestyles of ammonia and nitrite oxidizers (Ettwig et al., 2010; Luesken et al., 2012). A critical observer of this recent ‘omics-driven’ progress in our understanding of microbial nitrogen transformations may then wonder why so many non-unique gene markers remain widely used as preferred targets to assess residence, abundance, diversity and distribution of Bacteria and Archaea that drive various aspects of the nitrogen biogeochemical cycle. The answer to this conundrum is multifold and needs to be looked at within a historical framework: early work on the biology of the nitrogen cycle was process-oriented and ‘cohorts’ of microbes that contributed to one or another of these processes were understood as dedicated facilitators of these processes: Nitrifiers, Denitrifiers, Ammonifiers and Nitrogen fixers. In addition, predominant environmental conditions associated with these processes were used as qualifiers (i.e. oxic vs. hypoxic and anoxic) and extended to the metabolic lifestyle of the participating microbes (i.e. aerobic vs. anaerobic). As a natural progression of process analysis, start and end-points became the foci of research, which resulted in an artificial categorization of which cohort ‘owned’ which substrate and end-product and which step was (rate-) limiting to the entire process. Some of these processes were entirely facilitated by individual microbial isolates (i.e. denitrification), whereas others required the sequential participation of more than one microbe (i.e. nitrification). With increasing technical advances, genetic methods of the mid 1990s allowed for identification of some molecular inventories involved in these processes [i.e. the genes encoding ammonia monooxygenase (AMO), the first enzyme in the nitrification process] that had up until then been elusive. In contrast, denitrification genes (i.e. nirS, encoding nitrite reductase) encoded by many chemoorganoheterotrophs such as Escherichia coli and genes encoding dinitrogen fixation inventory in the alphaproteobacterial order Rhizobiales were already well characterized in the 1980s. Equipped with this new genetic information, phylogenetic analysis found that evolutionary relationships were congruent between the gene encoding one of the subunits of AMO, amoA, and the small subunit ribosomal genes of AOB (Rotthauwe et al., 1997). At the same time, physiological studies led to the belief that the AmoA protein contained the active site to AMO (Hyman and Arp, 1992). Because ammonium is the starting point of the nitrification process, there was consensus that the evolutionary history of nitrification as well as the abundance and distribution of nitrifying microbes could be understood solely by tracking the amoA gene and studying the AmoA subunit of AMO. Although it soon became clear that AMO is a representative of a much larger family of membrane-bound monooxygenases that includes particular methane monooxygenase (Klotz and Norton, 1998), AMO (amoA and AmoA, in particular) has been faithfully regarded as the beacon of nitrification. A similar reasoning was applied in the study of other processes notwithstanding the fact that, for instance, ammonification (also known as ‘dissimilatory reduction of nitrate to ammonium’, DNRA) and canonical denitrification (dissimilatory reduction of nitrate to dinitrogen) share inventory facilitating the reduction of nitrate to nitrite. There was thus hope that the growing availability of genomes would provide the opportunity to construct the ‘core genome elements’ of the microorganisms that were typical facilitators of specific biogeochemical process, i.e. the ‘cohort.’ Unfortunately, the reality of genome information has not brought us closer to defining our preconceived functional cohorts and inventories, but rather has presented a much broader, less specific, portrayal of genome evolution. However, increased availability of sequenced genomes during the last two decades facilitated two major improvements for environmental microbiology: (i) an increased number of sequence variants of ‘functional signature genes’ routinely used to detect specific microbial cohorts, and (ii) several novel or improved functional signature genes including those that detect newly discovered players in geochemical cycles, such as the obligate anaerobic ammonia-oxidizing (Strous et al., 2006) and methane-oxidizing bacteria (Ettwig et al., 2010), obligate aerobic chemolithotrophic ammonia-oxidizing Thaumarchaeota (Könneke et al., 2005; Walker et al., 2010; Spang et al., 2012), and new phylotypic representatives of NOB that utilize the same but sequence-divergent inventory (i.e. the Nitrospira-type vs. the Nitrobacter-type nitrite oxidation module; Lücker et al., 2010; Sorokin et al., 2012). Both of these improvements allowed for improved design of primers and probes in PCR and FISH hybridizations.

In the early days of genome sequencing, which included our work on Nitrosomonas europaea and other ammonia- and nitrite-oxidizing bacteria, our expectation was that the number of ‘unknown’, ‘hypothetical’ and ‘conserved hypothetical’ protein-encoding open reading frames per genome would ‘shrink’ as the number of sequenced genomes increased. Today, nearly 20 years after TIGR presented the inaugural complete bacterial genome of Haemophilus influenzae R20 (Fleischmann et al., 1995) and with a sequencing capacity that can produce several fully sequenced microbial genomes in a single day, we are still waiting for the curve of ‘hypotheticals’ versus the number of sequenced genomes to turn from exponential to asymptotic. How does this challenge our quest of understanding abundance and diversity of microbial populations and the changing structure of their communities? I believe this means that our practised categorization of metabolism based in principle on chemoorganoheterotrophic pathways of E. coli that considers any deviation as an exception is fundamentally flawed. Rather, we need to understand the presently known acquisition of catabolic potential as additional examples of many versions of yet unknown metabolic diversity.

Contemporary wizardry of analysing signature macromolecules (DNA, RNA, proteins) seems to have much in common with computing and computer-based modelling: output is ultimately dependent on the information and theoretical framework of (implicated) input. The latter is usually a mix of experimentally proven and unproven hypotheses connected by a pinch of wishful thinking. We are beginning to acknowledge and understand that one of the major problems in environmental microbiology is actually of semantic nature in that ‘process’, ‘organism’ and ‘implicated molecular inventory’ were usually unambiguously correlated, such as in the case of the Zumftian ‘canonical denitrifiers’ (anaerobic organoheterotrophic bacteria that reduce nitrate to dinitrogen; Zumft, 1997), the ‘nitrifiers’ (ammonia- and nitrite-oxidizing bacteria) and the ‘ammonifiers’ (anaerobic reducers of nitrate to ammonium).

This problematic situation was likely created by an immature marriage of key questions asked by microbial ecologists (Who is there? What is everyone doing?) as well as by physiologists (What are the sources of Energy, Reductant and Carbon?) followed by times during which ‘bride and groom’ did not effectively communicate. An additional chasm has formed by a focus on the ‘uncultured majority’ using functional gene markers in molecular microbial ecology studies versus the detailed physiological and biochemical examination of ‘model organisms’ that are able to grow under defined laboratory conditions and survive experimental manipulation. To this day, there is ongoing debate over the relevance of cultured microorganisms to big environmental processes: for instance, can the study of a single model organism such as N. europaea define the process of ammonia oxidation? However, the dawn of evolutionary and genomic microbiology affords us the realization that metabolism is modular, a conclusion built on sound molecular evolutionary theory and confirmed with every newly sequenced and annotated genome. Evolutionary and genomic microbiology also informs us that these metabolic modules arose by birth and fortuitous combination (horizontal gene transfer) and have persisted and adapted as forced by functional pressures (‘use it or lose it’) thereby providing the basis for functional niche adaptation. We have known collectively for quite some time that metabolism (in particular, catabolism) of environmental microbes revolves around highly reactive and toxic intermediates. For instance, nitrite, nitric oxide radicals, hydroxylamine, hydrazine (rocket fuel) and nitrous oxide (laughing gas) in the N-cycle are requisite metabolic intermediates. We are also informed by evolutionary and genomic analyses that the genomes of these microbes encode multiple, functionally redundant, overlapping yet distinct inventories that regulate the half lives of reactive metabolic intermediates and facilitate their transformations. For example, at present, we know more than five evolutionarily unrelated classes of nitric oxide reductases, some of which existing in several evolutionarily related variations.

The study of function and origin (evolution) of biogeochemical processes with an emphasis on the starting point (as determined by the sources of Energy, Reductant and Carbon) neglected that selection for high-throughput toxin-producing machines (such as the alcohol and aldehyde-producing initial steps in chemolithotrophic catabolism) could not occur without the pertinent detoxification as well as energy- and reductant-extracting inventories already in place (Klotz and Stein, 2008; 2011; Tavormina et al., 2011). Likewise, using the genes encoding these usually substrate-promiscuous toxin-producing machines as indicators for the prediction of redox partitioning and flow in microbial communities to explain metabolic capacity and ecosystem function is likely missing the target, unless the system is functionally stratified. Although phylogenetic studies of process start point inventory are important (i.e. the superfamily of copper-dependent membrane monooxygenases, Cu-MMOs; Tavormina et al., 2011), sound phylogenetic, protein structural and functional analyses of the end-point detox, energy- and reductant-extracting inventory (Bergmann et al., 2005; Klotz et al., 2008; Kartal et al., 2011a,b; Kern et al., 2011; Simon & Klotz, 2013 and references therein) paired with comparative genome analysis (Arp et al., 2007; Bartossek et al., 2012; Hu et al., 2012; Speth Daan et al., 2012) continue to be just as crucial for understanding the function and origin of metabolic modules. My crystal ball reveals that this new paradigm and the increasing collaboration between molecular ecologists and molecular (omics-informed) physiologists will lead to continued successful environmental microbiological applications including the development of primers that target genes encoding detox, energy- and reductant-extracting inventory (Schmid et al., 2008; Attard et al., 2010; Li et al., 2010; Harhangi et al., 2012) and the inclusion of more phenotypically variable isolates in physiological and genomic studies. A case in point is the still elusive inventory that extracts energy and reductant in ammonia-oxidizing Thaumarchaeota, which is needed to crisply distinguish between those Thaumarchaeota that support their growth by the oxidation of ammonia to nitrite (the ‘AOA’) and those that express functional AMO for other purposes (the amo-encoding Archaea; ‘AEA’) (Hatzenpichler, 2012 and references therein). In the end, we may yet achieve a more reliable correlation between the physical world and the unseen majority of life that sustains and changes it.


My thanks go to all colleagues in the Nitrification Network and the Organization for Methanotroph Genome Analysis (OMeGA) for past, present and future discussions and specifically to those who gave me the opportunity to co-author collaborative work. In particular, I would like to thank Lisa Y. Stein (UA-Edmonton) for continuing critical and motivating discussions, collaboration and friendship, and for a critical reading of this crystal ball contribution.

The bioavailability of essential trace metals and its modification by microbes

François M. M. Morel, Department of Geosciences, Guyot Hall, Princeton University, Princeton, NJ 08544, USA.

As cofactors of metalloenzymes, metals play key roles in the metabolism and growth of microorganisms. This is widely appreciated in the case of Fe, which is used in myriad redox enzymes, but it is also true of other metals such as Zn, Cu and Mo, among others, which catalyse key processes such as protein degradation, methane oxidation and N2 fixation.

The bioavailability of trace metals thus influences the flow of energy and nutrients in ecosystems with important consequences for biogeochemical processes and community structure. For example, Fe limits primary production in large oceanic regions (Martin et al., 1994), while Mo limits N2 fixation in some tropical forests (Barron et al., 2009).

Metal-binding compounds, some from exogenous sources, some produced by the organisms themselves, control the bioavailability of trace metals. The best-known example is that of siderophores produced by microbes to bind and take up Fe (Sandy and Butler, 2009). The toxicity of other metals such as Cu or Cd is generally decreased by complexation with organic compounds. In oligotrophic environments where they can be used (such as the open ocean), electrochemical techniques have shown that the bulk of essential metals is bound to strong complexing agents. But the nature, origin and function of these chelators remains one of the most vexed questions in environmental microbiology. In most instances, we do not even know the nature of metal complexing agents in highly controlled conditions such as in culture media where their presence and function is usually masked by the addition of artificial chelators such as EDTA.

The major obstacle to unravelling the question of metal bioavailability is simply analytical: how to identify and quantify in very complex media compounds of unknown structure that complex trace metals at very low concentration. This limitation is being overcome by the enormous progress in high-sensitivity high-resolution mass spectrometry which is able to identify very large numbers of compounds in complex mixtures with increasingly better accuracy and lower limits of detection. This progress in high-resolution LC-MS/MS technology is essentially responsible for the emergence of fields like proteomics and metabolomics within the last decade and a half. But it should also allow identification of metal-binding compounds in culture media and natural samples. As with much of the emerging technologies, the problem of analytical detection is replaced by one of data analysis as the compounds of interest must be identified among the hundreds of thousands of individual species revealed by the instruments over the course of a single LC-MS run. As already exemplified in a few studies (Velasquez et al., 2011), the distinctive isotopic distributions of individual metals can be used to distinguish novel metal complexes among a forest of unrelated compounds. Analysis of fragmentation patterns of individual compounds, complemented by additional analytical information, will reveal conserved metal binding structural features. It should also gradually provide spectral libraries of matched MS/MS spectra for compounds bearing these motifs, greatly improving the bioinformatics necessary for identifying novel metal chelating agents. The age of ‘chelomics’ is nearly upon us.

Our crystal ball may principally reflect our optimism, but as we begin to identify and characterize metal complexing agents in cultures and in nature, we foresee a sea change in our understanding of the bioavailability of trace metals, an important facet of the interactions between microbes and their environment.


I thank Xinning Zhang-Paulot, Oliver Baars, Jeffra Schaefer, David Perlman and Anne Kraepiel for advice and discussions.

Electrical interactions of bacteria

Ken Nealson, Department of Earth Sciences, University of Southern California, Los Angeles, CA 90089, USA.

The crystal ball has always been a poor weapon for me – I am a far better marksman with the retrospectroscope! That being said, it is always fun to have a look at what might be, and it is an honour to be asked to say a few words. Based on what I have seen and heard in the last year, I suspect that the electrical (redox) charge of surfaces, and electrical interactions between cells (of the same and different species) are going to be an area of great interest and impact in the coming years.

In the past few years, it has become apparent that extracellular electron transport to insoluble electron acceptors (EAs), as well as to soluble EAs that become insoluble or toxic upon reduction, is commonly done by microbes: being a well-characterized process in bacteria, and less well so in Archaea. Much less well-appreciated are the recent findings from many laboratories that bacteria can take up electrons from insoluble electron donors, using these electrons as a source of energy. Along with these observations are the more subtle issues involved with attachment, growth and biofilm formation: issues that are almost certainly closely related to, and controlled by, various methods of sensing and responding to surface charge.

My crystal ball says that there will be many new discoveries of electrical interactions of bacteria with insoluble substrates, be they other bacteria, insoluble minerals, charged electrodes, or even eukaryotic cells, all of which have a charge that changes as a function of pH. Thus, we have a lot to learn: (i) how do bacteria sense and respond to charged surfaces; (ii) how is this response regulated, and what are the consequences of the response; and (iii) what are the ecological implications of these interactions? Unless I miss my bet, we will find that such behaviour is far more common than we anticipated, and that there is an entire area of microbial ecology dealing with the response to surface charge, and the ensuing extracellular electron transfer: an area that will range from syntrophy, symbiosis and pathogenesis, on one hand, to geobiology, corrosion and material science on the other.

Combating global proliferation of harmful cyanobacterial blooms by integrating conceptual and technological advances in a water management toolbox

Hans W. Paerl, Institute of Marine Sciences, University of North Carolina at Chapel Hill, 3431 Arendell Street, Morehead City, NC 28557, USA.

Nutrient enrichment (eutrophication) of freshwater ecosystems has promoted global proliferation of cyanobacterial harmful (toxic) algal blooms (CyanoHABs). This problem is exacerbated by global warming (Paerl and Huisman, 2008), and threatens the use and sustainability of some of the world's largest lakes and drinking water reservoirs. Particularly affected are rapidly developing regions, typified by China's third largest lake, Taihu, a previously pristine lake supplying the drinking water needs of over 12 million people, and a key regional fishing, tourism and cultural resource (Fig. 2).

Figure 2.

A toxic cyanobacterial (Microcystis spp.) bloom in Lake Taihu, China (photo: Hans Paerl).

Taihu, and other large lake ecosystems, have become the ‘poster children’ for CyanoHAB expansion in densely populated regions. Experimental work has demonstrated that excessive inputs of both nitrogen (N) and phosphorus (P) are responsible for the proliferation and persistence of toxic CyanoHABs in Taihu (Xu et al., 2010). These results challenge the previous paradigm that only P reductions are needed to control CyanoHABs, which was based on the assumption that numerous diazotrophic genera can fix atmospheric nitrogen (N2), thus supplying ecosystem N demand (Schindler et al., 2008). However, numerous studies have shown that this assumption does not hold true for freshwater and marine ecosystems (i.e. N inputs supplied by N2 fixation fall far short of ecosystem N demands) (Nixon, 1995; Paerl and Scott, 2010). Hence, eutrophication in these systems can be further accelerated by additional N inputs, especially if they contain sufficient amounts of P stored in sediments (Conley et al., 2009). Indeed, eutrophic systems worldwide exhibit the capacity to absorb even more N and increase their trophic state and CyanoHAB dominance. It is crucial to understand how input reductions in total, as well as specific N and P substrates, shape phytoplankton communities, and to do so while accounting for climactic variations that are known to favour CyanoHABs.

While managing these nutrients often requires engineering solutions, implementation can only be successful if it is ecologically constrained so that the resulting microbial taxa are desirable (e.g. nontoxic species). There is a need to define N and P reduction thresholds favouring bloom abatement in order to clarify the selective effects of anthropogenic N and P forms, including determining how selective nutrient reductions impact toxin-producing versus non-toxic cyanobacterial genera.

The challenge is to combine environmental multidisciplinary approaches to combat CyanoHABs over geological, climatic and hydrological gradients. To do this, we must combine rapid, sensitive and (from a biodiversity perspective) meaningful identification and characterization techniques with spatio-temporal delineation of the effects nutrient enrichment exerts on CyanoHAB expansion.

Aquatic microbial ecologists have developed in situ bioassays and whole lake assessments of phytoplankton responses to nutrient enrichment and reductions. These approaches have utilized general and taxon-specific biochemical and molecular techniques, including phytoplankton group-specific diagnostic photopigment indicators and genetic markers capable of detecting quantifying taxa-specific responses to nutrient manipulations. These assays can corroborate or expand information gained from conventional microscopic observations, and traditional biomass indicators such as chlorophyll a (total algal biomass), c-phycocyanin (total cyanobacteria), particulate C and dry weight. In addition, great inroads are being made to better understand the most troublesome aspect of CyanoHAB proliferation due to nutrient over-enrichment, their toxicity. Toxin producers can now be distinguished and quantified using a suite of molecular approaches, both amplification-based (myriad PCR assays) and in situ (e.g. fluorescence-based hybridization assays or shotgun metagenomics). Coupling these taxa-specific assays to nutrient enrichment experiments has helped identify relationships between basin-specific nutrient loads and the selective stimulation and proliferation of toxin-producing CyanoHABs such as Microcystis spp. (Otten et al., 2012).

From an environmental management perspective, there is a need to ‘scale up’ local experimental results to the ecosystem level, including large lakes and coastal environments to gauge regional responses to nutrient enrichment and climatic variability. Aircraft or satellite-based remote sensing has proven to be a powerful, highly useful means of relating small-scale experimental results to whole ecosystem responses. It has also helped clarify causal relationships between environmental, anthropogenic and climate parameters and CyanoHABs, and predicting bloom potential under future change scenarios. Traditional approaches to collecting data to assess the dynamics of CyanoHABs involve direct observation by light microscopy on shipboard or mooring, or laboratory experiments, such as taxonomic analysis or pigment extraction. Advances in autonomous sensing (fluorometric, spectrometric) analyses can now provide real-time measurements of water quality. Remote sensing provides observations at large coverage and high frequency. Multispectral satellite images have been used for assessing harmful algal blooms including CyanoHABs (Schofield et al., 1999). These images can discriminate CyanoHABs' distinct potentially toxic algal groups from other phytoplankton by observing subtle but detectable absorbance characteristics of diagnostic photopigments (e.g. phycocyanin, zeaxanthin). Landsat, MODIS, MERIS and QuickBird data (Wheeler et al., 2012) have been used to assess cyanobacteria in US lakes. These platforms can complement ground-level measurements of diagnostic photopigments, making them highly useful in extrapolating ground-truthed data. Complementary optical water quality (e.g. turbidity, chlorophyll a, coloured organic matter and temperature) has been measured using remote sensing of absorption, reflectance and emission of light by a substance. Satellite imagery, however, has spatial and temporal resolution limitations. Techniques such as image fusion, in which two or more images are combined into a single image, can be used in combination with wavelet information, regression trees and spatial/temporal adaptive reflectance fusion model (STAR-FM) to extract maximum amounts of information to help characterize CyanoHABs (Singh, 2011). Lastly, recent advances in hyperspectral imagery show promise in detecting toxic algal species and associated water quality parameters. Hyperspectral imagery provides a specific reflectance difference in algal bloom types based on taxa-specific photopigments that absorb in characteristic and highly specific wavelengths. At present, hyperspectral imagery is very expensive and site-specific because it is dependent on flyovers of highly specialized aircraft (e.g. hyperion aboard the EO-1 high altitude aircraft) (Lunetta et al., 2009). The streamlining of this technology will facilitate the application of hyperspectral imagery to water quality measurements in the near future.

Understanding the linkage between human- and climatically driven CyanoHABs and developing effective means to control these events will require combining environmental microbiology techniques with remote and in-system sensing technologies that can capture and quantify environmental forcing features and the microbial responses over a range of watershed, basin, regional and global scales. The good news is that individually, these technologies and approaches are largely ready for wide-scale deployment. The challenge now is to couple them in a manner that will enable us to capture and quantify the cause and effect relationships and thresholds in a non-linear, event-driven, hydrologically variable, warming world. Current and evolving empirical, statistical and inferential modelling techniques will help address these challenges and achieve the ultimate goal of safe, sustainable aquatic ecosystems.


I thank J.T. Scott, M. McCarthy, W. Lewis and W. Wurtsbaugh for helpful discussions and the US National Science Foundation for support of much of the work discussed.

Unweaving the evolutionary fabric of symbiotic digestion in termites

Claire L. Thompson and Andreas Brune, Department of Biogeochemistry, Max Planck Institute for Terrestrial Microbiology, Karl-von-Frisch-Strasse 10, 35043 Marburg, Germany.

More than a century ago, the contents spilling out of a punctured termite gut reminded the naturalist Joseph Leidy of ‘the turning out of a multitude of persons from the door of a crowded meeting-house’ (Leidy, 1881). We now know that this dense community of microorganisms breaks down lignocellulose and converts it to fermentation products that drive the metabolism of their host. However, the intestinal microbial community of a termite reflects more than just its day-to-day activities. Indeed, there are indications that elements of the gut microbiota are tightly woven into the evolutionary fabric of both vertebrate and invertebrate hosts (Ley et al., 2008; Colman et al., 2012). As descendants of omnivorous cockroaches that lived more than 130 million years ago, termites have gone on to become dietary specialists, able to degrade lignocellulose more rapidly and efficiently than any other organism known. Despite fundamental differences in host diet, the gut microbiota of cockroaches and termites have many bacterial lineages in common, and bacterial symbionts of termite gut flagellates appear to be derived from free-living relatives that were already present in the ancestor of termites (Noda et al., 2009; Schauer et al., 2012).

However, the evolutionary origin of most lineages, the basis for the complexity of the intestinal community, and the fundamental changes associated with the loss of the cellulolytic flagellates in the evolutionary higher termites are still unclear. High-throughput sequencing technologies now allow a detailed census of the meeting-house attendees and the teasing out of phylogenetic patterns across a broad range of host species. A far more challenging task is to understand what the meeting is about. Gazing into the crystal ball, we predict that future studies will reveal the functions of individual populations within the termite gut community. Metagenomic analysis has already provided insights into the nature of the bacteria involved in cellulose digestion in wood-feeding higher termites (Warnecke et al., 2007), and a survey of hydrogenase genes has indicated that the microorganisms responsible for hydrogen turnover differ between termites and cockroaches (Ballor and Leadbetter, 2012). However, the specific roles of the gut microbiota in the majority of termite species, particularly those specialized on lignocellulosic diets at advanced stages of humification, including soil organic matter, remain to be clarified. New tools from the burgeoning fields of functional genomics and metatranscriptomics will permit identification of the microorganisms responsible for the degradation processes as well as the metabolic pathways involved.

Of equal importance will be to understand the interactions of the microbiota with their host. Such interactions in vertebrates are under intensive study. Studies with germ-free mammals have shown that the gut microbiota is crucial for the complete post-natal maturation of the host and has a profound influence on immune development (Mazmanian et al., 2005). Apart from Drosophila melanogaster, comparatively little is known about the immune systems of insects, including termites, and the interaction of the gut microbiota with the immune system. We envisage the arrival of genome sequences of several termite species and new approaches using germ-free cockroaches that will shed light on the complex host–microbe interactions occurring within the guts of these insects.

Correlation analysis in microbial ecology: can we infer causation after all?

William Van Treuren2 and Rob Knight1,2,3,4, 1Department of Computer Science, University of Colorado at Boulder, Boulder, CO 80309, USA. 2BioFrontiers Institute, University of Colorado at Boulder, Boulder, CO 80309, USA. 3Department of Chemistry & Biochemistry, University of Colorado at Boulder, Boulder, CO 80309, USA. 4Howard Hughes Medical Institute, Boulder, CO 80309, USA.

It is by now a canard that ‘correlation does not imply causation’. However, researchers and clinicians increasingly need to mine feature-rich datasets to create hypothesis about mechanisms of disease and targets of intervention in those diseases. Counter-intuitively, one strategy that has emerged to serve this need is correlation analysis, with the goal of extracting a subset of meaningful features [operational taxonomic units (OTUs), metabolites, etc.] that can be investigated with higher confidence that this subset is important to the overall structure of the data, and provide explanations at a deeper level.

The motivation for this redux in correlation techniques is that improvements in data acquisition through genome sequencing, nuclear magnetic resonance and mass spectrometry have expanded the scale at which microbial communities can be surveyed much faster than the corresponding computational techniques for analysing and interpreting the data. Discovering meaningful correlations in datasets with tens of thousands of features and hundreds of millions of observations is, to say the least, challenging. Many high-profile microbial ecology papers include networks and heatmaps to suggest correlations in their data. These correlation analyses include co-occurrence analysis (which OTUs or metabolites are found in the same samples?) and covariance analysis (which OTUs are found together with which metabolites?) and occasionally correlations of OTUs and/or metabolites with time (Xia et al., 2011). The same challenges apply in using these correlation techniques for mixed multi-level omic datasets (e.g. which transcripts correlate with which metabolites?). A key challenge is the compositional nature of the data: normalizing to a sum can introduce correlations among many components that should be uncorrelated. Although many researchers have recently developed independent methods for assessing correlations in compositional data, including SparCC (Friedman and Alm, 2012), CoNet (Faust et al., 2012), Family Wise Error Rate strategies (e.g. Romano et al., 2008) and basic distance metric strategies, no consensus on technique has been reached nor have these methods been benchmarked against one another.

To make progress, consensus must be reached about which correlation analysis methods are most appropriate for which data types and experimental designs. Very little comparative work has been done identifying which methods are most effective in which parts of the correlation space, leading to a profusion of different methods as well as replication of some known bad strategies. For example, applying basic distance metrics (Spearman rank correlations, Euclidean distance, etc.) and choosing the most extreme linkages often fails because the probability of the most extreme links being true does not differ from the probability that less extreme links are true. Similarly, because there is no metric for deducing how many correlations one should expect, high numbers of false positives obscure meaningful correlations and can lead to inaccurate interpretations of the data (Lovell et al., 2010).

Development of a suite of techniques verified to be both precise and accurate will greatly assist both hypothesis generation and data explanation, especially through the development of causal models. In dysbioses, whether at the scale of our own gut or of entire ecosystems, knowing which organisms correlate, co-vary and depend on one another could have radical implications for correcting the ecological imbalance. For example, identifying members of the community that are centrally or critically located within the metabolic network of a dysbiotic community via correlation analysis of multiple levels of omic data could provide new targets for intervention, and, coupled with sensitivity analysis and Bayesian network inference techniques, improved predictions about causality. Recent advances in treatment of refractory Clostridium difficile infections using faecal community isolates (Lawley et al., 2012) demonstrate how more robust network analyses could be deployed. A proven way to analyse co-occurrences among metabolites and taxa could significantly reduce the time from hypothesis to treatment. Using microbial communities for environmental remodelling (remediation, extraction, etc.) relies on keeping those communities operating efficiently and robust to invasion – both of which could be greatly assisted by knowing the co-occurrence and covariance patterns.