From a macroecological perspective, the seasonal shifts in abundance of flora and fauna are clear to anyone with an interest in the natural world. The seasons are associated with fluctuating population sizes and associated phenological shifts in biological communities according to interactions between ecological niche requirements and the environment. The perception of biological diversity derived from a single snapshot in time is therefore likely to reflect seasonally abundant taxa, with less abundant taxa representing those that are poorly adapted to the current ecological regime or true rare taxa. While phenological changes in macrofauna and flora are immediately apparent, temporal shifts in microbial communities are poorly understood. Molecular approaches have thus far disclosed that four out of six eukaryotic kingdoms largely comprise protists (Simpson & Roger 2004) and environmental DNA surveys have revealed previously unknown phylogenetic depths of protist diversity (Moon-van der Staay et al. 2001) but we remain largely ignorant of the spatial and temporal distribution of protist taxa. In this issue, Nolte et al. (2010) examine interesting seasonal patterns of limnological protist diversity (Fig. 1) using 454 Roche second-generation pyrosequencing and highlight seasonally and biogeographically relevant shifts in community composition, outlining the importance of phenology when appraising biodiversity patterns in protists. Remarkably, Nolte et al. (2010) also appear to have comprehensively sampled the PCR amplifiable protist community from a freshwater ecosystem (Fig. 2). The major findings are highlighted below accompanied by a commentary on second-generation sequencing derived environmental DNA datasets.


Figure 1.  Dark field illumination of freshwater lake plankton, © Jens Boenigk.

Download figure to PowerPoint


Figure 2.  Monthly and total rarefaction analyses of the protist community of Lake Fuschlsee, Austria, between March 2007 and October 2007.

Download figure to PowerPoint

Until 2005, studying microbial diversity derived from environmental DNA sequencing was a relatively low-throughput affair, analysing clone libraries using 96- or 384-well chain-termination sequencing. With the exception of atypically large studies, most environmental DNA surveys relied on sample sizes below 1000 sequences. However, the advent of massively parallel pyrosequencing (Margulies et al. 2005) by 454 Life Sciences (now a Roche company) has radically changed our ability to identify the richness prevailing in communities dominated by microbial organisms (Sogin et al. 2006; Creer et al. 2010). With 454 Roche’s GS series Titanium sequencer currently generating approximately 1 000 000 reads with an advertised average length of 400 bases, a breadth of previously untestable microbial biodiversity hypotheses can now be addressed by matching contemporary sequencing power to the diversity and richness of microbial communities.

Although numerous studies have been published that use second-generation sequencing to investigate prokaryote and viral communities (Angly et al. 2006; Sogin et al. 2006), environmental studies featuring eukaryotes are only just emerging (Creer et al. 2010; Medinger et al. 2010). Nolte et al. (2010), building on earlier work from the Austrian lake Fuschlsee ecosystem (Medinger et al. 2010) not only provide a second-generation sequencing template that can be used as an inspiration to other labs wanting to study protist diversity (Fig. 1) derived from the nuclear small subunit (nSSU), but they add a temporal component to their analyses. They additionally feature a much needed perl derived algorithm [Cleaning and Analyzing Next Generation Sequences – CANGS (Pandey et al. 2010)], that will undoubtedly be of great use for the molecular ecological community performing biodiversity studies. By repeating 3-weekly sampling between March and October 2007, they have been able to show that, month by month, the standing diversity of the lake is relatively unchanged (673–1239 nSSU genotypes), whereas, the total sampled taxon richness approaches 4000 nSSU genotypes. What is remarkable is that rarefaction analyses performed on the combined dataset approach operational taxonomic unit (OTU) saturation (Fig. 2). The latter suggests that, of the taxa that have been extracted and PCR-amplified successfully, they have comprehensively sampled the protist community from a temperate lake ecosystem, which is an outstanding achievement.

Interestingly, they have also mapped the phylogenetic and biogeographic affinities of the cosmopolitan Spumella morphospecies complex to show that taxa associated with warmer regions of the globe prevail in the summer protist community, whereas taxa displaying cooler ecological adaptation are more prevalent in colder months. Notably, the seasonal succession of two nSSU genotypes are separated by a single nSSU base substitution, suggesting that closely related taxa are niche separated by time, as is demonstrated in some closely related invertebrates. Their data also suggests that not only does taxon abundance vary on a seasonally predictable basis, but some nSSU genotypes are completely absent in some months, only to appear in others. Moreover, they report a protist ‘rare biosphere’ (Sogin et al. 2006) where a low percentage (1.7%) of reads contribute to 25% of the OTU richness. Together, the data suggest that phenology and skewed rank abundances substantially influence the temporal and overall standing biodiversity patterns of protist communities. These are considerable contributions regarding our understanding of the magnitude and organization of microbial communities. Two prevailing models underpin predictions regarding the richness and biogeography of microbial communities; the ubiquity model (Finlay & Fenchel 2004) and the moderate endemicity (ME) model (Foissner 2008). The ubiquity model suggests that abundance and continuous, large-scale dispersal events of bacteria and many microbial eukaryotes sustain the global distribution of species and maintain high local/global species ratios and/or high alpha and low beta diversity. On the other hand, the eponymous ME model suggests that many protists at least, are cosmopolites, whereas others display ecologically and biogeographically logical restricted distributions. Concordantly, the ME model also predicts high abundances and rates of migration only in euryoecious species and moderate, to low, proportions of local to global species ratios (Foissner 2008). The debate continues regarding the accuracy of these predictions, but Nolte et al. (2010) confirm the importance of phenology when performing comparative analyses of datasets derived from different temporal or ecological regimes.

The introduction of second-generation sequencing has heralded a new era in our ability to comprehensively assess PCR-derived richness of any DNA community. However, greater vision is associated with a wider view of the PCR and sequencing environment, including errors and recombinant DNA molecules (or chimeras) where molecules from two different origins artificially combine during PCR (Meyerhans et al. 1990). PCR recombination will suggest the existence of taxa that do not exist and consequently give a false impression of organismal richness (Markmann & Tautz 2005) and distribution. Nolte et al. (2010) have been careful to not incorporate insertion/deletion events into their estimates of taxon richness, due to uncertainties regarding homopolymer calls derived from 454 Roche sequencing platforms. Moreover, they performed PCR reactions using low numbers of cycles, long extension times and used polymerases specifically chosen to reduce the amplification of chimeras (Qiu et al. 2001). Nolte et al. (2010) have therefore chosen to empirically reduce chimera formation, rather than performing a bioinformatic chimera removal step. Either way, reducing the incorporation of pyrosequencing noise and chimeric amplicons into final data sets is one of the largest molecular biological and bioinformatic challenges associated with second-generation sequencing environmental DNA datasets. The volume of data is substantial and environmentally derived DNA is a hypothetical breeding ground for the formation of chimeric molecules. Last year, Reeder & Knight (2009) contributed to the debate regarding the accuracy of metagenetic pyrosequencing datasets and the concept of the microbial ‘rare biosphere’. Quince et al. (2009) have also recently proposed the use of the message parsing interface (MPI) PYRONOISE software in conjunction with chimera screening to better clean pyrosequencing datasets, but limitations of computational power and substantial sequence diversity may still provide infrastructural and bioinformatic challenges for many biologists.

In this issue, Nolte et al. (2010) provide major advances in our understanding of protist diversity by utilizing contemporary sequencing approaches for the assessment of eukaryotic environmental DNA. Combinations of extensive sampling and sequencing have provided paradigm-shifting insights into the magnitude and temporal/taxonomic organization of protist diversity. Lake Fuschlsee is however, a single lake ecosystem. Perhaps one of the most exciting avenues of future work will be to compare contemporary environmental sequencing datasets from geographically disparate regions and begin objectively investigating the true extent and distribution of microbial eukaryotes. For the microbial researcher, it will be an exciting decade of challenges and discoveries.


  1. Top of page
  2. References

Simon Creer is a molecular ecologist with broad interests in the application of molecular genetics for testing ecological and evolutionary hypotheses. In partnership with a diverse range of collaborators current research foci include investigations of the magnitude, composition and distribution of microbial eukaryotes at a range of spatial scales, advancing the field of mitogenomics in spiders, and unravelling gene environment interactions in non-model organisms. All these areas employ challenging 454 Roche pyrosequencing datasets and so the analysis and interpretation of second-generation sequencing datasets for biodiversity assessment, phylogenomics and transcriptomics are current research priorities.