Microbial genomics and related transcriptomics methods rely on culturing techniques to obtain enough DNA suitable for high-throughput sequencing without resorting to DNA amplification techniques. A few microgram of DNA is needed for most common next-generation sequencing methods. For transcriptome analysis, sufficient cDNA is needed to measure low abundance mRNA copies in the cell. However, the large majority of microbes on earth resist cultivation, hampering research into their relevant gene pool, ecological niche or industrial relevance. For example, many environmental or gut-related species cannot be grown outside their natural habitat. Even if we isolate the metagenome or the metatranscriptome from these environments, this reveals only a fragmented sequence landscape that is difficult to assign to individual species. Although enrichment techniques or metatransciptome analysis of previously unculturable species have been shown to assist in directed culturing, e.g. of a Rikenella-like bacterium (Bomar et al., 2011), the unravelling of a complex metagenome into its individual genomes and their organization is impossible using current technologies.
A major challenge is the analysis of bacteria and other organisms living inside a complex matrix, like biofilms. Metagenome or transcriptome analysis of microorganisms has been described for biofilms consisting of a single species by scraping of the biofilm to obtain enough material (Holmes et al., 2006), but for multi-species biofilms this method results in a metagenome or metatranscriptome dataset. The solution to these challenges may be the isolation and genomic analysis of unculturable single cells isolated from such environments. Here we describe in brief the state-of-the-art in single-cell microbial genomics.
Whereas classical next-generation sequencing to determine an organism's genome sequence relies on pooling DNA from 106–108 cells, single-cell genomics relies on whole-genome amplification from a single cell. Most studies rely on Multiple displacement amplification (MDA), a biochemical amplification technique using random primers and ϕ29 DNA polymerase (Dean et al., 2001; Raghunathan et al., 2005; Zhang et al., 2006; Marcy et al., 2007a). Other amplification techniques like random-primed PCR result in a more over- and under-representation of different regions of the template DNA and generate very short fragments (Dean et al., 2001; Hosono et al., 2003). MDA, however, results in fragments of 12–100 kb rendering them suitable for sequencing. Although the complete microbial genome from a single cell can be amplified to amounts required for current sequencing methods without a priori sequence knowledge, early studies suggested that up to 40% of the genomic sequence was missed (Podar et al., 2007; Marcy et al., 2007b; Woyke et al., 2009) (Table 1).
Table 1. Examples of single-cell genome sequencing.
Assembled bases (Mb)
Estimated % genome recovery
Single cell separation
Method validation using strain with known genome sequence.
Pooled sequence data from five individual cells; see Table 2.
An overview of an MDA set-up using a microfluidic device is shown in Fig. 2, although FACS-based methods are also often reported in literature (Rodrigue et al., 2009; Siegl and Hentschel, 2010). All DNA in the initial sample will be amplified, which renders the method very prone to DNA contamination. Another disadvantage of the initial method is uneven amplification of the genome, which results in high-coverage sequencing of the amplified genomic regions while remaining sequences may not be sufficiently covered (Zhang et al., 2006). Marcy et al. (2007a) demonstrated that reducing MDA reaction volumes lowers non-specific synthesis from contaminant DNA templates and unfavourable interactions between primers. The work of Rodrigue et al. (2009) demonstrated a biochemical method to normalize the products obtained in MDA reactions. They also discussed the problem of chimera formation linking non-contiguous chromosomal regions in MDA (Dean et al., 2001; Zhang et al., 2006), which may hamper sequence assembly and render mate-pair data less efficient in contig positioning. Several other single-cell techniques are described in recent reviews by Wang and Bodovitz (2010), Kalisky and Quake (2011), and Pan et al. (2011). As data analysis from single-cell amplified genomes is equally challenging, the software framework SmashCell has been developed to automate the main steps in sequence assembly, gene prediction, annotation and visualization (Harrington et al., 2010).
Single-cell genome sequences of uncultured microorganisms
Examples of sequencing of single amplified genomes (SAGs) are listed in Table 1. Woyke et al. (2010) describe using a micro-displacement technique to sequence a genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN, a symbiont isolated from the bacteriome of the green sharpshooter Draeculacephala minerva. This polyploid bacterium has an estimated 200–900 genome copies per cell. Of the 57 Mb of sequence generated, approximately 90% was of contaminant origin, as estimated by mapping to a previously sequenced genome of Sulcia and phylogenetic analysis with blastx and MEGAN (Mitra et al., 2009). The remaining reads were assembled into a draft genome, misassemblies due to chimeras were corrected manually, and subsequent application of primer walking, sequencing PCR products and Illumina sequencing resulted in a final finished genome (Fig. 3).
Siegl et al. (2011) used FACS to isolate cells from the candidate phylum Poribacteria and subsequently MDA to obtain a SAG. These bacteria are almost exclusively found in marine sponges as symbionts and resist cultivation efforts. The SAG of 1.88 Mb was contained in 1597 contigs, which covered an estimated two-thirds of the total genomic DNA based on the distribution of tRNA genes and their specificities found in the contigs. Nevertheless, a comprehensive overview of poribacterial metabolism could be deduced (Fig. 4). The extensive Sup-type polyketide synthases found in the SAG of Poribacteria confirmed the previously proposed assignment of Sup-PKS to this species. With the finding of a second putative PKS system showing high similarity to the lipopolysaccharide type I PKS WcbR from Nitrosomonas and Burkholderia, as well as RkpA from Sinorhizobium fredii, they suggested that Poribacteria contain at least two different types of PKS systems and their products may be involved in sponge–microbe interactions. This study showed that single-cell genomics is highly capable of dissecting the genomic information from unculturable bacteria, shedding light on genomic organization, metabolic functions and possibly new insight in the debate on the origin of sponge bioactive compounds.
Ammonia-oxidizing archaea (AOA) are among the most abundant microbes on Earth, and may significantly impact global nitrogen and carbon cycles. Five single cells were isolated from a low-salinity sediment AOA-enrichment culture using a microfluidic device and laser tweezers, and DNA was amplified and sequenced separately from each cell (Blainey et al., 2011) (Tables 1 and 2). Individually, three single-cell datasets gave assemblies of more than 1 Mb at sequencing depths of 10× to 30×, and an estimated 60% genomic coverage each; the low coverage is considered typical due to MDA amplification bias. Surprisingly, each of the single-cell assemblies represented a different 60% of the target genome, and combining the five datasets led to a single-cell assembly representing > 95% of the Nitrosoarchaeum limnia genome. Based on nucleotide identity comparisons, this AOA is proposed to represent a new genus of Crenarchaeota. In contrast to other described AOA, this low-salinity archaeum appears to be motile, based on the presence of numerous motility and chemotaxis-associated genes in the genome (Blainey et al., 2011).
Table 2. Assembly statistics for sequencing of three single cells of Nitrosoarchaeum limnia SFB1, and consensus genome (reads from metagenome and five single cells).
Single-cell transcriptomics, metabolomics and proteomics
Recent reports on single-cell transcriptomics discuss mainly the analysis of polyadenylated mRNA of eukaryotes. A comprehensive overview of the technologies involved is given by Tang et al. (2011). In short, the single-cell methods exploit reverse transcription using oligo(dT) primers to convert mRNAs with poly(A) tails into cDNAs, followed by uniform amplification and sequencing (RNA-seq). However, currently no single-cell analysis reports are known that exploit protocols for mRNA extraction from bacterial cells, for instance using the MessageAmp II-Bacteria Kit (Ambion) as described by Frias-Lopez et al. (2008). Single-cell metabolome and proteome/peptidome analyses are still in their infancy, as these compounds cannot be amplified and their analysis requires technological breakthroughs in pushing the limits of detection (Rubakhin et al., 2011).
Since the introduction of single-cell genomics (Raghunathan et al., 2005), there have been surprisingly few reports of successful reconstruction of whole genomes from single unculturable bacterial cells (Table 1). This undoubtedly reflects the extreme difficulties in the various steps of single-cell isolation, miniaturization, DNA amplification, avoidance of contamination and data analysis. Nevertheless, the pioneering examples show that it is definitely feasible to sequence genomes of single unculturable cells isolated from complex consortia, and we expect this approach to become more widespread as miniaturization technologies improve.
Recently, it has also been recognized that isogenic microbial populations (pure cultures) contain substantial cell-to-cell differences in physiological parameters such as growth rate, resistance to stress and regulatory circuit output (Ingham et al., 2008; Lidstrom and Konopka, 2010). In this light, adaptation of single-cell genome sequencing using microfluidic approaches towards RNA-seq transcriptome analysis of single cells using next-generation mRNA sequencing should become increasingly important (Siezen et al., 2010).
This project was carried out within the research programmes of the Kluyver Centre for Genomics of Industrial Fermentation and the Netherlands Bioinformatics Centre, which are part of the Netherlands Genomics Initiative/Netherlands Organization for Scientific Research.