A comparison of DNA- and RNA-based clone libraries from the same marine bacterioplankton community


  • Markus M. Moeseneder,

    Corresponding author
    1. Department of Biological Oceanography, Royal Netherlands Institute for Sea Research (NIOZ), P.O. Box 59, NL-1790 AB Den Burg, The Netherlands
    Search for more papers by this author
  • Jesus M. Arrieta,

    1. Department of Biological Oceanography, Royal Netherlands Institute for Sea Research (NIOZ), P.O. Box 59, NL-1790 AB Den Burg, The Netherlands
    Search for more papers by this author
  • Gerhard J. Herndl

    1. Department of Biological Oceanography, Royal Netherlands Institute for Sea Research (NIOZ), P.O. Box 59, NL-1790 AB Den Burg, The Netherlands
    Search for more papers by this author

*Corresponding author. Present address: Department of Microbiology, Oregon State University, 220 Nash Hall, Corvallis, OR 97331, USA. Tel.: +541 737 1889; fax: +541 737 0496, E-mail address: moeseneder@gmx.net


Clones from the same marine bacterioplankton community were sequenced, 100 clones based on DNA (16S rRNA genes) and 100 clones based on RNA (16S rRNA). This bacterioplankton community was dominated by α-Proteobacteria in terms of repetitive DNA clones (52%), but γ-Proteobacteria dominated in terms of repetitive RNA clones (44%). The combined analysis led to a characterization of phylotypes otherwise uncharacterized if only the DNA or RNA libraries would have been analyzed alone. Of the DNA clones, 25.5% were found only in this library and no close relatives were detected in the RNA library. For clones from the RNA library, 21.5% of RNA clones did not indicate close relatives in the DNA library. Based on the comparisons between DNA and RNA libraries, our data indicate that the characterization of the bacterial community based on RNA has the potential to characterize distinct phylotypes from the marine environment, which remain undetected on the DNA level.


Molecular techniques as tools in aquatic microbial ecology brought new insights into the community structure and dynamics of marine Bacteria[1–3]. Most of these studies used 16S rRNA gene approaches and revealed a complex bacterial community structure with novel gene lineages in the sea [4,5]. However, the interpretation of 16S rRNA gene approaches has been complicated in the last years, because microorganisms harbor multiple, heterogeneous rRNA operons [6,7]. Bacteria have 1–15 rRNA operons, reflecting different ecological strategies of growth and activity[7]. Klappenbach et al.[7] showed that soil Bacteria rapidly forming colonies upon exposure to complex medium had, on average, a higher copy number of rRNA operons (5.5 copies) than slowly growing colonies (1.4 copies), which has been interpreted as a higher fitness of Bacteria with repetitive rRNA operons in environments with periodic resource fluctuations. Bacteria from environments with a more constant supply of resources seem to have a low number of rRNA operons within the genome. For example, a bacterium isolated from the oligotrophic marine environment exhibited slow growth rates in the lab and only 1 rRNA operon[8]. Still, high in situ metabolic rates of these Bacteria with a single rRNA operon could also indicate that keeping the cell volume, the genome and the number of rRNA operons small, might allow higher metabolic rates and faster reproduction in an oligotrophic environment [8–10].

Bacteria also have to sustain prolonged periods of starvation in the marine environment when available resources do not support bacterial growth [11–13]. On the RNA level during starvation, ribosomes and ultimately rRNA decrease to minimal levels within the cell [8,14] and several studies point to a linear relationship between rRNA content and the growth rate of bacterial cells [15–18]. On a per cell base, ribosomes are much more abundant than rRNA operons. Depending on the growth rate of the bacterium, between 6800 and 72,000 ribosomes cell−1 have been found for Escherichia coli[19] and between 200 and 2000 for an oligotrophic ultramicrobacterium[8]. Exceptions to these observations also exist as a marine Vibrio sp. strain contained a higher number of ribosomes during starvation periods, which has been attributed to the ability of this strain to immediately regain high activity as soon as starvation is ceased[20]. Still, because of the generally higher number of ribosomes in metabolically active Bacteria than in dormant cells, it is assumed that the analysis of rRNA rather than genomic DNA provides a tool to determine the metabolically active cells [21,22]. Thus, under oligotrophic marine conditions, the RNA content per cell might be low, which could also explain the lower fluorescence in situ (FISH) hybridization counts with Bacteria-specific probes compared to the overall abundance[23]. As an alternative, the analysis of ‘bulk' RNA with techniques like quantitative hybridization seems to be a much more promising tool to analyze microorganisms with low per cell RNA content, although compared to sequencing, the use of a single hybridization probe might drastically decrease the phylogenetic resolution[24]. Most of the molecular studies on marine bacterioplankton are based on the analysis of DNA or RNA and comprehensive results on the simultaneous analysis of marine bacterioplankton on their DNA and RNA level with the same experimental approach are still missing.

From previous T-RFLP and DGGE fingerprinting results [25,26], we expected differences between DNA and RNA clone libraries. Because of the lower number of rRNA operons compared to ribosomes per cell, the detection of Bacteria based on 16S rRNA genes might be mainly determined by the cellular abundance of the organisms in the environment. In contrast, Bacteria present in low cellular abundance and therefore probably not detectable on the DNA level, but with a higher ribosome content might be still detectable with RNA clone analysis. Thus, with DNA and RNA sequencing, new insights into the total bacterial community and in its potentially active fraction should be a first step to a better understanding of different bacterial phylotypes in the oligotrophic marine environment.

2Materials and methods

Details about sample collection, extraction/purification of nucleic acids, cDNA synthesis for reverse transcription-PCR (RT-PCR), PCR and RT-PCR conditions can be found elsewhere[25].

2.1Sample collection, extraction and purification of nucleic acids

In brief, the complex bacterial community used in this study originated from 200 m depth of the North Aegean Sea (39° 13.45N, 25° 00.00E). Five liter seawater was pre-filtered and prokaryotes passing the Whatman GF/C filter were concentrated onto 0.22 μm polycarbonate filters. The microorganisms on these filters were suspended in 2 ml lysis buffer, incubated with lysozyme followed by an incubation with sodium dodecyl sulfate and Proteinase K. The lysate was extracted with an equal volume of phenol:chlorform:isoamylalcohol (25:24:1) and subsequently with an equal volume of chloroform:isoamylalcohol (24:1). The nucleic acids were precipitated, re-suspended in 200 μl diethyl pyrocarbonate-treated water and stored at −80° C. RNA was removed from DNA with DNase-free RNase and purified using the Qiaex II Kit. For RNA purification, DNA was removed with 20U RNase-free DNase. RNA was phenol extracted and precipitated as described above for the nucleic acids. The efficiency of the DNA removal from RNA was checked as previously described[25].

2.2cDNA synthesis for reverse-transcription polymerase chain reaction (RT-PCR), PCR and RT-PCR conditions

The transcription of 16S rRNA into cDNA was performed with ‘first-strand-reaction-mix-beads' using a pd(N)6-primer. The primers used for PCR and RT-PCR were the Bacteria specific primer 27F and the universal primer 1492R. After (RT-) PCR for 30 cycles, the (RT-) PCR products were purified with the Qiaquick PCR Purification Kit and quantified by comparing the band intensity of the PCR product with a Smart Ladder (Eurogentec, Searing, Belgium) as a concentration standard in a 1% agarose gel.

2.3Cloning and sequencing

Fifty ng PCR product (insert:vector molar ratio of 3:1) was used for cloning reactions using the pGEM cloning kit (pGEM-T Easy Vector Systems, Promega, Leiden, The Netherlands) following the recommendations of the manufacturer. Insert-containing colonies were re-suspended in 200 μl ultrapure water (Sigma, Zwijndrecht, The Netherlands). Cell suspensions of individual bacterial clones were pelleted at 3200g for 20 min. Pure plasmids from the cell pellets were obtained with the Qiaprep miniprep kit (Qiagen). Sequencing reactions were performed using the BigDye Terminator Cycle Sequencing Kit (PE Applied Biosystems). Each cycle sequencing reaction contained 500 ng cleaned plasmid, 4 μl 5 × BigDye Terminator buffer (400 mM Tris–HCl (pH 9.0), 10 mM MgCl2), 2 μl Ready Reaction Mix from the BigDye Kit, 0.8 μl of the primer (5 μM, 27F, 518F (5′-CCAGCAGCCGCGGTAAT-3′) or 1492R) adjusted with ultrapure water (Sigma) to a final volume of 20 μl. All clones were sequenced with the 27F primer for 2 h, giving a ∼300 bp sequence. For the sequences where full length inserts (∼1500 bp) were analyzed, additional sequencing reactions were performed with the primers 27F, 518F and 1492R. Alignment was performed as described below. Cycling conditions started with an initial denaturation at 96 °C for 1 s, followed by 25 cycles of denaturation at 96 °C for 10 s, annealing at 55 °C (50 °C for primer 518F) for 5 s and extension at 60 °C for 4 min. Samples were precipitated with 80 μl of 75% isopropanol (vol./vol.) and re-suspended in 12 μl TSR (Template Suppression Reagent, PE Applied Biosystems). Sequencing was performed in an ABI Prism 310 capillary sequencer (PE Applied Biosystems) using POP 6 and the protocol supplied by the manufacturer.

2.4Phylogenetic analysis

Partial or full length sequences for selected clones were combined by pre-alignment with SeqApp (http://iubio.bio.indiana.edu/soft/molbio/seqapp/). Sequences were imported into ARB[27]. Alignment was performed using the automatic aligner in ARB and pre-aligned sequences were checked manually for small sequencing errors, instrument reading errors, correct alignment, secondary structure and correct group consensus alignments. Chimeric structures from PCR amplification should be detectable as changes in the secondary structure of the sequence. Our clones indicated two 16S rRNA gene sequences of chimeric origin, which were removed from analysis. Additionally, all sequences were checked using the online analysis CHECK_CHIMERA[28]. However, we found the manual control of the secondary structure superior over the CHECK_CHIMERA, as this analysis gives occasionally contradictory results. The phylogenetic tree was constructed with PAUP[29] using neighbor-joining and parsimony methods. Neighbor joining (calculated with a distance matrix using a Kimura 2-parameter model and assuming a transition/transversion ratio of 2) and parsimony trees were inferred by the heuristic search option. Maximum likelihood trees were constructed with fastDNAml[30]. To evaluate the neighbor-joining and parsimony trees, 100 bootstrap re-samplings were performed to support the topology of these trees. Instead of bootstrap analysis of maximum likelihood trees, posterior probability distributions were calculated using Baysian interference and Markov chain Monte Carlo (MCMC) techniques for phylogenetic tree reconstruction and comparison[31]. The sequences have been submitted to the GenBank database[32] under the accession numbers AF406316–AF406553.

2.5Reliability of the cDNA approach

Marine bacterial strains (MS20, MS21, MS23) were isolated from marine snow aggregates collected in the Northern Adriatic Sea (3 km off the coast of Rovinj, Croatia). Marine snow was collected[33] and strains isolated as described previously[34]. Two ml liquid culture grown overnight at 20 °C was centrifuged at 3200g for 20 min and washed twice with STE buffer (100 mM NaCl, 50 mM Tris–HCl [pH 7.4], 1 mM EDTA). Cell pellets were stored at −80 °C until nucleic acid extraction (as described above). The DNA sequences obtained from these strains were submitted to GenBank and are available under the accession numbers AF237975–AF237977. Reliability of the cDNA synthesis for subsequent reactions was first tested in experiments by mixing pure RNA, obtained by the same protocol as described here, from the three strains with the following proportions: 25 ng μl−1 RNA from strain MS20, 50 ng μl−1 from strain MS21 and 25 ng μl−1 from strain MS23. RNA was quantified with the RiboGreen RNA quantification kit according the recommendations of the manufacturer (Molecular Probes, Eugene, USA). After cDNA synthesis of this mixture, the RT-PCR product was cloned and the whole inserts (>1300 bp) of 37 randomly picked clones were sequenced (as described above).

2.6Coverage values and phylotype definition

Coverage values were calculated to determine how efficient our clone libraries described the complexity of a theoretical community of infinite size, i.e., the original community. The coverage[35] of the clone library is given as C= 1− (n 1/N), where n 1 is the number of clones which occurred only once in the library and N is the total number of clones examined. For phylotype definition, we assumed a clone with a sequence similarity of >97% over the first (using the 27F primer) ∼300 bp sequenced to be an identical phylotype as suggested previously [36,37].

3Results and discussion

3.1Reliability of rRNA reverse-transcription and phylogenetic reconstruction

After cDNA synthesis, all RT-PCR products were in the expected size (∼1500 bp) and sequence analysis from the 37 RT-PCR products were >99% similar to the corresponding strains when DNA was used as template (data not shown). All 198 clones in our library (2 clones from the DNA library were of chimeric origin and therefore excluded) were characterized based on the first ∼300 bp to determine their phylogenetic affiliation. Additionally, we sequenced >1300 bp from 40 representative clones and clones with low BLAST scores. Separate trees were constructed including partial and full-length sequences from all clones. Partial sequences always formed clusters (>99% similarity) with full-length sequences from the same clones (data not shown). Therefore, the full-length sequence trees shown in Fig. 1 are representative for all clones found in the DNA/RNA library.

Figure 1.

Figure 1.

Maximum likelihood tree inferred from 91 to 1460 bp (E. coli numbering), bootstrap values or posterior probabilities (percentages) for the neighbor joining, parsimony and maximum likelihood analysis are indicated above and below the corresponding nodes, respectively. ‘–' indicates no representative bootstrap support. The prefix ‘env.' denotes an environmental gene clone and organisms in culture are in italics. The GenBank accession numbers for clones presented in this study are provided in the experimental procedures section. Asterisks indicate clones with low BLAST scores. The scale bar indicates 0.10 changes per site. (a) α-Proteobacteria, (b) γ-Proteobacteria, (c) other groups.

Figure 1.

Figure 1.

Maximum likelihood tree inferred from 91 to 1460 bp (E. coli numbering), bootstrap values or posterior probabilities (percentages) for the neighbor joining, parsimony and maximum likelihood analysis are indicated above and below the corresponding nodes, respectively. ‘–' indicates no representative bootstrap support. The prefix ‘env.' denotes an environmental gene clone and organisms in culture are in italics. The GenBank accession numbers for clones presented in this study are provided in the experimental procedures section. Asterisks indicate clones with low BLAST scores. The scale bar indicates 0.10 changes per site. (a) α-Proteobacteria, (b) γ-Proteobacteria, (c) other groups.

Figure 1.

Figure 1.

Maximum likelihood tree inferred from 91 to 1460 bp (E. coli numbering), bootstrap values or posterior probabilities (percentages) for the neighbor joining, parsimony and maximum likelihood analysis are indicated above and below the corresponding nodes, respectively. ‘–' indicates no representative bootstrap support. The prefix ‘env.' denotes an environmental gene clone and organisms in culture are in italics. The GenBank accession numbers for clones presented in this study are provided in the experimental procedures section. Asterisks indicate clones with low BLAST scores. The scale bar indicates 0.10 changes per site. (a) α-Proteobacteria, (b) γ-Proteobacteria, (c) other groups.

3.2Clones from the DNA (16S rRNA gene) and RNA (16S rRNA) library and their distribution in different bacterial phyla

Most DNA and RNA clones (72.5%) fell into the major phyla α- and γ-Proteobacteria. The remaining 55 clones (27.5%) were related to the Δ-Proteobacteria, green non-sulfur Bacteria, Cyanobacteria, plastids, Bacteroidetes, Actinobacteria-Firmicutes, Chlorobium-Fibrobacter and grouped together as shown in Table 1. DNA clones alone indicated that 52% were affiliated to α-Proteobacteria and 22.44% to the γ-Proteobacteria. The remaining DNA clones (25.51%) were distributed over distinct bacterial phyla, where only 1–7 clones were found per group (Table 2). For the RNA library, 28% of the clones were affiliated to α-Proteobacteria, but 44% to the γ-Proteobacteria. The remaining RNA clones (28%) were distributed over the remaining phyla with 1–5 clones per group, except for the clones affiliated to the Actinobacteria-Firmicutes contributing 11% to the RNA clones (Table 2). DNA clones in the α-Proteobacteria showed that 60% were >97% similar to the SAR11 clade, and 14% of these DNA clones clustered with SAR11 (SAR11 clustered, Fig. 1(a)). For the RNA clones, 36% fell into the SAR11 clade, and 14% clustered with SAR11. However, only 2 DNA clones and 1 RNA clone fell into the SAR116 clade, and the remaining clones grouped as shown in Table 1. Since 31% of our DNA clones clustered within the SAR11 clade, our results agree reasonably well with the ∼26% abundance of SAR11 clones in DNA libraries from seawater and therefore potentially relate to oligotrophic marine Bacteria[38,39]. Since our sample originated from a depth of 200 m, only 10% of the clones in the RNA library related to SAR11 and somehow agreed with decreased FISH counts towards depth for this clade[38]. On the other hand, 39% of the RNA clones in the γ-Proteobacteria fell into the SAR86 clade (Fig. 1(b)). For the DNA clones, 23% were related to the SAR86 clade and the remaining RNA clones grouped as shown in Table 1. These results indicate a well mixed water column since SAR86 has been previously detected only in the surface water column[3], where divergent proteorhodopsins light-driven proton pumps have been recently detected in this group[40]. Furthermore, clones affiliated to Actinobacteria-Firmicutes (ACT_6) and Chlorobium-Fibrobacter (CHL_2) might also contribute to a certain extent to the bacterial activity in the community, based on their occurrence only in the RNA library. A previous study confirmed the occurrence of clones affiliated to Chlorobium-Fibrobacter at mesopelagic depths but rather on the DNA level[41], while we detected these clones (e.g., Table 1 CHL_2) primarily on the RNA level.

Table 1.  Phylogenetic affiliation of the clones in the DNA (16S rRNA gene) and RNA (16S rRNA) libraries
Phylogenetic groupRepresentative clonen DNAn RNAClosest GenBank relativeBLAST scores of group
  1. Clones with a similarity of >97% were defined as the same phylotype and therefore grouped together. n= number of clones in the DNA and RNA library, respectively. The prefix ‘env.' denotes an environmental gene clone and organisms in culture are in italics. Representative clones used for the phylogenetic trees in Fig. 1 are in bold.

ALP_1AEGEAN_11210env.Arctic97A_7 (AF353236)88
ALP_2AEGEAN_10110env.SAR203 (U75255)87
ALP_3AEGEAN_12210env.OCS126 (AF001638)97
ALP_4AEGEAN_13010env.SAR220 (U75257)96
ALP_5AEGEAN_16110env.MB-C2-128 (AY093481)95
ALP_6AEGEAN_16910env.K2S24 (AY344373)82
ALP_7AEGEAN_17110env.OCS28 (AF001636)99
ALP_8AEGEAN_18010env.CHAB-I-5 (AJ240910)100
ALP_9AEGEAN_20801Pelagibacter ubique (AF510192)98
ALP_10AEGEAN_22701env.OM75 (U70683)97
ALP_11AEGEAN_23301env.Arctic96B_22 (AF353229)91
ALP_12AEGEAN_12411env.ZA3911c (AF382131)98–99
ALP_13AEGEAN_12820env.MB13F01 (AY033325)94
ALP_14AEGEAN_20702Rhodobium orientis MB312 (D30792)89
ALP_15AEGEAN_10630env.Arctic95D_8 (AF353223)98–99
ALP_16AEGEAN_16230env.SAR193 (U75649)97–99
ALP_17AEGEAN_23803Roseobacter sp. ISM (AF098495)97
ALP_18AEGEAN_10822env.HOC29 (AB054163)97–98
ALP_19AEGEAN_10440env.SAR193 (U75649)97–98
ALP_20AEGEAN_15541env.ZA3603c (U78994)98
ALP_21AEGEAN_14242env.MB-C2-128 (AY093481)98–99
ALP_22AEGEAN_20506env.Olavius loise endosymb. 2 (AF104473)89
ALP_23AEGEAN_10753env.MB13F01 (AY033325)92–93
ALP_24AEGEAN_100155env.ZD0409 (AJ400350)99–100
GAM_1AEGEAN_14510env.ZA3610c (AF382123)97
GAM_2AEGEAN_14610env.ZA3610c (AF382123)96
GAM_3AEGEAN_15310env.KTc1119 (AF235120)98
GAM_4AEGEAN_24501env.ARKICE74 (AF468306)95
GAM_5AEGEAN_19411env.ISO4 (AF328762)92
GAM_6AEGEAN_10520env.ZD0405 (AJ400348)99
GAM_7AEGEAN_13320env.Arctic96B_1 (AF353242)93
GAM_8AEGEAN_21603env.37–8 (AY167969)98
GAM_9AEGEAN_18320env.ZA3411c (AF382119).96–97
GAM_10AEGEAN_19530env.ZA3411c (AF382119)98
GAM_11AEGEAN_11430env.CHAB_1_7 (AJ240911)99
GAM_12AEGEAN_16514env.ZA3605c (AB382818)97
GAM_13AEGEAN_22905env.KTc0924 (AF235121)87
GAM_14AEGEAN_20204env.ZA3913c (AF3821332)96–97
GAM_15AEGEAN_23415env.CHAB_1_7 (AJ240911)93
GAM_16AEGEAN_20938env.PLY_P1_108 (AY354844)97–98
GAM_17AEGEAN_204113env.PLY_P1_108 (AY354844)95–97
DEL_1AEGEAN_13412env.SAR324 (U65908)98
DEL_2AEGEAN_22804env.SAR324 (U65908)95
Green non-sulfur bacteria    
GNS_1AEGEAN_11610env.sponge sy PAWS52f (AF186417)90
CYA_1AEGEAN_12010env.ZA3833c (AF382140)100
CYA_2AEGEAN_16610env.ZA3833c (AF382140)99
CYA_3AEGEAN_16810env.MB11E09 (AY033308)100
CYA_4AEGEAN_17210Prochlorococcus sp. MIT9313 (AF053399)99
PLA_1AEGEAN_11810Mantoniella squamata (X90641)95
PLA_2AEGEAN_11711env.OCS182 (AF001660)96–97
PLA_3AEGEAN_15811env.OM5 (U70715)99
PLA_4AEGEAN_11533Mantoniella squamata (X90641)96
B_1AEGEAN_10310env.agg58 (L10946)96
B_2AEGEAN_17910env.GMD16C04 (AY162110)88
B_3AEGEAN_12920env.Arctic97A_13 (AF354618)92
B_4AEGEAN_15020env.NAC60_3 (AF245645)95
ACT_1AEGEAN_16310env.MB11A03 (AY033296)92
ACT_2AEGEAN_17310env.ZA3111c (AF382115)99
ACT_3AEGEAN_11311env.ZA3635c (AF382139)87
ACT_4AEGEAN_19320env.ZA3111c (AF382115)93
ACT_5AEGEAN_18213env.ZA3409c (AF382122)94
ACT_6AEGEAN_24107env.ZA3612c (AF382135)86
CHL_1AEGEAN_18511env.OCS307 (U41450)91
CHL_2AEGEAN_22505env.SAR406 (U34043)94
Table 2.  Distribution of the clones (n= number of clones) from the DNA (16S rRNA gene) and RNA (16S rRNA) among the main phylogenetic groups found in this study
Phylogenetic group Clones only in DNA or RNA library nClones in DNA and RNA libraries nUnique clones nCoverage values for clones in this group %
  1. Clones from the DNA and RNA library were compared whether they were characteristic only for the DNA/RNA library or found in both libraries (i.e., clones with similarities >97% were found in the DNA and RNA library). Coverage percentages were calculated as described in Section 2, nd = not determined.

Other groups32212062
All clones941044378.5
Removed clones    

The combined analysis of DNA and RNA clones from the same bacterial community leads to a characterization of phylotypes otherwise uncharacterized when the DNA or RNA clones would be analyzed alone. Table 2 indicates that ∼25% of DNA clones are characteristic only for this library, and no close relatives (>97% similarity) in the RNA library were found. Comparable values were also observed for clones from the RNA library, as ∼21% of RNA clones did not indicate close relatives in the DNA library (Table 2). For example, clones related to ALP_19, GAM_10, GNS_1, CYA_1–4 and B_1–4 were only represented in the DNA library, while clones related to ALP_22, GAM_14, DEL_2, ACT_6 and CHL_2 only in the RNA library (Table 1).

Three aspects in the distribution pattern of clones in the DNA and RNA libraries can be considered in an ecological context. First, repetitive DNA clones (e.g., ALP_19, ALP_23, ALP_24, Table 1) might be representative for Bacteria high in cellular abundance and/or with multiple operons within their cells. However, the much lower number or absence of similar clones in the RNA library could indicate that these DNA clones are from Bacteria with less ribosomes and therefore, probably representative for cells with reduced metabolic activity. Secondly, a high number of repetitive clones in the RNA library (e.g., GAM_17, Table 1) could represent active members of the complex community with more ribosomes present in their cells. Thirdly, clones from Bacteria with low cellular abundance and/or low operon numbers, which were not detected in the DNA library, might indicate members in the complex community that are not detectable on their DNA level (beyond the detection threshold of our approach), but on their RNA level (e.g., ALP_22, GAM_13, ACT_6, CHL_2). This observed mismatch between the DNA and RNA libraries suggests that these clones originate from Bacteria low in cellular abundance but with potentially high metabolic activity as indicated by their clonal presence in RNA libraries.

In our study, stringent controls in sample preparation, (RT-) PCR and sequencing were performed, however, we did not detect (RT-) PCR biases leading to sequencing artifacts, which could explain the observed mismatch between our DNA and RNA libraries. In fact, we found >97% sequence similarity between clones from both libraries (∼50% of all clones analyzed), indicating that the often hypothesized increase in sequencing errors and preferential chimera formation for RT-PCR products did not determine the outcome of our RNA library. Only two sequences were chimeras and originated from the DNA library. Instead, we found well-aligned sequences for the DNA as well as the RNA library. We are aware that our study does not address the possibility that bacterial cells have to differ in numbers of rRNA molecules as a function of size, physiology and even time of the day. Furthermore large cells are likely to have more rRNA even if growing at lower doubling times than small (more) active cells. Besides these uncertainties, our study indicates that distinct microorganisms with low BLAST scores (e.g., RNA clones ALP_22, GAM_13 and CHL_2) might contribute to activity patterns of marine microorganisms, which remained undetected when 16S rRNA genes were analyzed.

Recent studies show that microheterogeneity accounts for a large portion of the diversity (by means of phylotype richness based on 16S rRNA genes) in complex bacterial communities [42,43]. Most of the diversity resulted from ∼50% of the sequences displaying <1% nucleotide difference to each other and it has been hypothesized that ‘microdiversity' is a feature of co-existing strains[42]. Since sequences with a similarity >97% were considered the same phylotype and therefore grouped together in our study, rRNA (gene) sequences from different operons within the same cell would probably fall within the same phylotype[6]. Higher similarity values were used for phylotype characterization in microdiversity studies[42], thereby increasing the microdiversity tremendously. We applied the ‘common rule of thumb’, which classifies organisms that are more than 3% different in 16S rRNA sequences as different phylotypes[44]. This extrapolation has been used for the majority of 16S rRNA gene clone libraries from various environments, and seems therefore for the sake of comparison a valid phylotype detection threshold [45,46]. However, different ‘ecotypes' could share sequence similarities >97% and would therefore group together within the same phylotype[47]. Although our study does not sample a bacterial community at such a high resolution than these recent studies [42,43], our smaller clone libraries indicate important differences between DNA and RNA libraries in (I) how clones from these 2 libraries are repetitively distributed in different phyla and (II) which phylotypes might be potentially important in terms of metabolic activity and which are not. Thus, our study contributes to the open question on the ecological significance of this observed microdiversity when RNA libraries are included in phylogenetic analysis. This could reveal whether DNA or RNA microdiversity represents populations (‘ecotypes') that share similar ecological niches and adaptations.

Still, because of the multiplicity and heterogeneity of 16S rRNA genes within bacterial strains [48,49], 16S rRNA gene analysis is rather a proxy for sequence ‘diversity' than for ‘diversity' of prokaryotic cells itself[50]. Thus, multiple operons within the bacterial cell could lead to a 2–15 times over-representation of certain clones, if we assume unbiased PCR amplification. Recent analysis of bacterial genomes with multiple rRNA operons indicated that a vast majority of interoperonic sequence differences within 76 bacterial genomes showed a <1% divergence, although the genomes under analysis tend to have more operons since they were derived from microorganisms in culture[6]. Taking these results into consideration, there might be a 2.5 × overestimation of bacterial diversity (by means of type richness) when using cloning and sequencing approaches of 16S rRNA genes. A recent study also indicates that a highly abundant marine strain (Candidatus Pelagibacter ubique gen. nov., sp. nov.) seems to have a single rRNA operon[42], which has been previously found for another oligotrophic marine bacterium[8]. Whether this is a general feature of marine microorganisms in the oligotrophic ocean remains to be determined.

3.3Unique clones in the DNA and RNA libraries and coverage of these libraries

Unique clones (clones only once in the clone library) were determined to evaluate the size of our clone libraries. Since 72.5% of the clones in DNA and RNA libraries were related to α- and γ-Proteobacteria, high overall coverage values within these groups indicate that the data presented here are representative for the complex community, based on our combined DNA/RNA approach (Table 2). For all clones in the DNA library, the coverage was 68%, whereas for all clones in the RNA library the coverage reached 89% (Table 2). Combining the DNA and RNA libraries since they were derived from the nucleic acids of the same complex community, the coverage was 78.5%. The observed lower coverage values for the DNA library (68%) can be explained by the higher number of unique DNA clones (n= 16) affiliated to the ‘other groups' cluster (Table 2). Unique clones were found more often in the DNA library (32 clones) than in the RNA library (11 clones), thereby contributing considerably to the overall complexity (by lowering the coverage values), while the RNA library seems less affected by unique clones. Unique clones probably represent an insignificant part of the community since they could originate from Bacteria with low operon numbers and slow metabolism[7] or Bacteria low in cellular abundance.

PCR (and cloning) biases[51] might explain the high number of unique clones in our DNA library, because of an inefficient amplification of template DNA, an uncertainty every PCR based approach is confronted with. We do not know (and we are not aware of any study that addresses this question) how many phylotypes are actually excluded from molecular analysis because specific primers are used in 16S rRNA (gene) techniques. Novel sensitive approaches [52,53] with specific FISH probes for representative clones from the DNA and RNA library could address many of the questions raised above. Also, the use of additional PCR primers with other specificity might resolve some of PCR related concerns[54].

Although we only analyzed a single free-living bacterial community from the oligotrophic Aegean Sea, insights into the bacterial community structure based on DNA and RNA was obtained. The majority of our clones indicated GenBank entries related to bacterioplankton clones from major ocean provinces such as the Sargasso Sea (22 clones), Atlanic Ocean (47 clones), North Sea (53 clones), Arctic Ocean (10 clones), Pacific Ocean (20 clones) as closest relatives. The remaining clones were related to marine symbionts (11 clones), deep sea microorganisms (7 clones), lake bacterioplankton (1 clone), marine mesocosms (10 clones) and 17 clones could not be clearly affiliated where the clones originated from. Although BLAST scores for our sequences were sometimes low, the dominance of related sequences from various marine provinces as closest relatives indicates that the DNA and RNA clone libraries are representative for oceanic bacterioplankton. Interestingly, RNA clones also showed low BLAST scores to sequences from GenBank, indicating a potential characterization of distinct phylotypes from the marine environment (Table 1). These results also indicate the potential of this combined DNA/RNA approach for the characterization of the bacterial community and the identification of members of the community on the RNA level. Therefore, conservative estimates can be made as abundant Bacteria, Bacteria with multiple operons per cell and Bacteria with higher ribosome numbers per cell are likely to be repetitively more abundant in clone libraries.


We thank the captain and crew of the RV Aegaeo for their help during sample collection and Christian Winter for sample collection and nucleic acids extraction. This manuscript was supported by a grant from the European Union to G.J.H. (MAST-MTP II, MATER, No. MAS3-CT96-0051).