Identification of the green alga, Chlorella vulgaris (SDC1) using cyanobacteria derived 16S rDNA primers: targeting the chloroplast


*Corresponding author. Tel.: +44 (131) 451 8165; Fax: +44 (131) 451 3129, E-mail:


We have tested a set of oligonucleotide primers originally developed for the specific amplification of 16S rRNA gene segments from cyanobacteria, in order to determine their versatility as an identification tool for phototrophic eucaryotes. Using web-based bioinformatics tools we determined that these primers not only targeted cyanobacterium sequences as previously described, but also 87% of sequences derived from phototrophic eucaryotes. In order to qualify our finding, a type culture and environmental strain from the freshwater unicellular, green algae genus Chlorella Beijerinck, were selected for further study. Subsequently, we sequenced a 578-bp fragment of the 16S rRNA gene, which proved to be present within the chloroplast genome, performed sequence analysis and positively identified our solvent-degrading environmental strain (SDC1) as Chlorella vulgaris.


For decades, strains of Chlorella (Chlorophyceae, Chlorococcales) have served as model organisms in plant physiology and biochemical research. Chlorella species have been shown to form intracellular symbioses with aquatic invertebrates and protozoa [1], produce hydrogen, tolerate extremes in acid and salt levels, and produce pharmaceutically important natural products [2]. Previous research has determined that there is a deep evolutionary gap between some of the Chlorella species, with some species possessing niche specific properties, such as containing the hydrogen-activating enzyme hydrogenase [3], the ability to liquefy gelatin [4], reduce nitrate and the ability to degrade and tolerate high concentrations (to at least 1% v/v) of volatile organic compounds (VOCs) such as isopropanol and acetone [5].

In order to understand the ecology of microalgae, it is desirable to match type and environmental strains. This facilitates the extrapolation of physiological and biochemical information gained to the environmental strains studied. However, various factors such as misidentification of strains in culture collections [2], inadequate culture conditions which lead to the malformation or loss of identifying features [6] and growth intricacies [3], make it difficult in many cases to apply taxonomic guidelines based on culture strains to environmental isolates. The phylogenetic relatedness of organisms can be deduced in principle from the comparison of their genetic sequences. In particular, the rRNA genes, which are found not only in all procaryotic and the nucleus of eucaryotic cells, but also in organelles, are well suited for such purposes [7,8].

Information regarding the chloroplast genomes from the algal world is now beginning to be disseminated through the literature, with the complete sequence of the plastid genomes of members representing several different classes recently being published, i.e. Euglenophytes, Rhodophytes, Chromophytes, Glaucocystophytes and the Chlorophytes [9–13]. As a result, genetic content and the sequence of many genes within chloroplast DNA have been shown to be relatively conserved amongst land plants and microalgae [14]. Sequencing of the 16S rRNA gene within these organisms is facilitated by the presence of several conserved regions, allowing primers to be designed [15].

On the basis of published 16S rRNA sequences, Nübel et al. developed cyanobacterium-targeted primers for use in cyanobacterial microbial diversity studies [16]. We have adopted these primers for use within our laboratory for the positive identification of environmental cyanobacterial isolates. In addition, we have applied these primers in the positive identification of environmental isolates of microalgae, namely via sequencing of the chloroplast genome region [17]. This present study was undertaken to: (1) determine the adaptability and robustness of this particular set of primers for use on other chloroplast-containing organisms, and (2) to positively identify the environmental strain of the putative unicellular, green alga Chlorella vulgaris (SDC1) using these primers by targeting the 16S rRNA region present within the chloroplast genome.

2Materials and methods

2.1Primer versatility

On the basis of published 16S rRNA sequences from Archaea, Eubacteria (including cyanobacteria) and Eucaryotes (i.e. chloroplast genomes) located within GenBank, we tested the robustness of the Nübel et al. primer pairs (1) 106F+781R(a), (2) 106F+781R(b), (3) 106F+781R, (4) 359F+781R(a), (5) 359F+781R(b) and (6) 359F+781R (Table 1)) using the bioinformatics package PRIMERSEARCH (Val Curwen, Human Genome Mapping Project, Cambridge, UK). These different primer combinations were tested independently to determine the most robust primer pair, when the PCR conditions first described by Nübel et al. were used [16].

Table 1.  Primer sequences and target sites of primers designed by Nübel et al. (reproduced from [16])
PrimeraSequence (5′ to 3′)Target siteb
  1. aR (reverse) and F (forward) designations refer to primer orientations in relation to the rRNA gene.

  2. bEscherichia coli numbering of 16S rRNA nucleotides [18].

  3. cForward primers 106F and 359F were used in alternative reactions.

  4. dY, a C/T nucleotide degeneracy.

  5. eReverse primer 781R is an equimolar mixture of 781R(a) and 781R(b).


2.2Environmental and culture strains studied

Samples of environmental Chlorella material (SDC1) (putatively identified, via microscopic examination, as C. vulgaris) were obtained from a mixed consortium of solvent-tolerant bacteria, by Bustard et al. [5] from the Heriot-Watt University, Edinburgh, UK. For comparison, a type culture strain of C. vulgaris Beijerinck was obtained from the University of Texas, Austin, USA (UTEX Strain 259).

2.3Culture conditions

An axenic culture of C. vulgaris (UTEX 259) was prepared using proteose medium (1.0 g l−1 proteose peptone; 0.25 g l−1 NaNO3; 0.025 g l−1 CaCl2.2H2O; 0.075 g l−1 MgSO4.7H2O; 0.075 g l−1 K2HPO4; 0.175 g l−1 KH2PO4; 0.025 g l−1 NaCl, pH 6.0 [19]), while the environmental strain (SDC1) was maintained in minimal salts medium (MSM) (3.0 g l−1 NaHCO3; 1.0 g l−1 NH4HCO3; 0.2 g l−1 K2HPO4; 102.5 mg l−1 MgSO4.7H2O; 36.75 mg l−1 CaCl2.2H2O; 10 mg l−1 FeSO4; 1 ml l−1 trace elements solution, pH 6.0), adapted from Angelidaki et al. [20] at 21±3°C, 12 h light:dark cycle, 15 μmol photons m−2 s−1 of illumination, aerated constantly and transferred routinely into fresh medium.

2.4DNA extraction, agarose gel electrophoresis and DNA recovery

Total DNA was extracted by harvesting and concentrating cultures within the late-exponential phase of growth. DNA extraction and recovery was carried out according to the methodology of Tamagnini et al. [21]. Agarose gel electrophoresis was performed following standard protocols [22] using 1.0×TAE buffer (44.5 mM Tris, 44.5 mM glacial acetic acid and 1.0 mM EDTA). The DNA was visualised with the fluorescent dye ethidium bromide by direct examination of the gel under UV light. λDNA cut with HindIII or, alternatively, a 100-bp ladder was used as marker throughout.

2.5Amplification of the 16S rRNA chloroplast gene

The oligonucleotide primers 359F, 781R and 106F (Table 1) were used to amplify C. vulgaris strains by means of PCR using the PE 2400 PCR System (Perkin Elmer-Applied Biosystems, Foster City, CA, USA) with 0.5 U of Taq DNA polymerase, 1×reaction buffer (50 mM KCl, 1.5 mM MgCl2 and 100 mM Tris–HCl, pH 9.0), 167 μM deoxynucleoside triphosphates, 1 μM of each primer and 10 ng of genomic DNA, for 40 cycles following previously published protocols [21]. Visualisation of the amplified DNA product was performed using agarose gel electrophoresis (as described previously).

2.6Direct sequencing of DNA product and sequence analysis

Sequencing of the purified PCR product was performed using the PE Big Dye Terminator Kit following the manufacturer's protocol, followed by sodium acetate/ethanol DNA precipitation [23]. Samples were then sequenced using the PE Applied Biosystems 377 DNA sequencer. Computer-assisted sequence analysis and comparisons were performed using a modified version of PHYLIP [24], Chromas (, CLUSTALW [25] and sequences derived from the Basic local alignment search tool (BLAST) [26] (National Centre of Biotechnology Information, Washington, DC, USA).


3.1Durability testing of Nübel et al. primers on phototrophic eucaryotes

On the basis of published 16S rRNA sequence data, we tested the versatility of these primer pairs. After extensive database searches and cross-referencing, we determined that the primer pair 359F+781R was the most appropriate for our purposes. Being ultimately the more specific binder to template DNA, as can be seen in Table 2, where the primer pair 106F+781R binds with some 16S rRNA sequences of organisms not affiliated with either cyanobacteria or phototrophic eucaryotes (Table 2), albeit in relatively small numbers (<0.25%). All primer pairs had matches with sequences not associated with cyanobacteria or phototrophic eucaryotes; however, at least two mismatches to template DNA were needed in each instance (data not shown) (Table 2). In addition, we observed rather unexpectedly that the specificity of these primers occasionally did not cover all families of organisms within a particular phylum, for example diatoms of the Fragilariophyceae family and unidentified environmental diatom samples did not match (even when a 5% margin for error was used (Table 2)). However, in this case a final evaluation cannot yet be made due to the limited number of related sequences available. Of the 9860 sequences obtained from GenBank and queried in this analysis, only those pertaining to cyanobacterial strains and the chloroplast of both algae and land plants produced hits (Table 2). Of those hits, approx. 99.8% of all cyanobacterium sequences present within the database (859 entries) matched, while approx. 90.3% of all algae (excluding unidentified environmental diatoms) and 74.0% of all land plant chloroplast genome sequences present within GenBank matched (Table 2). These results lend strong support to the findings of Nübel et al. [16], who observed that these same primers matched all 174 cyanobacterial 16S rRNA sequences present within public databases as of June 1997. Nübel et al., however, did not extend their search to include all phototrophic eucaryotes with published sequences, and thus discover the great versatility of these primers.

Table 2.  Results of web-based bioinformatics study into the versatility of primer pairs originally developed by Nübel et al., including all Archaea, Eubacteria and Eucaryotic sequences related to the 16S rRNA gene fragment using the program PRIMERSEARCH (Val Curwen, Human Genome Mapping Project, Cambridge, UK)
 106F+781R(a) (%)106F+781R(b) (%)106F+781R (%)359FC+781R(a) (%)359FT+781R(b) (%)359F+781R
  1. All primer pair combinations were tested in order to determine the most versatile pairs for general use. Several instances were observed where primers matched sequences from organisms not related to either cyanobacteria or eucaryotic phototrophs; however, as the number was below our error margin of 0.25%, they were not included. 359FC and 359FT denote the two sequences for primer 359F (Table 1) possible due to degeneracy; 359FC+781R(b) and 359FT+781R(a) were not included in this table as values were identical to 359FC+781R(a) and 359FT+781R(b), respectively.

  2. aSignifies eucaryotic orders which have sub-orders that are not targeted by these primers. Only sub-orders which are targeted are shown in this instance.

CFB/green sulfur Eubacteria and relatives
Chlamydiales/Verrucomicrobia group
Unclassified Cyanobacteria97.197.2+99.8+
Fibrobacter/Acidobacteria group
Firmicutes (Gram-positive eubacteria)2.32.3+
Green non-sulfur Eubacteria group
Nitrospira group
Thermus/Deinococcus group
Fungi/Metazoa group
Rhodophyta (red algae)68.068.0+88.088.0+
Stramenopiles (heterokonts)
Bacillariophyta (diatoms)a
Bacillariophyceae80.080.0 89.189.1+
Coscinodiscophyceae (centric)75.075.0+85.085.0+
Environmental samples3.73.7+
Viridiplantae (green plants)a
Chlorophyta (green algae)83.083.0+98.098.0+
Streptophyla (land plants)45.045.1+74.074.0+

3.2Identification of C. vulgaris (SDC1) using 16S rRNA gene sequencing

PCR of total DNA isolated from both C. vulgaris (UTEX 259) and the solvent degrading/tolerant environmental strain (SDC1) Chlorella sp. using the primer pairs 106F+781R and 359F+781R (Table 1) yielded amplified products of the expected sizes. On comparison of both SDC1 and UTEX 259 with other sequences present within GenBank and the Ribosomal Database Project (RDP) [27] it was found that SCD1 (GenBank accession No. AF350260) and UTEX 259 (GenBank accession No. AF350259) both aligned with strong conformity with additional C. vulgaris sequences, including the recognised type culture strain CCAP 211-11b [2].

On closer analysis, using the program Sequence Matrix available from RDP, it was noted that C. vulgaris (SDC1) was 96.4% identical to the culture strain UTEX 259, while C. vulgaris (UTEX 259) was 99.9% similar to CCAP 211-11b (Table 3). In addition, when multiple sequence alignment was performed using CLUSTALW 1.75, a 21-bp difference was noted between the environmental strain and the sequences of UTEX 259 (Fig. 1B), C-27 (Fig. 1B) and CCAP 211-11b (data not shown). A 21-bp difference in 578 bp represents a 3.6% difference in rRNA gene sequence. This 3.6% difference falls within the range previously reported for interoperon differences within alga of up to 5%[28] and interstrain variability of up to 16%[29].

Table 3.  Similarity matrix for small subunit rRNA gene fragments housed at the Ribosomal Database Project [27] comparing the environmental strain (SDC1), C. vulgaris (UTEX 259) and 19 most closely related organisms (current records, as of March 2001, from BLAST used as sequence comparison source) Thumbnail image of
Figure 1.

A: Gene map of C. vulgaris C-27 chloroplast genome (adapted from NCBI BLAST (accession No. NC_001865) [13], showing the exact location of gene fragment sequences within this study. B: Comparison of 16S rRNA gene fragments of two type culture (UTEX 259 (GenBank accession No. AF350259) and C-27) and the solvent-degrading environmental strain (SDC1) of C. vulgaris (GenBank accession No. AF350260) using CLUSTALW (nucleotide mismatches highlighted in red).

We have also observed, although somewhat unexpectedly, that both the environmental (SDC1) and type strains (UTEX 259 and CCAP 211-11b) of C. vulgaris and Chlorella sorokiniana are genetically very similar over this 16S rRNA gene fragment region (Table 3), being 95.0%, 94.4% and 94.3% identical, respectively. These results support the widespread view that even though there is a distinct difference between the two taxa in the G+C content of their DNA (62% and 56%, respectively [3]), both morphologically and genetically there is no doubt that C. sorokiniana is closely related to C. vulgaris. These results lend strong support to the hypothesis that C. vulgaris might have evolved from C. sorokiniana by the loss of the ‘primitive’ characteristics of hydrogenase activity and thermophily [2]. Moreover, of the 19 additional sequences included within the similarity matrix, only eight were not obtained from chloroplast genomes. Of those, six are environmental clones, described as either cyanobacteria or chloroplast genomes, although little detail currently exists on their exact origin (Table 3).

3.3Location of 16S rRNA sequence fragment within the chloroplast genome

In order to determine whether the 16S rRNA gene sequence obtained for both the type culture (UTEX 259) and environmental strain (SDC1) did indeed reside within the chloroplast genome of C. vulgaris as postulated rather than in the nucleus or mitochondria, multiple sequence alignment was used to align these two sequences with that of the full chloroplast genome of C. vulgaris C-27 (GenBank accession No. AB001684) [13] (Fig. 1A). Using the genomic map for this particular chloroplast (GenBank accession No. NC_001865), we were able to place our sequence fragment at the 22.439–23.025-kb position only (Fig. 1A). In addition, these data support the previous findings of Kapoor et al. [30], this being that there is only one copy of the 16S rRNA gene found within the chloroplast of this particular unicellular alga. Finally, we cross-referenced our results with all sequences of C. vulgaris within GenBank and found that the only matches occurred for sequences which originated within the chloroplast and not within the nuclear or mitochondrial genomes of this particular alga.


The origin and evolution of photosynthesis has long remained inscrutable due to a lack of sequence information of photosynthesis genes across the entire genetic domain. However, it is now widely accepted that photosynthetic eucaryotes derived from several endosymbiotic events involving the uptake of a cyanobacterium by one or more non-photosynthetic, phagotrophic hosts [31–33]. Previous phylogenetic studies using small subunit rRNA sequence data suggest that cyanobacteria, oxychlorobacteria (prochlorophytes) and plastids are the products of an explosive evolutionary radiation, the deep-branching order of which has only recently been resolved [32,34]. Furthermore, a number of gene clusters have been shown to be conserved between cyanobacteria and plastids.

Recently, Nübel et al. [16] developed what they thought were cyanobacteria-specific primers, for use on both axenic and non-axenic but unialgal cultures applied to the study of cyanobacterial biodiversity. Within this study Nübel et al. also tested these primers on two diatoms (plastids), resulting in the successful amplification of the target region [16]. However, it seems that the full potential of these primers was never realised. Our laboratory has been working routinely with these primers for several years [17,35,36]. Using web-based bioinformatics tools to screen these primers against all published 16S rRNA sequences for Archaea, Eubacteria and Eucaryotes present within GenBank as of March 2001, we found that the primer pair 359F–781R targeted not only cyanobacterial, but also eucaryotic 16S rRNA gene fragments located within the chloroplast of algae and land plants (Table 2).

This information was then put into practice to positively identify an environmental strain (SDC1) of the Chlorella genus of freshwater unicellular green algae. When compared with both an internal standard (type strain UTEX 259) and GenBank submissions, we gained a 97% match with the type strain for C. vulgaris and have identified this solvent-degrading and solvent-tolerant strain as such [5]. The genetic sequence we gained using these primers, although only ∼600 bp in length, can be used successfully not only to positively identify the organism in question, but also to infer phylogenetic associations with similar accuracy to phylogenetic reconstructions involving the use of complete 16S rRNA gene sequences [37]. Thus, we have shown here, using various bioinformatics tools and two strains of a phototrophic eucaryote, that the primers originally designed by Nübel et al. for the use in cyanobacterial studies can be adapted for far greater use and benefit. They are ideal for not only biodiversity studies into cyanobacterial communities, but can also be useful as a tool in the identification of phototrophic eucaryotes either for the positive identification of new species or for phylogenetic relatedness studies.

Potential breakthroughs in areas such as pollution abatement are often hampered by the use of uncharacterised biological systems. Positive identification of species would allow for a more comprehensive evaluation and comparison across studies. In general, rRNA genes are considered to be the most appropriate targets for identification and evolutionary relatedness studies, being more conserved in structure and function than other genes [15]. Comparison of the rpoC1 gene [38] and genetic studies targeting the tRNALeu (UAA) intron [39], for example, have been described as an alternative method of phylogenetic analysis. However, the sequence data available from these methods are rather limited, whereas the determination of 16S rRNA gene sequences is a routine procedure in procaryotic taxonomy today. This results in large and steadily growing databases which improve the robustness of phylogeny reconstructions, identification results and primer specificity evaluations. Other molecular biological approaches which have been described for species identification are applicable exclusively to axenic cultures. These include multiplex randomly amplified polymorphic DNA analysis [40] and 16S–23S rRNA gene internal transcribed spacer sequence [41]. Other approaches apply to certain organisms with novel compounds or activities (either active or redundant), such as phycocyanin [42], nitrogenase [43], hydrogenase [21] or cyanobacterial toxins [44].

In this ‘post-genomics age’, it is vital to link biochemical data obtained from type strains to environmental strains studied using information present within public databases, to aid in the transition towards full structure-to-function studies. In harnessing this information, the pathway to improved biotechnological processes will be considerably shortened if common features between potentially useful microorganisms can be identified and categorised a priori.


The authors acknowledge financial assistance from Heriot-Watt University's Multi-disciplinary PhD Scholarship program. This article is supported by a grant from the British Council/ICCTI Protocol 2000/2001 collaborative visit scheme, PRAXIS/P/BIA/13238/98, the UK's Biotechnology and Biological Sciences Research Council (grant No. 97/E11124) and the Engineering and Physical Science Research Council (grant No. GR/M90917).