Genetic analysis of natural populations of the marine diazotrophic cyanobacterium Trichodesmium


*Corresponding author. Tel.: +46 (480) 44 73 10; Fax: +46 (480) 44 73 10


The genetic diversity of Trichodesmium, a marine nitrogen-fixing non-heterocystous cyanobacterium of great ecological importance, was examined using the partial gene sequences of the small subunit ribosomal RNA (16S rDNA) gene and the regulatory gene hetR. Different species and morphotypes (fusiform and spherical colonies) of Trichodesmium were collected in the northern Caribbean Sea, the central Atlantic Ocean and southern Pacific Ocean. The trichome morphologies were observed with light microscopy before DNA extraction and PCR amplification. Phylogenetic analysis of 16S rDNA revealed that all cyanobacterial sequences from the colonies of Trichodesmium spp. were very closely related to the laboratory culture of Trichodesmium sp. NIBB 1067. The overall results from the hetR analysis were congruent with that of 16S rDNA, but the variation in nucleotide sequence between different species was higher within the hetR gene. The sequence data showed that three main clades were represented. One clade comprised sequences from Trichodesmium hildebrandtii Gomont and Trichodesmium thiebautii Gomont (including both its fusiform and spherical colony forms). Sequences from Trichodesmium contortum Wille and Trichodesmium tenue Wille constituted a second clade. The third clade contained Trichodesmium erythraeum Ehrenberg together with the two laboratory strains of this species, Trichodesmium sp. NIBB 1067 and Trichodesmium sp. IMS 101.


The genus Trichodesmium was first described by Ehrenberg [1] and has since been recorded in all tropical and subtropical oceans. Trichodesmium spp. are among the most widespread and abundant cyanobacteria on earth, and is believed to be an important source of fixed nitrogen in the open oceans [2–5]. Many aspects of the biology of Trichodesmium have been investigated [5], but the underlying genetic basis for morphological variations have not been studied in detail.

One of the key features of the genus Trichodesmium is its ability to form colonies, which can be divided into two main categories, fusiform (tuft) and spherical (puff), consisting of several hundreds of trichomes. Although trichome morphology may vary from colony to colony, the trichomes within a colony are very uniform in size and morphology, suggesting that the individual colonies are clonal [6]. Two cultures of Trichodesmium have been reported, Trichodesmium sp. NIBB 1067 isolated from the Pacific Ocean [7] and Trichodesmium sp. IMS 101 from the northern Atlantic Ocean [8]. Based on genetic and morphological observations, these cultures are isolates of T. erythraeum[6,8,9]. In natural populations in the northern Atlantic Ocean, five species of tufts and one complex group of puffs have been described [6]. Two of these, T. erythraeum and T. thiebautii, have been compared genetically by using a fragment of nifH[9,10]. Small genetic differences were detected between them. For the other species, T. contortum, T. hildebrandtii and T. tenue, molecular data are lacking.

The aim of the present study was to investigate the genetic variation among all forms of Trichodesmium spp. found in the Atlantic Ocean [6]. This is of importance as most experiments performed in situ concern the colony stage of Trichodesmium spp., and in particular the tuft type of colonies, and there are sometimes subtle morphological differences between different species or morphotypes. The laboratory strain Trichodesmium sp. IMS 101 was also included, as this is the only strain that has been successfully grown in several laboratories. The DNA of the collected species was analysed for the nearly complete sequence of the small subunit of ribosomal RNA (16S rDNA). This gene was chosen because of its unmatched reference data for cyanobacteria [11–15]. In addition, partial sequences of hetR, a regulatory gene only reported from filamentous cyanobacteria [16,17], were also used. The hetR gene is a serine type of protease [18], involved in heterocyst [16] and akinete [19] differentiation. It is also present and expressed in species that, like Trichodesmium spp., do not develop these specialised cells [17].

2Materials and methods

2.1Collection and DNA isolation

Colonies of Trichodesmium spp. were collected during a cruise in the Bahamas and northern Caribbean Sea in January 1995 and on a cruise crossing the central Atlantic Ocean in April 1996, on the R/V Seward Johnson (Fort Pierce, FL, USA). Samples from the southern Pacific Ocean were collected during a cruise in March–April 1998, on the R/V Roger Revelle (San Diego, CA, USA). The colonies were collected from net-tow samples, using a plastic inoculation loop, dispensed in droplets of filter-sterilised seawater and transferred to a 0.5 ml counting chamber. Species assignment of the trichomes in the colonies was made by microscopic observations of their morphology, after which the colonies were transferred back into filtered seawater. The single colonies, and the location of the sampling site for the 16S rDNA sequence analyses collected in the northern Caribbean Sea, are given in Table 1. The 16S rDNA sequence from Trichodesmium sp. NIBB 1067 [13] represents T. erythraeum, which served as a control for the unlikely event that sequences obtained from the pooled sample of 25 T. erythraeum colonies originated from other species. The morphology of the trichomes in the colonies used for analyses of 16S rDNA and hetR sequences (Table 1) corresponded to those described previously [6] with two exceptions. First, the colony stage of T. contortum was not included in the previous study [6]. These colonies were large, 5 mm long and 3 mm broad, and pigmented with a pale straw-like colour. The trichomes were ca. 35 μm broad with cells 5 μm long and the gas vesicles were mostly oriented in the centre of the cells. Second, one of the T. tenue colonies contained trichomes with larger, 10 μm wide cells and dark red pigmentation, in contrast to the typical T. tenue[6]. The hetR sequence from T. erythraeum collected in the northern Caribbean Sea was already known [17].

Table 1. Trichodesmium colonies collected during research cruises in the Caribbean Sea, central Atlantic Ocean and southern Pacific Ocean and used for amplification of 16S rDNA and hetR
TaxaGeneSampling locationColoniesGenbank accession number
  1. All colonies were of the tuft type unless otherwise stated. The gene that has been analysed is indicated in the second column and the number of colonies in each samples in the fourth. For further details see text.

  2. aCulture collection held at Institute of Marine Science, at the University of North Carolina, Morehead City, NC, USA.

  3. bn.a., not applicable, DNA was obtained from laboratory culture.

T. contortum16S rDNACaribbean Sea (26°59′N, 78°57′W)1AF013028
T. erythraeum16S rDNACaribbean Sea (18°02′N, 63°17′W)25AF013030
T. hildebrandtii16S rDNACaribbean Sea (24°01′N, 74°60′W)1AF091322
T. tenue16S rDNACaribbean Sea (26°59′N, 78°57′W)1AF013029
T. thiebautii16S rDNACaribbean Sea (18°02′N, 63°17′W)1AF013027
T. thiebautii puff type16S rDNACaribbean Sea (19°40′N, 69°20′W)1AF091321
Trichodesmium sp. IMSa 101hetROff the coast of North Carolina, Atlantic Oceann.a.bAF091323
T. contortumhetRCaribbean Sea (21°07′N, 72°00′W)5AF013031
T. hildebrandtiihetRCaribbean Sea (23°19′N, 74°92′W)5AF013032
T. tenuehetRAtlantic Ocean (2°03′N, 32°54′W), (4°41′N, 31°16′W)2+1AF013033
T. thiebautiihetRPacific Ocean (13°56′S, 173°12′W)1AF091325
T. thiebautii puff typehetRPacific Ocean (14°10′S, 178°16′W)1AF091324

Samples were then either kept frozen in liquid N2 or DNA was extracted directly using a modified protocol for extraction and purification with cetyl trimethyl ammonium bromide (CTAB) [20]. Typically, one frozen or freshly collected colony was transferred to a microcentrifuge tube containing 100 μl TE (10 mM Tris-HCl, 1 mM EDTA, pH 8) and 20 μl of lysozyme (5 mg ml−1 in TE) was added, followed by incubation for 1 h at 37°C. After lysozyme treatment, 13 μl of SDS (10% w/v) was added, followed by incubation for 10 min at 70°C. A 20 μl aliquot of 5 M NaCl was added and mixed by inverting the tube, and 20 μl of 10% (w/v) CTAB/0.7 M NaCl was added and incubated for 10 min at 56°C. The solution was then extracted with 200 μl of chloroform and centrifuged for 10 min at 15 000×g. The collected aqueous phase was mixed with an equal volume of isopropanol and incubated for at least 2 h at −20°C. The precipitates were spun down and washed with 70% (v/v) ethanol. The yield of DNA with this procedure was ca. 40 ng per Trichodesmium colony as estimated by gel electrophoresis and ethidium bromide staining. The DNA from the laboratory culture Trichodesmium sp. IMS 101 was extracted according to the same protocol.

2.2PCR and cloning

The primers OX1 and OX2 were used to amplify 16S rDNA sequences [21], under the following PCR conditions in a thermal cycler (MJ Research, Watertown, MA, USA). Samples of 10 μl (approximately 5 ng DNA) were mixed with 200 mM dNTP, 5% acetamide, 200 mM of both OX1 and OX2, 3 mM MgCl2, Taq DNA polymerase reaction buffer and 2.5 U of Taq DNA polymerase (Promega, Madison, WI, USA) in a total reaction volume of 100 μl and an overlay of mineral oil. The Taq polymerase was added after the initial denaturation step at 80°C (hot start) followed by 30–35 cycles of 1 min at 50°C, 2 min at 72°C with an extension of 5 s for each cycle and 1 min at 94°C, repeated 30 to 35 times. All clones were derived from single PCR reactions and were obtained according to the instructions provided with a TA cloning kit (Invitrogen, San Diego, CA, USA). The 448 bp fragment of hetR, corresponding to position 159–606 in Nostoc sp. PCC 7120 [16], was amplified with Pfu (Stratagene, La Jolla, CA, USA) and primers hetr1 (Nostoc sp. PCC 7120 position 139–158; 5′-AARTGYGCNATHTAYATGAC-3′) and hetr2 (Nostoc sp. PCC 7120 position 607–625; 5′-CRATRAANGGYATNCCCCA-3′) [17]. These PCR products were cloned using a blunt-end cloning kit (Stratagene).

2.3Sequencing and sequence analyses

Plasmids used in sequencing were prepared by alkaline lysis using Prep-A-Gene DNA purification system (Bio-Rad, Richmond, CA, USA). The inserts were sequenced bi-directionally using an ABI model 373 A automated sequencer (Applied Biosystems, Foster City, CA, USA) and dye-terminator chemistry, using primers from conserved regions of 16S rDNA [22] with primers M13F and M13R on the cloning vector. Three clones from each 16S rDNA plasmid library were analysed. The number of hetR clones sequenced from each plasmid library varied between one (for the hetR clones in T. thiebautii puff and tuft type) and five (for the hetR clones from T. erythraeum). Two sequences from the same plasmid library did not show more variation than could be accounted for by Taq or Pfu polymerase errors [23]. The sequences were aligned using CLUSTAL W 1.7 [24] and, when necessary, edited manually using SEAVIEW [25]. Trees with bootstrap analysis of the data were inferred using PHYLOWIN [25] running on a Linux operating system. For the quartet puzzling maximum likelihood analysis the program PUZZLE was used, with its default settings except that parameter estimation was set to exact [26]. The GenBank accession numbers for the sequences determined in this study are listed in Table 1. Sequences not determined in this study were retrieved from GenBank under the following accession numbers, Leptolyngbya sp. PCC 73110 (X84809; AF013036), Nostoc sp. PCC 7120 (X59559; M37779), Trichodesmium sp. NIBB 1067 (X70767) and T. erythraeum (AF013034).


3.1Analysis of 16S rDNA

For an overview of the variation between the six different species and morphotypes, the nearly complete 16S rDNA sequence (1438 nucleotides) was determined. In most cases the sequences obtained were identified as cyanobacterial, but other sequences were also encountered occasionally. This was the case in clones from the colonies of T. thiebautii (tuft type) and T. hildebrandtii. The contaminating sequences were most similar to Microscilla (=Flexibacter) marina Pringsheim (GenBank accession number M58793), a heterotrophic non-septated filamentous bacterium belonging to the Cytophaga group. One such sequence, from a T. thiebautii colony, was 91% identical to the M. marina 16S rRNA gene (position 1015–1332).

The phylogenetic tree inferred with the Trichodesmium spp. 16S rDNA sequences using maximum likelihood quartet puzzling (Fig. 1A) showed that the species were divided into three groups. One group contained T. hildebrandtii and the two colony types of T. thiebautii, a second T. contortum and T. tenue and a third T. erythraeum and Trichodesmium sp. NIBB 1067. Phylogenetic programs based on distance and parsimony tree-building methods resulted in identical tree topologies (data not shown).

Figure 1.

Phylogenetic trees inferred from natural populations of Trichodesmium spp. 16S rDNA and hetR sequence data. The origin of the sample is indicated as AO for the Atlantic Ocean, CS for the Caribbean Sea and PO for the Pacific Ocean. The scale bars indicate the branch length corresponding to the number of substitutions per sequence position. A: Tree constructed using 16S rDNA sequence data and quartet puzzling maximum likelihood. The sequences obtained from natural populations of Trichodesmium spp. colonies collected in the north Caribbean Sea are denoted with CS. The tree was constructed on the basis of 1394 positions, ignoring positions with gaps and sequence ambiguities and using Leptolyngbya sp. PCC 73110 as an outgroup. The observed value of the transition/transversion ratio was 1.13, and was used in the maximum likelihood analysis. The maximum likelihood index of the tree was ln(L)=−3619.47. The numbers at the branches are relative likelihood scores. B: Phylogenetic tree inferred from hetR gene sequence data using quartet puzzling maximum likelihood analysis. A total of 448 aligned positions were used in the analyses, with the transition/transversion ratio set to the observed 1.74. The maximum likelihood index of the tree was ln(L)=−1775.16. The numbers at the branches are the percentage of 1000 quartet puzzling steps that each subset of sequences were grouped together. C: Phylogenetic tree inferred from translated hetR gene sequence data using maximum parsimony analysis. The total 149 amino acid sites had 25 informative sites that were used in the maximum parsimony. The resulting tree required 80 steps. The numbers at the branches are percentages of the bootstrap values from 500 replicates. The origin of the sample is indicated as AO for the central Atlantic Ocean, CS for the northern Caribbean Sea and PO for the southern Pacific Ocean.

The sequences from T. contortum, T. erythraeum, T. tenue and T. thiebautii (tuft type) showed a sequence identity ranging between 97.6–99.2% (Table 2). The sequence from T. hildebrandtii was 99.6% identical to both the tuft and puff forms of T. thiebautii. The sequences from T. erythraeum and the cultured isolate of this species Trichodesmium sp. NIBB 1067 were 99.7% similar.

Table 2.  Comparison of similarities of 16S rDNA gene sequence data
TaxaT 1067T eT tT cT thT thpT h
  1. The sequence identities were calculated as percentages for the 16S rDNA sequences from Trichodesmium colonies. T 1067=Trichodesmium sp. NIBB 1067, T c=T. contortum, T e=T. erythraeum, T h=T. hildebrandtii, T t=T. tenue, T th=T. thiebautii (tuft type), T. thp=T. thiebautii (puff type).

T 1067100      
T e99.7100     
T t98.398.1100    
T c98.498.199.2100   
T th97.997.797.998.0100  
T thp97.997.797.998.099.6100 
T h97.897.697.897.999.699.4100

3.2Analysis of hetR

Since the nucleotide differences in the 16S rDNA sequence data were very small, the partial sequence of a protein-encoding gene, hetR, with a potentially higher variability between closely related strains was determined. The hetR sequences from Trichodesmium spp. strains showed clear differences and produced well resolved trees (Fig. 1B, C). The primers, hetr1 and hetr2, proved to be successful on all samples tested. No contaminating sequences were detected in the samples. Sequencing of cloned PCR products generated by using primers for the hetR gene gave no indication of paralogous or non-identical orthologous copies within individual colonies. The clones within each Trichodesmium species or morphotype were either identical or, rarely, one nucleotide of the 448 differed from the other clones, possibly due to Pfu polymerase or sequencing errors. The two T. tenue morphotypes from the central Atlantic Ocean had identical hetR sequences and are therefore presented as one sequence. The hetR sequence of T. thiebautii (both tuft and puff forms) from the southern Pacific Ocean, and that of T. hildebrandtii from the northern Caribbean Sea, were very similar (Table 3). The sequence of T. erythraeum from the north Caribbean Sea [17] was almost identical to the cultured isolate of this species, Trichodesmium sp. IMS 101 (isolated off the coast of North Carolina), differing with only two nucleotides, both in the third codon position.

Table 3.  Comparison of similarities of hetR gene sequences
TaxaT 101T eT tT cT thT thpT h
  1. The sequence identities were calculated as percentages for the hetR sequences from Trichodesmium colonies. T 101=Trichodesmium sp. IMS 101, other abbreviations as in Table 2.

T 101100      
T e99.6100     
T t94.094.0100    
T c92.992.496.7100   
T th91.591.592.491.1100  
T thp89.789.790.889.598.2100 
T h90.890.892.090.698.297.8100

The tree based on hetR nucleotide sequences was produced using maximum likelihood analysis (Fig. 1B) and maximum parsimony was used for the translated amino acid sequence (Fig. 1C). Distance analysis in combination with neighbor joining resulted in identical tree topologies (data not shown). The topology of both these hetR trees was almost identical with that obtained using 16S rDNA sequences (Fig. 1A). Again, the sequences of T. contortum and T. tenue were clustered together, but with a slightly lower likelihood confidence value. The T. erythraeum sequence grouped with the sequence from Trichodesmium sp. IMS 101. In the other clade T. thiebautii tuft and puff type sequences grouped together, with the T. hildebrandtii sequence ancestral to them, which is a slightly different configuration compared to the 16S rDNA tree.


The aim of the study was to investigate the genetic diversity among the different forms of Trichodesmium spp. identified previously [6]. A detailed genetic comparison between colonies of different morphology had not been performed earlier. The use of 16S rDNA in genotypic studies is widespread, but the resolution is sometimes too low to distinguish closely related species. It has been shown that bacteria with identical 16S rDNA sequences can still show considerable variations in the rest of the genome, implying that they do not belong to the same species [27,28]. A value of 97.5% identity of 16S rDNA sequences has been estimated to be the upper limit at which two organisms with relatively high certainty are not related at the species level [29]. This value is lower than all sequence similarities found between the Trichodesmium spp. sequences. Hence, there is not enough variation within the 16S rDNA gene to separate the Trichodesmium morphotypes into separate species with a high degree of certainty. The differences and similarity between morphotypes were confirmed by including the hetR gene in the analysis.

It is unlikely that the hetR sequences obtained here are from other very narrow filamentous cyanobacteria reported to be associated with Trichodesmium spp. colonies [30], as (1) the hetR sequences within each sample were very closely related, and (2) all of the cyanobacterial 16S rDNA sequences recovered were closely related to the sequence of Trichodesmium sp. NIBB 1067. Thus, no other cyanobacteria seemed to be present in these colonies. However, the 16S rDNA sequences of the Flexibacter type were present in samples from T. thiebautii (tuft type) and T. hildebrandtii. Filamentous heterotrophic bacteria have been reported to be abundant in colonies of Trichodesmium spp. [31] but no affiliation was suggested for these organisms.

The spherical colony form (puff) was originally described as Heliothrichum radians[32], but was later divided into two species, T. thiebautii[33] and T. tenue[34]. Puff colonies of T. tenue were never observed during the three cruises and were therefore not examined. Sequences from the T. thiebautii puff type showed very high similarity to the sequence from its tuft type and to that of T. hildebrandtii, and all three were in the same clade in both hetR and 16S rDNA trees. Based on these observations it seems more appropriate to refer to T. hildebrandtii, T. thiebautii and T. thiebautii puff as closely related strains rather than different species.

The 16S rDNA sequence analysis of T. erythraeum suggested that Trichodesmium sp. NIBB 1067 were closely related. Indeed, both the nifH gene sequence and morphological characters suggest that Trichodesmium sp. NIBB 1067 and T. erythraeum are closely related strains [7,9]. The small differences in 16S rDNA sequences may be due to the presence of non-identical copies of the 16S rRNA gene in T. erythraeum. Such sequence heterogeneity between copies of the 16S rRNA gene within the same genome is known from other bacteria [35–39]. It is also possible that genotypically different strains occur in the northern Caribbean Sea and in the southern Pacific Ocean. In the case of hetR sequences, T. erythraeum is most closely related to the laboratory strain Trichodesmium sp. IMS 101, which is in agreement with morphological observations [40].

All the T. contortum colonies and the T. tenue colony used for 16S rDNA amplification were collected in the northern Caribbean Sea, while the two T. tenue samples used for hetR amplification were from the central Atlantic Ocean. Nevertheless, a close relationship between T. contortum and T. tenue is supported in both 16S rDNA- and hetR-based phylogenetic analyses. This indicates that these morphologically very different species are closely related. A similar situation was observed within another group of Oscillatorian strains [41]. T. contortum is the largest Trichodesmium species with cell diameters ranging between 35 and 50 μm, while T. tenue is the smallest with cell diameters ranging between 3 and 10 μm. The autoecology of these two species is not known, although a relatively large population of T. tenue was encountered at 75 m in central areas of the Atlantic Ocean and was normally not encountered in the upper 25 m. This indicated a different behaviour by this species compared to other Trichodesmium species which are typically found nearer the surface. The larger (10 μm) and intensely red pigmented morphotype of T. tenue collected at 75 m has not been described before. Its hetR sequence was identical to the one from the typical T. tenue, which again emphasises the large morphological variation and lack of genetic variability within the genus Trichodesmium.

The lack of variation between sequences within each sample, the identical sequence of two T. tenue morphotypes and the very high similarity between T. erythraeum sequences with those from the cultured isolates Trichodesmium sp. NIBB 1067 and Trichodesmium sp. IMS 101, suggest that genetic differences between identical morphotypes are small to negligible. In contrast, sequences from T. thiebautii (tuft and puff type) and T. hildebrandtii showed a relatively higher genetic variation and incongruent topologies between 16S rDNA- and hetR-based trees. The different topologies could be due to differences in evolutionary rates between the two genes. Alternatively, the genetic diversity within this subgroup of Trichodesmium is far more complex than expected, leading to difficulties in retrieving identical genotypes by microscopic identifications. A more detailed investigation of this subgroup is therefore needed.

In a study comparing a 324 bp fragment of the nifH gene between Trichodesmium NIBB 1067, T. erythraeum and T. thiebautii, similarities were 98% or higher from the strains [9]. These similarities are close to those obtained here for the 16S rDNA gene sequences. The hetR gene, on the other hand, showed a mean distance value of ca. 91% between T. erythraeum and members of the clade containing T. thiebautii. Taken together the phylogeny of hetR gene sequences from Trichodesmium correlated with that of 16S rDNA, but the number of variable positions between two closely related strains was higher and there appeared to be no risk of accidental amplification of sequences from other types of bacteria. The use of hetR should therefore be a good marker to study genetic diversity and molecular phylogeny of filamentous cyanobacteria in natural populations.


We are thankful to Doug Capone and Jon Zehr for providing ship time and space on the R/V Seward Johnson (Fort Pierce, FL, USA). We thank the US NSF for ship support and an NSF research grant to E.J.C. Thanks to Marcelino Suzuki, Ena Urbach, Brian Lanoil, Doug Gordon, Mike Rappé and Terah Wright, for providing expert assistance in lab and computer techniques. Ulla Rasmussen is acknowledged for critical examination of the manuscript prior to submission. The Lars-Hiertas Minne and Bergvalls funds, the Swedish Foundation for International Co-operation in Research and Higher Education (STINT) and the Swedish Natural Science Research Council (NFR), are acknowledged for their financial support.