Characterization of Trichodesmium-associated viral communities in the eastern Gulf of Mexico


Correspondence: Ian Hewson, Department of Microbiology, Cornell University, Wing Hall 403, Ithaca, NY 14853, USA. Tel.: +1 607 255 0151; fax: +1607 255 3904; e-mail:


Trichodesmium surface aggregations shape the co-occurring microbial community by providing organic carbon and nitrogen and surfaces on which microorganisms can aggregate. Rapid collapse of Trichodesmium aggregations leads to drastic changes in the chemical and physical properties of surrounding waters, eliciting a response from the microbial community and their associated viruses. Three viral metagenomes were constructed from experimentally lysed Trichodesmium collected from two locations in the eastern Gulf of Mexico. Trichodesmium were either treated with mitomycin C to induce potential lysogens or incubated in the absence of mitomycin C. Comparative analyses of viral contiguous sequences indicated that viral composition was responsive to treatment type. Cyanophages were more represented within incubations treated with mitomycin C, while gammaproteobacterial phages were more represented within the untreated incubation. The detection of latent bacteriophage integrases in both the chemically treated and untreated incubations suggests that Trichodesmium death may lead to prophage induction within associated microorganisms. While no single cyanophage-like genotype associated with Trichodesmium lysis could be identified that might point to an infectious Trichodesmium phage, reads resembling Trichodesmium were recovered. These data reveal a diverse consortium of lytic and temperate phages associated with Trichodesmium whose patterns of representation within treated and untreated libraries offer insights into the activities of host and viral communities during Trichodesmium aggregation collapse.


The marine diazotrophic cyanobacterium Trichodesmium spp. is an important component of tropical and subtropical oligotrophic marine ecosystems as a key source of nutrients and as a physical substrate for co-occurring microorganisms (Capone et al., 1997). Trichodesmium colonies support a wide diversity of associated microorganisms that likely consume fixed C and N that the cyanobacterium excretes as organic carbon and nitrogen. In addition, Trichodesmium provides a ‘pseudobenthic’ habitat on which other organisms may colonize and grow (Sheridan et al., 2002; Hewson et al., 2009). In calm weather conditions, Trichodesmium colonies form dense surface aggregations that provide hot spots of biogeochemical cycling in oligotrophic surface waters. Both the cyanobacterium itself and bacteria that colonize the surface of Trichodesmium filaments are capable of enhanced phosphorus uptake relative to surrounding seawater bacterioplankton which is a key adaptation for growth in phosphorus limited cyanobacterial aggregations (Dyhrman et al., 2006; Van Mooy et al., 2012).

Trichodesmium aggregations undergo sudden population crashes causing them to disappear in short periods of time in both culture and field settings (Ohki, 1999; Berman-Frank et al., 2004; Hewson et al., 2004; Rodier & Le Borgne, 2010). Although some copepods are capable of consuming Trichodesmium (O'Neil & Roman, 1992), predation is not believed to be the primary cause of large mortality events as some species of Trichodesmium are known to excrete feeding deterrents (Hawser et al., 1992; Capone et al., 1997). One hypothesis for Trichodesmium mortality is autocatalyzed programmed cell death. This has been observed in the marine phytoplankton Emiliana huxleyi and was hypothesized to occur in Trichodesmium based on gene expression and translation of a catalase-like gene, associated with eukaryotic programmed cell death, during Trichodesmium demise (Berman-Frank et al., 2004, 2007). Simultaneous research has shown, however, that viral lysis is the cause of bloom collapse in many bloom forming marine algae (Nagasaki, 2008), and further research of Emiliana huxleyi bloom dynamics has revealed that a virus contributes to coccolithophore demise, with host lysis due to viral infection occurring through a pathway similar to a programmed cell death pathway that occurs in multicellular eukaryotes (Vardi et al., 2009). Viral infection of Trichodesmium is another hypothesized cause of rapid collapse of Trichodesmium aggregations. Previous studies have observed virus-like particle production from Trichodesmium exposed to the mutagen mitomycin C (Ohki, 1999; Hewson et al., 2004), commonly used in studies of lysogeny in seawater (Weinbauer & Suttle, 1999; McDaniel et al., 2001), in both laboratory and field incubations of Trichodesmium. Thus, the emergence of virus-like particles upon mitomycin C treatment suggests that Trichodesmium harbor prophage that may enter the lytic cycle under as-yet undefined conditions (Ohki, 1999; Hewson et al., 2004).

Trichodesmium aggregation collapse inevitably leads to changes in the physical and chemical conditions of seawater that likely impact associated microbial communities. Previous studies have found a wide diversity of organisms associated with Trichodesmium colonies and Trichodesmium aggregations. Bacterial abundance on Trichodesmium filaments is threefold higher per unit volume than abundance in the surrounding water (O'Neil & Roman, 1992; Sheridan et al., 2002), and the taxonomic structure of communities associated with Trichodesmium is different than typical open ocean bacterioplankton (Hewson et al., 2009). Like algal blooms, collapse of surface aggregations of Trichodesmium causes large pulses of dissolved and particulate organic matter from dying Trichodesmium cells, leading to a shift in the sources of C and N available for co-occurring microorganisms (Rodier & Le Borgne, 2010). In addition, collapse and disappearance of Trichodesmium colonies reduces the availability of physical surfaces onto which associated microorganisms may aggregate. The large chemical and physical perturbations caused by Trichodesmium death may lead to a shift in the activity of associated microorganisms and their viruses.

The aims of the current study were twofold. Firstly, this study used high-throughput sequencing of the virus fraction of Trichodesmium aggregations upon collapse to examine the taxonomic structure of the associated viral consortia. This study compared viruses present after Trichodesmium aggregations were chemically treated (mitomycin C) to induce lysogenic viruses to the viral consortia present after containment-induced Trichodesmium lysis. The second goal was to investigate potential viruses of Trichodesmium within these incubations. Results of sequence analyses suggest that viruses capable of infecting many different phylogenetic groups were present in the treated and untreated incubations. Phages of bacterial groups previously found in association with Trichodesmium were detected, and differences in relative representation of viral consortia between mitomycin C-treated and untreated incubations suggest that viruses that arose in response to Trichodesmium lysis without chemical treatment were different than the viruses induced due to mitomycin C addition. The overall similarity between mitomycin C-treated and mitomycin C-untreated incubations, however, suggests that Trichodesmium demise may cause lysogenic induction even when no lysogen is added. No single viral genotype resembling known cyanophage made up a significant proportion of these libraries, and thus no definitive virus of Trichodesmium could be identified, leaving room for further inquiry into the cause of rapid Trichodesmium collapse.


Sample collection and experimental setup

Experiments were conducted in the eastern Gulf of Mexico onboard the R/V Pelican cruise EK0001, 6–10 October 2009. Trichodesmium colonies were collected at 2 stations: The mouth of Tampa Bay (27°32′ 50″N, 82° 46′ 55″W) and station 9B (26°25′44″ N, 82°30′58″W), both on the West Florida Shelf. These stations were chosen because dense Trichodesmium aggregations were visible at the water surface. Trichodesmium were obtained using a 1-m-diameter, 125-μm mesh plankton net that was towed behind the vessel for 5 min to collect colonies. Trichodesmium collected by net tow were further concentrated over a 10-μm Isopore (PC, Millipore) filter and resuspended in 250 mL of 10 μm – filtered seawater in acid-washed and seawater rinsed polycarbonate bottles. At the Tampa Bay station, collected water was split into two incubations and mitomycin C was added (20 ng mL−1, final concentration) to one of the incubations, while the other incubation was left untreated. A third incubation with mitomycin C added was conducted at station 9B.

Trichodesmium were incubated in flow-through outdoor incubators to maintain ambient surface water temperature and subject to 25% attenuated irradiance under shade cloth for 12 h. After 12 h, Trichodesmium colonies in all incubations were green and the water pink, indicating the presence of cyanobacterial phycoerythrin in the water and thus Trichodesmium death. The entire incubation was filtered through 0.2-μm Durapore (PVDF; Millipore) filters, and the filtrate was transferred to sterile 50-mL tubes, which were frozen in liquid nitrogen until further analysis. Filtration through a 0.2-μm filter is necessary to remove bacteria and other large debris, but may have biased the composition of our metaviromic data sets because some viruses are around this size threshold, such as some phycodnaviridae (Simpson et al., 2003) as well as long-tailed siphoviridae (Sullivan et al., 2009).

Viral nucleic acid extraction and preparation

Purification of viral particles followed a modified filter-chloroform-nuclease treatment protocol (Ng et al., 2011). The 0.2-μm-filtered Trichodesmium lysates were precipitated in 10% polyethylene glycol (PEG 8000, weight/volume) at 4 °C for 24 h. The PEG-treated lysates were then centrifuged at 30 000 g for 30 min. The supernatant was decanted and pellets were resuspended in 1 mL of 0.02-μm-filtered phosphate-buffered saline (PBS). A subsample (100 μL) of each sample was filtered again through 0.2-μm Acrodisc (0.2 μm; Pall Gellman, New York) filters to insure that samples were free of potential contaminating particles (like bacteria). Chloroform (0.2 volumes) was added to each sample, which were gently agitated and settled at room temperature for 10 min. Samples were briefly centrifuged and the aqueous layer (containing purified viruses) was transferred to a new tube. The aqueous fraction was incubated with DNase (2.5 U each) for 3 h at 37 °C, after which EDTA was added to a final concentration of 2 mM. Viral DNA in the purified aqueous layer was extracted using the ZR viral DNA extraction kit (Zymo Research, Irvine, CA) according to manufacturer's protocols.

Nucleic acid amplification and sequencing

Extracted viral DNA was amplified via multiple displacement (Φ29) amplification in triplicate reactions (Genomiphi©; GE Biosciences, Piscataway, NJ). Triplicate reactions were combined and amplified DNA was purified using a modified DNeasy extraction protocol (Thurber et al., 2009). The purified and amplified DNA was subjected to high-throughput sequencing using 454 Titanium chemistry at EnGenCore (University of South Carolina, Columbia, SC), where each sample was sequenced on 1/8 of a picotiter plate. Sequence data are available at the Community Cyberinfrastructure for Advanced Microbial Ecology Research and Analysis (CAMERA) Web site under project accession number CAM_P_0000951.

Viral metagenome phylogenetic annotation and comparative analyses

Annotation of viral metagenome reads and contiguous sequences followed an approach similar to VIROME (; Bhasvar et al., 2009). Viral metagenomic libraries were cleaned up using the QC filter available on the CAMERA Web site (; Sun et al., 2011) and then de-replicated to account for 454 sequencing amplification bias (Gomez-Alvarez et al., 2009) using the 454 de-replication workflow also available at CAMERA. Sequence reads within each cleaned library were subjected to blastx (Altschul et al., 1990) against the viral proteins, prokaryotic proteins, and fungal proteins databases at CAMERA with an e-value cut-off of e < 10−3. The top hit from all three databases based on lowest e-value (and highest bit score among annotations with tied e-values) were compiled into Microsoft Excel. The phylogenetic affiliation of viral hits was annotated based on the NCBI Taxonomy browser ( The host genus of viral hits was determined based upon the name of the virus (e.g. Prochlorococcus phage P-SSM4) or by a literature search to determine the organism of isolation, if the host organism was not described within the name of the phage (e.g. phage λ).

Sequence reads were assembled into contiguous sequences (contigs) in CLC Genomics Workbench 4.0 using stringent assembly parameters (20% minimum overlap, 95% identity). Open reading frames (ORFs) were identified on contigs using fraggene_scan (Rho et al., 2010). ORFs were compared using blastp against the Reference Proteins database (CAMERA) using an e-value cutoff of e < 10−3. Contigs were annotated based on their best ORF match to the CAMERA Reference Proteins database. Representation of individual viral genotypes was determined by taking the total contig base pairs of all contigs with best ORF matches to specific sequenced phages and dividing that number by the total number of basepairs in the assembled contig library for each data set.

To compare assembled contiguous sequence annotations between libraries, the relative representation of viral groups was calculated based upon the sum of the lengths of contigs assigned to phylogenetic bins divided by the total length of all contigs within that particular analysis (e.g. viral contigs; Equation 1).

Difference between mitC-treated libraries and untreated libraries

display math

Read libraries were compared with each other using reciprocal blastn comparisons in which each library was used as both the database and the query library. In addition to the three libraries generated for this study, the libraries were compared with the Induced Tampa Bay Phage library constructed using similar techniques (CAM_PROJ_TampaBayPhage; McDaniel et al., 2008) and viral reads from the Marine Viromes data set available at CAMERA (CAM_PROJ_MarineVirome), divided by location [Sargasso Sea (SAR), Arctic (Arctic), British Columbia coastal waters (BBC) and the Gulf of Mexico (GOM); Angly et al., 2006]. 10 000 reads were randomly sampled from each library and subjected to blastn against all other read libraries (e < 10−10), using each library as both a database and a query library for each reciprocal comparison. The number of reads in each query hitting at least one read in the database library was recorded, and the average number of reads in the query with hits to the database was calculated for each pairwise comparison. A similarity matrix was then constructed based upon the averaged results of three separate comparisons and subjected to multidimensional scaling (MDS) in XL Stat (Addinsoft SARL, New York, NY) to visualize the relatedness between libraries.

Results and discussion

Library statistics

Read libraries had between 30 648 and 69 208 reads, after de-replication and clean-up, representing 62 Mbp of sequence information. G + C content in all three libraries was below 50%. After cleanup, between 20% and 30% of reads in each library matched known viral, prokaryotic and eukaryotic fungal proteins based on blastx comparison with protein databases (e-value < 10−3). Of annotated reads, one-third to nearly one-half of the reads most closely matched viral proteins, with a larger proportion matching bacterial proteins (Supporting Information, Fig. S1). Contiguous sequence (contig) assembly resulted in 2263 to 2687 contigs per library (Table 1). Average contig length was 755–875 bp, and annotation revealed a similar proportion of viral, bacterial, archaeal and unknown contigs (Fig. 1a).

Table 1. Library statistics for reads and assembled contigs in the Tampa Bay mitomycin entreated (TB + mitC) and untreated Tampa Bay (TB) and the 9B mitomycin C treated (9B + mitC) libraries. Contigs considered viral had best ORF match (based on blastp, lowest e-value and highest bit score) to the Reference Proteins database at CAMERA
 TB + mitCTB9B + mitC
# Raw reads65 83279 407]34 358
Average raw read length326 ± 0.80355 ± 0.45374 ± 0.84
# Reads57 49369 20830 648
Total base pairs (bp)24 375 86525 581 06012 478 835
Average % GC45 ± 0.0245 ± 0.0243 ± 0.04
Average read length424 ± 0.51370 ± 0.44407 ± 0.70
# Contigs251626872263
Average contig length856 ± 18826 ± 22755 ± 13
Total contig bp2 152 6712 220 0601 709 318
# Viral contigs341339221
Total viral contig bp316 481350 470184 625
Avg viral contig length942 ± 811110 ± 108875 ± 56
Viral contig range239–16 870191–15 790249–10 988
Figure 1.

(a) Contiguous sequence annotations of kingdom based on best ORF match per contig to CAMERA Reference Proteins, e < 10−3, (b) relative representation of viral families amongst contigs with best ORF match to viral proteins normalized by contig base-pairs falling into each family divided by the total number of viral contig basepairs; (c) relative difference between mitomycin C treated and untreated Tampa Bay viral contigs (see Eqn. 1). Values that fall below zero indicate decreased representation in a mitomycin C treated library compared to the untreated Tampa Bay library.

Comparison between libraries and with other viral data sets

The mitomycin C-treated and mitomycin C-untreated Trichodesmium viral metagenomes from Tampa Bay were most similar to each other, and the mitomycin C-treated library from station 9B was more similar to the two Tampa Bay viral metagenomes than to other sequenced viral metagenomes included in our analysis (Fig. 2, Table S1). This suggests that the viral community associated with Trichodesmium lysis is distinct from virioplankton communities in other environments. In addition, the mitomycin C-treated and mitomycin C-untreated libraries from the mouth of Tampa Bay were most similar of any of the libraries in this analysis, with an average of 74% read similarity based on blastn with an e-value cutoff score of 10−10.

Figure 2.

MDS plot of read libraries based on a similarity matrix of averaged triplicate reciprocal blastn analyses of 10 000 random reads from each library. TB + mitC = Tampa Bay mitomycin c treated, TB = Tampa Bay untreated, 9B + mitC = 9B mitomycin C treated, SAR = Marine Viromes Sargasso Sea subset, Arctic = Marine Viromes Arctic subset, BBC = Marine Viromes British Columbia coastal waters subset, GOM = Marine Viromes Gulf of Mexico Subset, TBP = Mitomycin C-treated Tampa Bay metavirome from McDaniel et al., 2008.

Assessing φ29 biases

Analysis of sequence reads revealed a variable distribution of read hits, with a number of similar reads resembling specific genes present in very high levels. Because libraries were prepared using Φ29 amplification, there are expected biases in the representation of genes and organisms within individual libraries (Shoaib et al., 2008). Subsequent contig-based analyses ruled out some potential biases associated with overrepresentation of genes relative to template quantities. For example, read-base representation of genes within the Pseudoalteromonas phage PM2 genome in the 9B mitomycin C-treated read library was uneven, as some genes had greater than 100× coverage but adjacent genes had only 1× coverage. Stringent assembly of sequence reads into contigs removed some over-amplification biases by combining many highly amplified reads onto single contigs. The length of contigs is used in this study as a proxy for representation of genotypes within libraries. Annotated contigs then represent the well-represented viruses within each of the samples, and as these libraries were all constructed in the exact same fashion, we will use differences in the contig lengths assigned to different viral genotypes and virus categories between libraries to assess differences in viral community structure. The following comparisons are relative comparisons between libraries and are not quantitative. In addition, read-based annotations were used for characterization of individual targeted genes, and assembled contiguous sequence ORF-based annotations were used for comparative analyses between libraries.

Annotation of contiguous sequence ORFs

Approximately 10% of contigs from each library had best ORF matches to previously described viral proteins (Fig. 1a), which represented a total of 215 viral genotypes based on the strongest ORF match per contig. Different phages exhibited different patterns of representation among the three libraries (Fig. 3). Contigs with best ORF matches to a subset of viruses were more highly represented within the untreated Tampa Bay viral metagenome compared with the mitomycin C-treated viral metagenomes. These included contigs annotated as Pseudoalteromonas phage pYD6-A, Enterobacter phage EcP1, Pseudomonas phage LIT1, Vibrio phage VBP47 and Erwinia phage vB_EamP-S6. These phages all infect gammaproteobacteria and have either unknown or lytic infection strategies. Another group of viruses exhibited increased representation in both the mitomycin C-treated libraries compared with the untreated library. Some of the highest represented members of this group include Methylophilales phage HIM624-A, Roseobacter phage SIO1, Cyanophage KBS-S-2A, Synechococcus phage S-SM2, and Synechococcus phage S-CAM8. The only member of this described group whose infection strategy is known is Roseobacter phage SIO1, which was originally characterized a lytic phage of Roseobacter. Synechococcus phages S-SM2 and S-CAM8 are both members of the myoviridae viral family. To date, there is little evidence that cyanobacterial myoviruses are capable of lysogeny (Clokie et al., 2010); thus, their higher representation in the chemically induced libraries is interesting. A third group of viruses were most highly represented in one of the mitomycin C-treated viral metagenomes but had lower representation within the other two libraries. Highly represented viral genotypes within the Tampa Bay mitomycin C-treated data set included Salicola phage CGphi29, Vibrio phage pYD21-A, Pseudoalteromonas phage H105/1 and Vibrio phage jenny 12G5. Several of these phages are known to be temperate, suggesting that these groups were induced upon treatment with mitomycin C. Pseudoalteromonas phage PM2 was most highly represented in the 9B mitomycin C-treated library and was absent from both libraries from Tampa Bay (Fig. 3, Table 2).

Table 2. Twenty most highly represented viral genotypes within the contig libraries
VirusTaxon IDVirus familyInfection strategyLibrary with highest representation
Pseudoalteromonas phage pYD6-A754052Unclassified dsDNA virusUnknownTB untreated
Salicola phage CGphi29754067Unclassified dsDNA virusUnknownTB + mitC
Methylophilales phage HIM624-A889949Unclassified dsDNA virusUnknownTB + mitC
Enterobacter phage EcP1942016PodovirusUnknownTB untreated
Pseudomonas phage LIT1655098PodovirusLyticTB untreated
Roseobacter phage SIO1136084PodovirusLyticTB + mitC
Vibrio phage pYD21-A754049Unclassified dsDNA virusUnknownTB + mitC
Vibrio phage douglas 12A4573171UnclassifiedUnknownTB untreated
Vibrio phage VBP47754073Unclassified dsDNA virusUnknownTB untreated
Erwinia phage vB_EamP-S61051675PodovirusLyticTB untreated
Synechococcus phage S-SSM7445686MyovirusLyticTB + mitC
Pseudoalteromonas phage H105/1877240SiphovirusTemperateTB + mitC
Pseudomonas phage YuA462590SiphovirusTemperateTB + mitC
Cyanophage KBS-S-2A889953Unclassified dsDNA virusUnknown9B + mitC
Pseudoalteromonas phage PM210661CorticovirusLytic9B + mitC
Pseudomonas phage LUZ7655097PodovirusLyticTB untreated
Vibrio phage jenny 12G5573176UnclassifiedUnknownTB + mitC
Synechococcus phage S-SM2444860MyovirusUnknown9B + mitC
Synechococcus phage S-CAM8754038MyovirusUnknown9B + mitC
Phage phiJL001279383SiphovirusTemperateTB untreated
Figure 3.

Relative representation of 20 most abundant viral contig annotations, annotated based on best ORF match per contig. Relative representation determined by total contig base pairs falling into each annotation divided by the total contig bp in each library.

Contigs with best ORF matches to members of the Caudovirales, or tailed phage, were the most highly represented viral order with the families Podoviridae, Siphoviridae, and Myoviridae together comprising the majority of identifiable viral sequences in each library. Sequences matching the Phycodnaviridae had a consistent presence within all three contig libraries, with the largest representation within the mitomycin C-treated 9B viral metagenome. Phycodnaviruses are large viruses whose sizes range from 0.15 to greater than 0.2 μm (Simpson et al., 2003), which is at or above the pore size of the 0.2-μm prefilter used. Thus, the phycodnaviruses present within these data sets may not represent all phycodnaviruses present at the time of sampling. A 11 000-bp contig with several ORFs best resembling proteins of Pseudoalteromonas phage PM2 was only found in the mitomycin C-treated library collected at station 9B. There were differences between the relative representation of viral families within mitomycin C-treated and mitomycin C-untreated Tampa Bay viral metagenomes (Fig. 1b and c). Myoviruses were slightly overrepresented in the untreated library compared with both mitomycin C-treated libraries. There was little difference in the representation of contigs matching siphoviruses between the Tampa Bay mitomycin C-treated and mitomycin C-untreated libraries, but there was a decrease in their representation in the mitomycin C-treated 9B viral metagenome. Finally, there was a greater representation of contigs matching phycodnaviruses (viruses of Chlorella, Ostreococcus, and Micromonas) within the mitomycin C-treated viral metagenomes compared with the mitomycin C-untreated viral metagenome. The cause of enhanced representation of phycodnaviruses is unclear, because mitomycin C is not expected to induce viruses within eukaryotic organisms.

Viruses of alphaproteobacteria, bacteroidetes, betaproteobacteria, cyanobacteria, firmicutes, and gammaproteobacteria were the six most dominant viral groups based on bacterial host. Previously described viruses of gammaproteobacteria were more represented within the untreated Tampa Bay viral metagenome compared with the mitomycin C-treated viral metagenomes. Conversely, cyanophage were better represented in the mitomycin C-treated viral metagenomes compared with the mitomycin C-untreated viral metagenome (Fig. 4). Together, these results indicate that lysogenic cyanobacteria were likely sensitive to mitomycin C addition. The large representation of gammaproteobacterial viruses in these libraries suggests that gammaproteobacteria were the most active and potentially the most abundant hosts. These observations are congruent with a previously reported Trichodesmium community metatranscriptome which found abundant gammaproteobacterial transcripts (Hewson et al., 2009). Phages of hosts that comprised a large proportion of metatranscriptomic reads in that study, particularly Pseudoalteromonas and Alteromonas were well represented within all three metaviromic libraries in this study. These organisms, some of whom are referred to as ‘copiotrophs’ (Ivars-Martinez et al., 2008), may have responded to the increased DOM availability after Trichodesmium lysis, which subsequently led to a response from the related phage population either through induction of prophage or through enhanced lytic infection due to increases in host density. The lower representation of gammaproteobacterial phage genotypes within the two mitomycin C-treated contig libraries compared with the mitomycin C-untreated library may indicate that this group was not the dominant host group at the onset of the incubation experiments and that this group could not respond to Trichodesmium lysis in the presence of mitomycin C. Interestingly, there is almost no difference in the relative representation of alphaproteobacterial, betaproteo-bacterial, and firmicute phage genotypes between the mitomycin C-treated and mitomycin C-untreated libraries, suggesting that the representation of these host groups remains similar in both incubations and that the viral population responds similarly to mitomycin C addition and Trichodesmium lysis (Fig. 4).

Figure 4.

Relative representation of viral contigs based on host organism (a) and relative difference between mitomycin C treated and untreated Tampa Bay viral contigs (b).

Mitomycin C treatment and location are the two factors that are most likely responsible for the differences in viral composition in this study. All three viral metagenomes were generated from incubations in which Trichodesmium lysed, leading to enhanced input of dissolved organic matter into the incubations, but in the two incubations in which mitomycin C was added, the associated bacterial community would not be able to respond to these physical changes before mitomycin C caused prophage induction. The phage genotypes that were better represented in the untreated viral metagenome compared with the mitomycin C-treated libraries represent viruses that may have entered a lytic infection cycle in response to changes in host activity or lytic viruses that increased in abundance due to increases in host abundance. Lytic viral production typically occurs when host and free virus abundance is high (Wilcox & Fuhrman, 1994). Lytic viruses have disproportionate effects on dominant bacterial taxa in plankton due to the density-dependent and relatively host-specific dynamic between viruses and their hosts. Thus, the most metabolically dominant or fastest growing member of the microbial community is controlled most heavily by lytic viruses, allowing less competitive community members to co-exist, an interaction termed ‘kill the winner’ (Thingstad, 2000). The viral genotypes more highly represented within the untreated incubation may have been interacting with hosts in a ‘kill-the-winner’ scenario, whereby hosts of these viruses were the fastest growing portion of the bacterial community. Conversely, the viruses that were better represented in the two mitomycin C-treated viral metagenomes likely represent temperate phages arising from prophage within hosts sensitive to mitomycin C that were in high abundance at the beginning of the incubation and viruses already present within the community at the time of mitomycin C addition. The latter description may explain why several cyanobacterial myoviruses, not expected to be lysogenic, were more represented within the mitomycin C-treated incubation; associated picocyanobacteria and their phages were abundant members of the community before Trichodesmium lysis and thus were represented within the mitomycin C-treated incubations that contained a viral community that was not able to respond to Trichodesmium lysis. Finally, viral genotypes that exhibited higher representation in only one of the mitomycin C-treated incubations likely represent variation in host composition between sampling locations. This may be due to differences in the physical and nutrient characteristics at the two locations or differences in Trichodesmium colony morphology between locations, which has been previously observed to lead to differences in the structure of associated microbial assemblages (O'Neil & Roman, 1992; Sheridan et al., 2002; Hmelo, 2010).

Integrase genes present within read libraries

Bacteriophage integrase genes have previously been used to characterize lysogeny within metaviromic samples (McDaniel et al., 2008). In this study, integrases made up 0.28–0.38% of sequence reads in each viral metagenome. Within the 9B mitomycin C-treated library, integrases most closely matching Vibrio phage were most highly represented followed by those of Pseudoalteromonas phage. Phage integrases of two other marine gammaproteobacterial hosts (Idiomarina and Congregibacter) were detected, with equal representation of the two in the untreated library and a higher representation of the Congregibacter phage integrase in the mitomycin C-treated viral metagenome. The representation of integrases in the Tampa Bay mitomycin C-treated library was 0.1% higher than the representation of integrases within the untreated Tampa Bay library (Table 3). The large proportion of integrase-like reads resembling phage of gammaproteobacterial hosts (Vibrio-like, Congregibacter-like) suggests that prophage induction of these or similar gammaproteobacterial hosts was stimulated. Similarity of unique integrase-like sequences between untreated and mitomycin C-treated Tampa Bay viral metagenomes suggests that temperate phages were induced after mitomycin C addition, but also by containment or Trichodesmium collapse.

Table 3. Summary of integrase reads found within the three libraries
 Integrase summary
TB + mitCTB untreated9B + mitC
# Integrase220197104
% Integrase reads0.380.280.34
Unique hits434555

Analysis of reads resembling Trichodesmium

Sequence reads with varying degrees of similarity to 19 different proteins of Trichodesmium erythraeum IMS101 were found in the metaviromic libraries. Trichodesmium proteins detected included a phage integrase (in the untreated Tampa Bay viral metagenome only), two different Rho-termination factor-like proteins (in both mitomycin C-treated and untreated Tampa Bay viral metagenomes), a RNA polymerase sigma factor (in 9B mitomycin C treated), and several hypothetical proteins. Interestingly, most reads matching Trichodesmium proteins were found in the untreated Trichodesmium viral metagenome and the least in the 9B mitomycin C-treated viral metagenome. Sequence reads matching three proteins were found in both mitomycin C-treated and untreated Tampa Bay viral metagenomes: a DUF900 hydrolase-like protein (RefSeq ID YP_721790), a hypothetical protein (YP_723033), and a Rho-termination factor-like protein (YP_720117).

Despite the large biomass of Trichodesmium in incubations, and apparent large viral burst size (Hewson et al., 2004), no single viral genotype that dominated libraries which would indicate a virus involved in Trichodesmium demise could be identified. It is possible that a phage of Trichodesmium was present at relatively low abundances within the phage community because other phage, particularly those infecting taxa that were likely consuming organic matter of the dying cyanobacteria, comprised a large proportion of total viral abundances in these incubations. Another possibility is that there are multiple, diverse phage that infect Trichodesmium, which would also lead to inconclusive results based on the size of the libraries sequenced. Sequences of a potential Trichodesmium phage may have been pushed below the sequence coverage of our viral metagenomes. The consistent detection of the same Trichodesmium genes in the two Tampa Bay viral metagenomes suggests this approach was not biased toward sampling the genome of Trichodesmium, but rather repeated the detection of phage particles bearing Trichodesmium-like genes. In addition to Trichodesmium-like genes, there was also a large proportion of unidentifiable sequence information within all three viral metagenomes that may encode genomic material of Trichodesmium cyanophages. CRISPR spacer sequences are identified within the T. erythraem genome according to MicrobesOnline, which is an analysis pipeline that identifies CRISPR elements missing from RefSeq annotations (Dehal et al., 2009). The presence of CRISPR spacer sequences within the genome of Trichodesmium erythraeum IMS 101 may indicate that this strain encountered phages in the past (Anderson et al., 2011). For these reasons, and based on previously published observations (Ohki, 1999; Hewson et al., 2004), it is likely that such a virus exists.


These data suggest that there is a diverse consortium of viruses associated with the collapse of Trichodesmium aggregations, but do not fully support the possibility that a Trichodesmium virus is responsible for aggregation collapse. There was a diverse group of viruses capable of infecting a number of different organisms present within collapsed Trichodesmium aggregations. Patterns of viral genotype representation reveal variations in host activity and virus infection strategies following Trichodesmium lysis. Viral genotypes that were more represented in the untreated library may represent lytic viruses of active opportunistic bacteria, while viruses that were better represented in the mitomycin C-treated viral metagenomes may have been temperate viruses of abundant but slow growing hosts. While a virus of Trichodesmium could not be identified within these metaviromic data sets, Trichodesmium-like genes were present among sequence reads, and Trichodesmium genes were found in all libraries. Thus, based on this study, there is inconclusive evidence that there is a virus of Trichodesmium. Higher sequencing coverage may be required to identify a virus of a dominant organism within a mixed microbial community. Future studies aimed at determining the cause of Trichodesmium aggregation collapse will require more directed approaches such as better isolation of the cyanobacterium away from associated bacteria and deeper sequencing efforts.


The authors would like to thank Lauren McDaniel, University of Southern Florida; Cindy Heil, Florida Fish and Wildlife Conservation Commission and Bigelow Laboratory; Deborah Bronk, Virginia Institute for Marine Science; Gary Hitchcock, University of Miami – RSMAS; Margaret Mulholland, Old Dominion University; Matt Garrett, Florida Fish and Wildlife Conservation Commission; Judy O'Neil, University of Maryland Center for Environmental Studies; Crew of R/V Pelican, LUMCOM NOAA – ECOHAB Program ‘ECOHAB: Karenia Nutrient Dynamics in the Eastern Gulf of Mexico’. Funding was provided in part by the Cornell Biogeochemistry and Environmental Biocomplexity Initiative, and National Science Foundation grants OCE-1049670 awarded to I.H., M. Breitbart and K. Daly, DEB-1028898 awarded to I.H. and N. Hairston and Federal Formula Funds Grant NYS-189489 awarded to I.H. and J. Casey.