SEARCH

SEARCH BY CITATION

Fig. S1. Cyanophage genome size plotted as a function of the number of predicted ORFs where original host genera are designated by colour.

Fig. S2. The genome location of four hierarchical ‘core’ gene sets plotted for 26 T4 phage genomes. Lines connect function-based orthologues across genomes, and are coloured as per legend.

Fig. S3. Multiple sequence alignment of the T4 phage gp51 baseplate hub catalyst protein from 26 T4 phage genomes. The cyanophage and marine vibriophage copies of gp51 are significantly reduced, missing the first ∼200 amino acids relative to the non-cyano non-marine T4 phage copies (the first 140 amino acids of the alignment are not shown). In spite of this size difference, there is marked similarity in the C-terminal region of the protein shown in the alignment.

Fig. S4. Maximum likelihood tree of the pyrophosphatase MazG protein. The tree was constructed from 271 aligned amino acids, using PhyML and the JTT model of substitution with gamma-distributed rates empirically estimated from the data. The accession numbers for the sequences used in this analysis are available upon request. Numbers at the nodes represent bootstrap values for 1000 replicates.

Fig. S5. Weblogo (http://weblogo.berkeley.edu/) diagrams of the various bioinformatically predicted promoters in the cyanophage genomes.

Fig. S6. Multiple sequence alignment of the cyanophage-encoded Zwf proteins identified in varying degrees of preservation across eight cyanophages. While the sequence conservation is minimal for the three highly degraded copies, their position in the genomes is conserved and remnants of sequence similarity remain along the protein.

Fig. S7. Alignment of the endonucleases in T4-GCs 228 and 282.

A. Putative homing endonucleases (T4-GC282) where only the P-SSM2 copy has conserved catalytic residues as compared with the experimentally characterized homing endonuclease present in S-PM2 (S-PM2p177, Zeng et al., 2009). The remaining copies appear to have lost these residues and are likely non-functional, yet are all located at a conserved region suggesting a single evolutionary event of insertion at the 3′-end of gp17 (see upper panel for genome sequence details).

B. Possible endonucleases (T4-GC228) which lack the conserved residues in (A) but nonetheless are highly conserved and proximal to the carbon metabolism genes, suggesting that they may be responsible for genetic shuffling in this region.

Fig. S8. Close-up genome representation of the mobile carbon metabolic gene cluster from cyanophage genomes. Genomic features are as described in Fig. 1, and genome location orientation is as described for Fig. 5.

Fig. S9. Close-up genome representation of the mobile hypothetical genes cluster from cyanophage genomes. Genomic features are as described in Fig. 1, and genome location orientation is as described for Fig. 5.

Table S1. Detailed features of the T4-like ocean cyanophage isolates.

Table S2. T4-like phage core genes determined from 16 cyanophages and 10 non-cyanophages. Numbers listed for each phage represent the size of the genes (bp), with multiple copies separated by a ‘|’. Some T4-GCs were pooled to create a single functional category based upon annotation and genome synteny.

Table S3. Non-cyano T4-like ‘core’ beyond the T4-core. Numbers listed for each phage are as in Table S2.

Table S4. Cyano T4-like core genes. Numbers listed for each phage are as in Table S2.

Table S5. Proteins that are unique to either P-HM1 or P-HM2 phage genome in pairwise comparison of these two co-isolated phages.

Table S6.Synechococcus phage-enriched proteins. Numbers listed for each phage are as in Table S2.

Table S7.Prochlorococcus phage-enriched proteins. Numbers listed for each phage are as in Table S2.

Table S8. Summary of cyano T4 proteomics experiments. Comparative proteomics = experimentally determined protein content in purified virus particles to determine the structural proteins in three sequenced T4-like virus genomes. A ‘Y’ means the protein was detected, ‘–’ means the protein is annotated in the genome but no peptides were detected, ‘NP’ means the protein is not present in the genome, ‘counts’ are the number of peptide fragments detected per protein, ‘copy # in T4’ refers to the biochemically and ultrastructurally determined copy number of proteins in the coliphage T4 particle. Ten of these proteins, in italics, have similar distributions among nine cyanophages and may be functionally linked.

File S1. The spreadsheet used to generate the overview of the cyanophage genome annotations that are presented in Fig. 1.

File S2. Multifasta of all ORFs examined in this study including gene identifiers and genome location, T4-GC assignment and functional annotation.

FilenameFormatSizeDescription
EMI_2280_sm_Figure_S1-S9.ppt6665KSupporting info item
EMI_2280_sm_Table_S1-S8.pdf311KSupporting info item
EMI_2280_sm_File_S1.xlsx275KSupporting info item
EMI_2280_sm_File_S2.txt1984KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.