SEARCH

SEARCH BY CITATION

FilenameFormatSizeDescription
ece3593-sup-0001-TableS1-S3-FigS1-S6.pdfapplication/PDF1045K

Data S1. Methods.

Table S1. Genomes used in this study for building gene clusters.

Table S2. Hypervariable regions of Prochlorococcus, cyanophage, and SAR11 reference genomes.

Table S3. Complete list of gene clusters over- or under-represented in BATS, HOT, MED, or RS.

Figure S1. CTD traces for sampling done at RS, MED, BATS, and HOT. Representative casts are shown from KRSE2008, PROSOPE, BATS216, and HOT186 cruises, respectively. Casts were the same as those used to collect samples for DNA sequencing except MED, where the cast was made on Sept. 15, 1999 at a station near the sampling site. Temperature is shown with solid lines, and relative fluorescence (chlorophyll) is shown with dashed lines. Depths where samples were taken for pyrosequencing are marked with dotted lines

Figure S2. Schematic overview of the methods. (A) Assigning metagenomic reads to gene clusters. Reads from each sample were compared to GenBank-nr using BLASTX and binned as Prochlorococcus, cyanophage, or SAR11. Reads in each taxonomic bin were then compared to the available genomes for that taxonomic group using BLASTN and assigned to gene clusters. (B) Calculating relative normalized abundances and entropies for each gene cluster. In this example, counts for the three BATS and three HOT samples were combined. Normalized abundance was calculated by normalizing over the gene clusters for each sample. Relative normalized abundance was calculated by normalizing over the samples for each gene cluster. Shannon entropy was calculated from r.n.a. PRO1000, PRO1001, and PRO1002 are core gene clusters, while PRO2983 is a flexible gene cluster (alkaline phosphatase).

Figure S3. Relative abundance of 16S rRNA genes obtained from metagenomic libraries of RS, MED, BATS, and HOT. (A) Phylum-level classification for all recruited reads. (B) Genus-level classification of the phylum Proteobacteria. (C) Genus-level classification of the phylum Cyanobacteria.

Figure S4. Taxonomic distribution of metagenomic reads from the four datasets included in this study. Top BLAST hits to sequenced genomes are shown, with subgroup/ecotype subdivisions of the counts shown where available. Note that only SAR11 subgroups 1a and 3 are represented by genomes, so only those two subgroups are shown.

Figure S5. Relative normalized abundance and entropy of single-copy gene clusters (found exactly once in each genome) and non-single-copy gene clusters (found more or less than once in at least one genome) from Prochlorococcus, cyanophage, and SAR11 in a genomic context. Gene clusters with entropy in the bottom 15% (Prochlorococcus, SAR11) or 25% (cyanophage) and r.n.a. for one sea in the top or bottom 10% are marked with solid black lines. The dotted line indicates r.n.a. equal to 0.25 (i.e., equal normalized abundance across the four seas). Gray boxes indicate HVRs (Methods).

Figure S6. Histograms of entropy values for single-copy gene clusters (found exactly once in each genome) and non-single-copy gene clusters (found more or less than once in at least one genome) from Prochlorococcus, cyanophage, and SAR11. Only those gene clusters with greater than 20 hits across the four samples are shown. Note the differences in y-axis scale bars between the single-copy and non-single-copy histograms.

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.