Fig. S1. Top 10 identified bacterial genus-level matches determined using a BLASTX search (E < 10−5) against the NCBI NR database using CAMERA (Sun et al., 2011) and mapped against the NCBI taxonomy within the MEGAN software package (Huson et al., 2007) (ordered according to frequency of occurrence in the EAC sample).


Fig. S2. STAMP analysis showing relative importance of broad metabolic categories (SEED, Level 1) in TSW and EAC samples. Corrected P-values (q-values) were calculated using Storey's FDR approach (Parks and Beiko, 2010). Groups over-represented in the EAC community (orange) correspond to negative differences between proportions. Groups over-represented in the TSW community (blue) correspond to positive differences between proportions.


Fig. S3. Hierarchical clustering of metagenomic profiles at (A) class, (B) genus and (C) functional level. Dendrograms represent group average clustering of the Bray–Curtis similarity between profiles. Abundance profiles were generated using the SEED database (Overbeek et al., 2005, E < 10−5) in MG-RAST (Meyer et al., 2008) and normalized abundance data was exported into PRIMER-E (Clarke and Gorley, 2006) for analysis using the CLUSTER algorithm (Clarke, 1993). Normalization was based on a log transformation and data centring as per the MG-RAST standard protocol ( Metagenomes representative of tropical and temperate ocean surface habitats (red and blue symbols respectively) were chosen for the analysis based on their geographic location (Tropical = < 23° latitude) and consisted of greater than 1000 hits. Datasets were as follows: GOS Temperate (‘Global Ocean Survey’, Rusch et al. 20007; MG-RAST ID 4441143.3, 4441144.3, 4441570.3, 4441573.3,4441574.3, 4441575.3, 4441578.3, 4441579.3, 4441583.3, 4441585.3, 4441659.3), GOS Tropical (‘Global Ocean Survey’, Rusch et al. 2007; 4441145.3, 4441146.3, 4441594.3, 4441603.3, 4441605.3), Equatorial Pacific (‘Marine Bacterioplankton Metagenomes’; 4443766.3, 4443695.3, 4443697.3, 4443698.3, 4443699.3, 4443700.3, 4443701.3), Study (‘EAC/TSW’; 4446407.3, 4446457.3), Monterey Bay (‘Monterey Bay Microbial Study’; 4443713.3, 4443712.3, 4443714.3, 4443715.3, 4443716.3, 4443717.3), Botany Bay (‘Botany Bay Metagenomes’; 4443688.3, 4443689.3), HOT/ALOHA (‘Microbial Community Genomics at the HOT/ALOHA’ De Long et al. 2006; 4441051.3, 4441057.4), All data are publically available on MG-RAST (Meyer et al., 2008;; accessed 21/3/12)

