Fig. S1. Consensus tree of the 16S–23S ITS region incorporating strain and environmental clone sequences (in italics) forming novel clusters. Bootstrap values for NJ and optimized ML trees were obtained through 100 repetitions and presented on the consensus tree. The scale bar represents the number of substitutions per nucleotide position. Branches with low bootstrap support (< 30%) in both analyses were collapsed to multifurcations. Synechococcus sp. RCC307 was used as an out-group. Maximum Likelihood was determined using PhyML and the HKY + G model of substituion (statistical selection of the best fit model was determined with the aid of jModeltest, Posada, 2008) over the likelihood optimized tree and substitution rates estimated from the data. Trees were calculated on the basis of 1239 sites selected with a 50% frequency filter, i.e. columns that include gaps were used to calculate the tree when greater than 50% the sequences had a base in that position. This filter included 1239 sites in the calculation, of which 195 were identical leaving 1044 informative sites. Trees built from an alignment that excluded gaps (319 columns, 124 informative sites) produced similar topology but with significantly lower resolution.

Fig. S2. Individual neighbour-joining trees of the seven MLSA loci. Bootstrap values for NJ and ML were obtained through 100 repetitions and presented on the NJ tree, ● indicates a value > 90%. Strain names are coloured according to their 16S or ITS clade designations to highlight the consistent clustering of strains on the basis of individual MLSA loci. The scale bar represents the number of substitutions per nucleotide position. Synechococcus sp. WH5701 was used as an out-group for each tree.

Fig. S3. Ordination plots displaying the relationships between sites (black labels), environmental parameters (Blue arrows and labels) and individual petB OTUs (red symbols and labels) determined at three distinct identity cut-offs. An OTU cut-off of 94% identity provides the best separation of OTUs relative to sites and environmental parameters.

Table S1. Summary data for Synechococcus strains examined in this study. The table includes the year, place and depth of isolation and the clade and subgroup designation based on 16S rRNA, 16S–23S ITS or MLSA sequences.

