SEARCH

SEARCH BY CITATION

Fig. S1. Relationship between total frequency of a group and the number of ≥ 97% clusters in that group for the original data (all samples combined) (A) and in resampled data giving equal number of tags (500) per group (B). ‘Group’ here means phyla or proteobacterial classes. To remove effects due to different sample sizes (e.g. there were many more sequences for the Gammaproteobacteria than for Epsilonproteobacteria), 500 tags for each group were randomly resampled and then the remaining number of ≥ 97% clusters was compared with the original frequency (panel B). The analysis shows that phylotype richness explained the relative abundance of a group even when each group was sampled with equal intensity. The Model II regression slope of log (frequency) versus log(cluster number) was 1.15 + 0.18 (r2 = 0.626; n = 12; P < 0.0001) for the original and 2.63 + 0.81 (r2 = 0.518; n = 12; P < 0.01) for the resampled data.

Fig. S2. The average frequency of ≥ 90% clusters as a function of the number of 97% cluster in each ≥ 90% clusters.

Fig. S3. A measurement of evenness (Gini coefficient) for each > 97% cluster versus the average frequency of that cluster. A perfectly even cluster would have a Gini coefficient of 1 whereas a very uneven cluster would have a coefficient of 0.

Table S1. Number of ribotypes and ≥ 97% clusters belonging to the SAR11 clade.

Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

FilenameFormatSizeDescription
EMI_2154_sm_tS1_fS1-4.pdf61KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.