Present address: College of Medicine, Swansea University, Swansea SA2 8PP, UK.
Phylogenetic distribution of traits associated with plant colonization in Escherichia coli
Article first published online: 30 AUG 2012
© 2012 Society for Applied Microbiology and Blackwell Publishing Ltd
Special Issue: Environmental Ecology of Pathogens and Resistances
Volume 15, Issue 2, pages 487–501, February 2013
How to Cite
Méric, G., Kemsley, E. K., Falush, D., Saggers, E. J. and Lucchini, S. (2013), Phylogenetic distribution of traits associated with plant colonization in Escherichia coli. Environmental Microbiology, 15: 487–501. doi: 10.1111/j.1462-2920.2012.02852.x
- Issue published online: 28 JAN 2013
- Article first published online: 30 AUG 2012
- Accepted manuscript online: 30 JUL 2012 07:08AM EST
- Manuscript Accepted: 15 JUL 2012
- Manuscript Revised: 13 MAY 2012
- Manuscript Received: 12 JAN 2012
Fig. S1. Isolation of 106 plant-associated strains.
A. Geographical location of sampling of GMB isolates, ‘Location 6’ encompasses locations outside England, the yellow dot represents the location of our laboratory.
B. Time of sampling, the asterisk denotes strains that were isolated on the same day in the same field.
C and D. (C) Distribution of GMB isolates according to geographical location of sampling and (D) according to the plant of sampling. In (C) and (D), ‘mixed’ refers to isolation from post-harvest mixed salad samples containing salad from different locations or sources. In (B), ‘unknown’ refers to a strain (GMB103) whose isolation history is unknown.
Fig. S2. Phylogenetic tree of Escherichia sp. strains. A maximum-likelihood phylogenetic tree based on concatenated sequences of internal fragments of 8 housekeeping genes from GMB, ECOR and other Escherichia sp. strains was calculated. Sequences from the Escherichia sp. strains were part of a recent genome sequencing project (Luo et al., 2011) and retrieved from the NCBI website. GMB56 and 57 are clones of GMB46.
Fig. S3. Distribution of C-source utilization correlation coefficients (r2) within and between phylogroups. Box plots show the distribution of r2 correlation coefficients for each pairwise comparison between phylogroups as displayed in the correlation matrix from Fig. S4. The coloured box plots represent intra-phylogroup comparisons; the blank box plots represent inter-phylogroup comparisons.
Fig. S4. Inter and intra-phylogroup correlation of C-source utilization profiles. Box plots show the distribution of r2 correlation coefficients for inter and intra-phylogroup correlations. The asterisks indicate statistically significant difference between distribution means as found by a unpaired t test with Welch's correction (***P < 0.0001).
Fig. S5. Phenotypes used to define the ‘plant association index’ (PAi) differ between E. coli phylogroups.
A. Nutritional ability, as indicated by the average combined growth on the 18 C-sources most significantly associated with variation between ECOR and GMB, as shown in Table S1.
B. biofilm formation at 28°C for 72 h.
C and D. (C) growth yields reached after 24 h on sucrose and (D) pHPA. Asterisks indicate statistically significant difference between distributions as found by a Dunn's comparison test after Kruskal–Wallis tests (*P < 0.05; **P < 0.001; ***P < 0.0001). The P-value of the corresponding KW tests is indicated below the title of each graph.
Fig. S6. Empirical definition of a statistical threshold for positive carbon source utilization. (A) Histogram (red columns) shows OD600 values across 26 non-utilized carbon sources; kernel density estimation of the probability density function is represented by the black line and its cumulative density function in panel (B), showing values for 5% and 1% tails. See Experimental procedures for details.
Table S1: Individual GMB strain information. For each GMB strain, location and date of isolation are provided. Locations correspond to the ones indicated on the map presented in Fig. S1. The table is ordered by phylogroups. *‘Mixed’ indicates isolation from post-harvest mixed salad samples containing salad from different locations or sources.
Table S2. Morphotype and Plant Association Index (PAi) for plant and host-associated E. coli isolates. The PAi was calculated for the 173 strains (nECOR = 72; nGMB = 101) used in this study. The list is ordered according to PAi for which a heatmap has been applied for clarity (green, high PAi; red, low PAi; yellow, midpoint). Morphotypes on Congo red-containing agar are also indicated. See text and Experimental procedures for more details.
Table S3. Cross-validation success rates in PLS-DA using ECOR and GMB as groups for model dimensions up to 10. The maximum cross-validation success rate is highlighted and represents 3 PLS-DA dimensions.
Table S4. C-sources differentially used by plant and host-associated E. coli isolates. Only C-sources used by more than 5% of strains (63/95 carbon sources) are shown; the table is ordered by the P-values of a Mann–Whitney–Wilcoxon test to find differentially used C-sources between ECOR and GMB. Red text indicates the C-sources found statistically significant after a Bonferroni correction, blue highlighting indicates C-sources for which there is more than 20% difference between ECOR or GMB utilization (% of strains reaching OD600 > 0.63 after 24 h at 37°C using the given compound as a sole C-source).
Table S5. Cross-validation success rates in PLS-DA using E. coli phylogroups as groups for model dimensions up to 10. The maximum cross-validation success rate (74.4%) is highlighted and represents 7 PLS-DA dimensions. On Fig. 3, only two dimensions are represented for clarity, which corresponds to a 62.5% success rate. Phylogroup E was excluded from the analysis because of a low sampling size (n = 5).
Appendix S1. Supplementary methods.
Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.