SEARCH

SEARCH BY CITATION

FilenameFormatSizeDescription
emi412145-sup-0001-si.tiff5452K

Fig. S1.sigi-hmm predictions in 54 E. coli strains. The left panel shows sigi-hmm predictions (cyan regions) from the genomes of 54 E. coli strains, coloured blue according to genomic %AT from light blue (most AT rich) to dark blue (most GC rich). The horizontal axis designates the chromosomal position in bp, while the vertical axis to the right indicates total size of predicted GIs (bp). The right panel shows more detailed annotations of E. coli O26 : H11 strain 11368, the E. coli with the largest genome and E. coli K-12 substrain MG1655, a non-pathogenic model organism with one of the smallest E. coli genomes. Genes ending with an underscore indicate multiple variants of the gene. ECO26_P_ designates multiple variants of E. coli O26:H11 prophages. More details regarding these annotations can be found in Supporting Information Tables S5 and S6 for E. coli K-12 and E. coli O26 : H11 respectively. Graphical representation dissimilarities of hotizontally transferred (HT) regions between the genomes in the left and right panels are due to scale differences.

emi412145-sup-0002-si.docx34K

Table S1. Results from phylum level genome size versus AT content robust regression analysis. The table includes regression estimates (column 2) for the bacteria in each phylogenetic group and Proteobacteria subphyla (column 1) as well as standard error (column 3), t-statistic (column 4), P-values (column 5) and number n of strains included in the analysis (column 6).

Table S2. Result from species level genomic size versus AT content robust regression analysis. The table shows the results from regression analyses between genome size and genomic %AT in the strains of the designated species in column 1. Column 2 shows the regression estimates, column 3 standard error, column 4 t-statistic, column 5 P-value and column 6 the number n of strains included in each regression model.

Table S3. Robust regression analysis of relative entropy versus genomic %AT, species level. The table shows the results from regression analyses between genome-based relative entropy and genomic %AT in the strains of the designated species in column 1. Column 2 shows the regression estimates, column 3 standard error, column 4 t-statistic, column 5 P-value and column 6 the number n of strains included in each regression model.

Table S4. Robust regression analysis of relative entropy versus genome size, species level. The table shows the results from regression analyses between genome-based relative entropy and genome size in the strains of the designated species in column 1. Column 2 shows the regression estimate, column 3 standard error, column 4 t-statistic, column 5 P-value and column 6 the number n of strains included in each regression model.

emi412145-sup-0003-si.xls28K

Table S5. Annotations of sigi-hmm-predicted regions in E. coli K-12 substrain MG1655 in Excel format. NCBI name and chromosome position of sigi-hmm predictions are found in the first column, with more detailed explanation of the predicted DNA in columns 2 and 3. All annotations are taken from the islandviewer website.

emi412145-sup-0004-si.xls30K

Table S6. Annotations of sigi-hmm-predicted regions in E. coli O26 : H11 strain 11368 in Excel format. NCBI name and chromosome position of sigi-hmm predictions are found in the first column, with more detailed explanation of the predicted DNA in columns 2 and 3. All annotations are taken from the islandviewer website.

emi412145-sup-0005-appendixs1.doc24K

Appendix S1. More information on the robust regression method used in this study.

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.