SEARCH

SEARCH BY CITATION

Fig. S1. Reads length distribution obtained from Matapan metagenome after trimming.

Fig. S2. Contigs length distribution obtained from Matapan metagenome assembled with GS De Novo Assembler of the Genome Sequencer FLX data analysis suite (version 2.3) with the default parameters applied.

Fig. S3. A. Relative proportions of domains identified in different metagenomic data sets based on the taxonomic binning of protein-encoding genes against the SEED database. The total number of protein-encoding genes matches in each metagenome is indicated in parentheses. The taxonomic assignment was carried out using a cut-off expectation (E) value of 1e-05 and alignment length of 30 bp inside the MG-RAST pipeline. B. Relative proportions of Archaeal orders identified in different metagenomic data sets based on the taxonomic binning of protein-encoding genes against the SEED database. Only orders having ≥ 0.01% match in at least one of all eight metagenomes compared were showed. The total number of protein-encoding genes matches in each metagenome is indicated in parentheses. The taxonomic assignment was carried out using a cut-off expectation (E) value of 1e-05 and alignment length of 30 bp inside the MG-RAST pipeline. C. Relative proportions of eubacterial classes identified in different metagenomic data sets based on the taxonomic binning of protein-encoding genes against the SEED database. Only classes having ≥ 0.01% match in at least one of all 8 metagenomes compared were showed. The total number of protein-encoding genes matches in each metagenome is indicated in parentheses. The taxonomic assignment was carried out using a cut-off expectation (E) value of 1e-05 and alignment length of 30 bp inside the MG-RAST pipeline. D. Relative proportions of Eukaryota phylum identified in different metagenomic data sets based on the taxonomic binning of protein-encoding genes against the SEED database. Only phylum having ≥ 0.01% match in at least one of all eight metagenomes compared were showed. The total number of protein-encoding genes matches in each metagenome is indicated in parentheses. The taxonomic assignment was carried out using a cut-off expectation (E) value of 1e-05 and alignment length of 30 bp inside the MG-RAST pipeline.

Fig. S4. Cluster analysis of several metagenomes based on matches to different KEGG categories expressed as percentage to the respective number of proteins identified in the metagenomes.

Table S1. A. Taxonomic classification results performed by MG-RAST using RDP SSU database. B. Taxonomic classification results performed by MG-RAST using SILVA SSU database. C. Taxonomic classification performed by MG-RAST using SEED proteins database. Results are expressed in number of proteins identified.

Table S2. A. Percentage of SEED, COG and KEGG functional categories in all metagenomes analysed. B. Comparison of transposase proteins abundance in all metagenomes analysed.

Table S3. A. SEED functional categories classification performed by MG-RAST. Results are expressed in number of proteins identified. B. COG functional categories classification performed by MG-RAST. Results are expressed in number of proteins identified. C. KEGG functional categories classification performed by MG-RAST. Results are expressed in number of proteins identified.

Table S4. A. CRISPR summary table divided by metagenome projects. B. Summary table of CRISPR in all metagenomes of this study. C. Spacers summary table divided by metagenome projects.

FilenameFormatSizeDescription
emi2827_sm_FigS1-4.doc1181KSupporting info item
emi2827_sm_TabS1-4.doc703KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.