Fig. S1. Monte Carlo simulations for cross-contigs between rumen viromes. The per cent shared viral genotypes and percent permuted rank abundance are plotted. The colours represent the likelihood score for a given position. The black dot on each plot represents the location of the best percent shared and percent permuted. (A) Cull-7664 versusLact-6993; (B) Cull-7664 versus Dry-7887; (C) Lact-6993 versus Dry-7887.

Fig. S2. Principal component analysis carried out in MG-RAST using normalized and centred data from the organismal or functional classifications of our rumen viromes and 10 publicly available ocean viromes. The red and green dots represent the rumen and ocean viromes respectively. A. M5NR database, E-value ≤ 0.001. B. SEED subsystems, E-value ≤ 1e-5.

Fig. S3. The distribution of significant matches of virome sequence reads from each cow to the GenBank non-redundant (NR) database based on BLASTX sequence similarities (E-value ≤ 0.001). The data were generated by the Joint Genome Institute using the MEGAN metagenome analysis software.

Fig. S4. The distribution of significant matches of virome sequence reads from each cow to a non-redundant viral database (NR_Viral_DB) based on a TBLASTX sequence similarities (E-value < 0.001).

Fig. S5. Venn Diagram showing the shared and unique hits to the NR_Viral_DB for three rumen viromes (TBLASTX; E-value ≤ 0.001).

Fig. S6. Distribution of sequences with similarity to the rumen viral metagenome (virome) based on TBLASTX sequence similarities of microbial reference genomes (A) and genome bins (B) to sequence reads from the rumen virome (E-value ≤ 0.001). Identities of the virome sequences were determined by TBLASTX comparison (E-value ≤ 0.001) to the NR_Viral_DB. Sequences classified as ‘other’ had significant similarity to a virome sequence that did not match any sequences in the NR_Viral_DB. The distribution of viruses, prophages, and other sequences seen here are a proportion of the total number of sequences from each genome that were similar to the rumen virome sequences.

Table S1. Percentage of sequences (mean ± SD) with similarity to SEED subsystems (E-value ≤ 1e-5).

Table S2. BLASTX comparison of the rumen virome to the Carbohydrate Active Enzyme (CAZy) database.

Table S3. Putative mobile elements detected in rumen microbial genomes and genome bins.

Table S4. Comparison of putative mobile elements from rumen microbial genomes and genome bins to the rumen virome (TBLASTX; E ≤ 0.001).

Table S5. CRISPR-associated (Cas) proteins detected in rumen microbial genomes and genome bins by RAST and GenBank.

Table S6. CRISPR-associated (Cas) proteins detected in the rumen viral and microbial metagenomes by MG-RAST or BLASTX comparisons to the NR database.

Table S7. Comparison of CRISPR spacer sequences from microbial reference genomes to three nucleotide databases (BLASTN; E-value < 0.001).

Table S8. CRISPR spacer sequences from the rumen metagenome predicted open reading frames (ORFs) generated by Hess and colleagues (2011) compared with three nucleotide databases (BLASTN; E-value ≤ 0.001, sequence identity > 90%).

EMI_2593_sm_fS1.tiff1034KSupporting info item
EMI_2593_sm_fS2.tiff7985KSupporting info item
EMI_2593_sm_fS3.tiff765KSupporting info item
EMI_2593_sm_fS4.tiff572KSupporting info item
EMI_2593_sm_fS5.tiff394KSupporting info item
EMI_2593_sm_fS6.tiff1133KSupporting info item
EMI_2593_sm_tS1-8.doc2026KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.