COVER: a priori estimation of coverage for metagenomic sequencing


E-mail; Tel. (+34) 91 585 4573; Fax (+34) 91 585 4506.


In any metagenomic project, the coverage obtained for each particular species depends on its abundance. This makes it difficult to determine a priori the amount of DNA sequencing necessary to obtain a high coverage for the dominant genomes in an environment. To aid the design of metagenomic sequencing projects, we have developed COVER, a web-based tool that allows the estimation of the coverage achieved for each species in an environmental sample. COVER uses a set of 16S rRNA sequences to produce an estimate of the number of operational taxonomic units (OTUs) in the sample, provides a taxonomic assignment for them, estimates their genome sizes and, most critically, corrects for the number of unobserved OTUs. COVER then calculates the amount of sequencing needed to achieve a given goal. Our tests and simulations indicate that the results obtained through COVER are in very good agreement with the experimental results.