Visualization of Pseudomonas genomic structure by abundant 8–14mer oligonucleotides


*E-mail, Tel. (+49) 511 532 6721; Fax (+49) 511 532 6723.


Under- and over-represented mono- to hexanucleotides are signatures of bacterial genomes, but the compositional biases of octa- to tetradecanucleotides have not yet been explored. Thirteen completely sequenced genomes of the Pseudomonas genus were searched for highly overrepresented 8–14mers. Between 59–989 overrepresented 8–14mers were found to exceed the applied threshold value. All genomic data sets of the 13 strains showed a consistent pattern, with individual oligomers clustering in either non-coding or coding regions. Non-coding oligonucleotides were typically part of longer repeats. Coding oligonucleotides were evenly distributed in the core genome, preferred one reading frame and matched with the local tetranucleotide usage patterns. Genomic islands were recognized by the depletion of overrepresented oligonucleotides. Several mainly coding 8–14mers occurred in genomes on average every 10 000 bp or less. Such frequently occurring 8–14mers could become useful markers for species identification. In the future of next-generation ultra-high throughput DNA sequencing, the composition of bacterial metagenomes may be quantified by scanning the primary sequence reads for these 8–14mer markers.