Present address: PO Box 3674, Garibaldi Highlands, BC, Canada V0N 1T0.
Efficient algorithms for the discovery of DNA oligonucleotide barcodes from sequence databases
Version of Record online: 21 APR 2009
© 2009 Blackwell Publishing Ltd and Crown in the right of Canada
Molecular Ecology Resources
Special Issue: Special Issue on Barcoding Life
Volume 9, Issue Supplement s1, pages 58–64, May 2009
How to Cite
ZAHARIEV, M., DAHL, V., CHEN, W. and LÉVESQUE, C. A. (2009), Efficient algorithms for the discovery of DNA oligonucleotide barcodes from sequence databases. Molecular Ecology Resources, 9: 58–64. doi: 10.1111/j.1755-0998.2009.02651.x
- Issue online: 21 APR 2009
- Version of Record online: 21 APR 2009
- Received 31 October 2008; revision received 15 January 2009; accepted 30 January 2009
- array designer;
- reverse dot blot hybridization;
- single nucleotide polymorphism (SNP)
Efficient design of barcode oligonucleotides can lead to significant cost reductions in the manufacturing of DNA arrays. Previous methods are based on either a preliminary alignment, which reduces their efficiency for intron-rich regions, or on a brute force approach, not feasible for large-scale problems or on data structures with very poor performance in the worst case. One of the algorithms we propose uses ‘oligonucleotide sorting’ for the discovery of oligonucleotide barcodes of given sizes, with good asymptotic performance. Specific barcode oligonucleotides with at least one base difference from other sequences in a database are found for each individual sequence. With another algorithm, specific oligonucleotides can also be found for groups or clades in the database, which have 100% homology for all oligonucleotide sequences within the group or clade while having differences with the rest of the data. By re-organizing the sequences/groups in the database, oligonucleotides for different hierarchical levels can be found. The oligonucleotides or polymorphism locations identified as species or clade specific by the new algorithm are refined and screened further for hybridization thermodynamic properties with third party software.