Computational motif discovery
Part 4. Bioinformatics
4.2. Gene Finding and Gene Structure
Basic Techniques and Approaches
Published Online: 15 JAN 2005
Copyright © 2005 John Wiley & Sons, Ltd
Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics
How to Cite
Tompa, M. 2005. Computational motif discovery. Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics. 4:4.2:28.
- Published Online: 15 JAN 2005
The goal of computational motif discovery is to predict short subsequences of biological sequences that are good candidates to serve some biological function. This article focuses on the computational prediction of protein binding sites in nucleotide sequences. Three types of motif model are described: consensus, IUPAC, and weight matrix. Two types of application are described, statistical overrepresentation (in which the input sequences come from a single genome and are believed to contain instances of a single motif) and phylogenetic footprinting (in which the input sequences are homologous, typically one from each of multiple related genomes). Programs of each of these types are briefly described, with references to fuller descriptions and web sites where the programs are available.
- binding site;
- weight matrix;
- position-specific scoring matrix;
- statistical overrepresentation;
- phylogenetic footprinting