Standard Article

Protein Family Databases

  1. Nicola J Mulder

Published Online: 15 OCT 2012

DOI: 10.1002/9780470015902.a0003058.pub3



How to Cite

Mulder, N. J. 2012. Protein Family Databases. eLS. .

Author Information

  1. University of Cape Town, Department of Clinical Laboratory Sciences, IIDMM, Cape Town, South Africa

Publication History

  1. Published Online: 15 OCT 2012


As new protein sequences continue to flood into public databases with the advancement of sequencing technologies, the importance of protein family databases for automatic protein functional classification increases. These databases are developed independently and each has its own methods and areas of interest, as well as its own strengths and weaknesses. To simplify access to multiple databases by the user, many of these databases have also been amalgamated into integrated protein family resources, which vary in their level of manual curation. These protein family databases or integrated resources have a number of applications in modern biology or bioinformatics, including protein functional annotation, orthologue prediction, protein–protein interaction prediction, gene set enrichment analysis and providing datasets for evaluation of mathematic models of biological systems or networks.

Key Concepts:

  • Protein signatures are mathematical descriptions of the sequence characteristics of members of the same protein family or domain.

  • Profiles and hidden Markov models are tools for characterising protein families or domains.

  • Regular expressions or patterns are used for describing short highly conserved motifs.

  • Protein family data has a number of applications, notably for the functional classification of new protein sequences.


  • protein family;
  • domain;
  • annotation;
  • functional classification;
  • profiles;
  • hidden Markov models