UNIT 2.5 Identifying Protein Domains with the Pfam Database

  1. Penny Coggill,
  2. Robert D. Finn,
  3. Alex Bateman

Published Online: 1 SEP 2008

DOI: 10.1002/0471250953.bi0205s23

Current Protocols in Bioinformatics

Current Protocols in Bioinformatics

How to Cite

Coggill, P., Finn, R. D. and Bateman, A. 2008. Identifying Protein Domains with the Pfam Database. Current Protocols in Bioinformatics. 23:2.5:2.5.1–2.5.17.

Author Information

  1. Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom

Publication History

  1. Published Online: 1 SEP 2008
  2. Published Print: SEP 2008


Pfam is a database of protein domain families, with each family represented by multiple sequence alignments and profile hidden Markov models (HMMs). In addition, each family has associated annotation, literature references, and links to other databases. The entries in Pfam are available via the World Wide Web and in flatfile format. This unit contains detailed information on how to access and utilize the information present in the Pfam database, namely the families, multiple alignments, and annotation. Details on running Pfam, both remotely and locally are presented. Curr. Protoc. Bioinform. 23:2.5.1-2.5.17. © 2008 by John Wiley & Sons, Inc.


  • protein domain;
  • HMM;
  • protein family;
  • superfamily;
  • sequence alignment;
  • sequence analysis