Standard Article

PANTHER: Protein families and subfamilies modeled on the divergence of function

Part 3. Proteomics

3.6. Proteome Families

Short Specialist Review

  1. Paul D. Thomas

Published Online: 15 OCT 2006

DOI: 10.1002/047001153X.g306319

Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics

Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics

How to Cite

Thomas, P. D. 2006. PANTHER: Protein families and subfamilies modeled on the divergence of function. Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics. 3:3.6.

Author Information

  1. Director, Evolutionary Systems Biology Group, SRI International, Menlo Park, CA, US

Publication History

  1. Published Online: 15 OCT 2006

Abstract

Protein Analysis Through Evolutionary Relationships PANTHER is a resource that defines a molecular taxonomy of proteins, and relates this taxonomy to computational representations of function, spanning molecular functions, biological processes, and biochemical pathways. In this way, PANTHER provides an infrastructure for exploring the relationships between protein sequence evolution and evolution of function at both the molecular and pathway levels. Proteins are grouped into families, and the relationships between sequences in each family are represented as a phylogenetic tree. The trees are constructed using sequence conservation-based distances that may reflect the functional constraints, over evolutionary time, on proteins of the same or similar functions. The current version of PANTHER contains calculated trees for over 5000 protein families, spanning over 80% of the protein-coding genes in mammalian genomes and their known relatives from all kingdoms of life. Expert biologists have divided these families into over 30 000 subfamilies (clades), where each subfamily comprises proteins that share a distinct function. Subfamilies therefore represent the functional divergence (or “cooption”) events that occurred during the family's evolutionary history. Each subfamily is associated, when possible, with ontology terms that describe the functions of its constituent proteins and, increasingly, with components in biochemical pathways. In addition, the sequences in each family and subfamily are represented as a statistical amino acid “signature”, allowing classification of new sequences, and exploration of the relationships between protein sequence and function. PANTHER is freely available at http://www.pantherdb.org, where users can submit protein sequences for classifications, search, and browse the classifications and trees, and employ computational tools to analyze their own data, such as gene expression data and genetic mutation data, in the context of protein families and functions.

Keywords:

  • Bioinformatics;
  • databases;
  • protein families;
  • protein domains;
  • Hidden Markov Models