The CATH domain structure database
Part 3. Proteomics
3.6. Proteome Families
Short Specialist Review
Published Online: 15 APR 2005
Copyright © 2005 John Wiley & Sons, Ltd
Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics
How to Cite
Pearl, F., Bennett, C. and Orengo, C. 2005. The CATH domain structure database. Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics. 3:3.6:89.
- Published Online: 15 APR 2005
The CATH database of protein domain structures (http://www.biochem.ucl.ac.uk/bsm/cath/) currently contains 60 435 domain structures classified into 917 fold groups, 1606 superfamilies, and 5202 sequence families. Recent developments include improved methods for rapidly recognizing domain boundaries in multidomain proteins. These exploit the principle of domain recurrence during evolution. Algorithms have been developed that identify these regions using a fast method that compares secondary structure arrangements between proteins (CATHEDRAL). In a recent CATH release, 75% of protein chains from the Protein Data Bank (PDB), with no significant sequence similarity to entries in CATH, had domains that could be recognized using this approach. Since domain boundary assignment is a significant bottleneck in the classification of new structures, CATHEDRAL will also help increase the frequency of CATH updates. CATH has recently been used to provide structural annotations for completed genomes. The Web-based Gene3D resource assigns complete and partial genome sequences, from 120 completed genomes, to CATH domain structure superfamilies.
- protein structure classification and comparison;
- domain boundary recognition