Mark E. Peterson and Feng Chen contributed equally to this work.
Article
You have full text access to this OnlineOpen article
Evolutionary constraints on structural similarity in orthologs and paralogs†
Article first published online: 16 APR 2009
DOI: 10.1002/pro.143
Copyright © 2009 The Protein Society
Additional Information
How to Cite
Peterson, M. E., Chen, F., Saven, J. G., Roos, D. S., Babbitt, P. C. and Sali, A. (2009), Evolutionary constraints on structural similarity in orthologs and paralogs. Protein Science, 18: 1306–1315. doi: 10.1002/pro.143
- †
Mark E. Peterson and Feng Chen contributed equally to this work.
Publication History
- Issue published online: 26 MAY 2009
- Article first published online: 16 APR 2009
- Accepted manuscript online: 16 APR 2009 12:00AM EST
- Manuscript Accepted: 30 MAR 2009
- Manuscript Revised: 29 MAR 2009
- Manuscript Received: 9 DEC 2008
Funded by
- The Sandler Family Supporting Foundation
- NIH. Grant Numbers: P01 GM71790, R01 GM60595, R01 GM61267, R01 GM54762, U54 GM074945
References
- 1, ( 1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5: 823–826.
- 2, ( 1987) Comparison of solvent-inaccessible cores of homologous proteins: definitions useful for protein modelling. Protein Eng 1: 159–171.
- 3, , , ( 1993) Comparison of conformational characteristics in structurally similar protein pairs. Protein Sci 2: 1811–1826.Direct Link:
- 4, , , , ( 1997) Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation. J Mol Biol 269: 423–439.
- 5, , ( 2000) Large-scale comparison of protein sequence alignment algorithms with structure alignments. Proteins 40: 6–22.Direct Link:
- 6, , ( 2000) Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. J Mol Biol 297: 233–249.
- 7( 1999) Twilight zone of protein sequence alignments. Protein Eng 12: 85–94.
- 8, ( 2000) An integrated approach to the analysis and modeling of protein sequences and structures. II. On the relationship between sequence and structural similarity for proteins that are not obviously related in sequence. J Mol Biol 301: 679–689.
- 9( 2003) Definitions of enzyme function for the structural genomics era. Curr Opin Chem Biol 7: 230–237.
- 10, ( 2000) History of the enzyme nomenclature system. Bioinformatics 16: 34–40.
- 11, ( 2000) Practical limits of function prediction. Proteins 41: 98–107.Direct Link:
- 12( 2002) Enzyme function less conserved than anticipated. J Mol Biol 318: 595–608.
- 13, ( 2003) How well is enzyme function conserved as a function of pairwise sequence identity? J Mol Biol 333: 863–882.
- 14, ( 2007) Quantitative assessment of relationship between sequence similarity and function similarity. BMC Genomics 8: 222.
- 15, , , ( 2007) Quantitative sequence-function relationships in proteins based on gene ontology. BMC Bioinformatics 8: 294.
- 16, ( 1999) The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. J Mol Biol 288: 147–164.
- 17, , ( 2001) Evolution of function in protein superfamilies, from a structural perspective. J Mol Biol 307: 1113–1143.
- 18, , , ( 1999) Protein folds, functions and evolution. J Mol Biol 293: 333–342.
- 19, ( 2003) Prediction of protein function from protein sequence and structure. Q Rev Biophys 36: 307–340.
- 20, ( 2004) Quantifying structure-function uncertainty: a graph theoretical exploration into the origins and limitations of protein annotation. J Mol Biol 337: 933–949.
- 21, ( 2002) Sequence variations within protein families are linearly related to structural variations. J Mol Biol 323: 551–562.
- 22, ( 1999) Evolution of protein sequences and structures. J Mol Biol 291: 977–995.
- 23( 1970) Distinguishing homologous from analogous proteins. Syst Zool 19: 99–113.
- 24, ( 2002) Orthology, paralogy and proposed classification for paralog subtypes. Trends Genet 18: 619–620.
- 25( 2005) Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet 39: 309–338.
- 26, , ( 1997) A genomic perspective on protein families. Science 278: 631–637.
- 27, , , ( 2006) Benchmarking ortholog identification methods using functional genomics data. Genome Biol 7: R31.
- 28, , , ( 2007) Assessing performance of orthology detection strategies applied to eukaryotic genomes. PLo S ONE 2: e383.
- 29
- 30, ( 1997) From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Mol Phylogenet Evol 7: 231–240.
- 31( 1997) On a Mirkin-Muchnik-Smith conjecture for comparing molecular phylogenies. J Comput Biol 2: 177–187.
- 32, , ( 1998) Duplication-based measures of difference between gene and species trees. J Comput Biol 5: 135–148.
- 33, , , , , , , , , , et al. ( 2006) TreeFam: a curated database of phylogenetic trees of animal gene families. Nucleic Acids Res 34: D572–D580.
- 34, , , , , , , , , , et al ( 2004) A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes. Genome Biol 5: R7.
- 35, , , , , , , , , , et al. ( 2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4: 41
- 36, , , ( 2006) OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res 34: D363–D368.
- 37, ( 2001) Protein structure prediction and structural genomics. Science 294: 93–96.
- 38, ( 2005) Protein structure prediction: inroads to biology. Mol Cell 20: 811–819.
- 39( 2006) Comparative modeling for protein structure prediction. Curr Opin Struct Biol 16: 172–177.
- 40, , , , ( 2006) Physically realistic homology models built with ROSETTA can be more accurate than their templates. Proc Natl Acad Sci USA 103: 5361–5366.
- 41
- 42( 1998) Shining a light on structural genomics. Nat Struct Biol 5 ( Suppl): 643–645.
- 43, , , , , , , ( 2000) Protein structure modeling for structural genomics. Nat Struct Biol 7 ( Suppl): 986–990.
- 44, , , , , , ( 2004) A practical and robust sequence search strategy for structural genomics target selection. Bioinformatics 20: 2288–2295.
- 45, , , ( 2005) Progress of structural genomics initiatives: an analysis of solved target structures. J Mol Biol 348: 1235–1260.
- 46, ( 2006) The impact of structural genomics: expectations and outcomes. Science 311: 347–351.
- 47, , ( 2002) LigBase: a database of families of aligned ligand binding sites in known protein sequences and structures. Bioinformatics 18: 200–201.
- 48, , , , , , ( 2004) UCSF Chimera–a visualization system for exploratory research and analysis. J Comput Chem 25: 1605–1612.Direct Link:
- 49, ( 1998) Large-scale protein structure modeling of the Saccharomyces cerevisiae genome. Proc Natl Acad Sci USA 95: 13597–13602.
- 50, , , ( 2005) Progress over the first decade of CASP experiments. Proteins 61 ( Suppl 7): 225–236.Direct Link:
- 51( 2004) Virtual screening of chemical libraries. Nature 432: 862–865.
- 52, , , ( 2006) Molecular mechanics methods for predicting protein-ligand binding. Phys Chem Chem Phys 8: 5166–5177.
- 53, ( 2007) Physics-based methods for studying protein-ligand interactions. Curr Opin Drug Discov Dev 10: 325–331.
- 54, ( 2005) Update on the pfam5000 strategy for selection of structural genomics targets. Conf Proc IEEE Eng Med Biol Soc 1: 751–755.
- 55, , ( 2006) Target selection and deselection at the Berkeley Structural Genomics Center. Proteins 62: 356–370.Direct Link:
- 56, ( 2008) Probing protein fold space with a simplified model. J Mol Biol 375: 920–933.
- 57, , , ( 1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247: 536–540.
- 58, , ( 2003). OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13: 2178–2189.
- 59, , , , , , ( 2004) The ASTRAL compendium in 2004. Nucleic Acids Res 32: D189–D192.
- 60, ( 1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89: 10915–10919.
- 61, , , ( 2008) BLOSUM62 miscalculations improve search performance. Nat Biotech 26: 274–275.
- 62, ( 1993) Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol 234: 779–815.
- 63, ( 1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 11: 739–747.
- 64, ( 2005) TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33: 2302–2309.
- 65, ( 1987) Engineering Statistics. New York: Macmillan Publishing Company, p 420.

1469-896X/asset/olbannerleft.gif?v=1&s=d218899ae53b2862ab119790ed504b8d72122fb3)
1469-896X/asset/olbannerright.gif?v=1&s=59470eb9a1d9b7b13b1be75e9445e6c46ee2214f)
