Assessing the relative importance of the biophysical properties of amino acid substitutions associated with human genetic disease

Authors

  • Bent N. Terp,

    1. Institute of Medical Genetics, University of Wales College of Medicine, Heath Park, Cardiff, UK
    Current affiliation:
    1. Center for Genomics and Bioinformatics, Karolinska Instituttet, von Eulers Väg 8, 171 77 Stockholm, Sweden
    Search for more papers by this author
  • David N. Cooper,

    Corresponding author
    1. Institute of Medical Genetics, University of Wales College of Medicine, Heath Park, Cardiff, UK
    • Institute of Medical Genetics, University of Wales College of Medicine, Heath Park, Cardiff CF14 4XN, UK
    Search for more papers by this author
  • Inge T. Christensen,

    1. Medical Chemistry Research, Novo Nordisk A/S, Maaloev, Denmark
    Search for more papers by this author
  • Flemming S. Jørgensen,

    1. Royal Danish School of Pharmacy, Department of Medicinal Chemistry, Copenhagen, Denmark
    Search for more papers by this author
  • Peter Bross,

    1. Research Unit for Molecular Medicine, Institute of Experimental Clinical Research, Aarhus, Denmark
    Search for more papers by this author
  • Niels Gregersen,

    1. Research Unit for Molecular Medicine, Institute of Experimental Clinical Research, Aarhus, Denmark
    Search for more papers by this author
  • Michael Krawczak

    1. Institute of Medical Genetics, University of Wales College of Medicine, Heath Park, Cardiff, UK
    Current affiliation:
    1. Institut für Medizinische Informatik und Statistik, Christian-Albrechts-Universität, Brunswiker Straße 10, 24105 Kiel, Germany
    Search for more papers by this author

Abstract

The inclusion of a mutation in a pathology-based database such as the Human Gene Mutation Database (HGMD) is a two-stage process: first, the mutation must occur at the DNA level, then it must cause a clinically detectable disease state. The likelihood of the latter step, termed the relative clinical observation likelihood (RCOL), can be regarded as a function of the structural/functional consequences of a mutation at the protein level. Following this paradigm, we modeled in silico all amino acid replacements that could potentially have arisen from an inherited single base pair substitution in five human genes encoding arylsulphatase A (ARSA), antithrombin III (SERPINC1), protein C (PROC), phenylalanine hydroxylase (PAH), and transthyretin (TTR). These proteins were chosen on the basis of 1) the availability of a crystallographic structure, and 2) a sufficiently large number of amino acid replacements being logged in HGMD. A total of 9,795 possible mutant structures were modeled and 20 different biophysical parameters assessed. Together with the HGMD-derived spectra of clinically detected mutations, these data allowed maximum likelihood estimation of RCOL profiles for the 20 parameters studied. Nine parameters (including energy difference between wild-type and mutant structures, accessibility of the mutated residue, and distance from the binding/active site) exhibited statistically significant variability in their RCOL profiles, indicating that mutation-associated changes affected protein function. As yet, however, a biological meaning could only be attributed to the RCOL profiles of solvent accessibility and, for three proteins, local energy change, disturbed geometry, and distance from the active center. The limited ability of the biophysical properties of mutations to explain clinical consequences is probably due to our current lack of understanding as to which amino acid residues are critical for protein folding. However, since the proteins examined here were unrelated, and our findings consistent, it may nevertheless prove possible to extrapolate to other proteins whose dysfunction underlies inherited disease. © 2002 Wiley-Liss, Inc.

Ancillary