Sequence-based prediction of pathological mutations

Authors

  • C. Ferrer-Costa,

    1. Molecular Modeling and Bioinformatics Unit, Institut de Recerca Biomédica, Parc Científic de Barcelona, Barcelona, Spain
    Search for more papers by this author
  • M. Orozco,

    Corresponding author
    1. Molecular Modeling and Bioinformatics Unit, Institut de Recerca Biomédica, Parc Científic de Barcelona, Barcelona, Spain
    2. Departament de Bioquímica i Biologia Molecular, Facultat de Química, Universitat de Barcelona, Barcelona, Spain
    • Parc Científic de Barcelona, C/Josep Samitier, 1-5, 08028 Barcelona, Spain
    Search for more papers by this author
  • X. de la Cruz

    Corresponding author
    1. Molecular Modeling and Bioinformatics Unit, Institut de Recerca Biomédica, Parc Científic de Barcelona, Barcelona, Spain
    2. Institució Catalana per la Recerca i Estudis Avançats (ICREA), Barcelona, Spain
    • Parc Científic de Barcelona, C/Josep Samitier, 1-5, 08028 Barcelona, Spain
    Search for more papers by this author

Abstract

The development of methods to assess the impact of amino acid mutations on human health has become an important goal in biomedical research, due to the growing number of nonsynonymous SNPs identified. Within this context, computational methods constitute a valuable tool, because they can easily process large amounts of mutations and give useful, almost cost-free, information on their pathological character. In this paper we present a computational approach to the prediction of disease-associated amino acid mutations, using only sequence-based information (amino acid properties, evolutionary information, secondary structure and accessibility predictions, and database annotations) and neural networks, as a model building tool. Mutations are predicted to be either pathological or neutral. Our results show that the method has a good overall success rate, 83%, that can reach 95% when trained for specific proteins. The methodology is fast and flexible enough to provide good estimates of the pathological character of large sets of nonsynonymous SNPs, but can also be easily adapted to give more precise predictions for proteins of special biomedical interest. Proteins 2004. © 2004 Wiley-Liss, Inc.

Ancillary