Testing similarity measures with continuous and discrete protein models

Authors

  • Stefan Wallin,

    Corresponding author
    1. Complex Systems Division, Department of Theoretical Physics, Lund University, Sölvegatan 14A, SE-223 62 Lund, Sweden
    • Complex Systems Division, Department of Theoretical Physics, Lund University, Sölvegatan 14A, SE-223 62 Lund, Sweden
    Search for more papers by this author
  • Jochen Farwer,

    1. Free University of Berlin, Department of Biology, Chemistry and Pharmacy, Institute of Chemistry, Takustr. 6, D-14195 Berlin, Germany
    Search for more papers by this author
  • Ugo Bastolla

    1. Centro de Astrobiología (CSIC-INTA), Ctra. de Ajalvir km. 4, 28850 Torrejón de Ardoz, Madrid, Spain
    Search for more papers by this author

Abstract

There are many ways to define the distance between two protein structures, thus assessing their similarity. Here, we investigate and compare the properties of five different distance measures, including the standard root-mean-square deviation (cRMSD). The performance of these measures is studied from different perspectives with two different protein models, one continuous and the other discrete. Using the continuous model, we examine the correlation between energy and native distance, and the ability of the different measures to discriminate between the two possible topologies of a three-helix bundle. Using the discrete model, we perform fits to real protein structures by minimizing different distance measures. The properties of the fitted structures are found to depend strongly on the distance measure used and the scale considered. We find that the cRMSD measure very effectively describes long-range features but is less effective with short-range features, and it correlates weakly with energy. A stronger correlation with energy and a better description of short-range properties is obtained when we use measures based on intramolecular distances. Proteins 2003;50:144–157. © 2002 Wiley-Liss, Inc.

Ancillary