Comparative analysis of sequence covariation methods to mine evolutionary hubs: Examples from selected GPCR families

Authors

  • Julien Pelé,

    1. UMR CNRS 6214–INSERM 1083, Laboratory of Integrated Neurovascular and Mitochondrial Biology, University of Angers, Angers, France
    Search for more papers by this author
  • Matthieu Moreau,

    1. UMR CNRS 6214–INSERM 1083, Laboratory of Integrated Neurovascular and Mitochondrial Biology, University of Angers, Angers, France
    Search for more papers by this author
  • Hervé Abdi,

    1. The University of Texas at Dallas, School of Behavioral and Brain Sciences, Richardson, TX, USA
    Search for more papers by this author
  • Patrice Rodien,

    1. UMR CNRS 6214–INSERM 1083, Laboratory of Integrated Neurovascular and Mitochondrial Biology, University of Angers, Angers, France
    2. Department of Endocrinology, Reference Centre for the pathologies of hormonal receptivity, Centre Hospitalier Universitaire of Angers, Angers, France
    Search for more papers by this author
  • Hélène Castel,

    1. INSERM U982, Laboratory of Neuronal and Neuroendocrine Communication and Differentiation, DC2N, University of Rouen, Mont-Saint-Aignan, France
    Search for more papers by this author
  • Marie Chabbert

    Corresponding author
    1. UMR CNRS 6214–INSERM 1083, Laboratory of Integrated Neurovascular and Mitochondrial Biology, University of Angers, Angers, France
    • Correspondence to: Marie Chabbert, UMR CNRS 6214-INSERM U1083, Faculty of Medicine, 3 rue Haute de reculée, 49045 Angers, France. E-mail: marie.chabbert@univ-angers.fr

    Search for more papers by this author

ABSTRACT

Covariation between positions in a multiple sequence alignment may reflect structural, functional, and/or phylogenetic constraints and can be analyzed by a wide variety of methods. We explored several of these methods for their ability to identify covarying positions related to the divergence of a protein family at different hierarchical levels. Specifically, we compared seven methods on a model system composed of three nested sets of G-protein-coupled receptors (GPCRs) in which a divergence event occurred. The covariation methods analyzed were based on: χ2 test, mutual information, substitution matrices, and perturbation methods. We first analyzed the dependence of the covariation scores on residue conservation (measured by sequence entropy), and then we analyzed the networking structure of the top pairs. Two methods out of seven—OMES (Observed minus Expected Squared) and ELSC (Explicit Likelihood of Subset Covariation)—favored pairs with intermediate entropy and a networking structure with a central residue involved in several high-scoring pairs. This networking structure was observed for the three sequence sets. In each case, the central residue corresponded to a residue known to be crucial for the evolution of the GPCR family and the subfamily specificity. These central residues can be viewed as evolutionary hubs, in relation with an epistasis-based mechanism of functional divergence within a protein family. Proteins 2014; 82:2141–2156. © 2014 Wiley Periodicals, Inc.

Ancillary