• Ca2+-binding proteins;
  • graph theory;
  • carbon clusters;
  • side-chain center of mass;
  • NMR


Identifying Ca2+-binding sites in proteins is the first step toward understanding the molecular basis of diseases related to Ca2+-binding proteins. Currently, these sites are identified in structures either through X-ray crystallography or NMR analysis. However, Ca2+-binding sites are not always visible in X-ray structures due to flexibility in the binding region or low occupancy in a Ca2+-binding site. Similarly, both Ca2+ and its ligand oxygens are not directly observed in NMR structures. To improve our ability to predict Ca2+-binding sites in both X-ray and NMR structures, we report a new graph theory algorithm (MUGC) to predict Ca2+-binding sites. Using carbon atoms covalently bonded to the chelating oxygen atoms, and without explicit reference to side-chain oxygen ligand co-ordinates, MUGC is able to achieve 94% sensitivity with 76% selectivity on a dataset of X-ray structures composed of 43 Ca2+-binding proteins. Additionally, prediction of Ca2+-binding sites in NMR structures was obtained by MUGC using a different set of parameters, which were determined by the analysis of both Ca2+-constrained and unconstrained Ca2+-loaded structures derived from NMR data. MUGC identified 20 of 21 Ca2+-binding sites in NMR structures inferred without the use of Ca2+ constraints. MUGC predictions are also highly selective for Ca2+-binding sites as analyses of binding sites for Mg2+, Zn2+, and Pb2+ were not identified as Ca2+-binding sites. These results indicate that the geometric arrangement of the second-shell carbon cluster is sufficient not only for accurate identification of Ca2+-binding sites in NMR and X-ray structures but also for selective differentiation between Ca2+ and other relevant divalent cations. © Proteins 2012. © 2012 Wiley Periodicals, Inc.