A simple clustering algorithm can be accurate enough for use in calculations of pKs in macromolecules



Structure and function of macromolecules depend critically on the ionization states of their acidic and basic groups. Most current structure-based theoretical methods that predict pK of ionizable groups in macromolecules include, as one of the key steps, a computation of the partition sum (Boltzmann average) over all possible protonation microstates. As the number of these microstates depends exponentially on the number of ionizable groups present in the molecule, direct computation of the sum is not realistically feasible for many typical proteins that may have tens or even hundreds of ionizable groups. We have tested a simple and robust approximate algorithm for computing these partition sums for macromolecules. The method subdivides the interacting sites into independent clusters, based upon the strength of site–site electrostatic interaction. The resulting partition function is factorizable into computationally manageable components. Two variants of the approach are presented and validated on a representative test set of 602 proteins, by comparing the pK1/2 values computed by the proposed method with those obtained by the standard Monte Carlo approach used as a reference. With 95% confidence, the relative error introduced by the more accurate of the two methods is less than 0.25 pK units. The algorithms are one to two orders of magnitude faster than the Monte Carlo method, with the typical settings. A graphical representation is introduced that visualizes the clusters of strong site–site interactions in the context of the three-dimensional (3D) structure of the macromolecule, facilitating identification of functionally important clusters of ionizable groups; the approach is exemplified on two proteins, bacteriorhodopsin and myoglobin. Proteins 2006. © 2006 Wiley-Liss, Inc.