Application of the ESMACS Binding Free Energy Protocol to a Multi‐Binding Site Lactate Dehydogenase A Ligand Dataset

Abstract Over the past two decades, the use of fragment‐based lead generation has become a common, mature approach to identify tractable starting points in chemical space for the drug discovery process. This approach naturally involves the study of the binding properties of highly heterogeneous ligands. Such datasets challenge computational techniques to provide comparable binding free energy estimates from different binding modes. The performance of a range of statistically robust ensemble‐based binding free energy calculation protocols, called ESMACS (enhanced sampling of molecular dynamics with approximation of continuum solvent), is evaluated. Ligands designed to target two binding pockets in the lactate dehydogenase, a target protein, which vary in size, charge, and binding mode, are studied. When compared to experimental results, excellent statistical rankings are obtained across this highly diverse set of ligands. In addition, three approaches to account for entropic contributions are investigated: 1) normal mode analysis, 2) weighted solvent accessible surface area (WSAS), and 3) variational entropy. Normal mode analysis and WSAS correlate strongly with each other—although the latter is computationally far cheaper—but do not improve rankings. Variational entropy corrects exaggerated discrimination of ligands bound in different pockets but creates three outliers which reduce the quality of the overall ranking.


Introduction
Over the last two decades the use of fragment-based lead generation (FBLG) has become a common, mature, approach to identify tractable starting points in chemical space for the drug discovery process. [1] This methodology involves scanning a library of low molecular weight compounds, known as fragments, to see if they bind to the target of interest. Once binding fragments have been identified they can be built on to create higher affinity molecules which, if they modulate protein function as required, become candidate drugs. One possible strategy is to link multiple fragments binding to different regions of the protein. FBLG represents an attractive application for in silico binding affinity calculations, but the need to obtain comparable free energy estimates from different binding modes represents a considerable challenge for many computationally efficient techniques.
Here, we take this challenge to evaluate the performance of our range of ensemble simulation based binding free energy calculation protocols, called ESMACS (enhanced sampling of molecular dynamics with approximation of continuum solvent).
These protocols have been shown to produce results which correlate well with experiment (correlation coefficients >0.7) and provide reproducible uncertainties [2][3][4][5][6][7] in studies of drugs binding to a single site. ESMACS is based on the common computational binding affinity prediction approach known as molecular mechanics Poisson-Boltzmann surface area (MMPBSA). [8] This is an approximate post-processing end-state method, which uses continuum solvent models to reduce the computational cost of obtaining results. The speed and ease of setup (compared to rigorous free energy calculations) makes MMPBSA an attractive candidate for use throughout the drug discovery pipeline. However, results are generally dependent on the system and binding mode, and are perceived to be less accurate than those obtained from more expensive and theoretically exact alchemical approaches (such as free energy perturbation, FEP, and thermodynamic integration, TI) [9,10] that have been used successfully in our labs for relative prediction of close analogues in drug discovery. [11][12][13][14] www.advancedsciencenews.com www.advtheorysimul.com Furthermore, the term MMPBSA as used in the literature permits a wide range of variants which incorporate different sampling strategies (for example, all ligand conformers can be drawn from simulation of the complex or from independent runs) and different solvation and entropy terms. Our previous work has demonstrated that MMPBSA analysis of single simulation trajectories is highly unreliable, with calculations initiated from the same structures varying by up to 12 kcal mol −1 for small molecules bound to proteins. [2,5,15] Results vary even more significantly for flexible ligands binding to major histocompatibility complex (MHC). [7] However, the average taken over an ensemble of replica simulations is found to provide reproducible results in good agreement with experimental ranking (r >= 0.7) for all these systems. These issues are a result of the underlying lack of reproducibility of classical molecular dynamics, for which predictions of macroscopic properties, such as the Gibbs free energy, require ensemble averaging over microscopic states. Newtonian dynamics are intrinsically sensitive to initial conditions, which manifests as different MD simulations producing trajectories that diverge rapidly over time no matter how close their initial conditions are (a consequence of the mixing ergodic properties of any system that will reach an equilibrium state). [16] Another limitation of single trajectory MMPBSA is that it does not account thoroughly for entropic components of the binding affinity. Here, we investigate both the use of independent simulations to account for ligand and receptor flexibility, and multiple approaches to incorporating entropic contributions to the binding free energy.
One target in which FBLG has been employed successfully is the lactate dehydogenase (LDH) tetramer, which is upregulated in clinical tumors (with high expression linked to poor prognosis). [17][18][19] Ward et al. [20] reported the use of X-ray crystallography alongside surface plasmon resonance (SPR) and nuclear magnetic resonance (NMR) based screening to develop fragment hits for the LDHA subunit into lead compounds. In this work we report a study testing the performance of ESMACS protocols for datasets including multiple binding sites and ligands varying in size, charge, and binding mode.

Datasets and Computational Protocols
In this section, we describe the ESMACS protocols we employ in detail and the dataset of 22 LDHA ligands used to test their performance.

Experimental Dataset and Starting Structures
The LDHA structure used in this study is based on chain A of PDB ID: 4AJP (truncated to start at residue 16) with consensus water molecules from other X-ray structures incorporated. A dataset of 22 ligands was studied, structures of which are shown in Figure 1a. The ligands bind to two distinct locations in the protein, known as the substrate and adenine sites, respectively (see Figure 1b). The dataset contains four ligands which bind to the substrate pocket, nine to the adenine pocket, and nine that bridge the two sites. Ligand poses were taken from crystal structures described in Ward et al. [20] Where no crystal conformation was produced ligands were superimposed on the most similar compound for which an experimental pose was available. This dataset contained both charged and neutral ligands and ligands of growing size. All were neutral except: ligands LDHA14, 16-18, and 28 that were assigned −1 charge, and LDHA19-21, 25-27, 29-34 with a charge of −2. For this data set, binding strengths were obtained from K d measurements derived from BIAcore and NMR experiments [20] and ranged between −11.0 and −3.1 kcal mol −1 .

ESMACS Binding Free Energy Calculations
The principle behind ESMACS is that many short simulations provide better sampling of conformational space than single simulations. It is based on the MMPBSA method and facilitates rapid and reproducible calculations of binding affinities. Our ESMACS protocols are flexible and allow for the analysis to be tailored to the target system. In this work, we employ a standard ESMACS protocol, which consists of running 25 replicas for a total of 6 ns each, details of which are discussed in Section 2.3.
When two reactants combine at constant temperature and pressure, the binding affinity is characterized by the change in Gibbs free energy, ΔG. This is described by the following relationship: where ⟨G complex ⟩, ⟨G receptor ⟩ and ⟨G ligand ⟩ are the ensemble average values of the Gibbs free energy for the complex, receptor (protein), and ligand, respectively. Sampling of the complex and its two components can either be performed independently or derived from simulation of the complex. The latter approach is more commonly used due to its improved convergence behavior, a consequence of cancelation between the noisy terms describing the internal energy of the ligand, receptor and complex. [21] However, recent work has indicated that adaptation energies associated with confining the receptor and ligand in a complex can differ significantly even for closely related compounds. [12] When both the receptor and ligand contributions are computed from the complex trajectory, we designate this a "1traj protocol." When all three contributions derive from independent trajectories we refer to this as a "3traj protocol," and when only one of the receptor or ligand contributions do so a "2traj protocol." In the latter case, a suffix (either -fl or -fr, for flexible ligand and receptor, respectively) is added to the protocol name to signify which component is derived from the independent simulation. Additional variants involve the use of the average receptor contribution across the complex simulations for all comparable ligands, which is indicated with an -ar (averaged receptor) suffix in the protocol name.
A summary of all of the protocols and the origin of component data in each is given in Table 1. The statistical performance of the pairs of protocols 1traj-ar and 2traj-fr, and 2traj-ar and 3traj, are the same since the receptor contribution for all cases is a constant between protein-ligand pairs. Consequently, we do not analyze the 3traj or 2traj-fr protocols explicitly. The binding free energy change calculated by MMPBSA (ΔG MMPBSA ) can be broken down into a number of components: where ΔG MM ele and ΔG MM vdW are the electrostatic and van der Waals contributions to the molecular mechanics free energy difference, respectively, ΔG MM int is the internal energy contribution, and ΔG sol pol and ΔG sol nonpol are the polar and non-polar solvation terms, respectively. Table 1. Summary of the origin of component contributions in six ESMACS protocols; whether they come from the ensemble of simulations run for the complex (C) or separate ensembles performed for the receptor (R) and ligands (L). "Constant" refers to the use of a constant, usually the average value across the studied systems.
Contribution to the binding free energy The electrostatic free energy of solvation, ΔG sol pol , is the part of the calculation described by the Poisson-Boltzmann (PB) calculation. The pbsa program included with AmberTools was used to perform the PB calculation (using default parameters: grid spacing of 0.5 Å, internal and external dielectric constants of 1 and 80, respectively). The non-polar solvation free energy contribution is estimated from the solvent accessible surface area using the traditional one component method (specified using inp=1 in the input file). In this approach the surface tension, , is set to 0.00542 kcal mol −1 Å −2 and the off-set, , to 0.92 kcal mol −1 . The fill ratio parameter was set to 4.0 which does not impact the results but ensures the stability of the calculations. All simulation and analysis in this paper is conducted at a temperature of 300 K.

Entropic Contribution to Binding Free Energies
MMPBSA calculations neglect the contribution to binding free energy from changes in the solute entropy. A computationally expensive method for accounting for this "configurational entropy" is normal mode (NMODE) analysis. [22] This can straightforwardly be incorporated in the binding affinity estimate using the following equation: The fact that converged normal mode calculations can require similar computational effort to the underlying simulations has motivated the creation of the weighted solvent accessible surface area (WSAS) model. [23] This model was parameterized to reproduce normal mode results using calculations of the solvent accessible surface area (SAS) and buried SAS (BSAS) of each atom of the system. The two types of surface area are weighted according to atom type, and the sum of the contributions of each atom is used to estimate S WSAS . This estimate is calculated from where w i is the atom-type specific weighting of the atom i, and k is a parameter which scales the impact of buried surface area. BSAS is calculated from the SAS using Where r i the atomic radius and r prob the probe radius used to determine the SAS. In this work, we compute SAS using the Lee and Richards algorithm [24] as implemented in the FreeSASA library (freesasa.github.io). Another, computationally efficient, alternative approach to accounting for the solute entropy was proposed by Duan et al. [25] In their formulation, the "variational entropy" can be derived from the fluctuations of the receptor-ligand interaction energy, E inter . This energy can be calculated using components of the MMPBSA calculation The fluctuation in interaction energy is then given by where angle braces indicate an ensemble average. This is then used to compute the entropic contribution to binding via where k B is the Boltzmann constant and = 1∕k B T.
In this work, we compare the effect of the inclusion of different entropy components on the correlation of computed binding free energy values with experiment. In particular, we study those derived from normal mode analysis, WSAS, and the variational entropy.

Statistics and Uncertainties
All statistics presented are based on their standard definitions with the exception of the mean unsigned error (MUE). It is well known that MMPBSA results have a significant offset from experimental values (typically on the order of 15 to 25 kcal mol −1 ) due to a range of factors, in particular, the neglect of entropic contributions. [26,27] Consequently, we present values corrected for the systematic (mean signed) error.
We compute uncertainties for all metrics through bootstrapping analysis. This method involves resampling with replacement the N input data points (for example, the replica averages of ΔG MMPBSA ) to provide a new bootstrap sample also containing N data points. This process is repeated many times (in our case 5000 times) and the statistic of interest calculated for each bootstrap population. The standard deviation of these values provides an estimate of the uncertainty associated with an average derived from a given sample; this is what is quoted as the bootstrap error measure of our statistics. For correlation coefficients, samples are drawn from the overall averages for each ligand paired with the relevant experimental value. www.advancedsciencenews.com www.advtheorysimul.com

Simulation Setup
Simulation system setup, including the creation of a water box and addition of neutralizing ions, was performed using Amber-Tools 17. [28,29] Protein parameters were taken from the standard Amber force field for bioorganic systems (ff14SB). [30] Ligand parameterizations were produced using the general Amber force field (GAFF). [23] The primary results of this paper were created using ligand partial charges generated using the restrained electrostatic potential (RESP) procedure, also part of the Amber package, from Gaussian 98 [31] geometrically optimized inhibitor representations. The Gaussian calculations were performed at the Hartree-Fock level of theory using the 6-31G* basis set. Additional results are presented in which AM1-BCC partial charges were generated using Amber alone.
Ensembles of 25 replica MD simulations were conducted using the package NAMD 2.11 [32] for each system (complex, receptor, or ligand) studied. All simulations were conducted using the protocol incorporated into the workflow tool BAC. [33] Each system was minimized with all heavy protein atoms restrained at their initial positions (with a restraining force constant of 4 kcal mol −1 Å −2 ). Initial velocities were then generated independently for each replica from a Maxwell-Boltzmann distribution at 50 K. Each system was virtually heated to 300 K over 60 ps and subsequently maintained at this temperature using the NAMD standard, Langevin dynamics approach (employing a coupling coefficient of 1 ps −1 ). While heating, the restraints applied during minimisation were retained. Once the system reached the correct temperature, the pressure was maintained at 1 bar using a Berendsen barostat (with a pressure coupling constant of 0.1 ps). Subsequent to heating, a series of equilibration runs, totaling 2 ns, were conducted, during which the restraints on heavy atoms were gradually reduced. The restraint reduction occurred in ten 100 ps steps, after each of which the force constant was halved. Finally, 4 ns production simulations were executed with snapshots output for analysis every 100 ps. For all MD simulation steps the long-range Coulomb interaction was handled using the particle-mesh Ewald summation method (PME), [34] with a nonbonded cutoff distance of 12 Å. The SHAKE algorithm [35] was employed on all atoms covalently bonded to hydrogen atoms, enabling a 2 fs time step for all simulations.

Results
First, we investigate the use of ensemble-based multiple trajectory ESMACS protocols in MMPBSA calculations for this system, before turning our attention to the inclusion of the approaches discussed previously to estimate entropic contributions to the binding energy.

Standard ESMACS Analysis
A comparison of the performance of the various ESMACS protocols in reproducing the LDHA experimental dataset rankings is shown in Table 2. All protocols show a strong overall correlation between the computed energies and experimental values, which is reflected in high Pearson and Spearman coefficients. However, some caution should be exercised when considering these metrics as the target dataset consists of three well separated clusters (corresponding to the different binding modes). No significant improvement in either coefficient is made through the incorporation of independent trajectories for ligand or receptor into the calculation, suggesting that the ligands may all bind in a "lock and key" fashion, inducing little if any strain in the protein or change in ligand conformation. As a consequence of this, we will focus for the rest of this work on analyzing the 1traj results.
The comparison of binding free energies obtained from the 1traj ESMACS protocols with the LDHA experimental dataset is shown in Figure 2. The good quality of the overall ranking is primarily due to correct separation of the tight binding ligands, which occupy both binding sites, and those occupying the adenine site. Consistently good rankings are obtained for each subset of ligands defined by the different binding modes (r 2 of 0.74 ± 0.17, 0.86 ± 0.24, 0.74 ± 0.16 for the adenine, substrate, and bridging ligands, respectively). This indicates that the method is not merely able to detect the gross changes that differentiate weak (and smaller) binders from tighter (and larger bridging) ligands but accounts for more subtle structural features. The ligands which bind in the substrate pocket (LDHA19, LDHA20, LDHA21, and LDHA25) do not lie close to the overall trend line and are assigned much more favorable relative binding affinities than the adenine site binders, despite having similar experimental affinities. This observation applies to the results of all ESMACS protocols under consideration. These observations are reflected in the high MUE for all rankings shown in Table 2. Graphical comparisons of the results obtained for all protocols with experiment (alongside correlation analysis) are provided in Supporting Information.
Decomposition of the MMPBSA contribution shows that the origin of the improved binding of the ligands that bridge both binding sites compared to the adenine site binders is primarily electrostatic (see Table 3). This large negative (attractive) contribution overcomes an increased polar solvation penalty. This is a pattern that is also apparent for the substrate pocket binding ligands. The majority of the increased affinity of the bridging ligands relative to the substrate pocket binders comes from the van der Waals contribution.

Impact of Ligand Parameterization
For simulations using Amber forcefields, the choice of procedures for ligand preparation is usually whether to use AM1-BCC or Gaussian/RESP based protocols to determine atom charges in combination with the GAFF general purpose forcefield parameters. In order to ascertain that our results were not strongly influenced by this decision we re-ran the 1traj ESMACS protocol for the full set of ligands using the AM1-BCC charge model. As shown in Figure 3 the ranking is almost unchanged, with all but two ligands (LDHA12 and LDHA15) having ΔG MMPBSA values within error between the two charge models. However, the binding affinities for both ligands have comparatively small uncertainties associated with them. A systematic difference between the partial charges is observed between the two parameterizations in these systems where the oxygen and nitrogen within the binding pockets have the same charges when prepared using Gaussian/RESP (approximately −0.64) but differ slightly (approximately −0.58 and −0.47, respectively) using AM1-BCC. The correlation coefficients of the results are unchanged, r 2 is 0.81 (0.07) and r s 0.82 (0.10), with the grouping of the ligands by binding mode also preserved.

Entropic Contributions
Accounting correctly, and computationally efficiently, for the entropic component of binding free energies remains a challenge for MMPBSA-based computations. Here, we investigate the impact of normal mode analysis derived estimates of the conformational entropy and the variational entropy technique on the ranking of different ESMACS protocols. The latter method is a computationally cheap analysis based on the variation of components of the MMPBSA calculation, while the former involves an expensive extra post-processing step (obtaining converged values can use as many CPU hours as the original simulation). Consequently, we also investigate the use of the WSAS method for approximating normal mode analysis based on solvent accessible surface area calculations to reduce this added expense.

Normal Mode Analysis and WSAS
The overall correlation, shown in (Figure 4a), is not improved by the incorporation of normal mode estimates of the configurational entropy (r 2 of 0.80 and r s of 0.80). Furthermore, the correlations for the ligand binding mode subsets is also similar (see Supporting information). The deviation from the overall trend by the substrate pocket ligands is exaggerated slightly by the incorporation of this contribution with the calculations, particularly for LDHA20 and LDHA21. These ligands are all small and highly charged, building upon a malonate substructure. Interestingly, the ranking of the bridging ligands improves slightly with the incorporation of configurational entropy into the ESMACS calculation, with the weakest binders moving towards the overall trend.
The WSAS method has an exceptionally strong correlation with the normal mode results, with r 2 of 0.93 and r s of 0.94 (see inset in Figure 4b). Correlation was also good within sets, with r 2 values of 0.76, 0.83, and 0.98 for adenine, bridging, and substrate subsets, respectively. This is reflected in the similarity of the overall rankings incorporating either method with the MMPBSA results shown in Figure 4. Using the WSAS parameters chosen here, the separation of the bridging ligands to the weakest binding adenine pocket ligands is increased by around 20 kcal mol −1 . Despite this, the overall pattern of the three binding modes relative to one another is not altered.  These results suggest that, in terms of ranking ligands, the WSAS approach offers a viable "drop in" alternative to normal mode calculations. However, neither approach offers an improvement in this dataset relative to MMPBSA alone. A plausible explanation for the lack of influence of these methods in the LDHA system is the lack of large domain level motions or flexibility.

Variational Entropy
The impact of the addition of variational entropy into the binding free energy is shown in Figure 4c. The main difference made is that the adenine and substrate ligands are brought into line with the exception of ligands LDHA16-18. This results in a lowered MUE of 13.86, while r s and PI (0.80 and 0.89, respectively) are maintained. However, the outliers reduce r 2 to 0.71 (0.06).
An intuitive explanation for this can be seen in the variability of the ΔG MM ele as described in Table 3, which indicates that the variation in electrostatics is much higher for ligands LDHA16-18 than for the other adenine pocket binders. The overall electrostatic contribution is also higher owing to the charged carboxylic acid moiety on these ligands. On a structural level, the carboxylic acid protrudes from the main adenine binding site forming transient interactions with ARG99 (see Figure 5). Given that the variational entropy is based on the change in interaction energy versus the average, these intuitively lead to changes in the entropic contribution for these molecules. It is likely that in reality these interactions are moderated by solvent effects (including interactions with individual water molecules) not included in the variational entropy calculation. Consequently, it appears that we obtain entropic penalties more in line with those obtained for the more highly charged substrate pocket ligands due to the solvent exposed carboxylic acid. The fact that the binding strength is underestimated indicates that our approach is unlikely to generate false hits but if used uncritically could exclude viable fragments. The larger compounds that span the entire binding site, and the majority of fragments in the substrate or adenine sites, remain well correlated after the inclusion of the variational entropy correction. This is in contrast to our previous work on BRD4 [36] where the same correction was more detrimental for larger compounds, a result we attributed to more movement in their binding site and therefore greater variation in interaction energy during the simulation.

Conclusions
We obtain excellent statistical rankings across this highly diverse set of ligands which contains considerable variation in size, charge, and binding mode using MMPBSA-based ESMACS protocols. Such a dataset is not suitable for relative alchemical methods thus ruling out the use of popular FEP or TI methods. Alternatives such as absolute perturbation methods may be plausible, but would also be a challenge given the large and flexible nature of some of the ligands. Therefore, suitable computational methods are needed which motivated us to further test our ESMACS approach, including alternative entropic corrections, and assessing effects of charge models. The fact that the ranking is not improved by the use of multiple trajectories shows that minimal strain was introduced to the protein or ligand upon binding, indicating a "lock and key" binding mechanism for all complexes studied. The research we present here provides a platform for future work which could determine the value of the entropic components investigated here in systems where binding involves greater structural rearrangement (which are best treated using a multiple trajectory approach).
Ligands that bind in different binding sites are separated by MMPBSA binding affinity estimates even when the experimental values are close to one another. This is caused by exaggerated electrostatic interactions in the adenine pocket and bridging ligands compared to those binding the substrate pocket. We investigated whether the separation between groups of ligands could be mitigated via the incorporation of entropic components into the binding free energy estimates. Neither normal mode nor WSAS derived measures of configurational entropy impact the rankings or groupings of ligands, the WSAS ranking correlating strongly with the values obtained from normal mode analysis. However, WSAS can be computed within 30 min on a single core for an entire ensemble whereas normal mode analysis can take up to 12 h to complete and requires one core per snapshot for this duration. Inclusion of variational entropy into the calculations reduces the separation of estimates for drugs binding to the two different binding pockets, but created three outliers which reduced the quality of the overall ranking. The origin of the overestimate of the entropy for these ligands appears to be related to the charge on the solvent exposed carboxylic acid moiety. The difference between the performance of this approach and normal mode analysis or WSAS suggests that relevant states of the LDHA system are distinguished by subtle changes in local interactions rather than global domain motions. In particular, caution must be employed when interpreting results for small ligands in which charged moieties are solvent exposed.

Supporting Information
Supporting Information is available from the Wiley Online Library or from the author.