Notice: Wiley Online Library will be unavailable on Saturday 27th February from 09:00-14:00 GMT / 04:00-09:00 EST / 17:00-22:00 SGT for essential maintenance. Apologies for the inconvenience.
Modeling of protein binding site flexibility in molecular docking is still a challenging problem due to the large conformational space that needs sampling. Here, we propose a flexible receptor docking scheme: A dihedral restrained replica exchange molecular dynamics (REMD), where we incorporate the normal modes obtained by the Elastic Network Model (ENM) as dihedral restraints to speed up the search towards correct binding site conformations. To our knowledge, this is the first approach that uses ENM modes to bias REMD simulations towards binding induced fluctuations in docking studies. In our docking scheme, we first obtain the deformed structures of the unbound protein as initial conformations by moving along the binding fluctuation mode, and perform REMD using the ENM modes as dihedral restraints. Then, we generate an ensemble of multiple receptor conformations (MRCs) by clustering the lowest replica trajectory. Using ROSETTALIGAND, we dock ligands to the clustered conformations to predict the binding pose and affinity. We apply this method to postsynaptic density-95/Dlg/ZO-1 (PDZ) domains; whose dynamics govern their binding specificity. Our approach produces the lowest energy bound complexes with an average ligand root mean square deviation of 0.36 Å. We further test our method on (i) homologs and (ii) mutant structures of PDZ where mutations alter the binding selectivity. In both cases, our approach succeeds to predict the correct pose and the affinity of binding peptides. Overall, with this approach, we generate an ensemble of MRCs that leads to predict the binding poses and specificities of a protein complex accurately.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
Molecular docking is an important tool in studying protein-ligand or protein–protein interactions and designing new drugs. Computer methods have long been useful for “docking” due to the discovery of ligands that can bind to proteins as lead candidates for drugs.1–6 Majority of the current binding/docking methods attempt to predict the bound ligand by keeping the protein (receptor) fixed and moving the target ligand around the binding site while performing an energy minimization. The major problems of this approach are: (i) proteins are dynamic, flexible, and deformable, so keeping them fixed often misses the correct binding modes, (ii) relying on pure energy minimization is insufficient to predict correct binding affinities. Docking algorithms predict the incorrect binding pose for about 50–70% of all ligands when the receptor is kept in single position.7 Overall, induced flexibility is the key ingredient to understand the physical principles of molecular recognition between ligand and receptor. It has been shown that even small changes in receptor upon binding can be important in computing the binding affinities. On the other hand, modeling of receptor flexibility is still a challenging problem due to the need of large conformational space that must be sampled. In order to overcome this challenge, multiple receptor conformations (MRCs) are generated in several ways: (i) using multiple conformations from molecular dynamics (MD) snapshots,8–13 (ii) applying principal component analysis to MD trajectories,14 (iii) using the snapshots based on geometry-based simulation techniques,15 (iv) detecting rigid and hinge regions with the Gaussian Network Model and the Elastic Network Model (ENM), respectively,16, 17 (v) perturbing receptor conformations along different normal modes directions,18–21 (vi) using normal modes as additional flexible variables during docking simulations,22–25 and (vii) using multiple structures of the protein receptor obtained from experimental studies, for example X-ray crystallography or NMR analysis.26–28 The majority of these types of approaches are very computationally intensive. Furthermore, the success of these types of ensemble docking approaches depends on two features of the set of receptor conformations: (i) they should include a wide range of binding site conformations realized in nature and (ii) they should exclude artifact conformations that can lead to the prediction of incorrect poses or false positives in virtual screening (i.e., prediction of a ligand with a high binding score when it does not bind in nature). Thus, the crucial ingredient in generating MRCs is to mimic nature and sample binding induced conformations using smart sampling strategies. Recently, Abagyan and collaborators29 were able to incorporate the multiple receptor conformational ensemble in a single docking simulation and reduce the sampling time with their approach.
In this present study, we come up with a novel sampling strategy by combining the ENM with replica exchange molecular dynamics (REMD)30 to generate “binding induced” MRCs. Clustering molecular dynamics snapshots8–13 and generating conformations by perturbing the conformation along normal modes18, 19, 24, 31–33 have been separately applied to incorporate the receptor flexibility in many docking studies. Here, by merging these two approaches, our goal is to bias MD search towards more binding-related conformations through normal mode predicted fluctuation profiles, which in return, to increase the efficiency and accuracy in generating the multiple receptor ensemble.
Although molecular dynamics is the most reliable method to sample conformations within the flexibility of protein structure, it is quite challenging to sample the rare events of large amplitude fluctuations (i.e., binding induced conformational changes) especially for the large protein molecules. On the other hand, the ENM34–41 which is based on a purely mechanical model, and view a protein structure as an elastic network, have been applied to many proteins to obtain slowest (i.e., functionally related) fluctuations. The nodes of the elastic network are α-carbons where identical springs connect the “interacting” α-carbons in their native fold. The model has the advantages that (i) it determines the functionally related motions (even large ones) without using any energy functions and (ii) it is computationally much more efficient than molecular dynamics. Most importantly, it has been shown that a few lowest frequency modes (or their linear combinations) of unbound conformations obtained by ENMs can capture the conformational change upon binding.42–44 Recently, Dobbins et al.19 applied ENM to a set of proteins from docking benchmark and found that the modes with certain characteristic frequencies can provide guidelines to predict the conformational change on protein–protein docking.
With the insights from these studies, we incorporate the modes that are related to the binding induced conformational changes into the REMD sampling search. Thus, in our approach, we use a restrained REMD search where we bias the search toward modal directions by using dihedral restraints. Our approach has the following advantages over using straightforward molecular dynamics runs: (i) REMD can help to overcome potential barriers with high temperature replicas, (ii) with biasing towards normal modes, binding induced conformational changes (i.e., similar to bound conformations) can be sampled much more efficiently than the straightforward sampling strategies, and (iii) it is computationally faster, as it is well suited for parallel computing. Moreover, our approach has another benefit over generating multiple flexible receptor conformations using only normal modes (i.e., generating conformations by perturbation of normal modes): coupling REMD with normal mode analysis (NMA) enables us to mimic a wide range of binding site conformations realized in nature. In addition normal mode by itself cannot really sample native-like bound conformations, it just gives us a crude estimate. Thus coupling REMD with normal modes can help us to predict binding affinities much more accurately.
We test our approach in analyzing the binding selectivity of postsynaptic density-95/Dlg/ZO-1 (PDZ) domain proteins (PDZs). PDZs, which are distributed diversely in the genome, play critical roles in (i) targeting proteins to specific membrane compartments, (ii) assembling proteins into supramolecular complexes, and (iii) regulating the function of their ligands in cellular signaling pathways. This ties them directly to the most puzzling diseases such as Alzheimer's, Parkinson's, cancer, and diabetes. PDZs perform their job by binding the C-terminal peptide of specific protein partners. PDZ domains have been categorized into two main classes according to the specificity of the interaction depending on its C-terminal four amino acids of their binding ligands.45 Class I type PDZs bind to a C-terminal motif with the sequence [Ser/Thr-X-ϕ-COOH] and Class II type PDZs prefer the sequence [ϕ-X-ϕ-COOH] where X is any amino acid and ϕ is any hydrophobic amino acid. All PDZs share about 25% identity in the sequence, similar secondary and tertiary structures with an average backbone root mean square deviation (RMSD) of around 1.4 Å and highly conserved C-terminal peptide interactions.46 Although PDZ binding site is well defined and PDZ motifs are classified based on their sequence type, there is still little information available on the binding affinity and stoichiometry of PDZ binding motifs and blocking peptides. This is partly because their dynamics rather than the structure or sequence determines the binding affinity as indicated with very recent study of Petit et al.47 In their study, they investigated how binding affinity alters, when they remove distal the third helix (α3) of third PDZ domain of PSD-95 which is not observed in any other PDZs and is not necessary to maintain the structure. Strikingly, removal of α3 reduces the binding affinity by 21-fold, even though it lies outside of the binding site and does not make direct contact with the binding C-terminal peptide. This result along with many others supports the notion that different PDZs evolve to have different dynamics properties tailored to mediate different functions in the cell, despite the fact that they all have same conserved structure and sequence.48
By analyzing the conformational dynamics of the unbound PDZs, it may be possible to determine their binding specificities. In our previous study, we showed that the most collective fluctuation profile obtained by modified version of ENM can capture on the average 60% of the binding induced conformational changes in PDZ domains.49 We also investigated the role of binding induced conformational dynamics of PDZs in their peptide selectivity and classification. By clustering the binding induced fluctuation profiles of a diverse set of PDZ domains, we showed that ENM predicted normal modes not only can identify the structural regions and types of correlated fluctuations critical for binding of Class I and Class II peptides but also predicts binding selectivity of PDZs.49
The challenge of PDZ domain (i.e., the reasons behind their selectivity and promiscuity) and their link to many different diseases has led to a number of important experimental50, 51 and computational49, 52–58 studies. Niv and Weinstein also developed a flexible docking scheme (called PDZDocScheme),56 which is based on simulated annealing molecular dynamics with the soft core potential or flexible binding site side chains, followed by rotamer optimization. When they apply their protocol to the original bound structures (self-docking), their scheme reproduces the structures of PDZ complexes with peptides 4–8 amino acids long within 2 Å, except for Syntenin (the RMSD of the best score is 3.7 Å when they redock Class II peptide to native bound structure). However, when unbound or homology models are used for docking, their flexible docking scheme can only predict the docked peptides within an average RMSD of 3 Å. Here, we would like to apply our flexible approach based on ENM-guided REMD to improve upon the PDZ-binding affinities, especially for the cases where only unbound and homology structures are available.
In our flexible docking procedure, we first generate an initial set of conformations by perturbing the unbound structure and homology models using the weighted average profile of significant lowest frequency ENM modes responsible for Class I and Class II types binding in PDZs.49 Then, we further sample these two sets of structures by running a dihedral restrained REMD where the restraints are set with respect to the binding induced fluctuation profile of ENM. After restrained REMD, the snapshots of the lowest replica are clustered and each individual structure from this ensemble of conformations is docked against different peptides using ROSETTALIGAND.59, 60 The overall average ligand RMSD values of redocking native peptides to native bound structures and unbound structures are 0.31 Å and 0.78 Å, respectively. However, when the native peptides are docked to unbound structures with our flexible approach, we obtain lowest energy models with an average ligand RMSD of 0.36 Å from the experimental structure, which is as good as redocking peptides to bound structures. Although docking methods provide low RMSD values for the docked poses of the ligands, they are generally unreliable for prediction of binding affinities. There are different approaches to overcome this deficiency using empirical scoring functions (DrugScore61 and XScore62) and more physically realistic methods such as molecular mechanics-generalized Born (GB) surface area (SA) technique.63 While ROSETTALIGAND is successful in obtaining good RMSD values for best poses, we prefer using a knowledge-based scoring function (DrugScore)61 to estimate the binding affinities of ligands to a particular receptor with reasonable accuracy. Thus, at the end of Rosetta run, we re-evaluate the binding energy of the ligand bound complex structure with the lowest Rosetta binding score. Our results indicate that the binding preference of different peptides in PDZs can be determined using the docking approach based on ENM-guided REMD. This would enable us to validate its applicability to include protein flexibility in docking studies. In addition, we further apply our method on (i) homolog structures of PDZs whose binding selectivities are verified experimentally and (ii) mutant structures of PDZ where mutations alter the binding selectivity. We find that the inclusion of backbone flexibility through MRCs, obtained by the normal mode incorporated REMD runs, help us to discriminate the binding preference for homology models and mutated structures, which are much more challenging cases for docking.
The four PDZs (PSD-95, GRIP, Syntenin, and Erbin) with available unbound (apo) and bound (holo) structures analyzed in this study are listed in Table I. The crystal structures of unbound and bound proteins are retrieved from the PDB (http://www.rcsb.org).64 The PDB codes, the names of the corresponding proteins, the class of PDZ domain according to their binding specificity and the sequences of the binding peptides are displayed in the columns of the table.
Table I. The PDB Codes, the Peptide Sequences, and the Binding Preferences of Proteins
With our docking scheme, called the ENM-guided REMD method (ENM-REMD), our goal is to generate an ensemble of a given receptor conformation which can mimic the nature and includes the conformations that are sampled through the binding process. Thus, we generate a set of conformations from molecular dynamics trajectories where we bias the search toward binding induced dynamics. We achieve this in three steps: first using ENM, we obtain the binding induced fluctuation profile of a protein by analyzing the slowest modes of unbound structure. Then, we obtain the deformed structures of the unbound structure by perturbing the unbound structure along the binding fluctuation modes of ENM (see details in subsection “Elastic Network Model and generating new conformations using ENM”). The perturbed/deformed structures are then used as initial structures in our restrained REMD simulation where we define dihedral restraints based on the ENM predicted binding induced fluctuation profile (i.e., the regions predicted as rigid/less flexible by ENM mode are assigned with stronger restraints whereas the regions with high flexibility are assigned with weaker restraints). After defining the restraints and creating the perturbed structures as initial structures based on ENM predict binding induced fluctuation profile, as a second step, we perform a dihedral restrained REMD simulation (see the details of the simulation parameter in the subsection “Dihedral restrained replica exchange molecular dynamics”). Having our simulation restrained with respect to binding induced fluctuation profiles helps us to sample towards correct bound-like conformations. Once the simulation is done, an ensemble of MRCs is generated by clustering the conformations sampled at the lowest replica. As a final step, each structure in the ensemble is used for docking and the best energy score docked pose is chosen as the best result. Figure 1(A) displays each step in our flexible docking protocol and the docking performance of the conformations obtained after the each step is also presented in Figure 1(B) for the case of PSD-95. PSD-95 only binds to Class I type of peptides. Shown in Figure 1(B), as we further proceed the steps, the Class I type binding affinity becomes more significant while the RMSD value of the Class I peptide decreases.
We carry out four different tests: First, each ligand is redocked into a known PDZ domain of the bound conformation (self-docking). Second, different ligands are docked into the original bound conformation (cross-docking). Third, the docking procedure applied to bound structures is also repeated for the unbound structures. Fourth, ligands are docked against the multiple conformations obtained with our flexible docking protocol (ENM-REMD). For further testing, we apply the ENM-REMD method to two protein structures (Lrcc7 and Cipp-PDZ9) generated from homology modeling and mutants of another PDZ domain, PICK1 that have dual (Class I and Class II types) binding specificity. All docking analysis in this study is performed using ROSETTALIGAND where the ligand flexibility is established by changing torsional angles, and the backbone of the ligand and the whole protein are held fixed throughout the docking simulation.
Elastic Network Model and generating new conformations using ENM
The normal modes that correspond to Class I- and Class II-type peptide binding fluctuations are obtained using the Anisotropic Network Model (also referred to as the ENM).34 ENM is equivalent to a NMA with an ENM at the Cα level and the Hessian is based on a harmonic potential form.41 ENM uses the coordinates of a structure from PDB database;64 with the adjacent residues connected based on a cut-off distance with a spring to build a Hessian connectivity matrix. A cutoff value of 16 Å is used in this study.
The diagonalized of the inverse of Hessian yields it into 3N–6 eigenvalues (λn) and their corresponding eigenvectors un, and we take the lowest frequency (most global) eigenmodes. To find the binding induced fluctuations, we focus on the contribution of each mode, weighted by the inverse of corresponding eigenvalue. The distribution of mode frequencies (eigenvalues) are evaluated to address how many modes contribute to the binding dynamics. A subset of modes whose eigenvalues are dispersed from those of the other modes is identified.49 In this approach, first, the histogram of the eigenvalues is generated and then the bin size is computed based on the highest dispersion in eigenvalue spectrum (i.e., the first gap in the eigenvalue spectrum). The eigenvalues correspond to the first bin are used along with their eigenmodes to compute the weighted sum of the square fluctuations obtained from
The weighted average of each mode corresponds to a fluctuation between two oppositely directed motions. We generate two sets of deformations for each mode k as
where t is a scaling parameter of the deformation43 and we used t = 25 for the proteins tested in this study. Each obtained structure that contains distorted bond lengths and angles is further subjected to an energy minimization of 50 steepest descent iterations followed by 1000 conjugate gradient iterations using the AMBER96 forcefield65 along with a GBSA solvation model.66
It is also possible to use a set of experimental structures in different liganded complexes (if they are available) to compute the binding induced dynamics.67–70 The recent study of Bakan and Bahar has shown that ENM predicted modes are very well correlated with the principal components of the conformational change for a large set of different liganded experimental structures of a given protein.71 We have repeated the same analysis for our test protein: PDZ domain. Thus, we collect the different complex structure of PSD-95 available in the Protein Data Bank. After structurally aligning these experimental structures, we compute the mean position of each Cα atom (〈ΔRi〉) and construct the covariance matrix based on the Cα deviations of protein structures. Then, we obtain the principal components of the structural change by decomposing this covariance matrix. The comparison of principal modes with ENM modes of our calculations is also highly correlated (see Supporting Information Fig. S1). This indicates that ENM predicted normal modes of unliganded (unbound) structure intrinsically captures the dynamics that lead to the liganded (bounded) conformation, therefore they can be used in docking to guide the unbound structure towards binding induced dynamics.
Homology modeling and mutated proteins
The protein sequences are obtained from the SMART (a Simple Modular Architecture Research Tool) database72 and submitted to “The Structure Prediction Meta Server”73, 74 with the option of an automated homology model program ESyPref3D. ESyPref3D is based on a strategy using neural networks to evaluate sequence alignments75 and uses the MODELLER program76, 77 to build the final structural model. Homology models are constructed with MODELLER with a minimal sequence similarity of 50% to the target. To obtain mutated structure of the PICK1 (PDZ protein interacting with C kinase 1), a computational point mutation is introduced into the X-ray structure via Swiss PDB Viewer.78 Before applying ENM, homolog and mutated structures are subjected to an energy minimization of 50 steepest descent iterations followed by 1000 conjugate gradient iterations using AMBER 96 force field65 along with a GB solvation model.66
We apply a dihedral restrained REMD30 to generate a large set of conformations starting from deformed structures obtained with ENM. First, dihedral restraints are applied to move the conformations along its binding induced fluctuations. The strength of each dihedral restraint is adjusted with respect to the fluctuation profile of normal modes obtained by ENM, where the regions shown as the most flexible in ENM analysis have weaker restraints compared to rigid parts. This type of restraint biases our sampling towards the directionality of the binding induced fluctuations. Second, rather than using unbound conformation as initial conformation, we generate a set of conformations by perturbing the unbound conformations along the normal mode vectors as explained in “Elastic Network Model and generating new conformations using ENM” section of “Materials and Methods”. The advantage of using different initial conformations in REMD is that it speeds up our search towards correct binding site conformations.
The AMBER96 forcefield with a GB implicit solvent model66 and a SA penalty term of 5 cal/mol Å2 are used in REMD. Backbone torsional restraints that hold specific phi/psi angles in the same conformation are applied to all replicas. We simulate each protein with 26 replicas for 5 ns/replica ranging from temperatures 270–450 K. Each structure is subjected to an energy minimization of 50 steepest descent iterations followed by 1000 conjugate gradient iterations prior to dynamics. The number of replicas is arranged according to the acceptance ratio of 48% conformational swap change between replicas. At the end of restrained REMD runs, trajectory of the lowest replica are clustered to ˜1 Å RMSD by using a modified k-means algorithm. Then, docking into available ligands is carried out on each member of the clustered ensembles.
Docking with ROSETTALIGAND
We minimize each clustered conformation with a short equilibration MD so that they can relieve some of the strain of the system. Then, the individual structures from the ensemble of PDZ domain conformations obtained with REMD is docked into several C-terminal peptides using ROSETTALIGAND59, 60 protocol in the ROSETTA package. ROSETTALIGAND is a method specifically developed for docking ligands into protein binding sites. The method uses a Monte Carlo minimization protocol to optimize the rigid body position and orientation of the ligand and the protein side chain conformations. The energy function includes van der Waals interactions, an implicit solvation model, an electrostatics model, an explicit orientation hydrogen bonding potential, and an empirically derived torsional potentials. The ROSETTALIGAND protocol we apply is substantially the same as described in the study of Meiler and Baker.60 The coordinates of the peptides are taken from the crystallographic complexes and they are treated as a single residue. The peptide flexibility is introduced by changing torsional angles and the backbone is held fixed throughout the docking simulation. In this study, we perturb the ligand position and orientation randomly with the translation of mean 0.1 Å and the rotations of mean 3o, respectively. We compute 10,000 trajectories to generate a comprehensive ensemble of conformations of the receptor-ligand complex for each peptide. The formation of a distinct binding funnel in binding energy/RMSD plots is an indication of successful docking and the final docked conformations are selected based on the lowest free energy pose in the protein-binding site.
Assessing the scoring accuracy with DrugScore
After selecting the pose corresponding to the lowest free energy of binding at the end of Rosetta docking, we reassess the binding energy score of this complex using DrugScore.61 Thus, we submit the receptor part of the docked pose in PDB format and the ligand coordinates in mol2 format into DrugScore online (http://www.agklebe.de). DrugScore, a knowledge-based scoring function for protein-ligand interactions, employs statistically derived pair potentials using the distance-dependent occurrence frequencies by which a particular ligand atom type is found in contact with a protein atom type. Wang and Wang79 studied the accuracy of different knowledge-based scoring functions for docking of 100 complexes. They showed that only four, X-Score, DrugScore, PLP (piece-wise linear potential), and G-Score among 11 various postdocking ligand-protein scoring functions give moderate correlations with the experimentally determined protein-ligand binding affinities. Gohlke et al.61 and Velec et al.80 have also reported that DrugScore shows better ranking than the scoring functions within the docking programs. Furthermore, a recent study Dokholyan and colleagues81 showed that the success rate of their docking results was improved to 85% by consensus scoring with DrugScore for a docking decoy set consisting of 100 complexes. Likewise, our binding affinities got improved by rescoring the lowest energy pose by DrugScore. Higher negative values indicate a higher binding affinity prediction. With DrugScore, the binding selectivity preferences for Class I and Class II peptides become more significant.
Results and Discussion
The proposed methodology is applied to the PDZ structures in Table I. First, we check the performance of ROSETTALIGAND by self-docking and cross-docking tests for experimentally known bound structures. Docking becomes more challenging when the experimental structures of bound complexes are not available, especially when homology models need to be used. For those cases, the success of docking depends whether the right fluctuation profiles toward bound-like conformations can be incorporated accurately in docking computations. To test this, we generate MRCs of the unbound structure with the ENM-guided dihedral restraint REMD (ENM-REMD) method described in “Materials and Methods” and shown schematically in Figure 1(A). In all tables, we present RMSD values with respect to the heavy atoms between the ligand position of the lowest energy docking pose (i.e., the first rank based on energy score) and that of the crystal structure. This provides a measurement for the accuracy of any given docking attempt, as most of the docking approaches; the ligand is being kept flexible, while keeping the receptor rigid. Strikingly, in most of the docking tests, we observe that the lowest energy score obtained with our ENM-REMD method also has the lowest peptide RMSD value indicating the success of our approach in sampling the correct peptide poses.
Self-docking and cross-docking of bound complex structures
As the ultimate purpose of protein-ligand docking studies is to reproduce the bound complex structures from the unbound conformations, it is important to check the performance of docking protocol on the experimentally known bound structures as a first test. Thus, we perform docking using bound conformations while the peptide and the backbone of bound structure are kept rigid. The peptide RMSD values of the poses with the best score for each PDZ are summarized in Table II. The average RMSD of the self-docking tests is 0.62 Å. Note that overall RMSDs of up to 2 Å are generally accepted as near native in docking.82 We also check the docking scores of each bound structure with different peptides. We find that the average RMSD values of the cross-docking models (0.98 Å) are slightly higher than that of the self-docking tests (0.62 Å). Overall, we obtain quite low RMSD values with ROSETTALIGAND in case of docking “bound” structures. Furthermore, the low energy scores of the lowest RMSD structures obtained with self-docking prove the binding preference of each PDZs. It has already been shown that ROSETTALIGAND, which includes side chain conformations and ligand flexibility, is a successful approach providing an accurate prediction compared to other existing docking methods in self-docking experiments60 and the results of our self-docking and cross-docking of bound structures also agree with these findings.
Table II. Docking Rigid Ligand Conformations to their Native (Bound) Structures (Self-Docking and Cross-Docking)
Evaluation of results of docking jobs based on ENM-guided dihedral restrained REMD snapshots
For all PDZs, we dock natural binding peptides along with different peptides using unbound structures in order to test if we can predict the binding selectivities of PDZs with our approach. First, we generate deformed conformations by perturbing the unbound conformation along binding-related fluctuation profiles using a few slowest modes of ENM. Different sets of slowest modes (i.e., the slowest modes that are highly coupled with binding related fluctuations) are used for each PDZ by carefully analyzing the eigenvalue spectrum as explained in method section. Thus, the weighted averages of five (1–5), three (1–3), five (1–5), and two (1–2) ENM modes are used for PSD-95, GRIP, Syntenin, and Erbin, respectively to generate the deformed conformations (see “Methods” for details). The overall average backbone RMSD between deformed ones and unbound conformations of PDZs based on the α-carbons is 0.84 Å, while average RMSD between deformed ones and bound conformations is only 1.06 Å. These structures are used as initial structures in the dihedral restrained REMD simulations. Finally, the lowest replica trajectories of the dihedral restrained REMD are clustered to generate a MRC set for the peptide docking. The lowest binding energy scores and corresponding peptide RMSDs are summarized in Table III for two different docking cases (i) using only unbound experimental structure for docking and (ii) ensemble of structures obtained by ENM-guided REMD for docking. The selected best-scoring poses (i.e., the poses correspond to the lowest binding energy) are highlighted with gray color. The successful docking runs are frequently accompanied with the formation of a distinct binding funnel in the binding energy/RMSD plots.60 Figure 2(A) shows the binding energy score versus RMSD of the docked complex for PSD-95. PSD-95 only binds to Class I peptides. Simply, using only unbound experimental structure of PSD95 for docking to Class I and Class II peptides does not yield the selectivity preference [brown and red dots in Fig. 2(A)]. However, when MRCs obtained by ENM-guided REMD are used for docking, we clearly see that our flexible scheme does a better job in indicating the binding selectivity of PSD-95 (blue and cyan dots in the plots). The best-scoring poses for unbound conformation and ENM-guided REMD snapshot obtained from our docking scheme are superimposed on the original Class I peptide positions. Illustrated in Figure 2(A), green represents the original peptide (i.e., from the complex crystal structure) while blue and brown represent the pose with the best score of the ENM-guided REMD snapshot and unbound conformation, respectively. We obtain an excellent agreement with the ENM-guided REMD snapshot.
Table III. Docking of Peptides to Unbound Structures and ENM-REMD Snapshots
For the unbound GRIP docked to its original ligand (Class II), we obtain 0.29 Å as the RMSD of best score compared to that of 1.8 Å in PDZDocScheme of Niv and Weinstein.56 Surprisingly, the unbound structure of GRIP provides a good initial structure for docking. It might be expected that the prediction accuracy of the docking calculations decreases with the quality of the receptor from the bound (holo) protein to the unbound (apo) protein to the modeled structures.83 In the case of the unbound structure of GRIP, its conformation may be adequate to accommodate a ligand. Figure 2(B) displays the comparison of the binding energy score versus RMSD of the docked complex. With our docking scheme, we also find that its binding specificity is towards Class II type peptide. The difference of the binding energy score obtained from DrugScore between Class I and Class II peptides is 34.13 kcal/mol when the unbound structures are used for docking. However, the difference in docking to ENM-guided REMD snapshots is 98.61 kcal/mol, which is much more significant (see Table III).
In Figures 3(A,B), we present the binding energy score versus RMSD of two PDZs (Syntenin and Erbin) which exhibit Class I and II dual specificities. First, flexible docking scheme can indicate the dual specificities of these two PDZs (i.e., the DrugScore values of lowest energy complexes obtained by ROSETTALIGAND are very close as seen in Table III) whereas using only unbound structure in ROSETTALIGAND fails to discriminate the dual specificity of Erbin. Moreover, using ENM-guided REMD snapshots as multiple receptor ensembles also provide the discrimination of binding affinities for Class I and Class II types of PDZs. Figure 3(A) shows that Syntenin binds Class II peptide with higher affinity than Class I peptide, which is consistent with experimental observation indicating that the PDZ2 domain of Syntenin binds slightly better to the Class II than to the Class I peptide84 (the binding energies of the poses are –168.33 kcal/mol and –122.59 kcal/mol for docking into Class II and Class I peptides, respectively). When we compare the binding energy values of each peptide for Erbin PDZ, Erbin binds Class I peptide with higher affinity than Class II peptide [Fig. 3(B)]. This agrees with the intrinsic binding affinities between the Erbin PDZ domain and peptides measured using ELISA. The affinity of the phage-selected peptide (TGWETWV) binding at a submicromolar level is higher than the affinities measured for the ErbB2 peptide (EYLGLDVPV).85, 86 Overall, incorporating flexibility with ENM-guided REMD runs predicts binding selectivities of PDZ domains very accurately.
The advantage of docking procedure
We believe the success of our method relies on sampling the bound-like conformations through performing an REMD simulation guided by the binding-related normal modes. To test the efficiency of our docking scheme, we compare our schemes with docking the peptides into (i) the unbound structure, (ii) conformations generated by perturbing the unbound conformation using the binding-related normal modes of ENM, and (iii) an ensemble of conformations obtained by the snapshot of the REMD simulation started from unbound experimental structure. Table IV presents the comparison of the docking results of PSD-95 with these different approaches. PSD-95 prefers Class I type of binding, therefore docking scores of Class I peptide should be lower than those of Class II peptide. As expected, docking Class I peptide to the unbound structure of PSD-95 provides the worst RMSD value (1.49 Å). Moreover, docking of Class II peptide to the unbound structure gives a lower energy score than that of Class I peptide. Docking studies using the conformations generated by the perturbation of ENM mode is computationally less expensive and provides good RMSD values (0.44 Å and 0.65 Å for two different deformed structures). However, the corresponding binding energy values obtained with DrugScore do not clearly indicate the binding affinity of PSD-95 (i.e., DrugScore binding energies are –54.64 kcal/mol for docking of Class I peptide and –85.57 kcal/mol for docking of Class II peptide). Thus, incorporating only ENM modes to docking does not differentiate the binding preference of PSD-95. Likewise, the clustered conformations of straightforward REMD simulation do not discriminate the binding preference and this approach also fails to capture the high-resolution low RMSD pose of PSD-95 complex. We get 1.01 Å and –60.57 kcal/mol by docking the snapshots of the REMD simulation starting from an unbound experimental structure. However, our docking approach successfully not only distinguishes the binding preference but also predicts the binding pose of receptor accurately. In our docking protocol, first, we generate new conformations by perturbing the unbound structure along the weighted averages of first five modes (1–5) of ENM, second, we use these conformations as initial structures for the restrained REMD simulations where the dihedral restraints are imposed with respect to average normal mode of ENM. Finally, clustered structures of the REMD simulation are docked into peptides. The pose obtained by our new protocol does not only provide the lowest binding energy (–118.90 kcal/mol) but also gives the best RMSD value of the receptor and significant binding energy differences between Class I and Class II peptides. These results indicate that REMD conformational search along the modes of ENM based on a dihedral restrained REMD improves the docking prediction accuracy by sampling native-like bound conformations. This observation is consistent with the other PDZs we tested as explained above.
Table IV. Docking Ligands into Unbound and Structures Obtained with ENM, MD Only, and ENM-Guided REMD for PSD-95
Flexible docking for homolog and mutant structures
In a more stringent test, we applied our flexible approach to homologs of PDZs whose binding selectivities are verified experimentally and also to PDZs whose binding specificities are altered upon mutation. Both Class I and Class II peptides were docked to homolog structures obtained using MODELLER. Then, we also obtained MRCs for homolog structures by our ENM-guided REMD approach using MODELLER prediction as a starting conformation. The lowest binding energy scores of the docked poses for both homolog structures and their corresponding RMSD values are presented in Table V. The docked pose of Lrcc7 which has Class I binding specificity has very low RMSD values (0.29 Å when homology model is used for docking, and 0.23 Å when MRCs of ENM-REMD approach are used for docking). Likewise, docking Class II peptide to Cipp-PDZ9, exhibiting Class II type binding affinity, yields a 1.12 Å RMSD docked pose when homology model is simply used for docking. On the other hand, docking Class II peptide to of Cipp-PDZ9 using our flexible approach gives consistently lower RMSD of 0.23 Å. As observed in earlier PDZ docking cases, the inclusion of backbone flexibility through ENM-REMD snapshots does a better job in discriminating that that Lrcc7 has affinity towards Class I peptide, and likewise Cipp-PDZ9 prefers Class II peptide (see Table V for DrugScore energy scores).
Table V. Prediction for Homolog and Mutated PICK1 Structures
We further test our flexible docking protocol by examining the binding characteristics of wild-type PICK1 and PICK1 carrying various mutations. The wild type of PICK1 can bind both Class I (PKCα) and Class II (GluR2) peptides.87–89 Yet, there are several mutations that have been reported previously, such as the mutation of lysine 27 (K27E) to glutamic acid alone or together with aspartic acid 28 (D28A) to alanine, which completely disrupt the interaction with both GluR2 and PKCα.89 Moreover, the experimental study of Dev et al.87 has shown that mutating lysine 27 to glutamic acid, a point mutation on the βA–βB loop, changes the binding selectivity of PICK1 to exhibit only Class I behavior. Another mutation study that replaces the residue in αB helix (lysine 83 to histidine) by Madsen et al.88 showed that the preference of PICK1 reverts to that of a Class I motif. Docking Class I and Class II peptides to the unbound conformation of wild and mutant PICK1 does not show the change in selectivity upon mutation [brown and red dots in the Figure 4(A–C)]. However, when the ENM-guided REMD snapshot are used, the wild type has a higher affinity for Class II peptide [cyan on (A)] whereas both mutants prefer Class I peptide [blue on Fig. 4(B,C)]. From the binding energy score versus RMSD of the docked complex for PICK1, we can clearly see that wild-type PICK1 [Fig. 4(A)] prefers Class II type of peptide sequence while K25E mutation [Fig. 4(B)] and K83H [Fig. 4(C)] on PICK1 alters its binding specificity to Class I type of peptide sequences as observed experimentally.
We present a new flexible docking scheme which biases REMD search towards binding induced conformational changes by incorporating the fluctuation profiles obtained by slowest eigenvectors of ENM called ENM-guided REMD-DOCK method. We applied our method to a set of PDZs, which shows different affinities to different peptide types. Rigorous analysis has shown that generating MRCs using ENM-guided REMD can be a useful tool to predict binding poses and selectivities accurately. In our method, the backbone flexibility is introduced by perturbing the receptor structure along its relevant normal modes and the conformational space is explored with the help of a dihedral restrained REMD simulation, which enables us to sample the conformational space towards bound-like conformations. The individual snapshots of the receptor are docked against various ligands to generate a collection of docked complexes of different stabilities by using a docking program (ROSETTALIGAND). The final docked complex of the lowest binding energy score is also coupled with an independent scoring method (DrugScore) to increase accuracy in docking scores. Our results show that (i) the receptor flexibility plays a key role in determining the binding selectivities of the PDZs, (ii) incorporating flexibility in PDZ-peptide docking predicts the docked poses accurately with an RMSD of >1 Å, (iii) the experimental binding affinities are captured qualitatively (including the higher affinity of the Erbin for the Class I peptide, and that of the Syntenin for Class II peptide) when the MRCs are generated by REMD sampling conformational space along binding induced fluctuation profiles of normal modes, and (iv) the final screening process using DrugScore enables us to discriminate the binding specificity of ligands in PDZs accurately.
We gratefully acknowledge Dr. Michael Feig, Dr. Ozlem Keskin, Tyler Glembo, and Ashini Bolia for their valuable comments. We also thank the Fulton High Performance Computing Initiative at Arizona State University and NCSA teragrid for computer time.