Using Ligand-Mapping Simulations to Design a Ligand Selectively Targeting a Cryptic Surface Pocket of Polo-Like Kinase 1

however,whenligandsarenotboundtoit,thepocketis closed. This cryptic pocket therefore presents a classicproblem in SBDD targeting a flexible protein surface. Here,we show that although opening of the pocket is highlyunfavorableintheabsenceofaligand,itispossibletoidentifyallofitsknownligand-bindingmodes,aswellasanovelmode,by using a modified ligand-mapping technique. The previ-ously unknown binding mode was used as a basis for thedesign of a new ligand with similar affinity to others bindingthis pocket. The predictions were validated by solving thecrystal structure of the bound complex.The walls of the hydrophobic PBD secondary binding sitecomprise Tyr417, Tyr421, Leu478, Tyr481 and Tyr485, withVal415 and Phe482 lying at the bottom. Side-chain move-ments of Tyr417 and Tyr481 allow for the accommodation ofa phenyl or other hydrophobic moiety. We first attempted togenerate a conformational ensemble of the hydrophobicbinding site by performing a 50 ns molecular dynamics (MD)simulation of the unliganded protein in explicit water, usingthe Amber ff99SB-ILDN force field.

S2 calculate long range electrostatic interactions under periodic boundary conditions, using a mesh spacing of 1.0 Å. With positional restraints on the solute atoms, 500 cycles of steepest descent and 500 cycles of conjugate gradient energy minimizations were carried out followed by two 50ps MD equilibration runs: in the first equilibration run, the system was heated gradually from 0.1 to 300K at constant volume while in the second equilibration run, the system was at a constant pressure of 1 atm and temperature of 300 K. Subsequent unrestrained equilibration (2 ns) and production (50 ns) runs were carried out at constant temperature (300 K) using a Langevin thermostat [12] with a collision frequency of 2 ps -1 and constant pressure (1 atm) using a Berendsen barostat [13] with a pressure relaxation time of 2 ps. The 10 independent 5-ns MD simulations of unliganded PBD with closed pocket were initiated with different atomic velocities and seeds for the pseudorandom number generator and equilibrated as described above.

Umbrella sampling
Umbrella sampling [14] was performed to obtain the free energy profile for the χ1 side chain dihedral of Y481. The fully equilibrated structure of unliganded PBD with closed pocket was used as the initial structure for all the umbrella sampling simulations. Thirteen 3-ns simulations were performed by varying the favoured angle from -180° to 180° in increments of 30°. The torsion angle restraint used was 15 kcal mol -1 rad -2 . Positional restraints were placed on all atoms except for those belonging to Y481, Y417 and F482 in order to reduce the need to equilibrate over many slow orthogonal degrees of freedom. The combined results from the last 2 ns of the simulations were analyzed using the weighted histogram analysis method (WHAM). [15] We have assessed the convergence of the simulation sampling by comparing the PMF calculated using the time range 1-2 ns and 1-3 ns, and we obtain very similar results ( Figure S2).

Ligand-mapping MD simulations
The initial unliganded PBD structure for ligand-mapping simulations was derived from the Protein Data Bank (PDB code 3FVH) as described above. Packmol [16] was used to generate 10 different placements of 40 benzene molecules within 40 Å of the protein centre. Each of the 10 systems was neutralized with a chloride ion and solvated with TIP3P water molecules in a periodic truncated octahedron box to yield a concentration of ~0.2 M benzene. Minimization, equilibration and production (5 ns) MD simulations were carried out as described above, for a cumulative sampling time of 50 ns.

Ligand design
The ligand-mapping MD trajectory structure with the shortest distance between Phe482 and the alternatively-bound benzene was used for ligand design. By superimposing this structure on the crystal structure of a PBD peptide complex (PBD code 3P37), a chimeric peptide was designed by replacing the N-terminal FDP residues of the peptide in the crystal structure with a 3phenylpropanoyl moiety such that the phenyl ring approximately occupies the position of the bound benzene. The PBD trajectory structure complexed with the chimeric peptide was used as the initial structure for MD simulation using the same protocol as described above, except that the production run was performed for 20 ns.
Atomic charges for benzene and the 3-phenylpropanoyl fragment were derived using the R.E.D.

Trajectory analysis
Non-hydrogen atoms of the 7 pocket residues (Val415, Tyr417, Tyr421, Leu478, Tyr481, Phe482 and Tyr485) were clustered using the MMTSB toolset. [20] The ART-2 algorithm [21][22] was used for RMSD-based clustering. Suitable cutoff radii (1.6 Å for the unliganded simulations and 1.4 Å for the ligand-mapping simulations) were empirically chosen to produce clusters containing as many centroid conformations corresponding to crystal structure conformations as possible. Clusters related by a 180° flip of Phe and Tyr aromatic rings about χ 2 were identified and combined. PyMOL [23] and Visual Molecular Dynamics (VMD) [24] were used for visualizing and generating figures.
Benzene occupancy grids were generated by using the ptraj module of AMBER 11 to bin carbon atoms of benzene molecules into 1 Å × 1 Å × 1 Å grid cells. A grid was generated for each of the three benzene binding modes using 100 snapshots exhibiting the relevant binding mode from a ligand-mapping MD trajectory. An isocontour value of 7 was used for visualization of benzene occupancy in the grids.

Experimental validation Protein preparation
Truncated construct of PBD of human Plk1 (residues 371-594) used for crystallization and fulllength His 6 -PBD (residues 345-603) construct were expressed in E. coli and purified in two chromatographic steps according to a previously reported procedure. [2] Chimeric peptide preparation

S6
Experimental diffraction data were collected in-house using an X8 Proteum diffraction system equipped with MICROSTAR rotating anode x-ray generator, Helios MX optics and Platinum135 CCD detector. Data were processed with PROTEUM software package and the structure was solved by molecular replacement using PHASER [25] from the CCP4 program suite and the 3P2Z structure (chain A) as the search model. The model was manually rebuilt with Coot [26] and refined using Refmac. [27] The summary of data collection and refinement is shown in Table S1.