For more than a century, enzymatic activity has been understood in terms of binding processes in which a compound interacts with an enzyme as a key would do with its lock. The “lock and key” model, proposed by Fischer in about 1890 to describe the interactions of a protein and a substrate, relies on the complementarity of physical properties such as shape, molecular size, solvation, and intermolecular interactions, which are central for ligand recognition and modulation of the pharmacological response of the target protein.
Since the first entry in the Protein Data Bank (RCSB PDB; www.pdb.org) was created in 1972 (PDB code 1sbt), more than 90,000 X-ray and NMR models have been made available in the PDB, in what constitutes the primary source for structurally informed design (SID). High-resolution X-ray and NMR models provide valuable information to elucidate the structural basis for molecular recognition and design of novel target-modulating compounds. They also serve as structural templates for homology modeling of proteins that have no structure yet solved.
The number of structural models deposited in the PDB has increased rapidly over the last 10 years (Fig. 1). In the last two years alone, approximately 15,000 structures (ca. 25% of the entire PDB) were released. This represents an average of ∼600 structures per month, highlighting the need for tools and techniques that can make use of this information. Several scientific and technical advances have progressed in parallel with this increase in protein structural data. Along with increases in computing power, memory, and storage, software tools have been developed that aim to rapidly characterize, categorize, and sort the molecular interactions observed in the PDB (i.e., IsoStar, SuperStar, and MED-SuMo) and statistical potentials have been developed based on this data (i.e., DOPE, PMF, and DFIRE).
Such wealth of structural information can be highly valuable in the hands of an expert SID scientist, whom will be able to take advantage of structure-function relationships based upon visualization and interpretation of structural data. The ability to observe and infer information beyond atom coordinates distinguishes such expertise. In addition, the data in these structures contains an element of uncertainty, because the atomic coordinates are merely a model fit to the experimental data and are subject to a variety of errors, including model over fitting, missing or incorrect atoms, and incorporation of highly strained small molecule ligands. There are many interesting examples to highlight in the PDB; for the scope of this review we will showcase a few examples of unusual structural interactions, including halogen and sulfur mediated interactions.
Effective SID requires the careful analysis of the structural model and the identification of “anchors” for maximizing ligand–receptor interactions. The PDB is a rich source of intermolecular interactions; most of them are widely known and seen quite often, while others may be more subtle or infrequent. Commonly observed ligand–receptor interactions include hydrogen bonds, salt bridges, and hydrophobic contacts. Focusing on hydrogen bonds, the most frequently seen acceptors are neutral or charged oxygen atoms and neutral nitrogen atoms. Common donors include neutral oxygen atoms and neutral or charged nitrogen atoms. Halogens are also able to participate in “nonclassical” hydrogen bonds as an acceptor, or in a role analogous to a hydrogen bond donor in halogen bonds, but these types of interactions are seen less frequently in the PDB. Many force fields and modeling tools that utilize fitted information derived from structural data can easily reproduce common interactions that have many experimental examples, but have a limited ability to accurately reproduce infrequent interactions. Atypical interactions that are seldom encountered in crystallographic models may challenge us to rethink and reinterpret SID. This is where first principle methods, such as quantum mechanics, can be helpful in deciphering and interpreting the underlying sources of such interactions.
SID of Halogen-Mediated Interactions
The value of halogen bonding and halogen interactions in rational drug design has been recently emphasized in several articles and reviews.[11, 13-17] Much of this discussion has centered on the apparent incongruity of halogens, generally regarded as electronegative atoms, being able to functionally act as “nonclassical” hydrogen-bond acceptors, or in a manner analogous to hydrogen bond donors in halogen bonds. Inspection of electrostatic potentials (Fig. 2) shows a region of positive potential at the distal end of the halogen, in the cases of Cl, Br, and I, that has been termed the “σ-hole.” Interactions involving these groups are highly directional, owing to the negative electrostatic potential that rings the central portion of the halogen; in fact, this region of negative potential can itself act as a hydrogen bond acceptor. In the case of fluorine, the halogen atom retains its overall negative potential and remains capable of behaving as a hydrogen bond acceptor in a more traditional sense. However, fluorine atoms attached to strongly electron withdrawing groups can exhibit a “σ-hole,”[20, 21] but these types of moieties are generally not seen in approved drugs. Fluorine atoms can also be used to modulate the electrostatic potential around another halogen atom by incorporating them into neighboring ring positions, effectively increasing the size of the “σ-hole” and strength of interactions with the halogen.
As noted previously, the few examples of halogen-mediated interactions in the PDB present a challenge in predicting the “correct binding pose” when these interactions dominate ligand binding. Additionally, in some cases weak interactions may change the balance favoring one binding mode over another; it is up to an expert SID scientist to (a) be aware of such possible binding modes and to (b) assess their likelihood. Both scenarios require a heavy reliance on structural models and a wide familiarity with the PDB as a whole.
An example of subtle structural differences resulting in flipped binding modes is illustrated by the CDK2-inhibitor cocrystal structures solved by Schering-Plough scientists (Fig. 3). Two compounds containing classical moieties that are known to interact with kinase hinges were crystallized in different binding modes: (a) a pyrazolopyridine group in a traditional hinge binding mode, despite having a chlorophenyl moiety present (PDB code 2r3f), and (b) an imidazopyrazine in a flipped mode to accommodate a fluorobenzene interaction with the hinge NH (PDB code 2r3g). Both of these X-ray structures are of high resolution (1.50 Å for 2r3f, 1.55 Å for 2r3g) with clear and unambiguous electron density in the ligand binding site. In the case of the flipped binding mode, the fluorine acts as a “nonclassical” hydrogen bond acceptor, with a bond distance of 3.1 Å to the hinge nitrogen and a CF---HN angle of 148°. The binding geometry is consistent with an energetically favorable interaction based on calculated contact energies and geometry optimization of model systems (optimized F---N bond length of fluorobenzene + N-methylacetamide is 3.2 Å at B3LYP/6-31+G*). This also suggests the overall electronegative character of the fluorine atom allows it to form a productive interaction with the hinge NH that may not be possible for chlorine with its modest “σ-hole” character. The similar IC50 values and known propensity of both imidazopyrazines and pyrazolopyrimidines to bind to the CDK2 hinge seem to indicate that both binding modes may be isoenergetic, or at least accommodated within the binding site.
An additional example of a fluorobenzene interacting with a kinase hinge was reported for TGF-beta receptor I kinase, PDB code 1rw8 (Fig. 4). Again, the fluorine acts as an acceptor at a distance of 2.9 Å from the hinge nitrogen and a CF---HN angle of ∼136°. One of the hydrogen atoms on the phenyl ring adjacent to the fluoro group is located about 2.7 Å from a hinge carbonyl oxygen (CCO distance of 3.6 Å) and may be involved in a “nonclassical” hydrogen bond.
Another recent example (Fig. 5) shows a series of fluorobenzene moieties interacting with an NS5b beta-loop, which is structurally reminiscent of a kinase hinge (PDB code 3skh). In this particular example, the fluorobenzene acts as a “nonclassical” hydrogen bond acceptor interacting with a backbone NH (CF---HN bond length ∼3.0 Å, CF---HN bond angle ∼162°). This group was modified during the lead optimization process into a pyridone, maintaining similar interactions with the beta-loop through polar isosteric replacement.
As mentioned earlier, the nature of these halogen mediated contacts can be understood in terms of the electrostatic potential of the molecular fragments that take part of the interaction. However, it is also important to understand the directionality of these interactions. While it is possible to describe the “σ-hole” and halogen bonding phenomenon quite well using electrostatic potentials, we have found the incorporation of nonbonding orbitals into the description of these interactions quite instructive.
The electrostatic potential maps of halobenzenes (Fig. 2) show that the electron withdrawing nature of the halogen atom decreases the effective negative potential on the benzene ring by shifting it to the CX bond. The “tip” of the CF bond has the most strongly negative electrostatic potential. In the case of CCl and CBr, there is a mixture of negative and positive potential along these bonds with the region of negative electrostatic potential situated in an annular arrangement. For CI bonds, the balance is shifted toward positive electrostatic potential. These illustrations provide an explanation of the rich and varied types of interactions halobenzenes can participate in: fluorobenzene acting as an acceptor, chloro- and bromobenzenes with mixed propensities for acting as acceptors and in halogen bonds, and iodobenzenes primarily participating in halogen bonds in the presence of carbonyl groups. Although, iodobenzenes are often avoided in drugs due to concerns over metabolic lability, thyroid hormone thyroxin (T4) is used to treat hypothyroidism (PDB entries 1eta and 2rox are examples of T4 bound to transthyretin) and some recent MEK1 inhibitors also incorporate this moiety (PDB codes 3orn, 3sls, and 3mbl). While they prove very useful in rationalizing the nature of halobenzene interactions, electrostatic potential maps offer limited information about directionality of such contacts. However, σ* antibonding orbitals of the involved fragments can be quite informative in this regard.
The σ* antibonding orbitals of benzene and the four halobenzenes are compared in Figure 6. For the sake of this discussion we will focus only on the graphical representation, which is what we have found to be most useful for explaining intermolecular interactions to medicinal chemists in drug discovery teams.
As expected for a symmetrical molecule, benzene has a uniform orbital distribution along its CH bonds. This distribution is maintained for the most part in the case of fluorobenzene, except for the CF bond, which presents a modest antibonding character. This differs from the other halobenzenes, which show predominant σ* antibonding orbitals along their CX bonds. This suggests that fluorobenzene may rely on two interaction points; one being an acceptor on the CF (described above in terms of its electrostatic potential) and one being a donor on an adjacent CH bond as both contain a large antibonding component. Based on the molecular orbital picture, the other three halobenzenes appear to interact primarily through the CX bond. In that sense, a fluorobenzene can be compared directly to a pyridone group, which primarily forms interactions via its carbonyl and NH moieties. The representation of the pyridone electrostatic potential maps and σ* antibonding orbitals are shown in Figure 7 and are remarkably similar to the fluorobenzene maps, although fluorobenzene is symmetric around the CF bond in contrast to the asymmetry of the pyridone ring.
As illustrated in the X-ray structures in Figure 5, the fluorobenzene ring appears to act as a bioisosteric replacement of a pyridone moiety. Examination of the SAR for this class of compounds suggests that the fluorobenzene ring is not as potent a binder as the pyridone, but the crystallographic evidence shows that similar protein–ligand interactions are maintained. This picture is consistent with the electrostatic potentials maps and σ* antibonding orbitals (Fig. 7), which show slightly more pronounced regions of negative and positive electrostatic potential, as well as minor differences in size and shape of the σ* antibonding orbitals adjacent to the pyridone as compared to the fluorobenzene. This implies that the same type of interaction is likely occurring in examples shown in Figures 3 and 4 as well, and that although no pyridone analogs were exemplified as experimental verification one could expect substitution of the pyridone for fluorobenzene to improve compound potency.
SID of Other Nontraditional Hinge Binding Motifs
Another interesting example of an uncommon interaction is contained in a recently published Chk1 kinase X-ray structure (PDB code 3u9n). In this case, the cocrystallized structure of the Chk1 kinase-inhibitor complex showed a thiazole group acting as a hinge binder with the sulfur atom facing the hinge backbone NH and CO groups (Fig. 8a). Although it is possible that nonproductive interactions can be tolerated when there are other compensatory interactions between the protein binding site and ligand, the SAR for this series suggests that the thiazole is engaged in an energetically favorable interaction. The Chk1 IC50 for the crystallized ligand is 75 nM, while direct replacement of the thiazole ring with isoxazole (Cpd 15 from ref. 34), which should be able to interact with the NH of Cys87, results in a significant loss in activity (Chk1 IC50 = 3400 nM). Similarly, other thiazole regioisomers show greatly diminished activity (Chk1 IC50 > 21 μM), illustrating that the arrangement of atoms in the thiazole are critical for activity of the compound. Once again, we resort to electrostatic potential maps and σ* antibonding orbitals to interpret this nonobvious interaction of the inhibitor fragment and its surrounding residues.
The location of the thiazole adjacent to and blocking the hinge is unique in our experience. The sulfur atom is located somewhat in between the NH of Cys87 and the CO of Glu85 (distances of 3.5 and 3.3 Å, respectively). The environment around the thiazole seems to indicate that the main polar interaction takes place due to the overlap of positive electrostatic potential around the CH and the CO group of Glu85, as shown in Figure 8b. No substantial contribution to the negative electrostatic potential in the plane of the thiazole ring originates from the sulfur atom, which is driven by the most electronegative atom in the fragment. However, it has been demonstrated that there are concentrations of negative electrostatic potential above and below the plane of the ring adjacent to sulfur atoms in thiazoles, which can participate in interactions with electrophiles, consistent with the position of the NH group of Cys87. Even though the electrostatic potential map is more complex for the whole inhibitor (Fig. 8c), the contribution of the S atom remains the same. From the latter maps, it can be observed that the highest negative potential is located around the oxygen of the benzofuran, while most of the hydrogen atoms facing the hinge are positively polarized, especially the CH adjacent to the thiazole sulfur. The small variation of the electrostatic maps around the thiazole S is a result of its high polarizability.
The orientation of the inhibitor in front of the hinge can be understood by analyzing the σ* nonbonding orbitals of the thiazole fragment, Figure 9. Because of the orthogonal orientation of the S σ* nonbonding orbitals, the preferred orientation of thiazole will be in between the CO and NH of the hinge instead of directly in front of either of them as in the case of the halobenzenes. For comparison, the σ* nonbonding orbitals of thiophene were also calculated, see Figure 9b. The symmetrical nature of the fragment is transferred to the orbitals which are identical in size and orientation while in the case of thiazole the effect of the nitrogen atom in the ring causes the σ* nonbonding orbital directly opposite it to be smaller. Also note that neither thiophene nor thiazole present considerable contributions from their CH bonds, with the sulfur atom dominating the picture. The experimental evidence that shows that 2-carbonyl substituted thiophenes adopt a preferred intramolecular syn-conformation,[36, 37] that nucleoside thiazoles display a preference for intramolecular interactions between thiazole sulfur and furanose oxygen, and that short sulfur to carbonyl oxygen distances are observed for oxathiazane and thiazolidine compounds. The nature and directionality of these interactions can be interpreted by the combined use of electrostatic potentials maps and σ* antibonding orbitals. Similar interactions, intermolecular in this case, appear to take place between the carbonyl of Glu85 and the thiazole moiety of the ligand.
A Path Forward For SID
Biased by the existing knowledgebase of kinase X-ray crystal structures, most SID scientists would not select the binding mode with fluorine as a hinge binder even if it was seen as one of the docking poses returned by a trusted model. Similarly, the interaction of a thiazole with the hinge is not obvious in the absence of crystallographic data. How do we increase our chances of success and accuracy? Moreover, does it make sense to obtain only a few crystallographic models on congeneric series of compounds assuming the binding mode will be the same?
As the modeling community has demonstrated an increased ability in retrospectively explaining frequent protein–ligand interactions, we believe that utilization of quantum mechanical calculations will improve the likelihood of prospectively predicting nonobvious interactions. Even though there are recent examples of scoring functions that incorporate halogen bonding terms, halogen and sulfur-containing heterocycles remain difficult to model and most often require quantum mechanical descriptions to explain their behavior. We present in this review a different way of thinking about the nature and directionality of these interactions that is generally applicable to other nontraditional and infrequently observed interactions. Routine inspection of structures in the PDB and the scientific literature helps in finding and utilizing unusual interactions, but still it takes a trained modeler's eye (who can interpret the electrostatic potential maps or the σ* antibonding orbitals) to further improve the quality of SID.
José Duca is Head of Computer Aided Drug Discovery (CADD) in Cambridge, Novartis Institutes for BioMedical Research (NIBR). Duca joined Novartis in 2010. Previously he had been with the Schering-Plough Research Institute and Merck Research Laboratories in Kenilworth, NJ, USA for 10 years and with Tony Hopfinger's group in the College of Pharmacy at the University of Illinois at Chicago as a Postdoctoral Fellow. He received his Ph.D. in Chemistry from the National University of Córdoba, Argentina.
Jason Cross is a Senior Scientist and heads the Molecular Modeling group at Cubist Pharmaceuticals. He received a B.Sc in Biochemistry from the University of Windsor in 1997 and a Ph.D. in Physical Chemistry from Wayne State University in 2002 after studying under Berny Schlegel. Following a postdoctoral position at Pfizer, he went on to provide computational chemistry support for drug discovery projects at Affinium Pharmaceuticals and Wyeth before joining Cubist Pharmaceuticals in 2009.