PI by NMR: Probing CH–π Interactions in Protein–Ligand Complexes by NMR Spectroscopy

Abstract While CH–π interactions with target proteins are crucial determinants for the affinity of arguably every drug molecule, no method exists to directly measure the strength of individual CH–π interactions in drug–protein complexes. Herein, we present a fast and reliable methodology called PI (π interactions) by NMR, which can differentiate the strength of protein–ligand CH–π interactions in solution. By combining selective amino‐acid side‐chain labeling with 1H‐13C NMR, we are able to identify specific protein protons of side‐chains engaged in CH–π interactions with aromatic ring systems of a ligand, based solely on 1H chemical‐shift values of the interacting protein aromatic ring protons. The information encoded in the chemical shifts induced by such interactions serves as a proxy for the strength of each individual CH–π interaction. PI by NMR changes the paradigm by which chemists can optimize the potency of drug candidates: direct determination of individual π interactions rather than averaged measures of all interactions.


Introduction
Fragment-based drug discovery (FBDD) [1] is based on the continuous optimization of weakly binding hits derived from initial screening methods towards highly selective and potent compounds that target proteins implicated in various types of disease.T his process relies on the tuning of weak reversible interactions,the most important of them being van der Waals, electrostatic,a nd hydrogen bonding interactions. [2] As pecial case of weak molecular forces that fine-tune molecular recognition events involves the interaction of the p-electron cloud of aromatic ring systems with aromatic or aliphatic hydrocarbons.I nt he literature,t his interaction is often termed CH-p interaction or CH-p hydrogen bond, [3] or simply classified as atype of hydrophobic interaction. [4] Here the aliphatic or aromatic CH group acts as ahydrogen donor (a soft acid) and the p system as the hydrogen acceptor (a weak base). According to theoretical studies in the gas phase, this interaction is weak and mainly dominated by dispersive forces with as mall electrostatic component. [5] Forn onactivated CH groups,t he interaction energy is about À1.5 kcal mol À1 and for activated, more acidic CH-groups, like acetylene or chloroform, the interaction becomes more favorable,o wing to an increase in electrostatic character. [5] Reviews on interactions involving aromatic p systems in general [3a,b, 6] and CH-p interactions in particular [3c] have been published. However,aclear definition of interactions involving p-systems remains as ubject of debate. [6,7] CH-p interactions are major modulators of affinity and selectivity in protein-ligand complexes. [4,8] This is evidenced by the high relative occurrence of aromatic amino acids in drug pockets [9] and the large number of aliphatic donor groups implicated in protein-ligand aromatic interactions. [4] Most drugs contain one or more aromatic ring system with an average of 1.8 in recently marketed drugs. [10] Present-day drug-discovery programs rely on the modification of weak initial binders by optimizing van der Waals interactions by shape complementarity,e lectrostatic interactions by charge matching,a nd inferring H-bond donor or acceptor groups from structural information. On the contrary, no method currently exists to assess the beneficial or detrimental effects of CH-p interactions to the overall affinity of ap rotein-ligand complex. Their impact can only be inferred indirectly through global measures of affinity (K D ) or derived from geometrical observations versus historical statistical distributions (for example,X -ray crystallography). Having amethod to directly gauge the strength of individual CH-p interactions rather than averaged measures of all interactions would greatly benefit drug-potencyoptimization.
NMR spectroscopy is uniquely suited to extract sitespecific information about molecular interactions at atomic resolution and is routinely used to guide drug-development processes. [11] Although afirst example has been given with the detection of weak J-couplings between nuclei involved in methyl-p interactions within proteins, [12] ag eneral detection strategy for CH-p interactions in protein-ligand complexes is still missing. It is important to note that suitably resolved NMR spectra can only be obtained using selective labeling, otherwise 1 J CC (strong) scalar coupling effects lead to substantially lower signal-to-noise.T he precise and accurate measurement of chemical-shift information is considerably facilitated by the availability of suitably labeled aromatic amino-acid side-chains for NMR detection. [13] Specific tryptophan labeling can be achieved by providing either the aketoacid derivative of tryptophan, namely indolepyruvate,or simply anthranilic acid. [14] Different isotopologues of these compounds can be prepared by efficient multistep organic synthesis allowing for unique isotope-labeling patterns in aromatic amino acids and very sensitive NMR detection schemes. [14] Herein, we show that the combination of amino-acidlabeling strategies and sensitive 1 H- 13 Cp rotein NMR spectroscopy provides unprecedented insight into the details of CH-p interactions and their relevance for protein-ligand complexes.Monitoring the induced change in chemical shifts of protein side-chain protons engaged in aC H-p interaction with aromatic ligand moieties allows the identification of favorable CH-p interactions relevant for binding.T he relationship between the observed chemical-shift change and beneficial stacking interactions can be used as ad irect read-out for assessing the quality of individual, stabilizing CH-p interactions,thereby guiding further lead optimization.

Direct Detection of CH-p Interactions by NMR Spectroscopy
CH-p interactions are crucial determinants in proteinligand interactions.T hey comprise aliphatic sp 3 -hybridized groups,m ost prominently leucine,v aline,i soleucine,a nd alanine CH 3 groups,a nd sp 2 -hybridized CH groups of aromatic amino acids, [4] among which tryptophan has the highest preference factor,f ollowed by histidine and phenylalanine. [9] While the general applicability of this method applies to all amino acids able to act as ah ydrogen donor (aromatic and aliphatic), this work will focus on the example of tryptophan.
Here we employed precursor-assisted labeling of tryptophan h (eta), e (epsilon), and z (zeta) carbons by supplementing standard minimal media with selectively labeled anthranilic acid, ametabolic precursor of tryptophan. [14] By combining two precursors,one labeled at positions h and e,the other at position z,w ereduce the number of NMR observables to atotal of three signals per tryptophan. Thebinding domain 1 of the bromodomain-containing protein 4( Brd4-BD1) that recognizes acetylated lysine residues, [25] was chosen as model system, due to the availability of al arge set of in-house protein-ligand co-crystal structures.Brd4-BD1 contains three tryptophans,yielding atotal of nine signals ( Figure 1).
In the approach presented here the experimental "read out" to probe CH-p interactions is the proton chemical-shift perturbation (CSP) and line broadening of the isolated 1 H-13 C spin-pair resonances induced by ligand binding. Significant 1 H chemical-shift changes are induced by the bound ligand, provided that aCH-p interaction exists.Line broadening can occur due to reversible ligand binding (in case of mm K D s) and in cases of multiple binding modes.C ontributions from reversible binding can be excluded for afully ligand-saturated protein. [15] In the following,weshow that, in the case of Brd4-BD1 ligand complexes,the CH-groups of Tr p81 are sensitive reporters for favorable CH-p contacts and can be easily monitored with 2D 1 H-13 CH SQC NMR experiments.I n order to provide ac omprehensive assessment, we selected as et of 13 ligands (Supporting Information, Table S1) for which both crystal structures and affinity data were available. Theselection of ligands covers representative binding modes observed in typical drug-development programs. Figure 2shows the overlay of the 1 H-13 CHSQC spectra of selectively z (blue) and h/e (red/green) labeled Brd4-BD1 bound to various ligands representing different binding modes.Three cases can be discriminated:i )the complete loss of signals due to ag lobal conformational rearrangement of the interacting tryptophan (Figure 2a), ii)line broadening due to intermediate exchange effects (Figure 2b), and iii)substantial CSPs due to well-defined CH-p interactions ( Figure 2c).
Reversible binding of ligand 1t oB rd4-BD1 ( Figure 2a) leads to the loss of all three Tr p81 1 H- 13 Csignals,apronounced case of conformational-exchange-induced line broadening [16] due to as ubstantial structural rearrangement of the tryptophan side-chain. Although seemingly disappointing,t his observation might help with clustering ligands according to their mode of binding. Figure 2b shows exemplary data for ligands with moderate binding affinities and site-selective line-broadening effects (only the h 1 H- 13 Cs ignal is affected). Ligand 2d isplays as ignificantly broadened NMR signal with am oderate upfield shift of the interacting h proton DwHÀh = 0.43 ppm (Figure 2b,right). Theobserved line broadening is partly due to its low binding affinity (complete ligand saturation cannot be achieved), but also other effects might contribute (for example,residual mobility in the bound state and alternative binding modes). [15] It is important to note that despite the lower binding affinity,t he observed CSPs can be quantitatively related to binding geometry.
Ligand 3 ( Figure 2c)i sr epresentative of ligands with optimal stacking geometries that lead to substantial upfield shifts of the involved tryptophan CH protons upon ligand binding.T he presence of aromatic ring systems above the zand the h-CH groups of Tr p81 leads to substantial upfield shifts of DwHÀh = 1.69 ppm and DwHÀz = 2.3 ppm, respectively.

Geometric Parameters of the CH-p Interface
Encouraged by the large CSPs for protein CH groups in contact with aromatic p systems,w es et out to analyze the underlying geometric dependencies.P rotein chemical shifts have already been successfully used in some instances to orient ligands in the binding pocket. [17] Theo bserved chemical-shift changes are due to ring current effects of ligand aromatic rings and electric-field effects caused by charged groups. [18] Protons in spatial proximity to an aromatic ring system experience differential effects from strong shielding to weak deshielding, depending on the orientation of the aromatic ring relative to the protons being affected. [19] In order to correlate this orientation dependence with proton CSP values of the interaction, we analyzed our set of 13 ligands.F or all ligands X-ray,a sw ell as NMR data was available.W ea pply am odel first introduced by Pople [20] where the center of the aromatic ring is treated as ap oint dipole inducing amagnetic field given by the standard dipole equation: where Ds is the change in the isotropic nuclear shielding constant in ppm, n is the number of circulating electrons (n = 6f or all ligands discussed in this work), e is the elementary charge in Franklin, a is the radius of the aromatic ring (1.39 10 À08 cm), m is the electron mass in gram, c is the speed of light in cm s À1 , q is the angle between the ring normal through the aromatic center (X) and the proton to ring center vector in rad, and r is the distance from the proton to the ring center (H-X) in cm (Figure 3). [15c] Thegeometric parameters r and q were extracted from Xray crystal structures based on the model depicted in Figure 3 (see also Table S1 in the Supporting Information). Interestingly,the proton-to-plane distance (H-Y) showed only small variations between 2.4 to 2.9 ,except for ligand 12 with aH-Ydistance of 3.27 ,with aC SP value of 0.23 ppm, which is the lowest among all ligands studied (Figure 4a). More variability was found for the proton-to-ring-center distances (H-X), which were between 2.5 and 4.0 (Figure 4b). The theoretical CSP values derived from Equation 1a re in very good agreement with experimental CSP values (ranging from 0.23 ppm to 2.74 ppm, see Table S1 in the Supporting Information;F igure 4c). Thes light deviation from the theoretically expected slope of 1i sd ue to limitations of the Pople model. However,i fp roperly parameterized, the precision of the point-dipole model is comparable to more sophisticated models as has been reported before. [15c, 21] We also want to emphasize that, although the experimental data set also contains ligands with moderate binding affinities,the correlation to predicted X-ray-crystal-structure-derived CSP values is equally good. Thus,even in cases where the binding affinity is only moderate and far from optimal, relevant structural information about the ligand binding mode can be extracted. Thelargest CSP values are obtained when the H-X distance is minimal, equivalent with stacking of the CH-donor directly above the ring center. Thus,C SP values are very efficient sensors to probe proton-to-ring-center orientations (Figure 3). It is very instructive to compare the relative Trp81 orientations in the different protein-ligand complexes and their relationship to proton-to-ring-center distances (Figure 4a). Small H-X distances are found for T-stacked binding modes.F or larger H-X distances,t he tryptophan is shifted laterally and tilted relative to the aromatic ring system, resembling ap arallel displaced stacking mode.

Do Large Upfield CSPs Correlate with Favourable CH-p Interactions?
Given the good agreement between favorable stacking geometries and the observed change in the proton chemical shift of the interacting protein CH group,w et ook ac loser look at the energetic details of the interaction. Interactions between an aromatic acceptor group with aromatic sp 2 (benzene-benzene), as well as aliphatic sp 3 donor groups (for example,b enzene-methane) have been analyzed with both theoretical [5,22] and experimental methods. [23] According to calculations for benzene-benzene [22] and benzene-methane interactions, [5] the most favourable configuration would put the donor proton directly on top of the aromatic ring center at ad istance of approximately 2.5 ,c onsistent with our experimental data for donor protons stacking on top of ligand ring systems.
We therefore employed DFT calculations for the interaction energies on the T-shaped stacking conformation by scanning the benzene surface in an orthogonal fashion at the optimal proton-to-ring-planed istance (H-Y) of 2.5 (Figure 5a). Figure 5bshows the resulting energy heatmap for the interaction of the CH-donor group of one benzene group with the p system of the second benzene group as acceptor system. TheDFT calculations revealed an interaction energy surface minimum (À2.86 kcal mol À1 )w hen the donor group is stacking directly on top of the ring center (H-X = H-Y = 2.5 ; q = 08 8). Comparing the interaction energy surface (Figure 5b) with the calculated isotropic shielding constants, Ds (Figure 5c)s hows that pronounced shielding (greater than 0.2 ppm) of aprotein donor proton is indicative of afavorable interaction energy contribution (approximately 1.0 kcal mol À1 ). We thus conclude that the observed correlation between CSPs and interaction energies can serve as ap roxy for beneficial CH-p interactions stabilizing protein-ligand complexes.
To demonstrate the relevance of CSPs for the identification of favourable CH-p interactions and their contribution to binding affinity,w ea nalyzed am atched ligand pair (ligands 3a nd 4), which are structurally identical except for the interaction interface with Tr p81-h (methoxy propyl vs. phenyl). ITC measurements were carried out to extract binding affinities and thermodynamic parameters ( Figure 6).   Table S1 in the SupportingInformation.T he green dashed line corresponds to the most frequentlyf ound H-Y distance of 2.5 .b)Comparison between calculated chemical shifts and their dependence on the proton-to-ring-center distance (H-X). In the calculation (solid line) the H-Y distance was set to 2.5 .c)Correlation of CSP with the calculated nuclear shielding constant Ds. Calculated values for Ds from X-ray crystal structures are well reproduced by experimental data (R 2 = 0.92).
Theaffinity of ligand 3issignificantly higher (K D :38nm ; DH: À9.2 kcal mol À1 )d ue to an additional CH-p interaction present compared to ligand 4( K D :1 24 nm ; DH: À7.5 kcal mol À1 ). Thee xperimental difference in binding enthalpy amounts to À1.7 kcal mol À1 ,w hich compares well with theoretical calculations for ap erfect benzene-benzene Tshaped orientation (À2.45 kcal mol À1 ). [22] Interestingly,chemical-shift changes observed for all (z, h,a nd e) 1 H-13 CT rp signals are consistently higher in ligand 3, thus providing independent evidence for an improved binding interaction. Although other contributions,f or example,d ifferential solvent effects, [24] might play arole,our measurements highlight that the robust improvement in overall binding affinity for ligand 3c an be readily observed and (at least in part) attributed to improved CH-p interactions.

Discussion
Efforts to exploit non-covalent interactions in proteinligand complexes to improve potency and selectivity of drugs would greatly benefit from the availability of information-rich experimental techniques to probe and quantify these interactions,i deally with atomic resolution. Ap rominent case of non-covalent forces that fine-tune molecular recognition events involves the interaction between aromatic ring systems and aromatic or aliphatic hydrocarbons,o ften termed CH-p interaction or CH-p hydrogen bond. Despite the undisputed relevance of this interaction, experimental demonstrations have been scarce due to technical limitations.Although X-ray crystallography can provide detailed structural information, an unambiguous demonstration of the existence of stabilizing CH-p interactions in solution is not straightforward. Here we show that our approach, PI by NMR, probes individual CH-p interactions and enables medicinal chemists to move away from coarse-grained methods quantifying global measures of affinity to as ite-specific interaction-based design strategy.
Thee xquisite sensitivity of the chemical shift to subtle changes of the chemical environment makes NMR the method of choice to probe small variations of chemical environments upon, for example,l igand binding to ap rotein target. Although this feature is widely known and generally accepted, applications to macromolecular systems typically found in drug-discovery programs were limited due to experimental limitations.A mong other reasons,t he overwhelming NMR spectral complexity of proteins together with limited sensitivity and unwanted scalar coupling effects were hampering applications.W epreviously described that appropriate selective precursors for aromatic residue isotope labeling can be synthesized and used in bacterial cell cultures to selectively label defined positions in the aromatic aminoacids phenylalanine,tyrosine,t ryptophan, and histidine. [26] Theapplication of these labeling techniques to study CHp interactions in protein-ligand complexes offers exciting possibilities in drug-design programs.I mportantly,t he technique is applicable to protein-ligand systems covering awide range of binding affinities from mm to nm,i ndicating the potential for structure-based drug-design programs.W e anticipate that the broader implementation of our approach will impact future drug-development programs in several ways.F irst, simply tracking the extent of chemical shifts induced by ligand binding allows the identification of favorable CH-p interactions relevant for binding,t hereby guiding further lead optimization. Monitoring whether the optimal binding modes are retained during compound optimization is consequently straightforward and only requires comparison of the ligand-induced CSPs.S econd, numerous chemical scaffolds are found in early stages of FBDD programs.S election of the most promising fragments suitable for subsequent lead optimization is still ad aunting task. Observation of the chemical-shift changes induced by aromatic ligand moieties is indicative of favorable CH-p interactions and may constitute an important fragment selection criterion. Finally,t he growing arsenal of selectively Figure 5. a) Configuration for the calculation of DH values. The CH-donorg roup of one benzene molecule was placed perpendicularly at aH-Y distance of 2.5 with respect to the interacting aromatic acceptor group of the second benzene ring. b) Energy surface for an orthogonal scan of the benzene-benzene interaction at adistance of 2.5 in units of kcal mol À1 .Asymmetry in calculated energies for x-/y-axes is mainly caused by the constant orientation of the CH-donating benzene ring along the y-axis along with the contribution of ortho hydrogenst othe interaction. c) Calculated isotropic shielding values (Ds)f or the same benzene-benzene interaction at adistance of 2.5 in units of ppm.
labeled precursors covering all relevant protein CH-donor groups will allow unique pharmacophore features to be addressed in hitherto unexplored protein binding pockets, even in the absence of X-ray crystallographic data. We are currently building up on the premise of using fragmentinduced CSP information from side-chain-labeled protein to guide achemical design strategy in order to optimally position ligand aromatic rings within the binding site of the protein target of interest.
We anticipate that the ease of implementation and high spatial resolution of PI by NMR will change the paradigm by which chemists optimize drug potency. Thep ossibility to specifically optimize CH-p interactions in protein-ligand complexes clearly has the potential to transform how we design molecular therapeutics.