Membrane proteins pose problems for the application of NMR-based ligand-screening methods because of the need to maintain the proteins in a membrane mimetic environment such as detergent micelles: they add to the molecular weight of the protein, increase the viscosity of the solution, interact with ligands non-specifically, overlap with protein signals, modulate protein dynamics and conformational exchange and compromise sensitivity by adding highly intense background signals. In this article, we discuss the special considerations arising from these problems when conducting NMR-based ligand-binding studies with membrane proteins. While the use of 13C and 15N isotopes is becoming increasingly feasible, 19F and 1H NMR-based approaches are currently the most widely explored. By using suitable NMR parameter selection schemes independent of or exploiting the presence of detergent, 1H-based approaches require least effort in sample preparation because of the high sensitivity and natural abundance of 1H in both, ligand and protein. On the other hand, the 19F nucleus provides an ideal NMR probe because of its similarly high sensitivity to that of 1H and the lack of natural 19F background in biologic systems. Despite its potential, the use of NMR spectroscopy is highly underdeveloped in the area of drug discovery for membrane proteins.
Membrane proteins are encoded by up to 30% of typical genomes and constitute the most important class of drug targets: more than 60% of current drugs are targeting membrane receptors, channels or transporters. Among these, the G-protein-coupled receptors (GPCRs) are the largest group of drug targets because of their important role in mediating communication between the inside and outside of the cell in response to an enormous variety of different ligands, ranging from small proteins and peptides to small organic molecules, ions and even light. These ligands can be hormones, odorants, neurotransmitters or other functional classes of biologically active compounds. Despite the importance of membrane proteins as drug targets, they have not been very amenable to structure-based drug design. This is because the hydrophobic nature of their transmembrane regions hampers crystallization as well as NMR-spectroscopic analysis. Progress in membrane protein structure determination by NMR is steadily being made, with some recent spectacular breakthrough achievements in the sizes of protein structures obtained for both β-barrel membrane proteins (1,2) and α-helical proteins (3). Because the structure determination of membrane proteins involves extensive detergent screening and the selection of suitable buffer conditions, it is not a routine application. Thus, NMR structure-based drug design involving membrane protein targets still remains a future goal. However, this does not preclude the application of NMR techniques to membrane protein drug discovery. In particular, NMR spectroscopy can yield high-quality ligand-binding information even in the absence of the structures of the targets. This article will explore the applicability of different NMR-spectroscopic approaches to the study of ligand–membrane protein interactions from a fundamental perspective keeping in mind their potential use in drug discovery.
Although NMR-based screening is only one of many screening tools in drug discovery, its simplicity, wide range of application (including protein–protein and protein–nucleic acid interactions) and superior ability to detect weakly bound molecules have attracted much attention. Nuclear magnetic resonance allows the measurement of multiple parameters at different levels of complexity and information content. Thus, NMR-based methods differ significantly from one another as a result of the particular approach used. Excellent reviews of different NMR-screening methods are provided for example in (4–7). Here, we briefly review the different methods that have been mainly employed with soluble proteins to provide an idea of the scope of approaches with potential or realized applicability to membrane protein ligand screening. Fundamentally, two types of experiments can be distinguished in NMR-based screening approaches: one to detect protein signals (Screening of ligands by detecting target-protein signals) and the other to detect ligand signals (Screening of ligands by detecting ligand signals). There are also specialized improvements in technology to increase throughput or to study particular types of ligands such as those that disrupt protein–protein interactions (Other NMR-based screenings). Because sensitivity of the observed NMR signals in the ligand–protein interacting systems depends on binding affinity, the estimation of the ligand dissociation constant (or binding constant) is also described (Determination of ligand-binding constants by NMR), before we end with a Summary.
Screening of ligands by detecting target-protein signals
In protein-detection based screening, the identification of ligand binding is based on changes in NMR signals arising from proteins, typically in one-dimensional 1H spectra or two-dimensional 1H,15N-heteronuclear single quantum correlation (HSQC) spectra. Because of the large number of peaks in proteins, two-dimensional experiments will afford better resolution of signals but require that the protein is labeled. The longer data acquisition times for higher dimensional spectra are also a drawback, especially when screening larger numbers of ligands. Recent efforts are therefore aimed at decreasing the acquisition time, including ‘SOFAST- HMQC’ or ‘Ultra-fast experiments’ (8,9).
Binding information can be obtained for one- or two-dimensional spectra regardless of whether the signals are assigned or not by simply recording if signals show altered chemical shifts or line broadness, and many screening programs are based on this approach (6). Broadening of the NMR signals is observed when the exchange rate (defined by the population weighed on/off-rate of the ligand) is similar to the difference in chemical shifts between the free and bound forms (10). Changes in signal positions are only observed when the exchange rate is slow, i.e., the ligand binds tightly. Broadening and changes in chemical shifts of signals upon ligand binding are determined by the differences in chemical shifts, the relative protein/ligand molar ratio and the on/off-rate of the ligand. For an in-depth discussion of the different regimes, see (7,11).
Chemical shift changes and line broadening are parameters that can be used for screening even if resonance assignment is not feasible. More information, however, can be obtained when signals are assigned. In that case, the changes in chemical shift or broadness of lines can be used to generate testable hypotheses on what are the residues in contact with the ligand, or which are allosterically modulated by ligand binding.
Even more information can be extracted, if the protein structure is known. In particular, the pioneering work of Fesik at Abbott Laboratories (Illinois, USA) opened a new field in the area of fast and efficient drug discovery, a technique coined structure–activity relationships by nuclear magnetic resonance (‘SAR by NMR’) (12). The Abbott group uses 2-dimensional 1H,15N-HSQC spectra to screen small molecular weight compounds for binding to 15N-labeled proteins of determined structures. Structure–activity relationships by NMR locates the binding site for the ligand on a protein’s surface because the resonances have been assigned prior to ligand screening, and the structure of the protein is known. Comparing the structures of compounds that bind to the same site on a protein provides information about the functional groups involved in ligand binding and can guide the synthesis of lead compounds by medicinal chemistry. This technique is restricted, however, to protein sizes of less than 30 000 D because of limitation by the molecular rotational correlation times leading to broad NMR lines for larger proteins. Many compounds have been discovered by this technique (13), and several compounds emerged in human clinical trials (14).
In cases where protein signals have not been or cannot be identified, other lead optimization methods such as Inter-Ligand NOE (ILOE) and ILOE for Pharmacophore Mapping (INPHARMA) can be used to detect protein-mediated ligand–ligand interactions by detecting ligand signals (15,16). The principle of these methods is based on two ligands binding to the same protein. ILOE is used to identify pairs of small molecules that bind to adjacent sites on the surface of the target protein (15). In contrast to the ILOE’s detection of simultaneous ligand binding at two different but proximal sites in the protein (15), the INPHARMA technique is specialized to identify ligands that compete for the same ligand-binding site (16). The idea in the ILOE approach is similar to the SAR by NMR approach in that the occupation of proximal but initially independent ligand-binding pockets can be combined with a single ligand targeting both pockets to obtain higher affinity ligands. In the INPHARMA approach, the two ligands are never close in space or bound to the protein simultaneously, but rather the observed NOEs are mediated by spin diffusion via the protons on the protein. The advantage of the ILOE and INPHARMA methods is that assignment and structure of the protein do not need to be known for lead optimization.
Screening of ligands by detecting ligand signals
Protein-detection-based methods suffer from the general drawback that NMR lines become broader with the increased size of the molecule under study. This makes it desirable to measure the ligand instead of the protein, because ligands are typically small molecules and will give rise to much sharper and more intense signals. Thus, NMR-based screening has often made use of detecting signals of the ligands that interact with target proteins. There are multiple ways by which ligand signals can carry information on protein binding, and these can be detected by classical NMR parameters such as chemical shift and relaxation. Excellent overviews are provided for example in (6,7).
A popular approach for ligand screening is based on the transferred NOE (trNOE) mechanism. Proton-proton cross-relaxation exhibits positive NOE peaks for small molecules alone (MW < 2000 D) that undergo fast molecular tumbling, whereas negative NOE peaks are observed when the molecular tumbling becomes slow by forming a complex with the target protein. Because ligands are at equilibrium between the free form and bound to the target protein, the NOE intensity that is encoded during the bound state is transferred by the exchange and observed at the free ligand signal position. Other methods that are based on the cross-relaxation mechanism include saturation transfer-difference (STD) experiments, Water-LOGSY, cross-saturation, transient trNOE and NOE pumping (7).
Saturation transfer-difference (STD) experiments detect inter-molecular magnetization transfer by taking the difference of two NMR spectra recorded with and without saturation of protein signals (17). The mechanism of this approach is based on rapid proton spin diffusion in proteins: in large proteins, once a part of the protein signal is irradiated, the saturation is transferred to the entire protein within 0.1 seconds (18). The application of STD to membrane proteins is discussed in 1H NMR-based approaches for membrane proteins.
Another mechanism for communication between ligand and protein is via water molecules (19,20). This approach is based on the observation that ligands are often hydrated when bound to protein, or specifically mediate the interactions between ligand and protein via hydrogen bonds. Thus, by excitation of water, ligand and protein sense their proximity. This mechanism is the basis for the Water-Ligand Observation with Gradient Spectroscopy (WaterLOGSY) technique that detects water-ligand NOE transfer. For water-ligand molecules that are located on the target-protein surface, the NOE is negative (19,20).
Other NMR-based screenings
To improve the HTS capabilities of NMR-based approaches, target-immobilized NMR screening (TINS) has been proposed (21). Here, the protein target is immobilized on a gel-based solid support. This is associated with several potential advantages: the target does not need to be soluble or even be a protein; the quantity of required target is reduced, as a single sample of the target is sufficient for a flow-through screen. With TINS, compound libraries can be screened much faster than using a traditional NMR sample in solution.
In addition to screening, the binding of ligands to single proteins such as enzymes or receptors, it is becoming increasingly important to investigate ligands interfering with protein–protein interactions, as the importance of protein–protein interactions as targets increases. A fast and information-rich NMR-based technique to screen antagonists of protein–protein interactions has recently been described by Holak et al. (22). This experiment has been coined NMR-based Antagonist Induced Dissociation Assay (AIDA) for the validation of inhibitors acting on protein–protein interactions (Figure 1). Antagonist Induced Dissociation Assay detects signals appearing upon the dissociation of the target-protein complexes. The approach requires a large protein fragment (larger than 30 kDa) to bind to a small reporter protein (less than 20 kDa). This methodology has been successfully used to discover novel p53/mdm2 antagonists (23). A cost of goods saving 1D AIDA technique has been described recently as well, in which tryptophan resonances are used as reporters for ligand-binding events because of their separation from most other signals in proton NMR spectra of proteins (24).
In contrast to the earlier mentioned in vitro assays, there are also efforts to conduct screening in vivo. The approach is called small molecule interactor library (SMILI)-NMR (25). This method records NMR signals of a protein that is over-expressed in Escherichia coli and elucidates changes in signal positions and broadening upon ligand interactions (26). The in-cell NMR approach has also been applied to observe and disrupt protein–protein interactions, coined Structural Interactions (STINT) NMR (27). The advantages of the in vivo studies are the detection of signals of unpurified proteins and information for more biologically relevant in vivo protein structures and interactions. Expansion of in vivo ligand-binding studies to mammalian cells has recently enhanced the relevance and information content of the technique (28).
Determination of ligand-binding constants by NMR
Typically, ligand-protein titration is conducted by observing protein signals to determine ligand association/dissociation constants. First, based on the equation of the dissociation constant, when the dissociation constant of the ligand is around 1 μm (tentatively defining moderate binding), approximately 99% of the protein binds the ligand at 0.1 mm protein concentration with almost equal amount of the ligand. Upon varying the ligand concentration, the population of the bound form is consequently changed. Therefore, the titration curve is generated by plotting changes in the peak positions or signal intensities to determine the dissociation constant. Next, when the dissociation is above 1 μm (tentatively defining as weak binding), larger amounts of the ligand is required to saturate the protein signals to the bound form. Because of limitations in ligand solubility or appearing of non-specific interactions at high ligand concentrations, it is possible that the dissociation constant is not well determined by NMR for very weakly interacting systems. Finally, when the dissociation constant is significantly lower than μm, such as nm (strong binding), the titration curve becomes so sharp that an accurate dissociation constant is not obtained.
Determining ligand affinity using ligand signals is not straightforward. When the binding is strong, the ligand-saturated point is difficult to detect because ligand signals become broadened upon binding to the protein. When the binding is weak, interaction is better detected using the STD technique and other experiments described earlier. However, it is difficult to determine the dissociation constant accurately because other rate constants, such as cross-relaxation rates, are involved in such experiments.
These issues are illustrated by the case studies of different ligands binding to the model protein bovine serum albumin (BSA). Bovine serum albumin binds a variety of different ligands including moderate-affinity (μm), high-affinity (nm) and low and/or varying affinity multisite binding ligands. For example, l-tryptophan is a moderate-affinity ligand, while naproxen is a high-affinity ligand, and salicylate has been proposed to bind to 76 binding sites in total (29). A systematic review of 1H NMR spectroscopy of these different types of ligands and combinations thereof (30) has yielded the following conclusions: when measuring 1H NMR chemical shifts and line widths, titrations of different ligand/protein ratios are needed to obtain an accurate binding constant. Particularly, careful measurements and analyses have to be carried out for multisite ligands: a wrong 1:1 binding model can provide a visually acceptable fit to the experimental salicylate binding data, while in-depth studies reveal the multiple site binding modes of this ligand (30).
For sub-micromolar affinity ligands where the free ligand peak is unaffected by the bound state, reporter ligands can be used for screening (31). In this approach, the known ligand is prebound and the new ligands in the screen are tested for their ability to displace the bound ligand. For example, in the case of BSA, this approach has been taken to study tryptophan binding: complementary to the 1H NMR studies of BSA described earlier, 19F NMR-based studies of l-5-tryptophan (32) and l-6-tryptophan (33) binding to BSA have been carried out. The extreme sensitivity of the 19F chemical shift resulted in the observation of two distinct peaks, indicating the presence of multiple tryptophan binding sites, a low-affinity and a high-affinity binding site. Competition with non-fluorinated tryptophan can be used to establish relative affinities of these ligands with respect to tryptophan at both sites. Thus, while the 19F approach – unlike the 1H approach – is restricted to ligands that bind at the same sites as 19F-containing ligands do, the 19F NMR studies proved useful in revealing an additional tryptophan-binding site that went undetected with 1H NMR, showing the complementary nature of the approaches.
In summary, NMR techniques for drug discovery are high-content methods: they potentially provide binding information, the location of the binding site and the conformation of the bound ligand. Nuclear magnetic resonance can also supply structural information that enables the docking of the ligand to the protein’s binding pocket. In addition, NMR provides very valuable information about the general behavior of the ligands that other HTS methods do not reveal, including solubility, binding behavior (promiscuous ligands), precipitation potential and aggregation. Because NMR-based screening is sensitive toward finding medium-affinity to low-affinity ligands, the approach can also serve as an effective prescreening tool for subsequent assay-based HTS. Thus, NMR-based screening for small molecular weight drugs is now well established in industry and can be used complementary to HTS methods and computational screening methods.
Challenges in membrane protein NMR spectroscopy
While 1H NMR-based methods to study ligand binding can be carried out with unlabeled protein, more sophisticated applications of NMR-spectroscopic techniques such as SAR by NMR require labeling, typically the biosynthetic introduction of 13C and 15N nuclei. However, many proteins cannot be successfully expressed in E. coli or Pichia pastoris that make uniform 13C, 15N labeling affordable. When proteins need to be expressed in mammalian or insect cell lines to obtain them in functional form, uniform labeling becomes prohibitively expensive when the protein expression levels are not unusually high. In such cases, specific 15N-labeled and/or 13C-labeled amino acids are introduced (34–36). Such proteins are not amenable to structure determination by NMR spectroscopy. Mammalian membrane proteins often belong to this group, e.g., when they are glycosylated or otherwise post-translationally modified in their native form and require the mammalian or insect cell machinery for proper folding.
NMR signal assignment requires well-resolved mono-disperse spectra as a prerequisite, in which a large number of the NMR-active nuclei in the sample are visible and resolved from each other, and the signal intensity for different peaks is as uniform as possible. This in part is the reason for the limit in size of biomolecules that can be studied, but poor quality spectra can also arise from systems that are dynamic and/or prone to aggregation even when the size of the monomeric unit is relatively small, depending on the propensity of the proteins and choice of detergents. Thus, it is critical to choose suitable detergents for each membrane protein. After or complementary to light-scattering experiments, 1H,15N-HSQC spectra are typically recorded to screen for detergents and other conditions, such as salt concentration and pH, under which reasonable NMR spectra can be obtained. Recent developments in microcoil NMR technology have the potential to make the screening of a large number of different detergents for their suitability to support NMR studies more feasible (37).
We will demonstrate these issues using the GPCR rhodopsin as an example. Rhodopsin is a glycosylated and palmitoylated 43 kDa protein containing 348 amino acids. 1H,15N-HSQC spectra of either 15N-lysine-labeled or 15N-tryptophan-labeled rhodopsin are shown in Figure 2A and B, respectively. The protein was dissolved in 20 mm sodium phosphate (pH 6.0) and 10% D2O containing octyl glucoside or dodecyl maltoside detergent micelles. The quality of both NMR spectra is quite poor as evidenced by the heterogeneity in number and intensity of signals (Figure 2). Site-directed mutagenesis and screening of solvent conditions has led to the improvement in spectral quality for some membrane proteins, e.g., diacylglycerol kinase, where the E. coli origin and expression system made such studies possible (38). When an optimal condition for NMR study is not found for the membrane protein of interest, fragments of the proteins may be studied instead (39). Although such fragments studies will gain some limited insight into the structure of the membrane proteins, they typically do not bind ligands in functional form excluding such systems from NMR-based ligand screening approaches.
The reason for the difficulties in obtaining membrane protein structures by NMR is largely based on the fact that NMR signals become broader as the molecular mass increases, leading to the reduction in sensitivity of NMR experiments. Because membrane proteins are studied under conditions surrounded by micelles formed by the detergents, the apparent molecular mass becomes larger than the protein molecular weight. Also, when membrane proteins form biologically functional or non-functional oligomers, the apparent molecular mass, including the surrounding detergent/micelles, results in further broadening of NMR signals. Thus, several efforts are underway to detect protein NMR signals of large proteins, which are useful for drug screening and/or signal assignment purposes: fast experiments, TROSY methods and various isotope labeling techniques. TROSY in particular has been crucial in all of the recent determinations of membrane protein structures but requires deuteration. Efforts to detect NMR signals in shorter time, such as ‘SOFAST- HMQC’ or ‘Ultra-fast experiments’ may prove useful for drug-screening or drug validation purposes (8,9). Because the line widths of methyl signals in these experiments are relatively narrow as a result of methyl three-site jump and the TROSY selection can further increase sensitivity (40–42), observing the methyl signals becomes advantageous for large macromolecular systems, including membrane protein systems. Several excellent review articles describe these techniques (43–46).
Despite such difficulties in protein expression and sample preparation, there is increasing success in the determination of membrane protein structures by NMR spectroscopy. To illustrate this progress, we downloaded a list of membrane protein structures determined with the help of NMR spectroscopya and analyzed the structures with respect to their transmembrane organization (Figure 3). Of 44 structures, 28 structures were determined using solution NMR (the others utilized solid-state NMR). While these numbers are encouraging, it is important to realize that the majority of these structures still represents either β-barrel proteins (Figure 3, ‘0′ bin) or single transmembrane helices (Figure 3, ‘1’ bin). A recent success was the structure determination of diacylglycerol kinase (Figure 3, ‘3’ bin), which although only consisting of three transmembrane helices forms a trimer. The trimeric organization is significant because it is formed via domain-swapping of helices. Thus, the structure actually represents with 9 (!) transmembrane helices the largest membrane protein whose structure has been determined by NMR spectroscopy to date (3). These results are highly encouraging: a decade ago, only the structures of small membrane proteins with molecular weights less than 10 kDa could be determined by NMR because of the decrease in the molecular tumbling by the addition of detergents (47). However, recent developments of NMR methodology and efforts of protein expression and sample preparation enabled the earlier mentioned structure determinations for membrane proteins with molecular weight >20 kDa.
Finally, it should be noted that in the application of NMR-screening methods to membrane proteins by looking at ligand signals, it is important to distinguish whether signal changes are because of ligand–detergent interaction or ligand–protein interaction. It is thus critical to record a suitable reference spectrum in each case.
1H NMR-based approaches for membrane proteins
Solution NMR spectroscopy has dramatically advanced in the scope of its applicability to proteins, especially when studying proteins of increasingly larger size or membrane proteins, by way of using NMR-active isotopes of hydrogen, carbon and nitrogen. While 1H is 100% abundant, 15N and 13C isotopes are used to replace the more abundant 14N and 12C isotopes in proteins, respectively. The ability to introduce these isotopes is therefore one constraint on the applicability of NMR spectroscopy to the study of proteins in general, including protein-ligand interactions. The natural abundance of these isotopes in the detergents and solvents used can significantly add to the background, in particular for 1H NMR spectroscopy, where the 1H isotope is 100% abundant. Additional problems are the low signal-to-noise ratio because of slow molecular tumbling of the protein–detergent complex discussed earlier. In the following paragraphs, we summarize current efforts in overcoming these constraints, with major emphasis on recording 1H NMR spectra. Similar considerations however would also apply to the direct detection of other isotopes such as 13C.
Suppression of background signals in NMR experiments for membrane proteins
As described in ‘Challenges in membrane protein NMR spectroscopy’, in the case of membrane proteins, a membrane mimetic is required, provided by detergent micelles when they are studied with solution NMR methods. The detergent concentrations are typically 100 times higher than the protein concentrations to ensure that only one functional protein or protein complex is present per micelle for uniformity purposes. The high signal intensity originating from the detergent leads to the suppression of signal intensities from the protein (dynamic range problem) and also results in overlapping with that of protein peaks. Over-sampling is a feature available in most recent commercial NMR instruments, but if it is not available, large detergent signals also cause other artifacts such as baseline rolling and insufficient digitization of the signal (48). An example is shown for a 0.7 mm solution of rhodopsin in 1% octyl glucoside (Figure 4). At the scale used, the protein signals are not even visible in this Figure, and the spectrum is dominated by the detergent signals. A value of 1% for the detergent concentration is in fact relatively low; in many cases, much higher detergent concentrations are used, making the dynamic range problem even more severe.
A biochemical solution to the detergent background problem is the use of deuterated detergents. However, their synthesis is typically very expensive. Unless the protein can be studied in commonly used detergents for which deuterated forms can be purchased off the shelf, custom-synthesis is often required. In addition, use of deuterated detergent for screening large numbers of samples may increase the screening cost significantly. The type of detergent that will give rise to optimal NMR spectra while maintaining the function of the protein is largely empirical, requiring extensive screening of different detergents and detergent/lipid mixtures and may settle on non-standard detergents (37,49,50). Membrane proteins have to be continuously maintained in the presence of membrane mimetics during cell extraction (or after refolding from inclusion bodies). Further, all purification and concentration steps require large volumes of buffers. Because of these reasons, typically the protein will be purified in a non-deuterated detergent, followed by exchange with the deuterated detergent. This adds an additional step of complexity to the NMR sample preparation to ensure efficient, homogenous and complete replacement of detergent with minimal protein loss. Thus, use of deuterated detergent may not always be practical based on cost and preparative effort, especially at the relatively large quantities needed for NMR-based screening.
When deuterated detergent is not available, too expensive or not practical, application of multiple solvent suppression experiments, such as WET (51), selective pulse experiments including sculpting (52,53) or coherence selection (54–56), is required. If possible, saturation by radio-frequency is not applied to suppress the water, solvent or detergent signals in protein samples because protein signals underneath the solvents are also saturated and the signal reduction is propagated to the entire protein by the spin-diffusion mechanism (57). Among the water suppression techniques, pulse techniques that use relatively long durations are not efficient to be incorporated into various 3D NMR experiments and coherence selection in combination with pulsed-field gradient is commonly applied.
Because one-dimensional NMR-spectroscopic approaches currently have (and in the foreseeable future will continue to have) broader applicability to membrane proteins, solvent suppression schemes sometimes with loss of information in some regions of the spectrum are particularly important. The earlier described AIDA method (53) also makes use of focusing on a particular spectral region (see Figure 1). Here, we demonstrate the utility of such an approach using selective excitation sculpting studies of full-length rhodopsin in octyl glucoside micelles as a model system. Rhodopsin is the most extensively studied G-protein-coupled receptor, and knowledge about its structure serves as a template for other related receptors. Because of the large numbers of members of the GPCR family and their importance as drug targets (see Introduction text of this article under Abstract), these studies are highly relevant for drug discovery efforts involving these receptors.
One-dimensional 1H NMR spectra recorded by selectively exciting the protein NH region by applying a selective excitation pulse centered around 10–12 ppm show 1H chemical shifts from both backbone and side-chain regions of rhodopsin in octyl glucoside micelles (Figure 5A). Further, excitation of the same region using the hyperbolic secant shaped pulse to remove detergent and water signals significantly increased the intensities of the NH peaks in the range from 6.0–8.5 ppm (Figure 5B) (58,59). Note, however, that the number of peaks observed in the 1D 1H NMR spectrum is significantly reduced. We tentatively propose that the observed signals arise mostly from the backbone C-terminus residues and flexible loop regions. This hypothesis is based on the previous observation (35) that sharp, highly intense and thus slowly relaxing signals are found only for Lys339 in a uniformly 15N-lysine labeled rhodopsin sample (Figure 2). Furthermore, comparison between the observed signals and those obtained with a peptide corresponding to the sequence of the C-terminal residues reveals extensive similarities between the rhodopsin C-terminus and the free peptide in solution (60).
One-dimensional 1H NMR spectra of bovine rhodopsin recorded at different concentrations of octyl glucoside indicated chemical shift dependence of the C-terminus backbone peaks (data not shown), highlighting the need to control the detergent environment quantitatively to obtain reproducible NMR results. To investigate possible detergent–protein interactions, we recorded one-dimensional and two-dimensional 1H-1H selective excitation NOE spectra. We observed differential interactions of the rhodopsin backbone signals with those of the detergent micelles (Figure 6). In particular, a set of strong NOE peaks was observed from rhodopsin protons (Figure 6B, represented by arrows) to a detergent peak at ∼1.85 ppm (Figure 6A, indicated by arrow). The identity of this detergent signal is shown as an inset in Figure 6A, a -CH2- group near the sugar head group. We did not observe intramolecular rhodopsin protein NOE peaks. A potential solution to detect such NOEs could be provided by detergent deuteration.
Using the sculpting experiments, we have successfully identified novel ligands binding to rhodopsin and interacting with cytoplasmic loop and C-terminal residues by measuring chemical shift and line-broadening effects in selectively excited 1H spectra as a function of added ligand, the anthocyanin cyanidin-3-glucoside (61). In this study, we were able to identify chemical shift and intensity changes in receptor and ligand. In dark-adapted rhodopsin an upfield shift of the chemical signals (Figure 7, peaks at position 3, 4, 7, 8, 9 and 10) of the protein was observed. In the case of ligand, some of the peaks corresponding to ligand (compare signals at position 2, 11, 14, 18 and 19 in Figure 7A with 7D) experienced decrease in intensity and some of them disappeared (peaks marked as ‘x’ and at positions 22, and 24 in Figure 7) in the presence of rhodopsin, indicating restriction in mobility upon binding. Further, the comparison of the 1H NMR spectra of rhodopsin upon light activation both in the absence and presence of ligand indicated decrease in peak intensities at peak positions represented as ‘+’ in Figure 7C. Using the selective excitation sculpting method, this study suggested that the binding of anthocyanin ligand, cyanidin-3-glucoside, modulates both the structure and the dynamics of rhodopsin in two different states, the inactive dark state and the light-activated Metarhodopsin II state. The approach is extendable to other conformations, such as G-protein-bound or opsin structures.
The results obtained with rhodopsin show high promise for the extension of the approach to other GPCRs. We have already demonstrated with rhodopsin that multiple conformations can be studied, because the life-time of these conformations under the NMR conditions studied are known. For other GPCRs, it also needs to be established what the stability of resting, activated or G-protein-bound states are, to ensure that the time it takes to acquire an NMR spectrum is meaningful for the particular conformation of interest. Furthermore, while the cytoplasmic loops and the C-terminus of rhodopsin are functionally important regions in the protein (critical for receptor activation and G-protein binding), it remains to be shown whether the same approach is also suitable to study ligands such as retinal that bind in the transmembrane domain of rhodopsin.
Saturation transfer-difference (STD) NMR application to membrane proteins
Of the many techniques developed for screening by NMR, summarized in ‘NMR-based approaches to drug screening’, a particularly promising technique for application to membrane proteins is STD. The technique requires very small amounts of protein (in the nm–μm range) because the ligand is present in 100-fold excess over the protein (7). Protein signals are saturated by irradiation around −1 ppm, which is transferred within ∼0.1 seconds to the rest of the protein and the ligand. When the ligand off-rate is fast, the information is quickly transferred to the ligand in solution where it decays slowly (within ∼1 seconds), so that during saturation, the proportion of saturated ligands in solution increases, amplifying the difference signal, up until the ligand excess concentration is reached. Thus, the intensity of the STD spectrum will be higher for ligands with fast off-rates, but even tight binding can still be measured, giving the technique a wide dynamic range. This approach has already been used for study of ligands targeting membrane proteins by NMR (18,62). In one study, integrins were embedded in DMPC/DMPG liposomes and binding of cyclic peptides was tested (18). An affinity of 30–60 μm was obtained, typical for this class of membrane receptors and demonstrating the particular utility of NMR-based approaches to reliably detect relatively low affinities. From differences in STD responses of individual protons in the cyclic peptide, it was even possible to map the epitope that is in direct contact with the receptor to a phenyl ring in the peptide. Only 0.25 nmol of the integrin was sufficient per assay. Another spectacular application of STD to membrane proteins is the recent study of the interaction of the sweet brazzein protein with the human sweet receptor (62). This receptor is a Class C GPCR, containing a large extracellular ligand-binding domain, coupled to the seven-transmembrane helical bundle typical for GPCRs. These receptors are challenging and interesting because they contain multiple binding sites in both transmembrane and extracellular domains and have very low affinity for their ligands, ranging from μm to mm. The ligands can bind simultaneously and affect each other’s affinity, thus it is imperative that the full-length native receptor is studied. One-dimensional 1H,15N HSQC STD experiments demonstrated the binding of brazzein to the sweet receptor (∼100 μg) in membrane suspensions with high intensity, while a non-sweet mutant brazzein protein did not give rise to strong STD signals. This level of protein amounts without purification requirement (because membrane preparations were used) is in our experience relatively straight-forward to obtain for many GPCRs. Thus, the approach is likely to have broad applicability to other membrane receptors. Given that the STD technique is highly sensitive and neither limited by protein size nor requires the assignment of the protein, this technique should find wide applicability to screening of ligands for membrane proteins that have lipid or detergent environments surrounding them.
19F NMR-based approaches
19F NMR spectroscopy can be a viable alternative for one-dimensional NMR-spectroscopic measurements, providing complementary results. Because there is no background from 19F nuclei in neither biomolecules such as proteins nor detergents used to dissolve membrane proteins, the applicability range of 19F NMR to study ligand binding in soluble and in membrane proteins is identical. In the following paragraphs, we therefore review the extensive literature on 19F NMR-based approaches to study ligand binding to proteins, regardless of the proteins under investigation being soluble or membrane proteins. First, we will review 19F ligand-observe studies using fluorinated ligands, including fluorinated phospholipids. We will then cover studies of structure and dynamics of proteins by 19F NMR. These studies will involve not only ligand-induced changes in structure and/or dynamics but also those involving other conformational changes, such as during protein function or protein folding, because the principles are the same.
19F NMR studies of protein structure, dynamics and ligand binding offer several advantages over other NMR-spectroscopic approaches as a result of the unique chemistry of the 19F atom. 19F has 100% natural abundance, and its sensitivity to NMR detection is 83% that of 1H. The presence of nine electrons surrounding the 19F nucleus makes it very sensitive to minor changes in its environment, including both Van-der-Waals and electrostatic interactions, which is reflected in its wide range of chemical shifts. This characteristic increases the probability of obtaining well-resolved peaks of fluorine atoms in different environments. Another major advantage of 19F NMR over other conventional NMR techniques is the appearance of its NMR signals in the absence of any background signals, including membrane mimetic environments and even entire cells. The information content of 19F NMR ligand-based screening, while not as high as SAR by NMR, is higher than that of HTS methods, in particular those employing cell-based approaches. These unique properties of the 19F nucleus suggest that 19F NMR spectroscopy could provide a highly desirable alternative to HTS by conventional NMR-spectroscopic techniques, in cases where the latter methods may not be applicable, such as for many membrane proteins or for in-cell studies. From a practical perspective, 19F labeled compounds are easily accessible by different chemical methods (see ‘Synthesis of 19F containing small molecule compounds’).
Ligand–protein interaction studies include (i) evaluating binding of ligands, (ii) characterizing binding kinetics of the ligands and (iii) determining the structural changes of a protein on ligand binding. These are probed by changes in line shape and/or chemical shift of a free fluorinated ligand on binding to a protein (19F ligand-observe studies) or that of a fluorinated residue in a protein on ligand binding (19F protein observe studies). Both approaches can be employed in the context of drug screening (19F NMR-based ligand screening).
19F ligand-observe studies
Spectral changes of a free fluorinated ligand on binding to a protein – like in the case of 1H NMR – can be either broadening of its line width or changes in its chemical shift depending on the binding affinity of the ligand. Fluorine signals of the ligand bound to the protein are expected to show restricted motion compared to its free state and hence give a broader line shape. It may also undergo chemical shift changes upon binding that may be either upfield or downfield depending on the nature of the change of interactions of the fluorine atom with its environment. A downfield shift indicates a more hydrophobic environment or a greater extent of Van-der-Waals interaction of the fluorine atom. Changes in electrostatic interactions of the fluorine atom with its environment can influence either a downfield or an upfield shift (63). Note however, structural information of the binding site can only be procured by observing changes in fluorinated protein on ligand addition.
Ligands with a low binding affinity rapidly exchange between bound and free forms that may lead to broadening of its resonances. The advantage of characterizing ligand–protein interactions of such weak binding ligands by studying changes in fluorinated ligands rather than protein observed changes is the requirement of less amount of protein. Binding constants can be determined by T2 measurements that contain a weighted average of relaxation rates of the free and bound forms of a ligand at different concentrations (64). The utility of T2 measurements has for example been demonstrated for BSA in binding studies of isoflurane, a volatile anesthetic (64). A Kd of 1.4 mm was obtained from T2 measurements of the free ligand and that bound to the protein (64). Another interesting case is the influenza virus M2 membrane protein, which forms proton channels that lead to the disruption of the matrix protein and the release of the viral genome (65). Amantidine is an inhibitor of this process. 1H NMR of amantidine or the protein could not provide information on ligand binding because very broad signals were obtained (66,67). 19F T2 relaxation measurements were used in this case to reveal interactions between the fluorinated amantidine ligand, and the M2 protein as well as interactions between the ligand and the dodecylphosphocholine micelles the protein was dissolved in (67).
Inhibitors of enzymatic reactions may be detected by a method called fluorine-based biochemical screening (FABS) (68,69). In this method, a substrate is tagged with a fluorinated moiety, and changes in distinct 19F signals for the substrate and product are followed with the progress of an enzymatic reaction in presence of test inhibitors. This method is particularly suited for screening inhibitors with low-binding affinity that remain undetected by regular NMR ligand screening methods. The sensitivity of the method is enhanced in the case of weak affinity ligands by having moieties with three fluorine atoms attached to the ligand and the method is named 3-FABS (69). IC50 value of the inhibitors is obtained by taking the ratio of the integrals of the 19F peaks of the substrate and the product as a function of inhibitor concentration. In addition to screening mixtures of inhibitors, it is also possible to screen mixtures of closely related enzymes to determine selectivity of an inhibitor provided the substrate is specific for the different enzymes. This method has been applied in several cases such as screening inhibitors for kinase AKT1 and protease trypsin (69), caspases (70) and thymidine phosphorylase (71).
Information on binding constants and stoichiometry of binding can be obtained by titrating fluorinated ligand and monitoring the changes in the protein-bound peaks and free peaks of the ligand by 19F NMR. In the slow exchange regime, we will observe two peaks, which may be sufficiently resolved in their chemical shift values to be useful for quantitation. Binding constants are determined from the ratios of bound and free ligand concentrations quantified by integrating 19F NMR signals (72).
19F protein observe studies
Studying ligand binding by monitoring changes in 19F signals reporting on protein conformation can be useful under conditions where accurate affinities and binding modes cannot be unambiguously determined from ligand-observe methods, or where it is desirable to increase the information content that can be obtained from 19F NMR studies. If 19F labels are placed on the protein, one can study where the ligand binds, and whether the ligand induces conformational changes, oligomerization or folding transitions. There are two approaches to introduce 19F labels into proteins. In the first approach, a 19F label is introduced biosynthetically as a fluorinated amino acid. As for incorporation of other isotope-labeled amino acids (see above), this method may not be very cost effective for mammalian membrane proteins (including GPCRs) because insect or mammalian cell expression required for such systems in fluorinated amino acid-rich medium can be very expensive. In the second approach, a 19F label is introduced through chemical reaction with activated cysteines. This approach has been shown to work well with GPCRs (73). However, this method is limited to labeling only surface exposed amino acids or those amino acids in membrane proteins for which side chains are exposed to the membrane for ease of entry of labeling reagents. The principle is shown in Figure 8. A receptor will have endogenous cysteines, shown in a homology model of the corticotropin-releasing factor receptor in Figure 8A. The cysteines can be derivatized with a 19F containing ligand directly, but a less invasive approach is to first activate the accessible cysteines and then introduce a trifluoroethylthiol group through disulfide exchange (Figure 8B). This procedure contains minimal perturbation from added chemical groups and retains maximal flexibility from the ethyl side chain.
Using 19F NMR to observe the protein can be useful, for example, if it is of interest to determine whether a receptor is in an active or inactive conformation upon ligand binding. If the specific chemical shifts associated with each state are known, then the appearance of the respective peaks can be used as an indicator whether a ligand is, for example, an agonist or antagonist or an inhibitor or inducer of oligomerization. This idea is illustrated with bovine rhodopsin: 19F NMR spectroscopy was used to study the conformational changes in rhodopsin upon light activation to which the 19F chemical shifts were very sensitive (73). In this case, the 19F label was introduced through chemical reaction of trifluoroethyltiol with activated cysteines (Figure 8B), here on rhodopsin. Distinct chemical shifts are found for the dark, inactive and the light-active states at numerous sites on the rhodopsin surface (Figure 9).
Determining structural changes in specific regions of a protein on ligand binding requires the introduction of a 19F label into the protein. More common than the chemical cysteine-labeling approach, is to substitute amino acids in the protein with fluorinated analogs and track the chemical shifts and line widths in 19F NMR spectra. The small size of the fluorine atom has enabled the substitution of residues such as Trp, Tyr, Phe with their fluorinated analogs without perturbations of the native structures of proteins. The observed chemical shift range, expression of the labeled protein in sufficient amount and integrity of the fluorine labeled protein are some of the factors that should be considered when choosing an isomer of a fluorinated amino acid. For example, of the 4-fluoro, 5-fluoro and 6-fluoro isomers available for fluoro-tryptophan, the 6-fluoro isomer has a very narrow chemical shift range and also shows broad unresolved spectra compared to the other two fluoro-tryptophans in lactate dehydrogenase enzyme (75). Moreover, the 6-fluoro isomer-labeled protein shows perturbations in its secondary structure as detected by circular dichroism spectroscopy, and a broad peak is obtained in the 19F NMR spectrum (75). On the other hand, the 4-fluoro isomer labeled protein can be produced in much larger quantity and shows no perturbations of the native structure (75). Assignments of the 19F peaks can be performed by either mutating the fluorinated residue or by nudge mutations, whereby a mutation in an adjacent position changes the chemical shift of the fluorinated residue as a result of change in its environment, or by complexation of a solvent accessible fluorinated residue with paramagnetic ions such as Gd3+ leading to line broadening of that residue (76).
19F NMR has been used to track both allosteric and non-allosteric changes on ligand binding to a protein. For example, in studies of the binding of d-glucose and d-galactose to the fluoro-tryptophan-labeled aqueous chemosensory receptor of E. coli (77), it was seen that sugar binding resulted in changes in chemical shifts of not only those fluoro-tryptophan residues that are adjacent to the binding site but also those tryptophan residues that are distant from the bound sugar by as much as 15 Å (77). These results indicate that sugar binding leads to a global change in the structure of the protein that is translated from the binding site to distant regions on the surface, and this global change can be tracked by 19F NMR (77). A different way of probing conformational change is to observe line broadening by the addition of Gd(III)-EDTA that indicates solvent accessibility of the fluorine-labeled residue (78). Information on binding constants and stoichiometry can be obtained by titrating the ligand and monitoring the shifts in the peaks of fluorinated amino acids (78). 19F NMR has also proved to be suitable for studying protein dynamics by monitoring relaxation rates of fluorinated residues, as illustrated by the study of ligand binding in ionotropic glutamate receptor (GluR2) (76).
Structure and function of membrane proteins in particular are largely influenced by their interactions with lipid bilayers, and 19F NMR can be used to study the detailed mechanisms of these effects. For example, line widths of lactate dehydrogenase become sharper on adding increasing concentrations of lysolecithin in a non-linear fashion (75). Because there was no change in the chemical shifts of the tryptophan residues, it was concluded that lysolecithin is only solvating the protein and not causing a conformational change (75). The number of lipid molecules bound to a protein can be calculated from the variation in line width with lipid concentration. In the case of lysolecithin binding to lactate dehydrogenase, this number was found to be lower than the aggregation number of lysolecithin, suggesting that lactate dehydrogenase is not inserted in the micelles but binds individual lipid molecules that shield exposed hydrophobic surface patches from initiating aggregation and inactivation of the enzyme (75).
19F NMR is a suitable technique in mapping the sites of the interaction of proteins with membranes. The use of solvent induced isotope shifts can provide information on solvent exposure of a residue. However, residues that are not solvent exposed could be either buried in a protein core or face the membrane or be membrane bound. This ambiguity can be overcome by the use of fatty acids in which a paramagnetic spin label is incorporated into the membrane under study, and its interaction with a fluorine probe in the protein is detected by the broadening of the corresponding fluorine peaks in a 19F NMR spectrum (79). The paramagnetic electron of the labeled fatty acid 7 Å from either end of the lipid phase will cause broadening of a fluorine nucleus that is within 15 Å from the label i.e., either in or near the lipid phase (79,80). By labeling specific amino acids with 19F and by their mutagenesis analysis, interactions with lipids can be followed, thus helping in mapping sites of protein–lipid interaction. The amount of broadening observed is inversely proportional to the distance between the label and fluorinated residue raised to the power of six (78). 8-doxylpalmitic acid incorporated in lysophosphatidylcholine is used as the nitroxide spin-labeled fatty acid to map the site of interaction of lactate dehydrogenase with lysophosphatidylcholine (80). Another use of such spin-labeled fatty acids, in the case of lactate dehydrogenase, is to determine whether substrate binding has any effect on the residues in the lipid binding region. Lactate dehydrogenase oxidizes d-lactate, and the electrons produced reduce the nitroxide labeled fatty acid, disrupting its interactions with the fluorine nucleus and recovery of the peak that was lost/broadened because of its interaction with the label (80,81).
19F NMR-based ligand screening
The ease of obtaining information from ligand-binding studies by 19F NMR, as mentioned earlier, has extended its applicability to HTS of chemical libraries that is a routine procedure in the field of drug discovery. The broad chemical shift dispersion of the fluorine nucleus allows for identifying ‘hits’ in a screen with less chances of encountering the problem of spectral overlap from different chemical compounds. The simplicity of the 19F spectra, unlike 1H spectra, decreases the time for deconvoluting the spectra when a large mixture of chemicals is being screened. Changes in chemical shift values and/or line widths of the free fluorinated ligand upon the addition of a protein will indicate whether a compound is binding to the protein or not. Thus, monitoring free ligand peaks allows the use of very low protein concentrations, in tens of μm range. Information on binding constants and stoichiometry of binding from ligand titration experiments can be further used to rank order ligands in a screen. Such information was obtained while screening a library of compounds for chaperones PapD and FimC, involved in the assembly of pili on E. coli, and are essential proteins that represent targets for the development of antibacterial agents (82). 19F NMR studies can also be used to provide further information on binding sites to optimize the lead compound by characterizing the structural changes induced by their binding. This is performed by using proteins substituted at different positions by fluorinated amino acids and monitoring their chemical shift changes on ligand binding. This is much less expensive and easier compared to 1H NMR where the spectra are complicated and further deconvolution requires expensive isotope labeled samples of high concentration.
There is a concern regarding availability of a library of fluorinated compounds. However, about 12% of the compounds in Available Chemical Directory of Screeningd compounds contain fluorine. As described earlier, there are a few drawbacks of the ligand-based screening methods if the ligand (i) has very high affinity because of the insensitivity of NMR to detect ligand peaks in sub-μm concentration ranges (ii) has slow kinetics and (iii) binds to the protein via a covalent bond. However, these problems are overcome by ligand-based competition-binding experiments in which 19F NMR signals of a spy molecule, which has medium to weak affinity for the protein of interest, is monitored as it is displaced by higher affinity ligands during a screen (83,84). This places a constraint on the types of ligands that can be identified with this method, as the ligands have to exhibit sufficient affinity to compete with the spy molecule, thereby limiting the affinity range of binders. Another limitation is that as in other competition binding experiments, this method can only study ligand binding to previously known binding site. Control molecules, which do not interact with the protein, can also be used along with the spy molecule in this method. Therefore, the screens are performed by monitoring the relative signal intensities of the spy and the control molecule (83,84). The protein is then added to the mix of spy and control molecule and the NMR signal of the spy molecule disappears as a result of binding to the protein (83,84). A hit in the screening process is indicated by the reappearance of the spy molecule signal at the same place as before the protein was added indicating displacement of the spy molecule with a compound of higher affinity from the library (83,84). The extent of displacement can be measured from the ratio of the control to spy molecule signal intensity that will in turn provide the binding constant of the hit (83,84). The choice of the spy and control molecules can be decided by their solubility in aqueous solution so that non-specific binding to proteins can be ruled out. A major advantage of this method is the requirement of only the spy molecule to be fluorinated and not the ligands being screened. This approach is known as fluorine chemical shift anisotropy and exchange for screening (FAXS) (83). The FAXS method has been successfully used to screen libraries for human serum albumin where the binding constant of a hit was found to be in good agreement with that obtained from other techniques such as fluorescence spectroscopy and isothermal titration calorimetry (85). Human serum albumin concentrations as low as 600 nm were used (85), showing that the use of very low protein concentrations is a major advantage of FAXS over other NMR-screening methods. This is especially beneficial for finding potential ligands for membrane proteins that are important drug targets but are difficult to be purified in large amounts. This method was also used to screen ligands for the kinase domain of p21-activated kinase (84). Apart from its use in HTS, FAXS has been very suitable for fragment-based screening of potent ligands. The use has been illustrated in screening fragments against v-Src SH2 domain that has a high affinity for phosphotyrosine (86).
For HTS of ligands, ligand titrations to obtain binding affinities are not always feasible because of (i) time-consuming titration procedure and performing relaxation experiments for each titration point (ii) aggregation arising from addition of excess ligand during titrations for ligands with medium affinities and (iii) loss of native characteristics of a protein by the addition of the increasing concentrations of ligands dissolved in organic solvents. A different ‘titrationless’ method has been developed based on ηxy and R2 measurements (87). ηxy is transverse cross-correlation rate constant of a fluorine attached to an aromatic ring and its ortho-proton and R2 is the transverse relaxation rate constant. The ratio ηxy/R2 gives a more accurate estimation of the exchange rate constant than that obtained from the more conventional R1ρ (rotating frame relaxation rate) measurement. This in turn gives a more accurate dissociation constant of the ligand (87).
As a proof of concept for extending these approaches to membrane proteins, we screened binding of 19F-labeled small molecules to rhodopsin by mixing the ligands with the receptor. Ligands were in a mixture of 10 compounds at 50 μm concentration each. The receptor concentration was 0.2 mm in detergent solution (fivefold excess). For a ligand with micromolar affinity, these conditions ensure that the majority of the ligand will be bound, and therefore a maximal peak shift is expected for a hit. Excellent signal-to-noise ratio can be achieved with 7 min acquisition time (Figure 10). Both line-width and chemical shift changes were observed.
A fragment-based library can be considered complimentary to a library of compounds for HTS purposes. Such a library is a collection of fluorinated fragments based on Local Environment of Fluorine (LEF) (88). The collection of chemical fragments covers a wider chemical space than HTS libraries, and the ‘hits’ obtained in a fragment library screen would lead to faster ‘lead’ optimization. Many parameters are kept in mind during the building of such a fluorinated fragment library. For example, local substituents around the fluorine atom influence its chemical shift dispersion and solubility. Usually, a single chemically equivalent fluorine is preferable, because more than one non-equivalent fluorine atom would lead to complex 19F NMR spectra. The fragments are clustered according to their global structural features and local environmental fingerprints into different global and local clusters so that the library has a good coverage of different environments around the fluorine atom. These fragments are then mixed into two batches: one for CF3 containing molecules and the other for CF-containing molecules. The fragments are screened by collecting 19F NMR spectra in the absence and presence of a protein and considering those signals as ‘hits’ that are perturbed on protein addition. The screening can be further confirmed by recording the same spectra in the presence of a known ligand. The advantage of this method is that it uses fewer concentrations of the fragments, thus enabling the testing of a large compound mixture and also lowering the protein concentration to be used. The low fragment concentration is also helpful in not limiting the use of ligands that have low water solubility.
Comparison of 1H and 19F-NMR-based versus conventional screening of membrane proteins
The main advantage of drug discovery by NMR spectroscopy when compared to traditional HTS methods using other spectroscopic or cell-based assays is its high-information content: in addition to ligand binding itself, the location of binding, affinities and conformational changes induced in the protein can be observed. Furthermore, as a result of the high sensitivity of NMR spectroscopy to molecular size, artifacts arising from low solubility of the ligand or ability of the ligand to precipitate the protein virtually never go undetected, unlike in traditional HTS approaches. However, the stringent requirements are also the main disadvantage, limiting the applicability of traditional NMR-based approaches to small soluble proteins. However, these difficulties can be overcome by using specialized 1H-based approaches and 19F-NMR-based approaches, which open the door to study of proteins that are otherwise out of reach for NMR, including large and/or multimeric soluble protein complexes and full-length membrane proteins in detergent micelles.
An example demonstrating the limitations in traditional HTS methods is the most common membrane protein drug discovery target family the GPCRs. Because GPCRs are not enzymes and have traditionally in the past been difficult to obtain in soluble form, all current HTS assays are cell-based. Several different approaches are typically employed. Changes in intracellular calcium concentration are measured for Gq coupled receptors, the cAMP assay is used for Gi or Gs coupled receptors. More recently, reporter genes have been employed, beta-arrestin redistribution has been measured, and receptor internalization has also been used as a reporter for GPCR ligand binding and activity (89). The most sensitive and widely employed assay is the cAMP assay, but it is restricted to Gs and Gi coupled receptors. The calcium-based assay employed for Gq coupled receptors has the problem of not distinguishing constitutive activity from basal levels of intracellular calcium concentration, it is being difficult to quantitate pharmacological effects. The reporter gene assay requires long incubation with ligands, and there are many problems associated with this, including many false positives, issues with stability, redistribution of ligands and receptors, etc. Arrestin redistribution is a protein interactions-based assay: arrestin binding to the GPCR is initiated by the phosphorylation of the C-terminus of the GPCR. It has been demonstrated in many instances that binding of proteins at the cellular side of the receptor, including arrestin binding to the C-terminus, but also other proteins, e.g., those involving PDZ domains, alters the ligand-binding properties and pharmacology of receptors. Finally, receptor internalization is a complicated process, and efficient and fast internalization is not always given. In addition to these assay-specific disadvantages, all of these assays are necessarily indirect and are therefore error-prone. Moreover, compound libraries have limited solubility, and high concentrations of DMSO are needed to solubilize them. These high DMSO concentrations alter the cell surface properties. Finally, while an HTS will almost always yield a hit, especially when screening large libraries, the quality of the compounds identified may be low and time-extensive and cost-extensive procedures are required to transform the hit to a lead.
NMR-based screening has found increasing application to soluble proteins not only because of the enormous amount of information that can be obtained from such a screen (12,90), but most importantly, NMR-based assays are not prone to artifacts brought about by denaturation, aggregation or precipitation of proteins induced by the ligands. There are estimates that 20% of all hits in HTS are based on unspecific ligand effects. Such effects are immediately recognized in NMR-based screens because of the direct measurement of protein signals. Furthermore, solubility of the compounds is directly visible from the NMR samples. Another advantage is the fact that weak ligands can be identified easily. A weak but selective ligand can become the starting point for successful screening, such as is exploited in the fragment-based screening approach. Thus, even though an NMR-based screen may seem more expensive because of the large protein requirements, in the long run, successful compounds may be found easier and cheaper when viewed from the end-product perspective. Typically, HTS is evaluated based on the number of compounds screened versus number of hits, but one really has to critically evaluate how many of the hits have actually led to a lead or drug. In fact, there are many cases where HTS in pharmaceutical industry has not yielded drugs against a desirable target.
Synthesis of 19F containing small molecule compounds
The access to diverse and drug-like screening libraries labeled with 19F is the prerequisite for 19F NMR-based screening technology. A recent database search revealed that more than ½ million fluorinated small molecular weight compounds are commercially availablec. However, many of those compounds do not satisfy drug-like criteria and are rather unlikely to yield expandable hits during screening. A notable exception is the trifluoromethyl group containing compound nitisinone. This compound was originally developed and is still used as an herbicide. It was recently found to be useful to treat the hereditary orphan disease tyrosinemia type 1 (HT-1) (91). Since its first use for this indication in 1991, it has replaced liver transplantation as the first-line treatment for this rare and mostly deadly condition. This is an interesting example because the compound is by no means ‘drug-like’, containing three strong electrophiles in addition to a nitro group. Nevertheless the compound is well tolerated, no severe side effects are reported and the drug comprises a major therapeutic advancement by increasing the former 4-year survival rate of 29% of newborns with HT-1 to 88%. Because nitisinone is probably an exception rather than the rule and many of the fluorinated compounds that are commercially available will not have the desired properties to make a drug or even screen for biologically relevant compounds, the development of new libraries containing 19F is highly desirable. Introduction of fluorine in organic compounds is an established area of organic chemistry and can be accomplished by a plethora of techniques (92). Many useful reactions exist to selectively introduce fluorine in organic compounds (Figure 11). To this end specific reagents have been introduced, e.g., the recently described Togni’s reagent for the electrophilic introduction of trifluormethyl groups (93) or Buchwald’s nucleophilic aromatic substitution of triflates (94) (Table 1).
Table 1. Compilation of some extreme physico-chemical properties of fluorine, and fluorine moieties that make them so attractive in medicinal chemistry.
Because of the exceptional physico-chemical nature of fluorine, however, organic chemistry of fluorine often takes different reaction pathways (Table 1). Thus, fluorine introduction is commonly used in medicinal chemistry to alter the drug compound’s profile, including its solubility, metabolism, pKa and logD (lipophilicity). In addition, it is well known that there are distinct stereochemical effects in fluorine compounds as opposed to their non-fluorine counterparts, e.g., the trifluormethyl group in phenols has an energetic preference for an out-of-plane geometry as opposed to the methyl group (Table 1, entry 9) (96). Fluorine introduction into organic molecules is very popular to protect metabolically labile positions, e.g., in benzene rings. Finally, positron emission tomography has to be mentioned as a special application of fluorine in drug discovery because of its excellent properties to follow the fate of drugs in the human body in a time and site resolved manner. Positron emission tomography has found its clinical application as a modern diagnostic form in several indication areas and will gain increasingly more importance with the rise of molecular markers in clinical trials (97).
Summary and outlook
The prospects of NMR-based screening of small molecule ligand binding to membrane proteins are very good: 1H and 19F NMR-spectroscopic approaches have been developed to overcome many of the challenges associated with solution NMR studies of membrane proteins in detergent micelles. Solvent suppression schemes and STD spectroscopy are powerful 1H-NMR-based approaches to study ligand binding to membrane proteins that are not accessible to structure determination of NMR spectroscopy. At the same time, the prospects for structure determination of membrane proteins by NMR spectroscopy are steadily increasing. Complementary to 1H-NMR spectroscopy, the future of 19F-NMR-based drug discovery is particularly bright. Fluorine is an extremely versatile element with many advantages in the drug discovery pipeline. At the chemical synthesis stage, 19F derivatives are easily obtained. The fluorine substitution modulates ligand properties that can lead to better drugs. At the screening stage, the 19F nucleus provides a sensitive ligand binding as well as conformational probe without background signals. At the more biologic level, the cellular uptake and fate of 19F-tagged compounds can be detected in a time-resolved and space-resolved as well as otherwise in a label-free manner. This can be extended to in vivo19F imaging. In this review, we focused mostly on the first three stages and the particular challenges and opportunities for 19F NMR in the context of screening membrane proteins. In the future, such approaches are likely to gain further popularity as the instrumentation capabilities further improve. Nuclear magnetic resonance instrumentation companies are developing increasingly more sensitive and versatile 19F NMR cryoprobes. Screening of 19F libraries will become fast and cost efficient, and the discovery of novel small molecular weight ligands by NMR will become possible even for the difficult membrane bound targets GPCRs and ion channels. The broad and general availability of 19F-labeled compound libraries is currently an issue and has to be solved in the future. An inexpensive and efficient method using multicomponent reaction chemistry involving labeled building blocks is proposed.
Kelly Hay is grateful to the Fluorine division of the American Chemical Society for a Moissant Summer Internship. This work was in part supported by the National Science Foundation CAREER grant CC044917, National Institutes of Health Grants NLM108730 and 1R21GM087617-01 and the Pennsylvania Department of Health.