We acknowledge Silke Wiesner for advice and suggestions to the manuscript. We thank Ancilla Neu, Lisa Strittmatter, and other group members for discussions. This study was supported by the Max Planck Society and a Marie Curie Reintegration Grant (FP7/2007-2013, grant agreement no. 239164) to R.S.
LEGO-NMR Spectroscopy: A Method to Visualize Individual Subunits in Large Heteromeric Complexes†
Article first published online: 14 AUG 2013
© 2013 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA. This is an open access article under the terms of Creative Commons the Attribution Non-Commercial NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
Angewandte Chemie International Edition
Volume 52, Issue 43, pages 11401–11405, October 18, 2013
How to Cite
Mund, M., Overbeck, J. H., Ullmann, J. and Sprangers, R. (2013), LEGO-NMR Spectroscopy: A Method to Visualize Individual Subunits in Large Heteromeric Complexes. Angew. Chem. Int. Ed., 52: 11401–11405. doi: 10.1002/anie.201304914
- Issue published online: 14 OCT 2013
- Article first published online: 14 AUG 2013
- Manuscript Received: 7 JUN 2013
- Funded Access
- Max Planck Society
- Marie Curie Reintegration Grant. Grant Number: 239164
- isotopic labeling;
- LSm complex;
- methyl TROSY;
- NMR spectroscopy;
- protein structures
NMR spectroscopy is unique as it provides a means to study bio-molecules with atomic resolution in a near natural environment. Traditionally, NMR spectroscopic analysis of structures, interactions, and dynamics has been reserved for molecular complexes that are smaller than 20 kDa. However, in recent years, the introduction of TROSY techniques,1 protein deuteration,2 and selective methyl-group labeling3 has significantly extended this molecular weight limit.4 Indeed, systems far over 100 kDa have been analyzed in great detail, revealing unique functional aspects of large molecular machines.5
Many NMR spectroscopic studies on large systems have been performed on highly symmetric complexes, as these assemblies are relatively easy to prepare and result in simple NMR spectra in which the resonance signals from all the subunits are identical.5a For large asymmetric assemblies that can be produced in E. coli by co-expression of all the components,6 spectral crowding will lead to NMR spectra that can no longer be analyzed in detail. In a limited number of cases this crowding could be circumvented by in vitro reconstitution of the complex from separately expressed NMR active and NMR inactive subunits.5b,c,h This strategy is, however, not generally applicable. As a result, most eukaryotic systems that are much more complex than their bacterial or archaeal counterparts will remain inaccessible to high-resolution NMR studies.
Herein, we introduce a sequential co-expression method for the preparation of large asymmetric complexes that combines the advantages of in vivo reconstitution and the benefits of partial NMR isotope labeling to reduce NMR spectral complexity. We transform E. coli cells with two plasmids carrying different promoters so that protein expression can be induced independently. In this manner, it is possible to induce protein synthesis for one set of proteins in an NMR active medium (stage 1), whereas a second set of proteins can be produced in an NMR invisible medium (stage 2; Figure 1 A). As all expressed proteins are present in a single E. coli cell, the final complex can assemble in a cellular environment preventing the aggregation of subunits that are otherwise instable in isolation. We refer to our method to “label, express, and generate oligomers” for NMR as “LEGO-NMR”.
The LEGO method requires tightly controlled individual DNA promoters such that the promoter that induces protein expression in stage 1 is completely switched off in stage 2, whereas the promoter for stage 2 is not active in stage 1. In LEGO methods A1 and A2 (Supporting Information, Figure S1), protein production is induced from an araBAD promoter using arabinose in stage 1 and from a T7 promoter using IPTG in the stage 2.7 In this case, the glucose that is present in stage 2 efficiently turns off the araBAD promoter.8 In LEGO method B, we introduce a three-promoter system, where protein production is induced from a T7 promoter in stage 1, and from an araBAD promoter in stage 2. In this case, the T7 promoter is actively switched off by the expression of T7 lysozyme in between stage 1 and stage 2 from a third plasmid that contains a rhamnose inducible promoter.9 This inhibition is required as T7 expression would otherwise continue for over 4 h after the removal of IPTG from the growth medium.7a
To establish the LEGO-NMR methodology, we use two different LSm complexes that play a role in mRNA degradation and pre-mRNA splicing. The LSm1–7 complex10 (containing the LSm1 to LSm7 proteins) and LSm2–8 complex11 (containing the LSm2 to LSm8 proteins) contain seven different protein chains that are arranged in a unique order.11b, 12 As most LSm proteins are insoluble in isolation, neither the LSm1–7 nor the LSm2–8 complex can be efficiently reconstituted in vitro from separately expressed proteins.13 On the other hand, co-expression of the different LSm proteins yields homogeneous NMR samples (Figure 1 B), showing that in-cell reconstitution functions efficiently. However, owing to the large number of unique resonances (649 expected backbone amide signals) the resulting NMR spectra suffer significantly from spectral overlap (Figure 1 B), preventing an accurate analysis. LSm complexes are thus a good example of eukaryotic protein complexes that are currently not accessible for detailed high-resolution NMR spectroscopic techniques.
To reduce the spectral overlap for the LSm2–8 complex by a factor of approximately two, we labeled the LSm5, LSm6, and LSm7 proteins with 15N in stage 1, whereas the LSm2, LSm3, LSm4, and LSm8 proteins were produced in an NMR inactive medium in stage 2. The resulting spectrum of the LSm2–8 complex that only displays LSm5, LSm6, and LSm7 is significantly simplified (Figure 1 C, top left). Importantly, a very good overlay of a subset of the resonances of the fully NMR active LSm2–8 complex is observed, as we intended to achieve. This situation clearly allows for the identification of the resonances in the LSm2–8 complex that result from the LSm5, LSm6, and LSm7 proteins.
To establish the power of the sequential co-expression methodology further, we produced seven different NMR samples of the LSm2–8 complex, in which only a single LSm protein was 15N-labeled in the stage 1, whereas the remaining six LSm proteins were expressed in an NMR inactive form in stage 2 (Figure 1 C). The seven spectra of the LSm2–8 complex allow for the unambiguous identification of the resonance signals that result from each individual LSm protein in the LSm2–8 complex. In this manner, a simplification of 89 % can be achieved (74 expected amide signals in the LEGO LSm6 spectrum). Our approach is thus able to deconvolute the complicated spectrum of the hetero-heptameric complex into seven significantly simplified sub-spectra. At the same time, the overlay of the seven NMR spectra of the complexes that contain a single labeled LSm protein yields the spectrum of the uniformly labeled LSm2–8 complex (Figure S2). Note that the proteins that are produced in a deuterated form in stage 1 are efficiently re-protonated at the beginning of stage 2 before the individual subunits are incorporated in the final complex. This eliminates the need of (refolding) methods to re-protonate backbone amides (Figure S3) in the LSm2–8 complex.
The LSm2–8 complex is part of the U6 snRNP, where it interacts with the 3′ end of the U6 snRNA.11a, 14 To establish which subunits in the LSm2–8 complex contact the RNA substrate, we performed NMR titration experiments with LSm2–8 LEGO complexes that either contained NMR active LSm2, LSm3, LSm4, and LSm8 (Figure S4A) or that contained NMR active LSm5, LSm6, and LSm7 (Figure S4B) in an otherwise NMR inactive background. In both complexes, we observed significant chemical shift perturbations upon complex formation with the RNA. Importantly, the single subunit LEGO spectra (Figure 1 C) establish that all seven LSm proteins are involved in RNA binding (Figure S4A,B) as resonance signals for all the LSm proteins experience chemical shift changes upon interaction with the RNA. To resolve the remaining spectral overlap, we performed an RNA titration experiment with an LSm2–8 complex that was labeled at LSm5 only (Figure 2 A). We then combined information from the previously assigned LSm657 complex,13a the LSm5 LEGO spectrum (Figure 1 C) and an HNCA spectrum of a fully 2H, 13C, 15N-labeled LSm2–8 complex (Figure 2 B) to assign the LSm5 residues in the LSm2–8 complex that contact the RNA. In this case, we exploited the fact that we were able to select the LSm5 resonance signals in the HNCA spectrum of the fully labeled LSm2–8 complex, thus reducing the number of expected resonance signals from 649 to 77, which significantly simplified the assignment process. This approach revealed that the residues that experience large chemical shift perturbations upon interaction with the U6 snRNA are located in loop 5 of the LSm5 protein (Figure 2 C). This loop connects β-strands 4 and 5 in the LSm fold and lines the central pore of the LSm ring. As the RNA we used for the interaction experiments contains only nine bases and as all the LSm proteins are involved in the RNA interaction (Figure S4) our data suggests that the RNA binding site in LSm2–8 is at the central pore. Additional information to support this observation can be obtained from the assignment of the other LSm proteins in the LSm2–8 complex in an analogous manner. Interestingly, the eukaryotic Sm complex,15 the archaeal LSm complex,16 and Hfq17 have all been shown to use this region to interact with substrate RNA indicating that this binding site is conserved in the eukaryotic LSm complexes.
Methyl TROSY spectroscopy has been shown to be highly suitable for the study of supramolecular complexes that are inaccessible to backbone-directed TROSY spectroscopy. To establish ε-1H-13C methyl labeling of methionine residues in concert with LEGO-NMR, we used the hetero-heptameric LSm1–7 complex whose 1H-15N TROSY spectra are of lower quality compared to those of the LSm2–8 complex (Figure S5). Methionine methyl TROSY spectra of the LSm1–7 complex, where all proteins are fully methionine labeled, display a large number of well resolved methyl resonances in addition to a region that suffers from significant spectral overlap (Figure 3 A, top left). To resolve the spectral overlap and to assign the well-resolved resonances to specific LSm proteins, we prepared seven different LEGO NMR samples of the LSm1–7 ring. In each of these samples a single LSm protein was methionine labeled, whereas the other six LSm proteins were NMR invisible. Methyl TROSY spectra of these hetero-heptameric complexes allowed for the unambiguous assignment of the methionine methyl groups to individual LSm proteins (Figure 3 A). Site-specific assignment of these methyl groups can be made using a mutational approach.5a, 18 In addition, the “singly labeled” LSm1–7 rings significantly resolved the spectral overlap of the spectrum. Methionine methyl TROSY spectroscopy is thus fully compatible with the LEGO-NMR methodology and can provide high-resolution spectra for complexes that are not amenable to 1H,15N-based TROSY spectroscopy. Note that it has been shown recently that methionine methyl groups are excellent probes to study molecular interactions.19
In addition to methionine methyl groups, methyl TROSY spectroscopy is often performed in concert with labeled methyl groups of isoleucine, leucine, valine,20 or alanine21 residues. As opposed to methionine labeling, these amino acids are incorporated into the protein through E. coli metabolization of specifically labeled precursor molecules. For isoleucine residues this is only possible in the presence of glucose as that induces catabolite repression that inhibits metabolic pathways that would otherwise degrade α-ketobutyric acid.22 Stage 1 in method A1, that we used for methionine and nitrogen labeling (Figure S1, Table S2), uses glycerol as a carbon source and can thus not be used for isoleucine labeling. To label selected subunits with isoleucine methyl groups we thus use method A2 (where the NMR labeling is moved from stage 1 to stage 2) or method B (where an arabinose-inducible vector is used in stage 2; Figure S1).
The high quality of the spectrum of the fully isoleucine-δ1 labeled spectrum reflects the strength of methyl-group labeling for high molecular-weight complexes (Figure 3 B, gray). We then used method B to prepare a “half-labeled” LSm2–8 LEGO complex that contains NMR active isoleucine-δ1 methyl groups in LSm5, LSm6, and LSm7 (Figure 3 B, black). As observed for the H,N-based spectra (Figure 1 C), a subset of the resonances that result from the labeled proteins can be readily identified. It is worth noting that the LSm5, LSm6, and LSm7 proteins contain 17 isoleucine residues, 16 of which yield well-dispersed resonance signals in the spectrum. To extend the LEGO approach one step further, we used method A2 to prepare an LSm2–8 complex, in which only LSm5 is NMR active (Figure 3 B, olive). The resulting HMQC spectrum displays six distinct resonance signals that result from the six isoleucine residues that are present in the LSm5 protein.
In the examples shown above, we ensured that isotope labeling was restricted to a subset of the subunits in the complex, whereas the remaining subunits were NMR invisible. Interestingly, it is also possible to distribute different labeling schemes over the different subunits. We demonstrate this approach with an LSm2–8 complex that is uniformly 15N labeled, LSm2, LSm3, LSm4, and LSm8 methionine labeled, and LSm5, LSm6, and LSm7 isoleucine labeled (Figure 4). Owing to the spectral separation of methionine and isoleucine methyl groups this approach allows for the independent and simultaneous monitoring of NMR parameters from different parts of a large complex.
NMR spectroscopic studies of large and asymmetric protein complexes suffer from significant challenges related to sample preparation and from spectral crowding owing to a high number of unique resonances. We have introduced a sequential co-expression strategy that tackles both issues simultaneously. Using the LSm1–7 and LSm2–8 complexes, we show that highly homogeneous samples that contain only one NMR active subunit can be readily prepared. Importantly, our strategy is compatible with backbone and methyl group side-chain labeling. LEGO-NMR is thus suitable for the study of large asymmetric complexes including eukaryotic systems that are currently inaccessible to detailed NMR analysis. Interestingly, around 50 % of the assemblies in the protein data bank (PDB) that contain three or more unique chains have been prepared in E. coli, indicating that our method is applicable to a wide variety of complexes.
As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer reviewed and may be re-organized for online delivery, but are not copy-edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.
Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.