Debugging Eukaryotic Genetic Code Expansion for Site‐Specific Click‐PAINT Super‐Resolution Microscopy

Abstract Super‐resolution microscopy (SRM) greatly benefits from the ability to install small photostable fluorescent labels into proteins. Genetic code expansion (GCE) technology addresses this demand, allowing the introduction of small labeling sites, in the form of uniquely reactive noncanonical amino acids (ncAAs), at any residue in a target protein. However, low incorporation efficiency of ncAAs and high background fluorescence limit its current SRM applications. Redirecting the subcellular localization of the pyrrolysine‐based GCE system for click chemistry, combined with DNA‐PAINT microscopy, enables the visualization of even low‐abundance proteins inside mammalian cells. This approach links a versatile, biocompatible, and potentially unbleachable labeling method with residue‐specific precision. Moreover, our reengineered GCE system eliminates untargeted background fluorescence and substantially boosts the expression yield, which is of general interest for enhanced protein engineering in eukaryotes using GCE.

Fluorescence microscopy in general and super-resolution microscopy (SRM) in particular can benefit from the use of small and photostable fluorophores.T he challenge of direct and specific protein labeling with organic fluorophores inside mammalian cells can be addressed through the genetic encoding of dye coupling modules,s uch as various protein or peptide tags (for am ore comprehensive overview of powerful technologies see Ref. [1]). One of the most versatile methods to achieve labeling in aresidue specific fashion is the incorporation of noncanonical amino acids (ncAAs) into proteins by using genetic code expansion (GCE). GCE most commonly relies on Amber (TAG)stop codon suppression by means of at RNAa nd aminoacyl tRNAs ynthetase (tRNA/ RS) pair orthogonal to the host translational machinery.T he RS is typically engineered in such aw ay that it only accepts the ncAA of choice,which can simply be added to the growth medium. This leads to acylation of the cognate tRNA CUA only when the ncAA is present, thereby resulting in its residuespecific incorporation in response to an artificially introduced UAGc odon in the mRNAc oding for the protein of interest (POI, for reviews,see Ref. [3]).
We and others have recently shown that ncAAs containing strained alkyne or alkene moieties can be encoded in living mammalian cells by means of the pyrrolysine tRNA Pyl / PylRS pair from Methanosarcina. [4] Cyclooctyne and transcyclooctene amino acid derivatives can subsequently be labeled through click chemistry reactions,s uch as ultrafast and bioorthogonal strain-promoted inverse-electron-demand Diels-Alder cycloadditions (SPIEDAC)w ith 1,2,4,5-tetrazines.Even though such reactions have previously been used to label and study surface proteins and highly abundant cytoskeletal proteins in mammalian cells with SRM, [5] applications to less abundant proteins are largely obscured by the limited efficiency of the GCE system, nonspecific binding (sticking) of the dyes,a sw ell as frequent and highly fluorescent background in the nucleus,p articularly in the nucleolus. [4d,5b] This renders an entire major organelle almost inaccessible to SRM through GCE-based labeling.
To improve the potential of GCE for SRM applications, we first aimed to understand the origin of and eliminate the nonspecific nuclear background labeling. We analyzed the widely used M. mazei PylRS protein sequence and, to our surprise,i dentified ap utative nuclear localization sequence (NLS;F igure S1 in the Supporting Information). NLSs are small motifs that direct proteins to the nuclear import machinery,w hich relocates NLS-bearing proteins into the nucleus. [6] This finding is indeed unexpected, given that archaea, the domain that Methanosarcina belong to,d on ot possess anucleus.T otest whether PylRS is indeed localized to the nucleus,wefirst recombinantly expressed PylRS from M. mazei in E. coli to generate ap olyclonal antibody (Ab PylRS ; Figure S2 in the Supporting Information). Immunofluorescence (IF) staining with Ab PylRS of HEK293T and COS-7 cells (HEK and COS) expressing the tRNA Pyl /PylRS AF (AF refers to ap reviously described PylRS mutant that accepts bulky side-chain moieties such as t-butyloxycarbonyl (BOC)-and trans-cyclooctene ncAAs) [4a,d, 5a, 7] system revealed clear local-ization of the PylRS to the nucleus (see Figure 1f or HEK cells and Figure S3 for COS cells). Since PylRS has ah igh affinity for its cognate tRNA Pyl ,w eu sed fluorescence in situ hybridization (FISH;see Figure 1cand Figure S3 for COS) to confirm that tRNA Pyl is also mainly localized to the nucleus.
It is important to consider that in eukaryotes,endogenous aminoacyl tRNAsynthetases and their cognate tRNAs can be shuttled between the nucleus and the cytoplasm through the action of many different cellular processes and responses. [8] However,t he expression of an orthogonal pair (the GCE machinery) is likely to result in impeded GCE efficiency if subjected to similar processes.W hereas NLSs can occasionally be identified in prokaryotes (which include bacteria and archaea), their role in such organisms,which lack anucleus,is widely debated. [9] Forthe purpose of achieving efficient, highyielding GCE in general, we assume that it is neither desired nor expected for the tRNA/RS pair to be mainly localized to the nucleus and thus spatially separated from the translational machinery present in the cytoplasm.
To reinforce cytoplasmic localization, we added as trong nuclear export signal (NES) [10] to the Nterminus of the PylRS AF (NESPylRS AF ), which we hypothesized would outcompete any NLS import signal intrinsic to the PylRS. Indeed, IF and FISH staining revealed ac lear cytosolic distribution of both the NESPylRS AF and tRNA Pyl (Figure 1b,d and Figure S3). To test whether this also increases the efficiencyofthe system, we used fluorescence-based flow cytometry of cells expressing an Amber suppression reporter (iRFP-GFP Y39TAG ;i RFP is an ear-infrared fluorescent protein) in the presence and absence of an unreactive tertbutoxycarbonyl lysine derivative (BOC) as the ncAA. Our reporter is composed of iRFP,w hich is fused to the Amber mutant of GFP (Y39TAG)atits Cterminus.Inthis assay,fulllength iRFP-GFP is only produced if the TAGc odon is suppressed to encode the ncAA. Thei ntensity of the green fluorescence (GFP) indicates the efficiencyo fA mber suppression, while iRFP fluorescence reports whether the cells were properly transfected. As shown in Figure 1e (and in detail in Figure S4), we observed an up to 15-fold enhancement of Amber suppression efficiency with NESPylRS AF .
We next wanted to test whether this NESPylRS AF construct also reduces background in fluorescence labeling experiments,i np articular the unwanted nuclear background staining.W ep erformed as ide-by-side comparison of intracellular labeling experiments with tRNA Pyl /NESPylRS AF and the conventional tRNA Pyl /PylRS AF system. We used the axial atropisomer {[(E)-cyclooct-2-en-1-yl]oxy}carbonyl)-l-lysine (TCO *a ), which we previously determined to be an ideal choice for site-specific labeling with 1,2,4,5-tetrazine containing Cy5 dye (Cy5-tet) derivatives, [5a, 11] despite the possibility that ac lick-reaction side product is also formed that could eliminate the dye from the protein. [12] Figure 2a,b shows Amber suppression results using TCO *a of the transcription factor jun-B 348TAG -GFP labeled with Cy5-tet using SPIEDAC (jun-B 348!TCO*a!Cy5 ). This construct contains aC -terminal GFP fusion, which is only generated when the Amber codon is suppressed. In such ac ase,t he jun-B-GFP signal can be used as ar eference to validate proper labeling. [5b] The detrimental effect of using the conventional system is  Figure S3 for COS cells). e) Flow cytometry analysis of the reporter iRFP-GFP Y39TAG to assess the Amber suppression efficiencyi n the presence of BOC of PylRS AF (left) and NESPylRS AF (center), and of NESPylRS AF without ncAA (right). The analysis shows that the number of bright GFP-expressing cells (i.e. successful Amber suppression) is substantially enhanced for the NESPylRS AF in the presence of BOC (up to 15-fold, shown here is the average of afull titration, which is detailed in Figure S4). The axes indicate fluorescence intensity in arbitrary units. particularly evident in Figure 2, where the Cy5 fluorescence (which should indicate jun-B staining) shows as imilar shape to an ucleolus.W heni nspecting only the Cy5 channel using the conventional system, nucleoli and areas of jun-B 348!TCO*a! Cy5 localization might be easily confused (Figure 2a). This background signal can even be observed when no POI is introduced ( Figure S5), and thus most likely originates from ncAA bound or coupled to tRNAand/or RS accumulating in the nucleus.H owever, for the NES system (Figure 2b), the labeled protein and GFP signals co-localize (see Figure S6 for aq uantitative co-localization analysis), and thus faithful identification of jun-B through the use of GCE and click chemistry becomes possible.B esides enhanced expression efficiency,r emoval of the unwanted nucleolar background is thus beneficial for general fluorescence microscopy,r anging from "simple" confocal imaging (as shown in Figure 2a nd Figure S5) to super-resolved microscopy techniques like STORM/GSDIM/STED (analogously to Figure 2 To extend the repertoire of GCE-compatible SRM techniques,w en ext wished to demonstrate that our tRNA Pyl /NESPylRS AF GCE-based click labeling can be combined with the more recently developed DNA-PAINT SRM [13] in a"Click-PAINT" approach. DNA-PAINT relies on placing as hort single-stranded (ss) DNA( the "docking strand") into the POI, to which ac omplementary ssDNA carrying as mall photostable synthetic dye (the "imaging strand") can transiently and reversibly anneal. By means of localization microscopy,t he freely diffusing imaging strand can then be discerned from the annealed one and as uperresolved image can be reconstructed. [13b] Although limited to fixed specimens,asthe majority of SRM applications still are, the strength of DNA-PAINT arises from several features, which will be summarized below.Aparticularly evident benefit for the combination of GCE with DNA-PAINT is that the solubility and biocompatibility of many organic fluorophores can be enhanced by coupling them to ab iomolecule like DNA, which leads to reduced tendencyt owards nonspecific binding (stickiness). In addition, GCE permits the imaging strand to be placed in direct proximity to the residuespecifically installed ncAA. This is in contrast to previously described DNA-PAINT,w hich was typically based on Ab labeling. [13] Such approach introduces an Ab linker of up to 10 nm and can limit the accuracyofthe method and lower the achievable labeling density,b oth of which can be crucial for optimal SRM (see Ref. [214]).
As outlined in Figure 3, first aD NA docking strand was equipped with acompatible 1,2,4,5-tetrazine and reacted with the POI TAG !TCO*a .N ext, an imaging strand containing the synthetic dye Atto655 was added to the cells.T he dye was conjugated to the imaging strand such that upon annealing with the docking strand, it was in close proximity to the labeling site.T ov alidate the method, we used ap reviously described Amber mutant of vimentin N116TAG .
[5b] Figure 3c shows an SRM image of our vimentin N116!TCO*a!PA INT -mOrange construct, which clearly gives enhanced resolution compared to the diffraction-limited image from the mOrange reference channel (Figure 3b).
We next aimed to test whether the beneficial features of Click-PAINT can enable more demanding microscopy studies involving the imaging of less abundant structures in cells,such as the nuclear pore complex (NPC), ar ing-like structure in the nuclear envelope.T hirty-two copies of the protein nucleoporin Nup153 have been recently counted in the NPC, [15] which has an approximate volume of 60 nm 3 .L abeling sites for Nup153 are consequently at asubstantially lower abundance and density than those for cytoskeletal filaments. We generated aG FP N149TAG -Nup153 construct (with GFP serving as reference) and subjected it to our method. As shown in Figure 3d,w ew ere able to obtain super-resolved images showing the typical circular appearance of NPCs by using total internal reflection fluorescence (TIRF) microscopy,which shows NPCs facing the coverslip and objective. [15,16] We note that not all rings are closed, since the cells also express wild-type (unlabeled) Nup153, which will compete for incorporation into the NPC with our GFP N149!TCO*a -Nup153 protein.
In summary,w eh ave eliminated am ajor flaw in the eukaryotic application of the most popular GCE system, the tRNA Pyl /PylRS pair from Methanosarcina.U ndoubtedly, yield is still one of the major issues of GCE in general, and not just for SRM applications.The main reason for low yields is competition between the hostsi nternal translation termination machinery and the stop codon suppression system. To address this issue in eukaryotes,m any approaches,i ncluding promoter engineering,better evolution of the RS,and release factor engineering,toname just afew,have been developed. Increasing the amount of tRNA, for example through gene multi-chaining,h as been am ajor focus of many previous studies for GCE yield enhancement (for Reviews,s ee Refs. [3]), simply because the lower the concentration of properly charged suppressor tRNA Pyl ,the more likely it is that the eukaryotic release factor terminates translation. We discovered that PylRS contains an NLS and accumulates together with its cognate tRNA Pyl in the nucleus,a nd that appending aN ES to PylRS relocates the pair back to the cytoplasm, where it can be translationally active.O ur repair ("debugging") strategy is extremely easy to implement, since existing systems only require N-terminal fusion of the NES to the PylRS.Therefore,every user of the eukaryotic pyrrolysine GCE machinery could immediately reap the benefits of enhanced codon suppression efficiencya nd thus ah igher expression yield for any application that requires am ore efficient system.
Cytoplasmic relocalization of the PylRS also results in increased contrast in labeling experiments,a sm ost easily recognized by lack of nonspecific nucleolar staining ( Figure 2). This enables contrast-enhanced imaging of proteins within the nucleus,acompartment previously not faithfully accessible for GCE-based labeling experiments. This is of benefit for all fluorescent-dye-based imaging modalities (from confocal microscopy to SRM techniques; Figure 2a nd Figure S7).
In addition, we presented acombination of the enhanced GCE system with DNA-PAINT,which we term Click-PAINT, in an application to image even low-abundance proteins in the nucleus.D NA-PAINT has multiple features that make it ap articularly powerful SRM technique in biology. [13] For example,al arge reservoir of imaging strands can help to reduce bleaching problems,a nd the technique has the potential to enable direct quantification of the number of fluorescent labels in an image.T he latter is an important parameter with respect to the ultimate goal of direct quantification of protein concentration in cells through microscopy. [13a] Another indirect benefit is that conjugating dyes to ssDNAm akes many fluorescent probes biocompatible and soluble.This can increase the robustness and generality of the method in combination with GCE by lowering nonspecific dye staining,t hus providing an alternative to the need for using fluorogenic dyes or optimization of the washing conditions,a si ss ometimes necessary when coupling dyes directly to ncAAs. [5] We note that GCE still has many limitations with respect to SRM, such as the generation of truncated proteins and the suppression of natural Amber codons.T hese hurdles need to be addressed in the future,i np articular when aiming to quantify the number of expressed proteins and not just fluorescence labels.H owever,w hile GCE-based SRM is not yet as simple to implement as fluorescent protein fusions or antibody-based labeling techniques,t hese techniques do not offer residue-level precision and the versatility of placing alabeling site virtually anywhere in aprotein.
Nevertheless,t he combination of residue-specific resolution of click chemistry-based GCE with DNA-PAINT clears the way for in-cell structural biology experiments. Figure 3. a) Schematic representation of the Click-PAINTmethod. The POI TAG is first expressed in mammalian cells in the presence of TCO *a when co-transfectedwith the tRNA Pyl /NESPylRS AF .Then, the POI TAG ! TCO*a is subjected to atwo-step labeling reaction in which first atetrazine-functionalized docking DNA strand is chemically ligated in aSPIEDAC reaction and second, acomplementary imaging strand conjugated with ad ye is added to the cells. b) Fluorescencesignal of the fused mOrange protein for the vimentin N116!TCO*a -mOrange construct used as areference for protein expression.c ,d) DNA-PAINTbased SRM performed by acquisition in the channel that is appropriate for the dye introducedu sing the Click-PAINT method for vimentin N116! TCO*a!PAINT -mOrange, resolution 50 nm (c), and GFP N149!TCO*a!PAINT -Nup153, resolution 25 nm (d;s cale bar in insets with zoomed-in nuclear pores is 100 nm, see also Figure S8). The resolution was determined using Fourier ring correlation. [2]