Genetic Encoding and Enzymatic Deprotection of a Latent Thiol Side Chain to Enable New Protein Bioconjugation Applications

Abstract The thiol group of the cysteine side chain is arguably the most versatile chemical handle in proteins. To expand the scope of established and commercially available thiol bioconjugation reagents, we genetically encoded a second such functional moiety in form of a latent thiol group that can be unmasked under mild physiological conditions. Phenylacetamidomethyl (Phacm) protected homocysteine (HcP) was incorporated and its latent thiol group unmasked on purified proteins using penicillin G acylase (PGA). The enzymatic deprotection depends on steric accessibility, but can occur efficiently within minutes on exposed positions in flexible sequences. The freshly liberated thiol group does not require treatment with reducing agents. We demonstrate the potential of this approach for protein modification with conceptually new schemes for regioselective dual labeling, thiol bioconjugation in presence of a preserved disulfide bond and formation of a novel intramolecular thioether crosslink.


Supporting Figures
: Time-dependent HcP deprotection. DiSUMO-I (2) (10 µM) was incubated with 0.001 eq. PGA-His6 and the reaction monitored by ESI-MS.   Modifications of +42 Da and +178 Da at varying levels were found depending on the proteins N-terminal sequence, which however, did not interfere with the PGA-mediated deprotection of HcP or subsequent bioconjugation. Proteins with an N-terminal His-tag can give rise to covalent modifications in E. coli with the +178 Da corresponding to the previously described α-N-gluconoylation. [1] HcP positions refer to numbering without the start methionine.    dually-labeled FRET sensor holo-5* (right panels). B) Purified Cys-Cys construct 6 (left panel) and dually-labeled FRET sensor holo-6* (right panels). The fluorescence images to detect the AF555 and AF647 fluorophores were recorded with excitation at λ = 532 nm and λ = 635 nm, respectively. S10 Figure S10: ESI-MS analysis of regioselective dual-labeling procedure of Cys-HcP construct 5. As indicated in the scheme, the protein was first labeled with AF555 and quenched with DTT to give 5 # . Secondly, HcP was deprotected by addition of PGA-SBP and in situ labeled with AF647 to give 5*. Finally, the labeled protein was ppantylated into the holo-form to give the active FRET sensor holo-5* (ca. 91%). The left panel shows an ESI-MS analysis at each stage of the procedure. Figure S11: Further analysis of dimeric nanobody construct 7-S-S-7. Shown are bioconjugation reactions analyzed using an SDS-PAGE gel. The protein (10 µM) was incubated with PGA-His6 (0.01 eq), TCEP (10 eq) and Cy5 maleimide (5 eq), as indicated, for 1 h at 25 °C. UV illumination of the gel reports on the bioconjugation with Cy5 (bottom panel). Lane 1 (from left to right) shows that the disulfide was formed quantitatively, as no reaction with Cy5 maleimide occurred under these conditions. Following addition of TCEP (lane 2), the disulfide is reduced and Cy5 bioconjugated on the reduced Cys residue. Lanes 3 and 4 show that HcP deprotection, mediated by PGA-His6, leads to a free thiol (Hcy) that undergoes bioconjugation with Cy5 maleimide. Together, these results provide independent evidence for the preservation of disulfide bonds during HcP deprotection and bioconjugation and further underlines the regioselectivity of the new dual labeling strategy. Note that the disulfide-linked homodimeric nanobody 7-S-S-7 is reduced to the monomers during SDS-PAGE sample preparation.   Figure 6A of the main text. Importantly, intramolecularly crosslinked 11 is resistant to TEV cleavage, whereas linear 9 with the protected latent thiol group of HcP is quantitatively cleaved by TEV protease. PGA-His6 was used for these experiments. DAA = dibromoadipic amide.

Chemical synthesis
Chemicals were purchased from Sigma-Aldrich, Thermo Fisher Scientific, Carl Roth, ChemShuttle, AppliChem, Acros Organics and Fluka. NMR-Spectra were measured on Bruker Avance II 300 and Bruker Avance II 400 spectrometers. Chemical shifts are referenced to the residual solvent signal. Peak multiplicity is abbreviated as follows: s = singulet, d = doublet, t = triplet, m = multiplet, td = triplet in doublet. Mass spectra were measured on MicroTof ESI from Bruker Daltonics.

N-(Hydroxymethyl)phenylacetamide
To a flask with phenylacetamide (4.0 g, 29.6 mmol, 1 eq) were added 30% formaldehyde solution (3 mL, 32.6 mmol, 1.1 eq), H2O (3 mL) and KOH (185 mg, 3.30 mmol, 0.1 eq). The suspension was heated to 70 °C for 10 min, thereby forming a clear solution. The mixture was cooled to rt and stirred overnight. DCM was added and the aqueous phase extracted with DCM (3x 10 mL). The combined organic phases were dried over Na2SO4 and concentrated under reduced pressure to yield the product as white solid (4.1 g, 83%). The NMR data is in agreement with the literature. [2] N-Boc-L-homocysteine methyl ester OMe

Expression and purification of proteins with HcP
For incorporation of HcP (1) by amber stop codon suppression, E. coli BL21 (DE3) gold cells were co-transformed with the respective plasmids coding for the target protein and mutant Mb PylRS(Y271M, L274A, C313A)/tRNA pair. [5,6] Lysogeny broth (LB) medium was inoculated with overnight cultures and the appropriate antibiotics were added (ampicillin 100 µg/mL, kanamycin 50 µg/mL and chloramphenicol 34 µg/mL). Cells were grown at 37°C until OD600 = 0.6 -0.8 was reached. LB cultures were then pre-incubated with HcP (2 mM) for 15 -30 min prior to addition of L-(+)-arabinose (0.2% w/v)) and IPTG (0.4 mM) for induction (for thioredoxin constructs only arabinose was added). Cells were further incubated for 4 h at 37°C (diSUMO, Trx and Ub constructs) or 28°C (NRPS and Nb constructs). Next, cells were harvested by centrifugation (4000 rpm, 15 min, 4°C) and cell pellets were resuspended in Ni-NTA buffer (50 mM Tris, 300 mM NaCl, pH = 7.5 / 8). Cell lysis was performed using sonication (SONOPLUS, Bandelin) and remaining cell debris was removed by centrifugation. Protein purification from the soluble fraction was performed using Ni-NTA affinity chromatography and gravity flow columns. Ni-NTA resins (Cube Biotech) were equilibrated with Ni-NTA buffer containing 20 mM imidazole prior to addition of lysate. Resins were washed with Ni-NTA buffer with 20 mM imidazole and His-tagged target protein was eluted with Ni-NTA buffer containing 250 mM imidazole. Purified protein was dialyzed in Ni-NTA buffer in three steps. The last dialysis step was performed with buffer additionally containing glycerol (10%). diSUMO-V was purified with a yield of 30.2 mg/L and for diSUMO-VI a yield of 26.0 mg/L was obtained.

Recombinant expression and purification of PGA
For production of PGA-His6 and PGA-SBP (Streptavidin-Binding Peptide) E. coli BL21 (DE3) gold cells were transformed with the encoding plasmid. [7] LB medium containing 2 mM CaCl2 and kanamycin (50 µg/mL) was inoculated with an overnight culture. Cells were grown until OD = 1 and induced with IPTG (1 mM). Cells were cultivated for another 4h at 28°C and harvested by centrifugation (4000 rpm, 15 min, 4°C). For purification of PGA-His6 and PGA-SBP the cell pellets were resuspended in Ni-NTA buffer (pH = 7.5) or buffer W (100 mM Tris, 150 mM NaCl), pH = 8). Cell lysis was performed using an emulsifier (EmulsiFlex-C5, Avestin). Protein purification from the soluble fraction was performed by Ni-NTA affinity chromatography or SBP-tag affinity chromatography using 2.5 % glycerol in washing buffers and elution buffers. [7] Purified protein was dialyzed in Ni-NTA buffer in three steps (2 mM CaCl2 each, 2.5 %, 2.5 % and 10% glycerol). Unless otherwise stated PGA-His6 was used in the following experiments.

Western Blot
To detect His-tagged proteins anti-6x His epitope tag (rabbit, ROCKLAND) was used as primary antibody. A corresponding HRP-tagged Anti-rabbit (Dako) was utilized as second antibody. Detection by chemiluminescence was performed with ECL Western Blotting Analysis System (GE Healthcare).

HcP deprotection to reveal the latent thiol group
If not indicated otherwise, deprotection assays were performed by diluting HcP-containing proteins to a concentration of 10 µM in Ni-NTA buffer (pH = 7.5). PGA was added in varying concentrations and the solution incubated at 25 °C. The reaction was stopped by addition of formic acid to reach pH = 1 -2. Samples were then analyzed as intact proteins by LC-MS. Time-course measurements of deprotection of 2 ( Figure 1E) and deprotection assays of diSUMO and Trx constructs ( Figures 2B and C) were performed in two technical repeats.

Mass spectrometry of intact proteins
For mass analyses of intact proteins an UltiMate™ 3000 RS system (Thermo Fisher Scientific Inc., MA, USA) was connected to a maXis II UHR-TOF mass spectrometer (Bruker Daltonik GmbH, Bremen, Germany). A standard ESI source (Apollo, Bruker Daltonik GmbH, Bremen, Germany). Samples were acidified by formic acid to reach a pH of 1 -2. Except for Trx, Nb and Ub, samples were reduced for 10 min by TCEP (2 mM) prior to MS analysis to prevent inhomogeneity. After centrifugation (14000 rpm, 2 min, 4°C) samples were injected to LC-MS. A C4 column (Advance Bio RP-mAb C4, 2.1 mm x 50 mm, 3.5 µm, Agilent Technologies, Waldbronn, Germany) was used at a flow rate of 0.6 mL/min with eluents A and B (eluent A: 0.1% formic acid in H2O; eluent B: 0.1% formic acid in acetonitrile). A desalting period (7 min, 5% B) was performed, followed by a steep gradient (5-60% B in 2 min). As settings for MS capillary voltage of 4500 V, end-plate offset of 500 V, a dry temperature of T=200°C and mass range of m/z 300-3000 were choosen. The nebulizer was set at 3.5 bar and flow rate of dry gas was 8.0 L/min. DataAnalysis 4.4 (Bruker Daltonik GmbH, Bremen, Germany) was used as analysis software. The software includes a MaxEnt algorithm, that was used for deconvolution.

diSUMO cleavage with SENP1 analyzed by SDS-PAGE
For SDS-PAGE analysis diSUMO FRET sensor 2* (10 µM) was incubated with and without the SUMO protease SENP1 (0.1 µM) for 2h at 25°C in PBS buffer (pH = 7.4). The reactions were stopped by addition of SDS-buffer and heating to 98°C for 10 min. Samples were visualized by UV illumination (excitation: 609 nm (AF647); 535 nm (AF555)) using Intas ECL Chemostar and by Coomassie Brilliant Blue staining. For LC-MS analysis of the cleavage reaction products, the reaction was stopped by heat (98°C) and acidified with formic acid to reach pH = 1 -2.

diSUMO cleavage with SenP1 analyzed by FRET spectroscopy
FRET measurements of diSUMO FRET sensor 2* (2 µM) were conducted in presence of 2 mM TCEP and 0.1% Tween in PBS buffer (pH = 7.4). Spectroscopic analysis of the SENP1 cleavage assay was performed with a Tecan plate-reader (Infinite® M1000 PRO) using 384well microplates (Greiner bio-one, PS, Flat Bottom, non-binding, Black). Emission spectra (ex.: 520 nm) of 2* incubated with or without SENP1 for 2h at 25°C were measured and corrected for direct excitation of the acceptor dye. For this purpose the emission spectrum of a diSUMO control protein labeled only with the acceptor dye AF647 (ex.: 520 nm) was substracted from the emission spectrum of 2*. For SENP1 cleavage assay of 2* the donor dye AF555 was excited by 520 nm. SENP1 was added 10 min after beginning of the recording. Emission of the donor dye was measured at 570 nm (I D) and emission of the acceptor dye at 665 nm (IA). The FRET ratio (IA / ID) was corrected for direct excitation of the acceptor and cross talk of the donor as well as for the buffer background. For cross talk determination, diSUMO was labeled with only the donor dye or only the acceptor dye. For direct excitation of the acceptor the onlyacceptor labeled diSUMO was excited by 520 nm and its emission at 665 nm was measured. The donor cross talk was determined by excitation of the only-donor labeled diSUMO at 665 nm when excited by 520 nm. FRET ratio was normalized to the ratio at t = 0. Experiments were performed in three technical repeats ( Figure S8).

FRET measurements with NRPS FRET sensors
FRET measurements with the NRPS constructs holo-5* and holo-6* were performed with a Tecan (Infinite® M1000 PRO) using 384-well microplates (Greiner bio-one, PS, Flat Bottom, non-binding, Black). holo-5* and holo-6* (0.3 µM) were mixed with and without ATP and L-Phe (2 mM each) in a total volume of 50 µL in assay buffer (50 mM HEPES, 100 mM NaCl, 1 mM EDTA, 10 mM MgCl2, pH = 7) and incubated for 30 min before starting the measurement (25 °C). AF555 was excited at 520 nm and emission was measured from 560 to 800 nm (bandwidth = 5 nm). AF647 was excited at 650 nm and emission was measured from 660 to 800 nm. The obtained emissions at 570 nm (Id) and at 674 nm (Ia) at donor excitation were normalized to the acceptor only emission at acceptor excitation, thus correcting for concentration fluctuations. The FRET ratio Ia/Id was calculated and the values of six replicates averaged.

Mammalian cell culture and nanobody binding assay
HeLa cells were cultured in EMEM supplemented with 10% fetal calf serum, 1% non-essential amino acids and 1% L-glutamine) at 37 °C and 5% CO2. Half confluent cells were seeded in 35 mm dishes and used for transient transfection via calcium phosphate coprecipitation with plasmid pMBH63, encoding HA-EGFP-Trx-TMD-mCherry with a signal sequence for transport to the plasma membrane. [8] After 24 h of incubation, the binding assay was performed. Cells were washed three times with PBS and dually labeled Nb 7* was added to 1 mL of the fresh medium in a final concentration of 10 nM. After 5 min incubation at 25 °C, cells were washed three times with PBS and iFluor405-streptavidin conjugate was added to 1 mL of fresh medium in a final concentration of 30 nM. Again, after 5 min incubation at 25 °C, cells were washed three times with PBS and fixed by addition of 500 µL paraformaldehyde (4%). Kan   QGTVAEVLGKDFVKFDKDIRRNYWPDAIRAQIAALSPED  MSILQGYADGMNAWIDKVNTNPETLLPKQFNTFGFTPK  RWEPFDVAMIFVGTMANRFSDSTSEIDNLALLTALKDKY  GVSQGMAVFNQLKWLVNPSAPTTIAVQESNYPLKFNQQ  NSQTAALLPRYDLPAPMLDRPAKGADGALLALTAGKNR  ETIAAQFAQGGANGLAGYPTTSNMWVIGKSKAQDAKAI  MVNGPQFGWYAPAYTYGIGLHGAGYDVTGNTPFAYPG  LVFGHNGVISWGSTAGFGDDVDIFAERLSAEKPGYYLH  NGKWVKMLSREETITVKNGQAETFTVWRTVHGNILQTD  QTTQTAYAKSRAWDGKEVASLLAWTHQMKAKNWQEW  TQQAAKQALTINWYYADVNGNIGYVHTGAYPDRQSGH  DPRLPVPGTGKWDWKGLLPFEMNPKVYNPQSGYIANW  NNSPQKDYPASDLFAFLWGGADRVTEIDRLLEQKPRLT  ADQAWDVIRQTSRQDLNLRLFLPTLQAATSGLTQSDPR  RQLVETLTRWDGINLLNDDGKTWQQPGSAILNVWLTSM  LKRTVVAAVPMPFDKWYSASGYETTQDGPTGSLNISVG  AKILYEAVQGDKSPIPQAVDLFAGKPQQEVVLAALEDTW  ETLSKRYGNNVSNWKTPAMALTFRANNFFGVPQAAAE  ETRHQAEYQNRGTENDMIVFSPTTSDRPVLAWDVVAP  GQSGFIAPDGTVDKHYEDQLKMYENFGRKSLWLTKQD  VEAHKESQEVLHVQRMDEKTTGWRGGHVVEGLAGELE  QLRARLEHHPQGQREP and [7] SENP1 GST-SENP1(E417-L644) pGST-Senp1 pGEX4T-1 Amp [12] Supporting References