Efficient Chemical Protein Synthesis using Fmoc‐Masked N‐Terminal Cysteine in Peptide Thioester Segments

Abstract We report an operationally simple method to facilitate chemical protein synthesis by fully convergent and one‐pot native chemical ligations utilizing the fluorenylmethyloxycarbonyl (Fmoc) moiety as an N‐masking group of the N‐terminal cysteine of the middle peptide thioester segment(s). The Fmoc group is stable to the harsh oxidative conditions frequently used to generate peptide thioesters from peptide hydrazide or o‐aminoanilide. The ready availability of Fmoc‐Cys(Trt)‐OH, which is routinely used in Fmoc solid‐phase peptide synthesis, where the Fmoc group is pre‐installed on cysteine residue, minimizes additional steps required for the temporary protection of the N‐terminal cysteinyl peptides. The Fmoc group is readily removed after ligation by short exposure (<7 min) to 20 % piperidine at pH 11 in aqueous conditions at room temperature. Subsequent native chemical ligation reactions can be performed in presence of piperidine in the same solution at pH 7.


Introduction
Native chemical ligation (NCL) [1] of unprotected peptide segments made by solid-phase peptide syntheses (SPPS) [2] has enabled chemical access to a variety of functional protein molecules. The typical size of a functional domain of proteins ranges from 100-200 amino acid residues. Chemical syntheses of such large protein domains can only be realized through multisegment native chemical ligation reactions. [3] However, the traditional multisegment ligation strategies are associated with multiple purifications and freeze-drying sequences, which are time-consuming, laborious, and low-yielding due to handling losses in intermediate purification steps. In this regard, an important step forward in the field of chemical protein synthesis has been one-pot native chemical ligation, where multiple peptide segments are ligated, sequentially, without intermediate purifications. [4] Another major advance in the field has been stimulated by the introduction of fully convergent chemical protein synthesis using kinetically controlled ligation [5a] or peptidehydrazide chemistry. [5b] In convergent synthesis, two halves of the polypeptide chain, synthesized independently in parallel from two or more peptide segments using NCL, are joined in a final step to give the full-length polypeptide chain. Fully convergent chemical syntheses are always efficient in terms of purity and yield; [6] and have been widely applied for the synthesis of number of large protein molecules. [5] Both one-pot multisegment ligation from the C-terminal peptide segment towards the N-terminal segment and fully convergent synthesis mandate temporary protection of the reactive cysteine residue located at the N-terminus of middle peptide thioester segment(s). In the first demonstration of one-pot NCL, Bang et al. utilized (4R)-1,3-thiazolidine carboxylic acid (Thz) as a cryptic form of the N-terminal Cys residue of the middle peptide thioester segment prepared by Boc-chemistry SPPS. [4a] The same strategy was used for the synthesis of several other moderately sized protein molecules, where peptide thioester syntheses were performed using Bocchemistry SPPS. [4b,c] Liu and co-workers introduced peptide hydriazides [7a] and Dawsons group introduced peptide oaminoanilides [7b-d] as surrogates for peptide thioester segments used in chemical protein synthesis. Because peptide hydrazides or o-aminoanilides are readily prepared by Fmocchemistry SPPS, they are widely used. However, the thiazolidine group was found to be incompatible with the oxidative NaNO 2 treatment necessary for activation of the peptide thioester surrogates, peptide o-aminoanilides, and peptide hydrazides [7e] prepared by the more widely used Fmocchemistry SPPS. This observation compelled researchers to develop alternative chemical tactics for one-pot ligations and convergent syntheses compatible with the peptide o-aminoanilides [5e, 7c] and peptide hydrazides. [4d-h, 5b] Although several alternative cysteine protecting groups (Figure 1 A) have been reported for these applications, none have found widespread use because they require either multistep synthetic route to protect cysteine or careful chemical manipulations to prevent side reactions during deprotection of the cysteine in aqueous solution. [4d-h] Herein, we report the use of the fluorenylmethyloxycarbonyl (Fmoc) group, [8] pre-installed on cysteine for routine Fmoc-chemistry SPPS, as an operationally simple and robust method for masking the N-terminal reactive cysteine residue of peptide (Cys-peptide) thioester segments. We show that the Fmoc group is stable to NaNO 2 treatment and can be removed quantitatively by a brief exposure to 20 % piperidine in aqueous ligation buffer. Moreover, the presence of piperidine in the ligation buffer does not interfere with the subsequent ligation reaction at neutral pH, thereby enabling multisegment peptide ligations in a one-pot manner. Clean conversion in every synthetic step gives high-purity full-length polypeptide in excellent overall yields.

Results and Discussion
The success of the proposed one-pot synthetic strategy (Figure 1 B) relied primarily on the efficient removal of the Fmoc group in aqueous buffer. We found, using the model peptide Fmoc-Cys-Leu-Tyr-Arg-Ala-Tyr-a CONHNH 2 (1) where the Fmoc group of the N-terminal Cys residue was left intact (Figure 2 A), that 20 % (v/v) aqueous piperidine at pH 11 was optimal for the quantitative deprotection of the Fmoc group within 7 minutes (Figure 2 B; for optimization details see Section S3 in the Supporting Information).

Fmoc Removal and NCL in the Same Reaction Mixture
In order to evaluate the feasibility of Fmoc removal combined with native chemical ligation in the same reaction mixture, we ligated a model thioester peptide Gly-Cys-Pro-Arg-Ile-Leu-Met-Arg-a COSCH 2 CH 2 SO 3 Na (3) with the model Cys-peptide hydrazide Cys-Leu-Tyr-Arg-Ala-Tyra CONHNH 2 (2) in standard ligation buffer (200 mm phosphate, 6 m Gu.HCl, 20 mm TCEP), as shown in Figure 2 A. The Cys-peptide hydrazide 2 was obtained by Fmoc removal on peptide 1 using 20 % piperidine in ligation buffer at pH 11.0. After Fmoc removal, the subsequent NCL with the thioester peptide 3 was carried out in the same solution at neutral pH in the presence of exogenous aryl thiol catalyst (20 mm 4-mercaptophenylacetic acid (MPAA)) and resulted in formation of the desired ligated product 4, without formation of any undesired side products (Figure 2 C). The  4h] are previously reported and VII is this work, for one-pot NCL. B) Schematic representation of multisegment one-pot NCL using Fmoc as the temporary masking group of the reactive N-terminal Cys residue. presence of 20 % piperidine in the reaction mixture did not interfere with the native chemical ligation. Since the pKa of piperidine is 11.1, presumably the piperidine remained protonated at neutral pH, thereby preventing nucleophilic attack on the thioester.

Compatibility of Asn-Gly and Asp-Gly Sequences with Piperidine Treatment
The presence of Asn-Gly or Asp-Gly sequences makes any peptide vulnerable to deamidation or iso-aspartic acid formation through the intermediacy of aspartimide at high pH, such as the pH 11 used to remove the Fmoc group in our method. [9] Therefore, it was imperative to check the compatibility of Asn-Gly or Asp-Gly sequences with the Fmoc removal conditions used in our study. In order to assess the stability of the Asn-Gly or Asp-Gly peptides, we incubated two model peptides containing Asn-Gly or Asp-Gly in their sequence at pH 11 in presence of 20 % piperidine. We found no detectable iso-aspartyl peptide formation from the Asp-Gly sequence and a negligible amount of deamidated product and iso-aspartyl peptide formation from the Asn-Gly sequence upon exposure to the optimized Fmoc removal conditions at pH 11, even up to 14 minutes (see Section S4.1 and Figure S7a-c in the Supporting Information).

Multisegment Polypeptide Synthesis
In order to evaluate the efficacy of consecutive one-pot NCL reactions using Fmoc-Cys-peptide thioester segments, we compared a typical synthesis of a larger polypeptide through C-to-N one-pot ligations without intermediate purification steps. We selected an 86-residue polypeptide segment (11, Cys 217 -Cys 302 ) from Plasmodium falciparum protein Pf-AMA1 (3D7 strain), which contains a pH-sensitive Asn-Gly sequence, as a target for synthesis (Figure 3 A).
For the one-pot three-segment C-to-N sequential ligation synthesis (Figure 3 B), Fmoc protection of the N-terminal Cys residue of both the middle segment and the N-terminal segment was not removed after chain assembly by standard Fmoc-chemistry SPPS. The corresponding peptide thioesters, Fmoc-Cys 247 -Phe 274 -a COCH 2 CH 2 SO 3 Na (6) and Fmoc-Cys 217 -Lys 246 -a COCH 2 CH 2 SO 3 Na (7), respectively, were obtained by activation of the peptide hydrazides by treatment with NaNO 2 followed by thiolysis with sodium 2-mercaptoethane sulfonate (MESNa). The Fmoc group in both peptides remained intact during the NaNO 2 treatment, and the thioester peptides were obtained in excellent crude purity (see Section S6 in the Supporting Information).
With all three peptide segments in hand, we first ligated the peptide 5 and 6 in ligation buffer (pH 6.9) containing 100 mm MPAA [  . Multisegment polypeptide synthesis with one-pot ligation sequences. A) Sequence of the polypeptide segment of P. falciparum protein Pf-AMA1 (3D7 strain) 11. NCL sites are underlined and highlighted in red and green. B) One-pot ligation without any intermediate purification step resulted in the final polypeptide in 37 % overall yield. R' = CH 2 CH 2 SO 3 Na. C) Analytical HPLC monitoring of the one-pot ligation reaction: a) 3 min after the addition of the peptide Cys 275 -Cys 302 -COOH (5) and Fmoc-Cys 247 -Phe 274 -a COCH 2 CH 2 SO 3 Na (6) in the standard ligation buffer (200 mm PB, 6 m Gu.HCl, 20 mm TCEP) containing 20 mm MPAA. 6' and 6'' indicate thiolactone formation from 6; b) The first ligation, within 24 h, resulted in Fmoc-Cys 247 -Cys 302 -COOH (8) as the ligated product; c) Fmoc removal from the ligated peptide 8 to give peptide Cys 247 -Cys 302 -COOH (9); d) 3 min after the addition of peptide Fmoc-Cys 217 -Lys 246 -a COCH 2 CH 2 SO 3 Na (7) in the reaction mixture; e) The second ligation was essentially complete within 20 h and furnished polypeptide Fmoc-Cys 217 -Cys 302 -COOH (10); f) Fmoc removal from the ligated product 10 to give target peptide Cys 217 -Cys 302 -COOH (11). * and ** indicate dibenzofulvene-TECP adduct and MPAA, respectively. D) Analytical HPLC profile (l = 214 nm) with ESI-MS showing charge-state distribution (inset) of the purified polypeptide 11; observed mass 10 123.28 AE 0.06 Da (average of the eight most abundant charge states) and calculated mass 10 123.21 Da (average isotope composition).
terminal peptide segment 7 in one-pot, the pH of the reaction mixture was brought down to 6.9 [ Figure 3 C, (d)]. After completion of the ligation reaction [ Figure 3 C, (e)], we again raised the pH of the reaction mixture to 11, maintaining the concentration of piperidine to 20 % (v/v), to remove the Fmoc group from the ligated polypeptide 10 [ Figure 3 C, (f)]. After the two ligations and two Fmoc-removal steps in one-pot, a single purification was carried out to obtain full-length polypeptide 11 in 37 % yield based on the limiting peptide 5 (Figure 3 D).
Having established an efficient method for Fmoc removal and NCL reactions in the same reaction solution, we wanted to explore the utility of this method for the total chemical synthesis of a typical protein. Human lysozyme contains a polypeptide chain of 130 amino acid residues that includes all twenty of the common proteinogenic amino acids in its sequence (Figure 4 A). It has been employed as a model system in various biochemical and biophysical studies direct-ed towards understanding enzyme catalysis, protein folding, and amyloidogenesis. [10a-c] For the total chemical synthesis, we strategically divided the full-length lysozyme polypeptide into four peptide segments. [5c]

One-Pot Synthesis of the Human Lysozyme Polypeptide Chain
The one-pot synthetic strategy is shown in Figure 4 B. Synthesis of the peptide segments is described in Sections S7. 1-S7.4 in the Supporting Information. The first NCL reaction between the Cys-peptide 12 and the peptide thioester 13 at pH 6.7 afforded the ligated product, which was then subjected to 20 % (v/v) piperidine treatment at pH 11 to furnish the deprotected peptide 17 within 7 min (see Section S7.5 and Figure S21 in the Supporting Information). We then reduced the pH of the reaction mixture and carried out the second ligation reaction with the peptide segment 14 a  [10f ] showing an RMSD for C-a atoms of 0.132 . at pH 6.7. The next Fmoc removal cycle in the same reaction mixture, by simply raising the pH to 11.0, maintaining the piperidine concentration to 20 % (v/v), led to complete removal of the Fmoc group from the resultant ligated peptide 18 to afford the deprotected polypeptide 19. The same pH adjustments were repeated for the final ligation reaction with the peptide thioester segment 15 in one pot to deliver the fulllength lysozyme polypeptide 20. After five synthetic steps, a single purification was carried out to give full-length human lysozyme polypeptide 20 in good yield (36 % based on the starting peptide segment 12). The homogeneity of the synthetic product and its mass were confirmed by LC-MS [ Figure 4 D, (a)]. The full-length polypeptide had an observed mass 14 700.70 AE 0.04 Da (average isotope composition) and calculated mass 14 700.63 Da (average isotope composition).

Convergent Synthesis of the Human Lysozyme Polypeptide Chain
In principle, fully convergent synthesis has advantages over consecutive synthetic reactions or partially convergent synthesis. [6] In order to demonstrate the application of Fmoc protection of N-terminal Cys in peptide thioester segments in a fully convergent synthesis, we prepared the same lysozyme polypeptide as depicted in Figure 4 C.
For the convergent synthesis of the lysozyme polypeptide chain, the N-terminal half of the full-length polypeptide was prepared by NCL of the Cys-terminal peptide hydrazide 14 b and the peptide thioester 15 to afford polypeptide hydrazide 21 (see Figure S23). The C-terminal half of the full-length lysozyme polypeptide was obtained, as discussed in the onepot strategy, by NCL of Cys-peptide 12 and the Fmocprotected peptide thioester 13, followed by piperidine mediated Fmoc removal to give polypeptide 17 (see Figure S24). Both 17 and 21 were purified by reverse-phase HPLC, then reacted after conversion of 21 to the thioester by treatment with NaNO 2 at reduced temperature followed by addition of MPAA, after which Cys-peptide 17 was added to effect the final NCL reaction at pH 6.7 and to afford the full-length lysozyme polypeptide 20 (see Figure S25) in 48 % yield (based on the starting peptide segment 17) after purification. The full-length polypeptide had an observed mass 14 700.72 AE 0.03 Da (average isotope composition) and calculated mass 14 700.63 Da (average isotope composition). The purities of the full-length lysozyme polypeptide (20) products obtained from convergent synthesis and the one-pot sequential reactions were similar [ Figure 4 D, (a,b)] although the overall yield was slightly reduced.

Folding to Obtain correctly Disulfide-Linked Human Lysozyme
Finally, the full-length polypeptide 20 was folded with concomitant formation of disulfide bonds in buffer containing a redox system consisting of 5 mm oxidized glutathione, 2 mm DTT in presence of 0.8 m Gu·HCl and 1 mm EDTA in 0.1m TRIS buffer (pH 8) to furnish folded lysozyme protein 22 after HPLC purification in 38 % yield of isolated product (based on the amount of lysozyme polypeptide 20).
[5c] The folded lysozyme protein had an observed mass of 14 692.71 AE 0.07 Da (average of the most abundant isotopologue mass over two observed charge states), 8.0 Da less than the mass of the full length polypeptide chain (14 700.7 Da), which is in an excellent agreement with the formation of four disulfide bonds with loss of eight protons [Figure 4 D,(c)]. The earlier retention-time shift compared with the unfolded lysozyme molecule in reverse-phase HPLC and the altered charge distribution in the ESI-MS, characteristics of a folded globular protein, was clearly evident for our chemically synthesized lysozyme molecule. To the best of our knowledge this is only the second successful total synthesis of human lysozyme, [5c] after several unsuccessful attempts made in the 1970s from leading laboratories. [10d,e] To confirm the correct disulfide bonding pattern and the 3D structure of the folded protein molecule, we crystallized the chemically synthesized lysozyme protein using reported [5c] crystallization condition and determined its 3D structure by X-ray crystallography. The lysozyme crystals were obtained by mixing 2 mL of a solution containing 14 mg mL À1 of folded synthetic human lysozyme 22 in 120 mm LiCl, 2.5 mm HEPES, pH 7.5, and 2 mL of reservoir solution consisting of 30 mm sodium phosphate, 2.5 m NaCl, pH 4.9 using hanging drop vapor diffusion method. We used a crystal that diffracted to 1.46 resolution to solve the structure by molecular replacement using the coordinates from PDB ID: 2NWD as the search model for phase determination. The crystallographic model was finally refined using Phenix. [11] The refined model had an Rwork/Rfree of 16.2 %/19.6 %. The obtained 3D model (

Conclusion
In summary, we have developed a robust and operationally simple method for total protein synthesis by multisegment, fully convergent, or one-pot native chemical ligations using the Fmoc moiety as a temporary masking group of the N-terminal reactive Cys residue of the peptide thioester segment(s). The Fmoc group was fully compatible with the NaNO 2 treatment frequently used for the generation of peptide thioesters from peptide hydrazides or peptide oaminoanilides. Fmoc removal was readily achieved using a short exposure of the peptide to 20 % piperidine in aqueous conditions. The presence of piperidine did not interfere with the subsequent ligation reactions in one-pot usually carried out at pH 7. Notably, our method provides an alternative and highly efficient approach for high-yielding chemical protein synthesis.