Protein labeling is a pivotal technique in molecular and cell biology. Strategies include derivatization of cysteine residues,1 labeling lysine or N-terminal amino groups with activated esters, periodate or PLP-mediated oxidation of the N-terminus for oxime ligation,2 and native-chemical ligation.3 Each of these methods has its own associated challenges: Selective labeling of a single cysteine residue frequently requires rounds of site-directed mutagenesis to introduce the labeling site and/or remove other cysteine residues, and selective labeling of the N-terminal amino group requires careful control of pH to ensure lysine residues are not also modified.4 Other modern methods for chemoselective labeling often require the introduction of specific recognition sequences5 or nonnatural amino acids into the protein to be labeled.6 In most cases, a substantial excess of the labeling reagent is necessary to ensure complete conversion to the product. We report a method for chemoselective N-terminal labeling of recombinant proteins in quantitative yield using depsipeptide substrates for the transpeptidase sortase A. The method does not require engineering of the protein sequence beyond that typically used in contemporary recombinant protein purification strategies. Unlike previous approaches,7 the method requires only a single N-terminal glycine residue in a sterically unhindered position, a minimal excess of the labeling reagent and substoichiometric quantities of transpeptidase.
Sortase A (SrtA) catalyzes the reversible attachment of virulence factors to the cell walls of Gram positive bacteria by C-terminal modification of proteins at an LPXTG recognition sequence.8 The enzyme catalyzes the covalent attachment of the LPXT motif to a cysteine residue in the catalytic site to form a thioester intermediate. An N-terminal oligoglycine motif in the peptidoglycan can then react with this intermediate to covalently attach the substrate to the cell wall. SrtA has been exploited extensively for the C-terminal modification of proteins.7b, 9 However, this method has certain constraints: the LPXTG sequence must be engineered into the protein and excess nucleophilic labeling reagent is required to push the equilibrium toward formation of product as the transpeptidase reaction is reversible (Scheme 1 a).
N-terminal labeling of proteins using SrtA has the potential advantage that only a single N-terminal glycine is required on the protein.10 Since many commercial expression plasmids incorporate protease recognition sequences which yield an N-terminal glycine after cleavage, such an approach is potentially widely applicable. However, reversibility of the SrtA-catalyzed reaction is a greater problem for N-terminal labeling of proteins.
We chose to investigate whether the reaction could be made irreversible by using depsipeptide substrates (Scheme 1 b). We anticipated that the hydroxyacetyl byproduct would not be a substrate for the reverse reaction, thus rendering the labeling reaction irreversible. A similar strategy has been applied previously to subtiligase,11 but overall yields were dependent on specific recognition sequences in both peptides and an excess of depsipeptide was necessary to drive the reaction to completion. Ploegh and co-workers have reported the use of a methyl ester substrate with SrtA;7a however, their method employed stoichiometric quantities of SrtA and excess methyl ester.
We first chose to establish an assay to evaluate the efficiency of the SrtA reaction with various peptide substrates. Three classes of labeling substrate 1, 2 and 3 were synthesized (Figure 1 a); these peptides share a common YALPET sequence followed by a single glycine or glycineamide, or two glycines, respectively. All three peptides were tested as substrates for ligation to a short peptide with the sequence GGSEFG 4 using SrtA in aqueous buffer (Figure 1 a). HPLC analysis using an authentic sample of product 5 as a standard showed that only peptides 2 and 3 act as substrates for SrtA, as indicated previously.12 Neither reaction achieved complete conversion to product 5 as this peptide is itself a substrate for SrtA and can thus react with glycine amide 6 or diglycine 7 to reform the starting peptides. The reaction mixture therefore comes to equilibrium after 50 % conversion (Figure 1 b). Increasing the ratio of the “nucleophilic” peptide 4 to the “electrophilic” acyl donor peptides 2 and 3 leads to an increase in conversion but the system still goes to equilibrium (Figure 1 b and Supporting Information).
Depsipeptides 8 and 9 were synthesized by Fmoc solid phase peptide synthesis (SPPS) using Fmoc-protected TG depsipeptide 12 (Scheme 2). First, alkylation of Fmoc tert-butyl-protected threonine 10 with benzyl bromoacetate in the presence of tetrabutylammonium iodide yielded the tri-protected depsipeptide 11. Hydrogenolysis of the benzyl group gave carboxylic acid 12 which was activated using HCTU for attachment to NovaGel Rink amide resin or glycine-loaded 2-chlorotrityl resin. Standard Fmoc-SPPS protocols were used to extend these depsipeptides to provide compounds 8 and 9 in 87 % and 67 % yield, respectively.
SrtA-mediated ligation of each depsipeptide with GGSEFG peptide 4 was followed by using HPLC (Figure 2). When one equivalent of either depsipeptide 8 or 9 was used, 4 was almost quantitatively transformed to the ligation product 5 (Figure 2 and Supporting Information). The presence of 2 mol % SrtA in the reaction mixture prevents the reaction going to completion as the equilibrium between the product and thioester intermediate remains significant. In contrast, the corresponding methyl ester 13 reacted slowly under these conditions, leading to only 30 % product formation after 8 h. When the amount of depsipeptide 9 was increased to 1.5 equiv, transformation of 4 to 5 became rapid and quantitative (see Supporting Information).
As depsipeptides proved effective for labeling model peptides, we sought to apply the method to labeling proteins. SrtA-mediated ligations are typically carried out using peptides containing an N-terminal oligoglycine motif; however, this is not frequently observed in proteins. A single N-terminal glycine is, however, commonly produced in recombinant proteins after cleavage with proteases such as thrombin or tobacco etch virus (TEV) protease. Depsipeptide 14 containing a dansyl lysine residue was synthesized using the same methodology as for 8 and 9. Initial labeling experiments were performed using a variant of the human mannose binding protein (ManBP).13 This trimeric protein has a single N-terminal glycine at the end of a three-stranded α-helical bundle. The substrate protein was labeled quantitatively within 4 h when incubated with 1.5 equiv of the depsipeptide labeling reagent per protomer (Figure 3). The degree of labeling was confirmed by electrospray mass spectrometry (ESMS) of the reaction mixture, which indicated complete conversion to the labeled protein (Figure 3 c). A small quantity of acylated SrtA can also be observed in the SDS-PAGE gels of the crude reaction mixtures, corresponding to 10 mol % SrtA present in the reaction mixture. SrtA can be easily removed by affinity purification to yield a pure reaction product. ManBP was also labeled successfully with a fluorescein-modified depsipeptide (see Supporting Information). When working at low protein concentrations, prolonged incubation with SrtA can lead to hydrolysis of the label;10 however, if necessary, increasing the quantity of label to 2–3 equiv can still allow complete conversion to the product.
To confirm the generality of our depsipeptide method we also labeled a sample of the mouse pumilio-2 Puf RNA-binding domain14 in which an N-terminal GT sequence had been produced by TEV protease cleavage. The protein was quantitatively labeled using 1.5 equivalents of labeling reagent 14 demonstrating the generality of this approach. We also investigated the labeling of both myoglobin and the fly pumilio RNA-binding domain. In both cases protein labeling was unsuccessful; however, short peptides corresponding to the N-terminal sequences of myoglobin, mouse pumilio and fly pumilio could be successfully modified (see Supporting Information). All three peptides displayed very similar reaction kinetics that were only slightly slower than those displayed for diglycyl peptide 4. We attribute the differences in protein reactivity to variation in steric bulk in the vicinity of the labeled glycine. In the case of fly pumilio, the glycine is only one residue removed from the globular domain of the protein with the sequence GS (c.f. GTG for the mouse paralog). This suggests that a minimum length of flexible peptide is required in order to ensure that labeling can occur.
The depsipeptide substrates allow rapid labeling of both peptides and proteins. In conventional SrtA-mediated ligations, the rate-determining step is the initial attack of the enzyme to form an acyl–enzyme intermediate;10 this is then attacked by the nucleophile to form a product that is itself a substrate for SrtA. The efficiency of any given substrate is therefore controlled by the rate of its turnover relative to that for the product. The Michaelis constants for peptide acyl donors are typically in the same range as the substrate concentrations used in this study;10, 15 the relative rate of reaction is therefore determined by the specificity constant, kcat/Km. We suggest that this factor accounts for the success of the depsipeptides under the conditions used in this study. The rate of acylation by an ester substrate will be greater than that for an amide, thus increasing kcat for both the depsipeptide and the methyl ester. However, while the affinity of the depsipeptide should be similar to that of a peptide substrate, the methyl ester will presumably bind less tightly. Overall the relative rate of reaction (as determined by kcat/Km) for the methyl ester should be less than that for the product, and product inhibition therefore leads to the observed low overall rate of reaction. The rate of reaction for the depsipeptide will be higher than that for the product and rapid conversion is thus observed.
In conclusion, we have described the synthesis and application of depsipeptide substrates for sortase-mediated ligation at the N-terminus of proteins. These substrates enable labeling of peptides and proteins with virtually equimolar quantities of each coupling partner and with substoichiometric quantities of sortase, as long as the N-terminal glycine residue is sterically accessible. This approach is ideally suited to the use of high-value reagents and minimizes the requirement for post-reaction purification. While the method introduces an LPET sequence at the N-terminus of the labeled protein, we would not anticipate that this modification should alter the biochemical behavior of a protein any more than introduction of a fluorescent tag by any other method. We thus expect that it will be of widespread utility for the N-terminal modification of proteins.