An important challenge in the field of protein bioconjugation is the development of strategies that can modify a protein once at a single site. Although many bioconjugation reactions can functionalize specific amino acids in aqueous solution,1 most proteins display multiple copies of the targeted residue on their surface. This commonly results in product mixtures that present the new functionality in multiple locations on the protein surface. As it is one of the rarest amino acids,2 cysteine is the most commonly targeted residue when site-selective modification is required, but there remain many situations in which the modification of a unique copy of this residue is inconvenient or impossible. To address these limitations, several chemoselective techniques have been developed to target surface-accessible aromatic residues.3 To complement these methods further, we report herein a biomimetic transamination reaction that can modify the N terminus of proteins and peptides under mild conditions. This technique introduces a uniquely reactive ketone or aldehyde group in a single location, thus allowing further modification through oxime or hydrazone formation. This simple strategy does not require the use of site-directed mutagenesis, and therefore has the potential to introduce virtually any functional group on a wide range of protein substrates.
The unique reactive properties of the N terminus have resulted in several strategies targeting this location to achieve site-selective protein modification. To a limited extent, the lower pKa value of N-terminal amino groups (relative to lysine side chains) can be used to direct acylation reactions to this site through careful control of the reaction pH.4 However, the large number of competing lysine residues (10–40 on most proteins) limits the selectivity of this reaction in most cases. Alternative strategies have targeted the N terminus in combination with specific amino acid side chains. In particular, the reaction of N-terminal cysteine residues with thioesters has been used with great success, and affords fusion proteins through native chemical ligations.5 N-terminal cysteine residues have also been modified through thiazolidine formation by using aldehyde reagents.6 Reactive aldehydes can be formed through periodate oxidation of N-terminal serine and threonine residues,7 and N-terminal tryptophan residues can be modified through Pictet–Spengler reactions.8
An alternative strategy that does not depend on the identity of the amino acid side chain can be envisioned through the oxidation of the N-terminal amino group to an imine, followed by hydrolysis to afford a ketone or an aldehyde. This reaction pathway has indeed been demonstrated for the chemospecific functionalization of N termini by using glyoxylic acid, copper(II) salts, 1 M pyridine, and 1 M HOAc.9 However, these conditions are too harsh to maintain the folded structure of most proteins, and are therefore more appropriate for sequence-analysis applications. In contrast to these reports, however, initial studies by our research group found that the N-terminal aspartic acid residue of angiotensin I (1) unexpectedly decarboxylated to a small extent (ca. 10 %) when the peptide was simply treated with glyoxylic acid (2 a; Scheme 1). The product of the reaction was identified as pyruvamide 3 through subsequent reaction with O-benzylhydroxylamine (5) to form oxime 6. Screening experiments indicated that the structure of the aldehyde substitutent strongly influenced the reaction efficiency (Table 1) and that the decarboxylated product was often accompanied by various amounts of imines 4 a–h. Interestingly, pyridoxal-5-phosphate (PLP, 2 c) emerged as the most effective aldehyde, thus affording a 65 % conversion of 1 into ketone 3 in 2 h at 37 °C, with no observed imine formation.
|Aldehyde||3 [%]||4 a–h [%]||Aldehyde||3 [%]||4 a–h [%]|
|2 a||ca. 10||trace||2 e||0||25|
|2 b||14||17||2 f||10||50|
|2 c||65||0||2 g||trace||>90|
|2 d||0||0||2 h||20||40|
Subsequent optimization experiments determined that a 1 mM peptide solution could be modified by using 10 mM PLP in 50 mM phosphate buffer at pH 6.5 and 37 °C (Table 2, entry 1). Consistent with other reactions involving carbonyl condensations,10 reactions at pH 6.5 afforded the optimal amount of product, although slight variations were still tolerated (Table 2, entries 2–4). An increase in temperature to 65 °C resulted in rapid conversion of 1 into 3 (>95 % in 2 h; Table 2, entry 5), although excellent conversion could also be obtained after 24 h at room temperature (Table 2, entry 7). The reaction was found to proceed readily in 2-[4-(2-hydroxyethyl)-1-piperazinyl]ethanesulfonic acid (HEPES), 3-(N-morpholine)propanesulfonic acid (MOPS), and phosphate buffers, and, contrary to previous transamination reports,11 did not require the presence of divalent cations or denaturing organic cosolvents. A variety of Lewis acids were screened to confirm this, and little change in rate or conversion was observed.12 Moreover, experiments performed in the presence of 10 mM ethylenediaminetetraacetate (EDTA) to sequester any metal ions in the reaction solution resulted in equivalent product yields.
|Entry||Peptide||Sequence||pH||T [°C]||t [h]||Conv [%]|
The most useful feature of this reaction is its selectivity for the N terminus of the peptide. Under the optimized conditions, peptide 7 afforded 90 % conversion into the ketone product (Table 2, entry 8). These screening experiments also revealed that the N-terminal residue does not need to be aspartic acid, as peptides bearing N-terminal methionine (8) and glycine (9) residues still afforded good yields of the desired products. These peptides also provided the singly modified conjugate, despite the presence of internal lysine residues.
In a biological context, PLP is a cofactor that effects a variety of metabolic transformations under enzymatic control, including racemization, elimination, decarboxylation, and transamination.13 Typically, PLP-mediated reactions first involve condensation of the aldehyde with amino groups on a peptide or protein, thus leading to the formation of imine 4 c (Scheme 2). In the absence of an enzyme, this process could occur with all the amino groups on a particular biomolecule. However, the imine formed with the N terminus has an α proton with a much lower pKa value, which allows tautomerization to occur uniquely at this site. Hydrolysis of the resulting glyoxyl imine 10 (accompanied by decarboxylation in the case of aspartic acid) yields an aldehyde or ketone specifically at the N terminus. Although in principle this reaction sequence could take place on the N termini of peptides and proteins in living cells, this pathway has not been reported to date. This is likely due to the low concentration of unbound PLP in the cytoplasm.
The development of the reaction for protein targets was initiated with horse heart myoglobin (12), which has an N-terminal glycine residue. Optimal conditions for myoglobin modification were found to be 50 μM protein and 10 mM PLP for 20 h at 37 °C in 25 mM phosphate buffer (pH 6.5; Figure 1 a). Following treatment of the mixture with p-bromobenzyloxyamine (13), it was determined by ESI-MS that a single product was obtained that corresponded to the expected oxime 14 (Figure 1 b). The overall conversion of the reaction was determined to be greater than 75 % by ESI-MS and 69 % using sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE analysis techniques, see below). Following removal of excess PLP, the absorbance of the heme component was found to be unchanged over the course of the reaction (Figure 1 c). This observation provides good evidence that the tertiary structure of the protein was still intact following the modification procedure.
We were able to confirm the site-specificity of this modification after subjecting 14 to a proteolytic digest with trypsin. Analysis of the peptide fragments by MALDI-TOF-MS showed that the reaction had occurred on the N-terminal fragment of the protein (Figure 1 d). The bromine atom of 13 provided a useful analysis tag, as the characteristic isotope pattern was specific to the modified fragment and could be observed clearly (Figure 1 e). Despite the 17 lysine residues on the protein, no other modified peptides were observed, a result that is consistent with the observation that these residues do not participate in this reaction.
The oxime linkage is well suited for use as a probe or linker in biochemical systems, as it is selective for ketone and aldehyde functionalities, forms in high yield, and is stable under physiological conditions.14 As an example, the N terminus of myoglobin was fluorescently labeled with chromophore 15. SDS-PAGE analysis revealed selective labeling of the protein after activation with PLP (Figure 2 b, lane 1), whereas no labeling was observed for unactivated myoglobin (lane 2).
This technique can also be used to label proteins when chemistry appropriate for cysteine groups is not available. For example, cysteine 70 in the green fluorescent protein15 (EGFP, 16 a) is reactive toward maleimide reagents and this protein (in our hands) cannot be expressed in fluorescent form when this residue has been substituted. Thus, the labeling of other positions on this protein is difficult to achieve by standard mutagenesis techniques. We have found that the N-terminal modification method can overcome this limitation, and results in the attachment of 15 at the N-terminal valine residue of 16 a (Figure 2 b, lanes 3 and 4) and at the N-terminal glycine residue of mutant 16 b. Subsequent experiments have confirmed that the internal chromophore retains its fluorescence after modification with PLP.12
The attachment of poly(ethylene glycol) (PEG) chains to proteins has been shown to improve circulation lifetime, lower immunogenicity, and confer proteolysis resistance.16 These conjugates are typically prepared through lysine modification, which results in complex mixtures that exhibit varying levels of efficacy. To avoid this, we have used the two-step strategy outlined above to attach a single polymer chain to the N terminus of proteins.17, 18 Specifically, the aldehyde of PLP-activated myoglobin was treated with various equivalents of the previously reported19 PEG alkoxyamine 17. As seen in Figure 3 b (lanes 1–3), a single bioconjugate was obtained even when using a large excess of 17. Using similar conditions, this reaction strategy has also proven effective for EGFP (16 b, lanes 6 and 7), RNase A (19, lanes 8 and 9), and thioredoxin (20, lanes 10 and 11). For comparison, PEG-NHS ester 18 yielded a ladder of polymer conjugates for the labeling of 12 (Figure 3 b, lanes 4 and 5). Clean conversion to a singly modified protein was not observed under any conditions using this approach.
The shift on the gel caused by polymer attachment provides a useful method for the quantitation of reaction conversion by optical densitometry measurements after Coomassie staining (Table 3). For myoglobin, 69 % overall conversion was achieved at 37 °C, and for EGFP (16 b) 53 % conversion was obtained. High levels of modification could be obtained for 16 a and 16 b at 55 °C. RNase A (19) was cleanly modified in 50 % conversion at 37 °C, as was thioredoxin (20) under identical conditions. The large number of disulfide and free thiol groups that proteins 19 and 20 possess render them difficult to modify in a single location by other methods. Finally, initial studies have found that protein G′ (21) can be modified to 30 % at 41 °C, thus indicating that proteins with N-terminal methionine residues are compatible with this technique. Studies to increase the conversion for this important class of substrates are in progress, as are studies to evaluate the compatibility of other N-terminal amino acids.
|Entry||Protein||N-terminal amino acid||Conc. [μM]||T [°C]||Conv. [%]|
|2||GFP-1V (16 a)||valine||10||55||67|
|3||GFP-1G (16 b)||glycine||10||37||41|
|4||GFP-1G (16 b)||glycine||10||55||80|
|7[b]||protein G′ (21)||methionine||33||41||30|
We anticipate that proteins possessing N-terminal serine, threonine, cysteine, and tryptophan residues will be incompatible with this technique because of known side reactions with aldehydes.6, 8 It is also presumed that N-terminal proline residues will be unreactive. Outside of these limitations, this reaction provides a generally applicable tool for the site-selective labeling of protein targets. Our current efforts use this methodology for the site-specific placement of fluorophores for sensing applications and the dual modification of both protein termini for device integration.