On the design of a constitutively active peptide asparaginyl ligase for facile protein conjugation

Peptide asparaginyl ligases (PALs) are precision tools for peptide cyclization, cell‐surface labelling, protein semisynthesis and protein conjugation. PALs are expressed as inactive proenzymes requiring low pH activation. During activation, a large portion of the cap domain of the proenzyme that covers the substrate binding site is proteolytically removed, exposing the active site to solvent and releasing a population of heterogenous active enzymes. The availability of a readily active ligase not requiring acid activation and subsequent purification of active forms would facilitate manufacturing and streamline applications. Here, we engineered the OaAEP1b‐C247A hyperactive ligase via serial truncations along the linker connecting the cap and core domain of the proenzyme. The recombinant expression of the truncated constructs was carried out in Escherichia coli. Following a solubilization/refolding protocol, one truncated construct termed ‘OaAEP1b‐C247A‐∆351’ could be overexpressed in the insoluble fraction, purified, and displayed a level of ligase activity comparable to the acid‐activated OaAEP1b‐C247A enzyme. This constitutively active protein can be stored for up to 2 years at −80 °C and readily used for peptide cyclization and protein conjugation. We were able to express and purify a stable constitutively active asparaginyl ligase that can be stored for months without significant activity loss. The removal of the low pH proenzyme activation step eliminates the heterogeneity introduced by this procedure. The yield of purified recombinant active ligase that can be routinely obtained per 100 mL of E. coli cell culture is about 0.9 mg. This recombinant active ligase can be used to carry out protein conjugation.

Peptide asparaginyl ligases (PALs) are precision tools for peptide cyclization, cell-surface labelling, protein semisynthesis and protein conjugation. PALs are expressed as inactive proenzymes requiring low pH activation. During activation, a large portion of the cap domain of the proenzyme that covers the substrate binding site is proteolytically removed, exposing the active site to solvent and releasing a population of heterogenous active enzymes. The availability of a readily active ligase not requiring acid activation and subsequent purification of active forms would facilitate manufacturing and streamline applications. Here, we engineered the OaAEP1b-C247A hyperactive ligase via serial truncations along the linker connecting the cap and core domain of the proenzyme. The recombinant expression of the truncated constructs was carried out in Escherichia coli. Following a solubilization/refolding protocol, one truncated construct termed 'OaAEP1b-C247A-Δ351' could be overexpressed in the insoluble fraction, purified, and displayed a level of ligase activity comparable to the acid-activated OaAEP1b-C247A enzyme. This constitutively active protein can be stored for up to 2 years at À80°C and readily used for peptide cyclization and protein conjugation. We were able to express and purify a stable constitutively active asparaginyl ligase that can be stored for months without significant activity loss. The removal of the low pH proenzyme activation step eliminates the heterogeneity introduced by this procedure. The yield of purified recombinant active ligase that can be routinely obtained per 100 mL of E. coli cell culture is about 0.9 mg. This recombinant active ligase can be used to carry out protein conjugation.
Enzyme-mediated peptide ligation [1,2] has been exploited for a wide range of applications such as protein/peptide ligation, cyclization and labelling, protein thioester formation [3], protein conjugation to various moieties, such as PEG, lipids or fluorescent probes, live-cell-surface labelling [4], nanobody conjugation [5], and antibody-drug conjugation [6][7][8][9][10][11]. Since its discovery, sortase A has been a popular choice to perform protein conjugation [12], but a significant amount of enzyme is required often approaching 1 : 1 molar ratio with the target protein. A rather large LPXTG tag must be genetically added to the target protein and the reaction catalyzed by sortase A is reversible.
Asparaginyl endopeptidases (AEPs) and peptide asparaginyl ligases (PALs) were discovered in cyclotideproducing plants, and both enzymes belong to the cysteine protease family C13. AEPs hydrolyze the Asx-Xaa peptide bond (Asx is Asn or Asp) at the P1 position of the polypeptide substrate [13,14]. By contrast, PALs catalyze peptide bond formation. The discovery of hyperactive PALs such as butelase-1 [11] or VyPAL2 [15] and the engineering of the single mutant OaAEP1b-C247A [16,17] has opened possibilities not envisioned before in the field of bioconjugation. Thus, their discovery has attracted intense research activities to facilitate the usage of PALs for various applications in biotechnology and medicine [1,2].
Structural studies revealed that PALs and AEPs [18][19][20][21][22] share a similar overall fold formed by a core domain linked to a C-terminal cap domain via a flexible linker. The core domain consists of a six-stranded b-sheet surrounded by six a-helices located at its periphery, while the cap domain is formed by a suite of a-helices [17,23,24]. An evolutionarily conserved glutamine residue at the N-terminus of the a6-helix of the cap (Gln347 in OaAEP1b), inserts into the S1 pocket, keeping the proenzyme in an inactive state [17]. Upon activation at acidic pH values ranging from 4.0 to 4.5, the cap domain becomes separated from the core domain via electrostatic repulsion, facilitating cleavage in trans and exposing the enzyme active site to the solvent [19,[25][26][27]. This cleavage allows binding by the PAL of polypeptide substrates containing the N/DX1X2 tripeptide motifs, where X1 is any residue besides Pro and X2 is a hydrophobic residue [26]. Such motifs are present at the N-terminus and within the linker region and cap domain of the proenzyme accounting for autoproteolysis activity observed at these sites. At acidic pH, hydrolysis is favored, leading to the degradation of the cap domain and the Nterminus of the core domain. In vivo, acidic proteolytic activation occurs in the vacuole of cyclotide-producing plants and serves to regulate the activity of these enzymes endowed with proteolytic and cyclization activity [27][28][29][30][31].
So far, all PALs reported were expressed recombinantly as zymogens and their enzymatically active isoforms were only obtained following incubation at low pH [17,21,24]. Significant heterogeneity is introduced during this low pH activation stage due to the presence of several closely spaced cleavage sites in the proenzyme and the various isoforms are subsequently difficult to separate via chromatography. Thus, the auto-activation process yields a mixture of heterogeneous activated forms of ligases due to the presence of multiple accessible activation sites at both the N and C termini of the proenzyme. This heterogeneity could be an issue for various industrial applications from a manufacturing and quality control perspective. To eliminate the time-consuming low pH activation step of the proenzyme and address the issue of heterogeneity of PALs, we designed several constructs of OaAEP1b-C247A. The highly active PAL variant derived from Oldenlandia affinis, OaAEP1b-C247A was selected for this study due to the availability of a bacterial recombinant expression system [16,17]. This hyperactive OaAEP1b-C247A PAL can be readily expressed in Escherichia coli [16,17]. We expressed several truncated OaAEP1b-C247A proteins retaining only portions of the linker and the a6-helix region located at the N-terminal end of its cap domain (Fig. 1). All constructs could be overexpressed in E. coli as inclusion bodies. However, only OaAEP1b-C247A-D351 could be readily refolded whereas all other constructs precipitated during the refolding procedure. We found that the purified protein retained a level of ligase catalytic activity comparable to the activated protein recovered after low pH treatment of the proenzyme. Thus, this work represents a cost-effective and faster way to produce large amounts of a hyperactive ligase in E. coli for various attractive biotechnological and industrial applications [1].

Analysis of the interface between the core and cap domains of OaAEP1b
The crystal structure of OaAEP1b (PDB access code: 5H0I) [17] allows a precise analysis of the set of interactions established between the cap and the core domains in the zymogen form (Fig. 1A). In the context of the plant cells, the cap domain appears to regulate the activity of PALs and AEPs to prevent undesired protein processing or protein/peptide ligation. Four residues, Val344-Val345-Asn346-Gln347 preceding the a6 helix (the first N-terminal helix of the cap domain), are located at the interface between the cap and core domain. In particular, Gln347 penetrates deeply into the S1 pocket establishing several polar interactions with surrounding active site residues [17]. The interface between the cap and the core domain extends over a total surface of 1227 A 2 and involves 41 residues of the core domain, which make contact with 31 residues from the cap domain. A total of nine hydrogen bonds and 14 salt bridges are formed between residues from the cap and the core domain and the estimated total binding energy for this interaction is À18.8 kcalÁmol À1 at neutral pH, as measured by PISA (https://www.ebi. ac.uk/pdbe/pisa/). Of note, seven Glu residues are found in the interface between the cap and the core domain of OaAEP1b (Fig. 1A). Separation of the two domains requires acidification of the milieu to pH values ranging between 4.0 and 4.5 with the addition of nonionic detergents such as N-laurylsarcosine. At these pH values, Glu residues are no longer negatively charged, disrupting the favorable electrostatic interactions between the two domains, and favoring proteolytic cleavage in trans (Fig. 1B).
Design and expression of a constitutively active OaAEP1b-C247A From the analysis above on OaAEP1b and from other AEPs and PAL crystal structures, it appears that the a6-helix and the four residues Val344-Val345-Asn346-Gln347 immediately preceding this a-helix, must play an important role in stabilizing the enzyme in its zymogen form. Moreover, Asn346, Asp349, and Asp351 have been proposed to constitute possible cleavage sites leading to the mature enzyme [17]. Therefore, we designed a series of truncated constructs targeting residues located in the a6-helix region and in the linker between the cap and core domain of OaAEP1b-C247A ( Fig. 2A).
All four constructs were expressed in E. coli BL21 T1R and designed to include the core domain of OaAEP1b-C247A (residues Gly55 to Asn324 according to Ref. 17 numbering) discarding the signal peptide region (residues 1-54) [17]. In addition to this core region necessary for activity, the four constructs designed included incremental sections from the linker and a6 helix encompassing putative acid-activation sites located after Asn or Asp residues, such as Asp328 or Asn336 (Figs 2B and 3). All four constructs showed robust levels of expression in E. coli although the corresponding proteins were all expressed as inclusion bodies. Next, we attempted to extract proteins from the insoluble fraction by urea solubilization followed by refolding. Out of the four OaAEP1b-C247A constructs tested, only the OaAEP1b-C247A-D351 protein could be refolded. For the other three truncated proteins tested, severe precipitation during the refolding procedure was observed, indicating that segments in the region spanning residues Pro325-Asp351 are required for protein solubility.

Refolding and purification of OaAEP1b-C247A-D351
The expression of OaAEP1b-C247A-Δ351 was observed to be of a good level in E. coli inclusion bodies (Fig. 5A). Thus, the inclusion bodies were first resolubilized in 8 M urea. The protein was subsequently refolded via stepwise dilution and reduction in urea concentration from 8 to 0 M using buffer 1 and buffer 2, respectively (see Methods and Fig. 4). After the stepwise dialysis, we carried out a two-step purification of the refolded OaAEP1b-C247A-Δ351. First, we used metal affinity chromatography (HisTrap column; Cytiva, Marlborough, MA, USA) followed by size exclusion chromatography (Superdex 200 16/ 600 pg; Cytiva) ( Fig. 5B,C). These steps led to a pure monomeric fraction of OaAEP1b-C247A-Δ351 (Fig. 5C). After purification, starting from a bacterial cell culture of 100 mL, we routinely obtained a yield of 1.75 mg of OaAEP1b-C247A-Δ351. Note, however, that active site titration shows that only about 53% of this protein is fully active (see below). Cyclization activity of OaAEP1b-C247A-Δ351 To evaluate the cyclization activity of the purified OaAEP1b-C247A-Δ351, the enzyme was tested against a linear NH2-GLPVSTKPVATRNAL-COOH peptide substrate (labeled 'LS') (Fig. 6A). The cyclization reaction was performed at 37°C, and samples were collected every 2 min. MALDI-TOF MS was subsequently utilized to detect a cyclized product (labeled 'CP'). Successful cyclization carried out by the active ligase of the LS with a mass of 1524 Da would result in CP with a mass of 1321 Da (Fig. 6A). After 12 min of reaction time, OaAEP1b-C247A-Δ351 had converted the majority of the linear substrate to the circularized product. No LS peak could be detected compared with a high CP peak detected in the MALDI-TOF mass spectra of the reaction mixture (Fig. 6B), indicating a complete cyclization reaction of the substrate by OaAEP1b-C247A-Δ351.

Comparison of ligase activity of constitutively active vs acid-activated PAL
Next, using a FRET ligation assay, we compared the ligase activity of the truncated OaAEP1b-C247A-D351 with its acid-activated zymogen counterpart. Briefly, 50 nM of either enzyme was added to a mixture of two peptides A: PIE(EDANS)YNAL and B: GIK(DAB-SYL)SIP. These two peptides were mixed in a A : B molar ratio of 1 : 3. Upon ligation, the fluorescence signal emission of the EDANS moiety of A (k em = 490 nm) becomes quenched by the DABSYL moiety of B (Fig. 7A). This assay allows us to follow the ligation rate between both peptides in real-time, giving access to the kinetic parameters of the truncated enzyme. We observed that the truncated purified protein has a ligation activity comparable (about 2-fold less) to its acid-activated zymogen counterpart and previously reported OaAEP1b-C247A [17]. The V max and K m values are 6.40 RFUÁs À1 and 8.16 lM, respectively, for OaAEP1b-C247A-Δ351 compared with 14.32 RFUÁs À1 and 8.34 lM for acid-activated OaAEP1b-C247A V max and K m values, respectively (Fig. 7B). As the constitutively active PAL was obtained using a refolding protocol, the exact final proportion of OaAEP1b-C247A-D351 proteins adopting an active conformation is not known, giving some uncertainty on the determination of the kinetic parameters. Thus, in order to refine the comparison of the activity of the refolded enzyme with the acid-activated one, we performed the titration of their active sites following the procedure outlined in Ref. [32]. To understand the difference in the V max between OaAEP1b-C247A-Δ351 and acid-activated OaAEP1b-C247A we performed an active site titration of OaAEP1b-C247A-Δ351 using a FRET ligation assay, after a 1 h incubation with varying concentrations of a covalent AEP inhibitor, Ac-YVAD-cmk [19]. The result of the active site titration showed that about 53% of the measured protein concentration is active and amenable to complete inhibition (Fig. 8). Remarkably, this difference in the concentration of active protein matches the measured difference in V max and suggests that the activity of the OaAEP1b-C247A-Δ351 is very similar to the activity of the acidactivated OaAEP1b-C247A.

Conjugation of the tRNA methyltransferase, TrmJ with a fluorescent peptide
To evaluate OaAEP1b-C247A Δ351 conjugation capability, we conjugated a protein of 20 kDa, tRNA methyltransferase, TrmJ [33]. We modified TrmJ to Fig. 3. Amino acid sequence of OaAEP1b-C247A. The amino acid sequence of the OaAEP1b-C247A proenzyme using the same color code as the 3D structure displayed in Fig. 2. The red line indicates the C-terminus of the amino acid sequence of the respective construct ( Fig. 2A). Secondary structure elements are labeled and shown above the sequence. The stretch of amino acids colored in yellow belongs to the signal peptide region, which is removed during the proenzyme maturation, with G55 becoming the N-terminus of the mature enzyme. By contrast, the residue at the C-terminal residue of the purified proenzyme is P474, highlighted as a black line (17). The two catalytic residues Cys217 and His175, as well as the gatekeeper residue Cys247, are highlighted in red in the amino acid sequence. include the C-terminal OaAEP1b-C247A Δ351 preferred tripeptide recognition motif (Asn-Ala-Leu). Using 200 nM of OaAEP1b-C247A Δ351, we were able to conjugate TrmJ present in the solution with a short fluorescence peptide consisting of an N-terminal Gly/Ile (GIGGIYRK-FITC). The conjugation rate at 37°C was analyzed using SDS/PAGE at six different time points. An increment of the FITC signal was observed at every time point, and after an hour of reaction time, most of TrmJ was labeled with FITC (Fig. 9). These results demonstrated that the constitutively active OaAEP1b-C247A Δ351 can efficiently conjugate a protein.

Discussion
The highly active PAL single mutant OaAEP1b-C247A was selected for this study due to the availability of a convenient bacterial recombinant expression system, while other hyperactive PALs require expression in insect cell systems [15][16][17]. Here, we showed that the bacterial recombinant expression of a constitutively active PAL is possible by introducing systematic truncations along the ⍺6 helix, which penetrates into the enzyme active site. We found that these truncations resulted in the protein being expressed as inclusion bodies, demonstrating that the cap domain provides a set of polar interactions with the core domain that are essential for soluble recombinant expression of the proenzyme. Moreover, constructs entirely devoid of the ⍺6-helix displayed severe precipitation during the purification process. By contrast, construct OaAEP1b-C247A-Δ351, which retains a small portion of ⍺6-helix enabled the purification of the protein from inclusion bodies without any severe precipitation. This result suggests that the presence of a portion of the ⍺6-helix is crucial in maintaining protein stability in solution. In summary, these results indicate that it is possible to express the OaAEP1b-C247A enzyme devoid of its inhibitory cap domain while retaining a level of catalytic activity similar to the acid-activated species.
An important question is whether the OaAEP1b-C247A-Δ351 construct derived in this work gives an economical advantage compared with the original construct that needs acid activation to obtain an enzymatically competent form [16,17]. In our hands, using the OaAEP1b C247A proenzyme as starting material [17], the final yield after acid activation and purification is about 1-2 mgÁL À1 of LB culture, which is significantly lower than what we obtain in the present work with a yield of refolded and active OaAEP1b-C247A-Δ351 enzyme of~0.9 mg/100 mL.  Despite having retained a very small portion of the cap domain in our design, OaAEP1b-C247A-Δ351 retains high enzymatic activity in an intramolecular cyclization assay. We were able to detect the complete conversion of the linear substrate to the cyclized product (Fig. 6). Likewise, taking into account the number of active enzymes, in an intermolecular ligation assay, the catalytic rate observed for the refolded OaAEP1b-C247A-Δ351 was comparable with its acid-activated counterpart. Nonetheless, the intermolecular ligation of two peptides and the conjugation assays were slower than intramolecular cyclization. An intramolecular cyclization reaction generally proceeds faster due to the incoming nucleophile being present in cis within the peptide substrate. By contrast, for intermolecular ligation, a molar excess of electrophilic and nucleophilic substrate peptides is required for efficient catalysis of the reaction [10].
It is noteworthy that a bacterial recombinant expression of a truncated OaAEP1b-C247A was published during the course of the present work [34]. In this design, the truncated enzyme comprised residue D328 and was completely devoid of the ⍺6-helix. The truncated OaAEP1b-C247A protein was reported to have comparable catalytic kinetics to its acid-activated counterpart [34]. However, the reported yield was much lower as compared to the yield obtained in the present work, which could be due to the complete removal of the ⍺6-helix, resulting in a less stable enzyme during the expression and purification steps.

Conclusion
So far, enzymatically active forms of either PALs or AEPs were only obtained via activation under acidic conditions [15][16][17]. This step leads to the introduction of a heterogeneous population of enzymes due to the multiple accessible activation sites present in the proenzyme, thus, limiting the quality and quantity of homogenous active PALs obtained.
We carried out systematic truncations of the OaAEP1b-C274A proenzyme to address this issue. As a result, we identified OaAEP1b-C247A-Δ351, which showed both good expression levels and activity in a bacterial expression system. The expression and purification of a constitutively active PAL alleviate the need for a tedious activation step and additional purification procedures. This expression and purification protocol leads to an enzyme endowed with comparable ligation kinetics as its acid-activated counterparts. Remarkably, compared with currently available acid-activation methods for PAL expression and purification, the yield of active ligase is increased from 1 to 2 mgÁL À1 of E. coli culture to more than 9 mgÁL À1 for the current method. As a cautionary note, scaling up in the laboratory does not necessarily translate into an exact tenfold increase in yield, as large volumes of refolding buffers would have to be handled when using several liters of cell culture. We foresee that the first successful purification of a stable and constitutively active ligase in a bacterial expression system described here could constitute a costeffective way for the large-scale production of several hyperactive ligases. In turn, these constitutively active enzymes will be convenient tools for various attractive industrial applications that require protein conjugation such as for the manufacturing of antibody-drug conjugates.

Design and expression of constitutively active
OaAEP1b-C247A-D351 The expression constructs spanning residues Gly55 to Asp351, with an N-terminal hexahistidine tag followed by a TEV cleavage site, were synthesized by BioBasic (Singapore City, Singapore). These constructs were expressed in E. coli BL21 (T1R) cells and cultivated at 37°C to an OD 600~1 in LB media (Biobasic). The proteins were overexpressed following induction with 0.5 mM IPTG at 18°C for 18 h. Cells were pelleted and stored at À80°C before purification.
Protein samples were collected after resolubilization, dialysis, and purification and were analyzed with SDS/ PAGE. Western blot analysis was also carried out using anti-His antibody obtained from Sigma Aldrich (Saint Louis, MO, USA) (catalog number: SAB4301134) to validate the purification of the protein. The purified protein was trypsin digested using a standard protocol and the digested fragments were analyzed via mass spectrometry (Table S1).

Purification and expression of full-length OaAEP1b-C247A
The full-length OaAEP1b-C247A construct was synthesized by BioBasic and was expressed in E. coli BL21 (T1R) cells. Expression and activation of OaAEP1b-C247A were done according to reference [17].

Kinetics assay
The kinetic properties of the peptide ligation of the constitutively active PAL were studied using a FRET assay. Two peptides synthesized by Genscript Biotech: PIE{EDANS} YNAL and GIK{DABSYL}SIP were mixed at a molar ratio of 1 : 3. Upon ligation, the peptide PIE{EDANS}YNGIK {DABSYL}SIP is produced. Fifty nanomolar of PAL enzyme is mixed with various concentrations of the peptide mixture. The EDANS fluorescence signal was measured with an excitation wavelength of 336 nm and an emission wavelength of 490 nm. A reduction in EDANS fluorescence signal occurs upon ligation of the two peptides due to quenching by DABSYL. The variation in fluorescence signal for each substrate mixture concentration was measured after the addition of the enzyme to initiate the reaction. The rate of decrease in fluorescence signal during the first 30 s after enzyme addition was plotted against the substrate concentration to obtain the V max , k cat , and K m values for each enzyme.

Active site titration
We followed the procedure described in reference [32]. The enzyme preparation was diluted to a concentration of 280 nM using a buffer containing 20 mM sodium phosphate at pH 6.5 and 5 mM 2-mercaptoethanol. Solutions containing serial twofold dilution of inhibitor YVAD-cmk [19] were prepared in a black microtiter plate (Greiner Bio-One GmbH, Kremsm€ unster, Austria) using buffer as diluent. The enzyme was subsequently added to the wells containing the inhibitor to a final volume of 50 lL. The plate was incubated for 1 h at room temperature before adding FRET peptides (PIE{EDANS}YNAL and GIK{DABSYL} SIP), which were mixed at a molar ratio of 1 : 3 giving a final enzyme: substrate molar ratio of 1 : 200. The EDANS fluorescence signal was measured with an excitation wavelength of 336 nm and an emission wavelength of 490 nm. Relative fluorescence units (RFU) of quenched EDANS signal were plotted against time. The value of the initial velocity (V i ) was determined from the slope of the RFU(t) curve. The measured value of V i was subsequently normalized by dividing with the initial rate obtained in the absence of inhibitor (control V 0 ). The calculated V i /V 0 ratio was plotted against inhibitor concentrations, generating an inhibition curve. The titer of the enzyme active site was then inferred from the intercept of this inhibition curve with the x-axis, assuming a 1 : 1 interaction between enzyme and inhibitor, which is in agreement with experimental crystallographic structures of homologous PALs with a peptide substrate published previously [19,35].

Conjugation of TrmJ
A concentration of 200 nM of OaAEP1b-C247A-Δ351 was used to conjugate 10 lM of TrmJ-NAL [33] with 50 lM of a short fluorescence peptide synthesized by Genscript Biotech: GIGGIYRK-FITC. This reaction was carried out in a 20 mM NaH 2 PO 4 , pH 6.5 at 37°C for 1 h with a final volume of 500 lL. A volume of 50 lL of the reaction was mixed with 5 9 SDS loading dye after 5, 10, 20, 30, and 60 min. The amount of conjugated TrmJ-NAL at all time points was then analyzed using SDS/PAGE.