Novel Homologs of Isopentenyl Phosphate Kinase Reveal Class‐Wide Substrate Flexibility

Abstract The widespread utility of isoprenoids has recently sparked interest in efficient synthesis of isoprene‐diphosphate precursors. Current efforts have focused on evaluating two‐step “isoprenol pathways,” which phosphorylate prenyl alcohols using promiscuous kinases/phosphatases. The convergence on isopentenyl phosphate kinases (IPKs) in these schemes has prompted further speculation about the class's utility in synthesizing non‐natural isoprenoids. However, the substrate promiscuity of IPKs in general has been largely unexplored. Towards this goal, we report the biochemical characterization of five novel IPKs from Archaea and the assessment of their substrate specificity using 58 alkyl‐monophosphates. This study reveals the IPK‐catalyzed synthesis of 38 alkyl‐diphosphate analogs and discloses broad substrate specificity of IPKs. Further, to demonstrate the biocatalytic utility of IPK‐generated alkyl‐diphosphates, we also highlight the synthesis of alkyl‐l‐tryptophan derivatives using coupled IPK‐prenyltransferase reactions. These results reveal IPK‐catalyzed reactions are compatible with downstream isoprenoid enzymes and further support their development as biocatalytic tools for the synthesis of non‐natural isoprenoids.


Introduction
Isoprenoids belong to one the most structurally and chemically diverse classes of natural products in existence and are utilized in a broad range of applications throughout the pharmaceutical and biotechnological industries. [1] Despite their diverse forms and functions, all isoprenoids are derived from the two universal precursors isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), which are derived from either the mevalonate (MVA, Scheme 1A) or deoxy-xylulose-5-phosphate (DXP) pathways in nature. [2] The recent discovery of isopentenyl phosphate kinase (IPK) in Archaea has also led to the identification of an alternate MVA pathway that bifurcates from the classical pathway following the formation of mevalonate-5phosphate (M5P). [2a,b,d,3] In the classical pathway (dotted box in Scheme 1A), M5P is phosphorylated to a diphosphate before being decarboxylated and forming IPP, but the alternate pathway sees these steps reversed, with IPK performing the final phosphorylation of isopentenyl monophosphate (IP) to IPP (solid box in Scheme 1A). [2d,3a,b,d-h] Due to the inherent value of these precursors, significant effort has been invested into engineering the pathways for large-scale production of valuable isoprenoids. [1b,4] However, several prominent factors, including essential isoprene production for cellular function, impractical metabolic fluxes, inherent reaction complexity, limited availability of precursors, and the number of required enzymes, have made optimizing the natural pathways a challenging endeavor. To overcome these barriers, several groups have attempted to simplify the process by creating artificial "isoprenol pathways" (Scheme 1B), which leverage the promiscuity of various kinases to bypass the natural pathways and radically reduce the number of required enzymes. [4b,5] The artificial pathways described thus far utilize promiscuous kinases or phosphatases (acid phosphatase PhoN from Shigella flexneri, [4b] hydroxyethylthiazole kinase ThiM from Escherichia coli, [5b] and choline kinase from Saccharomyces cerevisiae [5a,d] ) to phosphorylate the isoprene alcohols (isopentenyl alcohol, dimethylallyl alcohol), which are subsequently diphosphorylated by an IPK to form the diphosphate precursors (Scheme 1B). Together, the two reactions have formed the basis of several platforms capable of producing both IPP and DMAPP in high quantities, [4b,5] and the convergence of the schemes on IPKs for the second reaction has implied IPKs as the best candidate for the diphosphorylation step.
Thus far, the biochemical studies of the IPKs from Methanocaldococcus jannaschii (MJ), [3a,6] Methanothermobacter thermautotrophicus (MTH), [7] Thermoplasma acidophilum (THA), [7] Arabidopsis thaliana (AT) [8] and Haloferax volcanii (HV) [2d] have revealed their high catalytic efficiencies (100-1000 mM À 1 s À 1 ). [2d,3a,6-8] In addition, the apo and ligand-bound crystal structures of MJ, MTH, and THA have illuminated the importance of catalytic His and Asp residues in the active site, as well as residues responsible for ATP and alkyl-substrate binding. [6,9] These structural insights have further provided the basis for mutational studies resulting in variant IPKs capable of accepting 10-carbon geranyl and 15-carbon farnesyl monophosphates (GP and FP, respectively) as substrates. [6,10] In addition, substrate specificity studies of THA and MTH revealed them to be promiscuous towards a small set of non-natural alkyl-monophosphate (alkyl-P) analogs. [7,11] However, the breadth and depth of the substrate promiscuity of IPKs as an enzyme class remains underexplored.
To address this question, we describe herein the biochemical characterization and broad substrate specificity assessment of 5 new IPK homologs from various Archaea using a library of 58 synthetic alkyl-P analogs (natural and non-natural). Consistent with previous work, [6-7,9,11b] our studies reveal most IPKs to have broad substrate specificities driven largely by steric factors in the active site, and our analysis of the 5 homologs has enabled the synthesis of 38 unique alkyl-diphosphate (alkyl-PP) analogs. Further, to unequivocally confirm the generation of non-natural alkyl-PPs, we coupled the activity of the IPKs to that of FgaPT2, one of several aromatic prenyltransferases (PTs) known to utilize non-natural alkyl-PPs as substrates. [12] Our results with a representative set of alkyl-Ps indeed confirm the generation and utilization of alkyl-PPs by IPKs and FgaPT2, respectively. Overall, the study illuminates the hypothesized promiscuity of IPKs as an enzyme class while simultaneously providing a toolbox of enzymes for the synthesis of non-natural alkyl-PPs, which may have further utility in generating nonnatural isoprenoids.

Initial Screening of IPKs
A set of five novel IPKs were identified from various Archaea DSM 5219 (MHM), Methanosarcina barkeri 3 (MSB), and Thermococcus paralvinellae (TCP)] using a BLAST search against a template IPK sequence from MJ. [3a,6] Recombinant IPK constructs were then overexpressed in Escherichia coli Rosetta2 cells transformed with codon-optimized synthetic genes in pET28a vectors, and the resulting N-His 6 -fusion proteins were purified using Ni-NTA chromatography. Assessment of IPK activities with the library of alkyl-P analogs ( Figure 1A) was performed using the pyruvate kinase-lactate dehydrogenase (PK-LDH) assay under standardized conditions (150-μL reaction, 25 mM Tris pH 7.8, 50 mM KCl, 5 mM MgCl 2 , 4 μg IPK, 2 U PK, 2 U LDH, 0.6 mM NADH, 1.5 mM PEP, 2 mM ATP, 1 mM alkyl-P; incubated for 30 min at 37°C Figure 1B, Table S6), and the formation of putative alkyl-PP analogs was subsequently confirmed by high-resolution mass spectrometry (HRMS) for all positive reactions (Table S7).
Cumulative analysis revealed 38 of the 58 tested alkyl-P analogs acted as substrates across the 5 IPKs, with MSB, MHM, Figure 1. (A) The library of alkyl-Ps utilized in this study. The natural substrate IP and its isomer DMAP are shown in a solid box. Alkyl-P analogs that did not show turnover with any IPK are shown in red. (B) Screening of five novel IPK homologs against the library of alkyl-Ps. Conversions were calculated by measuring the absorbance at 340 nm of each enzymatic reaction just before and 30 min after the addition of IPK at 37°C and comparing it to a positive control (n = 2). Appropriate controls were conducted to account for any ATPase activity. Each reaction consisted of 2 U PK, 2 U LDH, 0.6 mM NADH, 1.5 mM PEP, 2 mM ATP, and 4 μg of IPK incubated in a buffered solution (25 mM Tris pH 7.8, 5 mM MgCl 2 ). All positive reactions were verified using HRMS.

ChemCatChem
Full Papers doi.org /10.1002/cctc.202100595 CMA, TCP, and CNG accepting 35, 32, 26, 23, and 22 analogs, respectively. In general, the non-allylic and allylic alkyl-Ps bearing 4-6 atoms in their linear chain were good substrates (> 50 % conversion), those with < 4 or 7-9 atoms displayed moderate conversion (10-50 %), and those with � 10 atoms were poor substrates (< 10 %) or not taken at all. Interestingly, dienes (20-22) did not serve as substrates for any IPKs even with < 10-atoms chain lengths, and the substrate scope of IPKs with benzylic and heterocyclic alkyl-Ps was generally limited to the core benzylic scaffold with less bulky substituents or heterocyclic analogs of smaller size (56). However, the benzodioxole analog (54) acted as a substrate for MHM and MSB, and similar methoxy-substituted analogs were accepted to either a small extent (52) or not at all (53 and 55) by the IPKs. In cases where direct comparisons could be made within the same category, increasing the size of the alkyl chain correlated with a reduction in turnover (e. g., 2 > 16 > 23; 8 > 15 > 29 > 35; 10 > 32 > 36; 46 > 47 > 48 > 50). Collectively, these analyses indicated steric interactions were the major contributing factors to the substrate scope of the IPKs, with MSB and MHM hypothesized to contain the least constrained active sites based on their broad substrate specificities.

Kinetic Studies
In addition to their generalized ability to utilize non-natural substrates, we also wanted to understand the enzymes' catalytic efficiencies (k cat /K M ) with the alkyl-P analogs as an initial gauge of their biosynthetic utility. As such, kinetic analyses were performed with a representative set of substrate/IPK pairs (Table 1) using the PK-LDH assay (1 mM 1 and 0.01-2 mM ATP or 1 mM ATP and 0.025-6 mM alkyl-P analog in 25 mM Tris pH 7.8, 5 mM MgCl 2 , 2 U PK, 2 U LDH, 0.6 mM NADH, 1 mM PEP at 37°C). The K M values of all five homologs with ATP and the natural substrate IP (1) were consistent with those previously reported for IPKs. [2d,3a,6-8] However, while the catalytic efficiencies of CMA, MSB, and MHM with the natural substrate IP (1) were consistent with those of previously characterized IPKs (100-1000 mM À 1 s À 1 ), [2d,3a,6-8] the values for TCP and CNG were an order of magnitude lower (18 and 28 mM À 1 s À 1 , respectively) mainly due to low k cat values. Furthermore, while DMAP (2) displayed similar efficiencies with MSB/MHM compared to IP, it decreased 3-fold (for CMA/TCP) and 8-fold (for CNG) due to higher K M values. Interestingly, the chloro-substituted DMAP analog (14) displayed similar efficiency as IP for TCP/CNG, a 3fold lower value for CMA/MSB, and a 7-fold decrease for MHM, while removing the methyl-group of DMAP (13) and cyclizing C4 and C5 with an additional carbon (18) decreased efficiencies by 7-50 fold and 30-600 fold, respectively, compared to 1. The alkyne (15) and azide (32) installations decreased efficiencies significantly across the board compared to 1 and 2, and though the addition of carbons to the alkyl-chain of 2 (23 and 28) caused decreases in catalytic efficiency, it was not necessarily due to large losses in K M . The most interesting cases of this phenomenon were with MHM and MSB, which appeared to arise from a notable combination of increased K M and  decreased k cat . Additionally, the binding efficiencies for 23 and 28 resembled values obtained for the butene analog 13 with either similar or decreased k cat values. As for the benzylic and heterocyclic analogs, steric factors contributed prominently, although differences in electron density also played a more pronounced role in determining catalytic efficiency compared to the allylic alkyl-Ps. The K M values for the unsubstituted benzyl analog 46 were similar to those of 2, but the k cat values fell by more than an order of magnitude in all cases except CNG (no change in k cat , increased K M ). Such similar K M values between 2 and 46 could be related to their similar carbon chain length and planarity, but the effect of these factors appeared to lessen with the sequential addition of fluorine atoms (47, 48), which degraded the homologs' specificity (48 < 47 < 46) mostly by increasing K M values. Considering the similar sizes of fluorine and hydrogen, this was likely caused by slightly unfavorable electronic interactions between the electron-rich substituent and the hydrophobic binding pockets. Chlorination at the para-position (49) had an even more dramatic effect on the enzymes' activity with benzyl analogs (49 < 47 < 46). Three IPKs (CNG, MHM, and TCP) displayed no detectable activity with 49, while the catalytic efficiencies of CMA and MSB were reduced by 3-4 orders of magnitude. These reductions in activity were likely caused by the greater steric bulk and electron density of the chlorine atom compared to fluorine, which would exacerbate the unfavorable interactions hypothesized with 48. Adding other bulky substituents to the benzyl group appeared to further prohibit them from acting as substrates, except in the case of the benzodioxole group (54). Interestingly, this analog was accepted by MSB with catalytic efficiency an order of magnitude lower than 46, mostly due to higher K M values. This phenomenon could be related to the high conformational constraint imposed by the dioxole ring (compared to methoxy substituents) coupled with the increased binding pocket space hypothesized for MSB. Finally, the heterocyclic thiophene ring of 56 displayed 2-fold higher k cat as well as K M values compared to 46, resulting in similar catalytic efficiency to the core benzylic scaffold. This could result from the reduced steric bulk of the thiophene ring being compensated by the increased electron density of the sulfur atom. Nevertheless, the demonstrated ability of the IPKs to utilize benzylic and heterocyclic analogs at all implied the class may be suitable for drug discovery purposes. [15] In general, the lower catalytic efficiencies of the non-natural analogs compared to IP was due to a combination of decreased k cat and increased K M values, which correlated to the overall size of the substrate. Based on these values, TCP and CNG were the worst catalysts for all substrates; CMA was the best catalyst for 1, 2, 13, and 14; and MSB served as a suitable catalyst for analogs with longer allylic chains (23 and 28) and alkyne (15), azide (32), benzylic (46--49, 54), and heterocyclic (56) substituents. The comparative analysis with natural and non-natural analogs indicated that the IPK with the highest catalytic efficiency with the natural substrate (CMA) may not the best catalyst for diverse non-natural substrates due to higher K M values compared to homologs with lower efficiency for the natural reaction (MSB and MHM). Alongside the turnover data, this suggests MHM and MSB contain IP binding pockets with additional space to accommodate alkyl-chain variations.

Sequence Alignment
To understand the differences in binding pocket architecture, we carried out a sequence alignment of the five IPKs from the current study with homologs whose crystal structures [6,9] have been solved previously (Figure S115). Among the 8 sequences, 25-38 % sequence identity was observed between any two IPKs, except in the case of MSB and MHM (51 %). Upon closer inspection, alignment of the enzymes' alkyl-binding sites revealed major differences in 4 previously identified residues corresponding to Y70, V73, V130 and I140 in THA (mapped as R1-R4, respectively, in Figure 2). In most cases, the 4 residues featured hydrophobic side chains (R1 = Y/F/A, R2 = I/V/T, R3 = I/ V/T, R4 = I/V). [6,9] However, 2 of the 4 amino acid residues displayed noticeably smaller side chains in MSB, MHM and TCP (R1 = A/S, R2 = T for MSB/MHM; R2 = R3 = T for TCP), while CMA and CNG had only one or no smaller side chains in any of the 4 positions. The corresponding increase in binding pocket space may thus explain the lower K M values for MSB, MHM, and TCP with most of the non-natural alkyl-P analogs compared to CMA and CNG. Furthermore, a comparison of kinetic parameters between CMA, MSB, MHM, and TCP suggests the presence of smaller side chains at R1 and R2 is more beneficial for promiscuity than when similar residues are found at R3. Taken together, these observations suggest that smaller side chains at specific positions in the binding pocket can accommodate substantial changes to the structure of the natural substrate.
More specifically, we hypothesize the substitution of bulkier residues by A/S and T at R1 and R2, respectively, create larger and more flexible binding pockets in MHM and MSB that can  [9] The catalytic His residue is shown in yellow, and the four residues forming the alkyl-chain binding pocket are shown in purple. (B) Identities of active site residues R1-R4 in all biochemically characterized IPKs as determined by sequence alignment using PROMALS3D (see Figure S115). [16] ChemCatChem Full Papers doi.org /10.1002/cctc.202100595 accommodate longer alkyl chains. This hypothesis is supported by both the initial screening data ( Figure 1B) and the two enzymes' specificities with analogs 15, 18, 23, 28, and 32 (Table 1). Although specificity does begin to decrease 15-30 fold with the longer chains of 28 and 32, the closeness in value of k cat /K M between 15, 18, and 23 suggests the two enzymes maintain reasonable activity with linear alkyl-chains of 6 atoms or less. Furthermore, in the case of MSB, the trends in K M and k cat showed 23 to bind more efficiently and be phosphorylated at a higher rate than either 15 or 18, which cements its preference for longer-chain non-natural alkyl-Ps. In terms of biocatalytic utility, both MHM and MSB have demonstrated the potential for synthesizing longer-chain precursors for nonnatural isoprenoid scaffolds. Overall, the good catalytic efficiencies of the IPKs with most of the tested alkyl-P analogs (� 1 mM À 1 s À 1 ) points to a generalized promiscuity within the enzyme class that can be harvested as a biocatalytic tool.

IPK-PT Coupled Platform
As an initial test of this utility, we decided to couple the IPKs' ability to generate alkyl-PPs to an isoprenoid enzyme with compatible substrate requirements in a one-pot synthesis. Considering the demonstrated promiscuity of PTs towards nonnatural allylic and benzylic alkyl-PPs, [12,14] we chose to utilize one such catalyst as the auxiliary enzyme. Specifically, the l-Trp C4-PT FgaPT2 from fumigaclavine biosynthesis [17] was selected for its abilities to utilize chemically diverse alkyl-PPs (both allylic and benzylic) as donor substrates. [12] The resulting IPK-FgaPT2 coupled system employed standard assay conditions (1.5 mM alkyl-P analog, 2 mM ATP, 1 mM l-Trp, 5 μM MSB, 20 μM FgaPT2, 25 mM Tris pH 7.8, 5 mM MgCl 2 , 50 mM KCl; incubated for 16 h at 37°C) and included select allylic/benzylic/heterocylic alkyl-P analogs (non-allylic alkyl-PPs are not substrates of PTs) [12,14] that afforded � 50 % turnover with MSB under standard conditions (Scheme 2). The resulting HPLC analysis revealed incubation of the coupled system with the 10 selected alkyl-P analogs led to good yields of alkylated l-Trp analogs in all reactions (� 75 %, Scheme 2, Figure S159), and the regiospecificity of 7 out of 10 products was assigned based on previous work. [12,14] Importantly, this is the first report of an IPK-coupled, PT-catalyzed generation of alkylated l-Trp derivatives using non-natural alkyl-P starting materials.

Conclusion
In summary, the biochemical characterization of 5 new IPKs from Archaea has revealed that the enzymes share a class-wide promiscuity toward non-natural alkyl-Ps and display catalytic efficiencies across two orders of magnitude with the natural substrate. Subsequent analysis of active-site sequence alignments points to residues with smaller side chains at key positions in the alkyl-binding site as being responsible for broader substrate specificity in the most promiscuous IPKs. In addition, the work herein has demonstrated that the IPK-catalyzed synthesis of non-natural alkyl-PP analogs is directly compatible with downstream alkyl-PP-utilizing enzymes (specifically PTs). Additional engineering studies will be needed to understand the role of different residues in the optimal binding of non-natural analogs, but the data presented here implies the fully optimized IPKs could be utilized as a biocatalytic tool in the two-enzyme isoprenol pathways or other metabolic pathways for the synthesis of non-natural isoprenoids. Furthermore, the IPKs' activity with diverse allylic and benzylic analogs, as well as analogs bearing synthetic handles, speaks to the class's overall biosynthetic potential, both in terms of unique scaffold generation and late-stage diversification of existing aromatics. Thus, the current study continues our collective advance towards the efficient synthesis of novel isoprenoids.

Synthesis of Alkyl-P Analogs
Detailed methods for the synthesis of alkyl-P analogs are summarized in the Supporting Information.

IPK Homolog Expression and Purification
Using a BLAST search, hypothetical IPKs were identified in the genomes of Candidatus methanomethylophilus alvus, Candidatus Nitrososphaera gargensis Ga9.2, Methanohalophilus mahii DSM 5219, Methanosarcina barkeri 3, and Thermococcus paralvinellae using the input sequence of MJ. The corresponding genes were then synthesized and codon-optimized for expression in E. coli by GenScript ( Supplementary Information Tables S3-S4). For the purposes of expression and purification, the synthetic genes were inserted into the pET28a vector between the NdeI and EcoRI sites using restriction digest, ligating a His 6 -tag to the N-terminus of the expressed proteins (Supplementary Information Table S5). The recombinant plasmids were transformed into E. coli Rosetta2 cells using heat shock, and the transformed cultures were then plated on Luria-Bertani (LB) agar supplemented with 50 μg mL À 1 kanamycin (KAN) to grow at 37°C. Resulting colonies were individually sequenced, and those bearing the recombinant plasmids were used to make LB broth cultures (supplemented with 50 μg mL À 1 KAN) grown at 37°C and stored long-term in 25 % glycerol at À 80°C.
Purification began with three complete cycles of freezing at À 80°C and thawing at room temperature. Cultures were then lysed through a combination of lysozyme treatment (30 min on ice) and sonication (40 min total, in cycles of 10-s pulses and 25-s breaks). Centrifugation was used to separate the soluble fraction from the insoluble cell debris (16000 rpm, 10°C, 1 h), and the supernatant was subsequently purified using Ni-NTA chromatography. Fractions containing the purified protein were pooled and concentrated in an Amicon centrifugal concentrator (Merck Millipore, Burlington, MA, USA), and repeated cycles of dilution and re-concentration in storage buffer (25 mM Tris pH 8.0, 50 mM KCl, 20 % v/v glycerol) removed a majority of the imidazole (final concentration < 1 mM). The purity of the concentrated protein was checked by SDS-PAGE, and pure proteins were drop-frozen in liquid N 2 and stored at À 80°C.

High-Throughput Screening of Alkyl-P Analogs
Purified IPK variants were screened against the library of alkyl-Ps using the pyruvate kinase-lactate dehydrogenase (PK-LDH) assay, which has been utilized previously to characterize similar proteins. [5b] Briefly, reactions were conducted in a high-throughput (HT) manner using 96-well plates. Each well contained a 150-μL reaction mixture composed of 4 μg IPK, 2 U PK, 2 U LDH, 0.6 mM NADH, 1.5 mM PEP, 2 mM ATP, and 1 mM alkyl-P in buffer (25 mM Tris pH 7.8, 5 mM MgCl 2 ). Before the IPK was added (10 μL), an absorbance reading at 340 nm was conducted to establish each well's initial A 340 before any NADH was consumed. After the addition of IPK, the absorbance at 340 nm was monitored every 30 s for 1 h at 37°C, and the difference between the final A 340 at 1 h and the initial A 340 before the addition of IPK was used to calculate the percentage turnover of NADH. All positive reactions were subsequently confirmed using HRMS as described in the Supplementary Information. Additionally, two control reactions were conducted: one without any alkyl-P to establish the baseline, and one using ADP instead of ATP to identify full conversion.

Kinetic Studies of IPK Homologs
From the initial screening data, pairs of enzymes and substrates were selected for kinetic characterization using the PK-LDH assay. For studies of the alkyl-P analogs as substrates, each well contained a 150-μL buffered reaction mixture (25 mM Tris pH 7.8, 5 mM MgCl 2 ) composed of 2 U PK, 2 U LDH, 0.6 mM NADH, 1 mM PEP, 2 mM ATP, 0.025-6 mM alkyl-P, and a concentration of IPK suitable to observe � 20 % conversion of NADH in 30 min. The addition of IPK initiated the reaction, and A 340 was measured every 30 s for 30 min at 37°C. For kinetic studies of ATP as a substrate, the same conditions were used except 1) IP (1) was used as the phosphoryl acceptor for all enzymes at a constant concentration of 1 mM, and 2) the concentration of ATP was varied in the range 0.01-2 mM. Initial rates were determined from the slope of the line of best fit for the time period in each reaction giving~10 % conversion after excluding the first 3 min (considered an equilibration period). Slopes were corrected for the degradation of NADH using the slope of a control reaction containing no alkyl-P. The kinetic constants k cat and K M and their associated errors were determined by inputting initial rate data for each substrate concentration into GraphPad Prism (GraphPad Software, San Diego, CA, USA) and conducting a nonlinear regression. Values for the specificity constant k cat /K M were obtained by performing the calculation in Microsoft Excel (Microsoft Corporation, Redmond, WA, USA) and propagating the errors.