Glucose 6‐phosphate dehydrogenase variants increase NADPH pools for yeast isoprenoid production

Isoprenoid biosynthesis has a significant requirement for the co‐factor NADPH. Thus, increasing NADPH levels for enhancing isoprenoid yields in synthetic biology is critical. Previous efforts have focused on diverting flux into the pentose phosphate pathway or overproducing enzymes that generate NADPH. In this study, we instead focused on increasing the efficiency of enzymes that generate NADPH. We first established a robust genetic screen that allowed us to screen improved variants. The pentose phosphate pathway enzyme, glucose 6‐phosphate dehydrogenase (G6PD), was chosen for further improvement. Different gene fusions of G6PD with the downstream enzyme in the pentose phosphate pathway, 6‐phosphogluconolactonase (6PGL), were created. The linker‐less G6PD‐6PGL fusion displayed the highest activity, and although it had slightly lower activity than the WT enzyme, the affinity for G6P was higher and showed higher yields of the diterpenoid sclareol in vivo. A second gene fusion approach was to fuse G6PD to truncated HMG‐CoA reductase, the rate‐limiting step and also the major NADPH consumer in the pathway. Both domains were functional, and the fusion also yielded higher sclareol levels. We simultaneously carried out a rational mutagenesis approach with G6PD, which led to the identification of two mutants of G6PD, N403D and S238QI239F, that showed 15–25% higher activity in vitro. The diterpene sclareol yields were also increased in the strains overexpressing these mutants relative to WT G6PD, and these will be very beneficial in synthetic biology applications.

Isoprenoid biosynthesis has a significant requirement for the co-factor NADPH.Thus, increasing NADPH levels for enhancing isoprenoid yields in synthetic biology is critical.Previous efforts have focused on diverting flux into the pentose phosphate pathway or overproducing enzymes that generate NADPH.In this study, we instead focused on increasing the efficiency of enzymes that generate NADPH.We first established a robust genetic screen that allowed us to screen improved variants.The pentose phosphate pathway enzyme, glucose 6-phosphate dehydrogenase (G6PD), was chosen for further improvement.Different gene fusions of G6PD with the downstream enzyme in the pentose phosphate pathway, 6phosphogluconolactonase (6PGL), were created.The linker-less G6PD-6PGL fusion displayed the highest activity, and although it had slightly lower activity than the WT enzyme, the affinity for G6P was higher and showed higher yields of the diterpenoid sclareol in vivo.A second gene fusion approach was to fuse G6PD to truncated HMG-CoA reductase, the rate-limiting step and also the major NADPH consumer in the pathway.Both domains were functional, and the fusion also yielded higher sclareol levels.We simultaneously carried out a rational mutagenesis approach with G6PD, which led to the identification of two mutants of G6PD, N403D and S238QI239F, that showed 15-25% higher activity in vitro.The diterpene sclareol yields were also increased in the strains overexpressing these mutants relative to WT G6PD, and these will be very beneficial in synthetic biology applications.
Isoprenoids (terpenoids) are naturally occurring, biologically significant, diverse hydrocarbons derived from five-carbon isoprene units.Many of these isoprenoids display important and valuable properties that result in their diverse applications.However, extraction and purification of these isoprenoids from their natural sources are neither economical nor sustainable.For this reason, intense efforts are being made to reconstitute these pathways in microbial hosts to enable their production in these organisms.Escherichia coli and Saccharomyces cerevisiae have been the hosts of choice for these plant-derived isoprenoids.
predominant approach has been to increase precursor molecule supply and direct the carbon flux into desired pathways.This has been combined with preventing the production of undesired metabolites from branched pathways [1].
Another important factor affecting yields is the limited supply of co-factor NADPH (Nicotinamide adenine dinucleotide phosphate), a key co-factor in the isoprenoid pathway.Increasing the NADPH pools in cells increased the yields of various products in diverse pathways, including isoprenoids.This has been demonstrated in the case of a-santalene [2], sterols [3], xylitol [4], caffeine, and carotenoids [5] in yeasts reconstituted with these pathways.Some of the strategies for increasing NADPH levels have included the overexpression of some of the enzymes known to generate NADPH, such as glucose 6-phosphate dehydrogenase (G6PD) [5], the overexpression of the regulators of the pentose phosphate pathway (STB5) [6], or, in the case of oleaginous yeasts, the overexpression of mannitol dehydrogenase (MDH2) [7].Other approaches include increasing the flux into the pentose phosphate pathway [1] or the use of the NADH-dependent HMG-CoA reductase [8] (instead of the NADPH-requiring enzyme of yeasts), which is the rate-limiting step in isoprenoid biosynthesis in yeasts.However, increasing the efficiency of the enzymes catalyzing NADPH generation, which would impose a lesser burden than overexpression, is an approach that has surprisingly not been explored.Using enzymes with enhanced NADPH generation could be very beneficial for improving S. cerevisiae as a cell factory for isoprenoid production.
In this study, we have attempted to increase NADPH levels by improving the efficiency of the NADPHgenerating enzymes.Using a genetic screen that we developed, we evaluated different metabolic enzymes of Saccharomyces cerevisiae known to enhance NADPH levels.We found that glucose-6-phosphate dehydrogenase (G6PD), the first enzyme of the pentose phosphate pathway, was one of the best enzymes to target for further enhancement.Focusing on this enzyme, we evaluated multiple approaches.Firstly, we evaluated the orthologue of this enzyme in the red yeast Rhodosporidium toruloides known to make high levels of the isoprenoid carotenoid [9].We also examined synthetic metabolon approaches (both nature mimics and those not found in nature) and a rational mutagenesis approach targeting residues in the active site pocket.The latter two approaches yielded promising variants, as seen through in vitro activity determinations as well as in vivo evaluations that included comparing the production of the heterogeneously produced diterpenoid sclareol in both the WT and the variants.

Chemicals and reagents
All the chemicals were purchased from commercial sources and were either analytical grade or molecular grade.The growth media components were obtained from BD Difco (Franklin Lakes, NJ, USA) and Himedia (Mumbai, India).The amino acids, glutathione (GSH) (reduced), bnicotinamide adenine dinucleotide phosphate sodium salt hydrate (NADP), glucose 6-phosphate (G6P), mevalonolactone and 6-phosphogluconic dehydrogenase from yeast were obtained from Merck (Darmstadt, Germany).Zymolase-20T was obtained from MP biomedicals (USA).Ultra centrifugal filters were obtained from Merck (Burlington, MA, USA).Oligonucleotides were obtained from Integrated DNA Technologies (IDT) and Merck (Bangalore, India).Vent DNA polymerase and restriction enzymes were obtained from New England Biolabs (Ipswich, MA, USA).Plasmid miniprep and gel/PCR clean-up kits were purchased from Thermo Fisher Scientific (Waltham, MA, USA).The NADPH kit was obtained from Promega (Madison, WI, USA), nickelnitrilotriacetic acid agarose (Ni-NTA), and polypropylene columns were obtained from Qiagen (Hilden, Germany).

Strains, media, and growth conditions
The yeast strains used in the study are described in Table S1.The strains were maintained on yeast extract, peptone, and dextrose (YPD) medium and grown at 30 °C.The yeast cells were transformed by the Lithium acetate transformation method as described [10]; transformants were selected and maintained on synthetic defined (SD) minimal medium containing 0.17% yeast nitrogen base, 0.5% ammonium sulfate, and 2% glucose supplemented with leucine, lysine, uracil, and methionine at 80 mgÁL À1 .
The E. coli strain DH5a was used as a cloning host, and BL21(DE3)pLysS strain as a protein expression host were grown at 37 °C.The growth and handling of yeast and bacteria and all the molecular biology techniques used in this study were according to standard protocols [11].

Cloning of genes into plasmid expression vectors
The genes ZWF1 (which encodes G6PD), IDP2, MAE1 with the first 90 bps truncated (tMAE1), and ALD6 were amplified from genomic DNA isolated from the S. cerevisiae BY4741 strain with their respective forward and reverse primers, as shown in Table S2.The genes were cloned under the TEF promoter in a yeast centromeric vector pRS313TEF (the TEF2 promoter of S. cerevisiae is referred to as the TEF promoter, [12]) in the sites mentioned in Table S3.ZWF1 and the ZWF1 mutants were also cloned into the pRS313CYC vector, where the genes were under the weaker CYC promoter (the CYC1 promoter of S. cerevisiae is referred to as the CYC promoter, [12]).The vector, pRS313CYC, was created by excising the TEF promoter from pRS313TEF by XbaI and SacI and replacing it with the CYC promoter.The gene RtG6PD was codon optimized and custom synthesized from GenScript (Piscataway, NJ, USA) (Accession no.OQ291226) and cloned into the yeast expression vector pRS313TEF.The G6PD-6PGL and 6PGL-G6PD fusion proteins were constructed by linking the ZWF1 (5 0 end) with the SOL3 (3 0 end), and the SOL3 (5 0 end) with the ZWF1 (3 0 end), respectively, by a nucleotide encoding a 16 amino acid poly Gly-Ser linker by splice overlap extension-polymerase chain reaction (SOE-PCR).This SOE PCR consisted of three PCRs.The first PCR amplified the ZWF1 gene with the primers ScG6PD BamHI-FP and G6PDlink-6PGL RP.The second PCR amplified the 6PGL gene using the primers G6PD-link-6PGL FP and 6PGL SalI-RP.The third PCR was the joining PCR, which amplified the fusion protein from the first two PCR reaction products using the primers ScG6PD BamHI-FP and 6PGL SalI-RP.The 2.3 kb fusion protein, along with the linker, was cloned into pRS313TEF and pET23 vectors (Novagen, Madison, WI, USA).The fusion protein G6PD-6PGL without linker was also constructed by fusing ZWF1 at the N-term, immediately fused to SOL3 at the C-term, in the same way as mentioned above using ScG6PD BamHI-FP and G6PD-6PGL RP in PCR1, G6PD-6PGL FP and 6PGL SalI-RP in PCR2.The G6PD-6PGL was constructed using the products of PCR1 and PCR2 by ScG6PD BamHI-FP and 6PGL SalI-RP and cloned into centromeric yeast expression vector pRS313TEF and bacterial expression vector pET23a (Novagen).The G6PD-tHMG1 fusion protein was constructed by fusing G6PD at the N-terminal with tHMG1 at the C-terminal with an 11-aa Gly-Ser linker by SOE PCR.ZWF1 was amplified using ScG6PD BamHI-FP and ScG6PD-link-SctHMG link RP; tHMG1 was amplified using ScG6PD-link-SctHMG FP and SctHMG1-XhoI RP.Both of these amplicons were used to construct the G6PD-tHMG1 fusion protein and cloned into pRS313TEF between the BamHI and XhoI sites.The various mutants of ZWF1 were constructed by splice overlap extension PCR (PCR1 with ScG6PD NheI-FP and corresponding mutant RP; PCR2 with corresponding mutant FP and ScG6PD XhoI-RP; the third PCR used PCR1 and PCR2 products as templates with ScG6PD NheI-FP and ScG6PD XhoI-RP primers) and cloned into pRS313TEF and PET23a vectors.The sclareol biosynthetic genes copal-8-ol diphosphate synthase (CcCLS ) and sclareol synthase (SsSS ) that were custom synthesized [13] were sub cloned into yeast centromeric vectors pRS314TEF and p416TEF, respectively, to make them compatible for a four-plasmid yeast transformation system.All the clones constructed were confirmed by sequencing.The construction and cloning of the constructs used in the study are shown in Fig. 1.The plasmids used in the study are listed and described in Table S3.
The accession numbers of the primary nucleotide sequences used for the construction of plasmids used in this study are ZWF1: NM_001183079.1,SOL3: NM_001179294.

Dilution spotting of yeast cells
Yeast transformants were grown overnight in 5 mL of synthetic defined (SD) medium with the amino acids leucine, lysine, uracil, and methionine and then re-inoculated into 10 mL of fresh media, grown to an OD 600nm of 0.8-1.0.The cells were harvested, washed with autoclaved water, and resuspended in sterile water at an OD 600nm of 0.2.These suspensions were serially diluted to 1 : 10, 1 : 100, and 1 : 1000.10 lL of each suspension were spotted on the desired minimal medium plates containing different concentrations of methionine or reduced glutathione (Merck, Darmstadt, Germany, Cat No. G6529) and amino acid supplements.The plates were incubated at 30 °C, and the images were captured by the Bio-Rad Gel Doc TM XR+ imaging system after 2-3 days.

Expression and purification of proteins
The G6PD fusion enzymes and the G6PD mutants constructed were tagged at the C terminus with a 6X HIS tag and cloned into the PET23a expression vector.The expression vectors were transformed into E. coli BL21(DE3)pLysS.The transformants were grown in LB broth with 25 lgÁmL À1 chloramphenicol and 100 lgÁmL À1 ampicillin overnight and re-inoculated into fresh culture at OD 600nm 0.05.The cultures were grown to an OD 600nm of 0.5, then induced with 0.5 mM Isopropyl b-D-1-thiogalactopyranoside (IPTG) (Cat No. I2481C, Goldbio, St Louis, MO, USA) and incubated at 30 °C shaking for 5 h.The cultures were harvested, and the pellet was stored at À80 °C for further analysis.
The pellet was resuspended in lysis buffer [20 mM Tris-HCl buffer, pH 8 containing 10% glycerol, 500 mM NaCl, 1 mM phenylmethylsulfonyl fluoride, and protease inhibitor mixture (Cat No. P2714, Merck, Darmstadt, Germany)].The cells were lysed by sonication at 20 amplitude, and 10 s sonication cycles were alternated with 15 s recovery periods.The sonicate was centrifuged at 10 000 g for 30 min at 4 °C, and the supernatant was collected.The cleared lysate was loaded onto a nickel-nitrilotriacetic acid agarose column equilibrated with purification buffer (20 mM Tris-HCl, pH 8, 500 mM NaCl), and the supernatant was loaded onto the column.The bound protein was washed with purification buffer containing 30 mM imidazole and finally eluted in purification buffer containing 300 mM imidazole.The purified proteins were analyzed by SDS/PAGE (12% gel).The Microcon-30 kDa Centrifugal Filter Unit (Merck, Burlington, MA, USA) was used to remove imidazole by buffer exchange (with purification buffer) and concentrate the protein.The concentrations of the purified proteins were estimated by Nanodrop (Eppendorf Bio Spectrometer Ò Basic, Hamburg, Germany) and subsequently used in the enzyme assays.

Determination of in vitro glucose 6-phosphate dehydrogenase (G6PD) activities and kinetic parameters
The purified proteins were added to the assay reaction buffer (100 mM Tris-HCl, pH 8.0) containing 0.2 mM b-Nicotinamide adenine dinucleotide phosphate sodium salt hydrate (NADP) (Cat No. N0505, Merck, Darmstadt, Germany), 0.01 M MgCl 2 , 0.6 mM glucose 6-phosphate (G6P) (G7879, Merck, Darmstadt, Germany), and the reduction of NADP is monitored over time at 340 nm at 25 °C in the POLARstar Omega plate reader.The specific activities of the enzymes were calculated from the initial velocity, and 1 unit is defined as the enzyme required to reduce 1 lmol of NADP per minute at 25 °C.The enzyme activities at different pHs were carried out at pHs 5, 6, 7, 8, and 9 in different buffers.pH 5 and pH 6 were made with potassium phosphate buffer (100 mM), while pH 7, 8, and 9 were made with Tris buffer (100 mM).
The kinetic parameters were determined by varying the concentrations of G6P or NADP ranging from 5 to 300 lM, keeping the other substrate constant.The initial velocities obtained for each concentration were fitted to the Michaelis-Menten equation via non-linear regression calculations, and the K m values were obtained using GRAPHPAD PRISM 5.0 (Dotmatics, Boston, MA, USA).

Detection of 6-Phosphoglucono lactonase (6PGL) domain functionality in vitro
In this assay, the substrate of 6PGL, 6phosphogluconolactone, was determined by the action of the G6PD domain of G6PD-6PGL on the substrate glucose 6-phosphate.The PGL domain in the fusion protein, if functional, will then act on 6-phosphogluconolactone to generate the end-product 6-phosphogluconate.We determined the functionality of 6PGL by demonstrating the formation of 6-phosphogluconate.This assay is modified from a previous report describing the coupled assay [14].
In the first step, the purified G6PD-6PGL fusion protein is added to the reaction buffer (100 mM Tris-HCl, pH 8) containing 0.2 mM NADP, 0.01 M MgCl 2 , 0.6 mM G6P, and the reaction of G6PD domain is confirmed (as monitored by NADP + reduction at 340 nm at 25 °C) until saturation.The fusion protein is then removed by a microcon 10 kDa centrifugal filter from the reaction mixture.200 lM NADP was added to the separated reaction mixture and incubated for 5 min to confirm that there was no leftover residual fusion enzyme as seen by a flat curve (no NADP reduction was seen as the G6PD-6PGL enzyme was absent).One lg of 6 phosphogluconate dehydrogenase (6PGD) of S. cerevisiae obtained from Merck, Darmstadt, Germany (Cat No. P4553) was then added (6PGD acts on the end product of 6PGL and reduces NADP in the reaction mixture), and the activity of the 6PGD enzyme was detected through NADPH production at 340 nm.The functionality of the 6PGL domain was indirectly observed through 6PGD activity on the end product of the fusion protein (6-phospho gluconate).

HMG2 gene deletion in yeast
The HMG2 gene was deleted in the hmg1D deletion background by PCR-mediated homologous recombination.A hmg2::LEU2 deletion cassette with flanking HMG2 regions on either end was generated by PCR using hmg2::LEU2 del-FP and hmg2::LEU2 del-RP and transformed into the S. cerevisiae BY4742 hmg1D strain.The hmg1Dhmg2D double deleted strains were selected for leucine prototrophy on plates containing mevalonate (mevalonolactone, Cat No. M4667, Merck, Darmstadt, Germany) added at 5 mgÁmL À1 from a stock of 330 mgÁmL À1 .The double deleted strain was confirmed by mevalonate auxotrophy since it has been earlier reported that double deletions of HMG1 and HMG2 show mevalonate auxotrophy [15].

Estimation of total pools of NADP and NADPH
As NADPH pools are approximately 95% of the total pools of NADP plus NADPH, we have used the measurement of the total pools as reflective of the NADPH levels.For estimation of the total pools of NADP and NADPH, zwf1D met15D S. cerevisiae strains transformed with plasmids expressing G6PD, G6PD-6PGL, G6PD-tHMG, N403D, and S238Q under the TEF promoter and similarly transformed with the control vector were grown in SD medium containing 200 lM reduced GSH at 30 °C overnight and re-inoculated in fresh SD medium at initial OD 600nm = 0.2; cells were allowed to grow at 30 °C till the early exponential growth phase OD 600nm = 0.6-0.8, with shaking at 220 rpm.An equal number of cells (OD 600nm = 1) were harvested at 2516 g and washed with sterile water, followed by resuspension of the cells in lysis buffer (100 mM KH 2 PO 4 , 1.2 M Sorbitol, pH 7).Spheroplasts were prepared by adding zymolase at the final concentration of 0.3 mgÁmL À1 and subsequently incubating at 30 °C in a shaking incubator at 100 rpm for 1 h.A 100-lL aliquot of these spheroplasts was mixed with an equal volume of the NADP/-NADPH GloTM detection reagent from the NADP/NADPH-GloTM assay kit (Cat No. G9081, Promega).The reaction mixture was incubated at room temperature for 45 min, and readings were taken using the POLARstar Omega luminescence reader.The data were analyzed using GRAPHPAD PRISM 5.0.

Sclareol estimation
The yeast strains expressing the sclareol biosynthesis genes sclareol synthase and copal-8-ol diphosphate synthase, along with G6PD, G6PD-6PGL, G6PD-tHMG fusion proteins and the G6PD mutants were grown in SD medium overnight, shaking at 30 °C, then re-inoculated at OD 600nm 0.02 into a 25 mL secondary culture, grown for 72 h, and 2.5 mL (10% v/v) of dodecane was added as an overlay and the culture was incubated for another 48 h, then centrifuged for 4930 g for 10 min, 1 mL of dodecane layer was collected and subjected to gas chromatography-mass spectrometry (GC-MS) (Agilent 7890B GC,5977C MSD, Santa Clara, CA, USA).The HP5-MS capillary column (30 m 9 0.25 mm 9 0.25 lm) was utilized in this analysis, and the carrier gas was helium (purity 99.999%). 1 lL was injected into a single-mode inlet maintained at 320 °C.The GC oven was programmed for a temperature range of 100-320 °C at a ramp rate of 10 °C with a final hold of 10 min.The flow rate of helium was consistently maintained at 1.4 cm 2 Ás À1 .The sclareol peak was observed at the retention time of 13.25 min.Targeted peaks were identified by analyzing the mass spectra with the available NIST library and literature as described earlier [13].The extracted ion chromatograms with ions of interest were analyzed (sclareol shows characteristic mass fragments of 177 191).A calibration curve over various concentration ranges made from the authentic standard (sclareol; Cat No. 515-03-7, Merck, Darmstadt, Germany) was used for the quantification.

In silico studies of G6PD and mutants
The 3D model of G6PD was downloaded from the Alpha-Fold Protein Structure Database (https://alphafold.ebi.ac.uk/), and models of two mutants, N403D and S238QI239F, were generated computationally using the Maestro interface of Schrodinger.The Alphafold2 structure of ScG6PD was used for these studies.All three structures were subjected to protein preparation [protein preparation wizard (PPW) of Maestro], where the structures were preprocessed by adding missing hydrogens, and appropriate bond orders were assigned to the structures.The protonation states of the polar residues were optimized with the protassign module of PPW, which uses PROPKA to predict pKa values (pH 7.0 AE 2.0) and side chain functional group orientations.The prepared structure was further used for the preparation of grids, molecular docking, and molecular dynamics (MD) simulations.The Glide [16] module of the Schrodinger suit was used to dock the substrates (G6P and NADP + ) and the products (NADPH and 6PGL) to the three structures, thus obtaining six complexes of the wildtype and mutant G6PD.All six complexes were energy minimized (for 5000 steepest descent steps) and subjected to implicit water MD simulations for 1 ns each, and the structures obtained after 1 ns simulations were considered for MM/GBSA (Molecular mechanics with generalized Born and surface area solvation) binding energy calculations using the Prime module of Schrodinger [17].

Refining a genetic screen to isolate mutant enzymes that lead to increased NADPH levels
We examined previous genetic screens for NADPH levels to develop a robust and sensitive screen.In a screen for NADPH homeostatic genes previously developed in the lab, we used yeast cells depleted of glutathione [18].The rationale for this screen was that the role of NADPH is generally masked by the presence of glutathione at millimolar (mM) concentrations relative to NADPH [present in micromolar (lM) concentrations] and that in low glutathione concentrations, NADPH levels become important, and thus genes affecting these levels can be identified.However, while it was reasonably successful in investigating the mitochondrial knockout collection [18], the screen was not sufficiently robust.Also, it lacked the required sensitivity for the current study.
Deletion of the G6PD encoding gene, ZWF1, in S. cerevisiae (zwf1D), leads to a distinct phenotype of methionine auxotrophy, which is known to result from NADPH deficiency.This deletion and its phenotype have previously been used to screen G6PD enzymes from different organisms [19].However, our preliminary studies with this genetic background revealed that, though robust, it was not sufficiently sensitive in differentiating minor variations.Thus, to increase the sensitivity of the assay and refine the screen, we introduced a met15D deletion in the zwf1D background.Since met15D is an organic sulfur auxotroph, it needs organic sulfur.Thus, glutathione, a tripeptide containing cysteine, fulfills the organic sulfur auxotrophy.However, the strain, which is also a methionine auxotroph owing to zwf1D, faces an accentuated methionine requirement in this background, even with added glutathione.
For preliminary evaluation of the screen, we decided to use the G6PD enzyme (encoded by ZWF1 in S. cerevisiae) expressed from either a strong promoter (TEF) or a weak promoter (CYC).Differences in the behavior of these two clones on these screens would enable us to evaluate the screens.We thus evaluated TEF-G6PD and CYC-G6PD for the complementation of the methionine auxotrophy of the zwf1D and zwf1D met15D in both glutathione and methionine-containing mediums across a range of concentrations.We observed that while the zwf1D met15D behaved similarly to the single zwf1D background in methionine medium, but, in glutathione medium over a narrow range of concentrations, we could see that the screen became more sensitive to minor differences such as those seen with the G6PD expressed under a strong or weak promoter (Fig. 2A,B).This genetic background has subsequently been used in all our screens and assays.
G6PD is the most suitable enzyme choice for further improvement, as seen by the zwf1D met15D screen Many enzymes in yeast are involved in contributing to the NADPH pools.To evaluate which of these enzymes might be most suited for intervention by mutagenesis or other strategies, we cloned the key enzymes known to be involved in playing a role in the cytosolic NADPH pools.We evaluated these enzymes by the zwf1D met15D assay.Thus, in addition to glucose 6-phosphate dehydrogenase (ZWF1/G6PD), we also cloned the cytosolic aldehyde dehydrogenase enzyme (ALD6), the truncated malic enzyme (that was deleted for the 30 aa mitochondrial signal sequence) (tMAE1) and the cytosolic isocitrate dehydrogenase (IDP2).We observed that although Idp2p and tMae1p showed only weak complementation in this assay, the G6PD enzyme was able to confer significant growth (Fig. 2C).Thus, we considered it relevant to focus our efforts on this enzyme of the pentose phosphate pathway (PPP pathway).Ald6p also showed very good complementation (Fig. 2C).However, in a previous study, Ald6p overexpression alone did not lead to increased isoprenoid yields and was observed to decrease cell mass [20].It increased isoprenoid production only when Ald6p overexpression was coupled with overexpression of a deregulated acetyl-CoA synthetase (Acs1p).Further, in another study [21], ALD6 downregulation (that resulted from the deletion of an upstream gene that overlapped with the ALD6 promoter) led to decreased levels of isoprenoids.The decreased isoprenoids in this study resulted from multiple reasons.However, the results from these two studies suggested that ALD6 might not be the best choice for the goal of the present study.Thus, we decided not to pursue it further.
The identification of G6PD (encoded by ZWF1) as an enzyme of choice for further enhancement underlined the importance of the pentose phosphate pathway, which has been the target of many investigators.
A comparison of RtG6PD and ScG6PD shows that ScG6PD has significantly higher activity than RtG6PD G6PD seemed like the best choice for further improvement.However, would the G6PD from a carotenoid yeast be a better enzyme source?We decided to evaluate the glucose 6-phosphate dehydrogenase enzyme from the red yeast Rhodosporidium toruloides (RtG6PD).Since R. toruloides is an oleaginous and carotenogenic yeast and is known to be the highest producer of carotenoids among yeasts [22], we considered that it might have a higher requirement of NADPH and therefore that the enzyme from this yeast might be superior to the S. cerevisiae enzyme.Thus, the cDNA encoding RtG6PD (which is GC rich and has multiple introns) was custom synthesized after codon optimization (Accession no.OQ291226) and expressed downstream of the TEF promoter.This gene was also evaluated in the zwf1D met15D screen.Surprisingly, RtG6PD conferred less growth than ScG6PD in this screen (Fig. 3A).
The use of the zwf1D screen by a previous group for comparing G6PD across species suggested that the in vivo results did not always tally with the in vitro activities [19].Although we have used a more stringent screen (using zwf1D met15D rather than only zwf1D), we decided further confirmation was required.Therefore, we also evaluated this by examining the effect on carotenoid pigmentation.We transformed the genes encoding the carotenogenic enzymes of R. toruloides GGPP synthase (RtGGPPS ), phytoene dehydrogenase (RtPD), and phytoene synthase of A. thaliana (AtPS ) into S. cerevisiae.These strains are able to make red pigment, and the use of the mono-functional AtPS restricted the carotenoids to lycopene, making it more sensitive (M.Wadhwa and A. K. Bachhawat, unpublished observations).However, the RtG6PD could not improve the pigmentation levels over the ScG6PD ability (Fig. 3B).The possibility still existed that the in vivo data might not entirely reflect the actual in vitro properties since protein stability could also contribute to in vivo activities.We thus purified the two enzymes from E. coli using His-tagged proteins that we purified by Ni-NTA columns (Fig. S1).Both enzymes showed optimum activity at pH 8 (Fig. S2), and their specific activities were compared at pH 8. We found that RtG6PD had significantly lower activity (11 AE 0.9 units per mg protein) compared to ScG6PD (92.5 AE 2.2 units per mg protein).Although surprising, it also confirmed what was observed in the in vivo assay.This suggested that the ScG6PD enzyme was better suited than RtG6PD for further improvement.

Evaluation of G6PD fusions to 6 Phosphogluconolactonase (6PGL) as a possible strategy for enhancing NADPH production efficiency
Fusion enzymes of consecutive enzymes in a pathway form synthetic metabolons and are an important strategy for cells to have more efficient processes.The substrate channeling from the product of a previous enzyme allows for more efficient catalysis.Further, the physical proximity of one enzyme's product to the substrate channel of another enzyme enhances efficiency [14].
In some parasitic protozoans, such as plasmodium species and other protozoans where NADPH requirements are essential, interestingly, one observes that two of the enzymes of the pentose phosphate pathway are found as fusion enzymes [23,24] and differ in some of their kinetic parameters.Therefore, we sought to explore whether such a fusion might be more effective in an organism (such as yeasts) where such a fusion does not naturally occur.To identify which type of fusions might be best suited for enzymes in the pathway, we studied the different fusions existing in nature [23].
Although there are three enzymes in the oxidative part of the pentose phosphate pathway, which include glucose 6-p dehydrogenase (G6PD encoded by ZWF1), 6-phosphogluconolactonase (6PGL encoded by SOL3), and 6-phosphogluconate dehydrogenase (6PGD encoded by GND1), the naturally occurring gene fusions occurred only between the first and second enzymes.However, the orientations were different in different organisms.Based on this comparative investigation, we created fusions of G6PD and 6PGL with and without a 16-aa linker between the two enzymes (Fig. 4A).We created them as G6PD-L-6PGL, 6PGL-L-G6PD, and a G6PD-6PGL (without a linker).We first evaluated them in the genetic screen (zwf1Dmet15D).The growth conferred seemed quite comparable to the growth conferred Fig. 3. Comparison of ScG6PD and RtG6PD using the (A) complementation assay and (B) carotenoid pigmentation assay.For the complementation assay (A), ScG6PD and RtG6PD were expressed under the TEF promoter and transformed in zwf1D met15D, and serial dilutions were spotted on glutathione plates.For the carotenoid assay (B), ScG6PD and RtG6PD were expressed in a strain containing TEF RtGGPPS (GGPP synthase), TEF AtPS (Phytoene synthase), TEF RtPD (Phytoene desaturase) and serial dilutions were spotted along with the empty vector control.OD 600 of dilution spots from top to bottom: 0.2, 0.02, 0.002, 0.0002.Both experiments were performed in triplicate, and the image represents a single experiment.
by the control ScG6PD, indicating that at least the G6PD enzyme was still functioning well (Fig. 4B).However, it was important to determine whether each domain was functional and the kinetic parameters.We cloned these His-tagged proteins in the pET23a expression vector and expressed them in E. coli.Of the various fusions, only the G6PD-L-6PGL and G6PD-6PGL could be purified (Fig. 4C), and thus we proceeded with these enzymes.
The first task was determining whether the individual G6PD and 6PGL domains were functional.The G6PD activity was confirmed in vitro using the assay for G6PD enzymes, where NADPH formation is measured.As expected, we could detect G6PD activity in vitro as well.To determine if the 6PGL region (encoding 6-phosphogluconolactonase activity) was functional, we initially performed the assay for G6PD to generate 6-phosphogluconolactone, the substrate for 6-PGL.The 6-PGL activity was then demonstrated using the product of this enzymatic reaction and coupling it to the next enzyme, 6-phosphogluconate dehydrogenase (6PGD), which was procured from a commercial source and can be measured by NADPH detection assays (Fig. S3).After establishing the functionality of both domains, the activity was compared with the ScG6PD enzyme.Activity measurements of the linker-containing enzyme indicated a less efficient enzyme (37 AE 0.8 units per mg protein).In contrast, without a linker, the fusion activity was significantly higher (93.9 AE 1.1 units per mg protein), though the activity was a little less than the single G6PD domain protein alone (141.5 AE 3.5 units per mg protein) (Table 1).
Investigation of the kinetic parameters of the Plasmodium falciparum, Plasmodium vivax, and Giardia lamblia enzymes had earlier revealed that although the fusion showed comparable activity, the K m towards G6P was improved [14,24,25].Therefore, we sought to investigate the affinity of the fusions towards the primary substrates, G6P and NADP.Kinetic measurements were carried out with the linker-containing fusion and the linker-less fusion.Interestingly, we observed that in the more active linker-less fusion, the affinity towards NADP was lower, but the affinity towards G6P was surprisingly almost five times higher (Table 2).This indicated that the pattern of this synthetically created fused enzyme was similar to what is seen in the natural enzymes of Plasmodium falciparum, Plasmodium vivax, and Giardia lamblia; and suggested that there was a possibility that this fusion might be able to perform well in vivo, where G6P could be in short supply.Therefore, we considered it essential to evaluate the impact of these fusions on the production of the diterpenoid sclareol.Genes for this diterpenoid were expressed in yeasts along with RtGGPP synthase and either WT G6PD or the fusion enzyme, and sclareol was estimated as mentioned in the methods.The experiment was carried out in five replicates, and the three best producers were compared.We observed that these fusions showed a 12-18% increase over the WT gene, with the fusion without the linker showing higher sclareol levels between the two (Fig. 5).

Construction of a novel synthetic metabolon, with G6PD fused to tHMG1
Although the G6PD-6PGL and 6PGL-G6PD fusions of S. cerevisiae were constructed based on insights on similar naturally occurring fusions occurring in parasitic protozoans, we also considered the creation of a fusion previously not found in nature.In the fusion design, we fused the NADPH-generating enzyme ScG6PD with an enzyme of the isoprenoid pathway,  3-hydroxy-3-methyl-glutaryl-coenzyme A reductase (HMG-CoA reductase), that uses NADPH.The expectation was that these could be more efficient in utilizing NADPH for the enzymatic reaction.Based on this rationale, we constructed a fusion of ScG6PD with the truncated HMG-CoA reductase, tHmg1p (Fig. 6A).
The truncated Hmg1p (tHmg1p) lacks the N-terminal regulatory region of HMG1, thereby preventing its feedback regulation [26,27].Since this was a fusion not known to exist in nature, we first needed to examine if the two domains were functional.To do so, we evaluated the ability of the fusion to complement the zwf1D met15D mutant and the ability of the fusion protein to complement the mevalonate auxotrophy of the hmg1D hmg2D mutant [15].Using these in vivo functional assays, we found that both domains were functional as seen by the complementation, and the functionality was comparable to the WT G6PD (Fig. 6B) and WT tHmg1p, respectively (Fig. 6C), as seen by the in vivo assay.The pigmentation levels in the carotenoidproducing strain overexpressed with the G6PD-SctHmg1 protein hinted at its efficiency in increasing the isoprenoid flux (Fig. 6D).
This novel G6PD-tHMG fusion under the TEF promoter showed about 1.8-fold more sclareol production than the WT G6PD (Fig. 5).

Rational mutagenesis of the ScG6PD based on evolutionary and structural insights identifies mutant proteins with superior activity
The G6PD enzyme is highly conserved across evolution.Although naturally occurring mutants have been detected with lower activity [28], increased activity mutants have not been reported.However, activity measurements in native enzymes across species suggest wide variation, and it might be possible to identify residues that would lead to increased activity upon mutagenesis.
To identify such residues that might be potentially able to increase catalytic activity, we utilized the Hotspot Wizard 3.0 server.Hotspot Wizard 3.0 is an online server (https://loschmidt.chemi.muni.cz/hotspotwizard/)that can create mutation libraries and detect residue alterations of a protein structure [29] that can affect the protein's stability or catalytic properties.From the presented residues, one can look at tolerated substitutions based on sequence homologs of 200 sequences.We carried out the analysis using the alphafold2 predicted structure.As an additional criterion, we also focused on residues that fell in the catalytic pockets but, were nonessential for catalysis.
Six active site pocket residues were identified accordingly (Table S4).These were H161, R226, N403, S238, and I239.Each residue was changed to the most conserved residue among the homologs.The specific mutations we made were decided based on the permissible substitutions that could preserve the protein's function.Thus, H161 was mutated to H161R (R being the most conserved residue at that position).Similarly, R226 was mutated to P, N403 to N403D, S238 to S238Q, and I239 to I239F.We also created a double mutant, S238EI239F.The mutants were made by splice overlap extension PCR.We initially verified them by the yeast screen, and this was done by expressing them under a weak promoter, examining them in Fig. 6.G6PD-tHMG1 fusion protein and its functionality.Schematic, along with the functional activity of the individual domains and the fusion.The G6PD-tHMG1 fusion is schematically represented (A).(B, C) evaluation of the functionality of the G6PD domain and the tHMG1 domains, and (D) the functionality of the fusion evaluated using the carotenoid assay (as explained in the methods) using a WT strain containing RtGGPPS, AtPS, and RtPD genes and compared with control tHMG1, a known flux enhancer.Details in materials and methods.OD 600 of dilution spots from top to bottom: 0.2, 0.02, 0.002, 0.0002.the zwf1Dmet15D screen (Fig. S4), and then cloning into an E. coli expression vector with a His-tag.The mutants were purified by Ni-NTA, and then their activity was evaluated.
Interestingly, we found that although the R226P mutants showed lower activity, the single mutants H161R and S238Q showed almost 10-15% higher activity, and the double mutant S238Q I239F also showed an approximately 10% increase in activity.More interestingly, the N403D mutant showed an almost 25% increase in catalytic activity (Table 1).
To see if combining these superior mutants in a single protein would increase protein enzyme activity, we introduced the separate mutations in a single clone and evaluated the activity of the triple mutant.Thus, in one case, we generated H161R S238QI239F N403D, which showed an approximately 15% increase in activity over the wild-type but did not increase the activity beyond the single mutant N403D and was, in fact, a little lower than the N403D activity alone.The mutant S238QI239F N403D, which was also created, showed activity comparable to N403D, which was significantly higher than the wild type, but the combined mutations did not lead to any significant synergistic increase (Table 1).
We also evaluated the relative NADPH levels of the mutants in vivo since activity measurements in vitro do not consider the protein's stability in vivo.We evaluated these mutants using an in vivo assay kit that measures the total pools of NADPH and NADP.However, since NAPDH pools reflect ~95% of the pools, they largely reflect the NADPH pools.This can be seen from the estimations done in a zwf1D background.As expected, the zwf1D cells had significantly lower levels of NADPH that could be restored by expressing the ZWF1 gene under the TEF promoter.The mutant S238QI239F also showed levels comparable to the WT clone.However, using this assay, the N403D mutant seemed to show slightly higher levels of NADPH (Fig. S5), but the assay itself might not be sensitive enough to differentiate between the variants significantly.
To evaluate how effective, the N403D mutant would be relative to WT ScG6PD in vivo in an engineered system making heterologous isoprenoids, we compared the formation of the diterpenoid sclareol with the WT and N403D mutant as described earlier and in the methods.We observed that the sclareol yields in the strain expressing the N403D mutant were 15-25% higher when compared with the wild-type G6PD (Fig. 5).The N403D hyperactive mutant was also fused to tHmgp to see if it might improve the sclareol yields further.However, the N403D-tHMG fusion did not show enhanced sclareol yields compared to the G6PD-tHMG fusion protein (Fig. S6).

Investigations into the structural basis of the increased activity of the hyperactive mutants
G6PD catalyzes the rate-limiting step of the oxidative pentose-phosphate pathway.It converts D-glucose 6phosphate (G6P) and NADP + to 6-phospho-D glucono-1-5 lactone (6PGL), NADPH, and H + .Six complexes were modeled where the substrates (G6P and NADP + ) and products (6PGL and NADPH) bound to the wild type, N403D, and S238QI239F mutants of G6PD.Energy minimization and implicit water molecular dynamic simulations were carried out for 1 ns each.The structures obtained after 1 ns simulations were considered to calculate the molecular mechanics generalized Born surface area (MM/GBSA) binding energy of the substrate and ligands with G6PD.A comparative analysis of these binding energies was done in quest of a possible explanation for the higher activity of the two mutants N403D and S238QI239F relative to the WT type at the atomistic level (Fig. S7).From the total binding energy values (MM/GBSA dG Bind), it was found that the binding of both products (NADPH and 6PGL) weakened in both mutants as compared to the wild type (Table S5).Even the binding of NADP + weakens in both mutants as compared to the wild-type, but the binding of G6P (substrate) becomes stronger in the case of mutants as compared to the wild-type (Table S6).This correlates with the experimentally determined K m values of N403D (but not with the S238QI239F), showing that the N403D mutant indeed shows stronger binding to substrate G6P than the wild-type (Table 2).The MM/GBSA binding energy (dG Bind) values in the case of products show that the binding of both products (NADPH and 6PGL) weakens in both mutants as compared to the wild type (Table S5), indicating the inhibition by these products on the protein might be less or these products might be easily released after the reaction in mutants compared to the wild type.

Discussion
The need for enhancing NADPH levels in metabolic pathways such as the isoprenoid pathway to overproduce heterologous isoprenoids is well recognized.In this study, we have taken an approach that could be more cost-efficient for the cell than the existing approaches.This is achieved through the use of catalytically superior variants of the key NADPHgenerating enzymes.When we scanned the literature for natural mutant variants that were superior in activity, we found that in humans, several mutants were reported for G6PD, but these were all showing lower activity [30].Only one mutant had higher activity, but investigations revealed that it was a regulatory mutant and only affected the expression but not the intrinsic enzyme activity [31].In Plasmodium spp.several natural variants were also investigated, but none showed significantly higher activity [32].From a synthetic biology perspective, the strategic approach to improve the enzyme itself (through either mutagenesis or fusions, etc.) does not seem to have been attempted so far, and this is what was investigated in this study.
While evaluating different NADPH-generating enzymes in the genetic screen that we developed, we also evaluated the G6PD enzyme from the carotenogenic yeast R. toruloides.A surprising finding, however, was that the R. toruloides G6PD enzyme was almost 9-fold less efficient than the S. cerevisiae enzyme.This was completely unexpected, as it was thought that the higher NADPH requirements of R. toruloides during carotenoid production might be met by a more efficient RtG6PD enzyme.One possible explanation is that there are other important sources of NADPH in R. toruloides.Alternatively, as R. toruloides makes carotenoids, which are capable of tackling certain reactive oxygen species, perhaps the need for excess NADPH in that organism is not as one would expect.
We created synthetic metabolons through gene fusions using the S. cerevisiae G6PD enzyme.In the initial design of these synthetic metabolons, we mimicked nature in some of the designs (where the first enzymes of the PPP were fused).The G6PD-6PGL fusion proteins were based on similar fusions seen in protozoan parasites that must combat significant oxidative stress in host cells.As active G6PD enzymes are known to be dimeric in nature, we were not sure how synthetic fusions would behave.Therefore, we made fusions in both orientations, with or without a linker.Both orientations yielded functional enzymes, that were interestingly comparable in activity with the WT enzyme, as seen in the in vivo assays.Fusions of these two sequential enzymes of the PPP pathway are seen in several parasites, where these pathways are so important that they have also been explored as therapeutic targets [32,33].Interestingly, these fusions also seem useful in the context of synthetic biology, which also requires higher NADPH levels.Although the G6PD domain in the synthetic G6PD-6PGL metabolons constructed were found to be less active than the individual G6PD domains alone in vitro and in vivo, the possibility existed that they could improve the NADPH pools in the cell by channeling the product directly to the substrate binding pocket of the second enzyme [34].Further, although the affinity for NADP + was lower in the fusion, it showed a higher affinity to the other substrate, G6P, than the individual G6PD enzyme (Table 2).This might improve the functioning of the enzyme under specific conditions, such as glucose limitation.The overall effectiveness of these fusions relative to WT was evaluated in vivo through the production of the diterpenoid sclareol, and marginally higher sclareol levels were observed.
Overproduction of truncated HMG1 (tHmg1p) is well known to increase the flux in the isoprenoid pathway.But it has also been reported that the yields can be further increased by overexpression of an enzyme that enhances NADPH levels [3].However, we have opted for a novel approach of fusing tHmg1p with a strong NADPH producer, G6PD.The G6PD-tHmg1p fusion enzyme compared well with SctHmg1p alone, a known flux enhancer in the isoprenoid pathway, and compared well with the WT G6PD in the carotenoid assay.This suggested that the fusion was likely to be at least as effective as the parent enzyme.When evaluated through the production of the diterpenoid sclareol, we observed that sclareol estimations were significantly higher in the fusion compared to the G6PD enzyme (since tHmg1p alone can also increase yields); it is clear that this fusion also has the potential for further development and improvement.The sclareol assays were done using a 4plasmid system, which might account for some of the variability.The use of integrated strains is also likely to result in higher yields.
In rational mutagenesis, we exploited an online server (Hotspot Wizard) [29] to help identify potential residues, but then additional constraints were imposed for deciding which mutants would be created experimentally.This algorithm has been successfully used to make enzymes with increased activity for industrial purposes [35,36], and thus we sought to use it in this study as well.The strategy proved immensely successful because three out of the five mutants we created showed higher catalytic activity.Among these, N403D consistently showed improved activity in vitro and increased sclareol levels compared with the wild-type enzyme.The in silico MD simulation studies have suggested a lower binding affinity of the products to the pocket.Thus, this may facilitate the release of these products and thereby enhance the activity.Further improvement by combining the mutations did not occur, but with the single N403D alone, we were able to obtain some increase in the production of the diterpenoid sclareol, which suggests that the mutant is not just effective in vitro but in vivo as well.
In conclusion, in a genetic screen that we developed, we evaluated different metabolic enzymes generating NADPH and found that glucose-6-phosphate dehydrogenase was the best and most suited for further optimization.Multiple approaches were taken to improve the characteristics of the enzyme from S. cerevisiae.In a synthetic metabolon approach, we have created and evaluated G6PD fusions with the downstream PPP enzyme and also G6PD fusions with tHmg1p, a key enzyme in the mevalonate pathway that is a major consumer of NADPH in the pathway.In addition, we have taken a rational mutagenesis approach that has yielded new mutants that show increased activity.Thus, both approaches yielded new variants that were beneficial for isoprenoid production.While demonstrating these for the diterpenoid sclareol, we believe they could generally apply to other terpenoids.

Fig. 1 .
Fig. 1.Schematic representation of cloning and construction of all the constructs mentioned in the study.Schematic representation of cloning and construction of all the constructs mentioned in the study.The construction of the constructs and cloning them into vectors pRS313TEF, pRS313CYC, PET23a were described in the text.

Fig. 2 .
Fig. 2. Comparison of zwf1D and zwf1D met15D as a screen for NADPH levels and evaluation of different NADPH-generating enzymes.(A) zwf1D met15D and (B) zwf1D strains were each transformed with an empty vector, ZWF1 expressed under either the weak CYC promoter (C ZWF1) or the strong TEF promoter (T ZWF1), and serial dilutions spotted on different concentrations of glutathione plates.(C) Different NADPH-generating enzymes in the cell, such as ScG6PD, RtG6PD, cytosolic isocitrate dehydrogenase (Idp2p), trunc-malic enzyme, and cytosolic aldehyde dehydrogenase (Ald6p) were overexpressed in the zwf1Dmet15D strain under the TEF promoter, and dilution spotted along on plates containing glutathione.OD 600 of dilution spots from top to bottom: 0.2, 0.02, 0.002, 0.0002.The experiments were repeated thrice, and this picture represents one of the experiments.

Fig. 4 .
Fig. 4. Pictorial representation of G6PD-6PGL fusion proteins and the in vivo functionality of their G6PD domains.(A) Schematic representation of the gene fusions created and their comparison with naturally occurring fusions.The fusion proteins made in this study aligned with the existing fusion proteins in T. vaginalis, T. parva, and P. falciparum.Gene fusions of G6PD and 6PGL (B) evaluated by complementation of zwf1Dmet15D and (C) their purification profiles were displayed on 12% SDS/PAGE.

Table 1 .
Specific activities of ScG6PD fusion proteins and mutants.Specific activities were obtained from at least two independent purifications each, with at least three technical replicates of enzyme measurements.AE designate the standard deviations.One unit of the enzyme reduces 1 lmol of NADP per min. a

Table 2 .
Steady-state K m values of G6PD a , fusion proteins, and mutants for Glucose 6-phosphate and NADP + .
a K m values were obtained from at least two independent purifications each, with at least two technical replicates of enzyme measurements.AE designate the standard deviations.