Transfer of disulfide bond formation modules via yeast artificial chromosomes promotes the expression of heterologous proteins in Kluyveromyces marxianus

Abstract Kluyveromyces marxianus is a food‐safe yeast with great potential for producing heterologous proteins. Improving the yield in K. marxianus remains a challenge and incorporating large‐scale functional modules poses a technical obstacle in engineering. To address these issues, linear and circular yeast artificial chromosomes of K. marxianus (KmYACs) were constructed and loaded with disulfide bond formation modules from Pichia pastoris or K. marxianus. These modules contained up to seven genes with a maximum size of 15 kb. KmYACs carried telomeres either from K. marxianus or Tetrahymena. KmYACs were transferred successfully into K. marxianus and stably propagated without affecting the normal growth of the host, regardless of the type of telomeres and configurations of KmYACs. KmYACs increased the overall expression levels of disulfide bond formation genes and significantly enhanced the yield of various heterologous proteins. In high‐density fermentation, the use of KmYACs resulted in a glucoamylase yield of 16.8 g/l, the highest reported level to date in K. marxianus. Transcriptomic and metabolomic analysis of cells containing KmYACs suggested increased flavin adenine dinucleotide biosynthesis, enhanced flux entering the tricarboxylic acid cycle, and a preferred demand for lysine and arginine as features of cells overexpressing heterologous proteins. Consistently, supplementing lysine or arginine further improved the yield. Therefore, KmYAC provides a powerful platform for manipulating large modules with enormous potential for industrial applications and fundamental research. Transferring the disulfide bond formation module via YACs proves to be an efficient strategy for improving the yield of heterologous proteins, and this strategy may be applied to optimize other microbial cell factories.


INTRODUCTION
Kluyveromyces marxianus is a budding yeast belonging to the Saccharomyces subclade within hemiascomycetes.Due to its long-standing safe association with human food, particularly dairy products, K. marxianus has been granted GRAS (generally regarded as safe) and QPS (qualified presumption of safety) status in the United States and Europe, respectively 1 .It has been approved as a new food raw material and a new feed additive in China in 2023.Therefore, K. marxianus is a native food-safe yeast and is an ideal host for producing food proteins.Besides the food-grade safety of K. marxianus, this species has many excellent properties that are suitable for industrial production, including thermotolerance 2 , high biomass, high growth rate 1 , and broad substrate spectrum 3 .Starting in the 1980s, K. marxianus emerged as an excellent microbial cell factory for producing diverse recombinant proteins and chemicals.For example, K. marxianus stands out as a top producer of several industrial enzymes.K. marxianus reached a pinnacle in inulinase production, attaining a titer of 896.1 U/ml 4 .Substantial expression was observed for β-1,4endomannanas (6.4 g/l) 5 , endo-β-xylanase (6.4 g/l) 5 , and feruloyl esterase (12.2 g/l) 6 in K. marxianus.In the realm of virus-like particles (VLPs), K. marxianus demonstrated superiority, particularly with VLPs of porcine circovirus type 2 (PCV2, 1.9 g/l) 7 , porcine parvovirus (PPV, 2.5 g/l) 8 , and infectious bursal disease virus (IBDV, 1.89 g/l) 9 , surpassing other microbial and insect systems.Owing to its thermotolerance and wide substrate spectrum, K. marxianus is actively being explored for bioethanol and xylitol production.Remarkably, it achieved a record ethanol production of 55 g/l at 42°C 10 .After optimizing NADPH supply, K. marxianus yielded approximately 204 g/l of xylitol during fed-batch fermentation at 42°C 11 .
With the expansion of K. marxianus application scenarios, a series of genetic tools for the engineering of K. marxianus were developed, including episomal plasmids 12,13 , multiplecassette integration 14 , CRISPR/Cas9 system 15 , Cre-loxP system 16 , flux balance analysis (FBA) model 17 , and genomescale metabolic model 18 .Despite these achievements, the incorporation of large-scale functional modules remains an obstacle.The most commonly used multicopy plasmids in K. marxianus were derived from the endogenous pKD1 plasmid of Kluyveromyces lactis 5,12 .However, the copy number of pKD1-derived plasmids is not stable 5 , making them only suitable for overexpression rather than the stable expression of genes in the functional module.The capacity of centromeric plasmids is limited.Meanwhile, the loss rates of centromeric plasmids were approximately 1%-2% per generation 16 , so they are not suitable for long-term culturing without selective pressure.Large-scale modules can be integrated into the genome through homologous recombination, but the efficiency of integration is low.So far, the largest size for one-piece fragment integration was 5 kb 19 , and that for overlapping fragment integration was below 15 kb 14 .The integration of large DNA fragments into the genome may alter the chromosomal structure and affect the expression of surrounding genes as well as the replication of the host chromosome 20 .In addition, the expression of the function module is highly dependent on the integrated locus and may interfere with the expression of native genes around the locus 20,21 .Considering the problem of integration and episomal plasmids, yeast artificial chromosome (YAC) provides a new option.YACs are shuttle (Escherichia coli and yeast) vectors containing essential functional elements of a typical chromosome, including autonomously replicating sequences (ARS), centromere (CEN), and telomere (TEL) 22 .The expression of heterologous gene loaded by YACs is independent of the host genome, avoiding mutual interference between native genes and loaded modules caused by integration.YACs can be employed for the cloning and manipulation of extremely large DNA inserts (up to 3 Mb) in S. cerevisiae, and larger YACs display increased mitotic stability as they increase in size 22,23 .After exceeding 50 kb in size, YACs exhibit high stability comparable to natural chromosomes, with a loss rate of ~0.3% per generation, which is superior to centromeric plasmids 22,24 .After its invention in 1983, YACs were originally used for the cloning and manipulation of large DNA segments from higher organisms, especially those from mammalians 25,26 .In recent years, YACs received attention from the field of yeast synthetic biology and have been successfully applied as a vector to transfer large functional modules into Saccharomyces cerevisiae and other unconventional yeast.For example, YAC was applied to transfer cellobiose phosphorolysis and xylose consumption modules (~12.2 kb) into Yarrowia lipolytica 27 .In S. cerevisiae, the entire flavonoid pathway was cloned in YACs to generate a small library for screening flavonoid intermediates or derivatives 28 .YAC containing xylose utilization enzymes that enhance xylose utilization was constructed in S. cerevisiae 29 .However, no YAC has been constructed for K. marxianus so far.
Currently, more than 50 heterologous proteins have been successfully expressed in K. marxianus.The highest yield of heterologous proteins in K. marxianus was 12.2 g/l 6 , which is higher than that of S. cerevisiae (5 g/l) 30 , but is significantly lagging behind that of Pichia pastoris (22 g/l) 31 .This indicates that there is still significant room to improve the yield of heterologous proteins in K. marxianus.One crucial step in the folding and maturation of proteins that enter the secretory system is the formation of native disulfide bonds between two cysteine residues.The process involves the oxidation of substrate proteins by Protein disulfide isomerase (Pdi1), which in turn is oxidized by endoplasmic reticulum oxidase (Ero1 and Erv2) 32 .Transferring disulfide bond formation module genes is an efficient strategy to increase the production of heterologous protein.Overexpression of native PDI1 at the GOS1 locus showed a 24% increase in α-amylase production in S. cerevisiae 33 .Duplication of either KlERO1 or KlPDI1 resulted in an approximately 15-fold higher yield of Human Serum Albumin secreted by K. lactis 34 .Simultaneous co-expression of native ERO1 and PDI1 resulted in approximately 30% higher enzyme yields of lipase r27RCL in P. pastoris 35 .The effect of transferring genes involved in disulfide bond formation modules on protein production in K. marxianus has not been evaluated yet.It remains unclear whether transferring additional genes from the disulfide bond formation modules, beyond the few genes mentioned above, would further enhance the protein yield.Furthermore, the compatibility of the introduced genes with the native genes is also unknown.
In this study, YACs for K. marxianus (KmYACs) were constructed and used to transfer disulfide bond formation modules into the organism.The module contained at most seven genes, with a size of 15 kb.KmYAC was stably inherited in K. marxianus as a single copy.The P. pastoris genes introduced by KmYAC were actively expressed in vivo and were compatible with the native genes of the disulfide bond isomerization pathway in K. marxianus.Transfer of the disulfide bond formation module significantly enhanced the secretory expression of various heterologous proteins in K. marxianus, with the highest level reaching 16.8 g/l, the highest level reported to date in K. marxianus.

Design and construction of KmYACs carrying disulfide bond formation modules
The KmYAC shuttle vector contained a replication origin ARS1, a centromere CEN5, two inverted telomeres (either from Tetrahymena or K. marxianus), and three selection markers HIS3, TRP1 and HphMX4 (Figure 1).To construct a linear KmYAC, the shuttle vector was digested with BamHI and NotI.The purpose of BamHI digestion was to release the left and right arms of KmYAC.Exogenous fragments (two cassettes were used in this study) were ligated to arms at the NotI site using Gibson assembly in vitro.The NotI site was in frame with the open reading frame (ORF) of HphMX4, and the insertion would abolish the function of HphMX4.The ligation product was transformed into K. marxianus cells and the transformants containing the linear KmYAC were selected.The same procedure was followed to construct circular KmYAC, except that the BamHI digestion was omitted to maintain the circular formation of KmYAC (Figure 1).
To improve the yield of heterologous proteins in K. marxianus, we designed our module to include genes encoding essential enzymes implicated in disulfide bond formation, including Ero1, Erv2, Pdi1, and Mpd1 (Figure 2A).Ero1 and Erv2 are ER oxidases that catalyze de novo disulfide bond formation and transfer bonds to Pdi1; oxidized Pdi1 then introduces disulfide bonds into reduced substrates 36,37 .Mpd1 is a member of the Pdi1 family and can fully compensate for the deletion of Pdi 38,39 .Downstream of the disulfide bond formation, Ypt1 is a GTPase required for vesicle docking and vesicle targeting during endoplasmic reticulum to Golgi trafficking.Given its importance in protein secretion, YPT1 was also included in our designed module (Figure 2A).A total of three modules, namely P5, P7, and K5, were designed (Figure 2B).Modules P5 and P7 comprised five and seven genes from P. pastoris, including three MPD1 homologs named PpMPD1-1, PpMPD1-2, and PpMPD1-3.Module K5 included five genes from K. marxianus.P5, P7, and K5 modules were loaded into linear KmYAC containing either K. marxianus telomeres or Tetrahymena telomeres, resulting in six assemblies.Three modules were loaded into circular KmYAC containing K. marxianus telomeres, resulting in three assemblies (Figure 2C).The average fidelity of linear KmYACs (23.9%) was higher than that of circular KmYACs (6.9%), suggesting that linear configuration is more suitable for assembly.The average assembly fidelity of linear KmYACs containing K. marxianus telomeres (41.6%) was higher than that of those containing Tetrahymena telomeres (6.1%), suggesting the native K. marxianus telomeres promote accurate assembly (Figure 2C).The average assembly fidelity of KmYACs was 19.2%, lower than assembling five-gene cassettes into the LAC4 locus via single-step genome recombination (62.5%) 14 .However, the single-step recombination led to nonspecific gene insertions in the genome, a concern avoided in the in vitro assembly of KmYACs.
Southern blot was performed to confirm the presence of KmYACs in K. marxianus.The probe detected bands corresponding to both linear and circular KmYACs (Figures 2D and S1).Notably, the size of the linear KmYACs was larger than the predicted size, possibly due to telomere extension.Moreover, the linear KmYAC that contained Tetrahymena telomeres was larger than the one containing K. marxianus telomeres, suggesting additional telomeric sequences were added at the heterologous telomeres (Figure 2D).The difference in intensity between the bands of KmYACs containing Tetrahymena and K. marxianus telomeres might be attributed to differences in loading rather than variations in copy numbers per cell.This was supported by the quantitative PCR (qPCR) results which indicated that the copy numbers of KmYACs per cell were all close to 1 (Figure 2E).

KmYACs are stably propagated without affecting cell growth
The mitotic stabilities of linear and circular KmYACs were investigated by culturing cells without selective pressure (Figure 3A).At 30°C, the average loss rate of KmYACs was ~0.48% per generation, which was comparable to the loss rate of YACs in S. cerevisiae (~0.3%) and was significantly lower than that of YACs in Y. lipolytica (~6%) 22,24,27 .At 40°C, Lin Km_TEL-K5 exhibited a relatively high loss rate (7.7% per generation), while other KmYACs still showed high stability with an average loss rate of 0.75% per generation.The high stability of most KmYACs at elevated temperatures might be associated with the thermotolerance characteristics of K. marxianus 40 .There was no statistical difference between the stabilities of KmYACs carrying K. marxianus telomeres and those carrying Tetrahymena telomeres, suggesting that heterologous telomeres support faithful segregation as well as native telomeres.Additionally, the configurations of KmYACs had no significant effect on stability, as circular KmYACs and linear KmYACs shared similar stabilities.The high stabilities of KmYACs provide advantages for long-term fermentation in various temperatures without selective pressure.The growth rate is a crucial factor in the production of microbial cell factories.Therefore, the effects of KmYACs on the growth of host cells were monitored.The growth curve of control cells was the same as the cells containing linear KmYACs with either K. marxianus telomeres or Tetrahymena telomeres (Figure 3B).Meanwhile, the growth curves of cells containing either linear or circular KmYACs were the same as that of control cells.Therefore, the presence of KmYACs, regardless of their configuration or type of telomere, did not affect the normal growth of host cells (Figure 3C).

KmYACs carrying disulfide bond formation modules improve expression levels of heterologous proteins
Next, the effects of KmYACs on the secretory expressions of heterologous proteins were investigated.Three industrial enzymes were chosen.BadGLA and TeGlaA are both glucoamylases, key enzymes used for the saccharification of starch 41,42 .Xyn-CDBFV belongs to lignocellulolytic enzymes applied in degrading lignocellulose to produce cellulosic ethanol 5 .Additionally, BadGLA, TeGlaA, and Xyn-CDBFV were predicted to contain 1, 4, and 5 pairs of disulfide bonds, respectively, which result in different demands for disulfide bond formation.
A multicopy plasmid expressing glucoamylase BadGLA was transformed into cells with KmYACs carrying diverse disulfide bond formation modules.Compared to the control cells, the secretory activities of BadGLA were significantly increased in cells containing KmYACs (Figure 4A).The magnitude of the increase ranged from 23% to 129%, with the highest increase observed in cells containing the linear KmYAC with K. marxianus telomeres and P5 module (Lin Km_TEL-P5).Three linear KmYACs were chosen to evaluate their effects on the expression of two other enzymes.Compared to the control, the linear KmYAC carrying the P5, P7, and K5 modules increased the activities of glucoamylase TeGlaA by 87%, 65%, and 123%, respectively (Figure 4B).The same set of KmYACs increased the activities of the β-1,4-endoxynlanase Xyn-CDBFV by 90%, 92%, and 56%, respectively (Figure 4C).The improved yield of TeGlaA and Xyn-CDBFV was confirmed by sodium dodecyl sulfatepolyacrylamide gel electrophoresis (SDS-PAGE) analysis (Figure 4D,E).Therefore, the KmYACs carrying disulfide bond formation modules exhibited universality in promoting the expression of heterologous proteins.
We subsequently investigated whether KmYACs promote the yield of heterologous protein in high-density fermentation.For this purpose, we cultivated cells carrying the BadGLA gene and either Lin Km_TEL-P5 (referred to as B-P5), Lin Km_TEL-P7 (referred to as B-P7), or Lin Km_TEL-K5 (referred to as B-K5) in a fed-batch 5 l fermentor.After 72 h of cultivation, the secretory activity of BadGLA in B-P5, B-P7, and B-K5, respectively, reached 7414.6, 8345.7, and 5973.2U/ml, representing increases of 116.5%, 143.7%, and 74.5% when compared to control cells (Figure 4F).The high-yield production of BadGLA was confirmed by SDS-PAGE analysis (Figures 4G and S2).Based on the specific activity of BadGLA (498.02U/mg) (Figure S3), the secretory expression of BadGLA in B-P5, B-P7, and B-K5 was determined to be 14.9, 16.8, and 12 g/l, respectively.It is noteworthy that the yield of 16.8 g/l represents the highest reported yield of a heterologous protein in K. marxianus thus far.In high-density fermentation, the yield of B-P7 was superior to that of B-P5, indicating that a more complete disulfide bond isomerization pathway transferred by KmYAC is more favorable in increasing heterologous protein yield.Additionally, the yields of B-P7 and B-P5 were higher than that of B-K5, suggesting that P. pastoris genes work more efficiently in protein synthesis and transportation than the native genes of K. marxianus.
To confirm the association between the improved yield of heterologous proteins and modules transferred by KmYAC, expression levels of module genes in KmYAC were evaluated.B-P7 and B-K5 were collected for analysis after cultivating 48 and 72 h.Compared to a native housekeep gene (KmSWC4), PpERO1, PpERV2, PpPDI1, PpMPD1-1, PpMPD1-2, PpMPD1-3, and PpYPT1 of the P7 module in B-P7 were all actively transcribed (Figure 5A).The effect of the P7 module on the expressions of homologous genes in K. marxianus was evaluated.After 48 h, B-P7 did not show any significant difference in the expression levels of KmERO1, KmPDI1, KmMPD1, and KmYPT1 compared to the control cells (Figure 5B).After 72 h, B-P7 did not show any significant difference in the expression levels of KmERO1, KmERV2, and KmPDI1 (Figure 5C).The P. pastoris genes introduced by KmYAC were actively expressed in vivo and were compatible with the native genes of the disulfide bond isomerization pathway in K. marxianus.In the case of KmYAC carrying native K. marxianus genes, the expression levels of KmERO1, KmERV2, KmPDI1, KmMPD1, and KmYPT1 were all increased in the B-K5 compared to control cells.The magnitude of the increase ranged from 0.24 to 28.82 times (Figure 5D).Therefore, the transfer of heterologous or native disulfide bond formation modules by KmYACs increased the overall expression of disulfide bond formation genes, likely enhancing the process of disulfide bond formation and facilitating secretory expression.S2.A smear at high molecular weight was proposed to be the hyperglycosylated form of BadGLA, as the smear disappeared after deglycosylation (Figure S3).

KmYACs carrying disulfide bond formation modules enhance flavin adenine dinucleotide (FAD) and lysine biosynthesis and increase metabolic flux into the tricarboxylic acid (TCA) cycle
To get an insight into the mechanism underlying the improved yield of heterologous proteins in cells containing KmYACs, we analyzed the transcriptome and metabolome of B-P7 and B-K5 and compared them with control cells (Figure 6A).We identified a total of 1406 differentially expressed genes (DEGs) between B-P7 and control cells, as well as 2068 DEGs between B-K5 and control cells.A total of 939 DEGs were found to be shared by B-P7 and B-K5 (Table S6).Subsequently, the identified DEGs underwent GO term enrichment analysis.Among the shared upregulated genes, genes implicated in mitochondrial translation, ergosterol biosynthetic process, ATP hydrolysis coupled proton transport, ubiquinone biosynthetic process, and vesicle-mediated transport were significantly enriched.Genes implicated in DNA integration, RNA-dependent DNA biosynthetic process, and vitamin B6 catabolic process were significantly enriched among the shared downregulation genes (p < 0.05, Figure 6A).Untargeted metabolomics analysis revealed that 268 and 308 metabolites were altered in B-P7 and B-K5, respectively, compared with control cells (p < 0.1).Pathways of pyrimidine metabolism, purine metabolism, pentose phosphate pathway, lysine biosynthesis, nicotinate and nicotinamide metabolism, and TCA cycle were both altered in B-P7 and B-K5 (Figure 6B,C).
Consistent changes in gene expression and metabolites in the same pathway suggest that the pathway is significantly altered with a high degree of confidence.For instance, six out of eight genes required for FAD biosynthesis were upregulated in B-P7 and B-K5, while the amount of precursor FMN and GTP was reduced in both strains (Figure 6D).These findings suggest that enhanced disulfide bond formation modules are associated with enhanced FAD biosynthesis, which is expected given that FAD functions as a cofactor of Ero1 and Erv2 in the oxidation of Pdi1 37 .
The expression of pyruvate dehydrogenase complex genes, specifically LAT1 and LPD1, was upregulated in both B-P7 and B-K5 strains.Additionally, the amount of lipoic acid, which serves as a substrate of pyruvate dehydrogenase, was The p-value of PpMPD1-1 group is 0.04, with no significant difference (ns) in other groups.(B, C) Relative mRNA levels of K. marxianus genes homologous to P7 module genes.Cells that did not contain KmYACs were used as a control.B-P7 and control cells were collected after cultivating 48 h (B) and 72 h (C).The mRNA levels of homologous genes in B-P7 and control cells were normalized to that of SWC4, and the relative levels were compared side by side.Statistical significance was determined using an unpaired two-tailed t-test.ns represents p > 0.05.(D) Relative mRNA levels of K5 module genes in B-K5.B-K5 cells were collected after cultivating 48 and 72 h. mRNA levels of K5 module genes in B-K5 were normalized to those of the same genes in the control cells.The p-value of KmMPD1 group is 0.01, with ns in other groups.

FIGURE 6 (Continued).
consistently reduced in both strains.These results suggest that there is a preferred metabolic flux from pyruvate into acetyl-CoA.Following this, the expression of citrate synthase (Cit1 and Cit3), which catalyze the production of citrate from acetyl-CoA, were upregulated, indicating an enhanced flux entering the TCA cycle.Most of the enzymes in the TCA cycle were upregulated in both strains, with the upregulation being more significant in B-K5.Overproduction of heterologous proteins imposes a metabolic burden on the host, potentially resulting in a higher demand for energy supply.In S. cerevisiae, the engineered strains significantly upregulated the TCA cycle and electron transport chain to meet the heightened energy demand for increased α-amylase production 43 .Overexpression of heterologous proteins, such as β-peptidase and the Fab fragment antibody 3H6 12 , marginally elevated TCA cycle flux and ATP production in P. pastoris 44,45 .Consequently, enhancing flux into the TCA cycle is an optional strategy among yeast overexpressing heterologous proteins (Figure 6E).
Amino acids serve as building blocks for protein biosynthesis.In B-K5, all ten genes required for lysine biogenesis showed unanimous upregulation, while the amounts of lysine and its precursor, saccharopine, were reduced (Figure 6F).Notably, this unanimous upregulation of synthetic genes and reduction of the end product was not observed in the biogenesis pathways of other amino acids.Therefore, these results strongly indicate a high demand for lysine in B-K5.To investigate this further, we supplemented the medium of B-K5 with 2 mM lysine and examined its effect on the yield of heterologous proteins.Compared to the medium without supplementation, the addition of lysine resulted in an 8% increase in secretory activities of BadGLA (Figure 6G).Arginine has similar characteristics to lysine, as both are basic amino acids.Six out of seven genes required for arginine biogenesis were upregulated in B-K5, including the rate-limiting gene ARG1 (Figure S4).Similar to the effects of adding lysine, adding arginine increased secretory activities by 5.3% (Figure 6G).As a control, adding isoleucine did not significantly improve the yield.This was consistent with the transcriptomic results, as CHA1, the rate-limiting gene of isoleucine biogenesis, was significantly downregulated in B-K5 (Figure S4).Therefore, cells overexpressing heterologous proteins might have a preferred demand for lysine and arginine.

DISCUSSION
In this study, YAC was constructed for K. marxianus for the first time.The effects of telomeres (Tetrahymena or K. marxianus) and configuration (linear or circular) on the stability of KmYAC were evaluated.Telomeres from Tetrahymena supported the stable propagation of KmYACs as well as K. marxianus telomeres, indicating that functional telomeres were formed at the end of Tetrahymena telomeres.It was observed that yeast can recognize and use telomeres from distantly related organisms.During the replication of linear plasmids in S. cerevisiae, telomere repeat units of S. cerevisiae C 1-3 A were added to the ends of Tetrahymena telomeres C 4 A 2

46
. Our results suggest that the exceptionally long telomeric repeat unit of K. marxianus (25 bp) can also be added to the ends of Tetrahymena telomeres containing short repeat units (6 bp).Therefore, it may be possible to construct YACs compatible with yeast species having long telomeric repeat units, such as Candida maltosa, Candida pseudotropicalis, Candida tropicalis, K. lactis, and Saccharomyces kluyveri by using heterologous telomeres with short repeat units (6-8 bp) 47,48 .Regarding the effect of configuration, our results showed that there was no difference in stability between circular and linear KmYACs.Moreover, both types of KmYACs did not affect the growth of the host.These results are consistent with a previous report on S. cerevisiae, where both linear and circular neochromosomes were present in one copy per cell, stable and innocuous to their host 49 .Circular YACs up to 600 kb can be purified from natural chromosomes for further manipulation, exhibiting better operability compared to linear YACs 50 .Therefore, circular YACs can serve as an alternative to linear YACs for transferring large functional modules in both S. cerevisiae and unconventional yeasts.
The largest module loaded into KmYACs was 15 kb, well below the maximum capacity of YAC to load exogenous DNA (~3 Mb).This was primarily because the disulfide formation modules contained rationally designed promoters, ORFs, and terminators.The rational design approach limited the size of a single fragment obtained either by chemical synthesis or PCR amplification.To address this issue, we used Gibson assembly to ligate two smaller cassettes, instead of a single long cassette, with the KmYAC shuttle vector in vitro (Figure 1).To fully utilize the loading capacity of KmYAC in the future, more cassettes might be assembled with the KmYAC vector in vivo, using transformation-associated recombination.A similar strategy was successfully applied in K. marxianus, where five fragments were ligated together 14 .Meanwhile, random recombination between cassettes could be performed to construct diverse modules in KmYACs, Figure 6.KmYACs enhance flavin adenine dinucleotide (FAD) and lysine biosynthesis and increase metabolic flux into the tricarboxylic acid (TCA) cycle.(A) Gene Ontology (GO) term enrichment analysis of differentially expressed genes (DEGs).Cells that did not contain KmYACs were used as a control.DEGs were identified between B-P7 cells and control cells, as well as between B-K5 cells and control cells (|fold change| ≥ 1.5).DEGs were subjected to GO enrichment analysis.Statistical significance was determined using an unpaired two-tailed t-test.The p-values of enrichment are shown on the x-axis (p < 0.05).(B, C) Metabolomic analysis of B-P7 and B-K5.Changes in metabolites were identified between B-P7 and control cells (B), as well as between B-K5 cells and control cells (C) (p < 0.1).Metabolic pathway enrichment was performed using MetaboAnalyst, based on the changes in metabolites.The p-value of the enrichment and pathway impact is shown in the graph.(D-F) Changes in gene expression levels and metabolites in the FAD biosynthesis (D), TCA cycle (E), and lysine biogenesis (F).Levels of gene expression levels in B-P7 or B-K5 were compared to those in control cells (p < 0.05), as well as levels of metabolites were also compared (p < 0.1).Gray-bordered boxes are used to indicate changes in gene expression levels, whereas black-bordered boxes are used to mark changes in metabolites.Metabolites that did not exhibit any significant change are labeled as blank in the boxes.(G) Secretory activity of BadGLA expressed by B-K5 and control cells in YG medium supplemented with 2 mM lysine, arginine, or isoleucine.Values represent mean ± SD (n = 3) (*p < 0.05).
resulting in a library for screening purposes.For example, seven enzymes from the flavonoid pathway were individually cloned into expression cassettes and then randomly combined on YACs to screen for specific flavonoid compounds in S. cerevisiae 28 .Therefore, by introducing novel strategies to improve the loading capacity and diversity of assemblies, KmYACs are expected to find more applications in metabolic engineering and synthetic biology.
Although K. marxianus is not as widely used as P. pastoris in the production of heterologous proteins, it is regarded as a promising cell factory due to its various advantageous features.For example, K. marxianus has been reported as the fastest-growing eukaryote, with a specific growth rate of up to 0.80 h −151 .The fermentation of K. marxianus usually takes 2-3 days 5 .In contrast, P. pastoris exhibits a specific growth rate of only 0.18 h −1 , and it usually takes 7-8 days to complete fermentation 52 .K. marxianus is well-known for its thermotolerance and its production performance remains stable between 30°C and 37°C 2,40 .On the other hand, P. pastoris requires a growth temperature of 28-30°C, and its production performance rapidly deteriorates at 32°C 53 .Furthermore, the expression of heterologous proteins in P. pastoris relies on methanol induction, which may pose risks to both the production process and the safety of the products 53 .In contrast, K. marxianus uses constitutive promoters to express proteins and its culture medium contains inorganic salts and glucose, ensuring the safety of both production and products 5 .However, the maximum yield of heterologous proteins in K. marxianus remains lower than that in P. pastoris.Therefore, transferring essential genes from P. pastoris that control protein folding and maturation into K. marxianus is a feasible strategy to enhance protein yield.Given this consideration, five or seven genes from the disulfide bond formation modules of P. pastoris were introduced into K. marxianus using KmYAC.These genes effectively functioned in vivo, and their expression levels were compatible with the genes involved in the native disulfide bond isomerization pathway in K. marxianus (Figure 5).
Transcriptomic and metabolomic analyses showed that the introduction of disulfide bond formation modules via KmYAC caused significant changes in various cellular processes, which may have contributed to the improved yield of heterologous protein in B-P7 and B-K5 (Figure S5).Among the enhanced processes, the TCA cycle, mitochondrial respiratory chain, ergosterol synthesis, and DNA synthesis were also upregulated in S. cerevisiae cells exhibiting a high yield of heterologous proteins 43,54,55 , suggesting that upregulation of these processes might be a conserved response to the stress caused by overexpression of heterologous proteins (Figure S5).While the upregulation of FAD synthesis suggests active oxidation of disulfide bonds inside ER, upregulation of the thioredoxin and glutathione antioxidant systems might be required to counteract the aberrant oxidation of free sulfhydryl groups in the intracellular environment (Figure S5) 37,56 .
Adjusting the amount of a single amino acid in the medium has different effects on the yield of heterologous proteins.
For instance, when the amount of methionine in the medium was reduced, the expression of amyloid-β peptides in S. cerevisiae increased by twofold.On the other hand, increasing the concentration of cystine in the medium did not affect the yield 43 .In this study, transcriptomic and metabolomic analysis of B-K5 suggests an urgent demand for the supply of arginine and lysine, but not for isoleucine.Consistently with these findings, adding lysine or arginine, but not isoleucine, further improved the yield of a heterologous protein in the B-K5.Lysine and arginine are basic amino acids commonly found in the target sequence of mitochondrial proteins 57,58 .The import of mitochondrial proteins was significantly upregulated in B-K5 (Figure 6A), and supplementation with arginine and lysine might improve this process, which could be coupled with the enhanced mitochondrial respiratory chain in B-K5.Upregulation of the respiratory chain might improve protein production by increasing the energy support of ATP and NADH 43 .Therefore, supplementation with lysine might promote the production of acetyl-CoA, which was consistent with a preferred metabolic flux into the TCA cycle in B-K5 (Figure 6E).In further studies, it would be intriguing to investigate the effect of supplementing other amino acids on the yield.

Strains and medium
A wild-type K. marxianus strain (FIM1) was deposited at China General Microbiological Culture Collection Center (CGMCC, No 10621) 5 .URA3, HIS3, and TRP1 were deleted in FIM1 and the resultant strain was named FIM-1ΔUΔHΔT, which was used as a host strain in this study 59 .Cells were cultivated at 30°C.FIM1 and FIM-1ΔUΔHΔT were grown in YPD medium (2% peptone, 1% yeast extract, 2% agar for plates).Synthetic dropout medium without histidine and tryptophan (SC-His-Trp) and that without uracil (SC-Ura) were prepared as described before 60 .For expressing heterologous proteins, cells were grown in a YG medium (2% yeast extract and 4% glucose).

Construction of the YAC
The tandem cassette containing PpPDI1, PpMPD1-2, and PpYPT1 was amplified from LHZ1016 and named P5 cassette I, and the cassette containing PpMPD1-1 and PpMPD1-3 was amplified from LHZ1017 and named P5 cassette II.The tandem cassette containing PpPDI1, PpMPD1-2, PpYPT1, and PpERV2 was amplified from LHZ1016 and named P7 cassette I, and the cassette containing PpMPD1-1, PpMPD1-3, and PpERO1 was amplified from LHZ1017 and named P7 cassette II.The tandem cassette containing KmPDI1, KmMPD1, and KmYPT1 was amplified from LHZ1018 and named K5 cassette I, and the cassette containing KmERV2 and KmERO1 was amplified from LHZ1019 and named K5 cassette II.To construct the linear KmYAC, LHZ1014 or LHZ1015 was digested with NotI and BamHI to release two arms and the filler sequence.After dephosphorylation, the digested product of LHZ1014 was ligated with P5 cassette I and II by Gibson assembly to form Lin Te_TEL-P5, ligated with P7 cassette I and II to form Lin Te_TEL-P7, and ligated with K5 cassette I and II to form Lin Te_TEL-K7.Similarly, the digested product of LHZ1015 was ligated with P5 cassettes to form Lin Km_TEL-P5, ligated with P7 cassettes to form Lin Km_TEL-P7, and ligated with K5 cassettes to form Lin Km_TEL-K5.To construct the circular KmYAC, LHZ1015 was digested with NotI and dephosphorylated.The digested product was ligated with P5 cassettes to form Cir Km_TEL-P5, ligated with P7 cassettes to form Cir Km_TEL-P7, and ligated with K5 cassettes to form Cir Km_TEL-K5.The ligation product was transformed into FIM-1ΔUΔHΔT by lithium acetate method 65 , and selected on SC-His-Trp plates.Transformants were identified by colony PCR.Primers used in the construction are listed in Table S1.
Enzymatic assays and SDS-PAGE LHZ1020, LHZ1021, or LHZ745 were transformed into cells containing KmYACs.Transformants were selected on SC-Ura plates and cultivated in 50 ml YG medium at 30°C for 72 h.The supernatant was collected for enzymatic assays and SDS-PAGE analysis.The activities of glucoamylases (BadGLA and TeGlaA) were determined by using 1% (w/v) soluble starch as a substrate.The activity of Xyn-CDBFV was determined by using 1% wheat arabinoxylan as a substrate.One unit of activity of these three enzymes was defined as the amount of enzyme required to release 1 μmol glucose per minute 63 .Fed-batch fermentations of strains expressing BadGLA were conducted in a 5-l bioreactor (BXBIO) as described previously 5 .The supernatant was collected after 24, 48, 60, and 72 h.Samples were subjected to the SDS-PAGE analysis and enzymatic assay.

Southern blot
Southern blot was performed as described before 66 .In brief, cells containing KmYACs were embedded in agarose plugs.Spheroplasts were created and lysed in the plugs.Subsequently, the plugs were subjected to pulsed-field electrophoresis (Bio-Rad).Linear KmYACs were resolved directly.Circular KmYACs were entrapped in the agarose plugs and separated from the natural chromosomes during electrophoresis.Plugs containing circular KmYACs were digested with NotI and subjected to the pulsed-field electrophoresis again to release the linearized KmYACs.KmYACs were transferred from gel to a nylon membrane (Beyotime) for about 16-20 h.The membrane was hybridized with DNA probes prepared using a Biotin Random Primer DNA Labeling Kit (Beyotime) and then detected using Chemiluminescent Biotin-labeled Nucleic Acid Detection Kit (Beyotime).The probe targets HphMX4 and its sequence is listed in Table S1.

Determination of copy numbers of KmYACs
Transformants containing KmYACs were grown in a 3 ml SC-His-Trp medium overnight.Genomic DNA was extracted from the cells and subjected to quantitative PCR, as described previously 63 .The copy number of KmYACs was determined by comparing the level of HphMX4 to that of endogenous LEU2 63 .

Stabilities of KmYACs
Transformants containing KmYACs were cultivated in the 3 ml SC-His-Trp medium overnight.The resulting start culture was diluted into 50 ml YPD medium at an OD 600 of 0.2 and grown for 24 h (about seven generations).This process was repeated for the next 4 days, resulting in a total of 35 generations.Cells from both the initial culture and final culture were collected and spread onto the YPD or SC-His-Trp plate at an appropriate dilution.The stability of the plasmid was determined as the ratio of colonies formed on the SC-His-Trp plate to that on the YPD plate.The loss percentage of the KmYAC per generation was calculated as described previously 16 .The experiment was performed with three parallel cultures.

Growth curves of cells containing KmYACs
Cells containing KmYACs were grown in 3 ml SC-His-Trp medium.The culture was diluted into 50 ml YPD medium at an OD 600 of 0.2 and grown for 72 h at 30°C.OD 600 of the culture was measured after 6, 9, 12, 24, 30, 36, 48, 54, 60, and 72 h.The experiment was performed with three parallel cultures.

qPCR and RNA-seq
Transformants containing LHZ906 and KmYAC (Lin Km_TEL-P7 or Lin Km_TEL-K5) were cultivated in the 3 ml SC-His-Trp medium at 30°C overnight.The overnight cultures were diluted into 50 ml YG medium at an OD 600 of 0.2 and grown at 30°C for 72 h.Cells were collected at 48 h and 72 h after cultivation.For qPCR, RNA was extracted from frozen cells using Quick-RNA Fungal/Bacterial Miniprep kit (Zymo Research) and was reverse-transcribed using a PrimeScript RT Reagent Kit (Takara).qPCR was performed using TB Green Premix Ex Taq (Takara).Primers used in qPCR are listed in Table S1.For RNA-seq, RNA was extracted from cells collected at 48 h by using TRIzol Reagent (Invitrogen), reversed transcribed using TruSeqTM RNA sample preparation Kit (Illumina), and sequenced by Illumina NovaSeq.6000 system (BIOZERON).After the quality control, the raw paired-end reads (150 bp*2) were separately aligned to the K. marxianus FIM1 reference genome and preprocessed by the RNA-seq pipeline from Shanghai BIOZERON Co., Ltd.DEGs (|fold change|≥1.5)were significantly enriched in GO terms, in which the p-value of GO terms was less than 0.05.RNA-seq data are listed in Table S4.

Metabolomics analysis
Cells were collected at 48 h after cultivation as described in RNA-seq.Metabolites were extracted from cells for untargeted metabolomics analysis (BIOZERON).UHPLC-MS/ MS analyses were conducted utilizing a Vanquish UHPLC system (Thermo Fisher Scientific) connected to an Orbitrap Q Exactive™ HF mass spectrometer (Thermo Fisher Scientific).Metabolic pathway analysis was carried out using Metab-oAnalyst (http://www.metaboanalyst.ca).p < 0.1 was considered statistically significant for differential metabolites, while p < 0.05 was considered statistically significant for metabolic pathway enrichment.Metabolomic data are listed in Table S5.

Statistical analysis
Each assay subjected to statistical analysis comprised three biological replicates.Statistical significance was determined using an unpaired two-tailed t-test.p < 0.05 was considered statistically significant.

Figure 1 .
Figure 1.Schematic map of KmYACs construction.The KmYAC shuffle vector was composed of a replication origin (ARS1), a centromere (CEN5), two inverted telomeres (TEL) from either Tetrahymena or Kluyveromyces marxianus, and three selection markers (HIS3, TRP1, and HphMX4).KmYAC shuffle vectors were digested and ligated with exogenous fragments by Gibson assembly in vitro to construct linear or circular KmYAC.The ligation product was then transformed into K. marxianus cells and the transformants containing KmYACs were selected.

Figure 2 .
Figure 2. The construction of KmYACs carrying disulfide bond formation modules.(A) Proteins encoded by the genes in the disulfide bond formation modules.ER oxidase (Ero1 and Erv2), protein disulfide isomerase (Pdi1 and Mpd1), and Rab family GTPase (Ypt1) were included in the module.(B) Composition of P5, P7, and K5 modules.Each module contained genes either from Pichia pastoris or Kluyveromyces marxianus.The module was loaded onto the KmYAC shuttle vector to construct KmYAC.(C) The characteristics of nine KmYACs.Assembly fidelity was calculated as the ratio of positive clones containing expected KmYACs to total clones.(D) Southern blot of KmYACs.(E) The copy number of KmYACs per cell.The copy number of KmYAC was calculated relative to that of endogenous LEU2.The values presented represent the mean ± SD (n = 3).

Figure 3 .
Figure 3. KmYACs are stably propagated without affecting cell growth.(A) Loss percentage of the KmYAC per generation in cells grown without selective pressure at 30°C or 40°C.Values represent mean ± SD from three parallel cultures.(B) Growth curves of cells containing linear KmYACs with telomeres either from Kluyveromyces marxianus or Tetrahymena at 30°C.Cells without any KmYAC served as a control.Values represent mean ± SD (n = 3).(C) Growth curves of cells containing linear or circular KmYACs.

Figure 4 .
Figure 4. KmYACs promote the secretory expression of heterologous proteins.(A-C) The activity of BadGLA (A), TeGlaA (B), and Xyn-CDBFV (C) in the supernatant of culture grown with cells containing KmYACs.The cells containing KmYACs were transformed with a plasmid expressing the indicated heterologous protein, while cells without KmYACs were transformed with the plasmid as a control.Transformants were cultivated for 72 h and the activity in the supernatant was measured.The activity from control cells was designated as unit 1.The values represent the mean ± SD (n = 3).Statistical significance was determined using an unpaired two-tailed t-test.*p < 0.05, **p < 0.01, ***p < 0.001.(D, E) Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) of the supernatant described in (B, C), respectively.(F) Activity curves of BadGLA in supernatant of culture grown in a 5 l fermentor.Values represent the mean from three technical repeats.(G) SDS-PAGE of 16 μl supernatant of culture grown in 5 l fermentor for 72 h.SDS-PAGE analysis of diluted samples was shown in FigureS2.A smear at high molecular weight was proposed to be the hyperglycosylated form of BadGLA, as the smear disappeared after deglycosylation (FigureS3).

Figure 5 .
Figure 5. Module genes transferred by KmYACs are actively expressed.(A) Relative mRNA levels of P7 module genes in B-P7.B-P7 cells were collected after cultivating 48 and 72 h. mRNA levels of P7 module genes were normalized to that of SWC4.Values represent mean ± SD (n = 3).The p-value of PpMPD1-1 group is 0.04, with no significant difference (ns) in other groups.(B, C) Relative mRNA levels of K. marxianus genes homologous to P7 module genes.Cells that did not contain KmYACs were used as a control.B-P7 and control cells were collected after cultivating 48 h (B) and 72 h (C).The mRNA levels of homologous genes in B-P7 and control cells were normalized to that of SWC4, and the relative levels were compared side by side.Statistical significance was determined using an unpaired two-tailed t-test.ns represents p > 0.05.(D) Relative mRNA levels of K5 module genes in B-K5.B-K5 cells were collected after cultivating 48 and 72 h. mRNA levels of K5 module genes in B-K5 were normalized to those of the same genes in the control cells.The p-value of KmMPD1 group is 0.01, with ns in other groups.