Augmenting glycosylation‐directed folding pathways enhances the fidelity of HIV Env immunogen production in plants

Abstract Heterologous glycoprotein production relies on host glycosylation‐dependent folding. When the biosynthetic machinery differs from the usual expression host, there is scope to remodel the assembly pathway to enhance glycoprotein production. Here we explore the integration of chaperone coexpression with glyco‐engineering to improve the production of a model HIV‐1 envelope antigen. Calreticulin was coexpressed to support protein folding together with Leishmania major STT3D oligosaccharyltransferase, to improve glycan occupancy, RNA interference to suppress the formation of truncated glycans, and Nicotiana benthamiana plants lacking α1,3‐fucosyltransferase and β1,2‐xylosyltransferase was used as an expression host to prevent plant‐specific complex N‐glycans forming. This approach reduced the formation of undesired aggregates, which predominated in the absence of glyco‐engineering. The resulting antigen also exhibited increased glycan occupancy, albeit to a slightly lower level than the equivalent mammalian cell‐produced protein. The antigen was decorated almost exclusively with oligomannose glycans, which were less processed compared with the mammalian protein. Immunized rabbits developed comparable immune responses to the plant‐produced and mammalian cell‐derived antigens, including the induction of autologous neutralizing antibodies when the proteins were used to boost DNA and modified vaccinia Ankara virus‐vectored vaccines. This study demonstrates that engineering glycosylation‐directed folding offers a promising route to enhance the production of complex viral glycoproteins in plants.


| INTRODUCTION
Molecular farming has long promised the production of low-cost vaccines on a large scale (Rybicki, 2009) and the potential for rapid production time frames to respond to emerging outbreaks . In reality, however, the platform has mostly been confined to niche applications (Stoger et al., 2014) and, with the exception of a few products that have undergone clinical evaluation, the development of many potentially valuable plant-made proteins (PMPs) has not progressed beyond academia (E. . Although this is partly attributable to limited availability of good manufacturing practice (GMP)-compliant production facilities (Stoger et al., 2014), a major obstacle for the widespread acceptance of plant molecular farming is the observation that the host cellular machinery does not always support the required posttranslational modifications that are critical for protein folding and functionality (E. A. . These constraints are particularly problematic for viral glycoproteins, which are important targets for vaccination (Sanders & Moore, 2021) and which often have unusually complex maturation requirements (Watanabe et al., 2019).
Consequently, efforts are increasingly aimed at addressing bottlenecks in planta that impair the production and functionality of PMPs (Margolin, Crispin, et al., 2020;. The low yields reported for many viral glycoproteins in plants  are probably partly related to the host chaperone machinery, as the coexpression of human chaperones (Protein origami™) has been demonstrated to substantially improve the accumulation of human viral glycoproteins in Nicotiana benthamiana Margolin, Oh, et al., 2020;Shin et al., 2021). This approach has also been reported to alleviate the endoplasmic reticulum-related stress response and accompanying pathology, which was observed following expression of a soluble HIV-1 envelope (Env) gp140 antigen Margolin, Oh, et al., 2020). The absence of furin proteases similarly complicates the production of properly mature viral glycoproteins, as has been described for other PMPs that require furin-mediated processing for activation (Mamedov et al., 2019;Wilbers et al., 2016). This can be addressed by the coexpression of furin (Margolin, Oh, et al., 2020;Margolin et al., 2022), or by the use of cleavage-independent antigen designs Margolin et al., 2019) where a flexible linker can compensate for proteolytic processing (Georgiev et al., 2015;Sarkar et al., 2018). Similar approaches could probably also be applied to other viral glycoproteins if the respective protease required for processing is absent , and this approach is also compatible with the coexpression of human chaperones if both are found to be necessary for production of a given protein (Margolin, Oh, et al., 2020;Margolin et al., 2022).
The impact of the plant glycosylation machinery on viral glycoprotein production and immunogenicity is poorly understood, owing in part to the challenge of producing sufficient recombinant material for analysis and the need for direct comparison of plantderived material with the equivalent protein when produced in mammalian cells. Nonetheless, the glycosylation of viral glycoproteins is central to the folding of these proteins and host-derived glycans direct the interaction of glycoproteins with various folding partners, co-ordinate their trafficking along the secretory pathway and impose quality control to remove aberrantly folded proteins (Xu & Ng, 2015). PMPs are typically decorated with plant-specific complex glycans that contain β1,2-xylose and core α1,3-fucose moieties (Montero- . Truncated glycans and Lewis A structures are also commonly reported for plantproduced proteins . Recently, a growing number of reports have established that some PMPs may be under-glycosylated (Castilho et al., 2018;Goritzer et al., 2020;Singh et al., 2020) and, although this is a common limitation of heterologous expression systems, it appears to be exaggerated in plants, further complicating the widespread use of the production platform (Margolin, Crispin, et al., 2020;Singh et al., 2020).
The impact of plant-specific glycosylation has been the subject of much debate, but it is important to distinguish between glycosylation patterns that merely reflect the machinery of the host cells and those that are of significance. The longest standing concern for PMPs is the potential of plant-specific glycans to be immunogenic in humans (Gomord et al., 2010). Given that these epitopes are foreign, it has been suggested that immune responses against glyco-epitopes could result in rapid clearance of the recombinant protein, or even anaphylaxis in extreme cases (Gomord et al., 2010). Concerns for the latter are reinforced by reports of hypersensitivity following the clinical administration of mammalian cell-produced recombinant monoclonal antibodies containing foreign glyco-epitopes. This is exemplified by the observation that anaphylactic reactions to the monoclonal antibody (mAb) cetuximab were associated with IgE responses to non-native Galactose-α−1,3-Galactose, which arose following expression in the mouse SP2/0 cell line (Arnold & Misbah, 2008). Conversely, plant-produced influenza virus-like particles (VLPs) decorated with typical plantderived N-glycans were safe in immunized volunteers with preexisting plant allergies, although transient induction of IgG and IgE to plant glyco-epitopes was observed (Ward et al., 2014). More recently, plant-produced severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and rotavirus VLPs bearing their respective glycoproteins were similarly shown to be safe in humans (Kurokawa et al., 2021;Ward et al., 2021). The difference between vaccines and mAbs could be due to the very different dosing, with vaccines typically being used at far lower concentration and total amounts than mAbs. These observations have resulted in considerable optimism for molecular farming, which have been bolstered by reports of Medicago Inc.'s influenza and SARS-CoV-2 vaccines demonstrating efficacy and safety in phase 3 trials (Hager et al., 2022;Medicago, 2021;Ward et al., 2021). Their rotavirus VLP vaccine also generated promising immune responses in humans in an early stage clinical trial (Kurokawa et al., 2021). Nonetheless, many other viral glycoproteins may require more faithful recapitulation of their native glycosylation to support appropriate folding and immunogenicity Margolin, Crispin, et al., 2020).
Considerable effort has been devoted to "humanizing" the N-linked glycosylation of PMPs by eliminating plant-specific glycan moieties and by generating mammalian-type N-glycan extensions, which would not otherwise occur in plants ( . Notable achievements include the generation of mutant plants that are deficient for β1,2-xylosyltransferase and α1,3fucosyltransferase expression (termed ΔXF), thereby eliminating the formation of plant-specific complex glycans (Strasser et al., 2008), the suppression of endogenous hexosaminidases (HEXOs), to prevent the formation of truncated glycans (Shin et al., 2017), and increased glycan occupancy following the coexpression of Leishmania major STT3D oligosaccharyltransferase (LmSTT3D; Castilho et al., 2018;Goritzer et al., 2020). Success has also been reported for the production of PMPs containing mammalian-type glycan extensions that are commonly observed on viral glycoproteins (Watanabe et al., 2019) including core α1,6-fucosylation (Castilho et al., 2011), β1,4-galactosylation (Schneider et al., 2015, and even sialylation, although these have not yet been applied to plant-produced viral glycoproteins (Castilho et al., 2010). The development of these approaches presents a powerful toolbox to produce PMPs with tailor-made glycosylation and offers the prospect of producing viral glycoproteins that recapitulate important features of their native glycosylation. Nonetheless, these approaches have not been wellexplored for viral glycoproteins and it is presently unclear how well they will work in the context of such heavily-glycosylated targets (Margolin, Crispin, et al., 2020).
We have pursued plant-based production of the HIV Env as a model viral antigen with complex folding and glycosylation requirements to explore how the plant cellular machinery impacts the production of such complex glycoproteins . Env is highly dependent on the host cell machinery for its biosynthesis (Checkley et al., 2011) and therefore this presents an opportunity to investigate potential constraints along the secretory pathway that could impede the production of other similarly complex viral glycoproteins. The native protein is comprised of an extensively glycosylated trimer where approximately half the mass is attributed to host-derived glycans (Stewart-Jones et al., 2016). These glycans are critical for protein folding as they dictate interactions with the host chaperone machinery (Watanabe et al., 2019). They are also important for viral fitness where they protect vulnerable epitopes from neutralizing antibodies (Wei et al., 2003). The glycans decorating Env are predominantly oligomannose-type as their dense clustering and the quaternary structure of the trimer sterically hinders access by host mannosidases, thereby impeding their maturation . More extensively processed, complex glycans are present at the base of the glycoprotein where they can be accessed by the host glycan processing machinery (Struwe et al., 2018). Although glycans can be recognized as part of epitopes targeted by neutralizing antibodies (Wibmer et al., 2016), holes in the glycan shield are more common targets for neutralization (Charles et al., 2021;McCoy et al., 2016;Pejchal et al., 2011;Yang et al., 2020) as protein epitopes are comparatively more immunogenic than host-derived carbohydrates (Zhou et al., 2017). Accordingly, recombinant immunogens with artificial holes in the glycan shield or misfolded antigens, which expose regions that are sequestered in the virion-associated trimer induce functionally irrelevant antibodies against these regions. Recapitulating the glycan shield is therefore often an important consideration for vaccine design and investigating plant-based production of the HIV Env glycoprotein provides an opportunity to evaluate how closely the system can reproduce these features where necessary.
To establish the impact of plant-specific glycosylation on viral glycoprotein production, we previously defined the site-specific occupancy of several prototype antigens and compared them to the equivalent human embryonic kidney 293 cells (HEK293) cellproduced proteins . Notably, the plantproduced material displayed increased oligomannose and truncated glycans, negligible complex-type glycans, as well as elevated levels of under occupancy compared with the equivalent mammalian cellproduced protein. The latter observation could account for the aggregation of recombinant HIV-1 Env gp140  and Marburg virus glycoprotein , which we previously observed following their expression in plants, as well as potentially impaired folding and immunogenicity of the gp140 immunogen (Margolin, Crispin, et al., 2020). In this current study, we sought to address these constraints using HIV Env gp140 as a challenging model antigen exhibiting extensive posttranslational MARGOLIN ET AL. | 2921 modifications Margolin et al., 2019), with the intention of integrating approaches to support both chaperonemediated folding and glycosylation. Remodeling the secretory pathway presents a promising route to enhancing the fidelity of recombinant HIV and other complex viral glycoproteins in plants.
2 | RESULTS 2.1 | Production of a "glycan-enhanced" HIV Env gp140 antigen in plants Site-specific glycan analysis of plant-produced gp140 has previously shown that the antigen was under-glycosylated in plants and that the protein contained truncated glycans that were absent in the equivalent mammalian cell-produced protein . Accordingly, in this study an integrated series of expression approaches (NXS/T Generation™) were conceived to address these constraints to produce an improved variant of the antigen in plants which is subsequently referred to as "glycan-enhanced" (GE) gp140.
Here we use the phrase GE to reflect material generated in plants using these strategies, with the aim of improving the glycosylation so that it more closely resembles the equivalent mammalian cellproduced protein. The protein was coexpressed with human calreticulin (CRT) as previously described, to improve production yields (Margolin, Oh, et al., 2020), L. major LmSTT3D was coexpressed to improve glycan occupancy (Castilho et al., 2018), and an RNA interference construct was coexpressed to supress activity of HEXO3 (HEXO3RNAi), which is responsible for truncating glycans to yield paucimannosidic structures and eventually core structures (Shin et al., 2017). These expression constructs were validated in previous studies and their effect has already been reported (Castilho et al., 2018;Margolin, Oh, et al., 2020;Margolin et al., 2022;Shin et al., 2017) These approaches were combined in N.
benthamiana ΔXF to prevent the formation of undesired speciesspecific complex glycans (Strasser et al., 2008), with the intention to produce an antigen with improved glycosylation but lacking undesirable plant-specific glycan modifications (Strasser et al., 2008).
The three variants of the protein were purified by sequential gels largely mirrored these observations, although the ability to discriminate between putative trimers and aggregates was markedly decreased owing in part to lower yields for the WT protein ( Figure 1b). These data suggest that the host N-linked glycosylation machinery does not support efficient glycosylation of the gp140 and that inefficient glycosylation is responsible for undesired aggregation of the protein in plants.
Western blot analysis of the purified proteins also demonstrated a considerable improvement in the production of the antigen in plants following engineering ( Figure 1c). The WT protein was poorly resolved by SDS-PAGE and presented as a diffuse smear, as previously described . In contrast, the GE antigen yielded a defined band, although some higher order products were also observed which likely represent a minor population of misfolded protein. The GE protein was slightly smaller than the equivalent mammalian cell-produced protein, as is expected given the lack of complex N-linked glycans in N. benthamiana ΔXF, which would be imparted by mammalian cells.

| Site-specific glycosylation of GE gp140
To verify the impact of our integrated engineering approach, the quantitative site-specific glycosylation of the three gp140 variants was determined by liquid chromatography-mass spectrometry.
Modeling of the site-specific glycosylation data onto a model of the trimer enables the distribution of the various glycans on the trimer to be visualized more easily (Figure 2). This representation also depicts the location of large holes in the glycosylation shield, which are likely targets of antibodies given the observation that protein epitopes are T A B L E 1 Summary of expression approaches implemented to produce recombinant HIV Env gp140 in plants and mammalian cells comparatively more immunogenic than host-derived glycosylation (Zhou et al., 2017). As expected, lower levels of oligomannose-type glycans are visible on the mammalian protein where the glycosylation is more sparse and the glycans are more accessible for processing by host mannosidases (Figure 2a). This representation of the glycosylation also reveals that although the GE protein is less glycosylated than the mammalian cell protein at several sites (Supporting Information: Tables S1-S3), engineering has resulted in only a single site with <50% under occupancy (N289; Figure 2c). This contrasts to the mammalian cell-produced protein where N625, N637, and N611 are all under occupied in >50% of the sites sampled ( Figure 2a). This is similarly observed in the WT protein for N625 and N611 which, in addition to these sites, also exhibits <50% occupancy at several other potential N-linked glycosylation sites (PNGs) (N241, N289, N276; Tables S1-S3. The data in Figure 3 demonstrate a marked decrease in unoccupied PNGs at multiple sites across the GE protein compared with the WT antigen (N130, N139, N160, N230, N289, N386, N393, N448, N611, N625, N637). The observed increase at N160 is encouraging as this glycan forms part of an epitope at the trimer apex that is frequently targeted by broadly neutralizing antibodies (Wibmer et al., 2015). A slight increase in glycan occupancy is similarly observed for N386 which forms part of the intrinsic mannose patch  contains similar levels of truncated and core glycans suggesting that suppression of HEXO has a negligible effect (Supporting Information: Table S4).
Comparing the GE protein to the mammalian cell-produced antigen ( Figure 4) reveals that despite improvements, the GE antigen still contains lower occupancy at numerous sites (N160, N187, N262, N276, N289, N362, N386, 393, N411). This includes a key broadly neutralizing antibody epitope, the N160 glycan, at the trimer apex, which has been observed to form target epitopes during natural infection, including for the CAP256 virus, which was used to design the immunogens described in this study (Moore et al., 2011).
Nonetheless, the GE protein contains improved glycan occupancy compared with the progenitor WT protein (Supporting Information: Tables S1, S2, and S4). Interestingly, four other PNGs in the GE protein exhibited increased occupancy compared with the mammalian protein (N130, N611, N625, N637). As expected, the mammalian protein also contains a diverse array of complex glycans which are lacking in the GE protein (Supporting Information: Table S5). Nonetheless, oligomannose-type glycans predominate at N160, N332, and are abundant at N393 (Supporting Information: Table S3) as expected for the intrinsic mannose patch where glycan processing is sterically hindered . Mannose trimming also notably decreased in the GE protein compared with when the antigen was expressed in mammalian cells.
An alternate approach to interpreting these data sets is to view them from a global perspective rather than analyzing them for each PNGs. This highlights the relative abundance of different glycan species across the entire protein to identify expressionsystem dependent patterns that are less confounded by how well the protein is folded. It should be noted, however, that as all sites are not included in the data this should be considered as a representation of the overall trend of the data rather than as an absolute measure. A more accurate indication of the global changes between the proteins would require further delineation of the site-specific glycan composition at every PNGs on each protein to avoid any confounding effect. The GE protein is compared with the WT protein in Figure 5a and then to the mammalian cell-produced antigen in Figure 5b. As before, the glycan composition is depicted as the percentage point change in the GE protein compared with its comparator. Therefore, positive and negative values indicate an increase or decrease in the relative abundance of a particular glycan, respectively. It is important to F I G U R E 2 Site-specific glycosylation of CAP256 gp140 produced in mammalian cells and in plants with either unmodified or modified glycosylation. (a) Heat map depicting the abundance of oligomannose-type glycans when HIV envelope (Env) gp140 is produced in human embryonic kidney 293 (HEK293) mammalian cells. The model was constructed using SWISS-MODEL using the structure of BG505 NFL as a template (6B0N). A representative glycan corresponding to the most abundant glycoform present at each site was modeled and sites are colored according to the abundance of oligomannose-type glycans detected at each site. All glycans are labeled according to the asparagine they are attached to, following alignment with HxB2. Where an N-linked glycan site was occupied by N-linked glycans on <50% of sites, the glycan modeled was colored blue. (b) The model generated for HEK293 produced CAP256 Env was recolored according to the site-specific glycan data for WT material. (c) The model generated for HEK293 produced CAP256 Env was recolored according to the site-specific glycan data for "glycan-enhanced" Env.
note that certain sites could not be resolved, for example, N406 on all samples analyzed, and the values presented represent the average of sites that could be obtained. The GE gp140 also displays improved glycan processing as evidenced by decreased Man8 and elevated Man6 and Man5 glycans. The extent of under occupancy is also noticeably reduced in the GE protein.
However, when the GE gp140 is compared with the mammalian cell-produced protein (Figure 5b), a slight increase in F I G U R E 3 Comparison of the glycan occupancy of recombinant HIV envelope (Env) gp140 when produced in N. benthamiana using integrated host and transient glyco-engineering ("glycan-enhanced") or no glyco-engineering expression approaches (wild type [WT]). (a) The percentage point changes in glycan occupancy are labeled onto the model generated in Figure 2. A positive change represents a glycan site that is more occupied on the "glycan-enhanced" material compared with the WT. (b) Site-specific glycan occupancy of recombinant HIV Env gp140 produced in WT and "glycan-enhanced" N. benthamiana. The change in glycan occupancy is colored, with red representing a decrease and blue an increase.
glycan compositions are observed that likely correspond to glucosylated Man9 structures (HexNAc(2)Hex(10+). Although this may reflect slightly less efficient folding in plants, this is equally likely to arise from differences in purification methods for the two systems. As observed with the site-specific analysis, the GE protein contains almost exclusively oligomannose-type glycans with considerably less processing than in mammalian cells where a considerable amount of complex glycans are observed.
The data from all three protein variants are summarized in Figure 5 and Supporting Information: Tables S1-S3, which quantitatively reflects differences in their glycosylation. Although oligomannose glycans predominate in mammalian cells (57%), this F I G U R E 4 Comparison of the glycan occupancy of recombinant HIV envelope (Env) gp140 when produced in N. benthamiana using integrated host and transient glyco-engineering ("glycan-enhanced") or mammalian cells (HEK293). (a) The percentage point changes in glycan occupancy are labeled onto the model generated in Figure 2. A positive change represents a glycan site that is more occupied on the "glycanenhanced" material compared with the HEK293-derived material. (b) Site-specific glycan occupancy of recombinant HIV Env gp140 produced in WT and "glycan-enhanced" N. benthamiana. The change in glycan occupancy is colored, with red representing a decrease and blue an increase.  (Margolin, Crispin, et al., 2020). Finally, it is important to note that although the proteins were captured using G. nivalis lectin, which specifically binds high mannose carbohydrates, it has been shown that this does not unduly bias the glycosylation analysis of HIV Env proteins purified with this resin .

| Characterization of GE gp140
Based on the aberrant folding, poor glycosylation and difficulties in producing a homogenous batch of the WT antigen, this was not pursued further. All data generated thus far demonstrated a considerable improvement for the GE protein. Consequently, all subsequent work in this study focused on comparing this improved antigen to the mammalian cell-produced protein.
Although gel filtration is commonly used to purify oligomeric proteins it cannot discriminate between well-ordered and aberrantly folded proteins (Ringe et al., 2013). The hallmark of a well-folded trimer is regarded as a compact three-lobed structure as viewed by negative-stain electron microscopy (EM) and preferential reactivity with broadly neutralizing antibodies compared with nonneutralizing antibodies . The purified GE and HEK293 cellproduced antigens were therefore visualized by negative-stain EM to investigate whether the antigens formed native-like trimers and to compare the structures of the two proteins ( Figure 6). Considerable In both a and b, the glycosylation of the "glycan-enhanced" protein is presented relative to the comparator protein and, therefore, positive and negative values indicate an increase or decrease in the abundance of a particular glycan species in the "glycan-enhanced" antigen. Schematics above each graph display the HxB2 aligned CAP256 NFL N-linked glycan sites, with sites labeled in black representing sites that were resolved by liquid chromatography-mass spectrometry (LC-MS) and gray the ones that were not.

| Immunogenicity comparison of plant-produced GE and mammalian cell-produced gp140
The immunogenicity of the recombinant proteins was tested in rabbits, which are an accepted small animal model for investigating the induction of antibodies against subunit HIV Env vaccines . This experiment was conceived to determine whether the host engineering approaches implemented in this study yielded an equivalent antigen to the mammalian cell-produced vaccine. Accordingly, rabbits were immunized in two different vaccination regimens as reflected in Figure 7a. Serum samples from both groups were also assessed for neutralization against pseudotyped HIV Subtype C viruses, using the standardized assay that has been described previously . Viruses were selected that represented a range of neutralization sensitivities: Tier 1 viruses are lab-passaged isolates that are unusually sensitive to neutralization, whereas Tier 2 viruses are more neutralization resistant and represent circulating isolates (Seaman et al., 2010). Although the induction of antibodies against tier 2 viruses is highly desirable this has only become achievable in  plant-derived and mammalian cell-produced proteins, respectively.
Both proteins also elicited comparable titers against 6644 with median titers of 100 and 118 observed for the plant and HEK293 proteins, respectively. Low levels of background were observed against pseudotyped VSV-G, a negative control, for two of the rabbits immunized with the plant-produced protein, although these were not observed for any of the HIV-Env pseudotyped viruses.
In the priming experiment, neutralizing antibodies were quantif-

| Recombinant HIV Env gp140 production in plants
Transient protein expression in plants was conducted by agroinfiltration of N. benthamiana WT and N. benthamiana ΔXF plants (Strasser et al., 2008). Recombinant Agrobacterium tumefaciens AGL1 strains encoding CAP256 SU gp140 and human CRT have already been reported Margolin, Oh, et al., 2020). Previously described expression constructs for LmSTT3D (Castilho et al., 2018) and HEXO3RNAi (Shin et al., 2017) were transformed into A.
tumefaciens GV3101:pMP90 for this study. The transient coexpression of multiple proteins in plants was achieved by agroinfiltrating a mixed suspension of A. tumefaciens strains, as previously described (Margolin, Oh, et al., 2020). T A B L E 3 Neutralizing antibody titers generated by the DDMMPP regimen investigating plant-produced (Group 3) and HEK293 cell-derived (Group 4) HIV Env gp140

| Gel filtration comparison of plant and mammalian cell-produced HIV Env gp140
Gel filtration profiles for plant-produced and mammalian cell-derived HIV Env gp140 were normalized as previously described  and overlayed for comparative purposes. before separation with the analytical column. The LC conditions were as follows: 280 min linear gradient consisting of 4%-32% acetonitrile in 0.1% formic acid over 260min followed by 20 min of alternating 76% acetonitrile in 0.1% formic acid and 4% Acn in 0.1% formic acid, used to ensure all the sample had eluted from the column. The flow rate was set to 300 nl/min. The spray voltage was set to 2.5 kV and the temperature of the heated capillary was set to 40°C. The ion transfer tube temperature was set to 275°C. The scan range was 375 − 1500 m/z. Stepped HCD collision energy was set to 15%, 25%, and 45%, and the MS2 for each energy was combined. Precursor and fragment detection were performed using an Orbitrap at a resolution MS1= 120,000, MS2 = 30,000. The AGC target for MS1 was set to standard and injection time set to auto, which involves the system setting the two parameters to maximize sensitivity while maintaining cycle time. Full LC and mass spectrometry (MS) methodology can be extracted from the appropriate Raw file using XCalibur FreeStyle software or upon request.

| Site-specific
Glycopeptide fragmentation data were extracted from the raw file using Byos (Version 3.5; Protein Metrics Inc.). The glycopeptide fragmentation data were evaluated manually for each glycopeptide; the peptide was scored as true-positive when the correct b and y fragment ions were observed along with oxonium ions corresponding to the glycan identified. The MS data was searched using the Protein Metrics 309 N-glycan library with sulfated glycans added manually combined with the Protein metrics 57 Plant N-linked glycan library.
The relative amounts of each glycan at each site, as well as the unoccupied proportion were determined by comparing the extracted chromatographic areas for different glycotypes with an identical peptide sequence. All charge states for a single glycopeptide were summed. The precursor mass tolerance was set at 4 and 10 p.p.m. for fragments. A 1% false discovery rate was applied. The relative amounts of each glycan at each site as well as the unoccupied proportion were determined by comparing the extracted ion chromatographic areas for different glycopeptides with an identical peptide sequence. Glycans were categorized according to the composition detected.
HexNAc (3)Hex(5-6)X was classified as Hybrid with HexNAc(3)Hex(5-6)Fuc(1)X classified as Fhybrid. Complex-type N-glycans were classified according to the number of HexNAc subunits and the presence or absence of fucosylation. As this fragmentation method does not provide linkage information compositional isomers are grouped, so for example a triantennary glycan contains HexNAc 5 but so does a biantennary glycans with a bisect. Core glycans refer to truncated structures smaller than M3. Any compositions containing a monosaccharide corresponding to a pentose (e.g., Xylose) are classified in the pentose category. Likewise, any glycan composition detected containing at least one fucose or sialic acid were assigned as "Fucose" and "NeuAc," respectively.

| Generating a three-dimensional (3D) representation of the glycan distribution of recombinant gp140
To generate a 3D representation of the glycan shield of CAP256 Env, SWISS-MODEL was used. Template search with BLAST and HHblits was performed against the SWISS-MODEL template library (SMTL, last update: September 15, 2021, last included PDB release: September 10, 2021). The target sequence was searched with BLAST against the primary amino acid sequence contained in the SMTL. A total of 968 templates were found. An initial HHblits profile has been built using the procedure outlined in (Steinegger et al., 2019), followed by 1 iteration of HHblits against Uniclust30 (Mirdita et al., 2017). The obtained profile was then be searched against all profiles of the SMTL. A total of 1120 templates were found. Models are built based on the target-template alignment using ProMod3 (Studer et al., 2021). Coordinates which are conserved between the target and the template are copied from the template to the model.
Insertions and deletions are remodeled using a fragment library. Side chains are then rebuilt. Finally, the geometry of the resulting model is regularized by using a force field. Any modeled ligands were removed, and a representative glycan for each detected glycan composition was added using Coot proximal to the asparagine it is MARGOLIN ET AL.
| 2931 attached to, corresponding to the data generated for CAP256 Env produced in HEK293 cells.

| Resolution of gp140 by PAGE
Purified HIV Env gp140 was resolved under native and denaturing conditions using BN-PAGE and SDS-PAGE systems, respectively . SDS-PAGE gels were subjected to western blot analysis using polyclonal goat anti-gp120, as previously reported (van Diepen et al., 2018).

| Negative-stain EM
Samples were negatively stained using standard protocols (Booth et al., 2011) and imaged at ×10,000 nominal magnification using a Tecnai T20 TEM (Thermo Fisher). Particles were selected by template-free auto-picking using Laplacian-of-Gaussian filtering, extracted into 208 × 208 boxes and subjected to two-dimensional (2D) classification in Relion 3.1 (Scheres, 2012). Nonprotein debris and empty areas of carbon were eliminated during early rounds of classification. The number of images making up the set of 2D classes with approximately the correct size and symmetry were divided by the total number remaining to estimate the proportion of correctly assembled trimeric HIV Env gp140 particles.

| Immunogenicity assessment
Sera from immunized rabbits was assessed for binding antibodies by ELISA, as previously described  and neutralizing antibodies were quantified using a standardized pseudovirus neutralization assay that was outlined in previous accounts van Diepen et al., 2019) 2.11 | Statistical analysis Statistical analyses were conducted using GraphPad Prism 5.0.
Comparisons between groups over time were analyzed using a two-way ANOVA test with a p < 0.05 considered as significant.

| DISCUSSION
The development of approaches to engineer glycosylation and folding pathways in plants potentially enables the production of glycoproteins which could not otherwise be viably produced in the system (Margolin, Crispin, et al., 2020;. Nonetheless, recapitulating native protein folding and glycosylation still poses a formidable challenge for vaccine developmentand viral glycoproteins in particular often accumulate poorly in plants . Using soluble HIV Env gp140 as a model for a heavily glycosylated viral glycoprotein, we have systematically investigated bottlenecks that impair the folding and glycosylation of viral glycoproteins in N. benthamiana Margolin, Oh, et al., 2020). In this current study, we have integrated a series of host and engineering approaches to address these limitations and thereby improve the production of the antigen in plants.
Integrating these approaches yielded a dramatic improvement in the production of the recombinant immunogen, which exhibited reduced aggregation and facilitated improved recovery of oligomeric protein. A similar observation was recently described for plant-produced IgA where the authors reported that coexpression of LmSTT3D improved the formation of dimers in the system by increasing occupancy of PNGs, although the impact was much less dramatic (Goritzer et al., 2020). The GE protein was also better resolved by SDS-PAGE compared with the non-engineered progenitor (WT). The basis for the diffuse products that were consistently observed for the WT protein in this and previous work    (Castilho et al., 2018;Goritzer et al., 2020). In stark contrast, the surface glycoprotein of HIV is amongst the most heavily glycosylated proteins described to date and contains~30 glycans per protomer, which comprises approximately half the mass of the trimer (Stewart-Jones et al., 2016). This study adds to a growing body of evidence that certain glycoproteins may be less efficiently glycosylated in plants (Castilho et al., 2018;Goritzer et al., 2020;Margolin et al., 2021), although we note that further work is required to determine the extent to which this phenomenon occurs. It is plausible that engineering the protein sequence directly could further improve glycan occupancy and this has recently been reported for mammalian cell-produced HIV Env trimers (Derking et al., 2021).
The GE protein was decorated almost exclusively with oligomannose-type glycans, with negligible undesired plant-specific modifications. The successful production of a recombinant glycoprotein lacking foreign (plant-specific) glyco-epitopes is a notable achievement given concerns regarding their potential immunogenicity in humans (Margolin, Crispin, et al., 2020). However, the resulting protein also displayed elevated levels of unprocessed oligomannosetype glycans compared with the mammalian cell-produced protein.
These are presumed to arise due to inefficient processing by host mannosidases in plants where the extensive glycosylation present on the recombinant protein may have exceeded the capacity of the system to efficiently mediate trimming. This would reduce the formation of complex glycans, providing an explanation as to why the WT protein did not contain plant-specific complex glycans. It is possible that the use of a weaker promoter driving expression of the HIV Env glycoprotein could improve the efficiency of mannosetrimming, although this would be counter-productive where high levels of the target protein are desired for commercial production.
This observation may also partly reflect differences in the purification methods for the different systems. In mammalian cells, the protein is secreted into the extracellular media and therefore all of the protein that was purified would have completed its translocation through the secretory pathway. In contrast the plant-produced gp140 was purified by homogenizing leaves thereby releasing protein from all stages of the secretory pathway. Accordingly, this would enrich for proteins in the early stages of the secretory pathway where glycan maturation would not yet have occurred. Given the fact that the majority of the N-glycans were oligomannose and not processed to complex ones, it is difficult to attribute much impact to the HEXO3RNAi construct used in this study for HIV Env gp140.
However, the goal of this study is ultimately to develop a broadly applicable vaccine production platform and this is expected to have a greater impact for other targets with more processed N-glycans Although no study has reported the site-specific glycosylation of viral CAP256 glycosylation, it is possible to compare the results obtained in this study with general observations from Env proteins from other strains. Analysis of the BG505 strain revealed two conserved regions of oligomannose-type glycans, one focused around the N332 supersite, termed the intrinsic mannose patch, and another located at the trimer apex, focused around the N160 site Pritchard et al., 2015;Struwe et al., 2018).
These two regions are key bnAb epitopes and the integrity of glycan processing in these areas is key for a successful immunogen. This is true for all immunogens analyzed in this study, with oligomannosetype glycans present across the outer domain and the apex. Complextype glycans are located towards the base of previously analyzed BG505 Env trimers and the mammalian-derived CAP256 trimer is similarly processed; however, this is not the case for the plantderived immunogens. Additionally, under occupancy on gp41 is a common feature of soluble Env immunogens produced in mammalian glycoproteins (Derking et al., 2021) and the presence of the under occupancy on these sites in plant-derived Env is one that requires solving for all immunogens.
Visualization of the GE and mammalian cell-produced proteins both failed to yield convincing evidence of native-like trimer formation. The antigen expressed in both systems was designed van Diepen et al., 2018) using a first generation "native flexible linker" strategy (Sharma et al., 2015), which was subsequently reported to require additional stabilizing mutations to promote efficient trimer formation for some envelopes (Georgiev et al., 2015;Ringe et al., 2015). It is worth noting that the efficiency at which native trimers form is heavily influenced by the genetic background of the virus and Subtype C isolates are often particularly difficult to produce in their native conformation without extensive engineering of the protein (Guenaga et al., 2017;Julien et al., 2015;Ringe et al., 2017). In many cases, homogenous preparations of trimer antigen may also require purification with monoclonal antibodies which specifically capture well-folded trimers  or eliminate misfolded protein species . This complicates vaccine development in Africa, where the infrastructure is poorly developed and preparation of GMP-grade antibody to support process development would be a further challenge (Margolin, Burgers, et al., 2020). In this study, we deliberately opted not to use mAb-based purification schemes, but instead opted for sequential affinity chromatography and gel filtration steps that could be widely implemented as would be required for an HIV vaccine.
The promising improvements in glycosylation of the GE gp140 prompted its immunogenicity testing in rabbits, where it was compared with the equivalent mammalian cell-produced antigen.
Binding antibody ELISAs suggested that the plant-produced antigen yielded slightly lower levels of binding antibodies than the mammalian protein. However, both antigens induced highly similar neutralizing antibodies against both Tier 1 and Tier 2 viruses. Although protein only immunizations did not induce any discernible autologous neutralizing antibodies, priming with DNA and MVA resulted in neutralization of the matched Tier 2 virus following protein boosting.
Both plant and mammalian cell-derived proteins induced similar numbers of responders and comparable neutralizing antibody titers suggesting that the immunogens were similar. However, it is unclear if the antibodies induced by the plant-produced and mammalian cellderived proteins targeted different epitopes. This would not be unexpected given the differences in glycan occupancy that were observed between the proteins and a large body of evidence demonstrating that antibodies frequently target holes in the glycan shield of the HIV Env glycoprotein (McCoy et al., 2016;Wibmer et al., 2016;Yang et al., 2020;Zhou et al., 2017). Therefore, it is plausible that unoccupied sites in the plant-produced antigen could induce antibodies that would not be induced by the mammalian cellproduced protein where these epitopes were shielded by glycans. If, however this was the case, this did not appear to negatively impact the induction of neutralizing antibodies.
This study represents a significant advance in plant-based viral glycoprotein production through engineering of the secretory pathway. These approaches now enable remodeling of the plant factory to better support the maturation requirements of complex glycoproteins facilitating the production of new vaccines. Further work is ongoing to apply this integrated approach to other viral targets, particularly those which pose a threat for causing disease outbreaks.
In conclusion, we present a novel paradigm for HIV-1 Env and other complex glycoprotein production in plants to facilitate rapid, lower cost reagent or vaccine production for developing countries.

DATA AVAILABILITY STATEMENT
Supporting data for the manuscript has been provided in the supplement.