Conformational switches and redox properties of the colon cancer‐associated human lectin ZG16

Zymogen granule membrane protein 16 (ZG16) is produced in organs that secrete large quantities of enzymes and other proteins into the digestive tract. ZG16 binds microbial pathogens, and lower ZG16 expression levels correlate with colorectal cancer, but the physiological function of the protein is poorly understood. One prominent attribute of ZG16 is its ability to bind glycans, but other aspects of the protein may also contribute to activity. An intriguing feature of ZG16 is a CXXC motif at the carboxy terminus. Here, we describe crystal structures and biochemical studies showing that the CXXC motif is on a flexible tail, where it contributes little to structure or stability but is available to engage in redox reactions. Specifically, we demonstrate that the ZG16 cysteine thiols can be oxidized to a disulfide by quiescin sulfhydryl oxidase 1, which is a sulfhydryl oxidase present together with ZG16 in the Golgi apparatus and in mucus, as well as by protein disulfide isomerase. ZG16 crystal structures also draw attention to a nonproline cis peptide bond that can isomerize within the protein and to the mobility of glycine‐rich loops in the glycan‐binding site. An understanding of the properties of the ZG16 CXXC motif and the discovery of internal conformational switches extend existing knowledge relating to the glycan‐binding activity of the protein.


Introduction
Zymogen granule membrane protein 16 (ZG16) is a mammalian lectin-like protein produced at high levels in the colon and also found in the pancreas, liver, and other tissues [1][2][3]. ZG16 downregulation is associated with ulcerative colitis and colon cancer [4][5][6][7][8][9]. Intracellularly, ZG16 is localized to the Golgi apparatus and zymogen granule membranes [10,11]. Extracellularly, it is found in the mucus-associated proteome [12]. The function of ZG16 is not yet known, but it may aid in the packaging of glycoproteins into secretory granules [13] and has been shown to decrease the penetration of bacteria into the mucus that coats the colon epithelium [14]. The ability of ZG16 to bind diverse glycans [15][16][17] is consistent with a putative role in innate immunity or in handling secretory cargo or both.
Previous structural studies focused on the carbohydrate binding capabilities of ZG16 [15,16], but the redox activity of the protein has not yet been addressed. ZG16 has a CXXC (Cys-Xxx-Xxx-Cys) amino acid sequence (Fig. 1A), which is CSRC in human ZG16 (UniProt O60844). A CXXC motif is a feature shared by thiol-disulfide oxidoreductases of the thioredoxin (Trx) superfamily [18] and sulfhydryl oxidases such as ERV/ALR enzymes [19] and Ero1 [20], all of which display this motif at the amino terminus of a helix. Precluding a similar structural context in ZG16, the CXXC is the final four amino acids at the carboxy terminus of the protein. No information was available about the ZG16 CXXC from previous crystal structures since the final eight carboxy-terminal residues were omitted from the crystallized variant [21].
Here we report the crystallization and analysis of ZG16 containing the CXXC motif. In efforts to resolve the structure of the CXXC region, we analyzed multiple high-resolution crystal structures, which fortuitously provided information on conformational flexibility within the folded protein domain. In addition to the structural analysis of ZG16 including the CXXC, we also addressed ZG16 redox properties with respect to catalysts of disulfide bond formation that function in the endoplasmic reticulum (ER), Golgi apparatus, and perhaps within the intestinal mucus hydrogel.

Preparation and crystallization of ZG16
Human ZG16, which has no canonical N-linked glycosylation sites, was produced in Escherichia coli fused carboxy-terminally to maltose binding protein (MBP), similarly to a previous expression method for this protein [21]. In our work, the ZG16 sequence spanned residues 21-167, rather than just the core lectin domain (residue 21-159) (Fig. 1A). A His 6 tag and a tobacco etch virus (TEV) protease cleavage site were introduced between the MBP and ZG16. However, we found that ZG16 precipitated upon TEV cleavage. Precipitation could be minimized by increasing the ionic strength during cleavage and was not caused by intermolecular disulfide bonding.
While high salt aided in ZG16 purification, it did not sufficiently increase protein solubility to permit preparation of a concentrated stock solution for X-ray crystallography. Reasoning that ZG16 has many exposed aromatic groups, we added arginine to the solution to compete with intermolecular cation-p interactions. Indeed, ZG16 was much more soluble in the presence of arginine and could be concentrated to at least 12 mgÁmL À1 (0.75 mM). Crystals were obtained in multiple space groups, all yielding high-resolution diffraction data (Table 1). Phasing was done by Amino acid sequence of human ZG16. The bent arrow above the sequence indicates the predicted start of the mature protein after signal peptide cleavage. Secondary structure assignments are shown below the sequence and colored to correspond to panel B. The segment from residue 71-76 is presented with a dashed line because the side chain of Ser73, rather than the backbone, hydrogen bonds to the adjacent b-hairpin. The residues flanking the cis peptide bond present in some of the ZG16 structures are highlighted in cyan and the cysteines in yellow. (B) Ribbon diagram of ZG16 in two orientations, made using structure 1 and colored from red (N terminus) to blue (C terminus). Numbers indicate the b hairpins. A dashed line representing the CXXC motif was added for the purpose of illustration. Loops forming the glycan-binding site are labeled according to [15]. (C) Comparison of two ZG16 structures illustrates the flexibility of the region containing Tyr160 and Pro161. In structure 2, these residues assume a different conformation to accommodate a crystal packing interaction with Trp103 (gray). (D) Conservation of amino acids in the carboxy-terminal tail of ZG16. (E) A Fo-Fc map for chain B of structure 3 displayed at 3r extending to 6 A from the protein. The Arg166 side chain was not used to calculate the map but was subsequently placed into the appropriate density. Structure figures were made using PYMOL. molecular replacement using the known ZG16 core lectin domain structure (PDB; 3APA). Hereafter, 'crystal form 1' or 'structure 1' will refer to space group P321, at 1.1 A resolution, and 'crystal form 2' or 'structure 2' to space group P2 1 2 1 2 1 , at 1.5 A resolution. Both of these crystals had one ZG16 molecule per asymmetric unit. A second P2 1 2 1 2 1 crystal was also obtained, diffracting to 1.2 A resolution and containing two ZG16 molecules per asymmetric unit. This form will be called 'crystal form 3' or 'structure 3'.
Structures of ZG16 reveal the flexibility of the CXXC region As described previously, ZG16 has a b-prism fold with a core composed of three b-hairpins [21] (Fig. 1B). Following each of the first two b-hairpins, the polypeptide chain completes a Greek key motif by forming two outer strands hydrogen bonded to either side of the b-hairpin. Lacking downstream strands, the third b-hairpin is flanked by~25 residues from the amino terminus of the protein. A comparison of the ZG16 structures from the different crystal forms revealed that the protein is flexible at the carboxy terminus, following Val159. Specifically, the peptide bond preceding Pro161 in structure 2 is in the cis isomer (it is trans in the other structures), and the side chain of Tyr160 is repositioned, apparently to facilitate crystal contacts (Fig. 1C). Notably, despite its flexibility, Tyr160 shows a similar high level of conservation in ZG16 orthologs as amino acids involved in core folding (Trp157) or likely to have a particular functional role in the protein (Cys167) (Fig. 1D). In none of the crystal forms does the hydroxyl group of Tyr160 make a direct hydrogen bond to another part of the protein, raising the possibility that this tyrosine and other conserved residues in the ZG16 tail are under strong evolutionary selection for intermolecular interactions or modifications.
Due to the flexibility of the ZG16 carboxy terminus, the CXXC motif could not be atomically modeled in any of the crystal forms. However, its location could be detected in one of the molecules in the asymmetric unit of crystal form 3. For this molecule, electron density likely corresponding to the guanidinium group of Arg166 was seen adjacent to Asp46, anchoring the rest of the tail (Fig. 1E). Nevertheless, the intervening electron density was not of sufficient quality to unambiguously determine the position of the disulfide. Crystallization of single-cysteine mutants or wild-type ZG16 in the presence of reducing agents did not result in improved electron density (data not shown), and it is evident that the carboxy-terminal segment is highly flexible relative to the ZG16 core lectin domain.

ZG16 has a dynamic nonproline cis peptide bond
In addition to the orientation of the carboxy-terminal tail, another difference between the ZG16 structures from different crystal forms is the presence or absence of a nonproline (non-Pro) cis peptide bond. A cis peptide was observed between Gly28 and Glu29 in structures 1 and 3 ( Fig. 2A) and was also present in previous structures of the ZG16 lectin domain (e.g., PDB: 3APA and 3VY7). A non-Pro cis peptide has been noted in a similar position in other b-prism lectins [22]. The new observation presented here is that the peptide assumes the trans configuration in structure 2 ( Fig. 2A). These isomers are unambiguous in the high-quality electron density maps (Fig. 2B). Due to the isomerization, the Ca atoms of Gly28 deviate by 2. 5 A in a superposition of structures 1 and 2, compared to a Ca rmsd of 0.59 A for residues 22-159. The side chain of Ser27 hydrogen bonds to the backbone N-H of Leu155 in the final b-hairpin of structure 2, whereas the Ser27 backbone carbonyl performs this task in structure 1, as in conventional hydrogen bonding between b-strands. In the structures with the cis peptide, the two aromatic rings of Tyr26 and Tyr30 sandwich the Ser27-Gly28 peptide plane such that the two rings and the peptide are all roughly parallel, while the Tyr26 ring is almost perpendicular to His156 on its opposite side (Fig. 2C). In the trans structure, the Ser27-Gly28 peptide is perpendicular to Tyr26, and the Tyr26 side chain changes conformation such that it lies parallel with His156 (Fig. 2D). The conformational changes upon cis-trans isomerization thus propagate to nearby side chains, but the perturbation is resolved within about 6 A to either side of the isomerizing bond.

Alternate backbone conformations in a glycanbinding loop
In addition to the cis and trans peptide isomers observed in different crystal forms, multiple, alternative backbone conformations superposed in crystallographic electron density were another indication of ZG16 conformational switches. In chain A of structure 3, alternate conformations were detected in the region from Ser32 to Gly34 (Fig. 3), corresponding to the glycine-rich 'GG loop' (Fig. 1B), one of the main elements of the glycan-recognition site [15,16]. The highresolution diffraction data facilitated interpretation of this switch, in which movement of the GG loop main chain is coupled to rotamer changes in the side chain of Arg145 (Fig. 3). The effect of the conformational changes in this loop is to modulate the contours of the glycan-binding pocket, re-configure the constellation of hydrogen bond donors and acceptors, and alter the accessibility of Asp151 (Fig. 3), a conserved amino acid important for mannose recognition, adhesion to pathogenic fungi, and anti-cell proliferative activities observed for ZG16 [3,23].
The ZG16 CXXC motif is oxidized in vitro by PDI and QSOX1 The enzyme quiescin sulfhydryl oxidase 1 (QSOX1) is a catalyst of disulfide bond formation localized, like ZG16, to the Golgi apparatus and extracellular fluids [24,25]. Moreover, QSOX1 is highly expressed in goblet cells of the colon [26] and is present with ZG16 in the mucus-associated proteome [27]. We therefore asked whether the ZG16 CXXC motif is a substrate of QSOX1. Reduced ZG16 was prepared and incubated with catalytic amounts of recombinant human QSOX1 at either neutral or moderately acidic pH, the latter relevant to the Golgi. Time points were taken by removing aliquots and quenching with maleimidefunctionalized polyethylene glycol (mal-PEG) of molecular mass 5 kD. At pH 7.4, QSOX1 efficiently catalyzed formation of a disulfide in ZG16, as measured by loss of ZG16 reactivity with mal-PEG Fig. 3. Alternate backbone configurations near the glycan-binding site of ZG16. Alternate conformations within the same asymmetric unit of crystal form 3 are shown side-by-side. A glycerol molecule is bound within the glycan-binding site, as seen in a previous ZG16 structure [21]. Dashed lines indicate hydrogen bonds. The side chain of Ser148 was observed in multiple rotamers (mult.rot). Bottom panels are rotated 30°relative to the top panels and show the molecular surfaces of the two conformations. The side-chain oxygen atoms of Asp151 are colored red. following addition of QSOX1 (Fig. 4A). At pH 5.8, the reaction was slower (Fig. 4A), consistent with the pH profile of human QSOX1 activity measured on the small-molecule model substrate dithiothreitol (DTT) (Fig. 4B) and with that previously reported for Trypanosoma brucei QSOX1 [28]. Due to the limitations of ZG16 solubility in physiologically relevant solutions, the Michaelis constant (K M ) could not be established, but we estimate it to be above 150 µM based on oxygen consumption assays (data not shown).
We also tested whether reduced ZG16 can be oxidized by protein disulfide isomerase (PDI), a major oxidoreductase involved in disulfide bond formation in the ER [29] and secreted in certain physiological settings [30]. As PDI does not perform multiple turnovers without an additional electron sink, we supplied it to reduced ZG16 in a 1 : 2 stoichiometry, considering that PDI has two active-site CXXC motifs. PDI rapidly oxidized ZG16 in this experiment (Fig. 4C). In contrast, E. coli Trx reduced the ZG16 disulfide (Fig. 4D), indicating that the redox potential of the ZG16 CXXC motif is between those of PDI (~165 mV) [31] and Trx (À270 mV) [32].

Discussion
ZG16 is an animal lectin with many reported disease associations [4][5][6][7][8][9] but a poorly understood physiological function. One aspect of ZG16 that had not been addressed is the nature of its carboxy-terminal cysteines. This study aimed to determine the structural relationship of the ZG16 CXXC motif to the b-prism fold, as well as to provide an initial analysis of ZG16 redox properties. We first and foremost observed that the protein has little tendency to form disulfide bonded dimers or multimers. Instead, the two cysteines in the human ZG16 tail readily formed an intramolecular disulfide, both when purified from bacterial cell lysates (starting material in Fig. 4C) and when oxidized by QSOX1 (Fig. 4A). We found ZG16 to be an excellent substrate for QSOX1 in vitro (Fig. 4A) and thus potentially useful for sulfhydryl oxidase assays in the future. ZG16 may also be a physiological substrate of QSOX1, since these proteins are found together in goblet cells and in intestinal secretions [12,27]. Nevertheless, it should not be concluded on the basis of our in vitro experiments that QSOX1 necessarily oxidizes ZG16 in vivo. We have shown that PDI can also oxidize ZG16 (Fig. 4C), and it may do so in the ER prior to any encounter between ZG16 and QSOX1. Though ZG16 is oxidized in vitro by QSOX1 and PDI, it is reduced by E. coli Trx (Fig. 4D). Whether ZG16 would encounter Trx from intestinal bacteria is unknown, but human Trx, which has a redox potential of À230 mV [33], has been identified in mucus [34]. The roles of various redox-active proteins in the intestinal mucus layer is an intriguing but still largely unexplored topic. When contemplating a possible redox-related function for ZG16, an evolutionary perspective may add insight. Overall, the ZG16 amino acid sequence is highly conserved in the animal species that contain the protein. Birds appear to have lost ZG16 altogether, but a ZG16 coding sequence has been identified in some reptiles, amphibians, and in many mammals. Most ZG16 orthologs contain the CXXC motif, but Cetartiodactyla (whales and even-toed ungulates) and some other species lack the first cysteine. The lower conservation of the first cysteine can be seen in Fig. 1D. Notably, Tyr160 and Pro161, which are present in the flexible region following the last b-strand of the ZG16 fold (Fig. 1C), are conserved both in ZG16 variants with a CXXC and in those with a single cysteine. As noted above, Tyr160 does not make hydrogen bonds or fixed hydrophobic contacts within ZG16, so this residue may be conserved for functional intermolecular interactions. Presuming that ZG16 uses its carboxy-terminal tail for such interactions, coevolution of a ZG16 partner may explain how a single cysteine could substitute for a disulfide bond in some species. The single-cysteine ZG16 variants have a histidine or tyrosine in place of the missing cysteine, followed almost always by a glutamic acid. This glutamic acid is not found in orthologs with both cysteines. The distinct evolutionary bifurcation of the ZG16 carboxyterminal tail, as well as the apparent absence of the protein in birds (which have evolved unique aspects of their digestive systems [35]), may provide hints to the physiological role of ZG16.
One open question relating to the structure and function of ZG16 is the relationship between its redox and glycan-binding activities. A previous study noted that ZG16 did not show antimicrobial activity and that reduction of the disulfide did not activate such activity [14]. Based on our data, the CXXC tail appears to be structurally uncoupled from the glycanbinding region. Many sugar-binding studies of ZG16 were performed in the absence of the tail entirely [15,16]. The lack of a requirement for the CXXC motif in glycan binding is expected based on the presence of this motif on the opposite end of the b-prism (Fig. 1B) and its conformational heterogeneity. Nevertheless, interactions made by the ZG16 tail may indirectly affect glycan binding in vivo by controlling localization of the protein and contributing to avidity.
Other conformational changes we observed in ZG16 may, in contrast, be directly linked to glycan binding and may help explain the diversity of binding targets. Our set of ZG16 crystal structures revealed two main regions of flexibility near the glycan-binding site. One is the first strand of the first b-hairpin, and the second is the GG loop following this first strand. The first strand is the site of cis-trans isomerization of the Gly28-Glu29 peptide bond. The part of ZG16 affected by this isomerization is adjacent to the region that shows NMR chemical shift differences upon binding of phosphatidylinositol mannosides and overlaps the region that participates in heparin binding [15,16]. The GG loop, in turn, interacts with bound mannose derivatives [16]. It was previously shown that sugars can bind ZG16 in different orientations [15], but our work raises the possibility of another mechanism for recognition of diverse glycans: structural reorganization of the glycan-binding pocket.
It has been noted that non-Pro cis peptide bonds are detected most frequently in proteins involved in sugar binding and catalysis and that these isomers are usually functionally important [36]. Nevertheless, experimental observation of both the cis and trans isomers of the same protein is rare. Whether isomerization contributes to the activity of ZG16 remains to be determined. Importantly, though the Gly28-Glu29 region of structure 2 (bearing a trans peptide bond) is near a crystal contact, introducing the cis peptide configuration as observed in the other ZG16 structures shows that it would not have induced a steric clash at this contact. Neither do any particularly noteworthy intermolecular interactions seem to be enabled uniquely by the trans form. Thus, it does not appear that crystal packing in crystal form two stabilized a high-energy conformation of ZG16 either to avoid unfavorable interactions or to facilitate favorable ones. For reference, another amino acid participating in the same protein-protein interface, Leu133, clearly occupies a different rotamer in structure 2 than in the other structures to avoid a clash. Leu133 is about 10 A away from Gly28, and the Leu133 rotamer is not coupled in any obvious way to the isomerizing bond. It would be interesting to determine whether physiological interactions of ZG16 with partner or target proteins stabilize the trans form of the Gly28-Glu29 peptide and for what functional purpose. Overall, the set of structures and redox activity assays described here emphasize the malleability of the ZG16 glycan-binding region and show that ZG16 can engage in diverse dithiol-disulfide exchange reactions.

Protein production and purification
The coding sequence for residues 21-167 of ZG16 was inserted into a plasmid downstream of the coding sequences for MBP, a His 6 tag, and a TEV protease cleavage site. The signal peptide for entry of ZG16 into the secretory pathway, predicted to span residues 1 through 16 [37], was not necessary for expression in E. coli, and the starting residue was chosen based on previous work [17]. According to the plasmid design, a non-native glycine remained at the aminoterminus of ZG16 after TEV cleavage.
The ZG16 expression plasmid was transformed into the BL21(DE3) E. coli strain. Cultures were grown at 37°C in the presence of 100 mgÁL À1 ampicillin to an optical density of 0.5 at 595 nm, at which point isopropyl b-D-1thiogalactopyranoside was added to a concentration of 0.5 mM. The growth temperature was lowered to 25°C, and the cultures were left to shake overnight. Cells were then pelleted, resuspended in 5 mM sodium phosphate, pH 7.5, 400 mM NaCl, and 5 mM imidazole (cell suspension buffer), and frozen at À80°C.
To purify ZG16, cell suspensions were thawed, sonicated on ice, and spun at 25 000 g for 20 min at 4°C.
Supernatant was applied to a Ni-NTA column, washed in the cell suspension buffer, and then eluted with an increasing imidazole gradient. Eluted protein was diluted threefold with PBS supplemented with an additional 800 mM NaCl, placed into a 3 kDa cutoff dialysis bag together with His 6tagged TEV protease, and dialyzed against the high-salt PBS overnight at room temperature. High salt prevented protein precipitation at this step. The cleaved protein was then reapplied to a Ni-NTA column equilibrated in 50 mM sodium phosphate buffer, pH 7.5, and 500 mM NaCl. ZG16 was released from the column during a wash with cell suspension buffer, and the His 6 -tagged MBP and TEV protease were released during a gradient to higher imidazole concentrations. ZG16 protein concentration was measured by absorbance in 6 M guanidine, 20 mM sodium phosphate buffer, pH 6.8, using an extinction coefficient of 31 500 M À1 Ácm À1 .
Human PDI (NP_000909.2) was cloned into the pcDNA3.1 plasmid to produce the full-length protein with a His 6 tag following the KDEL sequence at the carboxy terminus. The PDI expression vector was transfected using the PEI Max reagent (Polysciences Inc., Warrington, PA, USA) into HEK 293F cells (Thermo Fisher, Waltham, MA, USA) grown in FreeStyle 293 expression medium. Six days after transfection, the culture was harvested and subjected to centrifugation for 15 min at 500 g to remove cells. The supernatant was transferred to a fresh bottle and centrifuged for 15 min at 2000 g to remove particulate matter. The remaining supernatant was passed through a 0.45 µM filter, and protein was purified by Ni-NTA chromatography. The catalytic region of human QSOX1 was purified in a similar manner using an expression vector previously described [38]. Escherichia coli Trx was purified from bacteria essentially as described except without detergent [39].

Crystallization and structure solution
For crystallization, ZG16 was concentrated to 12 mgÁmL À1 . To aid in solubility during this process, 200 mM L-arginine was added to the protein in the centrifugal concentrator, and the buffer was exchanged to 10 mM Tris, pH 8.1, 100 mM NaCl, and 200 mM L-arginine by repeated concentration and dilution.
Crystals were grown using the hanging drop method by mixing protein stock solution 1 : 1 with well solution. crystals were transferred to a solution composed of 80% of the well solution and 20% glycerol before flash freezing. Diffraction data were collected at the European Synchrotron Radiation Facility on beamline ID23-1. Phases were obtained by molecular replacement using PDB: 3APA as the search model. Rebuilding and refinement were conducted iteratively using COOT [40] and Phenix [41]. Structure figures were generated using PYMOL [42]. Atomic coordinate files have been deposited in the Protein Data Bank with accession codes 7O4P (structure 1), 7O3I (structure 2), and 7O88 (structure 3).

Oxidation assays
Reduced ZG16 was prepared by adding DTT at a concentration of 20 mM to a 200 µL aliquot of 100 µM protein.
After 20 min at room temperature, DTT was removed from the protein solution using a PD-10 column equilibrated in PBS supplemented with an additional 400 mM NaCl. Elution fractions of 500 µL were collected, and the protein concentration in the peak fraction was measured. Mal-PEG of molecular mass 5 kDa was dissolved in water at a concentration of 50 mM and applied to a PD-10 column equilibrated in water to remove any maleimide not conjugated to polyethylene glycol. Reactions were initiated containing 10 µM reduced ZG16 and 50 nM recombinant human QSOX1. At the indicated time points, 9 µL aliquots of the reaction were removed and mixed with 1 µL of 10 mM mal-PEG. For dithiol-disulfide exchange reactions with PDI, PDI (40 µM) was reduced by incubation with 20 mM DTT for 20 min at room temperature. Reduced PDI was desalted on a PD-10 column equilibrated with PBS. Oxidized PDI and reduced ZG16, or vice versa, were mixed at final concentrations of 5 µM ZG16 and 2.5 µM PDI in 10 µL aliquots. At the indicated time points, 1 µL mal-polyethylene glycol was added to each aliquot. Gel loading buffer was added to the aliquots, and proteins were separated on 12% acrylamide gels.