The macro domain as fusion tag for carrier‐driven crystallization

Obtaining well‐ordered crystals remains a significant challenge in protein X‐ray crystallography. Carrier‐driven crystallization can facilitate crystal formation and structure solution of difficult target proteins. We obtained crystals of the small and highly flexible SPX domain from the yeast vacuolar transporter chaperone 4 (Vtc4) when fused to a C‐terminal, non‐cleavable macro tag derived from human histone macroH2A1.1. Initial crystals diffracted to 3.3 Å resolution. Reductive protein methylation of the fusion protein yielded a new crystal form diffracting to 2.1 Å. The structures were solved by molecular replacement, using isolated macro domain structures as search models. Our findings suggest that macro domain tags can be employed in recombinant protein expression in E. coli, and in carrier‐driven crystallization.


Introduction
Carrier-driven crystallization, 1,2 or chaperoneassisted crystallization, 3 describes the crystallization of a target protein by fusing it to a well-behaving protein tag, which may contribute to the formation of the crystal lattice. A small selection of wellcharacterized proteins has thus far been used as crystallization tags: Short fragments of fibrinogen and a-actin have been crystallized by fusing them to E. coli glutathione-S-transferase (GST) 4 or the catalytic domain of myosin II, 5 respectively. The b 2 adrenergic G protein-coupled receptor crystallized after inserting T4 lysozyme into a flexible loop region. 6 Several structures obtained by carrierdriven crystallization contain E. coli maltose binding protein (MBP) as a fusion tag 3 : the ectodomain of the human T cell leukemia virus type 1 gp21, 7 the yeast mating regulator MATa1 homeodomain, 8 segments of the amyloid-forming a-synuclein 9 and the fungal Kar3 kinesin motor domain. 10 Corsini et al. demonstrated the importance of using small rigid linkers between the target protein and the fusion tag, when crystallizing the U2AF homology motif of splicing factor Pfu60 fused to E. coli thioredoxin A. 11 Given that there are high resolution structures available for most of the fusion proteins used thus Abbreviations: GST, glutathione-S-transferase; MBP, maltose binding protein; SEC, size-exclusion chromatography; TTM, triphosphate tunnel metalloenzyme; VTC, vacuolar transporter chaperone Statement of Significance: A novel fusion tag for recombinant protein expression and carrier driven crystallization is presented.
far, the fusion tag, besides assisting in the formation of diffracting crystals, can facilitate structure solution by molecular replacement. In addition, an increased stability of the fusion protein is frequently observed. 12,13 We have recently reported that SPX domains of previously unknown biochemical function are sensors for inositol pyrophosphate signaling molecules, controlling phosphate (P i ) homeostasis in fungi, plants and animals. 14 The name SPX originates from the yeast SYG1 and Pho81 and the mammalian XPR1 proteins, all of which contain SPX domains (https:// www.ebi.ac.uk/interpro/entry/IPR004331). Comparing different eukaryotic organisms, we located this small, a-helical domain at the N-termini of proteins involved in P i transport 15,16 and P i signaling, 17,18 and as a single-domain protein in plants. 14,19,20 Previously, we mapped three sets of domains in the vacuolar transporter chaperone (VTC) complex, a multi-subunit protein assembly embedded in the vacuolar membrane in yeast cells. 21,22 The central catalytic triphosphate tunnel metalloenzyme (TTM) domain synthesizes inorganic polyphosphate chains from ATP 22,23 and a trans-membrane pore translocates the growing polymer into the vacuole where inorganic polyphosphate represents an important P i store. 22,24 VTC subunits Vtc2, 3 and 4 contain additional 180 amino-acid SPX domains at their Ntermini. We could solubly express and purify S. cerevisiae SPX ScVtc2 (residues 1-182) and SPX ScVtc4 (residues 1-178) as well as other SPX domains from various fungal and plant species, but could neither obtain crystals from the purified proteins nor readily interpretable NMR spectra. 14 In addition, carrier driven crystallization using the established thioredoxin A 11 and maltose-binding protein 3 tags were unsuccessful.

SPX domains can be crystallized as macro domain fusion proteins
The VTC SPX domains contain highly conserved Ntermini and are connected to the catalytic TTM domain by a short linker 22 (Fig. 1). We attempted to crystallize SPX ScVtc2 and SPX ScVtc4 fused to their respective catalytic TTM domains (residues 1-553 and 1-480 in ScVtc2 and ScVtc4, respectively). We obtained crystals for SPX ScVtc4 -TTM-6xHis, diffracting to 4.5 Å resolution and showing clear signs of perfect merohedral twinning. We replaced the TTM domain by thioredoxin A, 11 but this construct again failed to yield well-diffracting crystals [ Fig. 2(A)]. Next, we engineered SPX ScVtc4 (residues 1-178) with a C-terminal macro domain tag (residues 181-366), connected via a short Ala-Gly-Ser linker to decrease inter-domain flexibility 11 [ Fig. 2(B); see Methods]. We chose the macro domain of human histone variant macroH2A1.1 as fusion tag as it (1) expresses to high levels in E. coli, (2) crystallizes in many and very different crystal lattices [25][26][27] (PDB-IDs 1YD9, 1ZR3, 1ZR5, 2FXK, 3IID, 3IIF), and (3) has solvent accessible N-and C-termini that allow for the addition of a fusion protein. Importantly, macro domains bind ADP-ribose, and ligand binding results in small conformational changes that alter the crystallization properties of the domain. 26,27 Our construct provides a C-terminal, non-cleavable 6xHis tag for metal affinity purification.
The SPX ScVtc4 -6xHis and SPX ScVtc4 -macro-6xHis fusion proteins expressed to similar levels in E. coli and both could be purified to homogeneity, while Nterminal fusion constructs were insoluble 14 [Fig. 2(C)]. The fusion protein displayed increased protein stability and yielded needle shaped crystals in more than 25 different crystallization conditions, after screening 10 different 96-well grid screens using the sitting drop vapor diffusion method. In parallel, screening of a reductively methylated SPX ScVtc4macro fusion protein 28 produced crystals of a different morphology.
The macro domain can be utilized to solve the phase problem The non-methylated SPX ScVtc4 -macro fusion protein yielded orthorhombic crystals diffracting to 3.3 Å resolution. Structural superposition of the existing macroH2A1.1 models suggests that the N-(residues 180-185) and C-termini (residues 354-368) of the macro domain can adopt different orientations [core r.m.s.d.'s are between 0.6 Å and 1.0 Å comparing 169 corresponding C a atoms; Fig. 3(A)]. We thus used all available macroH2A1.1 structures as search models in molecular replacement searches as implemented in the program PHASER. 29 The best solution comprises two macro domains derived from PDB entry 1YD9 25 in the asymmetric unit, which are related by a pseudo two-fold axis [ Fig. 3(B,C)]. We refined this crystallographic dimer, which accounts for 50% of protein atoms in the asymmetric unit in the program autoBUSTER (see Methods) and used the resulting phase information as starting phases for density modification (solvent content is 0.65). The resulting density modified map at 3.3 Å revealed the presence of long segments of electron density, which could be modeled as two long ahelices forming the core of the SPX domain [  Table I]. 14 Monoclinic crystals of the reductively methylated SPX ScVtc4 -macro fusion protein diffracted to 2.1 Å (PDB-ID 5IIT, 14 Table I). The structure was solved using the macro domain PDB entry 1ZR3 26 and the two SPX core helices from the low resolution cerevisiae Vtc2; UniProt Acc. P43585), ScVtc3 (S. cerevisiae Vtc3; Q02725) and ScVtc4 (S. cerevisiae Vtc4; P47075) and including a secondary structure assignment calculated with the program DSSP 48 and based on PDB entry 5IIG. 14 Invariant and conserved residues are highlighted in dark-and light-purple, respectively. SPX ScVtc2 shows 62% and 32% sequence identity with SPX ScVtc3 and SPX ScVtc4 , respectively. The two long core helices of the SPX domain are colored in light blue, surrounding helices are colored from yellow to red. The C-terminal catalytic TTM domain (a-helices in green, b-strands in blue) is connected to the SPX domain via a variable linker (in yellow).  SPX ScVtc4 -macro structure as search models in molecular replacement calculations with PHASER, 29 as described. 14 There are four molecules in the asymmetric unit, with the SPX and macro domains making extensive interactions [ Fig. 4(A)]. While the two long core helices of the non-methylated and methylated SPX ScVtc4 domain structures closely align (r.m.s.d. is 1.0 Å comparing 71 corresponding C a atoms), their N-(residues 2-22) and C-termini (154-177) adopt different orientations in our structures: The very C-terminal helix in SPX does not engage in the formation of a three helix bundle [ Fig.  4(B)], but is pointed outwards [shown in red in Fig.  4(A,C)], establishing contacts with a neighboring molecule [ Fig. 4(A)]. Remarkably, we found the Nterminal a-helical hairpin motif initially located in our 3.3 Å map, to be disordered in the 2.1 Å structure [ Fig. 4(B,C)]. Importantly, the Ala-Gly-Ser linker is well-defined by electron density in the 2.1 Å structure [ Fig. 4(C)]. We used this high-resolution SPX ScVtc4 -macro model to confirm the topology of the SPX domain and the directionality of a-helices in our low resolution maps. The refined models revealed that SPX domains fold into three-helix bundles with two central 80 Å long core helices and two shorter C-terminal helices, connected by short loops. The N-terminus appears flexible but can fold into an a-helical hairpin motif, which provides a binding site for inositol polyphosphate signaling molecules 14 [ Fig. 4(B,C)].

Crystal packing involves SPX and macro domain protein-protein interactions
We next analyzed the arrangement of the SPX ScVtc4 and macroH2A1.1 domains in our two crystal forms (solvent content is 0.65 and 0.55 for the nonmethylated and methylated forms, respectively): Crystal lattice formation in our low-resolution structure of non-methylated SPX ScVtc4 -macro is achieved by SPX-SPX (interface area: 1050 Å 2 ), by macromacro (interface area: 870 Å 2 ) and by SPX-macro domain interactions (interface area 770 Å 2 ), as calculated with the program PISA 30 [Fig. 4(D)]. Similar results were obtained for the crystal form of the  reductively methylated SPX ScVtc4 -macro fusion protein, although very different crystal contacts are established in the two crystal forms [ Fig. 4(D,E)]. In both cases, SPX-SPX interactions are not sufficient to build-up a three-dimensional crystal lattice [ Fig.  4(D,E)] suggesting that the additional interactions formed by the macro domain allowed for the crystallization of the fusion protein.

Discussion
Our experiments suggest that the macro domains are suitable tags for recombinant protein expression and carrier-driven crystallization. Based on our previous work on the histone variant macroH2A1.1, 26,27 we employed this human macro domain as fusion tag as it (1) stably expresses to high levels in E. coli, (2) can be easily purified, (3) is very stable in minimal buffers, (4) crystallizes in many different lattice combinations [25][26][27] (PDB-IDs 1YD9, 1ZR3, 1ZR5, 2FXK, 3IID, 3IIF), and (5) provides solvent accessible N-and C-termini for the design of both N-and C-terminal (this study) protein fusions. It is of note that macro domains bind ADP-ribose with nano-to micromolar affinity. 26,31 Thus, ADP-ribose could be used as an additive in carrier-driven crystallization. In our previous co-crystallization experiments with the isolated macroH2A1.1 macro domain, ADPribose induced the formation of new crystal forms (compared to the existing 'apo' structures 25,26 ), as ligand-binding induces a structural re-orientation of the C-terminal helix. 26,27 Notably, the macro ADPribose binding pocket can accommodate a MES buffer molecule (e.g., PDB-ID 1ZR3 26 ), and we could only obtain crystals for the SPX ScVtc4 -macro fusion when using MES-based protein storage buffers. In addition to ADP-ribose, macro domains can also bind ADP, albeit with lower affinity. 26,31 It might thus be possible to soak or co-crystallize macro fusion proteins with halogenated nucleotide variants, in order to introduce heavy atoms at defined positions in macro fusion protein crystal lattices for experimental phasing. Our work has focused on the use of the mac-roH2A1.1 macro domain, but many alternative macro domain structures have been reported from human (PDB-IDs 2X47, 32 2L8R, 33 4IQY, 34 3VFQ, 35 4J5S 36 ), trypanosomal (PDB-ID 5FSY 37 ), archaeal (PDB-IDs 1HJZ, 38 2BFQ 31 ), bacterial (5CB3, 39 5KIV 40 ) and viral (PDB-IDs 3EJF, 41 3EWP, 42 3GPO, 43 5DUS 44 ) proteins. Given that all these different macro domains produced well-diffracting crystals (n 5 61 were refined at a resolution between 1.5 and 2.5 Å ), one interesting approach to carrierdriven crystallization would be to screen target proteins in fusion with different macro domains and in the pre-or absence of ADP-ribose. In addition, it may be worthwhile to test linkers of different length and composition between the protein of interest and the macro fusion tag.
Linker length is an important factor in carrierdriven crystallization as it determines the molecular dynamics between the target protein and the fusion tag. While too short linkers may cause expression and folding problems, long linkers may hinder crystal formation. 11 In the present study a short, Ala-Gly-Ser linker was used, which allowed for highlevel expression of the SPX ScVtc4 -macro fusion protein and which has well-defined electron density in our 2.1 Å structure [Figs. 2(C), 4(C)].
Taken together, crystallization of SPX ScVtc4 as a macro fusion protein allowed us to define the SPX domain fold and borders, which later on enabled the successful crystallization and structure solution of other fungal and human SPX domains. 14 14 ). Based on our findings, we suggest that macro domains could be interesting fusion tags for recombinant protein expression and for the carrierdriven crystallization of 'difficult' target proteins.

Cloning and purification of proteins
The S. cerevisiae SPX ScVtc4 (amino acids 1-178) was cloned into plasmid pMH-HC using BspHI and XhoI restrictions sites. The plasmid provided a C-terminal non-cleavable 6xHis tag. For carrier-driven crystallization, a new plasmid was constructed which provides a C-terminal macro tag (amino-acids 181-366 of human histone macroH2A1.1) followed by 6xHis tag [pMH-macroHC, Fig. 2(B)]. SPX ScVtc4 and the macro domain are connected by an Ala-Gly-Ser linker. Target proteins can be cloned utilizing the XbaI/ BamHI restriction sites.
For recombinant protein expression, plasmids were transformed into E. coli BL21 (DE3) RIL cells and selected on LB-Agar plates containing kanamycin and chloramphenicol. 3 l terrific broth medium containing 30 mg/mL kanamycin and 34 mg/mL chloramphenicol were inoculated and cells grown to OD 600nm 5 0.6 at 378C. Then, the temperature was reduced to 168C and protein expression was induced by adding 300 lM Isopropyl-b-D-thiogalactopyranoside. After 16 hours, cells were harvested by centrifugation for 20 min at 4,000g at 48C. The pellet was washed with PBS buffer, resuspended in a small volume of lysis buffer (50 mM Tris/HCl pH 7.8, 500 mM NaCl, 2 mM b-mercaptoethanol [b-ME]) and snap-frozen in liquid nitrogen.
For protein purification, pellets from 3 l of bacterial culture were thawed and lysis buffer, supplemented with 0.1% (v/v) IGEPAL, 1 mM MgCl 2 , 500 units TurboNuclease (BioVision) and 2 tablets Protease Inhibitor Cocktail (Roche), was added to a final volume of 300 mL. Cells were lysed using an EmulsiFlex-C3 (Avestin) and the cell debris was removed by centrifugation for 1 hour at 7,000g at 48C. Proteins were purified by Ni 21 affinity chromatography using a 5 mL HisTrap HP column (GE Healthcare). After dialysing proteins against sizeexclusion chromatography (SEC) buffer (20 mM Mes pH 6.5, 300 mM NaCl, 0.5 mM TCEP) overnight at 48C, monomeric SPX ScVtc4 or SPX ScVtc4 -macro peak fractions were isolated on a Superdex 75 HR 26/60 column (GE healthcare). Reductive methylation of SPX ScVtc4 -macro was carried out as described previously. 28 Methylated protein sample was re-purified by an additional SEC step. Purified proteins were concentrated and immediately used for crystallization.
Crystallization and data collection SPX ScVtc4 -macro (16 mg/mL in 20 mM MES pH6.5, 300 mM NaCl, 0.5 mM TCEP) crystals were grown in 2.75M NaCl, 8.75% (v/v) PEG 6,000. Crystals were cryo-protected by serial transfer through 10 lL drops containing crystallization buffer supplemented with increasing concentrations of ethylene glycol (final concentration: 8% [v/v]), and diffracted to 3.3 Å at beam-line PXII of the Swiss Light Source (SLS), Villigen, Switzerland. The reductively methylated SPX ScVtc4 -macro protein (10 mg/mL in 20 mM MES pH 6.5, 300 mM, 0.5 mM TCEP) crystallized in 19% (v/v) PEG 3,350, 0.1M (NH 4 ) 2 SO 4 , 0.1M MES pH 6.5 as described. 14 Data processing and scaling was done with XDS (version: May, 2016). 45 Crystallographic structure solution and refinement The structure of the non-methylated SPX ScVtc4 -macro domain was solved using the molecular replacement method as implemented in the program PHASER, 29 and using the isolated macro domain of human histone macroH2A1.1 (PDB-ID 1YD9 25 ) as search model. The solution comprised a dimer in the asymmetric unit, which was refined in autoBUSTER (Global Phasing Limited, version 2.10.3). The resulting phases were used as starting phases for density modification as implemented in PHENIX.RE-SOLVE. 46 The structure was completed in alternating cycles of manual model building in COOT 25 and restrained TLS refinement in autoBUSTER (Global Phasing Limited), using external reference restraints based on the high-resolution structures of the macroH2A1.1 macro (PDB-ID 1ZR3 26 ) and SPX ScVtc4 domains (PDB-ID 5IIG 14 ), respectively. The side-chains of most amino acids in the SPX domain could not be modeled with certainty and were thus truncated to Ala. The quality of the refined structure was validated using the program MolProbity 47 (Table I) and structural presentations were prepared in PyMOL (Molecular Graphics System, Version 1.7 Schr€ odinger, LLC).