The TM1244 gene of Thermotoga maritima encodes the PurS subunit of a phosphoribosylformyl-glycinamidine synthase II [FGAM; Enzyme Commission (EC) 18.104.22.168] with a molecular weight of 9507 Da (residues 1–82) and a calculated isoelectric point of 5.91. PurS is part of the de novo purine biosynthesis subsystem, where it forms a complex with PurQ (TM1245) and smPurL (TM1246) to form formylglycinamide ribonucleotide amidotransferase (FGAR-AT). The FGAR-AT complex is involved in the fourth step of the purine biosynthetic pathway,1–3 where it catalyzes the adenosine triphosphate (ATP)-dependent conversion of formylglycinamide ribonucleotide (FGAR) and glutamine to formylglycinamidine ribonucleotide (FGAM), adenosine diphosphate (ADP), Pi, and glutamate. In Gram-positive bacteria, archaebacteria, and the Gram-negative T. maritima, FGAR-AT is a complex of three proteins: PurS, smPurL, and PurQ. In eukaryotes and other Gram-negative bacteria, FGAR-AT is a multidomain protein with a molecular mass of about 140 kDa (designated as lgPurL). Here, we report the crystal structure of TM1244 that was determined using the semiautomated, high-throughput pipeline of the Joint Center for Structural Genomics (JCSG).4
Materials and Methods.
Protein production and crystallization. PurS from T. maritima [The Institute for Genomic Research (TIGR): TM1244, Swiss-Prot: Q9X0X1] was amplified by polymerase chain reaction (PCR) from genomic DNA using PfuTurbo (Stratagene) and primers corresponding to the predicted 5′- and 3′-ends. The PCR product was cloned into plasmid pMH1, which encodes an expression and purification tag (MGSDKIHHHHHH) at the amino terminus of the full-length protein. The cloning junctions were confirmed by DNA sequencing. Protein expression was performed in a selenomethionine-containing medium using the Escherichia coli methionine auxotrophic strain DL41. Lysozyme was added to the culture at the end of fermentation to a final concentration of 250 μg/mL. Bacteria were lysed by sonication after a freeze/thaw procedure in Lysis Buffer [50 mM Tris, pH 7.9, 50 mM NaCl, 10 mM imidazole, 0.25 mM Tris(2-carboxyethyl)phosphine hydrochloride (TCEP)], and the cell debris was pelleted by centrifugation at 3400 × g for 60 min. The soluble fraction was applied to nickel-chelating resin (Amersham Biosciences) pre-equilibrated with Lysis Buffer. The resin was washed with Wash Buffer [50 mM potassium phosphate, pH 7.8, 300 mM NaCl, 40 mM imidazole, 10% (v/v) glycerol, 0.25 mM TCEP], and the target protein was eluted with Elution Buffer [20 mM Tris, pH 7.9, 300 mM imidazole, 10% (v/v) glycerol, 0.25 mM TCEP]. The eluate was buffer-exchanged with Buffer Q [20 mM Tris, pH 7.9, 50 mM NaCl, 5% (v/v) glycerol, 0.25 mM TCEP] and applied to a RESOURCE Q column (Amersham Biosciences) pre-equilibrated with the same buffer. The flow-through fraction, which contained the target protein, was buffer-exchanged with Crystallization Buffer [20 mM Tris, pH 7.9, 150 mM NaCl, 0.25 mM TCEP] and concentrated for crystallization assays to 11.5 mg/mL by centrifugal ultrafiltration (Millipore). Molecular weight and oligomeric state of TM1244 were determined using a 1.0 × 30 cm Superdex 200 column (Amersham Biosciences) coupled with static light scattering (Wyatt Technology). The mobile phase consisted of 20 mM Tris, pH 8.0, 150 mM NaCl and 0.02% (w/v) sodium azide. The protein was crystallized using the nanodroplet vapor diffusion method5 with standard JCSG crystallization protocols.4 The crystallization reagent, which produced P212121 crystals, contained 20% (v/v) glycerol and 24% (w/v) polyethylene glycol (PEG)-1500. I4122 crystals were obtained from 18% (v/v) PEG-400, 0.2 M CaCl2, 0.01 M trimethylamine HCl, 0.1 M N-2-hydroxyethylpiperazine-N′-2-ethanesulfonic acid (HEPES), pH 7.5. Twelve percent (v/v) ethylene glycol was added to the I4122 crystals as a cryoprotectant. The crystals were indexed in orthorhombic space group P212121 and tetragonal space group I4122 (Table I).
Table I. Summary of Crystal Parameters, Data Collection, and Refinement Statistics for TM1244 (PDB code:1vq3)
Data collection, structure solution, and refinement. Diffraction data from both P212121 and I4122 crystals were collected at the Advanced Light Source (ALS, Berkeley, CA) on beamline 8.2.1. Data sets were collected at 100 K using a Quantum 210 charge-coupled device (CCD) detector. Data were integrated and reduced using Mosflm8 and then scaled with the program SCALA from the CCP4 suite.7 The structure was determined using the JCSG molecular replacement protocol9 using FGAR-AT from Bacillus subtilis [Protein Data Bank (PDB) code: 1t4a; 31% sequence identity] as a search model. Initial molecular replacement attempts with the P212121 data did not produce promising results. Therefore, we used the 2.5 Å I4122 data to solve the structure. Coordinates of a partial model of a dimer of TM1244 from the I4122 crystal form, which lacked residues 30–37 and 51–56, were used as a search model to phase the P212121 data. We carried out further refinement with the 1.9 Å data from the P212121 crystal using RESOLVE,10 REFMAC5,11 and O.12 Both crystal forms contain four molecules in the asymmetric unit. Data collection, model, and refinement statistics are summarized in Table I.
Validation and deposition. Analysis of the stereochemical quality of the model was accomplished using AutoDepInputTool,13 MolProbity,14 SFcheck 4.0,7 and WHAT IF 5.0.15 Figure 1(B) was adapted from an analysis using PDBsum,16 and all others were prepared with PyMOL (DeLano Scientific). Atomic coordinates and experimental structure factors of TM1244 have been deposited in the PDB and are accessible under the code 1vq3.
Results and Discussion.
The three-dimensional (3D) structure of TM1244 [Fig. 1(A)] was determined to 1.90 Å resolution by the molecular replacement (MR) method as described in the Materials and Methods section. The final model includes four monomers (residues 1–82 for chains A, B, C, D), four residues from the His-tag, and 301 water molecules in the asymmetric unit. No electron density was observed for the remaining residues of the expression and purification tag. The Matthews coefficient (Vm)17 is 2.29 Å3/Da and the estimated solvent content is 45.8%. The Ramachandran plot, produced by MolProbity,14 shows that 98.8%, 99.7%, and 0.3% of the residues are in favored, allowed, and disallowed regions, respectively. The only outlier, Leu68 (in chain A), with ϕ = −160.2° and ψ = −41.7°, is located in the loop between α-helix H2 and β-strand β3 and is involved in a crystal contact.
The TM1244 monomer contains three β-strands (β1–β3) and two α-helices [H1–H2; Fig. 1(A, B)] and belongs to the PurS-like fold of SCOP (Structural Classification of Proteins). The total β-strand and α-helical content is 37.2% and 30.2%, respectively. Most members of the PurS-like fold consists of two α+β subdomains that dimerize and resemble the ferredoxin-like fold. The ferredoxin-like fold has a two-layer α+β architecture composed of four β-strands and two α-helices and is one of the most common folds. However, it is the TM1244 dimer structure that closely resembles the ferredoxin-like fold [Fig. 2(A, B)], where helix H1 and strand β2 come from the neighboring monomer and are segment- or domain-swapped. In the dimer, strand β2 domain-swaps into the adjacent monomer and then continues into its own subunit as strand β3. In this way, the dimer forms two ferredoxin-like domains.
A structural similarity search, performed with the coordinates of TM1244 using the DALI18 server, identified the highly similar structure of PurS (Mth169) from Methanobacterium thermoautotrophicum (PDB: 1gtd),19 with a root-mean-square deviation (RMSD) of 2.2 Å over 74 aligned Cα atoms and a sequence identity of 26%. The MR search model PurS, from B. subtilis (PDB: 1t4a)20, could be superimposed with an RMSD of 1.4 Å over 63 aligned Cα atoms and a sequence identity of 37%. Given the low sequence similarity, the structural similarity of TM1244, Mth169, and B. subtilis PurS is remarkable, and the three proteins can be aligned over all secondary structural elements with no significant structural differences.
The monomeric unit of TM1244 is unlikely to be the biologically relevant unit, as the ferredoxin-like fold is formed from dimerization through domain swapping. In both the I4122 and P212121 crystal forms, a weakly associated tetramer is also observed, in comparison to the dimeric interface which in the two crystal forms is more extensive and tightly packed. However, the crystal packing of these tetrameric units is different. In the P212121 crystals, the tetramers assemble as a polymeric repeat, whereas in I4122 the assembly is octameric. Results from analytical size exclusion chromatography in combination with static light scattering indicate a dimer in solution, consistent with the 2:1:1 ratio of PurS:smPurL:PurQ in the PurSQL complex. A dimer is also consistent with the structure of FGAR-AT (lgPurL) from Salmonella typhimurium (PDB: 1t3t),21 where duplicated PurS domains on the same chain resemble the TM1244 dimer structure. Structural alignment of lgPurL with the TM1244 dimer produces an RMSD of 1.8 Å over 124 aligned Cα atoms and a sequence identity of 12% [Fig. 3(A, B)]. Interestingly, in lgPurL, only five β-strands are conserved and the region corresponding to β-strand β3 of chain A of TM1244 is replaced by a loop in the lgPurL structure that links the N- and C-terminal halves of the domain. It is noteworthy that between lgPurL and the known structures of PurS homologs, the sequence conservation of this loop region of lgPurL is better conserved than the local structure.
In T. maritima, the FGAR-AT complex comprises three proteins: PurS, smPurL, and PurQ. In eukaryotes, the complex is made up of a single, multidomain protein: lgPurL. Previously, we reported the structure of smPurL (TM1246)22 from T. maritima, which corresponds to the middle domain of lgPurL. PurS (TM1244) corresponds to the N-terminal domain of lgPurL, whereas its C-terminal domain corresponds to the PurQ protein (TM1245). A structural superposition showing the putative arrangement of TM1244 and TM1246 in the T. maritima FGAR-AT complex is illustrated in Figure 3(B). The location that PurQ (TM1245) is predicted to occupy in the FGAR-AT complex is indicated by an arrow.
The TM1244 structure reported here represents a PurS protein, whose structure has been determined by X-ray crystallography. The information reported here, in combination with the recently solved structure of smPurL (TM1246) and further biochemical and biophysical studies, will yield valuable insights regarding the role of this protein in purine biosynthesis.
Portions of this research were carried out at the Stanford Synchrotron Radiation Laboratory (SSRL) and the Advanced Light Source (ALS). The SSRL is a national user facility operated by Stanford University on behalf of the U.S. Department of Energy, Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of Energy, Office of Biological and Environmental Research, and by the National Institutes of Health (National Center for Research Resources, Biomedical Technology Program, and the National Institute of General Medical Sciences). The ALS is supported by the Director, Office of Science, Office of Basic Energy Sciences, Materials Sciences Division, of the U.S. Department of Energy under Contract No. DE-AC03-76SF00098 at Lawrence Berkeley National Laboratory.