The TM1585 gene of Thermotoga maritima encodes a putative glycerate kinase (EC 2.7.1.-)1, 2 with a molecular weight of 44,589 Da (residues 1–417) and a calculated isoelectric point of 5.73. This enzyme is part of the Entner-Doudoroff pathway II (non-phosphorylative), where it catalyzes the ATP-dependent conversion of (R)-glycerate to 2-phospho-(R)-glycerate and ADP [Fig. 1(A)]. Two groups of glycerate kinases are known, and they share no significant sequence similarity despite their presumably identical biochemical activities. The first group consists of glycerate kinases from bacterial species, primarily of the Firmicutes group and the gamma subdivision of Proteobacteria. The second group includes TM1585 and glycerate kinases from eukaryotes and archaea in addition to several bacterial species.3 The initial annotation for TM1585 as a glycerate kinase was based on its homology to a gene responsible for complementation in Methylobacterium extorquens AM1 mutants lacking glycerate kinase activity.4 However, other family members related to TM1585 are annotated as putative glycerate dehydrogenases/hydroxypyruvate reductases based on genetic analysis of the tartrate utilization pathway in Agrobacterium vitis.5 Glycerate kinase and glycerate dehydrogenase/hydroxypyruvate reductase catalyze successive steps in the serine metabolism pathway.4 Recent biochemical studies have shown a glycerate-2-kinase activity for TM1585 (A. Osterman et al., personal communication). In humans, deficiency of glycerate kinase leads to D-glyceric aciduria (OMIM 220120), which is characterized by massive amounts of D-glyceric acid in the urine.6 Clinical features include delayed psychomotor development, mental retardation, and seizures7. Here, we report the crystal structure of TM1585, the first structural representative of the glycerate-2-kinase family, which was determined using the semiautomated, high-throughput pipeline of the Joint Center for Structural Genomics (JCSG).8
Materials and Methods.
Protein production and crystallization. The TM1585 gene (TIGR: TM1585; Swiss-Prot: Q9X1S1) was amplified by polymerase chain reaction (PCR) from genomic DNA using PfuTurbo (Stratagene) and primers corresponding to the predicted 5′- and 3′-ends. The PCR product was cloned into plasmid pMH1, which encodes an expression and purification tag (MGSDKIHHHHHH) at the amino terminus of the full-length protein. The cloning junctions were confirmed by DNA sequencing. Protein expression was performed in a selenomethionine-containing medium using the Escherichia coli methionine auxotrophic strain DL41. At the end of fermentation, 250 μg/mL lysozyme was added to the culture. Bacteria were lysed by sonication after a freeze/thaw procedure in Lysis Buffer [50 mM Tris (pH 7.9), 50 mM NaCl, 0.25 mM Tris(2-carboxyethyl)phosphine hydrochloride (TCEP)], and the cell debris was pelleted by centrifugation at 3400 × g for 60 min. The soluble fraction was applied to nickel-chelating resin (GE Healthcare) pre-equilibrated with Equilibration Buffer [50 mM potassium phosphate (pH 7.8), 300 mM NaCl, 10% (v/v) glycerol, 0.25 mM TCEP] containing 20 mM imidazole. The resin was washed with Equilibration Buffer containing 40 mM imidazole, and the target protein was eluted with Elution Buffer [20 mM Tris (pH 7.9), 300 mM imidazole, 10% (v/v) glycerol, 0.25 mM TCEP]. The eluate was buffer exchanged with Buffer Q [20 mM Tris pH 7.9, 5% (v/v) glycerol, 0.25 mM TCEP] containing 50 mM NaCl and applied to a RESOURCE Q column (GE Healthcare) pre-equilibrated with the same buffer. The protein was eluted using a linear gradient of 50 to 500 mM NaCl in Buffer Q. The appropriate fractions were pooled, buffer exchanged with Crystallization Buffer [20 mM Tris (pH 7.9), 150 mM NaCl, 0.25 mM TCEP], and concentrated for crystallization assays to 10.5 mg/mL by centrifugal ultrafiltration (Millipore). Molecular weight and oligomeric state of the target protein were determined using a 1.0 × 30 cm Superdex 200 column (GE Healthcare) in combination with static light scattering (Wyatt Technology). The mobile phase consisted of 20 mM Tris (pH 8.0), 150 mM NaCl, 0.02% (w/v) sodium azide. The protein was crystallized using the nanodroplet vapor diffusion method9 with standard JCSG crystallization protocols.8 The crystallization reagent contained 65% (v/v) 2-methyl-2,4-pentanediol (MPD), 0.1 M Bicine (pH 9.0). The crystals were indexed in orthorhombic space group P212121 (Table I).
Table I. Summary of Crystal Parameters, Data Collection, and Refinement Statistics for TM1585 (PDB: 2b8n)
Rsym = Σ|Ii−<Ii>|/Σ|Ii| where Ii is the scaled intensity of the ith measurement and <Ii> is the mean intensity for that reflection.
Rcryst = Σ||Fobs|−|Fcalc||/Σ|Fobs| where Fcalc and Fobs are the calculated and observed structure factor amplitudes, respectively.
Rfree = as for Rcryst, but for 5.0% of the total reflections chosen at random and omitted from refinement.
Typically, the number of unique reflections used in refinement is slightly less than the total number that were integrated and scaled. Reflections are excluded due to systematic absences, negative intensities, and rounding errors in the resolution limits and cell parameters.
Unit cell parameters
a = 61.95 Å, b = 85.17 Å, c = 172.51 Å, α = β = γ = 90.00°
Data collection. Multi-wavelength anomalous diffraction (MAD) data were collected at the Stanford Synchrotron Radiation Laboratory (SSRL) on beamline 9-2 at wavelengths corresponding to the high energy remote (λ1) and inflection (λ2) of a selenium MAD experiment. The data- sets were collected at 100 K using an ADSC CCD detector. The MAD data were integrated and reduced using XDS and then scaled with the program XSCALE.10 Data statistics are summarized in Table I.
Structure solution and refinement. The structure was determined with 2.53 Å selenium MAD data using the CCP4 suite11 and SOLVE/RESOLVE.12 Model completion and refinement were performed with dataset λ1 using COOT13 and REFMAC5.14 Since the data are incomplete to 2.53 Å, we define the nominal resolution as 2.70 Å, which is the resolution of a dataset that is 100% complete and has the same number of reflections as observed in that dataset. Nevertheless, the 2890 reflections observed between 2.70 and 2.53 Å (52.8% complete for this shell) are included in the refinement. Refinement statistics are summarized in Table I.
Validation and deposition. Analysis of the stereochemical quality of the model was accomplished using AutoDepInputTool,15 MolProbity,16 SFcheck 4.0,17 and WHATIF 5.0.18 Protein quaternary structure analysis was performed using the PQS server.19 Figures were prepared with PyMOL (DeLano Scientific). Atomic coordinates and experimental structure factors for TM1585 at 2.70 Å resolution have been deposited in the PDB and are accessible under the code 2b8n.
Results and Discussion.
The crystal structure of TM1585 [Fig. 1(B)] was determined to a nominal resolution of 2.70 Å using the MAD method. Data collection, model, and refinement statistics are summarized in Table I. The final model includes two TM1585 monomers (residues 4–417 for chains A and B) and 97 water molecules in the asymmetric unit. No electron density was observed for residues 1–3 or the expression and purification tag. The Matthews' coefficient (Vm)20 for TM1585 is 2.6 Å3/Da, and the estimated solvent content is 52.4%. The Ramachandran plot produced by MolProbity16 shows that 98.2% and 100.0% of the residues are in favored and allowed regions, respectively. TM1585 is composed of 12 β-strands (β1–β12), 15 α-helices (H1–H7, H9–H16), and one 310-helix (H8) [Fig. 1(B and C)]. The total β-strand, α-helical, and 310-helical content is 18.6%, 39.9%, and 0.7%, respectively. The TM1585 monomer is comprised of two dissimilar α/β domains: an N- terminal, Rossmann-like domain (residues 23–249) and a C-terminal domain that adopts a new fold (residues 4–22 and 250–417). The N-terminal domain has a central, six-stranded, parallel β-sheet with a strand order of 654123, as well as an all-helical subdomain that is formed from the packing of helices H5–H6 and H9–H10. The C-terminal domain contains a six-stranded, mixed β-sheet with strand order 126345 and seven helices packed on both sides of the β-sheet.
The active site is likely to be in the cleft between the N- and C-terminal α/β domains. Highly conserved amino acids with side-chains pointing into the putative active site include seven residues from the Rossmann-like domain (Lys47, Asp189, and the glycine-rich loop 122-SGGGS-126) and five residues from the C-terminal domain (Glu312, Arg325, Asn326, Asp351, and Asn407) [Fig. 2(A)]. Each of these domains contains one highly conserved basic residue (Lys47 in the N-terminal Rossmann-like domain and Arg325 in the C-terminal domain) that could potentially interact with the triphosphate tail of ATP. However, consistent with the nucleotide-binding function of other canonical Rossmann-fold proteins, it is likely that the N-terminal domain binds ATP. The C-terminal domain is possibly involved in substrate binding with the active site located in the cleft between these domains.
Analysis of the crystallographic packing of TM1585 using the PQS server19 indicates a monomer [Fig. 1(B)]. This finding is inconsistent with results from analytical size exclusion chromatography in combination with static light scattering, which suggest a dimer. However, the monomer contains a catalytically competent active site, which suggests that the enzyme can function as a monomer.
A DALI21 structural similarity search using the N-terminal domain of TM1585 (residues 23–249) as a query found remote structural similarity to a number of Rossmann-fold domains. The top DALI hit is to the structure of glutamyl-tRNA reductase from Methanopyrus kandleri (PDB: 1gpj),22 which aligns with a root-mean-square deviation (RMSD) of 3.2 Å and a sequence identity of 12% over 114 Cα atoms. Of particular interest is the similarity to the N-terminal domain (residues 1–55 and 261–368) of glycerate kinase from Neisseria meningitidis (PDB: 1to6, Midwest Structural Genomics Center), which belongs to the other glycerate kinase family.3 Both domains can be structurally aligned with an RMSD of 3.4 Å and a sequence identity of 16% over 101 Cα atoms [Fig. 2(B)]. Although these domains have a similar overall fold, the putative active site residues are not conserved between these two families of glycerate kinases. A DALI21 search using the C-terminal domain of TM1585 as a query failed to find significant similarities to proteins in which all elements of the glycerate kinase C-terminal domain are present, indicating that this domain is a new fold. Remote structural similarity to cobalt precorrin-4-methyltransferase CbiF domain 2 (PDB: 1cbf)23 was found by ProSMoS (Grishin NV, unpublished).24 Both domains display a central, mixed β-sheet of five β-strands (order 12534) surrounded by α-helices. However, in TM1585, the domain includes an extra α-helix at the N-terminus and an extra α/β insertion between the last two β-strands that attaches to the edge of the five β-strands. Although the topological arrangement of the secondary structural elements of these domains are similar, the packing of the connecting helices around the β-sheet differ significantly. Hence, we classify this domain as a new fold in agreement with the SCOP database [Fig. 2(C)].
TM1585 contains homologous sequences in archaea, bacteria, and eukaryotes. Models for TM1585 homologs can be accessed at http://www1.jcsg.org/cgi-bin/models/get_mor.pl?key=TM1585. The TM1585 crystal structure reported here reveals a novel fold and is the first structural representative of the glycerate-2-kinase family. Recent biochemical studies (A. Osterman et al., personal communication) have shown that TM1585 has a glycerate-2-kinase activity for TM1585. Mutagenesis studies of active site residues and further crystallographic studies are required to elucidate the mechanism of enzyme function. The information presented here could form the starting point for the rational design of such experiments.
This work was supported by NIH Protein Structure Initiative grants from the National Institute of General Medical Sciences (www.nigms.nih.gov). Portions of this research were carried out at the Stanford Synchrotron Radiation Laboratory (SSRL). The SSRL is a national user facility operated by Stanford University on behalf of the U.S. Department of Energy, Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of Energy, Office of Biological and Environmental Research, and by the National Institutes of Health (National Center for Research Resources, Biomedical Technology Program, and the National Institute of General Medical Sciences).
NOTE ADDED IN PROOF
Recent work by Reher et al. (FEMS Microbiol Lett 259 (2006) 113–119), where they propose a 2-phosophoglycerate kinase activity for a related enzyme from Picrophilus torridus, lends further support to our assertion that TM1585 is a glycerate-2-kinase.