The enzyme 5,10-methenyltetrahydrofolate synthetase (MTHFS), also known as 5-formyltetrahydrofolate cyclo-ligase, is involved in folate metabolism. MTHFS catalyzes the obligatory initial metabolic step in the intracellular irreversible conversion of 5-formyltetrahydrofolate to other reduced folates, which requires ATP and magnesium. The product is then interconverted into other reduced folates involved in one-carbon metabolism.1, 2 The reduced tetrahydrofolates (THFs) serve as cofactors that carry one-carbon moieties for the de novo synthesis of purines (supply the number 2 and number 8 carbons of the purine ring) and thymidylate (used for the methylation of dUMP to dTMP), and for the remethylation of homocysteine to methionine.3, 4 Methionine can be adenylated to form S-adenosylmethionine, a cofactor and one-carbon donor for numerous cellular methylation reactions.5–7 The substrate of MTHFS, 5-formyltetrahydrofolate (5-FTHF), is used in chemotherapy (clinically known as Leucovorin), either to rescue patients from methotrexate toxicity or to enhance the effectiveness of 5-fluorouracil. MTHFS from Mycoplasma pneumoniae shares significant sequence homology with human MTHFS. Although sequence information for MTHFS from various organisms is available, no structural information for this family of enzymes has been obtained. To provide a structural basis for understanding the function of the MTHFS protein family, we have determined the crystal structure of M. pneumoniae MTHFS at 2.2 Å resolution.
Materials and Methods.
Primers (Operon, Emeryville, CA) for PCR amplification contained an NdeI restriction site in the forward primer (5′- CATATGGACAAAAATGCCTTAAGAAAA) and BamHI site in the reverse primer (5′- GGATCCTTATTCATCATTAATAATTAAGTCTAGTTGTACATCC). PCR was performed using Deep Vent Polymerase (New England Biolabs, Inc., Beverly, MA) and M. pneumoniae genomic DNA. The PCR product was cloned into pCR-BluntII-TOPO vector (Invitrogen Corp., Carlsbad, CA), and the gene insert was confirmed by DNA sequencing. The amplified TOPO vector was restricted with NdeI and BamHI, and the gene insert was purified by agarose gel electrophoresis extraction. This insert was ligated into pSKB3 (a gift from Steve Burley, Rockefeller University, New York), digested with NdeI and BamHI and transformed into DH5α. The DNA sequence of the gene insert was confirmed.
Protein expression and purification:
His-tagged M. pneumoniae MTHFS was expressed in Escherichia coli strain BL21(DE3)/pSJS12448 cells upon induction with 0.5 mM isopropyl-β-D-thiogalactopyranoside. Selenomethionyl (Se-Met) protein was prepared according to the method of Doublie.9 Bacteria were lysed by sonication (50 mM Hepes, pH 7, 0.5 mM phenylmethylsulfonyl fluoride (PMSF), 1 μg/mL antipain, 1 μg/mL chymostatin, 0.5 μg/mL leupeptin, and 0.7 μg/mL pepstatin A) and cell debris pelleted by centrifugation at 39,000 × g for 20 min in a Sorvall centrifuge. The lysate was then spun in a Beckman ultracentrifuge Ti45 rotor at 60,000 × g for 20 min at 4°C to remove membrane proteins. The His-tagged M. pneumoniae MTHFS was affinity purified from the soluble fraction using Talon metal affinity resin (Clontech, Palo Alto, CA) according to the procedure recommended by the manufacturer. Elution was achieved with 300 mM imidazole. The eluted sample was dialyzed against 20 mM Tris-Bis-TrisPropane pH 8.0, 10 mM NaCl, 1 mM dithiothreitol (DTT), 1 mM ethylenediaminetetraacetic acid (EDTA). The target protein was bound onto a 5 mL HiTrap Q column (Amersham Biosciences, Uppsala, Sweden) and eluted with a 15-column volume linear gradient from 0.01 to 0.25M NaCl in the same buffer. The purity of the expressed protein was determined by sodium dodecylsulfate (SDS) gel electrophoresis, and the molecular weight was confirmed by electrospray mass spectrometry. The protein was concentrated in 20 mM Tris pH 8.0, 100 mM NaCl, 1 mM EDTA, 1 mM DTT to 30 mg/mL.
Crystallization conditions were screened with the sparse matrix sampling method10 using the hanging drop vapor diffusion technique at room temperature with commercially prepared reagents (Hampton Research, Laguna Niguel, CA). Se-Met crystals of the M. pneumoniae MTHFS protein were grown by vapor diffusion at 20°C in a solution containing 30 mg/mL protein, 0.2M ammonium sulfate, 0.1M sodium acetate and 25% polyethylene glycol 4000, pH 4.6. All crystals were flash-frozen in a solution containing 0.17M ammonium sulfate, 0.085M sodium acetate, 21.25% polyethylene glycol 4000, 15% polyethylene glycol 400, pH 4.6 and mounted on loops at 100K prior to data collection.
Structure solution and refinement:
X-ray diffraction data sets were collected at one wavelength corresponding to the selenium absorbance peak (λ = 0.9793 Å) at the Advanced Light Source (ALS), Lawrence Berkeley National Laboratory beam line 5.0.2. using the ADSC Quantum 210 CCD detector placed 200 mm from the sample. The data were processed using the programs HKL2000 and SCALEPACK.11 X-ray data statistics are shown in Table I. The program SOLVE12 was used to locate the selenium sites in the crystal and to calculate initial phases at 2.6 Å resolution. The presence of two molecules per asymmetric unit was used to find the non-crystallographic symmetry (NCS) matrix, and two-fold NCS density averaging was carried out using the DM program.13 The six Se sites were located using the program suite CNS.14 The figure-of-merit (FOM) after phase extension reached 0.87 overall and 0.80 in the 2.6 Å resolution shell. The model building was performed using the program O.15 The preliminary model was then refined to 2.2 Å using CNS,14 with 10% of the data randomly selected for free R-factor cross validation. The refinement statistics are shown in Table I. Atomic coordinates have been deposited at the Protein Data Bank under accession code 1SBQ. The programs MOLSCRIPT16 and GRASP17 were used for preparation of the figures.
Table I. Crystallographic Data and Refinement Statistics
Contents of Asymmetric Unit
Rsym = Σhkl Σi |Ihkl, i − 〈I〉hkl|/Σ|〈I〉hkl|
R = Σ[|Fo| − |Fc|]/Σ|Fo|, where Fo and Fc are the observed and calculated structure factors, respectively.
The R-free value was calculated using a random 10% of the data and omitted from all stages of the refinement.
A PSI-PHI BLAST18 search with the protein sequence of M. pneumoniae MTHFS revealed 97 sequence homologues with E-values below 4E-3. M. pneumoniae MTHFS has 25% sequence identity with human MTHFS, 26% with mouse MTHFS and 24% with rat MTHFS. Figure 1 shows sequence alignment of M. pneumoniae MTHFS with its closest homologs as well as with human, mouse and rat MTHFS. The sequence alignment immediately reveals that there is a highly conserved region in this protein family, residues 100 to 127, particularly residues 115 to 127. Notably, this is an aromatic residue rich region containing one phenylalanine and three tyrosine residues.
The final model includes all residues of M. pneumoniae MTHFS, most of which are well defined in the electron density map. The final model has been refined at 2.2 Å resolution to a crystallographic R-factor of 22.8%. The root mean square (RMS) deviations from ideal stereochemistry are 0.007 Å for bond lengths, 1.3 Å for bond angles and 0.90° for improper angles. The mean positional error in atomic coordinates for the refined model is estimated to be 0.31 Å by a Luzzati plot.19 Ninety-nine percent of all residues lie in the allowed region of the Ramachandran plot produced with PROCHECK.20
There are two molecules of M. pneumoniae MTHFS in an asymmetric unit of a unit cell. Each molecule is a single-domain α + β protein with approximate dimensions of 40 × 40 × 30 Å3. The backbone representation of the M. pneumoniae MTHFS monomer is shown in Figure 2(a). The central core of the protein is a four-stranded parallel β-strand flanked by three helices on one side and a single helix on the other side [Fig. 2(a)]. On the edge of the molecule, there is a three-stranded antiparallel β-strand extruding from the core domain [Fig. 2(a)].
Putative active site:
Mapping the conserved sequence motif region (Fig. 1) onto the three dimensional structure of M. pneumoniae MTHFS, we observed that this region is located at a turn/loop region between a β-strand and an α-helix [Fig. 2(a)]. The side chains of the most conserved residues (115 to 127) are shown in black in the figure. This cluster of amino acid residues forms a cavity or depression in its center [Fig. 2(b)]. A significant electron density was found in the center of the cavity. This density is modeled as a sulfate ion, since the crystal was grown in the presence of sulfate salt and the hydrogen-bonding pattern is consistent for a sulfate. The sulfate ion occupying the cavity forms hydrogen bonds with the main-chain nitrogen atoms of residues Phe 118, Lys 120, Gly 121, Tyr 122, Tyr 123 as well as the side chain of residue Arg 115. [At this point we cannot rule out the possibility that the ion may be a phosphate, because inductively coupled argon plasma (ICAP) detected the presence of phosphorous.] The abundance of aromatic residues in this region suggests favorable hydrophobic interactions with the substrate and/or cofactors of the enzyme.
The molecular surface of M. pneumoniae MTHFS is not highly charged [Fig. 2(b)]. Around the putative active site, we observed a channel-like cavity composed of conserved residues 115 to 127 [Fig. 2(b)]. Residues Arg 115 to Lys 120 form the ‘clamp’ of the channel. and residues Gly 121 to Tyr 123 form the base of the channel [Fig. 2(b)].
Comparison to structural homologs:
A Dali21 search using the monomer crystal structure found two structural homologs with Z-scores higher than 6 in the Protein Data Bank22: (i) E. coli D-ribose-5-phosphate isomerase (Rpia)23, 24 (PDB Ids 1LKZ, 1O8B; Z = 8.3); and (ii) Acidaminococcus fermentans glutaconate coenzyme A transferase25 (PDB ID 1POI; Z = 6.1). These structural homologs have low similarity to M. pneumoniae MTHFS at the sequence level, and they were not found using a BLAST18 search against the sequence database. At the structural level, the overall fold of M. pneumoniae MTHFS is similar to that of these homologs, especially in the central β-strand layer region [Fig. 2(c)].
However, the substrate binding and anion binding sites are different. Structure-based sequence alignment indicates that the active site residues in the homolog proteins are not conserved in the MTHFS protein family, and the conserved residues in the MTHFS protein family are not conserved in these structural homolog proteins. As an example, structural alignment with D-ribose-5-phosphate isomerase (Rpia)23, 24 is shown in Figure 2(c). In Rpia, the active site involves residues Asp 81, Asp 84 and Lys 94,23, 24 and these are not conserved in the MTHFS protein family. The cavity in the Rpia binding site does not exist in the M. pneumoniae MTHFS structure [Figs. 2(b,d)]. Furthermore, the putative active site cavity of M. pneumoniae MTHFS does not exist in the Rpia structure [Figs. 2(b,d)]. Although the two protein families possess a similar overall fold, their functional sites appear to be different, presumably due to the different molecular functions they carry out.
M. pneumoniae MTHFS is the first structure to be determined in the MTHFS family. The function of this protein has been confirmed biochemically (manuscript in preparation). The crystal structure of M. pneumoniae MTHFS revealed a fold similar to that of E. coli Rpia and A. fermentans glutaconate coenzyme A transferase, but they appear to have different active sites, implying different enzymatic functions. Combining sequence and structural information, we propose the putative active site of M. pneumoniae MTHFS in this study and suggest that all members of the MTHFS family have the same active site.
We thank the staff of the Advance Light Source beamline 5.0.1. and 5.0.2. for assistance during data collection. We are also thankful to Dr. Alexander F. Yakunin (University of Toronto, Toronto, Canada) for preliminary enzymatic assay, Dr. David King (Howard Hughes Medical Institute and University of California, Berkeley, CA) for mass spectroscopy, Hisao Yokota and Barbara Gold for cloning, Marlene Henriquez and Bruno Martinez for expression studies and cell paste preparation and John-Marc Chandonia for bioinformatics. This work was supported by National Institutes of Health grant GM 62412 to S-H.K.