Crystal structure of thy1, a thymidylate synthase complementing protein from Thermotoga maritima at 2.25 Å resolution



The thy1 gene of Thermotoga maritima encodes a thymidylate synthase-complementing protein with a molecular weight of 26,005 Da and a predicted isoelectric point of 8.2. Thymidylate synthase complementing proteins (TSCP) have been implicated in cell survival in the absence of external sources of thymidylate.1 The mechanism of action of this family of enzymes is unknown but appears to be significantly different from that of the thyA thymidylate synthases based on the structural arrangement of the active site as described here. In this article, we report the first crystal structure of a member of this protein family determined with use of a semiautomated high-throughput pipeline at the Joint Center for Structural Genomics (

Initial crystallization conditions were obtained using 50 nL protein drops mixed with 50 nL mother liquor. Crystals for data collection were obtained from 1 μL sitting drops and gave diffraction to 2.25 Å resolution at the Stanford Synchrotron Radiation Laboratory (SSRL) beamline 11-1 (Table I). The structure of TM0449 was determined by using the selenomethionine-based, multiple-wavelength anomalous dispersion (MAD) technique (Table I). The final model includes the protein tetramer, 4 flavin-adenine dinucleotide (FAD) molecules, and 274 water molecules. The Ramachandran plot produced by PROCHECK 3.5 shows 92.4% of all residues in the most favored regions, 7.4% in additional allowed regions, and 0.3% generously allowed regions. No residues lie in disallowed regions.

Table I. Summary of Crystal Parameters, Data Collection, and Refinement Statistics
Crystal characteristics and data statistics
 Space groupP212121
 Unit cell parametersa = 54.20 Å, b = 116.61 Å, c = 141.83 Å
Contents of asymmetric unit: four chains of TM0449 in a functional tetramer
 λ0 Seλ1MAD Seλ2 MAD Seλ3 MAD Se
  1. Rsym = Σ|Ii − 〈Ii〉| |/Σ|Ii| where Ii is the scaled intensity of the ith measurement, and 〈Ii〉 is the mean intensity for that reflection.

  2. Rcryst = Σ| |Fobs| − |Fcalc| |/Σ|Fobs| where Fcalc and Fobs are the calculated and observed structure factor amplitudes, respectively.

  3. Rfree = as for Rcryst, but for 10% of the total reflections chosen at random.

Data collection    
 Wavelength (Å)0.9800.97930.91840.9792
 Resolution (Å)20–2.2520–2.720–2.720–2.7
 No. of reflections43020255922566025716
 (No. of observations)146,2689093418216289725
 Completeness (%)99.199.799.799.7
 (In highest resolution shell, %)100.00 (2.27–2.25)100.0 (2.77–2.70)99.3 (2.77–2.70)99.9 (2.77–2.70)
 Mean I/σ(I)
 (In highest resolution shell, %)2.3 (2.27–2.25)4.0 (2.77–2.70)3.7 (2.77–2.70)3.6 (2.77–2.70)
 R-sym on I0.0610.0510.0750.054
 (In highest resolution shell, %)0.391 (2.27–2.25)0.162 (2.77–2.70)0.197 (2.77–2.70)0.194 (2.77–2.70)
 Sigma cutoff0.
Model and refinement statistics   
 Data set used in structure refinementλ0 Se  
 Resolution range20–2.25 ÅStereochemical parameters
 No. of reflections (total)43369Restraints (RMSobserved)
 No. of reflections (test)4367Bond angle1.3°
 Completeness (total)99%Bond length0.007 Å
  Average isotropic B-value30.0 Å2
 Cutoff criteria|F| > 2.0Luzzati mean coordinate error0.272 Å

TM0449 is a tetramer composed of identical 220-residue subunits with a unique tertiary structure. Each of the monomer consists of a central domain and a long α-helix. The central domain is made up of a five-stranded antiparallel β-sheet with strand order β1-β2-β3-β5-β4, flanked by 6 α-helices on one side of the sheet. The dimer interface is formed by a partial stacking of the β-sheets, whereas stacking of the two long helices between the dimers results in the tetramer formation [Figs. 1(A) and (B)]. A structural similarity search performed by using the DALI server ( with the coordinates of TM0449 failed to detect significant similarities to any other protein structure, indicating that TM0449 exhibits a novel fold.

Figure 1.

A: Ribbon diagram of Thermotoga maritima thy1 thymidylate synthase complementing protein (TM0449). Each monomer of the tetramer is shown in a separate color, and the FAD molecules are shown in a stick representation. B: Ribbon diagram of the thy1 monomer.

A large active site pocket is formed in the center of the tetramer, with four channels running along the molecular interfaces. The four FAD molecules, one per monomer, are located in the active site, with the AMP moieties pointing toward the central cavity and the riboflavin moieties of each FAD molecule pointing toward the surface of the protein. The density for the AMP moieties is strong (average B value = 41 Å2), but density for the flavin moieties is much weaker, indicating the flavin rings are not as well ordered (average B value = 60 Å2).

Two observations indicate the mechanism of thy1 proteins differs significantly from that of thyA. First, the proposed active site lacks the invariant catalytic base cysteine found in thyA proteins.2, 3 Second, TM0449 has a tightly bound FAD in the active site, suggesting that, unlike thyA, thy1 enyzmes are capable of both methyl transfer and reduction of substrate or a cofactor. It is tempting to speculate that the mechanism of this enzyme is similar to the reductive methylation of uridine catalyzed by the folate-dependent ribothymidyl synthase of S. faecalis.4 This enzyme uses methylenetetrahydrofolate as the 1-carbon donor and FADH2 as the reducing agent. Further studies are in progress to confirm this hypothesis and fully understand the catalysis of TM0449.

Materials and Methods.

Protein production: Thy1 thymidylate synthase complementing protein (TIGR: TM0449; Genbank: NP_228259) was PCR amplified by using Pfu (Stratagene) from Thermotoga maritima strain MSB8 genomic DNA with primer pairs encoding the predicted 5′- and 3′-ends of TM0449. The PCR product was cloned into plasmid pMH1, which encodes a purification tag consisting of the amino acids MGSDKIHHHHHH at the amino terminus of the full-length protein. The cloning junctions were confirmed by sequencing. Protein expression was performed in selenomethionine-containing media using the Escherichia coli methionine auxotrophic strain DL41. Bacteria were lysed by sonication after a freeze-thaw procedure in Lysis Buffer (50 mM Tris, pH 7.9, 50 mM NaCl, 1 mM MgCl2, 3 mM DL-methionine, 0.25 mM TCEP, 1 mg/mL lysozyme), and cell debris was collected by centrifugation at 3600 × g for 60 min. The soluble fraction was applied to a nickel chelate resin (Invitrogen) previously equilibrated with Equilibration Buffer (20 mM Tris, pH 7.9, 0.25 mM TCEP, 10% v/v glycerol, 3 mM DL-methionine). The resin was washed with Equilibration Buffer containing 40 mM imidazole and protein eluted with Equilibration Buffer containing 200 mM imidazole. Buffer exchange was performed to remove imidazole from the protein eluate, and the protein in Buffer Q (20 mM Tris, pH 7.9, 25 mM NaCl, 5% v/v glycerol, 0.25 mM TCEP) was applied to a Resource Q column (Pharmacia). Protein was eluted by using a linear gradient to 400 mM NaCl. Appropriate fractions were further purified by size-exclusion chromatography using S200 resin (Pharmacia) with isocratic elution in SEC Buffer (20 mM Tris, pH 7.9, 150 mM NaCl, 0.25 TCEP). The protein was concentrated to ∼10 mg/mL by centrifugal ultrafiltration (Millipore).

Crystallization: The protein was crystallized by using the vapor diffusion method with 1 μL sitting drops. Crystallization buffer contained 50% (v/v) PEG-200 as the precipitant in 0.1 M HEPES at pH 7.5 with a final protein concentration of 10 mg/mL. Crystals grew within 28 days at 20°C, and the crystals were indexed in the orthorhombic space group (Table I).

Data collection: Multi-wavelength anomalous diffraction data were collected at SSRL (Stanford, CA) on beamline 9-2 at three wavelengths (Table I). A higher resolution data set was collected at beamline 11-1 and used for refinement (Table I). All data sets were collected at 100K by using a Quantum 4 CCD detector. Data were reduced by using Mosflm5 and then scaled with the program SCALA from the CCP4 suite.6 Data statistics are summarized in Table I.

Structure solution and refinement: The structure was solved by using the software packages SnB7, and the program MLphare from the CCP4 suite,6 and SOLVE8 to 2.7 Å resolution. Fourfold noncrystallographic symmetry averaging and phase extension, with the program DM, was used to improve the experimental phases to 2.25 Å resolution. An initial model was built by using the ARP/wARP package.9 The structure refinement was performed by using CNS.10 Refinement statistics are summarized in Table I.

The final model contains four polypeptide chains. Five methionine residues (residues 1, 17, 33, 49, and 168, respectively) in each chain were replaced by selenomethionine. The terminal purification tags including a hexa-His sequence were excluded from the model because no electron density for these residues was found in the maps. Other residues not included in the model due to lack of electron density were as follows: 32–38, 89–94, and 216–200 in chain A; 32–37 and 93–95 in chain B; 33–36 and 92–93 in chain C; and 32–36 and 216–220 in chain D.

Validation and deposition: Analysis of the stereochemical quality of the models was performed by using the JCSG Validation Central suite, which integrates seven validation tools: Procheck 3.5.4, SFcheck 4.0, Prove 2.5.1, ERRAT, WASP, DDQ 2.0, and Whatcheck. The Validation Central suite is accessible at Atomic coordinates of the final model and experimental structure factors of TM0449 have been deposited with the RCSB and are accessible under the code 1KQ4.


This work was supported by NIH Protein Structure Initiative grant P50-GM62411 from the National Institute of General Medical Sciences ( Portions of this research were carried out at the Stanford Synchrotron Radiation Laboratory, a national user facility operated by Stanford University on behalf of the U.S. Department of Energy, Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of Energy, Office of Biological and Environmental Research, and by the National Institutes of Health, National Center for Research Resources, Biomedical Technology Program, and the National Institute of General Medical Sciences.

Note Added in Proof.

An independent identification of Thy1 as part of the ThyX family of enzymes that carry out an alternative flavin-dependent mechanism for thymidylate synthesis was published recently by Myllykallio et al. (Science 2002;297:105–107) and reviewed in conjunction with the JCSG crystal structure of Thy1 by Murzin (Science 2002;297:61–62.)