Crystal structure of a tandem cystathionine-β-synthase (CBS) domain protein (TM0935) from Thermotoga maritima at 1.87 Å resolution


The TM0935 gene of Thermotoga maritima encodes a tandem cystathionine-β-synthase (CBS) domain protein, with a molecular weight of 16,425 Da (residues 1–145) and a calculated isoelectric point of 5.1. TM0935 shares distant sequence homology to the tandem CBS domain of inosine monophosphate dehydrogenase (IMPDH). CBS domains are small intracellular modules of unknown function mostly found in two or four copies within a protein. Pairs of CBS domains dimerize to form a stable globular domain.1 IMPDH contains a tandem CBS domain inserted within a loop of the TIM barrel structure of its enzymatic domain; however, the CBS domains are not needed for enzymatic activity.2 The region containing the CBS domains is involved in regulation by S-AdoMet, which suggests that CBS domains bind to some as yet unidentified small adenosyl-like molecule that regulates the activity of attached enzymatic or other domains.3, 4 CBS domains are found in the intracellular regions of a number of different integral membrane proteins. Two CBS domains are found in intracellular loops of several voltage-gated chloride-channels and magnesium-transporters.1 Here, we report the crystal structure of TM0935 determined using the semiautomated high-throughput pipeline of the Joint Center for Structural Genomics (JCSG).5

The structure of TM0935 [Fig. 1(A)] was determined to 1.87 Å resolution using the multiwavelength anomalous dispersion (MAD) method. Data collection, model, and refinement statistics are summarized in Table I. The final model includes 1 protein molecule (residues 1–70 and 77–145 plus 3 residues from the N-terminal His-tag) and 223 water molecules. No interpretable electron density was observed for residues 71–76. The Matthews' coefficient (Vm) for TM0935 is 2.40 Å3/Da, and the estimated solvent content is 48.3%. The Ramachandran plot, produced by PROCHECK 3.46 shows that 88% of the residues are in the most favored regions and 12% are in additional allowed regions.

Figure 1.

Crystal structure of TM0935. (A) Stereo ribbon diagram of Thermotoga maritima TM0935 color coded from N-terminus (blue) to C-terminus (red) showing the domain organization. The helices H1–H6, β-strands (β1–β5) and the disordered residues 71 and 76 (dashed line) are indicated. Figure 1A produced with PYMOL (DeLano Scientific LLC). (B) Diagram showing the secondary structure elements in TM0935 superimposed on its primary sequence. β-Hairpins are depicted in red and the β-strands making up the two β-sheets (A and B) are labeled. The disordered region is depicted in a dashed line with the corresponding sequence in brackets. Figure 1B adapted from PDBsum (

Table  . Summary of Crystal Parameters, Data Collection, and Refinement Statistics for TM0935 (PDB: 1o50)
  1. ESU = Estimated overall coordinate error.12, 17

  2. Rsym = ‖ I1 − < I1 > ‖‖/‖Ii‖, where I1 is the scaled intensity of the ith measurement, and <I1> is the mean intensity for that reflection.

  3. Rcryst = ‖‖Fobs‖ − ‖Fcalc‖‖/‖Fobs‖, where Fcalc and Fobs are the calculated and observed structure factor amplitudes, respectively.

  4. Rfree = as for Rcryst, but for 5.1% of the total reflections chosen at random and omitted from refinement.

Space groupP41212
Unit cell parametersa = b = 45.17 Å, c = 177.20, α = β = γ = 90°
Data setλ0λ1 MAD Seλ2 MAD Se
Wavelength (Å)0.97000.97930.9497
Resolution range (Å)31.63–1.8332.39–2.5032.39–2.50
Number of observations54,28129,62630.878
Number of reflections16,9636,5506,520
Completeness (%)
(In highest resolution shell, %)90.695.895.3
Mean I/σ(I)
(In highest resolution shell)
Rsym on I0.0720.0840.089
(In highest resolution shell)0.6220.4260.428
Sigma cutoff0.00.00.0
Highest resolution shell (Å)1.88–1.832.63–2.502.63–2.50
Model and refinement statistics
 Resolution range (Å)31.43–1.87Data set used in refinementλ0
 No. of reflections (total)15,974Cutoff criteria‖F‖ > 0
 No. of reflections (test)813Rcryst0.191
 Completeness (% total)99.4Rfree0.255
Stereochemical parameters  
 Restraints (RMS observed)
  Bond length0.018 Å
  Bond angle1.55°
Average isotropic B value 32.3 Å2
ESU based on R value 0.14 Å

The TM0935 monomer is composed of 5 β-strands (β1–β5), organized in 2 β-sheets (A, B), and 6 α-helices (H1–H6) [Fig. 1(A)]. The total β-strand and α-helical content is 19.1% and 39.9% respectively. TM0935 folds into an α/β superdomain with internal symmetry that contains 2 individual CBS domains. CBS1 consists of 3 α-helices (H1, H5, H6) and an antiparallel β-sheet, comprised of β-strands β4 and β5. CBS2 consists of 3 α-helices (H2–H4) and an antiparallel β-sheet, comprised of β-strands β1–β3 [Fig. 1(A and B)]. The two domains share 20% sequence identity over 49 residues where the Cα atoms can be superimposed with a 1.14 Å root-mean-square deviation (RMSD). The crystallographic packing in the TM0935 structure suggests that a dimer is the biologically relevant oligomeric form. The dimer is formed through interactions of CBS1 with CBS2 of the other subunit by crystallographic 2-fold symmetry [Fig. 2A]. The dimer interface corresponds to interactions between α-helices H2 and H6, and H3 and H5, with a buried surface area of 2210 Å2 per monomer.

Figure 2.

(A) Ribbon diagram of the TM0935 dimer. (B) Ribbon diagram of a superposition of TM0935 (rainbow) and the CBS domain (residues 87–214) of IMPDH from Streptococcus pyogenes (gray). Helices and β-strands are indicated for TM0935. Figures produced with PYMOL (DeLano Scientific LLC).

A structural similarity search, performed with the coordinates of TM0935 using the DALI server7 did not yield structural homologues, but Fold and Function Assignment System (FFAS)8 indicates structural similarity to the CBS domain (residues 87–214) of IMPDH from Streptococcus pyogenes [Protein Data Bank (PDB) code: 1ZFJ].2 Subsequently, the structural alignment between TM0935 and residues 87–214 of IMPDH from Streptococcus pyogenes was calculated with TOP.9 The RMSD is 1.7 Å over 95 aligned residues with 19% sequence identity. The main differences between TM0935 and IMPDH from S. pyogenes are spacial shifts of helices (H1, H5, and H6), as well as a 15 amino acid insertion (residues 65–80) after helix H3 in TM0935 [Fig. 1(B)].

According to FFAS,8 TM0935 has at least 9 distant homologues in the T. maritima proteome: TM0829 with 24% sequence identity, TM0892 with 21% sequence identity, TM1140 with 24% sequence identity, and TM0587 with 17% sequence identity (all are hypothetical proteins), TM0845 (hemolysin-related protein) with 18% sequence identity, TM1347 (guaB) with 19% sequence identity, TM1354 (dehydrogenase-related) with 17% sequence identity, TM0715 (transferase-related) with 19% sequence identity, and TM1161 (MgtE) with 20% sequence identity. Models for TM0935 homologues can be accessed at

The structure reported here represents the first structure of an independent, tandem CBS-domain protein from T. maritima, whose structure has been determined by X-ray crystallography using the MAD method. The information reported here, in combination with further biochemical and biophysical studies, will yield valuable insights into both the functional determinants of this protein family and the thermostability of these organisms.

Materials and Methods.

Protein production and crystallization:

TM0935 (TIGR: TM0935; Swissprot: Q9X033) was amplified by polymerase chain reaction (PCR) from T. maritima strain MSB8 genomic DNA using PfuTurbo (Stratagene) and primer pairs encoding the predicted 5′- and 3′-ends of TM0935. The PCR product was cloned into plasmid pMH1, which encodes expression and purification tags consisting of the amino acids MGSDKIHHHHHH at the amino terminus of the full-length protein. The cloning junctions were confirmed by sequencing. Protein expression was performed in selenomethionine-containing medium using the Escherichia coli methionine auxotrophic strain DL41. For the expression of the native protein a modified Terrific Broth medium [24 g/L yeast extract, 12 g/L tryptone, 1% (v/v) glycerol, 50 mM 3-(N-morpholino) propanesulfonic acid (MOPS) pH 7.6] was used. Lysozyme was added to the culture at the end of fermentation to a final concentration of 1 mg/mL. Bacteria were lysed by sonication after a freeze-thaw procedure in Lysis Buffer [50 mM Tris pH 7.9, 50 mM NaCl, 1 mM MgCl2, 0.25 mM Tris (2-carboxyethyl) phosphine hydrochloride (TCEP)], and the cell debris was pelleted by centrifugation at 3400 × g for 60 min. The soluble fraction was applied to a metal chelate affinity resin (Amersham Biosciences) previously charged with nickel and equilibrated with Equilibration Buffer [50 mM potassium phosphate pH 7.8, 0.25 mM TCEP, 10% (v/v) glycerol, 300 mM NaCl) containing 20 mM imidazole. The resin was washed with Wash Buffer [50 mM potassium phosphate pH 7.8, 300 mM NaCl, 40 mM imidazole, 10% (v/v) glycerol, 0.25 mM TCEP], and the protein was eluted with Elution Buffer [20 mM Tris pH 7.9, 300 mM imidazole, 10% (v/v) glycerol, 0.25 mM TCEP]. Buffer exchange was performed to remove imidazole from the eluate, and the protein in Buffer Q [20 mM Tris pH 7.9, 5% (v/v) glycerol, 0.25 mM TCEP] containing 50 mM NaCl was applied to a Resource Q column (Amersham Biosciences) previously equilibrated with the same buffer. Protein was eluted using a linear gradient of 50–500 mM NaCl in Buffer Q, and appropriate fractions were pooled. Protein was buffer exchanged into size exclusion chromatography (SEC) buffer (20 mM Tris pH 7.9, 150 mM NaCl, 0.25 mM TCEP) and concentrated for crystallization assays to 24 mg/mL by centrifugal ultrafiltration (Millipore). The protein was crystallized using the nanodroplet vapor diffusion method10 with standard JCSG crystallization protocols.5 Two crystals were used in the structure solution. The first crystal of native protein was obtained using a solution of 0.2 M ammonium formate, 20% polyethylene glycol (PEG)-3350, pH 6.6. The second crystal of selenomethione-substituted protein crystallized from a solution containing 20% PEG 6000, 1.0 M LiCl, and 0.1 M 2(N-morpholine) ethanesulfonic acid (MES) at pH 6.0. The crystals were indexed in the tetragonal space group P41212 (Table I).

Data collection:

Diffraction data were collected at Stanford Synchrotron Radiation Laboratory (SSRL, Stanford, CA) using the BLU-ICE11 data collection environment. The native crystal was collected on beamline 9-1 with a MAR-345 image plate detector (λ0). Anomalous diffraction data from the selenomethionine-substituted crystal were collected on beamline 11-1 using a Quantum 315 charge-coupled device (CCD) detector at wavelengths corresponding to the inflection point (λ1 MAD Se) and the high-energy remote (λ2 MAD Se) of a Selenium MAD experiment (Table I). Both crystals were collected at 100 K. Data were integrated and reduced using Mosflm12 and then scaled with the program SCALA from the CCP4 suite.9 Data statistics are summarized in Table I.

Structure solution and refinement:

The structure was determined from the selenomethionine data by MAD phasing using the CCP4 suite9 and SOLVE/RESOLVE.13 The initial structure was phase extended and autotraced against the 1.87 Å native data set using ARP/wARP.14 Structure refinement was performed using REFMAC5,9 O,15 and Xfit.16 Refinement statistics are summarized in Table I. The final model includes 1 protein molecule (residues 1–70 and 77–145 plus 3 residues from the N-terminal His-tag) and 223 water molecules in the asymmetric unit. No electron density was observed for residues 71–76.

Validation and deposition:

Analysis of the stereochemical quality of the models was accomplished using the JCSG Validation Central suite, which integrates 7 validation tools: PROCHECK 3.5.4, SFCHECK 4.0, PROVE 2.5.1, ERRAT, WASP, DDQ 2.0, and WHATCHECK. The Validation Central suite is accessible at Atomic coordinates of the final model and experimental structure factors of TM0935 have been deposited with the PDB and are accessible under the code 1o50.


Portions of this research were carried out at the Stanford Synchrotron Radiation Laboratory, a National user facility operated by Stanford University on behalf of the U.S. Department of Energy, Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of Energy, Office of Biological and Environmental Research, and by the National Institutes of Health (National Center for Research Resources, Biomedical Technology Program, and the National Institute of General Medical Sciences).