X-ray crystal structure of MTH938 from Methanobacterium thermoautotrophicum at 2.2 Å resolution reveals a novel tertiary protein fold



We have determined the crystal structure of MTH938 (Fig. 1), a hypothetical protein encoded by the Methanobacterium thermoautotrophicum (Mthe) genome (DNA bases 843,263–862,747),1 at 2.2 Å resolution by Se-Met multiwavelength anomalous diffraction (MAD) techniques. Se-Met labeled MTH938 crystallized with the symmetry of space group P41212 with one dimer per asymmetric unit. The dimensions of each monomer of 111 amino acid residues are about 26 × 30 × 32 Å3. A Dali search2 with this MTH938 structure found no significant structural similarity (highest Z-score of 2.7) with any existing protein. The crystal structure of MTH938 reveals a new tertiary fold consisting of three β-sheets and three α-helices (Fig. 1). There is a disulfide bond between residues Cys 5 and Cys 87 in each monomer. As Mthe is an anaerobic archaea and the cystine pair is not conserved in the amino acid sequence alignment (Fig. 1), the potential structural and functional significance of the disulfide bond is uncertain. It is interesting that the only eukaryotic homolog in this sequence cluster is an unnamed human protein, suggesting possible lateral gene transfer into the human genome.

Figure 1.

Results of amino acid sequence similarity search using iterative PsiBlast11 on MTH938. The secondary structural elements of MTH938 are indicated above the aligned sequences. MTH938 (gi|7482721), a hypothetical protein from Methanobacterium thermoautotrophicum (strain Delta H);Pyro_abyssi (gi|7518282), a hypothetical protein PAB1927 from Pyrococcus abyssi (strain Orsay); Pyro_horikoshii (gi|7519171), a hypothetical protein PH1505 from Pyrococcus horikoshii; Arch_fulgidus (gi|7482959), a conserved hypothetical protein AF0029 from Archaeoglobus fulgidus; Xylella (gi|9104874), a conserved hypothetical protein from Xylella fastidiosa; and Homo_sapiens (gi|10437033), an unnamed protein from Homo sapiens. The ribbon diagram shows the spatial arrangements of structural elements of MTH938. The three α-helices are colored orange, the larger 5-strand mixed sheet in red, 3-strand anti-parallel sheet in cyan, and a small 2-strand parallel sheet in green. The locations of amino acid residues are numbered at frequent intervals.

Two larger sheets, one from each monomer, associate as a ten-strand mixed β-sheet [Fig. 2(a)] that forms the base of a cleft [Fig. 2(b)]. Molecular modeling and electrostatic potential calculations3 suggest that this cleft could potentially bind double-stranded nucleic acid with interacting elements from αA and the tip of β5 of either subunit of the MTH938 dimer. The dimer interface surface area of 262 Å2, however, corresponds to only about 5.5% of the surface area of a monomer. Dynamic light scattering and gel filtration chromatography also indicate that MTH938 is monomeric in solution. Further biochemical and structural investigations on this protein are in progress.

Figure 2.

a: Ribbon diagram showing an MTH938 dimer. b: The electrostatic surface of the dimer with approximately same orientation as in a. Selected surface amino acid residues are labeled.


DNA from M. thermoautotrophicum, bases 843,263 to 862,747, section 74 of 148, was cloned into expression vector pET15b and transformed into Escherichia coli BL21-DE3 cells. The selenomethionine derivative of MTH938 was prepared following a published protocol.4 Purified Se-Met labeled MTH938 containing a 10 amino acid N-terminal linker with a hexa-His tag was concentrated to about 10 mg/ml in 20 mM Tris-HCl, pH 8.0, with 100 mM NaCl and 5 mM β-mercaptoethanol.

Crystals grown in hanging drops containing 20% PEG 3350, 0.2 M ammonium chloride, and 0.1 M sodium cacodylate at pH 6.2 were used for X-ray diffraction data collection. Diffraction intensity data (Table I) were collected using the Advanced Photon Source (APS) Beamline 14BM-D, Argonne National Laboratory, from a single frozen crystal (100 K) at three wavelengths. The wavelengths selected were the peak (λ1) and inflection (λ2) of the Se K-edge, and at a higher energy remote wavelength (λ3). The data were processed and scaled to 2.2 Å resolution using Denzo and Scalepack,5 respectively. The summary of X-ray data statistics is listed in Table I. Four Se sites, corresponding to two molecules per asymmetric unit, were located using direct methods as implemented in SnB 2.16 and MAD phases were calculated to 2.7 Å resolution based on the anomalous signal from the Se sites using SOLVE version 1.187 with a figure of merit (FOM) of 0.69. The MAD phases were further improved and extended to 2.2 Å resolution using RESOLVE version 1.047 and ARP V5.1.8 The model was built manually into electron density maps calculated using phases obtained from these procedures. Cycles of model building, using O version 6.19 followed by least squares refinement using CNS10 with bulk solvent correction, yielded the final structure that includes all 111 amino acids of MTH938, the last two amino acids of the N-terminual His-tag for both the molecules in the asymmetric unit, and 85 solvent water molecules. Amino acid residues 74, 78, and 80 in molecule A and 74 and 78 in molecule B were refined as alanines because of poor side-chain density. The final crystallographic R-factor and free R-factor (Table I) were 0.228 and 0.266, respectively, for 12,671 reflections (99.3%) between 20–2.2 Å resolution and |F| > 0.0. The refined atomic coordinates and both the unmerged and merged X-ray diffraction data have been deposited in the Protein Data Bank (PDB ID 1IHN).

Table I. Crystallographic Data and Refinement Statistics
Crystal characteristics and data collection statistics
Cell constants a = b = 63.63 Å, c = 116.80 Å; Space group P 41212
Contents of asymmetric unit: 2 MTH938 molecules; Z = 16 molecules/unit cell
X-ray Source: APS 14BM-D beam line
 λ1 (Se)λ2 (Se)λ3 (Se)
Wavelength (Å)0.97890.97920.9537
Resolution (in Å)40.0–2.240.0–2.240.0–2.2
Number of reflections23,51623,54423,570
(Number of observations)(235, 768)(234, 960)(236, 414)
Completeness (%)99.899.899.9
(in 2.24–2.20 Å shell, %)(100)(100)(100)
Mean I/σ(I)
(in 2.24–2.20 Å shell)(6.0)(9.0)(5.2)
R-merge on Ia0.0920.0840.091
(in 2.24–2.22 Å shell)(0.376)(0.273)(0.431)
Sigma cut-offI < −3σ(I)I < −3σ(I)I < −3σ(I)
Figure of merit:0.69 (20–2.7 Å resolution) for 7,848 reflections
Model and refinement statistics
 Data set used in structure refinement:λ2 (Se)
 Resolution range20–2.2 Å
 Number of reflections12,671 (12,020 in working set; 651 in test set)
 Completeness99.3% (94.2% in working set; 5.1% in test set)
 Cutoff criteria|F| > 0.0
 Number of amino acid residues, 224;Number of water molecules, 85
Rcrystc0.228R.m.s. deviations:
Rfree0.266 Bond length0.006 Å
   Bond angle1.3
  Luzzati error0.27 Å
  • a

    Rmerge = ∑hkli|I(hkl)i − 〈I(hkl)〉|/ ∑hkliI(hkl)i〉.

  • b

    Rmeas = ∑h [m/(m − 1)]1/2i |Ih,i − 〈Ih〉|/ ∑hiIh,i, m is the multiplicity of each reflection, ∑h is taken over all unique reflections, and ∑i is taken over the set of independent observations of each unique reflection.12

  • c

    Rcryst = ∑hkl |Fo(hkl) − Fc(hkl)|/ ∑hkl |Fo(hkl)|, where Fo and Fc are observed and calculated structure factors, respectively.

  • d

    Computed with PROCHECK.13

 Ramachandran plot statisticsd
  Residues in most favored regions172 (86.9%)
  Residues in additional allowed regions 25 (12.6%)
  Residues in generously allowed regions  1 (0.5%)
  Residues in disallowed regions  0 (0.0%)
 Overall G-factord0.28


We thank G. Kornhaber and D. Zheng for helpful discussions, and APS BioCARS staff members for their support in data collection. MTH938 represents structure #8 from the Northeast Structural Genomics Consortium.