The TM0875 gene of Thermotoga maritima encodes a hypothetical protein (residues 1–158), with a molecular weight of 18,447 Da and a calculated isoelectric point of 4.8. TM0875 has no known sequence homologue in other organisms and is, therefore, an “orphan” unique to Thermotoga maritima.1 Here, we report the crystal structure of TM0875 determined using the semiautomated high-throughput pipeline of the Joint Center for Structural Genomics.2
The structure of TM0875 [Fig. 1(A)] was determined to 2.00 Å resolution using the multi-wavelength anomalous dispersion (MAD) method. Data collection, model, and refinement statistics are summarized in Table I. The final model includes one protein molecule (residues 6–154) and 129 water molecules. No interpretable electron density was observed for residues 1–5 and 155–158. The Matthews' coefficient (Vm) for TM0875 is 2.32 Å3/Da and the estimated solvent content is 46.6%. The Ramachandran plot, produced by Procheck 3.4,3 shows that 95.5% of the residues are in the most favored regions and 3.8% are in additional allowed regions. One residue Lys108 (φ = 45.8°, Ψ = −106.5°) is in a disallowed region. Lys108 is the second residue within a distorted type II′ β turn (residues 107–110). The φ and Ψ angles of this residue deviate slightly from the canonical values of a type II′ β-turn (φ = 60.0°, Ψ = −120.0°) by 14.2° and 13.5°, respectively.
Table I. Summary of Crystal Parameters, Data Collection, and Refinement Statistics for TM0875 (PDB: 1o22)
values in parentheses correspond to the highest resolution shell.
The TM0875 monomer is composed of nine β-strands (β1–β9), two α-helices (H1, H3), and two 310-helices (H2, H4) (Fig. 1). The total β-strand, α-helical, and 310-helical content is 39.4%, 19.5%, and 2.7%, respectively. TM0875 folds into a six-stranded (β1, β2, β6–β9) antiparallel β-sheet A with 612543 topology wrapped around a central α-helix H3, that is flanked by additional α-helix H1, and a small sub-domain including β-strand β4 and a two-stranded (β3, β5) antiparallel β-sheet B (Fig. 1). Analysis of the crystallographic packing in the TM0875 structure suggests that a dimer is the biologically-relevant oligomeric form. The tightly intertwined dimer is formed through antiparallel interactions of α-helices H1 and β-strands β3 and β4, that form a four-stranded antiparallel β-sheet with the corresponding strands from the other subunit that are related by the crystallographic two-fold [Fig. 2(A)]. The dimer interface corresponds to a buried surface area of 2579 Å2 per monomer with 36 hydrogen bonds and 67% non-polar atoms.4
An initial structural similarity search, performed with the coordinates of TM0875 using the DALI server5 identified no structural homologue and, hence, indicated a new fold. A recent DALI search identified significant homology with the recently determined hypothetical protein YggU from E. coli (PDB: 1n91) with an RMSD of 2.8 Å for an alignment of 60 residues with a sequence identity of 7%. A superposition of TM0875 and YggU shows [Fig. 2(B)] that the structural homology is restricted to the central β-sheet A/helix H3 structure, but deviates by a swap of the N-terminal β-strand β1 in TM0875 for a C-terminal strand in YggU. These two new structures represent a new fold denoted as YggU-like fold. However, the topology of the β-sheet and the dimerization interface consisting of α-helix H1 and β-strands β3–β5 are unique to TM0875, and can, therefore, be considered a variant of the YggU-like fold. According to the Fold and Function Assignment System (FFAS),6 YggU belongs to a conserved and uncharacterized family of proteins,7 with about one hundred members in all kingdoms of life. It will be interesting to see whether the structural homology between the core domain of Tm0875 and the YggU protein is also manifested in a functional similarity.
The TM0875 structure reported here represents a new fold from Thermotoga maritima, whose structure has been determined by X-ray crystallography using the MAD method. The information reported here, in combination with further biochemical and biophysical studies, will yield valuable insights into the functional determinants of this protein family and the thermostability of these organisms.
Materials and Methods.
Protein production and crystallization:
TM0875 (TIGR: TM0875; Swissprot: Q9WZX8) was amplified by PCR from Thermotoga maritima strain MSB8 genomic DNA using PfuTurbo (Stratagene) and primer pairs encoding the predicted 5′- and 3′-ends of TM0875. The PCR product was cloned into plasmid pMH1, which encodes an expression and purification tag (MGSDKIHHHHHH) at the amino terminus of the full-length protein. The cloning junctions were confirmed by sequencing. Protein expression was performed in selenomethionine-containing medium using the Escherichia coli methionine auxotrophic strain DL41. Lysozyme was added to the culture at the end of fermentation to a final concentration of 250 μg/ml. Bacteria were lysed by sonication after a freeze-thaw procedure in Lysis Buffer [50 mM Tris pH 7.9, 50 mM NaCl, 0.25 mM Tris(2-carboxyethyl)phosphine hydrochloride (TCEP)], and cell debris pelleted by centrifugation at 3400 × g for 60 mins. The soluble fraction was applied to a metal-chelate affinity resin (Amersham Biosciences) and equilibrated with Equilibration Buffer (50 mM potassium phosphate, pH 7.8, 0.25 mM TCEP, 10% (v/v) glycerol, 300 mM NaCl), containing 20 mM imidazole. The Ni-resin was washed with Equilibration Buffer containing 40 mM imidazole, and the protein eluted with Elution Buffer (20 mM Tris pH 7.9, 10% (v/v) glycerol, 0.25 mM TCEP, 300 mM imidazole). Buffer exchange was performed to remove imidazole from the eluate, and the protein in Buffer A (20 mM Tris, pH 7.9, 5% (v/v) glycerol, 0.25 mM TCEP) containing 50 mM NaCl was applied to a Resource Q column (Amersham Biosciences) previously equilibrated with the same buffer. The protein was eluted using a linear gradient of 50 to 500 mM NaCl in Buffer A. Appropriate fractions were buffer exchanged into crystal Buffer B (20 mM Tris, pH 7.9, 150 mM NaCl, 0.25 mM TCEP) and concentrated for crystallization assays to 9.4 mg/ml by centrifugal ultrafiltration (Millipore). The protein was crystallized using the nanodroplet vapor diffusion method8 with standard JCSG crystallization protocols.2 The crystal grew in Hampton Crystal Screen Cryo #22 (25.5% polyethylene glycol (PEG) 4000, 0.085 M Tris, pH 8.5, 0.17 M sodium acetate and 15% glycerol). The crystals were indexed in the tetragonal space group P43212 (Table I).
Diffraction data were collected at the Advanced Light Source (ALS, Berkely, USA) using the BLU-ICE9 data collection environment. Anomalous diffraction data from the selenomethionine-substituted protein crystals were collected at 100 K on beamline 5.0.2 using a ADSC CCD detector at wavelengths corresponding to the inflection point (λ1 MAD Se), the peak (λ2 MAD Se) and the high energy remote (λ3 MAD Se) of a Selenium MAD experiment (Table I). Data used in refinement (λ0 MAD Se) were collected from a second crystal. Data were integrated and reduced using Mosflm10 and then scaled with the program SCALA from the CCP4 suite11. Data statistics are summarized in Table I.
Structure solution and refinement:
The structure was determined from the selenomethionine data by MAD phasing using SOLVE/RESOLVE.12 Structure refinement was performed using CNS,13 Refmac5 with TLS refinement11 and Xfit14 Refinement statistics are summarized in Table I. The final model includes one protein molecule (residues 6–154) and 129 water molecules. No interpretable electron density was observed for residues 1–5 and 155–158 and the N-terminal His-tag.
Validation and deposition:
Analysis of the stereochemical quality of the models was accomplished using Procheck 3.4 and SFcheck 4.0.3, 11 Protein quaternary structure analysis and buried surface area used the PQS server (http://pqs.ebi.ac.uk/). Figure 1(B) was adapted from analysis using PDBsum (http://www.biochem.ucl.ac.uk/bsm/pdbsum/) and all others were prepared with PYMOL (DeLano Scientific). Coordinates and experimental structure factors of TM0875 have been deposited with the PDB and are accessible under the code 1o22.
This work was supported by NIH Protein Structure Initiative grant P50-GM 62411 from the National Institute of General Medical Sciences (www.nigms.nih.gov). Portions of this research were carried out at the Stanford Synchrotron Radiation Laboratory, a National user facility operated by Stanford University on behalf of the U.S. Department of Energy, Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of Energy, Office of Biological and Environmental Research, and by the National Institutes of Health (National Center for Research Resources, Biomedical Technology Program, and the National Institute of General Medical Sciences).