Thermus thermophilus HB8, an extremely thermophilic bacterium, optimally grows at 70°C.1 Thus, the proteins from this bacterial strain are quite stable at room temperature, easy to purify, and suitable for X-ray crystallographic studies.
TT1751 is annotated as a conserved hypothetical protein from T. thermophilus HB8. The TT1751 open reading frame (ORF) encodes a polypeptide chain of 129 amino acids, with a molecular mass of 14.1 kDa. A BLAST analysis of the TT1751 sequence against the nonredundant protein database identified a total of 14 homologues. Proteins with sequence similarities to TT1751 ranging from 28% to 57% exist in archaea and prokaryotes, including the pathogenic bacteria Vibrio vulnificus and Legionella pneumophila (Fig. 1). In the Pfam database, these proteins belong to a Domain of Unknown Function, DUF302,4 which is equivalent to COG3439 in the National Center for Biotechnology Information Database of Clusters of Orthologous Groups.6 All of these proteins were annotated as hypothetical proteins, and their average length is approximately 135 amino acid residues. We now report the crystal structure of TT1751 at 2.0 Å resolution, which is the first structure of a member of the DUF302 protein group.
Materials and Methods.
Cloning, expression, and purification: The TT1751 gene from T. thermophilus was amplified by polymerase chain reaction (PCR) and subcloned into the pET11b vector. A selenomethionine derivative of the protein was expressed in E. coli B834 (DE3), induced by isopropyl-β-D-thiogalactopyranoside (IPTG). The cell lysate was incubated at 70°C for 30 min and was then centrifuged to remove the denatured protein. The soluble fraction was applied to a Q Sepharose column (Amersham Biosciences) previously equilibrated with 20 mM Tris-HCl buffer (pH 8.0) containing 50 mM NaCl and 1 mM DTT. The proteins were eluted by using a linear gradient of 0.05 to 1 M NaCl. The fractions containing TT1751 were collected and dialyzed against 20 mM Tris-HCl buffer (pH 8.5) containing 1mM DTT. The solution was applied to a Mono Q HR 5/5 column (Amersham Biosciences) previously equilibrated with 20 mM Tris-HCl buffer (pH 8.5) containing 1 mM DTT. The proteins were eluted by using a linear gradient of 0 to 1 M NaCl. The fractions containing TT1751 were collected and dialyzed against 50 mM HEPES-NaOH buffer (pH 7.0) containing 150 mM NaCl and 1 mM DTTT. The eluted proteins were applied to a Superdex 75 HR 10/30 gel filtration column (Amersham Biosciences), equilibrated with 50 mM HEPES-NaOH buffer (pH 7.0) containing 150 mM NaCl and 1 mM DTT. The purifed protein was concentrated to 6.0 mg/ml using a Centricon filter (Millipore). The yield of the SeMet-substituted TT1751 was 8.1 mg from 6.9 g of wet cells. The protein concentration was measured with a Protein Assay Rapid Kit (Wako). The N-terminal amino acid sequence of the purified TT1751 protein was confirmed with a Procise amino acid sequencer (Applied Biosystems).
Crystallization and data collection: The initial crystallization conditions were screened by microbatch crystallization, using an IMPAX Crystallization Robot (Douglas Instruments) against Hampton crystal screening kits (Hampton Research). In the crystallization conditions, 1 μL of the protein (6.0 mg/mL), in 50 mMN-2-hydroxyethylpiperazine-N′-2-ethanesulfonic acid (HEPES)-NaOH buffer (pH 7.0) containing 150 mM NaCl and 1 mM diothreitol (DTT), was mixed with 1 μL of the solution containing 38% polyethylene glycol monomethylether (PEGMME) 5000, 0.17 M ammonium sulfate, and 0.1 M potassium citrate (pH 5.6), in silicone/paraffin (1:1) oil, using an IMPAX Crystallization Robot (Douglas Instruments). Crystals of the selenomethionine protein (0.2 mm × 0.2 mm × 0.1 mm) were obtained in 1–2 days at 20°C. The crystals were soaked in 10 μL of cryoprotectant solution containing 38% PEGMME 5000, 0.17 M ammonium sulfate, 0.1 M potassium citrate (pH 5.6), and 25% glycerol for about 1 min before being flash-frozen in a −180°C nitrogen stream. The crystals belong to the space group P21, with unit-cell dimensions a = 41.60 Å, b = 78.81 Å, c = 44.69 Å, β = 116.65°.
Structure determination and refinement: X-ray diffraction data sets were collected at 4 wavelengths, at beamline BL44B2 at SPring-8.7 Data were processed and scaled using the programs DENZO and SCALEPACK.8 We used the SOLVE program package9 to obtain the electron density map. Four selenium sites were found in one asymmetric unit. The RESOLVE program10 was used to normalize the magnitudes of the differences at the peak wavelength and to perform density modification. The resulting electron density map was sufficient for model building. The model was built for the low-remote data using the program TURBO-FRODO.11 Structural refinement was performed using X-PLOR version 3.85112 and REFMAC.13 The structure of TT1751 has been determined to an R factor of 0.194 (Rfree of 0.259) for all data in the resolution range of 50 Å to 2.0 Å, using REFMAC.
Results and Discussion.
The structure of TT1751 [Fig. 2(A)] was determined to 2.0 Å resolution using multiwavelength anomalous dispersion (MAD) method. Data collection, model, and refinement statistics are summarized in Table I. The final model contains residues 3–129 (residues 1–2 are not visible), 56 molecules of water, and 1 molecule of ammonium sulfate. The refinement statistics are summarized in Table I. Structural evaluations of the final TT1751 protein model, using PROCHECK,15 indicated that 96.4% of the residues are in the most favorable regions of the Ramachandran plot, with no residues in the “disallowed” regions. The coordinates of the TT1751 protein have been deposited in the Protein Data Bank (PDB), with the ID code 1J3M.
Table I. Data Collection and Refinement Statistics
The TT1751 monomeric structure consists of 5 α-helices and 5 β-strands. The β-sheet is composed of 5 antiparallel β-strands, in the order 1-5-4-3-2 [Fig. 2(A)]. The 2 identical subunits form a homodimeric structure in the crystal [Fig. 2(B and C)]. Each homodimer contains 1 molecule of ammonium sulfate between α3-helix and α5-helix in the crystal packing. Figure 2(D) shows the secondary structure topology of TT1751.
The TT1751 structure was compared with the previously determined structures in the PDB database, using the program DALI (http://www.ebi.ac.uk/dali/).16 Several proteins shared weak structural similarity (Z score > 4.0) with the TT1751 structure. The highest Z score in the result of DALI was 6.1 (Homo sapiens β2-adaptin; for 81 pairs of aligned Cα atoms; PDBID = 1E42-A), despite the very weak sequence similarity (9% identity). The present structure of TT1751 [Fig. 3(A)] was compared with that of the β2-adaptin structure17 [Fig. 3(B)]. These structures are in the same orientation, and were superimposed using DALI and LSQKAB.16, 18 The RMSD of the superimposition was 3.1 Å. However, the functional relationship between TT1751 and β2-adaptin remains unknown.
Our thanks to Nobuo Kamiya, Taiji Matsu, and Hisashi Naitow for data collection at RIKEN Beamline BL44B2 of SPring-8.