Crystal structure of hypothetical protein PH0734.1 from hyperthermophilic archaea Pyrococcus horikoshii OT3

Authors

  • Ken-ichi Miyazono,

    1. Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo-Ku, Tokyo 113-8657, Japan
    Search for more papers by this author
  • Yozo Nishimura,

    1. Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo-Ku, Tokyo 113-8657, Japan
    Search for more papers by this author
  • Yoriko Sawano,

    1. Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo-Ku, Tokyo 113-8657, Japan
    Search for more papers by this author
  • Tsukasa Makino,

    1. Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo-Ku, Tokyo 113-8657, Japan
    Search for more papers by this author
  • Masaru Tanokura

    Corresponding author
    1. Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo-Ku, Tokyo 113-8657, Japan
    • Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-Ku, Tokyo 113-8657, Japan
    Search for more papers by this author

INTRODUCTION

Proteins from hyperthermophilic organisms are frequently selected as targets of structure determination for the purpose of understanding protein functions from a structural viewpoint. We cloned the gene of PH0734.1, which is a hypothetical protein with unknown function, from Pyrococcus horikoshii OT31 and overexpressed its protein product in E. coli to investigate its structure and function.

PH0734.1 is a protein consisting of 172 residues with a molecular mass of 19,601 Da. Analysis of its amino acid sequence showed that PH0734.1 is divided into two domains: an N-terminal DUF1947 domain found in archaeal hypothetical proteins (Pfam data base,2 Accession number: PF09183.1) and a C-terminal PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain. Although the PUA domain is characterized as an RNA-binding domain and is observed in various RNA modification enzymes such as archaeosine tRNA-guanine transglycosylase (ArcTGT)3, 4 and pseudouridine 55 synthase,5 the structure and function of PH0734.1 remains unknown.

To elucidate the structural basis of PH0734.1, we determined its three-dimensional structure at 1.73 Å resolution. The crystal structure of PH0734.1 revealed that although the structure of the PUA domain of PH0734.1 is highly similar to the preexisting protein structures, the conformation of the DUF1947 domain is novel, with a unique lysine cluster being formed at its N-terminal region. DUF1947 may modulate the binding target of the PUA domain using its characteristic electropositive surface.

Abbreviations

ArcTGT, archaeosine tRNA-guanine transglycosylase; PUA, PseudoUridine synthase and Archaeosine transglycosylase.

METHODS

Cloning, expression, and purification

The PH0734.1 gene was amplified from genomic DNA of P. horikoshii by PCR. The primers used for this amplification were designed according to the start-and-stop codon regions of the PH0734.1 gene, and Nde I and Bam HI sites were added at the 5′ end of the primers containing start-and-stop codons, respectively. After digestion with Nde I and Bam HI, the amplified fragment was cloned into pET-28a(+), the T7 polymerase-based expression vector of E. coli. The plasmid was transformed into E. coli Rosetta(DE3) pLysS for protein expression.

The E. coli transformants were cultivated at 25°C in LB medium until the optical density at 600 nm reached 0.6. Protein expression was induced by the addition of IPTG to a final concentration of 0.2 mM, and the cultivation at 25°C was continued for 16 h. E. coli cells were collected by centrifugation at 5,000 × g for 10 min. The cells were resuspended in 20 mM Tris-HCl (pH 7.0), 150 mM NaCl, and 15% glycerol, and then lysed by sonication. After centrifugation at 40,000g for 30 min, the supernatant was treated at 80°C for 30 min. The supernatant of the centrifugation at 40,000g for 30 min was purified by Ni-NTA agarose (QIAGEN). His-tagged PH0734.1 was eluted by a buffer solution containing 20 mM Tris-HCl (pH 7.0), 150 mM NaCl, 5% glycerol, and 250 mM imidazole. The elution fraction was treated with thrombin and benzonase to remove N-terminal His-tag of PH0734.1 and to cleave contaminated DNA and RNA, respectively. The PH0734.1 was further purified using a cation exchange chromatography column Resource S 6 mL (GE Healthcare). The protein was eluted with a 120-mL linear gradient of 0–1M NaCl in a 20 mM Tris-HCl (pH 7.0) buffer solution. Purified protein was dialyzed against 5 mM Tris-HCl (pH 7.0) and concentrated to 12 mg/mL for crystallization.

Crystallization and data collection

Crystallization experiments of PH0734.1 were performed by the sitting drop vapor diffusion method at 20°C. Crystallization drops were made by mixing 1 μL of the protein solution [12 mg/mL in 5 mM Tris-HCl (pH 7.0)] with 1 μL of a variety of reservoir solutions. Crystals of PH0734.1 were obtained under a reservoir solution condition containing 100 mM CAPS (pH 10.5) and 30% (v/v) PEG400. Typical crystals of PH0734.1 were obtained within 3 days.

X-ray diffraction data of PH0734.1 were collected at the AR-NW12 beamline in Photon Factory (Tsukuba, Japan). All X-ray diffraction data measurements were carried out under cryogenic conditions (95 K). The crystal of PH0734.1 was diffracted to a resolution of 1.73 Å. X-ray diffraction data of PH0734.1 were integrated and scaled with the program HKL2000.6 The crystal of PH0734.1 belonged to the space group P3221, with unit-cell parameters of a = b = 52.92 Å and c = 133.31 Å. Evaluation of the Matthews coefficient7 indicated that the crystal of PH0734.1 contained one protein molecule per asymmetric unit (VM = 2.48 Å3/Da). The data collection statistics are summarized in Table I.

Table I. Data Collection, Phasing, and Refinement Statistics of PH0734.1
  1. Values in parentheses are for the highest resolution shell.

Data collection 
 Wave length (Å)1.0000
 Space groupP3221
 Unit cell (Å)a = b = 52.92, c = 133.31
 Resolution (Å)25.0–1.73 (1.79–1.73)
 Number of observed reflections459,715
 Number of unique reflections24,689
 Number of reflections in the Rfree dataset1198
 Completeness99.8 (100)
Rmerge (%)6.5 (22.1)
II21.5 (12.2)
 Redundancy18.8 (19.0)
  
Refinement 
 R/Rfree (%)20.9/22.9
 Number of nonhydrogen atoms 
  Protein1331
  Water182
 RMSD bond length (Å)0.007
 RMSD bond angle (deg.)1.126
 Ramachandran plot (%) 
  Favored97.6
  Allowed2.4
  Disallowed0

Structure determination

The crystal structure of PH0734.1 was determined by the molecular replacement method. Molecular replacement was performed by the program MOLREP8 in CCP49 following the homology structure search using the program MrBUMP.10 The best search model for the molecular replacement was the C-terminal region of a hypothetical protein Ta1423 from Thermoplasma acidophilum (1Q7H, residues: 69–153), which is predicted as a PUA domain. The initial model was automatically rebuilt and refined using the program ARP/wARP.11 After automodel building, several cycles of manual model rebuilding and refinement were performed using the programs XtalView12 and Refmac5.13 Water molecules were picked up from the FoFc map on the basis of peak height and distance criteria. The geometry of the final structure was evaluated with the programs PROCHECK14 and Rampage.15 The coordinates of PH0734.1 have been deposited into the Protein Data Bank (PDB) with the accession number 3D79.

Structural analysis

Structural analysis was carried out using a set of programs: Dali16 for the search of similar structures from the database, SURFACE (CCP4)9 for the calculation of protein surface area, Dalilite17 for the superposition of molecules, APBS18 for calculation of the electrostatic potential of the protein surface, ESpript19 for the preparation of alignment figures, and Pymol (http://pymol.sourceforge.net/) for the depiction of structure.

RESULTS AND DISCUSSION

Structure determination

We collected the diffraction data of PH0734.1 to a resolution of 1.73 Å because the values of Rmerge and R/Rfree got drastically worse when we included the higher resolution data, though the value of I/σI in the outermost shell was high. The structure of PH0734.1 was determined by the molecular replacement method with good stereochemistry. The final model contained one protein molecule and 182 ordered water molecules. Because of the poor electron density map, we could not build the structure of the first 4 residues (1–4) and the last 1 residue (172). R and Rfree values of the final model were 20.9% and 22.9%, respectively. In the Ramachandran plot, 97.6% of the residues were included in the favored region, and the rest of the residues were in the allowed region. The refinement statistics are summarized in Table I.

Overall structure of PH0734.1

The overall structure of PH0734.1 is shown in Figure 1(A). PH0734.1 is composed of 11 β strands, six α helices, and three 310 helices with the topology of β1-α1-α2-β2-β3-β4-β5-α3-3101-β6-3102-α4-β7-3103-β8-β9-β10- α5-β11-α6. The structure of PH0734.1 is composed of two domains: an N-terminal DUF1947 domain (5–70) and a C-terminal PUA domain (71–162). In the DUF1947 domain, the first five β strands (β1–β5) form an antiparallel β sheet and face two α helices (α1–α2) on one side. In the PUA domain, the last six β strands (β5–β11) form a barrel-like mixed β sheet and are surrounded by six helices (α3–α5, 3101–3103). The C-terminal residues of PH0734.1 (163–172) form an α helix (α6) and interact with the DUF1947 domain. The two domains are tightly connected by hydrophobic interactions and electrostatic interactions with an approximate buried surface area of 6157 Å2.

Figure 1.

A: Overall structure of PH0734.1. Color-coding runs from blue in the N-terminal region to red in the C-terminal region. Secondary structure assignments are labeled on the ribbon model. B: Superposition of the structure of PH0734.1 (cyan), Ta1423 (green), and ArcTGT (red). C: Electrostatic potential diagram of PH0734.1. Positive and negative potentials are represented by blue and red, respectively.

Comparison with other proteins

A database search using the Dali server16 revealed that only two protein structures were similar to the structure of PH0734.1. The closest structure was that of the function-unknown protein Ta1423 from Thermoplasma acidophilum (PDB code: 1Q7H, Z-score = 16.3, r.m.s.d. = 2.5 Å, sequence identity = 37%). In addition to this structure, domains C2 and C3 of archaeal tRNA-guanine transglycosylase (ArcTGT) from Pyrococcus horikoshii (PDB code: 1IQ8) showed high similarity (Z-score = 14.9, r.m.s.d. = 2.7 Å, sequence identity = 33%). When these proteins were compared, the structure of the DUF1947 domain of PH0734.1 was quite different from that of the other proteins, although the structures of the PUA domain showed high similarity [Fig. 1(B)]. Based on analysis of the Dali search, there are no proteins structurally similar to the DUF1947 domain of PH0734.1 (Z > 6.0). It has a novel conformation.

In the structure of ArcTGT, domains C2 and C3 (corresponding to the DUF1947 and PUA domains of PH0734.1, respectively) bind the characteristic λ-form tRNA,4 although the residues important to the recognition of tRNA are poorly conserved in PH0734.1. The DUF1947 domain of PH0734.1 possesses a unique lysine cluster at around the N-terminal region of the α1 helix and loop α2-β2. Lys11, Lys12, and Lys15 of helix α1, and Lys36, and Lys37 of loop α2-β2 form a highly electropositive protein surface [Fig. 1(C)]. Although the electropositive protein surface of the DUF1947 domain also exists in the structure of Ta1423 and ArcTGT, the electropositive residues and the location of the electropositive surfaces are not conserved among them. This finding would indicate that although PH0734.1 interacts with electronegative macromolecules such as tRNA for ArcTGT, using its electropositive surface of the DUF1947 and PUA domains, the binding target of PH0734.1 is different from that of Ta1423 and ArcTGT. DUF1947 may modulate the binding target of the PUA domain using its characteristic electropositive surface.

Acknowledgements

The synchrotron-radiation experiments were performed at the AR-NW12 beamline in the Photon Factory (Proposal No. 2003S2-002).

Ancillary