Methyltransferases (MTases) constitute a large family of enzymes that transfer the methyl group of S-adenosylmethionine (AdoMet) to carbon, nitrogen, or oxygen atoms of DNA, RNA, proteins, and small molecules.1, 2 In this process, AdoMet is converted to S-adenosylhomocysteine (AdoHcy), which is a potent inhibitor of MTases. Consequently, the enzymatic activities of MTases are regulated by the ratio of AdoMet to AdoHcy concentrations in the cell.
Structures of >20 different MTases are currently available.3 Most of these structures contain a core domain with a central seven-stranded β-sheet, with the last strand running antiparallel to the first six strands.3 This core domain is associated with the binding of the AdoMet molecule, and additional domains in these enzymes are involved in binding the other substrates.
Recently, a new class of MTases was characterized on the basis of the structures of the RrmA,4 MT1,5 and YibK6 proteins. These enzymes contain a core domain with a different backbone fold, having a central six-stranded fully parallel β-sheet. More importantly, this core domain has a deep trefoil knot at its C-terminus, and this knot appears to be crucial for the binding of the AdoMet molecule and for the dimerization of these enzymes.
In recent years, several structural genomics initiatives have been established aimed at rapidly elucidating protein structures of functional and biological interest, developing relevant technologies, and providing a more comprehensive picture of protein conformational space. Significantly, the structures of RrmA,4 MT1,5 and YibK6 MTases have all been determined by structural genomics consortia. The Northeast Structural Genomics Consortium (NESG) is particularly focused on clusters of eukaryotic domain families from several model organisms, including humans, and homologous proteins from bacteria and archea (http:/www.nesg.org).
NESG target IR73 is a 245-residue hypothetical protein, yggJ (HI0303), from Haemophilus influenzae. HI0303 is a member of a widely conserved protein family, represented in a large number of prokaryotic genomes as well as in Arabidopsis thaliana (Fig. 1) and is annotated in Swiss-Prot as a hypothetical protein of unknown function. Here we report the crystal structure of HI0303 at 2.0 Å resolution. On the basis of its structural similarity with the RrmA,4 MT1,5 and YibK6 MTases, and despite the lack of significant sequence similarity to them, we propose that HI0303 is another member of the trefoil-knot class of MTases.
The crystal structure was determined by the selenomethionyl single-wavelength anomalous diffraction (SAD) method7 and deposited in the Protein Data Bank under the accession code 1NXZ (Table I). Out of the residues in the dimer in the asymmetric unit, 89.1% are in the most favored regions of the Ramachandran plot, and 10.6% are in the additionally allowed regions. One residue (Ser87) is in the generously allowed regions, and none are in disallowed regions. TABLE I.
Summary of Crystallographic Information
|Maximum resolution (Å)||2.0|
|No. of observations||134,504|
|Rmerge (%)a||6.2 (41.0)|
|No. of reflections||59,377|
|Figure-of-merit from SAD phasing||0.24|
|Resolution range for refinement||29.5–2.0|
|Completeness (%)||85 (61)|
|R factor (%)b||21.4 (22.7)|
|Free R factor (%)||26.4 (26.1)|
|RMS d in bond lengths (Å)||0.008|
|RMS d in bond angles (°)||1.1|
The structure of the HI0303 monomer is made of two domains. The small domain includes the N-terminal 72 residues of the protein and contains a twisted five-stranded β-sheet (β1–β5) and one helix (α1) [Fig. 2(A)]. The structure of this domain closely resembles that of the RNA-binding domain of the ribosomal protein TL5 (PDB accession code 1FEU),8 even though the amino acid sequence identity among the structurally equivalent residues of the two domains is only 14%.
The core domain contains a central six-stranded parallel β-sheet (β6–β11) that is flanked by five helices (α2–α6) [Fig. 2(A)]. Near the C-terminus of this domain, the structure contains a deep trefoil knot,9 so that the β11–α6 segment (∼25 residues) is threaded through the β9–β10 loop. The structure of this domain is remarkably similar to the core domain of RrmA, MT1, and YibK [Fig. 2(B)]. However, the degree of amino acid sequence conservation among these proteins is very low (<15% identity for structurally equivalent residues).
A dimer of the HI0303 protein is observed in the crystal [Fig. 2(C)], consistent with solution light-scattering results (data not shown). The dimer is formed through contacts of the core domains of the two monomers, with little contribution from the small domains. The knot in the core domain (the β11–α6 segment) mediates a substantial portion of the dimer interface. A total of 1650 Å2 of the surface area of each monomer is buried in the dimer interface, which is twice the value of 700 Å,2 that is found in many biologically relevant protein–protein interactions.10 Residues in this interface are generally well conserved among the homologs of this protein (Fig. 1), suggesting that HI0303 and its homologs may function as dimers.
On the basis of this structural similarity, we propose that this trefoil knot in the core domain of HI0303 mediates the binding of the AdoMet/AdoHcy substrate, as observed in the structure of the YibK-AdoHcy complex.6 Considering the structural similarity between YibK and HI0303, we generated a model for the complex of AdoHcy and HI0303 [Fig. 2(A), (C), (D)]. The AdoHcy molecule assumes the more commonly observed extended conformation in our model, instead of the strained conformation seen in the YibK-AdoHcy complex. In this model, AdoHcy is located in a cavity near the knot, surrounded by two highly conserved segments 195-GSEGG-199 and 218-LGKRVLRTET-225 in HI0303 (Fig. 1). The AdoHcy molecule shows generally favorable interactions with these conserved residues of the protein, with no bad steric clashes. Therefore, it is likely that HI0303 and its homologs can also bind AdoMet/AdoHcy and are thus also MTases. Attempts at cocrystallizing HI0303 with AdoMet were not successful because the protein precipitated shortly after addition of the compound.
Among the structural homologs of HI0303, Rrma and MT1 have been classified as RNA 2′-O-ribose MTases. The gene for MT1 is located in an operon for ribosomal proteins, whereas Rrma contains the three conserved motifs1, 2 that have been identified for some members of this family. The first motif corresponds to the β6–α2 segment in the structure of HI0303, but the conformation of residues in this region is different among the enzymes [Fig. 2(B)]. The second and third motifs correspond to the β10–α5 and β11–α6 segments, respectively, which comprises the knot of the core domain. These are the same segments that are highly conserved among the HI0303 family members and may interact with the AdoMet molecule. However, the sequence homology between HI0303 and Rrma for the residues in these two motifs is very low.
Our observation that the small domain of HI0303 shares structural similarity with a RNA-binding domain leads to the suggestion that HI0303 may also be an RNA MTase. This hypothesis is supported by an examination of the electrostatic surface features of the HI0303 dimer [Fig. 2(D)]. There is a long groove on the surface of the dimer, which is surrounded from two sides by many highly conserved basic residues [Fig. 2(D)]. Notably, the AdoHcy-binding site is in close proximity to a cluster of conserved basic residues from the small domain, His27, Arg33, Lys59, and one residue from the interacting monomer (Arg214) [Fig. 1(D)], which may mediate the positioning of the RNA substrate near the active site. Therefore, our structural analyses suggest the hypothesis that HI0303 and its homologs are RNA 2′-O-ribose methyltransferases.