Structure of HI0073 from Haemophilus influenzae, the nucleotide-binding domain of a two-protein nucleotidyl transferase

Authors


Introduction.

Nucleotidyl transferases (NTs) are enzymes utilized by all living systems. They catalyze the transfer of a nucleotide to an acceptor hydroxyl group in a wide range of substrates. At the time of this writing, the International Union of Biochemistry and Molecular Biology classifies 61 NTs with distinct specificities [Enzyme Commission (EC) 2.7.7, at http://www.chem.qmul.ac.uk/iubmb/enzyme/], of which 3 have been reclassified as ribonucleases. NTs play roles in diverse biological processes such as polynucleotide synthesis and modification, small-molecule biosynthesis and metabolism, and antibiotic resistance.

The 2 Haemophilus influenzae open reading frames, HI0073 and HI0074, comprise an operon coding for a novel 2-protein NT complex.1 We have shown by size exclusion chromatography (SEC) that HI0073 and HI0074 form a 2:2-heterotetramer in solution. HI0073 is a member of a protein sequence family from archaea and bacteria termed previously “minimal nucleotidyl transferases” (MNTs).2 We have also shown1 that HI0074 is structurally related to the substrate-binding domain of kanamycin nucleotidyl transferase.3 Taken together, these data support the premise that HI0073 is the nucleotide-binding domain, and HI0074 is the substrate-binding domain of the HI0073/74 enzyme complex. Moreover, sequence analysis revealed that the 2-component NTs form a superfamily comprised of at least 8 families, abundant in organisms living in harsh conditions and in some pathogens.1 The substrate specificity and the biological functions of the HI0073/74 family and of members of the other 2-component NT families are unknown.

Here we present the crystal structure of the nucleotide-binding domain HI0073 at a resolution of 1.8 Å. The structure was determined by exploiting the anomalous signal of zinc. Intrinsic tryptophan fluorescence and the fluorescent compound 2′(or 3′)-O-(2,4,6-trinitrophenyl)-adenosine 5′-triphosphate (TNP-ATP) were used to characterize nucleotide binding. Additional information about HI0073 and HI0074 is provided on our Structural Genomics website: http://s2f.umbi.umd.edu.

Materials and Methods.

Cloning of HI0073:

The gene encoding HI0073 from H. influenzae Rd KW20 was amplified using PfuTurbo DNA polymerase (Stratagene; La Jolla, CA), genomic DNA, and 5′- and 3′-end primers. For the forward primer, a sequence encoding a thrombin cleavage site and an NdeI restriction site at the 5′-end of the gene was included. The polymerase chain reaction (PCR) product was introduced into the pET100/D-TOPO expression vector by the TOPO directional cloning procedure (Invitrogen, La Jolla, CA). Recombinant plasmids were isolated from the Escherichia coli TOP10 strain. The expression construct for production of native protein without the His-tag was prepared by digestion with NdeI and self-ligation.

Protein purification:

HI0073 protein was purified from E. coli BL21(DE3) cells grown at 30°C in Luria–Bertani (LB) media containing 100 μg/mL ampicillin to an A600 of 0.6 and induced with 0.1mM isopropylthio-β-D-galactoside (IPTG). Harvested cells were disrupted in buffer containing 20 mM Tris-HCl at pH 7.5 and 1 mM ethylenediaminetetraacetic acid (EDTA) (Buffer A) by passage through a French press. After centrifugation, the soluble fraction was passed through a Q Sepharose column (Amersham Biosciences) and eluted in buffer A using a 0–1 M NaCl linear gradient. Ammonium sulfate was added to a concentration of 2 M, and the solution was applied on a Butyl Sepharose column (Amersham Biosciences). The protein was eluted with a 2–0 M linear gradient of ammonium sulfate. The protein-containing fraction was applied on Sephacryl S-100 (Amersham Biosciences) equilibrated with 100 mM NaCl in Buffer A, and the eluted protein was concentrated to ∼15 mg/mL.

Analytical size exclusion chromatography:

Analytical SEC was performed on an ÄKTA Purifier10 using a Superdex75 HR 10/30 column (Amersham Biosciences). Runs were performed at 0.4 mL/min in a solution containing 20 mMN-2-hydroxyethylpiperazine-N′-2-ethanesulfonic acid (HEPES)-NaOH (pH 7.5), and 0.1 M NaCl.

Fluorescence measurements:

All spectra were obtained using a Spex FlouroMax2 at 293 K. For nucleotide-binding assays, the fluorescence emission was measured from varying concentrations of equimolar HI0073 and HI0074 solution in 20 mM Tris-HCl (pH 7.5), 10 mM MgCl2, and 100 mM NaCl. The samples were excited at 280 nm, and the emission spectra were recorded at 290–450 nm, with maximal fluorescence emission at 340 nm. Titration curves were measured for the following nucleotides: adenosine 5′-triphosphate (ATP), adenosine 5′-diphosphate (ADP), adenosine 5′-monophosphate (AMP), cytidine 5′-triphosphate (CTP), guanosine 5′-triphosphate (GTP), uridine 5′-triphosphate (UTP), and 2′-deoxythymidine 5′-triphosphate (dTTP). For the titration of the fluorescent nucleotide TNP-ATP (Molecular Probes), a 1 μM TNP-ATP solution in 20 mM Tris-HCl (pH 7.5), 3 mM MgCl2, and 100 mM NaCl was titrated with HI0073/74. The sample was excited at 408 nm, and emission spectra were recorded at 520–620 nm. Maximal fluorescence was observed at 555 nm. Binding constants were calculated using a simple hyperbolic model.

Crystallization and data collection:

HI0073 was crystallized at room temperature by the vapor diffusion method in hanging drops. The protein solution was mixed with well solution containing 11% 2-propanol, 0.1 M sodium cacodylate (pH 6.5), and 0.2 M zinc acetate and equilibrated against the well solution. For diffraction data collection, the crystals were immersed in cryogenic solution comprising 30% glycerol, 0.1 M sodium cacodylate (pH 6.5), and 0.2 M zinc acetate, and flashed-cooled in liquid propane cooled by liquid nitrogen. Crystal parameters are provided in Table I. Multiple-wavelength anomalous dispersion (MAD) data, exploiting the absorption edge of Zn, were collected at the Industrial Macromolecular Crystallography Association–Collaborative Access Team (IMCA-CAT) 17-ID beamline at the Advanced Photon Source (APS; Argonne National Laboratory, Argonne, IL). For data acquisition, the beamline was equipped with an Area Detector Systems Corporation (ADSC) quantum 210 charge-coupled device (CCD) detector. Single wavelength data were collected at the IMCA-CAT beamline 17-BM. The beamline was equipped with a MAR 165 mm CCD detector. Data were integrated and scaled using the HKL2000 program suite.4

Table I. Data Collection and Phasing Statistics
  • a

    The Friedel pairs are treated as independent reflections.

  • b

    The values in parentheses are for the highest resolution shell, 2.42–2.37 Å.

  • c

    The values in parentheses are for the highest resolution shell, 1.85–1.80 Å.

  • d

    Rmerge = Σhkl [(Σj | Ij − 〈I〉)/Σj | Ij |], for equivalent reflections (anomalous data separated).

  • e

    Phasing power = Σj | FH |/Σj | E |, where E is the lack of closure error.

  • f

    Dispersive R = ΣhklFP + FH(calc)| − FPH |/Σhkl | FPHFP |, where FP corresponds to the reference data set λ11, and FPH correspond to data collected at λ12 or λ13.

  • g

    Anomalous R = Σhkl | ΔFobs± − ΔFcalc± |/Σhkl | ΔFobs±|, where ΔF± is the structure factor difference between Friedel pairs.

Space groupP63
Zn MAD data statistics     
 Cell dimension (Å)a = 89.56, c = 61.48
 No. molecules per asymmetric unit2    
 λ11λ12λ13λ14λ2
 Wavelength (Å)1.28321.28301.25711.29650.9800
 Resolution (Å)2.702.702.702.701.80
 No. observed reflections282,367282,520281,912282,032960,910
 No. unique reflectionsa14,98814,99414,99414,98851,005
 Completeness (%)99.9 (100)b100 (100)b100 (100)b99.9 (99.9)b99.6 (100.0)c
 Rmerged7.4 (27.5)b7.5 (34.5)b7.2 (42.8)b6.3 (39.6)b2.8 (26.9)c
 〈I/σ〉18.717.116.618.732.1
Phasing statistics (MAD)     
 Phasing powere0.360.45  
 Dispersive Rf0.970.96  
 Anomalous Rg0.760.740.73  
Figure of merit0.40    

Structure determination:

Four wavelength MAD data were collected at the absorption edge of zinc. The program SOLVE5 was used to identify 7 Zn sites, which were then used to calculate phases using data to a resolution of 2.7 Å. Solvent flattening, assuming a solvent content of 45%, was carried out using RESOLVE.6 The model was built on a Silicon Graphics Octane workstation using the interactive computer graphics program XTALVIEW.7 The structure was later independently determined by single-wavelength anomalous diffraction (SAD) at 1.8 Å resolution, yielding a solvent-flattened electron density map of higher quality than the MAD electron density map. Refinement of the structure (Table II) was carried out using the program SHELXL.8 Water molecules were added using SHELXWAT. Several residual peaks were present in the difference Fourier (Fo-Fc) electron density map in the vicinity of the Zn-cluster. We attribute these peaks to alternate positions of zinc ions, noting that the crystals contained high concentration of zinc acetate. These were not modeled because of the ambiguity of the map in the vicinity of the zinc ions. Structure analysis was carried out using PROCHECK.9 MOLSCRIPT10 and RASTER3D11 were used for depiction of the structure.

Table II. Refinement Statistics
  • a

    The values in parentheses are for the highest resolution shell.

  • b

    Rcryst = ΣhklFo| − |Fc∥/Σhkl |Fo|, where Fo and Fc are the observed and calculated structure factors, respectively.

  • c

    Rfree is computed from 2532 reflections that were randomly selected and omitted from the refinement.

Resolution (Å)19.8–1.80Wavelength (Å)0.9800
Unique reflections48,448Completeness (%)a100 (100)
Number of protein atoms1653Number of H2O310
Number of heterogen atoms19  
Rcryst (%)b20.0Rfree (%)c28.2
RMSD from ideal geometry   
 Bond length0.011 ÅAngle distances0.034 Å
Ramachandran plot (%)   
 Most favored91.0Allowed9.0
 Generously allowed0.0Disallowed0.0

Results and Discussion.

The 114-amino acid structure of HI0073, refined at 1.8 Å resolution, contains 2 molecules in the asymmetric unit. Five N-terminal and 9 C-terminal residues of the first molecule, and 3 N-terminal and 9 C-terminal residues of the second molecule are disordered. The root-mean-square deviation (RMSD) between α-carbon positions of the 2 molecules is 0.7 Å with the largest differences corresponding to a lysine-rich loop region comprising Val37 to Lys43. SEC indicates that HI0073 is predominantly a dimer in solution. However, the crystal structure reveals monomers with few contacts between molecules. In contrast, the HI0073 partner protein, HI0074, forms a tight dimer both in the crystal and in solution.1 This is consistent with our model of the HI0073/74 complex, where HI0074 mediates the dimer association.

HI0073 adopts an α/β-fold. A 4-stranded mixed β-sheet with the topology β3 ↑-β2 ↑-β1 ↓-β4 ↑ is flanked by 2 α-helices on 1 face of the sheet, and 1 α-helix on the opposite face [Fig. 1(A)]. The structure comparison program DALI12 identified 7 structurally similar proteins with Z scores greater than 4: TTC1285, a putative MNT from Thermus thermophilus HB27 [NMR structure, Protein Data Bank (PDB) code: 1wot]; tRNA nucleotidyl transferase (PDB code: 1uet); Kanamycin nucleotidyl transferase (PDB code: 1kny)13; DNA-polymerase β (PDB code: 1bpy)14; 2 Poly(A)-polymerase fragments (PDB codes: 1fa0 and 1vfg)15, 16; and the ribosomal protein S6 (PDB code: 1ris).17 With the exception of the ribosomal protein, the proteins with confirmed function correspond to the nucleotide-binding domains of enzymes involved in nucleotide transfer, further substantiating the proposal that HI0073 and its sequence relatives form the nucleotide binding domain of the 2-component NT superfamily. The structure most similar to HI0073 is TTC1285, a member of the 2-component NT superfamily that shares 34% sequence identity with HI0073. The 2 structures superimpose with an RMSD of 2.9 Å over the paired α-carbon atoms.

Figure 1.

Structure and nucleotide binding of HI0073. (A) Ribbon diagram of the fold. α-helices are colored in salmon–red; β-strands are colored in lilac. The N- and C-termini are labeled. (B) Stereoscopic view of the binuclear metal binding site. The aspartic acids (Asp46, Asp48, and Asp79) that serve as ligands to the metal, as well as the glutamic acids from the symmetry-related molecule (Glu67′ and Glu71′), are shown as a stick model with the following color scheme: oxygen, red; carbon, green. Zinc ions are shown as gold spheres, and water molecules as red spheres. The loop following β1 is highlighted in dark salmon–red. Shown in a gold stick model is DCT from the structure of the DNA polymerase β complex. Its position was obtained by superposing the binuclear metal cluster and the 2 aspartic acids common to the 2 enzymes. (C) Progress curve of TTP titration. Tryptophan fluorescence energy transfer is provided as a function of TTP concentration. The data were fitted to a hyperbolic function to obtain a Kd value of 72 ± 5 μM.

The crystal structure contains 6 zinc ions, 2 sulfate ions, 1 sodium ion, and 1 glycerol molecule. The environment of 4 of the zinc ions is of functional importance: A pair of zinc ions, 1 per protein molecule, binds to Asp46, Asp48, and Asp79 [Fig. 1(B)] conserved in the HI0073 sequence family. This region corresponds to the magnesium sites of the nucleotide-binding domains in the previously known structures of NTs. A superposition of HI0073 with the structure of DNA polymerase β determined in complex with Mg2+-dideoxy cytidine triphosphate (DCT)14 shows that in addition to the presence of a magnesium binuclear center in the same position of the zinc binuclear center, carboxylate groups equivalent to Asp46 and Asp48 are also found in DNA polymerase β (numbered Asp190 and Asp192).

In addition to the 3 HI0073 aspartic acids ligands conserved in the sequence family, the zinc ion pair coordinates water molecules and carboxylate groups of symmetry related molecules, such that the coordination geometry of each metal is octahedral. Octahedral coordination is typical of magnesium ions, whereas zinc ions exhibit tetrahedral or pentagonal coordination. Moreover, functional zinc ligands usually comprise histidine and cysteine side-chains and carboxylic acids, but not multiple carboxylic acids.18–20 Thus, we propose that the relatively high concentration of zinc in the crystallization solution (0.2 M) enabled the metal to occupy the magnesium sites employing coordination geometry characteristic of magnesium ions.

While HI0073 and DNA polymerase β both contain 2 analogous aspartic acids coordinated to the metals, the remaining ligands are different. Instead of Asp79 in HI0073, a water molecule serves as a magnesium ligand in the DNA polymerase, and instead of 1 of the water molecules coordinated to the second magnesium in HI0073, Asp356 serves as a ligand in the polymerase structure. Overall, both proteins utilize 3 carboxylate groups as ligands of a binuclear metal center. The α-phosphate oxygen atom of DCT bridges the 2 metal ions of the DNA polymerase. Glu67 of a symmetry-related molecule mimics this role in the crystal structure of HI0073 [Fig. 1(C)]. Finally, the position of the γ-phosphate oxygen that serves as a ligand to 1 of the magnesium ions in the polymerase structure is occupied by a water molecule in HI0073.

In 1 molecule (molecule A in the PDB entry, 1no5), 2 additional zinc ions are associated with the zinc binuclear center. These sites are located at the interface between symmetry-related molecules and are not likely to have a functional role. They may have arisen because of the high zinc concentration. Finally, 1 zinc ion, with tetrahedral coordination geometry, mediates the interaction between the 2 molecules in the asymmetric unit, and a second zinc ion (also with tetrahedral coordination), together with the 2 sulfate ions and a sodium ion, form a cluster that mediates contacts between symmetry-related molecules.

The loop connecting β1 and β2 contains a helical turn between Gly34 and Gly39. Gly34 and Ser35 are conserved in the HI0073 sequence family. The residue conservation and their proximity to the cofactor binding site suggest that the 34–39 segment plays an important functional role. Indeed, our proposed model of the HI0073/74 complex, which was based on the crystal structure of HI0074 and a comparative model of HI0073,1 places this helical turn at the interface between the 2 proteins. The model also projects the side-chains of the 3 conserved aspartate residues into the HI0074 substrate binding cleft, indicating that the model is consistent with a bound nucleotide available for transfer to the substrate.

The nature of the physiologically relevant nucleotide remains unclear. The fluorescence assays probing intrinsic tryptophan residues confirmed that the HI0073/74 complex binds a variety of nucleotides with a range of dissociation constants [Kd(ATP) = 111 μM, Kd(ADP) = 187 μM, Kd(AMP) = 170 μM, Kd(CTP) = 118 μM, Kd(GTP) = 81 μM, Kd(UTP) = 162 μM, and Kd(dTTP) = 72 μM]. The highest affinity is thus for dTTP [Fig. 1(B)], though the discrimination in binding affinities does not appear to be conclusive. Nevertheless, for adenine nucleotides, it appears that ATP binds more tightly than ADP or AMP.

Direct binding of the fluorescence nucleotide TNP-ATP indicated tighter binding (Kd = 0.6 μM) than the values obtained for physiologically relevant nucleotides using the tryptophan fluorescence indirect method. To further investigate nucleotide binding, the assay was continued after a plateau was reached by adding ATP, until the fluorescence signal was quenched to half the maximum value. Next, EDTA was added to the solution, quenching the signal to the base value representing free TNP-ATP. In conclusion, the titration with ATP demonstrated displacement of TNP-ATP, confirming that both nucleotides bind at the same site. Chelating magnesium by EDTA confirmed the dependence of nucleotide binding on the presence of the cation. [Protein Data Bank coordinates entry code: 1NO5.]

Acknowledgements

We thank John Moult and Eugene Melamud for the use and help with their bioinformatics website. We thank the staff at the Advanced Photon Source, IMCA-CAT, for their help during data collection. Use of the Advanced Photon Source was supported by the U.S. Department of Energy, Basic Energy Sciences, Office of Science, under contract W-31-109-Eng-38.

Ancillary