Comparative sequence analysis suggests that the ydaF gene encodes a protein (YdaF) that functions as an N-acetyltransferase, more specifically, a ribosomal N-acetyltransferase. Sequence analysis using basic local alignment search tool (BLAST) suggests that YdaF belongs to a large family of proteins (199 proteins found in 88 unique species of bacteria, archaea, and eukaryotes). YdaF also belongs to the COG1670,1 which includes the Escherichia coli RimL protein that is known to acetylate ribosomal protein L12. N-acetylation (NAT) has been found in all kingdoms.2 NAT enzymes catalyze the transfer of an acetyl group from acetyl-CoA (AcCoA) to a primary amino group. For example, NATs can acetylate the N-terminal α-amino group, the ϵ-amino group of lysine residues, aminoglycoside antibiotics, spermine/speridine, or arylalkylamines such as serotonin.3
The crystal structure of the alleged ribosomal NAT protein, YdaF, from Bacillus subtilis presented here was determined as a part of the Midwest Center for Structural Genomics. The structure maintains the conserved tertiary structure of other known NATs and a high sequence similarity in the presumed AcCoA binding pocket3–6 in spite of a very low overall level of sequence identity to other NATs of known structure.
B. subtilis acetyltransferase YdaF protein preparation was performed following procedures described previously.7 The open reading frame of B. subtilis YdaF protein was amplified by polymerase chain reaction (PCR) from E.coli DH 5α genomic DNA. The gene was cloned into the pMCSG78 using a modified ligation-independent cloning protocol.9 This process generated an expression clone producing a fusion protein with a N-terminal His6 tag and a Tobacco Etch Virus (TEV) protease recognition site (ENLYFQ↓S). The fusion protein was overproduced in E. coli BL21-derivative harboring a plasmid encoding three rare E. coli tRNAs [Arg (AGG/AGA) and Ile (ATA)].
The purification procedure used buffers containing 50 mMN-2-hydroxyethylpiperazine-N′2-ethanesulfonic acid (HEPES) pH 8.0, 500 mM NaCl, 5% glycerol, and 10, 20, and 250 mM imidazole for the binding, wash, and elution buffers, respectively. The cells were lysed by sonication after adding fresh lysozyme at 1 mg/mL of final concentration in the presence of protease inhibitor cocktails (Sigma). The lysate was clarified by centrifugation and passed through a pre-equilibrated Ni-NTA column (QIAGEN), and the bound protein was removed with elution buffer. The His6-tag was removed by cleavage with recombinant His-tagged TEV protease. The cleaved protein was then resolved from the His-tag and His-tagged TEV protease by passing the mixture through a second Ni2+-column. The sample buffer was exchanged into 10 mM Tris/HCl pH 7.6, 2 mM dithiothreitol (DTT) using a PD-10 column (Amersham Biosciences) for the crystallization experiments.
The molecular weight of the protein in solution was determined by size exclusion chromatography on a Superdex-200 10/30 column (Amersham Biosciences) calibrated with ribonuclease A (13.7 kDa), chymotrypsinogen A (25 kDa), ovalbumin (43 kDa), and albumin (67 kDa) as standards. The calibration curve of Kav versus log molecular weight was prepared using the equation: Kav = (Ve − Vo)/(Vt − Vo), where Ve = elution volume for the protein, Vo = column void volume, and Vt = total bed volume.
Diffraction-quality crystals of selenomethionine (SeMet)-derivitized YdaF protein were grown at 22°C using vapor diffusion in hanging drops. The crystallization drops consisted of 2 μL of the protein at 8 mg/mL mixed with 2 μL reservoir solution of 10% polyethylene glycol (PEG) 6K, Na/KPO4 pH 6.0, NaCl 0.1M. Crystals were flash-frozen in liquid nitrogen with crystallization buffer plus 25% sucrose as cryoprotectant before data collection. The crystals are monoclinic, space group P21 (refer to Table I for complete details).
Table I. Summary of YdaF Crystal Data, MAD Data Collection, and Refinement
Unit cell parameters (angstroms, degrees)
a = 59.339 Å, b = 134.185 Å, c = 91.062 Å, α = γ = 90, β = 104.08
P21 (num 4)
Molecular weight [183 residues (SeMet)]
Molecules per asymmetric unit (a.u.)
Selenomethionine residues per a.u.
Resolution limit (Å)
Number unique reflections
Overall data completeness (%)
Overall data redundancy
Overall Rmerge (%)
Figure of merit (FOM)
Resolution range (Å)
Number of reflections (all)
Number of reflections (observed)
Percent reflections observed
Overall R-value (%)
Free R-value (%)
RMS deviations from ideal geometry
Protein non-hydrogen atoms
Mean B-factor (Å2)
Ramachandran plot statistics (%)
Residues in most favored regions
Residues in additional allowed regions
Residues in generously allowed regions
Residues in disallowed region
A two-wavelength multiple-wavelength anomalous dispersion (MAD) data set was collected for the peak and edge energies, as determined from a fluorescence scan of a crystal containing SeMet-labeled YdaF protein. The data were processed with HKL2000 and the substructure was determined using the MAD method in SOLVE.10 This initial step was administered within the Automated Crystallographic System (ACrS).11 The resulting initial model was 48% complete. The substructure solution was then used to phase the structure within AutoSHARP, utilizing the Baysian statistical approach in SHARP.12 The resulting phases and refined substructure were submitted for density modification, phase extension, non-crystallographic symmetry (NCS) averaging, and auto tracing with RESOLVE,13–15 and the resulting initial model included 68% of the polypeptide chain.
Final refinement was completed with REFMAC16 using 24 translation, libration, and screw rotation tensor (TLS) groups17, 18 in combination with restrained maximum-likelihood refinement. Manual model building was done using Quanta.19 Crystal characteristics, data collection, structure solution, and refinement statistics are shown in Table I.
Results and Discussion.
The crystal structure of YdaF protein reveals a hexameric unit with 32 symmetry [Fig. 1(a)]. This is consistent with the results of size exclusion chromatography that suggests the protein is a mixture of dimers and hexamers. The observed molecular weight of dimer is 48.8 kDa (expected 42.1 kDa) and molecular weight of hexamer is 128.3 kDa (expected 126.3 kDa). The ratio of dimer to hexamer observed under the conditions of the size exclusion chromatography experiment was 3:2. The crystal structure and solution measurements suggest YdaF is a trimer of dimers.
The monomer unit is a α/β fold characteristic of acetyl transferase domain. The main β-sheet that runs antiparallel (β2–β5) with α-helices α1 and α2 to the left side of the β-sheet, α-helix α3 is cupped within the curved face of the main sheet and helix α4 is to the right of helix α3, which forms a cleft [Fig. 1(b)]. Structural comparisons were conducted via the DALI server20 and the top results were for other NATs with probability z scores as high as 12, although the sequence homology was only 14%. Upon examination of these similar structures, it was determined that the conserved motifs A–D were also present in the YdaF structure. Motifs A and B have been identified as the putative AcCoA binding site in the other NAT structures.3–6, 21, 22 Despite the overall low sequence similarity, this region of YdaF appears to exhibit both sequence and structural conservation. Superpositions of the structures found in the DALI search showed low deviation in the core region and higher deviations in the relative positions of helices α1 and α2.
COG1670 is functionally described as an N-acetyltransferase and related to the rimL gene, which encodes a ribosomal N-acetyltransferase that acetylates the ribosomal protein L12 to form L7 on the large ribosomal subunit. YdaF is a member of this COG. The sequence identity between YdaF and RimL is 29%, with 54% of the residues similar. The putative AcCoA binding site of RimL, motifs A and B have 66% and 55% sequence similarity, respectively. It is suspected that YdaF is the B. subtilis homologue of RimL.
Atomic coordinates have been deposited in the Protein Data Bank (PDB), with PDB-ID 1NSL. We wish to thank all members of the Structural Biology Center at Argonne National Laboratory for their help in conducting these experiments. This work was supported by the Protein Structure Initiative, NIH grant GM-62414 and JSB was partially supported by American Cancer Society fellowship GMC-98219.