The E. coli gidB gene is the promoter-distal member of the two-gene gid operon1, 2 located in the vicinity of the bacterial origin of replication oriC.3, 4 It has been suggested that coupled transcription from the gid and myoC promoters activates initiation of chromosome replication.5 Transcription from the gid promoter oscillates in the same manner as that of the dnaA gene6 and decreases dramatically after the onset of DNA replication.7
E. coliGidB is representative of a large family of proteins encountered in gram-positive and gram-negative bacteria that are thought to function as S-adenosyl-L-methionine (SAM)-dependent methyltransferases in cell division or chromosome replication.1 GidB has been classified as a member of the minimal gene set for cellular life based on a comparison of completely sequenced genomes of two parasitic bacteria: Haemophilus influenzae and Mycoplasma genitalium.8 It was believed that because GidA and GidB homologs were found in M. genitalium, the smallest known self-sustaining living organism with a complement of roughly 517 genes,9, 10 the genes encoding both protein products are essential. In fact, it has been reported that gidA is essential in Helicobacter pylori in which it is cotranscribed with the dapE gene encoding an N-succinyl-diaminopimelate deacylase.11 A more unexpected finding came from a study involving a global transposon mutagenesis of the minimal M. genitalium and Mycoplasma pneumoniae genomes. This work identified a single, possibly disruptive insertion in the gidB gene,12 suggesting that gidB is dispensable.
GidB is a single α/β domain protein (molecular dimensions 45 Å × 46 Å × 44 Å) with a seven-stranded β-sheet of mixed polarity (parallel β3-β2-β1-β4-β5 followed by antiparallel β7-β6) flanked by seven α-helices [Fig. 1(a)]. The linear arrangement of the secondary structural elements is αA, αB, αC, α1, β1, αD, β2, αE, β3, β4, αF, β5, αG, β6, and β7 (Fig. 2). In many, but not all, methyltransferases, the segment connecting β3 and β4 is an α-helix. In GidB, as in H. influenzae YecO methyltransferase,13 the β3-β4 linker is random coil.
A DALI14 search of the Protein Data Bank (http://www.rcsb.org/, December 2001) with the coordinates of GidB identified 303 structurally similar proteins with Z-scores > 2.0. Only 15 of these proteins gave Z-scores > 10.0 (Table I). The putative SAM-binding site includes the C-terminal portion of β1 and the loop connecting β1 and αD. This 11-residue segment, starting at position 71, matches the SAM-binding consensus sequence (Asp-Val/Ile/Leu-Gly-Thr/Ser/Ala-Gly-X-Gly-X-Pro-Gly/Ala/Ser-Ile/Leu/Val, where X represents any residue). A superposition of GidB and rat catechol O-methyltransferase (PDB ID 1VID, Table I) permitted docking of SAM and identification of GidB residues likely to stabilize cofactor binding [Fig. 1(b)].
Table I. DALI Search Results for Structures Similar to GidB
PDB, Protein Data Bank; RMSD, root-mean-square deviation; RA, number of residues aligned; %SI, percent sequence identity over the aligned fragments.
Automated homology modeling with MODPIPE15 using GidB as a template yielded 636 models of both prokaryotic and eukaryotic proteins (model score > 0.7, model length > 104 residues) reflecting the highly conserved methyltransferase fold. Models with 30–100% sequence identity to the template sequence were limited to GidB family members. The vast majority of the other models represent known methyltransferases with various substrate specificities or larger methyltransferase domain-containing proteins. Detailed inspection of the more stringent MODPIPE/PSI-BLAST results suggests that GidB is specific for sterol and/or lipid substrates. Of the four “best” models (length > 180 residues, model score > 0.9, PSI-BLAST E-value < 1e-20), one is Salmonella enterica GidB (GI:16762453; E-value = 9e-21; 86% identity to target residues 1–207) with the remainder including Nostoc sp. PCC 7120 γ-tocopherol methyltransferase (GI: 17229295; E-value = 1e-23; 15% identity to target residues 1–194), Saccharomyces cerevisiae δ(24)-sterol C-methyltransferase (GI:462024; E-value = 3e-22; 15% identity to target residues 76–264), and Vibrio cholerae cyclopropane-fatty-acyl-phospholipid synthase (GI:15641135; E-value = 2e-20; 15% identity to target residues 169–352). It is also possible that GidB is specific for nucleic acids. Five of the most closely related structures revealed by the DALI search have been annotated as either DNA or RNA methyltransferases. We believe, however, that GidB is more likely to be specific for uncharged, sterol substrates because the putative active site is largely hydrophobic [Fig. 3(a)]. Further biochemical and biophysical studies of the GidB protein will be required to establish its precise functional role in bacteria, which should be facilitated by the availability of the X-ray structure presented in this work.
The full-length gidB open reading frame was amplified by polymerase-chain reaction using the forward primer containing a BamHI restriction site (GGTCAGGGATCCATGCTCAACAAACTCTCCTTACTGCTG), a reverse primer containing an XhoI restriction site (GCTGACTCGAGTTAAATTTTATTTGCTTTAATCACCACCAG), and E. coli UT5600 DNA as template using standard protocols.16 The amplified insert was cloned into the corresponding sites of the pGEX-6P-1 plasmid. S-Met and Se-Met glutathione S-transferase (GST)-tagged protein expression was conducted in E. coli BL21 cells (overnight induction at 18°C). Proteins were purified on glutathione and Sepharose Q resins following established procedures.17 Proteins for crystallization were dialyzed extensively against 20 mM HEPES pH 7.0, 100 mM potassium chloride, and 3 mM dithiothreitol, concentrated to 24 mg/mL and passed through a 0.1-μm filter. Gel filtration experiments indicated that GidB is monomeric in solution (data not shown). MALDI-MS confirmed the identity of the purified recombinant protein (measured mass = 23855.0 ± 15Da, predicted mass = 23842.5 Da).
Diffraction-quality Se-Met and S-Met GidB crystals (tetragonal bipyramids) were obtained by sitting-drop vapor diffusion at 4°C against a reservoir containing 0.1 M of sodium citrate pH 6.5, 15% polyethylene glycol 4000, and 10% isopropanol. Crystals were cryoprotected by transfer to the mother liquor supplemented with 20% (v/v) glycerol for 15–30 s and immersion in liquid propane (see Table II for space group and unit cell dimensions).
Table II. Crystallographic Data and Refinement Statistics
PDB ID 1JSX
Crystal characteristics and data collection statistics
Cell constants: a = b = 53.9 A˚, c = 150.6 A˚
Space group: P4122; 1 molecule per asymmetric unit
Total number of models (model score > 0.7, model length > 104 residues)
Models with >50% sequence identity to template
Models with 30–50% sequence identity to template
Models with <30% sequence identity to template
Diffraction data were collected under standard cryogenic conditions with a MARCCD detector on Beamline X9A (National Synchrotron Light Source, Brookhaven National Laboratory), processed and scaled by using Denzo/Scalepack.18 The structure of the Se-Met protein was determined with data from a two-wavelength anomalous diffraction experiment.19 Four selenium positions were located with SnB20 and refined by using MLPHARE21 (figure of merit of 0.28 at 2.4 Å resolution). Density modification of the MLPHARE-refined phases yielded a good-quality experimental electron density map suitable for automated model building with ARP/wARP22 (98 residues built in three fragments of the seven-stranded β-sheet) and FFFEAR23 (50 residues added in α-helical regions). After partial refinement in CNS,24 missing residues were added manually with O.25 Residues 36–47, 192–193, and five residues from the N-terminal cloning artifact (Gly-Pro-Leu-Gly-Ser) were not visible in the electron density map and were omitted from refinement. The final model, consisting of 193 of 207 residues and 127 water molecules, was refined against a 2.4 Å resolution data set obtained from an S-Met crystal to an R factor of 24.0% with an Rfree value of 27.5% (see Table II for a summary of X-ray data and refinement statistics). Refined atomic coordinates and structure factors have been deposited in the Protein Data Bank (PDB ID 1JSX).
We thank Dr. K. Rajashankar from BNL for his support in data collection, and Drs. D. Jeruzalmi and C. Edo for helpful discussions. This work was supported by NIGMS grant P50-GM62529 (S.K.B.) and NIH grant GM20276 (M.J.R.). S.K.B. is an Investigator in the Howard Hughes Medical Institute. The E. coli GidB protein represents target T35 from the New York Structural Genomics Research Consortium.