The crystal structure of YgbM (EC1530) (Fig. 1), a glyoxylate induced protein from Escherichia coli, has been determined and refined to 1.63 Å by multiple-wavelength anomalous dispersion (MAD) method. YgbM is encoded by DNA bases 2862259–2863035 and belongs to a protein family of Pfam-B_7694.1 The gene is clustered with MutS (DNA mismatch repair protein), serine/threonine protein phosphatase, glycerol-3-phosphate regulon repressor, 3-hydroxyisobutyrate dehydrogenase, l-fuculose phosphate aldolase, gluconate permease, Rpos (RNA polymerase sigma factor), Nlpd (lipoprotein Nlpd), Pcm (protein-l-isoaspartate o-methyltransferase), SurE (stationary phase survival protein).
The Se-Met derivative of YgbM crystallized in the C2 space group with unit cell dimensions of a = 104.907 Å, b = 74.368 Å, c = 39.376 Å, β = 98.81°. There is one 258-residue protein per asymmetric unit. This structure adopts the common TIM (triosephosphate isomerase) barrel (β/α)8, in which an eight-membered cylindrical β-sheet is surrounded by eight helices.2 Similar to other TIM barrel structures, all of the turns between the α-helices and the subsequent β-strands at the N-terminal end of the barrel are composed of only three or four residues, whereas the corresponding loops at the C-terminal end are longer and form a part of the potential active site. Inside of the TIM barrel, several hydrophilic side-chains from the C-terminal loops as well as two well-ordered water molecules coordinate to a Mg2+, presumably forming an active site (Fig. 2). As expected, a Dali search3 found several structures with relatively high similarity, which include 4XIS, 1A0C-A, 1QUM-A, 1DE5, and 1BYB with Z scores higher than 10. Further biochemical and structural analyses are in progress.
Materials and Methods.
Protein Cloning Expression and Purification. The ORF of ygbM was amplified by PCR from E. coli genomic DNA (ATCC). The gene was cloned into the NdeI and BamHI sites of a modified pET15b cloning vector (Novagen) in which the TEV protease cleavage site replaced the thrombin cleavage site and a double-stop codon was introduced downstream from the BamHI site. This construct provides for an N-terminal hexa-histidine tag separated from the gene by a TEV protease recognition site (ENLYFQ↓G). The fusion protein was overexpressed in E. coli BL21-Gold (DE3) (Stratagene) harboring an extra plasmid encoding three rare tRNAs (AGG and AGA for Arg, ATA for Ile). The cells were grown in LB at 37°C to an OD600 of approximately 0.6 and protein expression induced with 0.4 mM IPTG. After induction, the cells were incubated overnight with shaking at 15°C. The harvested cells were resuspended in binding buffer (500 mM NaCl, 5% Glycerol, 50 mM HEPES pH 7.5, 5 mM imidazole), flash-frozen in liquid N2, and stored at −70°C. The thawed cells were lysed by sonication after the addition of 0.5% NP-40 and 1 mM each of PMSF and benzamidine. The lysate was clarified by centrifugation (27000g for 30 min) and passed through a DE52 column preequilibrated in binding buffer. The flow-through fraction was then applied to a metal chelate affinity column charged with Ni2+. The hexa-histidine tag was eluted from the column in elution buffer (500 mM NaCl, 5% Glycerol, 50 mM HEPES pH 7.5, 500 mM imidazole), and the tag then cleaved from the protein by treatment with recombinant His-tagged TEV protease. The cleaved protein was then resolved from the cleaved His-tag and the His-tagged protease by flowing the mixture through a second Ni2+-column.
The YgbM protein was dialyzed in 10 mM HEPES, pH 7.5, 500 mM NaCl, and concentrated by using a BioMax concentrator (Millipore). Before crystallization, any particulate matter was removed from the sample by passing through a 0.2-μm Ultrafree-MC centrifugal filter (Millipore). For the preparation of selenomethionine (SeMet) enriched protein, the E. coli YgbM was expressed in the methionine auxotroph strain B834(DE3) of E. coli (Novagen) and purified under the same conditions as the native protein in supplemented M9 media. The reducing reagent β-mercaptoethanol (5 mM) was added to all purification buffers.
The protein was crystallized by vapor diffusion in hanging drops by mixing 2 μL of the protein solution (9 mg/mL) with 2 μL of 0.1 M HEPES, pH 7, 5% PEG 8000 and 5% glycerol, and equilibrated at 20°C over 100 μL of this solution. Crystals, which appeared after 3 days, were flash-frozen in liquid nitrogen with crystallization buffer plus 20% glycerol as cryoprotectant before data collection.
Diffraction data were collected at 100 K at the 19ID beamline of the Structural Biology Center at the Advanced Photon Source, Argonne National Laboratory. The three-wavelength inverse-beam MAD data up to 1.63 Å [peak: 12.6620 KeV (0.97946 Å), inflection point: 12.6603 KeV (0.97957 Å), high-energy remote: 13.1000 KeV (0.93927 Å)] were collected from a Se-Met labeled protein crystal. One crystal (0.2 × 0.2 × 0.2 mm) was used to collect at 100 K all data MAD sets to 1.63 Å with 3 s exposure/1°/frame using 150 mm crystal to detector distance. The total oscillation range was 170 degrees as predicted with use of strategy module within HKL2000 suite.4 The space group was C2 with cell dimension of a = 104.907, b = 74.368, c = 39.376, β = 98.81°. All data were processed and scaled with HKL2000 (Table I)to an Rmerge of 7.2%, 7.0%, and 8.0% for inflection point, peak, and remote, respectively.
Table I. Summary of Crystal and MAD Data
a = 104.907 Å, b = 74.368 Å, c = 39.376 Å, β = 98.81°
MW Da (residues)
MAD data collection
Resolution range (Å)
No. of unique reflections
R merge (%)
Structure Determination and Refinement.
The structure was determined by MAD phasing5 using CNS6 and refined to 1.63 Å by using CNS against the averaged peak data. The initial model was built automatically by using ARP/wARP.7 The model was further refined to 1.63 Å. Throughout the model was manually adjusted by using O.8 The final R was 0.194 and the free R of 0.214 with all data (Table 2). Electron density calculated at 1.5 σ is well connected for all the main-chains and most of the side-chains except a few areas on the surface of the molecules. The stereochemistry of the structure was checked with PROCHECK9 and the Ramachandran plot. The main-chain torsion angles for all residues are in allowed regions.
Table II. Crystallographic Statistics
Density modification, FOM (1.8 Å)
Resolution range (Å)
No. of reflections
RMSD from ideal geometry
Bond length (1–2) (Å)
Mean B-factor (Å2)
Protein atoms (2149)
Protein main chain
Protein side chain
Ramachandran plot statistics (%)
Residues in most favored regions
Residues in additional allowed regions
Residues in disallowed region
We thank all members of the Structural Biology Center at Argonne National Laboratory for their help in conducting experiments and Lindy Keller for help in preparation of this manuscript.