Crystal structure of the Escherichia coli SbmC protein that protects cells from the DNA replication inhibitor microcin B17


  • Michael J. Romanowski,

    1. Laboratories of Molecular Biophysics, The Rockefeller University, New York, New York
    Search for more papers by this author
  • Salley A. Gibney,

    1. University of Vermont, College of Medicine, Burlington, Vermont
    Search for more papers by this author
  • Stephen K. Burley

    Corresponding author
    1. Laboratories of Molecular Biophysics, The Rockefeller University, New York, New York
    2. Howard Hughes Medical Institute, The Rockefeller University, New York, New York
    Current affiliation:
    1. Structural GenomiX, Inc., 10505 Roselle St., San Diego, CA 92121.
    • The Rockefeller University, 1230 York Ave., New York, NY 10021
    Search for more papers by this author


Escherichia coli SbmC, also known as gyrase inhibitory protein GyrI and YeeB, is a 157-residue polypeptide with a predicted molecular mass of 18081.4 Da and a calculated pI = 4.61 (Fig. 1). The gene encoding SbmC is an SOS regulon gene induced by DNA-damaging agents and by the entry of cells into the stationary phase.1 The SbmC protein was originally identified as a factor that protects cells from the ribosomally synthesized 43-residue-long DNA replication inhibitor peptide microcin B17.1 More recently, it was shown that SbmC inhibits the supercoiling activity of the bacterial gyrase complex in vitro and that both overexpression of the protein and expression of the antisense sbmC RNA induce filamentous growth of cells and suppress cell proliferation. Therefore, the protein was renamed to reflect this function.2

Figure 1.

a: Alignment of protein sequences similar to E. coli SbmC identified in iterative PSI-BLAST19 (14 iterations) and DALI3 searches (December 2001). The secondary structural elements from the X-ray structure are shown above the aligned sequences. Gray spheres denote disordered residues from the C-terminus. Color-coding denotes sequence conservation among the proteins (white-to-green ramp, 30–100% identity). SbmC (GI:465566), E. coli SbmC/GyrI; NP456610 (GI:16760993), a putative GyrI ortholog from Salmonella enterica (strain Typhi CT18; E-value = 3e-37; 70% identity); CAC3490 (GI:15896727), a GyrI ortholog from Clostridium acetobutylicum (E-value = 3e-30; 21% identity); B3023 (GI:16130919), a hypothetical protein from E. coli (E-value = 0.9e-32; 23% identity); BH0401 (GI:15612964), residues 143–299 of an unknown conserved protein from Bacillus haldurans (E-value = 4e-31; 25% identity); LIN1814 (GI:16800881), amino acids 156–299 of the lin1814 gene product from Listeria innocua similar to putative AraC-type transcriptional regulators (E-value = 3e-25; 22% identity); ROB (GI:16132213), residues 124–284 of the E. coli right-origin binding protein (E-value = 2e-26; 11% identity); BMRR (GI:2851504), residues 121–279 of the B. subtilis BmrR protein (11% identity); NB: PSI-BLAST failed to detect BmrR but did identify the B. subtilis YdfL protein (GI:16077613; E-value = 0.11; 9% identical to SbmC), which is 27% identical to BmrR. b: Structure-based alignment of residues 1–77 and 78–157 with secondary structural elements denoted as in (a).

X-ray structure determination of SbmC revealed a compact α/β protein (dimensions 39 Å × 40 Å × 51 Å) with two four-stranded antiparallel β-sheets (N-terminal sheet: β5, β2, β4, and β3; C-terminal sheet: β8, β9, β6, and β1) and two α-helices (αA and αB) overlying the N- and C-terminal β-sheets, respectively [Figs. 1(a) and 2(a)]. The linear arrangement of the secondary structural elements within the polypeptide chain is β1, β2, αA, β3, α3 (denoting a three-residue α-helix), β4, β5, β6, αB, β7 (which runs antiparallel to the C-terminal portion of β9), β8, αC (a short α-helix), and β9 [Fig. 1(a)].

Figure 2.

a: RIBBONS20 drawing of SbmC with labeled N- and C-termini and secondary structural elements. The center of the pseudo two-fold symmetry is indicated with an oval. b: PyMOL21-generated structure-based superposition of residues 1–77 (blue) onto residues 78–155 (red).

The protein resembles a cupped hand [Fig. 2(a)] with the β-sheet forming the palm and α-helices αA and αB making up the thenar eminence and the fingers, respectively. Two similar halves of SbmC are related by a pseudo two-fold symmetry [Figs. 1(b), 2(a), and 2(b)], suggesting that the full-length protein arose from duplication and fusion of an ancestral gene encoding a dimeric protein of about 80 residues. Superposition of the N-terminal half (residues 1–77) onto residues 78–155 using the DALI3 server yielded a Z-score of 5.8 and a root-mean-square deviation (RMSD) of 3 Å, with five pairs of secondary structural elements related by the pseudo-dyad [β1–β5, β2–β6, αA–αB, β3–β8, and β4–β9; Figs. 1(b) and 2(b)]. The sequence identity between both segments of SbmC produced with a structure-based alignment is 13% [Fig. 1(b)].

Surface electrostatic calculations revealed an acidic groove, walled off on each side by αA, α3, αB, and αC and with its base formed by β3 and β8 [Figs. 2(a) and 3)]. The dimensions of the groove (length = 30 Å, width = 15 Å, height = 17 Å) are compatible with binding of linear peptides, making SbmC structurally, and possibly functionally, similar to the MHC proteins.4

Figure 3.

GRASP22 representations of the chemical properties of the solvent-accessible surface of SbmC calculated by using a water probe radius of 1.4 Å. The surface electrostatic potential is colored red and blue, representing electrostatic potentials from −20 to 20 kBT, where kB is the Boltzmann constant and T is temperature in Kelvin. Calculations were performed with an ionic strength of 0 and dielectric constants of 80 and 2 for solvent and protein, respectively. The worm diagrams show the orientation of the molecule in each panel. α-Helices αA and αB are labeled as are the N- and C-termini. a: The acidic groove, viewed from above. b: View of the opposite face of SbmC. c: End-on view rotated 90° about the horizontal axis from the view in (a). d: End-on view rotated 180° about the vertical axis from the view in (c).

A DALI3 search of the Protein Data Bank (, December 2001) with the coordinates of SbmC identified two structurally similar proteins, both of which are transcriptional regulators: the E. coli right origin-binding protein Rob (PDB ID 1D5Y; Z-score = 14.4; RMSD = 2.8 Å for 140 α-carbon pairs; 9% identity) and the drug-binding domain of the Bacillus subtilis BmrR protein (PDB ID 1BOW5, 6; Z-score = 12.0; RMSD = 3.3 Å for 129 α-carbon pairs; 11% identity). Although SbmC resembles fragments of both Rob and BmrR, we do not believe that it interacts with DNA because neither portion of Rob or BmrR is responsible for DNA binding.

Automated homology modeling with MODPIPE7 using SbmC as a template yielded 30 models of prokaryotic proteins (model score > 0.7, model length > 75 residues), which can be subdivided into two categories. One group contains models of proteins comparable in length to SbmC that probably represent other gyrase inhibitors. The second group contains 13 models of C-terminal halves of 30–35 kDa proteins belonging to the AraC/XylS family of transcriptional regulators.

Recombinant SbmC was tested for binding to previously crystallized fragments of GyrA8 and GyrB9 and to the basic C-terminal domain of GyrA (residues 517–780; calculated pI = 8.76), but no evidence of any association was found (data not shown). It is possible, however, that SbmC inhibits DNA gyrase activity by binding to the GyrA-GyrB heterotetramer in solution and not to its individual subunits. Alternatively, interactions between SbmC and the gyrase complex may be mediated by one or more additional factors. Further biochemical and biophysical studies of the SbmC protein will be required to establish its precise functional role in bacteria, which should be facilitated by the availability of an X-ray structure.

The full-length sbmC open reading frame was amplified by polymerase chain reaction by using the forward primer containing a BamHI restriction site (GTGACGGATCCATGAACTACGAGATTAAGCAGGAAGAGAAACG), a reverse primer containing an XhoI restriction site (GACTGCTCGAGTCAGTGATGTTTTGGCTGCACCGCAAC), and E. coli UT5600 DNA as template using standard protocols.10 The amplified insert was cloned into the corresponding sites of the pGEX-6P-1 plasmid. S-Met and Se-Met glutathione S-transferase (GST)-tagged proteins were expressed in E. coli BL21 (overnight induction at 18°C). Proteins were purified on glutathione and Sepharose Q resins following established procedures.11 Proteins for crystallization were dialyzed against 20 mM HEPES, pH 7.0, 100 mM potassium chloride, and 3 mM dithiothreitol, concentrated to 30 mg/mL, and filtered through a 0.1-μm filter. Gel filtration experiments documented that SbmC is monomeric in solution (data not shown). MALDI-MS confirmed the identity of the purified recombinant SbmC (measured mass = 18,491.2 ± 5 Da, predicted mass = 18,492.9 Da; MALDI-MS revealed the presence of another major peak with the measured mass = 18,548.12 Da, which probably represents a sodium adduct of the protein).

Diffraction-quality Se-Met and S-Met SbmC crystals (tetragonal bipyramids) were obtained by hanging-drop vapor diffusion at room temperature against a reservoir containing 0.1 M sodium acetate pH 4.6, 40 mM ammonium sulfate, and 26% (w/v) monomethyl ether of polyethylene glycol 2000. Crystals were cryoprotected by transfer to the mother liquor supplemented with 20% glycerol for 15–30 s and immersion in liquid propane (see Table I for space group and unit cell dimensions).

Table I. Crystallographic Data and Refinement Statistics PDB ID: 1JYH Crystal characteristics and data collection statistics Cell constants: a = b = 72.4 Å, c = 80.9 Å, α = γ = β = 90° Space group: P43212; 1 molecule per asymmetric unit X-ray source: NSLS X9A beamline
 λ1 (Se-Met peak)λ2 (Se-Met inflection)S-Met
Wavelength (Å)0.979350.979540.97962
Resolution (Å)30.0–1.830.0–1.830.0–1.8
Number of observations19862531940905913338
Number of reflections203402034719846
Completeness (%)97.697.596.3
(1.8–1.84 Å shell)97.597.599.3
Mean I/σ(I)51.752.038.6
(1.8–1.84 Å shell)24.828.429.4
R-merge on Ia0.0580.0680.053
(1.8–1.84 Å shell)0.1190.1360.089
Cutoff criteriaI < −3 σ(I)I < −3σ(I)I < −3σ(I)
Figure of meritb0.6250 (20.0–1.8 Å resolution) for 17068 reflections
Model and refinement statistics
 Data set used in structure refinementS-Met
 Resolution range18.0–1.8 Å
 Number of reflections19658 (17726 in working set; 1932 in test set)
 Completeness95.7% (86.3% in working set; 9.4% in test set)
 Cutoff criterion|F| > 0.0
 Number of amino acid residues155
 Number of water molecules177
 Rfree0.222Bond lengths (Å)0.005
Bond angles (°)1.30
Luzzati error (Å)0.19
  • a

    Rmerge = ∑hkli|I(hkl)i − 〈I(hkl)〉|/∑hkliI(hkl)i〉.

  • b

    Figure of merit calculated using MLPHARE.15

  • c, d

    Rcryst = ∑hkl |Fo(hkl) − Fc(hkl)| /∑hkl | Fo(hkl)|, where Fo and Fc are observed and calculated structure factors, respectively.

  • d

    Computed with PROCHECK.23

  • e

    Computed with MODPIPE.7 Models are publicly available from MODBASE ( via advanced search with the keyword NYSGRC_1JYH.

Ramachandran plot statisticsd
 Residues in most favored regions125 (94.0%)
 Residues in additional allowed regions8 (6.0%)
 Residues in generously allowed regions0 (0%)
 Residues in disallowed regions0 (0%)
 Overall G-factor0.3
MODPIPE statisticse
 Total number of models (model score > 0.7, model length > 75 residues)30
 Models with > 50% sequence identity to template2
 Models with 30–50% sequence identity to template0
 Models with < 30% sequence identity to template28

Diffraction data were collected under standard cryogenic conditions with a MARCCD detector on Beamline X9A (National Synchrotron Light Source, Brookhaven National Laboratory) and processed and scaled by using Denzo/Scalepack.12 The structure of the Se-Met protein was determined with data from a two-wavelength anomalous diffraction experiment.13 Five selenium positions were located with SnB14 and refined by using MLPHARE.15 Density modification of the MLPHARE-refined phases yielded a high-quality experimental electron density map (figure of merit of 0.625), which was suitable for automated model building with ARP/wARP16 (145 of 157 residues built in three fragments). After partial CNS17 refinement, missing residues were added manually with O.18 Two bona fide C-terminal His residues and five residues at the N-terminus originating from a cloning artifact (Gly-Pro-Leu-Gly-Ser) were not visible in the electron density map and were omitted from refinement. The final model, consisting of 155 of 157 residues and 177 water molecules, was refined against a 1.8 Å diffraction data set obtained from an S-Met crystal to an R factor of 19.6%, with an Rfree value of 22.2% (see Table I for a summary of X-ray data and refinement statistics). Refined atomic coordinates and structure factors have been deposited in the Protein Data Bank (PDB ID 1JYH ).

We thank Dr. K. Rajashankar from BNL for help with data collection and Drs. J.B. Bonanno and D. Jeruzalmi for useful discussions. This work was supported by NIGMS Grants P50-GM62529 (S.K.B.) and GM20276 (M.J.R.). S.K.B. is an Investigator in the Howard Hughes Medical Institute. The E. coli SbmC protein represents target T473 from the New York Structural Genomics Research Consortium.