The YciO protein of Escherichia coli is a member of a family of proteins that also includes YrdC, HypF in E. coli, YwlC in Bacillus subtilis, and Sua5 in Saccharomyces cerevisiae (PF01300 family in PFAM database1). Sequences similar to YciO are found either as (a) independent proteins, (b) with C-terminal extensions, or (c) as domains in larger proteins (PFAM). The proteins YwlC from B. subtilis and Sua5 from S. cerevisiae are examples of the second category. Sua5 has been identified as having an essential role for normal growth on lactate or glycerol medium, although the precise function of the Sua5 protein remains unclear.2 HypF represents the third category where this domain is located in the middle of the linear sequence. The hypF and hydN genes form the hydA locus in E. coli, which encodes functions necessary for the formation of hydrogenase activity. HypF, the hydrogenase maturation protein, most likely participates in the maturation of all three E. coli hydrogenase isozymes 1, 2, and 3 from their inactive precursor forms.3
Both crystallographic and nucleic acid binding studies with E. coli YrdC revealed that this protein might exert its function through binding to double-stranded RNA.4 This would imply a role for YciO/YrdC and their homologous domains in larger proteins as functioning in the regulation of either transcription or translation. To gain further insight into this family of proteins, we have determined the crystal structure of E. coli YciO at 2.1 Å resolution.
Materials and Methods.
The gene encoding E. coli yciO was cloned, and the protein was expressed and purified as an N-terminal His6-tag fusion with a thrombin cleavage site as described previously for E. coli MoeA.5 Crystals of the fusion protein were obtained by the hanging drop vapor diffusion method by equilibrating drops containing 2 μL of protein (15 mg/mL) in buffer (20 mM Tris, pH 7.5, 40 mM NH4SO4, 60 mM NH4Ac, 5 mM 2-mercaptoethanol) and 2 μL of reservoir solution (10% [w/v] PEG3350, 0.1 M MES buffer, pH 6.5, 0.1 M MgAc2, 5% [v/v] ethylene glycol) suspended over 0.5 mL of reservoir solution. The crystals belong to the orthorhombic system, space group P212121 with unit cell dimensions a = 48.40, b = 68.77, c = 94.77 Å and one molecule in the asymmetric unit. Before data collection, the crystal was soaked for ∼15 s in a cryoprotecting solution of 23% (w/v) PEG3350, 0.1 M MES, pH 6.5, 0.1 M MgAc2, 5% (v/v) ethylene glycol, and 19% (w/v) glycerol, and was flash cooled in a cold stream of N2 gas to 100 K.
The structure of YciO was determined by MAD phasing from a SeMet-labeled protein crystal (Table I). Data were processed and scaled with HKL2000.6 All expected Se sites were found (SOLVE7), and the phases were calculated to 2.1 Å (figure of merit of 0.52). The electron density map after applying density modification (RESOLVE8) allowed automatic tracing of ∼80% of the main-chain and built the side-chains with the program ARP/WARP.9 The model was then adjusted manually by using the program O10 and refined (CNS version 1.011) against data collected at the hard remote wavelength. During refinement, 4.8% of the reflections were set aside for the calculation of Rfree. Water molecules were initially added automatically with CNS and verified by visual inspection of the difference map. The final model contains all residues, 1–206, 127 water molecules, and one sulfate ion with an R-factor of 0.211 and Rfree of 0.229. The model has good geometry with no residues in the disallowed regions of the Ramanchandran plot (Procheck12). The coordinates of YciO are deposited in the Protein Data Bank with the accession code 1KK9.
Table I. Data Collection and Refinement Statistics
Resolution range (last shell)
Rsym (last shell)
Completeness (last shell)
No. of observations
No. of nonhydrogen atoms
No. of water molecules
Average B-factor (Å2)
RMSD bond length (Å)
RMSD bond angle (°)
In most favorable region (%)
In disallowed regions
Results and Discussion.
The YciO molecule has an overall α/β architecture and consists of a single domain containing a central 10-stranded β-sheet flanked by six α-helices (A–F) (Fig. 1). The β-sheet can be subdivided into two sheets (β1-β8-β2-β3-β7-β4, β4-β6-β5-β9-β10) with a common middle strand β4. There are two helices on each side of the N-terminal part of the β-sheet (A, E and B, F), whereas the remaining two helices (C, D) are on the same side of the C-terminal part of this sheet. Helices are positioned asymmetrically about the β-sheet, with four of them (A, E, C, and D) packing against one face and two (B and F) against the other (Fig. 1).
Analysis of sequences of proteins homologous to YciO identified residues that are highly conserved within this family.4 We analyzed the residue conservation by using a larger set of sequences assembled in the recent version of PFAM.1 There is a strong concentration of the conserved residues in two areas: within a narrow depression in the protein surface between the C-termini of strands β7 and β10, and a much smaller area on the opposite side of the molecule defined by the side-chains of Asp35 and Ser36. A sulfate ion, originating from the crystallization mother liquor, is clearly visible in the electron density map within the depression mentioned above. This anion forms several hydrogen bonds with the protein through its oxygen atoms. Its O1 atom is H-bonded to the NH group of Ser144 (strand β7) and NE of Arg57, the O2 atom makes H-bonds to NE2 of His63, Wat383, and through a bridging Wat316 to atoms OG1 of Thr66 and carbonyl oxygen of Asn64, the O3 atom is H-bonded to NH1 of Arg196 (β10), and NH2 of Arg57 (loop between helix B and β3), the O4 atom is H-bonded to NH2 of Arg196, OG of Ser144, and through the water molecule Wat313 to side-chains of Thr66, Thr98, and Arg122. The electrostatic potential surface shows a bimodal distribution with opposite ends of the β-sheet having positive or negative potentials. The depression with highly conserved residues is near the boundary between these two areas and is characterized by a positive potential.
Sequence comparison of YciO and YrdC shows 27% amino acid identity between the two proteins. Their three-dimensional structures are quite similar to each other. Superposition of YciO with E. coli YrdC4 gives a root-mean-squares deviation (RMSD) of 1.52 Å for 131 Cα pairs. The YciO is seven amino acids longer than YrdC and has two additional β-strands at the N-terminus (β1, β8). The β8 strand corresponds in YrdC to a stretch with an extended conformation but which does not form proper H-bonds with the next strand. Although the folds of YciO and YrdC are similar, molecular surfaces in the regions of hypothetical active sites are somewhat different. YrdC has a wider cleft then YciO. There is also a significant difference in the distribution of their electrostatic potential. The bimodal distribution observed in YciO is much less pronounced in YrdC.4 The structural features of YciO are consistent with binding to an extended RNA molecule.
We thank Robert Larocque for technical help in molecular biology, Dr. Stephane Raymond for help with the structural genomics database, and Dr. Joseph D. Schrag for helpful discussions.