NMR structure of conserved eukaryotic protein ZK652.3 from C. elegans: A ubiquitin-like fold


  • John R. Cort,

    1. Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington
    Search for more papers by this author
  • Yiwen Chiang,

    1. Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, New Jersey
    Search for more papers by this author
  • Deyou Zheng,

    1. Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, New Jersey
    Search for more papers by this author
  • Gaetano T. Montelione,

    1. Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, New Jersey
    Search for more papers by this author
  • Michael A. Kennedy

    Corresponding author
    1. Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington
    • Pacific Northwest National Laboratory, Environmental Molecular Sciences Laboratory, Richland, WA 99352
    Search for more papers by this author


Structural proteomics aims to provide one or more representative three-dimensional (3D) structures for every structural domain family in nature. As part of an international effort in structural proteomics, the Northeast Structural Genomics Consortium has targeted clusters of strongly conserved eukaryotic protein families for structural and functional analysis. On this basis, protein ZK652.3 (nesg WR41/WP:CE00949/YOY3_CAEEL/Swiss-Prot P34661/gi|17557033) from Caenorhabditis elegans was selected for structure determination. Sequencing of cDNA libraries shows that homologues of ZK652.3 occur widely in vertebrates and plants (Fig. 1). However, ZK652.3 homologues are conspicuously absent from the yeast and Drosophila genomes. Expression of the ZK652.3 gene has been observed in a transcriptional profile of C. elegans genes, where it was one of a cluster of 89 genes whose expression levels covaried during development.1 The biochemical function of this protein is presently unknown. Here we describe the 3D structure of ZK652.3 determined by nuclear magnetic resonance (NMR) spectroscopy and discuss structural similarities with other proteins that provide clues to potential biochemical functions.

Figure 1.

Multiple-sequence alignment of ZK652.3 from C. elegans with human (Swiss-Prot Q9NZF2) and Arabadopsis thaliana homologues (Swiss-Prot Q9CA23). Secondary structure regions observed experimentally in the present study are indicated (h for helix, b for β-sheet) above the sequence. The sequence of ubiquitin is also shown and aligned with ZK652.3 on the basis of the structural similarity identified by Dali.15

Materials and Methods.

The gene coding for the ZK652.3 protein was subcloned from cDNA clone YK452c8 into expression vector pET15b with a hexa-His N-terminal purification tag, generating plasmid pET15b-WR41. The resulting construct was verified by DNA sequence analysis. E. coli strain BL21(DE3) cell cultures transformed with pET15b-WR41 were grown at 37°C in MJ minimal medium.2 Details of the production and purification of ZK652.3 will be described in detail elsewhere. Sample purity (>95%) and molecular weight (11.5 kDa with purification tag) were verified by SDS-PAGE and MALDI-TOF mass spectrometry. Uniformly 13C, 15N-enriched or 10% 13C, 100% 15N-enriched ZK652.3 protein samples were prepared in 5-mm Shigemi susceptibility-matched NMR tubes, at 1.5 mM protein concentration in H2O solution containing 5% D2O, 10 mM ammonium acetate, 50 mM sodium chloride, and 5 mM DTT at pH 5.50 ± 0.05.

NMR spectra of ZK652.3 were collected at 25°C on 600, 750, and 800 MHz Varian Inova spectrometers at the Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington. Spectra were referenced to external DSS. Two-dimensional 1H-15N HSQC3 and 3D HNHA4 and 15N-edited NOESY-HSQC3 experiments were conducted on [U-15N]-ZK652.3. Two-dimensional 1H-13C HSQC3 and 3D HNCACB,5 HNCO,5 CBCA(CO)NNH,5 CBCACOCAHA,6 HCCH-TOCSY,6 HCC-TOCSY-NNH,7, 8 CCC-TOCSY-NNH,7, 9, 10 CN-NOESY-HSQC,11 and 4D CC-NOESY12 experiments were recorded on [U-13C, U-15N]-ZK652.3. The sample was lyophilized and redissolved in D2O before acquisition of the 4D CC-NOESY data set. NOESY experiments used mixing times of 120 msec (4D CC-NOESY and 3D CN-NOESY-HSQC) or 150 msec (15N-edited NOESY-HSQC).

Spectra were processed and analyzed with Felix (MSI). No chemical shifts for residues L26 and P27 were obtained, and only partial assignments for T21 and F28 were obtained. Stereospecific valine and leucine methyl group assignments were obtained from a 1H-13C-HSQC spectrum recorded on a sample labeled with 10%-13C (Ref. 13). Peak intensities in the NOESY spectra were converted into short (1.8–2.5 Å), medium (1.8–3.0 Å), long (1.8–4.0 Å), and extra-long (1.8–5.0 Å) distance restraints. Pseudoatom corrections for stereochemically ambiguous protons were added to the upper bounds: 1.0 Å for methylene protons, 2.0 Å for chemically equivalent aromatic protons, and 2.4 Å for pairs of methyl groups in leucine and valine residues. All long-range NOEs (i-j > 4) were put initially into the extra-long category, as were all NOEs between side-chains. During structure refinement, the upper bounds of exceptionally intense long-range NOEs were reduced by 0.5–1.0 Å in some cases. Dihedral angle restraints for ϕ were derived from the HNHA experiment.4 Dihedral angle restraints for ψ were added during the later stages of refinement for residues in helix and strand regions where chemical shift and αiNi and αiNi+1 NOESY peak intensities indicated they were appropriate. Hydrogen bond restraints for slowly exchanging amide protons were added late in the refinement when the acceptor atom was identifiable from a preliminary structural ensemble.

A total of 476 NOE distance restraints, 50 hydrogen bond restraints (2 per H-bond), and 71 dihedral restraints (42 ϕ, 29 ψ) were used for calculation of the structural ensemble. Structures were calculated by using X-PLOR 3.840.14 The routines dg_full_embed.inp, dgsa.inp, and refine_gentle.inp were used to generate 26 structures from an extended starting structure. Two high-energy structures were removed to yield an ensemble of 24 structures (Fig. 2). Statistics for this ensemble are compiled in Table I. The structural ensemble and restraints have been deposited in the Protein Data Bank (PDB id 1L7Y), and the chemical shifts have been deposited in BioMagResBank (BMRB-5329).

Figure 2.

Stereoview of backbone trace for 24 aligned ZK652.3 structures. The conformations of residues 1–11 and 91–94 are not well determined, presumably because they are unstructured and are not shown for clarity. Backbone atoms CA, C′, N, and O from residues 13–21 and 29–89 were superimposed on the average structure. The N- and C-termini of the displayed portion of the protein are labeled. Helical regions of the structure are colored red, β-sheets are blue, other structured portions are green, and the unstructured loop between residues 21 and 29 is black.

Table I. Statistics for the Final Ensemble (24 Structures)
  • a

    Unstructured N- and C-termini (residues 1–12, 90–94) excluded

  • b

    Determined with PROCHECK-NMR.20

Distance restraints 
 Medium-range (1 < |i − j| < 5)86
 Long-range (|i − j| ≥ 5)219
 Hydrogen bond restraints (2 per H-bond)50
Dihedral restraints 
Total number of restraints (all) per residuea7.8
Distance restraint violations 
 Mean number of violations > 0.0 Å21.1 ± 2.1
 Maximum number of violations26
 Maximum violation (Å)0.12
 Mean RMS violation (Å)0.009 ± 0.001
Dihedral restraint violations 
 Mean number of violations > 0.0°2.6 ± 0.9
 Maximum number of violations4
 Maximum violation (°)1.5
 Mean RMS violation (°)0.12 ± 0.04
Mean rms deviation from the average  coordinates (Å) 
 Residues 13–21 and 28–89 
  Backbone atoms (Cα, C′, N, O)0.42 ± 0.09
  All heavy atoms0.95 ± 0.10
 Residues 13–89 
  Backbone atoms (Cα, C′, N, O)0.75 ± 0.25
  All heavy atoms1.28 ± 0.25
Ramachandran plot (residues 13–89)b 
 In most favored region (%)87
 In additional allowed region (%)10
 In generously allowed region (%)2
 In disallowed region (%)1

Results and Discussion.

ZK652.3 adopts a ubiquitin-like α + β fold (or β-grasp fold) with secondary structure elements ordered β- β-α- β- β-α- β along the sequence (Fig. 3). The strands of the β-sheet are ordered 2-1-5-3-4 with strands 2 and 3 antiparallel to the others. Residues 1–12 and 90–94 are not structured, displaying random coil-like backbone chemical shifts and no medium (other than sequential) or long range NOEs. Residues 22–29 appear to be more mobile than the rest of the structured portion of the protein. The backbone amide proton-nitrogen cross peaks for two residues in this region, L26 and F28, could not be located in HSQC spectra, and other residues in the span display strong cross peaks to water and weak cross peaks to other protons in the 15N-edited NOESY spectrum. Consequently, this portion of the structure in not converged in the structural ensemble.

Figure 3.

Ribbon cartoon representations ZK652.3 and two similar structures identified by Dali,15 ubiquitin (PDB id 1UBI) and the RalGDS Ras binding domain (1LFD chain A). One ZK652.3 structure from the ensemble is shown. The orientation is the same as in Figure 2. The orientations of the other proteins were generated by structural superposition with ZK652.3. The unstructured N- and C-termini of ZK652.3 are colored with alternating gray and black segments; the conformation shown for these regions is not indicative of the actual conformation. The ribbon cartoons were generated with MOLSCRIPT.21

In an initial effort to gain clues to the biochemical function of ZK652.3, its structure was compared with previously determined structures in the Protein Data Bank using Dali.15 The most similar structures identified by Dali were Ras binding domains from RalGDS (PDB id 1LFDchain A, Z = 8.8) and Rap-1a (1C1Ychain B, Z = 8.5), the ubiquitin-like module UBX domain (1H8C, Z = 8.5), and ubiquitin (1UBI, Z = 8.1). All have between 12 and 16% sequence identity with the structured portion of ZK652.3. Two of these structures, 1LFD chain A and 1UBI, are displayed next to ZK652.3 in Figure 3. The Ras binding domains invariably are parts of much larger protein molecules, and although the existence of an isolated Ras binding domain is not inconceivable, attention was focused on ubiquitin and ubiquitin-like modifier proteins. Like ZK652.3, these are small, single-domain proteins.

Several surface features common to ubiquitin and ZK652.3 are notable. Sequence similarity is particularly high in the C-terminus (Fig. 1), although the [BOND]Gly-Gly-COO motif with which ubiquitin and many ubiquitin-like modifier proteins terminate is replaced with [BOND]Val-Gly-His-COO in ZK652.3, and the chain extends even farther in homologues from other species. Arginine residues R72 and R74 near the C-terminus of ubiquitin, which are essential for its function,16 are conserved in ZK652.3. The side-chains of two residues, I44 and V70, in a critical surface hydrophobic patch on ubiquitin17 are structurally equivalent to I60 and I87 in ZK652.3, whose side-chains also are surface exposed. In the structure of the complex formed by MoaD and MoeB,18 thought to be bacterial progenitors of ubiquitin and ubiquitin E1 activating enzyme, respectively, hydrophobic MoaD residues L59 and F75 at the interface of the two proteins are structurally equivalent to V65 and I87 in ZK652.3. The activating enzyme (E1) in ubiquitin-like modifier systems activates the C-terminal carboxylate of a specific ubiquitin-like modifier protein for subsequent transfer to its conjugating enzyme (E2) then ligation to a target protein by a ligase enzyme (E3).19 We speculate that ZK652.3 may interact with an as yet unidentified activating enzyme in a new ubiquitin-like modification system.


We thank Prof. Y. Kohara for kindly providing cDNA clone YK452c8. We thank J. Aramini, A. Bhattacharya, and K. Gunsalus for helpful discussions, and Ms. D. Magapal for technical assistance. Acquisition and processing of NMR spectra and structure calculations were performed in the Environmental Molecular Sciences Laboratory (a national scientific user facility sponsored by the U.S. Department of Energy Office of Biological and Environmental Research) located at Pacific Northwest National Laboratory and operated for DOE by Battelle (contract KP130103). ZK652.3 from C. elegans is target WR41 of the Northeast Structural Genomics Consortium.