David W. Smith wrote this manuscript.
Crystal structure of gene locus At3g16990 from Arabidopsis thaliana†
Article first published online: 8 JUL 2004
Copyright © 2004 Wiley-Liss, Inc.
Proteins: Structure, Function, and Bioinformatics
Volume 57, Issue 1, pages 221–222, 1 October 2004
How to Cite
Blommel, P. G., Smith, D. W., Bingman, C. A., Dyer, D. H., Rayment, I., Holden, H. M., Fox, B. G. and Phillips, G. N. (2004), Crystal structure of gene locus At3g16990 from Arabidopsis thaliana. Proteins, 57: 221–222. doi: 10.1002/prot.20213
Coordinates for the crylstal structure have been deposited in the PDB under accession code 1Q4M.
- Issue published online: 11 AUG 2004
- Article first published online: 8 JUL 2004
- Manuscript Accepted: 27 APR 2004
- Manuscript Revised: 26 APR 2004
- Manuscript Received: 2 APR 2004
- NIH National Institute for General Medical Sciences. Grant Number: P50 GM64598
The Center for Eukaryotic Structural Genomics is dedicated to determining the structures of novel proteins from eukaryotic organisms. Open reading frames are scored using thirteen different categories (i.e. new fold prediction, solubility prediction, small percentage of low complexity sequence, etc.) and then ranked to indicate their suitability for study by nuclear magnetic resonance (NMR) or X-ray crystallography. Gene locus At3g16990 from Arabidopsis thaliana was given a suitable score for study, with the only major demerits being a large cysteine residue count, a moderate new fold prediction and a predicted low expression based on gene chip results. Here, we report the crystal structure of the protein from Arabidopsis thaliana gene locus At3g16990 as determined by single wavelength anomalous dispersion (SAD) phasing.
Materials and Methods.
The protein was synthesized as a Se-Met derivative, as previously described.1 The Sesame laboratory information management software package2 was used to collect data from cloning, cell growth and protein purification procedures.
The Se-met labeled protein was crystallized from a protein solution (15 mg/mL protein in 25 mM NaCl and 10 mM Tris pH 7.5) mixed in a 1:1 ratio with the well solution (55 mM sodium acetate pH 4.5, 1.05M ammonium sulfate) in a hanging drop crystallization trial using VDX® trays. The crystals were found to belong to the space group P41212 with unit cell constants of a = b = 62.700, c = 287.621. Phasing was accomplished with the SAD method using data collected at the Se peak with the SOLVE3/RESOLVE4 software package. The phasing effort located 7 (of 8 possible) selenium atom sites in the asymmetric unit, which were input into the RESOLVE program to create an initial traceable map. Further modeling and refinement were completed using TNT5 and Turbo,6 while final maps and models were built and refined using consecutive iterations of refinement with Refmac57 and model building with Xfit.8 The final structure consisted of seven alpha helices per molecule and two molecules per asymmetric unit with the two molecules being separate entities. The final refinement model consisted of residues 5–219 (of 221) of chain A and residues 4–219 of chain B, as the first four to five and the final two residues were too disordered to fit. The final model also contained 253 water molecules and four sulfate groups per asymmetric unit. Table I lists the data and refinement statistics. Coordinates and structure factors were deposited with the Protein Data Bank (entry 1Q4M).
|Data Set||Energy (eV)||# Refl||Compl||Redun||Rsym|
|Unit cell||a = 62.70, b = 62.70, c = 287.62|
|Resolution range (Å)||24.92–2.08 (2.14–2.08)|
|Average B factor (Å2)||32.9|
|RMSD bond lengths (Å)||0.018|
|RMSD bond angles (°)||1.529|
Results and Discussion.
Upon final structure determination, the coordinates for At3g16990 were sent to the DALI server9 to search for proteins with similar three-dimensional structures. Of the solutions returned, four had Z-scores above 10. These four structures correspond to heme oxygenase fragment (PDB 1N45, 8% IDE, 3.3 Å RMSD), ribonuclease reductase (PDB 1XSM, 9% IDE, 3.1 Å RMSD), methane monooxygenase hydroxylase (PDB 1MHY-D, 5% IDE, 2.8 Å RMSD), and heme oxygenase (PDB 1J77-A, 9% IDE, 3.4 Å RMSD). Each of these other proteins contains at least one iron atom (in either a heme cofactor or in some other iron binding site). In contrast, At3g16990 shows no evidence of tightly bound iron atoms, as it exhibits no distinct ligand-to-metal charge transfer spectrum typical of iron-sulfur or iron-tyrosinate proteins and no suitable peaks of unidentified intense electron density. Also, in comparing sequences, At3g16690 does contain a heme binding histidine similar to those in 1J77 and 1N45, but unlike both of these structures, the pocket where a heme would bind is filled with a residue side chain (valine in At3g16990 vs. glycine in both 1J77 and 1N45) thereby sterically hindering the potential binding of a heme group. Furthermore, At3g16990 has a pocket (discussed in more detail below along with mystery density) in a region similar to the iron binding pocket of 1XSM, but does not have enough iron binding residues pointing inside the pocket to fully capture an iron ion.
A VAST10 search of the same model gave a high Z-score of only 3.9 for PDB: 1A32 with a RMSD of 2.6 Å and a sequence identity of 9.6% for 52 residues when compared to the 211 residues of At3g16990. Thus VAST did not return any structurally similar proteins when At3g16990 was queried against the non-redundant PDB database.
A BLAST search11 on the sequence for At3g16990 produced only one other protein with a high sequence similarity. This protein, pm36 from Glycine max, was annotated to be involved in seed maturation; however, extensive supporting biochemical evidence is not available. Since At3g16990 is also a plant protein, it could also be involved in seed maturation process, but at present there is no evidence to support this conjecture.
The final solution of the structure of At3g16990 contains density that cannot be assigned to any amino acid or crystallization solution component. This density is buried within the protein and does not have any apparent access to solvent. There are three hydrogen bonding donors/acceptors pointing directly toward the density. Two are from the side chains of Asp 47 and Glu 210, while the third acceptor is the carbonyl oxygen from Val 39. The density is stacked between two aromatic side chains, Phe 50 and Tyr 143, suggesting that the molecule may be aromatic. Furthermore, molecular modeling indicates that the unassigned density has a size and shape consistent with a purine or a substituted indole. These molecules have important contributions in signal transduction pathways in plants.12 Figure 1 shows the location of this density along with the surrounding amino acid side chains.
We acknowledge other members of the CESG team, financial support from NIH National Institute for General Medical Sciences grant P50 GM64598, the BioCARS beamline at APS/Argonne National Laboratory.
- 3Automated MAD and MIR structure solution. Acta Cryst 1999; D55: 849–861., .
- 4Maximum likelihood density modification. Acta Cryst 2000; D56: 965–972..
- 6Turbo Frodo. Mountain View California: Silicon Graphics; 1991., .
- 7Refinement of macromolecular structures by the maximum-likelihood method. Acta Cryst 1997; D35: 240–255., , .
- 11Basic local alignment search tool. J Mol Biol 1990; 15: 403–410., , , , .
- 13The PyMOL Molecular Graphics System. San Carlos, CA: DeLano Scientific; 2002..