Crystal structure of the protein from gene At3g17210 of Arabidopsis thaliana†
Coordinates for the crystal structure have been deposited in the PDB (accession 1Q4R).
The Center for Eukaryotic Structural Genomics (CESG) is dedicated to determining the structures of novel proteins from Arabidopsis thaliana. Proteins, or at least apparent open reading frames, have been prioritized for structure solution by nuclear magnetic resonance (NMR) and/or X-ray crystallography. The protein from gene locus At3g17210 was in the second-highest tier of the project list, indicating no significant demerits: no known homologous structure, not targeted by another structural genomics group, no predicted transmembrane segments, no identifiable signal sequence, relatively low Cys content, no proline-rich segments. The target also had several desirable predicted attributes (absence of low-complexity sequences, and high predicted solubility, and prediction to belong to an unknown structural class). The laboratory information management system used to track progress and data from initial selection to coordinate deposition is the Sesame system.1
Materials and Methods.
The target was cloned from mRNA from the T87 Arabidopsis thaliana cell line.2 The N-terminal tag sequence was GSSHHHHHHSSGLVPRGSH, with GSH remaining after thrombin cleavage. Selenomethionyl protein was expressed in a B834(DE3) pLacI+RARE expression host, using Studier's PASM-5052 autoinducing medium (Willam Studier, personal communication) for 25 h at 25°C in 500-mL batches in 2-L pop bottles.3 Ten and nine-tenths (10.9) grams of cells from one liter of culture were lysed by sonication, and the hexahistidine fusion construct was captured on a 5-mL HiTrap Chelating HP column charged with Ni2+ (Amersham). The recombinant protein (254 mg) was eluted with a linear 0-500 mM imidazole gradient, desalted into 100 mM NaCl, 10 mM Tris:HCl (pH 8.0), 1 mM Ca2+ and cleaved with thrombin (Novagen) overnight at 4°C. The cleaved target was subjected to subtractive IMAC chromatography, and the purified cleaved target was desalted into 100 mM NaCl, 10 mM Na HEPES (pH 7.0), and concentrated to 14 mg/mL. The target identity and extent of selenomethionine incorporation (> 98%) was confirmed by ESI and MALDI mass spectrometry.
Crystallization conditions for the selenomethionyl protein were very similar to those for native protein. Optimum crystals formed from hanging-drop vapor-diffusion droplets with 2 μL protein + 2 μL reservoir solution against a reservoir of 30%(w/v) MEPEG 5 K, 120 mM Na citrate, 100 mM Na HEPES (pH 6.0). Crystals began to form within three days and reached full size by two weeks. As the crystal growth solution was found to be an effective cryosolvent, crystals were mounted in loops and cooled by direct immersion in liquid nitrogen. The space group was P62, a = b = 55.468, c =57.673. Multi-wavelength anomalous diffraction data around the Se(K) edge were collected at APS/BioCARS/14IDB on a MARCCD detector, using inverse-beam geometry and reduced with HKL20004 and CCP4 programs.5 Optimum wavelengths for the experiment were determined by a fluorescence scan on the crystal (Table I).
Table I. Data and Refinement Statistics
|Resolution range (Å)||all (20–1.90 Å)|
|Reflections|| || || || |
|Completeness|| || || || |
| Outer shell (1.97–1.90 Å)||52.6||49.0||38.3||63.7|
|R(merge)|| || || || |
| Outer shell||27.4||24.6||24.2||26.8|
|Phasing|| || || || |
| Mean FOM after SOLVE|| ||0.54|| || |
|Refinement|| || || || |
| F-factor/free R-factor||18.5/23.2|| || || |
| RMDS bonds (Å)||0.020|| || || |
| RMDS angles (deg)||1.861|| || || |
| Average B factor||21.2|| || || |
| Number of water molecules||109|| || || |
| Number of metal ions||1|| || || |
|Ramachandran plot|| || || || |
| Residues in most favorible region (%)|| || ||95.7|| |
| Residues in additional allowed region (%)|| || ||4.3|| |
| Residues in generously allowed region (%)|| || ||0.0|| |
Solution of the Se position and phase improvement were performed with SOLVE/RESOLVE.6, 7 A single, ordered Se atom was found, and the final figure of merit was 0.57. RESOLVE auto-traced 97 of 112 residues in the recombinant target.8 The structure was improved by iterative rounds of refinement (REFMAC59 or CNS10) against the low-energy remote data set with five percent of the data withheld from refinement, and model-building using XFIT.11 A tightly bound metal ion was identified in the final rounds of refinement. The final model comprises 103 contiguous ordered amino acid residues, beginning at residue 10 of the recombinant construct and residue 7 of the native protein and continuing to the C-terminus of the protein. The model contains 109 ordered waters, including two associated with the bound metal, which, on the basis of its B-value, occupancy in refinement, is surmised to be a fully occupied Mg2+ ion. The final R-overall and R-free were 0.185 and 0.232 for all 7209 observed reflections (94.8% complete) between 20 and 1.9 Å. The R-free value in the 1.95–1.90 Å shell was 0.267. The RMS deviations of bond lengths and angles from ideal values were 0.020 Å and 1.86° (Table I).
Results and Discussion.
The crystal structure of At3g17210 protein is best described as a dimer on a two-fold crystallographic axis. The monomers have β-α-β-β-α-α-β secondary structure elements, with anti-parallel beta strands, with 2,1,3,4 sheet topology. Strand 2 of one monomer hydrogen-bonds with strand 4 of the other monomer to form an elongated eight-stranded beta barrel. The N- and C-termini emerge from the same face of the dimer, which also contains the magnesium binding site. The fold of the protein and the oligomeric state compare favorably with the structure of this protein as determined also by NMR spectroscopy by CESG.12
BLAST sequence searches13 revealed a high degree of sequence similarity (P(N)< e−10) between At3g17210 and the following SWISSPROT entries: Q8LQD2 Oryza sativa, Q41049 Populus balsamifera pop3 peptide, Q9AR79 Populus tremula Boiling stable protein A, Q8L408 Oryza sativa, Q9FK81 (At5g22580) A. thaliana, Q42482 Populus sp. wound responsive mRNA, and O31026 V. cholera. Because of the protein's homology to pop3 and the other proteins above, the At3g17210 protein may belong to a class of plant stress response proteins. Although it has an apparent structural similarity to the monooxygenase from the gene actva-orf6 of Streptomyces coelicolor (PDB entry 1LQ9; ), a structure reported after CESG selected this target, the At3g17210 protein must have a different function, because it does not contain the active-site residues of the monooxygenase.
We acknowledge support from the BioCARS beamline at APS/Argonne National Laboratory and members of the CESG team: D. Aceti, P. Blommel, B. Buchan, J. Cao, C. Corenilescu, J. Doreleijers, D. Dyer, H. Geetha, D. Hruby, T. Kimball, B. Ramirez, N. Rosenberg, M. Runnels, K. Seder, J. Shaw, H. Sreenath, J. Song, E. Tyler, D. Vinarov, F. Vojtik, G. Wesenberg, M. Westler, R. Wrobel, J. Zhang, and Z. Zolnai.