Crystal structure of At2g03760, a putative steroid sulfotransferase from Arabidopsis thaliana



Sulfotransferases catalyze the transfer of a sulfate group from the donor 3′-phosphoadenosine 5′-phosphosulfate (PAPS) to a variety of biological molecules, including steroids. In 1997, the first X-ray crystal structure of a steroid sulfotransferase (SST) was reported,1 and since then, only a handful of other three-dimensional structures of these enzymes have been solved.2–4 The structures are similar, but the source organisms have so far been limited to mice and humans. In mammals, it is thought that the SSTs are used to sulfonate a steroid molecule, and thus deactivate the steroid by increasing its solubility in water. This allows the steroid to be more readily excreted in urine.1

Steroid sulfotransferases are present in plants as well as in mammals. In plants, sulfonation may also deactivate steroids, or have an alternative function such as activating the molecule or serving as a defense response to pathogens.5 Even though Arabidopsis thaliana has not been shown to accumulate O-sulfated metabolites, this model plant species has been found to contain at least eight genes potentially encoding for sulfotransferases.5 One of these sulfotransferases, the protein derived from gene locus At2g03760, is the subject of this work.

Presently, little work has been done on steroid sulfotransferases from plants. However, seedlings treated with salicylic acid or methyl jasmonate or mature plants subjected to avirulent pathogens accumulate mRNA for At2g03760, also known as RaR047.6 The RaR047/At2g03760 sequence was used to probe a Brassica napus genomic library, resulting in the isolation of three separate genes. These gene products were studied in terms of their enzymatic ability to deactivate brassinosteroids upon induction with salicylic acid.7 The three B. napus proteins share a 73–87% sequence identity to At2g03760, and all four of these proteins belong to the SULT4 family.7

Here we present the structure of the plant-derived, putative, steroid sulfotransferase, At2g03760. This plant sulfotransferase has high structural similarity to two human steroid sulfotransferases: human estrogen sulfotransferase (PDB 1AQU) and dehydroepiandrosterone sulfotransferase (PDB 1J99).

Materials and Methods.

Callus tissue from an Arabidopsis thaliana Ecotype Columbia T87 suspension culture was collected by filtration. The callus was then ground to a fine powder in liquid nitrogen and total DNA was isolated essentially as described previously.8 The ORF was not interrupted by introns, thus it was possible to amplify directly from Arabidopsis total DNA. A two-step PCR protocol was used to add the GateWay® (Invitrogen, La Jolla, CA) recombination sites and the TEV protease cleavage site to the open reading frame (ORF) of At2g03760.9 The gene from the sequence-verified entry vector was then transferred into a pQE80-derived expression vector that allows expression of the target as an N-terminal fusion with the combination of S tag-(His)6- Maltose Binding Protein (MBP)A TeV protease site is also present in the linker region between MBP and the target protein, At2g03760. The expression vector was transformed into Rosetta cells (Novagen, Madison, WI) and grown in polyethylene terephthalate (PET) beverage bottles10 with a Terrific broth (TB) medium. The seleno-methionine labeled protein was grown in a similar manner, but using enriched M9 minimal medium11 instead of TB medium. Cells were broken by sonication, and the protein solution was loaded onto Ni-IDA columns (Hitrap chelating, Amersham Biosciences, Arlington Heights, IL) charged with Ni2+. The protein was eluted with a linear gradient from 35–300 mM imidazole, and after SDS-PAGE analysis, the fractions containing fusion protein were pooled. Imidazole was removed by gel filtration on a desalting column (HiPrep 26/10, Amersham Biosciences) and the protein was subjected to cleavage at 25°C by adding TEV protease (1/100, w/w). After more than 95% of the fusion protein was cleaved, the protein solution was applied to a second Ni-IDA column. The column was then subjected to a 0–175 mM imidazole gradient, and the fractions containing target protein were combined. The pooled fractions were desalted and then concentrated to a final concentration of 15 mg/ml. Further details of cloning, expression, growth, and purification will be published elsewhere in the context of an evaluation of performance on a large sample of evaluated targets. The Sesame laboratory information management software package12 was used to collect data pertaining to cloning, cell growth, and protein purification procedures.

Crystals of At2g03760 were grown by batch method from 0.89 M sodium malonate, 44 mM MES pH 6.0 at 24°C. The crystals were characterized as belonging to the space group C2221 with unit cell dimensions of a = 91.477, b = 120.899, c = 74.315 with one molecule per asymmetric unit. The structure was solved using anomalous seleno-methionine data and the SOLVE13/RESOLVE14 package. The initial map showed all three internal methionine residues along with good density for all of the alpha-helices and beta-strands. The model was built in through several iterations of using the Xfit15 program for model building and RefMac516 (with isothermal temperature factors) for refinement. Once a model was obtained from the MAD data, it was refined against a native data set with a larger resolution range (the resolution of MAD data was to 2.30Å whereas the native data was to 1.89Å). The final model gives an R = 19.1% with RFree = 22.2% for 279 ordered residues, one malonate molecule, and 216 water molecules. Data collection parameters and final statistics are given in Table I. Coordinates and structure factors were deposited with the Protein Data Bank (entry 1Q44).

Table 1. Summary of Data Collection, Crystal Structure, and Refinement Parameters
Data setEnergy (eV)No. reflCompletenessRedundancyR(Symm)
  • Numbers in parentheses indicate the highest resolution shell.

Space group  C2221 
Unit cell  a = 91.477, b = 120.899, c = 74.315 
Resolution range (Å)  25.00–1.89 (1.95–1.89) 
Completeness  99.2 (98.8) 
Redundancy  7.3 (6.6) 
Rsym  0.048 (0.336) 
R(cryst)  19.1 (25.1) 
R(free)  22.2 (28.5) 
Average B factor (Å2)  32.4 
RMSD bond lengths (Å)  0.02 
RMSD bond angles (°)  1.9 

Results and Discussion.

The structure of At2g03760 is composed of 14 alpha-helices and 7 beta-strands with the beta-strands mostly internal and surrounded by the alpha-helices [Fig. 1(A)]. The structure contains two disordered loops that could not be modeled and thus were not included in the final refinement. One of these missing loops, residues 176–186, is small with 11 residues; the other loop, residues 266–296, is larger with 31 residues.

Figure 1.

A: Ribbon diagram of At2g03760. The darkened area highlights the additional secondary structure within the protein (see text). B: Ribbon diagram of human estrogen sulfotransferase (hEST). The darkened loop highlights the long loop in hEST that is disordered in the At2g03760 structure. C: Sequence alignments for At2g03760, hEST, and DHEST. Conserved catalytic residues are noted by asterisks. Images were created with Molscript19 and Raster3d20 software.

Upon final refinement, the structural model for At2g03760 was sent to the DALI server to find structurally similar proteins.17 Of the 143 similar proteins returned with Z scores above 2.0, only two of the structures gave a Z score above 10. These two were also the only structures to have sequence identities above 29%. These two structures are human estrogen sulfotransferase (hEST, sequence identity 30%) and human dehydroepiandrosterone sulfotransferase (DHEST, sequence identity 29%). Of the two, hEST has a lower RMSD (3.12) for all main chain atoms when aligned with At2g03760 than does DHEST (RMSD of 3.26). The At2g03760 structure is also similar to phosphate kinases, with the most similar structure being adenylate kinase (PDB 1ZIN) with an RMSD of 4.0 and a sequence identity of 7%.18

Three residues have been identified to be conserved in the steroid sulfotransferases whose three-dimensional structures have been determined.1 These residues are Lys 48, His 108, and Ser 138 in hEST, which correspond to Lys 75, His 140, and Ser 170 in DHEST and Lys 44, His 99, and Ser 129 in At2g03760 [Fig. 1(C)]. Among these conserved residues, the serine and lysine residues of both human proteins both point into the PAPS binding site. At2g03760 was crystallized in the absence of substrate and the turn containing Lys 44 occupies the pocket where a PAPS molecule or similar molecule would presumably bind. All three structures also contain a histidine residue near the steroid-binding pocket that has been shown to aid in the catalytic transfer of the sulfate group in hEST. Two other Arg residues have also been identified within the brassinosteroid sulfotransferases as being important for either PAPS binding or catalysis.7 These two residues correlate with Arg 161 and Arg 287 of At2g03760.

As noted above, the At2g03760 structure has a long disordered loop between residues E266 and G296 that could not be found in the electron density. Both the hEST and the DHEST structures contain similar, unstructured loops in analogous regions. This loop in the hEST (E231-G262) [highlighted in black in Fig. 1(B)] and DHEST (S221-G252) structures lies over the PAPS and steroid binding pocket. This loop is not ordered in the At2g03760 structure, presumably because substrate is not present. Whether substrate binding will impart order on this region requires identification of the appropriate substrates and subsequent crystallographic investigation.

An additional structural difference between At2g03760 and the hEST or DHEST structures deserves comment. At2g03760 contains an additional alpha-helix, turn, beta-strand at the N-terminus [highlighted in black in Fig. 1(A)] that is not present in either the hEST or the DHEST structures [Fig. 1(B)]. Other than this addition to At2g03760, the tertiary structures of At2g03760, hEST, and DHEST are nearly identical.


Although the specific substrates are unknown, it is highly probable that At2g03760 is a steroid sulfotransferase. The structural similarities with two human steroid sulfotransferases suggest that At2g03760 contains both a region for binding a PAPS molecule, and a region for binding a steroid-like molecule. The high sequence identity with brassinosteroid sulfotransferases characterized from B. napus further indicates that At2g03760 may have a similar enzymatic function in A. thaliana.


We acknowledge financial support from NIH National Institute for General Medical Sciences grant P50 GM64598, the BioCARS beamline at APS/Argonne National Laboratory, and members of the CESG team: B. Buchan, J. Cao, C. Cornilescu, J. Doreleijers, D. Dyer, H. Geetha, D. Hruby, S. Leisman, F. Peterson, B. Ramirez, N. Rosenberg, M. Runnels, K. Seder, J. Shaw, J. Song, E. Tyler, D. Vinarov, F. Vojtik, G. Wesenberg, W. Westler, J. Zhang, and Z. Zolnai.