Crystal structure of XC5357 from Xanthomonas campestris: A putative tetracenomycin polyketide synthesis protein adopting a novel cupin subfamily structure



Members of the genus Xanthomonas represent a major group of phytopathogenic bacteria infecting most economically important crop plants. Extensive genome analyses and comparative genomics have been used to identify novel gene products of biological importance for these phytopathogens in recent years.1–5Xanthomonas campestris pathovar campestris (Xcc) is a gram-negative, yellow pigmented bacterium causing black rot of cruciferous crops. A local Xanthomonas campestris strain 17 from Taiwan has been chosen for structural genomics studies. Its genome has been sequenced (unpublished result from Tsai et al. 2004) and ORFs annotated using a bioinformatics approach ( XC5357 is one of these gene products consisting of 113 amino acids with a MW of 12.2 Da. A BLAST search against UniProt6 database has recognized significant sequence identity to the tetracenomycin polyketide synthesis protein from X. campestris ATTC33913 (Q4URX6),1X. axonopodis (Q8PN81),1 and H. marismortui (Q5VOH1, tcmJ),7 with percentage identities of 99.1, 82, and 51%, respectively. InterPro scan8 for sequence motif against PROSITE9 and Pfam10 databases returned with a cupin motif,11, 12 and sequence matches to existing PDB structures identified several oxalate decarboxylases (1O4t, 1l3j, 1j58, and 1uw8), albeit with low sequence identities (ranging from 25–30%). Here we report the crystal structure of XC5357 from Xcc and use this information to deduce its possible biological function. The determined structure was indeed found to belong to the cupin superfamily.11, 12 However, it also possesses distinct structural feature and belongs to a novel cupin subfamily.

Materials and Methods.

The cloning of XC5357 gene, along with its expression, protein purification, and crystallization screening has been described in a previous communication.13 Se-Met labeled XC5357 was produced using a nonauxotroph E. coli strain BL21(DE3) as a host in the absence of methionine but with ample amounts of Se-Met (100 mg/L). The induction was conducted at 37°C for 4 h by the addition of 0.5 mM IPTG in M9 medium consisting of 1 g of NH4Cl, 3 g of KH2PO4, and 6 g of Na2HPO4 supplemented with 20% (w/v) of glucose, 0.3% (w/v) of MgSO4, and 10 mg of FeSO4. Purification and crystallization of the Se-Met labeled XC5357 were performed using the protocols as established for the native protein.13 Crystals suitable for diffraction experiments reached maximum dimensions of 0.2 × 0.2 × 0.1 mm3 after 3 days.13 The data collection statistics has been published before.13

Data Refinement.

Crystals were soaked in a cryoprotectant solution comprising reservoir solution plus 25% (v/v) glycerol. Diffraction data sets were collected from a single Se-Met labeled crystal using the SPring-8 Taiwan beamline BL12B2 with an ADSC Quantum 4R CCD detector at 100 K. The diffraction data were processed using HKL2000,14 and indexed according to the same crystal orientation matrix. The refinement of selenium atom positions, phase calculation, density modification, and building of initial model were performed using the program SOLVE/RESOLVE.15 CNS16 was then used for manual model rebuilding and refinement to a final Rcryst of 23.1% and Rfree of 25.3%, respectively. The Matthews coefficient for XC5357 is 2.96 Å3/Da, and the estimated solvent content is 58.08%. The data refinement statistics are summarized in Table I. The atomic coordinates of XC5357 have been deposited in the PDB (2GU9).

Table I. Refinement Statistics for XC5357
  1. Values in parentheses are for the highest resolution shell.

Resolution range (Å)30.0–1.40 (1.46–1.40)
Data cutoff (σF)0.0
Completeness of used reflections (%)96.3 (93.4)
Number of used reflection32312 (2854)
Rfree test set size (%)5.1
R/Rfree (%)23.1/ 25.3
Number of nonhydrogen atoms
r.m.s. deviations
 Bond lengths (Å)/bond angles (°)0.006/1.4

Results and Discussion.

The structure of XC5357 has been determined to a resolution of 1.4 Å using the MAD method, with the refinement statistics listed in Table I. The final model (Fig. 1) comprises two protein molecules from residue 2 to residue 112 and 340 water molecules. However, no metal ion was found in the electron density map. The Ramachandran plot was produced by Pdb-Viewer,17 which shows that 93.7% of the residues are in the most favored regions, with 3.6% of the residues in the additional allowed regions. Two aspartic residues were found to take backbone torsional angles deviated from the allowed values. The first aspartic residue D38 was found to occupy a strategic position of the twisting β4–β5 loop, with its side chain Oδ1 atom forming a salt bridge with the backbone HN atom and second side chain Oδ2 atom forming a H-bond with the side chain Nϵ atom of the succeeding R40 residue, respectively, while the second residue D60 is located in the tight V59DGH62 turn connecting β6–β7 strands, with its backbone HN and side chain Oδ2 atoms forming two H-bonds with the A79 carbonyl oxygen atom.

Figure 1.

(a) Diagram showing the secondary structure elements in XC5357 superimposed on its primary sequence. The conserved domains I and II were boxed in green, with the highly conserved amino acids shown in blue. The putative active site residues of G36, D38, Q46, and H80 were shown in green and red. (b) Diagram of the XC5357 dimer, with one color coded from blue (N-terminus) to green (C-terminus), and another from green to red. XC5357 is an all β-sheet protein comprising 12 β-strands in each monomer, with a crossover domain-swapping between β1 and β8 in the opposing subunits.

The structure of XC5357 monomer is comprised exclusively of β-strands and loops (Fig. 1), with the total β-strand content amounting to 57%. It is formed primarily by two entirely antiparallel β-sheets that form a jelly roll β-sandwich, which is the major feature of the cupin barrel fold [Fig. 1(b)]. The homodimer is formed by a domain swapping between adjacent edge strands β1′ and β8 from two different subunits in the dimer. The larger β-sheet has a six-stranded 2310581′ topology, while the smaller β-sheet has a four-stranded 4967 topology, respectively. Many hydrophobic residues (L6, L8, L15, F16, L18, V21, W47, F49, V51, L71, V93, and F95) from each subunit form strong hydrophobic interaction in the interface. Except the 10 β-stands forming the two major β-sheets, two extra smaller antiparallel β-strands 11 and 12 were also present, possibly to enclose a ligand binding cavity.

A structural fold and homolog search performed with the coordinates of XC5357 using the secondary structure matching18 and DALI19 servers has identified a considerable number of structures with a similar conserved cupin fold. Twenty structures of the highest Z scores were chosen and superimposed by PdbViewer17 as shown in Figure 2(a). The rmsd values between these monomer structures with that of the query XC5357 structure are very low, ranging from 1.02 to 1.42 Å only, indicating that the cupin fold is highly conserved. The rmsd values and the numbers of Cα matched were listed in Figure 2(c). Even though the rmsd values are small, there does exist a distinct conformation for XC5357 in the β4–β5 motif I region [Fig. 2(a,b)], which exhibits a significant different twist compared with the similar regions in all other structural homologues reported so far. In fact, there are no conserved HXH residues in the motif I region for XC5357. They are replaced with GPD residues instead [Figs. 1(a) and 2(c)]. The close-up view of the domain I regions for XC5357 and 1O4T (an oxalate decarboxylase)20 as shown in Figure 2(b) does reveal the dramatic different positions of the active site residues; while the metal ion binding residues H61, H63, E68, and H102 of 1O4T can coordinate with the manganese ion very well [colored in blue in Fig. 2(b)],20 the corresponding residues G36, D38, Q46, and H80 of XC5357 [colored in red in Fig. 2(b)] have not only lacked the coordination capability, but also experienced considerable shifting in position. This is consistent with the result that no metal ion was found in this region. The possibility that the conserved HXH cupin motif be N39R40H41 rather than G36P37D38 [Fig. 2(c)] can be eliminated since the N39R40H41 residues are located in the β4–β5 loop region and the Nϵ atom of H41 is more than 10 Å away from the metal ion in 1O4T.

Figure 2.

(a) Superimposition of the twenty structural homologues of XC5357 searched by EEM and DALI servers. The β4-β5 motif I structure of XC5357 (colored in red) adopts a conformation very different from all other reported cupin domain structures. The PDB codes of these homologues were listed in (c). (b) Superimposed close-up stereo view of motif I structure between 1O4T (in blue) and XC5357 (in red). The amino acids that act as ligands for the active site metal cofactor were shown in ball-and-stick. (c) Multiple sequence alignment of representative cupin proteins superimposed in (a). The four β-strands 4, 5, 8, and 9 in the motif I and motif II were labeled for 1O4T and XC5357 in the top and bottom of the figure, respectively. The amino acids that act as ligands for the active site metal cofactor were shown in red, except those of the corresponding amino acids for XC5357, which were shown in green. Other amino acid residues highly conserved in these motifs were shown in blue. The first column shows the rmsd values and the numbers of Cα matched in superimposition, while the second column lists the PDB codes of the structural homologues employed in (a).

The cupin domain has been well studied before, and was found to contain two diagnostic conserved motifs.11, 12 The first motif comprises the β4–β5 region and the second motif the β8–β9 region [Figs. 1(a) and 2(c)], with a variable length of the inter-motif-region. Several glycine residues in these motifs were found to be highly conserved, which are all located in the loop turn position, preceding a β-strand [colored in blue in Fig. 2(c)]. Since the cupin superfamily of proteins is among the most functionally diverse of any described to date, how such a well-conserved tertiary cupin structure can encompass so many different biochemical functions is an intriguing issue to be answered. It has been suggested that these variety of biochemical function can be provided by minor variation of the residues in the active site and by the identity of the bound metal ions.11, 12 Here we further show another possibility; it can be achieved through twisting the conserved β4–β5 strands and changing the motif I structure to take a possibly different biochemical function. The determined XC5357 structure thus represents a novel subfamily of the well-conserved cupin superfamily.

A ProFunc21 search has been performed to deduce the possible function of XC5357. A BLAST search against the UniProt6 database recognized significant sequence identity (99.1, 82, and 51%, respectively) to the tetracenomycin polyketide synthesis protein from X. campestris,1X. axonopodis,1 and H. Marismortui.7 The secondary structure matching search against the PDB database identified well-matched protein structures with possible antibiotics synthesis function from Thermos thermophilus hb8 (1V70), with auxin binding (1LR5), and with oxalate decarboxylase function (1VJ2, 1O4T, and 1J58). Furthermore, no hit was obtained from any of the 189 enzyme active site templates, and neither from any of the 13374 ligand-binding templates.22 Cleft analysis23 did show a possible binding pocket of 1632 Å3 close to the metal ion binding site in the 1VJ2,24 1O4T,20 and 1J5825 structures. Since no metal ion is present in XC5357, enzymatic reactions requiring metal ions can be eliminated for XC5357. From this combined analyses, XC5357 is likely a tetracenomycin polyketide synthesis enzyme with a novel substrate binding cavity requiring no metal ion for activity. The information reported here, along with further biochemical and biophysical studies will yield valuable insights into this possible antibiotic synthesis protein.


The authors thank the core facilities for protein X-ray crystallography in the Academia Sinica, Taiwan, in the National Synchrotron Radiation Research Center, Taiwan, and in the SPring-8 Synchrotron facility in Japan for assistance of X-ray data collection. The National Synchrotron Radiation Research Center is a user facility supported by the National Science Council, Taiwan, ROC, and the Protein Crystallography Facility is supported by the National Research Program for Genomic Medicine, Taiwan, ROC. This work is supported by NCS grants 94-2113-M005-003 and 95-2113-M005-018 to SHC.