Ki Hyun Nam and Soo-Jin Kim contributed equally to this work.
Crystal structure of engineered β-glucosidase from a soil metagenome
Article first published online: 19 AUG 2008
Copyright © 2008 Wiley-Liss, Inc.
Proteins: Structure, Function, and Bioinformatics
Volume 73, Issue 3, pages 788–793, 15 November 2008
How to Cite
Nam, K. H., Kim, S.-J., Kim, M.-Y., Kim, J. H., Yeo, Y.-S., Lee, C.-M., Jun, H.-K. and Hwang, K. Y. (2008), Crystal structure of engineered β-glucosidase from a soil metagenome. Proteins, 73: 788–793. doi: 10.1002/prot.22199
- Issue published online: 24 SEP 2008
- Article first published online: 19 AUG 2008
- Manuscript Accepted: 25 JUN 2008
- Manuscript Revised: 18 JUN 2008
- Manuscript Received: 5 APR 2008
- Korea Science and Engineering Foundation. Grant Number: R01-2007-000-20072-0
- National Institute of Agricultural Biotechnology. Grant Number: 05-4-11-16-3
- protein engineering;
- crystal structure;
- β-glucosidiase complex with tartaric acid
Intensive screening of microbial isolates over the last several years has resulted in the identification and commercialization of numerous biomolecules, many of which are the products of microbial secondary metabolism.1 Soil microorganisms in particular have been identified as valuable sources of naturally occurring industrially relevant antibiotics.2 It has been estimated, based on the reassociation kinetics of DNA isolated from various soil samples, that the number of distinct prokaryotic genomes in soil ranges from 2000 to 18,000 genomes per gram of soil.3 Metagenomics is a rapidly developing field that involves the study of complex genomes within different environments and microbial niches.4 Several successful attempts to generate novel enzymes with enhanced catalytic activity and thermal stabilities using metagenomics have recently been reported.5–7
The utilization of polysaccharides as an energy, chemical, and carbon source requires the activity of β-glucosidase (Bgl; EC 18.104.22.168).8, 9 Bgl cleaves β-1,4-glycosidic linkages in disaccharide or glucose-substituted molecules through an acid–base catalytic mechanism, analogous to hydrolysis.10 The function of Bgl is essential for carbohydrate metabolism in cells, and the use of Bgl in various biotechnological processes, such as food and chemical synthesis, has been explored.11, 12 Defects in Bgl activity in humans are associated with Gaucher disease, a non-neuronopathic lysosomal storage disorder.13 There are two Bgl homologues, BglA and BglB, which have been shown to be members of the GH-1 family of enzymes. BglA and BglB are involved in the hydrolysis of cellodextrins, but they have different quaternary structures and substrate specificities. BglA is a cellobiase with an unusual octameric structure, while BglB is a monomer that acts as an exo-Bgl, hydrolyzing cellobiose and cellodextrins with a high degree of polymerization.14, 15
Recently, we isolated and characterized a novel bgl gene from uncultured soil bacteria (Usbgl).16 The gene encoded a protein with a predicted molecular weight of 55 kDa and an amino acid sequence was 56% identical to the family 1-glycosyl hydrolase of Chloroflexus aurantiacus. UsBgl exhibited substantial glycosyl hydrolase activity in the presence of natural glycosyl substrates, such as sophorose, cellobiose, cellotriose, cellotetraose, salicin, and arbutin, and was able to convert the major ginsenoside Rb1 into the pharmaceutically active minor ginsenoside, Rd. UsBgl also exhibited thermostable properties, which indicated that it may be useful in the development of novel biotechnical processes.16
Here, we report the X-ray crystal structure of engineered UsBgl. This type of structural analysis will contribute not only to our ability to engineer pharmaceutically active forms Bgl, but may also lead to a better understanding of the molecular etiology of Gaucher disease. Our results also provide insight into the development and production of value-added products, including pharmaceuticals.
Protein engineering, preparation, crystallization, and activity determination
Recombinant Bgl that lacked a signal peptide (accession no. DQ842022) was amplified from the β-glucosidase gene using the primers 5′-TCGCGGATCCATGACTGAACATGAGCTTCAG-3′ (which contained a BamHI site at the 5′-end) and 5′GCAAGCTTAATAGCGGGCG CGGCTAGCCC-3′ (which contained a HindIII site at the 3′-end). The amplified DNA was ligated into BamHI- and HindIII -digested pET28a(+) (Novagen) to create pEGLU. Mutations were introduced using the QuikChange Multi Site-Directed Mutagenesis kit (Stratagene) and pEGLU (encoding wild-type UsBgl) as the template. Two primers were designed to introduce the following sets of mutations: N-terminal primer (5′-TC GGA TCC AAC GTT AAG AAG TTC CCC GAG GGC TTT CTG TGG GGC-3′) introduced the mutations Glu39AsnN, Leu40Val, Gln41Lys, Pro42Lys, and Lys45Glu; C-terminal primer (5′-GC AAG CTT TCA ATC CTC TAG CCC GTT GTT CGC AAT TAC GTC GCG-3′) introduced the mutations Arg477Asn, Ala481Glu, and Ala482Asp. BL21(DE3) cells were transformed using recombinant plasmids, and cultures were grown at 37°C in LB medium supplemented with kanamycin (50 μg/mL). Expression was induced for 18 h at 22°C by the addition of 1 mM IPTG. Cells were subjected to centrifugation at 4000 rpm for 30 min, resuspended in lysis buffer (50 mM Tris-HCl, pH 8.0; buffer A) and then subjected to sonication. The cells were subjected to centrifugation at 14,000 rpm for 30 min, and supernatants were collected and loaded onto a His Trap column (GE-Healthcare) that was pre-equilibrated with buffer A. Recombinant protein was eluted using a linear gradient of 0.5 M imidazole in buffer A, and then further purified by Hi-Load 16/60 Superdex 200 prep-grade chromatography (GE-Healthcare) using buffer B (10 mM Tris-HCl, pH 8.0, 50 mM NaCl, 50 mM KCl, and 10 mM DTT). Purified protein was concentrated to 20 mg/mL using a Centriprep column (Millipore). The initial crystallization of UsBgl was performed with an automated Hydra II crystallization robot at 22°C using the sitting-drop vapor-diffusion method and commercially available kits from Hampton Research. Initial micro-crystals were obtained using 0.8 M Na/K tatrate, Tris-HCl (pH 7.0). Each hanging drop was prepared by mixing 1 μL each of the protein solution and the reservoir solution, and then the mixture was placed over 0.5 mL of reservoir solution. Optimized crystals were obtained from 0.8 M Na/K tatrate and 0.1 M Tris-HCl (pH 7.0–7.4) or 0.1 M MES (pH 6.8–7.0). Enzyme assays were carried out at 55°C using saccharide or related reagent as the substrate.16
X-ray data collection and structure determination
X-ray diffraction data was collected from a cooled crystal using an ADSC Quantum 210 CCD detector at beamline 6C using a Pohang Light Source (South Korea). Crystals were flash-frozen in a liquid nitrogen stream with 25% (v/v) glycerol as a cryo-protectant. The wavelength of the synchrotron X-rays was 1.23986 Å. The raw data was processed and scaled using DENZO and SCALEPACK from the HKL2000 program.17 Initial phases were obtained using molecular replacement. The PHASER program within the CCP4 program suite was employed using BglA from T.maritima (PDB code 1OIM) without water molecules and metal as a model.18 The structure was refined using simulated annealing, energy minimizations, and individual isotropic B factor refinement from the Crystallography & NMR System (CNS).19 A subset (5%) of the total number of reflections was randomly selected and multiple cycles of editing and adjustment of the model were carried out into sigma A weighted 3Fo − 2Fc, Fo − Fc using the program Coot.20 Final refinement was carried out in REFMAC from the CCP4 suite of programs,21 and final structures were validated with PROCHECK.22 Structural analyses were calculated using CNS and PDBsum.23 Graphical representations of the molecule were generated using PyMOL.24
The coordinates and structure factors for UsBgl have been deposited in the RCSB Protein Data Bank, accession code 3CMJ.
RESULTS AND DISCUSSION
Previously, we reported the isolation and purification of UsBgl from uncultured soil bacteria, and demonstrated that it possessed catalytic activity.16UsBgl was less stable and had lower solubility than other Bgl family members (data not shown). We identified a putative signal peptide in the N-terminal region of UsBgl, and a likely cleavage site between Ala18 and Lys19, using neural networks and hidden Markov models trained on gram-negative bacterium.25 In the reported crystal structure of Bacillus cereus oligo-1,6-glucosidase, the N- and C-termini of the protein form a closed loop, and this structural feature is regarded as critical for maintaining enzyme stability.26 Several other Bgl proteins also display a similar structural feature, in which the N- and C-termini interact through hydrogen bonds. BLAST and PSI-BLAST searches27 using UsBgl as a query sequence indicated that UsBgl is 47.4%, 44.1%, and 47.0% similar to Thermotoga maritima BglA (TmBglA), Bacillus polymyxa BglA (BpBglA), and Bacillus polymyxa BglB (BpBglB), respectively. To enhance the thermostability of UsBgl, we engineered several point mutations in the N-and C-terminal regions of the protein, based on the sequence and structure of TmBglA, which is a highly thermostable protein. These mutations resulted in the following amino acid substitutions of UsBgl: Glu39AsnN, Leu40Val, Gln41Lys, Pro42Lys, Lys45Glu (N-terminal), and Arg477Asn, Ala481Glu, Ala482Asp (C-terminal) [Fig. 1(a)]. Engineered UsBgl, in which the N-and C-termini were mutated to match those of TmBglA, exhibited enhanced protein solubility (data not shown) and catalytic activity (Table I). Moreover, we obtained crystals of engineered UsBgl under the crystallization conditions discussed earlier, whereas we have not been successful in crystallizing wild-type UsBgl.
|Kcat (s−1)||22.6 ± 5.0||33 ± 2.2||0.71 ± 0.014||1.58 ± 0.042||0.76 ± 0.26||1.46 ± 0.33|
|Km (mM)||0.12 ± 0.03||0.097 ± 0.006||4.05 ± 0.68||4.19 ± 0.79||0.95 ± 0.019||0.57 ± 0.058|
|Kcat/Km (mM s−1)||179||336||0.176||0.376||0.80||2.58|
Engineered UsBgl eluted as a monomer during size-exclusion chromatography and crystallized under K/Na-tartrate conditions. The crystal belonged to the space group P212121, and there was one molecule per asymmetric unit. Unlike other BglB proteins, we did not observe oligomeric structures in the crystallographic packing arrangement. The structure of UsBgl was determined using molecular replacement and was refined to 1.6 Å with an Rwork of 16.5% and Rfree of 19.7% (Table II). The overall structure of UsBgl had the classic TIM barrel structure, which was nearly identical to the structures of other Bgl proteins14, 28–30 [Fig. 1(b)]. Each β-strand was hydrogen bonded to two neighboring strands, and within the barrel, a complete ring of hydrogen bonds was present. The engineered N-and C-termini were sealed by hydrogen bonding, in which the Nδ of Asn478 interacted with the carbonyl group of Phe43 and Pro44 through a distance of 2.87 and 3.10 E, respectively. The carbonyl group of Asn478 interacted with the amino group of Phe43 over a distance of 2.78 Å, and the carbonyl group of Lys42 and amino group of Leu480 interacted with the hydrogen bond network via water molecules (2.76 and 2.87 Å, respectively) [Fig. 1(c)]. These results indicated that the closed structure of the N- and C-termini may play a role in increasing the stability of the engineered protein. However, we cannot rule out the contributions of the deleted signal peptide or the deleted regions of wild-type UsBgl to the stability of the engineered protein.
|Data collection statistics|
|Unit cell parameters (Å)||a = 70.331|
|b = 71.002|
|c = 86.978|
|Resolution range (Å)||20–1.6 (1.66–1.60)|
|Average I/σ(I)||35.57 (3.55)|
|Rmerge (%)a||0.065 (0.186)|
|Reflections in working set||23,215|
|Reflections in test set||2537|
|r.m.s.d. bonds (Å)||0.006|
|r.m.s.d. angles (°)||1.2|
|Ramachandran plot (%)c|
|Average B factor (Å2)|
When we compared the structure of UsBgl with TmBglA (pdb code 1OIM) BpBglA (1BGA), and BpBglB (2O9T), the rms deviations were 0.68, 0.79, and 0.98, respectively.14, 29–30 The entrance to the substrate binding pocket of UsBgl was formed primarily by the four extended loops that connected the strands and helices at the C-terminal side of the barrel, similar to other GH-1 family members. These long segments at the C-termini of the β-strands define the active centre, with the catalytic residues located at the ends of β4 (acid–base Glu203) and β11 (nucleophilic Glu387). These two amino acid residues have been shown to be required for the activity of glycosyl hydrolase.31 Several highly conserved residues in β4 and β11, including Trp49, Tyr113, Arg114, Phe115, Asn254, Asn330, Ile385, and Tyr432, may also play an important role in substrate recognition and the binding of sugar substrates and inhibitors.
The active site pocket of UsBgl bound to tartaric acid contained 13.2 buried vertices using the PDBsum program.23 This was similar to BpBglB (2O9T) (13.3 buried vertices). Among BglA family members, the number of buried vertices of TmBglA (1OIM) and BpBglA (1BGA) is 16.1 and 9.2, respectively. Typically, BglA proteins have a narrow cavity and active site pocket, whereas the active site pocket of BglB proteins is larger than that of BglA. Thus, while the amino acid sequence and Cα structure of UsBgl is similar to TmBglA, the substrate specificity,16 oligomeric state in solution based on size-exclusion experiments, and structure are more similar to BglB. Therefore, we propose that UsBgl belongs to the BglB family of proteins.
In the tartaric acid-bound form of UsBgl, conserved active site residues (Glu203 and Glu387) interact with the bound substrate. Interestingly, we found that tartaric acid adopted a slightly twisted form, and interacted with several residues in the active site pocket of UsBgl (Fig. 2). Tartaric acid interacted with the Oϵ atom of Glu203 (acid–base) and the Oϵ atom of Glu387 (nucleophilic) over 2.92 and 2.69 Å, respectively, as well as the conserved Oϵ atom of Gln57, Nϵ atom of His158, Nδ atom of Asn202, Oη atom of Tyr332, Nϵ atom of Trp434, Oϵ atom of Glu441 and Nϵ atom of Trp442 over a distance of 2.74, 2.97, 2.66, 2.71, 2.78, 2.33, and 3.04 Å, respectively. The mechanism of binding of the hydroxyl group of tartaric acid was similar to that of the glucoside geometry in the structure of the TmBglA-glucoside-tetrazole complex (2J7B, to be published).
In summary, we have determined the crystal structure of engineered UsBgl, in which the N- and C-terminal regions were modified to enhance the stability of the protein. Our results provide insight into the engineering of enzymes for enhanced protein stability and increased activity. On the basis of biological and structural analysis, we propose that UsBgl belongs to the BglB family of GH-1 enzymes. The structure of the tartaric acid-bound form of UsBgl has significant implications for drug discovery efforts.
The authors thank Dr. H. S. Lee, K. J. Kim, and K. H. Kim for assistance during data collection at beamline 6C of the Pohang Light Source, Korea.