Family GH16 glycoside hydrolases can be assigned to five subgroups according to their substrate specificities, including xyloglucan transglucosylases/hydrolases (XTHs), (1,3)-β-galactanases, (1,4)-β-galactanases/κ-carrageenases, “nonspecific” (1,3/1,3;1,4)- β-d-glucan endohydrolases, and (1,3;1,4)-β-d-glucan endohydrolases. A structured family GH16 glycoside hydrolase database has been constructed (http://www.ghdb.uni-stuttgart.de) and provides multiple sequence alignments with functionally annotated amino acid residues and phylogenetic trees. The database has been used for homology modeling of seven glycoside hydrolases from the GH16 family with various substrate specificities, based on structural coordinates for (1,3;1,4)-β-d-glucan endohydrolases and a κ-carrageenase. In combination with multiple sequence alignments, the models predict the three-dimensional (3D) dispositions of amino acid residues in the substrate-binding and catalytic sites of XTHs and (1,3/1,3;1,4)-β-d-glucan endohydrolases; there is no structural information available in the databases for the latter group of enzymes. Models of the XTHs, compared with the recently determined structure of a Populus tremulos × tremuloides XTH, reveal similarities with the active sites of family GH11 (1,4)-β-d-xylan endohydrolases. From a biological viewpoint, the classification, molecular modeling and a new 3D structure of the P. tremulos × tremuloides XTH establish structural and evolutionary connections between XTHs, (1,3;1,4)-β-d-glucan endohydrolases and xylan endohydrolases. These findings raise the possibility that XTHs from higher plants could be active not only on cell wall xyloglucans, but also on (1,3;1,4)-β-d-glucans and arabinoxylans, which are major components of walls in grasses. A role for XTHs in (1,3;1,4)-β-d-glucan and arabinoxylan modification would be consistent with the apparent overrepresentation of XTH sequences in cereal expressed sequence tags databases.
Progressive changes in the composition of cell walls and in the fine structure of constituent polysaccharides are of central importance in plant development. Different cell types within a plant can be distinguished from each other by the chemistry and organization of their walls, and walls differ in a way that is related to developmental stage or to their exposure to different environmental conditions (Fincher 1993; Pennel 1998; Cosgrove 1999; Carpita et al. 2001). Wall composition and fine structure can also change in response to microbial attack, when rapid deposition of the polysaccharide callose and layers of lignin, together with the formation of a cross-linked protein network, create a physical barrier that is believed to impede the progress of invading fungal and bacterial pathogens (Mengden et al. 1996; Knogge 2002). Thus, the wall is a dynamic structure in which newly deposited polysaccharides can be modified or restructured during normal growth and development, or in response to abiotic and biotic stresses.
Primary walls of dicotyledonous plants consist of a cellulosic network embedded in a matrix of complex polysaccharides of which xyloglucans, glucuronoarabinoxylans, and pectic polysaccharides are most abundant (Carpita and Gibeaut 1993). Extensive intermolecular hydrogen bonding between matrix-phase polysaccharides and cellulose are thought to be important determinants of overall wall integrity (Fry 1989; Hayashi 1989; Carpita et al. 2001). Walls of the monocotyledons are constructed in essentially the same way, although glucuronoarabinoxylans and (1,3;1,4)-β-d-glucans predominate in the matrix phase, particularly in the grasses; levels of pectic polysaccharides and xyloglucans are relatively low in these species (Carpita and Gibeaut 1993). For example, xyloglucan levels in walls from barley (Hordeum vulgare) aleurone, starchy endosperm, and young leaves are extremely low or absent (Sakurai and Masuda 1978; Fincher 1993). The highest xyloglucan content so far detected in barley cell walls is 8% to 10% in coleoptiles (D.M. Gibeaut, M. Pauly, and G.B. Fincher, unpubl.). Xyloglucan levels in suspension-cultured maize cells are ∼4% (Rose et al. 2002), and low levels are found in walls from the starchy endosperm of rice (Shibuya and Misaki 1978; Shibuya et al. 1983).
Xyloglucans are subject to modification following their initial deposition into the wall. In particular, their molecular mass distribution can be altered to simultaneously generate larger and smaller polysaccharide chains (Smith and Fry 1991; Nishitani 1997). This “disproportionation” is catalyzed by xyloglucan transglycosylases (EC 220.127.116.11) that are abundant in the apoplastic space during wall development. It is now known that xyloglucan-modifying enzymes include both xyloglucan endotransglycosylases (XETs) and xyloglucan endohydrolases (XEHs), which are collectively referred to as xyloglucan transglycosylases/hydrolases (XTHs) (Farkas et al. 1992; Fanutti et al. 1993; Rose et al. 2002).
Our interest in barley XTHs was originally stimulated by observations that XTH sequences were surprisingly abundant in barley expressed sequence tag (EST) databases. These observations were consistent with earlier reports that XET activity extractable from vegetative tissues of grasses was higher than that extractable from dicotyledons (Fry et al. 1992; Wu et al. 1994). Given the low levels of xyloglucans in walls of most barley tissues, it seemed possible that the enzymes might be active on the more abundant matrix-phase polysaccharides, namely, the arabinoxylans and/or the (1,3;1,4)-β-d-glucans. In support of this possibility, it was recently reported that the molecular mass of xylans in walls of suspension-cultured maize cells increased dramatically in the first few hours after deposition in the walls (Kerr and Fry 2003); this was suggestive of (1,4)-β-d-xylan transglycosylation activity.
Here, homology modeling has been used to develop potential three-dimensional (3D) structures for XTHs. The models have subsequently been compared with known 3D structures of enzymes that hydrolyze arabinoxylans and (1,3;1,4)-β-d-glucans, to assess the possibility that XTHs from grasses could bind to and disproportionate these abundant wall constituents. The plant XTHs are all classified in the family GH16 group of glycoside hydrolases, although a small number of microbial XTHs are classified within family GH12 (Coutinho and Henrissat 1999) (http://afmb.cnrsmrs.fr/CAZY/). Initial analyses of the family GH16 enzymes have revealed an evolutionary and structural link between higher plant XTHs and microbial (1,3;1,4)-β-d-glucan endohydrolases, while more detailed comparisons of active site regions have uncovered similarities between the XTHs and family GH11 (1,4)-β-d-xylan endohydrolases. Our results are supported by the recently published 3D structure of the Populus tremula × tremuloides XTH (Protein Data Bank [PDB] entries 1UN1 and 1UMZ; Johannson et al. 2004).
EST database analysis reveals an abundance of XTH sequences
Analysis of cell wall–modifying enzymes and proteins from 345,000 barley ESTs from the National Center for Biotechnology Information (NCBI) Web site revealed an abundance of XTH sequences in the databases. Although xyloglucans account for <5% to 10% of walls in most barley tissues, the XTH ESTs represent >20% of all ESTs encoding enzymes and proteins that have been implicated in the synthesis, restructuring, and/or degradation of major wall polysaccharides in barley (Fig. 1A). Within the XTH ESTs, five genes account for most of the entries (Fig. 1B). These highly represented genes are designated HvXTH1 to HvXTH5 (Fig. 1B). Further analyses of the XTH sequences identified in Figure 1 showed that the barley genome carries at least 22 independent XTH genes (data not shown). The detection of additional XTH ESTs that did not overlap with regions used to define the 22 different genes would suggest that further XTH genes are likely to be present on the barley genome. Examination of the draft rice genome sequence (Goff 2002) indicated that the rice genome carries at least 29 XTH genes (data not shown). Cellulose synthases (CesAs) comprise ∼25% of the barley ESTs, while expansins, which are involved in wall elongation (Cosgrove 1999), comprise ∼15% of wall-modifying proteins in the barley EST databases (Fig. 1A).
The family GH16 glycosyl hydrolases can be divided into five subgroups
The Lipase Engineering Database (LED) system (Fischer and Pleiss 2003) was used to collect ∼260 proteins from the family GH16 group of glycoside hydrolases, as in the CAZy Database (Coutinho and Henrissat 1999) (server at http://afmb.cnrs_mrs.fr/∼cazy/CAZY/index.html). Based on multiple sequence alignment, phylogenetic analysis, and substrate specificity, the proteins were assigned to five subgroups, each of which has distinct substrate specificity. The subgroups included XTHs (110 proteins), (1,3;1,4)-β-d-glucanases (49 proteins), (1,3)-β-galactanases (10 proteins), (1,4)-β-galactanases/κ-carrageenases (5 proteins), and non-specific (1,3/1,3;1,4)- β-d-glucanases (74 proteins). The sequences of two SKN1 proteins, two KRE6 proteins, and four putative β-d-glucan synthesis-associated proteins were assigned to the nonspecific (1,3/1,3;1,4)-β-d-glucanase subgroup, as proposed previously (Barbeyron et al. 1998). Several proteins classified in the CAZy Database as family 16 glycoside hydrolases (Coutinho and Henrissat 1999), including four (1,3)-β-d-glucan recognition proteins, four Gram-negative bacteria binding proteins, a p50 protein, an allergen, an external agarase, and eight (1,3)-β-d-glucan-binding proteins, could not be assigned to any of the subgroups.
An unrooted phylogenetic tree showing a random selection of 57 enzymes of the 260 proteins from the five subgroups is shown in Figure 2. Inclusion of all 260 proteins resulted in a cluttered tree in which individual proteins were difficult to discern (Harvey et al. 2000). Enzymes for which 3D structures are available are indicated, together with enzymes used for subsequent modeling experiments (Fig. 2). It appears that the large nonspecific (1,3/1,3;1,4)-β-d-glucanase subgroup is centrally placed in the tree, and that the two smaller subgroups of β-galactanases and the two subgroups of (1,3;1,4)-β-d-glucanases and XTHs lie at the edges of the tree (Fig. 2). Whether this means that the non-specific (1,3/1,3;1,4)-β-d-glucanase subgroup has acted as a common ancestor for the other four subgroups in family GH16 remains to be demonstrated.
Multiple sequence alignments further revealed that the catalytic nucleophile and the catalytic acid/base residues of the β-galactanases and nonspecific (1,3/1,3;1,4)-β-d-glucanase subgroups were separated by four amino acid residues (Fig. 3A), while those in the XTH and (1,3;1,4)-β-d-glucanase subgroups were separated by three amino acid residues (Fig. 3B). In the expectation that the different spacing of catalytic amino acid residues was likely to have an effect on substrate specificity and probably reflected an important evolutionary change in active site geometry (Michel et al. 2001), we have subdivided the family GH16 enzymes into subfamily GH16a, comprising the β-galactanase and nonspecific (1,3/1,3;1,4)-β-d-glucanase subgroups, where the catalytic residues are separated by four amino acid residues, and into subfamily GH16b, comprising the XTHs and the (1,3;1,4)-β-d-glucanases, where the catalytic residues are separated by three amino acid residues (Fig. 3B).
The family GH16 database
The family GH16 glycoside hydrolase database (GHDB) is available at http://www.ghdb.uni-stuttgart.de. In addition to the alphabetic listing of family GH16 glycoside hydrolases, the database provides experimentally determined 3D protein structures, phylogenetic trees, and multiple sequence alignments of the two subfamilies GH16a and GH16b and of each homologous subgroup therein. Functionally relevant residues such as the catalytic amino acid residues, aromatic substrate-binding residues, polar substrate-binding residues, Ca2+-binding residues, cysteine residues involved in disulfide bridge formation, signal peptide sequences, potential N-glycosylation sites, and residues involved in substrate processing are annotated according to the structural analyses described below, literature citations, and additional data extracted from GenBank and PDB (Berman et al. 2000). An automatic BLAST search is incorporated into the database for comprehensive analysis of the stored data.
Homology modeling of 3D structures from family GH16a
To compare the active site structures of family GH16 glycoside hydrolases, we performed homology modeling of selected members of the different subgroups, including XTHs and the nonspecific (1,3/1,3;1,4)-β-d-glucanases, for which no experimentally determined 3D structures were available. The 3D structure of a family GH16 (1,4)-β-galactanase/κ-carrageenase, from Pseudoalteromonas carragenovora (PDB entry 1DYP; Michel et al. 2001) was used to predict the structure of the active site region of a second κ-carrageenase (272 out of 513 amino acid residues) from Zobellia galacinovorans (Barbeyron et al. 1998). The sequence identity between the target and template sequences was 39%, and the models suggested that secondary structural elements of the β-jelly roll fold of the enzymes were conserved, while differences were detected in loop regions (data not shown).
There are no crystal structures available for the nonspecific (1,3/1,3;1,4)-β-d-glucanase group of family GH16 enzymes, but the solved structure of the P. carragenovora κ-carrageenase (Michel et al. 2001) was used successfully as a template for molecular modeling of the nonspecific (1,3/1,3;1,4)-β-d-glucanases from Bacillus circulans (Yahata et al. 1990) and Strongylocentrotus purpuratus (Bachman and McClay 1996). The sequences of the two nonspecific (1,3/1,3;1,4)-β-d-glucanases contained 682 and 499 amino acid residues, respectively, and were about twice as long as the sequence of the available κ-carrageenase template structure. Differences in sequence lengths and domain compositions also made it difficult to align the nonspecific (1,3/1,3;1,4)- β-d-glucanases with each other and with the κ-carrageenase template. However, the PRODOM program (Corpet et al. 1998) used in conjunction with multiple sequence alignment eventually allowed 268 out of 682 amino acid residues of the B. circulans nonspecific (1,3/1,3;1,4)-β-d-glucanase and 276 out of 499 amino acid residues of the S. purpuratus nonspecific (1,3/1,3;1,4)-β-d-glucanase to be aligned with the κ-carrageenase template sequence for homology modeling of the target protein structures (Fig. 4A,B). Both models again showed the typical family GH16 β-jelly roll fold, and the 3D dispositions of the catalytically active region were conserved in both molecular models.
The substrate-binding clefts of both enzymes stretch across the surface of the protein, and in the case of the S. purpuratus nonspecific (1,3/1,3;1,4)-β-d-glucanase model, the cleft appears to be closed at the top (Fig. 4, cf. B and A). Nevertheless, the dispositions of the catalytic amino acid residues and of the aromatic residues that are believed to be involved in substrate binding are similar. In addition, the active site aromatic residue that distinguishes the various subgroups (Trp118 in the κ-carrageenase from P. carragenovora, a Phe residue in the (1,3;1,4)-β-d-glucanases, and a Tyr residue in the XTHs) (Fig. 3A) is located in a similar position in all enzymes (Fig. 4A,B). The conservation of these residues in Cartesian space around the catalytic site (Fig. 4) is consistent with their conservation in the amino acid sequence alignments of the enzymes (Fig. 3A).
The protein sequence identities, root mean square (RMS) deviations, overall G-factors, and Ramachandran plot statistics for template and target sequences of selected members from family GH16a are shown in Table 1. Despite the relatively low sequence identity values, the figures indicate that the models constructed for the nonspecific (1,3/1,3;1,4)- β-d-glucanase from the κ-carrageenase template sequence are acceptable.
Homology modeling of 3D structures from family GH16b
In subfamily GH16b, the 3D structures of two Bacillus (1,3;1,4)-β-d-glucanases have been solved by X-ray crystallography (PDB entries 1GBG [Hahn et al. 1995] and 1BYH [Keitel et al. 1993]). Based on sequence identities of ∼60% between template and target sequences, it was possible to use the solved 3D structures to model the structure of a (1,3;1,4)-β-d-glucanase from Bacillus brevis (Louw et al. 1993) and 219 of 330 amino acid residues of a (1,3;1,4)-β-d-glucanase from Clostridium thermocellum (Schimming et al. 1992). The only differences between the two solved structures and the two models were in loop regions of the enzymes; all secondary structural elements, residues required for substrate binding, and the 3D dispositions of catalytic residues were conserved (data not shown).
Amino acid sequence identities of ∼30% allowed the 3D structures of the two Bacillus (1,3;1,4)-β-d-glucanases to be used as templates to build reliable models of two XTHs, namely, the XTHs from Vigna angularis (175 of 292 amino acid residues) (Okazawa et al. 1993), and Vitis labrusca (197 of 291 amino acid residues) (Ishimaru and Kobyashi 2002). Because the template sequences of the Bacillus (1,3;1,4)-β-d-glucanases lacked ∼100 amino acid residues found at the C termini of the XTH target sequences, these regions of the XTH proteins could not be modeled. However, the catalytic amino acid residues of the two modeled XTHs were conserved and similar in spatial disposition to catalytic residues of the (1,3;1,4)-β-d-glucanases, as were substrate-binding aromatic residues and the distinguishing aromatic residues Phe92 from the (1,3;1,4)-β-d-glucanases and Tyr75 from the XTHs (Fig. 4C–E). Again, the conservation of these residues in Cartesian space around the catalytic site is consistent with their conservation in the amino acid sequences of the enzymes (Fig. 3B), and reliability checks showed acceptable values for the models (Table 1). In contrast, models of the XTHs calculated by using the structure of the Pseudomonas carragenovora κ-carrageenase as a template showed distorted active site structures (data not shown), which showed that this template was not suitable for modeling XTHs.
The substrate-binding cleft in the V. angularis and V. labrusca XTH model appears to be somewhat wider than that in the (1,3;1,4)-β-d-glucanase templates (Fig. 4C–F). Multiple sequence alignments of template and target sequences, together with alignments of all sequences belonging to the XTH subgroup, showed a set of conserved amino acid residues that are located in the substrate binding cleft of the XTHs. The frequencies of these residues are shown in Table 2. Amino acid residues corresponding to Tyr75, His83, Phe106, Tyr170, and Trp174 in the XTH of V. angularis are conserved in the substrate-binding cleft of the two modeled XTH structures (Fig. 4D,E; Table 2). The highly conserved His83 and Asp87 residues of the XTHs correspond to Trp103 and Asp107 of the Bacillus macerans (1,3;1,4)-β-d-glucanase, respectively. These residues are likely to stabilize the catalytic nucleophile through hydrogen bonds (Keitel et al. 1993) and Asp87 may take part in proton trafficking (Michel et al. 2001). Similarly, Tyr75 of the V. angularis XTH corresponds to Phe92 of the B. macerans (1,3;1,4)-β-d-glucanase (Table 2), and is 5 to 7 Å from the Oε1 and Oε2 of the catalytic amino acid residues Glu85 and Glu89, while Tyr170 of the V. angularis XTH and corresponding amino acid residues in other XTHs are in the same position as Met180 of the B. macerans (1,3–1,4)-β-d-glucanase (Table 2). Thus, these Met and Tyr residues are conserved in both the (1,3;1,4)-β-d-glucanase and XTH subgroups of family GH16b enzymes (Fig. 3; Table 2). Amino acid residues Asn121 and Glu131 of the B. macerans (1,3–1,4)-β-d-glucanase also involved in substrate binding (Hahn et al. 1995) and are conserved in the XTHs; they correspond to Asn104 and Glu114 of the V. angularis XTH. The two aromatic amino acid residues in positions corresponding to Tyr24 and Tyr94 of the B. macerans (1,3–1,4)-β-d-glucanase could not be detected in the sequences and the modeled 3D structure of the XTHs (Figs. 3, 4C–E). Because of a low sequence identity between the template and target sequences of Bacillus licheniformis and V. angularis, respectively (Table 1), the C-terminal portion of 90 to 100 amino acid residues of the V. angularis XTH could not be modeled, including the conserved Trp179 (Fig. 3; Table 2). In contrast, the modeled structure of the V. labrusca XTH reached further toward the C terminus, including Trp191 (Figs. 3B, 4D,E).
Another difference between the XTHs and the (1,3;1,4)-β-d-glucanases of the family GH16b subgroup is the insertion in the XTHs of three amino acid residues Pro-TyrXxx, where Xxx is a nonconserved amino acid residue, at a position corresponding to Pro98Tyr99Ile100 in the V. angularis XTH structure (Table 2; Fig. 3). The insertion is located in the modeled structure of V. angularis at the beginning of the central β-sheet, close to the Glu89 at substrate-binding subsite 2 or −2, so that the Tyr residue points toward the catalytic amino acid residues and the bound substrate (Fig. 4D). In the molecular model of V. labrusca XTH the corresponding Tyr residue is not exposed at the surface but covered by the C-terminal region (Fig. 4D); therefore, it would not be expected to contribute to substrate binding. Although the two homologous XTHs are expected to have similar 3D structures, the orientation of the highly conserved Tyr residue of the Pro-TyrXxx motif and the highly conserved Trp residue corresponding to Trp179 of the V. angularis model cannot be predicted.
Comparison of XTH models with the experimentally determined 3D structure from P. tremula × tremuloides XTH
Following submission of this manuscript an experimentally determined 3D structure of P. tremula × tremuloides XTH was released (PDB entries 1UN1 and 1UMZ; Johannson et al. 2004). This allowed a comprehensive evaluation of our XTH models. The amino acid sequence identities between the P. tremula×tremuloides XTH and the V. angularis and V. labrusca XTHs are 83% and 38%, respectively. The calculated RMS deviation values between Cα of the 3D structure of P. tremula × tremuloides XTH and the two XTH molecular models from V. angularis and V. labrusca are 5.8 Å (170 of 175 Cα atoms) and 7.4 Å (177 of 200 Cα atoms), respectively. The major differences resulting in the high RMS deviation values resulted in variations between the outer loop regions and in the N- and C-terminal regions, while the overall β-sandwich fold was very well conserved. It is worth noting that the central β-sheet of P. tremula × tremuloides XTH deviated from the two models by only 1.4 Å (39 of 175 Cα atoms) and 2.5 Å (39 of 200 Cα atoms). In addition, most of the side chains forming the substrate binding site of the models have a similar orientation (data not shown). The experimental structure thus allows deciding between the two suggested orientations of the highly conserved ProTyrXxx motif in the two XTH models. According to the structure of the P. tremula × tremuloides XTH, the Tyr residue is embedded between the two major β-sheets and points toward the N-glycosylated Asn93 residue. Therefore, an interaction between this Tyr residue and the substrate is unlikely, but it might be essential for the conformational stability of this N-glycosylation site. In contrast, the highly conserved (Trp179 in P. tremula × tremuloides XTH) has a similar structural position as Trp191 in the model of the V. labrusca XTH, and this residue forms a part of the substrate binding site.
XTH structures share similarities with (1,4)-β-d-xylan endohydrolase 3D structures
When the two models of the family GH16 V. angularis and the V. labrusca XTH were compared with the 3D structure of the family GH11 Aspergillus niger (1,4)-β-d-xylan endohydrolase (Krengel and Dijkstra 1996), significant differences in the shapes of the substrate-binding clefts were evident (Fig. 4D,F). The XTH cleft seems to be relatively broad, which might be related to the highly substituted (1,4)-β-d-glucan backbone of the XTH substrate compared with the presence of less bulky substituents on the (1,4)-β-d-xylan substrate of the (1,4)-β-d-xylanases; the top of the (1,4)-β-d-xylanase cleft appears to be closed. As expected for glycosyl endohydrolases, hydrophobic residues are located along the substrate-binding clefts of both types of enzyme (Fig. 4D,F). Nevertheless, both families GH11 and GH16 exhibit β-jelly roll folds (Coutinho and Henrissat 1999). A comparison of the catalytic and the substrate binding residues between the family GH16 B. macerans (1,3;1,4)-β-d-glucanase, the family GH16 P. tremula × tremuloides XTH, and the family GH11 A. niger β-d-xylan endohydrolase demonstrates a close structural similarity, with the catalytic acid/base and catalytic nucleophile residues having the similar 3D orientation (Fig. 5). The β-strands in this region carry aromatic residues, which have similar 3D dispositions in relation to the catalytic residues. The Asp87 residue (or a corresponding Asp107), which is conserved in family GH16 enzymes is located between the two catalytic residues in the (1,3;1,4)-β-d-glucanases and XTHs (Fig. 5A,B), is replaced by Tyr81 in the β-d-xylan endohydrolase (Fig. 5C).
For a more detailed comparison, the catalytic residues and the Asp/Tyr residues that are located between these in the XTH structure and the β-d-xylan endohydrolase structure were superposed. This shows that an “aromatic triad” of three aromatic amino acid residues that are important for substrate binding in the A. niger β-d-xylan endohydrolase, namely, Tyr10, Tyr164, and Trp172 (Tahir et al. 2002), has the same relative position in 3D space as a triad of three aromatic amino acid residues, namely, Trp174, Tyr170, and Trp179, in the P. tremula × tremuloides XTH (Fig. 5, cf. B and C). Aromatic amino acid residues corresponding to Tyr6 and Tyr89 of the A. niger β-d-xylan endohydrolase, also identified as important for catalytic activity (Tahir et al. 2002), could not be identified unequivocally through the comparison of the 3D structures. Furthermore, the aromatic platform of Tyr75 in the P. tremula × tremuloides XTH overlaps with that of Tyr70 in the A. niger β-d-xylan endohydrolase (Fig. 5B,C; Table 2). All other subgroups of the family GH16 glycoside hydrolases, namely, the (1,3;1,4)-β-d-glucanases, (1,3)-β-galactanases, (1,4)-β-galactanases/κ-carrageenases, and “nonspecific” (1,3/1,3;1,4)-β-d-glucanases, do not show this “aromatic triad” pattern in the active site (Figs. 3, 4, 5A).
Over 20% of entries that encode cell wall–modifying enzymes or proteins in the large barley EST databases correspond to mRNAs for the XTH group of higher plant enzymes (Fig. 1). Given that barley cell walls usually contain very low amounts of the XTH substrate xyloglucan, which is of relatively low abundance in the Poaceae more generally, this high level of XTH ESTs is surprising. One of several possible explanations for the higher than expected number of XTH entries is that the substrate specificity of the enzymes is broader than originally believed and that they catalyze the hydrolysis or transglycosylation of other polysaccharides that are more abundant than are xyloglucans in barley cell walls. The most abundant noncellulosic polysaccharides of barley walls are arabinoxylans and (1,3;1,4)-β-d-glucans (Fincher 1993; Carpita et al. 2001). We therefore modeled XTH enzymes against the known 3D structures of related family GH16 (1,3;1,4)-β-d-glucan hydrolases and compared the active site regions with those of family GH11 (1,4)-β-d-xylan endohydrolases, in an attempt to identify any structural and hence evolutionary linkages between enzymes that bind xyloglucans, arabinoxylans, and (1,3;1,4)-β-d-glucans.
The higher plant XTHs have been classified by Coutinho and Henrissat (1999) into the family GH16 group of glycosyl hydrolases. This family can be divided into five subgroups on the basis of differences in their substrate specificities (Fig. 2). First, a subgroup of (1,3)-β-galactan endohydrolases (EC 18.104.22.168) can hydrolyze (1,3)-β-galactosyl linkages in complex polysaccharides that include agarose (Michel et al. 2001). The second subgroup of enzymes (EC 22.214.171.124) are specific for (1,4)-β-galactosyl linkages in other complex polysaccharides that contain sulphated and anhydrogalactosyl residues, including κ-carrageenan and keratin sulphate (Kloareg and Quatrano 1988). The third subgroup includes the “nonspecific” (1,3/1,3;1,4)-β-d-glucan endohydrolases (EC 126.96.36.199). These are nonspecific insofar as they can hydrolyze both (1,3)- and (1,4)-linkages in β-d-glucans, provided there is an adjacent (1,3)-β-d-glucosyl residue on the nonreducing terminal side of the linkage hydrolyzed (Anderson and Stone 1975; Høj and Fincher 1995). These enzymes can therefore hydrolyze both (1,3)-β-d-glucans and (1,3;1,4)-β-d-glucans. The fourth subgroup comprises (1,3;1,4)-β-d-glucan endohydrolases (EC 188.8.131.52). These are absolutely specific for the hydrolysis of a (1,4)-β-d-glucosyl linkage, but only if there is an adjacent (1,3)-β-d-glucosyl residue toward the nonreducing end of the substrate. These enzymes can therefore hydrolyze only (1,3;1,4)-β-d-glucans (Parrish et al. 1960; Høj and Fincher 1995; Hrmova and Fincher 2001). The fifth subgroup contains xyloglucan-modifying enzymes that are collectively referred to as xyloglucan transglycosylases/hydrolases (XTHs; EC 184.108.40.206), but include both (XETs and XEHs (Farkas et al. 1992; Fanutti et al. 1993; Rose et al. 2002). These enzymes hydrolyse (1,4)-β-d-glucosyl linkages specifically in xyloglucans, but those with XET activity can also catalyze transglucosylation reactions, in which the nonreducing terminal product of the hydrolysis reaction can subsequently be transferred onto another xyloglucan molecule (Rose et al. 2002).
Although members of the XTH subgroup are found exclusively in higher plants while members of the other four subgroups are of microbial origin (Coutinho and Henrissat 1999), the classification of XTHs with microbial (1,3;1,4)-β-d-glucan endohydrolases in family GH16 establishes an evolutionary link between the two enzyme groups. It is noteworthy that higher plants of the Poaceae also synthesize (1,3;1,4)-β-d-glucan endohydrolases, but these have a completely different protein fold, they are classified in family GH17 and clearly converged with the microbial (1,3;1,4)-β-d-glucan endohydrolases along a distinct evolutionary route (Høj and Fincher 1995; Coutinho and Henrissat 1999). The plant XTHs and microbial (1,3;1,4)-β-d-glucan endohydrolases of subfamily GH16b show a number of features that distinguish them from the other three subgroups of the family. In particular, their catalytic amino acid residues are separated by three amino acid residues, in contrast to members of the other three subgroups that comprise subfamily GH16a, in which four amino acid residues separate the catalytic amino acid residues (Fig. 3). Moreover, the plant XTHs and microbial (1,3;1,4)-β-d-glucan endohydrolases of subfamily GH16b have a distinguishing Phe or Tyr residue in the vicinity of the catalytic amino acid residues, in a position that is occupied by a Trp in subfamily GH16a enzymes (Fig. 4; Michel et al. 2001). One key difference between the XTHs and the (1,3;1,4)-β-d-glucanases is the insertion of three amino acid residues in the XTHs, at a position corresponding to Pro118Tyr119Ile120 of the V. angularis XTH (Fig. 3). An additional difference is the substitution of Met, at a position corresponding to Met180 of the B. macerans (1,3;1,4)-β-d-glucanase (Fig. 3B), with an aromatic amino acid residue, mostly Tyr, at a position corresponding to Tyr170 of the V. angularis XTH (Figs. 3, 4).
Reliable models of the V. angularis and V. labrusca XTH could be generated using solved 3D structures for (1,3;1,4)-β-d-glucanases from B. licheniformis (Keitel et al. 1993) and B. macerans (Fig. 4; Table 1; Hahn et al. 1995). Similarly, reliable models of the family GH16a nonspecific (1,3/1,3;1,4)- β-d-glucanases could be generated by using the known 3D structure of the P. carragenovora β-carrageenase (Michel et al. 2001) as a template (Fig. 4; Table 1), but reliable models could not be built from 3D structures across the subfamilies GH16a and GH16b (Table 1). It must be remembered that generation of the models of XTHs necessitated the removal of 90 to 100 amino acid residues from their C termini, and that this C-terminal region could play an important role in substrate specificity.
The recently published 3D structure of P. tremula × tremuloides XTH (Johansson et al. 2004) allowed a detailed evaluation of the modeled XTH enzymes from V. labrusca and V. angularis. The evaluation demonstrated the reliability of our molecular models and resolves the ambiguity about the role of the conserved Tyr residue in the Pro-TyrXxx motif, which seems to play a structure-stabilizing role rather than a role in substrate binding. However, the concept of an aromatic triad (Tyr10, Tyr164, and Trp172 in β-d-xylan endohydrolase and Trp174, Tyr170, and Trp179 in the P. tremula × tremuloides and V. angularis XTHs) that is essential for substrate binding has been confirmed. A highly conserved Trp179 of P. tremula × tremuloides XTH that is located at the surface of the substrate binding site has been shown to be essential for substrate binding (Johansson et al. 2004). The presence of conserved aromatic amino acids and the general structural similarity of the substrate binding sites of XTHs and (1,3;1,4)-β-d-glucanases suggests similarities in their substrate specificity. However, xyloglucans are much more bulky than are (1,3;1,4)-β-d-glucanses, which is reflected in a slightly more open structure of the XTH substrate binding sites.
In contrast to the apparent similarities between the XTHs and the (1,3;1,4)-β-d-glucanases of family GH16, structural links between the XTHs and (1,4)-β-d-xylan endohydrolases were not so obvious. There are no (1,4)-β-d-xylanases in family GH16; these enzymes are variously placed in families GH8, GH10, and GH11 (Coutinho and Henrissat 1999). The protein folds detected for the glycosyl hydrolase families GH8, GH10, and GH11 that contain (1,4)-β-d-xylan endohydrolases are (α/α)6, (β/α)8 and β-jelly roll, respectively (Coutinho and Henrissat 1999). Because the family GH16 enzymes also adopt a β-jelly roll conformation and because the 3D structure of a family GH11 (1,4)-β-d-xylan endohydrolase from A. niger has been solved (Krengel and Dijkstra 1996), it was possible to compare this structure with the model of the XTH from V. angularis and V. labrusca (Fig. 4D–F). In addition, selected active site amino acid residues from the experimentally determined structure of P. tremula × tremuloides and the models prepared for the family GH16 XTH from V. angularis and V. labrusca were shown to have a similar relative 3D disposition as those defined by the 3D structure of the family GH11 (1,4)-β-d-xylan endohydrolase from A. niger. Furthermore, in the P. tremula × tremuloides XTH, the structural disposition of the C terminus has been clarified. This finding also supports our predictions that a structural relationship between the XTH enzymes and the GH11 (1,4)-β-d-xylan endohydrolases exists. In the XTHs, the C-terminal region contains an exposed α-helix, which is missing in other GH16 subgroup enzymes. Interestingly GH11 (1,4)-β-d-xylan endohydrolases also have an exposed α-helix that is positioned at the same end of the β-sheet, and this α-helix faces the opposite side to the active site region. The α-helices in both groups of the enzymes carry clusters of aromatic amino acid residues and therefore could potentially be involved in binding to cell walls or other polysaccharide epitopes. These models and the recently experimentally determined 3D structure of the P. tremula × tremuloides XTH therefore established a structural and potentially functional linkage between the family GH16 XTHs and the family GH11 (1,4)-β-d-xylan endohydrolases.
In summary, structural links between family GH16 XTHs and (1,3;1,4)-β-d-glucanases, and between the XTHs and family GH11 (1,4)-β-d-xylan endohydrolases have been established. In the context of seeking a potential role for the highly abundant XTHs of the Poaceae (Fry et al. 1992; Wu et al. 1994) in the hydrolysis or modification of polysaccharides other than the assumed substrate of xyloglucan, the homology modeling experiments and the experimentally determined 3D structure of P. tremula × tremuloides XTH suggest that the substrate-binding clefts and the active sites of the XTHs are structurally similar to those of the (1,3;1,4)-β-d-glucanases and the (1,4)-β-d-xylan endohydrolases. Thus, one can postulate that the XTHs of the Poaceae might have evolved to bind these other cell wall components. Higher plants already have (1,3;1,4)-β-d-glucanases and (1,4)-β-d-xylan endohydrolases from families other than GH16 and GH11, so one might further suggest that the need for XTHs with altered specificity in the Poaceae could be related to a requirement for the transglycosylation and attendant disproportionation of cell wall arabinoxylans and (1,3;1,4)-β-d-glucans, rather than a role in hydrolysis of these wall polysaccharides. The first evidence in support of this role was recently provided by Kerr and Fry (2003), who showed an abrupt increase in the molecular mass of xylans in walls of suspension-cultured maize cells in the first few hours after deposition in the walls. This is consistent with (1,4)-β-d-xylan transglycosylation activity, which could be catalyzed by an XTH enzyme that has been modified for action on (1,4)-β-d-xylans, and is reminiscent of the well-characterized action of XTHs on xyloglucans (Smith and Fry 1991; Nishitani 1997). Although it is likely to be difficult and time-consuming to separately purify the five XTH isoenzymes that correspond to the five most abundantly expressed XTH genes from barley (HvXTH1 to HvXTH5), we have now cloned cDNAs for each of these barley genes and will attempt to express active enzymes in heterologous systems, and to investigate whether any of the individual, expressed HvXTH enzymes, isolated in highly purified form, are active on arabinoxylans or (1,3;1,4)-β-d-glucans.
Materials and methods
Database construction and evolutionary analyses
A total of ∼260 amino acid sequences of family GH16 glycoside hydrolases was collected through an automated blast search from the NCBI-GenBank database (Benson et al. 1999) by using parsing tools from the LED (Fischer and Pleiss 2003) (http://www.ghdb.uni-stuttgart.de). Amino acid residues with potential roles in catalytic and substrate binding were annotated based on the information obtained by multiple sequence alignments with ClustalW (Thompson et al. 1997). Phylogenetic trees were constructed with TREE-PUZZLE version 5.0 using maximum-likelihood and quartet-puzzling (Schmidt et al. 2002). The parallel version of TREE-PUZZLE was run on a Linux PC-cluster of Dual Athlon MP 1800+ with 256 processors and a Myrinet interconnect system. The phylogenetic trees were edited manually and were drawn with the TreeView program (Page 1996). The accession numbers at the GenBank/EMBL, SWISS-PROT, and PDB (Berman et al. 2000) databases of the 57 selected family GH16 glycoside hydrolases are as follows:
The subgroup of “nonspecific” (1,3/1,3;1,4)-β-d-glucan endohydrolases includes the entries from Eisenia foetida (O77072), B. circulans (Q45095, P23903, and BAC06195), Thermogota maritima (B72428), Thermotoga neapolitana (Q60039), Cochliobolus carbonum (O14421), Lysobacter enzymogenes (AAN77505), Oerskovia xanthineolytica (O68641), S. purpuratus (Q26660), Streptomyces avermitilis MA-4680 (BAC69475), and the KRE6 protein from Candida albicans KRE6 (P87023).
The subgroup of (1,4)-β-galactanases/κ-carrageenases includes the entries from Zobellia galactanivorans (O84907), Pirellula sp.1 (CAD72787.1), Pseudoalteromonas carrageenovora (1DYP), and Pirellula sp. (CAD73010.1).
The subgroup of (1,3)-β-galactanases includes the entries from uncultured bacterium AguB (AAP49346.1) and AguD (AAP49316.1), Aeromonas sp. (AAF03246), Alteromonas atlantica (Q59078), Pseudomonas sp. Nd137 (BAB79291), Z. galactanivorans (AAF21820), Microscilla sp. PRE1 (AAK62838), Streptomyces coelicolor (P07883), and S. coelicolor A3 (2) (NP_627674).
The subgroup of (1,3;1,4)-β-d-glucanases includes entries from Clostridium acetobutylicum (D97245), C. thermocellum (P29716), Pseudomonas sp. (BAC24104), Orpinomyces sp. PC-2 (O14412), Streptococcus bovis (O07856), B. brevis (P37073), Brevibacillus brevis (A48378), Bacillus amyloliquefaciens (P07980), Bacillus subtilis (P04957), Bacillus lichiformis (1GBG), Hypocrea jecorina (AAF82804), Bacillus polymyxa (P45797), B. macerans (1BYH), and Rhizobium meliloti (P33693).
Finally, the subgroup of XTH entries includes entries from Carica papaya (AAK51119), V. labrusca (BAB78506), Arabidopsis thaliana (AAM91637), Daucus carota (AAK30204), V. angularis (A49539), Actinidia deliciosa (AAC09388), H. vulgare (X91659), Nicotiana tabacum (BAA13163), Gossypium hirsutum (T09870), Cicer arietinum (CAA06217), Pisum sativum (BAA34946), Asparagus officialis (AAF80591), Beta vulgaris (AAL04440), Fagus sylvatica (CAA10231), Ananas comosus (AAM28287.1), Lycopersicon esculentum (S49812), Festuca pratensis (CAC40809), and P. tremula × tremuloides (1UN1).
Multiple sequence alignments
Multiple sequence alignments were performed with ClustalW 1.80 (Thompson et al. 1994, 1997; Higgins et al. 1996; Jeanmougin et al. 1998) using the LED database. Individual entries were checked manually by hydrophobic cluster analysis (Lemesle-Varloot et al. 1990) to ensure that integrity of hydrophobic clusters was undisturbed and that both the distribution of secondary structure elements and topology of the active sites remained conserved. The program Bestfit from the University of Wisconsin GCG software package (Devereux et al. 1984), with the implemented gap penalty function and the Smith and Waterman local algorithm (Smith and Waterman 1981), was used to calculate sequence identities and similarities between the template and target protein sequences.
EST database analysis
Publicly available EST sequences from barley were accessed through the NCBI site (http://www.ncbi.nlm.nih.gov). A total of ∼345,000 barley ESTs available in October 2003 was searched for sequences corresponding to enzymes and proteins involved in cell wall biosynthesis, modification, and/or degradation. These included cellulose synthases (CesA), cellulose synthase–like enzymes (Csl), glucan synthase–like proteins (Gsl), α- and β-expansins, XTHs, β-d-glucan endo- and exohydrolases, (1,4)-β-d-xylan endo- and exohydrolases, (1,4)-β-d-mannan endohydrolases, and α-l-arabinofuranosidases. Stress-related (1,3)-β-d-glucan endohydrolases were specifically excluded from the analysis. No attempts were made to discriminate between isoforms of the various proteins or between individual EST libraries, and total numbers of ESTs were simply used as a first approximation of the relative abundance of mRNA transcripts in the range of tissues from which the EST libraries had been prepared. To estimate the number of genes in the barley XTH gene family, XTH sequences were aligned for the identification of overlapping EST sequences that corresponded to fragments of the same gene, and for the identification of distinct genes. ESTs with homology to previously identified barley and rice XTH genes were aligned and clustered by using the ContigExpress program that is a part of the Vector NTi Suite 7.0 software package (Informax Inc).
Homology protein structure modeling
The 3D molecular models of protein sequences of selected representatives of family GH16 glycoside hydrolases were constructed with Modeller 6v2, which is based on homology modeling by satisfaction of spatial restraints (Sali and Blundell 1993). Default settings of the program Modeller 6v2 (Sali and Blundell 1993) and the structurally optimized aligned protein sequences were used. Homologous proteins exhibiting substrate specificities corresponding to nonspecific (1,3/1,3;1,4)-β-d-glucanases, (1,3;1,4)-β-d-glucanases, (1,3)- and (1,4)-β-galactanases, and XTHs were grouped. The N termini of the amino acid sequences were deduced from the literature data when available and/or in combination with multiple sequence alignments (Harvey et al. 2000) and the PRODOM program that analyses domain arrangements (Gouzy et al. 1996).
Evaluation of models
The stereochemical quality (Ramachandran plots) and overall G-factors of the final models, which are measures of normality of main-chain bond lengths and bond angles (Engh and Huber 1991), were calculated with PROCHECK (Ramachandran et al. 1963; Laskowski et al. 1993). Further, the program O (Jones et al. 1991) was applied for the calculation of RMS deviations between Cα positions of the template and target protein 3D structures. Newly derived 3D models and structures were superposed with Swiss-PDBViewer V3.7 (Guex and Peitsch 1997) and viewed with PyMol (http://www.pymol.org).