Plant cell walls are composites of various carbohydrates, proteins and other compounds. Cell walls provide plants with strength and protection, and also represent the most abundant source of renewable biomass. Despite the importance of plant cell walls, comparatively little is known about the identities of genes and functions of proteins involved in their biosynthesis. The model plant Arabidopsis and the availability of its genome sequence have been invaluable for the identification and functional characterization of genes encoding enzymes involved in plant cell-wall biosynthesis. This review covers recent progress in the identification and characterization of genes encoding proteins involved in the biosynthesis of Arabidopsis cell-wall polysaccharides and arabinogalactan proteins. These studies have improved our understanding of both the mechanisms of cell-wall biosynthesis and the functions of various cell-wall polymers, and have highlighted areas where further research is needed.
The cells walls surrounding plant cells serve many important functions. The composition and architecture of cell walls contribute to the remarkable morphological and functional diversity of plant cell types. For example, cell walls strengthen plant cells. Most cells use this strength to resist positive pressure generated within the cell; by harnessing this turgor pressure, plants are able to stand upright. In contrast, thick cell walls of water conducting vessels and tracheids enable these cells to resist substantial negative pressure; they do so by depositing a thick cell wall. Plant cell walls also provide the first line of defense against invading micro-organisms and abiotic stresses.
There are many types of cell walls; however, most may be grouped into one of two functional classes: primary walls or secondary walls. Primary walls are synthesized during cell expansion in growing cells, while secondary walls are deposited in certain cell types after cell expansion has ceased. Although cell walls are strong composites deposited outside the plasma membrane, they are sufficiently flexible to allow cell expansion, and their composition is modulated in response to the physiological needs of the plant.
Cell walls are composed of a variety of polymers, which typically include cellulose, hemicelluloses, pectin and structural proteins. In addition, some cell walls contain lignin, which is a polymer of monolignols derived from the phenylpropanoid pathway. Many recent reviews have described the structure and diversity of plant cell-wall polymers as well as their biosynthesis (see, for example, Somerville, 2006; Mohnen, 2008; Popper, 2008; Vogel, 2008; Scheller and Ulvskov, 2010). Cellulose is the main load-bearing structure in the cell wall, and typically constitutes approximately 30–40% of the wall mass. The term ‘hemicelluloses’ covers a diverse group of polysaccharides, which generally exhibit limited solubility and interact with cellulose microfibrils. The hemicelluloses (i.e. xyloglucans, xylans, mannans and mixed-linkage β-glucans) have 1,4-β-linked backbones of sugars in an equatorial linkage configuration; short sidechains are appended to the backbones of some of these polymers. Occasionally, other polymers are considered to be hemicelluloses, but we consider that a clear definition based on common structural features is more useful (Scheller and Ulvskov, 2010). Pectins are very complex acidic polysaccharides with backbones that are rich in galacturonic acid (see below); these polysaccharides are more easily solubilized than hemicelluloses.
About 10% of the genes in Arabidopsis have been estimated to be involved in the various aspects of cell-wall metabolism, including polymer biosynthesis, transport, deposition, remodeling, turnover, and regulation of these processes (McCann and Carpita, 2008). The problems of trying to purify membrane-bound enzymes that use complex substrates using conventional biochemical approaches have been well-documented. Consequently, this has made the identification of enzymes that catalyze cell-wall polysaccharide biosynthesis very difficult, and traditional biochemical methods have led to identification of only a few genes encoding biosynthetic enzymes, despite extensive efforts for many years. The Arabidopsis genome sequence (Arabidopsis Genome Initiative, 2000) has provided an invaluable tool for efficient identification and systematic functional analysis of genes encoding cell-wall polysaccharide biosynthetic enzymes. Arabidopsis is an excellent model plant for cell-wall studies as its cell walls resemble those found in many crop plants and trees. Despite the complexity of plant cell-wall structure and biosynthesis, forward and reverse genetics and other functional genomic strategies in Arabidopsis have led to the identification of many genes encoding proteins involved in cell-wall biogenesis. These studies have had a major impact on our understanding of the synthesis of plant cell-wall polysaccharides, as well as the function of these polymers within the cell wall. We have also gained a better understanding of the roles of plant cell walls in regulating processes such as plant growth and interactions with the environment. There are, however, some notable differences between the cell walls found in Arabidopsis and those found in other plants (Vogel, 2008). Because grass cell walls contain mixed-linkage β-glucans and xylan-bound hydroxycinnamate esters, which are not found in Arabidopsis, orthologous genes involved in these biosynthetic pathways may be absent from the Arabidopsis genome, necessitating the use of other model systems to understand these pathways. Many of the other differences between cell walls in grasses and plants such as Arabidopsis may be quantitative rather than qualitative, and hence orthologous genes may be present. In some cases, the absence of orthologous sequences has proven advantageous. For example, gain-of-function experiments performed in Arabidopsis implicated cellulose synthase-like F (CSLF) and cellulose synthase-like H (CSLH) sequences in mixed-linkage β-glucan biosynthesis (Burton et al., 2006; Doblin et al., 2009; Fincher, 2009).
This review describes the current state of knowledge with regard to the identities of genes encoding proteins involved in the biosynthesis of cell-wall polysaccharides and arabinogalactan proteins in Arabidopsis. Much progress in Arabidopsis was made possible by pioneering work involving seminal experiments that led to elucidation of the structures of plant cell-wall polysaccharides and characterization of the enzymes required to synthesize them; however, space constraints prevent detailed discussion of many of these experiments. As many other processes are vital for proper cell-wall biogenesis, including precursor synthesis and transport, transcriptional regulation, signal transduction, transport of cell wall polymers to the apoplast, and integration of these polymers into the cell wall, we refer you to other recent reviews covering these topics (Zhong and Ye, 2007; Hematy and Hofte, 2008; Reiter, 2008; Reyes and Orellana, 2008; Vogel, 2008). The focus of this review is to highlight ways in which the availability of the Arabidopsis genome sequence has enabled the identification and functional characterization of enzymes catalyzing various steps in the biosynthesis of cell-wall polysaccharides and arabinogalactan proteins, and the insights that these discoveries have provided into the function of particular cell-wall components in regulating both cell growth and wall structural integrity.
Cellulose is composed of unsubstituted 1,4-β-glucan chains, and is a major load-bearing and ubiquitous constituent of plant cell walls. The plasma membrane localized machinery that performs cellulose synthesis is a large multi-protein complex, termed the cellulose synthase complex. This complex simultaneously synthesizes the 18–36 glucan chains that are found bonded together in a single microfibril. This complex resides within the plane of the plasma membrane, where it is able to access intracellular UDP-glucose and to extrude microfibrils outside the cell, directly into the cell wall (for recent reviews, see Somerville, 2006; Taylor, 2008).
The cellulose synthase complex has been visualized in freeze fracture studies as a six-lobed structure of approximately 30 nm in diameter, known as a rosette (Mueller and Brown, 1980). The first genes found to be involved in cellulose synthesis encoded the cellulose synthase (CESA) proteins. These were first identified as ESTs isolated from developing cotton fibers based on their sequence homology to bacterial genes that encode proteins involved in cellulose biosynthesis and the ability of the corresponding protein to bind to UDP-glucose (Pear et al., 1996). Subsequent analysis has shown that the Arabidopsis genome encodes a large family of proteins that show homology to the bacterial cellulose synthases (CESA superfamily; Richmond, 2000; Richmond and Somerville, 2000); these proteins have been implicated in the biosynthesis of a variety of plant cell-wall polysaccharides (see below). The first functional proof that CESA proteins were essential for cellulose synthesis came from studies of the temperature-sensitive Arabidopsis mutant radial swelling 1 (rsw1), which exhibited decreased cellulose content when grown at the restrictive temperature due to a defect in a member of the CESA gene family (AtCESA1). rsw1 mutants also show a loss of rosette structures from the plasma membrane at the restrictive temperature, demonstrating a clear connection between CESA proteins, cellulose synthesis and the requirement for rosettes in the plasma membrane, and providing functional evidence that the rosette structures visualized were indeed the protein complexes required for cellulose synthesis (Arioli et al., 1998). Forward genetic screens have identified a number of other cellulose-deficient mutants resulting from mutations in CESA genes, including ixr1, ixr2/prc1, irx1, irx3 and irx5 (Arioli et al., 1998; Taylor et al., 1999, 2000, 2003; Fagard et al., 2000; Scheible et al., 2001; Desprez et al., 2002) (see Table 1). Sequencing of the Arabidopsis genome revealed a family of ten CESA genes (Richmond, 2000). The irx1 (CESA8), irx3 (CESA7) and irx5 (CESA4) mutants specifically exhibit secondary wall cellulose defects (Taylor et al., 1999, 2000, 2003), while the remainder have a range of phenotypes consistent with primary wall cellulose defects. Comparative analyses of Arabidopsis CESA sequences with those from other higher plants revealed similar sets of CESA orthologs. For example, studies of the rice brittle culm mutants have identified three rice CESA genes that appear to be direct functional analogs of the Arabidopsis CESA4, CESA7 and CESA8 genes (Tanaka et al., 2003). All three secondary wall-associated CESA members in Arabidopsis are required for both complex formation and plasma membrane localization (Gardiner et al., 2003; Taylor et al., 2003). This indicates that cellulose synthase contains combinations of three unique positions each occupied by different CESA subunits, and, in the case of the secondary wall CESA proteins, no subunit can substitute for another. The situation is more complex for the remaining CESA genes. Both CESA1 and CESA3 encode subunits that permanently occupy two of the three positions, with the remaining position being filled by CESA2, CESA5, CESA6 or CESA9 (Desprez et al., 2007; Persson et al., 2007b). These four proteins exhibit partial redundancy and non-overlapping expression profiles, suggesting that they are subtly suited to slightly different functions in different plant tissues. Indeed, when the CESA6 promoter is used to drive the expression of CESA2, the encoded protein can occupy the CESA6 position within the cellulose synthase complex in CESA6-expressing tissues; however, this same construct does not fully rescue the prc1 (cesA6) mutant (Persson et al., 2007b). The function of CESA10, the remaining member of this family, remains unclear.
Table 1. Genes involved in cellulose biosynthesis
References are listed for mutant alleles with the exception of TED6, for which RNAi lines were generated. We have designated CESA10 as occupying position 1 due to its sequence similarity to CESA1, but this has not yet been demonstrated experimentally.
To date, the cellulose synthase complex has proven intractable to biochemical analysis. The precise interactions among the three subunits within the large rosette structure are unknown, but recent data show that CESA proteins are able to form oligomers (Atanassov et al., 2009; Timmers et al., 2009). Until very recently, no biochemical data had shown any other proteins to be associated with the complex; however, TED6, a plasma membrane-localized protein of unknown function, has been demonstrated to interact with the CESA7 subunit of the secondary wall complex (Endo et al., 2009). TED6 appears to have a general role in secondary cell-wall formation in vessels, but whether it has a specific function in cellulose deposition is unclear.
Many recent advances in our understanding of cellulose synthesis have come from the field of cell biology. The cellulose synthase complex was studied using electron microscopy and freeze fracture long before the first plant CESA subunit was identified (Giddings et al., 1980; Mueller and Brown, 1980). In more recent years, Arabidopsis has been used for studying the complex within living tissue. Live-cell imaging of fluorescent protein fusions to Arabidopsis CESA subunits has permitted the observation of a number of processes, including movement of the complex within the plasma membrane (presumably during glucan chain formation), delivery of the complex to the plasma membrane, and its relationships to the cortical cytoskeleton (Paredez et al., 2006, 2008; DeBolt et al., 2007a,b; Wightman and Turner, 2008; Wightman et al., 2009).
In addition to the CESA proteins, a number of other proteins have been implicated in cellulose microfibril formation, but none of them have been found to be associated with the cellulose synthase complex. COBRA (COB) was originally discovered in a screen of mutants with cell expansion abnormalities in the root (Benfey et al., 1993). COB encodes a small protein that is anchored to the extracellular face of the plasma membrane by a glycosylphosphatidylinositol anchor and contains a motif with some similarities to a cellulose-binding domain (Roudier et al., 2002). cob mutants exhibit defects in anisotropic expansion as a result of altered cellulose microfibril orientations; however, the exact role of this protein in polymerization or cell-wall deposition is unclear (Schindelman et al., 2001; Roudier et al., 2002, 2005). Within the Arabidopsis genome, there are a further 11 members of the COBRA-LIKE (COBL) family. The cobl4 mutant exhibits secondary wall cellulose defects, and, as for CESA proteins, probably represents a secondary wall-specialized enzyme (Brown et al., 2005). Gene network analysis placed COBL6 in a sub-network with CESA9 (Ma et al., 2007); like CESA9, COBL6 may therefore be involved in cellulose synthesis in pollen and anthers (Persson et al., 2007b). COBL9 has been implicated in root hair development, specifically tip-directed growth (Parker et al., 2000; Jones et al., 2006). From their very specific gene expression profiles, both COBL10 and COBL11 are probably involved in aspects of flower development (Brady et al., 2007).
Another key player in cellulose biosynthesis is KORRIGAN (KOR1), an endo-1,4,-β-d-glucanase with a single transmembrane spanning sequence (Nicol et al., 1998). Several kor mutant alleles have been reported and result in a marked reduction in cellulose synthesis. Interestingly, it was found that defects in this single glucanase result in perturbations in both primary and secondary wall formation, as well as cytokinesis (Lane et al., 2001; Sato et al., 2001; Szyjanowicz et al., 2004). It has been hypothesized that KOR1 functions in removing the growing cellulose chain from some type of primer or initiator (Peng et al., 2002). Alternative hypotheses propose that KOR1 alters the crystallization properties within microfibrils (Szyjanowicz et al., 2004; Takahashi et al., 2009) or that it is involved in the release of completed cellulose microfibrils from cellulose synthase complexes (Szyjanowicz et al., 2004). The effect of KOR1 activity upon the growing cellulose microfibril is further supported by recent data that show lower velocity of the cellulose synthase complex in kor mutants (Paredez et al., 2008).
The KOBITO1 (KOB1) gene encodes another protein that appears to be involved directly in cellulose biosynthesis. The KOB1 protein appears to be either membrane-localized or to exist in the cell wall; there are no recognizable motifs that indicate its molecular function (Pagant et al., 2002; Lertpiriyapong and Sung, 2003). Mutants of other genes have indirect or subtle effects upon cellulose deposition. Of particular interest are those encoding sterol biosynthetic enzymes, thus demonstrating a link between sterols and cellulose formation (Schrick et al., 2004).
The next major goal is to identify the precise molecular functions of each of the encoded activities of the above genes and to decipher where along the cellulose biosynthetic pathway each activity occurs. This will probably involve reconstituting cellulose synthesis in an in vitro system.
At least four types of enzymatic activities are needed to synthesize this complex carbohydrate (Faik et al., 2002): (i) UDP-glucose-dependent 1,4-β-glucan synthase to assemble the glucan backbone, (ii) UDP-xylose-dependent 1,6-α-xylosyltransferase to attach xylosyl residues to selected glucosyl residues of the glucan backbone, (iii) UDP-galactose-dependent 1,2-β-galactosyltransferase to attach galactosyl residues to specific xylosyl residues, and (iv) GDP-fucose-dependent 1,2-α-fucosyltransferase to attach fucosyl residues to selected galactosyl residues. One or more genes encoding each type of biosynthetic enzyme needed for XyG biosynthesis have been identified and characterized (Table 2).
Table 2. Biochemically characterized sequences involved in XyG biosynthesis
xxt1 and xxt2 single mutants have no observed phenotype, xxt1 xxt2 double mutant has aberrant root hairs and significantly altered cell-wall mechanical properties (Cavalier and Keegstra, 2006; Cavalier et al., 2008)
The Arabidopsis CELLULOSE SYNTHASE-LIKE C4 (CSLC4) gene, encoding a member of the CaZY glycosyltransferase (GT) family GT2 (Cantarel et al., 2009), has been implicated as a 1,4-β-glucan synthase involved in XyG backbone biosynthesis (Cocuron et al., 2007). CSLC4 was identified as the most closely related Arabidopsis sequence to a CSLC gene from nasturtium (Tropaeolum majus) that is highly expressed at a stage of seed development during which mass deposition of storage XyG occurs. Although confirmation of function requires the characterization of XyG in CSLC mutant plants, several compelling lines of evidence support the prediction that one or more members of the AtCSLC family encode XyG β-glucan synthase, including: (i) accumulation of soluble cellodextrins upon heterologous expression of AtCSLC4 and TmCSLC in Pichia pastoris, (ii) production of longer, insoluble β-glucan chains upon co-expression in P. pastoris of AtCSLC4 with an Arabidopsis 1,6-α-xylosyltransferase (AtXXT1, see below), suggesting interaction between these two enzymes, and (iii) coordinated expression of AtCSLC4 and AtXXT1 in Arabidopsis. Because the Arabidopsis CSLC family contains five members, disruption of expression of multiple members of the family may be necessary to perturb XyG biosynthesis. There is also evidence that some CSLC sequences are involved in the synthesis of other polysaccharides (Dwivany et al., 2009).
Several genes encoding glycosyltransferases involved in the addition of xylosyl residues to the glucan backbone of XyG have been identified (Faik et al., 2002; Cavalier and Keegstra, 2006; Cavalier et al., 2008; Zabotina et al., 2008). Each of these 1,6-α-xylosyltransferases is a representative of a seven-membered CaZY GT34 family in Arabidopsis. AtXXT1, the first functionally characterized representative of this family in Arabidopsis (Faik et al., 2002), was identified based on biochemical and sequence similarities to galactomannan 1,6-α-galactosyltransferase from fenugreek (Trigonella foenum-graecum) (Edwards et al., 1999). Cavalier and Keegstra (2006) characterized recombinant AtXXT1 and its closest homolog, AtXXT2, and observed that these enzymes are biochemically similar, each being capable of xylosylating cellohexaose in vitro. These enzymes preferentially xylosylate cellohexaose at the fourth glucosyl residue from the reducing end, and will also produce doubly and triply xylosylated products. The AtXXT5 gene encodes another putative xylosyltransferase, which, when disrupted, yields plants with decreased XyG quantity and degree of xylosylation (Zabotina et al., 2008), but its activity has not been confirmed. Reverse genetic analysis of Arabidopsis xxt1 and xxt2 single mutants indicated that these plants exhibit no significant morphological perturbations, suggesting genetic redundancy of these sequences. In contrast, xxt1xxt2 double mutant plants lack detectable XyG (Cavalier et al., 2008). The apparent absence of XyG in xxt1xxt2 double mutants suggests that XXT1 or XXT2 is required in order for proper xylosylation of XyG to occur. Perhaps the most surprising revelation resulting from studies of xxt1xxt2 double mutants is their relatively mild phenotypic abnormalities; many models of primary walls predict a prominent structural role of XyG, so it may be necessary for the cell-wall research community to reconsider the roles of XyG.
Genetic screening has proven powerful as a tool for identifying genes encoding cell-wall biosynthetic enzymes in Arabidopsis. A screen that examined the monosaccharide composition of cell walls among mutant plants resulted in identification of numerous murus (mur) loci that affect the composition of cell-wall polysaccharides (Reiter et al., 1993, 1997). Among these mutants was mur3, a mutant characterized by a significant reduction in the fucose content of its cell walls (Reiter et al., 1997), and aberrant XyG galactosylation (Madson et al., 2003). Positional cloning of MUR3 revealed that this gene encodes a xyloglucan β-1,2-galactosyltransferase (CaZY family GT47) that specifically galactosylates the third xylosyl residue from the non-reducing end of XXXG, forming XXLG (Madson et al., 2003); the reduction in fucose content observed in walls of the mur3 mutant therefore resulted from the absence of an appropriate glycosylation site for XyG α-fucosyltransferase. Interestingly, the second xylosyl residue from the non-reducing end of XXXG was excessively galactosylated in the mur3 mutant, indicating the contribution of at least one additional XyG β-galactosyltransferase in XyG biosynthesis (Madson et al., 2003). One or more members from a group of MUR3-like genes probably encode this additional β-galactosyltransferase (Li et al., 2004).
Identification of the AtFUT1 gene made this the first characterized sequence encoding an enzyme involved in XyG biosynthesis. Biochemical purification of the fucosyltransferase from pea was used to identify the Arabidopsis AtFUT1 gene, which encodes the XyG α-1,2-fucosyltransferase that terminally fucosylates XyG (Perrin et al., 1999). A previously identified mutant plant (mur2) (Reiter et al., 1997), exhibiting a deficiency in cell-wall fucose content, was later shown to contain a lesion in the AtFUT1 gene, resulting in a loss of XyG fucosylation (Vanzin et al., 2002). Although the FUT1 protein is a member of a family of ten related CaZY GT37 members in Arabidopsis, the absence of detectable fucosylated XyG in plants with lesions in the FUT1 gene (Vanzin et al., 2002; Perrin et al., 2003) supports the prediction that the other AtFUT sequences do not fucosylate xyloglucan, but rather other molecules (Sarria et al., 2001).
Until recently, xylan biosynthesis has not received as much attention as other plant cell-wall polymers. This changed dramatically with the increased interest in using plant cell-wall material as a potential source of biofuels. As a major component of many woody secondary cell walls, xylan has enormous potential as a source of sugar, but is composed predominantly of pentoses that may prove difficult to ferment efficiently. Xylan may also limit the digestibility of other cell-wall polymers by blocking the access of cellulose-degrading enzymes, providing further impetus to understanding and manipulating xylan biosynthesis (Yang and Wyman, 2004; Jeoh et al., 2007).
Glucuronoxylan found in Arabidopsis is composed of a 1,4-β-linked xylan backbone with glucuronic acid and 4-O-methyl glucuronic acid side chains (Figure 2). This structure is similar to that found in glucuronoxylan from a variety of dicot species, including trees such as birch. A short oligosaccharide sequence (Figure 2) had been identified at the reducing end of xylan from both birch and spruce wood (Johansson and Samuelson, 1977; Andersson and Samuelson, 1983); however, the significance of this sequence was not recognized until its recent identification and characterization in Arabidopsis (Peña et al., 2007). By analogy with xylan from other species, it is likely that some sugars are modified by the addition of acetate groups, but the exact nature of these substitutions in Arabidopsis has not yet been reported.
The Arabidopsis genome sequence has significantly contributed to recent advances in our understanding of xylan biosynthesis, most notably by facilitating the use of genome-wide expression data. Co-expression analysis, using marker genes for secondary cell-wall biosynthesis, identified six glycosyltransferases (IRX7, IRX8, IRX9, IRX10, IRX14 and PARVUS) with a role in xylan biosynthesis (Brown et al., 2005, 2007, 2009; Persson et al., 2005, 2007a; Zhong et al., 2005; Peña et al., 2007; Wu et al., 2009). A mutation in any one of these sequences results in a reduction in 1,4-β-xylan content. In the case of IRX10, this reduction in xylan content is much greater when a closely related gene (IRX10L) is also mutated (Brown et al., 2009; Wu et al., 2009). An allele of IRX7 was identified independently using forward genetics and termed fra8 (Table 3) (Zhong et al., 2005). Recently, a homolog of IRX7/FRA8 termed F8H has been shown to be a functional paralog (Lee et al., 2009).
Table 3. Characterization of xylan biosynthesis genes and corresponding mutants
Xylan synthase activity
Possible transferase activity
‘Oligosaccharide’ refers to that found at the reducing end; xylan synthase activity and chain length refer to the mutant relative to wild-type.
IRX7/FRA8 (ortholog F8H)
IRX10 (ortholog IRX10L)
The mutant phenotypes caused by disrupting the genes encoding xylan biosynthetic enzymes listed above may be grouped into two classes. The irx7, irx8 and parvus mutants contain reduced amounts of the reducing-end oligosaccharide, have a more heterogeneous distribution of xylan chain lengths, and microsomal extracts show no change in xylan synthase activity. In contrast, mutants within the second group containing irx9, irx14 and irx10 appear to have relatively large amounts of the oligosaccharide, shorter chains and reduced xylan synthase activity (Table 3) (Brown et al., 2007, 2009; Lee et al., 2007; Peña et al., 2007; Wu et al., 2009). The reduced abundance of the oligosaccharide observed among mutants in the first group suggests that these sequences play a role in its synthesis. To date, there is no direct evidence to suggest which enzyme catalyzes which reaction, and tentative inferences can only be made based upon the activities of homologous enzymes (Table 2). IRX9 and IRX14 are related GT43 enzymes that were originally proposed to function as the xylan synthase that extends the xylan backbone (Brown et al., 2007; Peña et al., 2007). The roles of these sequences are less clear now that IRX10, an unrelated GT47 enzyme, has also been demonstrated to be essential for xylan synthase activity (Brown et al., 2009; Wu et al., 2009). One possibility is that IRX10 might transfer a single xylosyl residue prior to processive addition of Xyl residues by IRX9 and IRX14. This would be analogous to glycosaminoglycan biosynthesis in mammals. In the case of heparin biosynthesis, the exostosin proteins EXT1 and EXT2 work together to add alternating GlcUA and GlcNAC residues; however, this processive addition of sugars is preceded by the essential addition of a single sugar to a tetrasaccharide linker by a separate enzyme (EXTL2 or EXTL3) (Esko and Selleck, 2002).
No candidates for the genes that are required to add the side branches to the xylan backbone have been reported. It is also unclear whether additional genes are required to produce the xylan backbone and reducing-end oligosaccharide (Figure 2). Two additional mutants with altered xylan synthase activity have been described. qua1 exhibits a pleiotropic phenotype that includes aberrant cell separation and reduced xylosyltransferase activity, but comparatively minor reductions in xylose content (Orfila et al., 2005). Similarly, mutants in AtCSLD5 exhibit both reduced xylan synthase and homogalacturonan synthase activities (Bernal et al., 2007). It is not clear whether xylan biosynthesis is the primary function of the products of these two genes, but these examples highlight interconnections between various aspects of cell-wall biosynthesis.
Questions remain about the direction of the xylan backbone biosynthesis. Although it has been possible to demonstrate the addition of xylosyl residues to the non-reducing end of a 1,4-β-xylohexaose acceptor, there are potential problems associated with the use of very high concentrations of artificial acceptors (York and O’Neill, 2008). Similarly, the function of the reducing-end oligosaccharide is unclear; it has been postulated to act either as a primer or as a terminator (York and O’Neill, 2008).
Side chains are present every seventh xylose residue, on average, along the xylan backbone in Arabidopsis (Brown et al., 2007). Although MALDI analysis of the digestion pattern suggests the side branches are not clustered, no direct information is available on the distribution of these side branches or whether the glucuronic and methylglucuronic acid side chains exhibit a particular pattern. A notable aspect of xylan side-branching is the fact that most xylan-deficient mutants possess mostly methylglucuronic acid side chains (Brown et al., 2007, 2009; Peña et al., 2007; Persson et al., 2007a; Wu et al., 2009). As the frequency of branching is maintained (Brown et al., 2007), this represents substitution of glucuronic acid by methylglucuronic acid residues. It has been suggested that this might be the result of limiting quantities of the methyl donor required for conversion of glucuronic acid to methylglucuronic acid, such that, when xylan synthesis is reduced, the available methyl donor is sufficient to convert all glucuronic acid to methylglucuronic acid (Peña et al., 2007), or that there are two alternative pathways of xylan biosynthesis (Persson et al., 2007a). Both explanations are difficult to reconcile with the observation that conversion of glucuronic acid to methylglucuronic acid is essentially complete in mutants that have a range of xylan contents. An alternative explanation involves alteration of sugar nucleotide pools (Brown et al., 2007), but there is no direct evidence to support this idea.
In many species, especially in grasses, xylans also have abundant arabinofuranosyl side branches. However, these structures have not been reported in Arabidopsis.
Mannan polysaccharides are another class of hemicellulose that is widespread among plant species (Popper, 2008). This group of carbohydrates includes mannans (1,4-β-linked homopolymers of mannosyl residues), glucomannans (1,4-β-linked heteropolymers containing mannosyl and glucosyl residues) and 1,6-α-galactosylated species of these carbohydrates (galactomannans and galactoglucomannans, respectively). Mannan polysaccharides serve structural and/or storage roles in many plants and algae (Frei and Preston, 1968; Meier and Reid, 1982; Maeda et al., 2000). Mannan polysaccharides exist in a variety of tissues and cell types of Arabidopsis (Handford et al., 2003; Liepman et al., 2007; Moller et al., 2007); however, relatively little is known about which specific mannan polysaccharides are present.
At least two types of enzymes are needed for the biosynthesis of mannan polysaccharides (Edwards et al., 1999). The 1,4-β-linked backbones of these carbohydrates are polymerized by GDP-mannose-dependent 1,4-β-mannan synthases. Some mannan synthases also have GDP-glucose-dependent 1,4-β-glucan synthase activity, and are therefore called glucomannan synthases. The addition of 1,6-α-galactosyl side chains to mannosyl residues within mannan and/or glucomannan chains is catalyzed by UDP-galactose-dependent 1,6-α-galactosyltransferases.
The first mannan synthase-encoding gene sequence, a member of the CELLULOSE SYNTHASE-LIKE A (CSLA) family (CaZY family GT2), was characterized from guar (Cyamopsis tetragonoloba; Dhugga et al., 2004). Members of the Arabidopsis CSLA family also encode glucomannan synthases (Liepman et al., 2005, 2007). In Arabidopsis, there are nine members of the CSLA family; biochemical analysis of several recombinant Arabidopsis CSLA proteins and a number of homologs from diverse plant species (Dhugga et al., 2004; Liepman et al., 2005, 2007; Suzuki et al., 2006) suggests conservation of (gluco)mannan synthase activity among all CSLA enzymes. These biochemical studies have been complemented by studies of one or more mutant corresponding to each Arabidopsis CSLA gene (Goubet et al., 2003, 2009; Zhu et al., 2003; Ubeda-Tomas et al., 2007). Most of these single mutants displayed no obvious phenotype under laboratory conditions, suggesting genetic redundancy among members of the AtCSLA family (Goubet et al., 2009). In fact, mutation of all three genes atcsla2, atcsla3 and atcsla9 was required for complete disruption of glucomannan accumulation in inflorescence stems. Loss of glucomannan in this triple mutant caused no other detectable mechanical or developmental defects (Goubet et al., 2009). The phenotypes of some csla mutants, expression patterns of particular CSLA genes, and accumulation patterns of mannan polysaccharides suggest other potential roles of these carbohydrates during plant growth and development. For example, a mutant with a lesion in the csla7 gene has defects in pollen tube growth and embryogenesis (Goubet et al., 2003), and another mutant (rat4) with a defective CSLA9 gene is resistant to transformation by Agrobacterium tumefaciens (Zhu et al., 2003). Furthermore, mannan polysaccharides and transcripts of CSLA genes are relatively abundant in Arabidopsis floral tissues (Liepman et al., 2007). These observations underscore important, yet poorly understood, roles of mannan polysaccharides in Arabidopsis, and probably other plants.
It is not yet clear whether galactomannan or galactoglucomannan is present in Arabidopsis, and therefore whether genes encoding galactomannan 1,6-α-galactosyltransferase exist in this plant. The two Arabidopsis genes most closely related to the fenugreek galactomannan galactosyltransferase (Edwards et al., 1999) are the AtGT6 and AtGT7 sequences (Faik et al., 2002); the activity of the proteins encoded by these genes remains to be determined.
Arabidopsis pectin appears to have a composition similar to that found in other plants (Figure 3). The primary cell walls of Arabidopsis typically contain about 40% pectin, including homogalacturonan, rhamnogalacturonan I (RG-I) and rhamnogalacturonan II (RG-II) (Zablackis et al., 1995). Linkage analysis has shown the presence of branched arabinans and 1,4-β-galactans in approximately equal proportions (Zablackis et al., 1995). Xylogalacturonan is also abundant in Arabidopsis walls (Zandleven et al., 2007). With the exception of some terminal structures, Arabidopsis RG-II is structurally nearly identical to RG-II from a wide variety of other plant species (Glushka et al., 2003). It is worth noting that most studies of Arabidopsis pectin structure have focused on leaves, and that different tissues and cell types differ in their pectin composition. For example, the mucilage of Arabidopsis seeds is rich in a largely unsubstituted form of RG-I (Penfield et al., 2001; Arsovski et al., 2009).
Due to the complex structure of pectin, a large number of enzymes must be required for its synthesis. Mohnen (2008) has estimated that 67 glycosyltransferases, methyltransferases and acetyltransferases are required to synthesize pectin. The first pectin biosynthetic enzyme to be identified was GAUT1, a homogalacturonan 1,4-α-galacturonosyltransferase (Sterling et al., 2006). This enzyme appears to be present in a complex with the homologous GAUT7 protein (Mohnen, 2008). The GAUT proteins form a sub-group within the CaZY GT8 family, and this sub-group contains 15 members in Arabidopsis. Although all 15 GAUT sequences may be α-galacturonosyltransferases, so far this activity has only been demonstrated for GAUT1. Another member of the GAUT family, GAUT9 (QUASIMODO1), is apparently involved in pectin biosynthesis, as the qua1 mutant has a pectin-deficient phenotype (Bouton et al., 2002). However, the mutant shows pleiotropic effects, including defects in xylan deposition (Orfila et al., 2005).
The biosynthesis of RG-I must require many enzymes, but these remain largely unknown despite extensive forward and reverse genetics studies. The Arabidopsis mutant arad1 is specifically deficient in 1,5-α-linked arabinan side chains of RG-I (Harholt et al., 2006). Thus ARAD1, a member of the CaZY GT47 family, is most likely a 1,5-α-arabinosyltransferase, and is so far the only glycosyltransferase known to be involved in RG-I biosynthesis. Activity of the heterologously expressed ARAD1 protein remains to be demonstrated. It is hypothesized that the sub-family of CaZY GT47 to which ARAD1 belongs contains additional proteins required for synthesizing the various linkages in arabinan. The other major type of RG-I side chain, besides arabinan, is 1,4-β-galactan. No data are available to suggest which glycosyltransferases might be involved in synthesizing this important structural component.
Xylogalacturonan is a modified form of homogalacturonan containing β-xylosyl residues linked to O3 of the backbone galacturonosyl residues. Xylogalacturonan is abundant in Arabidopsis leaves, and as much as 30% of the xylose in leaf primary cell walls may be associated with this polysaccharide (Jensen et al., 2008). The Arabidopsis xgd1 mutant completely lacks xylogalacturonan in leaves (Jensen et al., 2008). The XGD1 protein is another member of the CaZY GT47 family, and, when transiently expressed in tobacco leaves, it was shown to catalyze the transfer of xylose from UDP-xylose onto endogenous pectin and exogenous oligogalacturonides. While leaves of the xgd1 mutant had no detectable xylogalacturonan, the mutant retained LM8 xylogalacturonan epitopes in siliques. This suggests that xylogalacturonan in siliques differs in structure from that found in leaves, and indicates that there must be additional xylosyltransferases involved in xylogalacturonan biosynthesis in Arabidopsis (Jensen et al., 2008).
RG-II has a highly complex structure, comprising 12 different sugars in 20 different linkages; however, very few sequences involved in RG-II synthesis have been identified. Three homologous proteins, RGXT1, RGXT2 and RGXT3, which are members of GT77, have been shown to specifically catalyze the transfer of xylose from UDP-xylose onto fucosyl residues via 1,3-α-linkages (Egelund et al., 2006, 2008). This enzymatic activity suggests a specific involvement in formation of the A-chain of RG-II, the only cell-wall carbohydrate structure known to contain α-xylose-(1,3)-fucose. The sugar composition of RG-II isolated from T-DNA insertional mutants of either RGXT1 or RGXT2 showed no significant changes, suggesting genetic redundancy among RGXT isoforms. However, the RG-II isolated from these mutant lines functioned as a specific acceptor substrate in an RGXT biochemical assay. The close linkage between the RGXT1 and RGXT2 sequences on chromosome 4 has so far prevented the generation of a double mutant.
The NpGUT1 protein was identified in a tobacco mutant deficient in cell adhesion, and suggested to be a glucuronosyltransferase involved in RG-II biosynhesis (Iwai et al.,2002). However, studies of the closest Arabidopsis homologs, IRX10 and IRX10L, have shown an involvement in xylan biosynthesis (see above), and failed to confirm any involvement in RG-II biosynthesis.
All pectic polysaccharides may show methyl esterification of the carboxyl groups of galacturonic acid residues. The qua2 mutant is affected in a putative methyltransferase and has decreased pectin content (Mouille et al., 2007). However, the biochemical activity of QUA2 has not been confirmed. Acetylation on O2 or O3 is also found in pectin. Recently, an Arabidopsis mutant, reduced wall acetylation 2 (rwa2), with decreased acetylation of wall polysaccharides, has been identified (Y. Manabe, Feedstocks Division, Joint BioEnergy Institute, and H.V.S., unpublished results), but it is not yet clear whether the affected protein is a specific pectin acetyltransferase.
Arabinogalactan proteins (AGPs) are abundant and ubiquitous proteoglycans on the cell surface in plants, and play important roles during plant growth and development, and in interactions with micro-organisms (for a recent review, see Seifert and Roberts, 2007). AGPs consist of a core protein backbone decorated heavily by O-glycosylation with complex arabinogalactans. AGPs often contain a glycosylphosphatidyinositol (GPI) lipid anchor. The structure of Arabidopsis arabinogalactans is poorly characterized. Characterization has been very difficult due to heterogeneity of the glycan structures, not only between different AGPs or depending on developmental stage and tissue type, but even on the same peptide sequence in the same tissue (Estevez et al., 2006). Biosynthesis of AGPs involves complex post-translational modifications, including cleavage of an N-terminal signal sequence, hydroxylation of proline residues, GPI anchor modification, and arabinogalactosylation of hydroxyproline residues. The initial steps of these post-translational modifications are known to some degree, but very little is known about the glycosylation process.
Originally AGPs were found as a sub-group of hydroxyproline-rich glycoproteins in cell walls (Fincher et al., 1983), but recent genomics and proteomics research has suggested that a much wider variety of proteins are modified by arabinogalactans. ‘Classical’ AGPs, which are similar in structure to the products of the first cloned full-length cDNAs encoding AGPs (Chen et al., 1994; Du et al., 1994), consist of a central domain rich in Pro, Ala, Ser and Thr, flanked by an N-terminal signal peptide and a C-terminal GPI anchor modification. Many AGPs deviate from this domain structure, including AGPs containing Lys-rich domains and fasciclin-like AGPs. Sequence analysis of the Arabidopsis genome predicted 13 classical AGPs, three AGPs containing Lys-rich domains, approximately 20 fasciclin-like AGPs, and approximately ten arabinogalactan peptides consisting of only 10–13 amino acid residues (Borner et al., 2002; Schultz et al., 2002). Of these proteins, eight classical AGPs were isolated, and the predicted cleavage of the signal sequence and attachment of the GPI anchor were confirmed (Schultz et al., 2004).
The consensus sequence motif that leads to glycosylation with arabinogalactan is not fully understood, but studies using synthetic peptides suggested that clustered non-contiguous hydroxyproline residues tend to be arabinogalactosylated (Kieliszewski and Shpak, 2001). According to this hypothesis, 40% of the GPI-anchored proteins isolated in Arabidopsis (100 of 248) are predicted to be AGPs, of which only 29 are classical AGPs or arabinogalactan peptides (Borner et al., 2003). Essentially any peptide sequence containing a secretion signal and clustered non-contiguous hydroxyproline residues can be considered AGP-like, including extensin-, COBRA-, glycerophosphodiesterase-, SKU5- and receptor-like proteins (Borner et al., 2003).
Prolyl hydroxylation of AGPs is catalyzed by prolyl-4-hydroxylases (P4H) in the ER. At least six P4H sequences are present in Arabidopsis, of which P4H-1 and P4H-2 encode active P4H isoforms with different substrate specificities (Hieta and Myllyharju, 2002; Tiainen et al., 2005). The efficiency of prolyl hydroxylation may depend on the different P4H isoforms, but also on peptide sequences. For instance, the proline residues in a Ser-Pro-Ser-Pro repeat were almost fully hydroxylated, whereas those in a Val-Pro-Val-Pro repeat were only 60% hydroxylated in Arabidopsis (Estevez et al., 2006).
Many AGPs are considered as GPI-anchored in Arabidopsis based on cleavage by phospholipase C and attachment to ethanolamine at the C-terminus (Borner et al., 2003; Schultz et al., 2004). Little is known about the structure of GPI in plants, but AGP from suspension-cultured pear cells contained d-Man-(1,2)-α-d-Man-(1,6)-α-d-Man-(1,4)-α-d-GlcN-inositylphosphoceramide lipid (Oxley and Bacic, 1999). The structure was consistent with that found in animals, protozoans and yeast, except for a partial substitution with Gal on the O4 position of the 6-linked mannosyl residue, and the presence of a ceramide lipid instead of glycerolipid. For the conserved portion of the structure, the proteins that catalyze the reactions within this pathway have been well studied in other organisms. Little is known about them in plants, but the Arabidopsis genome contains genes homologous to several animal and yeast genes responsible for GPI anchor synthesis. Mutations in these genes cause severe deficiencies in plants. For example, mutations in the SETH1 and SETH2 genes in Arabidopsis, which are homologs of two components in the GPI–transamidase complex involved in the transfer of d-GlcNAc to phosphatidylinositol, specifically blocked pollen germination and tube growth (Lalanne et al., 2004). Another example is the Arabidopsis peanut1 mutant (pnt1), which is defective in a homolog of a mammalian mannosyltransferase that is essential for GPI biosynthesis, and displayed numerous changes in cell-wall polysaccharide composition as well as developmental abnormalities (Gillmor et al. 2005).
Most of the glycan structure of AGPs consists of a 1,3-β-galactan backbone branched at theO6 position, with side chains mainly composed of arabinose, but also containing glucuronic acid, rhamnose, xylose and other sugars (Fincher et al., 1983, Showalter, 1993; Majewska-Sawka and Nothnagel, 2000; Gaspar et al., 2001; Seifert and Roberts, 2007). Arabinogalactan glycosylation may be initiated in the ER (Oka et al., 2010), but mainly occurs in the Golgi apparatus (Kato et al., 2003). Many glycosyltransferases are required to build an entire arabinogalactan structure. For example, assuming that every glycosyltransferase recognizes a specific disaccharide as acceptor and creates a specific glycosidic linkage, an arabinogalactan structure added to a hydroxyproline residue of a synthetic (Ala-Hyp)51 sequence expressed in tobacco would require at least 11 glycosyltransferases (Tan et al., 2004). Bioinformatics analysis using animal 1,3-β-galactosyltransferase sequences suggested involvement of the CaZY GT31 family for synthesis of 1,3-β-galactans present in the arabinogalactan backbone (Qu et al., 2008).
The wealth of information gained from the Arabidopsis genome sequence (Arabidopsis Genome Initiative 2000), coupled with the powerful tools available to Arabidopsis researchers (Seki et al., 2002; Alonso et al., 2003; Rhee et al., 2003), has facilitated much progress within the cell-wall research community in identification of genes encoding enzymes involved in cell-wall biosynthesis. The progress made since completion of the Arabidopsis genome sequence is illustrated by numerous studies. Within the genome, there are many gene families containing members with related function. For example, the characterization of GAUT1, a GalA transferase involved in pectin biosynthesis, has resulted in the identification of another 14 genes with potential roles as GalA transferases, and this information should facilitate rapid functional characterization of the remaining family members. Similar situations exist for members of other families (Sarria et al., 2001; Faik et al., 2002; Zhong et al., 2003; Liepman et al., 2007).
While the availability of the genome has had a significant impact on the identification of genes encoding enzymes that are essential for xylan biosynthesis, there has been a relative lack of progress in the complementary biochemical analysis. Consequently, no single gene product has been unambiguously assigned a function in xylan biosynthesis. It is clear, however, that identifying these genes is an excellent first step and will facilitate further functional analysis. In the case of XyG biosynthesis, more progress has been made in identifying the activities of individual enzymes; sequences corresponding to at least one representative of each glycan synthase and glycosyltransferase needed for XyG biosynthesis have been identified. The availability of this information has enabled new experimental approaches to examine the role of XyG in the cell wall. For example, a mutant lacking detectable XyG displays only modest phenotypic abnormalities, challenging the validity of many models about the roles of XyG in cell-wall structure and in regulating growth (Cavalier et al., 2008). As corresponding mutants are generated for other polysaccharides, it is likely that current models of cell-wall structure and function will undergo a paradigm shift.
The utility of Arabidopsis in comparative studies is also clearly apparent. The structure of Arabidopsis xylan is broadly similar to that of a range of both gymnosperm and angiosperm trees. Furthermore, genes with similarities to IRX7, IRX8, IRX9 and PARVUS have been identified in poplar and shown to function as true orthologs by complementing the corresponding Arabidopsis mutant (Zhou et al., 2006, 2007; Kong et al., 2009). In contrast, in a comparison of Arabidopsis, poplar and rice, no obvious ortholog of IRX8 was found in rice (Caffall et al., 2009), although orthologs of other genes encoding enzymes involved in xylan biosynthesis clearly exist in monocots and are likely to represent invaluable starting points for further analysis. It will be interesting to determine whether xylan biosynthesis in monocots is essentially a variation of a well-conserved pathway, or whether alternative pathways for xylan synthesis exist. This illustrates a type of comparative study that would not have been possible without the availability of the genome sequence and the tools for functional analysis that are available in Arabidopsis.
Arabidopsis will continue to play an important role in the identification of genes and functional analysis of proteins involved in plant cell-wall biosynthesis. As more steps in wall biosynthesis are identified, there will probably be many more discoveries that challenge our current understanding of cell-wall biosynthesis and the functional roles of various cell-wall components. Furthermore, knowledge gained from these studies will have important practical applications, including enabling the engineering of plant cell-wall biomass tailored for various uses.
A.H.L. is supported by National Research Initiative grant number 2007-03542 from the United States Department of Agriculture Cooperative State Research, Education and Extension Service. N.G. and R.W. are supported by the 7th Framework Program of the European Union, project number 211982, ‘Renewall’. H.V.S. is supported by the US Department of Energy, Office of Science, Office of Biological and Environmental Research, through contract DE-AC02-05CH11231 between Lawrence Berkeley National Laboratory and the US Department of Energy. A.H.L. acknowledges Dr Cory Emal (Eastern Michigan University, Department of Chemistry) and Dr David Cavalier (Michigan State University, Plant Research Laboratory) for helpful discussions.