Interfascicular fiber formation in Arabidopsis requires the regulation of cell fate and differentiation, and the integration of cell biological processes and metabolic pathways leading to cell maturation. A combination of approaches will be necessary to gain a more complete understanding of these complex processes and their regulation. In this study, we assayed global changes in the Arabidopsis transcriptome along a developmental gradient of inflorescence stem maturation that incorporates progressive fiber differentiation. While neither the expression profile of a given gene of unknown function nor sequence similarity to functionally characterized genes can prove biochemical or cellular function, a combination of both techniques on a whole genome level provides a powerful tool for hypothesis building and identification of candidate genes worthy of more detailed study. We have applied this approach to identify candidate genes potentially encoding missing links in lignin biosynthesis and transcriptional regulators of fiber differentiation and maturation. Our results shed light on the coordination of metabolic pathways with cellular differentiation and complement previous genetic, expression profiling, and bioinformatic studies to provide new insights into fiber differentiation and maturation.
Upregulation of the shikimate pathway in conjunction with stem development might be anticipated as one mechanism by which maturing fiber cells could meet the metabolic demand for phenylalanine required for monolignol biosynthesis. Indeed, the expression of genes encoding individual shikimate pathway enzymes is both developmentally regulated and responsive to several environmental stimuli such as light, mechanical wounding, or pathogen infection (Eberhard et al., 1996; Keith et al., 1991; Lee et al., 1997; Mobley et al., 1999). Although most steps of the shikimate pathway have been well characterized in plants (for references on each enzyme see Table S5), no compilation of the gene families encoding this pathway has previously been published. Most Arabidopsis shikimate pathway enzymes are encoded by single genes or small gene families sharing high sequence similarity. This suggests that in most cases all gene family members encode enzyme isoforms with the same biochemical function. While more distantly related genes are generally absent from the Arabidopsis genome, an exception is two putative Arabidopsis SK genes that share relatively high sequence identity with a characterized SK gene from tomato (Schmid et al., 1992), also share somewhat lower similarity with two additional Arabidopsis genes (Table S4). In general, however, it appears that duplicated genes encoding enzymes of the shikimate pathway have not been recruited by other metabolic pathways, as has apparently happened with genes encoding enzymes related to phenylpropanoid enzymes (Table S2 and references therein).
Our global expression profiling results show that genes encoding enzymes of the shikimate pathway leading to the biosynthesis of phenylalanine are transcriptionally upregulated in primary stems consistent with enhanced demand of this amino acid for lignin biosynthesis (Figure 1; Table 1). In contrast, the parts of the pathway that are specific for the other aromatic amino acids are not differentially expressed. This suggests a pattern of transcriptional control of the aromatic amino acid biosynthetic pathway that directly reflects the physiological requirement for individual amino acids. A tight crosstalk between phenylpropanoid metabolism and the shikimate pathway was also observed in Arabidopsis plants mutated in the phenylalanine ammonia lyase encoding genes PAL1 and PAL2, which results in the accumulation of aromatic amino acids and a transcriptional upregulation of shikimate pathway biosynthesis genes (Rohde et al., 2004). Within some gene families individual members respond differently along the developmental gradient. For example, only one of the three CM genes in Arabidopsis (CM1) is upregulated, while CM2 and CM3 are not differentially expressed along the axis of primary stems (Figure 7). CM1, CM2, and CM3 display different organ-specific and stress-induced expression patterns (Eberhard et al., 1996; Mobley et al., 1999). The authors suggest, based on enzymatic properties and expression patterns, that CM1 might become, in particular, important when flux through the pathway is rapidly increased, for example, in response to environmental stress (Mobley et al., 1999) and our data support this hypothesis.
A similar pattern of apparent functional differentiation was detected in the case of DAHP synthase (DHS) genes. Previous studies have found that DHS1 is induced by wounding and pathogen attack, and may thus provide precursors for secondary metabolism, while DHS2 is more constitutively expressed, suggesting a role in providing aromatic amino acids for protein biosynthesis (Keith et al., 1991). Unfortunately, DHS1 was not represented on the array used, but DHS2 expression is only elevated in the oldest part of the stem, while DHS3, an uncharacterized third Arabidopsis gene, is strongly upregulated in coordination with ongoing lignin biosynthesis.
Taken together, in cases where multiple genes exist, the distinct isoforms appear to be differentially regulated and could potentially provide aromatic amino acid precursors for different physiological requirements, for example, protein biosynthesis and secondary metabolism/lignin biosynthesis. In contrast, those enzymes in the pre-chorismate pathway encoded by single-copy genes that are likely targeted to the chloroplast (DQS, DHQD/SD, and CS) must meet all physiological requirements during plant growth and development. In accordance with such broader roles, these genes are constitutively expressed or are only weakly upregulated in primary stems in concert with lignin biosynthesis (Figure 7).
Phenylalanine biosynthesis in plants follows an alternative pathway in which prephenate is first transaminated to form arogenate, which in turn is dehydrated to phenylalanine (De-Eknamkul and Ellis, 1988; Siehl and Conn, 1988). While neither a prephenate aminotransferase (PNT) nor an arogenate dehydratase (ADT) gene has been identified in plants to date, our global expression profiling approach showed that three Arabidopsis genes with similarity to prephenate dehydratase from yeast (ADT3, At2g27820; ADT5, At5g22630; and ADT6, At1g08250) have expression patterns consistent with roles in phenylalanine biosynthesis (Figure 7) and are predicted to be localized to the chloroplast (Table S5), making them candidates for arogenate dehydratase. Our profiling also identified three candidate PNT genes. Among these, At2g20610 has previously been identified in several mutant screens. Loss of At2g20610 function causes a variety of phenotypic abnormalities including excessive root formation, and a drastic increase in endogenous auxin levels (Boerjan et al., 1995 [sur1], Celenza et al., 1995 [afl1], King et al., 1995 [rty], Lehman et al., 1996 [hls3]). This phenotype could be explained by re-channeling carbon flow into the tryptophan/auxin biosynthesis pathway by a block in the phenylalanine pathway. However, Mikkelsen et al. (2004) recently showed that sur1 plants lack glucosinolates, accumulate l-cysteine conjugate precursors and that SUR1 encodes a C-S lyase that likely converts S-(alkylacetohydroximoyl)-l-cysteines to the corresponding thiohydroximic acids.
The second candidate PNT gene, At1g34060, encodes a homolog of Allium spp. alliinases that catalyze the cleavage of cysteine sulphoxide derivatives to produce the volatiles responsible for the typical flavor of onion and garlic (Jones et al., 2004) and thus also belongs to the C-S lyase protein family (Lancaster et al., 2000). The known or likely activity of SUR1 and At1g34060 as C-S lyases makes these two genes unlikely candidates to encode prephenate aminotransferase. However, as aminotransferases can also have C-S lyase activity (Gaskin et al., 1995), we cannot exclude the possibility that these gene products may also be capable of transaminating prephenate.
The third candidate PNT gene (At2g38400) is most likely to encode a true aminotransferase rather than a C-S lyase. It shares 37% amino acid sequence identity with an alanine:glyoxylate aminotransferase (AGT2) from rat and has been described as one of three Arabidopsis AGT2 homologs (Liepman and Olsen, 2003). However, the recombinant protein of one of this AGT2 homologs failed to exhibit glyoxylate aminotransferase activity using several amino acid donors (Liepman and Olsen, 2003), making it possible that AGT3 actually encodes the prephenate amino transferase of the shikimate pathway.
The identification of candidate genes for prephenate aminotransferase and arogenate dehydratase opens the door to functionally test their candidacy. One approach would be to test their abilities to complement the pha2 (prephenate dehydratase) and aro8/aro9 (aromatic amino acid aminotransferase; Iraqui et al., 1998) mutants from yeast by simultaneously expressing combinations of both candidate genes and thereby potentially establishing the end point of plant phenylalanine metabolism in yeast.
We used the expression pattern of a set of known or inferred monolignol biosynthetic genes as a benchmark to identify candidate genes for less well-characterized steps of lignin biosynthesis. No direct experimental evidence exists regarding the nature of the transporters involved in monolignol export. Nevertheless, ATP binding cassette (ABC) type transporters are plausible candidates fulfilling this function, along with vesicle-mediated secretion of monolignols (Samuels et al., 2002). This class of transporters is involved in the import or the export of a wide variety of soluble metabolites ranging from carbohydrates, fatty acids, and proteins to aromatic molecules (Higgins, 2001). They have been associated with detoxification, xenobiotic, and heavy metal transport in all kingdoms, but it is becoming increasingly clear that in plants ABC transporters are also involved in the transport of secondary metabolites such as terpenoids, alkaloids, very long-chain fatty acids, and anthocyanins (Goodman et al., 2004; Jasinski et al., 2001; Pighin et al., 2004; Shitan et al., 2003). We identified seven candidate ABC transporters that display expression patterns in primary stems consistent with expression profiles of monolignol biosynthetic genes and increased lignin content. No function has yet been attributed to any of these, but two of them (MDR13 and MDR8) have expression patterns that most closely resemble those of monolignol biosynthesis genes. Both proteins are likely targeted to the cytoplasm and belong to the multiple drug resistance (MDR) subfamily of the ABCB class of full transporters featuring two trans-membrane domains and two nucleotide binding domains (TM-ABC)2 (Garcia et al., 2004; Sánchez-Fernández et al., 2001). Interestingly, MDR8 is the Arabidopsis homolog to an MDR from Coptis japonica (CjMDR1) that is involved in the translocation of berberine, a benzylisoquinoline alkaloid (Shitan et al., 2003). The other candidate transporter genes identified in our analysis, WBC23, PDR1, PDR8, and PDR13, are related to the human ABCG subfamily (Garcia et al., 2004). While WBC23 encodes a half transporter consisting of a single ABC and TM domain, PDR (pleiotropic drug resistance) genes encode full transporters with an (ABC-TM)2 organization (Sánchez-Fernández et al., 2001). While all three PDR proteins are predicted to be targeted to the chloroplast making them unlikely monolignol transporter candidates, WBC23 is likely targeted to the cytoplasm. WBC23 is a homolog of the WHITE, BROWN, and SCARLET proteins from Drosophila melanogaster, which are involved in the export of aromatic metabolites such as 3-hydroxykynurenine, a tryptophan derivative (Mackenzie et al., 2000). Thus far, PDR like genes in plants, have only been implicated in the transport of the diterpene sclareol (van den Brûle et al., 2002) while the functions of most family members remain elusive.
Monolignol dehydrogenation and polymerization
Upon transport into the apoplast, polymerization of monolignols is initiated by dehydrogenation of monolignols followed by enzyme-independent radical coupling. Many different types of oxidative enzymes have been associated with the dehydrogenation of monolignols (for review see Boerjan et al., 2003). However, in addition to developmental lignification, oxidative enzymes such as peroxidases are involved in a variety of other physiological responses including auxin catabolism, defense against pathogens, salt tolerance, and oxidative stress (Hiraga et al., 2001). This functional diversity, and the large size of the gene families encoding these enzymes (Table S4 and references therein), makes it difficult to identify isoforms that are specifically involved in lignin polymerization. Our profiling study identified 22 oxidizing enzymes (class III peroxidases, laccases, NADPH oxidase-like enzymes, oxalate oxidases, and copper amine oxidases) that display co-expression with monolignol biosynthetic genes (Figure 6). These belong to five of the six classes analyzed, a range that might reflect the participation of a mixture of divergent oxidizing enzymes in the dehydrogenation process (for references see Table S4). Among these, eight class III peroxidases displayed expression profiles similar to monolignol biosynthetic genes and none of them has previously been implicated in lignin polymerization.
In Arabidopsis, only the peroxidase gene AtPA2 (At5g06720) is expressed in lignifying tissues using promoter-reporter gene fusions and its expression level is enhanced in a mutant characterized by elevated lignin levels (Østergaard et al., 2000). AtPA2 expression was also significantly elevated in middle sections of primary stems in our study. However, the fold change remained below twofold, and therefore it was not included in the cluster analysis depicted in Figure 6. Similarly, the two Arabidopsis genes most similar to a poplar peroxidase that has been correlated with lignin biosynthesis (At4g08770 and At4g08780 compared with PXP3-4; Christensen et al., 2001) as well as At4g21960, the closest Arabidopsis relative to a peroxidase from tobacco that has been implicated in lignin biosynthesis (Blee et al., 2003), were excluded from further analysis because they barely missed our stringent criteria for being differentially expressed. Taken together, these results suggest that the eight peroxidases identified may not even represent the whole set of peroxidases involved in lignin polymerization, suggesting that a plethora of different enzymes could be involved.
Although peroxidases have been traditionally implicated in monolignol polymerization, laccases, which use molecular oxygen as a electron donor and can oxidize monolignols in vitro, have been implicated in lignification in many plant species (Gavnholt et al., 2002; Kiefer-Meyer et al., 1996; LaFayette et al., 1999; Ranocha et al., 2002). The Arabidopsis genome harbors many genes belonging to the blue copper superfamily to which laccases belong. However, only 17 deduced protein sequences are more similar to any of the characterized laccases from dicots (Table S4) than to an ascorbate oxidase from cucumber (Ohkawa et al., 1989). Among the six laccase genes transcriptionally upregulated over the course of stem development, two, LAC04 and LAC11, are most closely related to LAC01, LAC02, and LAC03 from poplar and a tobacco laccase (Kiefer-Meyer et al., 1996; Ranocha et al., 2002). Antisense suppression of LAC03 from poplar results in adhesion defects in cell walls of xylem fibers (Ranocha et al., 2002). Two other proteins, LAC02 and LAC17, are the only Arabidopsis proteins that group with poplar LAC110 (Ranocha et al., 2002) and all tulip tree laccases (LaFayette et al., 1999) in phylogenetic reconstructions, while the two remaining proteins, LAC05 and LAC12 group with the poplar LAC90 protein sequence (data not shown). This suggests that in Arabidopsis, multiple, phylogenetically divergent laccase isoforms are under similar transcriptional control and may serve redundant functions in lignin polymerization.
Original models for lignin monomer polymerization assumed that radical coupling occurs randomly driven by the supply of monolignol radicals. However, the discovery of dirigent proteins showed that radical coupling reactions can be guided by binding proteins that provide stereoselectivity to the reaction in lignan (monolignol dimer) biosynthesis (Davin et al., 1997). Dirigent isoforms could fulfill the task of generating ordered structure to the lignin polymer (Gang et al., 1999). However, an essential role for dirigent proteins in lignin polymerization has yet to be demonstrated (Boerjan et al., 2003).
We identified 21 genes in the Arabidopsis genome with significant sequence similarity to the Forsythia dirigent protein (Table S4), and each is represented on the microarray we used. While none of the dirigent genes is tightly co-expressed with the majority of monolignol biosynthetic genes in Arabidopsis stems (i.e., in expression categories Ia/b; Figure 6), three genes are co-expressed with some monolignol biosynthetic genes, and could play roles in incipient lignin polymerization. In phylogenetic reconstructions (data not shown), Arabidopsis dirigent proteins form three distinct clades. Three related members (DIR9, DIR10, and DIR18) form one cluster and these sequences all have large insertions with repetitive sequences in their coding region that are not found in other dirigent proteins. None of these genes is differentially expressed in primary stems. DIR5, 6, 12, 13, and 14 form a second cluster and are most closely related to the original lignan-specific protein from Forsythia. Among these, DIR5 and DIR6 were differentially expressed with patterns that most closely resemble the expression of monolignol biosynthetic genes in our study. The remaining proteins form a third distinct clade in phylogenetic reconstructions, and among these 16 genes, only DIR11 is co-expressed with PAL3 in expression cluster Ic (Figure 6). Interestingly, five dirigent genes are actually expressed more highly in young parts of the stem and are downregulated in stem sections with ongoing fiber lignification. This suggests that these genes serve biochemical functions in the upper part of developing stems, perhaps related to the high levels of soluble phenolic compounds in these samples. Expression profiling at higher resolution and reverse genetics approaches could shed more light on the functions of the three lignification-associated dirigent candidates we identified.
Transcriptional regulation is likely to play a key role in the complex series of events leading to fiber cell differentiation and maturation along the developmental axis we surveyed by global expression profiling. However, not all genes encoding transcription factors that are involved in regulating these events are necessarily differentially expressed, but could be expressed constitutively and/or be activated post-translationally. Alternatively, subtle but critical differences in expression levels over distances of only a few cells might not be detectable with our approach. Thus, it is not surprising that certain genes that have been implicated in the regulation of lignin biosynthesis and/or fiber differentiation were not differentially expressed in our experiment, for example, AtHB8 and REV/IFL (Baima et al., 2001; Zhong and Ye, 1999). Probes for other functionally characterized transcription factor genes were not represented on our array (e.g., PAP1/MYB75; Borevitz et al., 2000), or just barely failed to meet our statistical criteria (F: P < 0.01) for inclusion in an expression category. For example, MYB61, which when over-expressed results in ectopic lignification of roots and stems (Newman et al., 2004), was up to fourfold more highly expressed in older stem sections compared with the tip, with an anovaP-value of 0.007, but the analysis of the quadratic expression model for this gene resulted in a P-value of 0.014 for the F-statistic, therefore excluding it from further analysis. This suggests that our rigid statistical analysis likely excludes valid candidate genes. However, we still identified 271 transcription factor genes that were differentially expressed, among which 191 were upregulated along the axis of primary stem development in a manner consistent with potential regulatory functions in fiber differentiation.
Not all the candidate DE transcription factors identified here are involved in regulating fiber differentiation, but may be involved in other developmental processes within the primary stem. To filter our data, we compared it with an expression map of Arabidopsis root cells and tissues over the course of root development (Birnbaum et al., 2003). Like xylem and interfascicular fiber cells of the stem, certain cells destined to form xylem in the root stele also undergo a differentiation process culminating in the formation of elongated secondarily thickened and lignified cells. Furthermore, auxin has been identified as a key signal that regulates both xylem differentiation in the stele and interfascicular fiber differentiation (Birnbaum et al., 2003; Little et al., 2002; Zhong and Ye, 2001). Of the 16 transcription factor genes upregulated specifically in the stele (Birnbaum et al., 2003) nine were also represented in our upregulated expression categories 4, 5, and 8 (Table S2). This suggests conserved and possibly redundant functions of this small set of candidate genes in primary stem fiber and root stele development.
Interestingly, this candidate gene set contains MYB and bHLH transcription factors with the potential to interact with each other to regulate target gene transcription. MYB/bHLH interactions are important in regulating anthocyanin and proanthocyanidin biosynthesis, trichome development, and other developmental processes (Baudry et al., 2004; Goff et al., 1992; Payne et al., 2000 and references therein). Among the genes we identified, MYB43 and MYB20 are closely related based on phylogenetic reconstructions (Stracke et al., 2001). As we can exclude cross-hybridization based on the oligo sequences used as probes, this suggests a conserved and possibly redundant functions of these genes. In contrast, MYB63 has been placed in a different phylogenetic cluster (Stracke et al., 2001) and the closest relative, MYB58, is not differentially expressed in stems or roots. No information is available regarding the potential functions of these genes, or of the co-expressed bHLH genes At1g29950 (AtbHLH144) and At4g29100 (AtbHLH068), but it is interesting that MYB43, MYB63, and AtbHLH068 are at least twofold more highly expressed in developing Arabidopsis secondary xylem relative to bark (Oh et al., 2003; Table S2).
Included among the transcription factor candidate genes, is the AP2-EREBP gene At5g07580. This class of transcription factors is unique to plants, and those members for which functions are known play regulatory roles in various developmental and defense-related processes (Riechmann and Meyerowitz, 1998). Within the large AP2-EREBP family, the closest relative of At5g07580 is At5g61590; these two genes are 71% identical, and could thus easily be distinguished by gene-specific probes on our array. Both genes were placed in expression cluster 4, but only At5g07580 was also placed in stele-specific LED 1, while At5g61590 is not associated with an LED in roots (Birnbaum et al., 2003). Therefore, both genes could have similar functions in stems, while no redundancy is expected in roots. The candidate gene KNAT7 (At1g62990), which belongs to the class II of KNOX-like homeobox genes (Serikawa et al., 1996), shares high sequence similarity with KNAT4 (At5g11060) and KNAT3 (At5g25220), neither of which were differentially expressed in our experiment or placed in a root LED expression category by Birnbaum et al. (2003). In comparison with class I KNOX genes that play roles in regulating cell fate and meristem indeterminancy (Tsiantis, 2001), little is known about this subgroup and to our knowledge no function has been assigned yet to any of its members. The Arabidopsis class I KNOX gene KNAT1/BREVIPEDICELLUS (BP) plays a role in regulating interfascicular fiber differentiation and lignin deposition, in addition to maintenance of meristem indeterminacy (Mele et al., 2003), and the expression of a putative ortholog was correlated with secondary wall formation in a poplar microarray experiment (Hertzberg et al., 2001). Unfortunately, BP was not represented on our array, but these results highlight the possible roles of KNAT genes in general and KNAT7 in particular in fiber differentiation.
Also represented among the nine candidate transcription factor genes is bZIP9 (At5g24880), an uncharacterized member of the basic leucine zipper motif (bZIP) transcription factor class. bZIP transcription factors form homodimers and heterodimers that play important roles in regulating defense responses and development (Jakoby et al., 2002). Interestingly, a second bZIP gene, TGA1 (At5g65210) was placed in expression category 4, and root LED 7 (upregulated in all root tissues over development). TGA1 together with other TGA partner proteins, is known to play a key role in defense signaling by interaction with the regulatory protein NPR1 (Després et al., 2003), and is also more than twofold more highly expressed in Arabidopsis secondary xylem relative to bark (Oh et al., 2003). The final candidate gene in this set is At5g42200, a member of the C3H ring zinc finger family, whose functions are poorly characterized. Interestingly, another C3H transcription factor, At4g26580, was also placed in our expression category 4, but was not represented on the array used by Birnbaum et al. (2003). This gene is highly upregulated in Arabidopsis secondary xylem (Oh et al., 2003), and a putative poplar At4g26580 ortholog is strongly upregulated in association with the final stages of secondary xylem development in poplar (Hertzberg et al., 2001; Table S2). Thus, these C3H genes are interesting candidates for further functional analyses.
We chose to filter our transcription factor candidate genes against the root expression data set of Birnbaum et al. (2003) as this is one of the most comprehensive and detailed microarray data sets in the literature. However, this would exclude candidate regulators that are stem or interfascicular fiber cell-specific or are regulating late stages of vascular differentiation given that the root tissue analyzed by Birnbaum et al. (2003) likely do not contain cells with secondarily thickened walls. We thus performed filtering against other published microarray experiments, including Arabidopsis secondary xylem (Oh et al., 2003), weight-induced secondary xylem formation in Arabidopsis (Ko et al., 2004), Zinnia tracheary element trans-differentiation (Demura et al., 2002), and developing poplar xylem (Hertzberg et al., 2001). These results, highlighted in gray in Table S2, identified several other candidates among our set of transcription factors in expression categories 4, 5, and 8, including those in MYB, bZIP, C3H, AP2-EREBP, and other classes (Table S2), that may play important roles in stem fiber differentiation, but are not correlated with stele development.
In summary, by carefully comparing our data set with that of other, related experiments, we were able to identify a total of 19 putative transcription factors that are upregulated in our analysis and also in at least two of other global expression profiling experiments. Although each experiment had a different focus, in all cases samples enriched for cells undergoing secondary cell wall biosynthesis were compared with samples that are mainly characterized by cells with primary walls. Therefore, we believe that these genes are strong candidates for transcriptional regulators of secondary cell wall formation, and are worthy of further investigation.