Differentiation of xylem cell walls have been well studied at the anatomical and biochemical levels. Before the commencement of genomics in plants, only a few genes involved in xylem differentiation had been identified and characterized, and most were for enzymes involved in lignin biosynthesis (for a recent review see Boerjan et al., 2003). This work led to a better understanding of lignin metabolism and, through the use of genetic engineering, to the production of wood with potentially improved properties for paper production (Pilate et al., 2002).
EST sequencing projects on tension wood
The first genomics studies undertaken on trees, through EST sequencing, were focused on wood formation in pine (Allona et al., 1998) and poplar (Sterky et al., 1998). Since then a number of other EST sequencing projects have been initiated on wood, as well as on other tree tissues. In addition to genomics studies, which focused on ‘regular’ wood formation, there is considerable potential in the use of tension wood as a model for wood formation.
In poplar, tension wood can easily be induced by a stem inclination where the upper side (with tension wood) and the lower side (without tension wood) of the bent stem can be directly compared for gene expression profiling. With such a model we may expect to identify genes linked to G-layer-specific features, such as its reduced lignin content, its increased cellulose crystallinity, or its reduced MFA. We anticipate that studies on this model will lead to the identification and characterization of specific genes involved in the definition of particular mechanical properties of wood.
Genomics studies on reaction wood have already been initiated with loblolly pine (Whetten et al., 2001) and poplar. In Sweden (Umeå Plant Science Centre and KTH-Royal Institute of Technology), 5723 ESTs have been produced from a Populus tremula × P. tremuloides tension wood cDNA library (Sterky et al., 2004). In another project launched in our institute (INRA-Orléans), four different cDNA libraries were prepared from the xylem collected from three P. tremula × P. alba trees, induced to form tension wood on gravitational stimulation (for details see Déjardin et al., 2004). The first cDNA library corresponds to genes expressed in the cambial zone (CZ) which includes cambium and very young xylem cells. The second cDNA library was prepared from the differentiating xylem collected on the tension wood side (DX-TW); and the third from the differentiating xylem collected on the opposite side (DX-OW). The fourth cDNA library corresponds to the genes expressed in the mature xylem (MX) collected on either side of the stem. From the clustering, alignment and annotation of more than 10 000 ESTs, it appeared that a number of expressed sequences did not show any homology with the sequences present in the public databases, suggesting that they may be specific to trees. The 11 consensus containing more than 10 ESTs, whose sequences gave no hit in the Swiss-Prot, EMBL, GenBank, DDBJ and PDB databases, are listed in Table 1. However, using more specific motif searches, four consensus sequences (Table 1, italics) had similarities with arabinogalactan proteins (AGPs), a class of hydroxyproline-rich proteins with poorly defined functions in cell wall formation. Nonetheless, these 11 sequences are likely to correspond to tree- or taxon-specific genes.
Table 1. Consensus containing >10 ESTs and having no homology with sequences in public databases (Swiss-Prot, EMBL, GenBank, DDBJ, PDB)
|Consensus ID (cluster ID)||Number of ESTs||Consensus length (bp)||polyA+ tail detected||Putative ORF position|
|1694 (1235)||32|| 940||Yes||120–629|
|1656 (1217)||28|| 688||Yes|| |
|1970 (1411)||25|| 521||Yes|| 59–259|
|1843 (1333)||24|| 603||No||176–355|
|1618 (1201)||22|| 996||No||157–609|
|1447 (1118)||17|| 647||Yes|| 49–438|
|1621 (1204)||16|| 855||Yes||289–633|
|1751 (1276)||13|| 559||Yes||149–343|
|1616 (1201)||11|| 909||No||110–544|
|1484 (1134)||11|| 413||Yes||148–318 or 72–257|
Functional genomics approaches, using bioinformatics motif searches, expression studies and characterization of transgenic poplars altered in the expression of these genes, will help elucidate the function of these genes in wood formation and to determine if they are indeed functionally specific to wood formation in trees.
For any given gene, the frequency of EST is representative of the mRNA abundance in the tissue used to construct the cDNA library, provided the sequencing has been done on randomly picked clones. Therefore the comparative analysis of EST distribution in the different wood cDNA libraries gives a reflection of the expression level of the corresponding genes (Audic & Claverie, 1997). Our analysis performed on the four cDNA libraries prepared from the wood of bent trees revealed a substantial shift in EST distribution between the CZ and xylem libraries, which is probably linked to the switch from cell expansion to the building of secondary cell walls. Twenty EST clusters appeared over-represented in the CZ library (Table 2). These clusters correspond mainly to genes involved in protein synthesis and cell fate (nine different ribosomal proteins, elongation factor 1-alpha, ubiquitin, peptidyl-prolyl cis–trans isomerase), and are expected to be expressed in young dividing cells. Two other abundant clusters, unique to the CZ library (Table 3), do not show any homology with the sequences available in the public databases (other than poplar EST sequences). Likewise, 18 clusters seem to be present only in the xylem libraries (Table 3), and five others are preferentially found (at a statistically significant level) in the xylem libraries (Table 2). Most of these clusters correspond to genes required for building the secondary cell wall: cellulose synthesis (cellulose synthase, sucrose synthase); methylation of lignin precursors (S-adenosylmethionine synthetase, caffeic acid 3-O-methyltransferase, caffeoyl-CoA O-methyltransferase); and microtubule cytoskeleton organization (two clusters of α-tubulin). In addition, four clusters correspond to AGP genes, three others to hypothetical proteins, and two others do not have any homology with the sequences in the public databases.
Table 2. Clusters that are differentially represented between the cambial zone (CZ) and xylem (X) libraries
|Cluster ID||Putative function||blast score||E value||ESTs in CZ||ESTs in X||CZ/X|
|Total number of ESTs|| || ||2090||7731|| |
|984||Metallothionein-like protein (Swiss-Prot Q96386)|| 94||e-19|| 7|| 1||25.9***|
|1266||Wound-responsive protein (TrEMBL Q42482)||227||5e-59|| 6|| 1||22.2***|
|1211||60S ribosomal protein L9 (Swiss-Prot P30707)||321||8e-88|| 10|| 2||18.5**|
|1221||60S ribosomal protein L15 (Swiss-Prot O82258)||335||5e-92|| 7|| 2||12.9***|
|1141||60S ribosomal protein L38 (Swiss-Prot O22860)||133||2e-31|| 6|| 2||11.1***|
|1213||Peptidyl-prolyl cis–trans isomerase (Swiss-Prot Q39613)||294||e-79|| 7|| 3|| 8.6***|
|1205-1||40S ribosomal protein S19 (Swiss-Prot Q9FNP8)||265||e-70|| 9|| 4|| 8.3**|
|1139||Unknown protein (Swiss-Prot P93384)||190||e-47|| 6|| 3|| 7.4***|
|1253||Peroxidase precursor (Swiss-Prot Q9SB81)||534||e-151|| 9|| 5|| 6.7**|
|1305||Nucleoid DNA-binding-like protein (Swiss-Prot Q9FL43)||409||e-150|| 7|| 4|| 6.5**|
|1126-1||Elongation factor 1-alpha (Swiss-Prot O49169)||850||0|| 22|| 13|| 6.3**|
|1262||40S ribosomal protein S9 (Swiss-Prot P52810)||264||2e-70|| 6|| 4|| 5.5**|
|1122||40S ribosomal protein S26 (Swiss-Prot P49206)||134||2e-31|| 8|| 6|| 4.6*|
|1231||60S ribosomal protein L12 (Swiss-Prot O50003)||300||2e-81|| 6|| 6|| 3.7*|
|1178||Tonoplast intrinsic protein, gamma (Swiss-Prot P25818)||356||5e-98|| 8|| 9|| 3.3*|
|1224||Ubiquitin (Swiss-Prot P03993)||150||2e-36|| 8|| 9|| 3.3*|
|1163||60S ribosomal protein L10a (Swiss-Prot P59230)||329||6e-90|| 7|| 8|| 3.2*|
|1206||60S ribosomal protein L30 (Swiss-Prot Q9M5M6)||202||4e-52|| 7|| 10|| 2.6*|
|1123||Calmodulin (Swiss-Prot P59220)||296||5e-80|| 9|| 13|| 2.6*|
|1142-2||Glyceraldehyde 3-phosphate dehydrogenase, cytosolic (Swiss-Prot P26518)||573||e-163|| 11|| 17|| 2.4*|
|1144||S-adenosylmethionine synthetase (Swiss-Prot Q96551)||540||0|| 5|| 47|| 0.4*|
|1270||No hits|| || || 2|| 34|| 0.2*|
|1254||Similar to pollen-specific protein SF3 (Swiss-Prot P29675)||282||8e-76|| 1|| 22|| 0.2*|
|1411||No hits|| || || 1|| 26|| 0.1*|
|1187||Tubulin alpha chain (Swiss-Prot P33629)||835||0|| 1|| 27|| 0.1*|
Table 3. EST clusters unique to the cambial zone (CZ) or xylem (X) libraries
|Cluster ID||Putative function||blast score||E value||Number of ESTs||Significance|
|1118||No hits|| || ||21||***|
|1195||Gamma thionin homologue (Swiss-Prot Q9ZUL7)|| 87||1e-17||16||***|
|1134||No hits|| || || 9||**|
|1592||Arabinogalactan protein (TrEMBL Q9SP59)||184||9e-46||96||***|
|1588||Arabinogalactan protein (TrEMBL Q9SP59)||146||9e-52||59||***|
|1399||Sucrose synthase (Swiss-Prot P13708)||866||0||54||***|
|1445||Endochitinase precursor (Swiss-Prot Q09023)||191||3e-48||53||***|
|1549||Arabinogalactan protein (TrEMBL Q9SP59)||210||2e-53||50||***|
|1400||Isoflavone reductase homologue (Swiss-Prot P52577)||454||e-127||43||***|
|1469||Blue copper protein precursor (Swiss-Prot Q41001)||159||5e-39||37||***|
|1409||Caffeoyl-CoA O-methyltransferase (Swiss-Prot Q43095)||499||e-141||36||***|
|1382||Tubulin alpha chain (Swiss-Prot P46259)||852||0||36||***|
|1132-1||Caffeic acid 3-O-methyltransferase (Swiss-Prot Q00763)||743||0||26||**|
|1555||Arabinogalactan protein-like (TrEMBL Q9SP59)||167||9e-41||22||**|
|1392||Cellulose synthase (Swiss-Prot O81368)||407||e-112||19||*|
|1390||Plasmamembrane intrinsic protein (Swiss-Prot Q08733)||520||e-147||18||*|
|1438||Hypothetical protein (TrEMBL Q9FH92)||119||2e-26||18||*|
|1407||B12D-like protein (TrEMBL Q9XHD5)||143||e-33||16||*|
|1380||Fructokinase (Swiss-Prot P37829)||513||e-145||16||*|
|1386||Hypothetical protein (TrEMBL Q8L9M6)||146||e-34||15||*|
|1056||Hypothetical protein (TrEMBL Q8LBY9)||157||6e-38||15||*|
The comparative analyses of EST distribution in the cDNA libraries prepared from the xylem tissues with (DX-TW) or without G-fibres (DX-OW) revealed that the ESTs from three clusters were more abundant in DX-OW. In contrast, nine clusters were significantly over-represented in the tension wood (Table 4), with four not represented in the DX-OW cDNA library. Among the clusters over-represented in DX-TW, two corresponded to the enzymes involved in sugar/cellulose metabolism (fructokinase and sucrose synthase); one was a putative transcription factor similar to the pollen-specific protein SF3; and five clusters corresponded to AGP.
Table 4. EST clusters differentially represented between the cDNA libraries prepared from tension wood (DX-TW) or opposite wood (DX-OW) differentiating xylem
|Cluster ID||Putative function||blast score||E value||ESTs in DX-TW||ESTs in DX-OW||DX-TW/ DX-OW|
|Total number of ESTs|| || ||2810||2562|| |
|1592||Arabinogalactan protein-like (TrEMBL Q9SP59)||184||9e-46|| 46|| 0||∞***|
|1588||Arabinogalactan protein-like (TrEMBL Q9SP59)||146||9e-52|| 30|| 0||∞***|
|1589||Hypothetical protein (GenBank AP006090)||315||e-83|| 10|| 0||∞**|
|1620||Arabinogalactan protein-like (TrEMBL Q9SP59)||253||3e-66|| 6|| 0||∞*|
|1549||Arabinogalactan protein-like (TrEMBL Q9SP59)||210||2e-53|| 25|| 1||22.8***|
|1555||Arabinogalactan protein-like (TrEMBL Q9SP59)||167||9e-41|| 10|| 1||9.1*|
|1380||Fructokinase (Swiss-Prot P37829)||513||e-145|| 12|| 2||5.5*|
|1254||Pollen-specific protein SF3 (Swiss-Prot P29975)||282||8e-76|| 11|| 2||5.0*|
|1399||Sucrose synthase (Swiss-Prot P13708)||866||0|| 31|| 9||3.1**|
|1409||Caffeoyl-CoA O-methyltransferase (Swiss-Prot Q43095)||499||e-141|| 8|| 20||0.4*|
|1126-1||Elongation factor 1-alpha (Swiss-Prot O49169)||850||0|| 2|| 10||0.2*|
|1460||Acyl carrier protein, mitochondrial precursor (Swiss-Prot P53665)||165||5e-41|| 0|| 5||0*|
The analysis of EST distribution in the cDNA libraries prepared from different wood tissues provided information on gene expression during wood formation. However, this analysis was restricted to highly expressed genes, and cDNA microarrays are required to extend expression studies to genes with a weaker level of expression. Such a transcriptional study could be performed on regular trees (Hertzberg et al., 2001a); on transgenic trees with altered wood formation (Israelsson et al., 2003); as well as on trees induced to differentiate tension wood. Although there is still no publication reporting a global gene-expression study during tension wood formation, we are aware of several ongoing programs on this topic.
Complementary to EST sequencing and transcript profiling, proteomic analyses are necessary to reveal post-transcriptional regulation mechanisms. Several proteomic studies have been reported on the identification and characterization of xylem proteins in differentiating xylem (van der Mijnsbrugge et al., 2000). Two comparative studies performed on tension and normal wood detected several proteins, the amounts of which were altered in tension wood tissues (Baba et al., 2000; Plomion et al., 2003). Nevertheless, at present none of these proteins has been characterized.
What to do next with the genomics?
Xylem is a complex tissue with different cell types at different developmental stages, and with specific patterns of gene expression. cDNA microarrays have been adapted to the xylem tissues through the use of tangential cryosectioning (Uggla & Sundberg, 2002) coupled with a protocol for RNA purification from submilligram amounts of xylem tissue (Hertzberg et al., 2001b). This has made it possible to follow the expression pattern of 3000 different genes from cambium to maturing xylem (Hertzberg et al., 2001a). However, at the moment it is not possible to discriminate cell type-specific gene-expression patterns using global profiling studies. Hence genes with a remarkable expression pattern, identified through gene profiling, will need to be localized accurately in order to determine the cell type(s) where they are expressed. Tools for such in-depth localization studies have been adapted, or are still in development, on poplar wood tissues: in situ hybridization (Regan et al., 1999; Wu et al., 2000; Hawkins et al., 2003); in situ PCR (Gray-Mitsumune et al., 2004); immunocytochemistry (Samaj & Boudet, 2002). Although requiring the production of specific antibodies, the latter technique appears to be very informative as it makes it possible, using transmission electron microscopy, to localize where in the cell the protein is actually located. Alternatively, efforts are presently being made to separate different cell types (such as fusiform cells from ray cambial initials) through microdissection, in order to determine the gene-expression profile in each cell type (N. Goué, B. Sundberg and P. Label, personal communication).
Functional genomics aims to elucidate the function of proteins encoded by specific genes identified through EST sequencing and gene profiling. One way may be to find a characterized orthologue from Arabidopsis thaliana, in order to assess its function through a search in mutant databases (http://flagdb-genoplante-info.infobiogen.fr/projects/fst/; http://signal.salk.edu/cgi-bin/tdnaexpress; http://www.mpiz-koeln.mpg.de/GABI-Kat/) and to analyse the corresponding Arabidopsis mutant phenotype, if available. However, this approach is not valid for genes that are specific to trees, or for biological processes (such as tension wood formation) that do not occur in Arabidopsis. In these cases, the best alternative relies on the production of transgenic poplar trees altered in the expression of the target gene, and the fine assessment of their ability to differentiate tension wood.