Wood is a complex and highly variable tissue, the formation of which is developmentally and environmentally regulated. In reaction to gravitropic stimuli, angiosperm trees differentiate tension wood, a wood with specific anatomical, chemical and mechanical features. In poplar the most significant of these features is an additional layer that forms in the secondary wall of tension wood fibres. This layer is mainly constituted of cellulose microfibrils oriented nearly parallel to the fibre axis. Tension wood formation can be induced easily and strongly by bending the stem of a tree. Located at the upper side of the bent stem, tension wood can be compared with the wood located on its lower side. Therefore tension wood represents an excellent model for studying the formation of xylem cell walls. This review summarizes results recently obtained in the field of genomics on tension wood. In addition, we present an example of how the application of functional genomics to tension wood can help decipher the molecular mechanisms responsible for cell wall characteristics such as the orientation of cellulose microfibrils.
Poplar, a model tree for functional genomics of wood formation
Trees, which are large perennials, exhibit specific features including dormancy, phase change and wood formation. Wood is essential for tree growth and development because it provides mechanical support and allows long-distance conductance of water and nutrients from roots to developing leaves.
In order to establish a platform for functional genomics in a tree species, poplar has been subjected to extensive expressed sequence tag (EST) sequencing projects (Sterky et al., 2004). This sequencing effort has been followed by gene-profiling studies during tree growth (e.g. Hertzberg et al., 2001a), and by the development of projects for high-throughput production of transgenic poplar trees to assess the function of tree-specific genes (for detail see Wullschleger et al., 2002). An important part of this effort is devoted to wood formation. These different projects have provided much invaluable data for annotation of the recently sequenced poplar genome (Tuskan et al., 2004). This review focuses on the advantages of tension wood for functional genomics studies that will help decipher the molecular mechanisms responsible for the formation and properties of wood.
Wood is composed primarily of cell walls
Wood formation results from the cyclic activity of cambium (for review see Mellerowicz et al., 2001). Wood (or secondary xylem) is a complex tissue made of the successive stacking of growth rings, year after year.
In the cambial zone, fusiform initials give rise to vessels and fibres, whereas ray initials generate ray parenchyma cells. All xylem cells undergo cell expansion, with both an elongation and a radial enlargement before deposition of the secondary cell wall. This cell wall thickens because of ordered deposition in the different layers of the secondary cell wall, first of cellulose and hemicelluloses, then lignins. Once fully differentiated, the vessel and fibre cells are submitted to programmed cell death which leads to degradation of cellular contents, leaving only the surrounding cell walls. As fibres and vessels account for the major part of the xylem cell population, wood is mostly made of the cell walls of dead cells.
The different layers of the cell wall can be seen as arrays of cellulose microfibrils coated with hemicelluloses. These microfibrils are randomly oriented in the primary wall but highly oriented in the secondary wall. The layers of the secondary wall display different cellulose microfibril angle (MFA). Variations in MFA are known to influence the cell wall's mechanical properties (Reiterer et al., 1999). Cell wall proteins, often highly glycosylated, are entangled within these arrangements. Further lignins are laid down, first in the cell corners and middle lamella, then in the primary and secondary cell walls. During xylem cell maturation, lignins and hemicelluloses fix the cellulose microfibrils together as a honeycomb structure in the secondary wall. Mechanically, the plant cell wall can be viewed as a composite material with rigid, stretch-resistant rods (the cellulose microfibrils) embedded in an amorphous, compression-resistant matrix (Fournier et al., 1994).
Tension wood compared with normal wood structure
During differentiation, each wood cell type undergoes specific anatomical changes that affect its shape and size. In addition, the thickness, composition and structure of xylem cell walls vary with external factors. This is why wood is such a highly variable material whose properties depend on both (i) the heterogeneity in the amount and arrangements of the different xylem cell types; and (ii) the structure and chemical composition of the cell wall from these different cells. Different kinds of wood with specific anatomical, chemical and physical characteristics coexist within a single tree (for detailed review see Plomion et al., 2001).
Earlywood occurs in the spring, when the cambium is very active, whereas latewood is formed later in the growing season, when cambium is less active. Therefore wood density is lower in earlywood than in latewood.
Juvenile wood is formed during the rapid early growth of a tree. It is very different from mature wood with respect to density and cell wall composition.
Reaction wood develops in response to the perception of gravity and/or to mechanical stimuli induced by disruption of the natural position of the stem (or branches). The tree ‘reacts’ by bending the displaced stem/branch back to its original position. In gymnosperm trees this reaction wood is named ‘compression wood’ and develops at the lower side of leaning stems and branches; in angiosperm trees such as poplar it is named ‘tension wood’ and occurs at the upper side (Scurfield, 1973). It is often associated with stem eccentricity caused by stimulated cambial growth.
Tension wood differs from normal wood formed in the absence of stimulus, and from opposite wood located on the lower side of the inclined stem, in a number of biochemical, anatomical and mechanical characteristics. Mechanically, longitudinal maturation strains in tension wood are present in larger quantities than in normal wood (Fournier et al., 1994). The frequency of vessels, and their porosity, are significantly lower in tension wood, whereas fibre and vessel lengths are significantly longer (Jourez et al., 2001). However, in a number of tree species such as poplar the most striking modifications are found in the fibres of tension wood. In these fibres, named G-fibres, one layer of the secondary wall (generally the S3 layer) is replaced by a very thick and poorly lignified layer which is rich in crystalline cellulose (Timell, 1969; Fig. 1). The cellulose MFA in this layer is almost parallel to the fibre long axis which contributes, probably in a significant way, to the specific mechanical properties of tension wood.
Tension wood formation and phytohormones
Phytohormones are key regulators in wood formation. The importance of auxin in wood formation (recently reviewed by Sundberg et al., 2000) was demonstrated a long time ago. In transgenic poplars with modified auxin metabolism, wood formation appeared altered (Tuominen et al., 1995). A radial concentration gradient of auxin in the cambial zone was later shown to be associated with the regulation of xylem development and, more particularly, with the duration of xylem fibre expansion (Tuominen et al., 1997). Similarly, tension wood has long been hypothesized to develop in response to an internal gradient of auxin (Wilson & Archer, 1977). However, a recent report demonstrated that tension wood formation was not linked to any alteration in the balance of endogenous auxin (Hellgren et al., 2004). Nevertheless, the expression of several genes from the Aux/IAA gene family, encoding potential mediators of the auxin signal transduction pathway, changed on induction of tension wood formation (Moyle et al., 2002). This is in agreement with the existence of interactions between translocation of the gravitational stimulus and the auxin signal transduction pathway, as proposed by Hellgren et al. (2004). Acting in synergy with auxin, gibberellins are known to stimulate meristematic activity and fibre elongation (Eriksson et al., 2000) but, so far, there are no data indicating the involvement of gibberellins in tension wood formation. Contrary to this, ethylene is likely to be involved, as its production is greatly increased in Eucalyptus on induction of tension wood (Nelson & Hillis, 1978). Moreover, the expression of the gene coding for 1-aminocyclopropane-1-carboxylate oxidase, responsible for ethylene production, is strongly induced during tension wood formation in poplar (Andersson-Gunnerås et al., 2003).
Genomic studies of tension wood
Differentiation of xylem cell walls have been well studied at the anatomical and biochemical levels. Before the commencement of genomics in plants, only a few genes involved in xylem differentiation had been identified and characterized, and most were for enzymes involved in lignin biosynthesis (for a recent review see Boerjan et al., 2003). This work led to a better understanding of lignin metabolism and, through the use of genetic engineering, to the production of wood with potentially improved properties for paper production (Pilate et al., 2002).
EST sequencing projects on tension wood
The first genomics studies undertaken on trees, through EST sequencing, were focused on wood formation in pine (Allona et al., 1998) and poplar (Sterky et al., 1998). Since then a number of other EST sequencing projects have been initiated on wood, as well as on other tree tissues. In addition to genomics studies, which focused on ‘regular’ wood formation, there is considerable potential in the use of tension wood as a model for wood formation.
In poplar, tension wood can easily be induced by a stem inclination where the upper side (with tension wood) and the lower side (without tension wood) of the bent stem can be directly compared for gene expression profiling. With such a model we may expect to identify genes linked to G-layer-specific features, such as its reduced lignin content, its increased cellulose crystallinity, or its reduced MFA. We anticipate that studies on this model will lead to the identification and characterization of specific genes involved in the definition of particular mechanical properties of wood.
Genomics studies on reaction wood have already been initiated with loblolly pine (Whetten et al., 2001) and poplar. In Sweden (Umeå Plant Science Centre and KTH-Royal Institute of Technology), 5723 ESTs have been produced from a Populus tremula × P. tremuloides tension wood cDNA library (Sterky et al., 2004). In another project launched in our institute (INRA-Orléans), four different cDNA libraries were prepared from the xylem collected from three P. tremula × P. alba trees, induced to form tension wood on gravitational stimulation (for details see Déjardin et al., 2004). The first cDNA library corresponds to genes expressed in the cambial zone (CZ) which includes cambium and very young xylem cells. The second cDNA library was prepared from the differentiating xylem collected on the tension wood side (DX-TW); and the third from the differentiating xylem collected on the opposite side (DX-OW). The fourth cDNA library corresponds to the genes expressed in the mature xylem (MX) collected on either side of the stem. From the clustering, alignment and annotation of more than 10 000 ESTs, it appeared that a number of expressed sequences did not show any homology with the sequences present in the public databases, suggesting that they may be specific to trees. The 11 consensus containing more than 10 ESTs, whose sequences gave no hit in the Swiss-Prot, EMBL, GenBank, DDBJ and PDB databases, are listed in Table 1. However, using more specific motif searches, four consensus sequences (Table 1, italics) had similarities with arabinogalactan proteins (AGPs), a class of hydroxyproline-rich proteins with poorly defined functions in cell wall formation. Nonetheless, these 11 sequences are likely to correspond to tree- or taxon-specific genes.
Table 1. Consensus containing >10 ESTs and having no homology with sequences in public databases (Swiss-Prot, EMBL, GenBank, DDBJ, PDB)
Consensus ID (cluster ID)
Number of ESTs
Consensus length (bp)
polyA+ tail detected
Putative ORF position
Corresponding cluster ID in brackets. Open reading frames (ORFs) were detected using orf finder software (http://www.ncbi.nlm.nih.gov/gorf/gorf.html). EST clusters potentially corresponding to arabinogalactan proteins according to specific motif searches are in italics.
148–318 or 72–257
Functional genomics approaches, using bioinformatics motif searches, expression studies and characterization of transgenic poplars altered in the expression of these genes, will help elucidate the function of these genes in wood formation and to determine if they are indeed functionally specific to wood formation in trees.
For any given gene, the frequency of EST is representative of the mRNA abundance in the tissue used to construct the cDNA library, provided the sequencing has been done on randomly picked clones. Therefore the comparative analysis of EST distribution in the different wood cDNA libraries gives a reflection of the expression level of the corresponding genes (Audic & Claverie, 1997). Our analysis performed on the four cDNA libraries prepared from the wood of bent trees revealed a substantial shift in EST distribution between the CZ and xylem libraries, which is probably linked to the switch from cell expansion to the building of secondary cell walls. Twenty EST clusters appeared over-represented in the CZ library (Table 2). These clusters correspond mainly to genes involved in protein synthesis and cell fate (nine different ribosomal proteins, elongation factor 1-alpha, ubiquitin, peptidyl-prolyl cis–trans isomerase), and are expected to be expressed in young dividing cells. Two other abundant clusters, unique to the CZ library (Table 3), do not show any homology with the sequences available in the public databases (other than poplar EST sequences). Likewise, 18 clusters seem to be present only in the xylem libraries (Table 3), and five others are preferentially found (at a statistically significant level) in the xylem libraries (Table 2). Most of these clusters correspond to genes required for building the secondary cell wall: cellulose synthesis (cellulose synthase, sucrose synthase); methylation of lignin precursors (S-adenosylmethionine synthetase, caffeic acid 3-O-methyltransferase, caffeoyl-CoA O-methyltransferase); and microtubule cytoskeleton organization (two clusters of α-tubulin). In addition, four clusters correspond to AGP genes, three others to hypothetical proteins, and two others do not have any homology with the sequences in the public databases.
Table 2. Clusters that are differentially represented between the cambial zone (CZ) and xylem (X) libraries
ESTs in CZ
ESTs in X
Significant at: ***, 0.1; **, 1; *, 5%. In X were grouped ESTs present in cDNA libraries for differentiating xylem and mature xylem from tension and opposite wood. For each cluster, the significance level for the ratio of ESTs present in X and CZ libraries was calculated according to the equation of Audic & Claverie (1997). blast score and E value refer to the sequence that gave the best hit with the consensus sequence. (Accession number for this sequence given in brackets.) CZ/X, enrichment in CZ compared with X = (% in CZ)/(% in X). This formula takes into account the difference in total EST number in CZ and X libraries (2090 and 7731, respectively).
Table 3. EST clusters unique to the cambial zone (CZ) or xylem (X) libraries
Number of ESTs
Significant at: ***, 0.1; **, 1; *, 5%. Clusters present only in the cambial zone cDNA library are listed in CZ; those present only in the differentiating and mature xylem cDNA libraries are listed in X. For each cluster the significance level for the ratio of EST present in X and CZ libraries was calculated according to the equation of Audic & Claverie (1997). blast score and E value refer to the sequence which gave the best hit. (Accession number for this sequence given in brackets.)
The comparative analyses of EST distribution in the cDNA libraries prepared from the xylem tissues with (DX-TW) or without G-fibres (DX-OW) revealed that the ESTs from three clusters were more abundant in DX-OW. In contrast, nine clusters were significantly over-represented in the tension wood (Table 4), with four not represented in the DX-OW cDNA library. Among the clusters over-represented in DX-TW, two corresponded to the enzymes involved in sugar/cellulose metabolism (fructokinase and sucrose synthase); one was a putative transcription factor similar to the pollen-specific protein SF3; and five clusters corresponded to AGP.
Table 4. EST clusters differentially represented between the cDNA libraries prepared from tension wood (DX-TW) or opposite wood (DX-OW) differentiating xylem
ESTs in DX-TW
ESTs in DX-OW
Significant at: ***, 0.1; **, 1; *, 5%. DX-TW/DX-OW: enrichment in DX-TW compared with DX-OW = (% in DX-TW)/(% in DX-OW). This formula takes into account the difference in total EST number between DX-TW and DX-OW libraries (2810 and 2562, respectively). For each cluster the significance level for the ratio of ESTs present in DX-TW and DX-OW libraries was calculated according to the equation of Audic & Claverie (1997). blast score and E value refer to the sequence which gave the best hit with the consensus sequence. (Accession number for this sequence given in brackets.)
The analysis of EST distribution in the cDNA libraries prepared from different wood tissues provided information on gene expression during wood formation. However, this analysis was restricted to highly expressed genes, and cDNA microarrays are required to extend expression studies to genes with a weaker level of expression. Such a transcriptional study could be performed on regular trees (Hertzberg et al., 2001a); on transgenic trees with altered wood formation (Israelsson et al., 2003); as well as on trees induced to differentiate tension wood. Although there is still no publication reporting a global gene-expression study during tension wood formation, we are aware of several ongoing programs on this topic.
Complementary to EST sequencing and transcript profiling, proteomic analyses are necessary to reveal post-transcriptional regulation mechanisms. Several proteomic studies have been reported on the identification and characterization of xylem proteins in differentiating xylem (van der Mijnsbrugge et al., 2000). Two comparative studies performed on tension and normal wood detected several proteins, the amounts of which were altered in tension wood tissues (Baba et al., 2000; Plomion et al., 2003). Nevertheless, at present none of these proteins has been characterized.
What to do next with the genomics?
Xylem is a complex tissue with different cell types at different developmental stages, and with specific patterns of gene expression. cDNA microarrays have been adapted to the xylem tissues through the use of tangential cryosectioning (Uggla & Sundberg, 2002) coupled with a protocol for RNA purification from submilligram amounts of xylem tissue (Hertzberg et al., 2001b). This has made it possible to follow the expression pattern of 3000 different genes from cambium to maturing xylem (Hertzberg et al., 2001a). However, at the moment it is not possible to discriminate cell type-specific gene-expression patterns using global profiling studies. Hence genes with a remarkable expression pattern, identified through gene profiling, will need to be localized accurately in order to determine the cell type(s) where they are expressed. Tools for such in-depth localization studies have been adapted, or are still in development, on poplar wood tissues: in situ hybridization (Regan et al., 1999; Wu et al., 2000; Hawkins et al., 2003); in situ PCR (Gray-Mitsumune et al., 2004); immunocytochemistry (Samaj & Boudet, 2002). Although requiring the production of specific antibodies, the latter technique appears to be very informative as it makes it possible, using transmission electron microscopy, to localize where in the cell the protein is actually located. Alternatively, efforts are presently being made to separate different cell types (such as fusiform cells from ray cambial initials) through microdissection, in order to determine the gene-expression profile in each cell type (N. Goué, B. Sundberg and P. Label, personal communication).
Functional genomics aims to elucidate the function of proteins encoded by specific genes identified through EST sequencing and gene profiling. One way may be to find a characterized orthologue from Arabidopsis thaliana, in order to assess its function through a search in mutant databases (http://flagdb-genoplante-info.infobiogen.fr/projects/fst/; http://signal.salk.edu/cgi-bin/tdnaexpress; http://www.mpiz-koeln.mpg.de/GABI-Kat/) and to analyse the corresponding Arabidopsis mutant phenotype, if available. However, this approach is not valid for genes that are specific to trees, or for biological processes (such as tension wood formation) that do not occur in Arabidopsis. In these cases, the best alternative relies on the production of transgenic poplar trees altered in the expression of the target gene, and the fine assessment of their ability to differentiate tension wood.
Using tension wood to provide insights into control of cellulose microfibril orientation
Functional genomics applied to wood aims to identify key genes whose function is important for specific aspects of wood formation and wood properties. For example, using tension wood as a model, functional genomics may be useful to identify and characterize the molecular partners responsible for the control of cellulose MFA. Presently, genomics studies on tension wood have been limited to EST sequencing. Nevertheless, the comparative analysis of EST distribution in cDNA libraries prepared from wood with or without G-fibres, associated with the knowledge available on cellulose synthesis in other models, already enables us to identify candidate genes that will be investigated further for their function in tension wood.
Xylem cells perceive and react to a mechanical stress
Xylem cells are constantly submitted to mechanical stresses during their differentiation. Mechanical stresses are probably required to maintain the cambium structure (Brown, 1964). Later, during maturation, xylem cells contract in length and expand laterally. As this contraction is restrained by older wood cells, the newly formed cells generate a longitudinal tensile stress, while the obstruction of lateral expansion by neighbouring cells leads to a tangential compressive stress (Fournier et al., 1994). The generation of these stresses, named maturation stresses, is thought to be connected directly to the definition of wood mechanical properties. But how does the xylem cell perceive and react to the mechanical stresses? In tension wood fibres it is likely that this occurs at the interface between the plasma membrane and the inner part of the G-layer. As shown in Fig. 2, in G-fibres the plasma membrane appears to be invaginated in the periplasmic area next to the internal side of the G-layer. The function of these finger-like structures remains to be elucidated, and a potential link with G-layer formation is still to be determined.
Communication between cytoskeleton and cell wall: the candidates
Plant cell walls are highly dynamic structures able to respond to both internal and environmental stimuli (Wojtaszek, 2000). It is likely that the cell response to mechanical stimuli goes through cross-talk between the cytoskeleton and the plant cell wall (Baluska et al., 2003). More precisely, a bidirectional flow of information probably exists between the cortical microtubules (a major component of the cytoskeleton made of heterodimers of α- and β-tubulins) and the cellulose microfibrils, which are known to provide spatial cues for the internal organization of microtubules (Fisher & Cyr, 1998). Microtubules are thought to direct the formation of an oriented scaffold on the plasma membrane that further binds nascent cellulose microfibrils (Baskin, 2001). In poplar wood fibres, a lot of parallel and obliquely oriented microtubules appear linked to the plasma membrane once fibres have ceased their elongation and secondary cell wall layers are being deposited. In tension wood G-fibres, these microtubules are arranged axially within the cells and run approximately parallel to the cellulose microfibrils (Chaffey et al., 2002). A number of α- and β-tubulin genes expressed during wood formation have been identified through EST sequencing. Interestingly, two α-tubulin clusters appear either unique (#1382) or preferentially represented (#1187) in the xylem libraries (Tables 2 and 3). However, their distribution in the cDNA libraries does not reveal any regulation on G-fibre differentiation.
Other studies based on Arabidopsis cell wall mutants indicate that both a katanin- and a kinesin-like protein are likely to be involved in microtubule control of the cellulose microfibril orientation (Burk & Ye, 2002; Zhong et al., 2002). Interestingly, gibberellins appear to act on microtubule organization through modulating the katanin level (Bouquin et al., 2003). Several ESTs from poplar differentiating xylem cDNA libraries have been annotated as putative kinesin, but again their distribution does not appear to be linked to tension wood formation.
Furthermore, Baluska et al. (2003) proposed a model where the local mechanical properties of the cell wall are sensed by some plasma membrane-spanning linker molecules that accumulate at the adhesion sites between cytoskeleton and plasma membrane. These linker molecules process this information further and signal it via diverse signal-transducing molecules associated with the dynamic cytoskeleton, further down into the cytoplasm and toward the nucleus. This signalling pathway orchestrates diverse cellular activities in response to the physical properties of the cell wall.
Several potential linker molecules have been proposed: integrins, wall-associated kinases (WAKs), cellulose synthases and AGP (Kohorn, 2000).
Integrins are plasma membrane proteins that associate the cell wall to the cytoskeleton. In animal cells they have been shown to be part of the signalling cascades involved in both ‘inside-to-out’ and ‘outside-to-in’ signalling. In plants, integrin-like molecules have been proposed to participate in a gravitropic signalling pathway (Katembe et al., 1997).
Wall-associated kinases have a cytoplasmic serine/threonine kinase domain that spans across the plasma membrane and extends a domain in the cell wall that may bind to pectins. These kinases physically link the plasma membrane to the carbohydrate matrix, but they also have the potential directly to trigger cellular events through their kinase domain. In vitro studies and yeast two-hybrid assays revealed interactions between a WAK and a cell wall glycine-rich protein (Park et al., 2001). However, genes encoding either integrin-like proteins or WAKs seem only poorly represented in poplar wood cDNA libraries.
Cellulose synthase ESTs originated mostly from the mature xylem library. They have been regrouped in five different clusters, although most have been classified in one cluster. Recently, Wu et al. (2000) described a xylem-specific cellulose synthase gene from aspen that is responsive to tensile stress. Sucrose synthase, another enzyme involved in the cellulose biosynthesis complex, is overrepresented in the cDNA library prepared from the tension wood differentiating xylem (Déjardin et al., 2004; Table 3). However, there is no evidence for the possible involvement of this plasma membrane-bound enzyme in communication with the cytoskeleton.
Arabinogalactan proteins are highly glycosylated proteins that present both adhesive and signalling properties. A number of different AGP-encoding genes have been found in the poplar-differentiating xylem libraries (Déjardin et al., 2004). Their expression appeared specifically linked to secondary cell wall deposition, as no AGP was found in the cambial library. Several AGP clusters were over-represented or even specifically regulated in the tension wood-differentiating xylem (Table 4). Bioinformatic analyses indicated that these AGPs belong to the classical family of AGPs. These AGPs share a fasciclin-like domain with predicted adhesion properties. A number are predicted to exhibit a carboxy-terminal glycosyl phospatidylinositol anchor that keeps them in tight association with the plasma membrane (Schultz et al., 2002). This anchor can be cleaved by phospholipase C in a signalling-mediated manner, leading to a controlled release of AGPs from the plasma membrane to the cell walls.
With regard to the genomics studies performed on poplar tension wood, several specific AGPs appear to be good candidates as molecular partners involved in the cross-talk between cell wall and cytoskeleton that leads to G-fibre differentiation. Functional studies are needed to evaluate these different candidate genes; in particular, the precise location in xylem cells of the corresponding proteins requires in-depth immunocytochemical methods, whereas assessment of their function during tension wood formation will necessitate the production of transgenic trees specifically altered in the expression of these candidate genes and their analysis at the chemical, anatomical and mechanical levels.
The development of functional genomics on tension wood will help identify the links between gene expression, cell wall assembly, wood anatomy and, ultimately, wood properties. This should greatly improve our knowledge on the differentiation of G-fibres and, more generally, on the mechanisms underlying wood formation.
Some data have been produced within the INRA AIP Lignome program. SEM photos were taken at the Centre de Microscopie Electronique de l’Université Claude Bernard Lyon I. The authors are indebted to Dr Krystyna Klimaszewska for correcting the English.
Appendix A1: Supplementary data for Tables 1–4. For each consensus ID from Table 1: (i) list of corresponding ESTs with their GenBank accession numbers; (ii) consensus sequence. For each cluster ID from Tables 2–4: (i) list of corresponding ESTs with their GenBank accession numbers; (ii) consensus sequence for the consensus that groups the largest number of ESTs (from which gene annotation was done).