Towards deciphering phloem: a transcriptome analysis of the phloem of Apium graveolens


For correspondence (fax +33 1 30 83 30 99; e-mail


Events occurring in the phloem tissue are key to understanding a wide range of developmental and physiological processes in vascular plants. While a considerable amount of molecular information on phloem proteins has emerged in the past decade, a unified picture of the molecular mechanisms involved in phloem differentiation and function is still lacking. New models to increase our understanding of this complex tissue can be created by the development of global approaches such as genomic analysis. In order to obtain a comprehensive overview of the molecular biology of the phloem tissue, we developed a genomic approach using Apium graveolens as a model. cDNA libraries were constructed from mRNAs extracted from isolated phloem of petioles. Expression data obtained from the analysis of 989 expressed sequence tags (ESTs) and the transcript profile deduced from a cDNA macroarray of 1326 clones were combined to identify genes showing distinct expression patterns in the vascular tissues. Comparisons of expression profiles obtained from the phloem, xylem and storage parenchyma tissues uncovered tissue-specific differential expression patterns for given sets of genes. The major classes of mRNAs predominantly found in the phloem encode proteins related to phloem structure, metal homeostasis or distribution, stress responses and degradation or turnover of proteins. Of great interest for future studies are the genes we found to be specifically expressed in the phloem but for which the function is still unknown, and also those genes described in previous reports to be up or downregulated by specific interactions. From a broader prospective, our results also clearly demonstrate that cDNA macroarray technology can be used to identify the key genes involved in various physiological and developmental processes in the phloem.


A key step in the evolution of multicellular terrestrial plants was the development of an efficient system for long-distance transport and supply of water, photoassimilates, organic and inorganic nutrients, and signalling compounds to the whole plant. Plant vascular systems, composed of the xylem and the phloem, share structural similarities throughout the plant taxa, ranging from very primitive vascular plants to the highly evolved angiosperms. The phloem is a complex tissue composed of several highly specialised cell types performing specific functions. The conducting cells of the phloem (sieve elements, SE) consist of files of living cells that share common structural features such as thickened lateral cell walls and perforated end walls (sieve plates) containing callose deposits. Several developmental features are also conserved among the vascular systems from different species, for example the selective degeneration of organelles and the formation of sieve pores, allowing unobstructed flow through the sieve tubes. In angiosperms, SEs form a complex with the ontogenetically related companion cells (CCs) derived either from the procambium in the primary phloem or from the vascular cambium in the secondary phloem (Esau, 1969). The enucleate SEs rely on symplasmically connected CCs for supply of the metabolites necessary for their activity (Oparka and Turgeon, 1999). In source and sink tissues, CCs are involved directly in dynamically controlling assimilate loading into or unloading from the SEs. A third cell type, phloem parenchyma cells, can also be found closely associated with the SE–CC complex. The thick-walled phloem parenchyma cells appear to have a structural and protective function, whereas others interact with the SE–CC complex to affect photoassimilate partitioning. These three cell types are characteristically small in size, and as a result, the phloem constitutes only a small fraction of the total plant tissues (less than 0.4% of the total volume in leaves) (Sjölund, 1997).

The phloem is a strategic control point for the interorgan exchanges of a wide range of molecules with varying size and biochemical properties. In seed plants, the phloem controls loading, long-distance transport, and unloading of sugars, amino acids, K+, polyols and hormones between distantly located organs. Apoplastic and symplastic models have been proposed for both the loading and the unloading of molecules from phloem SEs (Oparka and Turgeon, 1999; Schulz, 1997). However, it is the symplastic pathway in association with specialised plasmodesmata that controls the transport of macromolecules, such as proteins and RNAs, between CCs and SE (Ruiz-Medrano et al., 2001).

Several approaches have been taken to increase our understanding of the molecular biology of the phloem. Following one approach, proteins and RNAs have been identified from phloem exudates in species such as wheat, rice, castor bean and cucurbits (Hayashi et al., 2000). Alternatively, the analysis of transcript profiles associated with the formation of the secondary vascular system has been undertaken in wood models (Allona et al., 1998; Hertzberg et al., 2001a; Sterky et al., 1998) and, to some extent, in Arabidopsis (Beers and Zhao, 2001). These studies revealed tissue-specific transcript profiles in the xylem and cambial tissues and, to some extent, in the phloem (Hertzberg et al., 2001a). Although attempts have been made to develop alternative methods, which would allow the content of phloem cells to be isolated (Asano et al., 2002; Hertzberg et al., 2001b), comprehensive studies, conducted with the aim of identifying genes expressed in the phloem on a large scale, have not yet been carried out. In this study, we have adopted a genomic approach in order to investigate the molecular biology of the phloem. We used celery (Apium graveolens) as a model system because of the ease with which the phloem of this species can be separated from the surrounding tissues, a property that has been exploited previously in physiological and molecular studies of carbon metabolism and partitioning (Daie, 1987; Noiraud et al., 2001). Here, we describe the transcript profile of phloem-enriched tissue fractions using both expressed sequence tag (EST) sequencing and cDNA macroarray hybridisation. This approach constitutes the first step towards increasing our understanding of the role of the phloem in physiological and environmental response processes.


Characterisation of the phloem in celery petioles

In celery petioles, vascular bundles are organised into collateral phloem and xylem strands and distributed on the abaxial side of the leaf (Figure 1a). The ‘functioning phloem’, characterised by domains of typical SE and CC, merges on the outside with the ‘bundle cap’ composed of phloem parenchyma cells (Figure 1b,c). SE having typical nacreous wall deposits make up approximately half of the cells in the functioning phloem, the remainder being CC and non-functional SEs. Using a combination of transmission electron (Figure 1d) and optical microscopy, we determined the relative proportions of functioning phloem and bundle cap cells in petiole phloem at two stages of leaf development. In the petioles of emerging leaves, functioning phloem cells predominate and represent approximately 60% of the total phloem cells. During the period of active cell division, a thin vascular cambium appears between the phloem and xylem. The phloem parenchyma of the bundle cap then develops further and ultimately represents 63% of the cells in the phloem of mature leaves. These modifications are accompanied by a reduction in the relative proportion of SE from 30 to 15% of the phloem in mature leaves.

Figure 1.

Organisation of the vascular tissues in the petiole of Celery (Apium graveolens cv. Dulce).Thin sections of petioles were stained with alum carmine and iodine green dyes, before observation by epifluorescence on binocular light microscopy (a) or on laser confocal scanning microscopy (b, c).

(a) Transverse section of a petiole of a mature leaf of celery (×2.5).

(b) Details of the vascular strands (×10).

(c) Details of the phloem tissue (×40).

(d) Longitudinal section of a sieve tube element observed by transmission electron microscopy; SE, sieve element; CC, companion cell. The phloem tissue observed on these micrographs was used for the preparation of the cDNA libraries and complex probes.

Co, collenchyma; SP, storage parenchyma; VB, vascular bundle; OD, oil ducts; Pa, phloem parenchyma; X, xylem; P, phloem; Ca, cambium; FP, functioning phloem; BC, bundle cap.

Sequence analysis of randomly selected cDNAs

Phloem and xylem strands from detached petioles were separated from the surrounding tissue containing collenchyma, cortical and medullar storage parenchymas and epiderm (hereafter referred to collectively as ‘parenchymas’), and then phloem was separated from xylem and used to prepare phloem cDNA libraries. To increase mRNA diversity, cDNA libraries were prepared from mRNA purified from two pools of phloem: one isolated from newly emerging leaves (PJ library) and the other from mature leaves (PA library). These two stages differ by anatomy, proportion of functioning phloem to other cell types and physiology (non-photosynthetic versus photosynthetic; non-storage versus storage). A third cDNA library (SJ) was generated after the preparation was enriched for phloem-specific sequences by subtractive suppression hybridisation (SSH; Diatchenko et al., 1996) carried out between cDNA from the phloem and cDNA from the parenchymas of petioles of newly emerging leaves. Individual clones were randomly selected for cDNA sequencing and macroarray spotting. The average size of the cDNA population was 1.2 kb for PA and PJ libraries and 325 bp for the SJ library.

Single-pass sequences were obtained from 989 clones representing 359, 362 and 268 ESTs from PJ, PA and SJ cDNA libraries, respectively. A search was conducted to identify potential chimaeric clones and revealed the presence of 12 overlapping chimaeric cDNA sequences. The average size of the edited ESTs was 698 bp (PA and PJ) corresponding to the 5′ end and 325 bp (SJ) corresponding to the full-length cDNAs. Of the 989 ESTs, at least 20% of the sequences were redundant, yielding 793 contiguous sequences, of which 87% were singletons, 9.5% duplicates and 3% represented three or more times (maximum: 14 times). The ESTs were annotated using blast similarity searches that revealed significant matches (threshold E-value = 10−5) for 80% of the sequences. This value fell to 65% for SJ ESTs, probably reflecting the smaller size of the SJ cDNAs.

The classification of ESTs into 13 main functional categories based on the MIPS standard showed a distribution close to that found in other plant tissue-specific EST collections (data not shown). However, the SJ library showed marked differences in the representation of genes from several functional classes (Table 1). Genes with functions related to photosynthesis, translational machinery, protein folding and the cytoskeleton were poorly represented. ESTs of genes encoding the structural phloem protein AgPP2-1, six metallothioneins, a protease inhibitor, a nodulin-like protein and several histones were abundantly represented. The abundance of ESTs encoding the phloem-specific lectin, AgPP2-1 (Dinant et al., 2003), suggested that other genes enriched by SSH could be preferentially expressed in the phloem. This hypothesis was tested further by cDNA macroarray analysis.

Table 1.  Representative partitioning of selected ESTs in the phloem subtracted cDNA library
FunctionSJ library (SSH), %PJ library, %
  1. Examples of the functional classes represented by phloem ESTs, which showed significant enrichment or under-representation in the SSH cDNA library (SJ). The SJ library was constructed from mRNA prepared from the phloem of petioles of newly emerging leaves after subtraction, with storage parenchyma mRNA. The comparison is made with the PJ phloem library prepared with the same mRNA population but not subtracted. The frequencies (%) were calculated taking into account the respective size of each cDNA library.

 Chlorophyll a/b-binding protein03.1
 Ribosomal protein (40S and 60S)1.54.8
 Cytoskeleton-related protein0.82.8
 Phloem protein PP2-13.750.3
 Protease inhibitor (γ- thionin)20
 Nodulin 213.750

Expression profile of phloem genes by macroarray analysis

A set of nylon macroarrays was prepared with 1326 cDNA clones (599, PJ; 440, PA; and 287, SJ) arrayed in duplicate within the same square. As controls, 20 clones were spotted twice in random positions on the arrays along with 34 negative controls (water or cloning vectors). The expression levels in the phloem, xylem and parenchymas of celery petioles from mature leaves were monitored by hybridisation with complex tissue-specific probes. A total of five independent experiments were performed from five independent sets of plants. To minimise physiological variation, the complex probes for each experiment were generated from phloem, xylem and parenchyma RNA extracted from the same plant. Moreover, tissue isolation was always performed at the same time of day and for identical lengths of time. The reproducibility of the macroarray experiments in scoring differential gene expression was validated as described by Desprez et al. (1998). Special care was taken for normalisation in order to take into account bias introduced by the enrichment of tissue-specific sequences. Normalisation of the intensity values was performed using a set of 100 reference clones (essentially PA and PJ clones) selected empirically as described in Experimental procedures.

For a majority of genes, significant interexperiment variation was observed in the transcript profiles. This could be in part because of the multiple sources of variation inherent to the array technology. In addition, the plants were grown to maturity under greenhouse conditions, which could have resulted in differences in plant development or physiology between experiments. Other sources of variation could have arisen as a consequence of differences in the purity and conditions of the tissue separation. Analysis of variance (anova) was used to assess the variability caused by the different parameters discussed above and to discriminate tissue-specific expression from these other sources of variation. Four factors were taken into account: gene, tissue, experiment (plant per day) and spotting reproducibility. The analysis showed that ‘experiment’ and ‘tissue’ were the factors primarily responsible for the observed variation in expression. A high F-value for experiment/tissue variation indicated that the two factors are interdependent (data not shown).

A local anova was also performed for each gene, and the P-value was used to identify genes showing significant variation in their expression pattern based on the different factors that were tested. A threshold P-value of 0.05 was maintained throughout. Approximately two-thirds of the clones displayed a significant variation pattern that depended upon the experiment, this included clones showing some tissue specificity. One-third of the clones (430 cDNA clones out of 1346) displayed significant variation in their expression pattern that was found to be dependent on the tissue. These clones corresponded to a set of 359 non-redundant sequences. This statistical analysis resulted in the exclusion of some classes of potentially interesting genes from this study, for example the genes induced only occasionally in the phloem (data not shown). We focused our further analysis on genes with the most stable patterns of expression. For the set of genes that displayed a tissue-dependent expression pattern, the transcript levels in the phloem tissue were compared with those in parenchyma and xylem. The ratios from this analysis (phloem/parenchyma, phloem/xylem, xylem/parenchyma) were then clustered, and a dendrogram was generated that contained three main branches consisting of genes preferentially expressed in the parenchyma, phloem or xylem (Figure 2a).

Figure 2.

Expression profile of genes showing tissue preferential expression.

(a) Clustering of the non-redundant gene expression patterns that show significant differential expression in vascular tissues (P-valuetissue < 5 × 10−2). Each row corresponds to one clone spotted on the filter (i.e. one gene), and each column corresponds to the ratios calculated from normalised intensity values, which were determined for each experiment between two of the tissue samples (Ph, phloem; Xy, xylem; Pa, parenchyma). Each clone is duplicated on the array, and each spot is quantified independently. Ratios are represented with false colours as follows: dark colours (black) correspond to a ratio of 1, red colours indicate ratio >1 (upregulated genes) and green colours indicate a ratio <1 (downregulated genes). The colour scale is indicated below the figure.

(b) Log2-transformed expression ratios of gene classes harbouring different tissue-dependent expression patterns. PhPa, ratio phloem/parenchyma; XyPa, ratio xylem/parenchyma; PaPa = reference = 1. Pa panel: representative set of genes preferentially expressed in the parenchymas (PhPa and XyPa < 1). V panel: representative set of genes preferentially expressed in the vascular tissue (PhPa and XyPa > 1) and with a similar level of expression in the phloem and in the xylem (Ph ≈ Xy). Xy panels: representative set of genes preferentially expressed in the xylem tissue, with only moderate (panel Xy1) or significant (panel Xy2) expression in the phloem. Ph panels: representative set of genes preferentially expressed in the phloem tissue, with only moderate (panel Ph1) or significant (panel Ph2) expression in the xylem.

Tissue differential expression

A majority of the clones (60% out of 359) appeared to be predominantly expressed in the vascular tissues. These included genes that clustered as being preferentially expressed in the phloem, xylem or both vascular tissues (Figure 2a). A number of genes that were preferentially expressed in the parenchymas showed moderate but significant variation (mean value for (Ph or Xy)/Pa = 0.82 ± 0.44) (Figure 2b, panel Pa). These included housekeeping genes as well as genes associated with photosynthesis, translation and the cytoskeleton (data not shown).

A number of genes in the vascular class also showed moderate but significant variation (mean value for (Ph or Xy)/Pa between 1.1 and 1.4). As this variation is a reflection of the fact that the anova grouped together genes with varying fold changes, we set the limit for a significant expression ratio to 1.5, corresponding to a minimum of 50% increase in the average accumulation in the vascular tissues compared that in the parenchyma. This isolated 73 genes as being either predominantly or specifically expressed in the phloem (mean Ph/Pa and Ph/Xy ratios > 1.5) (Figure 2b, panels: Ph1 and Ph2) or in vascular tissues (mean of Ph/Pa ratios > 1.5 and Ph/Xy < 1.5) (Figure 2b, panels: V, Xy1 and Xy2). These genes displayed expression ratios ranging from 1.5- to 22-fold, and cluster analysis confirmed that these genes showed reproducible and covariant transcript profiles. In addition, we also observed that genes represented by several independent clones clustered in the same sub-branch (data not shown). Out of the non-redundant genes differentially expressed in vascular tissues, half of these (37) corresponded to clones isolated from the SJ library, 23 of which were exclusive to this library. This again demonstrates that using subtractive approach can lead to the identification of new vascular-specific genes as the total number of genes spotted from this library represented only 20% of the total clones spotted on the macroarrays.

Characterisation of the main classes of genes preferentially expressed in the phloem

Table 2 summarises the functions associated with the genes found to be preferentially expressed either in the phloem (P+: 65 of 73 genes) or in the vascular tissues (V+). Nearly 40% of these sequences corresponded to genes of unknown function. Fourteen showed significant similarity with sequences found in other plant species.

Table 2.  Genes showing a predominant or specific expression in the phloem (P+) (average ratio Ph/Pa > 1.5 and Ph/Xy > 1.5) or in the vascular system (V+) (Ph/Pa > 1.5 and Ph/Xy < 1.5)
FunctionClonePutative identityAccession numberBLAST E-scoreTissue P-valueTissue
  1. The putative identity was based on blast analysis. The Accession number corresponds to the sequence giving the highest similarity score indicated in the blastE-score column. The tissue P-value refers to the result of the local anova on the tissue factor.

 MetabolismSJ0305Alcohol dehydrogenasedbj|BAA94770.1|2E-131,2E-03P+
 Metabolism/jasmonatePA0561Allene oxide cyclaseemb|CAB95731.1|5E-565,6E-03V+
 Metabolism/lipidPJ0106Oxysterol-binding proteingb|AAF14027.1|1E-843,2E-02P+
 Metabolism/polyaminsSJ0242S-Adenosyl methionine decarboxylasepir‖S68989|6E-101,2E-02P+
 Metabolism/vitaminsPA0067Thiazole biosynthetic enzyme AgTBSsp.|O237874E-501,5E-04V+
DNA and RNA binding
 DNA bindingPJ0016Histone H1Cgb|AAD48472.1|2E-252,3E-02P+
 DNA bindingSJ0365Myb-related factorgb|AAG08960.1|2E-206,4E-05P+
 DNA bindingPJ0077Nucleoid DNA-binding proteindbj|BAB11161.1|8E-474,5E-02P+
 DNA bindingPJ0606Transcription factor scarecrow-likepir‖C71441|3E-361,9E-02P+
 RNA-bindingPJ0465Oligouridylate-binding proteinemb|CAB75429.1|5E-933,9E-2P+
 RNA helicase/RNAseIIIPJ0332CAF proteinref|NP_171612.1|7E-315E-06P+
Protein synthesis and degradation
 UbiquitinSJ0196Ring-H2 finger proteinpir‖T51854|3E-661E-03P+
 UbiquitinPJ0287E2 ubiquitin-conjugating enzymepir‖T46009|1E-801,5E-03P+
 UbiquitinPJ0401E2 ubiquitin conjugating enzymegb|AAG40371.1|4E-791,6E-04V+
 UbiquitinPJ0815E2 ubiquitin-conjugating enzymeemb|CAD29823.2|1E-852E-03P+
 Protein turnoverSJ0373Protease inhibitor/γ-thionin – AgIP2pir‖S30578|1E-107,8E-05P+
 ChaperoneSJ0331Heat shock protein cognate 70sp|P246292E-062,2E-02P+
 Translation factorPA0715Translation initiation factor EIF5Asp|P563365E-822,2E-02V+
Metal homeostasis and transport
 Metal homeostasisSJ0123Metallothionein – AgMT1gb|AAB70560.1|4E-091E-05P+
 Metal homeostasisSJ0187Metallothionein – AgMT2gb|AAC62510.1|1E-225E-04P+
 Metal homeostasisPJ0656Metallothionein – AgMT3emb|CAB85630.1|4E-243,2E-04P+
 Metal homeostasisSJ0223Metallothionein – AgMT4emb|CAB85630.1|4E-132,5E-02P+
 Metal homeostasisSJ0316Metallothionein – AgMT5emb|CAB85630.1|3E-152,2E-03P+
 Metal homeostasisSJ0385Metallothionein – AgMT6gb|AAF68995.1|4E-123,4E-05P+
 Redox potentialPA0642Blue copper-binding protein Isp|Q410018E-334,6E-05P+
Cell wall
 Cell wallPA0497Bacillus cotA (Laccase)gb|AAB72167.1|3E-545,7E-05V+
 Cell wallPJ0458Expansin precursorgb|AAF32411.1|5E-841,4E-04P+
 Cell wallPJ0355Pectate lyasedbj|BAB59066.1|1E-1184,4E-02P+
 Lignin biosynthesisSJ0272Cinnamyl/sinapyl alcohol dehydrogenasegb|AAK58693.1|7E-124,9E-04P+
 Phloem proteinPJ0658PP2 phloem lectin – AgPP2-1pir‖T04765|5E-16<E-07P+
 CytoskeletonPJ0094Actin depolymerising factor 2gb|AAG16974.1|4,00E-661,1E-02P+
 TransporterPA0383Sucrose transporter – AgSUT2gb|AAD45390.1|4E-964E-02V+
 TransporterPA0591Transporter proteingb|AAK62597.1|3E-791,9E-04P+
 MembraneSJ0270Multispanning membrane proteinpir‖T50793|2E-059E-04P+
 MembraneSJ0354Plasma membrane intrinsic proteinpir‖T12440|7E-498E-03P+
 MembraneSJ0020Nodulin 21 – AgNod1pir‖T00561|1E-344,6E-04P+
Signal transduction
 Signal transductionSJ0179Receptor protein kinasegb|AAM65586.1|8,00E-596,4E-03P+
 Signal transductionSJ0139Ser/Thr protein kinase – AgPK721dbj|BAB01326.1|2E-251E-06P+
 DetoxificationSJ0336Glutathione S-transferasegb|AAG34828.1|9E-156E-03P+
 Stress-relatedPA0662Aluminium-induced proteingb|AAK50814.1|4E-655,4E-03P+
 Stress-relatedPA0361Cell death associated proteingb|AAF62404.1|3E-723,6E-02V+
 Stress-relatedSJ0037Coronatine-induced proteingb|AAF27046.1|3E-311,9E-03P+
 Stress-relatedPJ0323Harpin-induced proteindbj|BAB09545.1|4E-561,8E-05P+
 Stress-relatedPJ0247Stress-related proteinsp|Q9MA634E-508,5E-03V+
 PJ0463Allergen-like protein Ole e 1gb|AAF16869.1|3E-536,5E-03P+
No score
 SJ0172No score AgNS1  1E-03P+
PA0391No score  1,3E-02P+
PA0475No score  2,2E-04P+
PA0525No score  8,9E-05P+
PA0526No score  3,3E-05P+
PA0532No score  8,5E-05P+
PA0641No score  4,7E-03P+
PA0697No score  8,8E-03P+
PJ0675No score  4,4E-03P+
SJ0122No score  1E-02P+
SJ0163No score  5E-06P+
SJ0195No score  8,1E-04P+
SJ0340No score  2,1E-03V+
SJ0350No score  3,2E-02P+
SJ0372No score  6,9E-05P+

Genes encoding ribosomal or photosynthesis-related proteins were not represented, even though they were abundant in the EST collection, perhaps reflecting the consistency of the high levels of photosynthetic activity taking place in the petioles (Hibberd and Quick, 2002). Several other categories were poorly represented. These included genes encoding cytoskeleton proteins, kinases, phosphatases, cyclophilins, and membrane-associated proteins, as well as proteins involved in metabolism such as oxysterol-binding protein (P+), S-adenosyl-methionine decarboxylase (P+) and thiazole biosynthetic enzyme (V+). Over-represented categories included genes involved in protein stability and turnover, stress and metal homeostasis, structural proteins, and putative RNA- or DNA-binding proteins.

We considered it plausible that the over-representation of stress-related genes could be linked to the procedure used for phloem isolation. We therefore repeated our macroarray analysis using probes prepared from RNA purified from whole petioles (phloem, xylem and parenchyma). The expression profiles obtained from the whole-petiole RNA preparations were compared with patterns obtained from preparations where the phloem, xylem and parenchyma had been isolated and then re-mixed prior to RNA extraction. No significant differences in expression pattern were observed between samples taken from the same plant and prepared by these two methods. This showed that the expression of the putative stress-related genes we found in the phloem was not induced by the extraction. It appears that the phloem isolation did have mild effect on gene expression patterns, and that several genes related to stress and redox homeostasis, such as gluthatione S-transferase (P+), patatin (P+) and allene oxide cyclase (V+), showed increased expression in the vascular tissues.

Interestingly, four genes encoding enzymes related to the ubiquitin-proteasome pathway were preferentially expressed in phloem or vascular tissues. These included E2 conjugating enzymes (P+ or V+), thought to be specialised components of specific SCF complexes, and an E3 ligase subunit (RING-H2 finger protein) (V+), thought to be involved in targeting substrate proteins towards the degradation pathway (Hellmann and Estelle, 2002). We considered it reasonable that chaperones could be involved in the long-distance translocation of proteins in the phloem; however, only a heat shock protein HSP70 (P+) was present in this class. Consistent with previous observations of phloem specificity in other species was the presence of ESTs in genes encoding a putative protease inhibitor (P+), metallothionins (classes II and III) (P+) and a blue copper-binding protein (P+). The six genes encoding metallothionins were abundantly represented in the ESTs (8.7% of the SJ ESTs) (Table 2;Figure 3). The phloem lectin gene AgPP2-1 (P+), which we have shown previously to be expressed specifically in the phloem (Dinant et al., 2003), showed the highest level of tissue specificity (P-valuetissue < 10−7). A second phloem lectin gene, AgPP2-2, was also predominantly expressed in the phloem, although only transiently (P-value = 0.1) and at a low level (data not shown).

Figure 3.

Northern blot analysis of selected genes.

Ph, phloem; X, xylem; Pa, storage parenchymas (including cortical and medullar parenchyma, collenchyma strands and epidermis). For each gene, the best E-score for blast hits and P-valuetissue for local anova are indicated. Genes preferentially or specifically expressed in the phloem: AgPP2-1, AgNod1, AgPK721, AgMT3, AgMT4 and AgNS1. Genes preferentially or specifically expressed in the vascular tissues: AgMT1, AgIP2, AgTBS and AgSUT2. Genes preferentially expressed in the storage parenchyma: AgSAMS and AgTub1. AgEIF4A (AY158704) was used as a reference gene as its expression is equivalent in vascular tissues and in storage parenchyma. Total RNA was evaluated by BET staining prior to RNA transfer to nylon blot.

Several genes encoding proteins involved in cell wall biosynthesis in the xylem (Im et al., 2000) or phloem fibres (Li et al., 2001) also exhibited phloem or vascular preferential expression patterns. These included laccase (V+), expansin (P+), pectate lyase (P+), and cinnamyl-sinapyl alcohol dehydrogenase. Other genes encoding two putative membrane proteins (P+) and a putative transmembrane protein with similarities to the nodulin Nod21 (P+) were abundant in the SJ library (3.5% of the SJ library ESTs) and specifically expressed in the phloem, although at a low level (Figure 3). Out of the same group, the sucrose transporter AgSUT2 (V+) was found to be preferentially expressed in the xylem. This localisation pattern has been confirmed by in situ hybridisation (Lemoine, unpublished results).

Several genes frequently described as highly tissue specific and related to signal transduction pathways or the transcriptional machinery were identified. These encoded two protein kinases (P+) and two transcription factors related to Myb and Scarecrow (P+). A putative CAF–DICER-like protein (P+, with a P-value = 5 × 10−6), thought to be involved in the production of microRNAs (Finnegan et al., 2003) for RNA silencing (Tang et al., 2003), was also identified. This finding is particularly interesting considering the pivotal role of the phloem in the transport and supply of the signalling macromolecules that regulate physiological and developmental processes.

Validation by Northern blot analysis

The macroarray results were verified by Northern blot analysis using a set of 13 genes with different tissue specificities (Figure 3). These included six genes predominantly expressed in the phloem (P+): phloem lectin AgPP2-1, nodulin 21-like AgNod1, serine/threonine protein kinase AgPK721, two type 3 metallothioneins (AgMT3 and AgMT 4), and a putative protein AgNS1 of unknown function. Four more genes found to be preferentially expressed in both phloem and xylem (V+) were also analysed: type 2 metallothionein AgMT1, a thiazole biosynthetic enzyme AgTBS, protease inhibitor AgIP2 and the sucrose transporter AgSUT2. Two genes found to be expressed at lower levels in the phloem than in other tissues were also included: S-adenosyl-methionine-synthase AgSAMS and alpha-tubulin AgTub1. The gene for the translation elongation factor EiF4A (AY158704) appeared to be similarly expressed in all three tissues and included as a control (Mandel et al., 1995).The pattern of mRNA accumulation observed by Northern blot analysis mirrored that deduced from our macroarray analysis. The similarities in expression pattern were observed in the case of abundant mRNAs such as AgPP2-1 and for mRNAs of low abundance such as AgNod1, and for sequences with high P-values (<10−7) as well as those with P-values close to the threshold (10−2). These results highlight the reliability of the macroarray technology and demonstrate that this approach can be used to produce transcript profiles of genes with varying levels of transcription.


A complex framework of gene regulation is needed for the phloem to carry out its central role in plant development, metabolism and stress responses. Such regulation can be considered to operate at three distinct levels in the phloem: intracellular, intercellular (i.e. CC to SE) and long distance (i.e. source to sink organs). The unravelling of the complex genetic programmes of the phloem tissue has been delayed by technical problems encountered when observing and isolating the tissue. By taking advantage of the ease with which the phloem tissue in celery can be isolated, we have developed a transcriptome approach, which has provided valuable insights into the classes of genes involved in phloem formation and function. This survey was based on genes derived from an individual tissue in an individual organ, a situation far less complex than that arising in other plant macroarray studies (Aharoni and Vorst, 2001). As compared to other techniques, such as laser-capture microdissection applied on vascular tissues (Nakazono et al., 2003) or on phloem (Asano et al., 2002), the isolation of the various tissues in celery from whole petioles yielded homogeneous samples in large amounts. We analysed the phloem in celery petioles at two developmental stages that differed in their relative proportion of functioning phloem (primarily SE:CC) to bundle cap (phloem parenchyma) cells and displayed distinct metabolic and physiological properties. Three phloem cDNA libraries were constructed to maximise the diversity of phloem genes represented in our study. Two of the libraries corresponded to abundant transcripts, and one was enriched in phloem-specific transcripts.

A set of 793 non-redundant genes expressed at various levels in the phloem were identified from these libraries and used to construct a set of cDNA macroarrays that were hybridised with complex probes corresponding to the phloem, xylem and parenchyma tissues. The transcript profile deduced from the cDNA macroarrays was analysed by anova (Kerr et al., 2000) and clustered into three groups of genes with spatially distinct expression patterns. One-third of the genes expressed in the phloem displayed tissue-differential expression patterns, with 9% (73 out of 793 independent genes) preferentially expressed in the phloem or in the ontogenetically related vascular tissues (phloem and xylem). These genes represented 7% of the non-subtracted libraries as compared to 14% of the subtracted one. While our experiments were designed to reflect normal developmental conditions, a number of genes were only transiently expressed in the phloem, suggesting additional transcriptional or post-transcriptional regulation. Of the 73 genes showing reproducible and preferential expression in the vascular tissues, approximately 20% showed similarity to genes of ‘unknown’ function identified in other plant species.

The preferential expression of these genes in vascular tissues provides any future research with a starting point for their functional characterisation. The majority of the annotated genes fell into five classes: phloem structure, protein stability and turnover, metal homeostasis, stress responses, and DNA-binding factors. In addition, a few genes encoded proteins that can be classified as involved with secondary metabolic pathways. Surprisingly, vascular-specific genes involved in primary metabolism associated with carbohydrate and amino acid mobilisation in the phloem (Bruguière et al., 1999; Williams et al., 2000) were notably absent from our study. The only known element of these pathways was the sucrose transporter AgSUT2 (Noiraud et al., 2000), which showed expression in both the phloem and the xylem. Celery petioles serve as both transport and storage organs. This diversity in function could explain the apparent difference observed in expression pattern between AgSUT2 and the previously reported phloem-specific sugar transporters in other plant species (Williams et al., 2000).

Phloem structure

The SEs of the phloem have a number of distinctive structural attributes that are unique to this cell type. During SE differentiation, P-proteins (phloem proteins) form ultrastructurally distinct bodies and filaments that persist in functioning SEs. (Bostwick et al., 1994). The phloem lectin or phloem protein 2 (PP2) appears to be a component of these structures (Read and Northcote, 1983) but is also translocated as a soluble protein (Golecki et al., 1999). In celery, two PP2 genes were identified that are differentially regulated. Both array and Northern blot analysis showed that AgPP2-1 is expressed continuously in the phloem (Figure 3) whereas AgPP2-2 appeared on the macroarray to be only transient-expressed in this tissue. Their expression in the SE:CC was recently confirmed by in situ hybridisation (Dinant et al., 2003), demonstrating the ability of macroarray analysis to detect genes with varying expression pattern.

Another distinctive feature of the phloem is the thickened polylaminate inner cell wall (nacreous wall) of the SEs in some species (Esau, 1969). Cell-wall-related genes were represented by 2% of the total EST collection and encoded various wall-modifying enzymes, such as a beta-1,3-glucanase, beta-1,4-glucanases and expansins. A gene encoding xyloglucan-endo-transglycosylase (XET) was also identified, a result consistent with previous reports of XET activity in celery (Vissenberg et al., 2000). This enzyme is involved in various developmental steps, including the formation of nacreous deposits in SEs (Bourquin et al., 2002). With the exception of one isoform of alpha-expansin that is preferentially expressed in the phloem, these genes showed no clear-cut tissue-specific expression. Expansins have been proposed to play a key role in the cell wall extension required for cell and tissue growth (Cosgrove, 2000) and in vascular cell differentiation (Cho and Kende, 1998), and have been localised previously to xylem in Zinnia elegans (Im et al., 2000). The biological significance of a phloem-specific isoform could be associated with either phloem wall biosynthesis or with the requirement of the SE to be able to respond rapidly to variations in the flow of the translocation stream. Several other vascular isoforms of genes associated with cell wall biosynthesis or structure were identified including those encoding pectate lyase, cinnamyl/sinapyl alcohol dehydrogenase and laccase.

The cytoskeleton (CK) organisation in phloem tissue differs markedly from that of other tissues in that SE appear to lack organised cortical microtubule arrays (Sjölund, 1997). CK elements have indeed been found in the phloem exudate of many species (Schobert et al., 1998) and probably participate, together with callose, in structurally defining the sieve pores (Chaffey and Barlow, 2002). Eighteen genes related to the cytoskeleton were identified in the celery ESTs. However, only one, encoding an acting depolymerising factor 2, appeared to be preferentially expressed in the phloem. Other classes of genes that were identified and could be related to phloem structure encode membrane proteins, such as several nodulins that contain putative transmembrane domains. Such genes, initially uncovered early and induced during Rhizobium/Fabacae interactions, could fulfil more general roles in plant growth and development. One of these genes, AgNod1, which displays significant similarity with Nod21 (Delauney et al., 1990), was found to be specifically expressed in the phloem, a result confirmed by our Northern blot analysis (Figure 3). This gene belongs to a small multigene family with a highly conserved signature, and is found in a broad range of plant species, including Arabidopsis thaliana (unpublished data). The presence of a tissue-specific isoform indicates a basic function associated with the maintenance of phloem architecture.

Protein turnover and ubiquitin degradation pathway

In addition to transporting photoassimilates and other low-molecular-weight compounds, research over the past decade has convincingly demonstrated that the phloem is also involved in the long-distance translocation of a large number of macromolecules (Thompson and Schulz, 1999). Several investigators have suggested that specific mechanisms control the stability of translocated proteins and mRNAs. Although little direct evidence supports such a hypothesis, a number of protease inhibitors (PI) have been identified in phloem exudates from many species (Dannenhoffer et al., 2001; Schobert et al., 1998; Xu et al., 2001a). We identified three PI genes encoding two cystatins and one putative protease inhibitor related to potato gamma-thionin (Stiekema et al., 1988). The latter was consistently and predominantly expressed in the phloem. Roles in defence or in the control of cell death during SE differentiation were hypothesised to explain the presence of PIs in the enucleated SE. It is interesting to note their abundance in phloem tissues even in the absence of any noticeable stress. These observations suggest that PIs in the phloem could be involved in constitutive mechanisms, such as those described as controlling protease activities during or after selective autophagy in SE (Dannenhoffer et al., 2001; Xu et al., 2001a).

Consistent with previous reports on the presence of ubiquitin in the sap exudates of several species (Schobert et al., 1998), we also identified 11 genes related to the ubiquitin degradation pathway as well as two proteasome subunits. Three genes encoding ubiquitin-conjugating enzymes and a RING-H2 finger protein were expressed preferentially either in the phloem or in the vascular tissues, as mentioned in other species (Nakazono et al., 2003). This supports the idea that specific components in the ubiquitin degradation pathway are active in vascular tissues (Lechner et al., 2002). As a result of the absence of ubiquitin-related protein degradation in the SE, it was assumed that activity of this pathway in the phloem was related to the maintenance of protein stability (Schobert et al., 1995). Alternatively, as the E3 ubiquitin-mediated degradation pathway is involved in many physiological responses (Del Pozo and Estelle, 2000), it could be hypothesised that these genes are involved in other processes such as signalling. The ubiquitin system and chaperones could participate in both the folding and the stability of proteins in SEs, or be associated with trafficking between the SE:CC and remote tissues (Bachmair et al., 1990). As seen in other species (Schobert et al., 1995, 1998), we observed that several genes encoding chaperones, such as cyclophilins, and heat shock proteins (HSPs), were transcribed in the phloem. The presence of chaperones in the phloem could reflect specific roles for these isoforms in the maintenance of protein folding during transport.

Metal homeostasis and transport

We identified transcripts corresponding to several genes encoding heavy metal donors, such as a blue copper protein, and also six distinct metallothioneins (MTs). MTs are cysteine-rich metal-binding proteins implicated in various processes related to metal homeostasis, detoxification, distribution and redox regulation (Cobbet and Goldsbrough, 2002). The six MT genes showed consistent preferential expression in the vascular tissues in all five experiments. MTs have been previously reported in the vascular tissues, including the phloem, of many plant species (Butt et al., 1998; Cobbet and Goldsbrough, 2002; Garcia–Hernandez et al., 1998; Nakazono et al., 2003), but a clear function is yet to be assigned. Metal micronutrients are imported into plants by specific uptake systems in the plasmalemma of root cells (Fox and Guerinot, 1998), and then transported via the xylem to the rest of the whole plant. The phloem plays a crucial role in the re-distribution of these compounds to the newly formed sink organs (Stephan and Scholz, 1993). Metal transport is generally thought to require ligands in order to prevent the cellular damage, which would be induced by the free metal ions, and to maintain solubility during long-distance transport. MTs could be involved in such complexes along with other high-affinity metal-binding peptides present in the phloem sap (Krüger et al., 2002).

Secondary metabolism

Secondary metabolites are known to be transported by the phloem tissue in a wide range of plants. Of those identified in this study, two genes that have been implicated, directly or indirectly, in defence mechanisms were of particular interest: allene oxide cyclase and thiazole biosynthesis enzyme. Allene oxide cyclase, an enzyme involved in jasmonate synthesis, showed preferential expression in the phloem (Figure 2), a result consistent with observations made recently in tomato (Stenzel et al., 2003) and supporting the hypothesis that this enzyme plays a role in the systemic signalling of plant defence. Thiazole biosynthesis enzyme in plants catalyses the formation of the thiazole ring and is thus essential for the synthesis of thiamine (Belanger et al., 1995). Its activity is required in several important biosynthetic pathways and is also involved in plant defence (Malamy et al., 1996). The vascular expression we observed may be associated with release of thiamine into phloem sap, an idea consistent with the detection of this vitamin in the phloem exudates of many species (Ziegler, 1975).

Genes encoding proteins involved in sterol biosynthesis and transport were also found to be expressed in celery phloem and included a gene preferentially expressed in the phloem displaying strong similarities with an oxysterol-binding protein (OSBP). OSBPs are involved in lipid trafficking via vesicle transport and sterol homeostasis in animals and yeast, although their precise role is still unknown (Xu et al., 2001b). Several putative OSBP have been identified in Arabidopsis, indicating the presence of a multigene family in plants similar to that already identified in yeast and mammals. The well-known requirement of sterols in vascular patterning (Carland et al., 2002) suggests a role for OSBP in vascular formation.

Experimental procedures

Plant material sampling and observation

Celery plants (Apium graveolens var dulce cv. Vert d′ Elne) were grown in greenhouse and were sampled at maturity (4 months old) at a regular time of the day (11–13.00 am). Phloem strands were isolated from the petioles of immature and mature leaves as described by Daie (1987). Newly emerging leaves correspond to inner leaves (2–5) and mature leaves (10–16). Xylem strands and storage parenchymas were isolated at the same time on the same plants. For transmission electron microscopy, petiole samples were treated and embedded in Epon's resin essentially as described by Roustaee et al. (2000). Micrographs were taken with a Philips 420 electron transmission microscope at 80 kV (Philips Electron Optics).

Construction of phloem cDNA libraries

Total RNA was isolated as described by Noiraud et al. (2000). Poly(A)+ RNA was isolated from total RNA using oligo(dT) columns (Stratagene, La Jolla, CA, USA). The cDNA libraries were constructed using the Lambda ZAP II-cDNA synthesis kit (Stratagene) and packaged using a gold packaging extract (Gigapack III, Stratagene). Mass excision was performed before sequencing of selected clones following the manufacturer's recommendations. The SSH library was prepared from mRNA purified from the phloem of petioles of newly emerging leaves and the mRNA from the parenchymas of the same petioles. Subtracted double-stranded cDNAs were synthesised using the SmaRT-PCR cDNA Synthesis Kit (Clontech, PaloAlto, CA, USA). For subtraction, both tester (phloem) and driver (storage parenchyma) RNA were obtained from the same petioles to minimise physiological variations. The tester cDNA pool was subtracted twice by the driver cDNA following the supplier's instructions (SSH: PCR select cDNA subtraction Kit, Clontech). The cDNA fragments were cloned into the ready-to-use pT-Adv vector (Clontech).

Sequence analysis

Single-pass sequencing of cDNA clones was performed using the universal M13 reverse primer. The raw sequences were edited to mask vector, adaptator, poly(dA/dT) and inaccurate regions (>3% of ‘N’). Trimmed ESTs with valid sequences shorter than 70 nt were eliminated. The quality of the sequences was confirmed by estimation of the total amount of ‘N’ that represented, on average, 1.05%. Sequence analysis was performed using the programs available on the GCG package (Genetics Computer Group, Wisconsin, USA). Chimaeric clones were identified by the presence of a cloning adaptator within the sequence, releasing two ESTs that were treated independently for sequence analysis. Chimaeric clones were excluded from the macroarray analysis. The redundancy was initially analysed by iterative comparisons using blastn (Altschul et al., 1990) performed against the EST library. Contigs were verified manually with a threshold chosen for contig formation of 98% identity over more than 60 bp. Similarity searches were performed using blast programs (blastx, blastn, and tblastx) against databases available at the NCBI ( Supplementary analysis was performed at TAIR ( Classification into main functional categories was performed according to the MIPS functional catalogue (

Preparation of cDNA macroarrays on nylon filters

Purified plasmid DNA or PCR product amplified from plasmid DNA were spotted at an average concentration of 100 ng µl−1, on nylon membrane filters (8 cm × 12 cm) (Genescreen Plus, Life Science, Promega, Madison, WI, USA). PCR products were amplified using forward and reverse M13 universal primers (95°C for 2 min, then 30 cycles of 95°C for 1 min, 55°C for 30 sec and 72°C for 1 min, and a final cycle of 72°C for 15 min), and purified on Bio-Gel P60 gels (Bio-Rad, Munich, Germany) by gel filtration at 350 g. The spotting was performed using a Q-Bot robot (Genetix, New Milton, UK) equipped with a 384-well plotter. DNA products were arrayed in duplicates at the final density of nine spots per square with a gridding pattern of 3 × 3. Thirty-four controls were spotted on the filter. They included the pBluescript plasmid, PCR-amplified pBluescript polylinker, pT-Adv (Clontech) and PCR-amplified pT-Adv polylinker.

Northern blots, test and reference hybridisation

Northern blot analysis was performed with 10 µg of total RNAs prepared as described by Desprez et al. (1998). The RNAs were transferred to a nylon membrane (Genescreen Plus, Life Science) in 10× SSC (1.5 m NaCl, 0.15 m sodium citrate). Pre-hybridisation (4 h) and overnight hybridisation were performed at 50°C as described by Church and Gilbert (1984), prior to washes under moderately stringent conditions (2× SSC, 0.1% SDS at 65°C for 2 × 15 min, followed by 1× SSC, 0.1% SDS for 15 min and 0.1× SSC, 0.1% SDS for 2 × 15 min). The cDNA probes, purified on Spin-X columns (Corning Costar Corp., Cambridge, MA, USA), were labelled using the Prime-a-gene Labelling kit (Promega, Madison, WI, USA) in presence of α-32P dCTP.

Macroarray test hybridisations were carried out using complex probes initially prepared with poly(A)+ RNAs purified using oligo(dT) columns (PolyA-Tract mRNA Isolation System, Bio-Rad). Total RNAs were subsequently used in place of poly(A)+ RNAs as the matrix for the preparation of the probes, with negligible effects on the results of the hybridisation. Probes were prepared by reverse transcription, in the presence of α-33P dCTP and 50 µg of total RNAs or 0.5 µg of poly(A)+ mRNAs as described by Desprez et al. (1998). Pre-hybridisation (4–8 h) and overnight hybridisation were performed at 42°C in the buffer described above. The washing conditions were identical to those of Northern blot. The filters were exposed on Imaging Plates for 16 h, and signals were detected using a BioImager analyser (Fuji Bas1500, FUJIFILM Medical Systems, Stamford, CT, USA). A maximum of three successive hybridisations on the same filters, including the T7 hybridisation, were carried out after stripping using 0.1% SDS (w/v) at 80°C for 30 min. The absence of residual radioactivity after stripping was verified after each hybridisation by exposure on imaging plates.

Reference hybridisations were carried out using 100 ng of the T7 primer (20-mer) labelled by the T4 polynucleotide kinase (Invitrogen, Groningen, the Netherlands) in the presence of 50 µCi of γ-33P ATP, as recommended by the manufacturer. Pre-hybridisation and hybridisation were performed at 42°C in the buffer described by Church and Gilbert (1984) in absence of formamide, followed by washing (2× SSC, 0.1% SDS at 42°C for 2 × 10 min, then 1× SSC, 0.1% SDS at 42°C for 10 min and 0.1× SSC, 0.1% SDS at room temperature for 10 min). Reference hybridisations (T7) were used to select within the set of filters those that showed homogeneous spotting, as determined by the coefficient of correlation between signals for each dot. Only sets of membranes showing good correlation coefficients (>0.981) were used for further test hybridisations. This step ruled out filters showing irregular spotting. Clones displaying spotting abnormalities (including duplication irregularities or low concentrations, i.e. an intensity value below a threshold of twofold greater than the background) were identified using reference hybridisation prior to removal.

Macroarray data analysis

The signal intensities of the hybridisations were quantified using XdotsReader (Cose, Dugny, France). A pre-defined grid, determining the area of signal quantification, was manually set up for recording. Signal quantification was performed for each dot using the sum of the intensity of all pixels within a circle of fixed radius after subtraction of the local background. The blots were normalised for the amount of radioactivity incorporated into cDNAs using initially the mean of the intensities of all spots (after subtraction of the background). These values were used to calculate the mean and the standard deviation of the intensities associated with each spot. This allowed us to identify a subset of 100 references (stable clones) showing the lowest variations, determined by stepwise calculation of a variation coefficient ‘Vc’ for each clone (Vc = standard deviation/mean of the intensities). This set included 54 PJ, 38 PA and 8 SJ clones. It was used for the final normalisation of the entire set of hybridisation data. The accuracy of the method was confirmed by calculation of their mean ratio for the tissue comparisons (Ph/Xy, Ph/Pa and Xy/Pa ratios) that was close to 1 (±0.35). Hybridisation and sequence data were analysed and stored in a database implemented under 4D (ACI, Paris).

Statistical analysis and clustering

An anova was applied to normalised intensity values, using four parameters: clone, tissue, experiment (i.e. set of plants) and clone duplication. The local anova was calculated and enabled us to associate two P-values with each clone spotted on the filters, P-valuetissue and P-valueexperiment. A threshold of significance of 5 × 10−2 was retained. The handling of the data was realised using a local application similar to that developed by Didier et al. (2002). To focus on the variations in the expression level between tissues presenting a biological relevance, a 50% threshold increase (ratio tissue1/tissue2 > 1.5) was retained; this cut-off was applied to the mean of the ratios of the normalised intensity values calculated for the tissue comparisons (Ph/Xy, Ph/Pa or Xy/Pa ratios) and determined in the five independent experiments. This estimation gave an indication of the magnitude of the variations measured in the various tissues.

Clustering analysis was performed on ratios calculated from the normalised intensity values after hybridisation with probes corresponding to the three tissues considered (phloem, xylem and parenchyma) using xcluster (Sherlock, 2000; available at ( Global and local anova, clustering and its representation in false colours were performed through a local interface ‘Interfiltres’ (Amselem, unpublished results).


The authors thank the Genoscope – CNS (Centre National de Séquençage) – for spotting facilities; B. Gélie for her help in microscopy facilities; and F. Divol, M. Gagneux, D. Lefebvre, J. Fromentin, K. Chevalier and N. Noiraud for their technical contributions. The authors acknowledge the help of T. Desprez for initial development of the transcriptome approach and thank Prof. W. de Jong, Dr B. Dubreucq and Dr Y. Chupeau for their helpful discussions and comments. We are grateful to Dr E. Pilling and Prof. G. Thompson for their critical reviewing of the manuscript.

Data deposition: The sequences reported in this paper have been deposited in the GenBank database (Accession nos. BU692978BU693963 and CB275388CB275397). EST, macroarray raw and clustering data are available from our website (