cDNA microarray analysis of small plant tissue samples using a cDNA tag target amplification protocol


For correspondence (fax +46 8245452; e-mail


Microarray technology is becoming an important comprehensive tool to study gene expression in plants. However, the use of this technology is limited by the large amount of sample tissue needed for microarray analysis. Generally, 50–200 µg of total RNA and 1–2 µg of mRNA is required for each hybridisation, which is equivalent to 50–100 mg of plant tissue. This requirement for large amounts of starting material severely constrains the use of microarrays for transcript profiling in specific tissues and cell types during plant development. Here we report on a robust and reliable target amplification method that enables transcript profiling from sub-mg amounts of plant tissue. Using 0.1 µg of total RNA we show that twofold expression differences are possible to distinguish with 99% confidence. We also demonstrate the application of this method in an analysis of secondary phloem development in hybrid aspen using defined tissue sections, corresponding to 2–4 cell layers with a fresh weight of ∼0.5 mg.


Developmental processes occur in specialized tissues or cell types and microarray technology offers the possibility, at the transcriptional level, to study these events. We have produced a large collection of ESTs from the wood-forming zone of Populus tremula x tremuloides, hybrid aspen ( (Sterky et al., 1998) which is the basis of our current efforts to study gene expression in specific cell types during the vascular differentiation process using cDNA microarrays. In our studies on the vascular differentiation in trees, we usually prepare cell specific samples consisting of 0.5 mg of tissue (Uggla et al., 1996), equivalent to approximately 0.5 µg of total RNA. However, the standard RNA labelling methods to carry out cDNA microarray experiments require large amounts of starting material, i.e. 1–2 µg of mRNA (Duggan et al., 1999). To facilitate these experiments, amplification of the signal or the target is required and can be performed in various ways (Mahadevappa and Warrington, 1999; Stears et al., 2000). A principle based on RNA polymerase amplification is currently one of the most favoured methods (Eberwine et al., 1992; Wang et al., 2000). Studies using full-length PCR amplification of the transcriptome have also been reported (Xu et al., 1997; Livesey et al., 2000). PCR is an extremely powerful method that allows amplification of mRNA populations from single cells (Karrer et al., 1998; Lambert and Williamson, 1993). A bias for shorter fragments has previously been observed when amplifying the complete transcriptome comprising of mRNA/cDNA species of a few hundred to several thousand basepairs (Van Gelder et al., 1990). However, by random fragmentation of the cDNA population into 200–600 bp fragments and subsequent selection of the 3′ cDNA tags for each transcript, we are able to reduce the transcriptome length and therefore minimise the bias during subsequent PCR amplification. Here we report on the use of a 3′ tagged cDNA target amplification protocol creating representative cDNA populations for transcript profiling using microarrays.

To assess the accuracy of the 3′ cDNA tag amplification method for cDNA microarray experiments we used 192 hybrid aspen cDNA clones that were printed in triplicates. RNA obtained from xylem and phloem tissue of hybrid aspen was used in these hybridisation experiments to evaluate the amplification protocol. In a final experiment we also used a cDNA microarray containing 2995 cDNA clones to confirm the generality of our method. The utility of the amplification method was furthermore demonstrated by preparing small and defined samples consisting of 2–4 cell layers from different stages of the secondary phloem differentiation process in hybrid aspen.

Results and Discussion

Xylem and phloem RNA were used in this study to assess the accuracy of the amplification method. Multiple hybridisations with different sources of labelled cDNA were compared by calculating the expression ratios of the phloem/xylem hybridisation signals. The targets were either traditionally prepared from 1 µg purified mRNA (hybridisations I, II, III) or from 100 µg total RNA (hybridisation IV), or were subjected to an amplification step starting with 1 µg of total RNA (hybridisations V, VI) or from 0.1 µg of total RNA (hybridisations VII, VIII).

In all cases, expression ratios from the hybridisation experiments were scatter plotted against the expression ratios from hybridisation I, which is the best hybridisation in this study, in this case defined as the hybridisation with the lowest background, most spots quantifiable and the best correlation between the three replicas of each spot. Figure 1(a–d) shows a subset of these scatter plots.

Figure 1.

Scatter plots of log2 transformed expression-ratios from hybridisations II, IV, V and VII (a–d) plotted against hybridisation I.

The plots represent the best hybridisation for each category tested, which in this case is defined as the hybridisation with the lowest background, most spots quantifiable and the best correlation between the three replicas of each spot. No signal was detected for any of the negative control clones. The solid line represents the regression line while the dashed lines represents 99% confidence intervals. All expression ratios are shown as phloem/xylem. (a) Ratios from hybridisation I plotted against ratios from another direct labelling of mRNA, hybridisation II. (b) Ratios from hybridisation I plotted against ratios from hybridisation IV where total RNA was used for labelling. (c) Ratios from hybridisation I plotted against hybridisation V where targets were amplified from 1 µg of total RNA before labelling. (d) Ratios from hybridisation I plotted against ratios from hybridisation VII where the targets were amplified from 0.1 µg of total RNA prior to labelling.

The reproducibility using standard labelling from mRNA was estimated by comparing expression ratios from two separate hybridisations (II and III) with hybridisation I. A 99% confidence was obtained for 2.0- and 1.6-fold difference in expression ratios, respectively (Figure 1a). Furthermore, a single hybridisation (IV) with labelled total RNA was compared to hybridisation I giving a 99% confidence with a 1.9-fold difference in expression ratio (Figure 1b).

In a similar manner, expression ratios obtained from amplified 3′ cDNA tag hybridisations were compared with hybridisation I to analyse the accuracy of the method. Two separate hybridisations (V and VI) with amplified 3′ cDNA tags from 1 µg total RNA resulted in a 99% confidence with a 1.7- and 2.1-fold difference in expression ratios compared to hybridisation I (Figure 1c). Furthermore, two additional hybridisations (VII and VIII) with amplified 3′ cDNA tags from 0.1 µg of total RNA resulted in 99% confidence with a 2.1- and 2.2-fold difference in expression ratios compared to hybridisation I (Figure 1d).

The number of quantifiable spots was used as an estimate of the quality and efficiency of different labelling methods. In our study, spots with a mean signal that were at least twice as high as the mean background were accepted for further analyses. No difference was observed in the number of spots passing the ‘filter’ when either mRNA (labelled with fluorescent nucleotides during reverse transcription) or amplified 3′ cDNA tag (labelled with fluorescent nucleotides in an assymetric cycling protocol) were used. For the data presented in Figure 1, 97–99% of the spots passed the quality filter. The spots disqualified by the quality criteria probably reflect random experimental variations in the hybridisation procedure.

Analysis of the linear regression plots (Figure 1a–d) show that the regression coefficients, which theoretically should be 1, for hybridisations II and III were 0.93 and 0.89, respectively, when compared to hybridisation I. When total RNA was used (hybridisation IV) a slightly lower regression coefficient of 0.79 was obtained. The coefficients using the 3′ cDNA tag amplified material (from 1 µg total RNA) were 0.83 and 0.78 (hybridisations V and VI) and when 0.1 µg total RNA was used the corresponding figures were 0.82 and 0.74 (hybridisations VII and VIII). The reason for the consistent but slightly lower regression slope in all experiments in relation to hybridisation I, is probably due to experimental variation during labelling and hybridisation. Hybridisation I represented a successful experiment in this series in relation to the dynamic range of the obtained expression ratios, naturally leading to lower regression coefficients in the compared experiments. The data for the amplified cDNA tags are consistent with the data obtained from the unamplified material (mRNA/total RNA templates). Furthermore, the coefficient of determination (R2) is sufficiently high (> 0.91) for microarray analysis in all experiments with only one case of a slightly lower R2 value of 0.87 (hybridisation VIII) observed.

In order to study the patterns of variation in relation to expression levels, residual plots were created. The obtained expression ratios of individual spots were subtracted from the predicted value (solid line, Figure 1a) and plotted against the product of the Cy5 and Cy3 signal. The result shows a low level of bias, i.e. the variability in individual spot is weakly related to expression levels (Figure 2). Interestingly, there is no apparent difference when the amplified cDNA tag hybridisations are compared with unamplified material (data from Figure 1a,c) as depicted in Figure 2(a,b). The tendency to observe higher errors at lower signals is due to the fact that these spots have signals close to the background noise. This is also observed when internal duplicate spots on the slide are plotted against each other and the residual plots created show similar results (data not shown).

Figure 2.

Residual plots, plotting the experimental value minus the predicted value from the scatter plots in Figure 1 against the product of the Cy3 and Cy5 raw signal from the dependent value hybridisation (II and V).

(a) Data from Figure 1(a) (ratios from hybridisation I plotted against ratios from another direct labelling of mRNA, hybridisation II). (b) Data from Figure 1(c) (ratios from hybridisation I plotted against hybridisation V where targets were amplified from 1 µg of total RNA before labelling).

In order to further demonstrate that the amplification method is valid for a larger part of the transcriptome we used a recently developed microarray containing 2995 different ESTs (Sterky et al., 1998). Target labelled from the xylem sample amplified from 1 µg of total RNA were hybridised with xylem target labelled from the 0.1 µg amplification, and the raw-signal of Cy3 (1 µg) was plotted against the raw-signal from Cy5 (0.1 µg) (Figure 3). The variability corresponds to a 99% confidence with 1.7-fold expression changes. This experiment finally demonstrates that the amplification method is reproducible using different amounts of starting material and also demonstrates the utility of our method when a larger part of the transcriptome is analysed.

Figure 3.

Scatter plot of raw data from the 2995 array on log2 scale where the signal from the Cy5 channel (target amplified from 1 µg of totalRNA) is plotted against the Cy3 channel (target amplified from 0.1 µg of total RNA).

The solid line represents the regression line and the dashed lines represents 99% confidence interval.

The average correlation coefficients using 1 µg and 0.1 µg of total RNA and the 3′ cDNA tag amplification procedure in our experiments were 0.96 and 0.94, respectively. This can be compared to the reported correlation coefficients of 0.77 and 0.79 for an RNA polymerase amplification strategy using comparable amounts of sample (Pilarsky et al., 1999). Thus, the 3′ cDNA tag amplification approach may have advantages when compared to full-length transcript RNA polymerase strategies for reproducible amplification of the transcriptome. The approach is not only restricted to cDNA arrays as shown here, since the probes of pre-fabricated oligonucleotide arrays are generally complementary to sequences in the 3′ cDNA region.

In order to demonstrate the usefulness of this amplification technique, tissue samples were sectioned out from phloem tissues at different stages of development (Figure 4). Secondary phloem in woody species is derived from the vascular cambium, the cambium occurs as a continuous ring of cells between the xylem and the phloem. The cell files seen in a stem cross-section are the result of a patterned control of numbers, places and planes of cambial cell division, and a subsequent regulated expansion and differentiation of the cambial derivatives into tracheary elements, vessels, fibres, parenchyma, sieve elements and companion cells (Mauseth, 1988). This differentiation process follows a radial distribution, meaning that in the cambium there is undifferentiated cells and at subsequent distances from the cambium the cells are more differentiated, until they finally become mature. The phloem tissues were sampled using a cryotom as described by Uggla et al. (1996). Each sample was approximately 2 mm × 20 mm × 30 µm comprising 2–4 cell layers giving a fresh weight of ∼0.5 mg. This dissection technique gives highly enriched samples from specific differentiation stages in the phloem development. The sample positions are shown in Figure 4. Sample A consists of differentiating phloem directly adjacent to the cambial cells, sample B consists of more mature and further developed phloem cells.

Figure 4.

Sample positions.

(Left) Nomarski optics picture showing part of the cambial region of the sampled hybrid aspen stem.

Positions of the tangential tissue sections used for sample A (differentiating phloem adjacent to the cambial cells) and B (more mature and further developed phloem cells) are indicated in the figure.

(Right) Toludien blue stained cross-section of hybrid aspen in high resolution, showing the bark to the xylem as indicated. The approximate locations of the sample regions are also indicated. Black bars represent 100 µm. Arrows indicate fibre bundles.

Table 1 shows the expression ratios for genes with a mean ratio higher than 2 in two replicate hybridisations. Using these threshold values, 15% of the analysed genes showed differential expression demonstrating the usefulness of transcript profiling in detecting differences in gene expression between defined cell-layers. Our data also emphasise that small and defined samples are useful to track down changes in gene expression during specific stages of plant development. A Zwille homologue, AI166030, shows higher expression in sample A, originating from the early phase of phloem development. Zwille is believed to be a regulator of meristematic tissue maintenance (Moussian et al., 1998), and this proposed function together with the expression pattern indicates that this gene could be involved in the control of the cells entering differentiation from the cambium mother cells. An expansin-like gene, AI166095 is also up-regulated in sample A, which is in accordance with the rapid expansion of these cells. In the B sample consisting of older developing phloem, two interesting genes linked to lignification are up-regulated, AI161730, a blue copper protein homologue and AI166034, a phenylalanine ammonia-lyase coding gene. This demonstrates that lignification is induced at this specific stage of phloem development. In fact, Figure 4 also shows, using biorefringency as a marker for ordered cellulose deposition, that secondary cell wall synthesis has started as developing fibre bundles is present in sample B. Thus, morphological data also confirm that lignification has started in these cells.The two genes with the highest differential expression are AI161823 (up- regulated 17 times in sample B) and AI166101 (up-regulated 12 times in sample A). The roles of these genes in the specific tissues are not obvious. AI166101 has the highest homology to a group of proteins containing Osmotins, Thaumatin and PR-proteins involved in different stress-responses, although AI166101 could possibly be involved in osmo-regulation in the rapidly growing cells (Singh et al., 1989). AI161823 has similarity to a SF16 protein isolog and a number of hypothetical proteins from Arabidopsis thaliana with unknown functions. SF16 is a protein that so far is believed to be exclusively expressed in pollen, and SF16 could possibly be an nucleic acid binding protein (Dudareva et al., 1994). These results are presented to show the utility of the technology. Since only 192 genes were analysed, a more detailed study of the phloem region with a larger array will give an expression road map to secondary phloem development in hybrid aspen and would also give more information about genes with unknown function. In summary, these experiments demonstrate that with as little as 0.1 µg of total RNA we were able to obtain reliable relative expression data for the genes arrayed. We also demonstrated the utility of the technology exemplified on phloem development in hybrid aspen. This generic technology opens up new possibilities for transcript profiling where analysis of small tissue samples is a prerequisite. Here we report a cDNA tag amplification technique that will enable gene expression studies in a few apical meristems of many plant species. This technology will also significantly contribute to transcript profiling studies in the plant model Arabidopsis thaliana where sample amount is limiting.

Table 1.  Expression ratios of sample A/B (see Figure 4) for genes with a mean ratio higher than 2 and lower than 0.5 in two replicate hybridisations
accession no.
Ratio A/B Ratio B/A
EXPANSIN ATEX6. Arabidopsis thalianaAI1660955.1 
HYPOTHETICAL 19.8 KD PROTEIN. Arabidopsis thalianaAI1617944.5 
ZWILLE PROTEIN. Arabidopsis thalianaAI1660304.1 
PROTEIN F19F24.19. Arabidopsis thalianaAI1660502.7 
LATEX ALLERGEN HEV B 5. Hevea brasiliensisAI1617432.6 
UNCONVENTIONAL MYOSIN. Helianthus annuusAI1617962.6 
GAG-LIKE POLYPROTEIN (FRAGMENT). Fusarium poae.AI1661382.6 
CYCLOPHILIN (EC Arabidopsis thalianaAI1617292.3 
HISTONE H4. Sesbania rostrata.HIS4 AI1660922.3 
GC-CYP (EC Vicia faba (Broad bean)AI1660622.2 
60S RIBOSOMAL PROTEIN L7A. Oryza sativaAI1618312.2 
SDL5A. Glycine max (Soybean)AI1617242.1 
HISTONE H2A. Lycopersicon esculentumAI1660072.1 
HYPOTHETICAL 55.2 KD PROTEIN. Arabidopsis thalianaAI165960 2.4
HYPOTHETICAL 74.3 KD PROTEIN. Arabidopsis thalianaAI161777 2.9
GLUTAREDOXIN ISOLOG. Arabidopsis thalianaAI166113 3.3
No annotationAI161820 3.6
F1N21.7. Arabidopsis thalianaAI166128 5.3
SF16 PROTEIN ISOLOG. Arabidopsis thalianaAI161823 16.7

Experimental procedures

Microarrays consisting of 192 hybrid aspen cDNA clones (Sterky et al., 1998) in triplicates and 48 human cDNA clones were used in the evaluation of the cDNA tag amplification protocol. The larger array contained 2995 hybrid asper clones (to be described in detail in another paper) in duplicate and the same human clones as the small 192 array. The gene fragments were PCR amplified in 200 µl and resolved in 40 µl 3× SSC and 0.04% sarcosyl after purification on MultiScreen PCR-filterplates (Millipore, Stockholm, Sweden) The clones were spotted using the GMS 417 Arrayer (Affymetrix, Santa Clara, CA, USA). The human clones were randomly chosen and acted as negative controls, no signal from these were detected in the presented hybridisations. Total RNA was obtained from xylem and phloem tissue of hybrid aspen (Hertzberg and Olsson, 1998) and the corresponding poly (A) RNA was isolated by the use of paramagnetic oligo (dT) beads (Dynal, Oslo, Norway), according to the manufacturer.

For the cDNA tag amplification protocol, 1 or 0.1 µg of total RNA from respective tissue was used. First and second strand cDNA synthesis was performed according to the manufacturer (Gibco-BRL, Hercules, CA, USA) using 1 µg NotI-oligo dT primer (5′-biotin-GAGGTCCCAACCGCGGCCGC(T)15–3′) and 200 U of Superscript II. The reactions were terminated with the addition of 10 µl of 0.5 m EDTA, phenol-chloroform extraction and ethanol precipitation. The precipitates were dissolved in 40 µl of 1 × TE (10 mm Tris–HCl, 1 mm EDTA) and primers were removed by Chromaspinn TE-100 columns (Clontech, Palo Alto, CA, USA). An inverted sonication probe was used to fragment the cDNA samples using 20 × 10 sec pulses at 90% effect (Sonifier® B-12, Branson Sonic Power, Danbury, CT, USA). Biotinylated 3′ cDNA tags were isolated using 25 µl of paramagnetic streptavidin coated beads (10 mg ml−1, Dynal) in 40 µl binding/washing buffer by continuous rotation at 37°C. The immobilised cDNA tags were end-repaired using 1.5 U T4 DNA polymerase (MBI Fermentas, Vilnius, Lithuania) in a 30 µl reaction volume at 12°C for 20 min according to the manufacturer. Blunt-end adapters (Sima18: 5′-GGA TCC GCG GTG-3′; Sima 19: 5′-TCT CCA GCC TCT CAC CGC GGA TCC-3′) were ligated onto the immobilized repaired cDNA tags using a ligase buffer (66 mm Tris–HCl pH 7.6, 5 mm MgCl2, 5 mm DTT, 50 µg ml−1 BSA) comprising 0.5 nmol adapter, 0.2 mm ATP and 7 U T4 DNA Ligase (MBI Fermenta) in a total volume of 30 µl. Ligation was performed overnight at room temperature with beads kept in suspension. The cDNA tags were released from the magnetic beads in a 30 µl volume by restriction with Not1 (MBI Fermenta). Six microliter of this eluate (containing cDNA tags) was used as template in a subsequent PCR. The PCR was performed in 100 µl containing 200 µm of each dNTP, 0.75 µm Sima19, 0.75 µm NotI-oligo dT primer, 65 mm Tris–HCl pH 8.8, 4 mm MgCl, 16 mm (NH4)2SO4, 0.5 µm BSA and 3 U Ampli-Taq DNA polymerase (Perkin Elmer, Boston, MA, USA). Cycling was performed according to the following procedure: initial incubation at 72°C for 3 min, followed by the addition of Taq DNA polymerase and subsequent cycling; 72°C for 20 min, 95°C for 1min, 45°C for 5min, 72°C for 15 min, followed by four cycles (95°C for 1 min, 50°C for 1 min, 72°C for 15 min), followed by 14–25 cycles (95°C for 1 min, 50°C for 1 min, 72°C for 2 min). The optimal number of cycles was defined as 1–2 cycles before PCR product saturation, as determined by agarose gel electrophoresis.

Labelling of the amplified cDNA tag populations was carried out in a cycled primer-extension reaction (assymetric PCR) using the Sima19 primer. The labelling reaction contained 100–200 ng amplified cDNA, 80 µm each of dATP, dGTP, dCTP, 20 µm of dTTP, 60 µm of Cy3-or Cy5-dUTP (Amersham Pharmacia Biotech, Uppsala, Sweden), 1 µm of Sima19 primer, 65 mm Tris–HCl pH 8.8, 4 mm MgCl, 16 mm (NH4)2SO4, 0.5 µm BSA and 3 U Ampli-Taq Gold in a total volume of 50 µl. The PCR program was 95°C for 60 sec and 10 cycles of 95°C for 30 sec, 50°C for 30 sec, 72°C for 10min. Labelling of total RNA (100 µg) and poly(A) RNA (1 µg) with Cy3-or Cy5-dUTP was performed according to previously reported protocols ( After labelling the RNA was hydrolyzed using NaOH at 70°C.

Targets labelled from amplified cDNA tags, total RNA and mRNA from two tissues were mixed and diluted to 500 µl in 1 × TE and concentrated using a Microcon-30 concentrator (Ambicon, Austin, TX, USA), repeated twice. Ten micrograms tRNA (Sigma, St Louis, MO, USA) and 10 µg oligo (A) (Sigma) was added and the volume was adjusted to 4.5 µl.

The microarray slides were pre-hybridised for 30 min at 42°C in a solution containing 5× SSC, 5× Denhardts, 100 µg ml−1 CT-DNA (SIGMA), 0.5% SDS and 50% formamide. The slides were then washed in water and rinsed in 2-propanol and finally dried using N2-gas. The labelled targets (4.5 µl) were mixed with 1.8 µl of 20× SSC, 2.2 µl of deionised formamide and 0.5 µl of 10% SDS. The mixture was heat denatured for 2 min at 95°C and cooled for 30 sec on ice prior to hybridisation to the microarray. The microarray was then covered with a plastic cover-glass (HybriSlip, Surgipath Medical Industries Inc., Richmond, IL, USA) and placed in a hybridisation chamber (ArrayIt, TeleChem International, Sunnyvale, CA, USA) and incubated for 13–16 h in a 42°C water bath. After hybridisation the slides were iteratively washed for 5 min in 1 × SSC + 0.03% SDS, 0.2 × SSC and 0.1 × SSC. The slides were scanned using GMS 418 scanner (Affymetrix, Santa Clara, CA, USA) and the obtained data were analysed using GenePix2.0 software (Axon Instruments, Foster City, CA, USA). Statistical analysis was performed with SigmaPlot5.0 (SPSS® Inc., Chicago, IL, USA).


We thank Deirdre O'Meara, Dr Ove Nilsson and Dr Rishi Bhalerao for valuable comments. This work was supported by grants from the Swedish Foundation for Strategic Research, the Swedish Council for Forestry and Agricultural Research and The Knut and Alice Wallenberg Foundation.