The responses of plants to abiotic stresses are accompanied by massive changes in transcriptome composition. To provide a comprehensive view of stress-induced changes in the Arabidopsis thaliana transcriptome, we have used whole-genome tiling arrays to analyze the effects of salt, osmotic, cold and heat stress as well as application of the hormone abscisic acid (ABA), an important mediator of stress responses. Among annotated genes in the reference strain Columbia we have found many stress-responsive genes, including several transcription factor genes as well as pseudogenes and transposons that have been missed in previous analyses with standard expression arrays. In addition, we report hundreds of newly identified, stress-induced transcribed regions. These often overlap with known, annotated genes. The results are accessible through the Arabidopsis thaliana Tiling Array Express (At-TAX) homepage, which provides convenient tools for displaying expression values of annotated genes, as well as visualization of unannotated transcribed regions along each chromosome.
Being sessile, plants cannot move away from extreme conditions such as heat, cold, high salinity or drought. These stress situations must trigger signals that alter plant physiology and growth to ensure survival in hostile environments. While relatively little is known about the primary receptors that sense these stresses, several downstream signaling cascades have been identified and studied in detail. Stresses may lead to an increase in reactive oxygen species (ROS), cytosolic Ca2+ or inositol phosphates, which in turn can induce signaling events further downstream (Allen et al., 2000; Xiong et al., 2002; Apel and Hirt, 2004; Gao et al., 2004; Pitzschke et al., 2006). Transmission of these signals includes post-translational modifications such as phosphorylation, ubiquitination and sumoylation of important regulatory factors (Halfter et al., 2000; Dong et al., 2006; Miura et al., 2007). Ultimately, many of these signaling cascades result in altered expression of stress-responsive genes. Some of these encode proteins responsible for the biosynthesis of hormones, such as abscisic acid (ABA), which can act as signaling molecules that amplify and spread the initial stress signal. Interestingly, different stresses as well as ABA treatment can change the expression of a common set of genes, indicating that stress responses are mediated in part by overlapping signaling pathways (Ishitani et al., 1997; Shinozaki and Yamaguchi-Shinozaki, 2000; Rabbani et al., 2003). However, these common signaling pathways might be activated in a different temporal and spatial manner by individual stresses (Xiong et al., 2002; Delessert et al., 2004; Kilian et al., 2007; Dinneny et al., 2008). In addition, there are signaling events that are specific to a particular stress (Chinnusamy et al., 2004; Yamaguchi-Shinozaki and Shinozaki, 2006). Together, these differential responses enable the plant to react adequately and specifically to different stresses.
The vast majority of expression analyses have been performed with full-length cDNA arrays or oligonucleotide arrays targeting known transcripts. The main disadvantage of these techniques is that they rely on prior information about potentially transcribed regions based on cDNA cloning, expressed sequence tags (EST) or computational gene predictions. The quasi standard for A. thaliana is the Affymetrix ATH1 array, which allows simultaneous detection of RNA from more than 20 000 genes (Redman et al., 2004; Busch and Lohmann, 2007). However, transcripts of about 10 000 annotated genes cannot be analyzed on this platform. In addition, RNAs appearing only under extreme environmental conditions may well have escaped previous annotation efforts. Especially in the light of the growing appreciation of the roles of non-coding RNAs, a more unbiased detection of stress-induced changes of the Arabidopsis transcriptome is of great importance (Sunkar and Zhu, 2004; Franco-Zorrilla et al., 2007; Reyes and Chua, 2007; Liu et al., 2008; Zhou et al., 2008).
Comprehensive quantification of known genes and the detection of novel transcripts can be achieved by two different methods: whole-genome tiling arrays or direct RNA sequencing (RNA-seq) (Yazaki et al., 2007; Laubinger et al., 2008b; Marioni et al., 2008; Mortazavi et al., 2008; Nagalakshmi et al., 2008; Naouar et al., 2008; Wilhelm et al., 2008). While RNA-seq shows considerable promise, robust statistical methods for interpreting RNA-seq data still need to be developed, whereas tiling arrays present a mature platform for gene expression analysis and the detection of novel transcripts. Here, we utilized whole-genome tiling arrays to analyze stress-induced changes in the A. thaliana transcriptome. We applied salt, osmotic, cold and heat stress as well as ABA treatments to whole seedlings and monitored the transcriptome after 1 h and 12 h of stress treatment. Our results demonstrate that many genes that cannot be characterized with the ATH1 array, including transcription factor genes and pseudogenes, are strongly responsive to certain stresses. Moreover, we identified unannotated regions in the genome that are transcribed after stress treatments. Together with a similar, recently published study (Matsui et al., 2008), our data set advances the understanding of stress-responsive gene expression in A. thaliana. To make our data accessible to the research community, all data have been included in the Arabidopsis thaliana Tiling Array Express (At-TAX) online resource, which allows visualization of gene expression estimates and single-probe intensities along each chromosome.
Results and discussion
Stress treatments and the responses of known stress-regulated genes
After seedlings had been grown for 10 days on solid MS medium at 21°C, they were transferred to liquid MS medium containing no additives (mock control), 200 mm NaCl (salt stress), 300 mm mannitol (osmotic stress) or 100 μm ABA. For cold and heat stress, seedlings were transferred to pre-chilled or pre-warmed liquid MS medium and incubated at 8°C and 30°C, respectively. Samples were taken after 1 and 12 h of continuous stress treatment. The RNA was extracted from whole seedlings and converted into double-stranded (ds) DNA targets that were hybridized to whole-genome tiling arrays (Affymetrix Arabidopsis Tiling1.0R®). This procedure allowed robust detection of transcriptional activity in the genome, although it does not reveal which DNA strand is transcribed (Laubinger et al., 2008b). Gene expression estimates for all genes included in the TAIR7 annotation and significantly different expression levels were calculated as described (Laubinger et al., 2008b).
Global comparison of stress-responsive expression of annotated genes
We first compared global changes in expression of annotated genes in response to different stresses. In addition to 20 583 genes that are also covered by the ATH1 expression array, 9645 annotated genes could be measured with tiling arrays. We analyzed both groups separately to determine the benefit of tiling arrays compared with ATH1 arrays. Depending on treatment, from about 300 to 900 genes of the ‘ATH1′ genes are up- or downregulated after 1 and 12 h of exposure to stress (Figure 2a). A similar fraction of transcripts, up to 4%, show changes in expression among the genes exclusively represented on the tiling array (Figure 2a). This suggests that analyses with ATH1 microarrays have missed several hundreds of stress-regulated transcripts. Gene ontology (GO) categorization of stress-regulated genes can be viewed in the Supporting Information (Figures S4 and S5).
The list of stress-responsive genes that are only represented on the tiling array comprises many with regulatory functions, such as transcription factor genes (Figure 2b, Table 1). Other interesting stress-responsive genes include, for example, a cold-inducible gene coding for a zinc-finger protein or genes coding for unknown proteins (Figure 2b). Among the stress-regulated genes are also ones that had been identified previously based on other properties, such as the potential meristem regulator ULTRAPETALA2 (ULT2) (Carles et al., 2005), the expression of which is specifically induced by ABA and cold treatment (Figure 2b).
Table 1. Differentially expressed transcription factors after stress treatment
ATH1 and tiling array
Tiling array only
Genes were separate in genes that are represented on ATH1 and tiling arrays and genes that are represented only on the tiling array.
Overlap of the transcriptional reprogramming in response to different stresses
Many genes respond not only to a single treatment, but are differentially regulated in several conditions. Hence, we were interested in comparing the effects of different stresses.
We first assessed to which extent stress-induced transcriptome changes are transient or constitutive. For all treatments, only a minority of ATH1 array genes were differentially expressed in the same direction after both 1 h and after 12 h (Figure 3a), The largest overlap was found for heat-responsive genes (21%). The overlap between the two time points was 10–14% for salt, osmotic stress and ABA-responsive genes, and merely 4% for cold-responsive genes (Figure 3a). Interestingly, among the cold-responsive genes, a larger fraction (9%) changed expression in opposite directions at the two time points (Figure S6). We obtained similar results for genes that are represented only on the tiling array (Figure 3b).
A convenient method for visualizing similarity between multivariate data sets is a principal components analysis (PCA). A PCA of our stressed samples showed that mock, salt, osmotic and ABA treated samples grouped together after 1 h, but are more distant in the corresponding 12-h samples (Figure 3c). The gene expression signatures of heat-stressed samples show the largest distance of all the samples, at both the 1-h and 12-h time point. In contrast to that, the gene expression signatures of cold-treated samples are closer to that of salt, osmotic and ABA treated samples, which might reflect a general involvement of ABA in all of these stresses, but not in heat stress.
Many ATH1 array genes (24–46%) respond to at least two of the following three stresses: salt, osmotic and ABA treatment (Figure 3a). In contrast, ATH1 array genes that changed in expression after 1 h of exposure to cold or heat stress show much less overlap with any of the other treatments (5–10%) (Figure 3a). After 12 h, however, there is greater overlap among cold- or heat-responsive ATH1 array genes and those that respond to salt, osmotic and ABA treatment (10–20%) (Figure 3a). These results suggest that the fast, transient response after cold and heat stress is more specific to these two conditions than the longer-term changes, which apparently include more generally stress-responsive genes. The greater overlap after 12 h of salt, osmotic, ABA and cold treatment is reflected in the PCA results (Figure 3c). That this is not the case for the heat-responsive genes after 12 h, even though there is an even greater overlap with salt-, osmotic- and ABA-responsive genes, may relate to the magnitude of the effect seen on expression of heat-responsive genes compared with the other treatments.
To explore the question of overlap in stress-responsive genes in more detail, we specifically asked how many genes could be detected as differentially expressed in all stresses and how many genes respond significantly only to a single stress. After 1 h we found seven and fifteen genes that are up- or downregulated, respectively, in all five conditions (Figure 4a). The number of genes with a broad response increases after prolonged application of stress. After 12 h we detected 35 and 66 concertedly up- and downregulated genes, respectively (Figure 4b). Again, several genes with a broad expression spectrum can only be analyzed with tiling arrays (Figure 4a,b).
We identified genes that change expression only in response to a specific stress using an entropy-based approach (Schug et al., 2005). Most genes with a specific response are seen after 1 h of cold and heat stress treatment, consistent with the limited overlap between each of these two treatments and all other treatments, while only a much smaller number of genes is specifically regulated by salt, osmotic or ABA treatment (Table 2). Again, these classes include several genes not represented on ATH1 arrays (Table 2).
Table 2. Analysis of genes that were specifically regulated by a particular stress treatment
ATH1 and tiling array
Tiling array only
Genes were separated in genes that are represented on ATH1 and tiling arrays and genes that are represented only on the tiling array.
Salt 1 h
Salt 12 h
Osmotic 1 h
Osmotic 12 h
ABA 1 h
ABA 12 h
Cold 1 h
Cold 12 h
Heat 1 h
Heat 12 h
Identification of stress-responsive, unannotated transcribed regions
In addition to capturing information on all annotated genes, a motivation for tiling array applications is the identification of transcriptionally active regions (TARs) that have not yet been annotated. In order to detect novel TARs outside of annotated exons, we applied a segmentation method that we have developed for tiling array data (Laubinger et al., 2008b; Zeller et al., 2008). The accuracy of TARs predicted from the control samples varied between 83 and 85% (calculated per tiling probe, data not shown) which is slightly higher than that observed for a similar study (Laubinger et al., 2008b). To further determine the accuracy of our TAR prediction, we asked to what extent our TARs are also represented by massively parallel signature sequencing (MPSS) tags (Meyers et al., 2004b). Per sample, between 40.2 and 43.1% of the high-confidence TARs contain one or more MPSS tags mapped to the Watson or Crick strand (Table 3). According to the same criteria only 25.2% of the exons annotated in TAIR7 are supported by MPSS data. We therefore conclude that many of our high-confidence TARs indeed correspond to expressed transcripts. We also compared our TARs with the ‘non-AGI’ TARs identified by Matsui et al. (2008). For more than 87% of their non-AGI TARs, our set of high-confidence predictions contains one or more overlapping TAR.
Table 3. Overlap of high-confidence transcriptionally active regions (TARs) with massively parallel signature sequencing (MPSS) tags
Overlap of all TARs with MPSS tags (%)
Overlap of stress-induced TARs with MPSS tags (%)
Salt 1 h
Salt 12 h
Osmotic 1 h
Osmotic 12 h
ABA 1 h
ABA 12 h
Cold 1 h
Cold 12 h
Heat 1 h
Heat 12 h
Stress-induced TARs were identified among high-confidence predictions for stress-treated samples by applying a statistical test for differential expression relative to mock controls (see Experimental procedures). The accuracy of this approach was determined by reverse transcription PCR (RT-PCR) validation experiments (Figure 5b). Indeed, TARs predicted to be strongly stress-responsive are more abundant in stressed samples than in the corresponding mock control (Figure 5b).
Per individual stress treatment, we found 82–338 unannotated, stress-induced TARs, covering 21–104 kb of the genome (Figure 5a). The size of individual stress-induced TARs ranged from approximately 135 bp (4 tiling array probes) to almost 2 kb (53 tiling array probes). Most of them were found after 12 h of salt stress treatment, while the fewest were identified after 1 h of osmotic stress or ABA treatment (Figure 5a). On average 18.5% of these unannotated, stress-induced TARs contain MPSS tags (Table 3). That the MPSS support for stress-induced TARs is substantially lower than for all high-confidence TARs was not unexpected, since MPSS tags were not sequenced from plants which were subjected to stress comparable to our treatments.
We also asked how specific the stress response of novel TARs is. In a pairwise comparison, we found the greatest overlap between novel TARs after salt stress, osmotic stress and ABA treatment (Figure 5c), resembling the pattern for annotated genes. However, the overall percentage of overlap was lower than for annotated genes.
Genomic location and conservation of novel transcripts
To characterize novel stress-specific TARs in more detail, we determined conservation of the genomic regions that give rise to these TARs in three other plant species for which complete or nearly complete genome sequences are available, Poplar trichocarpa, Oryza sativa and Sorghum bicolor (Goff et al., 2002; Yu et al., 2002; Bedell et al., 2005; Tuskan et al., 2006). Compared with annotated exons, novel stress-specific TARs are in general much less conserved (Figure 6a). This could reflect that these novel TARs are evolutionarily younger or less stable. Alternatively, if these TARs are mostly non-coding, primary sequence conservation might be less important.
Novel stress-specific TARs in the genome might either constitute unannotated exons of known genes or they might be independent genes. A simple indicator for these alternatives can be the distance of novel TARs to annotated genes. Per sample we identified between 21 and 69 unannotated stress-specific TARs separated by more than 500 bp from the nearest annotated genes (examples shown in Figure 6b,c; for other samples see Figure S7), while others are in close proximity to or even abut annotated genes (examples in Figure 6d). Because our method did not identify the strand from which transcripts arise, we examined some of these cases by reverse transcription followed by PCR (RT-PCR). In one case, there is apparently an additional exon that is induced under one specific stress, but not others (Figure 6d, left). In another case, a minor transcript form is present under all conditions, but becomes more abundant under a specific stress (Figure 6d, middle). In a third case, it appears that a constitutive exon has simply been missed in previous annotation efforts (Figure 6d, right).
Apart from a comprehensive expression analysis of most annotated genes, we employed our tiling array data for the de novo detection of TARs using the mSTAD algorithm (Laubinger et al., 2008b; Zeller et al., 2008). Among the novel TARs in regions annotated as intergenic, we identified several hundred with an interesting stress-responsive expression pattern. In several cases, where novel TARs were close to annotated genes, these constituted stress-induced exons.
Incorporation of stress data into the At-TAX online resource
For the whole stress data set described here, we integrated gene expression estimates, predicted TARs, as well as single-probe intensities along the chromosomes into our At-TAX visualization tools accessible through http://www.weigelworld.org/resources/microarray/at-tax. In addition, lists with stress-regulated genes and genes with broad and specific stress response are accessible through the At-TAX website. Further supplemental files on the same website contain information on TARs, their genomic location, predicted expression level, location relative to annotated genes, overlap with ESTs, and P-values resulting from tests for induction upon stress. By making this information available we enable the community to analyze the expression behavior of their favorite gene under various stress conditions or further investigate the roles of hitherto unknown transcripts.
Plant material and growth conditions
Wild-type Arabidopsis seeds (Col-0) were plated on half-strength MS medium supplemented with 1% sucrose and kept for 3 days at 4°C. Plates were then transferred in continuous light at 21°C. After 10 days a control sample was taken (time point = 0) and plants were subsequently transferred to liquid MS medium with 1% sucrose (mock control). For stress application, MS medium was supplemented with 200 mm NaCl, 300 mm mannitol and 100 μm ABA, respectively. Cold and heat stress were induced in pre-cooled and pre-warmed liquid MS medium, respectively, and plants were kept at 8 ± 1 or 30 ± 1°C. Samples were taken after 1 and 12 h and frozen in liquid nitrogen. All experiments were carried out in biological triplicates.
RNA isolation, target preparation and array hybridization
Total RNA was isolated using the RNeasy Plant Mini Kit (Qiagen, http://www.qiagen.com/) and 1 μg RNA was used for all subsequent steps. Preparation of dsDNA hybridization targets, array hybridization, washing and scanning were performed exactly as described previously (Laubinger et al., 2008a,b). Raw array data files were submitted to GEO and are available under the accession number GSE13584.
Detection and comparison of differentially expressed genes
Tiling probes were mapped to gene models annotated in TAIR7 and expression measurements were calculated using RMA as described (Laubinger et al., 2008b). Genes expressed differentially under stress compared with the corresponding mock control were identified using the RankProduct method with a false discovery rate cutoff of 10% (Breitling and Herzyk, 2005; Hong and Breitling, 2008).
We compared differentially expressed gene sets for different stress samples at the same time point as well as for the same stress at different time points. For this we calculated the percentage of genes in common between two sets A and B as
additionally requiring for a gene in |A ∩ B| that it changed in the same direction in A and B. Figure S6 shows the percentage of genes upregulated in A but downregulated in B, or vice versa. To define the set of ‘broad response genes’, annotated transcripts consistently up- or downregulated under all five stresses were identified separately for both time points and both platforms. The statistical significance of overlaps was assessed with a hypergeometric test (using the R package phyper; http://www.r-project.org/) (Fury et al., 2006).
Comparison of genes found to be differentially expressed by stress in this and previous studies
Genes found to be stress responsive in this study were compared with those reported in Matsui et al. (2008) by means of Venn diagrams (Figure S3). Raw data from Kilian et al. (2007) were re-analyzed as described applying the GCRMA method available in the current gcrma package (version 2.15.1) (Wu et al., 2004), and extracting genes exhibiting a fold change ≥3. These lists of stress-responsive genes were also compared with the ones identified here by Venn diagrams (Figures S1 and S2). Overlap significance was assessed as described above for the comparison of differentially expressed genes.
Entropy-based detection of genes with a specific stress response
Following the methodology proposed in Schug et al. (2005), we calculated Shannon entropy H relative to other stresses at the same time point for all genes based on the fold change of expression between treatment and control. The possible values for H range from 0 (for genes exclusively responding to a single stress) to log2(5) (for genes with a uniform stress response to all five stresses). As a measure of response specificity we also computed Q for each gene and each stress condition at both time points. Small values of Q are indicative of genes exhibiting a large fold change restricted to a small number of samples including the stress treatment of interest. For histograms of H and Q across all genes (and stresses) see Figures S8 and S9. Files listing H and Q for each gene are available through the At-TAX homepage.
For Table 2 we extracted genes with entropy H < log2(2) and differentially expressed under at least one stress condition and categorized them according to microarray platform representation. Furthermore, we included genes with stress-induced differential expression and a restricted expression pattern obtained for Q < 2 in the same stressed sample, also categorized by platform representation.
Detection of unannotated transcriptionally active regions (TARs)
Transcriptionally active regions were detected using the mSTAD segmentation algorithm (Laubinger et al., 2008b; Zeller et al., 2008). We followed the described normalization procedure (Laubinger et al., 2008b), but additionally performed a background correction of the raw array data before quantile normalization (Bolstad et al., 2003). This correction for uneven background was done by subtracting a mean array image (using a 51 by 51 feature sliding window) (Borevitz et al., 2003). After pre-processing the array data, we trained the internal parameters of the mSTAD model on 1-h and 12-h mock controls. Genome-wide predictions for 1-h salt-, osmotic-, ABA-, cold- and heat-stressed samples were made by the models trained on the 1-h mock control; for 12 h of salt, osmotic, ABA, cold and heat stress the models trained on the 12-h mock control sample were used. From the predicted TARs a set of unannotated, high-confidence predictions (referred to as ‘novel TARs’) was extracted as described previously, requiring that the TARs included at least four probes, fewer than 25% repetitive probes, average discrete expression level between 6 and 10 (as modeled by the mSTAD algorithm) and an overlap to annotated exons of at most 25 nucleotides (Laubinger et al., 2008b). A table containing the genome coordinates and additional annotation of all these novel high-confidence TARs is available from the At-TAX homepage.
Testing TARs for stress-induced expression
Each TAR meeting the above criteria for an unannotated high-confidence region was tested for a stress-dependent increase in expression level. We employed the Wilcoxon rank-sum test (also known as the Mann–Whitney U-test; we used the generalized Kruskal–Wallis test implemented in the Matlab statistics toolbox) to compare the intensities of all probes inclusive to the TAR of interest between the stressed sample and the corresponding mock control (pooling replicate intensities). When the median intensity under stress was higher than that of the control and the P-value of the statistical test was ≤5%, a TAR was called ‘stress induced’. A table containing the genome coordinates, P-values, neighboring genes and additional information on stress-induced novel TARs is provided on the At-TAX homepage.
RT-PCR analysis of novel stress-induced TARs
The RT-PCR validation experiments were performed as described previously (Laubinger et al., 2008b). Primer sequences for validated TARs are listed in Table S1.
Comparison between TARs identified here and TARs described previously
So-called ‘non-AGI TUs’ were downloaded as supplementary material of Matsui et al. (2008). We determined how many of these overlap by at least 1 bp with one or more of the high-confidence TARs described here, irrespective of the genomic strand from which they originated.
Overlap between novel TARs identified under different stress conditions
In a pairwise comparison of stress-induced novel TARs, we counted positions where novel TARs induced by different stresses overlapped. Subsequently, we normalized these counts by the total number of non-redundant positions corresponding to novel TARs that were induced by either of the two stress conditions to obtain the percentages shown in Figure 5c. The statistical significance of the observed overlap was assessed by a permutation test. We randomly shuffled the chromosomal location of TARs 10 000 times independently and determined whether the proportion of permutation experiments with a total overlap length exceeding the originally observed overlap length was less than 0.05 or less than 0.0001, respectively.
Comparison of TARs with MPSS tags
The MPSS tags mapped to the Arabidopsis genome were downloaded from the Arabidopsis MPSS Plus Database (Meyers et al., 2004a,b). Only the reliable and significant 20-bp tags were further considered (Meyers et al., 2004b). The TARs (and exons annotated in TAIR7 as a control) were counted as ‘confirmed’ by MPSS if they contained one or more MPSS tag(s) regardless of the genomic strand to which the tag was mapped.
Assessing evolutionary conservation of novel TARs
Whole-genome alignments between A. thaliana, O. sativa, P. trichocarpa and S. bicolor were obtained from the homepage of the VISTA project (http://pipeline.lbl.gov/downloads.shtml) (Frazer et al., 2004). These whole-genome alignments were generated using methods described previously (Kent, 2002; Brudno et al., 2003; Couronne et al., 2003). As a proxy for conservation of a region of interest, we assessed the number of sequence identities in the alignment corresponding to a TAR. Afterwards, sequence identity counts were normalized by transcript length and the number of aligned species (three). As a control for the novel TARs in each stressed sample, we randomly sampled 100 times as many annotated exons assessing their degree of conservation in the same manner. Resulting histograms are shown for all stressed samples in Figure S10.
Calculating distances between TARs and neighboring genes
For each stress-induced novel TAR we determined the distance between its start and the nearest annotated gene upstream as well as the distance between its end and the nearest gene downstream. The histograms shown in Figures 6b and S7 were computed from the minimum of these two distances. A distance of 1 can result either from a small overlap to (an) exon(s) or from the novel TAR being located in an intron of an annotated gene.
We are grateful to Jim Carrington for critical reading of the manuscript and members of the lab for helpful suggestions and comments. This work was supported by the Max Planck Society (GR, DW), European Community FP6 IP SIROCCO (contract LSHG-CT-2006-037900, to DW), a Gottfried Wilhelm Leibniz Award from the Deutsche Forschungsgemeinschaft (DFG) to DW and a grant for Temporary Positions for Principal Investigators from the DFG to SL.