SEARCH

SEARCH BY CITATION

Keywords:

  • cross-species comparison;
  • diurnal cycle;
  • extended night;
  • microarray

ABSTRACT

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. AIMS, SCOPE AND USE OF MAPMAN
  6. A CASE STUDY IN CROSS-SPECIES KNOWLEDGE TRANSFER
  7. CONCLUDING STATEMENTS
  8. ACKNOWLEDGMENTS
  9. REFERENCES
  10. Supporting Information

MapMan is a software tool that supports the visualization of profiling data sets in the context of existing knowledge. Scavenger modules generate hierarchical and essentially non-redundant gene ontologies (‘mapping files’). An ImageAnnotator module visualizes the data on a gene-by-gene basis on schematic diagrams (‘maps’) of biological processes. The PageMan module uses the same ontologies to statistically evaluate responses at the pathway or processes level. The generic structure of MapMan also allows it to be used for transcripts, proteins, enzymes and metabolites. MapMan was developed for use with Arabidopsis, but has already been extended for use with several other species. These tools are available as downloadable and web-based versions. After providing an introduction to the scope and use of MapMan, we present a case study where MapMan is used to analyse the transcriptional response of the crop plant maize to diurnal changes and an extension of the night. We then explain how MapMan can be customized to visually and systematically compare responses in maize and Arabidopsis. These analyses illustrate how MapMan can be used to analyse and compare global transcriptional responses between phylogenetically distant species, and show that analyses at the level of functional categories are especially useful in cross-species comparisons.


INTRODUCTION

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. AIMS, SCOPE AND USE OF MAPMAN
  6. A CASE STUDY IN CROSS-SPECIES KNOWLEDGE TRANSFER
  7. CONCLUDING STATEMENTS
  8. ACKNOWLEDGMENTS
  9. REFERENCES
  10. Supporting Information

Plant Omics research has come a long way since the first two-colour microarrays were hybridized (Schena et al. 1995). In early experiments, only a few treatments were conducted and significance was often assessed based on whether an arbitrary threshold of a twofold change was exceeded. Since then, microarray technology has become cheaper and more reliable, and is now a standardized commodity that is widely available to many laboratories. These developments make it essential to develop tools that aid the biological interpretation and evaluation of these huge data sets. Many tools have been developed to visualize microbe and animal Omics data, especially microarray data, on biological pathways focusing mostly on microarray data (Grosu et al. 2002; Dahlquist 2004). However, when the first ‘full genome’Arabidopsis arrays became available in 2002, there was still a clear lack of user-friendly tools for plants. This was exacerbated by the slow extension of ontologies to plant genomes. This gap was closed with the MapMan software (Thimm et al. 2004), which relied on its own ontology to classify genes and metabolites, and provided a modular system to visualize the results in the context of pathways and processes. Subsequently, further tools were released that support the functional analysis of Omics data for plant pathways, such as KaPPA-view- (Tokimatsu et al. 2005), Vanted- (Junker, Klukas & Schreiber 2006), Reactome- (Tsesmetzis et al. 2008) and Cyc-related web sites (Zhang et al. 2005). Most other tools still lack comprehensive and comprehensible overviews of plant metabolism. The deficit with respect to plant signalling pathways is even larger.

The ready availability of microarrays (and, increasingly, other Omics technologies) has spurred time-course or multifactorial experiments that include dozens of individual arrays and contain huge amounts of well-replicated data. Examples include extensive diurnal and circadian cycles (Bläsing et al. 2005; Edwards et al. 2006; Michael et al. 2008; Usadel et al. 2008b), or the AtgenExpress stress time series (Kilian et al. 2007). An increasing number of data sets are publicly available through large public domain databases like GEO, NASC (Craigon et al. 2004), ArrayExpress and Genevestigator (Zimmermann et al. 2004). It is becoming important to integrate large numbers of similar treatments performed in different laboratories, in order to arrive at a holistic view. Especially in cases where many genes might change or react just above the noise level, a meta-analysis is useful, as has been demonstrated for diurnal changes (Usadel et al. 2008b) and the circadian cycle (Covington et al. 2008). Often, such comparisons require data compression. To meet these rising demands, we introduced an application termed PageMan, which compresses whole-genome expression data by using statistical analyses to provide an overview of pathway responses. These pathway responses can then be visualized in a compact manner, allowing many different treatments to be viewed simultaneously.

High throughput data is becoming available for crop plants. It can be anticipated that this will increase dramatically in the near future. Nevertheless, in view of the wealth of knowledge available for Arabidopsis, it would seem beneficial to use the knowledge gained in this reference species to support the analysis and evaluation of transcriptomics data for other plants. A first necessary step would be to compare and then transfer the response of groups of genes. Some studies have already compared the expression of gene families in different species. For example, Brady et al. (2007) compared the developmental and spatial regulation of the COBRA genes both on the tissue- and cell-specific level in Arabidopsis and maize. A further extension of this approach is to compare the global response of different species to a similar perturbation; for example, Jiao et al. (2005) compared the response of rice and Arabidopsis to light, or to investigate local co-expression domains within the genome using orthology and co-expression information (Ren, Stiekema & Nap 2007).

In this guide, we give step-by-step instructions on how ImageAnnotator can be used to visualize different kinds of Omics data, and how the visualization and statistical tools and filters provided by ImageAnnotator can be used to make the most out of this data. Subsequently, we describe how the MapMan family of tools can be used to interpret the response of the crop plant maize, and then investigate whether it is possible to systematically compare global expression responses to carbon depletion in these two species. Maize is not yet fully sequenced and it is phylogenetically quite distant from Arabidopsis, thus evaluating different strategies to compare responses in these two species should prove especially informative for scientists working on other plant species where the complete genome sequence is not yet available.

MATERIALS AND METHODS

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. AIMS, SCOPE AND USE OF MAPMAN
  6. A CASE STUDY IN CROSS-SPECIES KNOWLEDGE TRANSFER
  7. CONCLUDING STATEMENTS
  8. ACKNOWLEDGMENTS
  9. REFERENCES
  10. Supporting Information

Plant growth

Maize plants (Zea mays L.) were European dent inbred lines named SL (Presterl et al. 2007), proprietary to KWS SAAT AG, Einbeck, Germany. They were grown for 14 d at 25/22 °C (day/night) at a photoperiod of 14/10 h and 500 µmol quanta m−2 s−1, at a relative humidity of 60/70% (day/night). For the experiment, the fourth leaf was sampled and divided into two parts, above and below ligule, corresponding to tissue in development or expansion. Samples were taken 1 h before the end of the light period, then 1 h before the end of the dark one (9 h). Plants were kept in the dark at the end of the night for an additional 6 h, when the last samples were taken (16 h dark, extended night time point).

RNA isolation and expression analysis with 17K Affymetrix arrays

Tissues were directly frozen in liquid nitrogen. Subsequently, homogenized tissues from five individual plants were pooled. One hundred milligrams of pooled homogenized tissue was used for total RNA extraction using the RNA-easy Qiagen kit following the manufacturer's instructions (Qiagen, Duesseldorf, Germany).

cDNA synthesis, cRNA labelling and the hybridization on the GeneChip genome array was done exactly as described by Thimm et al. (2004) and Bläsing et al. (2005), and as recommended by the manufacturer (part no. 900385, Affymetrix UK Ltd, High Wycombe, Buckinghamshire HP10 0HH, UK). In brief, biotin-labelled cRNA was used to hybridize Affymetrix GeneChip® Maize Genome 17K array and performed by the RZPD.

Development of a maize ontology/mapping file

A maize ontology/mapping file was developed as follows. The Arabidopsis reference genome was taken as template and maize genes were blast searched against potential matching Arabidopsis genes. Subsequently, genes potentially involved in C4 processes were manually assigned based on annotations and similarities to known genes. Furthermore, a domain search using InterProScan (Zdobnov & Apweiler 2001) was performed and this data was used in conjunction with the annotation available for the maize genes to manually correct the maize ontology (Doehlemann et al. 2008).

Statistical Methods and array processing

All calculations were performed using R (R development core team 2006). In brief, Robust Multichip Average expression measures (Bolstad et al. 2003) were calculated for all arrays used. Subsequently, a linear model was fitted to all arrays using the limma library included in the Bioconductor package. Afterwards, contrasts of interest (representing log2 fold changes between different time points or tissues such as ‘end of night – extended night’ and ‘end of day – end of night’, ‘below – above the ligule’) were extracted. As many P-values were generated by the multitude of genes that were considered and the number of different comparisons, P-values had to be adjusted for multiple testing. Therefore, the corresponding P-values were first corrected for multiple genes and then, the experiments where there was most likely a significant change were identified (Smyth 2004). Contrasts having a P-value below 0.05 after false discover rate control (Benjamini & Hochberg 1995) and an associated log2 fold change of 1 or above (−1 or below) were called significant.

AIMS, SCOPE AND USE OF MAPMAN

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. AIMS, SCOPE AND USE OF MAPMAN
  6. A CASE STUDY IN CROSS-SPECIES KNOWLEDGE TRANSFER
  7. CONCLUDING STATEMENTS
  8. ACKNOWLEDGMENTS
  9. REFERENCES
  10. Supporting Information

Terminology

MapMan is composed of a set of Scavenger modules, the ImageAnnotator module and the PageMan module (Fig. 1). Scavengers organize genes, enzymes, proteins or metabolites into functional categories (‘BINs’ and ‘sub-BINs’). The sub-BINs and BINs for a given parameter type are structured in a hierarchical tree and identified by numerical codes that reflect this hierarchy. The resulting ontologies (‘mapping files’) (see Table 1, and Supporting Information MapMan Guide p. 68 ff.) are imported into the ImageAnnotator module, which uses them to organize profiling data. This organized data is subsequently visualized via a false-colour code on diagrams of biological processes (‘maps’).

image

Figure 1. Overview of MapMan and PageMan. The general structure of the MapMan tools. The data flow from the Scavengers as well as from user-supplied Experimental Data Files into MapMan and PageMan is indicated.

Download figure to PowerPoint

Table 1.  Structure of MapMan mapping files. The table reproduces the structure of a MapMan mapping file, including the five columns for the Bincode, the name of the bin, the identifier to be used, its annotation and type
BincodeNameIdentifierDescriptionType
1PS   
1.1PS.lightreaction   
1.1.1PS.lightreaction.photosystem II   
1.1.1.2PS.lightreaction.photosystem II.PSII polypeptide subunits244963_atATCG00570 ‘PSII cytochrome . . .T
22.1003polyamine metabolismPutrescineamino acid degradation; arginine polyamine metabolismM

TranscriptScavengers classify genes into a redundancy-reduced ontology, i.e. the aim is to assign a gene in the most appropriate BIN for this particular gene, rather than sorting it into multiple BINs. Obviously, in some cases this is not possible; in general, this reduction of redundancy aids statistical analysis and visualization as the same item is not displayed multiple times. This is one of the main distinguishing features when comparing the MapMan to the GO ontology. Arabidopsis has provided the main source of information because it was and is still the best annotated plant genome. A skeleton ontology was developed using the TIGR3 annotation (Thimm et al. 2004). Continuous updating using literature, expert inputs and updating to the current TAIR8 version of the Arabidopsis genome (Usadel, Nagel & Thimm 2005; see http://www.gabipd.org/projects/MapMan for the latest updates) resulted in the re-assignment of thousands of genes, and expansion of the ontology from 197 to >1500 categories of which half describe individual enzyme functions. These ontologies/mapping files are distributed to the user in xml, Excel, text or binary file format (http://www.gabipd.org/database/java-bin/MappingDownloader; see also Supporting Information MapMan Guide p. 3ff).

The ImageAnnotator module visualizes data on diagrams (‘maps’) of biological pathways or processes relying on mapping files. It is written in Java to make it compatible with multiple operating systems and it is distributed to the user from the MapMan web site (http://www.gabipd.org/projects/MapMan/data.shtml; see also Supporting Information MapMan Guide p. 10ff).

The PageMan module uses the same mapping files as the ImageAnnotator module but is best suited to visualize multiple experiments at once. It visualizes data by compressing the response of whole pathways (all of the genes in a sub-BIN or BIN) down to single-coloured rectangles (Supporting Information MapMan Guide p. 72ff).

Mapping files represent the files in which the association of genes, metabolites or proteins to MapMan BINs is stored. The files contain a column containing a numerical BINcode identifying the MapMan BIN, one column with a human-readable name for this numerical identifier, a gene/protein/metabolite identifier column, a column in which the identifier is described relying on genome project annotations and a last column identifying an item as a metabolite, protein or transcript (Table 1). MapMan is used for different organisms such as Arabidopsis, medicago, rice, barley and solanaceous species, and different mapping files are available for each species. Mapping files are either available in binary format for faster access or as text/Excel files which allow the customization of the species under study with user-specific annotations.

Maps are diagrams of pathways or processes. They are usually generated by hand, and customized for use via the ImageAnnotator interface (see also Supporting Information MapMan Guide p. 57ff). At present, about 60 images can be downloaded; this is continually expanding (see http://www.gabipd.org/database/java-bin/MappingDownloader#pathways). Many of these represent individual pathways. However, overviews are available for many sets of genes, for example, central metabolism, secondary metabolism, protein synthesis, protein degradation and transcription factors. These are complemented by many individual metabolic pathways. Usually, the pathway images are used with the MapMan ontologies. However, the modular structure of the ImageAnnotator allows other ontologies to be used as well (Supporting Information MapMan Guide S1, pp. 86–91).

Experimental Data Files are supplied by the user and contain the measurements the user has conducted. These should contain log fold changes between a treatment and a reference. Other kinds of data can also be imported and used, e.g. qRT-PCR or deep sequencing data, provided the data is expressed as log fold change.

Getting started with the ImageAnnotator module

The downloadable MapMan application (http://www.gabipd.org/projects/MapMan/, see also Supporting Information MapMan Guide S1, pp. 3,4) consists of the ImageAnnotator software tool, mapping files for Arabidopsis, maize, barley, tomato, Medicago and potato, about 60 maps that have already been pre-customized to allow immediate use for each of these species, and a small set of demonstration Experimental Data Files for the response of Arabidopsis to carbon starvation and re-addition of sucrose for 30 min and 3 h (Osuna et al. 2007) and to nitrogen starvation and re-addition of nitrate for 30 min and 3 h (Scheible et al. 2004; see below for how to link your own Experimental Data Files), including a Wizard-driven software installer (see Supporting Information MapMan Guide S1, pp. 10–21).

After MapMan has been successfully installed and started, the user is presented with the ImageAnnotator user interface (Supporting Information MapMan Guide S1, p. 24). On the left-hand side is a browser. In the latest versions of the ImageAnnotator, the browser groups the included mapping files by organism and/or array types and maps by their metabolic and/or functional context. This facilitates species- or array-specific customization, which is important for users who work with several species or array types. In order to display any of the included experimental data files, the user needs to select it by using the mouse together with the map on which it should be visualized (see Supporting Information MapMan Guide S1, pp. 24–27). Subsequently, the user is prompted to choose an appropriate mapping file. The experimental data is then automatically visualized on the map (Supporting Information MapMan Guide p. 27).

In the display, all parameters (e.g. transcripts) in a BIN are grouped in a block, and the response of each parameter is shown via a colour scale. The user can modify this colour scale and the general appearance of the displayed pathway using a toolbar (see Supporting Information MapMan Guide p. 28).

A mouse-over function allows individual features to be identified in a pop-up box. The features displayed include the annotation of the gene, the BIN it has been assigned to and its log expression change. This information (gene, BIN; annotation, response) can be exported and stored. Further information is available by direct linking to web resources, like TAIR, SGN, MIPS, curating the corresponding genome (see Supporting Information MapMan Guide S1, pp. 29, 30).

Regardless of the display method chosen, all displays can be exported as images using the export option available in the Pathway menu (Supporting Information MapMan Guide p. 31).

Linking your own experimental data files

To visualize your own experiments, it is necessary to prepare Experimental Data Files and to link them in the ImageAnnotator browser. An Experimental Data File should contain a header in the first row and the first column should contain row names (e.g. Affymetrix probe set identifier). All further columns can contain log (fold change) values or (see below) derived values (Supporting Information MapMan Guide S1, pp. 35, 36, 57–59). These files can be prepared as xls or tab-delimited txt files, whereby the latter are uploaded more quickly. MapMan does not support multiple values for the same gene or metabolite, thus these should be averaged. It is, however, possible to include the results of statistical analyses as ‘derived values’, which are used to filter and evaluate the results online (see below). If it is wished to display different types of data simultaneously (see below), the parameters (e.g. levels of transcript 1 . . . i, levels of metabolite 1 . . . n) can be mixed, but each parameter must have a clear identifier and these must be specified in the corresponding mapping file.

The Experimental Data File is stored in a chosen folder on the server or hard drive. A link is established by clicking on the appropriate folder of the ‘Experiment’ section of ImageAnnotator browser, upon which the stored file is located on the user's hard drive or server (Supporting Information MapMan Guide S1, pp. 57, 58). The link, but not the file itself, is stored in ImageAnnotator; for this reason, it is advised to store files in clearly named folders on the hard drive, and to avoid moving or renaming these files. After a file has been linked in MapMan, it can be visualized exactly like the included files.

Making and customizing new maps

MapMan can be adapted in several ways to be used with specific, self-made data sets or templates. One of the most basic is to prepare your own diagrams or maps. These can be created via any image processing program and can be stored in a wide variety of formats. A link is established in the section ‘Pathways of the ImageAnnotator browser’, in an analogous manner to that described already for Experimental Data Files (Supporting Information MapMan Guide S1, pp. 60, 61). Maps can then be accessed and opened by mouse click. When a map is used for the first time, it must be customized. This involves right (apple-) clicking at the position where a sub-BIN or BIN should be shown and entering its number into the pop-up dialog box. This is repeated for all of the BINs/sub-BINs on a particular diagram (Supporting Information MapMan Guide S1, pp. 62–67). This meta-information is saved in the background as an xml file, and is automatically opened the next time the map (i.e. image) is used. The xml file is automatically stored (with the same name but different extension) in the same folder as the corresponding map. However, as for Experimental Data Files, it is important to store the maps in a clearly named folder and to avoid moving or renaming the folder.

Display of Omics data from different sources

Other types of Omics data also require tools to allow them to be visualized and analysed. Proteomics is currently progressing from a technique were tens or a few hundred proteins are measured to one where thousands are analysed (Baerenfaller et al. 2008). Therefore, a ProteinScavenger was developed that assigns ∼60 000 proteins from various plants. It was implemented as an extension of the TranscriptScavenger modules, and identifies proteins according to the gene codes of the corresponding gene and assigns them to the corresponding sub-BIN and BIN in a hierarchical tree that has the same structure as that generated by the TranscriptScavenger. This allows quantitative proteomics data to be displayed using exactly the same maps and supporting files as that used for the display of transcript data (e.g. May et al. 2008).

Metabolomics started as an independent branch of Omics research (Fiehn 2002; Trethewey 2004). It has traditionally been strong in plant sciences (Stitt & Fernie 2003; Yonekura-Sakakibara et al. 2008). However, in typical metabolomics experiments, the data is fundamentally different from transcriptomics experiments. Of the thousands of metabolites that are known or thought to be present in a plant sample, only a few hundred are identified and measured. MetaboliteScavengers allow MapMan to be used to visualize metabolite profiling data. The current versions cover metabolites routinely detected by GC-TOF and LC-MS platforms (Thimm et al. 2004; Kopka 2005; and Gibon unpublished). A special overview map called ‘Metabolites’ which focuses on individual metabolites is provided for researchers concentrating on metabolite profiling (see Supporting Information MapMan Guide S1, p. 26).

One of the next challenges is to display these different data sets on the same visual interface. As an interim solution, MapMan and other tools such as KaPPA-view (Tokimatsu et al. 2005) display data from different domains using different glyphs such as squares, triangles or circles. However, further approaches may be needed to provide a less cluttered overview of the results. This is obviously easier when a specific process or a restricted area of metabolism is shown (Gibon et al. 2006; Kusano et al. 2007).

As already pointed out, one can mix these different kinds of data by simply putting them into the same experimental data file or the same column of a file (Supporting Information MapMan Guide S1, p. 59). Again, the user only has to make sure that the data is provided as log fold changes so that the colour scale used in the ImageAnnotator is meaningful. The ImageAnnotator module recognizes the different entities and displays them using circles for metabolites and squares for transcripts.

Online display of time course data for all of the genes in a given sub-BIN or BIN

One major limitation of the first versions of MapMan was that ImageAnnotator provided static pictures, but did not display time courses. In the latest versions, it is possible to select multiple experiments, or time points, by shift- or control-clicking. ImageAnnotator then switches into a multi-data display mode which shows time course data for a set of selected genes as line plots. This serves as an advantage over tools which do not incorporate this feature (see Fig. 5 and Supporting Information MapMan Guide S1, p. 40).

image

Figure 5. Time course display in MapMan. Time points at the end of the day, the end of the night and an extension of the night in maize were compared with the average expression values of these three time points and displayed simultaneously by selecting all three time-points at once.

Download figure to PowerPoint

Adding statistical tools to the ImageAnnotator module

The initial build of ImageAnnotator used Experimental Data Files that contained the logarithm of the fold change between a control and a treatment (Thimm et al. 2004). It was limited because it did not support statistical analysis. However, the realization that experimental data stemming from Omics analysis platforms should be treated with the same rigorous measures as that applied to ‘classical’ experimental data has resulted in stricter demands on the handling of microarray experiments, their replication and statistical analysis. These standards have been adapted by the plant community, resulting in several recommendations for the analysis of microarray experiments (Nettleton 2006). Successive versions of ImageAnnotator have therefore incorporated an increasing number of tools for online (statistical) analyses.

ImageAnnotator now supports online calculation of whether the response of items in a particular BIN deviates from the response of all other items that have been measured. This is achieved by using a Wilcoxon test. This test is similar to a Student's t-test but is independent of a normal distribution and thus identifies pathways or processes which change. The results can be shown by clicking on a tab below the pathway display labelled ‘Wilcoxon Rank Sum Test’ in the ImageAnnotator module (Supporting Information MapMan Guide S1, pp. 50, 51). These pathway wise statistics highlight some major features of a given response, for example, small but consistent changes in the expression of many genes in a given functional category. By using this function, it is thus possible to pinpoint which sub-BINs and BINs are showing coordinated changes, or basically which pathways are of probable interest. This distinguishes the ImageAnnotator module from many other tools.

Another example is a simple online K-means clustering solution. The clustering procedure and its interface can be brought up by right (apple-) clicking on an experiment containing multiple measurements (Supporting Information MapMan Guide S1, pp. 41–49).

Online filtering in ImageAnnotator based on statistical criteria

ImageAnnotator now also supports online filtering to remove all genes whose response does not meet a statistical criterion for differential expression. This is done by creating a more complex Experimental Data File, which contains not only data values but also so-called ‘derived values’. These are the results of user-performed statistical analyses like significance analyses, cluster assignments (Usadel et al. 2005) or more sophisticated evaluations (see Bläsing et al. 2005). The user configures the files, such that some columns are defined as ‘derived values’. These could be P-values or absolute expression values (Supporting Information MapMan Guide S1, pp. 34–37). Alternatively, column headers can be selected in such a way that the software can automatically detect derived value columns (Supporting Information MapMan Guide S1, p. 36).

Clicking on a filter icon allows these derived values to be used to filter the experimental data (Supporting Information MapMan Guide S1, pp. 37–39). Genes that are filtered out are greyed-out in the ImageAnnotator display. This allows the user to flip between the visualization of different experimental results, without transcripts changing their place on the pathway display (completely removing some of the data points would render this impossible).

Online filtering in ImageAnnotator based on biological criteria

This capacity for online filtering can also be used to filter data based on a wide range of other criteria, which can be chosen freely by the user. One example would be the absolute expression value; this would allow the user to filter out low-expressed transcripts where the measurements might be less reliable. Other examples would be filtering based on information about the tissues or cell types in which genes are expressed or the subcellular location of the encoded protein in order to focus on genes that are active in a given tissue, cell or compartment. This filtering option can also be used to focus down on transcripts that have previously been shown to respond to another treatment. In all cases, the user creates a complex Experimental Data File, which contains the corresponding meta-values as derived data (see above). Clicking on the filter icon allows them to be used to filter the experimental data in the same way as already described to provide multidimensional biological filtering (see Supporting Information MapMan Guide p. 35ff).

Generating and customizing new mapping files

If the BINs that MapMan provides are not suitable, the user can combine maps with custom mapping files generated in, for example, Microsoft Excel (Supporting Information MapMan Guide S1, pp. 68–71). For example, one possible application could be to distinguish subcellular localization in the case of quantitative proteomic and metabolomics data. This could be done by taking an original mapping file and then tagging identifiers by their subcellular localization and introducing new BINs for them (Supporting Information MapMan Guide p. 71). Two further examples are used in the cross-species comparison described later in this article, where mapping files were generated that were derived from Kyoto Encyclopedia of Genes and Genomes (KEGG; Masoudi-Nejad et al. 2007) or Eukaryotic Orthologous Groups (KOGs; Eukaryotic Clusters of Orthologous Genes, Tatusov et al. 2003).

KEGG classification of maize transcripts

To establish a mapping file for maize using the KEGG ontology, probe sets were assigned to tentative contigs from the KEGG genes database (Masoudi-Nejad et al. 2007), which had been assigned a KEGG term (ftp://ftp.genome.jp/pub/kegg/genes/organisms_est/ezma/). To this aim, individual probes on the Affymetrix chip were blasted (blastN) against the maize contigs, and hits against contigs were pooled, as has been done for Arabidopsis (ftp://ftp.arabidopsis.org/home/tair/Microarrays/Affymetrix/README) to determine best matching KEGG expressed sequence tag-based hits (EGENES database, which is freely available for academic users at ftp://ftp.genome.jp/pub/kegg/genes/organisms_est/ezma/). The KEGG ontology was then converted into a MapMan-type ontology using the PageMan converter (Supporting Information MapMan Guide p. 86ff).

Classification into KOG clusters for an evaluation in PageMan

To classify maize tentative contigs and Arabidopsis proteins into clusters of orthologous genes, KOG binary profiles were downloaded from National Center for Biotechnology Information (6.9.2007, ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/little_endian/). Subsequently, all TAIR7 protein models (TAIR7_pep_20070425, ftp://ftp.arabidopsis.org/home/tair/Sequences/blast_datasets/TAIR7_blastsets/) were searched for these KOG models using RPS-Blast (Marchler-Bauer et al. 2005) using a bit-score cut-off of 50, and extracting only the best KOG hit. The same procedure was performed for the tentative contigs from maize (version 17 downloaded from Dana Faber Cancer Institute (DFCI), ftp://occams.dfci.harvard.edu/pub/bio/tgi/data/Zea_mays/). Mapping files to be used with PageMan were created for Arabidopsis and maize which consisted of a flat hierarchy (Supporting Information Dataset S3 and S4).

Mapping expression responses to chromosomal position

Another recently added and distinguishing feature of the ImageAnnotator module is its ability to map expression values onto the chromosomal position. This involves opening a special folder in the ImageAnnotator Browser called ‘ChromosomeView’, and then selecting experiments as usual (Supporting Information MapMan Guide S1, pp. 52–54). Again, a toolbar is brought up which allows the user to zoom into chromosomes and to modify the appearance of this view.

Compression and visualization of large sets of arrays using the PageMan module

A meta-analysis often requires compression and visualization of the huge data compendia or experiment sets. One simple way to condense array data is to look for enriched pathways or classes, classically termed gene set enrichment (for an overview of this technique see e.g. Curtis, Oresic & Vidal-Puig 2005). In essence, the responses of many thousands of individual genes are compressed into a smaller number of categories, defined by ontology. These categories are then statistically analysed to identify significant responses of sets of functionally related genes. These enrichment (or by symmetry) depletion analyses usually yield lists of changed pathways and/or classes. These are often displayed as lists or tables.

To allow the results of such analyses to be viewed as a graphical display, we developed a module dubbed PageMan (Usadel et al. 2006). PageMan is available in a downloadable version and as a simple web-tool (http://mapman.mpimp-golm.mpg.de/general/ora/ora.shtml). The stand-alone tool can be used in conjunction with the ImageAnnotator module. Data is loaded into PageMan using the Experimental Data Files that are used by ImageAnnotator (Supporting Information MapMan Guide S1, pp. 72–77).

However, unlike the ImageAnnotator module, for each file, the user follows three defined steps, aided by a wizard-based interface. In the first step, the Experimental data File is loaded into PageMan and configured. In the second step, a mapping file is selected and loaded (Supporting Information MapMan Guide S1, p. 77). As PageMan aims at providing a statistics-based overview of enriched functional categories from global Omics responses, the third and last step requires the user to choose how these should be compressed. Typical possibilities are to average across a category, or to perform statistical analysis such as a Wilcoxon or Fisher test (Supporting Information MapMan Guide S1, p. 78).

The results can be viewed as an interactive image, or can be exported in tabular form. In the image produced by the stand-alone version, the MapMan categories are organized vertically in a tree structure. The values of the statistical analysis are displayed as a false colour heat-map-like display (Supporting Information MapMan Guide S1, pp. 79, 80). This provides a useful first overview of a response. It can be used to decide which detailed maps to use when the data is viewed in ImageAnnotator. It also allows many contrasting treatments to be viewed such as large-time and concentration series. This allows simple visual identification of shared and disparate elements in the responses (see Usadel et al. 2008a,b for examples where PageMan has been applied) The treatments are aligned horizontally; their order can be manually changed and gaps can be introduced (Supporting Information MapMan Guide S1, p. 81). Depending on the results, parts of the tree hierarchy can be collapsed or expanded (Supporting Information MapMan Guide S1, p. 84).

Thus, PageMan condenses the numbers of features (e.g. from >22 000 individual genes on the ATH1 array to about 1500 hierarchically organized categories that are defined by the current MapMan ontology). This has several advantages. Firstly, many of these categories have a known function, making it easier to extract a biological meaning. Secondly, the underlying statistics allows the detection of subtle but coordinated changes in sets of related genes (e.g. a small increase in transcripts for ribosomal proteins that would be missed by analyses that are restricted to identifying lists of single genes that pass a response threshold). Finally, by filtering out non-significant categories, it is often possible to get a concise display of the data which can be achieved by either collapsing parts of the hierarchy or by simply displaying significant genes only (Supporting Information MapMan Guide S1, pp. 83, 84).

Extension of MapMan to other species

The sequencing of Arabidopsis was the result of a large international collaborative project lasting several years (Arabidopsis Genome Initiative 2000). Since then, rapid advances in sequencing have yielded full genome sequences for several plant crop species such as grape, rice, papaya and sorghum. This was followed by the development of large-scale arrays for these species (see e.g. Wise et al. 2007 for an overview of currently available plant microarray data). In a parallel development, deep sequencing technologies are making it possible to gather quantitative data for all transcripts above the detection limit. Unlike array technology, this is possible even without a priori knowledge about the genomic base of the organism under study. Rapid improvements in the cost-effectiveness of deep sequencing technologies might soon lead to them being favoured over array technology.

An exciting, emerging challenge in the interpretation of comparative transcriptomics/proteomics experiments is therefore to integrate expression data from different species. This will require the development of techniques that allow transfer and comparison of the results between different species. The first step will probably be to leverage the thousands of experiments conducted in Arabidopsis using the Affymetrix platform, to support and deepen the understanding of array experiments performed on other species. Arabidopsis is the only plant species for which comprehensive expression array data resources (see above) are publicly available.

The first and essential step in order to use MapMan with a new plant species is to transfer the MapMan ontology to the transcripts and proteins of the studied species. This is necessarily an ongoing process. It will try to keep up with new information about the genome sequence and refinements of the annotations of gene structure, while at the same time incorporating emerging knowledge about gene function in Arabidopsis and, increasingly, other model plants species. The importance of the latter is emphasized by the fact that the number of genes where no information is available at all comprises up to 15% of all genes, and nearly half still lack a structured functional annotation even in well-studied model organisms (Swarbreck et al. 2008).

In the case of maize, the maize mapping file was generated using a mixture of automated and manual approaches (Doehlemann et al. 2008). The latter is always advisable, but for maize it was essential, because maize was the first C4 plant integrated into the MapMan ontology. Continual updates will be required as the Arabidopsis annotations improve and as more genome sequence information and validated gene models become available for maize. Very recently, rice mapping files have been developed (Degenkolbe et al. 2009; Howell et al. 2009). They could prove useful to identify further graminaceous-specific genes and clades, which are mis-annotated or not annotated in the current maize ontology.

The transfer of the MapMan ontology to create mapping files for other species is currently being upgraded to a fully automated process. This will be essential, in order to deal efficiently with continual updates of annotations, both of the crop plant but also of Arabidopsis genes and Arabidopsis mapping files. The automated process will rely on hundreds of novel classifications and sub-classifications of characterized proteins in any plant species. To this aim, all plant proteins that have been manually included in SwissProt, as well as many specific protein domains and clusters that are manually entered into other databases, have been classified using the MapMan ontology. The automated process already reaches an assignment accuracy of more than 70% when compared with manual classifications. Links to other curated databases like Reactome will also be established, and reactions will be made downloadable.

The MapMan ontology is also being expanded to accommodate species or plant family-specific processes that are absent from Arabidopsis (Urbanczyk-Wochniak et al. 2006). This will allow us to extend the MapMan ontology to any plant, while at the same time relying on reference annotation from manual classifications. It will also reduce the number of ‘blind spots’ within MapMan which arise from metabolic processes encountered in a novel species which have not yet been described with any MapMan reference classification.

In summary, MapMan can be used with any (plant) species, where gene models or unigenes exist. As MapMan can be used with any kind of numerical data, quantitative data from deep sequencing platforms can be displayed in MapMan as long as the reads can be clustered or mapped against unigenes which could even be derived from the deep sequencing runs themselves.

A CASE STUDY IN CROSS-SPECIES KNOWLEDGE TRANSFER

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. AIMS, SCOPE AND USE OF MAPMAN
  6. A CASE STUDY IN CROSS-SPECIES KNOWLEDGE TRANSFER
  7. CONCLUDING STATEMENTS
  8. ACKNOWLEDGMENTS
  9. REFERENCES
  10. Supporting Information

To illustrate the flexibility of MapMan, we describe a case study showing how the MapMan tools can be used to interpret the global expression response in the crop plant maize, and investigate strategies to compare the response in maize with that in the reference species Arabidopsis.

We chose maize as an exemplary crop plant, as a high-quality microarray platform is available and because it represents a major food and feed crop and it is a major source for biofuels. More generally, a study of the similarities and differences between Arabidopsis and graminaceous species would allow knowledge transfer to important monocotyledonous crop plants, which diverged about 150–200 million years ago from the dicots (Wolfe et al. 1989; Mitchell-Olds & Clauss 2002).

We investigated the changes of gene expression at the end of the day, the end of the night and after a short extension of the night. This is a simple and generic biological response. All plants are exposed to a diurnal alternation between light and dark. Many buffer against these changes by accumulating part of the new photoassimilate as starch during the light period. The starch is then degraded during the night so that very little of it is remaining by the end of the night (Smith & Stitt 2007; Stitt et al. 2007). A short extension of the night leads to exhaustion of starch and acute carbon-starvation (Gibon et al. 2004). In Arabidopsis, hundreds of genes show changes in expression during the night (Bläsing et al. 2005), and thousands in the first hours of an extended night (Gibon et al. 2004; Usadel et al. 2008b). Comparison with responses in simpler treatments have been incorporated into a model that shows that these changes are generated by an interplay between clock-, sugar- and light-signalling, with sugars playing an especially large role (Usadel et al. 2008b).

To investigate the analogous responses in maize, we performed an experiment in which maize plants were grown for 14 d in a 14 h light/10 h dark cycle, and the mature part of the fourth leaf was harvested close to the end of the light period, shortly before the end of the night, and after a short 6 h extension of the night. As in Arabidopsis, starch levels were high at the end of the light period and were almost completely depleted by the end of the night, and a 6-h extension of the night led to a very marked decrease in sugar levels (Supporting Information Table S1). In addition, we also harvested the immature segment from below the ligule at the end of the light period, to allow a simple developmental comparison. This tissue was visually very pale and had very low levels of chlorophyll a (Supporting Information Table S1).

Using MapMan's metabolism overview visualization to validate the response of a control experiment in maize

When dealing with novel technologies like the GeneChip Maize Genome Array and the MapMan maize gene ontolology, it is important to perform baseline experiments to provide an overview of their performance and limitations. Ideally, such experiments should be scientifically interesting, but at the same time these should provide results that can easily be tested experimentally or checked against the literature.

We analysed the whole data set by principal component analysis and hierarchical cluster analysis, to assess the general effect of the time of harvest and the developmental stage of the maize leaves on the transcript profile. This was tested by asking whether ‘replicate’ samples grouped together. In the principal components plot (Fig. 2a), the strongest visual separation is obtained between the different tissues, followed by the time of harvest, and replicate samples were closely related to each other. This was confirmed by hierarchical clustering using average linkage (Fig. 2b), where, again, the strongest separation is observed for the different tissue type, whereas samples from the same diurnal time point clustered together.

image

Figure 2. Experimental setup and technical reproducibility control. For the whole maize array dataset, Robust Multichip Average expression measures were calculated, the dataset was then subjected to (a) principal component analysis (PCA) and (b) hierarchical cluster analysis, using 1-the pair-wise correlation as a distance and average linkage. In the PCA, extended night samples are coloured in black, end of night samples in grey and day samples in orange. Samples from below the ligule at the end of the day are coloured in red. The very same colour scheme was applied in the hierarchical cluster analysis.

Download figure to PowerPoint

After this check of basic reproducibility, the transcriptional changes during maize leaf development were chosen as the initial baseline experiment. Leaf material harvested from below the ligule is still in development, is visually very pale and is not photosynthetic. Thus, changes in the expression of genes involved in photosynthesis and related processes were to be expected when comparing the pale tissue below the ligule to the green tissue above the ligule.

The data was analysed using limma (Smyth 2004) to identify transcripts that show a significant change of their level during this developmental switch, and then visualized and interpreted these changes using the MapMan software. The data as well as the necessary steps to reproduce the analyses described here are provided in the supplemental material (Supporting Information Dataset S1 Robin maize; see also Supporting Information Maize Case Study Tutorial S1). The Experimental Data File loaded in this example contained, for each gene on the maize Affymetrix GeneChip array, the log2 ratio of the signal in the immature compared with the mature tissue. The expression data was loaded into ImageAnnotator and visualized using the ‘metabolism overview’ map to provide a general overview of the transcriptional changes in central metabolic processes. A colour scale was selected in which blue and red correspond to an increase and decrease in mature tissue, compared with immature tissue. A setting was selected in which it saturates at a value of 3 (23 = an eightfold change). A logarithmic false-colour scale is used to avoid colouring small changes (see Thimm et al. 2004).

Inspection of the display revealed several general trends in the immature tissue, including down-regulation of transcripts for proteins involved in the light reactions, tetrapyrrol and carotenoid biosynthesis and the Calvin cycle (see Supporting Information Fig. S1a). This meets the expectation derived through previous published experiments, and thus validates the analytic technologies. Transcripts for phosphoenolpyruvate carboxylase and carbonic anhydrases were down-regulated in immature leaf tissue compared with mature leaf tissue. Both of these enzymes are potentially involved in the C4 photosynthesis of maize. There are also more subtle changes such as an up-regulation of the mevalonate (MVA) pathway in the tissue below the ligule. This could be explained as a consequence of the down-regulation of the chloroplast localized deoxyxylulose 5-phosphate (DXP) pathway. MVA and DXP are alternative pathways for terpenoid synthesis.

The built-in Wilcoxon test was used to identify pathways in which many genes undergo coordinated changes. This revealed, for example, that there is a coordinated decrease of transcripts for plastidial ribosomal subunits in the immature tissue (Supporting Information Fig. S1c). This is in accordance with expectations, as the plastidic ribosomes are mainly involved in translating plastid-encoded genes for components of the photosynthetic apparatus. Cell wall-related processes and the proteasome were up-regulated in the immature tissue compared with the mature tissue (Supporting Information Fig. S1a,b), potentially indicating that the developmental changes require a change not only in the cell wall composition but also of the protein inventory.

Using pathway displays to show shared responses between the diurnal cycle and an artificial extension of the night in maize and Arabidopsis

For the remainder of the analysis, we concentrated on time-dependent changes of transcript levels in the mature leaf samples. We first explored the diurnal response between the end of the day (ED) and the end of the night (EN). After normalizing and analyzing these arrays, an Experimental Data File was created containing the logarithm of the ratio of ED : EN and loaded into ImageAnnotator. The response was first visualized using the metabolic overview display. For comparison, we inspected a similar file created from published studies on Arabidopsis (Bläsing et al. 2005, Supporting Information Dataset S2). Screen shots of the displays are juxtaposed in Fig. 3. [Note that EN is taken as the reference in both panels. Thus, transcripts that decrease during the night are assigned a positive sign (blue), whereas transcripts that decrease in the extended night receive a negative sign (red shading)].

image

Figure 3. The response of Maize and Arabidopsis to the diurnal cyle and an extended night. The log2 fold changes between the end of the day (ED) and the end of the night (EN) (a,c) or a night extended by 6 h (XN) versus the end of the night (b,d) in both Arabidopsis (a,b) and maize (c,d) were visualized using the MapMan software. Red indicated a decrease whereas blue indicates an increase (see colour scale in b). The insets in Fig. 3a, c show the response of the cytoplasmatic ribosomes.

Download figure to PowerPoint

Visualization of the changes between dawn and dusk data points in the two species using the metabolic overview map did not immediately reveal marked similarities (Fig. 3a,c). One major difference was identified by the Wilcoxon test feature from within ImageAnnotator; although the cytoplasmatic ribosomal subunits were up-regulated at the end of the day in maize, the response was much weaker in Arabidopsis (not shown). This was confirmed by viewing the diagram for ‘RNA and protein synthesis’ (see inset in Fig. 3a,c). Some shared responses of pathways could nevertheless be identified. These included strong and similar responses of a set of genes involved in starch metabolism in both species. Trehalose-6-phosphate synthase (TPS)-like genes (see below for further discussion) also showed some shared response; many are down-regulated at the end of the day in Arabidopsis and maize. This is in accordance with the important role of trehalose-6-phosphate for sugar signalling (see below for further discussion).

Exposure of Arabidopsis to a short prolongation of the night leads to a much more marked response, affecting thousands of genes (Thimm et al. 2004; Usadel et al. 2008b). The influence of an extended night in maize and Arabidopsis was compared using a similar approach to that outlined above (Fig. 3b,d). The transcriptional changes to an extension of the night seem to be much more prominent in Arabidopsis than in maize. This might be due to technical reasons. Potentially, the maize array might underestimate the true fold change more than the Arabidopsis microarray does (Czechowski et al. 2004). Many common themes emerged between Arabidopsis and maize for this more extreme treatment (see next paragraph), possibly indicating that the day-to-day diurnal change is less conserved and is rather adapted to individual plant species. It will be interesting in the future to compare other monocotyledons with C3 and C4 photosynthesis, to learn if these differences in the expression response might be due to differences in the photosynthetic pathway, to phylogeny or to other factors.

One shared response between Arabidopsis and maize, which was already apparent in the diurnal cycle and is confirmed in the extended night treatment, is the induction of many TPS-like genes. There is mounting evidence that trehalose-6-phosphate acts as a sugar-signal in Arabidopsis (Lunn 2007; Stitt et al. 2007). TPS is encoded by a family of 11 genes in Arabidopsis, most of which contain modifications in their sequence, indicating that they lack catalytic activity. These TPS-like genes emerged early in evolution (Lunn 2007). Our present results indicate that they might play similar roles in sugar signalling across divergent plant species. Other shared responses that are evident from Fig 3b,d include the down-regulation of photosystem and tetrapyrrol biosynthesis, of glycolysis and of several sectors of secondary metabolism. In Arabidopsis, this includes down-regulation of the DXP pathway (Gibon et al. 2006). In maize, several genes of the DXP pathway were about 1.5-fold down-regulated, but the changes of individual genes lacked statistical significance. Nevertheless, as relatively few DXP pathway genes were found on the maize Affymetrix chip, unmeasured genes might explain this behaviour. Indeed, the fact that some genes might not be represented on a particular platform should always be kept in mind when interpreting microarray data.

Pathway display of statistically evaluated data sets increases the signal-to-noise ratio

In order to get a clearer picture of the responses in the two species, the data sets were analysed statistically and reloaded into ImageAnnotator as extended Experimental Data Files, in which P-values were included (see Supporting Information Maize Case Study Tutorial S1, pp. 11–15), and transcripts which did not show a significant change at a false discovery rate of 5% were greyed out using ImageAnnotator's filter function (Fig. 4). This provided a much clearer picture for both Arabidopsis and maize (compare Fig. 4 with Fig. 3). For example, inspecting this filtered display revealed that several genes involved in fatty acid synthesis are down-regulated at the end of the day in maize. Although there was a mixed response for these genes in this category in Arabidopsis, several genes were again down-regulated.

image

Figure 4. The filtered response of maize and Arabidopsis to the diurnal cycle and an extended night. The data from Fig. 3 was filtered based on a statistical assessment of the microarray data performed using R. Subsequently, only significantly changed genes were displayed in MapMan.

Download figure to PowerPoint

In conclusion, filtering the data based on statistical evaluation or based upon prior knowledge represents a potent feature of ImageAnnotator in order to be able to increase the signal-to-noise ratio and focus on fewer but more important changes.

Automated display of the responses of small sets of genes across several arrays

The experiments performed in maize can be placed in a temporal sequence (end of the day, end of the night and extension of the night). We investigated if there were any trajectory courses along this time sequence. In Arabidopsis, inositol catabolism is repressed at the end of the day, remains repressed during the night, but is strongly induced when the night is extended (Gibon et al. 2006). To visualize the inositol pathway with the data from this short time course in ImageAnnotator, the new multiple-experiment mode that was used allows the responses of a small set of genes to be compared across multiple treatments. The data sets were first pre-processed by calculating the difference to the average expression level for each of the three time points. The transformed data for the three data sets were selected simultaneously in ImageAnnotator by clicking them with the mouse while holding down the shift and/or control key (see Supporting Information MapMan Guide S1, p. 40 and MapMan Case Study Tutorial S1, pp. 20–23). The selection of multiple experimental time points automatically switches ImageAnnotator into a multi-experiment mode. In this display mode, grey lines depict individual genes or metabolites, the red line represents the average response of the displayed items and the upper and lower green lines depict the average ± one standard deviation (Fig. 5). This display highlights that there is a marked induction of inositol catabolism towards the end of the night in maize, which becomes stronger when the night is extended.

Comparison of responses at the levels of functional categories using PageMan

To provide a comparative overview of the responses in all of the treatments, we loaded the data into PageMan (Fig. 6, for the steps see Supporting Information Maize Case Study Tutorial S1, p. 25ff) and applied a Wilcoxon test for each category. This reveals whether the genes in this particular category behave differently to all the other genes on the chip. We used the same MapMan files as the ones that were used for the graphical displays. Although the mapping files for Arabidopsis and maize contain different genes, they use the same categories and hierarchy. Thus, PageMan allows direct comparison of data from these two species. An ‘add data’ function in PageMan allows the different data sets to be combined online.

image

Figure 6. A condensed PageMan display of changed pathways. The log2 fold changes between the end of the day (ED) and the end of the night (EN) and short extension(s) of the night (XN) and the end of the night were imported into PageMan both for maize (Zma) and Arabidopsis (Ath). The data was subjected to a Wilcoxon test in PageMan and the results were displayed false-colour coded. BINs coloured in red are significantly down-regulated relative to the rest of the array, whereas BINs coloured in blue are up-regulated.

Download figure to PowerPoint

This analysis condenses the ca. 14 000 genes present on both the maize and Arabidopsis chip to about 1000 categories. These were further compressed by removing categories that did not show a significantly different change in a given species (Fig. 6). Visual inspection revealed which categories show a qualitatively similar or different response in both species. Again, changes in minor carbohydrates metabolism such as trehalose metabolism during the diurnal cycle and an extended night became immediately visible (Fig. 6). However, other changes could be observed as well. One example is the coordinated change in transcripts for protein synthesis and degradation, which are consistently seen in Arabidopsis and maize. For example, genes in the category protein synthesis are induced at ED compared with EN, and are repressed in the extended night treatment. Other similar responses include changes in genes involved in amino acid and nitrogen metabolism, where a general down-regulation is seen in both species during the night and also in the extended night. A further example is the down-regulation of sets of genes involved in cell wall synthesis and lipid synthesis in both species in an extended night.

Comparison of maize and Arabidopsis responses using an ortholog-driven approach

This visual overview of the responses in maize and Arabidopsis reveals that some biological themes are conserved between these two species. Similarities have also been shown between monocots and dicots for the response to light (Jiao et al. 2005). The following section describes several approaches in which we attempted to carry out a more systematic comparison of the transcriptional responses of Arabidopsis and maize.

In the first approach, we tried to identify the potential orthologous pairs of sequences between Arabidopsis and maize in order to compare their transcriptional responses. Potential orthologous pairs were identified using a reciprocal best blast hit approach, using a range of e-value cut-offs. This approach identified several thousand potential orthologous sequences between maize and Arabidopsis. The expression response of these potential orthologous pairs was analysed to assess the degree of conservation of expression responses in the treatments used in our study. This conservation was captured by displaying the data as a scatter plot and calculating the Pearson correlation coefficients. This was done separately for the diurnal cycle and the extended night treatment. Correlation coefficients of up to 0.26 and 0.34 were observed for the diurnal cycle and the extended night treatment, respectively. In preliminary analyses, we used a range of different similarity thresholds to identify putative orthologous pairs. The best results were obtained with the most stringent setting (e-value <1 × 10−150).

Correlation coefficients of 0.26–0.34 indicate that although there is some conservation between Arabidopsis and maize, it is not very marked. In part, this might be because a simple blast-based approach erroneously flags sequences as orthologs, and/or misses orthologous sequences. This is likely, because the full genome of maize is not yet sequenced and true orthologs of Arabidopsis genes might not be included yet. It is also possible that in some cases genes are identified as orthologs based on sequence similarity that actually have different expression patterns between species.

The MapMan infrastructure was therefore used to investigate the usefulness of two further approaches, which do not rely on finding the true ortholog(s), and thus might be more error-tolerant.

Exploring cluster-wise comparisons of experiments in PageMan

In the first alternative approach, we used groups of orthologous genes, rather than gene pairs, to compare the expression response in the two species. Arabidopsis protein sequences and the maize nucleotide sequences were searched for gene signatures of KOGs (Tatusov et al. 2003) using RPS-Blast and preassembled KOG profiles (Marchler-Bauer et al. 2005). This led to the extraction of 2055 sequence clusters, whose members were found both on the Arabidopsis and the maize arrays. In order to compare the different data in PageMan, a mock mapping file was prepared (Supporting Information Dataset S3 and S4). The numerical KOG-Identifier was used as a BINcode (the hierarchy is therefore completely flat). The PageMan software was then used to calculate the average response per BIN, and P-values for a Wilcoxon test that the response of the BIN (which in this case represents a KOG category) changes relative to the remainder of the genes on the array. PageMan automatically converts these P-values to their corresponding z-scores (e.g. 1.96 for a P-value of 0.05, see also e.g. Sokal & Rohlf 1981). These scores are then plotted (Supporting Information Maize Case Study Tutorial S1, p. 25ff).

When the responses of the gene clusters were compared between the two species for the diurnal data sets, we obtained correlation coefficients for the average response and the Wilcoxon-based transformed P-values of 0.248 and 0.255, respectively (Supporting Information Maize Case Study Tutorial S1, p. 39ff). The extended night samples showed somewhat more conservation, with correlation coefficients of 0.303 and 0.324, respectively. These correlation coefficients are similar to, but not better than, those obtained by single gene level comparisons. This indicates that using clusters of orthologous genes instead of single putative orthologs does not improve the agreement between the responses in the two species.

Exploring functional class-wise comparisons of experiments in PageMan

We next used a pathway-based approach to condense and compare the responses in the two species. We did this using either KEGG terms or MapMan BINs. Both of these ontologies collect genes into groups, which represent biological pathways and other processes. This might be more tolerant to errors in identifying functionally related genes than gene-by-gene approaches. Furthermore, using such an approach would be more robust when considering species-dependent differences with respect to which steps or genes in a pathway might be subject to transcriptional regulation.

The experimental data were loaded into PageMan, now relying either on the mapping files supplied by MapMan or, in the case of the KEGG ontology, based on a converter available from the PageMan web site which converts KEGG assignments into the ontology format displayed in Table 1 (Supporting Information MapMan Guide S1, pp. 86–91). In both cases, the Wilcoxon test was used to identify significantly changing categories.

We were able to identify marked similarities between the response of Arabidopsis and maize to the diurnal cycle as well as to an extension of the night. When the KEGG ontology was used for the class-wise assessment, we obtained correlation values of 0.28 and 0.44 for the species-comparison of the end of the night to the end of the day, and the end of the night to an extended night, respectively (data not shown). The MapMan ontology yielded values of 0.38 and 0.54, respectively (Fig. 7). Examples of categories that showed highly significant changes are noted in Fig. 7. For example, in both treatments there were strong conserved responses of E3 ligases as well as constituents of the mitochondrial and plastidial ribosomal subunits (note that as EN is taken as the reference in both panels, a positive sign in the diurnal change signifies that expression of that gene category is decreasing during the night). Conversely, for the cytoplasmatic ribosomal subunits, a strong change was only detected in maize (see also Fig. 3a,c). During the diurnal cycle, well-conserved responses were also found for pentatricopeptide repeat proteins, histones and amino acid metabolism. In the case of the extended night experiments, conserved responses were also found for ubiquitin, regulation of transcription as well as posttranscriptional modification.

image

Figure 7. Comparison of response of functional categories in Arabidopsis and maize. The data from a PageMan Wilcoxon test was exported for changes between the end of the day (ED) or an extended night (XN) and the end of the night (EN) in Arabidopsis and maize. The corresponding experiments from the two species were visualized by a scatter plot. Some categories which showed strongly significant changes in the Wilcoxon test were annotated.

Download figure to PowerPoint

Summarizing, the ontology-based analysis was more powerful than sequence-based analyses, irrespective of whether the latter used single putative orthologs or clusters of putative orthologs. Different ontologies can be used to condense the results and identify conserved responses across the two species. In the example analysed here, MapMan terms yielded higher correlation than the KEGG terms. This might be due to the lower redundancy in MapMan terms and/or the manual annotation procedure that followed the automatic annotation. It also cannot be excluded that some of the general MapMan terms which help to broadly classify genes, contributed to these results. Furthermore, it has to be noted that the MapMan terms include signalling processes as well as metabolism.

More generally, these results show that it is possible to condense and compare results across species with applications like PageMan, which use a priori knowledge to sort genes into a functional ontology and compare the responses of the different gene categories.

CONCLUDING STATEMENTS

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. AIMS, SCOPE AND USE OF MAPMAN
  6. A CASE STUDY IN CROSS-SPECIES KNOWLEDGE TRANSFER
  7. CONCLUDING STATEMENTS
  8. ACKNOWLEDGMENTS
  9. REFERENCES
  10. Supporting Information

In this report, we show that maize expression array data can be visualized and interpreted using MapMan, taking advantage of tools that were initially developed for the reference species Arabidopsis. We also explore different ways in which MapMan tools can be used to compare data across these two species. This depends heavily on the flexible and modular structure of MapMan. It should be noted that Arabidopsis and maize are phylogenetically very distinct, and the analysis was done while still relying on partial genome sequence information for maize. The strategies outlined in this paper should therefore be applicable to a very wide range of crops and wild species. Indeed, MapMan ontologies are already available for Chlamydomonas (May et al. 2008), and for dedicated transcript arrays for tomato (Urbanczyk-Wochniak et al. 2006), medicago (Tellström et al. 2007), rice (Howell et al. 2009) and others.

Developments in Omics technologies will pose new challenges for Omics data visualization and interpretation. For example, proteomics research has matured in plant research, allowing the quantification of thousands of proteins (Baerenfaller et al. 2008). Emerging metabolomics technologies such as Fourier transform mass spectrometry (FT-MS) allow the identification and quantification of many more metabolites than was previously possible (Giavalisco et al. 2008). The modularity and flexibility offered by MapMan will be useful in keeping up with these changes, allowing it to continue to provide novel ways to organize, evaluate and display data. It will also be adapted to more plants relying on increasingly sophisticated methods for knowledge transfer. Thus, MapMan will aid in data visualization, analysis and interpretation in the next 5 years of omics research.

ACKNOWLEDGMENTS

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. AIMS, SCOPE AND USE OF MAPMAN
  6. A CASE STUDY IN CROSS-SPECIES KNOWLEDGE TRANSFER
  7. CONCLUDING STATEMENTS
  8. ACKNOWLEDGMENTS
  9. REFERENCES
  10. Supporting Information

Thanks are due to Dr Florian Wagner at the RZPD Company (Berlin, Germany) for carrying out the Microarray hybridization and to Dr Leonard Krall and Borjana Arsova for helpful comments on the manuscript. This research was supported by the Max Planck Society and the BMBF funded GABI projects GABI MapMen 0315049A and GABI-Cool 0313111D.

REFERENCES

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. AIMS, SCOPE AND USE OF MAPMAN
  6. A CASE STUDY IN CROSS-SPECIES KNOWLEDGE TRANSFER
  7. CONCLUDING STATEMENTS
  8. ACKNOWLEDGMENTS
  9. REFERENCES
  10. Supporting Information
  • Arabidopsis Genome Initiative. (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796815.
  • Baerenfaller K., Grossmann J., Grobei M.A., Hull R., Hirsch-Hoffmann M., Yalovsky S., Zimmermann P., Grossniklaus U., Gruissem W. & Baginsky S. (2008) Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics. Science 320, 3841.
  • Benjamini Y. & Hochberg Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B 57, 289300.
  • Bläsing O., Gibon Y., Günther M., Höhne M., Osuna D., Thimm T., Scheible W.-R., Morcuende R. & Stitt M. (2005) Sugars and circadian regulation make major contributions to the global regulation of diurnal gene expression in Arabidopsis. The Plant Cell 17, 32573281.
  • Bolstad B.M., Irizarry R.A., Astrand M. & Speed T.P. (2003) A comparison of normalization methods for high density oligonucleotide array data based on bias and variance. Bioinformatics 19, 185193.
  • Brady S.M., Song S., Dhugga K.S., Rafalski J.A. & Benfey P.N. (2007) Combining expression and comparative evolutionary analysis. The COBRA gene family. Plant Physiology 143, 172187.
  • Covington M.F., Maloof J.N., Straume M., Kay S.A. & Harmer S.L. (2008) Global transcriptome analysis reveals circadian regulation of key pathways in plant growth and development. Genome Biology 9, R130.
  • Craigon D.J., James N., Okyere J., Higgins J., Jotham J. & May S. (2004) NASCArrays: a repository for microarray data generated by NASC's transcriptomics service. Nucleic Acids Research 32, D575D577.
  • Curtis R.K., Oresic M. & Vidal-Puig A. (2005) Pathways to the analysis of microarray data. Trends in Biotechnology 23, 429435.
  • Czechowski T., Bari R.P., Stitt M., Scheible W.R. & Udvardi M.K. (2004) Real-time RT-PCR profiling of over 1400 Arabidopsis transcription factors: unprecedented sensitivity reveals novel root- and shoot-specific genes. The Plant Journal 38, 366379.
  • Dahlquist K.D. (2004) Using genMAPP and mappfinder to view microarray data on biological pathways and identify global trends in the data. In Current Protocols in Bioinformatics (eds A.D.Baxevanis, D.B.Davison, R.Page, L.Stein & G.Stormo), pp. 7.5.17.5.26. John Wiley & Sons, Inc, New York, USA.
  • Degenkolbe T., Do P.T., Zuther E., Repsilber D., Walther D., Hincha D.K. & Köhl K.I. (2009) Expression profiling of rice cultivars differing in their tolerance to long-term drought stress. Plant Molecular Biology 69, 133153.
  • Doehlemann G., Wahl R., Horst R.J., et al. (2008) Reprogramming a maize plant: transcriptional and metabolic changes induced by the fungal biotroph Ustilago maydis. The Plant Journal 56, 181195.
  • Edwards K.D., Anderson P.E., Hall A., Salathia N.S., Locke J.C., Lynn J.R., Straume M., Smith J.Q. & Millar A.J. (2006) FLOWERING LOCUS C mediates natural variation in the high-temperature response of the Arabidopsis circadian clock. The Plant Cell 18, 639650.
  • Fiehn O. (2002) Metabolomics – the link between genotypes and phenotypes. Plant Molecular Biology 48, 155171.
  • Giavalisco P., Hummel J., Lisec J., Inostroza A.C., Catchpole G. & Willmitzer L. (2008) High-resolution direct infusion-based mass spectrometry in combination with whole (13)c metabolome isotope labeling allows unambiguous assignment of chemical sum formulas. Analytical Chemistry 80, 94179425.
  • Gibon Y., Bläsing O., Palacios-Rojas N., Pankovic D., Hendriks J.H.M., Fisahn J., Höhne M., Günther M. & Stitt M. (2004) Adjustment of diurnal starch turnover to short days: depletion of sugar during the night leads to a temporary inhibition of carbohydrate utilization, accumulation of sugars, and posttranslational activation of ADP glucose pyrophosphorylase in the following light period. The Plant Journal 39, 847862.
  • Gibon Y., Usadel B., Blaesing O.E., Kamlage B., Hoehne M., Trethewey R. & Stitt M. (2006) Integration of metabolite with transcript and enzyme activity profiling during diurnal cycles in Arabidopsis rosettes. Genome Biology 7, R76.
  • Grosu P., Townsend J.P., Hartl D.L. & Cavalieri D. (2002) Pathway Processor: a tool for integrating whole-genome expression results into metabolic networks. Genome Research 12, 11211126.
  • Howell K.A., Narsai R., Carroll A., Ivanova A., Lohse M., Usadel B., Millar H. & Whelan J. (2009) Mapping metabolic and transcript temporal switches during germination in Oryza sativa highlights specific transcription factors and the role of RNA instability in the germination process. Plant Physiology 149, 961980.
  • Jiao Y., Ma L., Strickland E. & Deng X.W. (2005) Conservation and divergence of light-regulated genome expression patterns during seedling development in rice and Arabidopsis. The Plant Cell 17, 32393256.
  • Junker B.H., Klukas C. & Schreiber F. (2006) VANTED: a system for advanced data analysis and visualization in the context of biological networks. BMC Bioinformatics 7, 109.
  • Kilian J., Whitehead D., Horak J., Wanke D., Weinl S., Batistic O., D'Angelo C., Bornberg-Bauer E., Kudla J. & Harter K. (2007) The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses. The Plant Journal 50, 347363.
  • Kopka J. (2005) Current challenges and developments in GC-MS based metabolite profiling technology. Journal of Biotechnology 124, 312322.
  • Kusano M., Fukushima A., Arita M., Jonsson P., Moritz T., Kobayashi M., Hayashi N., Tohge T. & Saito K. (2007) Unbiased characterization of genotype-dependent metabolic regulations by metabolomic approach in Arabidopsis thaliana. BMC Systems Biology 1, 53.
  • Lunn J.E. (2007) Gene families and evolution of trehalose metabolism in plants. Functional Plant Biology 34, 550563.
  • Marchler-Bauer A., Anderson J.B., Cherukuri P.F., et al. (2005) CDD: a Conserved Domain Database for protein classification. Nucleic Acids Research 33, D192D196.
  • Masoudi-Nejad A., Goto S., Jauregui R., Ito M., Kawashima S., Moriya Y., Endo T.R. & Kanehisa M. (2007) EGENES: transcriptome-based plant database of genes with metabolic pathway information and expressed sequence tag indices in KEGG. Plant Physiology 144, 857866.
  • May P., Wienkoop S., Kempa S., et al. (2008) Metabolomics- and proteomics-assisted genome annotation and analysis of the draft metabolic network of Chlamydomonas reinhardtii. Genetics 179, 157166.
  • Michael T.P., Mockler T.C., Breton G., et al. (2008) Network discovery pipeline elucidates conserved time-of-day-specific cis-regulatory modules. PLoS Genetics 4, e14.
  • Mitchell-Olds T. & Clauss M.J. (2002) Plant evolutionary genomics. Current Opinion Plant Biology 5, 7479.
  • Nettleton D. (2006) A discussion of statistical methods for design and analysis of microarray experiments for plant scientists. The Plant Cell 18, 21122121.
  • Osuna D., Usadel B., Morcuende R., et al. (2007) Temporal responses of transcripts, enzyme activities and metabolites after adding sucrose to carbon-deprived Arabidopsis seedlings. The Plant Journal 49, 463491.
  • Presterl T., Ouzunova M., Schmidt W., Moller E.M., Rober F.K., Knaak C., Ernst K., Westhoff P. & Geiger H.H. (2007) Quantitative trait loci for early plant vigour of maize grown in chilly environments. Theoretical and Applied Genetics 114, 10591070.
  • Ren X.Y., Stiekema W.J. & Nap J.P. (2007) Local coexpression domains in the genome of rice show no microsynteny with Arabidopsis domains. Plant Molecular Biology 65, 205217.
  • Scheible W.R., Morcuende R., Czechowski T., Fritz C., Osuna D., Palacios-Rojas N., Schindelasch D., Thimm O., Udvardi M.K. & Stitt M. (2004) Genome-wide reprogramming of primary and secondary metabolism, protein synthesis, cellular growth processes, and the regulatory infrastructure of Arabidopsis in response to nitrogen. Plant Physiology 136, 24832499.
  • Schena M., Shalon D., Davis R.W. & Brown P.O. (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467470.
  • Smith A.M. & Stitt M. (2007) Coordination of carbon supply and plant growth. Plant, Cell & Environment 30, 11261149.
  • Smyth G.K. (2004) Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology 3, Article 3.
  • Sokal R.R. & Rohlf F.J. (1981) Biometry, 2nd edn. Freeman, New York, USA.
  • Stitt M. & Fernie A.R. (2003) From measurements of metabolites to metabolomics: an ‘on the fly’ perspective illustrated by recent studies of carbon–nitrogen interactions. Current Opinion in Biotechnology 14, 136144.
  • Stitt M., Gibon Y., Lunn J.E. & Piques M. (2007) Multilevel genomics analysis of carbon signaling during low carbon availability: coordination of the supply and of utilization carbon in a fluctuating environment. Functional Plant Biology 34, 526549.
  • Swarbreck D., Wilks C., Lamesch P., et al. (2008) The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Research 36, D1009D1014.
  • Tatusov R.L., Fedorova N.D., Jackson J.D., et al. (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41.
  • Tellström V., Usadel B., Thimm O., Stitt M., Küster H. & Niehaus K. (2007) The lipopolysaccharide of Sinorhizobium meliloti suppresses defense-associated gene expression in cell cultures of the host plant Medicago truncatula. Plant Physiology 143, 825837.
  • Thimm O., Blaesing O.E., Gibon Y., Nagel A., Meyer S., Krueger P., Selbig J., Mueller L.A., Rhee S.Y. & Stitt M. (2004) Mapman: a user-driven tool to display genomics datasets onto diagrams of metabolic pathways and other biological processes. The Plant Journal 37, 914939.
  • Tokimatsu T., Sakurai N., Suzuki H., Ohta H., Nishitani K., Koyama T., Umezawa T., Misawa N., Saito K. & Shibata D. (2005) KaPPA-view: a web-based analysis tool for integration of transcript and metabolite data on plant metabolic pathway maps. Plant Physiology 138, 12891300.
  • Trethewey R.N. (2004) Metabolite profiling as an aid to metabolic engineering in plants. Current Opinion in Plant Biology 7, 196201.
  • Tsesmetzis N., Couchman M., Higgins J., et al. (2008) Arabidopsis reactome: a foundation knowledgebase for plant systems biology. The Plant Cell 20, 14261436.
  • Urbanczyk-Wochniak E., Usadel B., Thimm O., et al. (2006) Conversion of MapMan to allow the analysis of transcript data from solanaceous species: effects of genetic and environmental alterations in energy metabolism in the leaf. Plant Molecular Biology 60, 773792.
  • Usadel B., Nagel A., Thimm O., et al. (2005) Extension of the visualisation tool MapMan to allow statistical analysis of arrays, display of corresponding genes and comparison with known responses. Plant Physiology 138, 11951204.
  • Usadel B., Nagel A., Steinhauser D., et al. (2006) PageMan: an interactive ontology tool to generate, display, and annotate overview graphs for profiling experiments. BMC Bioinformatics 7, 535.
  • Usadel B., Bläsing O.E., Gibon Y., Poree F., Höhne M., Günter M., Trethewey R., Kamlage B., Poorter H. & Stitt M. (2008a) Multilevel genomics analysis of the response of transcripts, enzyme activities and metabolites in Arabidopsis rosettes to a progressive decrease of the temperature in the non-freezing range. Plant, Cell & Environment 31, 518547.
  • Usadel B., Bläsing O.E., Gibon Y., Retzlaff K., Höhne M., Günther M. & Stitt M. (2008b) Global transcript levels respond to small changes of the carbon status during progressive exhaustion of carbohydrates in Arabidopsis rosettes. Plant Physiology 146, 18341861.
  • Wise R.P., Caldo R.A., Hong L., Shen L., Cannon E.K. & Dickerson J.A. (2007) BarleyBase/PLEXdb: a unified expression profiling database for plants and plant pathogens. In Methods in Molecular Biology, Vol. 406, Plant Bioinformatics – Methods and Protocols (ed. D.Edwards), pp. 347363. Humana Press, Totowa, NJ, USA.
  • Wolfe K.H., Gouy M., Yang Y.W., Sharp P.M. & Li W.H. (1989) Date of the monocot-dicot divergence estimated from chloroplast DNA sequence data. Proceedings of the National Academy of Sciences of the United States of America 86, 62016205.
  • Yonekura-Sakakibara K., Tohge T., Matsuda F., Nakabayashi R., Takayama H., Niida R., Watanabe-Takahashi A., Inoue E. & Saito K. (2008) Comprehensive flavonol profiling and transcriptome coexpression analysis leading to decoding gene–metabolite correlations in Arabidopsis. The Plant Cell 20, 21602176.
  • Zdobnov E.M. & Apweiler R. (2001) InterProScan – an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17, 847848.
  • Zhang P., Foerster H., Tissier C.P., Mueller L., Paley S., Karp P.D. & Rhee S.Y. (2005) MetaCyc and AraCyc. Metabolic pathway databases for plant research. Plant Physiology 138, 2737.
  • Zimmermann P., Hirsch-Hoffmann M., Hennig L. & Gruissem W. (2004) GENEVESTIGATOR. Arabidopsis microarray database and analysis toolbox. Plant Physiology 136, 26212632.

Supporting Information

  1. Top of page
  2. ABSTRACT
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. AIMS, SCOPE AND USE OF MAPMAN
  6. A CASE STUDY IN CROSS-SPECIES KNOWLEDGE TRANSFER
  7. CONCLUDING STATEMENTS
  8. ACKNOWLEDGMENTS
  9. REFERENCES
  10. Supporting Information

Figure S1. The difference in transcripts from below versus above the ligule was visualized using the MapMan maps (A) Metabolism overview, (B) Proteasome and (D) RNA-Protein Synthesis. In panel (C), a screenshot of the Wilcoxon analysis performed in MapMan is reproduced.

Table S1. Metabolite measurements in maize leaves measured at the end of the night (EN), the end of the day (ED) and in an extended night (XN).

MapMan Guide S1. This guide shows steps to be taken by the user to get familiar with MapMan. It included instructions on how to visualize data provided with the StartUp package as well as user-supplied data.

Maize Case Study Tutorial S1. This tutorial explains in detail how the figures shown here can be reproduced by providing step-by-step instructions.

Dataset S1. This shows the statistical analysis of the maize data, including log2 fold changes and an indication if the change is deemed significant. Statistical flags are indicated by a ‘d_’ in front of the experiment name. Otherwise, EN, ED and XN refer to time points close to the end of the night, the end of the day and an extended night, respectively.

Dataset S2. This shows a statistical analysis of the Arabidopsis experiment, including log2 fold changes and an indication if the change is deemed significant. Statistical flags are indicated by a ‘d_’ in front of the experiment name. Otherwise, EN, ED and XN refer to time points at the end of the night, the end of the day and an extended night which has been prolonged by 4, 6 or 8 h, respectively.

Dataset S3. A customized mapping file using KOG terms to sort Arabidopsis Affymetrix probe set identifiers into categories.

Dataset S4. A customized mapping file using KOG terms to sort maize Affymetrix probe set identifiers into categories.

Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

FilenameFormatSizeDescription
PCE_1978_sm_suppmat_case study.pdf2853KSupporting info item
PCE_1978_sm_suppmat_fig.pdf609KSupporting info item
PCE_1978_sm_suppmat_MapMan_Guide.pdf3935KSupporting info item
PCE_1978_sm_supporting_datafile_D1_Robin_maize.txt1061KSupporting info item
PCE_1978_sm_supporting_datafile_D2_ENXNED_Arabidopsis_results.txt1530KSupporting info item
PCE_1978_sm_supporting_datafile_D3_ATHAffy_KOG_mapping_file.txt525KSupporting info item
PCE_1978_sm_supporting_datafile_D4_MaizeAffy_KOG_mapping_file.txt338KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.