Articles can be viewed online without a subscription.
Expression Profiling of Cassava Storage Roots Reveals an Active Process of Glycolysis/GluconeogenesisF
Article first published online: 25 JAN 2011
© 2011 Institute of Botany, Chinese Academy of Sciences
Journal of Integrative Plant Biology
Volume 53, Issue 3, pages 193–211, March 2011
How to Cite
Yang, J., An, D. and Zhang, P. (2011), Expression Profiling of Cassava Storage Roots Reveals an Active Process of Glycolysis/Gluconeogenesis. Journal of Integrative Plant Biology, 53: 193–211. doi: 10.1111/j.1744-7909.2010.01018.x
- Issue published online: 22 FEB 2011
- Article first published online: 25 JAN 2011
- Accepted manuscript online: 7 DEC 2010 04:18AM EST
- Received 25 Nov. 2010 Accepted 1 Dec. 2010
- Top of page
- Materials and Methods
- Supporting Information
Mechanisms related to the development of cassava storage roots and starch accumulation remain largely unknown. To evaluate genome-wide expression patterns during tuberization, a 60 mer oligonucleotide microarray representing 20 840 cassava genes was designed to identify differentially expressed transcripts in fibrous roots, developing storage roots and mature storage roots. Using a random variance model and the traditional twofold change method for statistical analysis, 912 and 3 386 upregulated and downregulated genes related to the three developmental phases were identified. Among 25 significantly changed pathways identified, glycolysis/gluconeogenesis was the most evident one. Rate-limiting enzymes were identified from each individual pathway, for example, enolase, L-lactate dehydrogenase and aldehyde dehydrogenase for glycolysis/gluconeogenesis, and ADP-glucose pyrophosphorylase, starch branching enzyme and glucan phosphorylase for sucrose and starch metabolism. This study revealed that dynamic changes in at least 16% of the total transcripts, including transcription factors, oxidoreductases/transferases/hydrolases, hormone-related genes, and effectors of homeostasis. The reliability of these differentially expressed genes was verified by quantitative real-time reverse transcription-polymerase chain reaction. These studies should facilitate our understanding of the storage root formation and cassava improvement.
- Top of page
- Materials and Methods
- Supporting Information
Cassava (Manihot esculenta Crantz) provides the major source of dietary carbohydrates for almost 750 million people throughout the tropics (Cock 1982; Nassar and Ortiz 2010). The majority of the cassava that is produced is used for human foods, livestock feeds and starches in small-scale industries (Lopez et al. 2005). The storage root is a key organ for the direct production of cassava. In many circumstances, its yield reflects the productivity of the entire plant (Alves and Cameira 2002). The physiological significance of the storage roots is belied by their relative structural simplicity compared with other plant organs: roots largely lack some of the major metabolic pathways, such as photosynthesis, and they have a stereotypical morphology that is conserved throughout the stages of cassava root development and throughout the life cycle of individual plants. This combination of physiological relevance and structural simplicity has made storage roots excellent targets for functional genomic analyses (Jiang and Deyholos 2006). It is possible to reveal the dynamic mechanisms of cassava root development using high-throughput expression profiling technologies, such as microarray and large-scale sequencing. Despite its economic importance, especially in developing countries, this orphan crop has received little attention in the scientific community compared to other major crops, such as rice or maize (Aerni and Bernauer 2006).
Numerous tools are currently available for performing functional genomic analyses on cassava (genetic map, bacterial artificial chromosome (BAC) library, expressed sequence tags (EST) library). A transformation system has also been developed (Taylor et al. 2004), and a draft sequence of the cassava genome was recently released (http://www.phytozome.net/cassava). Global gene expression profiling provides another useful tool to improve our ability to study biological processes in cassava. In plants, cDNA microarrays have been used to study responses to various stresses (Thimm et al. 2001; Seki et al. 2002; Oono et al. 2003; Rabbani et al. 2003), identify genes related to metabolic pathways (Guterman et al. 2002) and analyze gene expression during adventitious root development (Brinker et al. 2004). Oligonucleotide microarrays have been used to investigate lateral root development after nitrate stimulation (Liu et al. 2008). In cassava, differentially expressed genes involved in the incompatible interaction between cassava and Xanthomonas axonopodis pv. manihotis (Lopez et al. 2005), as well as the process of post-harvest physiological deterioration, have been studied using cDNA microarrays (Reilly et al. 2007). Although both cDNA and oligonucleotide microarrays can be used to analyze gene expression patterns, fundamental differences exist between these methods; for example, the oligonucleotide microarrays apparently have high precision, whereas the cDNA arrays show poor concordance (Woo et al. 2004). Short and long oligonucleotide arrays have several advantages over cDNA arrays in terms of specificity, sensitivity and reproducibility. Long oligonucleotides can provide increased signal intensity compared to short ones (Relogio et al. 2002; Shippy et al. 2004).
To increase our understanding of storage root development, a transcriptome analysis of gene expression during the process is needed. Recently, Sojikul et al. (2010) reported the characterization of differentially expressed genes in fibrous and storage roots using cDNA-amplified fragment length polymorphism (AFLP). In this study, we investigated gene expression changes in cassava root at different developmental stages, e.g., fibrous roots, developing storage root and mature storage root, using a cassava 60 mer oligonucleotide microarray. Our study provides new insights into the molecular nature of cassava tuberous root development.
- Top of page
- Materials and Methods
- Supporting Information
Development of a custom cassava microarray
For the RIKEN cDNA library (Sakurai et al. 2007), 35 400 sequences were assembled into 13 063 unique consensus sequences (unigenes) that contain 6 998 tentative contigs from 29 138 sequences and 6 065 singletons. Only 197 sequences have been screened out in the sequence analysis (phrap with minmatch 17 and minscore 40). Sequence comparison with assemblies and singletons from The Institute for Genomic Research (TIGR) indicated that 7 629 sequences had high similarity with a subset of 13 063 unigenes (E value < 1e-20), and 7 774 TIGR sequences showed no significant similarity with the unigenes. A total of 20 837 sequences composed of the 13 063 unigenes and the 7 774 TIGR sequences were generated for further analyses. The largest EST dataset, 71 520 ESTs from National Centre for Biotechnology Information (NCBI) including both TIGR and RIKEN results, was reduced to 11 536 contigs and 8 222 singletons with the same program and parameters mentioned above (556 sequences have been screened out). Only 37 sequences showing low similarity (E value≥1e-20) were found after sequence comparison with the 20 837 sequences that were generated by the previous step. This finding indicates that the combination of the TIGR and RIKEN libraries provides nearly equivalent information to the NCBI public database. Therefore, a dataset of 20 874 unigenes was used to design the oligonucleotide probes for the microarray using Agilent (Palo Alto, CA, USA) technology. Finally, a 4×44K custom oligonucleotide array was designed for the present study containing in situ-synthesized 60 mer oligonucleotides representing 20 840 unique cassava genes; there was one replicate for each probe located at another position on the slide (Agilent Technologies). A total of 20 840×2 in situ synthesized 60 mer oligonucleotide probes plus 1 264 positive control features and 153 negative control features were designed on each microarray. Four housekeeping genes (Table 1) were used as internal control genes.
|Probe name||Sequence name*||Description|
|CUST_18705||Contig350||Manihot esculenta actin|
|CUST_7767||Contig3493||Manihot esculenta cytochrome|
|P450 protein CYP71E (c15)|
The signal-to-noise ratio (SNR) for each spot was calculated as described by Leiske et al. (2006). The average percentages of acceptable spots (SNR > 2.6) and high-quality spots (SNR > 10) were 94.16 ± 1.23% and 73.6 ± 2.51% in all 11 arrays. To evaluate the microarray quality, principal component analysis (PCA) of all 11 cassava arrays was conducted using the GeneSpring GX Software (Agilent Technologies), as shown in Supporting Figure S1. Technical replicates (TR-1 and TR-2) were assigned together and indicated that the entire experiment from RNA extraction to data extraction was reliable and reproducible. Obvious separation was detected among the three different types of materials analyzed, fibrous root (FR), mature storage root (MR) and technical repeat (TR). FR and MR showed a distinct boundary; while developing storage root (DR) was expected as an intermediate phase between FR and MR (Figure 1) from a developmental point of view. Gene expression profiling may be more informative than morphological observation for the identification of different root development stages. Correlation and coefficient (Supporting Figure S2) as well as hierarchical cluster (Supporting Figure S3) analyses also revealed that among the three biological replicates of developing storage root, DR-1 was similar to the FR group, while DR-2 and DR-3 were related to the MR group. Expression patterns of four house-keeping genes further confirmed the reliable quality of the microarray (Supporting Figure S4).
When the total 20 840 sequences were compared with 27 217 arabidopsis proteins, 20 450 sequences had hits with 11 995 arabidopsis proteins, and 14 813 sequences had hits with 9 158 arabidopsis proteins at E value ≤ 1e-5. After searching the descriptions of the 912 differentially expressed genes in the NCBI non-redundant protein database, the descriptions of 847 hits were identical to the The Arabidopsis Information Resource (TAIR) result (836 similar descriptions) from arabidopsis (Supporting File S3).
Gene expression profiling of developing cassava storage roots
Before normalization, the signal intensities of each feature were filtered against negative controls on the array. It was found that 72.13 ± 2.63%, 72.14 ± 2.48%, 68.91 ± 2.24% and 71.61 ± 2.10% of the genes on the chip were expressed in FR, DR, MR and TR, respectively. A comparative study was conducted by comparing gene expression profiles between each of the three selected stages (FR, DR and MR) using a twofold change cutoff (FCC). There were 777, 1 410 and 48 significantly upregulated genes (Figure 2A); 623, 1 349 and 409 significantly downregulated genes (Figure 2B) in DR/FR, MR/FR and MR/DR, respectively, at the cut-off SNR > 2.6 and P-value < 0.05. In total, 3386 differentially expressed genes were defined (Figure 2C, Supporting File S1). When using the random variance model (RVM) method, 912 differentially expressed genes could be identified at a cut-off P-value < 0.05 and false discovery rate (FDR) < 0.1 (Supporting File S2). In total, 742 common genes were detected by the FCC and RVM methods (Figure 2D).
Clustering of and pathways represented by differentially expressed genes
Among the 912 differentially expressed genes associated with Gene IDs in the Kyoto Encyclopedia of Genes and Genomes (KEGG), Biocarta and Reactome, 54 related pathway maps were selected for further analysis. Among these pathways, 25 pathways were considered significant at a cut-off P-value < 0.05 and FDR < 0.05 (Supporting File S4). The enrichment of each pathway was calculated according to the given equation and shown in Figure 3. The FDR of most pathways was not higher than 0.01 with only one exception (the FDR of fatty acid metabolism pathway was 0.0125). These pathways are suggested to be important during the initiation and formation of the storage root in cassava; and some of them belong to glucide metabolism, which includes N-glycan degradation, the pentose phosphate pathway, fructose and mannose metabolism, glycan structure degradation, glycolysis/gluconeogenesis, and starch and sucrose metabolism.
A global pathway net was constructed to illustrate the key pathways in the process of root development (Supporting File S5). Twenty out of 25 significant pathways were included in the pathway net (Figure 4), and the five pathways that were omitted are arachidonic acid metabolism, glycan structure degradation, limonene and pinene degradation, N-glycan degradation, and the phosphatidylinositol signaling system. Glycolysis/gluconeogenesis was considered to be the most important node in the net because the component exchanges with other pathways were strongly dependent on its existence.
The top hitting locus identifiers of the 11 995 non-redundant arabidopsis genes, together with the corresponded MR/FR ratio (Supporting File S6), were mapped to KEGG arabidopsis pathways using the KegArray tool (Version 1.2.1) for a detailed view of the pathways of interest (Figure 5). A subset of 2 086 identifiers corresponding to the differentially expressed cassava genes (SNR > 2.6 and P-value < 0.05) was also mapped. There were 4 158 and 691 hits, respectively, corresponding to all identified genes and differentially expressed genes. Changes in enrichment were calculated according to the equation: Re = (nf/n)/(Nf/N), where Nf and N were set as 691 and 4158, respectively (Supporting File S7). Among the 25 significant pathways mentioned above, 17 were confirmed by the 2 086 identifiers and 14 pathways exceeded the average enrichment, which is consistent with the results of previous pathway analysis.
Short time-series clustering revealed that six significant expression patterns were involved in root development (Figure 6). There were 144, 195, 92, 38, 39 and 36 genes (Supporting File S8) clustered in profiles 0 (0, –2, –3), 4 (0, –1, –1), 11 (0, 1, 1), 3 (0, –1, –2), 15 (0, 2, 3) and 2 (0, –1, –3), respectively. Two pairs of opposite profiles were identified: profiles 0 and 15, and profiles 4 and 11. The latter pair of profiles (Figure 6B) contained more genes and was selected for further gene ontology analysis. Functional category enrichment evaluation based on the gene ontology (GO) was performed on the upregulated or downregulated genes assigned to profile 11 and 4, respectively. Significant GO terms were found at a cut-off P-value < 0.05, and the enrichment of each GO term was calculated (see additional file 9). Significantly enriched GO terms in profile 11 and profile 4 are illustrated in Figure 7. Some reliable GO terms are presented for profile 11 (amylopectin biosynthetic process, basipetal auxin transport, response to wounding, response to cytokinin stimulus and the starch metabolic process) and profile 4 (carbohydrate biosynthetic process, response to auxin stimulus, carbohydrate transport, carbohydrate metabolic process and plant-type cell wall loosening). There were three common GO terms considered as basic functions in profiles 11 and 4 (oligopeptide transport, regulation of transcription and its subset, DNA-dependent regulation of transcription). Some interesting relationships were found between profile 11 and 4. For example, basipetal auxin transport was present in the former, and response to auxin stimulus appeared in the latter. Among the three components described by GO annotation (cellular component, molecular function and biological process), the biological processes might be the more relevant aspect of GO with respect to root development. Therefore, only functional clusters belonging to this component are presented in the selected profiles.
To verify the results of the GO analysis, 11 995 and 2 086 non-redundant arabidopsis gene locus identifiers were annotated with GO terms in TAIR (http://www.arabidopsis.org/tools/bulk/go/index.jsp), in which 3 448 and 1 507 non-redundant GO IDs were hit, respectively. Changes in enrichment were calculated according to the equation: Re = (nf/n)/(Nf/N), where Nf and N were set as 2 086 and 11 995, respectively (Supporting File S10). Among the 64 significant GO terms in profiles 11 and 4 mentioned above, 62 were found in 3 448 GO IDs and 56 were found in 1 057 GO IDs. Among the 56 GO terms, 42 exceeded the average enrichment.
When arabidopsis gene locus identifiers were mapped with KEGG, a BRITE (Biomolecular Relations in Information Transmission and Expression) (functional hierarchies and ontologies) view of all hits was constructed. Similarly, a functional categorization was presented after annotation in TAIR (Supporting File S11). The expression of many transcription factors was significantly changed during storage root development. A total of 138 genes (182 distinct gene models) were considered to be transcription factors when the 2 086 identifiers were annotated in TAIR. The number of upregulated transcription factors was nearly equal to the number of downregulated ones. The arabidopsis genome has at least 1 922 predicted transcription factors in the Database of arabidopsis Transcription Factors (DATF, http://datf.cbi.pku.edu.cn/index.php) and more than 2 192 in the Plant Transcription Factor Database (PlnTFDB, http://plntfdb.bio.uni-potsdam.de/v3.0/index.php?sp_id=ATH). These transcription factors have been classified into 65 or 83 families in DATF and PlnTFDB, respectively. The enrichment of each family was calculated by classification of non-redundant identifiers according to the two databases (Supporting File S12). Several enriched families were identified, such as ARF, C2C2-Dof, CCAAT, CPP, E2F-DP, G2-like and GRF.
Verification of gene expression patterns by quantitative real-time PCR
To validate the microarray data and evaluate the methods of selected differentially expressed genes, real time RT-PCR was performed using the RNA extracted from the three sample replicates at different developmental stages that were used in the microarray experiment. A total of 55 genes were selected for verification, including 42 differentially expressed genes identified by FCC and RVM methods, as indicated in Figure 2D, and 13 genes involved in starch and sucrose metabolism.
Cassava actin was used as the normalization standard. The fold changes (log2 ratio) in DR and MR compared with FR are presented in Table 2. In total, 94.55% of the tested genes were consistent with the microarray analysis.
|Sequence name*||Tair locus||Expect||DR-FR||MR-FR|
- Top of page
- Materials and Methods
- Supporting Information
Development of cassava storage root has drawn increasing attention from the cassava research community due to the use of cassava as a staple food crop in the tropics and also as feedstock for bio-ethanol production in many countries (Nguyen et al. 2007a, 2007b; Jansson et al. 2009). Recently, a study related to storage root formation was conducted using AFLP-based transcript profiling (Sojikul et al. 2010). In our study, the custom-designed 4×44K long oligonucleotide (60 mer) microarray was developed and used to investigate the genome-wide gene expression profile related to storage root development in cassava cultivar TMS60444. Differentially expressed genes at different developmental stages were identified, and their potentially relevant functions were studied using pathway and GO analyses, in which 25 important pathways were identified as significant ones to be regulated. Important genes related to each individual pathway were identified. The results of real time RT-PCR also confirmed the validity of the microarray, as well as the storage root specific expression patterns. Therefore, our study sheds new light on storage root development in cassava by using the long oligonucleotide (60 -mer) cassava microarray.
High-quality microarrays are a prerequisite for reliable analysis of different biological processes. Different types of microarray, including cDNA (long strands of amplified cDNA sequences), short oligonucleotide (25–30 nt), and long oligonucleotide (50–80 nt), used in transcriptome study may have different feature performances (Petersen et al. 2005; de Reynies et al. 2006; Tsai et al. 2006; Fan 2009; McHale et al. 2009). The custom long oligonucleotide array described here was generated by the Agilent SurePrint ink-jet technology, which provides a flexible platform for revising and updating oligonucleotide probes in the array without additional cost (Hughes et al. 2001; Wolber et al. 2006; Li et al. 2008). Most probes used on the current array are represented in the most recent draft of the cassava genome (http://www.phytozome.net/cassava). In addition, the 4×44K platform used for the array design contains four independent arrays in one slide; this arrangement is cost-effective and can reduce the variation among the arrays within a slide because high background levels in an array might obscure the signal from low-expressed genes and impede accurate quantification. The average SNR of the current microarray was 602.30, which was much higher than that of most cDNA array platforms (35.1 to 38.3). High SNR will promote sufficient signal generation for the detection of even low copy genes.
Two statistical criteria have been applied in the current analysis. Several thousand differentially expressed genes were identified in a pair-comparison at P < 0.05. Because more than 20 000 genes were analyzed in this microarray experiment, it is important to control the proportion of false positives (Tsai et al. 2003). The FDR based on P-values is the expected proportion of true null hypotheses that will be rejected in relation to the total number of null hypotheses that will be rejected (Benjamini and Hochberg 1995). The FDR is a more convenient and natural scale than the P-value scale, and it can provide the probability that a gene value is a false positive (Pawitan et al. 2005). In this study, the false discovery rates of differentially expressed genes selected by the RVM method were controlled at less than 10%, which guaranteed the reliable results of the current microarray experiment. The FDR was also calculated in the pathway analyses, and it was used in the GO analyses for correcting P-values (Dupuy et al. 2007). Quantitative real time PCR has become the gold standard for measuring gene expression, and it is generally used to validate microarray results (Dallas et al. 2005). With a criterion of P < 0.05 in the microarray analysis, the false positives could be effectively controlled (94.55% of qRT-PCR results were consistent with microarray data). The results indicate that microarray analyses in the present study are statistically reliable and accurate.
Gene expression profiling could provide valuable information related to the biological process during starchy storage root development, and these processes are expected to be conserved in storage root-bearing species e.g., sweet potato and yams. Starch accumulation is obviously the main theme in the process, but how this is achieved is still unclear. In the present study, there were 25 significant enriched pathways, and many of them are related to glucide metabolism, which is strongly correlated to starch accumulation and storage root bulking. In parallel, zeatin biosynthesis was highlighted because it is not only related to the promotion of root cell propagation, elongation and enlargement, resulting in sufficient room for starch granule accumulation and tuberization (Melis and Vanstaden 1985; Gibson 2004), but it is also needed for amyloplast formation and starch accumulation (Miyazawa et al. 2002; Bishopp et al. 2009). Lipid biosynthesis is required to support cell propagation and jasmonic acid biosynthesis. The regulation of tuberization in potato and yam by jasmonates has been reported (Koda and Kikuta 1991; Koda et al. 1991; Vandenberg and Ewing 1991; Ovono et al. 2010). For signal transduction, the inositol phosphate-calcium signaling system may play a comprehensive role in the tuberization process (Cenzano et al. 2008). Furthermore, several secondary metabolic processes, such as flavonol biosynthesis, were also found to be important for plant growth and development (Taylor et al. 2004; Besseau et al. 2007). Based on our study, a molecular mode of storage root development was constructed (Figure 8).
In the two opposite development-associated expression patterns (profile 11 and profile 4), three common GO terms were identified (oligopeptide transport, regulation of transcription and DNA-dependent regulation of transcription). The array data resulted in fewer differentially expressed genes (Figure 2A, B) between DR and MR, which suggests that profile 11 and profile 4 are root development-associated. The results of the GO analysis also support this hypothesis. For example, basipetal auxin transport appeared in profile 11 (upregulated pattern), whereas response to auxin stimulus was found in profile 4 (downregulated pattern). Interestingly, metal ion transport and copper ion transport were also highlighted in profile 11, which is highly consistent with the physiological requirement of cassava for copper (Chew et al. 1978). Furthermore, defense response, response to abiotic stimulus, and response to wounding were also noticeable, possibly due to stress responses that occurred during sample collection.
When pathways of interest (glycolysis/gluconeogenesis, starch and sucrose metabolism) were examined (Figure 5), several genes that were either upregulated or downregulated based on the array were subsequently validated by qRT-PCR (Table 2). These genes included six genes involved in glycolysis/gluconeogenesis (Contig4784, AtALDH7B4; Contig6596, AtALDH2B4; Contig6973, AtLOS2; Contig6535, AtALDH2B7; CAS01_007_P13.f, Arabidopsis L-lactate dehydrogenase; DV453892, Arabidopsis Enolase) and seven genes involved in starch and sucrose metabolism (Contig2017, AtAPL3; TA9083_3983, Arabidopsis Glucan phosphorylase; TA5570_3983, AtSBE2.1; Contig5561, AtADG1; TA8522_3983, AtSBE2.2; Contig4441, Arabidopsis Pectinesterase; BM259732, AtATPME2). When we mapped the qRT-PCR ratios (DR/FR and MR/FR) of these 13 genes to KEGG pathways, upregulations of ADP-glucose pyrophosphorylase, glucan phosphorylase and starch branching enzyme were observed, indicating that these are the key enzymes required for starch accumulation in the storage root. ADP-glucose pyrophosphorylase (AGPase), which catalyzes the rate-limiting step in starch biosynthesis in plants, is strongly associated with the yield production both in grains and root crops (Reviewed by Smith 2008 and Zeeman et al. 2010). The downregulation of pectinesterase was also considered to be important because it blocks the carbon flowing to pectate. In glycolysis/gluconeogenesis, downregulation of enolase, L-lactate dehydrogenase and aldehyde dehydrogenase (NAD+) could slow down the entry of carbon into the citrate cycle, pyruvate metabolism and propanoate metabolism, leading to less α-D-glucose-6P to be converted to glycerate-3P and α-D-glucose-1P. This would result in most of the α-D-glucose-6P being transported into the amyloplast for starch and sucrose metabolism. This result may also suggest that these enzymes are rate-limiting in the two important pathways in starchy root formation (Figure 8). These findings may facilitate the engineering of cassava for enhanced starch accumulation and deeper understanding of the interested biological process, the starchy root formation (Figure 8). By comparison with the report by Sojikul et al. (2010), in which only 157 transcript-derived fragments were indentified between leaves and roots using cDNA-AFLP, the present study was able to distinguish a large amount of differentially expressed genes among the three types of roots, giving more interesting findings related to the storage root tuberization.
Several enriched transcription factor families, such as ARF, C2C2-Dof, CCAAT, CPP, E2F-DP, G2-like and GRF, were identified in the present study. C2C2-Dof (Yanagisawa 2002, 2004), G2-like (Bravo-Garcia et al. 2009) and GRF (Kim et al. 2003) are of particular interest. Although their roles related to starchy storage root development have not been determined, it is possible to narrow down the gene candidates to study the comprehensive biological process by developing a hypothetical model. Recently, a transcriptional factor called RSR1 belonging to the AP2/EREBP family of transcription factors was found to regulate starch biosynthesis in rice endosperm (Fu and Xue 2010) and another transcription factor MADS1 was involved in tuberous root initiation in sweet potato (Ku et al. 2008).
In conclusion, gene expression during the process of cassava starchy root formation was characterized using the newly developed cassava microarray, and putative rate-limiting enzymes in key pathways have been highlighted, which provides potential targets for cassava genetic engineering. The platform will provide a valuable resource for the scientific community to study the developmental biology, stress response, virus resistance, genetics and genomics in cassava.
Materials and Methods
- Top of page
- Materials and Methods
- Supporting Information
The cultivar TMS60444, which is frequently used for genetic engineering, was used in the present study. The in vitro plants were transferred to pots for one month of growth in a greenhouse and then planted in a field. Fibrous roots (FR), developing storage roots (DR) and mature storage roots (MR, Supplemental Figure S1) were collected from three independent healthy 4-month-old cassava plants in the field and submerged into liquid nitrogen immediately. All samples were maintained in liquid nitrogen during transportation, and they were stored in an ultra-freezer (−80 °C) until RNA extraction.
Generation of a cassava oligonucleotide microarray
The microarray design was based on the sequence information from a large collection of cassava ESTs from NCBI (71,520 ESTs of cassava, released 28 March 2008) and TIGR (Manihot_esculenta_release_5, released 1 June 2007; 5 189 assemblies, 10 214 singletons) as well as a 35 400 full-length cDNA RIKEN library (Sakurai et al. 2007). Phrap (http://www.phrap.org/phredphrapconsed.html) and BLAT, the BLAST-like alignment tool (Kent 2002), were employed in sequence analyses. The design of the 60 mer oligonucleotide probes and the microarray were performed using Agilent technology.
Hybridization and data extraction
Total RNA was extracted from root samples, including FR, DR and MR, using the RNeasy Mini Kit (Qiagen, Valencia, CA, USA). Two RNA samples extracted from stored storage root slices were used as technical repeats (TR) for quality control. RNA quality was checked on a 1% agarose gel using an RNase-free electrophoresis system. RNA labeling and hybridization were conducted by the Shanghai Biochip Corporation (Shanghai, China) following the manufacturer's protocols. Arrays were incubated at 65 °C for 17 h in Agilent hybridization chambers (G2545A) and then washed according to the protocol at room temperature. Hybridized microarray slides were scanned at 5 μm resolution with an Agilent Technologies Scanner (G2505B), and images were saved in JPG format. Both 10% and 100% photomultiplier tube (PMT) settings were selected, and combined images were exported. The signal intensities of all spots on each image were quantified by Agilent Feature Extraction software, and data were saved as .txt files for further analysis.
Normalization and differential gene definition
Two types of statistical analyses were used. First, the signal intensity of each gene was globally normalized with the GeneSpring GX Software (Agilent Technologies) following the work flow guide (Bolstad et al. 2003). The signal-to-noise ratio (SNR) was calculated using the difference of the median signal minus the background median signal, divided by the background standard deviation (Leiske et al. 2006). Pair comparison was used to analyze the normalized and averaged data from the three types of samples (FR, DR and MR). P-values from T tests and fold changes between each comparison for each gene were calculated. Genes induced or suppressed greater than a twofold ratio (twofold change cutoff, FCC) were taken as differentially expressed when SNR > 2.6 and P-value < 0.05. Second, raw data were normalized using LOWESS within the R statistics package. The RVM (random variance model) corrective anova was used to analyze the normalized data from the three different samples (FR, DR and MR). The P-values and FDR were calculated by the R program according to Benjamini and Hochberg's method (Benjamini and Hochberg 1995). Genes were taken as differentially expressed when both P-value < 0.05 and FDR < 0.1 (Wright and Simon 2003).
All non-redundant sequences that were considered to be unique cassava genes were locally blasted in the TAIR protein database (27 217 arabidopsis protein sequences), which was downloaded from ftp://ftp.arabidopsis.org/home/TAIR, using the blastx program in the blastall package (version 2.2.9). The top hits were used for gene annotation, and the corresponding arabidopsis gene locus identifiers were mapped to the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways using the KegArray tool (Version 1.2.1). Differential genes identified using the random variance model (RVM) method were locally blasted against the NCBI non-redundant protein database using the blastx program in the blastall package. All the hits were recorded to confirm the gene annotations from the arabidopsis protein database, and accession numbers were assigned and used to search Entrez IDs using gene2accession in NCBI. Entrez IDs were recorded and used to search Gene IDs in the Gene Ontology database (http://www.geneontology.org/) for functional categorization of differentially expressed genes. In addition, the Entrez IDs were converted to Gene IDs in the KEGG database (http://www.genome.jp/kegg/), Biocarta (http://www.biocarta.com/genes/index.asp) and Reactome (http://www.reactome.org/). The significant pathways were identified based on KEGG, Biocarta and Reactome. Fisher's exact test (Yi 2006) was used to select the significant pathways, and the threshold of significance was defined by P-value < 0.05 and FDR < 0.05. The enrichment Re was given by the following: Re = (nf/n)/(Nf/N), where nf is the number of differential genes within a particular pathway, n is the total number of genes within the same pathway, Nf is the number of differential genes on the entire microarray, and N is the total number of genes on the microarray. Pathway-net was built according to the direct or systemic interactions between pathways in the KEGG database. Using a strategy for clustering short time-series (STC) gene expression data (Ernst et al. 2005), some unique profiles were defined. Significant profiles that had a higher probability than expected were identified using Fisher's exact test and multiple comparisons (Miller et al. 2002; Ramoni et al. 2002). GO-analysis was applied to the genes belonging to specific profiles (Ashburner et al. 2000; Harris et al. 2006). Generally, Fisher's exact test and the χ2 test were used to classify the GO category, and the FDR (Dupuy et al. 2007) was calculated to correct the P-value. The FDR was defined as FDR= 1 −Nk/T, where Nk refers to the number of Fisher's tests with P-values less than those calculated using the χ2 test. Within the significant category, the enrichment Re was given by: Re = (nf/n)/(Nf/N), where nf is the number of differential genes within the particular category, n is the total number of genes within the same category, Nf is the number of differential genes on the entire microarray, and N is the total number of genes on the microarray (Schlitt et al. 2003).
Quantitative real-time PCR
To validate the array data, the expression of 55 genes of interest were confirmed using quantitative real-time PCR (qRT-PCR) with the cassava RNA samples extracted using Plant RNA Reagent (Invitrogen, Cat. No. 12322-012). DNA was removed from the samples using DNase I (TaKaRa, Cat. No. D2215) treatment according to the manufacturer's protocol. The RNA quantity and purity were determined using a NanoDrop ND-1000 spectrophotometer (Nano Drop Technologies, Wilmington, DE, USA). A 2 μg aliquot of total RNA was used to synthesize first-strand cDNA using the ReverTra Ace (TOYOBO, Code: TRT-101) in a 20 μL reaction volume. The qRT-PCR primers were designed with Primer3Plus (http://www.bioinformatics.nl/cgi-bin/primer3plus/primer3plus.cgi). The PCR reactions were performed in a 20 μL volume containing a 2×SYBR Green Master Mix (TOYOBO, Code: QPK-201), 50 ng cDNA, 400 nM of forward primer, and 400 nM of reverse primer in a Bio-Rad CFX96 thermocycler. The amplification conditions were 95 °C for 1 min, followed by 40–50 cycles of 95 °C for 15 s and 61 °C for 30 s. Beta-actin was used as the internal control. All of the samples were measured in triplicate. The comparative Ct method was used to calculate the relative gene expression levels across the samples. The relative expression level of each gene in one sample (ΔCt) was calculated as follows: Ct target gene – Ct beta-actin. The relative expression of each gene in two different samples (ΔΔCt) was calculated as follows: ΔCt (sample 1) –ΔCt (sample 2).
(Co-Editor: Hai-Chun Jing)
- Top of page
- Materials and Methods
- Supporting Information
This work was supported by grants from the National Basic Research Program (2010CB126605), the National High Technology Research and Development Program of China (2009AA10Z102), the Earmarked Fund for Modern Agro-industry Technology Research System (nycytx-17), the Chinese Academy of Sciences (KSCX2-EW-J-12) and Shanghai Municipal Afforestation & City Appearance and Environmental Sanitation Administration (G102410). Wenzhi Zhou is acknowledged for downloading and checking the sequences of the latest cassava genome draft.
Utilization of the microarray
- Top of page
- Materials and Methods
- Supporting Information
- 2006) Stakeholder attitudes toward GMOs in the Philippines, Mexico, and South Africa: The issue of public trust. World Dev. 34, 557–575. , (
- 2002) Evapotranspiration estimation performance of root zone water quality model: evaluation and improvement. Agr. Water Manage. 57, 61–73. , (
- 2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29. , , , , , , , , , , , , , , , , , , , (
- 2007) NCBI GEO: Mining tens of millions of expression profiles – database and Tools update. Nucleic Acids Res. D760–D765. , , , , , , , , , (
- 1995) Controlling the false discovery rate – a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 57, 289–300. , (
- 2007) Flavonoid accumulation in Arabidopsis repressed in lignin synthesis affects auxin transport and plant growth. Plant Cell 19, 148–162. , , , , , (
- 2009) Cytokinin Signaling during Root Development. Elsevier Academic Press Inc, San Diego . pp. 1. , , (
- 2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–193. , , , (
- 2009) Specialization of the Golden2-like regulatory pathway during land plant evolution. New Phytol. 183, 133–141. , , (
- 2004) Microarray analyses of gene expression during adventitious root development in Pinus contorta. Plant Physiol. 135, 1526–1539. , , , , , , (
- 2008) Phospholipid and phospholipase changes by jasmonic acid during stolon to tuber transition of potato. Plant Growth Regul. 56, 307–316. , , , , , (
- 1978) Influence of soil-applied micronutrients on cassava (Manihot esculenta) in Malaysian tropical oligotrophic peat. Exp. Agr. 14, 105. , , (
- 1982) Cassava – A basic energy-source in the tropics. Science 218, 755–762. (
- 2005) Gene expression levels assessed by oligonucleotide microarray analysis and quantitative real-time RT-PCR – how well do they correlate? BMC Genomics 6, 59. , , , , , , , , , (
- 2006) Comparison of the latest commercial short and long oligonucleotide microarray technologies. BMC Genomics 7, 51. , , , , , , (
- 2007) Genome-scale analysis of in vivo spatiotemporal promoter activity in Caenorhabditis elegans. Nat. Biotechnol. 25, 663–668. , , , , , , , , , , , , , , , , , , , , , , , , , (
- 2006) NCBI GEO standards and services for microarray data. Nat. Biotechnol. 24, 1471–1472. , (
- 2005) Clustering short time series gene expression data. Bioinformatics 21(Suppl. 1), i159–i168. , , (
- 2009) Consistency of predictive signature genes and classifiers generated using different microarray platforms. Mol. Cell. Toxicol. 5, 42. (
- 2010) Co-expression analysis identifies Rice Starch Regulator1 (RSR1), a rice AP2/EREBP family transcription factor, as a novel rice starch biosynthesis regulator. Plant Physiol. DOI: 10.1104/pp.110.159517. , (
- 2004) Sugar and phytohormone response pathways: navigating a signalling network. J. Exp. Bot. 55, 253–264. (
- 2002) Rose scent: Genomics approach to discovering novel floral fragrance-related genes. Plant Cell 14, 2325–2338. , , , , , , , , , , , , , , , , (
- 2006) The Gene Ontology (GO) project in 2006. Nucleic Acids Res. D322–D326. , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , (
- 2001) Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat. Biotechnol. 19, 342–347. , , , , , , , , , , , , , , , , , , , , , , , (
- 2009) Cassava, a potential biofuel crop in the People's Republic of China. Appl. Energ. 86(Suppl. 1), S95–S99. , , , , (
- 2006) Comprehensive transcriptional profiling of NaCl-stressed Arabidopsis roots reveals novel classes of responsive genes. BMC Plant Biol. 6, 25. , (
- 2002) BLAT: the BLAST-like alignment tool. Genome Res. 12, 656–664. (
- 2003) The AtGRF family of putative transcription factors is involved in leaf and cotyledon growth in Arabidopsis. Plant J. 36, 94–104. , , (
- 1991) Possible involvement of jasmonic acid in tuberization of yam plants. Plant Cell Physiol. 32, 629–633. , (
- 1991) Potato tuber-inducing activities of jasmonic acid and related-compounds. Phytochemistry 30, 1435–1438. , , , , , (
- 2008) IbMADS1 (Ipomoea batatas MADS-box 1 gene) is involved in tuberous root initiation in sweet potato (Ipomoea batatas). Ann. Bot. 102, 57–67. , , , , (
- 2006) A comparison of alternative 60-mer probe designs in an in situ synthesized oligonucleotide microarray. BMC Genomics 7, 72. , , , , (
- 2008) Characterization of a newly developed chicken 44K Agilent microarray. BMC Genomics 9, 60. , , , , (
- 2008) Microarray analysis reveals early responsive genes possibly involved in localized nitrate stimulation of lateral root development in maize (Zea mays L.). Plant Sci. 175, 272–282. , , , , , (
- 2005) Gene expression profile in response to Xanthomonas axonopodis pv. manihotis infection in cassava using a cDNA microarray. Plant Mol. Biol. 57, 393–410. , , , , , , , (
- 2009) Changes in the peripheral blood transcriptome associated with occupational benzene exposure identified by cross-comparison on two microarray platforms. Genomics 93, 343–349. , , , , , , , , , , , , (
- 1985) Tuberization in cassava (Manihot esculenta) – cytokinin and abscisic-acid activity in tuberous roots. J. Plant Physiol. 118, 357–366. , (
- 2002) Optimal gene expression analysis by microarrays. Cancer Cell 2, 353–361. , , , , , (
- 2002) Amyloplast formation in cultured tobacco BY-2 cells requires a high cytokinin content. Plant Cell Physiol. 43, 1534–1541. , , , (
- 2010) Breeding cassava to feed the poor. Sci. Am. 302, 78–84. , (
- 2007a) Energy balance and GHG-abatement cost of cassava utilization for fuel ethanol in Thailand. Energ. Policy 35, 4585–4596. , , (
- 2007b) Full chain energy analysis of fuel ethanol from cassava in Thailand. Environ. Sci. Technol. 41, 4135–4142. , , (
- 2003) Monitoring expression profiles of Arabidopsis gene expression during rehydration process after dehydration using ca. 7000 full-length cDNA microarray. Plant J. 34, 868–887. , , , , , , , , , , , , , , (
- 2010) Tuber formation and growth of Dioscorea cayenensis-D. rotundata complex: interactions between exogenous and endogenous jasmonic acid and polyamines. Plant Growth Regul. 60, 247–253. , , (
- 2005) False discovery rate, sensitivity and sample size for microarray studies. Bioinformatics 21, 3017–3024. , , , , (
- 2005) Three microarray platforms: an analysis of their concordance in profiling gene expression. BMC Genomics 6, 63. , , , , , , , , , , , , , , (
- 2003) Monitoring expression profiles of rice genes under cold, drought, and high-salinity stresses and abscisic acid application using cDNA microarray and RNA gel-blot analyses. Plant Physiol. 133, 1755–1767. , , , , , , , , , (
- 2002) Cluster analysis of gene expression dynamics. Proc. Natl. Acad. Sci. USA 99, 9121–9126. , , (
- 2007) Towards identifying the full set of genes expressed during cassava post-harvest physiological deterioration. Plant Mol. Biol. 64, 187–203. , , , , , (
- 2002) Optimization of oligonucleotide-based DNA microarrays. Nucl. Acids Res. 30, e51. , , , , (
- 2007) Sequencing analysis of 20,000 full ength cDNA clones from cassava reveals lineage specific expansions in gene families related to stress response. BMC Plant Biol. 7, 66. , , , , , , , , , , (
- 2003) From gene networks to gene function. Genome Res. 13, 2568–2576. , , , , , , (
- 2002) Monitoring the expression profiles of 7000 Arabidopsis genes under drought, cold and high-salinity stresses using a full-length cDNA microarray. Plant J. 31, 279–292. , , , , , , , , , , , , , , , , , (
- 2004) Performance evaluation of commercial short-oligonucleotide microarrays and the impact of noise in making cross-platform correlations. BMC Genomics, 5, 61. , , , , , , (
- 2008) Prospects for increasing starch and sucrose yields for bioethanol production. Plant J. 54, 546–558. (
- 2010) AFLP-based transcript profiling for cassava genome-wide expression analysis in the onset of storage root formation. Physiol. Plant. 140, 189–198. , , , , , , (
- 2004) Development and application of transgenic technologies in cassava. Plant Mol. Biol. 56, 671–688. , , , , (
- 2001) Response of Arabidopsis to iron deficiency stress as revealed by microarray analysis. Plant Physiol. 127, 1030–1043. , , , , (
- 2003) Estimation of false discovery rates in multiple testing: application to gene microarray data. Biometrics 59, 1071–1081. , , (
- 2006) Detection of transcriptional difference of porcine imprinted genes using different microarray platforms. BMC Genomics 7, 328. , , , , , , , , , , (
- 1991) Jasmonates and their role in plant-growth and development, with special reference to the control of potato tuberization – A review. Amer. Potato J. 68, 781–794. , (
- 2006) The agilent in situ-synthesized microarray platform. Methods Enzymol. 410, 28–57. , , , , (
- 2004) A comparison of cDNA, oligonucleotide, and Affymetrix GeneChip gene expression microarray platforms. J. Biomol. Tech. 15, 276–284. , , , , , , (
- 2003) A random variance model for detection of differential gene expression in small microarray experiments. Bioinformatics 19, 2448–2455. , (
- 2002) The Dof family of plant transcription factors. Trends Plant Sci. 7, 555–560. (
- 2004) Dof domain proteins: plant-specific transcription factors associated with diverse phenomena unique to plants. Plant Cell Physiol. 45, 386–391. (
- 2006) Whole pathway scope a comprehensive pathway based analysis. BMC Bioinformatics 7, 30. (
- 2010) Starch: its metabolism, evolution, and biotechnological modification in plants. Annu. Rev. Plant Biol. 61, 209–234. , , (
- Top of page
- Materials and Methods
- Supporting Information
Figure S1 Principal component analysis of all 11 cassava arrays. DR, developing storage roots in orange square; FR, fibrous roots in red triangle; MR, mature storage roots in blue diamond; TR, technical repeat in green circular.
Figure S2 High correlation and co-efficiency of samples from fibrous roots (FR), developing storage roots (DR) and mature storage roots (MR).
Figure S3 Hierarchical clustering of samples from fibrous roots (FR), developing storage roots (DR) and mature storage roots (MR). DR-1 was clustered into the FR group; DR-2 and DR-3 were clustered into the MR group.
Figure S4 Stable expression of control genes in different samples. MeCYP71E, Manihot esculenta cytochrome P450 protein CYP71E (c15); MeBeta-6 Tublin, Manihot esculenta TUB6 (Beta-6 Tublin), putative; MeEF-1-alpha, Manihot esculenta elongation factor 1-alpha; MeACT, Manihot esculenta actin.
Supporting File S1 Upregulated and downregulated genes in pair comparison from different samples.
Supporting File S2 Differential expressed genes selected by random variance model (RVM) method.
Supporting File S3 Gene annotations of differentially expressed genes base on National Centre for Biotechnology Information (NCBI) and TAIR protein database.
Supporting File S4 Significant pathways and contributing genes.
Supporting File S5 Interactions between significant pathways and related ones according to Kyoto Encyclopedia of Genes and Genomes (KEGG) database.
Supporting File S6 Non-redundant Arabidopsis gene lists represented total or differential cassava genes on microarray.
Supporting File S7 Appeared and absent significant pathways according to KegArray results.
Supporting File S8 Differentially expressed genes assigned in each significant expression profile.
Supporting File S9 Significant gene ontology (GO) terms and contributing genes in profile 11 and profile 4.
Supporting File S10 Enrichment of each significant gene ontology (GO) term calculated base on The Arabidopsis Information Resource (TAIR) GO annotation results.
Supporting File S11 Predicted 138 differentially expressed transcription factors identified by Kyoto Encyclopedia of Genes and Genomes (KEGG) Biomolecular Relations in Information Transmission and Expression (BRITE) and The Arabidopsis Information Resource (TAIR) Functional Categorization.
Supporting File S12 Transcription factor family classifications and enrichments based on the Database of Arabidopsis Transcription Factors (DATF) and Plant Transcription Factor Database (PlnTFDB).
|JIPB_1018_sm_FileS1.xls||2087K||Supporting info item|
|JIPB_1018_sm_FileS2.xls||170K||Supporting info item|
|JIPB_1018_sm_FileS3.xls||372K||Supporting info item|
|JIPB_1018_sm_FileS4.xls||92K||Supporting info item|
|JIPB_1018_sm_FileS5.xls||32K||Supporting info item|
|JIPB_1018_sm_FileS6.xls||5576K||Supporting info item|
|JIPB_1018_sm_FileS7.xls||32K||Supporting info item|
|JIPB_1018_sm_FileS8.xls||163K||Supporting info item|
|JIPB_1018_sm_FileS9.xls||113K||Supporting info item|
|JIPB_1018_sm_FileS10.xls||3939K||Supporting info item|
|JIPB_1018_sm_FileS11.xls||214K||Supporting info item|
|JIPB_1018_sm_FileS12.xls||66K||Supporting info item|
Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.