These authors contributed equally to this work.
Genomic expression dominance in the natural allopolyploid Coffea arabica is massively affected by growth temperature
Article first published online: 28 JUL 2011
© 2011 The Authors. New Phytologist © 2011 New Phytologist Trust
Volume 192, Issue 3, pages 760–774, November 2011
How to Cite
Bardil, A., de Almeida, J. D., Combes, M. C., Lashermes, P. and Bertrand, B. (2011), Genomic expression dominance in the natural allopolyploid Coffea arabica is massively affected by growth temperature. New Phytologist, 192: 760–774. doi: 10.1111/j.1469-8137.2011.03833.x
- Issue published online: 19 OCT 2011
- Article first published online: 28 JUL 2011
- Received: 26 April 2011, Accepted: 14 June 2011
- Coffea arabica;
- genomic expression dominance;
- natural allopolyploid;
- Top of page
- Materials and Methods
- Supporting Information
- •Polyploidy occurs throughout the evolutionary history of many plants and considerably impacts species diversity, giving rise to novel phenotypes and leading to ecological diversification and colonization of new niches. Recent studies have documented dynamic changes in plant polyploid gene expression, which reflect the genomic and functional plasticity of duplicate genes and genomes.
- •The aim of the present study was to describe genomic expression dominance between a relatively recently formed natural allopolyploid (Coffea arabica) and its ancestral parents (Coffea canephora and Coffea eugenioides) and to determine if the divergence was environment-dependent. Employing a microarray platform designed against 15 522 unigenes, we assayed unigene expression levels in the allopolyploid and its two parental diploids. For each unigene, we measured expression variations among the three species grown under two temperature conditions (26–22°C (day–night temperatures) and 30–26°C (day–night temperatures)).
- •More than 35% of unigenes were differentially expressed in each comparison at both temperatures, except for C. arabica vs C. canephora in the 30–26°C range, where an unexpectedly low unigene expression divergence (< 9%) was observed.
- •Our data revealed evidence of transcription profile divergence between the allopolyploid and its parental species, greatly affected by environmental conditions, and provide clues to the plasticity phenomenon in allopolyploids.
- Top of page
- Materials and Methods
- Supporting Information
Allopolyploidy has long been recognized as an important mechanism in eukaryote evolution (Osborn et al., 2003), especially in flowering plants, including many important agricultural crops such as wheat (Triticum aestivum), cotton (Gossypium hirsutum), sugarcane (Saccharum officinarum) and coffee (Coffea arabica) (Chen & Ni, 2006; Jackson & Chen, 2009). Polyploidy considerably impacts plant species diversity, giving rise to novel phenotypes and leading to ecological diversification and colonization of new niches (Otto & Whitton, 2000; Adams, 2007). As highlighted by Hegarty & Hiscock (2009), the processes by which two genomes adapt to coexistence within the same nucleus are complex. Recent studies have documented dynamic changes in plant polyploid gene expression, which reflect the genomic and functional plasticity of duplicate genes and genomes (Jackson & Chen, 2009). To investigate the effects of genomic merger and doubling, transcriptomic divergence between parents and synthetic nascent allopolyploids or natural allopolyploids has been assessed in several recent studies using genome-wide approaches to measure the deviation from additivity. Indeed, in the additive model, allotetraploid expression would be expected to be equivalent to the average expression of the parental species. Studies in a variety of allopolyploids have revealed a tendency for nonadditive gene expression in Arabidopsis allotetraploids (Wang et al., 2006), in Gossypium allotetraploids (Chaudhary et al., 2009), in Senecio interspecific hybrids and allohexaploids (Hegarty et al., 2006, 2008), and in Triticum allohexaploids (Pumphrey et al., 2009). More recently, Rapp et al. (2009) have shown in nascent Gossypium allopolyploids that only 4–11% of genes exhibit additive expression while 82–92% of genes show up- or down-regulation relative to the level of one of the two parents, reflecting massive expression dominance. Most studies have compared synthetic allopolyploids with their parents. As highlighted by Buggs (2008), only a few natural polyploid species of known parentage and for which genomic resources are available have been well characterized. In a recent study comparing five Gossypium allotetraploids that have diverged after a million or so years of evolution, the authors (Flagel & Wendel, 2010) revealed that the magnitude of dominance remains but that the bias in the direction of dominance disappears.
Among the 104 Coffea species so far described (Davis & Rakotonasolo, 2008), all are diploid (2n = 2x = 22) and generally self-incompatible, except Coffea arabica which appears to be the only one that is tetraploid (2n = 4x = 44) and self-fertile (Charrier & Berthaud, 1985). Molecular analyses (Lashermes et al., 1999) have shown that Coffea arabica is an allopolyploid resulting from the hybridization between Coffea eugenioides (E genome) and Coffea canephora (C genome) or ecotypes related to those diploid species. The C. arabica coffee tree model displays some interesting features when compared with Gossypium or Arabidopsis. While diversification of the Coffea subgenus Coffea probably occurred in the second half of the Middle Pleistocene (450 000–100 000 yr before present (BP)), it is most likely that the allopolyploid speciation of C. arabica took place in relatively recent times (10–50 000 BP; A. Cenci et al., unpublished). Note also that C. arabica displays little genetic diversity and has experienced spatial isolation in Ethiopia, its centre of primary diversity. As highlighted by Lashermes et al. (2010), little divergence has been observed between the two constitutive genomes of C. arabica (Ea Ca) and those of its parental species. Hybridization followed by polyploidization is sufficiently recent that differences between parental genomes in the polyploid are not eroded (Leitch & Leitch, 2008). However, compared with a neo-synthetic allopolyploid, the C. arabica species is old enough for successive generations since 30 000 BP to have been subjected to natural selection, thus allowing stabilization of its genome. Consequently, the C. arabica allopolyploid is considered as a ‘recent’ natural polyploid model, like Triticum aestivum (c. 8000 BP; Pumphrey et al., 2009). The origin of C. arabica occurred between the origins of more ancient allopolyploid species, such as Gossypium (c. 1.5 million BP; Senchina et al., 2003) and Arabidopsis suecica (12 000–300 000 BP; Jakobsson et al., 2006), and very recent allopolyploid species, such as Spartina anglica (< 150 yr ago; Ainouche et al., 2003), Tragopogon miscellus (< 80 yr ago; Soltis et al., 2004) and Senecio cambrensis (c. 60 yr ago; Abbott & Lowe, 2004).
We recently generated the first 15K coffee microarray, a spotted 70-mer oligo-gene microarray, based on publicly available expressed sequence tags (ESTs; Privat et al., 2011). In a series of experiments, we demonstrated that this microarray (called ‘PUCECAFE’) enabled reproducible global expression analysis with different tissues (seeds, leaves and flowers) and with different coffee species (C. eugenioides, C. arabica and C. canephora). In a gene expression analysis using a 15K coffee microarray in which C. arabica was compared with its related parents, we assessed: the extent of genomic expression dominance between the C. arabica transcriptome and its parents; and whether the divergence between the allopolyploid transcriptome and its parents is modulated by temperature.
Materials and Methods
- Top of page
- Materials and Methods
- Supporting Information
Fresh mature seeds of Coffea arabica (L.) and Coffea canephora (Pierre) were provided by the Centre de Coopération Internationale en Recherche Agronomique pour le Développement from La Cumplida (Matagalpa, Nicaragua). Coffea eugenioides mature seeds were provided by the Coffee Research Foundation, Kenya. Coffea arabica was represented by two accessions, that is, cv Java derived from the wild Coffea arabica Ethiopian pool and cv T18141, a C. arabica homozygous line derived from a backcross between C. arabica and a natural interspecific hybrid between C. arabica and C. canephora, which thus is introgressed by C. canephora unigenes. Coffea canephora was represented by cv Nemaya, derived from a cross between two wild Congolese genotypes. Finally, Coffea eugenioides accession seeds were collected from trees originating from Mount Elgon forest in Kenya (1°8′N 34°33′E). The coffee seedlings were grown in a glasshouse under natural daylight, at a constant temperature of 24°C, and watered as necessary. After 120 d, the plants were transferred to a phytotron chamber (CRYONEXT, Montpellier, France; model RTH 1200L).
Cultivation in phytotron chambers
The study involved comparing the transcriptomes of different species under two sets of growing conditions, which differed in terms of the diurnal and nocturnal temperatures. In the two phytotron chambers, the photoperiod, humidity and luminosity were set at 12 h : 12 h, light : dark, 80–90% and 600 μmol m−2 s−1, respectively. The plants were subjected to a diurnal temperature of 26°C and a nocturnal temperature of 22°C in one phytotron (referred to as the coldest temperature) and 30°C and 26°C, respectively, in a second phytotron (referred to as the hottest temperature). In each phytotron chamber, from each accession, three plants were grown in a randomized complete block design. After 45 d for the hottest temperature and 60 d for the coldest, plant size and number of newly formed leaves were similar. Two young leaves were then collected from each plant at midday (6–8 h after lights on), and then flash-frozen in nitrogen and stored at −80°C until extraction.
Two leaves were pooled per plant from three plants to form three biological replicates, which were subjected to RNA extraction and hybridized in microarray. We used 72 slides (i.e. 3 (replicates) × 2 (dye-swaps) × 6 (comparisons) × 2 (growth temperature conditions)) following a saturated design as described in Supporting Information Fig. S1.
Total RNA was isolated using the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s recommendations, but with slight modifications. A DNase treatment (RNase free DNase set; Qiagen) was carried out on RNA during the extraction protocol and the total RNA was eluted in a smaller volume of 65 μl. Each RNA sample was quantified on a Nanodrop (ThermoFisher Scientific Inc., Waltham, MA, USA). RNA quality was confirmed on a Bioanalyser (Agilent Technologies, Foster City, CA, USA).
The microarrays were performed at the Genomix transcriptomic platform (Institut de Génomique Fonctionnelle, Montpellier) using the protocols described previously (Privat et al., 2011).
Coffee gene assembly (Build II) To create the SGN Coffee Unigene Build II (http://solgenomics.net/), 71 659 EST chromatograms were processed from the following C. canephora sequence libraries: cccl (coffee leaf; 11 655 chromatograms), cccp (coffee pericarp; 10 849 chromatograms), cccs18w (coffee early-stage bean; 1972 chromatograms), cccs30w (coffee middle-stage bean; 15 318 chromatograms), cccs42w (coffee late-stage bean, 42 wk after pollination; 469 chromatograms), cccs46w (coffee late-stage bean, 46 wk after pollination; 10 907 chromatograms), cccwc22w (coffee early-stage whole fruit; 11 660 chromatograms), irdccf (IRD coffee cherry in various developmental stages; 5089 chromatograms), irdccl (IRD, young leaves; 3693 chromatograms) and nDav1 (Nestle Dav1; 47 chromatograms), using phred software (http://www.phrap.org/phredphrapconsed.html) (Ewing et al., 1998). The sequences were processed to remove vector, adaptors and low-complexity sequences using an SGN-developed Perl script. A total of 55 539 sequences passed the filter tests and were used in the assembly. The unigene assembly was created in two steps. First, we used a self-BLAST and an SGN Perl script (precluster.pl) and, secondly, we used cap3 software (http://seq.cs.iastate.edu/) (Huang & Madan, 1999) for each cluster.
Long oligonucleotide microarray design and synthesis The C. canephora long oligonucleotide set was designed and synthesized by Operon (Cologne, Germany) based on the SGN Coffee Build II (15 721 unigenes; http://solgenomics.net/). An amino linker was attached to the 5′ end of each oligonucleotide. The oligonucleotides, selected to limit the secondary structure, have a 67 ± 3°C melting temperature, 65 ± 5 base length, and 43 ± 5% GC content. More than 98% of the oligonucleotides were within 1000 bases from the 3′ end of the available gene sequence. For 195 unigenes, no adequate oligonucleotide could be designed and they were therefore classified as ‘missing unigenes’ (Table S3). BLAST alignments were performed to identify oligonucleotides that could cross-hybridize with other sequences of the SGN Coffee Build II. Finally, out of 15 522 oligonucleotides designed, 371 oligonucleotides had > 70% overall identity to another unigene and had a contiguous identical length of over 20 nt common to another unigene (Table S4). For the preparation of labelled Cy3- and Cy5-aRNA target, 1 μg of total RNA sample was amplified using the Amino Allyl Message Amp II aRNA amplification kit (Ambion), according to the manufacturer’s instructions. The oligonucleotide probes were printed on reflective epoxysilane-coated slides (Amplislide; Genewave, Ecole Polytechnique, France) using a Lucidea Array printer (GE Healthcare, Bio-Sciences Corp., Piscataway, NJ, USA). The oligo library also included sets of positive and negative controls used for quality control. The two labelled aRNA were added to Microarray Hybridization Buffer Version 2 (GE Healthcare) and applied to the microarrays in the individual chambers of an automated slide processor (GE Healthcare). Hybridization was carried out at 37°C for 12 h. Hybridized slides were washed and were immediately scanned at 10-μm resolution in both Cy3 and Cy5 channels with a GenePix 4200AL scanner (Molecular Devices, Sunnyvale, California, USA). ArrayVision (GE Healthcare Bio-Sciences Corp., Piscataway, NJ, USA) software was used for feature extraction. Spots with high local background or contamination fluorescence were flagged manually. A local background was calculated for each spot as the median fluorescence intensity of four squares surrounding the spot. This background was subtracted from the foreground fluorescence intensity.
Validation of microarray results The 15K coffee microarray was created from C. canephora EST libraries (Privat et al., 2011). For C. arabica, public accessibility to EST collections was limited (Lashermes et al., 2008) and no EST resources were available for C. eugenioides. Despite the high genetic similarity of these three species, we wanted to ensure that there was no bias at the time of hybridization in favour of C. canephora. Our choice of no-bias indicator was the number of unigenes significantly expressed compared with background noise. The signals emitted by the negative controls defined the background noise. A unigene was considered expressed if the intensity exceeded twice the median negative control standard deviation. If a unigene was significantly expressed during a comparison (six hybridizations), its signal should be superior to the highest background noise in each hybridization (maximum six times). We chose to set this threshold at 5, and we screened the number of genes that reached or surpassed this value for each species. Consequently, a detection limit threshold of 5 meant that, for a given gene in the six hybridizations, we observed a signal that was significantly greater than the background noise at least five times. Correlation analyses were performed in order to check for similarities between biological and technical replicates through Pearson’s moment correlation analysis.
Analyses of expression changes between species and growth conditions A significant analysis of microarray (SAM) test was run using the Bioconductor siggenes package (http://bioconductor.case.edu/bioconductor/2.5/bioc/html/siggenes.html). Repeated permutations of the data were carried out to identify significant unigenes (Tusher et al., 2001). Multiple testing adjustments based on Benjamini and Hochberg's false discovery method (1995) were performed (FDR ≤ 0.05), allowing a stringent analysis with no false positive identification of differentially regulated unigenes. These analyses provided the ranking of significantly expressed unigenes. For each unigene, we had the log 2-fold expression ratio of three contrasts: the diploid parents to each other and each diploid parent to the allopolyploid. For each comparison, we determined the number of differentially expressed unigenes. That number of unigenes was compared with the total number of unigenes of the chip (15 522 unigenes) to obtain percentages calculated with the same denominator, thereby following the conventions adopted by Rapp et al. (2009). This calculation method enables exact percentage comparisons under different conditions (i.e. temperatures and species).
Comparisons between species and growth conditions as a function of 12 possible expression pattern categories Unigenes significantly differentially expressed were binned into 12 possible expression pattern categories, as defined by Rapp et al. (2009), between two diploids and their derived allopolyploid. To assess the distribution of expression intensities between them, we mapped the kernel density of expression for each species using the density estimator (Proc SGPLOT) in the sas (Statistical Analysis System) software package (SAS Institute, Cary, NC, USA). These were plotted on a standardized scale against the experimental mean to illustrate allopolyploid vs diploid comparisons.
Stability of C. arabica cultivars under different temperature conditions In order to be able to compare differences in fluorescence intensity in the same genotype as a function of the per-species growth temperature conditions, a lowess normalization method was used to normalize M-values (log-ratios) for dye-bias within each array, and quantile normalization ensured that the intensities had the same empirical distribution across arrays. These two methods were implemented in the limma R package (Smyth & Speed, 2003; Smyth, 2005) of the Bioconductor project (Gentleman et al., 2004). This data normalization method enabled us to conduct a global analysis of the 72 slides.
Estimates of gene expression were used to fit a linear model in the sas software package, taking the form:
- (Eqn 1)
(Yij, the normalized expression intensity of a unigene; μ, the intercept; δi, the fixed effect of treatment i (i.e. genotype observed for a growth temperature condition) with the random effect of replication sj and the random error term eij.) Resulting P-values were adjusted for multiple testing using the Benjamini & Hochberg’s (1995) method for controlling the FDR. For a highly stringent analysis with no false positive identification of differentially regulated genes, we filtered for a P-value ≤ 10−6.
In order to estimate the expression stability between the two temperature conditions, we calculated two indices, called the ‘fluorescence intensity differential’ and the ‘variation index’.
The fluorescence intensity differential was calculated between hybridizations at 26–22°C (coldest temperature) and 30–26°C (hottest temperature) using the following equation.
Fluorescence intensity differential:
- (Eqn 2)
(n, the number of unigenes for which we noted a significant treatment effect at P ≤ 10−6.)
This differential was calculated for C. canephora and for the two allopolyploids.
An intensity variation index was also calculated by the following equation:
- (Eqn 3)
Gene ontology functional enrichment analysis Computational annotation was performed using Blast2 GO software v2.4.4. (http://www.blast2go.org). The annotation step was performed using the BlastX algorithm, the NCBI nr database and a Blast expectation value threshold of 1 E−3. The Blast2GO tool was then used to obtain GO information from retrieved database matches. All sequences were mapped using default parameters. An InterPro Scan was also performed to find functional patterns and related GO terms using the specific tool implemented in the Blast2GO software with the default parameters. We utilized gene ontology classifications for molecular and cellular function, coupled with Fisher’s exact test, to identify processes under- or over-represented for unigenes partitioned into ‘transgression’ and ‘dominance’ categories as a function of temperature.
- Top of page
- Materials and Methods
- Supporting Information
We hybridized labelled leaf aRNAs to a 15K coffee microarray, assaying 15 522 unigenes for their relative expression levels and determined, for each unigene, the level of expression variation between parents, and between each parent and the allopolyploid. Our results are displayed according to the temperature conditions.
A new set of leaves formed in the three species after growing for 45 d under the hottest conditions and 60 d under the coldest conditions. This tallied with what is commonly found under natural conditions. The vegetative cycles of the coffee trees were more rapid at the hottest temperature. The phytotron growth conditions seemed to mimic natural growing conditions well. Leaf RNAs were then extracted and used to interrogate microarrays. We used 72 slides for this study. There were three replicates and two dye swaps for each comparison, with a total of six comparisons carried out per temperature (Fig. S1). The analysis was therefore focused on a transcriptome comparison of coffee tree leaves from three species, C. canephora, C. eugenioides and C. arabica. The last species was represented by two different accessions, that is, cv Java and cv T18141.
Assessment of DNA microarray quality
Raw quantification and background noise values were calculated for each chip (data not shown). Flags highlighted invalidated spots, which made it possible to see if there were any unusual artefacts on a slide caused, for example, by washing impurities or dust. The distributions of raw intensities, background noise and log-ratios were uniform. Few spots were flagged and background noise was low and constant when the signal intensity increased, indicating that the chips were good quality.
As 70-mer oligonucleotides designed from C. canephora may not hybridize well to C. eugenioides or C. arabica unigenes, we conducted a comparison of the number of unigenes expressed relative to the background noise for each species. Detection limit data were used to determine the quantity of unigenes significantly expressed in comparison to the background noise. We found that more than 8200 unigenes out of the 15 522 unigenes targeted by the microarray were significantly expressed relative to the background noise, irrespective of the species and growth temperature conditions (Table 1). We calculated that from 86.3% to 90.3% unigenes were common to the three species (Table 1). Moreover, a Pearson correlation between replicates was calculated for each unigene in all arrays, with coefficients ranging from 0.87 to 0.96 in pairwise comparisons for each independent experiment (data not shown). This high coefficient demonstrates the high level of repeatability at which the microarray is able to detect transcriptomic data.
|Growth temperature conditions||Detection limit threshold||Expressed unigenes relative to the background noise||Average||CV (%)||Common unigenes among the three species (%)|
|C. arabica cv Java||C. arabica cv T18141||C. canephora||C. eugenioides|
Transcriptome divergence between diploid parents
We used the SAM test for statistical analyses of the microarray data, with the probability set at P < 0.05 for the purposes of identifying unigene expression differences between parents. For both temperatures, high levels of expression divergence were observed between parental diploids. At 26–22°C, amongst the 15 522 unigenes of the microarray, 8460 (54.5%) unigenes were differentially expressed between C. eugenioides and C. canephora (Fig. 1a). At 30–26°C, 5547 unigenes (35.7%) were differentially expressed between C. eugenioides and C. canephora (Fig. 1b). Of the differentially expressed unigenes, equivalent proportions were up-regulated in each parent, that is, 27.9% for C. canephora vs 26.6% for C. eugenioides at 26–22°C and 19.4% for C. canephora vs 15.2% for C. eugenioides at 30–26°C (Fig. 1a,b).
We sought to estimate the number of unigenes for which differential expression remained stable under both temperature conditions. The intersection of the Venn diagram in the Fig. 2 represents unigenes whose expression differential was common to both temperature conditions (3880 unigenes; Fig. 2). Of these 3880 unigenes, we found that 3033 unigenes maintained the same pattern (i.e. genes that were up-regulated at 30–26°C remained up-regulated at 26–22°C) under both temperature conditions (Fig. 2).
Transcriptome divergence between diploid parents and natural allopolyploid species
In Fig. 1(a,b), we only show the results pertaining to cv Java compared with the diploid parents. The data obtained for cv T18141 were very similar and are represented in Fig. S2. At 26–22°C, the percentages of unigenes differentially expressed between the allopolyploid and its parents were high. Overall, 48.5% and 47.8% of unigenes were differentially expressed between the allopolyploid and C. canephora, and between the allopolyploid and C. eugenioides, respectively (Fig. 1a). Equivalent proportions were found at 30–26°C for the comparison between the allopolyploid and C. eugenioides (40.8%; Fig. 1b). However, an unexpected divergence of only 8.9% in gene expression was noted for the comparison between the allopolyploid and C. canephora at 30–26°C (Fig. 1b). Also, we observed that the proportions of up- and down-regulated unigenes for the same comparison varied little. For example, in C. arabica vs C. canephora at 26–22°C, 3727 unigenes (24%) were up-regulated in the allopolyploid, and 3804 unigenes (24.5%) were up-regulated in C. canephora, that is, a difference of 0.5% (Fig. 1a). This difference peaked at 4.2% for C. canephora vs C. eugenioides at 30–26°C (Fig. 1b).
Patterns of differential expression between both C. arabica cultivars and parental species
As defined by Rapp et al. (2009), the unigenes were partitioned into 12 differential expression patterns. This analysis – unlike that used in the previous section – was specifically focused on the two C. arabica cultivars. We have presented the details of these results in Figs 3–6. These results are summarized in Fig. 7 to facilitate the analysis. At 26–22°C, the ‘C. eugenioides-like dominance’ and ‘C. canephora-like dominance’ ranged from 13 to 17% and from 14 to 8% for the allopolyploid cv Java and for the allopolyploid cv T18141, respectively (Fig. 7). There was also substantial transgressive up- and down-regulation in the allopolyploids (16–18%; Fig. 7). Lastly, only 12% of unigenes were in the ‘additivity’ category in the two allopolyploids (Fig. 7). At 30–26°C, we found that the proportion of unigenes in the ‘no change’ category increased considerably from 13–17% to 31–33% (Fig. 7). However, it was the virtual disappearance of some categories and the very marked increase in ‘C. canephora-like dominance’ unigenes that were most surprising. The proportion of unigenes in the ‘C. eugenioides-like dominance’ category declined drastically to only 0.2 to 2 for cv Java and for cv T18141, respectively (Fig. 7). The proportion of unigenes in the ‘transgression’ category also decreased drastically to 3% (Fig. 7). Lastly, the proportion of unigenes in the ‘additivity’ category was under 2% for both allopolyploids (Fig. 7). Under these conditions (30–26°C), we noted dominance of the C. canephora transcription profile. Fig. 8 shows that the 3250 unigenes in the ‘C. canephora-like dominance’ category in the allopolyploid cv Java were derived from all categories of unigenes in the coldest conditions. This was also the case for unigenes in the ‘no change’ category (Fig. 8).
We also sought to determine the proportion of unigenes common to both of the studied allopolyploids. The calculation was focused on categories for which there were over 70 unigenes. At 26–22°C, the two allopolyploids shared 45–56% unigenes irrespective of the category considered (data not shown). At 30–26°C, these proportions were over 70% for the ‘C. canephora-like dominance’ category (i.e. the only one having a significant number of unigenes; data not shown). This high proportion of unigenes common to two very different representatives of the allopolyploid species indicates that the results obtained here could be applied to the entire species.
Impact on physiology and metabolic pathways in allopolyploids
We focused our study in particular on 66 unigenes involved in several major metabolic pathways (sugar and starch degradation, lipid, phenylpropanoid and ethylene biosynthesis, stress response and circadian rhythm). For each of these unigenes, it was thus possible to monitor changes in category according to the growth temperature conditions. We noted that there were category changes for all biosynthetic pathways. However, of the 66 studied unigenes, 10 had an atypical behaviour as they remained in the same category or shifted to a minority category when the environmental conditions changed (Table S1). For instance, this was the case for lipoxygenase (SGN-U347607) and glutathione peroxidise (SGN-U349893), which are major pathways in the oxygenation of fatty acids, acylglycerol-phosphate-acyltransferase (SGN-U351326), which is involved in membrane phospholipid biosynthesis, or caffeoyl-CoA-3-O-methyltransferase (SGN-U351599), which is involved in lignin biosynthesis and always in a transgressive situation under both hot and cold conditions (Table S1). Circadian clocks are known to affect many physiological and developmental processes, including various metabolic pathways and fitness traits in photosynthesis and starch metabolism in plants. The late elongated hypocotyl (LHY; SGN-U351840), which is one of the negative regulators of the central oscillators of the circadian clock in Arabidopsis, was in the ‘additivity’ category in the coldest conditions and in the ‘C. eugenioides-like dominance’ category in the hottest conditions (Table S1).
Stability of C. arabica cultivars in different temperature conditions
The question then arises as to whether regulatory changes at high temperatures affect C. arabica more than C. canephora. Of the 15 522 unigenes, we found 4244 unigenes for which there was a significant between-treatment effect (i.e. genotype observed for a growth temperature condition; Table 2). For this subset of 4244 unigenes, a fluorescence intensity differential (DFI) was calculated for hybridizations conducted at 26–22°C and 30–26°C. This differential was calculated for Canephora and both allopolyploids. The variation index (IVar) was also calculated. On average, the absolute intensity differential value was 3.22 in C. arabica cv Java and 2.74 in C. arabica cv T18141 as compared with 3.78 in C. canephora (Table 2). The mean intensity variation was 26.48% in C. arabica cv Java and 25.26% in C. arabica cv T18141 as compared with 33.21% in C. canephora (Table 2). A second subset of 385 unigenes was created that corresponded to a ‘transgressive’ situation for allopolyploids at 26–22°C and to a ‘C. canephora-like dominance’ situation at 30–26°C (Table 2). On average, the absolute intensity differential value was 6.14 in C. arabica cv Java and 9.41 in C. arabica cv T18141 as compared with 16.45 in C. canephora (Table 2). The intensity variation between 30 and 26°C and 26–22°C for the 385 unigenes relative to the intensity at 26–22°C was, on average, 35.60% in C. arabica cv Java and 53.34% in C. arabica cv T18141 as compared with 89.32% for C. canephora (Table 2). Taken together, these results suggest better homeostasis of the allopolyploids as compared with the diploid.
|C. arabica cv Java||C. arabica cv T18141||C. canephora|
|Subset of 4244 unigenes|
|Subset of 385 unigenes|
Functional enrichment analysis
In order to shed light on the processes involved under the conditions studied, we enriched the gene ontology (GO terms) within ‘dominance’ and ‘transgression’ categories for both growth temperature conditions. Table S2 shows GO terms with a significantly higher frequency in category unigene sets in comparison with the full set of unigenes of the PUCE CAFE array. Unigenes dominantly transcribed in the allopolyploid were enriched for GO terms pertaining to chloroplasts, membrane activities and activities that regulate DNA replication, or which more specifically are involved in photorespiration (Table S2). This is the case of the respiratory chain complex I, which forms part of the mitochondrial respiratory chain, including the NADH dehydrogenase complex. For unigenes transgressively transcribed in the allopolyploid at the coldest temperatures, there were many enrichments of GO terms, especially for functions pertaining to chloroplasts, sugars, lipids and membranes (Table S2).
- Top of page
- Materials and Methods
- Supporting Information
The tetraploid species C. arabica is an agricultural species of prime importance (65% of world coffee production) that is grown in > 30 intertropical countries. It is native to the tropical forests on the Abyssinian Plateau, at elevations of 1200–1950 m, with an optimum mean annual temperature range of 18–21°C (DaMatta & Cochicho Ramalho, 2006; Davis et al., 2006). The diploid species C. canephora (35% of world coffee production) is native to lowland equatorial rainforests of the Congo River basin and tropical West Africa, extending up to Lake Victoria in Uganda at elevations of 250–1500 m (Davis et al., 2006). In those regions, the annual mean temperature ranges from 22 to 26°C, without substantial oscillations (DaMatta & Cochicho Ramalho, 2006). The wild diploid species C. eugenioides is found in large or small relict forests in highland areas, at elevations of 1000–2000 m, with a mean annual temperature ranging from 18 to 23°C (Davis et al., 2006). In order to monitor gene expression in the natural allopolyploid C. arabica, we grew the two diploid parent species (C. canephora and C. eugenioides) and two C. arabica cultivars in two separate growth chambers in which diurnal and nocturnal temperature conditions were set at 26–22°C for the coldest growth temperature conditions and 30–26°C for the hottest. The coldest temperatures were similar to the hottest conditions in tropical environments under which both C. arabica and C. eugenioides can be found, while the hottest temperature conditions corresponded to extreme growth conditions under which C. arabica and C. canephora can be grown. The moderate lighting conditions were suitable for growing coffee trees, which naturally grow under canopy.
We hybridized labelled leaf aRNAs to a 15K coffee microarray, assaying 15 522 unigenes for their relative expression levels, and determined, for each unigene, the level of expression variation between parents, and between each parent and the allopolyploid. This approach is often used to estimate changes in the expression of several thousand genes in many different species (Hegarty et al., 2006; Wang et al., 2006; Rapp et al., 2009; Flagel & Wendel, 2010). The original feature of our approach is that we compared the three species in varying environmental conditions. We varied the temperature because it is a key parameter affecting most physico-chemical reactions in plants. Our results are displayed according to the temperature conditions.
We confirmed an absence of hybridization bias, as the number of unigenes expressed relative to the background noise was the same whatever the species. The 70-mer oligonucleotides designed from C. canephora thus seemed to hybridize well to the C. eugenioides and C. arabica unigenes. This result could be explained by the high genic sequence identity among these three species (> 98%; A. Cenci et al., unpublished).
Transcriptomic divergence between parents and between each parent and the allopolyploids
As all plants were grown under common controlled conditions, we expected only modest expression divergence among diploids, but high levels of expression divergence were observed between C. eugenioides and C. canephora at both temperatures (Fig. 1a,b). Of the differentially expressed unigenes, equivalent proportions were up-regulated in each parent. These initial results showed that the transcriptomic regulation of the two species markedly differed despite the high sequence identity between the genomes (A. Cenci et al., unpublished) and was modulated by growth temperature conditions. Our findings were in the same range as those obtained by Rapp et al. (2009), for comparisons between Gossypium arboreum and Gossypium bickii and between G. arboreum and Gossypium thurberi. In our study, equivalent proportions of the differentially expressed genes were up-regulated in each parent – this balance was also reported by Wang et al. (2006) and Rapp et al. (2009).
When the allopolyploid was compared to its two diploid parents, it was found that the transcriptomic divergence between two species could vary considerably depending on growing conditions, with almost half of unigenes differentially expressed between the allopolyploid and its two parental species at 26–22°C (Fig. 1a) and between the allopolyploid and C. eugenioides at 30–26°C (Fig. 1b), whereas unexpectedly low unigene expression divergence (< 9%) was noted for the comparison between the allopolyploid and C. canephora at 30–26°C (Fig. 1b). The overall results showed that the C. arabica and C. canephora transcription profiles converged as the temperature increased. It is worth noting that there was little variation in the proportions of up- and down-regulated unigenes for the same comparison.
Impact of temperature on gene expression in allopolyploids
We used a categorically partitioned analysis of the full set of unigenes, as defined by Rapp et al. (2009), and were able to characterize all unigenes studied in each of the comparisons. This analysis was specifically focused on the two C. arabica cultivars. The virtual disappearance of some categories and the very marked increase in ‘C. canephora-like dominance’ and ‘no change’ was most surprising. The new phenomenon revealed by our study was the drastic change in proportions when environmental conditions (i.e. growth temperature) were modified. This clearly suggests that parental regulatory genes might be modulated by allopolyploid species.
Among the lines very widely disseminated in Latin America, introgression of chromosome fragments from C. canephora by interspecific hybridization seems to have made the selected lines more adapted to hot temperature conditions (22–26°C), as compared with natural C. arabica lines. This is the case for cv T18141, which we studied here and whose rate of introgression by C. canephora was estimated using molecular markers at over 20% of the C. canephora genome (Lashermes et al., 2000). We noted that the transcription profile of C. arabica cv T18141 seems to be more similar to that of its parent C. canephora than that of the pure C. arabica variety ‘Java’, which makes sense considering the introgression of T18141 by chromosome fragments from C. canephora.
We found that the behaviour at 30–26°C of the two majority unigene categories (i.e. ‘C. canephora-like dominance’ and ‘no change’) derived from all categories of unigenes in the coldest conditions (Fig. 8). We particularly focused our study on 66 unigenes involved in several major metabolic pathways (Wang et al., 2006; Salmona et al., 2008; Joët et al., 2009; Ni et al., 2009) and showed that no biosynthesis pathway was uniformly classified in any category and that, when the growth temperature conditions varied, there were category changes for all biosynthesis pathways. However, there was a minority of genes showing atypical behaviour. Those genes would be interesting targets for testing the hypothesis that the higher plasticity of allopolyploids might be explained by a few genes in particular biological pathways, as suggested by Ni et al. (2009). In the functional enrichment analysis, we found that the enrichments varied at the coldest temperature and that there was considerable depletion at the highest temperature in both of the categories studied (‘dominance’ and ‘transgression’), which suggests stressful growing conditions.
Lastly, we observed the stability of C. arabica cultivars under different temperature conditions. The allopolyploids seemed to display a higher level of homeostasis than their C. canephora parent. It is generally accepted that allopolyploids exhibit greater phenotypic plasticity than their parents. Our results seem to suggest that their greater phenotypic plasticity is based on better homeostasis of gene expression. To explain this mechanism, we proposed that the relative contribution of homeologues to the transcriptome varied with growth temperature conditions, but maintained the same global genetic expression level between the two temperatures. This would mean that at the cold temperature one homeologue gene is mostly recruited to be expressed whereas the other is recruited little or not at all, and that the opposite situation would be observed at the hot temperature. However, the global level of expression would vary little in the allopolyploid and a certain ‘stability’ would be observed.
We studied the transcription profile of a relatively recent natural allopolyploid species and compared this profile with that of its parental species under two temperature conditions during plant growth. By employing a microarray technology for the analysis of 15 522 coffee unigenes, amounting to around half of all coffee tree unigenes, we reached the conclusion that transcriptome divergence in C. arabica, in comparison with its two diploid parents, is modulated by the environment. This finding was not completely unexpected, as it was previously reported that the pattern of parental dominance varies from tissue to tissue (Chaudhary et al., 2009) and under stress (Liu & Adams, 2007). However, it is the extent of the phenomenon that is remarkable. The capacity of an allopolyploid to play on two subgenomes and to adapt the score to the prevailing environmental conditions distinguishes it from its parents. The C. arabica allopolyploid species showed higher homeostasis depending on environmental conditions as compared with the variability noted in the diploid C. canephora species. In our view, this phenomenon could explain the greater plasticity of allopolyploids compared with their parental species in coping with environmental variations and might ultimately explain their better adaptation to environmental conditions.
Our global gene expression analysis did not allow us to determine the relative contributions of homeologues to the transcriptome. To be able to do this, a microarray technology needs to be developed using homeologue-specific (i.e. parent-specific) probe sets or by sequencing as was done by Vidal et al. (2010). Such technologies would enable us to differentiate cases in which similar levels of expression are attained via biased parental expression or via equal expression of the homeologous parental genomes. It would thus be possible to determine whether or not one subgenome is massively expressed to the detriment of the other subgenome, as suggested by Flagel & Wendel (2010), and whether this balance of subgenomic contributions is environment-dependent.
It is still unclear what mechanisms are involved in the modulation of parental regulatory genes in allopolyploid species. Several nonexclusive hypotheses could be put forward. For instance, as suggested by Ni et al. (2009), a few regulatory genes whose activity is dependent on environmental conditions could lead to a cascade of changes in downstream genes and physiological pathways in polyploids. But how are these regulatory genes activated? Ha et al. (2009) suggested that expression variation of miRNAs leads to changes in gene expression, growth vigor and adaptation.
- Top of page
- Materials and Methods
- Supporting Information
We thank C. Dantec and D. Severac (CNRS, Institut de Génomique Fonctionnelle, Montpellier) for microarray generation and quality control. This research was supported by a grant from the Agence Nationale de la Recherche (ANR; Génoplante GPLA 06010 G). All the authors have read the manuscript and agree with the contents.
- Top of page
- Materials and Methods
- Supporting Information
- 2004. Origins, establishment and evolution of new polyploid species: Senecio cambrensis and Senecio eboracensis in the British Isles. In: Leitch AR, Soltis DE, Soltis PS, Leitch IJ, Pires JC, eds. Biological relevance of polyploidy: ecology to genomics. Biological Journal of the Linnean Society 82: 467–474. ,
- 2007. Evolution of duplicate gene expression in polyploid and hybrid plants. Journal of Heredity 98: 136–141. .
- 2003. Hybridization, polyploidy and speciation in Spartina (Poaceae). New Phytologist 161: 165–172. , , , .
- 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society 57: 289–300. , .
- 2008. Towards natural polyploid model organisms. Molecular Ecology 17: 1875–1876. .
- 1985. Botanical classification of coffee. In: Clifford MN, Wilson KC, eds. Coffee: botany, biochemistry and production of beans and beverage. London, UK: Croom Helm, 13–47. , .
- 2009. Reciprocal silencing, transcriptional bias and functional divergence of homeologs in polyploid cotton (Gossypium). Genetics 182: 503–517. , , , , , , .
- 2006. Mechanisms of genomic rearrangements and gene expression changes in plant polyploids. Bioessays 28: 240–252. , .
- 2006. Impacts of drought and temperature stress on coffee physiology and production: a review. Brazilian Journal Plant Physiology 18: 55–81. , .
- 2006. An annotated taxonomic conspectus of genus Coffea (Rubiaceae). Botanical Journal of the Linnean Society 152: 465–512. , , , .
- 2008. A taxonomic revision of the baracoffea alliance: nine remarkable Coffea species from western Madagascar. Botanical Journal of the Linnean Society 158: 355–390. , .
- 1998. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Research 8: 175–185. , , , .
- 2010. Evolutionary rate variation, genomic dominance and duplicate gene expression evolution during allotetraploid cotton speciation. New Phytologist 186: 184–193. , .
- 2004. Bioconductor: open software development for computational biology and bioinformatics. Genome Biology 5: R80. , , , , , , , , , et al.
- 2009. Small RNAs serve as a genetic buffer against genomic shock in Arabidopsis interspecific hybrids and allopolyploids. Proceedings of the National Academy Sciences, USA 106: 17835–17840. , , , , , , , , , .
- 2008. Changes to gene expression associated with hybrid speciation in plants: further insights from transcriptomic studies in Senecio. Philosophical Transactions of the Royal Society B 363: 3055–3069. , , , , , .
- 2006. Transcriptome shock after interspecific hybridization in Senecio is ameliorated by genome duplication. Current Biology 16: 1652–1659. , , , , , .
- 2009. The complex nature of allopolyploid plant genomes. Heredity 103: 100–101. , .
- 1999. CAP3: a DNA sequence assembly program. Genome Research 9: 868–877. , .
- 2009. Genomic and expression plasticity of polyploidy. Current Opinion in Plant Biology 13: 1–7. , .
- 2006. A Unique recent origin of the allotetraploid species Arabidopsis suecica: evidence from nuclear DNA markers. Molecular Biology and Evolution 23: 1217–1231. , , , , , , .
- 2009. Metabolic pathways in tropical dicotyledonous albuminous seeds: Coffea arabica as a case study. New Phytologist 182: 146–162. , , , , , , , .
- 2000. Molecular analysis of introgressive breeding in coffee (Coffea arabica). Theoretical and Applied Genetics 100: 139–146. , , , , , , , .
- 2008. Genomics of coffee, one of the world’s largest traded commodities. In: Moore PH, Ming R, eds. Genomics of tropical crop plants. New York, NY, USA: Springer, 203–224. , , .
- 2010. Genetic and physical mapping of the SH3 region that confers resistance to leaf rust in coffee tree (Coffea arabica L.). Tree Genetics and Genomes 6: 973–980. , , , , , .
- 1999. Molecular characterisation and origin of the Coffea arabica L. genome. Molecular and General Genetics 261: 259–266. , , , , , , .
- 2008. Genomic plasticity and the diversity of polyploid plants. Science 320: 481–483. , .
- 2007. Expression partitioning between genes duplicated by polyploidy under abiotic stress and during organ development. Current Biology 17: 1669–1674. , .
- 2009. Altered circadian rhythms regulate growth vigour in hybrids and allopolyploids. Nature 457: 327–331. , , , , , , , .
- 2003. Understanding mechanisms of novel gene expression in polyploids. Trends Genetics 19: 141–147. , , , , , , , , , et al.
- 2000. Polyploid incidence and evolution. Annual Review of Genetics 34: 401–437. , .
- 2011. The ‘PUCE CAFÉ’ project: the first 15K coffee microarray, a new tool for discovering candidate genes correlated to agronomic and quality traits. BioMed Central Genomics 12: 5. , , , , , , , , , et al.
- 2009. Nonadditive expression of homeologous genes is established upon polyploidisation in hexaploid wheat. Genetics 181: 1147–1157. , , , , .
- 2009. Genomic expression dominance in allopolyploids. BioMed Central Biology 7: 18. , , .
- 2008. Deciphering transcriptional networks that govern Coffea arabica seed development using combined cDNA array and real-time RT-PCR approaches. Plant Molecular Biology 66: 105–124. , , , , , .
- 2003. Rate variation among nuclear genes and the age of polyploidy in Gossypium. Molecular Biology and Evolution 20: 633–643. , , , , , , , , , .
- 2005. Limma: linear models for microarray data. In: Gentleman R, Carey V, Dudoit S, Irizarry R, Huber W, eds. Bioinformatics and computational biology solutions using R and Bioconductor. New York, NY, USA: Springer, 397–420. .
- 2003. Normalization of cDNA microarray data. Methods 31: 265–273. , .
- 2004. Recent and recurrent polyploidy in Tragopogon (Asteraceae): cytogenetic, genomic and genetic comparisons. In: Leitch AR, Soltis DE, Soltis PS, Leitch IJ, Pires JC, eds. Biological relevance of polyploidy: ecology to genomics. Biological Journal of Linnean Society 82: 485–501. , , , , ,
- 2001. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy Sciences, USA 98: 5116–5121. , , .
- 2010. A high-throughput data mining of single nucleotide polymorphisms in Coffea species expressed sequence tags suggests differential homeologous gene expression in the allotetraploid Coffea arabica. Plant Physiology 154: 1053–1066. , , , , , , , , , .
- 2006. Genomewide nonadditive gene regulation in Arabidopsis allotetraploids. Genetics 172: 507–517. , , , , , , , , , et al.
- Top of page
- Materials and Methods
- Supporting Information
Fig. S1 Microarray experimental design.
Fig. S2 Transcriptome divergence and nonadditive gene expression between Coffea arabica cv T18141 and parental diploid species.
Table S1 Monitoring selection of differential responses of 66 unigenes in two growth temperature conditions
Table S2 Blast2GO annotation for genes differentially expressed in the ‘dominance’ and ‘transgression’ categories
Table S3 The missing genes
Table S4 Oligonucleotides that may cross-hybridize with several sequences
Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing material) should be directed to the New Phytologist Central Office.
|NPH_3833_sm_FigS1-S2.ppt||67K||Supporting info item|
|NPH_3833_sm_TableS1-S4.xls||114K||Supporting info item|