Expansion of banana (Musa acuminata) gene families involved in ethylene biosynthesis and signalling after lineage-specific whole-genome duplications



  • Whole-genome duplications (WGDs) are widespread in plants, and three lineage-specific WGDs occurred in the banana (Musa acuminata) genome. Here, we analysed the impact of WGDs on the evolution of banana gene families involved in ethylene biosynthesis and signalling, a key pathway for banana fruit ripening.
  • Banana ethylene pathway genes were identified using comparative genomics approaches and their duplication modes and expression profiles were analysed.
  • Seven out of 10 banana ethylene gene families evolved through WGD and four of them (1-aminocyclopropane-1-carboxylate synthase (ACS), ethylene-insensitive 3-like (EIL), ethylene-insensitive 3-binding F-box (EBF) and ethylene response factor (ERF)) were preferentially retained. Banana orthologues of AtEIN3 and AtEIL1, two major genes for ethylene signalling in Arabidopsis, were particularly expanded. This expansion was paralleled by that of EBF genes which are responsible for control of EIL protein levels. Gene expression profiles in banana fruits suggested functional redundancy for several MaEBF and MaEIL genes derived from WGD and subfunctionalization for some of them.
  • We propose that EIL and EBF genes were co-retained after WGD in banana to maintain balanced control of EIL protein levels and thus avoid detrimental effects of constitutive ethylene signalling. In the course of evolution, subfunctionalization was favoured to promote finer control of ethylene signalling.


Gene duplication is a major mechanism generating new templates for evolutionary innovation in eukaryotes (Ohno, 1970; Lynch & Conery, 2000). Gene duplicates may originate from single gene duplications such as tandem and proximal duplications, or from large-scale duplications (Maere et al., 2005). Tandem duplicates are adjacent and mainly result from unequal crossing-over, whereas in proximal duplications duplicates are separated by other genes and may result from unequal crossing-over or transposon activities (Wang et al., 2012). Large-scale gene duplications, including whole-genome duplications (WGDs) and segmental duplications (i.e. duplications of a chromosomal region; Koszul & Fischer, 2009), are frequent in the history of angiosperm genomes. A hexaploidization event (γ triplication) occurred near the origin of eudicots (Jiao et al., 2012). It was followed by lineage-specific duplications in some taxa, such as two tetraploidization events in Arabidopsis thaliana (α and β; Blanc et al., 2003; Bowers et al., 2003) and a hexaploidization in Solanaceae (‘T’ triplication; The Tomato Genome Consortium, 2012). Within monocots, two WGD events were characterized in sequenced Poaceae genomes. The ρ WGD occurred c. 50–70 million yr ago (Ma) (Paterson et al., 2004; Salse et al., 2008) and the σ WGD occurred earlier in the monocot lineage (Tang et al., 2010). In addition, a recent WGD occurred in Zea mays c. 5–12 Ma (Schnable et al., 2009). Recently, the genome of banana (Musa acuminata), a monocotyledon from the order Zingiberales, was sequenced, using DH-Pahang, a doubled haploid (523 Mb; 2n = 22) derived from a seedy diploid of the subspecies malaccensis (D'Hont et al., 2012). This subspecies contributed one of the three M. acuminata genomes of the sterile triploid cultivar Cavendish (AAA genome), which accounts for half of world-wide banana production (Lescot, 2011). Analyses of the banana genome revealed three rounds of WGD (α, β and γ) that were not shared with the Poales or the Arecales (palms) (D'Hont et al., 2012). The α and β WGDs were estimated to have occurred within a short time frame c. 65 Ma, whereas the γ WGD was dated to c. 100 Ma. The availability of the banana genome sequence offers the opportunity to study the evolution of banana gene families in the context of the three WGDs.

Following duplication, paralogous genes can have different fates. They can become pseudogenes or be lost, and it is now well established that, over evolutionary time, most of WGD duplicate genes are lost through fractionation (Lockton & Gaut, 2005). This process has a major impact on the evolution of plant genes, as some of them are preferentially retained after WGD or are found preferentially in a singleton state (Freeling, 2009). In addition, it has been observed that functional categories of genes that were more likely to be retained after WGD were less likely to be retained after tandem duplication and vice versa (Freeling, 2009; Woodhouse et al., 2011; Rodgers-Melnick et al., 2012). In banana, the most preferentially retained gene categories after WGD included transcription factors, signal transduction genes and translational elongation genes, similar to findings in A. thaliana (Blanc & Wolfe, 2004; Maere et al., 2005; D'Hont et al., 2012). Retention of these gene categories has been explained by the gene balance hypothesis (Birchler et al., 2001; Papp et al., 2003; Freeling & Thomas, 2006) which states that genes encoding products that are in a balanced interacting relationship, such as those encoding members of a protein complex or involved in multiple steps in regulatory cascades, will tend to be dosage sensitive because changes in the stoichiometry of individual components will be detrimental. These genes are thus more prone to be co-retained after WGD (Birchler & Veitia, 2007). Other models for duplicate gene retention include neofunctionalization, where one of the duplicates acquires a new function, and subfunctionalization, where the two copies share the function of the ancestral gene (Force et al., 1999).

To analyse gene family evolution, we focused on a key pathway for banana fruit ripening, the ethylene biosynthesis and signalling pathway (Supporting Information Fig. S1). Banana fruits are climacteric; they are characterized by drastic changes in ethylene production with an increased respiration burst during ripening (Burg & Burg, 1965; Liu et al., 1999). In addition, export bananas are ripened by exogenous application of ethylene. In A. thaliana, ethylene is perceived by a family of five ethylene receptors (ethylene response 1 (ETR1), ETR2, ethylene response sensor 1 (ERS1), ERS2 and ethylene-insensitive 4 (EIN4); reviewed by Shakeel et al., 2013). Ethylene receptors act as negative regulators of signalling through constitutive activation of the Ser/Thr kinase constitutive triple response 1 (CTR1; Kieber et al., 1993). The response-to-antagonist 1 (RAN1) protein is a copper transporter that is essential for biogenesis of ethylene receptors (Binder et al., 2010) and reversion-to-ethylene sensitivity 1 (RTE1) is involved in the function of the ETR1 receptor (Resnick et al., 2008). In the presence of ethylene, receptors inactivate CTR1, thus relieving suppression on downstream signalling components. The EIN2 protein, an endoplasmic reticulum-bound protein (Alonso et al., 1999), is processed and its C-terminal domain migrates into the nucleus (Qiao et al., 2012; Wen et al., 2012). There, it activates the EIN3/EIN3-like (EIL) transcription factors which, in turn, initiate the ethylene transcriptional responses by binding to specific elements in promoter regions of genes encoding ethylene response factors (ERFs). Additional regulation of ethylene signalling occurs at the post-transcriptional level: EIN3 protein levels are regulated through EIN3-binding F-box (EBF) proteins which are components of Skp, Cullin, F-box containing (SCF) complexes (Guo & Ecker, 2003; Potuschak et al., 2003; Gagne et al., 2004). In banana, genes encoding the two main enzymes of the ethylene biosynthesis pathway (1-aminocyclopropane-1-carboxylate synthase (ACS) and 1-aminocyclopropane-1-carboxylate oxidase (ACO)) were identified based on cDNA amplification (two ACO and four ACS genes; Liu et al., 1999; Inaba et al., 2007). In addition, three ERS genes (Yan et al., 2011), one CTR1-like gene (Hu et al., 2012), five EIL genes (Mbéguié-A-Mbéguié et al., 2008), two EBF genes (Chen et al., 2011; Kuang et al., 2013) and 15 ERF genes (Xiao et al., 2013) were identified but no complete inventory of these gene families could be performed.

Here, we identified all members of 10 gene families of the banana ethylene pathway using genome-scale approaches. We analysed their evolutionary patterns with a specific focus on the EIL and EBF gene families which play a central role in the control of ethylene signalling. Our results showed expansion of several ethylene gene families after banana WGD. Based on expression data, the co-expansion of EBF genes and of a specific subgroup of EIL genes is partly associated with functional redundancy; however, subfunctionalization also occurred in both families.

Materials and Methods

Identification of ethylene pathway genes

For each gene family, clusters of protein sequences from 12 plant species (Table S1) were identified using Pathway tools databases (Karp et al., 2002) including MusaCyc (http://banana-genome.cirad.fr/musacyc; Droc et al., 2013), the Greenphyl database (http://www.greenphyl.org/cgi-bin/index.cgi, Rouard et al., 2011), InterProScan (Quevillon et al., 2005) and BLASTP clustering using a protein reference list (Table S2). Only the longest sequence of each gene was kept. To identify banana ERF, APETALA2/ethylene-responsive element-binding proteins (AP2/EREBP) were identified by searching the M. acuminata proteome for Interpro domain IPR001471 and were classified following a specific approach (Methods S1, Table S3). For other plants, ERF numbers were retrieved from published data. Gene family expansion was detected by comparing the number of family members in banana to numbers of gene family members in other species after standardization using the predicted proteome size of each species. A χ2 test was applied to determine the significance of the observed difference.

Phylogenetic tree reconstruction

Protein sequences were aligned using mafft version 6.717b (Katoh & Toh, 2008). To improve alignments, genes were manually curated when necessary. Maximum-likelihood phylogenetic analysis was performed using PhyML version 3.0 (Guindon et al., 2010) under the LG evolution model (Le & Gascuel, 2008). Tree topology was reconstructed using the best of nearest neighbour interchange (NNI) and subtree pruning and regraphing (SPR) methods. Branch supports were estimated using an approximate likelihood ratio test with a Shimodaira–Hasegawa-like procedure (Guindon et al., 2010). Phylogenetic trees were visualized with FigTree v.1.3.1 (http://tree.bio.ed.ac.uk/software/figtree/).

Identification of gene duplication modes

Duplicated copies of genes were identified by an all-by-all comparison of M. acuminata predicted proteins using BLASTP (E-value cut-off of 1e−10) and the five best nonself protein matches were selected. WGD gene pairs were identified based on Musa ancestral blocks available at http://banana-genome.cirad.fr/dotplot (D'Hont et al., 2012) and in the Plant Genome Duplication Database (http://chibba.agtec.uga.edu/duplication/index/downloads; Lee et al., 2013). Additional small paralogous relationships that could correspond to potential segmental duplications were detected using SynMap (http://genomevolution.org/CoGe/SynMap.pl) with a 3 : 3 quota-align ratio and default parameters (Tang et al., 2011). Fine analysis of duplicated regions was carried out with SynFind (http://genomevolution.org/CoGe/SynFind.pl) and the most conserved pairs deriving from Musa α WGD were identified using SynMap with a quota-align ratio of 1 : 1 (Tang et al., 2011). Tandem or proximal duplications were considered when two duplicated genes were consecutive in the genome or separated by 20 or fewer gene loci, respectively. A χ2 test was used to identify retention bias for ethylene pathway gene families compared with genome-wide retention. For genes from other plant species, gene duplication modes were identified using published WGD data (Bowers et al., 2003; Tang et al., 2010; Schnable et al., 2011b; The Tomato Genome Consortium, 2012) and the approach described for banana. Duplication modes for ACS, EBF, EIL and ERF banana gene families were visualized with Circos (Krzywinski et al., 2009).

Gene structure and molecular evolution analyses

Exon/intron structures of EIL and EBF genes were retrieved from http://www.arabidopsis.org/ (TAIR10), http://rice.plantbiology.msu.edu/ (MSU7 version) and http://banana-genome.cirad.fr/ (Gaze version 1) and were manually curated if necessary. Protein domains were identified using InterProScan (Quevillon et al., 2005) and published data (Gagne et al., 2004). For molecular evolution analysis, coding sequence alignments were guided by protein sequence alignments using pal2nal (Suyama et al., 2006). To estimate variation in selective pressure for EBF and EIL gene families, branch models of the CODEML program in PAML (Yang et al., 2007) were constructed to estimate ω (= dN/dS), the ratio of synonymous (dS) to nonsynonymous (dN) substitution rates, under two different assumptions. The model assuming a single ω for all branches (the one-ratio model: M0) was compared to the free-ratio model M2 which assumes an independent ω for each selected lineage. A likelihood-ratio test was used to compare the fits of the two models (Yang, 1998).

Plant material for gene expression analysis

Banana (Musa acuminata Colla) fruits of the Cavendish cultivar grown at a banana farm in Guadeloupe were harvested at the immature green, early and late mature green developmental stages (40, 60 and 90 d after flowering (DAF), respectively; Mbéguié-A-Mbéguié et al., 2008). After harvest, one fruit per bunch was sampled (T0 condition) and all other harvested fruits were kept for 24 h at 20°C in chambers ventilated with humidified air. Half of the harvested fruits were treated for 24 h with 10 000 ppm of acetylene, an ethylene analogue, followed by storage in ventilated chambers with humidified air at 20°C. One fruit per bunch and per condition was sampled at 2 d (T3 condition) and 4 d (T5 condition) after treatment. The physiological state of fruits was monitored by measuring colour change and extent of softening as in Mbéguié-A-Mbéguié et al. (2008). Pulp and peel tissues corresponding to the median part of the fruit were separately frozen in liquid nitrogen and stored at −80°C.

RNA extraction and quantitative reverse transcription–polymerase chain reaction (qRT-PCR) analysis

Total RNA was extracted from 600 mg of fruit tissue using a TE3D/MATAB protocol (Argout et al., 2008) followed by a lithium chloride (2M) precipitation step. RNA was treated with RQ1 DNAse (Promega, Madison, WI, USA) and purified using the RNeasy® MinElute™ Cleanup Kit (Qiagen, Hilden, Germany). The quantity and quality of RNA were analysed using agarose gel electrophoresis and with Agilent Bioanalyzer 2100 and RNA 6000 Nano LabChips (Agilent Technologies, Waldbronn, Germany). First-strand cDNA was synthesized from 1 μg of RNA using SuperScript® III reverse transcriptase (Invitrogen, Carlsbad, CA, USA). Primers were designed using the Primer Blast and Primer Designer tools (Droc et al., 2009; http://banana-genome.cirad.fr/) and are listed in Table S4. Primer specificities were confirmed by amplicon sequencing and melting-curve analysis. The qRT-PCR experiments (see Methods S2) were performed in duplicate for four biological replicates per condition in 384-well plates using a Light Cycler® 480 system (Roche Applied Sciences, Basel, Switzerland). Normalized transcript abundances (A = Etarget(−Cptarget)/Ereference(−Cpreference)) were calculated using LightCycler® 480 SW software version 1.5 and MaActin2 (GSMUA_Achr1G05990_001) as a reference gene. Statistical analysis was performed using an ANOVA after a logarithmic transformation of raw data followed by Tukey's test.


Expansion of banana gene families involved in the ethylene pathway

Members of 10 gene families involved in the core ethylene biosynthesis and signalling pathway (Fig. S1) were identified using the predicted proteomes of 12 plant species (Table S1) including M. acuminata and representatives of monocots (rice (Oryza sativa), Brachypodium distachyon, sorghum (Sorghum bicolor), maize (Zea mays) and date palm (Phoenix dactylifera)) and eudicots (A. thaliana, Thellungiella parvula, grapevine (Vitis vinifera), tomato (Solanum lycopersicum), peach (Prunus persica) and woodland strawberry (Fragaria vesca)). For each gene family, the total number of genes was compared (Table 1). ACS genes showed a higher number in banana as compared with the Poaceae (= 0.007). The 11 banana ACS genes encode proteins belonging to Type I, Type II and Type III ACS as defined in A. thaliana and tomato (Yamagami et al., 2003; Yoshida et al., 2005; Fig. S2). Banana and maize showed the highest ACO gene numbers, with 12 members. Banana and tomato had three CTR1-like genes compared with one or two in other species. They also had five and six EBF genes, respectively, whereas other species had two (except for maize with four EBF genes). In addition, banana gene families involved in transcriptional regulation of ethylene signalling were significantly expanded, with 17 EIL members (< 0.001) and 122 ERF members (< 0.001; Fig. S3). Thus, six out of 10 ethylene pathway gene families (ACS, ACO, CTR1-like, EBF, EIL and ERF) showed high gene numbers in banana with a particular expansion of EIL and ERF transcription factors compared with other species. The genes encoding ethylene receptors and RAN1-like and RTE1 homologue (RTH) proteins were not particularly expanded in banana.

Table 1. Gene numbers of ethylene pathway gene families in 12 plant genomes
Gene familyPlant species
  1. Ma, Musa acuminata; Os, Oryza sativa; Bd, Brachypodium distachyon; Sb, Sorghum bicolor; Zm, Zea mays; Pd, Phoenix dactylifera; Tp, Thellungiella parvula; At, Arabidopsis thaliana; Vv, Vitis vinifera; Sl, Solanum lycopersicum; Pp, Prunus persica; Fv, Fragaria vesca; NA, not available; ACS, 1-aminocyclopropane-1-carboxylate synthase; ACO, 1-aminocyclopropane-1-carboxylate oxidase; ERS, ethylene response sensor; ETR, ethylene response; EIN, ethylene-insensitive; RAN, response-to-antagonist; RTH, RTE1 homologue; CTR, Ser/Thr kinase constitutive triple response; EBF, ethylene-insensitive 3-binding F-box; EIL, ethylene-insensitive 3-like; ERF, ethylene response factor.

  2. a

    Nakano et al. (2006).

  3. b

    Yan et al. (2013).

  4. c

    Zhuang et al. (2010).

  5. d

    Licausi et al. (2010).

  6. e

    Sharma et al. (2010).

  7. f

    Zhang et al. (2012).

ACS 115334591071168
ACO 12868129553755
RTH 232341222322
EBF 522242222622
EIL 1776696664945
ERF 12282aNA53b84cNANA65a82d68e59fNA

The expansion of four banana gene families of the ethylene pathway results from preferential retention after whole-genome duplications

To elucidate the origin of banana gene family expansions, gene duplications were classified into four modes: WGDs, potential segmental duplications, tandem duplications and proximal duplications (Table 2). A total of 28 317 out of 36 542 predicted protein-coding genes in banana were found to be duplicates. The main duplication mode was WGD, with 40% of banana genes involved in WGD gene pairs. Potential segmental duplicates corresponded to 7% of banana genes, whereas tandem and proximal duplications involved 3.4% and 4.2% of banana genes, respectively.

Table 2. Modes of duplication of banana genes
 Number of genes involved in different duplication modesa
 Duplication modelsbWGDSegmentalTandemProximalUnknowncUniqued
  1. ACS, 1-aminocyclopropane-1-carboxylate synthase; ACO, 1-aminocyclopropane-1-carboxylate oxidase; ERS, ethylene response sensor; ETR, ethylene response; EIN, ethylene-insensitive; RAN, response-to-antagonist; RTH, RTE1 homologue; CTR, Ser/Thr kinase constitutive triple response; EBF, ethylene-insensitive 3-binding F-box; EIL, ethylene-insensitive 3-like; ERF, ethylene response factor; WGD, whole-genome duplication.

  2. a

    The same gene can be involved in different types of duplication.

  3. b

    Number of genes duplicated by at least one identified duplication mode.

  4. c

    Number of genes involved only in unknown duplications.

  5. d

    Number of genes not duplicated (singleton status).

Genome-wide14 77127171258153710 7288225
Gene family
ACS 10930010
ACO 5203270
RTH 0000020
EBF 5500000
EIL 121260050
ERF 9291440300

Duplication modes were identified for the majority (74%) of the 184 genes in the ethylene gene families (Table 2). For seven of the 10 gene families (ACS, ERS/ETR/EIN4-like, RAN1-like, EIN2-like, EIL, EBF and ERF), nearly all duplicates originated from WGD events, including all RAN1-like, EIN2-like and EBF genes (Table 2, Figs 1, S2, S4–S6). For the three remaining families (ACO, CTR1-like and RTH), it was not possible to identify a major duplication mode (Table 2, Figs S7–S9). In addition, it was not possible to infer WGD relationships for genes not anchored to the M. acuminata chromosomes (e.g. MaACS8 and MaEIL16).

Figure 1.

Duplication modes for 1-aminocyclopropane-1-carboxylate synthase (ACS), ethylene-insensitive 3-binding F-box (EBF), ethylene-insensitive 3-like (EIL) and ethylene response factor (ERF) banana gene families visualized with Circos. Genes are located on the 11 DH-Pahang chromosomes and on Musa ancestral blocks coloured as previously described in D'Hont et al. (2012). Gene duplication through whole-genome duplications (WGDs), potential segmental duplications or tandem/proximal duplications are indicated in blue, green and red, respectively. Genes present on scaffolds that are not anchored to the 11 M. acuminata chromosomes are not represented here (e.g. MaACS8 and MaEIL16).

Among the 10 ACS genes anchored on M. acuminata chromosomes, nine showed WGD gene pair relationships (Table 2) and originated from five different Musa ancestral blocks as defined in D'Hont et al. (2012) (Fig. 1). Relationships between MaACS4, MaACS2 and MaACS3 involved ancestral blocks G1 (in dark blue; Fig. 1) and G10 (in beige; Fig. 1). Further investigations using the Plant Genome Duplication Database detected additional paralogous relationships around MaACS4 on G1 and MaACS2 and MaACS3 on G10, suggesting that the three genes are present on previously undetected paralogous regions from the same ancestral block. In addition, two potential segmental gene pairs were found involving three ACS genes (MaACS4, MaACS6 and MaACS7) (Table 2, Figs 1, S2). All EBF genes (MaEBF1–5) were related to each other by WGD relationships and probably originated from a unique Zingiberales ancestral gene (Fig. 1). Among the 16 EIL genes anchored on M. acuminata chromosomes, 12 were involved in 23 WGD gene pair relationships and resulted from duplication of three ancestral blocks; six of them were additionally involved in three potential segmental gene pairs (Table 2, Fig. 1). Finally, 101 ERF genes were found distributed on the 12 Musa ancestral blocks and, among the 90 identified ERF gene pairs, 86 showed WGD relationships involving 91 different ERF genes and only two were tandem gene pairs (Table 2, Fig. 1). The ACS, EBF, EIL and ERF gene families showed significant preferential retention of their members after WGD (= 0.005, = 0.007, = 0.011 and < 0.001, respectively), indicating that their expansions are attributable to duplicate retention after the three banana WGD rounds. Thus, ethylene pathway banana gene families evolved mostly through WGD, which is the main duplication mode in the banana genome.

Parallel acquisition of ultra-paralogues for EIL and EBF genes in banana

The EIL transcription factors are central positive regulators of ethylene responses and their regulation by EBF proteins is a key step of ethylene signalling control (Chao et al., 1997; Potuschak et al., 2003; Gagne et al., 2004; Binder et al., 2007). Both families were over-retained after WGD in banana, raising the question of their phylogenetic evolution. The maximum likelihood phylogenetic tree of EIL proteins (85 sequences) showed three main groups which probably originated before the eudicot/monocot divergence as they all contained representatives from analysed species (Fig. 2). The largest group (group I; 42 members) corresponded to homologues of AtEIN3 and AtEIL1, two A. thaliana genes that have been shown to be necessary and sufficient for activation of ethylene-response genes (Chao et al., 1997). Group I comprised the majority of banana EIL family members (13 out of 17) and also genes encoding functional EIL proteins from tomato (SlEIL1–4; Tieman et al., 2001) and rice (OsEIL1; Mao et al., 2006). These banana genes were subdivided into two subgroups: I-m1, which comprises orthologous genes from all analysed monocots, and I-m2, which only contains banana and date palm genes. Based on analysis of reciprocal gene pair duplication relationships, ancestral block relationships and the tree topology, the nine MaEIL genes of the I-m1 subgroup were retained after three Musa WGDs and a potential ancestral segmental duplication (Figs 1, 2). Global expansion of the EIL gene family in banana can be explained by preferential retention of duplicated genes of the I-m1 subgroup. Other species had two to four gene family members in group I, originating from WGD events but also from gene-scale duplications (Fig. 2). The EIL group II (Fig. 2) comprised 22 members with three WGD-derived banana genes and corresponded to homologues of AtEIL3, a transcriptional regulator of plant sulfur metabolism (Maruyama-Nakashita et al., 2006). Members of EIL group II might thus have functions other than in ethylene signalling. Finally, one banana sequence (MaEIL17) grouped together with homologues of AtEIL4 and AtEIL5 within EIL group III (21 members; Fig. 2).

Figure 2.

Maximum likelihood phylogenetic analysis of the ethylene-insensitive 3-like (EIL) family. The maximum likelihood cladogram is rooted using a midpoint branch. Branches are coloured according to gene duplication modes: whole-genome duplications (WGDs) (blue), segmental duplications (green), tandem/proximal duplications (red) and unknown (grey). Black branches correspond to speciation events. Branch supports >0.70 (approximate likelihood ratio test statistics with a Shimodaira–Hasegawa-like procedure) are indicated. EIL groups and subgroups are indicated on the right. Thellungiella parvula (Tp), grapevine (GSVIV), tomato (Solyc), woodland strawberry (Fv), A. thaliana (At), peach (Ppa), Brachypodium distachyon (Bradi), rice (Os), maize (GRMZM or AC), sorghum (Sb), date palm (DP) and banana (GSMUA) identifiers are indicated (in brown for banana). Stars indicate previously described banana EIL sequences (Mbéguié-A-Mbéguié et al., 2008). WGD events are indicated with Greek symbols or a T for tomato triplication.

The phylogenetic tree of EBF proteins showed a simple topology that mirrored angiosperm evolution, suggesting that, within monocots and eudicots, all EBF genes evolved from a single ancestral sequence (Fig. 3a). Two orthologous eudicot EBF subgroups were identified. They probably originated from the γ eudicot paleoploidy event and corresponded to homologues of the tomato SlEBF1 and SlEBF2 functional genes (Yang et al., 2010). Tomato EBF genes were further duplicated through the T Solanaceae WGD or other unknown duplication modes. In A. thaliana, the two functional AtEBF1 and AtEBF2 genes (Binder et al., 2007) are orthologous to SlEBF2 and resulted from the Arabidopsis β WGD (Bowers et al., 2003). Within Poaceae, the ancestral gene was duplicated by the ρ WGD event and duplicated copies were retained in each grass species as for rice OsEBF1 and OsEBF2 (Rzewuski & Sauter, 2008). In banana, the five genes were present on different chromosomal regions showing synteny (Fig. 3b). Further investigations of banana paralogous relationships (see Materials and Methods) identified three pairs of paralogous EBF regions deriving from the most recent α WGD; five of them bearing EBF genes, and an additional one from which the EBF gene was lost (Fig. 3b). As eight regions were expected after the three WGDs, two additional α segments were lost or substantially rearranged and could not be identified. Based on these results and our tree topology, a possible scenario of banana EBF gene evolution is depicted in Fig. 3(b), where the five EBF genes derive from one γ ancestral region that was further duplicated through the β and α Musa WGDs.

Figure 3.

Expansion of the banana ethylene-insensitive 3-binding F-box (EBF) family through whole-genome duplications (WGDs). (a) Maximum likelihood cladogram of the EBF family rooted using the eudicots–monocots speciation branch. Branch supports > 0.70 (approximate likelihood ratio test statistics with a Shimodaira–Hasegawa-like procedure) are indicated. Branches are coloured according to gene duplication modes: WGD (blue) and unknown (grey). Black branches correspond to speciation events. Thellungiella parvula (Tp), grapevine (GSVIV), tomato (Solyc), woodland strawberry (Fv), A. thaliana (At), peach (Ppa), Brachypodium distachyon (Bradi), rice (Os), maize (GRMZM), sorghum (Sb), date palm (DP) and banana (GSMUA) identifiers are indicated (in brown for banana). WGD events are indicated with Greek symbols or a T for tomato triplication. Stars indicate previously described banana EBF proteins (Kuang et al., 2013). (b) Microsynteny analysis of banana EBF paralogous segments resulting from the α, β and γ banana WGD events. Boxes of the same colour indicate syntenic paralogous genes and white boxes represent genes without a syntenic relationship in these regions.

Thus, WGD represents the main driving force for EBF family evolution, whereas both WGD and gene-scale duplications explain the EIL family evolution. In addition, banana EBF genes and EIL genes of the I-m1 subgroup showed a similar evolutionary pattern with an expansion through large-scale duplications after speciation, resulting in parallel acquisition of ultraparalogues (i.e. genes resulting from lineage-specific duplications; Zmasek & Eddy, 2002).

Purifying selection and structure conservation of banana EBF and EIL genes after WGD

Analyses of exon/intron structures showed that the 17 EIL banana genes have a monoexonic structure similar to that of EIL genes of A. thaliana (except for AtEIL3) and rice. They encode proteins that vary in size but have a conserved EIN3 domain (417–655 amino acids; Fig. S10a). All banana EBF genes have a conserved structure of two exons and one intron, similar to A. thaliana and rice genes (Fig. S10b). The F-box domain (IPR001810) and the leucine-rich repeats are also well conserved within and between species. This structure conservation suggests that banana EIL and EBF genes are functional. To infer the type of selection pressure acting on these genes, we used maximum likelihood codon models in PAML. Using the null model (M0), the ω (= dN/dS) values were < 1 for EIL (ω = 0.1639) and EBF genes (ω = 0.2524), indicating purifying selection (Table S5). We looked for variation of selective pressure between banana EIL groups and subgroups by testing the fits of two models: the one-ratio model M0 and the free ratio model M2, where different values of ω are assumed for the different branches (Table S5). The EIL group III comprising only one gene (MaEIL17) is under strong purifying selection (ω4 = 0.0428). MaEIL17 could have an essential function that would explain this evolutionary pattern. The three remaining EIL groups are also under purifying selection, with different intensities of selection pressure (Table S5). For the EBF gene family, the branch bearing the WGD gene pair MaEBF1 and MaEBF3 seems more constrained than the other branches (ω3 = 0.1947; Table S5). Thus, banana EIL and EBF genes are under purifying selection and selective constraint varies significantly within the two families, particularly between EIL groups (< 0.00001 and = 0.014125, respectively).

Gene expression profiles in banana fruits revealed functional redundancy and subfunctionalization for the ethylene pathway genes retained after WGD

We analysed the expression profiles of the 10 gene families in banana fruits using previously published RNA-Seq data (D'Hont et al., 2012; Methods S3, Table S6). These showed that several ACO genes are expressed in the fruits, with one gene showing higher expression levels (MaACO1; Fig. 4a). By contrast, only one member of the ACS gene family (MaACS1) belonging to Type 1 ACC synthases (Fig. S2) was expressed in our experimental setting. Five out of seven ERS genes were found to be expressed in the six analysed libraries (Fig. 4a), in addition to all members of the RAN1-like, RTH, CTR1-like, EIN2-like and EBF gene families. All MaEIL genes of the I-m1 group were expressed, albeit at different levels, and two out of three genes from group II were also expressed, whereas no expression was detected for group I-m2 and group III genes (Fig. 4a). The ERF genes expressed in the fruits (≥ 10 RPKM; Fig. S11, Methods S3) belonged mainly to ERF groups VII (seven genes), VIII (11 genes) and IX (four genes).

Figure 4.

Expression profiles of the ethylene pathway genes in banana fruits. (a) Heatmap visualization of expression levels from RNA-Seq data. The six libraries correspond to banana fruits harvested at 40, 60 and 90 d after flowering (DAF) and not treated (A, B, C) or treated for 24 h with acetylene (D, E, F). The RNA-Seq analysis was carried out on RNA extracted 4 d after acetylene treatment. Transcript abundance was normalized in RPKM and indicated with a rainbow colour scale from blue (1 RPKM) to red (> 100 RPKM). (b) Heatmap visualization of relative mRNA abundances from qRT-PCR data. The libraries correspond to cDNA samples from separated pulp and peel tissues of banana fruits. 40T0, 60T0 and 90T0 correspond to banana fruits harvested at 40, 60 and 90 DAF, respectively. T3 and T5 correspond to fruits treated for 24 h with acetylene (Eth.) or not treated (C.) and stored for 2 or 4 additional days, respectively. Relative transcript abundance was normalized with the banana Actin 2 reference gene and transformed in log2. Gene expression levels are indicated with a rainbow colour scale from blue (very weakly expressed) to red (very strongly expressed) and white boxes correspond to genes without detected expression. Stars indicate previously described banana genes.

We then used qRT-PCR for a targeted analysis of EIL and EBF gene expression profiles in two fruit tissues using pulp tissues directly after harvest at 40, 60 and 90 DAF (40T0, 60T0 and 90T0, respectively), and pulp and peel tissues harvested at 90 DAF (ethylene-responsive stage) and analysed at 2 (T3) and 4 d (T5) after 24 h of acetylene treatment (Fig. 4b, Table 3). The previously known ethylene- and ripening-induced genes, MaACO1 and MaACS1 (Liu et al., 1999; Choudhury et al., 2008), were used as controls for the response to acetylene. As expected, MaACO1 showed a significant rise of expression in the pulp of late mature green fruits (90 DAF; up to 56-fold; < 0.001) and was induced by acetylene in the pulp (4.5–7-fold; < 0.001) and particularly in the peel (118–189-fold; < 0.001). The MaACS1 gene was also induced by acetylene in both pulp and peel (< 0.001).

Table 3. Differential expression of ethylene-insensitive 3-like (EIL) and ethylene-insensitive 3-binding F-box (EBF) banana genes in banana pulp and peel
Gene nameDevelopmental stageEthylene treatment
log2(FC) 6T0log2(FC) 9T0P-valuelog2(FC) T3Elog2(FC) T5EP-valuelog2(FC) T3Elog2(FC) T5EP-value
  1. ACS, 1-aminocyclopropane-1-carboxylate synthase; ACO, 1-aminocyclopropane-1-carboxylate oxidase; EBF, ethylene-insensitive 3-binding F-box; EIL, ethylene-insensitive 3-like; log2(FC), log2 expression fold change between tested condition and control; 6T0 and 9T0, fruits at 60 and 90 d after flowering (DAF), respectively, versus fruits at 40 DAF; T3E and T5E, acetylene-treated 90 DAF fruits at 2 and 4 d after 24 h of acetylene treatment, respectively, versus untreated controls; n.s., not significant (threshold of 2-fold and/or  0.05); +inf, gene not expressed in untreated controls. Only differentially expressed genes are represented.

ACS1 n.s.+inf+inf< 0.0001+inf+inf< 0.0001
ACO1 1.345.82< 0.00012.192.80< 0.00017.566.88< 0.0001
EIL1 n.s.−1.90−1.39< 0.0001n.s.
EIL3 −1.29< 0.0001−1.51−1.09< 0.0001n.s.
EIL4 n.s.−1.20< 0.0001n.s.
EIL5 1.10< 0.0001−2.85−1.9< 0.0001n.s.
EIL6 −1.340.007−3.33−1.94< 0.0001−1.360.001
EIL7 n.s.−2.060.002n.s.
EIL9 −1.16< 0.00011.392.79< 0.00011.120.001
EIL11 n.s.−1.45−1.48< 0.001n.s.
EBF1 n.s.−1.720.011n.s.
EBF3 n.s.2.633.50< 0.00012.202.06< 0.0001
EBF4 n.s.1.810.0231.371.49< 0.0001
EBF5 n.s.−1.060.006−1.04< 0.001

Among the nine EIL genes from group I-m1, two (MaEIL8 and MaEIL10) were not detected by qRT-PCR (Fig. 4b) in accordance with very weak RNA-Seq expression levels (Fig. 4a). The remaining seven genes showed mostly stable expression patterns in banana fruits with weak regulation of expression at the late mature green stage (90T0) for four of them (MaEIL3, MaEIL5, MaEIL6 and MaEIL9). The seven MaEIL genes were mostly down-regulated by acetylene in banana pulp (Table 3), except for MaEIL9, which was up-regulated (2.6–7-fold; < 0.0001). By contrast, their expression in banana peels remained quite stable. Expression of group II EIL genes, MaEIL2 and MaEIL11, although weaker compared with I-m1 genes, was detected and MaEIL11 showed slightly reduced expression in acetylene-treated banana pulp (2.7-fold; P < 0.001). Based on these results, we propose that the regulation of ethylene signalling in banana fruits is mainly under control of seven ultra-paralogous EIL genes belonging to the I-m1 group (MaEIL1, MaEIL3, MaEIL4, MaEIL5, MaEIL6, MaEIL7 and MaEIL9). These WGD-derived genes have redundant expression patterns, except for MaEIL9, which shows indications of subfunctionalization.

The five EBF genes were expressed at the three developmental stages and showed distinct profiles after acetylene treatment. MaEBF3 was strongly induced by acetylene in the pulp (6–11-fold; < 0.0001) and also in peel tissues (4-fold; < 0.0001), whereas MaEBF4 was slightly induced in both tissues (3-fold; < 0.0001). MaEBF1 and MaEBF5 were either not regulated or slightly down-regulated by acetylene (3- and 2-fold, respectively; Table 3), whereas no differential expression was found for MaEBF2. Thus, EBF duplicates resulting from Musa WGD have different expression patterns in banana fruits, suggesting subfunctionalization after duplication.


Lineage-specific WGDs are responsible for the expansion of ethylene pathway genes in banana

We evaluated the contribution of four gene duplication modes (i.e. WGD and segmental, tandem and proximal duplications) to gene family evolution in banana. Identification of duplicate genes that are retained after WGD can be challenging (Van de Peer, 2004). Loss of synteny after fractionation and many rearrangements of the banana genome after three WGDs (D'Hont et al., 2012) can complicate the detection of gene pairs and make it difficult to distinguish potential segmental duplications from WGDs. Nevertheless, we have found that 77% of banana genes are duplicates and that close to half (40%) result from WGD, indicating that WGD is the main source of duplicate genes in banana. Gene-scale duplications involved 7.6% of banana genes, a proportion quite similar to that found in rice and A. thaliana, albeit in the lower range (Rizzon et al., 2006; Wang et al., 2011). Approximately 29% of banana genes are involved in unknown types of duplication. Distant single-gene transposition events may explain part of these duplications of unknown origin (Cusack & Wolfe, 2007).

The impact of duplication was investigated throughout the ethylene pathway, revealing that, in banana, most of the gene families of this pathway evolved through WGD as a common evolutionary mechanism. Common modes of duplication were previously found for photosynthetic and Calvin cycle soybean (Glycine max) genes and also for circadian clock genes in Brassica rapa (Coate et al., 2011; Lou et al., 2012). Duplication modes of gene families can be conserved across species, as is the case for the EBF family which largely evolved through WGD with no identified tandem or proximal duplications. By contrast, EIL genes were also duplicated through simple tandem duplication in peach and woodland strawberry, which have not undergone recent WGD, but also in rice and tomato, although no large tandem clusters were identified. In addition, although banana ERF transcription factors were amplified through WGD and showed very few tandem duplicates, grapevine has two large clusters of ERF genes (Licausi et al., 2010). Studies at the genome scale have shown that gene family members may have common nonrandom patterns of origin; patterns that are conserved in different species (e.g. Wang et al., 2011; Rodgers-Melnick et al., 2012). Our findings indicate that duplication modes of gene families are not always strictly conserved across plant species.

Dosage balance and subfunctionalization are the driving forces for over-retention of ethylene gene families in banana

The ACS, EBF, EIL and ERF gene families were preferentially retained after banana WGD, raising the question of the evolutionary forces underlying this retention. The four gene families may interact with each other in different ways. In banana, the most expanded EIL group, I-m1, are homologues of AtEIN3/AtEIL1 and of tomato and rice genes shown to be functional in ethylene signalling. In the absence of ethylene, AtEIN3 and AtEIL1 are constitutively expressed and degraded post-transcriptionally. The regulation of EIN3/EIL1 abundance through SCF-E3 directed ubiquitination using EBF1 and EBF2 proteins as substrate recognition factors is critical for plant development, as overexpression of AtEIN3 and AtEIL1 genes resulted in stunted growth, reduced fertility and early senescence (Chao et al., 1997; Binder et al., 2007; An et al., 2010). In addition, ebf1ebf2 double mutants had elevated EIN3 levels and showed dramatic growth arrest phenotypes (Gagne et al., 2004; Binder et al., 2007). The system is further regulated through induction of AtEBF2 gene expression by EIN3 (Potuschak et al., 2003; Konishi & Yanagisawa, 2008). The banana EIL genes of group I-m1 and EBF genes are under purifying selection without apparent signs of pseudogenization. Several of them were highly expressed, with similar expression profiles in banana fruits, suggesting functional redundancy. In addition, all five EBF genes and seven out of nine MaEIL genes of group I-m1 were found to be highly expressed in pathogen-inoculated and control leaf pieces and roots of banana accessions DH-Pahang and Pahang (http://banana-genome.cirad.fr/transcriptomic), suggesting co-expression in different physiological situations. The co-retention of both EIL and EBF ultraparalogues might have been important to maintain balanced control of EIL protein levels and subsequently of ethylene signalling and its potential detrimental effects. Thus, retention of these two gene families could be explained by the gene balance hypothesis (Veitia, 2004; Birchler & Veitia, 2007). Retention of ERF transcription factors might also be attributable to maintenance of a balance between interacting EIL transcription factors and ERF genes and could be consistent with the extension of the gene balance hypothesis to dosage-sensitive protein–DNA interactions (Birchler & Veitia, 2011; Schnable et al., 2011a). In addition, a dosage balance effect through the interaction of ERF and regulatory sequences of ACS genes for control of ethylene synthesis (Zhang et al., 2009) could explain ACS gene duplicate retention.

Retention of duplicates could also be attributable to subfunctionalization (Force et al., 1999; Lynch & Force, 2000), and half or more of WGD duplicated genes showed differential expression in A. thaliana, rice, soybean and Populus trichocarpa (Blanc & Wolfe, 2004; Rodgers-Melnick et al., 2012; Wang et al., 2012; Roulin et al., 2013). We found that some MaEIL WGD duplicates were either strongly or weakly expressed in banana fruits, and one duplicate (MaEIL9) had a different regulation of expression after acetylene treatment, indicating expression divergence after duplication. Similarly, in tomato, three LeEIL genes were functionally redundant as positive regulators of multiple ethylene responses (Tieman et al., 2001) and a fourth one (LeEIL4) was induced during fruit ripening (Yokotani et al., 2003). In A. thaliana, AtEIN3 and AtEIL1 have overlapping effects in a variety of ethylene responses (Chao et al., 1997; Alonso et al., 2003b; Guo & Ecker, 2003) but they also have functionally distinct roles (An et al., 2010). Thus, in addition to functional overlap, banana EIL genes might also have evolved specific functions, as could be the case for MaEIL9 after ripening induction.

As for banana EBF genes, two were induced by acetylene (MaEBF3 and MaEBF4) whereas three genes were more stable in expression. The A. thaliana AtEBF1 and AtEBF2 genes are duplicates from the β Arabidopsis WGD. They have a synergistic effect but AtEBF2 was more strongly induced than AtEBF1 by ethylene, through EIN3 binding to the AtEBF2 promoter (Binder et al., 2007; Konishi & Yanagisawa, 2008). Mutant analysis suggested that A. thaliana EBF1 is involved in the baseline ubiquitination process of EIN3/EIL1, while EBF2 is important for the removal of excess EIN3/EIL1 after induction of ethylene signalling (Binder et al., 2007). Thus, based on gene expression profiles in the banana fruit, we hypothesize that MaEBF3 and possibly MaEBF4 would have a function in dampening of ethylene signalling after treatment with acetylene, whereas the MaEBF1/2/5 genes could be involved in baseline control of EIL proteins. In tomato, the previously identified SlEBF1 and SlEBF2 genes were functionally redundant and essential for fruit ripening (Yang et al., 2010); however, the role of four additional tomato EBF genes that we have identified here remains to be determined. We also found that most banana ACS genes were not or were weakly expressed in banana fruits, except for MaACS1. Other banana ACS genes have been previously shown to have different expression profiles, such as MaACS2, which was induced by wounding in banana fruits (Liu et al., 1999). Arabidopsis thaliana ACS genes present unique and overlapping expression patterns in different plant tissues or in response to different stresses, suggesting that they control the specificity of the ethylene response (Yamagami et al., 2003; Tsuchisaka & Theologis, 2004). The retention of ACS genes after WGD could be related to tissue- or stress-specific functions of ACS genes. In addition, as ERF genes are also involved in responses to different stresses, their retention in the long term could also be attributable to advantageous properties conferred by some of them.

Based on our observations, we propose that banana ACS, EIL, EBF and ERF genes were over-retained after WGD to maintain a balance between genes involved in the same pathway and that, over evolutionary time, mutations could have been retained in some duplicate genes to promote finer control of ethylene signalling by subfunctionalization (Lynch & Force, 2000). In addition to its role in fruit ripening, ethylene is important for plant development and responses to biotic and abiotic stresses (e.g. Alonso et al., 2003a; Rzewuski & Sauter, 2008; Muday et al., 2012; Vandenbussche et al., 2012). Thus, the co-expansion of ethylene pathway genes in banana might have contributed an advantage in relation to development or response to external stimuli, fostered by subfunctionalization.

The genes involved in the first steps of ethylene perception (i.e. ethylene receptors and RAN1-like and RTH genes) are not expanded in banana. In A. thaliana, RAN1 plays a role in copper delivery to the receptors (Binder et al., 2010) while RTE1 stabilizes the ETR1 receptor conformation, enhancing its ability to repress ethylene signalling (Resnick et al., 2008). A relatively stable gene number could be a consequence of the role of these genes in negative regulation of ethylene signalling, as more receptors could possibly result in reduced sensitivity to ethylene (Ciardi et al., 2000; Klee & Tieman, 2002).

Towards the identification of ethylene pathway genes involved in banana ripening

We identified all members of the key gene families for ethylene biosynthesis and signalling in banana. Our focus on their expression in banana fruits and in response to acetylene treatment allowed identification of those that could play a role in fruit ripening. We have confirmed that MaACS1 and MaACO1 are the key ethylene biosynthesis genes involved in ripening of banana fruits (Domínguez & Vendrell, 1994; Liu et al., 1999). We found that five out of seven genes encoding ethylene receptors were expressed in banana fruits, in addition to all members of the RAN1, RTH and EIN2 gene families; and we identified two CTR1-like genes expressed in banana fruits in addition to the one previously reported (MaCTR1; Hu et al., 2012). We also found that at least 22 ERF genes are expressed in banana fruits. More research is needed to determine whether particular members of these gene families have a specific regulation of gene expression or specific function or if they are functionally redundant. We also identified four EIL genes (MaEIL6, MaEIL7, MaEIL9 and MaEIL11) expressed in the fruits in addition to those described in the literature (Mbéguié-A-Mbéguié et al., 2008). Three of them are members of the I-m1 group, which makes a total of seven MaEIL genes of this group that were expressed in banana fruits. Banana EIL genes of the I-m1 group are therefore the main candidates for regulation of ethylene signalling in banana fruits. Finally, all five banana EBF genes were expressed in the fruits, with different expression profiles, adding three more genes to the previously identified MaEBF1 and MaEFB2 (Kuang et al., 2013). All of them could have a function in fruit ripening but it is likely that MaEBF3 (and possibly MaEBF4) is particularly involved in the control of ethylene signal dampening after ripening induction.

By analysing the impact of treatment with an ethylene antagonist (1-methylcyclopropene (1-MCP)), negative feedback regulation of ethylene biosynthesis was demonstrated in ripening banana fruits (Golding et al., 1998). Further studies showed that ethylene biosynthesis in ripening fruits was negatively controlled in the pulp and positively controlled in the peel (Inaba et al., 2007). We found that the majority of EIL genes were down-regulated in banana pulp after acetylene treatment. We also observed a strong up-regulation of the MaEBF3 gene (and to a lesser extent MaEBF4), which might lead to an increase in MaEBF3 protein synthesis and subsequent EIL protein degradation. These expression profiles are consistent with a dampening of ethylene signalling in banana pulp after ripening initiation. In banana peel, the high expression level of MaACO1, more stable expression of EIL genes and more moderate up-regulation of MaEBF3/MaEBF4 all support positive regulation of ethylene signalling in banana peel during ripening.

By combining a phylogenomics approach at the genome scale with expression analysis, we have increased our knowledge of the banana genes involved in ethylene signalling and biosynthesis. This provides the basis for further studies on the natural ripening process in banana and on initial responses to ethylene in banana fruits. Finally, our findings illustrate the decisive contribution of WGDs to the evolution of genes involved in important physiological processes throughout biological pathways.


This work was supported by Centre de coopération Internationale en Recherche Agronomique pour le Développement (CIRAD). We thank Mr Poumaroux for providing access to his banana plot, Olivier Hubert (CIRAD, Guadeloupe, France) for monitoring banana growth, Corinne DaSilva (Genoscope, Evry, France) for RNA-Seq data normalization, Jean-François Dufayard for phylogenetic expertise and the SouthGreen Bioinformatics Platform (http://southgreen.cirad.fr) for the banana genome hub (http://banana-genome.cirad.fr/) and computational resources.