Macrophylogenetic analyses of the gain and loss of self-incompatibility in the Asteraceae


Author for correspondence: S. Good-Avila Tel: +902 5851798 Fax: +902 5851059 Email:


  • • The self-incompatibility (SI) status of 571 taxa from the Asteraceae was identified and the taxa were scored as having SI, partial SI or self-compatibility (SC) as their breeding system. A molecular phylogeny of the internal transcribed spacer (ITS) region was constructed for 211 of these taxa.
  • • Macrophylogenetic methods were used to test hypotheses concerning the ancestral state of SI in the Asteraceae, the gain and loss of SI, the irreversibility of the loss of SI and the potential for partial SI or SC to be terminal states.
  • • The ancestral breeding system in the family could not be resolved. Both maximum likelihood and parsimony analyses indicated that transitions among all breeding system states provide the best fit to the data and that neither partial SI nor SC is a terminal state. Furthermore, the data indicated that the loss of SI is not irreversible, although breeding system evolution has been more dynamic in some clades than in others.
  • • These results are discussed within the context of evidence for the gain and loss of SI, the evolutionary role of partial SI and methodological assumptions of tests of breeding system evolution.


In plants bearing genetic self-incompatibility (SI) systems, self-fertilization and inbreeding are prevented by the gene products of the S-locus which prevent reproduction between individuals sharing one or more SI alleles (de Nettancourt, 2001). Because of the prevalence of hermaphroditic species in the angiosperms and the well-documented deleterious effects of inbreeding, SI is widely believed to have evolved as a strategy to avoid self-fertilization and inbreeding (Richards, 1986; de Nettancourt, 2001). At least 68 angiosperm families have been identified as having some kind of genetic SI system (de Nettancourt, 2001; Silva & Goring, 2001).

In species with both gametophytic and sporophytic SI, the male and female components of the S-locus are linked and transmitted as a single Mendelian character (de Nettancourt, 2001; Silva & Goring, 2001) and are both necessary and sufficient to cause SI in the correct genetic background (Lee et al., 1994; Murfett et al., 1994). This has led researchers to treat SI as a qualitative trait. However, despite the precise action of S-genes, natural populations of plants often show broad variation in the SI response, and modifiers weakening or nullifying SI can be linked or unlinked to the S-locus and controlled by one or multiple genes that may also be influenced by the environment (Levin, 1996; Stephenson et al., 2000; Good-Avila & Stephenson, 2002; Stone, 2002). This means that natural populations of plants may vary from being strictly self-incompatible to showing intra- or interpopulation variation in the strength of SI, known as partial or pseudo self-incompatibility/self-compatibility (PSI hereafter).

After a review of the breeding systems and geographic ranges of many plant species, Stebbins (1974) concluded that the transition from SI to predominant self-fertilization was one of the most common transitions in plant breeding systems. A breakdown in SI has been shown to evolve in populations in which sexual reproduction is limited by mate availability, such as small or colonizing populations (Levin, 1996; Kunin & Shmida, 1997; Hiscock, 2000), and macrophylogenetic studies have identified the independent loss of SI multiple times in virtually all groups examined (Takebayashi & Morrell, 2001). Although the common occurrence of a breakdown in SI is well established, there are at least two aspects of the macroevolutionary dynamics of breeding system evolution that are unresolved: the evolutionary role of PSI, and the irreversibility of the loss of SI. Firstly, typically, PSI is assumed to be a transient state between SI and self-compatibility (SC) (Levin, 1996), but its macroevolutionary role has never been addressed (Goodwillie et al., 2005). In particular, is PSI part of a transition to SC, a stable or terminal state or likely to revert to SI?

Secondly, macrophylogenetic analyses support the intuition that SI is lost more frequently than gained (Takebayashi & Morrell, 2001), and Igic et al. (2006) recently presented evidence that the loss of SI is essentially irreversible in the Solanceae. However, although there are arguments for thinking that the loss of SI is irreversible (see Igic et al., 2004), it is not clear how general this conclusion is. A review of seven macrophylogenetic studies assessing rates of the gain and loss of SI found support for the gain of SI in three of the studies (Takebayashi & Morrell, 2001). SI could be regained through either the de novo origination of SI or the restoration of the ancestral SI system. A broad survey of the number of SI systems in the angiosperms estimated that new SI systems have evolved a minimum of 21 independent times (Weller et al., 1995; Steinbachs & Holsinger, 2002) and, indeed, multiple systems have now been identified within the family Polemoniaceae (Barrett et al., 1996; Kohn et al., 1996; de Nettancourt, 2001). The conditions under which an ancestral SI system could be restored have not been well explored but include restoration of a modifier locus (Nasrallah et al., 2004) or complementation of modifiers through hybridization between compatible ancestors or between a self-compatible and self-incompatible ancestor (Rick & Chetelat, 1991).

A macrophylogenetic analysis of breeding systems is performed by reconstructing the phylogenetic relationship of a suite of species and using the character (breeding) states of the extant taxa to infer past evolutionary processes. Using methods based on maximum likelihood (ML) and/or parsimony, one can evaluate variables such as the number of gains or losses of a trait, the transition rates between character states, whether trait evolution is punctual or gradual, and the inferred location of character state changes on the tree topology, and test hypotheses such as a bias in the rate of gain vs loss or the irreversibility of trait evolution (Harvey & Pagel, 1991; Sanderson, 1993; Pagel, 1999a). Collectively, these analyses provide important information about the (ir)reversibility of trait evolution, because the location as well as the frequency of gains/losses is integral to understanding and testing hypotheses about the evolutionary process.

In this paper, we examine the evolutionary dynamics of the gain and loss of SI in the Asteraceae, a family known to possess sporophytic SI, although the gene regulating it is not known (de Nettancourt, 2001). To do this, we collected DNA sequence data and SI status information for 193 species (211 taxa) representing most of the tribes in the Asteraceae. We then employed methods based on both ML (Pagel, 1994, 1999b) and parsimony (Maddison & Maddison, 1989) to ask the following questions.

  • • What model best describes breeding system evolution in the family and is there evidence for the gain of SI?
  • • What is the evolutionary role of PSI?
  • • What are the transition rates between character states?
  • • Is there a punctual or gradual mode of trait evolution?
  • • What is the ancestral state of SI (SI, PSI or SC) in the Asteraceae?
  • • What is the ancestral character state reconstruction of breeding traits on the phylogeny and does the loss of SI tend to occur on terminal branches?

Because of our focus on PSI as well as SC, we refer to the loss of SI as any forward transition from SI to PSI to SC and the gain of SI as any backward transition from SC to PSI to SI.

Materials and Methods

Data collection

SI status  We surveyed different databases to obtain information from as many species in the Asteraceae as possible regarding SI status. The SI status was obtained by surveying published articles concerning the reproductive ecology and/or systematics of species in the family (database available upon request). The nomenclatural synonyms for each taxa were obtained by searching in The International Plant Names Index (2005). Our final database contained information on the SI status of 544 species, but 39 of these were part of species complexes consisting of various subspecies or varieties, and the final data set consisted of 571 taxa. We scored the SI status as SI, PSI or SC depending on evidence from pollination and/or microscope work. Most authors declared a species as self-incompatible when fruit set after hand self-pollination was zero or very low (< 0.05) and/or when they observed no germination or growth of self-pollen on the stigmatic surface. Occasionally, fruit set after self- and cross-pollinations was recorded and the authors did not explicitly state the SI status of a species, but if fruit set after self-pollination was zero or very low (< 0.05), we called it self-incompatible. Authors reported species as having PSI if they found variation among individuals or populations in the strength of SI and sometimes if a breakdown in SI was induced by environmental factors such as temperature or age. The number of studies describing PSI as caused by individual, population or environmental causes was noted. Most authors defined a species as self-compatible when it set fruit after self-pollination.

Molecular sequence data and phylogenetic reconstruction  To test specific hypotheses concerning the gain and loss of SI, we searched GenBank for DNA sequence data (internal transcribed spacer (ITS), megakaryocyte associated tyrosine kinase gene (matK) and ribulose-biphosphate carboxylase gene (rbcL)) for the 571 taxa for which we had SI status data. We obtained data for 211 (193 species) of the 571 taxa for the internal transcribed spacer (ITS1 and ITS2, 5.8 S) in the nuclear genome. Differences in the number of taxa within each category between the phylogenetic (n = 211) and full (n = 571) data sets were assessed for significance using a χ2 test. A previous phylogenetic analysis of the Asteraceae using ITS sequence data by Goertzen et al. (2003) recognized 15 tribes in the Asteraceae: the proportion of taxa represented in each tribe in our molecular data set is presented in Table 1. Of the 211 taxa for which we obtained sequence data, there were representatives from all four subfamilies and 11 of the 15 tribes recognized by Goertzen et al. (2003), and five of the 10 subfamilies and 17 of the 36 tribes described by Panero & Funk (2002) (Table 1). The number of taxa is greatest in the subfamily Asteroideae (tribes Astereae + Heliantheae sensu lato) with c. 57% of the taxa. This is probably a consequence of it being the largest subfamily in the Asteraceae (Bremer, 1994), and it containing two of the most economically (Heliantheae) and ecologically (Madieae) important tribes for which breeding system data are available. The underrepresentation of other tribes can be attributed to the fact that most of the 36 tribes described by Panero & Funk (2002) for which we lack data are monotypic or contain only a few species, and the fact that the species in these tribes are locally endemic and there are little or no breeding system data.

Table 1.  Number of species and their relative representation in the phylogenetic analyses according to the inclusion in the tribes as described by Panero & Funk (2002) and Goertzen et al. (2003)
TribeSpecies (n)% in phylogenetic data set
Panero & Funk (2002) Goertzen et al. (2003)
BardanesiaeBardanesiae  1100
MutiseaeMutiseae  5 40
CichorieaeLactucaceae124 48
CardueaCarduea 24 54
ArctotaeArctotae  1100
AnthemidaeAnthemidae 23 26
CalendulaeCalendulae 15  7
SenecionaeSenecionae 12 42
AsteraeAsterae104 15
GnaphalaeGnaphalae  7  0
Heliantheae254 42
Helenieae*  22 45
Heliantheae* 100 36
Coreopsidaeae*  41 46
Tageteae*   2 50
Eupatorieae*  14 14
Millerieae*  18  6
Madieae*  57 67
Total 570 38.8

The 211 ITS sequences were aligned with Clustal X (Thompson et al., 1997) using stepwise alignment by first aligning species within a genus and then genera within tribes. Finally, the tribes were aligned with reference to the consensus sequences obtained by Goertzen et al. (2003). This final alignment was edited and trimmed using bioedit (Hall, 1999), resulting in a total sequence length of 703 bp. To reconstruct the phylogenetic relationships among taxa, the best model of nucleotide substitution was chosen using maximum likelihood criteria as employed in modeltest 3.07 (Posada & Crandall, 1998). The Tajima & Nei (1982) model of nucleotide substitution with a gamma distribution of mutation rates among sites (a = 1.37) provided the best fit to the data. The phylogeny was obtained with mega 2 software (Kumar et al., 1993), using minimal evolution as the optimality criterion.

To examine the gain and loss of SI in the Asteraceae, we followed the ML methods of Pagel (1997, 1999b) to (a) investigate whether trait evolution was punctual or gradual, which also results in testing the importance of including the branch length information, (b) determine the ancestral state of SI in the Asteraceae, (c) test alternate hypotheses concerning the nature of character state evolution in the family, (d) reconstruct the ancestral SI state at all internal nodes on the phylogenetic tree, and (e) using the ancestral state reconstruction count the proportion of transitions occurring on internal nodes or terminal branches. Additionally, we repeated parts (c), (d) and (e) using the method of parsimony. The analyses were performed on rooted trees and, for the ML method, branch lengths were estimated by the minimum evolution criteria. Trees were rooted with Chuquiraga oppositifolia D. Don, which is a member of the most basal tribe, the Barnadesieae, in the Asteraceae (Jansen & Kim, 1996; Panero & Funk, 2002; Goertzen et al., 2003), and polytomies were resolved by setting the branch length subtending them to 0.0000001, thereby treating them as soft rather than hard. The sensitivity of the results to changes in the topology was assessed by forcing this tree to have the topology of the tribal relations, as described by Goertzen et al. (2003) or Panero & Funk (2002), and re-doing each of the analyses described below.

(a) Test of the importance of branch length information for understanding character evolution (i.e. the mode of evolution). Given the three character states, SI, PSI and SC, the full model of character evolution includes six transitions: SI → SC, SI → PSI, PSI → SI, PSI → SC, SC → SI and SC → PSI. ML methods can be used to infer both the tempo and the mode of character state changes. The tempo is inferred from the transition rate estimates themselves, while the mode is inferred from a scaling parameter, κ, which assesses the probability that transitions are dependent on branch length (Pagel, 1999a). If κ = 0, trait evolution is independent of the branch lengths, indicating a punctual mode of evolution. With κ = 1, trait evolution is dependent on the branch length, indicating a gradual mode of evolution.

The transition rate estimates are calculated as part of the process of maximizing the overall likelihood of the distribution of character states given the hypothesis, while the value and utility (κ  0) of including the branch length information are evaluated by comparing the fit of the data to a full model of character evolution with and without the scaling parameter, κ. The fit of the data to models including or excluding κ was assessed using both the likelihood ratio test (LRT) and the Akaike information criterion (AIC). For the LRT, the LR statistic is LR = −2loge[Hi/Hj], where Hi is the smaller of the two likelihoods. The LR statistic is χ2 distributed with the degrees of freedom equal to the difference in the estimated parameters when models are nested. The AIC statistic is AICi = −2loge Hi + 2Ki, where Hi is the likelihood for the model and Ki is the number of parameters in the model (Akaike, 1974). To compare two models, which need not be nested, the difference in AIC between the two models is calculated as ΔAICi = AICi  AICmin, where AICmin is the smaller AIC of the two compared models. Burnham & Anderson (2003) suggest that whenever ΔAICi = 2 the two models are both substantially supported, when 3 < ΔAICi > 10 the model with the larger AIC has considerably less support, and when ΔAICi > 10 the larger AIC model can be excluded. If two models have the same number of parameters, as can occur when they are not nested, a difference in the –ln likelihoods of 2 would be equivalent to a ΔAICi of 4.

(b) Ancestral state of SI for the Asteraceae. To determine the ancestral breeding state for the Asteraceae, we used the command ‘fossil’ in the program multistate (Pagel, 2002) to assign the root node as SI, PSI or SC and then the calculated the likelihood of the data. Because the three models concerning the hypothetical ancestor were not nested, the models were compared using ΔAICi as described in (a).

(c) Test of hypotheses concerning the ML model of breeding system evolution. To find the model of breeding system evolution that maximized the likelihood of the data, we compared the fit of reduced models to the full model + scaling parameter (as supported by the analyses; see Results) to test specific hypotheses about the gain and loss of SI. Specifically, we restricted transitions to test the following hypotheses: Is (1) PSI or (2) SC a terminal state? (3) Is the loss of SI irreversible, i.e. do only forward transitions occur (SI → PSI → SC, or SI → SC)? (4) Is there a bias in the rates of the gain and loss of SI, i.e. are the rates of forward and backward transitions equal? Because we found strong evidence that backward transitions were required to explain the data, we performed a series of tests to more rigorously examine the importance of the backward transitions. We tried several model building approaches, but present analyses in which all forward transitions were allowed and then one or two backward transitions were added to see if they significantly improved the model. The above analyses were performed with the aid of multistate software (Pagel, 2002) using the LRT and AIC statistics described above to compare models. In addition, the sum of the forward (α) and backward (β) transition rates among the three character states for the best fit model (the full model + the scaling parameter; see Results) was calculated.

(d) Ancestral state reconstruction. To assign an ancestral breeding state to the internal nodes, each node was successively fixed as SI, PSI, SC or unresolved and then the likelihood of the data given this model was recalculated. If the likelihood of the data given one particular ancestral state was greater than 2.0 units compared with both other states, the node was set to that state; otherwise it was set to unresolved (Pagel, 1994, 1999b) (equivalent to a ΔAICi of 4.0; see (a) above).

(e) The proportion of gains/losses at internal nodes and terminal branches. Sanderson (1993) describes how the location on the topology of inferred gains and losses of a trait contributes to the evidence for irreversibility. For example, if SI was gained early and subsequently only lost on terminal branches, this would provide greater support for the hypothesis of irreversibility. To assess this, we calculated the proportion of changes falling into each of the six transition categories as a function of the total number of possible changes at both internal nodes and on terminal branches. This was done by counting the number of each type of transition occurring on trees in which the ancestral character state was reconstructed by either ML or parsimony methods and dividing by the total possible number of changes, 210 for internal nodes and 211 for terminal branches.

To assess the importance of the gain of SI by an independent method, we compared the results of the ML methods with those obtained using parsimony methods. To do this, we reconstructed the ancestral SI state at all internal nodes using parsimony rules on the tree topology (without branch lengths) as implemented in the program MacClade (Maddison & Maddison, 2002) and then calculated the number of steps and character state changes under different step matrices that correspond to the hypotheses outlined above: (1) PSI terminal; (2) SC terminal; (3) irreversibile parsimony, and (4) the full model – equivalent to unordered parsimony. We were unable to test the hypothesis of equal rates of forward and reverse evolution using parsimony. Using the reconstruction of ancestral states by parsimony, we also tabulated the number of transitions occurring at internal nodes and on terminal branches (see (e) above).


Robustness of data set and the phylogenetic tree

The frequency of taxa having SI (65%), PSI (10%) or SC (25%) as their breeding system in the full data set did not differ significantly from that in the phylogenetic data set [χ2 = 0.60, degrees of freedom (d.f.) = 2, P = 0.7395; Table 2]. Of the 56 species in the whole data set, 60% exhibited variation in the strength of SI within a population and 40% variation among populations (not shown). Interestingly, for the species that exhibited within-population PSI, most authors found that between 8 and 15% of the individuals were self-compatible while the remaining were either partially or fully self-incompatible (not shown).

Table 2.  The proportion (and number) of self-incompatibility (SI), partial SI (PSI) and self-compatibility (SC) taxa in the entire data set and that used for the phylogenetic analyses
Entire data setPhylogenetic data subset

The minimum evolution tree describing the relationship among tribes provided generally high bootstrap support for internal nodes: 38% of the internal nodes had a support equal to or higher than 75%, and 19% of the internal nodes had support ranging from 50 to 74% (Fig. 1), which is high for a family such as the Asteraceae known to have evolved rapidly in the last 40–50 million years (Devore & Stuessy, 1995). Both methods recovered five subfamilies: Barnadesioideae, Mutisioideae, Carduoideae, Cichorioideae and Asteroideae. The first four subfamilies were paraphyletic and consisted of the following tribes: Barnadesieae (subfamily Barnadesioideae); Mutisieae (subfamily Mutisioideae), Cadueae (subfamily Carduoideae) and Cichorieae and Arctoteae (subfamily Cichorioideae). The subfamily Asteroideae was monophyletic and was composed of six tribal groups: (1) Anthemideae, (2) Calenduleae and its sister Senecioneae, (3) Astereae, (4) Helenieae, (5) Eupatorieae, Tageteae and Madieae and (6) Millerieae, Heliantheae and its sister group Coreopsideae (Fig. 2). The topology of the trees generally concurred with those of Goertzen et al. (2003) and Panero & Funk (2002), except in the position of some paraphyletic tribes (not shown). Here, we present the results based on analyses of the minimum evolution tree based on our data but with forcing the topology at the tribe level to concur with that presented in Goertzen et al.'s (2003), because their tree was also based on ITS but included more species (288 species) and was highly similar to our original tree. However, we use the nomenclature of the tribes as described by Panero & Funk (2002) (listed in Table 1). All three topologies gave identical conclusions and only influenced the absolute value of, for example, likelihood scores, showing that the choice of topology did not affect our results.

Figure 1.

Condensed minimum evolution tree depicting the relationships among tribes of the 211 taxa using internal transcribed spacer (ITS)1, ITS2 and 5.8 S DNA. The values subtending the nodes or on branches represent the bootstrap value for each node (see text for details). The size of each triangle is proportional to the number of species pertaining to each tribe and the branch length.

Figure 2.

Rates of forward (→) and backward (←) transitions between self-incompatibility (SI), partial SI (PSI) and self-compatibility (SC) reconstructed under the full model of evolution by (a) maximum likelihood and (b) unordered parsimony.

(a) Test of the mode of evolution of SI

The fit of our data to the full model was significantly improved based on either the LRT or AIC criterion by the inclusion of the scale parameter κ (Table 3). The κ value was small (c. 0.12), suggesting that trait change was punctual – occurring quickly and then followed by a long period of stasis.

Table 3.  Likelihood, likelihood ratio test (LRT) and difference in the Akaike information criterion (ΔAICi) test comparing the full model of evolution of self-incompatibility (SI) status with and without the branch scaling parameter (κ)
Model–ln likelihoodd.f.LRTΔAICi
  1. A likelihood ratio > 3.841 indicates that the scaling parameter has a significant affect on the model.

  2. d.f., degrees of freedom.

Full +κ163.0318148.008146.008

(b) Ancestral state of SI status in the Asteraceae

The maximum likelihood test to determine whether SI, PSI or SC was the most likely ancestral state for the family could not resolve what the most likely ancestral state was: the likelihoods for the three models were very close and the difference in AIC was not greater than 2 units for any model (AICSI = 340.38; AICPSI = 340.36; AICSC = 340.84).

(c) Models of evolution of SI

The first four reduced models had a lower fit to the data than the full model, using both the LRT and the AIC criterion (Table 4). We can infer from this that (1) neither SC nor PSI is a terminal state, and (2) including only forward transitions or setting the rate of forward transitions equal to that of backward transitions provides a significantly poorer fit to the data than the full model. This indicates that the loss of SI is not irreversible and that at least some backward transitions are required to explain the data. The relative importance of the backward transitions is tested in models 6–9 (Table 4). This shows that no single reverse transition is sufficient to provide as good a fit to the data as the full model but that inclusion of the transition from SC to SI shows the greatest improvement in the likelihood. The transition rate estimates for the best fit model of character state evolution using ML are shown in Fig. 2. Keeping in mind that these are the instantaneous transition rates over the whole tree (and are not probabilities), this underscores that mating system evolution in the Asteraceae is very dynamic. The total value of the forward transition rates, α, is 0.76 and that of backward transitions, β, is 0.52, giving a relative rate of the loss of SI of α/(α + β) = 0.59.

Table 4.  Likelihood, likelihood ratio test (LRT) and difference for the Akaike information criterion (ΔAICi) test for nine hypotheses concerning the evolution of self-incompatibility (SI) in the Asteraceae
Model numberTransitions includedHypothesisd.f.–ln likelihoodLRTΔAICiParsimony steps
  1. The likelihood ratio test of each reduced model (numbers 2–9) nested in the full model is χ2 distributed and evaluated with the degrees of freedom (d.f.) indicated. When ΔAICi > 10 the model with the larger AIC can be rejected (see text for details).

  2. Likelihood ratios > 3.84 for 1 d.f., 5.991 for 2 d.f., or > 7.84 for 3 d.f. indicate a significant difference between the full and reduced models.

  3. PSI, partial self-incompatibility; SC, self-compatibility.

1All sixFull model163.03   55
2SI → PSI → SC, PSI → SI and SI → SCSC terminal2196.8367.663.1 88
3SI → SC → PSI, SC → PSI and SI → PSIPSI terminal2171.3615.9413.2 57
4All six, but constraints on ratesForward = backward transition rate3174.623.1417.1
5SI → PSI → SC and SI →S CIrreversibility (forward transitions only)3207.8089.583.1104
6All forward + PSI → SIForward + one backward2197.4868.964.9
7All forward + SC → SIForward + one backward2169.4912.9 8.9
8All forward + SC → PSIForward + one backward2196.8367.664.1
9All forward + SC → SI and PSI → SIForward + two backward1169.7213.311.8

(d) Ancestral state reconstruction: overall patterns of gain and loss of SI on the phylogeny

The phylogenetic tree describing the reconstruction of the ancestral breeding states using ML methods (Fig. 3) reveals that the ancestral state of some internal nodes could not be resolved (grey solid lines) and others were identified as either SC or PSI (grey dashed lines). However, of the resolved branches, there are clear differences in the number of gains and losses of SI in each tribe. For example, several clades are predominantly SI such as (a) the genus Coreopsis L. (Coreopsidae, node A; Fig. 3a), (b) the Madieae clade (Silversword alliance node F; Fig. 3a), (c) the Calycadenia DC clade (node E; Fig. 3a), and (d) members of the genus Helianthus L. (node D) and the genus Sonchus L. (node N; Fig. 3b). In these clades the loss of SI is more frequent than its gain and tends to occur on terminal branches. Other clades exhibit mixtures of SI, PSI and SC and many changes in the breeding system are inferred to have occurred. For example, (a) the tribes Astereae, Senecioneae and Cardueae (Fig. 3b) are dominated by SC and PSI and the closely related tribe Cardueae (Fig. 3b) has all three character states approximately equally represented, and (b) the basal members of the tribe Cichorieae are also dominated by SC or PSI as exemplified by members of the genera Cichorium (node L; Fig. 3b) and Lactuca (node M; Fig. 3b). Striking examples of the gain of SI are inferred: (a) in a section of the Heliantheae, the Encelia Adans. alliance (node C; Fig. 3a), (b) in the Coreopsidae because the ancestral genus, Bidens L., is SC (node B; Fig. 3b), (c) in the genus Malacothrix in the tribe Cichorieae (node K; Fig. 3b), (d) at multiple locations within the tribes Asteraceae, Senecioneae, Cardueae and Cichorieae (L, M, and N) and (e) in the genus Lasthenia in the tribe Helenieae (node G; Fig. 3a).

Figure 3.

Figure 3.

Reconstruction of the ancestral state of self-incompatibility (SI) in the Asteraceae using maximum likelihood and the criteria indicated in the text. Character states of internal and terminal branches: solid, SI; dashed, partial SI (PSI); dotted, self-compatibility (SC). Grey solid lines, branches for which the ancestral state was uncertain (SI, PSI or SC equally probable); grey dashed lines, either SC or PSI. Letters indicate genera or alliances with more than three taxa represented in the phylogeny: (a), (A) Coreopsis, (B) Bidens, (C) Encelia, (D) Helianthus, (E) Calycadenia, (F) silversword, (G) Lasthenia; (b), (H) Solidago, (I) Senecio, (J) Centaurea, (K) Malacothrix, (L) Cichorium, (M) Lactuca and (N) Sonchus.

Figure 3.

Figure 3.

Reconstruction of the ancestral state of self-incompatibility (SI) in the Asteraceae using maximum likelihood and the criteria indicated in the text. Character states of internal and terminal branches: solid, SI; dashed, partial SI (PSI); dotted, self-compatibility (SC). Grey solid lines, branches for which the ancestral state was uncertain (SI, PSI or SC equally probable); grey dashed lines, either SC or PSI. Letters indicate genera or alliances with more than three taxa represented in the phylogeny: (a), (A) Coreopsis, (B) Bidens, (C) Encelia, (D) Helianthus, (E) Calycadenia, (F) silversword, (G) Lasthenia; (b), (H) Solidago, (I) Senecio, (J) Centaurea, (K) Malacothrix, (L) Cichorium, (M) Lactuca and (N) Sonchus.

Reconstructing the ancestral state of SI using parsimony rules and four different models of character state change revealed that parsimony methods also strongly prefer models in which both gain and loss of SI can occur (supplementary material Fig. S1, available online, and Table 4). A step matrix of unordered parsimony is parallel to the ML full model and required the fewest number of inferred changes, 55 steps, of all the models examined (Table 4). The transition rates of this model are, overall, quite similar to those found under ML (Fig. 2b). The number of steps required to reconstruct the ancestral states assuming only forward changes (irreversible parsimony) was 104. Constructing parsimony step matrices that are parallel to the ML models of PSI or SC as a terminal state required 57 or 88 steps (Table 4). This suggests that parsimony methods agree with those based on ML in that the full/unordered model is best, except that parsimony finds that PSI may be a terminal state, probably because parsimony reconstructs transitions to PSI only on terminal branches.

(e) The proportion of gains/losses at internal nodes and terminal branches

Between 5 and 13% of the internal or terminal branches showed a shift in breeding system based on ML or parsimony ancestral character state reconstructions (Table 5). These changes indicate that there are no transitions from partial SI to SC (losses) or to SI (gains) inferred via parsimony (see also Fig. S1). In addition, ML infers that the loss of SI is twice as frequent on internal than terminal branches and occurs from SI to partial SI and from partial SI to SC while parsimony infers 2–4 times more losses on terminal than internal branches and infers them as occurring from SI to SC and SI to partial SI. Finally, ML infers slightly more gains of SI on terminal than internal branches but three times more gains than losses on terminal branches while parsimony infers that 2.5 times more gains occur on terminal than internal branches. ML calculates these gains as occurring from SC to SI and from partial SI to SI while parsimony infers them as occurring from SC to SI and from SC to partial SI. Thus, ML suggests that the loss of SI is greater on internal nodes and the gain of SI greater on terminal branches, while parsimony reconstructs both greater loss and greater gain on terminal branches. The slight differences in the transitions described in the internal/terminal branch analyses (Table 5) compared with the transition rate estimates (Fig. 2) are caused by the former being based on the changes assigned by the ancestral state reconstruction (i.e. changes that were justified by the 2 log likelihood criterion for a single node) while the latter were based on the full probabilistic model of character state evolution.

Table 5.  Proportion of transitions occurring at internal nodes and on terminal branches using either maximum likelihood (ML) or parsimony ancestral character state reconstruction for the evolution of self-incompatibility (SI) in Asteraceae
TransitionInternal nodesTerminal branches
  1. The proportion of changes occurring for each transition were estimated using the total of possible changes at internal nodes (210) or on terminal branches (211).

  2. PSI, partial self-incompatibility; SC, self-compatibility.

SI → SC0.00000.01430.00000.0427
PSI → SC0.01430.00000.00950.0000
SI → PSI0.01900.00480.00470.0237
Total loss0.0330.01910.01420.066
SC → SI0.01900.01900.04740.0332
SC → PSI0.00000.00950.00000.0284
PSI → SI0.01900.00000.00000.0000
Total gain0.0380.02850.04740.062
Total changes0.0710.04760.06160.128


Our literature survey of the breeding systems of 571 species in the Asteraceae revealed that the majority of plants in the Asteraceae are self-incompatible (63%), but a significant proportion are either partially self-incompatible (10%) or self-compatible (27%). Analyses of the patterns of gain and loss of SI by both ML and parsimony indicate that breeding system evolution has been very dynamic and is best described by a model in which transitions among all character states are allowed to occur. In particular, we can reject two slightly different tests of irreversibility: that the rate of forward is not equal to the rate of backward transition, and that the number of backward transitions is zero. Ancestral character state reconstruction by both ML and parsimony showed that breeding system evolution has been more dynamic in some clades than others, but that gains and losses of SI have occurred on both internal and external branches, providing stronger evidence that the loss, or partial breakdown, of SI is not irreversible in the Asteraceae.

The use of ML methods to study character evolution provides the opportunity to examine the mode of evolution and to investigate whether including branch length information improves the fit of the data to the model of character evolution. Our analyses indicated that breeding system evolution in the Asteraceae is punctual and that including a scaling parameter, κ, to describe this evolution significantly improves the fit of the data. The finding that breeding system changes occur soon after speciation suggests that they may have a role in speciation: a finding that is not surprising, and a hypotheses that could be further examined using ML-based phylogenetic models.

Although the SI status of the ancestor of the family could not be resolved by ML methods, the finding that 63% of extant species have SI suggests that it is an ancient breeding system in the family, as other authors have argued (Richards, 1986; Lane, 1996). An inspection of Fig. 3 (or the supplementary material Fig. S1 using parsimony) shows that, while the basal group in the family, the Barnardisioneae, is self-incompatible the other basal tribes, the Mutiseae and Cichorieae, are not predominantly self-incompatible and the ancestral character states for the family tend to be reconstructed as SC, PSI or uncertain. Thus, while the phylogeny gives some support to the hypothesis that the family is ancestrally self-incompatible or partially self-incompatible, many species in the subfamilies Cichorioideae and Carduoideae must have subsequently lost SI.

Before we discuss the implications of these results, it is worth discussing some of the assumptions of the methods used to generate them. Firstly, both methods assume that the phylogeny reflects the true topology for the family (Harvey & Pagel, 1991). We are confident that the relationships within tribes reflected the true topology because most internal nodes below the level of subtribes had bootstrap support higher than 80%. The internal nodes that had lower bootstrap values were those for which the true phylogenetic relationships remained unresolved. However, when we employed the topologies of either Goertzen et al. (2003) or Panero & Funk (2002), the main results were identical, giving further statistical support to our conclusions. Secondly, Pagel's ML method (Pagel, 1994, 1999b) assumes that forward and backward transitions are equally probable (unless one imposes a restriction that α = β) and then calculates the likelihood of the data given a model. Pagel (1999a) tested the assumption of steady-state character evolution using a bacteriophage data set and concluded that the assumption was reasonable, but if differential transition rates across the tree are a result of differences in birth or death rates of self-incompatible or self-compatible taxa this may be more problematic (Adam Richman, pers. comm.).

Thirdly, Takebayashi & Morrell (2001) noted that, in studies that use ML methods to infer rates of the gain and loss of SI, the proportion of forward to total transitions (α/(α + β)) tends to be equal to the proportion of selfing (or self-compatible) species in the data set. The authors propose that ML estimators of α and β may be more sensitive to topological information when larger phylogenetic data sets are used. We found that the proportion of forward to total transitions was 0.59 while the proportion of SC + partial SI taxa was 0.35; this suggested that our analyses were more sensitive to the topological information, perhaps because of the inclusion of four times more taxa compared with those reviewed by Takebayashi & Morrel (2001) (211 vs 25–60).

Fourthly, the ability to detect a difference in the rate of gain vs loss of a character trait and to test for irreversibility depends on the size of the phylogeny, when the trait is first gained and the overall rates of gain and loss of the trait (Sanderson, 1993). Sanderson (1993) concluded that the power of the test for both hypotheses was generally low. General conditions for improving the power of the test were provided when the analyses were performed on large phylogenies in which rates of both the gain and the loss of traits were high and the first trait gain occurred deep on phylogeny. The data presented here indicate that we appear to have these conditions: SI appears early on some deep internal nodes and the overall rates of character state change are high, and there is evidence that SI was lost preferentially on internal branches and then gained on terminal branches.

Lastly, Igic et al. (2006) showed that using only the character states of extant taxa to infer the ancestral character states can lead to spurious results. Using the trans-genetic polymorphism of S-alleles to set internal nodes as SI, they showed that the loss of SI is essentially irreversible in the Solanaceae and that they would have spuriously concluded it was not irreversible without the inclusion of the SI ‘fossil’ data. We were not able to include SI ‘fossil’ data in our data set (because the S-locus is unknown in the Asteraceae), but have assessed the location as well as the frequency and importance of transitions to provide stronger evidence for the reversibility of SI in the Asteraceae.

The breakdown of SI

Although the Asteraceae exhibit a high proportion of species with SI, an important fraction of the species (37%) exhibit a partial or complete breakdown of SI. Our ML results indicated that the relative rate of forward to backward transitions was 0.59 and that the breakdown of SI occurred approximately twice as frequently on internal nodes compared with external nodes, and in virtually all tribes. The evolutionary transition from SI to either PSI or SC will depend on the genetic mechanisms controlling the breakdown of SI and the strength of selection for self-fertility (Levin, 1996; Good-Avila & Stephenson, 2002). A breakdown of SI is known to be caused by at least three possible mechanisms: (1) mutations at the S-locus, (2) the influence of unlinked modifier loci and (3) polyploidization or S-gene duplication (de Nettancourt, 2001; Stone, 2002). Firstly, mutations at the S-locus that inactivate the S-gene products typically cause self-fertility in mutant individuals (Porcher & Lande, 2005). Depending on the nature of the mutant, the number of S-alleles, the selfing rate and levels of inbreeding depression, populations are expected to evolve to SI or SC (Charlesworth & Charlesworth, 1979). Because the gene regulating S is unknown in the Asteraceae, the role of S-linked mutations in the dissolution of SI is difficult to assess, but mutations at the S-locus are believed to have resulted in the appearance of the SC taxon Stephanomeria exigua Gottlieb ssp. coronaria (Greene) Gottlieb from its SI ancestor Stephanomeria exigua Nutt. (Brauner & Gottlieb, 1987) and the breakdown of SI in Carthamus flavescens L. (Imrie & Knowles, 1971).

Secondly, if a breakdown in SI is caused by an unlinked modifier locus, it is more difficult for SC to completely invade a self-incompatible population (Porcher & Lande, 2005). However, if variation at an unlinked modifier locus causes PSI, PSI may be evolutionarily stable (Vallejo-Mar’n & Uyenoyama, 2004). If modifiers of SI are caused by multiple unlinked modifiers, as found in Campanula rapunculoides L. (Good-Avila & Stephenson, 2002), then PSI may exhibit even broader conditions for stability if polymorphism in modifiers of SI is similar to polymorphism in alleles promoting selfing vs outcrossing (Latta & Ritland, 1993). Indeed, several detailed crossing studies in the Asteraceae have found evidence that PSI is caused by variation in unlinked modifier loci [Senecio squalidus L. (Hiscock, 2000); Aster furcatus Burgues ex Britton & A. Brown. (Reinartz & Les, 1994) and Chrysanthemum L. (Ronald & Ascher, 1975)]. Experimental studies have identified several unlinked modifiers of the S-locus (Hancock et al., 2003) potentially explaining why many SI species exhibit some variation in the degree of SI in response to environmental or genetic background conditions (reviewed in Levin, 1996; Stephenson et al., 2000).

Thirdly, polyploidization may cause a breakdown of SI. A recent review of the effect of polyploidy on the breakdown of SI in angiosperms concluded that polyploidy in species with gametophytically controlled SI can result from competitive inhibition between heteroallelic pollen but that, overall, there is no strong association between polyploidy and a complete loss of SI in angiosperms, especially for species with sporophytically controlled SI (Mable, 2004). An analysis of the coevolution of the breeding system with polyploidy in the Asteraceae confirms that there is no overall support for a loss of SI after polyploidization (M. M. Ferrer and S. V. Good-Avila, manuscript in preparation). In sum, there is some evidence that the loss of SI may be caused by S-linked mutants or modifier loci in the Asteraceae but not by polyploidization.

Is there evidence that the breakdown in SI is associated with island or small population effects? Baker (1955) predicted that founder effects during island colonization should select for the breakdown of SI as a result of a predicted reduction in the number of S-alleles and compatible mates in founder populations. However, there is not strong support for this hypothesis in the Asteraceae. There are four groups of island species represented in our data set: the Madieae (Hawaiian silverswords) and members of the genus Bidens colonized Hawaii, members of the genus Malacothrix colonized islands off southern California and the genus Sonchus reached the Canary Islands. Two of these exhibit a breakdown of SI with island colonization – the Bidens and Malacothrix genera (Williams, 1957; Davis, 1986; Sun & Ganders, 1986; Sun & Ganders, 1988) – while two do not – the Hawaiian silverswords (Carr & Powell, 1986) and island members of the genus Sonchus (Kim et al., 1999). We suggest that one possible influence of the differential response of these groups to island colonization may be correlated to the evolution of life histories in the groups: the silverswords and many species of Sonchus are long-lived perennials, while many species of Bidens and Malacothrix are annual or short-lived perennials (M. M. Ferrer and S. V. Good-Avila, manuscript in preparation).

The gain of SI

Perhaps the most surprising finding of this study is the evidence that the loss of SI is reversible. The finding that SC can revert to PSI or SI in the Asteraceae could be explained by several factors, including (1) de novo generation of SI systems and (2) the retention of genes involved in mate recognition. Firstly, in a parsimony analysis of the number of independent gains and losses of SI in the angiosperms, Weller et al. (1995) concluded that SI systems have evolved independently at least 21 times. In this light, it is interesting that one of the broadest macrophylogenetic studies of the evolution of SI was carried out in the Polemoniaceae (Barrett et al., 1996) and multiple origins of SI were needed to explain the phylogeny. Although this was considered unlikely at the time, recent studies reveal the presence of multiple SI systems in the Polemoniaceae: gametophytic SI occurs in Phlox L. (Levin, 1993), Sporophytic SI in Linanthus Benth (Goodwillie, 1997), and heterostylous SI in the genus Aliciella Brand. (M. Tommerup and M. Porter, unpublished). The analyses presented here also suggest that a de novo origin of SI is possible in the subfamily Asterideae within the tribes Astereae or Senecioneae or within the tribe Cichorieae, because most of the basal members in Lactuca (node m; Fig. 3a) or the sister clade Malacothrix (node k; Fig. 3a) are partially self-incompatible or self-compatible and yet the derived members of the Malacothrix clade and the entire derived Sonchus clade (node n; Fig. 3a) are self-incompatible.

Secondly, a gain of SI could occur because the ancestral SI system is restored. If the partial or complete breakdown of SI is caused by an S-linked mutation, S-allele diversity is expected to erode, making the restoration of SI more difficult (Igic et al., 2004), although the relative timescale of the erosion of S-allele diversity to, for example, speciation is unknown. However, if SI is lost as a result of unlinked modifier alleles it could theoretically be regained through selection or complementation of modifier loci after hybridization. There is experimental evidence that SI can be restored via these pathways. Studies in the self-compatible plant Arabidopsis thaliana (L.) Heynh. identified remnants of the Brassicaceae SI systems: three female-linked (S-locus receptor kinase gene (SRK)) S-alleles have been identified ( Shimizu et al., 2004) and SI was restored to one strain by trans-genetically inserting functional copies of the female (SRK) and male (S-locus Cys-rich gene (SCR)) S-alleles (Nasrallah et al., 2004). In addition, there is evidence that SI can be restored or maintained through hybridization. Restoration of SI was achieved by crossing two self-compatible races of Lycopersicon hirsutum HBK in Peru (Rick & Chetelat, 1991) and hybrid individuals of two self-incompatible species of Rorippa (Brassicaceae) maintain SI (Bleeker, 2004). These experiments show that SC is not always irreversible and suggests that the role of hybridization and gene flow in the evolution of SI warrants further study.

Our analyses indicate that the total rate of loss of SI is as high as or higher than the rate of gain of SI, and yet 65% of the species in the family are SI. This implies that there has been selection to restore SI, as suggested by the relatively high rates of the gain of SI on terminal branches. Although there will be diverse factors determining the final mating system of any species, these results are consistent with the hypothesis that species that lose SI continue to act as outcrossers. There is evidence that species that lose SI adopt other breeding systems such as dioecy (Miller & Venable, 2000) or maintain temporal or spatial separation of the reproductive functions that both encourage outcrossing and reduce interference between male and female functions (Routley et al., 2004). In the Asteraceae, many self-compatible taxa exhibit mixed mating systems; outcrossing rates in Bidens species in Hawaii range from 0.425 to 0.881 (Sun & Ganders, 1988), in H. annus they range from 0.6 to 0.91 (Ellstrand et al., 1978), and in Prionopsis ciliata DC they were found to have a mean of 0.57, while only a few self-compatible species have been found to predominantly self-fertilize, such as Tragopogon mirus Ownbey (0.07) (Soltis et al., 1995). Partially self-incompatible species have also been found to be predominantly outcrossing, with Rutidosis leptorrynchoides Hook showing an outcrossing rate of 0.82 (Young et al., 2002) and Flourensia cernua DC an outcrossing rate of 1.00, despite wide variation in strength of SI among individuals (Ferrer et al., 2004).

The evolutionary role of PSI

Our survey of the literature found that 10% of the species in the Asteraceae are partially self-incompatible, although this number is probably a minimum estimate (Levin, 1996). Interestingly, the majority of these species exhibited within-population variation in the degree of SI in which between 10 and 15% of the population was fully self-compatible, something that may indicate the frequency of mutations causing a weakening of SI or an evolutionary strategy. However, the results of our macrophylogenetic analyses were somewhat ambivalent concerning the evolutionary role of PSI. The ML analyses indicated that SI can lead to PSI which can lead to SC, especially on internal branches, while SC can revert to PSI or SI, especially on terminal branches. This result could be caused by the presence of PSI among species designated as self-compatible, or by the restoration of PSI in self-compatible species. However, the parsimony analyses indicated that PSI is a derived state only in terminal taxa. In either case, the only evidence that PSI is maintained over longer evolutionary periods is presented in the tribes Asterae, Calenduleae and Senecioneae, where ML reconstructed PSI as the most probable state at several deep internal nodes. Therefore, our analyses suggest that PSI may be an important component of both the maintenance and breakdown of SI in the Asteraceae, but its full evolutionary role requires further theoretical and empirical study.

In conclusion, we have found evidence that the evolution of the breeding system has been dynamic in the Asteraceae and that SI can be regained. We suggest that the difference between the results presented here and those obtained by Igic et al. (2006) may also be attributed to differences in the genetic basis, physiology and population dynamics of SI between sporophytic SI in the Asteraceae and gametophytic SI in the Solanaceae. We surveyed 571 species in the Asteraceae and 218 in the Solanaceae and found that the proportion of self-compatible to self-incompatible species is essentially reversed in the two families: 27% of the species in the Asteraceae are SC and 73% PSI or SI, while in the Solanaceae 67% are SC and 33% SI or PSI, although the two families are relatively closely related and of similar age (Wikström et al., 2001). Unravelling the mechanisms responsible for differences in the evolution and breakdown of SI between families is also undoubtedly influenced by many additional factors such as growth habit, life-span, apomixis, clonality and latitude. These coevolutionary dynamics will be explored in future studies.


We thank Adam Richman, Joshua Kohn, Marcy Uyenoyama and Sam Vander Kloet for helpful comments on a previous version of this manuscript. This research was funded by postdoctoral fellowship support to MMF from the Organization of American States (OAS) and by a research grant to SG-A from the National Sciences and Engineering Research Council (NSERC) of Canada.