The divergence history of the perennial plant Linaria cavanillesii confirms a recent loss of self‐incompatibility

Many angiosperms prevent inbreeding through a self‐incompatibility (SI) system, but the loss of SI has been frequent in their evolutionary history. The loss of SI may often lead to an increase in the selfing rate, with the purging of inbreeding depression and the ultimate evolution of a selfing syndrome, where plants have smaller flowers with reduced pollen and nectar production. In this study, we used approximate Bayesian computation (ABC) to estimate the timing of divergence between populations of the plant Linaria cavanillesii that differ in SI status and in which SI is associated with low inbreeding depression but not with a transition to full selfing or a selfing syndrome. Our analysis suggests that the mixed‐mating self‐compatible (SC) population may have begun to diverge from the SI populations around 2810 generation ago, a period perhaps too short for the evolution of a selfing syndrome. We conjecture that the SC population of L. cavanillesii is at an intermediate stage of transition between outcrossing and selfing.


Introduction
Hermaphrodite plants have evolved numerous strategies to prevent self-fertilization and thus to avoid the deleterious effects of inbreeding depression (Barrett, 2002). One of the most effective of these is self-incompatibility (SI), a genetic system leading to the rejection of pollen grains by pistils carrying the same alleles at the self-incompatible locus (S-locus) (Castric & Vekemans, 2004;Franklin-Tong, 2008;Shimizu & Tsuchimatsu, 2015). SI is widespread in flowering plants (60% of angiosperm species are considered SI, , but many species have independently evolved self-compatibility (SC) through the breakdown of SI. Indeed, the loss of SI probably represents one of the most common evolutionary transition to have occurred in the history of angiosperms (Stebbins, 1974;Igic et al., 2008;Goldberg et al., 2010;. Understanding why some populations are resistant to the establishment of SC individuals, whereas SC fixes in others, is central to our quest to understand the diversity of plant mating systems. Two main reasons have been given to explain the frequent independent transitions from SI to SC. The first invokes a direct fitness advantage gained by SC individuals that pass an extra copy of their genes to the next generation by self-fertilizing their progeny (automatic selection hypothesis; Fisher, 1941). Mutations conferring SC and increased selfing should thus spread in a population in the absence of countervailing forces, such as inbreeding depression (Charlesworth & Charlesworth, 1987) or pollen discounting, whereby self-fertilization compromises a plant's outcrossing potential (Nagylaki, 1976;Holsinger, 1988;Harder & Barrett, 1995;Harder & Wilson, 1998;Porcher & Lande, 2005).
The second reason invokes an advantage of reproductive assurance (Darwin, 1876;Lloyd, 1965;Jain, 1976;reviewed in Busch & Delph, 2012). In SI individuals, the number of available partners depends both on population size and the presence of external vectors to move pollen from one individual to another. In small and isolated populations, SI alleles can be lost by drift (Wright, 1939), so that different individuals are more likely to carry the same alleles and therefore to reject each other's outcross pollen, with consequently lost mating opportunities (Charlesworth & Charlesworth, 1979;Busch & Schoen, 2008;Young & Pickup, 2010;Stone et al., 2014). Mates may also be limited during episodes of colonization, where single individuals establish at new localities (Baker, 1955;Pannell, 2015;Pannell et al., 2015). Similarly, in environments where pollinators are scarce, obligate outcrossing individuals will be less able to reproduce than their closely related SC counterparts (Baker, 1955;Cheptou & Massol, 2009;Massol & Cheptou, 2011), even if selfing leads to inbreeding depression (e.g. Herlihy & Eckert, 2002; but see Layman et al., 2017). SC may also confer an outcrossing advantage, particularly in small populations, because SC individuals can mate with any individuals in the population, whereas SI individuals will be prevented from mating with individuals with the shared S-alleles.
Major changes are expected to follow a transition from SI to SC and a concomitant increase in the selfing rate. First, deleterious mutations are expected to be purged by selection, thus resulting in decreased inbreeding depression (Lande & Schemske, 1985;Barrett & Charlesworth, 1991). Although purging may be quick, especially if the transition to selfing coincides with strong bottlenecks (e.g. Busch, 2005a;Guo et al., 2009;Ness et al., 2010), many partially selfing species may maintain high levels of inbreeding depression for long periods (Winn et al., 2011). This might be because inbreeding depression in such populations is so strong that no selfed individuals survive to reproductive maturity, thus removing any possibility for selection among selfed progeny that differ in their load of deleterious mutations (Lande et al., 1994). Alternatively, inbreeding depression might require a longer period of time to be purged (Busch, 2005a).
Because plants that are able to self-fertilize autonomously do not need pollinators, the evolution of SC and self-fertilization is often associated with the evolution of a reduced investment in pollinator attraction and reward, such as through smaller flowers, lower pollen/ovule ratios and reduced nectar production (Ornduff, 1969;Goodwillie et al., 2010;Sicard & Lenhard, 2011). This 'selfing syndrome' has been reported in numerous plant species (reviewed in Goodwillie et al., 2010) and appears to be able to evolve rapidly after a transition to SC and self-fertilization (Bodbyl Roels & Kelly, 2011). However, not all SC species show a shift in flower morphology towards the selfing syndrome (e.g. Busch, 2005b;Fenster & Mart en-Rodr ıguez, 2007;Dart et al., 2012). The reasons for this variation are not well understood and understanding the dynamic of its evolution thus remains interesting.
The consequences and timing of a transition to SC have been investigated in a number of cases where related species or populations differ in their SI status. For instance, Capsella grandiflora (Foxe et al., 2009;Guo et al., 2009), Clarkia xantiana spp. parviflora (Runions & Geber, 2000) and Arabidopsis thaliana (Tang et al., 2007) are SC species that show a selfing syndrome and have been recently derived from their SI-related species (Runions & Geber, 2000;Foxe et al., 2009). Coalescent-based analyses have revealed that Leavenworthia alabamica, a predominantly SI species, experienced a transition to SC in at least two of its populations more recently than 150 000 years ago, with the most recent transition (about 48 000 years ago) associated with negligible changes in floral morphology (Busch et al., 2011). Similarly, Foxe et al. (2010) inferred that some populations of Arabidopsis lyrata probably evolved higher selfing rates as recently as 10 000 years ago, also without showing any accompanying shift towards a selfing syndrome.
The perennial herb Linaria cavanillesii shows variable levels of SI among its populations and underwent a recent breakdown of its gametophytic SI system in part of its range (Voillemot & Pannell, 2016). The one known SC population of L. cavanillesii appears to be at an intermediate step of its evolution from SI to self-fertilization. On the one hand, whereas occasional selfing in the SI population is accompanied by high inbreeding depression, inbreeding depression is absent in the SC population (Voillemot & Pannell, 2017). On the other hand, the SC population continues to show substantial outcrossing (selfing rate s = 0.59; Voillemot & Pannell, 2016), a situation in which the purging of inbreeding depression might be ineffectual (Winn et al., 2011). Nor does the SC population show any evidence of a selfing syndrome, with plants from both SC and SI populations producing large flowers with a long nectar-containing spurs, and no reduction in pollen production (Voillemot & Pannell, 2016).
We hypothesized that maintenance of an outcrossing syndrome in L. cavanillesii despite its loss of SI and inbreeding depression might be attributable to a very recent transition towards SC. Accordingly, we used an approximate Bayesian computation framework (ABC) (Beaumont et al., 2002) to infer the time of divergence between SI and SC populations of L. cavanillesii on the basis of molecular variation at 16 microsatellite loci. ABC uses coalescent simulations to generate genetic data expected under a given demographic model. Simulated data are then compared with observed data to test the relevance of the demographic model, bypassing the need to compute full likelihoodsa major limitation for most demographic models (Beaumont et al., 2002;Csill ery et al., 2010). Our results confirm that the SC population of L. cavanillesii is very recently derived from neighbouring SI populations and indeed represents one of the most recent transitions from outcrossing to partial selfing yet documented.
cliffs at sites ranging from 300 to 1400 m in altitude. Between May and June, individuals grow multiple herbaceous shoots from a subwoody perennial base, with inflorescences developing towards the end of the branches. Yellow flowers have long floral nectar spurs and are pollinated by a variety of bees and bumblebees. Around 30 days after fertilization, mature capsules open and seeds are dispersed passively, probably aided by wind.

Sampled populations
Six populations of L. cavanillesii have been the focus of detailed previous analysis (Voillemot & Pannell, 2016): a fully SC population (COV), two SI populations (BER, DEN) and three mixed populations with both SI and leaky SI plants (BUI, RUB, ZAR) (Voillemot & Pannell, 2016). The species has a strong population-genetic structure, with mean pairwise F ST = 0.56 (min F ST = 0.32, max F ST = 0.75), and, for the SC population in particular, high divergence from each of the two closest and northernmost SI populations (F ST = 0.63 and 0.56 for COV-DEN and COV-BER, respectively; Voillemot & Pannell, 2016). Population structure combined with preliminary simulations involving all pairs of populations (Fig. S1) suggests that the SC population diverged recently from one of the two closest populations. We thus focused our investigation on these three northern populations (i.e. the SC COV population, and BER and DEN SI populations; Fig. 1).

Genotyping
We collected leaf material from 28 to 41 individuals from each of the three natural populations and extracted DNA with the DNeasy 96 Plant kit (Qiagen, Hombrechtikon, Germany). Sixteen polymorphic microsatellite markers were amplified by PCR (Biometra thermocycler) using the following reagents: 1 9 PCR mix: 2 ng lL À1 template DNA, 10 9 PCR Buffer, 25 mM MgCl 2 , 5 9 Q-solution, 2.5 mM dNTP, 0.2 lM of each primer and 0.5 U lL À1 of Taq DNA polymerase (HotS-tarTaq â ; Qiagen). The thermocycling conditions were set to 15 min at 95°followed by 32 cycles of 30 s at the annealing temperature (Voillemot & Pannell, 2016), 30 s at 72°C and 30 s at 95°C, followed by  Fig. 1 Geographical range of Linaria cavanillesii in Spain (grey oval), and the populations that have been studied in detail (Voillemot & Pannell, 2016). Three northern populations (COV, DEN and BER) were used in the ABC simulations presented here. Their selfincompatibility status (SC: self-compatible; SI: self-incompatible) revealed by controlled crosses is indicated. one cycle of 1 min at the annealing temperature and a final extension of 30 min at 72°C. PCR products were then sequenced on an ABI3100 sequencer (Applied Biosystems, Foster City, CA, USA). We used the program Genemapper â to analyse microsatellite data.

ABC test of gene flow between natural populations
Two model comparisons were performed with ABC: (i) to test for the possibility of ongoing gene flow among the three sampled populations; and (ii) to test alternative topologies of population trees for the three populations. For each test, a specific ABC approach was adopted.
We tested the hypothesis of ongoing gene flow between L. cavanillesii populations by estimating the relative posterior probabilities of two models: the 'I model' (isolation) and the 'IM model' (isolation/migration). In the I model, an ancestral panmictic population becomes subdivided into two isolated daughter populations with no gene flow between them (Fig. 2). In the IM model, the two daughter populations remain connected by gene flow after an ancestral split (Fig. 2). Specifically, under the IM model, gene flow occurs in both directions at independent rates M i = 4N i m, where M i is the number of migrants received each generation by population i, N i is the effective population size of population i, and m is the proportion of individuals in population that were migrants from the other population the previous generation. The effective population sizes of the two current populations and the ancestral one were randomly sampled from a uniform prior distribution of [0-30 000] individuals and were allowed to vary freely to accommodate the occurrence of a bottleneck at the time of population divergence (T split ). T split was sampled from a uniform prior of [0-100 000] generations. The scaled migration rate 4Nm under the IM model rate was sampled from the uniform prior of .
For each demographic model, we used the software ms to perform 10 6 multilocus coalescent simulations (Hudson, 2002). The program's output was then converted into microsatellite data sets on the basis of a generalized stepwise mutation model (GSM) assuming a mean mutation rate l = 1.25 9 10 À5 (based on current estimates of the microsatellite mutation rate for plants; e.g. Vigouroux et al., 2002;Marriage et al., 2009). The probability of an increase or decrease in the microsatellite repeat number for each mutation was modelled assuming a geometrical parameter a distributed following a uniform prior and sampled on the interval [0-0.5]. For each simulated and observed data set, an array of summary statistics for polymorphism and divergence data was computed using publicly available R-scripts (Illera et al., 2014;Rougemont et al., 2016; https://github.com/QuentinRougemont/Microsa tDemogInference). The computed summary statistics correspond to the mean and standard deviation of the number of alleles over the 16 loci (A), the allelic richness (Ar), the observed and expected heterozygosities (H o and H e ), the allele size in base pairs, the Garza-Williamson index (GW; Garza & Williamson, 2001), G ST (Nei, 1973) and dl 2 (Goldstein et al., 1995).
Relative posterior probabilities of the two models were estimated from comparisons between the vector of observed summary statistics and a reference table comprising summary statistics computed from 2 million simulations (one million simulations under each model). The 500 replicate simulations (out of 2 9 10 6 ) falling nearest to the observed summary statistic values were retained, and these were weighted by an Epanechnikov kernel that peaks when the observed statistic equals the simulated one. From the retained simulations, posterior probabilities of each model were estimated using a feed-forward neural network implementing a nonlinear multivariate regression by considering the model itself as an additional parameter to be inferred under the ABC framework, using the R package abc (Csill ery et al., 2012). Computations were performed using 50 trained neural networks and 15 hidden networks in the regression. The robustness of the ABC model comparison (i.e. the probability of correctly supporting an M model given an estimated posterior probability P) was evaluated as follows. First, 1000 pseudo-observed data sets (PODs) were randomly simulated under each compared model using similar prior distributions. To estimate the relative posterior probabilities of each model, each simulated POD was then treated by ABC in a way similar to that of the Linaria data set. Finally, we estimated the robustness, R, of our inference on the basis of the obtained empirical distribution of the posterior probability P under the two models, with R = P(P I = P | I) / [ P(P I = P | I) + P(P I = P | IM)]. P(P I = P | I) is the probability of correct support for model I with the observed posterior probability P I , and P(P I = P | IM) is the probability of wrongly supporting the I model with the observed posterior probability P, assuming here that the IM model is correct (Fagundes et al., 2007).

DIYABC test of alternative topologies in population trees
The previously described approach is robust to test the existence of migration between two sampled gene pools, a feature not hitherto implemented in the software DIYABC (Cornuet et al., 2014). However, preliminary analysis has shown that DIYABC has greater power to estimate parameters of a model with no gene flow (data not shown), and is able to deal with more than two populations. Thus, after rejecting the IM model (i.e. the hypothesis of ongoing gene flow; see Results), we tested alternative topologies describing possible evolutionary links between populations BER, COV and DEN using DIYABC (v.2.0.4) (Cornuet et al., 2014) (Fig. 3a). Specifically, we compared three alternative topologies of a model involving divergence among three populations (Fig. 3a). This model describes the subdivision T 2 generations ago of an ancestral population of size N 5 into two populations, Pop 3 and Pop 4. Pop 3 has a constant effective population size N 3 from the time of the ancestral split up to the present. In contrast, Pop 4 of size N 4 is further split into the two daughter populations Pop 1 (of size N 1 ) and Pop 2 (of size N 2 ) T 1 generations ago (Fig. 3a). All sizes for current and ancestral populations were sampled independently, fully accommodating the possibility of either a bottleneck or an expansion at the time of divergence (T 1 and T 2 ). All three possible topologies of this model were compared, with Pop 3 identified as one of BER, COV or DEN (Fig. 3a). All topologies shared the same prior distributions and were randomly simulated 300 000 times, with parameter values sampled from uniform distributions. N 1 , N 2 , N 3 , N 4 and N 5 were independently sampled from the uniform prior of [0-30 000]. The oldest time of divergence, T 2 , was sampled from the uniform prior of [0-50 000] generations. Finally, T 1 , the most recent time of divergence, was sampled for each iteration i from the uniform prior [0-T 2-i ], where T 2-i is the sampled value for T 2 at the iteration i. We assumed a mutation rate of l = 1.25 9 10 À5 for the GSM (Estoup et al., 2002). Finally, we computed an array of summary statistics for each iteration obtained for all topologies. These statistics were as follows: the mean number of alleles; the genetic diversity (Nei, 1973); the mean allele size variance; the Garza-Williamson index (GW; Garza & Williamson, 2001); F ST (Weir & Clark Cockerham, 1984); the mean classification index (Pascual et al., 2007); the shared allele distance between populations (Chakraborty & Jin, 1993); and the dl 2 distance (Goldstein et al., 1995). The relative posterior probabilities of the three topologies were estimated using the logistic regression estimate implemented in DIYABC (Fagundes et al., 2007;Beaumont, 2008) from the 1000 simulated data sets closest to the observed one. We then performed a goodness-of-fit test to evaluate the capacity of the estimated parameters to reproduce the observed data set under the best-supported scenario. We simulated 1000 data sets using a random combination of parameters sampled from the estimated joint posterior distribution, and we computed summary statistics for each simulated data set to obtain the expected distribution of the chosen statistics under the estimated scenario.

Results
Our ABC analysis strongly rejected a scenario of ongoing gene flow between the SC population COV and the two geographically nearby SI populations DEN (P Isolation = 1; robustness = 1; Table 1) and BER (P Isolation = 1; robustness = 1; Table 1). Our analysis also provides strong support for a scenario of ongoing gene flow between the two SI populations BER and DEN (P Migration = 0.80; robustness = 1; Table 1). Note that the measures of robustness here represent the probability of correctly supporting the M model with the posterior probability estimated by ABC and can be interpreted as 1 minus the P-value.
Using the software DIYABC (Cornuet et al., 2014), we compared different topologies of a three-population model of divergence (Fig. 3a). Two successive population splits were considered under this model, at two different times in the past T 1 and T 2 , leading to three current populations from a single ancestral one. The best-supported topology corresponds to an ancestral split into two SI populations followed by a more recent split between the current SC population COV and the current SI population BER. This scenario is supported with a relative posterior probability of 0.61 [0.57-0.64]. The alternative scenarios 1 and 2 (Fig. 3a)  Finally, we estimated the parameters describing the best-fitting topology using the same statistical framework implemented in DIYABC (Fig. 3b). With the assumed mutation rate of 1.25 9 10 À5 , current effective population sizes were estimated as N BER = 3460 (95% CI: 1120-11 700), N COV = 850 (95% CI: 193-2580) and N DEN = 9640 (95% CI: 4750-19 800) individuals (Fig. 3b). The ancestral split leading to the two current SI lineages occurred an estimated 10 000 generations ago (95% CI: 4060-31 300), whereas that leading to the current SC COV population occurred 2810 generations ago (95% CI: 714-14 400; Figs 3b and S2). Finally, we used a goodness-of-fit test to assess the BER (SI) COV (SC) N 1 = 3460  N 5 = 8760  N 4 = 23300  N 3 = 9640  N 2 = 850  T 1 = 2810  T 2 = 10800  0 DEN (SI)  The posterior probabilities of the I model against the IM model obtained by ABC analysis are provided. The robustness is the probability that the best-supported model is indeed the correct one given the empirically observed posterior probabilities. Robustness was obtained using 1000 pseudo-observed data set and computing the posterior probabilities of each model by ABC analysis. Populations 1 and 2 refer to the population pair for which the corresponding models were compared. Incompatibility status is indicated in parenthesis for each population (SC: self-compatible vs. SI: self-incompatible).
ability of a scenario with the estimated parameter values to reproduce the observed data set. The best-supported combination of scenario and associated parameter values was able to reproduce all of the 36 observed values of summary statistics (Table S1, Figs S3 and S4), suggesting that the proposed scenario is not only the best among the arbitrary set of compared scenarios, but is also able to explain faithfully most of the patterns of polymorphism and divergence observed among sampled natural populations of L. cavanillesii.

Discussion
We used ABC simulations to compare the fit of alternative demographic models of divergence to empirical data of the plant L. cavanillesii, a species with amongpopulation variation in its SI status (Voillemot & Pannell, 2016). Overall, our analyses are consistent with a scenario where the lineage ancestral to COV (a mixedmating population that has lost its SI system) has been diverging from ancestral SI populations for an estimated 2810 generations. Our analyses reject a scenario of ongoing gene flow between the SC and each of the two SI populations, but are consistent with a scenario of ongoing gene flow between the two SI populations. Our simulations also suggest that the effective population size of SC population is about ten times smaller than the SI populations and the inferred ancestral population. The inferred divergence time between DEN (SI) and the pair of populations COV (SC) and BER (SI) was nearly four times greater than between COV (SC) and BER (SI).

Accuracy and robustness of our results
Our inferences are necessarily based on very simple models of divergence. The utility of such ABC inference depends critically on the number of scenarios tested, as well as on whether the competing models can be accurately distinguished from one another. We believe that, despite their simplicity, the models explored are sufficiently different from one to another to provide a firstorder insight into the history of divergence between L. cavanillesii populations. Moreover, our estimates of robustness based on ABC simulation indicate that the set of summary statistics chosen to describe the genetic data contains sufficient information to discriminate alternative scenarios. An uninformative set of statistics would lead to ambiguous model comparisons, with low variation of relative posterior probabilities among competing scenarios. Importantly, the best-supported scenario was able to reproduce all of the 36 statistics available to describe patterns of polymorphism and divergence in natural populations of L. cavanillesii. This gives us some confidence that much of the historical signal is captured in the selected scenarios. In particular, the absence of migration between the SC and SI populations points to strong genetic isolation and allows an estimate of the divergence time between the two populations.
In principle, we might have tested more complex models including periods of strict isolation followed by secondary contact. However, from our current simple sampling scheme with two or three populations and a limited number of microsatellite markers, increasing the number of model comparisons would have implied comparing submodels nested within those already chosen. For instance, our test for ongoing gene flow does not allow us to infer the origin of the putative migration: either the ongoing gene flow results from a secondary contact or results from continuous allelic exchanges since the populations split. Both theory (Alcala & Vuilleumier, 2014) and simulation studies  show that these two submodels of ongoing gene flow can only be distinguished in extreme situations. For instance, strong statistical support for secondary contact is only possible for cases where it occurred relatively recently after a long period of isolation. One potentially important issue that we did not consider was the effect of a change in the mating system on population sizes and divergence times. Although selfing can compress coalescent times, the SC population is only partially selfing, and modelling has shown that coalescent trees for an SI population (s = 0) vs. one with intermediate selfing (s = 0.5) are difficult to distinguish (Nordborg & Donnelly, 1997), partly because a few generations of outcrossing are sufficient to erase signatures of inbreeding even in populations with a long history of selfing.
Our analysis suggests that the SC and the SI populations began to diverge relatively recently. This inference is of course sensitive to assumptions made for the mutation rate at markers chosen for the analysis. We assumed a microsatellite mutation rate of 1.25 9 10 À5 , which corresponds to a lower bound of values estimated for other plant species (e.g. Vigouroux et al., 2002;Marriage et al., 2009). Lower mutation rates would of course elevate the estimated divergence time proportionally, but it is also possible that the mutation rate for L. cavanillesii was higher than that assumed, so that divergence times might also have been lower than inferred. Clearly, divergence time estimates based on other markers would be valuable, but it seems unlikely that the inferred recent divergence between SC and SI populations would be radically altered.
Our analysis arrived at evidently robust estimates for other model parameters, including the effective population sizes of the populations concerned and rates of migration between them. It is important to emphasize that these estimates are based on the assumption of uniformity in the population sizes and rates modelled. Migration and mutation rates are known to vary substantially across the genome (Hodgkinson & Eyre-Walker, 2011), and such variation has only recently been included in demographic inference approaches (Cruickshank & Hahn, 2014;Roux et al., 2016). Moreover, mutation rates, effective population sizes and migration rates in natural populations are also likely to be heterogeneous over time, and it is not clear how such variation might impact on the details of our results. Our estimates of the effective population sizes for the sampled populations seem implausibly high to us, given that the current census sizes are probably in the order of hundreds rather than thousands of individuals. However, the plants grow on cliffs, and it is possible that we missed subpopulations at inaccessible sites in the vicinity that may be connected by gene flow to the populations we sampled. Alternatively, exaggerated population size estimates might suggest that the mutation rate we assumed was too low. In this case, our estimates of divergence times would be even shorter than those in our analysis. Notwithstanding these uncertainties, the parameter values inferred here conform to patterns we might expect in relative terms, both in the light of theory (Nordborg & Donnelly, 1997;Charlesworth & Pannell, 2001) and conclusions reached from other studies (Gl emin et al., 2006;Wright et al., 2008;Duminil et al., 2009). In particular, the lower effective size of the SC population compared with the SI population is consistent with its lower genetic variation (Voillemot & Pannell, 2016), as well as with the expectation that selfing populations should have a lower N e than outcrossing population in general (Nordborg & Donnelly, 1997;Charlesworth & Pannell, 2001;Roze & Rousset, 2004). The inferred genetic isolation (absence of ongoing gene flow) between the SC and the SI populations is also to be expected, because inbreeding populations tend to invest less in dispersal (reviewed in Charlesworth & Pannell, 2001;Barrett et al., 2014) and thus to be more isolated from one another than outcrossing populations (Hamrick & Godt, 1996;Duminil et al., 2009;Barrett et al., 2014;but see below).

Implications for understanding the loss of SI in L. cavanillesii
Variation in the compatibility status among populations of L. cavanillesii is almost certainly due to the loss of SI rather than its gain (Schoen et al., 1997;Igic et al., 2006;Goldberg & Igi c, 2008. We may thus suppose that the loss of SI in the lineage contributing to the SC population (COV) occurred at the time of, or after, its divergence from the SI populations from which it is derived, that is at least as recently as the inferred 2810 generations ago. L. cavanillesii is a longlived perennial plant, with individuals that likely live for at least a decade in age-structured populations. Although it is difficult to estimate the generation time of L. cavanillesii accurately, if we allow for a mutations rate per year of up to ten times lower than that assumed in our analysis, the age of SC in the species is very plausibly less than 30 000 years. Given the relatively small size of the populations sampled and their inferred genetic isolation, especially the SC population, it is unlikely that they have persisted at the same localities over the entire period of their divergence. Our inference should thus be interpreted broadly; that is, the SC population corresponds simply to the lineage of plants that may have given rise to the sampled populations by colonization from elsewhere, possibly after the time of divergence inferred by our study. Such colonization would likely have reduced the effective size of the populations we sampled, a pattern consistent with the inferred effective population size of the SC population being about ten times smaller than the SI populations and the inferred ancestral population.
To what extent might the relatively recent loss of SI in L. cavanillesii account for the differences between the derived SC and the SI populations in terms of patterns of inbreeding depression (Voillemot & Pannell, 2017) and their floral biology (Voillemot & Pannell, 2016)? Although putative purging has been observed in many selfing SC species (e.g. Weber et al., 2012;Dart & Eckert, 2013) and simulated under different conditions (e.g. Whitlock, 2002;Winn et al., 2011;Marchini et al., 2016), the dynamics of its evolution in natural populations is poorly known and difficult to assess. In L. cavanillesii, it seems likely that sufficient time has elapsed since the origin of SC for the population's genetic load to have been purged (Husband & Schemske, 1996;Crnokrak & Barrett, 2002). Indeed, Winn et al. (2011) performed simulations to see how quickly purging could occur after a transition to self-fertilization and found that even with low selfing rates, purging can occur rapidly (within a few hundred to thousand generations). However, it is not possible with our data to identify which factors were the main drivers of purging in our recently derived SC population: colonization through bottleneck, high selfing rates during colonization or a combination of both these scenarios could also contribute to the selection of SC and/or the purging of inbreeding depression. The relatively small population size, low genetic diversity and high population isolation of the SC population (Voillemot & Pannell, 2016) are all conditions that might accelerate the purging of deleterious alleles from a population (Kirkpatrick & Jarne, 2000;Whitlock, 2002;Pujol et al., 2009;Oakley & Winn, 2012;Pekkala et al., 2012;Lohr & Haag, 2015;Hedrick & Garcia-Dorado, 2016).
As noted in the Introduction, the autonomous SC population of L. cavanillesii shows no evidence for the evolution of a selfing syndrome (Voillemot & Pannell, 2016), that is a reduction in pollen production or allocation to attracting and rewarding pollinators (Ornduff, 1969;Sicard & Lenhard, 2011). One explanation could be that genetic variation in the SC lineage of L. cavanillesii was low, for example as a result of population bottlenecks, and that it was thus unresponsive to selection. Alternatively, it could simply be that 3000 generations have been insufficient for the evolution a selfing syndrome. In their study using the hermaphrodite freshwater snail Physa acuta, No€ el et al. (2016) found that although early-acting inbreeding depression was purged rapidly, male allocation remained unchanged over the 20-generation course of their experiment. In contrast, Bodbyl Roels & Kelly (2011) have shown that morphological changes relevant to a selfing syndrome can evolve within a few generations if there is sufficient variation upon which selection can act, a finding consistent with what is known more generally about the capacity of populations under natural selection to undergo rapid morphological change (e.g. Seeley, 1986;Losos et al., 1997;Reznick et al., 1997;Lendvai & Levin, 2003).
Other investigations of the divergence time among populations with contrasting mating systems have also inferred relatively recent transitions that are not associated with a selfing syndrome (e.g. Foxe et al., 2010;Busch et al., 2011). For instance, in SC populations of the perennial A. lyrata, the loss of SI occurred an estimated 10 000 years ago, and flowers still show an outcrossing syndrome (Hoebe et al., 2009;Foxe et al., 2010). Similarly, in the annual L. alabamica, a population that evolved SC 48 000 years ago displays only slight changes in floral size and pollen production (Busch et al., 2011). These cases, and that of L. cavanillesii, contrast with that of the annual Capsella rubella, where a selfing syndrome has evolved in fewer than an estimated 20 000 years (Foxe et al., 2009;Slotte et al., 2012). It remains unclear why some species evidently evolve a selfing syndrome relatively rapidly, whereas others, such as L. cavanillesii, do not.
On balance, it seems to us implausible that the SC lineage of L. cavanillesii has been unresponsive to selection for increased selfing and reduced allocation towards pollinator attraction and reward for thousands of generations, but observations of other species referred to above clearly present a similar conundrum. Given that inbreeding depression has evidently been purged from the SC lineage of L. cavanillesii (Voillemot & Pannell, 2017), it is difficult to understand why selection should maintain outcrossing and an outcrossing syndrome in this particular case. It of course remains possible that SC evolved even more recently than the split between the lineages sampled, for example that the transition was facilitated by a severe bottleneck that depleted genetic variation, rendering the population unresponsive to selection for increased selfing.

Supporting information
Additional Supporting Information may be found online in the supporting information tab for this article: Table S1 Summary statistics from the ABC simulations. Figure S1 Posterior probability for the time of split between the SC population (COV) and the five other known populations (BER, DEN, BUI, RUB, ZAR; respectively in red, yellow, green, blue and purple line). Figure S2 Posterior probability for the time of split between populations COV and BER (red line), or between DEN and the combined pair COV-BER (blue line). Figure S3 Distributions of estimated posterior probabilities. Figure S4 Principal component analysis of summary statistics using 1000 data sets simulated with the prior distributions for the chosen parameters, the observed data and data from the posterior predictive distribution.