Multiple aspects of the selfing syndrome of the morning glory Ipomoea lacunosa evolved in response to selection: A Qst‐Fst comparison

Abstract The frequent transition from outcrossing to selfing in flowering plants is often accompanied by changes in multiple aspects of floral morphology, termed the “selfing syndrome.” While the repeated evolution of these changes suggests a role for natural selection, genetic drift may also be responsible. To determine whether selection or drift shaped different aspects of the pollination syndrome and mating system in the highly selfing morning glory Ipomoea lacunosa, we performed multivariate and univariate Qst‐Fst comparisons using a wide sample of populations of I. lacunosa and its mixed‐mating sister species Ipomoea cordatotriloba. The two species differ in early growth, floral display, inflorescence traits, corolla size, nectar, and pollen number. Our analyses support a role for natural selection driving trait divergence, specifically in corolla size and nectar traits, but not in early growth, display size, inflorescence length, or pollen traits. We also find evidence of selection for reduced herkogamy in I. lacunosa, consistent with selection driving both the transition in mating system and the correlated floral changes. Our research demonstrates that while some aspects of the selfing syndrome evolved in response to selection, others likely evolved due to drift or correlated selection, and the balance between these forces may vary across selfing species.

in selfing species have been less studied than flower or petal size.
The reasons for these trait reductions remain unclear. While selection favoring the spread of selfing itself is theoretically well supported through the automatic advantage (Fisher, 1941) and reproductive assurance (Busch & Delph, 2012), there has been little effort to determine whether reductions in floral traits that do not directly affect selfing rate result from natural selection or from the accumulation of mutations through genetic drift. Both explanations are plausible. On the one hand, the repeated evolution of the selfing syndrome represents a convergent evolutionary response to a change in mating system, and convergent evolution can often indicate similar selective pressures (Losos, 2011;Stern, 2013). Several types of selection have been proposed to drive selfing-syndrome evolution. For example, because costly floral displays are no longer needed to attract pollinators in highly selfing species, selection could favor reallocation of resources away from these displays to other fitness-enhancing functions, such as higher fruit-to-flower and seedto-ovule ratios (Goodwillie et al., 2010). Similarly, because less pollen is needed for self-fertilization and as a reward for pollen-feeding insects, selection could reallocate resources used to produce pollen to other functions (Goodwillie et al., 2010). Alternatively, reduced floral size and display size may be favored because they reduce attractiveness to herbivores (McCall & Irwin, 2006;Sicard & Lenhard, 2011).
Finally, selection may often favor faster development in highly selfing species because they tend to grow in marginal habitats (Lloyd, 1992;Snell & Aarssen, 2005).
By contrast, genetic drift is also a plausible explanation for the evolution of selfing-syndrome traits. Without the necessity of attracting insect pollinators, selfing plants no longer experience purifying selection to maintain display traits and may accumulate mutations that reduce floral size, pollen production, nectar production, and display size through genetic drift (Duncan & Rausher, 2013a). Moreover, self-pollination increases homozygosity, which reduces effective population size and increases linkage disequilibrium, thereby increasing the potential effects of genetic drift (Charlesworth & Wright, 2001;Pollak & Sabran, 1992). Random mutations in coding sequence generally have an adverse effect on protein function (Eyre-Walker & Keightley, 2007), and, in genes controlling floral size, mutations that alter protein function often cause floral size reduction (Krizek & Fletcher, 2005). Relatedly, inbreeding depression can manifest as generally reduced size as a result of the accumulation of mildly deleterious variants (Charlesworth & Charlesworth, 1987). For these reasons, genetic drift causing the accumulation of mutations that reduce floral traits should not be ignored as a possible cause of the general pattern of reductions in the selfing syndrome in any individual species. Yet to our knowledge, there have been few investigations into the role of selection in shaping selfing-syndrome traits that do not directly affect selfing rate.
Two recent studies-on corolla size in the morning glories Ipomoea lacunosa and Ipomoea cordatotriloba (Duncan & Rausher, 2013a) and flower size in Collinsia heterophylla (Strandh, Jönsson, Madjidian, Hansson, & Lankinen, 2017)-indicated a role for natural selection in the divergence of both floral and reproductive traits. Neither, however, included elements of the selfing syndrome outside of floral dimensions, such as nectar, pollen, or growth traits.
A related but distinct question is whether selection favors the initial spread of self-pollination. Typically, the evolution of increased selfing, whether for reproductive assurance or because of an automatic transmission advantage (Barrett, 2002), is thought to be driven by selection, which is required to overcome the inbreeding depression that is common in outcrossing species (Busch & Delph, 2012).
Qst-Fst comparisons between quantitative traits and neutral markers offer a method for differentiating between selection and drift as possible causes for population or species divergence (Leinonen, McCairns, O'Hara, & Merilä, 2013). When the genes responsible for individual traits are not known, either because the traits are polygenic, because the system is a nonmodel species, or both, molecular signatures of selection (Nielsen, 2005) cannot be used to detect selection on divergent traits. In these situations, a Qst-Fst approach, which uses phenotypic measurements to estimate the differentiation in loci underlying quantitative traits of interest and compares that differentiation to neutral markers, can discriminate between selection and drift as explanations for phenotypic divergence.
In this investigation, we apply Qst-Fst approaches to determine the extent to which natural selection contributed to the evolution of selfing-syndrome traits and of self-pollination in I. lacunosa. This study greatly expands upon the initial investigation by Duncan and Rausher (Duncan & Rausher, 2013a) in several ways. First, it includes selfing-syndrome traits besides floral size, such as nectar production, pollen production, and display size (flower number) and thus constitutes the first attempt to determine whether reductions in these traits in highly selfing species are also caused by selection.
Second, we estimate genetic variance more precisely with measurements taken from individuals of known parentage in the greenhouse rather than wild individuals. Third, this study examines traits that are not traditionally considered part of the selfing syndrome, but are thought to correlate with high selfing rates (inflorescence size and growth rate). We determine whether these traits differ between the species and whether they differ in the directions predicted by the expectations of the selfing syndrome-that is, that floral traits should be generally reduced in the selfing species and growth rate possibly increased. Finally, this study includes a broader array of populations and a much larger number of neutral loci and dramatically changes our previous estimates of Fst.
In addition to exploring how selection has shaped floral changes that follow the mating system transition, we use the Qst-Fst approach to detect selection for increased selfing itself. In self-compatible morning glories, selfing rate is controlled by herkogamy (anther-stigma separation), a quantitative trait known to be evolutionarily labile in many species (Chang & Rausher, 1998;Duncan & Rausher, 2013a;Opedal, Bolstad, Hansen, Armbruster, & Pélabon, 2017). We therefore use the Qst-Fst framework to determine whether selection favored reduced herkogamy in I. lacunosa.  Rifkin, Castillo, Liao, & Rausher, 2019). We have observed both species growing intertwined in the same habitat.

| Samples and plant culture
For this study, we used plants covering a wide geographic range obtained from the Rausher Lab's field collections, an accession from the Baskin Lab, and accessions from the USDA's GRIN seed bank (Rifkin et al., 2019). Our samples included 33 I. cordatotriloba accessions from 13 sites and 31 I. lacunosa accessions from 12 sites. From each site, we used no more than three accessions (a complete list of the accessions used can be found on the Dryad Digital Repository).
One selfed offspring from all but three of these accessions was genotyped for the Fst analysis, which is thus based on 61 individuals.
For Qst estimation, we grew two selfed offspring from each of these accessions and three additional accessions that were not genotyped but were from the same region (Austin, TX) as a genotyped accession that never produced flowers. A total of 114 individuals were thus phenotyped. The limited per-population sample numbers may affect our estimates of within-species population variation, but as our focus is between-species differentiation and given the magnitude of between-species differentiation (see below), this should not affect our conclusions. In addition, with a large number of genetic loci, accurate estimates of Fst can be obtained from even a small number of individuals (Willing, Dreyer, & van Oosterhout, 2012).
To grow plants for genotyping and phenotyping, seeds were scarified and planted in four-inch pots in Fafard 4P soil and maintained in a growth room under 16-hr days at 25.6°C (78°F). After 4 weeks, conditions were changed to 12-hr days at 18.3°C (65°F) to trigger flowering. When flower buds appeared, plants were moved to the Duke Greenhouse and grown under the following conditions: 12-hr days at 23-26°C (74-80°F), 12-hr nights at 16-19°C (61-67°F), 65% relative humidity, and 700µmol s s −1 /cm 2 . The plants were allowed to acclimate for 2 weeks before any measurements were taken.

| Phenotypic measurements
We included measurements of nine characters in our analysis: the traditional selfing-syndrome characters corolla length, corolla width, nectar volume, nectar sugar concentration, and pollen number, as well as characters associated with early growth (length of first three internodes at day 21) and display size (total flowers per day, length of inflorescence from stem to flower base, number of flowers on inflorescence). To determine if selection favored increased selfing in I. lacunosa, we measured herkogamy (the degree to which anthers are positioned below the stigma), a major determinant of selfing rate in these species (Duncan & Rausher, 2013a). Herkogamy was measured as the number of anthers that were below the stigma: in highly outcrossing populations, the style extends above the anthers, while in selfing populations, anthers touch the stigma (Duncan & Rausher, 2013a). For comparison, we also measured three vegetative traits that we have no reason to believe are associated with selfing: sepal length, leaf length/width ratio, and leaf dissection.
Early growth measurements of the first three internodes (cotyledons to first true leaf, first true leaf to second true leaf, and second true leaf to third true leaf) were taken with a rule on day 21 after planting. After plants were moved to the greenhouse, floral measurements were performed between 8:00 a.m. and 12:00 p.m.
Because flowers remain open for less than a day, it is not necessary to standardize flower age. For most individuals, three flowers were measured on different days. Corolla length and width, inflorescence length, and style length were measured using a digital caliper (Mitutoyo Digimatic CD6″ CS).
Nectar volume and sugar concentration were quantified from flowers that had been covered overnight to prevent evaporation.
The day before a flower opened, the bud was capped with a plastic straw covered with parafilm. The following morning, all nectar was extracted from the base of the flower with 2 µl microcapillary tubes (Drummond Scientific). The height of the nectar in the tube was measured with the digital caliper. Because each tube is 32 mm long and holds 2 µl in total, this measurement was converted to volume with the formula V = (2 µl * height of nectar in tube)/32 mm. Nectar sugar concentration was quantified by expelling all the nectar from the microcapillary tube onto a Master-53M ATAGO refractometer. The refractometer was standardized with water at the start of each day's measurements. Because refractometer readings are often imprecise with low volumes, two sugar concentration measurements were taken: undiluted nectar and nectar diluted with 2.5µl water. The relationship between sugar concentration measured as weight/weight (w/w) and as measured by mol/L or mg/ml is nonlinear (Bolten, Feinsinger, Baker, & Baker, 1979). Therefore, we used data on the relationship between w/w and mol/L for sucrose solutions from the CRC Handbook of Chemistry and Physics to convert mol/L into mg/ml values (Rumble, 2018, p. 5-132). Specifically, we converted the mol/L values in the table to mg/ml values by multiplying the mol/L by the molecular weight of sucrose (342.2964 g/mol). We then calculated a regression equation to convert w/w into mg/ml: mg/ml = 0.0524(w/w) 2 + 9.6554(w/w) + 1.3904 (R 2 = 0.9999). For nectar diluted with 2.5µl water, the diluted sugar concentration was first converted into mg/ml, then multiplied by the ratio: (actual nectar amount + 2.5µl)/actual nectar amount. The diluted and undiluted nectar sugar concentration values in mg/ml were averaged to produce the nectar sugar concentration used in our analyses.
Pollen production, measured as pollen per ovule, was quantified by removing anthers the day before anthesis, allowing them to

| Character divergence
To determine which traits differed between the two species, we performed a nested ANOVA using the "Fit Model" platform in JMP Pro 13 (SAS Institute, 2017). For each trait, we analyzed a model in which species was a fixed effect, while site nested within species and female parent (accession) nested within site were random effects, using restricted estimation maximum likelihood fits. Six individuals that never produced flowers were excluded from analysis. Pollen per ovule was log-transformed for analysis; all other traits were approximately normally distributed and were therefore not transformed.
Measurements on two selfed offspring per female parent (accession) provided the error effect. To correct for multiple comparisons, we applied a Benjamini and Hochberg false discovery rate correction (Benjamini & Hochberg, 1995). As a descriptive measure, we also calculated the relative divergence between species means as the absolute value of the difference between the two species means divided by whichever species mean was larger for the trait in question.

| SNP calling
To estimate Fst, we identified SNPs from leaf transcriptomes.
Total RNA was extracted from a single young leaf (0.5-2.0 cm in length) using a modified TRI Reagent (Sigma-Aldrich) protocol that included an additional TRI Reagent:chloroform cleanup step, addition of glycogen, and three ethanol washes. RNA was resuspended in 30µl RNAse-free water. We assessed RNA quality using a 2,200 TapeStation system (Agilent Technologies), and all samples displayed an RNA integrity score of at least 7. RNA sequencing libraries were generated using a KAPA Stranded mRNA-Seq kit (KAPA Biosystems). SNPs were called and filtered using a modified version of the GATK best practices for RNASeq (Van der Auwera et al., 2013). We aligned reads to the I. lacunosa draft genome assembly with STAR 2-pass (Dobin & Gingeras, 2015). Alignment files were cleaned, marked for duplicate reads, assigned read groups, and sorted using PicardTools (http://broad insti tute.github.io/picard). Reads extending into introns were truncated using Split "n" Trim. We used the GATK Joint Genotyper in -erc GVCF mode to call SNPs and hard-filtered based on Fisher Strand Bias < 30, quality-by-depth < 2, minimum depth 10 reads, SNP clustering, and at least 60 individuals (out of 61 total) called. The resulting 66,729 SNPs were then coded as either synonymous, nonsynonymous, or noncoding using an APL (Iverson, 1962) script written by one of the authors (MDR) and an annotated draft I. lacunosa genome generated by our laboratory. A total of 27,079 synonymous SNPs were identified and used for estimating Fst. All scripts are available on GitHub (https ://github.com/ joann arifk in/Ipomo eaQstFst).
Fst can be affected by the manner in which SNPs are ascertained (Bhatia, Patterson, Sankararaman, & Price, 2013). Ideally, only SNPs that were polymorphic in the ancestral population before speciation should be included, but this is often not possible to determine. We therefore examined the effects of two different ascertainment protocols. In the first protocol, we used all synonymous SNPs that were polymorphic in at least one of the two species (N = 27,079). With this protocol, Fst may be biased upward because the set of SNPs may include mutations that have arisen since divergence of the two species as well as ancestral variants that have been lost in one lineage but not the other (which is likely in a highly selfing species). In the second protocol, we used only SNPs that were polymorphic in both species. This approach may yield an estimate of Fst that is biased downward because it excludes ancestral SNPs that have become fixed in one species. To control for these effects, we calculated Fst from two datasets: all synonymous SNPs ("All SNPs," N = 27,079) and synonymous SNPs that were polymorphic in both species ("Shared SNPs," N = 2,352).
We also repeated the "All SNPs" estimate with LD-pruned SNPs separated by either 20kb or 40kb (consistent with the distance of LD decay in I. lacunosa, (Rifkin et al., 2019)), but it did not alter our estimates and we therefore report the estimate based on the full sample of SNPs.

| Qst-Fst comparison
The Qst-Fst test asks whether Qst is significantly greater than Fst.
If divergence in a quantitative trait is due to genetic drift, then it is expected that Qst = Fst. By contrast, if divergence was caused by selection, then it is expected that Qst > Fst (Leinonen et al., 2013).
For each of the traits that exhibit significant divergence between species, we asked whether Qst is significantly greater than Fst.

| Multivariate Qst-Fst comparison
We performed a multivariate Qst-Fst comparison on selfing-syndrome traits that were significantly diverged between species (early growth, flowers per day, inflorescence length, corolla length and width, herkogamy, nectar volume and concentration, and pollen number) according to the methods of Martin et al. . We adapted R code made available by the authors (http://www.isem.univ-montp2. fr/fr/personnel/equipes/metapopulations/martin-guillaume.index/) by adding steps to round matrices to 12 decimal places to avoid loss of symmetry via loss of significance and to transform to positive definite by changing null eigenvalues to very small eigenvalues. This test relies on the property that if covariance matrices are evolving neutrally, then while they may vary considerably, they should on average be proportional to each other with the expected coefficient of proportionality ρ Exp = 2Fst * (1 − Fst) in outcrossing species, or ρ Exp = Fst * (1 − Fst) in a highly selfing species Phillips, Whitlock, & Fowler, 2001;Rogers & Harpending, 1983). If the observed coefficient of proportionality between the G-matrices significantly differs from this expectation, then that suggests the action of nonneutral processes. We chose to use the proportionality constant for a highly selfing species, as one of our species is highly selfing and the other moderately selfing; however, using the constant for an outcrossing species does not change our results.
To estimate the covariance matrices, we performed a MANOVA using accession means of traits transformed to Gaussian distributions and normalized to the means with "Species" as a factor. We estimated the coefficient of proportionality ρ Obs between the Gmatrices G A and G S , where G A is the accession-level G-matrix and is equal to MS A (mean squares at the accession level, based on the residual sum of squares divided by degrees of freedom) and G S is between-species level G-matrix, estimated from MS B (mean squares at the species level) and MS A . A detailed description of these calculations can be found in the supplementary methods and scripts are available at https ://github.com/joann arifk in/Ipomo eaQstFst.
To estimate the neutral coefficient of proportionality Fst * (1 − Fst), we computed Weir-Cockerham estimates of Fst and its 95% confidence intervals using the hierfstat package in R at the species level across all individuals (Goudet, 2005;de Meeûs & Goudet, 2007;Weir & Cockerham, 1984). A confidence interval could not be generated for our coefficient of proportionality estimate because there is only one degree of freedom in the two-species comparison. Therefore, we instead estimated a distribution of values for the observed coefficient of proportionality (ρ Obs ) by resampling accessions within each species with replacement over 1,000 bootstrap replicates. We then compared the 95% confidence interval of this distribution of the observed coefficient of proportionality (ρ Obs ) to the 95% confidence interval of the expected coefficient of proportionality (ρ Exp ).

| Univariate Qst-Fst comparisons
We also performed Qst-Fst comparisons on the traits individually to gain more insight into which traits might be driving the multivariate changes most strongly. Fst was calculated using Weir and Cockerham estimates in hierfstat as described above (Goudet, 2005;de Meeûs & Goudet, 2007;Weir & Cockerham, 1984). We also applied Hudson's estimator (Hudson, Slatkin, & Maddison, 1992 We employed two methods to compare our estimates of Qst and Fst. First, we applied a standard nonparametric bootstrapping approach, which we refer to as "standard bootstrap" analysis. Using an APL program written by MDR, 1,000 bootstrap values of Qst and Fst were generated by randomly choosing accessions (Female Parent) with replacement. For each sample, we calculated Fst and Qst as described above, then calculated the difference D = Qst − Fst. From these values, we calculated the proportion of D values that were ≤0.
We performed this analysis only on characters that showed a significant difference between species in the ANOVA described above.
Second, we used Whitlock and Gilbert's method for parametric bootstrapping of Qst in unbalanced half-sib designs (Gilbert & Whitlock, 2014Whitlock & Gilbert, 2012;Whitlock & Guillaume, 2009). Because the distribution of Qst for neutral traits varies depending on demographic traits, this approach uses the estimate of Fst to simulate a distribution based on the expectation that under neutrality We used the tool QstFstComp in the R package QstFstComp (Gilbert & Whitlock, 2014) in "half.sib.dam" mode (despite the name, it can be set to model any level of relatedness between siblings) with two populations, 1,000 simulations, and "dam.offspring.relatedness" set to 1 to reflect selfed offspring.

| Differences between species
Using a nested ANOVA, we found little evidence for population differentiation within species (a table can be found on the Dryad Digital Repository): after a false discovery rate correction, the population effect was not significant for any of the traits. This does not necessarily mean there is no real population differentiation, only that we were not able to detect it, probably because we scored a maximum of three accessions per population. Variation among accessions within populations was significant for several traits (the early growth traits internode 1 and internode 3 elongation, the floral morphology traits of corolla length and width, and the vegetative trait of leaf shape), indicating the presence of genetic variation in these traits.
Although other traits did not show a significant accession effect, we again do not claim there is no genetic variation for these traits, only that we did not have the power to detect it. In the Qst calculations, we assume that such variation is reflected in the σ 2 BetwAcc component of variance.
Generally, traits typically included in the selfing syndrome (corolla length and width, nectar volume, nectar sugar concentration, and pollen number) were significantly reduced in I. lacunosa, even after correction for multiple comparisons (Table 1). Herkogamy also differed significantly, as expected, with more anthers tending to be below the stigma in I. cordatotriloba. In contrast, the early growth trait of internode length on day 21 after germination was significantly increased in I. lacunosa, indicating faster early growth. In addition, number of flowers produced per day was significantly greater and  of selection against geitonogamy (both consistent with expectation).
Two of the three vegetative traits we examined (leaf length/width ratio and sepal length) did not differ significantly between the species, although I. cordatotriloba did produce more dissected leaves.

| Qst-Fst analysis
The estimate of Fst from all synonymous SNPs (Fst = 0.442, 95% CI 0.437, 0.447) was higher than Fst estimated using only shared polymorphic SNPs (Fst = 0.390, 95% CI 0.375, 0.405). A higher Fst estimate from "all" than from "shared" SNPs is consistent with the expected biases introduced by the two ascertainment protocols.

| Multivariate analysis
If divergence of a set of traits is neutral in a highly selfing species, the between-species and within-species G-matrices will be propor-

| Univariate analyses
To identify which individual traits were subject to selection, we performed a univariate Qst-Fst analysis on each selfing-syndrome trait that differed between the two species. For herkogamy, the estimated Qst was 0.766, which is substantially higher than either estimate of Fst. This difference was significant in the standard bootstrap analysis for both all SNPs and shared SNPs (p < 0.0001 for both). It was also significant in the parametric bootstrap for the analysis using Fst estimated from shared SNPs (p = 0.03) and of borderline significance (p = 0.053) in the parametric bootstrap analysis using Fst estimated from all SNPs. We interpret these results to be consistent with selection having caused the evolution of reduced herkogamy in I. lacunosa.
In the standard bootstrap analysis, four selfing-syndrome traits exhibited significant Qst -Fst differences for both Fst estimates (Table 2,  (Gilbert & Whitlock, 2015). Overall, the analyses suggest that the divergence between the two species in floral morphology was due to selection, and that that selection acted chiefly on floral dimensions, nectar volume, and nectar sugar concentration. In contrast, although early growth, pollen and display traits are also components of the selfing syndrome, we did not find evidence of direct selection on these traits.

| Evolution of increased selfing in I. lacunosa
Mating system transitions, particularly transitions from outcrossing to selfing, are among the most common evolutionary changes in plants ( Barrett, 2002). Increases in selfing rate can influence processes that can affect fitness, such as the magnitude of inbreeding depression experienced, the reduction in pollen transmission to other plants because of pollen discounting, increased reproductive TA B L E 2 Qst-Fst differences in simple and parametric bootstrap assurance, and loss of heterozygosity (Jarne & Charlesworth, 1993).
Given these effects on fitness, it seems likely that in most cases, changes in selfing rate are driven by natural selection. Our results are consistent with this expectation in indicating that selection was responsible for the evolution of reduced herkogamy, and thus higher selfing rates, in I. lacunosa. Since reproductive assurance and the automatic advantage are well-supported theoretical bases for the evolution of selfing, and since as a weedy annual I. lacunosa may have particularly benefited from reproductive assurance, it is possible that they played a role in this transition. At this point, however, we do not know with any certainty the factors responsible for selection favoring selfing in this species.

| The role of selection on selfing-syndrome traits
The floral and life-history changes that constitute the selfing syndrome are potentially distinct from morphological and physiological changes that directly affect selfing, such as reduced herkogamy have been few attempts to determine whether selection is indeed responsible for these changes (Duncan & Rausher, 2013a;Strandh et al., 2017). Genetic drift is also a plausible explanation for these differences. Selfing plants experience relaxed selection to attract pollinators, and their reduced effective population size makes them more susceptible to the effects of drift (Charlesworth & Wright, 2001;Pollak & Sabran, 1992). Random mutations and inbreeding depression are likely to cause floral size reduction (Charlesworth & Charlesworth, 1987;Eyre-Walker & Keightley, 2007). For these reasons, determining whether selection is responsible for these changes is valuable for understanding how and why the selfing syndrome evolved.
We have demonstrated that I. lacunosa has evolved the classic selfing-syndrome traits of reduced flower size, reduced nectar production, and reduced pollen production, as well as inflorescence structure and early growth differences that may be associated with weediness and annuality. Moreover, we have shown that divergence between species in this suite of traits was likely driven by natural selection, based on multivariate Qst-Fst analysis.
From our univariate analyses, we find support for selection having played a role in reducing corolla size and nectar volume and sugar concentration.
By contrast, other selfing-syndrome traits, including several that are strongly diverged between the two species (e.g., inflorescence length, flowers produced each day, early growth, and pollen production), showed no evidence that divergence was caused by selection.
There are three explanations for this result: (a) our Qst-Fst test may not have been powerful enough to detect selection on these traits; (b) these traits may truly not have experienced divergent selection, but diverged due to genetic drift; and (c) divergence in these traits may have been caused by correlated selection on other traits.
Studies in other species suggest that indirect selection on correlated traits may account for divergence in traits that did not display evidence of selection. In general, floral traits have been found to be moderately to strongly correlated, both phenotypically and genetically (Bernacchi & Tanksley, 1997;Fishman, Beardsley, Stathos, Williams, & Hill, 2014;Fishman, Kelly, & Willis, 2002;Georgiady, Whitkus, & Lord, 2002;Goodwillie et al., 2006;Lin & Ritland, 1997;Slotte, Hazzouri, & Stern, 2012). For example, previous studies have found evidence for a tradeoff between flower size and flower number (Ashman & Majetic, 2006;Sargent, Goodwillie, Kalisz, & Ree, 2007). If such a tradeoff exists in Ipomoea, the detected selection favoring decreased flower size could have resulted in the observed increase in flower number in I. lacunosa. Similarly, if pollen production is correlated with flower size, as is the case in Capsella and Mimulus (Fishman et al., 2002;Slotte et al., 2012), the observed selection for reduced flower size may have led to less pollen in I. lacunosa as a correlated response. This is particularly likely in a selfing species. Selfing species tend to show increased integration among traits (Vallejo-Marín et al., 2014), which may reflect stronger correlations due to increased linkage disequilibrium (Fornoni et al., 2016). Therefore, correlated selection remains a plausible explanation for divergence in traits which did not show evidence of selection.
While we find that divergence in four selfing-syndrome characters was likely caused by selection, and can thus rule out genetic drift in these traits, we are not able to determine from our data whether direct or indirect selection acted on each of these characters. Although to our knowledge this issue has not been examined, it is possible that Qst-Fst differences could reflect correlated rather than direct responses to selection. It is thus possible that, say, direct selection to reduce corolla length and width could also reduce nectar production, if reduction in floral dimensions caused a correlated reduction in nectary size. An artificial selection experiment in Eichhornia paniculata found evidence consistent with this when selection for increased flower size also led to increased nectar volume, suggesting underlying genetic correlations (Worley & Barrett, 2000). However, we suspect that a correlated response of this type detectable from a Qst-Fst analysis would require a strong genetic correlation between floral dimensions and nectary size. Identifying the genetic bases of these traits and relationships among them may allow us to distinguish between direct selection, correlated selection, and drift.
As is traditional, we have assumed that selfing-syndrome characters evolved after the evolution of increased selfing and did not themselves contribute directly to an increase in selfing rate.
However, to the extent that the reductions in floral characters in I. lacunosa render flowers less attractive to pollinators, one consequence of selection on these characters may be reduced outcrossing and thus increased selfing. It is thus possible that these selfing-syndrome characters are not so much a consequence of increased selfing but a contributor to it. Further experiments will be required to determine the extent to which these characters may have evolved because they increased selfing rates.
In addition to traditional selfing-syndrome characters, we also examined early growth. Researchers have long observed an association between selfing, weediness, and an annual life history, and selfing plants develop faster at several life stages (Fishman et al., 2014;Snell & Aarssen, 2005). We found that stem elongation during the first 3 weeks after germination was significantly faster in I. lacunosa, consistent with the evolution of more rapid growth. The early growth differences might lead to the expectation of more general vegetative differences in response to a more marginal habitat, and the other vegetative trait that had diverged, leaf dissection, is associated with patterns of local water availability (Nicotra et al., 2011 in much of their range, and the one study that we are aware of testing the relationship between physiology and selfing rate found no relationship (Ivey, Dudley, Hove, Emms, & Mazer, 2016). In addition, the magnitude of morphological divergence is far greater for floral than for vegetative traits. In most species, floral and vegetative traits are not tightly correlated (Ashman & Majetic, 2006), and indeed, we found no evidence of divergence in a flower-adjacent vegetative trait (sepal length, known to be variable in this group of species: Austin, 1978).

| Power and limitations of Qst-Fst analysis
In our analyses, the standard bootstrap approach detected higher levels of significance than the parametric bootstrap approach. This raises the question of which approach provides a better picture of selection on these traits. Unlike parametric bootstrap, the standard bootstrap approach does not account for the possibility that populations may by chance drift in a similar direction (Whitlock, 2008;Whitlock & Guillaume, 2009 & Merilä, 2005), it also has low power when the number of "populations" is small (Gilbert & Whitlock, 2015), as in our study. Low power may thus explain why p-values for the four significant traits in the standard bootstrap analysis are lower than in the parametric analysis.
A more important caveat is that all current methods for Qst-Fst comparisons rely on assumptions that natural populations may not satisfy (Gilbert & Whitlock, 2015;Leinonen et al., 2013;Whitlock, 2008 & Whitlock, 2015). While these issues primarily reduce the power to detect Qst-Fst differences, we nevertheless managed to detect significant differences despite this power reduction. Finally, differences between subspecies or species are likely to contain more nonadditive genetic differences (Whitlock & Gilbert, 2012). Dominance increases Qst and the variance of Qst relative to purely additive inheritance when selection is acting, but does not increase the likelihood of false positives from neutrally diverging traits (Cubry et al., 2017;Santure & Wang, 2008). Thus, the increased nonadditive differences in highly diverged populations may reduce power but should not increase the likelihood of false positives in estimates of Qst-Fst difference.
Selfing affects our Qst-Fst comparisons in two ways: it shapes the estimates generated of Qst and it may alter the relationship between Qst and Fst. The selfing breeding design to generate our families may provide more accurate Qst estimates for selfing species (Goudet & Büchi, 2006), so it is probably not a cause for concern.
The history of inbreeding, on the other hand, may have complex effects on the estimation of Qst. Inbreeding can inflate the variance of Qst (Cubry et al., 2017), although its effects are generally less dramatic than those of dominance (Cubry et al., 2017;Santure & Wang, 2008). Finally, the effects of purging of deleterious alleles and of inbreeding depression are not accounted for in a Qst-Fst framework (Gilbert & Whitlock, 2015).
Overall, these sources of bias are more likely to reduce than to inflate the magnitude of Qst-Fst differences.  & Merilä, 2005). Since multiple aspects of our methods (estimate of Qst, bootstrapping, parametric bootstrapping) are conservative or expected to lack power in diverged, selfing species, we argue that our analysis is conservative overall and the evidence we identify for selection is compelling. Inbreeding and self-pollination are common in plants and may be associated with different selective regimes (Allendorf, Hohenlohe, & Luikart, 2010;Barrett, 2002). To better understand phenotypic evolution in inbreeding populations, quantitative genetic theory should develop methods for Qst-Fst comparisons that can more precisely account for this variation in demographic history or interface with existing demographic inference models to differentiate neutral divergence from drift.

| CON CLUS IONS
Our findings partially support the general expectation that natural selection drives the evolution of selfing-syndrome traits. In particular, we found that this was true for reductions in floral size and nectar traits. Surprisingly, however, we did not find that selection caused reduced pollen production, a key feature of the selfing syndrome in I. lacunosa and other highly selfing species. This complicates and expands upon the conclusions of Duncan and Rausher (2013a), in which both selfing-syndrome traits examined diverged in response to selection. In addition, we found no evidence that selection modified other characters typically associated with highly selfing plants, such as display size and growth rates. In the absence of selection, changes in these traits may have been caused largely by genetic drift or by correlated selection that was too weak to detect. Distinguishing between these two possibilities will require additional experiments examining the genetic correlation structure of selfing-syndrome traits.

ACK N OWLED G M ENTS
We thank Carol Baskin, Lena Hileman, Craig Freeman, Robert Pitman, Michael Flessner, and the officers of the USDA GRIN network for their assistance with seed sourcing. We thank anonymous reviewers for their contributions to improving this manuscript. Funding sources: NSF grant DEB 1542387 to MDR, NSF DDIG grant DEB 1501954 to JLR, Sigma Xi grant to JLR, and Duke University sequencing funds.

CO N FLI C T O F I NTE R E S T
None declared.