Postzygotic barriers persist despite ongoing introgression in hybridizing Mimulus species

The evolution of postzygotic isolation is thought to be a key step in maintaining species boundaries upon secondary contact, yet the dynamics and persistence of hybrid incompatibilities in naturally hybridizing species are not well understood. Here, we explore these issues using genetic mapping in three independent populations of recombinant inbred lines between naturally hybridizing monkeyflowers, Mimulus guttatus and Mimulus nasutus, from the sympatric Catherine Creek population. We discover that the three M. guttatus founders differ dramatically in admixture history, with nearly a quarter of one founder's genome introgressed from M. nasutus. Comparative genetic mapping in the three RIL populations reveals three new putative inversions, each one segregating among the M. guttatus founders, two due to admixture. We find strong, genome‐wide transmission ratio distortion in all RILs, but patterns are highly variable among the three populations. At least some of this distortion appears to be explained by epistatic selection favouring parental genotypes, but tests of inter‐chromosomal linkage disequilibrium also reveal multiple candidate Dobzhansky‐Muller incompatibilities. We also map several genetic loci for hybrid pollen viability, including two interacting pairs that coincide with peaks of distortion. Remarkably, even with this limited sample of three M. guttatus lines, we discover abundant segregating variation for hybrid incompatibilities with M. nasutus, suggesting this population harbours diverse contributors to postzygotic isolation. Moreover, even with substantial admixture, hybrid incompatibilities between Mimulus species persist, suggesting postzygotic isolation might be a potent force in maintaining species barriers in this system.

between species when they come into secondary contact (Coyne & Orr, 2004;Dobzhansky, 1937;Mayr, 1942).Nevertheless, gene flow between recognized species -at rates high enough to have substantial genomic consequences -is not uncommon, even when strong reproductive isolating barriers exist (Caeiro-Dias et al., 2023;Delmore et al., 2015;Ellegren et al., 2012;Larson et al., 2014;Nolte et al., 2009;Renaut et al., 2013;Teeter et al., 2010).At the extremes, hybridization can break down established barriers causing previously isolated lineages to fuse (Taylor et al., 2006) or strengthen reproductive isolation through reinforcement (Hopkins & Rausher, 2012;Jaenike et al., 2006;Servedio & Kirkpatrick, 1997), but theory also suggests partial reproductive isolation can sometimes evolve as a stable optimum (Servedio & Hermisson, 2020).Studies of naturally hybridizing species can provide a window into the causes of these alternative evolutionary outcomes, allowing us to identify and measure the impact of the genetic loci involved in reproductive isolation.
One powerful approach to address these questions is experimental hybridization of incompletely isolated species that occur together in secondary sympatry.In particular, the use of lategeneration hybrids such as recombinant inbred lines (RILs) can enable replication of wild, admixed genotypes and facilitate genetic mapping of barrier loci under controlled environmental conditions.This approach should be especially fruitful for identifying loci involved in postzygotic isolation because strongly incompatible allele combinations are likely to be purged during RIL formation.Indeed, selection against hybrid incompatibilities can be a major source of transmission ratio distortion (TRD) in RIL populations (Colomé-Tatché & Johannes, 2016).Although there are often additional contributors to TRD in experimental hybrid populations (reviewed in Fishman & McIntosh, 2019), hybrid incompatibilities are expected to leave a particular signature of distortion, generating both local peaks and linkage disequilibrium between unlinked loci caused by underrepresented or absent multi-locus genotypes.This signature will likely be particularly apparent in RILs, which have the advantage of multiple generations of recombination, increasing power to detect and localize incompatibilities by amplifying their effects (and revealing recessive phenotypes) through inbreeding (Moyle & Nakazato, 2008).
Despite the clear value of experimental hybridization, and RILs in particular, for illuminating the genetic causes of reproductive isolation, inferences to natural populations can be challenging because such studies are often based on crosses between only two individuals or inbred lines.This limitation might be serious for investigations of postzygotic isolation between species in the early stages of divergence, as hybrid incompatibilities are often polymorphic (Cutter, 2012;Good et al., 2008;Sweigart et al., 2007), particularly in populations with ongoing hybridization (Zuellig & Sweigart, 2018a).Variation in chromosomal structure is also a common feature of natural populations (Hoffmann & Rieseberg, 2008), and polymorphic inversions are often associated with adaptive differences (Feder et al., 2003;Jones et al., 2012;Lowry & Willis, 2010) and reproductive isolation (Noor, Grams, Bertucci, & Reiland, 2001;Sotola et al., 2023).One way to capture a more representative set of natural variation is to generate a multi-parent mapping population, for instance, by performing multiple biparental crosses to a common founder (Yu et al., 2008).Although this strategy is now used routinely in plant breeding programs (Scott et al., 2020) and in dissecting the genetics of complex traits (Aylor et al., 2011;Long et al., 2014), it has rarely been applied to studies of speciation.
In this study we investigate patterns and mechanisms of reproductive isolation in three independent RIL populations constructed from crosses between Mimulus guttatus and M. nasutus (also referred to as Erythranthe guttata and E. nasuta: see Lowry et al., 2019;Nesom et al., 2019).These two species are closely related (~200 KYA diverged: Brandvain et al., 2014), and broadly allopatric across western North America, but occur in secondary sympatry at many sites in their shared range.One such sympatric site occurs at Catherine Creek (CAC) in Washington state, a system of ephemeral seeps and streams that flow into the Columbia River.These Mimulus species have diverged in mating system (M.guttatus is large flowered and predominantly outcrossing, whereas M. nasutus is small flowered and primarily selfing), which acts as a strong premating barrier where the two species co-occur (Martin & Willis, 2007).At Catherine Creek, divergent flowering phenology (Fishman, Sweigart, et al., 2014;Kenney & Sweigart, 2016) and microhabitat adaptation (Mantel & Sweigart, 2019) are also important contributors to premating isolation.Despite these substantial barriers to interspecific mating, studies of population genomic variation at Catherine Creek have revealed clear signatures of historical and contemporary hybridization (Kenney & Sweigart, 2016), resulting in largely unidirectional introgression from M. nasutus into M. guttatus (Brandvain et al., 2014;Kenney & Sweigart, 2016;Sweigart & Willis, 2003).Even in the absence of interspecific gene flow, M. guttatus populations are exceptionally genetically diverse (Flagel et al., 2014;Puzey et al., 2017) and maintain substantial variation for complex phenotypes (Troth et al., 2018).
Here, we investigate the interplay between admixture and postzygotic breakdown in naturally hybridizing species.As a first step, we use whole genome sequence data to characterize introgression within and across CAC M. guttatus genomes, which reveals that all three M. guttatus RIL parents carry substantial, though variable, M. nasutus ancestry.We then leverage this variation in introgression to ask: Is there evidence of structural variation at Catherine Creek, as has been observed in other well-studied populations of M. guttatus?Do structural variants resist introgression or, alternatively, do they facilitate transfer of large tracts of interspecific variation?Is postzygotic isolation maintained between these hybridizing Mimulus species and, if so, does it vary among crosses?Do lines with more admixture have fewer incompatibilities?Is polymorphism in hybrid breakdown driven by introgression suggesting selection against incompatibilities could occasionally favour admixture?Even in this modest sample of three M. guttatus lines, we find segregating variation for inversions and for hybrid incompatibilities with M. nasutus, suggesting these hybridizing species continue to harbour diverse contributors to postzygotic isolation.

| Plant material and crossing design
In this study, we examined patterns of admixture in eight inbred lines of M. guttatus generated from wild-collected seeds from Catherine Creek (CAC; Table S1).These lines were formed through >10 generations of inbreeding with single-seed descent.
We selected three of these lines (CAC110, CAC162, and CAC415; Table S1) with phenotypes typical of CAC M. guttatus (Mantel & Sweigart, 2019) to act as pollen parents in independent intercrosses with a common seed parent, M. nasutus CAC9.We generated three F1 hybrids (flowers from CAC9 were emasculated in the bud) and self-fertilized a single plant from each to form three F2 populations (N = 500 plants each).RILs were then generated through four to five additional rounds of self-fertilization using single-seed descent (RILs; Figure S1).The final collection consisted of 151 CAC110 RILs, 128 CAC162 RILs, and 89 CAC415 RILs.All plants were grown by sowing seeds onto moist Fafard 3-B potting mix (Sun Gro Horticulture, Agawam, MA) in 2.5″ pots with cold-stratification for 7 days at 4°C.Seeds were germinated in a Conviron growth chamber and grown to maturity in the University of Georgia greenhouses under 16-h supplemental light, 23°C days/16°C nights.

| DNA extraction, library preparation, and sequencing
In this study, we generated whole genome sequence (WGS) data for seven CAC M. guttatus lines (including the three RIL parents; Table S1) and reduced representation RADseq data for all 368 RILs.From all sequenced plants, we extracted genomic DNA from bud and leaf tissue using a modified CTAB method in a 96-well format (https:// doi.org/10.17504/protocols.io.bgv6jw9e).We submitted the seven M. guttatus DNA samples to the Duke University Sequencing Core, which prepared standard 500-bp DNA-seq libraries and sequenced them (150-bp paired-end reads) in a partial lane of an Illumina HiSeq4000 sequencer.From the RIL DNA samples, we constructed ddRADSeq libraries using the BestRAD protocol (https:// doi.org/10.17504/protocols.io.6awhafe).Briefly, we digested 50-100 ng of genomic DNA with BfaI and PstI and ligated half plates of individual samples with 48 unique in-line barcoded adapters.We prepared libraries using NEBNext Ultra II library preparation kits for Illumina (New England BioLabs, Ipswich, MA).Each pool of 48 was then indexed with a unique NEBNext i7 adapter and an i5 adapter containing a degenerate barcode and PCR amplified for 12 cycles.We size-selected the libraries to 300-900 bp using BluePippin 2% agarose cassettes (Sage Science, Beverly, MA) and pooled the libraries in equimolar concentrations following quantification via qPCR.
Two rounds of RIL libraries were prepared and sequenced due to insufficient sequencing coverage from the first round.Both rounds of RIL libraries were sequenced (150-bp paired-end reads) in partial lanes of an Illumina HiSeq4000 at the Genome Sequencing Facility at Duke University.

| WGS processing and assessment of introgression in M. guttatus
To assess levels of admixture at Catherine Creek, we used newly generated and previously published WGS data from M. guttatus and M. nasutus both from and outside of CAC (Table S1).We trimmed Illumina adaptors from WGS using Trimmomatic (v0.36;Bolger et al., 2014)
These filtering steps produced data sets of 104,971 SNPs in the CAC110 RILs, 139,320 SNPs in the CAC162 RILs, and 114,133 SNPs in the CAC415 RILs.We also used a subset of these same SNPs -those with >10× coverage -to assess levels of residual heterozygosity in the RILs.
To produce a set of high-confidence markers that were consistent in all three RIL populations, we binned the full filtered SNP dataset into 50-kb windows.Using a custom Python script, we called windowed genotypes as M. guttatus or M. nasutus homozygotes if they had >75% of reads containing SNPs matching one parent and heterozygotes if they had 25%-75% of reads matching one parent.Windows with no SNP calls were coded as missing data.Next, we used Python scripts from the processing steps of the program GOOGA (Flagel et al., 2019) to estimate genotype error rates and recombination rates for each window (given low levels of heterozygosity, we set the expected genotype frequencies to 1:1 for alternative parental homozygotes with heterozygous windows treated as missing data).Following this analysis, we dropped 39 RILs from the dataset because they had excessive missing data (stemming from low read coverage) or high (e1 or e2 > 0.2) genotype error rates in GOOGA.We used GOOGA's HMM to convert windowed genotypes to genotype probabilities in the remaining 323 RILs (CAC110: 143, CAC162: 98, CAC415: 82).We used this step to exclude lowconfidence genotypes (windows with probabilities <.75) and then converted back to hard calls for mapping.
We used LepMAP3 (Rastas, 2017) to construct a linkage map for each of the three RIL populations.First, we used the SeparateChromosomes2 module to assign markers to linkage groups.
In some cases, we ran this module on markers from a single chromosome (i.e. when those markers failed to form a linkage group in the full data set -see Table S2 for details).Next, we performed iterative ordering using the OrderMarkers2 module (grandparentPhase = 1, useKosambi = 1, sexAveraged = 1, 10 iterations/per chromosome) and chose the order with the highest likelihood for each linkage group.For seven linkage groups with poorly resolved marker orders, we also computed likelihoods using the physical order of the IM62 v3 reference genome; in two cases, the physical order had higher likelihoods and was chosen instead (Table S2).In one case (LG11 in the CAC162 RILs), physical order was used despite it having a lower likelihood because the highest-likelihood order was highly fragmented and lacked collinearity with both the reference genome and other RIL maps.To create our final maps, we then manually removed standalone markers with physical positions that differed greatly from their calculated genetic positions.These steps resulted in three linkage maps with 3828, 4389 and 4379 markers, for the CAC110, CAC162, and CAC415 RILs respectively.Retaining only markers that mapped to recombination-informative sites, our final maps contained 794, 606, and 690 markers.

| Assessment of TRD and linkage disequilibrium
To explore genome-wide patterns of TRD, we calculated genotype frequencies and tested for deviations from the 1:1 Mendelian expectation (χ 2 tests, 1 df).We used both uncorrected (α = 0.01) and stringent Bonferonni-corrected (α = 0.000114-0.000400,calculated separately for each chromosome) chi-squared tests to assess the significance of distortion (Table S3).To investigate whether any peaks of TRD were associated with major two-locus hybrid incompatibilities, we used the LDscan command in the package 'pegas' (Paradis, 2010) to calculate r, pairwise linkage disequilibrium (LD), between all markers with unique map positions.To minimize false positives driven by the large number of pairwise comparisons, we plotted only r values greater than the 95th percentile (CAC110 RILs: r > .37,CAC162 RILs: r > .41,CAC415 RILs: r > .39).Because the signal of off-diagonal LD was abundant and somewhat diffuse, we also ran likelihood ratio tests comparing observed to expected genotypes to identify the strongest cases of two-locus epistasis.For all instances of significant off-diagonal LD (i.e.>95th percentile), we ran these tests on the pair of markers with the highest r value (as well as on five markers on each side, 11 total markers per locus).For all interactions with p < .0001,we then analysed whether two-locus epistasis was better explained by selection favouring parental genotypes or by hybrid incompatibilities.

| QTL mapping of fertility traits
To corroborate genomic inference of hybrid incompatibilities, we used quantitative trait locus (QTL) mapping to identify loci that contribute to variation in hybrid male and female fertility.For each RIL, we assessed male fertility by collecting anthers separately from three flowers (second pair or later) into lactophenol-aniline blue stain (Kearns & Inouye, 1993) and inspecting the pollen grains using a compound microscope.For each flower, we estimated pollen fertility by calculating the proportion of darkly stained (viable) pollen grains from a haphazardly selected group of ≥100.When fertility estimates from flowers of the same plant differed by more than 0.2, pollen from up to three more flowers was collected and counted.We then estimated individual pollen viability as the average value from all collected flowers.Because the distribution of pollen viability in each RIL population was left-skewed, we logit-transformed the data to approximate a normal distribution.
To estimate hybrid female sterility, we intercrossed random pairs of RILs two or more times and collected all seeds produced.
We counted the number of seeds produced per fruit for each cross made and ran a linear mixed model (LMM, using the lmer command in the 'lme4' package).In addition to maternal parent genotype (fixed effect), the model included the average pollen viability of the paternal parent (as calculated above, fixed effect), a unique RIX identifier nested within RIX cross type (CAC110 RIL × CAC162 RIL, CAC162 RIL × CAC415 RIL, etc.; random effect) and the year that the cross was made (crosses were made during two separate grow-outs in separate years; random effect).We then calculated least square means of female fertility (using seeds per fruit as a proxy) for each RIL genotype using the 'emmeans' package.This method produced distributions with a long right tail, so we square root transformed the data.
Given the small sample sizes of each RIL population, we took a multistep approach to QTL map hybrid male and female sterility.
For each trait in each RIL population, we first used r/QTL (Arends et al., 2010;Broman & Sen, 2009) to perform 10 iterations of composite interval mapping (CIM; Jansen & Stam, 1994;Zeng, 1993Zeng, , 1994)).Because CIM requires no missing data, missing genotypes are imputed before the scan is run, leading to potential differences among iterations.We set lenient LOD thresholds (α = 0.15) for QTL significance, by performing 5000 permutations, and calculated 1.In these lines, blocks of M. nasutus ancestry are generally smaller and more evenly dispersed throughout the genome (Figures S2 and   S4b), consistent with introgression events that occurred in the more distant past.Even in CAC110, however, there are many very small regions of inferred introgression (Figure S4b), indicating that M. guttatus genomes at Catherine Creek are shaped by a long history of introgression from M. nasutus.

| Comparative linkage mapping
Across most of the genome, the three RIL linkage maps are collinear with similar lengths (Table 1), and marker order generally follows the physical order of the M. guttatus IM62 v3 reference assembly (Figure 2, Figure S5).However, we identified three clear exceptions to this pattern, where maps differ due to parental variation in putative inversions, characterized by a large number of physically dispersed markers mapping to the same genetic position.Two of these putative inversions are on LG1 (inv1.1 and inv1.2; Figure S6) and are potentially unique to Catherine Creek (or locally distributed), having not been reported in other crosses with these Mimulus species (Fishman, Willis, et al., 2014;Flagel et al., 2019).In both cases, the CAC9 M. nasutus parent carries In addition to the effects of structural variation, we observed some instances of map length variation that appear to be associated only with parental differences in introgression.In the CAC110 RILs, introgressed regions on LGs 1, 11, 12, and 13 have elevated genetic distances relative to the same (non-introgressed) regions in the other two RIL populations (Table 1, Figure 2, Figure S5).Rather than reflecting a biological difference in recombination frequency between introgressed and non-introgressed regions, we attribute this effect to technical differences arising from the much lower marker densities in introgressed regions, which are highly similar to the CAC9 parent (Table 1, Figure 2, Figure S5).

| Genome-wide patterns of TRD
With the exception of LG9, we observed strong TRD in at least one RIL population on all linkage groups (Figure 3, Figure S7, Tables S3 and S4).
Globally, patterns of TRD are highly variable across RIL populations.
In the CAC110 and CAC415 RILs, most distorted markers show an excess of M. guttatus genotypes (~59% in CAC110 and ~80% in CAC415), whereas in the CAC162 RILs, there is a strong deficit of M. guttatus

| Tests of hybrid incompatibilities as a source of TRD
To investigate whether peaks of TRD were caused by hybrid incompatibilities, we characterized pairwise LD between all markers in each RIL population (Figures S8-S13, Table S5).
Because we found many instances of significant off-diagonal LD, often affecting large genomic regions, we also used likelihood ratio tests to identify the strongest interactions (Table 2).
Using these tests, we identified 23 cases of two-locus epistasis across the three RIL populations (10 in the CAC110 RILs, 4 in the CAC162 RILs, 9 in the CAC415 RILs: Table 2, Table S5, Figures S8-S13).All but one of these cases are due to a deficit of heterospecific allele combinations, consistent with the action of hybrid incompatibilities reducing viability or fertility during RIL formation.In 16 of these cases, the underlying mechanism seems to be epistatic selection favouring the same parental genotypes -that is, we observed either an excess of M. guttatus genotypes or M. nasutus genotypes at both loci.In six cases, however, there are clear deficits of only one of the two heterospecific genotype combinations and the expected single-locus distortion stemming from selection against the incompatible genotype at each locus (Table 2).This asymmetry is a hallmark of Dobzhansky-Muller incompatibilities.As with overall patterns of TRD, regions involved in two-locus epistasis are largely population specific, suggesting incompatibility loci are often polymorphic in M. guttatus (although in some cases, we likely have limited power to detect shared epistasis due to small sample sizes and genetic background effects).For two of the six incompatibilities, introgression from M. nasutus might be the cause of polymorphism (Table S6).In these cases, RILs with admixed M. guttatus parents lack the incompatibility (presumably because they carry M. nasutus alleles at both loci), suggesting there could be selection favouring introgression in these regions.
TA B L E 1 Summary of linkage map length and intrascaffold recombination rate estimates.Total map length and intrascaffold recombination rate estimates for each chromosome, plus total map length and average recombination rate in each of the three RIL populations.

| Genetics of hybrid fertility traits
Despite six generations of selfing, all three RIL populations exhibited substantial variation in male fertility.Some of this variation may be due to inbreeding depression in M. guttatus parental lines, which showed lower pollen viability than the M. nasutus parent (Figure S14).However, heterospecific interactions are likely also involved, as all three RIL populations include a considerable fraction of individuals (~10%-22%) with pollen viability <50% (well below the M. guttatus parental average of 72%; Figure S14).In the CAC110 RIL population, we mapped five pollen viability QTLs accounting for ~36% of variance, including two separate epistatic pairs (Table 3, Figure 4).One of these epistatic pairs leads to reduced male fertility in RILs with M. nasutus alleles on LG4 and M. guttatus alleles on LG14 (Figure S15a).The pattern of TRD at these same loci (large excess of M. guttatus on LG4 and a slight, non-significant excess of M. nasutus alleles on LG14; Figure S7) is consistent with a gametic incompatibility that reduces pollen viability, but other genetic mechanisms are also possible (e.g. a sporophytic incompatibility that reduces pollen viability to the point of preventing effective self-fertilization/seed production).The second epistatic pair in the CAC110 RIL population -between QTLs on LG12 and LG14 -also arises due to a negative heterospecific interaction (RILs homozygous for M. nasutus alleles on LG12 and M. guttatus on LG14 have the lowest fertility; Figure S15c), although M. nasutus genotypes have higher pollen fertility at each QTL individually (Table 3).
Neither of these incompatibilities involve any loci that map to introgressed regions in any of the M. guttatus parents (Table S6), suggesting polymorphism is due to within-M.guttatus variation at incompatibility loci on LG4 and LG14.At the remaining QTL for male fertility in the CAC110 RILs on LG11, M. nasutus alleles result in lower pollen viabilities (opposite to parental values; Figures S14   and S15b), suggesting there could be additional unmapped loci that interact with this QTL to decrease male fertility.In the CAC162 RIL population, we mapped two QTLs for pollen viability (accounting for ~28% of the variance; Table 3, Figure S17), but because we detected no epistasis between them and M. nasutus alleles increased fertility at both, these effects might be due to inbreeding depression or to incompatibilities with additional loci below our detection thresholds.We mapped no QTLs for male fertility in the CAC415 RIL population (Table 3).
For female fertility, all three M. guttatus parents had much lower fruit production than the M. nasutus parent, suggesting substantial inbreeding depression affecting this trait (Figure S14).Accordingly, all QTLs we identified -three in the CAC110 RIL and one in the CAC415 RIL -were additive with M. nasutus alleles increasing fertility (Table 3, Figures S16 and S18).

| DISCUSS ION
Closely related species living in secondary sympatry -especially those with incomplete reproductive isolation and still experienc- Structural variants appear to have a large impact on these patterns, with two of three new putative inversions showing polymorphism in M. guttatus due to introgression from M. nasutus (Figure S6).In the most admixed of the three M. guttatus parents (CAC110), roughly onequarter of the introgression is within the boundaries of the inv1.2 and inv13 inversions (Figure S2).Additionally, another highly admixed line, CAC262 (Figure S3), carries M. nasutus ancestry across all three inversions (Figure S2).
An open question is whether these inverted regions display elevated levels of divergence as expected if they disproportionately     (Noor, Grams, Bertucci, & Reiland, 2001;Rieseberg, 2001).If so, taking these regions into account might strengthen the previously discovered negative relationship between absolute divergence and local recombination at Catherine Creek (Brandvain et al., 2014;Kenney & Sweigart, 2016), which relied on estimates of recombination from genetic maps between non-CAC accessions.The fact that for two of the three inversions M. nasutus variants have introgressed into M. guttatus suggests they do not large fertility costs (pollen viability is 68% in CAC110 and 95% in CAC262).Unlike reciprocal translocations, which can cause F1 fertility problems through direct impacts on chromosomal segregation (Sotola et al., 2023;Stathos & Fishman, 2014), inversions often do not induce underdominant hybrid sterility in plants (Fishman & Sweigart, 2018;Huang & Rieseberg, 2020;Zhang et al., 2021).Nevertheless, loci for reproductive isolation do sometimes map to inversions (Huang & Rieseberg, 2020;Livingstone & Rieseberg, 2004;Noor, Grams, Bertucci, Almendarez, et al., 2001;Noor, Grams, Bertucci, & Reiland, 2001;Todesco et al., 2020), one reason being that, for species still connected by some degree of gene flow, inversions might prevent selection from purging incompatibilities (Noor, Grams, Bertucci, & Reiland, 2001).We find limited evidence for this scenario at Catherine Creek, with no hybrid sterility QTLs and only one putative incompatibility loci mapping unequivocally to inversions (Figure S6).It is also possible that inversions at CAC play an important role in premating barriers if they lock together multiple linked variants involved in divergent adaptation (Kirkpatrick & Barton, 2006;Rieseberg, 2001).A key question for future studies is whether traits involved in divergent ecological adaptation between CAC M. nasutus and M. guttatus, especially those involved in their divergent drought responses (Mantel & Sweigart, 2019), map to inversions as has been shown in ecotypes of M. guttatus (Lowry & Willis, 2010) and sunflowers (Todesco et al., 2020).

|
5-LOD intervals('lodint' command)  for each significant peak position in any of the 10 iterations.We then performed single-QTL scans using each QTL identified with CIM as an interactive covariate ('intcovar' option) in separate runs.Again, we set LOD thresholds (α = 0.15, 10,000 permutations) for interacting QTL significance and calculated 1.5-LOD intervals for each significant peak position.With these identified peaks, we fit a multiple-QTL model('fitqtl' command; HK method), directly estimating additive QTL effects (at each peak position) and identifying epistatic effects.Finally, we removed any nonsignificant (p > .05)effects and re-fit the model.Admixture in M. guttatus lines All eight CAC M. guttatus lines show clear signatures of introgression from M. nasutus, but the genomic location and proportion of admixture varies considerably (Figure 1, Figures S2 and S3).These introgressed regions were inferred from a Hidden Markov Model (HMM) that uses the spatial distribution of genomic windows with low interspecific divergence to probabilistically infer regions of M. nasutus versus M. guttatus ancestry (see Section 2 and Brandvain et al., 2014).On average, introgression from M. nasutus was inferred at ~14% of the genome in these eight CAC M. guttatus lines (range: 8.61%-23.83%).As expected, genomic windows inferred as introgressed by the HMM show very low values of divergence with CAC9 M. nasutus (average π = 0.0016; Figure S4).The CAC110 RIL parent is the most highly admixed of all the M. guttatus lines, with nearly a quarter of its genome inferred as introgressed from M. nasutus.Although the largest inferred blocks of introgression in CAC110 are only ~1.1 Mb (FigureS4b), they are often spatially clustered and sometimes cover most of a chromosome arm (FigureS2).Given the improbability of inheriting a large number of small introgressions immediately adjacent to one another, we think it likely that several regions in CAC110 (on Chromosomes 1, 8, 11, and 13 -Figure1, FigureS2) represent true contiguous blocks of admixture (the HMM may sometimes break up blocks of introgression -seeBrandvain et al., 2014).In the CAC162 and CAC415 RIL parents, overall proportions of admixture are lower (~9% in each).
the ancestral orientation (inferred from collinearity with the IM62 v3 and other M. guttatus reference assemblies; phyto zome.org) and the M. guttatus parents segregate for the derived variant.At inv1.1, recombination is suppressed in the CAC110 and CAC415 RILs, but not in the CAC162 RIL, indicating collinearity between the CAC9 M. nasutus and CAC162 M. guttatus parents (Figure2, FigureS6).At inv1.2, on the other hand, the genetic maps suggest CAC9 is collinear only with CAC110 (Figure2, FigureS6).Remarkably, this M. guttatus parent also carries a large block of introgression spanning this entire region (Figure1), suggesting that gene flow from M. nasutus has re-introduced the ancestral variant into CAC110.We discovered a similar situation on LG13, where introgression from M. nasutus accounts for collinearity between CAC9 and CAC110 at inv13 (recombination rates in the CAC110 RIL are much higher in this region than in the CAC162 and CAC415 RILs; FigureS5).Additionally, inv13 might have arisen in M. nasutus, as it also appears to be present in the reference genome assembly of M. nasutus SF5 (an accession derived from Shear's Falls, an allopatric M. nasutus population ~70 km from CAC), but not in other recently assembled M. guttatus genomes (phyto zome.org).Notably, all three putative inversions also overlap regions showing evidence of introgression in CAC262 (FiguresS2 and S6), suggesting this highly admixed line likely carries multiple M. nasutus structural variants.Taken together, these results point to introgression from M. nasutus as a major source of large-scale structural variation at Catherine Creek, responsible both for restoring ancestral variants and for introducing newly derived inversions into M. guttatus.

F
Introgression from M. nasutus into sympatric M. guttatus.(a) π calculated in 50 kb windows across the first two chromosomes of the IM62 v3 Mimulus guttatus reference genome, between the parental lines of the three RIL populations (CAC9 vs. CAC110, CAC9 vs. CAC162, and CAC9 vs. CAC415).Blue shaded regions indicate putatively admixed regions in the three focal CAC M. guttatus lines tested, diagnosed by a greater than a 95% posterior probability of M. nasutus ancestry (in 1 kb windows) as inferred via the HMM (i.e. more recent coalescence with included M. nasutus samples than with included M. guttatus samples).Grey shaded regions indicate missing data (fewer than 5000 genotyped sites within any 50 kb window).Red heatmaps above each chromosome indicate the density of centromeric repeats (Fishman & Saunders, 2008), identified via BLAST, in the same 50 kb windows.All chromosomes in all tested M. guttatus lines are shown in Figure S2.(b) Heatmap showing the percentage of M. nasutus ancestry on each chromosome, and across the genome, of each of the three CAC M. guttatus lines used as parents of the RIL populations, as estimated as the percentage of 1 kb windows with a greater than a 95% posterior probability of M. nasutus ancestry as inferred via the HMM.Darker blue shading indicates a higher percentage of M. nasutus ancestry.All tested M. guttatus lines are shown in Figure S3.genotypes across the genome (~89% of distorted markers show an excess of M. nasutus).In a few cases, the same region is even distorted in opposite directions (e.g. at the beginning of LG4: M. guttatus excess in CAC110 RILs, M. nasutus excess in CAC162 RILs; Figure 3, Figure S7).Despite this general variability, there are a few instances of shared distortion in multiple RIL populations, potentially implying the same causal mechanisms.Most notably, the proximal end of LG14 shows an excess of M. guttatus alleles in all three populations.There is also an excess of M. guttatus alleles on LGs 1 and 4 and an excess of M. nasutus alleles on LGs 8, 12, and 13 in two of the three RIL populations (CAC110and CAC162).These regions of shared distortion are largely confined to segments of the genome that are non-introgressed in parental lines.In addition to local peaks of TRD among homozygous marker genotypes, we also investigated levels of residual heterozygosity in the three RIL populations.At the SNP level (RIL windowed genotypes exclude heterozygous calls; see Section 2), we found low levels of heterozygosity across RIL genomes (CAC110 RILs: 1.95%, N = 13,975; CAC162 RILs: 2.34%, N = 33,340; CAC415 RILs: 2.79%, N = 23,178), consistent with the Mendelian expectation of 1.5%-3% for five to six generations of self-fertilization.Residual heterozygosity was broadly uniform across the genome, with no clear regions enriched for heterozygous SNP calls.
ing gene flow -provide an opportunity to investigate the evolution and maintenance of reproductive barriers in nature.Here we constructed three RIL populations in one such system: hybridizing sister species Mimulus guttatus and M. nasutus from the sympatric Catherine Creek site.We found abundant, but variable, admixture from M. nasutus in CAC M. guttatus, with the three M. guttatus RIL parents carrying a representative range of introgression across their genomes (~9%-24%).As has been found in another well-studied M. guttatus population(Lee et al., 2016), we discovered evidence of segregating inversions at Catherine Creek, including some with variants introduced from M. nasutus.Patterns of TRD and long-range LD, along with interacting QTLs for hybrid sterility, also provide evidence of multiple segregating hybrid F I G U R E 2 Map position (cM) versus physical marker position (Mb) for the first two chromosomes of the IM62 v3 Mimulus guttatus reference genome in the three RIL populations.Red heatmaps below each curve indicate the density of markers, measured as the distance in Mb to the nearest neighbouring marker.Blue shaded regions below heatmaps indicate putatively admixed regions (see Figure 1).Low SNP density, and the subsequent inability to genotype markers in areas of the M. guttatus parental genomes showing evidence of introgression from M. nasutus manifests as regions of low marker density on some chromosomes.Regions of low recombination in specific RIL populations indicating a putative inversion between CAC9 and the M. guttatus parent of that population are shaded in grey.All linkage groups are shown in Figure S5.incompatibilities between these naturally hybridizing species.Additionally, these RILs are now a permanent resource to be used in research on adaptation and speciation in Mimulus, facilitating future experiments to identify the genes involved in hybrid incompatibilities and understand their ecological context across environmental conditions.4.1 | M. nasutus as a source of structural variation in sympatric M. guttatusConsistent with previous work showing signatures of ongoing and historical introgression at Catherine Creek(Brandvain et al., 2014;Kenney & Sweigart, 2016), we identified many clear tracks of admixture from M. nasutus across the genomes of CAC M. guttatus (despite having chosen these lines for their species-diagnostic traits).

F
Transmission ratio distortion by map position (cM), (a) Linkage group (LG) 1-7, (b) LG8-LG14.Genotype frequencies in windowed 50 kb markers across the 14 linkage groups of the three RIL populations.Red dots indicate homozygous M. guttatus genotypes, yellow dots, homozygous M. nasutus genotypes.Bars at the top of each plot indicate regions showing significant transmission ratio distortion from the Mendelian 1:1 expectation by χ 2 tests (1 df) at α = 0.01 (thin bars), and a more stringent Bonferonni-corrected α (thick bars, unique α for each chromosome in each RIL population depending on the number of markers tested, see Table S3 for values).Bar colour indicates an excess of M. guttatus (red), or M. nasutus (yellow) genotypes.Markers are ordered by genetic position within each map (cM).The same figure with markers ordered by physical position in the M. guttatus IM62 v3 reference genome (Mb) is shown in Figure S7.TA B L E 2 Positions of two-locus epistasis.Physical (Mb) positions of pairs of loci on each linkage group (LG) in the three RIL populations made up of contiguous 50 kb markers with LD (r) in the 95th percentile, as well as Chi-Square statistics from likelihood ratio tests (3 df) with p < .0001.Counts of each genotypic class in the population are shown.Loci marked with an asterisk have peak r and χ 2 values at adjacent markers (*).Single locus proportions of M. guttatus alleles (G Prop.) were used to determine if the pattern of TRD at each epistatic locus could be explained by a two-locus hybrid incompatibility (NG-or GN-in 'Pattern' column), or if epistatic selection favouring the same parental genotypes better explains the pattern (GG or NN in 'Pattern' column).The best candidates for two locus hybrid incompatibilities (NG-or GN-pattern) are shaded grey.See TableS5for a full list of epistatic loci in the 95th percentile of LD.
Male and female fertility QTL locations and effects in each mapping population.Linkage group (LG), map (cM), and physical (Mb) position of each significant QTL are shown.Additive effects (a) are negative when the M. guttatus allele has the lower value, indicated in the summary column.QTLs with significant epistatic effects are shaded together.Percentage of RIL population variance (PVE) is shown for each additive and epistatic effect.The 1.5-LOD interval of the QTLs underlying female fertility on LG10 in the CAC415 RILs, marked with an asterisk, spans a breakpoint of the known derived inversion in the reference M. guttatus line IM62, leading to a disjunct interval (*).Effect plots are shown in Figures S15-S18.

F
Male and female fertility QTL locations in each mapping population.Physical position (Mb) of the 1.5-LOD interval of each significant QTL is shown.Phenotype and RIL population indicated in key.Arrows indicate the position of peak marker(s).Arrows pointing left indicate the M. guttatus allele has the lower value, and arrows pointing right indicate the M. nasutus allele has the lower value.QTLs with significant epistatic effects are indicated with shared symbols.Centromere locations are indicated in black.Note: The 1.5-LOD interval of the QTL underlying female fertility on Chr10 in the CAC415 RILs spans a breakpoint of the known derived inversion in the reference M. guttatus line IM62, leading to a disjunct interval.