No evidence for maintenance of a sympatric Heliconius species barrier by chromosomal inversions

Mechanisms that suppress recombination are known to help maintain species barriers by preventing the breakup of coadapted gene combinations. The sympatric butterfly species Heliconius melpomene and Heliconius cydno are separated by many strong barriers, but the species still hybridize infrequently in the wild, and around 40% of the genome is influenced by introgression. We tested the hypothesis that genetic barriers between the species are maintained by inversions or other mechanisms that reduce between‐species recombination rate. We constructed fine‐scale recombination maps for Panamanian populations of both species and their hybrids to directly measure recombination rate within and between species, and generated long sequence reads to detect inversions. We find no evidence for a systematic reduction in recombination rates in F1 hybrids, and also no evidence for inversions longer than 50 kb that might be involved in generating or maintaining species barriers. This suggests that mechanisms leading to global or local reduction in recombination do not play a significant role in the maintenance of species barriers between H. melpomene and H. cydno.


Impact Summary
It is now possible to study the process of species formation by sequencing the genomes of multiple closely related species. Heliconius melpomene and H. cydno are two butterfly species that have diverged over the past two million years. These species have different color patterns, mate preferences, and host plants, traits that involve variants of multiple genes spread across the genome. However, the species still hybridize infrequently in the wild and exchange large parts of their genomes. Typically, when genomes are exchanged, chromosomes are recombined and gene combinations are broken up, preventing species from forming. Theory predicts that gene variants that define species might be linked together because of structural differences in their genomes, such as chromosome inversions, that will not be broken up when the species hybridize. We sequenced large crosses of butterflies to show that there are almost certainly no megabase-long chromosome regions that are not broken up during hybridization, and while we find evidence for some small chromosome inversions (on the order of tens of kilobases in size), it is unlikely that these are necessary to keep gene combinations together. This suggests that hybridization is rare enough and mate preference is strong enough that inversions are not necessary to maintain the species barrier.

Introduction
It is now widely appreciated that the evolution and maintenance of new species is constrained by genetic as well as ecological and geographical factors (Seehausen et al. 2014). A classic problem for speciation is that if combinations of divergently selected alleles arise in populations that remain in contact, recombination is expected to break down the associations between alleles and prevent speciation from proceeding (Felsenstein 1981). A large body of work has invoked genetic mechanisms that couple species-specific alleles and so reduce the homogenizing effects of gene flow (Smadja and Butlin 2011;Nachman and Payseur 2012), including assortative mating, one-allele mechanisms (Felsenstein 1981), tight physical linkage, pleiotropy, and multiple (or "magic") traits (Servedio et al. 2011). Here, we focus on the role of chromosomal inversions in suppressing recombination of divergently selected alleles in hybrids.
Inversions have frequently been implicated in speciation (White 1978;King 1993;Ayala and Coluzzi 2005;Hoffmann and Rieseberg 2008;Kirkpatrick 2010). Traits associated with reproductive isolation are often linked to inversions (e.g., Noor et al. 2001;Ayala et al. 2013;Fishman et al. 2013) and genetic divergence between species can increase within inverted regions through reduction of gene flow (Navarro and Barton 2003b;Jones et al. 2012;McGaugh and Noor 2012;Lohse et al. 2015). Theory predicts that inversions can spread by reducing recombination between locally adapted alleles (Butlin 2005;Kirkpatrick and Barton 2006;Feder and Nosil 2009;Ortíz-Barrientos et al. 2016), which can either establish or reinforce species barriers by capturing loci for isolating traits such as mating preferences and epistatic incompatibilities (Dagilis and Kirkpatrick 2016) or allow adaptive cassettes to spread between species via hybridization (Kirkpatrick and Barrett 2015). Reduced recombination within and around inversions has been confirmed in several species (Stevison et al. 2011;Farré et al. 2013), although it is unlikely that gene flow is entirely suppressed within inversions, due to double crossovers and gene conversion (Korunes and Noor 2017), factors addressed in some recent models (Guerrero et al. 2012;Feder et al. 2014).
Several authors have predicted that inversions can enable the formation and maintenance of species barriers in sympatry or parapatry by favoring the accumulation of barrier loci in the presence of gene flow (Noor et al. 2001;Rieseberg 2001;Navarro and Barton 2003a;Faria and Navarro 2010), as opposed to older models where inversions have direct effects on hybrid fertility or viability (discussed in Rieseberg 2001). Especially striking is the fact that most sympatric Drosophila species pairs differ by one or more inversions, whereas allopatric pairs are virtually all homosequential (Noor et al. 2001). In the particular case of Drosophila pseudoobscura and D. persimilis, three chromosomes differ by large, fixed inversions and a fourth chromosome has many varied arrangements Noor et al. 2007), genome differentiation is greater within and near inversions McGaugh and Noor 2012) and sterility factors are associated with inversions in a sympatric species pair, but with collinear regions in an allopatric pair (Brown et al. 2004). In rodents, sympatric sister species typically have more autosomal karyotypic differences than allopatric sister species (Castiglia 2014). Sympatric sister species of passerine birds are significantly more likely to differ by an inversion than allopatric sister species, with the number of inversion differences best explained by whether the species ranges overlap (Hooper 2016). Although inversions are not the only mechanism by which recombination rate can be modified during speciation, and more recently attention has been drawn to the potential role of genic recombination modifiers (Ortíz-Barrientos et al. 2016), the very strong effect of inversions on recombination rate, and the fact that they are completely linked to the locus at which recombination is reduced, means that they are perhaps the most likely mechanism of recombination rate evolution during speciation.
We set out to test the role of inversions in the maintenance of species barriers in Heliconius butterflies. The 46 species of Heliconius have been the focus of a wide range of speciation research (Merrill et al. 2011a;Supple et al. 2013;Kozak et al. 2015;Merrill et al. 2015). Chromosomal inversions are known to play an important role in the maintenance of a complex color pattern polymorphism in Heliconius numata (Joron et al. 2011). However, no other Heliconius inversions have been identified with traditional methods, as Heliconius chromosomes typically appear as dots in chromosome squashes, at least in male tissues (Brown et al. 1992).
Here, we systematically searched for inversions between populations of two Heliconius species, H. melpomene rosina and H. cydno chioneus, which are sympatric in the lowland tropical forests of Panama. These species differ by many traits (Jiggins 2008) including color pattern (Naisbit et al. 2003), mate preference Naisbit et al. 2001;Merrill et al. 2011b), host plant choice (Merrill et al. 2013), pollen load, and microhabitat (Estrada and Jiggins 2002). Hybrid color pattern phenotypes are attacked more frequently than parental forms, indicating disruptive selection against hybrids (Merrill et al. 2012). Assortative mating between the species is strong, and genetic differences in mate preference are linked to different color pattern loci (Merrill et al. 2011b). Matings between H. cydno females and H. melpomene males produce sterile female offspring, but male offspring are fertile, and female offspring of backcrosses show a range of sterility phenotypes (Naisbit et al. 2002). Hybrids are extremely rare in the wild, but many natural hybrids have been documented in museum collections (Mallet et al. 2007) and examination of present-day genomic sequences indicate that gene flow has been pervasive, affecting around 40% of the genome (Martin et al. 2013;Arias et al. 2014). Modeling suggests that the species diverged around 1.5 million years ago, with hybridization rare or absent for one million years, followed by a period of more abundant gene flow in the last half a million years (Kronforst et al. 2013;Martin et al. 2015b), suggesting that the species originated in parapatry, but have been broadly sympatric and hybridizing during their recent history. Although the species are closely related, they are not sister species; several other species such as Heliconius timareta and Heliconius heurippa are more closely related to H. cydno than H. melpomene.
Models predict that inversions, or other modifiers of recombination, can be established during both sympatric speciation and secondary contact (Noor et al. 2001;Rieseberg 2001;Feder and Nosil 2009;Feder et al. 2011;Feder et al. 2014). Furthermore, the genetic basis for species differences between H. melpomene and H. cydno is well understood and would seem to favor the establishment of inversions. Wing pattern differences are controlled by a few loci of major effect (Naisbit et al. 2003), some of which consist of clusters of linked elements. There is also evidence for linkage between genes controlling wing pattern and those underlying assortative mating (Merrill et al. 2011b). The existing evidence for clusters of linked loci of major effect would therefore seem to favor the evolution of mechanisms to reduce recombination between such loci, and hold species differences in tighter association.
We therefore set out to investigate patterns of recombination and test for the presence of inversions between H. melpomene rosina and H. cydno chioneus. H. melpomene melpomene has a high-quality genome assembly with 99% of the genome placed on chromosomes and 84% ordered and oriented (Heliconius Genome Consortium 2012; Davey et al. 2016). Whole genome resequencing has shown that F ST between H. melpomene melpomene and H. melpomene rosina is consistently low across the genome, with only a few small, narrow peaks of divergence, but F ST between H. melpomene rosina and H. cydno chioneus is substantially higher and heterogeneous (Martin et al. 2013), and many gene duplications have been identified between the two species (Pinharanda et al. 2017).
However, H. melpomene and H. cydno have not yet been examined for evidence of large differences in genome structure such as inversions. To test for this, we constructed fine-scale linkage maps for H. melpomene, H. cydno, and H. cydno x H. melpomene hybrids to test for the presence of reduced recombination in hybrids and inverted regions between the species (Fig. 1). Our linkage maps are based on tens of thousands of new single nucleotide polymorphisms (SNPs) discovered and genotyped using RAD Sequencing data from just under 1000 individuals from 24 crosses. We also generated long-read sequencing data and new genome assemblies for both species to test for inversions on smaller scales. This is the first systematic survey of genome structure and recombination at a fine scale in a lepidopteran species, and also one of very few such surveys of both parent species and their hybrids (Ortíz-Barrientos et al. 2016), which we hope will be a valuable test case for the role of inversions in speciation.

LINKAGE MAPS
Full details of methods for our crosses, library preparations, sequencing, and linkage map construction can be found in Supporting Information. In brief, for the within-species crosses of H. melpomene rosina and H. cydno chioneus, wild males were mated to virgin stock females and linkage maps were constructed from F1 offspring, whereas for hybrids, H. cydno stock females were mated to wild H. melpomene males, and F1 males were backcrossed to H. cydno stock females, with linkage maps constructed from backcross offspring (Fig. S1). Grandparents, parents, and offspring were RAD sequenced using the PstI restriction enzyme on Illumina HiSeq 2500 and 4000 machines using 100 bp paired end reads, except for one H. melpomene and one H. cydno trio which were whole genome sequenced with 125 bp paired end Illumina HiSeq 2500 sequencing (previously reported in Malinsky et al. 2016), and 58 hybrid individuals that were sequenced on a HiSeq 2000 using 50 bp single-end sequencing. RAD sequences were demultiplexed with Stacks (Catchen et al. 2013) and Illumina RAD and whole genome reads were aligned to version 2 of the H. melpomene genome (Hmel2; Davey et al. 2016) with Stampy (Lunter and Goodson 2011), Picard (http://broadinstitute.github.io/picard/), and GATK (dePristo et al. 2011) and genotype posteriors called with SAMtools mpileup (Li 2011).
Linkage maps were constructed from genotype posteriors using Lep-MAP. Within-species linkage maps for H. melpomene and H. cydno were built with Lep-MAP2 (Rastas et al. 2016) and some additional bespoke scripts. Due to the more complex cross structure of backcross populations, smaller cross sizes, and lower sequence quality for some crosses, different methods and thresholds were used to construct linkage maps for the H. cydno x H. melpomene hybrid crosses, now incorporated into Lep-MAP3 (https://sourceforge.net/projects/lep-map3/). Most notably, separate linkage maps were built for each large within-species cross, but only one linkage map was constructed for all hybrids, given the small size of the backcross families. The hybrid linkage map was then divided into four separate maps for each pair of grandparents. Full details can be found in Supporting Information.
In brief, SNPs were filtered to ensure each genotype was supported by multiple reads in the majority of individuals, excluding SNPs with rare alleles and segregation distortion. Missing parental genotypes were called based on related parent and offspring genotypes. Markers were identified by clustering together SNPs with almost identical patterns and filtering candidate markers with low support. Markers were separated into linkage groups, setting parameters empirically to identify 21 linkage groups for the expected 21 H. melpomene/H. cydno chromosomes, and markers for each linkage group were ordered. As females do not recombine, maternal markers were easy to identify and unchanging, so we could also make use of thousands of SNPs where both parents were heterozygous, by removing the maternal alleles and so converting the SNPs to paternal-only markers (Jiggins et al. 2005). Initial marker orderings were manually reviewed and edited, and all SNPs were reassigned to the final set of cleaned markers to improve coverage of the genome.

GENOME SCAFFOLDING
Hmel2 scaffolds were manually ordered according to the linkage maps for each of the three Heliconius melpomene crosses wherever possible. A small number of misassemblies in Hmel2 were corrected, with scaffolds being split and reoriented where necessary. Not all scaffolds could be ordered based on the linkage maps alone, so Pacific Biosciences reads were also used. PacBio reads were aligned to Hmel2 scaffolds using BWA mem with -x pacbio option (Li 2013). Scaffolds were ordered by manual inspection of spanning reads between scaffolds, identified and summarized by script find_pacbio_scaffold_overlaps.py. Chromosomal positions were assigned by inserting dummy 100 bp gaps between each pair of remaining scaffolds. Although PacBio sequencing could fill gaps between scaffolds, we chose not to do this for these analyses to avoid disrupting Hmel2 linkage map and annotation feature coordinates.

PERMUTATION TESTING
CentiMorgan values were calculated using the recombination fraction alone, as the maps were sufficiently fine-scale that mapping functions were not necessary (Ziegler and König 2001); see Supporting Information note on crossover detection for further details. Per-cross maps (Fig. S3) and map statistics (Table 1) were calculated for F1 parents within-species and for grandparents for hybrids. Marey maps (Figs. 2 and S3) and total map lengths were calculated using centiMorgan values. Chromosomes were tested for reductions in chromosome-wide recombination rate in the hybrids compared to H. melpomene or H. cydno using a bootstrapped Kolmogorov-Smirnov test suitable for discrete data with ties such as recombination counts (ks.boot in the R Matching package; Sekhon 2011), using a one-tailed test for reduced rates in hybrids with 10,000 bootstrap samples, declaring significance at a 0.05 Fine-scale recombination rates ( Fig. S4) were calculated in windows of 1 Mb with 100 kb steps, counting individual crossovers in each window (see Supporting Information note on crossover detection for further details). One megabase windows were tested for differences in recombination rate by calculating null distributions of rate differences by permutation of species labels across all offspring, testing at a 0.05 false discovery rate over 270,000 permutations, controlling for multiple tests with three comparisons for each of 2549 windows. Ninety-five percent confidence intervals in Figure S4 were calculated by bootstrap, sampling offspring for each species by replacement 10,000 times and calculating centiMorgan values, plotting 2.5 and 97.5% quantiles for each window.

INVERSION DISCOVERY
PBHoney (in PBSuite version 15.8.24 (English et al. 2014)) was used to call candidate inversions between H. melpomene and H. cydno, using alignments of PacBio data to ordered Hmel2 scaffolds made with BWA mem with -x pacbio option (Li 2013), retaining only primary alignments, and accepting alignments with minimum mapping quality of 30 in Honey.py tails, running separately on each of four samples (H. cydno females, H. cydno males, H. melpomene females, H. melpomene males). Break point candidate sets were compiled together into one file and scaffold positions converted to chromosome positions using script com-pile_tails.py. PBHoney was run with default options, requiring a minimum of three overlapping reads from three unique zero-mode waveguides to call a breakpoint candidate. As the H. cydno male sample had low coverage, we also ran PBHoney requiring a min-imum of two reads from one zero-mode waveguide and included these tentative candidates where they overlapped with candidates from other samples called with the default settings.
PBHoney was tested for false positives by simulating PacBio reads with pbsim 1.0.3 (Ono et al. 2013), generating a sample profile using the H. melpomene female sample and simulating 15 "SMRT cells" at 5x coverage each. Simulated data were then aligned with BWA and inversions called with PBHoney as above.
Trio assemblies were aligned to the ordered Hmel2 genome using NUCmer from the MUMmer suite (Kurtz et al. 2004; version 3.23), followed by show-coords with show-Tlcd options, to produce tab-separated output including scaffold lengths, percentage identities, and directions of hits.
Script detect_inversion_gaps.py was used to integrate the PBHoney inversion candidates with the linkage maps, trio alignments, and H. melpomene annotation (from Hmel2). As these data are being used to rule out inversions in regions without recombinations, PBHoney inversion candidates were rejected if at least one recombination for the same species as the candidate was contained within the inversion. PBHoney candidates were also rejected if there was a trio scaffold alignment spanning the candidate inversion, with spanning defined as extending more than half the length of the candidate inversion in either direction. Finally, candidates shorter than 1 kb were rejected, as linkage disequilibrium between SNPs separated by 1 kb or less in H. melpomene is significantly higher than background levels  and so inversions below this size are unlikely to be required to maintain linkage. The retained inversion candidates were then combined into groups by overlap.

POPULATION GENETICS STATISTICS
To look for evidence of variation in gene flow, F ST , d XY , and f d were calculated within and around candidate inversions following Martin et al. (2013Martin et al. ( , 2015a, using all four H. melpomene rosina, four H. cydno chioneus, four H. melpomene French Guiana, and two H. pardalinus samples from Martin et al. (2013). Samples were aligned to Hmel2 using bwa mem version 0.7.12 using default parameters and genotypes were called GATK version 3.4 HaplotypeCaller using default parameters except for setting heterozygosity to 0.02. For each candidate inversion, 11 windows equal to the size of the inversion were generated, one for the inversion itself and five either side of the inversion, except where candidates were at the ends of chromosomes. Statistics were calculated for each window with scripts popgenWindows.py and ABBABABAwindows.py in GitHub repository genomics_general (https://github.com/simonhmartin/genomics_general).

H. erato ANALYSIS
The H. erato version 1 genome assembly was downloaded from LepBase (http://ensembl.lepbase.org/Heliconius_erato_v1)  Table 1 and Fig. S1 for cross designs and Tables S1-S3 for full sample information) and generated PstI RAD sequencing data for a total of 963 offspring as well as whole genome sequencing for parent-offspring trios from H. melpomene cross 2 and H. cydno cross 1 (Tables S1-S3). Linkage maps were constructed from tens of thousands of SNPs discovered and genotyped in RAD sequencing and whole genome trio sequencing data (Fig. S2, Tables S4 and S5); separate maps were constructed for each within-species cross, but, due to the varying size and complexity of the hybrid crosses and heterogeneity of hybrid sequencing data, one single linkage map was constructed for all hybrid crosses using more conservative filters, and the single map was divided into separate F1 crosses post hoc (Table 1; Fig. S2; see Methods and Supporting Information for full details). We also generated Pacific Biosciences long-read data for pools of male and female larvae from H. melpomene cross 2 and H. cydno cross 1 (Table 2).
To improve the accuracy of our recombination rate measurements, we first used the new linkage maps and Pacific Biosciences long-read data for Heliconius melpomene to improve the scaffolding of version 2 of the H. melpomene genome assembly (Hmel2; 795 scaffolds, 275.2 Mb total length, 641 scaffolds placed on chromosomes (274.0 Mb), 2.1 Mb scaffold N50 length; Davey et al. 2016). This resulted in an updated genome assembly with 13 complete chromosomes, the remaining eight chromosomes having one long central scaffold with short unconnected scaffolds at either end (272.6 Mb placed on 21 chromosomes in 38 scaffolds, including 17 minor scaffolds at chromosome ends  ). The variation is largely due to variation in sequencing depth and PstI site occurrence, which are both related to GC content ( Fig. S3; Benjamini and Speed 2012; see Supporting Information note for full discussion). However, SNP density is not correlated with recombination rate, final map lengths, or crossover resolution, and final map lengths are consistent across all crosses (see below), so we do not believe variation in SNP density has affected our results (see Supporting Information note for further details).
Crossing over has previously been shown to be absent in Heliconius females (Turner and Sheppard 1975;Jiggins et al. 2005;Pringle et al. 2007;Davey et al. 2016), and we could find no evidence to the contrary in any of our crosses (Fig. S2), so we focus on paternal crossovers throughout (see Supporting Information note for a discussion and defense of this point). The paternal linkage maps have a mean genetic length of 51 cM and mean recombination rate of 4.2 cM/Mb per chromosome for both species and hybrids (Table 3). Mean crossovers per offspring across 21 chromosomes were 10.8 in H. melpomene (SD 2.4, from 335 offspring) and 10.7 in H. cydno (SD 2.2, from 297 offspring). This is consistent with an expectation of one crossover per chromosome per offspring and a 50% chance of inheritance of one of the 2 recombined gametes (from 4 total gametes).

SPECIES AND HYBRIDS
To identify potential genomic regions that may influence the maintenance of the species barrier, we examined our linkage maps for evidence of reduced recombination in the hybrids compared to the within-species crosses at the genome-wide scale, the chromosome scale, and at fine scale (1 Mb windows). Figure 2 shows Marey maps (Chakravarti 1991)   per species combined (see Fig. S4 and Table 1 for per-cross Marey maps and map lengths). Mean broad scale recombination rates and total genome-wide map lengths were almost identical across H. melpomene, H. cydno, and the hybrids (Tables 1 and 3; mean genome-wide recombination rates were all 3.9 cM; mean chromosome-wide recombination rates were all 4.2 cM; total map lengths were H. melpomene, 1081 cM; H. cydno, 1074.1 cM; hybrids, 1077 cM). Some differences in chromosome-scale recombination rate between the species maps are visible; for example, on chromosome 17, the H. melpomene map is 9.1 cM longer than H. cydno; on chromosome 6, H. cydno is 6.8 cM longer than H. melpomene ( Fig. 2; Table 3). However, we are primarily interested in recombination suppression in hybrids, and only chromosome 2 has a significantly reduced chromosome-wide recombination rate in the hybrid crosses, and only when compared to H. melpomene, not H. cydno (one-tailed Kolmogorov-Smirnov tests; see Methods).
At the fine scale, measuring recombination rate in sliding 1 Mb windows across chromosomes, regions with reduced recombination in the hybrids can be observed ( Fig. S5; for example, chromosome 17, 11-13 Mb and chromosome 19, 13.5-14 Mb), but none of these regions are statistically significant (permutation test for fine-scale variation in recombination in 1 Mb sliding windows at a 5% false discovery rate; see Methods).

INVERSIONS BETWEEN SPECIES
We also examined our recombination maps for evidence of inversions between species (Fig. 1). There are no regions of any map with a detectable reversed region in H. cydno or the hybrids with respect to H. melpomene (Fig. 2). This is true for the species maps and for all individual cross maps (Fig. S4). This indicates there are no large fixed inversions between H. melpomene and H. cydno.
Known or predicted chromosome inversions involved in the maintenance of species barriers are typically megabases long, and models indicate that inversions may have to be very large to become fixed in a population (Feder et al. 2014). Our maps are sufficiently fine scale to rule out the presence of inversions on the megabase scale (H. melpomene mean gap between markers, 115 kb, median 87 kb, maximum 1.38 Mb; H. cydno mean 135 kb, median 101 kb, maximum 1.14 Mb; see Figs. S6 and S7, and Supporting Information notes on our ability to detect and resolve crossovers). Simulation of random inversions indicates that our existing maps give us power to detect ß98% of 500 kb inversions, ß90% of 250 kb inversions and ß75% of 100 kb inversions (Fig. S8). These sizes are smaller than most inversions known to be associated with adaptive traits or species barriers, which are typically on the megabase scale; however, they are on the order of the sizes of the known inversions involved in within-species polymorphism in H. numata (see Introduction). The recombination maps alone do not rule out the presence of an inversion in any remaining gap between markers within H. melpomene or H. cydno.

SEQUENCE READS AND TRIO ASSEMBLIES
To test for the presence of smaller fixed inversions between H. melpomene and H. cydno that were undetectable using our recombination maps, we generated Pacific Biosciences long-read sequence data for pools of male and female larvae from one each of the H. melpomene and H. cydno crosses used to generate recombination maps (Table 2; Figs. S9 and S10). We called candidate inversions from the long-read data using PBHoney to identify reads with clipped alignments, realign the clipped read ends, and detect such split reads with inverted alignments.
We also generated Illumina short-read assemblies of the maternal and paternal genomes of one offspring from the same crosses used to generate the linkage maps and PBHoney candidates. These assemblies were constructed using a trio assembly method that separates maternal and paternal reads from one offspring and constructs haplotypic assemblies of each parental genome, providing longer and more accurate contigs compared to standard Illumina assemblies of heterogenous genomes such as those of Heliconius species (Malinsky et al. 2016; Table 2). We aligned these trio assemblies to the H. melpomene genome and assessed whether the resulting alignments supported or conflicted with candidate split read inversions.
In total, 1494 raw PBHoney split read candidates were identified across the four samples (two sexes for each of two species; Tables 2 and S6). As we consider our linkage maps to be reliable, and we are concerned with regions of the genome where our linkage maps do not contain recombinations, we rejected 438 split read candidates (30%) that spanned recombinations in the linkage maps (Table S6), of which 294 (20%) were longer than 1 Mb, with 36 (2.5%) longer than 10 Mb. The remaining candidates were all in regions that may contain crossovers but where crossover location could not be resolved, or in regions where multiple SNPs showed that there were no crossovers and so recombination could not be used to detect inversions (see Supporting Information note for discussion).
A further 344 candidates (23%) were removed because the candidate was spanned by a trio scaffold from the same species by 50% of the inversion length on either side (Table S6). These rejected candidates are likely to be mostly false positives; when we simulated PacBio reads directly from the ordered Hmel2 reference genome, PBHoney called 49 "false-positive" inversions. Alternatively, they may be generated by polymorphic inversions that are not present in the two parental haplotypes in the trio assemblies, but are present in at least one of the other two parental haplotypes and so detectable in the PacBio data, but as we expect only fixed inversions to contribute to species barriers, we have not considered these candidates any further.
A further 199 of the 1494 candidates (13%) were removed because they were shorter than 1 kb (Table S6) on the grounds that there is already above-background linkage disequilibrium between SNPs separated by 1 kb or less in H. melpomene . The remaining 463 split read candidate inversions from the four samples were merged into 185 candidate groups based on their overlaps. We expect fixed inversions to be present in both sexes for each species, but the four samples were sequenced with variable coverage, with particularly low coverage for the H. cydno males (Table 2). Given this, 173 additional candidates with less robust support that overlapped with the 185 merged groups were included in the dataset (Table S6). Each of the merged groups was then classified based on their presence in either or both species and their support by split read and trio assembly evidence (Table 4; Figs. 3 and S11-S17; Table S7; see Methods for full criteria).
Despite the high rate of likely false positives, PBHoney does appear to detect some genuine inverted sequences relative to the reference genome. Where candidate inversions are identified in the same location from both H. melpomene and H. cydno sequence data, it is likely that these candidates are accurately reflecting a misassembly in the reference genome. This is especially the case where the inversion breakpoints fall at contig boundaries in the assembly, as local misassembly can prevent neighboring contigs from being assembled. There were 59 candidate groups where PBHoney found overlapping inversions in both H. cydno and H. melpomene, 50 (85%) of which span multiple contigs, with most inversion breakpoints falling at or near to the end of a contig (Table 4; Figs. 3 and S15-S17). This indicates either that some whole contigs are inverted, or that the ends of contigs have inverted regions that need to be reassembled (which perhaps explains the failure to fill the contig gaps during assembly). In contrast, candidate inversions specific to one or other species were less likely to span multiple contigs (20 of 65 H. cydno candidates (31%), and 19 of 56 H. melpomene candidates (35%); Figs. S11-S14). We suggest that while some of these species-specific inversions could be explained by misassemblies and incomplete PacBio coverage across both species, many of them could be genuine inversions.

ASSEMBLIES AND POPULATION GENETICS
As the false positive rate for PBHoney is high, we made further use of the trio assemblies to find support for the remaining PBHoney candidate inversion groups (Tables 4 and S7, Figs. S11-S17). Thirteen H. cydno and nine H. melpomene groups were further supported by trio scaffolds aligning with inverted hits within inversion breakpoints (Fig. 3, "Split reads and trio assembly"; Figs. S11 and S13). Of these, eight H. cydno and three H. melpomene candidates did not have inversion breakpoints near contig boundaries, suggesting that they are less likely to be due to genome misassemblies. If these inversions are speciesspecific, as indicated by the PBHoney output, we expect support for the reference genome order in the species that does not possess the inversion candidate. Indeed, six of these H. cydno and all three H. melpomene candidates have trio scaffolds of the other species spanning the whole inversion or one of the breakpoints, supporting the inversion as being species-specific (Figs. S11.2, S11.4, S11.6, S11.8, S11.10, S11.11; Fig S13.6, S13.8, S13.9). Hence, we have likely detected a small number of species-specific inversions. However, the longest of these candidates is Figure  S11.2 at 20,247 bp, far shorter than any known inversion relevant for speciation and shorter than is expected to become fixed in simulations (Feder et al. 2014). Furthermore, this is only slightly larger than the distance at which linkage disequilibrium in H. melpomene reaches background levels (ß10 kb; Martin et al. 2016), such that any effect of reduced recombination would be slight in population genetic terms. We conclude that there are a small number of likely species-specific inversions, but that these are too small to play a role in speciation via reduced recombination. Notably, none of these candidate inversions were located near loci known or suspected to determine species differences in wing pattern or any other trait with known locations (Nadeau et al. 2014, Davey et al. 2016; we have transferred the H. melpomene loci to positions in the ordered Hmel2 genome in Table S8 and also included a table of all candidate inversion positions for comparison in Table S7; the BD region has been narrowed based on the results of Wallbank et al. 2016).
We also calculated F ST , d XY , and f d (Cruickshank and Hahn 2014;Martin et al. 2015a) across inverted regions (Figs. S11-S17) to look for evidence of variation in gene flow at the inversion relative to surrounding regions. An inversion acting as a species barrier typically produces a signal of elevated F ST and reduced admixture (here estimated using f d ; Aulard et al. 2002;Deng et al. 2008;Huynh et al. 2011;Nachman and Payseur 2012;Fontaine et al. 2015;Love et al. 2016), and an inversion enabling the spread of an adaptive cassette between species (Kirkpatrick and Barrett 2015) might produce a signal of elevated f d . However, we see very little evidence for deviations in these statistics within the handful of candidate inversions compared to the surrounding regions, with only one H. cydno inversion (Fig. S11.4, 11,719 bp long) showing a noticeable localized increase in F ST and small increase in d XY . This region contains no annotated features, although of course this does not rule out some functional importance of this region.
Some candidates with only split read evidence, many in only one sex, are hundreds of kilobases long (outliers labeled in Fig. 3, particularly those marked with circles, where breakpoints are not near contig boundaries), which, if real, may be relevant to speciation. However, given the large number of false positives produced by PBHoney, the lack of supporting evidence from trio assemblies, and the lack of clear, localized deviations in F ST , d XY , and f d signals at these candidates, it is unlikely these candidates, even if they are real, are substantial species barriers.

REGIONS
We used the recently completed H. erato genome assembly (Van Belleghem et al. 2017) to investigate the incidence of inversions between more divergent genomes in the Heliconius genus.
Heliconius melpomene and H. erato diverged 10 million years ago (Kozak et al. 2015; Fig. S18), considerably more than the ß1.5 million years between H. melpomene and H. cydno. Despite the substantial divergence time, the chromosomes of the two species are collinear throughout at the large scale, with a few exceptions. There are many regions of the H. erato genome assembly that are inverted relative to the ordered Hmel2 assembly, but they fall within regions where the H. erato or H. melpomene linkage maps were not informative and so may be due to genome misassemblies. For example, H. erato scaffolds Herato0201, Her-ato0202, and Herato0203 on chromosome 2, and the first 300 kb of H. melpomene chromosome 3, may be misoriented rather than genuinely inverted.
However, three large inverted and/or translocated regions are well supported by linkage map markers in both species, and so are likely to be genuine inversions (  (Herato0211, Herato0212, Herato0213, and Her-ato0214) and multiple linkage map markers in both species. On current scaffold ordering, this rearrangement appears to be an inversion followed by a translocation (for scaffold Herato0214), but it is likely to be a single inversion; as scaffolds Herato0212, Herato0213, and Herato0214 are all found at the same marker on the linkage map, it may be that these scaffolds need to be reoriented and reordered, inserting scaffold Herato0212 at the start of the inversion in Herato0211 and inverting Herato0214. Nevertheless, this large region deserves further attention, especially as some pairs of H. erato subspecies appear to have elevated F st in the center of chromosome 2 (see Fig. 2 of Van Belleghem et al. 2017). It is unclear whether this inversion is polymorphic only in H. erato or whether it is present in other Heliconius species.

Discussion
We have systematically tested the hypothesis that inversions causing reduced recombination rates in hybrids might maintain species barriers with gene flow (Ortíz-Barrientos et al. 2016). Highdensity linkage maps and high-coverage long-read sequence data give us considerable power to both measure recombination rate and detect structural rearrangements. We find evidence for some small inversions, but not for inversion differences between H. melpomene and H. cydno at a scale that is likely to influence the speciation process.
Our data have some limitations that might have prevented us from identifying genuine inversions between H. melpomene and H. cydno. First, we have only sequenced crosses from three or four pairs of parents per species, and so may have missed polymorphic inversions absent from our sampling of wild individuals. However, any inversion important for speciation is expected to be fixed between the species, so it should have been detected even in small samples. Second, our ability to detect inversions and differences in recombination is limited by the size of our crosses (roughly 300 individuals for each species and for the hybrids), and the maps contain regions of the genome up to a maximum of 1.3 Mb without crossovers that might conceivably harbor inversions (see Results, recombination maps show no major inversions between species); further crosses could improve resolution in these areas. Third, we have used RAD sequencing data to measure recombinations, which is limited to ß10 kb resolution (using the PstI restriction enzyme); some of the smaller candidate inversions could be confirmed by developing further markers within them at narrower resolution, but this would not change our conclusion that inversions are unlikely to be involved in speciation between H. melpomene and H. cydno.
One important aspect of our experimental design is that we have measured recombination in hybrids as well as investigating gene order in the parental species. This gives power to detect reduced hybrid recombination rate more generally as well as specifically the presence of inversions. We have found no evidence for significantly reduced recombination in hybrids, at the broad (chromosome) scale or megabase scale, suggesting that genic modifiers of recombination are unlikely to have widespread effects in these species. Larger crosses would give greater resolution to this test, and might detect smaller regions of reduced recombination. Nevertheless, we can decisively rule out the presence of any multimegabase rearrangements among these samples.
We complemented the linkage maps with PacBio sequencing and trio assemblies to detect candidate inverted regions at a smaller scale. This approach also has challenges and generated a high rate of false positives. One potential source of difficulties is reliance on alignment to the H. melpomene reference genome assembly. The existing assembly has 25% transposable element content (Lavoie et al. 2013) and is likely missing around 6% of true genome sequence, mostly due to collapsed repeats (Davey et al. 2016). Inversion breakpoints are typically repeat-rich, which increases the likelihood that reads or scaffolds will not align correctly, and that the breakpoint regions could be misassembled or absent in the reference genome and in the trio assemblies. This problem may be worse for H. cydno, where more divergent sequence may align incorrectly or not align at all, and unique H. cydno sequence will not be present in the reference (an additional ß5% of H. cydno sequence did not map to the H. melpomene genome compared to H. melpomene samples; Table 2). These issues may explain the high observed rate of false positives in our data.
Nonetheless, the detection of likely genome misassemblies indicates that our methods do indeed have the power to detect real rearrangements. These are supported by multiple lines of evidence in both species and fall near contig boundaries. These misassemblies could be due to whole inverted contigs, or to misassembled inverted regions at the ends of contigs, which may be preventing the contigs being joined by spanning reads. Misassemblies demonstrate that our methods are capable of detecting large rearrangements in the sampled reads relative to the genome assembly.
In contrast, our candidate species-specific inversions are typically smaller than the misassemblies, and are mostly not supported by multiple lines of evidence. Indeed, we can find no compelling fixed candidate inversions supported by both the split read and trio assembly datasets that also show evidence of an increase in F ST or d XY , except for the 11.7 kb inversion shown in Figure S11.4, which is probably too small to substantially increase linkage across this locus beyond that expected by normal decay of linkage disequilibrium . It is possible that some of the candidates with less robust evidence are genuine, given the limitations described above, but on the existing evidence we cannot identify any inversions that are likely to be involved in maintaining species barriers between H. melpomene and H. cydno.
Although existing models identify situations where chromosome inversions can spread to fixation between two species and maintain a species barrier, they do not show that inversions always spread during speciation with gene flow. For example, in the model of Feder et al. (2014), inversions only fix when the strength of selection on the loci captured by the inversion is considerably lower than migration between the species. Similarly, Dagilis and Kirkpatrick (2016), modeling the spread of inversions that capture a mate preference locus and one or more epistatic hybrid viability genes, show that inversions are unlikely to spread where pre-and post-zygotic reproductive isolation is already strong. In a recent review, Ortíz-Barrientos et al. (2016) also highlight that during reinforcement, assortative mating and recombination modifiers such as inversions are antagonistic; if strong assortative mating arises first, there is only weak selection for reduced recombination.
We considered H. melpomene and H. cydno to be good candidates for the spread of inversions because there are linked loci causing reproductive isolation, because hybridization has been ongoing for much of their history, because an inversion is known to maintain color pattern polymorphism in H. numata (Joron et al. 2011), and because they are a parallel case to that of D. pseudoobscura and D. persimilis, where inversions do appear to maintain the species barrier . Comparisons between sympatric and allopatric populations of the two Heliconius species have shown that almost a third of the genome is admixed in sympatry and that hybridization has been ongoing for a long time (Martin et al. 2013), perhaps at a low rate.
However, strong selection on species differences and assortative mating are not conducive to the spread of inversions. Aposematic warning patterns are strongly selected (Mallet and Barton 1989) with F 1 hybrids twice as likely to be attacked as parental phenotypes (Merrill et al. 2012), and prezygotic isolation in the form of mate preference is almost complete ). Therefore, inversions may not be necessary for divergent loci to accumulate between the species. Thus, in this case, the evolution of strong assortative mating may have been favored by reinforcement selection and close physical linkage between preference and wing-patterning loci (Merrill et al. 2011b), and it is likely that the species barrier between H. melpomene and H. cydno has persisted with gene flow, but without the suppression of recombination by chromosome inversions.
An alternative and complementary explanation is that the rate of production of inversions may simply be low in Heliconius. This is suggested by the low background rate of fixation of inversions in Heliconius genomes. We have shown that, not only is there little evidence for substantial, fixed inversions between H. melpomene and H. cydno, but also that H. melpomene and H. erato, which last shared a common ancestor over 10 million years ago, have largely collinear genomes, and it is also known that there is substantial chromosomal synteny across the Nymphalids (Ahola et al. 2014). The association of multiple inversions with the wing pattern polymorphism in H. numata is all the more remarkable given the low background rate of inversions in these butterflies. This contrasts with, for example, the many fixed or polymorphic inversions in the genomes of Drosophila (Krimbas and Powell 1992), Anopheles (Ayala et al. 2014), and primates (Samonte and Eichler 2002), and especially with the solid case for the influence of inversions on speciation between sympatric populations of D. pseudoobscura and D. persimilis (Noor et al. 2001;Machado et al. 2007;Noor et al. 2007), where major fixed inversions occur on most chromosomes. Although H. melpomene and H. cydno have similarly divergent genomes overall compared to the Drosophila species pair, we do not find evidence for a similar role for inversions in maintaining the species barrier. Although inversions are clearly involved in speciation in many taxa studied to date, they appear to be absent in H. melpomene and H. cydno and in the flycatcher species pair Ficedula albicollis and F. hypoleuca (Ellegren et al. 2012), so the possibility of speciation without inversions should be kept in mind. We conclude that species barriers can persist during speciation with gene flow without substantial suppression of between-species recombination.
AUTHOR CONTRIBUTIONS JWD and CJ conceived and designed the project. JWD carried out Heliconius melpomene crosses, analyzed the data, and wrote the manuscript. SLB extracted DNA samples and prepared RAD libraries. PMR and JWD wrote linkage mapping software and built linkage maps. AP carried out H. cydno crosses and extracted DNA samples. SHM prepared population samples and wrote population genetics software. RMM carried out hybrid crosses and prepared RAD libraries. JWD, RMM, SHM, RD, and CDJ designed experiments and sequencing strategies. WOM managed insectaries in Panama and contributed to crossing designs. All authors read and commented on the manuscript.

ACKNOWLEDGMENTS
We are grateful to A. Tapia, L. Evans, M.-C. Melo, J. Scott, B. v. Schooten, A. Gonzalez-Karlsson, M. Abanto, T. Thurman, and all staff at the Smithsonian Tropical Research Institute for support and assistance with insectary work. The RAD sequencing protocol was developed by P. Etter at the University of Oregon and further enhanced by P. Fuentes-Utrilla and C. Eland at Edinburgh Genomics. PacBio sequencing was carried out by K.

Supporting Information
Additional Supporting Information may be found in the online version of this article at the publisher's website: Supporting information Figure S1. Cross design. Figure S2. Genetic and physical maps for each ordered Hmel2 chromosome. Figure S3. SNP density, PstI site density, and GC content for each ordered Hmel2 chromosome. Figure S4. Marey maps of recombinations for each cross separately. Figure S5. Recombination rates in cM/Mb for each ordered Hmel2 chromosome. Figure S6. Gaps in linkage maps where lack of precise recombination data cannot rule out presence of inversions. Figure S7. Histograms of gap lengths for Heliconius cydno and H. melpomene. Figure S8. Probability of detecting random inversions of sizes from 10 kb to 1.5 Mb given existing linkage maps for Heliconius melpomene and H. cydno. Figure S9. Histograms of raw read lengths for Pacific Biosciences sequencing. Figure S10. Histograms of base depths across the genome after alignment of raw PacBio reads to Heliconius melpomene genome assembly Hmel2. Figures S11-S17. Full evidence for each candidate inversion group, separated into the classes shown in Figure 3 and Table 4. S11, H. cydno, split reads and trio assembly. S12, H. cydno, split reads only. S13, H. melpomene, split reads and trio assembly. S14, H. melpomene, split reads only. S15, Both species, split reads and trio assembly. S16, Both species, split reads only. S17, Both species, split reads in one species, trio assembly in both. Figure S18. Oxford grids for ordered Hmel2 scaffolds and H. erato scaffolds. Table S1. Full sample details for Heliconius melpomene crosses. Table S2. Full sample details for Heliconius cydno crosses. Table S3. Full sample details for Heliconius cydno x H. melpomene hybrid crosses. Table S4. SNP and marker information for within-species crosses. Table S5. SNP and marker information for hybrid crosses. Table S6. Counts of raw and filtered PBHoney candidate inversions. Table S7. Locations of candidate inversions. Table S8. Locations of known trait loci.