Parallel clines in a quantitative trait in butterfly co-mimics despite different levels of genomic divergence and selection

Hybrid zones, where phenotypically distinct populations meet and interbreed, give insight into how differences between populations are maintained despite gene flow. Divergence in quantitative traits, controlled by multiple loci, may require stronger barriers to gene flow than traits controlled by few, major effect loci. The butterflies Heliconius erato and Heliconius melpomene are distantly related Müllerian mimics that show parallel divergence in wing colour patterns between geographical races across South America. We investigated intraspecific hybrid zones in which the colour patterns of both species show discrete variation in pigmentation, and quantitative variation in iridescent blue. We tested whether differentiation across hybrid zones persisted due to genome-wide barriers to gene flow maintained by indirect selection, or whether it resulted from direct selection on colour patterns in the face of gene flow. Analysis of phenotypic clines revealed parallel clines in iridescence between species, consistent with direct selection for mimicry, but cline widths were different between species, indicating differences in the strength of selection on iridescence. Genotyping-by-sequencing revealed less defined population structure and weaker genomic differentiation in H. melpomene, suggesting the hybrid zone has evolved differently in the two species. In both species, iridescence clines were not concordant with genome-wide ancestry clines, suggesting that they are targets of direct selection. This is the first direct comparison of divergence in quantitative and Mendelian traits in this classic Müllerian mimicry system.


Introduction 40
Hybrid zones, where genetically differentiated populations are in contact and interbreed, have long 41 been a valuable resource in understanding the evolutionary processes shaping taxonomic boundaries. 42 Hybrid zones can form in continuously distributed populations, where different alleles are favoured at 43 either end of an ecological gradient or transition, a process called primary intergradation. 44 Alternatively, they can form when previously isolated populations, which have become genetically 45 differentiated in allopatry, come into secondary contact (Endler, 1977). Both scenarios can result in 46 clinal variation in quantitative traits and allele frequencies across the hybrid zone, where such clines 47 are maintained by the balance between gene flow following dispersal, and selection (Barton & Hewitt, 48 1985). 49 Total selection acting on a divergent locus is composed of direct selection on that locus, plus 50 the indirect selection it experiences from other loci also under divergent selection, depending on the 51 strength of linkage disequilibrium (LD) between it and those other diverged loci (Barton, 1983; Barton 52 & Gale, 1993). When the strength of direct selection outweighs indirect selection, the width of a cline 53 informs about the strength of selection acting on that trait or genetic locus, with cline width being 54 proportional to the strength of selection for a given dispersal parameter (Barton & Hewitt, 1985). 55 Under this scenario the cline centre indicates the point at which the direction of divergent selection 56 changes, so cline coincidence can indicate a common selective agent (Barton & Hewitt, 1985). The 57 shape and position of clines can also be determined by the number of loci involved and the strength of 58 LD between them. If the strength of direct selection on each locus is outweighed by indirect selection 59 from many loci in LD, the total selection affecting such loci will be approximately equal, resulting in 60 them having similar centres and widths, and being steeper at the centre (or "stepped") where LD is 61 strongest (Kruuk, Baird, Gale, & Barton, 1999;Szymura & Barton, 1991). 62 Differentiation in phenotypes showing discrete variation across hybrid zones can be maintained by 63 strong divergent selection, despite high levels of gene flow across the rest of the genome (Gompert,64 4 (Toews et al., 2016), and monkeyflowers (Stankowski, Sobel, & Streisfeld, 2015). In contrast, studies 66 which examine clines in quantitative traits under divergent selection across hybrid zones, often find that 67 they are accompanied by stepped clines in genetic markers, indicating indirect selection on many 68 different loci across the genome and genetic structure across the hybrid zone, despite some gene flow 69  Kruuk et al., 1999). 77 Therefore, an increased level of overall genome-wide differentiation, and population level genetic 78 structure may be expected across hybrid zones over which quantitative variation is maintained. 79 Here, we studied the clinal variation of two colour pattern traits in South and Central American 80 Heliconius butterflies: the yellow hindwing bar (Mendelian) and iridescence (quantitative). Heliconius 81 erato and Heliconius melpomene are Müllerian co-mimics with bright aposematic wing colour 82 patterns. Where they co-occur, they converge on almost identical wing colour patterns to share the 83 cost of educating predators of their distastefulness (Brown, 1981). Both species comprise many 84 parapatric colour pattern races, or subspecies, connected by hybrid zones (Mallet, 1993;Rosser, 85 Dasmahapatra, & Mallet, 2014). When members of different subspecies hybridise, their offspring can 86 display novel or heterozygous phenotypes (Arias et al., 2008;Mallet, 1989;Mallet, 1986). Predators 87 are less likely to learn to avoid rare phenotypes, causing frequency-dependent selection on colour 88 patterns (Langham, 2004;Mallet & Barton, 1989). This maintains stable hybrid zones (Mallet, 1986;89 Rosser et al., 2014). The diverse colouration seen in the Heliconius genus has been extensively 90 studied, the vast majority of which is determined by a genetic 'tool kit' of around five major-effect 91  (Mallet, 1986; Figure 1). The iridescence is produced by nano-structural ridges on the 98 surface of wing scales, which are layered to produce constructive interference of blue light (Parnell et 99 al., 2018). In a system so well-studied, little is known about how selection acts on structural colour 100 (Sweeney, Jiggins, & Johnsen, 2003) and divergence in this trait as not been previously studied. While 101 the yellow bar is controlled by a single major-effect gene (Mallet, 1986;Nadeau, 2016), iridescence 102 segregates as continuous variation, with conservative estimates suggesting it is controlled by around 103 five additive genetic loci (Brien et al., 2018). 104 Here, we use geographic cline analysis to test whether the colour pattern clines are maintained by 105 direct or indirect selection, and what we can learn about the selection regimes impacting variation and 106 convergence of iridescence. By obtaining genome-wide SNP data we can question the extent to which 107 genome-wide barriers to gene flow may maintain the iridescence clines via indirect selection, or 108 whether differentiation is maintained by direct selection in the face of gene flow, as is seen in other 109 instances of colour pattern divergence in these species (Nadeau et al., 2014). This is also an 110 opportunity to examine how the different genetic architectures of the traits influences the properties of 111 the clines. Individuals with a yellow bar on the underside of the hindwing ( Figure 1C) have genotype ywcyca or 166 ywcywc. As two of the four phenotypes can be produced by two different allele combinations we 167 inferred the allele frequencies at each locality for each species assuming Hardy-Weinberg equilibrium 168 for the three alleles. The frequency of Y could be directly observed from both its heterozygous and 169 homozygous phenotypes. The frequency of yca could be inferred from the frequency of its 170 homozygous phenotype, allowing us to infer the frequency of ywc. Distances between sampling sites were estimated using the great circle distance, which is the 182 shortest distance between two points on the surface of a sphere. This was calculated using the 183 Where p(z) is a gene frequency (or mean score of a trait) at position xi, c is the cline centre and w is the 190 cline width, defined as the ratio between the total change in the frequency of an allele (∆p) or value of 191 the trait (∆z) across the cline and gradient at the cline centre: 192 The other two more complex models are 'stepped' clines. They consist of a central sigmoid step 194 flanked by two exponential tails: 195 The tails describe the pattern of introgression from the centre into the foreign genepool; θ is the rate of solution to the model. To ensure that the likelihood surface was thoroughly explored, independent runs 204 were conducted using a range of initial parameter estimates. After obtaining maximum likelihood 205 (ML) solutions for the three cline models, the most likely model was identified using Likelihood Ratio 206 Tests. As the minimum and maximum mean allele frequency or trait values (p(z)min, p(z)max) were 207 allowed to vary in the tails of the cline, the sigmoid, Sstep and Astep models were described by 2 (c, 208 w), 4 (c, w, θ, B) and 6 parameters (c, w, θ0, θ1, B0, B1), respectively. 209 After model selection, support limits were estimated for each parameter in the ML model. 210 Starting with the optimum fit, and constraining the values of all other parameters, the likelihood 211 surface for individual parameters were explored by making 10,000 random changes of their value. The 212 range of estimates that was within 2 log-likelihood units of the maximum estimate was taken as the 213 support limit for that parameter, and is approximately equivalent to a 95% confidence interval. 214 km) with all other parameters allowed to vary, summing the profiles, and obtaining the ML estimate; 220 MLsum was obtained by summing the ML estimates from each profile. If clines are not coincident or 221 concordant, MLsum is significantly smaller than MLcomp, as determined by a chi-squared test (α = 0.05) 222 with n-1 degrees of freedom, where n is the number of traits. One complication with this method for 223 comparing cline parameters is that the profiles for each trait must be built using the same model. 224 Although the more complex Sstep and Astep models are a significantly better fit than the sigmoid 225 model, the parameters estimates for the cline centre and cline width were highly similar regardless of 226 the model fit (see results). Therefore, all likelihood profiling was conducted with the sigmoid model. 227 To estimate the strength of selection acting on ywc, the following equation was used from 228 Barton and Gale (1993): 229 Where s* is the difference in mean fitness between populations at the edge of the cline, and 231 populations at the centre. This demonstrates the mean strength of effective selection on loci underlying 232 a trait required to maintain a cline of width (w), given the dispersal distance per generation (σ). 233 Dispersal estimates were taken from Mallet et al. (1990) and Blum (2002). 234 235

Sequencing data 236
Restriction-associated DNA (RAD) sequence data were generated for 265 H. erato (SI Table S2), and 237 whole genome re-sequencing was carried out on 36 H. melpomene individuals (SI Table S3). Genomic 238 DNA was extracted from each individual using DNeasy Blood and Tissue Kits (Qiagen). Library 239 preparation and sequencing was carried out by Edinburgh Genomics (University of Edinburgh). 240 Single-digest RAD libraries were prepared using Pst1 restriction enzyme, with eight base-pair 241 barcodes and sequenced on the Illumina HiSeq 2500 platform (v4 chemistry), generating an average of 242 554,826 125 base paired-end reads per individual (see SI Table S2 for coverage and accession 243 information). We demultiplexed the pooled reads using the RADpools program in the RADtools  SI Table S3 for coverage and accession 248 information). 249 We checked the quality of all the raw sequencing reads using FastQC (v 0.11.5) and removed any 250 remaining adapters using Trim Galore (v 0.4.1). We aligned the sequence data of all individuals, both 251 SNP datasets were generated using samtools mpileup (v 1.5) to compute genotype likelihoods and 261 bcftools (v 1.5) for variant calling. For a site to be a variant, the probability that it was homozygous for 262 the reference allele across all samples was required to be less than 0.05. Multiallelic sites, insertions 263 and deletions were ignored. For H. melpomene we identified 30,027,707 single nucleotide 264 polymorphisms (SNPs), and for H. erato we identified 5,088,449 SNPs. SNPs were filtered out that 265 had a phred quality score lower than 30, that lacked sequence data in 50% or more of the individuals, 266 that had a minor allele frequency lower than 0.05 and that were private variants. We pruned based on 267 linkage disequilibrium, discarding SNPs within a 20kb window with r 2 > 0.8, using the bcftools plugin 268  We carried out a principal components analysis (PCA) using PCAngsd (Meisner & 278 Albrechtsen, 2018), which estimates a covariance matrix based on genotype likelihoods. We used 279 eigenvector decomposition to retrieve the principal components of genetic structure. 280 281

Population differentiation 282
To test the extent of genetic differentiation between the iridescent and non-iridescent subspecies, we 283 first measured FST between all the individuals from the iridescent populations south of the hybrid zone, 284 and all the non-iridescent individuals north of the hybrid zone, excluding the sampling site Jaqué, 285 which was in the centre of the hybrid zone in both species. SNP datasets were generated for each 286 species, using samtools mpileup and bcftools (v 1.5). In each species Hudson's FST estimator was 287 calculated among populations (Hudson, Slatkin, & Maddison, 1992): 288 Where Hw is the within-population heterozygosity, Hb is the between-population 290 heterozygosity, and p1 and p2 represent the allele frequencies in each population. This was calculated 291 in R for every SNP with a custom script. Average genome-wide FST was calculated as a ratio of 292 averages, by averaging the variance components, Hw and Hb, separately, as recommended by Bhatia 293 et al. (2013). We also estimated average genome-wide FST between all pairs of populations, including 294 those in the hybrid zone, for each species, and plotted pairwise FST against pairwise geographic 295 distance. 296 297

Comparison of phenotypic and genomic clines 298
If iridescence varies across the hybrid zone in a similar way to the background genetic structure, this 299 may suggest that variation is a result of neutral diffusion across a zone of secondary contact. If 300 iridescence is under divergent selection, we would expect genes which contribute to this trait to 301 introgress less between the iridescent and non-iridescent populations than neutral alleles, and the 302 iridescence cline to be narrower than a cline in average ancestry [e.g. (Scordato et al., 2017)]. We 303 compared the shape and position of clines in iridescence with clines in genetic structure across the 304 hybrid zones. 305 To fit a cline in genetic structure, we used ancestry proportions estimated by NGSadmix for 306 K=2, and calculated mean admixture proportions for each site across the hybrid zone in each species. 307 We then fit geographic cline models to variation in the admixture proportion across sampled 308 populations using ANALYSE, and tested whether iridescence and admixture clines had coincident 309 centres and concordant widths, using the likelihood profiling approach described above. was fixed in all Colombian sampling sites, apart from at some of the northernmost Colombian 317 sampling sites near Bahía Solano (Figure 2 A,B). In H. melpomene, the frequency of ywc gradually 318 decreased, and persisted at comparable frequencies to the North Colombian yellow bar allele (Y) for 319 ~200 km, before the Central American yellow bar allele (yca) became predominant ( Figure 2D). In 320 contrast, in H. erato Y became the predominant allele, with yca approaching fixation towards the end of 321 the transect ( Figure 2C). 322 In both species the blue score, used as a proxy measure for iridescence, decreased across the 323 transect (Figure 2 A,B). The colour measurements used to calculate the blue score were highly 324 repeatable (p<0.001 for both red and blue values in both wing patches measured, Table S5 (Table 1; SI Table S6). 335 Likelihood profiling revealed that both iridescence and the yellow bar clines were coincident 336 (i.e. the same cline centre) between species, however the iridescence cline was significantly wider in 337 H. melpomene (Table 2). The yellow bar widths were not significantly different between species 338 (Table 2), despite non-overlapping support limits (Table 1), possibly due to gaps in sampling. Within 339 each species, clines in both colour traits were coincident and concordant (Table 2). 340 We estimated effective selection (s*) on ywc across the hybrid zone in both species using the 341 ML estimates and two log-likelihood support limits of cline width (Table 1) Figure S3). 364 NGSadmix also supported K=2 for H. melpomene (although K=1 cannot be tested, SI Figure  365 S2D), but revealed a less straightforward population structure. While a "Colombia-like" genetic 366 background could be seen, Panamanian individuals showed mixed ancestry, with the exception of two 367 individuals from the site closest to the centre of the iridescence cline ( Figure 4D). This is supported by

Clines in genomic admixture proportion 387
To test whether variation in iridescence was independent from genetic structure across the 388 hybrid zone, we compared clines in iridescence and admixture proportion across the transect (Figure  389 3E, F). An asymmetrical stepped cline model best fits the variation in admixture proportions across the 390 transect for H. erato (Table 1, SI Table S6), with a steeper right tail, similar to the iridescence cline. 391 However, the admixture proportion cline is not coincident with the cline in iridescence in H. erato 392 (Table 2), with the centre of the iridescence cline estimated to be located significantly further North 393 (Table 2). The admixture proportion cline is also significantly wider than the iridescence cline in H. 394

erato. 395
Neither stepped cline model is a significantly better fit than a sigmoidal model for variation in 396 admixture proportions in H. melpomene (Table 1). The clines in iridescence and admixture proportion 397 are coincident, but not concordant in H. melpomene, with a wider iridescence cline (Table 2). 398 However, the likelihood profiles reveal a wide range of values for admixture cline centres and widths 399 with similar likelihoods (SI Figure S4), suggesting the cline model fits poorly to the data. This may be 400 due to gaps in sampling. indirect selection is dominant (Kruuk et al., 1999). In H. erato, the two colour pattern clines are 431 coincident, and the best fitting cline model for variation in iridescence is stepped, however the stepped 432 cline models are not a significantly better fit than a simple sigmoidal cline for the yellow bar, lending 433 less support to indirect selection. In addition, while the best fitting cline model for the admixture 434 proportions is stepped, its centre is significantly different from the colour pattern clines, which again 435 suggests against indirect selection being the dominant force. Finally, while we do see stepped clines in 436 H. erato, they are both asymmetrical clines, with a very shallow left tail, and a more prominent right 437 tail. It is therefore more likely that these tails reflect genuine asymmetry, due to hybrid zone 438 movement, which has been predicted and documented in these species (Blum, 2002;Mallet, 1986; 439 Thurman, Szejner-Sigal, & McMillan, 2019), or asymmetric selection, rather than being due to 440 indirect selection. 441 In H. melpomene the colour pattern clines are estimated to be more than four times wider than the 442 corresponding traits in H. erato, and are not characteristic of the steep, stepped clines which result 443 from strong LD and indirect selection (e.g. Szymura & Barton, 1991). Wider clines could be a result 444 of greater dispersal distances. While dispersal distances have not been directly measured in Heliconius 445 butterflies, estimates have been made (Blum, 2002;Mallet et al., 1990), and the only study which 446 compares the two species reports higher dispersal distances in H. melpomene (Mallet et al., 1990). 447 However our estimates of the selection coefficient s* (Barton & Gale, 1993) show that even with 448 higher dispersal, colour pattern traits in H. melpomene appear to be under much weaker selection. 449 Other studies on parallel hybrid zones between neighbouring races in this species pair show that H. admixture between the two genetic clusters. In addition, pairwise FST in between-subspecies 464 population pairs is consistently higher than within-subspecies population pairs, even when accounting 465 for geographic distance. This contrasts with Peruvian and Ecuadorian hybrid zones between different 466 H. erato races, which comprise of single genetic clusters, indicating hybrid zone formation via 467 primary intergradation, or ancient secondary contact (Nadeau et al., 2014). It is possible that the H. 468 erato hybrid zone in the present study also formed via primary intergradation, with genetic structure 469 forming as a result of reduced effective migration across the genome due to the combined effects of 470 divergent selection and the build-up of statistical associations between loci (Flaxman et al., 2014). 471 Alternatively, the two ancestral groups may have diverged during isolation, with gene flow resuming 472 upon secondary contact, causing an abrupt discontinuity between genetically differentiated 473 populations (Barton, 1983). When several traits under selection change simultaneously across a 474 contact zone, linkage disequilibrium (LD) can be maintained between them, and indirect selection can 475 therefore predominate (Szymura & Barton, 1986   It must be noted, however, that we had fewer H. melpomene samples, and fewer sampling locations 491 than for H. erato, which may have impacted some of these results. different centres. This is particularly striking given the similarities in the phenotypic clines within this 508 species, and suggests that coincidence of colour pattern clines is not simply due to a pattern of 509 genome-wide differentiation, but more likely due to common selective forces maintaining closely 510 correlated clines. Spatial variation in quantitative traits can also be non-adaptive, arising via purely 511 stochastic processes (Storz, 2002). However, the iridescence cline is significantly narrower than the 512 admixture proportion cline suggesting reduced introgression of genes controlling iridescence relative 513 to the genome-wide average, likely due to selection on colour. It is therefore unlikely that the 514 iridescence cline in H. erato is a result of neutral diffusion following secondary contact. 515 In H. melpomene, we find low overall genetic differentiation. A cline model of admixture 516 proportions based on two genetic clusters has a centre and width not significantly different to the 517 colour pattern clines. However, this is likely due to the poor fit of the cline models to variation in 518 admixture proportions (Figure 4; SI Figure S4), illustrated by the large confidence intervals (Table 1). 519 This is possibly due to gaps in sampling, but the poor fit can also be explained by the less defined 520 population structure in this species. Also, clear phenotypic intermediates in the hybrid zone are not 521 predicted to be of mixed ancestry (Figures 3, 4). This suggests that divergence in iridescence in this 522 species is independent of any extensive genomic differentiation and that the broad phenotypic clines 523 that we see are due to weak selection over large distances. 524

525
Conclusions 526 This unique system allowed us to compare genetic and phenotypic patterns of divergence in 527 convergent traits that show both discrete and quantitative variation. Despite different patterns of 528 population structure and genomic divergence, the co-mimics H. erato and H. melpomene have formed 529 parallel clines in iridescence, and the evidence appears to support direct selection plays a role in 530 maintaining these clines. This is likely due to iridescence playing a role in the mimetic warning signal, 531 despite the weaker selection which appears to be operating in H. melpomene. 532 While pigmentation has a long history in our understanding of the link between genotype and 533 phenotype, we have barely scratched the surface of our understanding of the genetic control of 534 structural colour. This natural system is a promising candidate for association or admixture mapping, 535 and could allow us to identify genomic regions controlling structural colour for the first time.  --------