The Amazon river is a suture zone for a polyphyletic group of co-mimetic heliconiine butterflies

rivers help generate and maintain colour pattern diversity, but to date there is no evidence that they lead to speciation in our study group.


Introduction
Tropical America is the most biologically diverse terrestrial region on the planet ( omas 1999, Jenkins et al. 2013), but the evolutionary and ecological reasons for this remain unclear (Smith et al. 2014, Antonelli et al. 2018. Allopatric speciation, whereby geographic barriers reduce gene flow allowing populations to diverge, is thought to have produced much of the world's biodiversity (Coyne and Orr 2004).
2 However, some of the most diverse neotropical communities are found in the Amazon basin; a region with few obvious geographic barriers that could promote allopatric speciation. Numerous hypotheses have been put forward to reconcile this discrepancy (Haffer 2008). For example, the Pleistocene refugia hypothesis (and variants of it) propose that climatic variations in the Earth's geological history fragmented the now continuous forest into geographically isolated islands of suitable habitat, promoting allopatric divergence (Haffer 1969, Naka et al. 2012, Weir et al. 2015, Naka and Brumfield 2018, Pulido-Santacruz et al. 2018.
Another long-standing hypothesis is that the large Amazonian rivers act as geographic barriers to dispersal that enable allopatric speciation (Wallace 1852, Hershkovitz 1968, Smith et al. 2014. While intuitively appealing, support for this hypothesis in modern analyses has been mixed (Gascon et al. 2000, Boubli et al. 2015, Nazareno et al. 2017, de Castro Godinho and da Silva 2018, Naka and Brumfield 2018, Santorelli et al. 2018. For example, the effectiveness of the rivers as barriers may depend on species-specific traits, as well as the size and type of river (Ayres and Clutton-Brock 1992, Burney and Brumfield 2009, Fouquet et al. 2015. Moreover, studies must discount the possibility that observed associations between biogeographic breaks and rivers are not merely sampling artefacts (Nelson et al. 1990, Oliveira et al. 2016.
Heliconiines comprise about 70 species and 300+ subspecies of brightly coloured neotropical butterflies known for their Müllerian mimicry (Jiggins 2017). Maps of subspecies richness made using museum specimens exhibit a striking pattern, with high richness along the course of the Amazon river and its major tributaries (Rosser et al. 2012). This pattern is apparent even when mapping the mean number of subspecies per species, thereby controlling for variation in species richness (Fig. 1A). In heliconiines, subspecies are defined as wing colour pattern variants that are fixed in at least part of their geographic range (Brown 1976a). (One species, H. numata, is polymorphic to varying degrees throughout its range, and its subspecies names correspond to 'biologically important mimetic morphs' [Brown 1976b]). For most species, areas with subspecies richness > 1 therefore typically indicate transitional, polymorphic regions (i.e. hybrid zones) between different geographic colour pattern subspecies. Regions where multiple species exhibit such hybrid zones, such as appears to be case for heliconiines along the Amazon river, are called 'suture zones' (Remington 1968, Dasmahapatra et al. 2010. A similar pattern of high diversity along the Amazon river is also apparent in birds at the species level (Hawkins et al. 2006), but is less clear or absent in mammals, amphibians and plants (De Oliveira and Daly 1999, Kreft and Jetz 2007, Bass et al. 2010. High diversity due to faunal mixing along the Amazon river is therefore consistent with the river barrier hypothesis. However, museum data are rarely collected systematically, and may be prone to spatial biases (the 'Wallacean shortfall' of distribution data [Lomolino 2004]). Some of the specimens on which the map in Fig. 1A was based are over 150 yr old, and date back to collectors such as Henry Walter Bates. In historical times (and even today), the primary mode of transport throughout Amazonia was by river, and museum data therefore suffer from highly unequal sampling (Brown 1979, Rosser et al. 2012, Oliveira et al. 2016, with most collections made along the course of large rivers (Fig. 1B). Furthermore, although Bates himself recognised the importance of recording accurate locality information (Bates 1863), many museum specimens bear generalised collecting localities and may come from either side of the river, or even reflect where a specimen was purchased (Wallace 1852, Emsley 1965, Brown 1979. Collectors may also actively seek out rare or hybrid forms. Therefore, despite the striking biogeographic pattern with a plausible biological explanation, sampling intensity must first be ruled out as an alternative explanation for the apparent suture zone in Fig. 1A. Recognising this, Rosser et al. (2012) modelled the proportion of museum specimens in grid cells that were intra-specific hybrids as a function of sampling effort, and mapped the residual variation. That analysis supported a high frequency of hybrids along the Amazon river, but nonetheless doubts remain over the reality of the pattern. For example, the paucity of sampling away from the river means it is unclear whether high polymorphism is restricted to the course of the river or is a widespread feature of lowland Amazonia. To distinguish these possibilities and test the hypothesis that the Amazon river represents a suture zone, we systematically collected heliconiine butterflies along a ~900 km transect across the Amazon River and north towards Brazil's borders with Guyana and Venezuela.

Methods
We collected 1470 heliconiines representing 25 species, at approximately 100 km intervals along a north-south transect running from 4.16°S to 3.20°N (Fig. 1C). Butterflies were collected from September to December in 2016 (1202 individuals), with some additional collections made in March and November of 2018 (15 and 253 individuals, respectively). The coordinates for each butterfly were recorded, and specimens were identified by NR and AVLF as subspecies or named colour pattern forms (the latter typically represent hybrids between subspecies). However, some valid taxonomic names may correspond to very minor colour pattern variations, which are unlikely to have much adaptive significance. Conversely, some major colour patterns variations may not have formal taxonomic names attached to them. To deal with this issue, we also classified all specimens to 'morpho-subspecies,' which we define as major colour pattern variants that can be either polymorphic or locally fixed. In the Supplementary material Appendix 1, we detail our decisions for each species and provide photos of all specimens used in the study. Specimens were deposited in the Zoological Collection of the Museum of Biodiversity (ZUEC-LEP) and AVLF's collection at the Univ. of Campinas, São Paulo, Brazil. The full list of taxa collected is given in the Supplementary material Appendix 1 Table A1, together with a map of all the sampling  Fig. A1). We divided our study area into sampling units, comprising equal area grid cells of 100 × 100 km or 25 × 25 km. For each species in a sampling unit, we then calculated a diversity statistic, D, which is the probability that two individuals drawn randomly have different colour patterns (Simpson 1949, Joron et al. 1999. Specifically, Dp s =-å 1 2 , where p s is the proportion of individuals exhibiting the sth colour pattern. As such, D can take values from zero (indicating all individuals in the sample are phenotypically identical) to one (indicating that all the individuals in the sample are phenotypically distinct).
To test for a statistical association between diversity and distance from the Amazon river, we used ordinary least squares (OLS) regression to model the mean value of D among species in each grid cell as a function of distance from the river. Because this response variable is bounded between zero and one, it was logit transformed (Warton and Hui 2011). Distance for each sampling unit was measured by calculating the mean latitude of all the specimens within a grid cell, and then taking the absolute difference from the latitude of the confluence of the Negro and Solimões rivers (−3.14°). The model can be summarised as: D i is the mean value of D among species in grid cell i, and distance i is the corresponding distance from the Amazon river. The residual errors are captured by ε i and are assumed to be normally distributed with equal variance. To test explicitly whether museum data predicted the actual patterns of polymorphism, we replaced distance with the mean number of subspecies/species estimated for each grid cell using museum data. High values of D may arise in communities dominated by a single variable species, rather than multiple variable species (i.e. a suture zone). In light of this, we used linear mixed effects models to test whether D is associated with proximity to the major rivers, using the R package nlme (Pinheiro et al. 2016). Building on Model 1, we modelled logit(D) as a function of distance from the rivers, while allowing each butterfly species to have a random slope and intercept. In addition, we included an ecologically relevant trait (butterflies' mimetic phenotype) as a fixed effect that interacts with distance. To do this, we partitioned species into four categorical mimicry rings (Fig. 1D, Supplementary material Appendix 1 Table  A2). This allowed us to test whether species with similar phenotypes (i.e. co-mimics) exhibited similar spatial diversity patterns. More specifically, we included/excluded mimicry from the model to test whether different mimicry rings differ in mean polymorphism, and whether these differences are associated with distance from the river. This model can be summarised as follows: Model 2 D ij is the value of D for species j in grid cell i and mimicry is a fixed categorical variable that interacts with distance. When the observed value of D ij was zero, we assigned it the minimum observed non-zero value, as recommended in Warton and Hui (2011). spp 0j gives the differences between species in mean logit(D) across all sample sites and spp ij × distance ij gives the variation among species in their response to distance, with the differences among the species assumed to be drawn independently from a normal distribution with mean 0 and variances s a 2 and s b 2 , respectively. The assumption of independent errors might be violated by spatial autocorrelation, leading to unreliable parameter estimates. We therefore included a spatial correlation structure in the model (Zuur et al. 2011 . Non-independent errors may also be produced by shared evolutionary history. For example, if co-mimetic species are phylogenetically closely related, an association between mimicry and distance might be generated by unmeasured traits shared through common ancestry. We therefore built a phylogenetic generalized linear mixed model (PGLMM; Ives and Helmus 2011) using the package phyr (Li et al. 2020), and tested the effect of mimicry when phylogenetic error structures were incorporated into the model.
In this model, phy 0j gives the difference between species in mean logit(D) across all sample sites that has a phylogenetic component, with the matrix Σ spp derived from phylogeny assuming Brownian motion evolution. phy 1j represents the variation among species in their response to distance that has a phylogenetic component. Phylogenetic branch lengths were obtained from an ultrametric Bayesian phylogenetic tree for heliconiines based on 20 nuclear and 2 mitochondrial genes (Kozak et al. 2015), which was pruned to the species in the present study.

Results
When analysing 100 × 100 km grid cells using morpho-subspecies, we found by far the highest values of D in cells overlapping the Amazon river (Fig. 1C). D declined significantly with absolute latitudinal distance from the confluence of the Negro and Solimões rivers (Fig. 1C inset), with the model explaining 61% of the variance in D (Model 1, α = −1.42, SE = 0.26, t = −5.38, p < 0.001; distance = −0.33, SE = 0.08, t = −4.12, p < 0.01, coefficients here and below on a logit scale). D was also significantly predicted by the museum data (α = −6.72, SE = 1.15, t = −5.86, p < 0.001; mean_ subspecies = 2.74, SE = 0.69, t = 3.96, p < 0.01, R 2 = 0.59). When modelling D with a random intercept and slope to account for variation in species responses to distance, we also estimated a significant negative slope (Model 2 excluding mimicry, α = −1.32, SE = 0.27, t = −4.96, p < 0.001; distance = −0.2, SE = 0.07, t = −3.02, p < 0.01). We detected a significant effect of mimicry when it was included (Model 2), showing that phenotypically similar species exhibit similar spatial patterns in polymorphism (likelihood ratio test: χ 2 =17.89, df = 6, p < 0.01). However, only butterflies in the red mimicry ring showed an association between polymorphism and distance from the river (Fig. 1E, slope for red mimics = −0.39, SE = 0.07, t = −5.65, p < 0.001). Finally, we tested whether our results for mimicry held while accounting for shared evolutionary history (Model 3). Mimicry remained significant (likelihood ratio test: χ 2 = 39.6, df = 6, p < 0.001), but again only the red mimics showed an association with distance from the river (slope = −0.39, SE = 0.08, t = −5.02, p < 0.001). The 100 × 100 km grid we used for these analyses was relatively coarse, but a finer scale grid of 25 × 25 km yielded similar results, in spite of the greater discontinuity in the transect and reduced sample size in each grid square (Supplementary material Appendix 1).

Discussion
By systematically sampling along a north-south transect though the central Amazon, we confirmed that the Amazon river is a suture zone for heliconiine butterflies, as predicted by museum data. This result has important implications for interpreting and understanding spatial patterns of colour pattern diversity across the entire Amazon basin in these species. Mean diversity of major colour pattern variants ( D ) of heliconiines was highest in the grid cells near Manaus that overlapped the Solimões, Negro and Amazon rivers (Fig. 1C), and D declined with increasing distance from the river (Fig. 1C inset). However, the influence of the river on spatial diversity patterns depends on the ecology of each species, in line with the conclusions of other recent studies on birds and frogs (Burney andBrumfield 2009, Fouquet et al. 2015).
In particular, river-based geographic structure is confined to heliconiines in the red mimicry ring. These species comprise a polyphyletic group of mutualistically-interacting co-mimics ( Fig. 1D-E) that switch phenotype across the river (Fig. 2). In light of this finding, we returned to the museum data that were the motivation for the present study, and mapped polymorphism of the species in the four mimicry groups separately (Fig. 3). These maps show striking differences in the spatial patterns of polymorphism. Possible reasons for these differences are discussed below. The bright colours of heliconiine butterflies are aposematic, warning predators that they are unpalatable. Müllerian mimicry in each mimicry ring acts as a mutualism, in which each species shares the costs of educating predators with comimics. In practice, however, the more abundant and more unpalatable species tend to drive the mimicry ring under Müller's theory, with the benefit of mimicry in each species varying in proportion to the inverse square of relative abundance or relative unpalatability (Müller 1879, Mallet 2001. Hybrid zones between mimetic subspecies are maintained via frequency-dependent selection by predators, which select against rare forms they do not recognise Barton 1989a, b, Langham 2004). Because these hybrid zones are not related to fixed features in the environment, they will tend to move due to asymmetries in selection, dispersal or density between the subspecies, or even if the mimicry alleles for one subspecies are genetically dominant (Barton 1979, Mallet 1986, 1993. For example, a colour pattern cline between two subspecies of H. erato has previously been shown to move 65 km in 30 yr, likely due to genetic dominance (Blum 2002, Thurman et al. 2019. However, moving clines can easily become trapped at geographic barriers, which can prevent cline movement even when the advancing subspecies has a fitness advantage (Barton 1979, Sherratt 2006, Barton and Turelli 2011. This is because geographic barriers reduce migration by preventing foreign subspecies from reaching the critical intermediate frequency at which frequency-dependent selection would drive a local population to fixation.
In the central Amazon, the huge rivers represent broad bands of entirely unsuitable habitat for heliconiines; at its widest the Rio Negro can be as much as 20 km across, far greater than dispersal estimates for Heliconius erato (2.6 km per generation) and H. melpomene (5.8 km) (Rosser et al. 2014). However, the rivers do not appear to block gene flow completely, as we found that populations close to the river were often polymorphic (see also Brown and Benson [1975]). Frequency-dependent selection is expected to purge rare colour pattern alleles from such populations, so the polymorphisms suggest that butterflies continually disperse across the river and introduce allelic variation. We therefore suggest that the rivers act as semi-permeable barriers that trap moving colour pattern clines. These clines will accumulate at Red sections correspond to phenotypes with a forewing yellow band, namely H. erato lativitta and H. melpomene malleti (although this character is variable and some specimens are likely recombinants between subspecies, especially between the Negro and Solimões). Blue corresponds to H. erato amalfreda (its co-mimic, H. melpomene meriana was never collected, although it is presumed to be present given the presence of H. melpomene pyrforus and H. melpomene malleti hybrids). Pie charts are scaled by log abundance. The inset plot shows concordant clines for presence/absence of hind-wing rays for the eight red-patterned species collected. The blue vertical line indicates the latitudinal value of the Negro and Solimões confluence, and the wings overlaid on the plot show the change in the H. burneyi phenotype from south of the confluence (left) to north of the confluence (right). the river in multiple taxa, in part due to mimicry, leading to the production of a suture zone. This can be contrasted with the classical explanation for suture zones; that they represent secondary contact between populations that differentiated in allopatric Pleistocene refugia (Remington 1968, Hewitt 1988. The hypothesis that major rivers trap moving hybrid zones might seem somewhat idiosyncratic, applying only to butterfly colour pattern clines, and with limited applicability to other organisms. However, any cline maintained by a balance between dispersal and endogenous selection against hybrids (a 'tension zone') is similarly positionally unstable (Key 1968, Barton and Hewitt 1985, Dasmahapatra et al. 2002. A recent genomic study of contact zones between birds in the headwaters of the Teles Pires river concluded that differences between divergent taxa were likely maintained by selection against hybrids due to postzygotic intrinsic incompatibilities (Pulido-Santacruz et al. 2018). Moreover, it has been suggested that the habitat in the contact zone has low suitability for the bird species involved, and may form a population sink (Weir et al. 2015). Thus, the idea that tension zones become clustered at geographic barriers or in regions of low population density may explain suture zones in a range of plants and animals. This explanation is not mutually exclusive to the possibility that populations differentiated in allopatry (e.g. in refugia), but simply highlights that the present positions of hybrid zones/suture zones may not bear any relation to the regions where populations evolved.
An obvious question is why the rivers should be so clearly associated with polymorphism in the red patterned species, but not in the other mimicry rings. Genetic control of colour pattern elements in Heliconius is thought to be broadly conserved across most species in the genus (Huber et al. 2015, Jiggins 2017. While there will be differences in the numbers of loci and dominance relationships between species, mimicry rings comprise polyphyletic assemblages of species, and so there is no phylogenetic expectation that species within a mimicry ring should show more range similarity to one another than to those in different mimicry rings. It therefore seems likely that the differences between mimicry rings are at least in part the product of ecological or behavioural differences. Species in the blue and orange mimicry rings are strong dispersers (Brown 1981), and the river may form less of a barrier for them. For example, while H. wallacei does show phenotypic variability along the transect, its colour pattern clines are very broad and stretch for hundreds of kilometres, suggesting  Table A3. 8 either high dispersal and/or relatively weak selection (Fig. 4). In contrast, the tiger-patterned heliconiines are philopatric, and mimic the phylogenetically unrelated and highly diverse ithomiine butterflies (Nymphalidae: Danainae). Ithomiine species typically exhibit multiple mimetic variants within a single locality, and these are almost certainly the primary drivers of the spatial patterns observed in the tiger-patterned heliconiines Benson 1974, Joron et al. 1999). Polymorphism of tiger-patterned heliconiines indeed peaks in western Amazonia (Fig. 3, Supplementary material Appendix 1 Fig. A3), where tiger-patterned ithomiines are most diverse (Brown 2005). Most heliconiines in the red mimicry ring are also philopatric, but they mimic other heliconiines, all of which are comparatively similar ecologically and genetically. Change in one species may therefore have more potential to drive correlated changes in co-mimics.
A further transitional region occurs in the north of our study area near the town of Rorainópolis, as the transect approaches the savannah of Roraima/Guyana. Here, two of the red species switch to a 'postman' phenotype with a large red patch on the forewing (H. erato amalfreda switches to H. erato magnifica, and H. melpomene meriana switches to H. melpomene pyrforus; Fig. 2). (We note that this reduces the strength of the statistical association between colour pattern diversity and the Amazon river, as polymorphism in these two species is thus approximately bimodal along the transect, rather than linear.) This second transition is concordant with the appearance of H. ricini, and a number of other species either appear, drop out or change colour in the general vicinity of Rorainópolis (Supplementary material Appendix 1  Table A1). Heliconius melpomene appears to be extremely rare north of Manaus until the appearance of H. melpomene pyrforus, whereas H. erato is reasonably common throughout the transect. The hybrid zones are close to where patches of savannah begins to appear, and the more open forests may favour H. melpomene, leading to very high abundance. This increasing density gradient will act in the same way as the rivers act in the south of the transect; i.e. it will produce asymmetrical migration that favours H. melpomene pyrforus and prevents the spread of the rarer, H. melpomene meriana phenotype. A very similar pattern is seen in H. elevatus and H. luciana (two very closely related species that may in fact be conspecific). As with H. melpomene, we collected H. elevatus around the rivers in the south of the transect, but it then becomes extremely rare to the north until replaced by H. luciana towards the savannah. In a previous paper, we hypothesised that H. erato and H. melpomene hybrid zones at the edge of the Andes were trapped in a region of low density produced by heavy orographic rainfall (Rosser et al. 2014). However, the abundance of H. melpomene along that transect follows a similar pattern to that observed here, with the rayed form of H. melpomene rare, while the Andean form that replaces it is common. Therefore, it may be that the more open montane forests favour H. melpomene, leading to a marked increase in abundance which prevents invasion of the lowland forms. This leads to a prediction that when stationary Heliconius hybrid zones (or any tension zones) are not associated with a clear geographic barrier, one form should be markedly more common than the other.
It is interesting to note that the Amazon river appears to be a suture zone for some species of birds (Hawkins et al. 2006), but for subspecies of heliconiine butterflies. This may be due to real biological differences. Birds, especially understory forest species, often shy away from large, open areas, such as a major rivers and even roads (Hayes andSewlal 2004, Laurance et al. 2004), whereas butterflies may be more likely to disperse across them. Thus, the barrier effect of rivers may be sufficient to drive speciation in birds, while greater levels of gene flow prevent speciation in butterflies, and merely trap subspecific hybrid zones and clines. This may explain why the Rio Branco, which is a mere 500 m wide where our transect crossed it, is thought to be a barrier for certain bird species (Naka et al. 2012), but we observed no phenotypic changes in heliconiine subspecies. Similarly, while multiple heliconiines show subspecies differentiation across the major Amazonian rivers, there do not appear to be any species with a distribution limited by the Amazon or Rio Negro, or cases of sister species on either side (Rosser et al. 2012(Rosser et al. , 2015. In contrast, riodinid butterflies (small, understory butterflies with presumably lower dispersal) do show species differentiation across the river (Hall and Harvey 2002). However, the differences between birds and butterflies may also reflect taxonomic culture. Heliconius populations are generally recognised as separate species only when there is strong evidence for sympatric coexistence of divergent taxa differing in multiple genetic traits (Rosser et al. 2019) whereas bird taxonomists may be more likely to recognise allopatric populations as species (Cracraft 1983, Gill 2014. Biogeography is becoming increasingly quantitative, and advances of this nature are allowing increasingly powerful tests of classic hypotheses (Naka et al. 2012, Oliveira et al. 2017, Santorelli et al. 2018. Nonetheless, broadscale analyses suffer from a lack of adequate data, particularly in poorly sampled areas such as the Amazon. Here, we used field collections to validate a striking biogeographic pattern inferred from museum data, highlighting the importance of museum collections in directing research on the evolution of biodiversity (Pyke and Ehrlich 2010). Specifically, we show that the Amazon river forms a suture zone for heliconiine butterflies in the red mimicry ring. Because species with other colour patterns are not similarly affected, we suggest that this may be because moving hybrid zones become stuck there. The position of the suture zone is thus readily explained by a parapatric model that depends on current physical features in the landscape, intrinsic selection against hybrids and mutualistic coevolution.

Data availability statement
All data is included in the manuscript and Supplementary material.