Evidence for frequency‐dependent selection maintaining polymorphism in the Batesian mimic Papilio polytes in multiple islands in the Ryukyus, Japan

Abstract Batesian mimicry is a well‐studied adaptation for predation avoidance, in which a mimetic species resembles an unpalatable model species. Batesian mimicry can be under positive selection because of the protection gained against predators, due to resemblance to unpalatable model species. However, in some mimetic species, nonmimetic individuals are present in populations, despite the benefits of mimicry. The mechanism for evolution of such mimetic polymorphism remains an open question. Here, we address the hypothesis that the abundance of mimics is limited by that of the models, leading to mimetic polymorphism. In addition, other forces such as the effects of common ancestry and/or isolation by distance may explain this phenomenon. To investigate this question, we focused on the butterfly, Papilio polytes, that exhibits mimetic polymorphism on multiple islands of the Ryukyus, Japan, and performed field surveys and genetic analysis. We found that the mimic ratio of P. polytes was strongly correlated with the model abundance observed on each of the five islands, suggesting negative frequency‐dependent selection is driving the evolution of polymorphism in P. polytes populations. Molecular phylogenetic analysis indicated that the southern island populations are the major source of genetic diversity, and the middle and northern island populations arose by relatively recent migration. This view was also supported by mismatch distribution and Tajima's D analyses, suggesting a recent population expansion on the middle and northern islands, and stable population persistence on the southern islands. The frequency of the mimetic forms within P. polytes populations is thus explained by variations in the model abundance rather than by population structure. Thus, we propose that predation pressure, rather than neutral forces, have shaped the Batesian mimicry polymorphism in P. polytes observed in the Ryukyus.

Negative frequency-dependent selection (NFDS) has been offered as a possible mechanism underlying female-limited polymorphic mimicry in butterflies (Barrett, 1976;Kunte, 2009;Turner, 1978), as have other hypotheses such as sexual selection and ecological-physiological trade-offs (Burns, 1966;Cook, Vernon, Bateson, & Guilford, 1994;Ohsaki, 2005;Vane-Wright, 1984). According to the NFDS hypothesis, the abundance of mimetic individuals is limited by the abundance of the model, because the defensive benefit of mimicry increases when the model is relatively more abundant. In other words, the advantage of mimicry decreases as the abundance of the mimics increases relative to that of the models (Barrett, 1976).
Eventually, the mimic ratio (MR; the proportion of the mimic phenotype in the population) approaches an equilibrium state (Kunte, 2009), at which the fitness of mimic and nonmimic types are equal.

K E Y W O R D S
female-limited mimetic polymorphism, mimic ratio, model abundance, phylogenetic analyses, population genetics F I G U R E 1 Mimicry system and sample sites in the Ryukyu Islands. (a) Mimetic females of Papilio polytes resemble the model species Pachliopta aristolochiae, whereas nonmimetic females are similar to males of P. polytes. (b) Map of each island, showing different characteristics in mimicry parameters AI (advantage index of Batesian mimicry) and MR (mimic ratio). Names of islands are abbreviated as follows: AGN, Aguni; AMA, Amamiooshima; HTR, Hateruma; IRO, Iriomote; ISG, Ishigaki; KHM, Kohama; KIK, Kikai; MYK, Miyako; OKI, Okinawa; TKT, Taketomi; TRM, Tarama. Island age, according to Osozawa et al. (2012), is shown in brackets The NFDS hypothesis predicts that this MR equilibrium corresponds to the local abundance of the model species determined by ecological factors.
On the other hand, the polymorphism can be influenced by selectively neutral processes. In other words, the mimic and nonmimic types have equal fitness independent of their frequency, and therefore, their gene frequencies in local populations change through genetic drift and migration (Ackermann & Cheverud, 2004;Wright, 1943). According to this scenario, it is expected that local populations (i.e., island populations) that are genetically closely related should show similar MRs independent of local model abundance.
Therefore, it is essential to quantify the relative effects of neutral and selective processes on the persistence of polymorphism.
Papilio polytes L., a swallowtail butterfly that is common throughout Southeast Asia, exhibits female-limited Batesian mimicry (Clarke & Sheppard, 1972;Katoh, Tatsuta, & Tsuji, 2017;Kunte et al., 2014;Figure 1a). Some of the females mimic other unpalatable butterflies in the family Papilionidae, as a defense against avian predators (Ford, 1975). The nonmimetic females resemble the males (Figure 1a). On several of the Ryukyu Islands in Japan, female P. polytes appear to mimic Pachliopta aristolochiae (Katoh et al., 2017;Uesugi, 2000). Interestingly, the MR of these P. polytes populations is higher on islands with greater abundance of the model species (Uesugi, 2000), implying that NFDS underlies the mimicry patterns. In addition, after the model P. aristolochiae became established on Miyako Island in the Ryukyus, the MR on the island increased rapidly from 1975 to 1989 (Uesugi, 2000), implying rapid local adaptation. The above observations seemed to indicate that natural selection was driving the process; however, to date no population genetic survey has been performed in P. polytes in the Ryukyus to test this view.
In this study, we assess the possible association between the MR of P. polytes and the abundance of the model species in the Ryukyu Islands. Subsequent to a previous study of these two factors performed in 1982 (Uesugi, 2000), the distribution of the model species P. aristolochiae has expanded (Katoh et al., 2017). Therefore, the present investigation of P. polytes and P. aristolochiae in the Ryukyu Islands provides a powerful opportunity to address the microevolution of polymorphic Batesian mimicry in P. polytes populations. In addition, we perform phylogenetic and population genetic analyses with mitochondrial DNA markers to examine whether neutral, nonadaptive forces such as phylogenetic constraint, isolation by distance, or demographic history can explain the MR distribution of P. polytes across the Ryukyu Islands.
Analysis of mitochondrial markers only provides us limited genetic information of a maternally inherited locus. To improve reliability of genetic inference, analyses using multiple loci including nuclear markers are ideal. On the other hand, mitochondrial DNA analysis is reasonable for the first attempt of population genetic analyses of nonmodel organisms, because it is practically convenient. Therefore, we analyzed mitochondrial markers to obtain a first insight into the genetic structure of P. polytes in Ryukyu Islands in the present study.
The butterflies were caught by bag net between 2014 and 2017.
After the species (P. polytes or P. aristolochiae), sex, and mimic type were recorded, live all specimens were stored at −30°C in a freezer until DNA extraction (Table S1), enabling us to be certain that they were neither collected nor recorded twice.  (Table S1).

| DNA extraction
Whole middle and hind legs of the butterflies were used for DNA extraction with a DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany). Samples were gently cut with hand shears and manually homogenized using disposable pestles in 180 µl of buffer ATL. The tissue homogenates were digested overnight at 55°C with 20 µl of Qiagen Proteinase K solution (600 mAU/ml) and subjected to DNA extraction. Total DNA was eluted from the spin column in 200 µl of buffer AE and stored at −30°C after confirmation of the concentration and quality using a Nanodrop 2000C (Thermo Scientific).

| PCR amplification and parallel sequencing of mtDNA genes
To determine the mtDNA sequences of the subjects from the Ryukyu Islands, we applied long-range PCR and shotgun sequencing using the MiSeq platform (Illumina). This approach prevented amplification of nonpreferred sequences and contamination by short nuclear pseudogenes of mitochondrial origin, which often produce misleading results in evolutionary analyses (Song, Buhay, Whiting, & Crandall, 2008). Universal primers for insect mtDNA (Simon et al., 1994) with slight modifications based on sequence alignment with closely related species, P. machaon (GenBank accession number: HM243594), P. helenus (KM244656), P. maackii (KC433408), P. syfanius (KJ396621), and Troides aeacus (EU625344) were used to amplify P. polytes mtDNA (GenBank accession number: NC_024742) (Table S3). PrimeSTAR GXL DNA Polymerase (Takara) was used, with the following PCR parameters: 30 cycles at 98°C for 10 s and 64°C for 10 min to amplify targets longer than 3 kbp. To amplify targets shorter than 3 kbp, different PCR parameters were used: 30 cycles at 98°C for 10 s, 60°C for 15 s, and 68°C for 2 min.
The PCR products were used to prepare sequencing libraries with a Nextera XT DNA Library Prep kit, which uses a transposomebased approach, and an Index kit v2 (Illumina). To improve the fidelity of library amplification, the Nextera PCR master mix (NPM) was replaced with the same volume of KAPA HiFi HotStart ReadyMix (Kapa Biosystems). The amplified libraries, containing sequences ranging in length from 300 to 800 bp, were dissected from 1.0% L03 agarose gels (Takara) using a MinElute gel extraction kit (Qiagen), quantified by a Qubit fluorometer with a dsDNA high-sensitivity assay kit (Thermo Scientific), and sequenced using a MiSeq Reagent Kit v2 (Illumina) to generate 2 × 250-bp paired-end reads (run numbers 1 to 3) and 2 × 150-bp paired-end reads (run numbers 4 to 8) (see Table S4).

| Primary data processing, short read mapping, and single nucleotide polymorphism calling
Raw FASTQ sequences of P. polytes mtDNA amplicons (DRA accession numbers: DRA006999 and DRA008115) were subjected to primary data processing. The overall data quality was evaluated by FastQC (Andrews, 2010), and low-quality 3′-tails (Phred quality score < 10) were removed automatically using the Perl script DynamicTrim.pl (Cox, Peterson, & Biggs, 2010). The tail-trimmed sequences were filtered by a custom Perl script to remove reads containing base call failures (N-bases).
Quality-filtered sequences were analyzed to estimate the haplotype sequence of the subject mtDNAs using the software described below for short read mapping and variant calling.
Sequence reads were aligned to the reference mtDNA sequence Command line options were invoked to skip indel calling and define the sample ploidy as haploid. The mt DNA regions where a SNP quality value ≥30 (error rate < 0.001) was achieved with a sequencing depth of ≥10 of among all samples were used for downstream analyses. The obtained SNP information of mtDNA regions described above was used to replace the corresponding nucleotides of reference mtDNA sequence into those of respective samples to generate a FASTA-formatted sequence data (DDBJ/EMBL/ GenBank accession numbers: LC466203-LC466479) using a custom Perl script.

| Molecular phylogenetic and population genetic analyses
To address the demographic history and evolution of Batesian mimicry of P. polytes in the Ryukyu Islands, we performed molecular phylogenetic and population genetic analyses. The mtDNA sequences described above were aligned with MAFFT (Katoh & Standley, 2013), and maximum likelihood (ML) and Bayesian phylogenetic analyses were conducted using Treefinder (Jobb, Haeseler, & Strimmer, 2004) and MrBayes 3.2.2. (Ronquist et al., 2012). Appropriate nucleotide substitution models and parameters were estimated by MrModeltest 2.3 (Nylander, 2004) by gene. In the ML analysis, search depth and number of bootstrap replications were set to 2 and 1,000, respectively. In the Bayesian analysis, the Markov chain Monte Carlo process was performed for 1,000,000 generations, and a total of 10,001 trees were sampled. After manual evaluation of the variation and saturation of the likelihood scores, the first 200 trees were discarded, and the final tree and the posterior probabilities of the branches were determined from 9,801 trees. The haplotype network was estimated using the TCS Network method (Templeton, Crandall, & Sing, 1992) by PopART v1.7 (Leigh & Bryant, 2015).
Pairwise p-distances across individuals from multiple islands were calculated using MEGA 7 (Kumar, Stecher, & Tamura, 2016) and visualized using heat maps. A population-level tree was inferred using the unweighted pair group method with arithmetic mean (UPGMA) method (Sneath & Sokal, 1973) based on genetic distance (Nei's net number of nucleotide differences, D A [Nei & Li, 1979]). Genetic differentiation (Φ ST ) among islands was calculated based on D A divided by the average number of pairwise differences among islands.
Population genetic analyses, including Tajima's D-tests (Tajima, 1989a(Tajima, , 1989b and mismatch distributions (Rogers & Harpending, 1992), were performed with Arlequin v3.5.2.2 (Excoffier & Lischer, 2010) based on pairwise nucleotide differences as genetic distances with no gamma correction. Tajima's D values were assessed by 2,000 permutation tests, and a significant negative value was considered to suggest population expansion in the absence of natural selection (Tajima, 1989a(Tajima, , 1989b. Based on the observed mismatch distributions, the possibility of an instantaneous population expansion was assessed by a generalized nonlinear least square approach, as described in Schneider and Excoffier (1999), and a strong unimodal distribution was considered to suggest a recent expansion (Rogers & Harpending, 1992;Slatkin & Hudson, 1991). The validity of the expansion model was evaluated by the sum of squared deviations between the observed and expected mismatch distributions generated by 5,000 bootstrap resamplings. An expansion time parameter, t, was estimated by the equation t = τ/2μm, where τ is the mutational timescale estimated from the mismatch distribution, m is the DNA sequence length used, and μ is the mutation rate. Three known nucleotide substitution rates for the COI gene, based on insect mtDNA studies (Vila et al., 2011), were adopted as μ values: 6.5 × 10 −9 (slow), 7.5 × 10 −9 (intermediate), and 9.5 × 10 −9 (fast) substitutions per site and year.

| Mimic ratio and advantage index of Batesian mimicry
The MR and the advantage index (AI) of Batesian mimicry of P. polytes were defined as follows: The definition of MR above is identical to relative abundance (RA) as defined by Sekimura, Fujihashi, and Takechi (2014). The definition of AI was modified from those of Uesugi (2000) and Sekimura et al. (2014) by removing "# nonmimic P. polytes females" from the denominator because the nonmimic females (and males) of P. polytes do not possess warning signals for avian predators and thus do not affect the mimicry advantage.

| Mantel and partial Mantel tests
Possible associations among genetic distance D A , geographic distance, and difference in MR among islands were assessed by Mantel test (Mantel, 1967) for the five islands with sufficient sample sizes (KIK, OKI, MYK, ISG, and TKT; Figure 1b and Table S1). Associations between differences in MR and differences in environmental factors (described below) among islands were also assessed by Mantel test.
For calculation of differences in MR and AI among islands, the numbers of the model species, mimic P. polytes females, and nonmimic P. polytes females observed in several surveys (Table S2) were totaled for each of the five islands. D A among islands was calculated by Arlequin v3.5.2.2 (Excoffier & Lischer, 2010) based on the absolute number of substitutions between sequences. Pairwise geographical distances among islands were measured as kilometers between centers of sampling area (i.e., city hall or downtown area) of each island except for OKI where the location of Nakijin village office was used. Climate variables (average temperature, average wind speed, and average rainfall) were transformed into a small number of principal components using R (ver. 3.4.2; the "stats" package), and the absolute values for differences in principal component scores between islands were used as environmental differences (for a summary of the principal components see Table S6). Mantel tests were conducted using Arlequin v3.5.2.2 (Excoffier & Lischer, 2010).
In addition, partial Mantel tests (Smouse, Long, & Sokal, 1986) were conducted to confirm whether AI, rather than other influences, such as historical processes or environmental factors, had shaped the spatial pattern of the MR across the five islands. We estimated the correlation between distance matrices of differences in MR and AI among islands while controlling for the effect of geographic distance, genetic distance D A , or environmental distances. The partial Mantel test statistic was assessed based on 10,000 permutations of the raw data according to Method 1 in Legendre (2000). These calculations were conducted using a script developed on the R platform (ver. 3.4.2; R Core Team, 2017), which is available from one of the authors (HT).

| General results
We obtained a total of 2,823,806 reads (1,

| Phylogeographic analysis of P. polytes in the Ryukyu Islands
Molecular phylogenetic and haplotype network analysis of P. polytes consistently indicated two major clades strongly associated with the geographic distribution (Figure 2, Figure S1). One of the major clades contained only individuals collected from southern islands (ISG, TKT, and TRM; the "Southern-specific haplogroup") with some deep branching, indicating genetically distant haplotypes. Another clade contained mainly northern individuals (OKI and KIK), along with some southern individuals (the "Mixed haplogroup"), with relatively short branches, indicating closer relationships between the haplotypes within the clade. The mimic-morph individuals (shown as black squares in Figure 2 and Figure S1a, and with asterisks in Figure   S1b) did not form a monophyletic group, but instead were present in both major clades.
Heatmap visualization of pairwise genetic distances calculated from 1,273 bp of the COI gene and the neighboring region sorted by geographic location (Figure S2a contained several haplotypes that had diverged from each other to some extent. Some of these were more closely related to those from the northern (KIK and OKI) and middle (MYK) islands (light red shading in Figure S2a,b), whereas the remaining haplotypes were genetically distinct, unique haplotypes of the southern islands (ISG and TKT; dark red shading in Figure S2a (Table 1). In addition, haplotype frequency-based genetic differentiation (F ST ) based on 1,273 bp of the COI gene and the neighboring region was not detected among these islands (F ST ≈ 0.000 in all pairs; data not shown).

| Mantel and partial Mantel tests
Regarding genetic distance (D A ), geographic distance, and MR differences among P. polytes populations, no significant association was detected among the five islands of the Ryukyu Islands (KIK, OKI, MYK, TKT, and ISG; Figure 1b), based on the Mantel test (Figure 4ac). Furthermore, the Mantel test confirmed that neither of the environmental distances was associated with MR differences among the five islands (Figure 4d,e). On the other hand, the partial Mantel tests confirmed a significant association between MR differences and AI differences among the five islands while controlling for the effect of either geographic distance (GD), genetic distance (AGD), or environmental distances (p < 0.01 in all cases; for a summary of the results, see Table 4; Figure 5, white dots). The results were substantially the same when a previous method for calculation of AI (Sekimura et al., 2014;Uesugi, 2000) was applied (p < 0.05 in all cases, Table 4; Figure 5, black dots). MR and AI were also found to be correlated in a previous survey of seven islands conducted in 1982  TA B L E 2 Neutrality test (Tajima's D) (Uesugi, 2000; Figure 5, gray dots), indicating a striking correspondence between P. polytes mimicry and the density of the model species (P. aristolochiae) in the Ryukyu Islands for several decades.

| D ISCUSS I ON
We hypothesized that in P. polytes populations the frequency of the Batesian mimicry morph is limited by the abundance of the model species (P. aristolochiae) and investigated populations of these two butterflies on eight of the Ryukyu Islands, Japan (Figure 1). We found that the mimic ratio and the model abundance (MR and AI, respectively) are positively associated (Figure 5), although each island exhibited unique MR and AI values, independent of geographical position (Figure 1b). We confirmed the significant association between MR differences and AI differences among the five islands, while controlling for the effect of either GD, AGD, or environmental distances (for a summary of environmental factors see Section 2, Table S6, and Table 4; Figure 5). Accordingly, we propose that Batesian mimicry in P. polytes is maintained by negative frequency-dependent selection shaped by model species abundance and that the MR on each island has adjusted to the local abundance of the model species through NFDS. These views have been proposed previously (Barrett, 1976;Kunte, 2009; Note. Tau is the moment estimator of the time to expansion (age of expansion). SSD indicates the sum of squared deviations between the observed and the expected mismatch distribution, assuming a sudden population expansion as a test statistic. p (Sim. SSD ≥ Obs. SSD)-values < 0.05 indicate significant deviation from the expected distribution, assuming a sudden population expansion model. Expansion times (L), (M), and (H) were calculated based on approximate mutation rate (= substitution rate) μ, with slow, intermediate, and fast rates of 6.5 × 10 −9 , 7.5 × 10 −9 , and 9.5 × 10 −9 substitutions per site and year, respectively. Abbreviation: MYA, million years ago. a Data from the Geospatial Information Authority of Japan (2018). b Data from Osozawa et al. (2012).

TA B L E 3 Mismatch distribution analyses
F I G U R E 4 Mantel test for (a) mimic ratio difference (MRD) and average genetic distance (AGD), (b) MRD and geographic distance (GD), (c) AGD and GD, (d) and (e) MRD and environmental distances (ED1 and ED2) among five Ryukyu Islands. AGD was calculated using Nei's net number of nucleotide differences (D A ). ED1 and ED2 are the difference between PC1 and PC2, generated using various climate variables. ED1 mainly reflects differences in average rainfall and average wind speed among islands. ED2 mainly reflects differences in average temperature among islands (see Table S6). Datapoints represent pairwise combinations of the five islands (KIK, Kikai; OKI, Okinawa; MYK, Miyako; ISG, Ishigaki; TKT, Taketomi). The p-value shown in each panel is for the Mantel test  Turner, 1978); however, the present study provides the first systematic evidence from field surveys.
We also investigated the phylogeographic history of P. polytes in the Ryukyu Islands to examine whether neutral forces such as phy- Nilaparvata lugens Stål (Kishimoto, 1976;Seino, Shiotsuki, Oya, & Hirai, 1987); they likely affect the P. polytes habitats also. This view is supported by our results implying recent population turnover and/ or expansion in the middle and northern islands ( Table 4), and the observed MR differentiation cannot be explained by genetic, geographic, or environmental distance (see Figure 4, Table S6). Based on these findings, we propose that the strong predation pressure on this butterfly (Katoh et al., 2017), rather than isolation by distance or other neutral processes, has shaped and maintained the Batesian mimicry patterns of P. polytes observed in the Ryukyu Islands.
It is generally difficult to demonstrate predation pressure on prey species (P. polytes in this case) in the wild or in field surveys; however, our preliminary data for beak marks on P. polytes indicate predation by birds. The strength of the predation pressure can be estimated from beak marks (e.g., Ohsaki, 1995), and that on mimetic females of P. polytes seems to depend upon the abundance of the model Note. Estimation of the correlation between distance matrices of differences in mimic ratio (MR) and advantage index (AI) (MRD and AID) among islands, controlling for the effects of geographic distance (GD), genetic distance (AGD; D A [Nei & Li, 1979]), environmental distance 1 (ED1), or environmental distance 2 (ED2). ED1 and 2 are the difference between PC1 and PC2, generated using various climate variables. ED1 mainly reflects differences in average rainfall and average wind speed among islands. ED2 mainly reflects differences in average temperature among islands. a Partial correlation coefficient. b AI and AID were calculated based on the definition described in the Section 2. c AI and AID were calculated based on the definition described in previous studies (Sekimura et al., 2014;Uesugi, 2000 (2000) and Sekimura et al. (2014). White and black dots represent the mean values for each island. Error bars show standard error calculated using data collected during several surveys conducted on different days (for raw data, see Table S2). Gray dots indicate datapoints based on the 1982 survey (Uesugi, 2000) and the AI calculation method of Uesugi (2000)  These data support our view that predation pressure combined with model species abundance determines the MR of P. polytes females.
According to our population genetic analysis, the KIK (Kikai) population of P. polytes was estimated to have been established in the relatively recent past (31,000 to 45,000 years ago; Table 3), after which it underwent population expansion (Figure 3).
This implication does not conflict with the relatively young age of KIK Island, estimated based on geological data to have arisen about 0.85 million years ago, compared with the other islands investigated (Table 3; Osozawa et al., 2012). In addition, the significantly negative value of Tajima's D (  (Table 3). Although TKT (Taketomi) Island also has a lower elevation (33 m; Table 3), the TKT population was estimated to have persisted at a stable size ( Figure 3 and Table 3). TKT is located close to ISG (Ishigaki) (Figure 1b), and TKT could easily be recolonized by the ISG population ( Figure 2). This view is supported by the absence of significant genetic differentiation between ISG and TKT (Table 1). Altogether, the consistent patterns observed across several types of analysis support these conclusions.
In future works, massive and comprehensive SNP analysis of the nuclear genome could be used to confirm the evolutionary history of the P. polytes populations examined in this study. Restriction site-associated DNA sequencing (RAD-seq), multiplexed intersimple sequence repeat genotyping by sequencing (MIG-seq) (Suyama & Matsuki, 2015), or other methods could be used. In addition, an ecogenomic analysis of the doublesex (dsx) gene could provide a useful perspective, because the dsx gene is thought to be responsible for mimicry polymorphism in P. polytes (Kunte et al., 2014;Nishikawa et al., 2015;Zhang, Westerman, Nitzany, Palmer, & Kronforst, 2017) and may play a key role in the rapid evolution of P. polytes mimicry in the Ryukyu Islands. Molecular analysis to detect the signatures of selection in the dsx gene would also be worthy of future study.

ACK N OWLED G M ENTS
We thank Senshi Nobayashi for assistance with sampling and our colleagues at the University of the Ryukyus for helpful discus- Computations were partially performed on the NIG supercomputer at ROIS National Institute of Genetics, Japan.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
KT and KT-S designed the study. KT-S, YS, MK, and EK performed the research. YS, KT-S, EK, and HT analyzed data. All authors contributed to writing the manuscript.