Phylogeography and population genetics of pine butterflies: Sky islands increase genetic divergence

Abstract The sky islands of southeastern Arizona (AZ) mark a major transition zone between tropical and temperate biota and are considered a neglected biodiversity hotspot. Dispersal ability and host plant specificity are thought to impact the history and diversity of insect populations across the sky islands. We aimed to investigate the population structure and phylogeography of two pine‐feeding pierid butterflies, the pine white (Neophasia menapia) and the Mexican pine white (Neophasia terlooii), restricted to these “islands” at this transition zone. Given their dependence on pines as the larval hosts, we hypothesized that habitat connectivity affects population structure and is at least in part responsible for their allopatry. We sampled DNA from freshly collected butterflies from 17 sites in the sky islands and adjacent high‐elevation habitats and sequenced these samples using ddRADSeq. Up to 15,399 SNPs were discovered and analyzed in population genetic and phylogenetic contexts with Stacks and pyRAD pipelines. Low genetic differentiation in N. menapia suggests that it is panmictic. Conversely, there is strong evidence for population structure within N. terlooii. Each sky island likely contains a population of N. terlooii, and clustering is hierarchical, with populations on proximal mountains being more related to each other. The N. menapia habitat, which is largely contiguous, facilitates panmixia, while the N. terlooii habitat, restricted to the higher elevations on each sky island, creates distinct population structure. Phylogenetic results corroborate those from population genetic analyses. The historical climate‐driven fluxes in forest habitat connectivity have implications for understanding the biodiversity of fragmented habitats.

( Figure 1). The arid desert lowlands can serve as a barrier to dispersal for high-elevation animal species (Holycross & Douglas, 2007;Masta, 2000;Ober, Matthews, Ferrieri, & Kuhn, 2011;Tennessen & Zamudio, 2008). The sky islands are also a region of high species turnover, as dozens of insect genera and hundreds to thousands of insect species reach the northern or southern limits of their ranges there (Felger & Wilson, 1994). As a result of the complex topography, mixing of temperate and tropical climate zones, and convergence of two major mountain ranges, the sky islands are one of the most biologically diverse ecoregions in the world (Skroch, 2008). By studying abundant and ecologically diverse taxa such as insects, we can gain a better understanding of diversification and community structure (Moore et al., 2013).
Neophasia butterflies are an excellent system for investigating how the sky islands have shaped species history. These butterflies are coniferous forest specialists and their evolutionary history likely follows that of their forest habitats, which have been hypothesized to fluctuate in area and connectivity in association with climate changes (Thompson & Anderson, 2000). The near-contact zone of ranges of the two Neophasia species meets at the major transition of ecoregions marked by the sky islands in AZ. Neophasia menapia (Felder & Felder, 1859) ranges from southwestern British Columbia to the Mogollon Rim and White Mountains of AZ (Scott, 1986), and southeastward through New Mexico to Guadalupe Mountains National Park in Texas (Lotts & Naberhaus, 2016). Neophasia menapia has four geographically defined subspecies (Pelham, 2008) F I G U R E 1 Map of Arizona, showing sampling sites and their two-letter identifiers. Elevation data were acquired from the National Resources Conservation Service Geospatial Data Gateway. The 1,300-1,850 m elevation range represents a scenario 18,000 C 14 years before present when the open conifer woodlands occurred in that range, evidenced by packrat middens containing plant matter from open conifer woodlands (Table 3 of Thompson and Anderson (2000)). Elevations above 1,850 m represent the present day start of open conifer woodlands. Inset map shows major ecoregions in black text and shades the state/province-scale distributions (except for Texas, which has only one record in the extreme SW corner) of each Neophasia spp.: blue for Neophasia menapia and red for Neophasia terlooii. Both species occur in Arizona, and it is therefore shaded purple and N. m. magnamenapia is the subspecies in AZ on which this study was focused (henceforth N. menapia). Neophasia terlooii (Behr, 1869) is present in the sky islands of southeastern AZ southward to the Sierra Madre Occidental in central Mexico (Bailowitz & Brock, 1991).
Both N. menapia and N. terlooii are exemplary representatives of the two major ecoregions that meet at the sky islands: N. menapia from the Rocky Mountain ecoregion and N. terlooii from the Madrean ecoregion. Additionally, because Neophasia are locally abundant, easy to spot, collect, and identify, and are fairly weak flyers, they are reliably available for collection in sufficient numbers at multiple sites. Additionally, due to their dependence on climatologically sensitive pine forests, they constitute an excellent model taxon to investigate how climate-induced changes in habitat connectivity influence diversification.
The larval hosts of N. menapia include Abies, Pseudotsuga, Tsuga, Picea, and Pinus, (Cole, 1971;Evenden, 1926;Robinson, Ackery, Kitching, Beccaloni, & Hernández, 2002;Scott, 1986) and N. terlooii feeds only on the latter two genera (Arizona Game & Fish Department, 2001). Pinus ponderosa (Engelmann), a larval host for both Neophasia species, occurs in the more fragmented Madrean evergreen woodlands and is a predominant member of the comparably larger and more contiguous Rocky Mountain coniferous forests (Brown, 1994). There is some overlap in the voltinism of these two species in AZ; N. menapia is univoltine, adults are active from mid-July through August, and N. terlooii is bivoltine and is active in low numbers from June through early September with peak adult activity in October. For both species, eggs are laid on live pine needles. Some populations of N. menapia in Montana have occasionally undergone population irruptions (Dewey, Ciesla, & Meyer, 1974), but in most cases Neophasia are locally abundant and not believed to disperse far from forested habitats containing their host trees.
Given the close and abrupt range disjunction in Neophasia, our main objective was to determine whether Neophasia species residing in forested habitats of different ecoregions of the sky islands and surrounding area have similar concordant population structure.
Using newly discovered SNPs from ddRADSeq libraries generated for this study, we infer with multiple lines of evidence the population structure and phylogeographic history of Neophasia. We hypothesized that, despite their capability of flight, genetic structuring of Neophasia populations would reflect the degree of current habitat connectivity. We discuss the limits of Neophasia dispersal with respect to the geographic layout of their mountain habitats in Arizona.
In addition to physiological constraints and climate differences between habitats (Halbritter, Teets, Williams, & Daniels, 2018), the desert and grassland matrices separating the two species' mountain habitats are likely to be dispersal barriers contributing to allopatry.

| Specimens
Neophasia menapia (n = 88) and N. terlooii (n = 69) were collected from 10 and seven field sites, respectively, throughout AZ in the summer and fall of 2013 and 2014 ( Figure 1, Table 1). Specimens were kept alive until they were stored at −15°C for a minimum of 12 hr. Frozen butterflies were then placed into 95% ethanol stored at the same temperature until they were moved to the University of Florida, Gainesville, FL, USA, at which point they were stored at −80°C.

| Library preparation
Thoracic muscle tissue, roughly 3 mm × 1 mm, was removed from each adult butterfly for DNA extraction. Genomic DNA was extracted from tissues using the OmniPrep extraction kit (G-Biosciences; Cat. # 786-136) following the manufacturer's protocol, with some adjustments to optimize DNA recovery (Appendix S1). Each DNA extract was quantified using a Qubit dsDNA BR Assay Kit (Life Technologies, Co.; Lot # 1681328). Extractions that contained a DNA concentration of at least 10 ng/µl were used for library preparation.
Library preparation for ddRADSeq followed that of Peterson, Weber, Kay, Fisher, and Hoekstra (2012), with some added quality control measures (Appendix S1). We used the restriction enzymes EcoRI (NEB, R3101 20,000 units/ml) and MseI (NEB, R0525 10,000 units/ml). These two enzymes have been successfully used to obtain 20,737 SNPs in an independent population genomics study on N. menapia (Bell, 2012

| Raw data processing
Raw sequences containing the multiplexed reads (10.7 GB total) were first assessed for quality by generating FastQC reports (FastQC 0.11.4;Andrews, 2010). Any reads that did not have at least 90% of bases with a Phred score of ≥20 were discarded using FastX quality trimmer from the FASTX_Toolkit (http://hanno nlab.cshl.edu/ fastx_toolk it/). Filtered sequences were then demultiplexed using FastX_toolkit FASTQ Barcode splitter and either the 14-, 10-, 9-, or 8-base barcodes. FastQC reports, filtering, and demultiplexing were all performed using the UF instance of Galaxy (Afgan et al., 2018).
Barcodes and the 6-base pair enzyme cut site were trimmed from the 5′ end and 5-10 bases were trimmed from the 3′ end of the reads from each individual. Bases were trimmed from the 3′ end because the last 5-10 bases tended to have lower Phred scores.
Reads were all trimmed to be at most 125 bases in length to facilitate downstream analyses in Stacks. Trimming was accomplished using FASTX_trimmer from the FastX _ toolkit. Because we used two restriction enzymes, some of the MseI cut sites were less than 125 bases from the EcoRI cut site at the 5′ end. These shorter reads were removed using FASTX _clipper to yield the final library of demultiplexed files for each individual.

| Discovering loci and calling genotypes
We used staCks 1.35 (Catchen, Hohenlohe, Bassham, Amores, & Cresko, 2013) for locus discovery and genotyping. Because a reference genome was not available to assemble loci when processing the reads, we ran the staCks program denovo_map.pl to assemble loci.
Unless otherwise specified with a population map, staCks assumes all barcoded individuals belong to a single population. To avoid investigator bias during SNP identification, we did not specify a population map because we did not know if our sampling sites could be  (Table 1).

| Quantifying genetic variability and spatial relationships
For all downstream analyses with the staCks populations outputs, R 3.2.2 was used (R Core Team, 2015). Expected heterozygosity was computed using function Hs from "adegenet" (Jombart, 2008) and deviations from Hardy-Weinberg equilibrium were tested for using the hw.test function from "pegas" (Paradis, 2010). We did not know if our sampling sites represented true populations; therefore, we first quantified genetic differentiation by comparing fixation indices (F ST ) between sampling locations within each species.
To determine between location fixation indices, we ran pairwise.fst from "heirfstat" (Goudat & Jombart, 2015) for populations within each species and computed F ST confidence intervals using boot.
ppfst. Our sampling was spatially clustered with respect to site; therefore, F ST comparisons were appropriate (Manel, Schwartz, Luikart, & Taberlet, 2003). A test for isolation by distance was run for each species using mantel.randtest from "ade4" (Dray & Dufour, 2007). An isolation by distance plot was generated, and F ST values were used to plot genetic distance as a function of geographic distance.
An analysis of molecular variance (AMOVA) was performed on the staCks-generated genotypes for both species using poppr.amova from "poppr" 2.1.1 (Kamvar, Tabima, & Grünwald, 2014). Strata were specified such that each individual was placed in a sampling region (mountain range) and then a site within each region (e.g., there were three sampling sites within the Huachuca Mountains). To test the significance of variations within individuals, between individuals, between sites (i.e., collection sites), and between mountains (i.e., mountain ranges or regions containing one or more sites), randtest from "ade4" was used on the AMOVA objects.

| Identifying population clusters
To identify genetic clusters and estimate cluster membership probabilities of each individual, we used struCture 2.3.4 (Pritchard, Stephens, & Donnelly, 2000) to estimate K, searching from K = 1-15 with 10 replicates per K and a burnin of 100,000 and 150,000 reps after burnin for each replicate. Prior population or location information was not used for inferring clusters and admixture was allowed.
Results were interpreted using Structure Harvester (Earl & von-Holdt, 2012) and plots generated using Structure Plot (Ramasamy, Ramasamy, Bindroo, & Naik, 2014). Substructure was investigated in N. terlooii for sampling sites in which all individuals had at or near 100% probability assignment to one of two clusters (i.e., minimal admixture). Two datasets, each including only minimally admixed sampling sites from one of the clusters, were analyzed again with struCture using the aforementioned parameters, except that only K = 1-5 were tested.

| Phylogenetic analysis
To further investigate genetic structure, we ran the ddRADSeq data through the pyraD pipeline to infer phylogenetic relationships. Eaton (2014) developed pyraD to maximize phylogenetic information across more distantly related samples from RADseq data. staCks and pyraD differ in that pyraD uses a global alignment algorithm that allows indels. Although the staCks pipeline has been used to generate species-level phylogenies (e.g., Jones, Fan, Franchini, Schartl, & Meyer, 2013), pyraD can be equally as effective and generally discovers more loci than staCks (Pante et al., 2015). For example, Mort et al. (2015) utilized the pyraD pipeline to generate a well-resolved phylogeny of a plant genus.
In pyraD, adaptor-trimmed sequences with quality scores were used for demultiplexing. As a quality control measure, reads were kept if there were fewer than 15 sites that had Phred scores ≤20.
The minimum number of reads for cluster formation and putative loci identification was set to seven. Putative pyraD loci were retained if at least ten individuals shared a locus. If we required more individuals to share a locus, we lost a considerable number of loci (see Results); therefore, we kept the minimum number of individuals (i.e., samples in pyraD) to ten. Up to 100 SNPs per locus (default for parameter ## 26.opt) were allowed. Using pyRAD generated data matrices, maximum likelihood trees with 10,000 ultrafast bootstrap replicates (Minh, Nguyen, & Haeseler, 2013) were inferred using iQ-tree (Nguyen, Schmidt, Haeseler, & Minh, 2015). The model used for both species was GTR + I + G4. Tree files were visualized using Figtree (Rambaut, 2007).

| RE SULTS
A cumulative total of approximately 120 million raw reads were generated from 151 individuals across ten N. menapia and seven N. terlooii sampling sites in AZ. All of the N. terlooii individuals were retained after filtering and 82 of the 88 N. menapia were retained.
After filtering, up to 10,740 were loci were recovered per sampling site, depending on the method and species (Table 2). Each locus contained one SNP for staCks and up to 100 SNPs for PYRAD.  (Table   S1). There were significant amounts of variation between mountains and between sampling sites within mountains for both N. menapia and N. terlooii, and there was a significant amount of variation within N. terlooii individuals as indicated by the analysis of molecular variance (Table 4). When plotting genetic distance as a function of geographic distance, there was a positive correlation between the two for N. terlooii (Figure 2b). This was evident with significant isolation by distance (r = .9062, p = .001,) from a Mantel test. Neophasia menapia did not show significant isolation by distance (r = −.081, p = .583), nor did genetic distance correlate with geographic distance (Figure 2a).

| Identifying population clusters
For N. terlooii, Bayesian hierarchical clustering using struCture suggested an optimum K of 2 based on delta K (Evanno, Regnaut, & Goudet, 2005) and the mean log probabilities (Table 5) Neophasia terlooii 9,351 3,125 The same individuals were included in each pipeline. Santa Rita Mountains separated from the three sites in the Huachuca Mountains An optimal K for N. menapia was not clear using these methods with delta K suggesting a K = 9, but mean log probabilities increasing with K and the highest probability being for the maximum K tested: K = 15 (Table 5). Assignment probability plots of individuals to each of several population clustering options are presented in

| Phylogenetic analysis
pyraD allowed us to use the ddRADSeq data for inferring phylogenetic relationships among individuals across our sampling sites.
Specifying that 10% of individuals must share a locus produced large matrices with a lot of missing data in the analysis. However, using a more stringent threshold, such as requiring a minimum of

| Overview
The sky islands are considered a neglected biodiversity hotspot (Felger & Wilson, 1994). Research on the biodiversity of the sky islands has focused primarily on dispersal and vicariance events associated with the Pleistocene epoch with subsequent divergence associated with biotic (Carstens & Knowles, 2007;Tennessen & Zamudio, 2008) and abiotic factors (Smith & Farrell, 2005). Within the sky islands, geographic structuring in genetic relationships has been observed in flightless beetles (Ober et al., 2011;Smith & Farrell, 2005), an ant (Favé et al., 2015), and a jumping spider (Masta, 2000

| Neophasia menapia is panmictic in AZ
Traditionally, population genetic studies required at least tens of individuals per sampling site to make accurate inferences. Willing, Dreyer, and Oosterhout (2012) found that the original Wright (1951) F ST overestimated differentiation with small sample sizes (n ≤ 6), but  Figure 4a). There is some evidence for the populations north of the Grand Canyon to be distinct from those south of the canyon. This is apparent in the struCture plot ( Figure 3b). However, the struCture analysis did not produce a clear, definitive K, which appears to provide further evidence for panmixia. We conclude that N. menapia is most likely panmictic in AZ and flight across the Grand Canyon is possible. However, given the possibility that some ancestral polymorphism was retained, the Grand Canyon may restrict gene flow to some degree given the lack of host plants at the lower elevations across the canyon.
Isolation by distance (IBD; Wright, 1943) occurs when genetic differentiation is due to limited gene flow across long distances rather than biologically relevant barriers in the environment.
Populations arising from dispersal events tend to be geographically clustered and represent some pattern of spatial autocorrelation (Meirmans, 2012). Mantel tests cannot differentiate between patterns arising from such clustering and those from IBD (Meirmans, Goudet, & Gaggiotti, 2011). The latter authors argue that it is best to use a clustering method that accounts for the geographical locations of the samples. The Mantel test suggested that IBD was not significant between N. menapia sampling sites, which also did not show any consistent or well-supported clustering. This makes sense because there were roughly equivalent F ST indices between proximal and distant sites (Figure 2a). Because we ran the Mantel test without specifying the geographical location data and there was evidence for hierarchical clustering within N. terlooii sampling sites, it is not surprising that the Mantel test was significant for N. terlooii. We conclude that the isolation by distance within N. terlooii is more likely due to hierarchical clustering on isolated mountains.

| Sky islands are responsible for Neophasia terlooii population structure
During our surveys, Neophasia butterflies were only observed at elevations between 1,700 and 2,600 m above sea level, and these eleva-

| Dispersal limitations and biogeographical considerations for Neophasia
It is evident that the geographic distance between the sky island mountains is enough to isolate N. terlooii populations. The shortest distance between any two sky islands in our study is 55 km (be- ranges would provide the data necessary to unravel their biogeographic history in western North America and would contribute to our work here and to knowledge of the evolution of biodiversity in the sky islands. Additionally, the timing and direction of dispersal events is key to further exploring their biogeography. This remains for future study and will require more accurate methods of divergence time estimation at the population level.

| CON CLUS IONS
We extracted genome-wide data from 151 individuals and used this data to infer population structure and generate phylogenies for a nonmodel taxon in a neglected biodiversity hotspot.
Despite differences in the number and type of loci discovered across the genomic pipelines used, we reached congruent biological conclusions regarding Neophasia population structure in AZ. As our ability to analyze genomic data catches up with our ability to generate data, complex biogeographical studies using nonmodel taxa at very shallow phylogenetic levels will be increasingly common and affordable. With improved algorithms and faster processors, total-evidence approaches are becoming increasingly feasible, especially for projects with minimal or short-term funding. Lastly, we thank our three anonymous reviewers for their detailed comments and suggestions during revision.

CO N FLI C T O F I NTE R E S T
None declared.