Strong isolation by distance among local populations of an endangered butterfly species (Euphydryas aurinia)

Abstract The marsh fritillary (Euphydryas aurinia) is a critically endangered butterfly species in Denmark known to be particularly vulnerable to habitat fragmentation due to its poor dispersal capacity. We identified and genotyped 318 novel SNP loci across 273 individuals obtained from 10 small and fragmented populations in Denmark using a genotyping‐by‐sequencing (GBS) approach to investigate its population genetic structure. Our results showed clear genetic substructuring and highly significant population differentiation based on genetic divergence (F ST) among the 10 populations. The populations clustered in three overall clusters, and due to further substructuring among these, it was possible to clearly distinguish six clusters in total. We found highly significant deviations from Hardy–Weinberg equilibrium due to heterozygote deficiency within every population investigated, which indicates substructuring and/or inbreeding (due to mating among closely related individuals). The stringent filtering procedure that we have applied to our genotype quality could have overestimated the heterozygote deficiency and the degree of substructuring of our clusters but is allowing relative comparisons of the genetic parameters among clusters. Genetic divergence increased significantly with geographic distance, suggesting limited gene flow at spatial scales comparable to the dispersal distance of individual butterflies and strong isolation by distance. Altogether, our results clearly indicate that the marsh fritillary populations are genetically isolated. Further, our results highlight that the relevant spatial scale for conservation of rare, low mobile species may be smaller than previously anticipated.

but see Wood et al. (2015). At the same time, habitat fragmentation prevents species with low mobility from building large population sizes. Therefore, rare species with low mobility are particularly likely to exhibit strong genetic differentiation and isolation by distance (Putz et al., 2015).
By quantifying genetic differentiation among populations, it is possible to assess the spatial dynamics of low-mobility species, and with sufficient resolution, it is even possible to identify spatial clusters of populations suitable for conservation actions. Conservation planning for rare species will benefit substantially from knowledge of the relationship between genetic differentiation and geographic distance. A correlation between geographic distance and genetic divergence has often been used as evidence for isolation by distance (Slatkin, 1993). The lack of a correlation implies that the barriers among populations are considerably reducing or interrupting gene flow among the populations and that consequently the degree of divergence among populations is governed by genetic drift. With such information, spatial conservation units can be identified and informed decisions can be made about where to prioritize connectivity among habitats and where to expand a species range by forming new habitat networks (Kukkala & Moilanen, 2013;Lehtomaki & Moilanen, 2013).
The marsh fritillary (Euphydryas aurinia) butterfly species is likely to be a case in point. This species has a short flight season in May-June and females emerge with several hundred fully developed eggs, mate shortly after emergence, and typically lay these eggs within meters of their pupation site. Under good conditions, females can produce additional egg batches that may be laid further from the emergence site, but dispersal in the adult stage is mostly rather limited. Similarly, the larvae, which overwinter in family clusters, only move short distances and mostly in the final instar in spring prior to pupation (Porter & Ellis, 2011). This species is listed in the Annex II of the EU Habitats Directive and has experienced considerable range contraction in Europe (Warren, 1994) as well as in Denmark over the past 100 years (Eskildsen et al., 2015). It was once a widespread species in Denmark, but has since declined dramatically and was considered virtually extinct in Denmark around year 2000 (Asbirk & Christensen, 2000). Since then, it has been found again in a number of small patches in Northern Jutland partly as a result of conservation actions and increased efforts to search for remnant populations (Brunbjerg et al., 2017). A key question is, therefore, whether genetic diversity is maintained in a rare species like the marsh fritillary in a fragmented habitat like its range in Denmark.
The main threats to the marsh fritillary butterfly in Denmark are habitat loss, encroachment as a result of eutrophication and the cessation of management, drainage, and the indirect effects of artificial fertilizers and pesticides applied to agricultural fields in the surroundings of marsh fritillary habitat (heaths, marshes, and grasslands) (Tjørnløv et al., 2015). Several conservation programs have been carried out (a) to manage marsh fritillary habitat, (b) to monitor the species demography, and (c) to increase the public awareness of the species and its threats (Larsen, 2008). In 2000, the Danish Nature Agency developed the first management plan of the marsh fritillary butterfly, where the main objectives for the conservation of the species were listed (Asbirk & Christensen, 2000). Unfortunately, too few resources have been devoted to the implementation of the management plan. The distribution of the marsh fritillary in Denmark is fragmented and split into three main regions and a few additional small and isolated populations ( Figure 1). Interpatch movement distances above three kilometers have previously been shown to be very rare for this species (Johansson et al., 2019). Dispersal among these three regions must, therefore, be very rare if it happens at all, and the potential for expansion into new areas is limited given the current habitat availability. According to the management plan, the maintenance of metapopulations of marsh fritillary should be the main priority, with the objective of maintaining a population size of at least 500 larval clusters per metapopulation. Recent demographic studies show that population sizes often fall far below this minimum (Lauridsen, 2015). The species can exhibit large fluctuations in population size due to weather conditions or parasites, which makes it very vulnerable to local extinctions (Joyce & Pullin, 2003). These fluctuations are also strongly affecting the effective population size which will tend to be approximately equivalent to the harmonic mean of the population size across years (Caballero, 1994).
A well-established technique for the detection of the genetic structure is the genotyping-by-sequencing (GBS), which is a reproducible, highly multiplexed next-generation sequencing approach that uses restriction enzymes to reduce genome complexity allowing for simultaneous single nucleotide polymorphism (SNP) discovery and genotyping (Elshire et al., 2011). The major advantages over other available protocols are both technical simplicity (Davey et al., 2011) and that informatics pipelines are publicly available and can be easily adapted to a wide variety of species, either with or without reference genome information (Elshire et al., 2011;Glaubitz et al., 2014). GBS, however, has not previously been used for SNP genotyping in any species of the Nymphalidae family. In the current study, we optimized the GBS protocol for the critically endangered marsh fritillary and genotyped a compressive number of individuals from 10 Danish populations. The resulting SNP dataset was used to analyze the genetic structure of the marsh fritillary in order to examine the spatial scale at which genetic differentiation can be detected for a rare, low mobile species.
We aim to answer the following questions (1) Is there evidence

| Sample collection
A total of 300 3rd or 4th instar larvae of the marsh fritillary were col-  Table 1).
All the genomic analysis and the bioinformatic filtering of the data have been outsourced to the Cornell University Genomic Diversity Facility (US) (https://www.biote ch.corne ll.edu/core-facil ities -brc/facil ities/ genom ics-facility).

| DNA extraction and genotyping-by sequencing (GBS) protocol optimization
The total genomic DNA was extracted from larvae using the DNeasy® Blood & Tissue Kit (Qiagen, Inc., Hilden, Germany) according to the manufacturer's protocol for purification of total DNA from insects. DNA quantity and quality was verified using a fluorometer (Qubit ® , Thermo Fisher Scientific Inc.) and by running 100 ng of each DNA sample on a 1% agarose gel, respectively.
For optimization of the standard GBS protocol (Elshire et al., 2011) for E. aurinia, a single DNA sample (400 ng) was digested for 2 hr with the restriction enzymes ApeKI, EcoT22I, and PstI, in separate essays, using a tenfold excess of enzyme and reaction conditions as specified by the endonuclease manufacturer (New England Biolabs). After ligation of appropriate adapters (adapter amounts were determined by titration as described in REF (Elshire et al., 2011) and PCR (see below)), fragment size distributions of each F I G U R E 1 The current distribution of the marsh fritillary (Euphydryas aurinia) in Denmark as indicated by black 1 × 1 km grid cells on the main map of Northern Jutland where the species has been observed during (2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015). The region in which the marsh fritillary is found in Denmark is indicated on the inset map of the country. The populations sampled for this study cluster in the three main regions A, B, and C of the distribution of the species in Denmark. A few patches occur outside these main regions, but only with small population sizes at one or two sites. The exact location of each sampled population is indicated by a number referring to the following population names: (1) Bruså, (2) Tranum Skydeterraen, (3) Overklitten Sø, (4) Sandmosen, (5) Vågholt Mose, (6) Troldkaervej, (7) Knasborgvej, (8) Videsletengen, (9) Milrimvej, and (10)  test library were visualized using an Experion (Bio-Rad, Hercules, California, USA) ( Figure S1). Based on these results, we selected the libraries obtained from the EcoT22I digests to maximize sequence coverage from GBS.

| Preparation of Illumina libraries for nextgeneration sequencing
Three 96-plex EcoT22I GBS libraries, comprising 285 DNA samples and three negative (no DNA) control, were prepared according to Elshire et al. (2011). Low DNA concentration samples (n = 15) were discarded and not submitted for sequencing. Briefly, individual DNA samples were digested with the restriction enzyme and adapters were ligated as described previously. The adapters comprised a set of 96 different barcodes containing adapters and a "common" adapter.
Individual ligations were pooled, and purified using QIAquick PCR purification kit (Qiagen

| DNA sequence analysis: SNP discovery and genotyping
Illumina raw reads were processed using the default parameter of the Universal Network-Enabled Analysis Kit (UNEAK) (Lu et al., 2013) for species without a reference genome. This pipeline was implemented in TASSEL version 3.0.166 (Glaubitz et al., 2014) and used for tag alignment and subsequent SNP calling. Briefly, the raw Illumina DNA sequence data (100-bp qseq files) were first trimmed to remove barcodes. The sequence remnants were then either trimmed further or padded with 3' A's to 64-bp lengths. Sequences were then aligned to each other, both to identify unique sequences, or "sequence tags", and to generate clusters of related sequences. For each cluster, a network was generated, in which sequence tags were organized according to mutation steps (i.e., mutational relationship). A single base-pair mismatch was allowed among cluster members. Networks were then filtered such that only SNPs originating from reciprocal tag pairs were retained (see Lu et al., 2013). SNPs from more complicated networks that often result from alignment of paralogs and repeats, or sequencing errors were discarded. To further reduce the impact of sequencing errors, we also set the error tolerance rate (ETR) parameter to 0.03, slightly below the expected Illumina sequencing error rate (0.04%). Failed samples (nonblank), defined as those with less than 10% of the mean reads per sample coming from the lane on which they were sequenced, were discarded (n = 5).
The resulting raw SNP dataset from the UNEAK pipeline was further filtered using Golden Helix SNP and Variation Suite (SVS version 7.2.2, Golden Helix, Bozeman, MT) and PLINK v1.07 (Purcell et al., 2007) softwares. First, the dataset was filtered by the application of genotype-level filters to remove genotypes with low read depths (RD) and/or low genotype quality (GQ). Thus, genotypes with RD ≤ 4× and GQ ≤ 98 were considered as missing. Later, we removed all SNPs and individuals with call rates <80%. In addition, SNPs with a minor allele frequency (MAF) < 0.05 were removed. Loci with a mean-observed heterozygosity >0.6 were also discarded to filter out potential paralogs. The SNP set was also pruned for linkage disequilibrium (LD) by excluding markers in strong LD (pairwise genotype correlation r 2 > .5) in a window of 50 SNPs (sliding window overlap 10 SNPs at a time). This filtering process is described in Figure S2.

| Genetic variability and population structure
Genetic variability in each population was assessed by the calcu- there was an overall correlation between geographic distance and genetic divergence (Smouse et al., 1986). Some authors argue that spatial structures in the dataset can enhance isolation by distance (Legendre et al., 2015). To control for such a potential effect, we ran one test based on all pairwise comparisons and one test on a subset of pairwise comparisons excluding pairs from two different regions.

| RE SULTS
The UNEAK pipeline recovered 30,137 bi-allelic SNP loci (n = 280; 5 samples failed the UNEAK pipeline). However, most of these SNPs had low coverage or were only present in a small number of individuals. After the complete filtering procedure, 318 SNPs were maintained in our matrix for 273 individuals with an overall call rate of 93.57% (see Figure S2). Over all of these loci, the mean coverage per Deviations from HWE were found to be highly significant for all the 10 populations investigated (p < .001 in all cases), both when pooled altogether and when considered singularly. All the deviations were due to heterozygote deficiency as can be seen by the positive F IS values ranging from 0.1 to 0.228 (Table 1). Genetic variability parameters, observed heterozygosity (H O ), unbiased heterozygosity (uH E ) and inbreeding coefficient (F IS ), mean percentage polymorphic loci (%P), and mean effective alleles (Ne), are listed in Table 1. Genetic divergence between populations ranged from 0.028 to 0.1 ( Table 2).
All the pairwise F ST values were highly significant (p < .001).
Populations were strongly clustered into three geographic regions; the minimum CVE value in the ADMIXTURE analysis suggested an optimal number of genotypic clusters for K = 6 (CVE = 0.525) ( Figure S3). The graphical visualization of the ADMIXTURE results for the 273 individuals and K ranging from two to 12 clusters is shown in Figure 2. When K = 6, E. aurinia is clearly subdivided into different genetic clusters that mainly corresponded with the three geographic regions of the species distribution (i.e., A, B, C) and each of the sampled populations: Region A (population 1) is characterized by a single genetic cluster; Region B is subdivided into two different clusters, one including populations 2 and 3 (dark blue) and an additional cluster exclusively including population 4 (light blue); and Region C genetic subdivision is more complex and characterized by 3 private clusters (red, green and orange clusters). However, while population 10 is mainly char-

| D ISCUSS I ON
We used a genotyping-by-sequencing (GBS) approach to increase the knowledge on the population genetic structure of the critically endangered marsh fritillary butterfly in Denmark. This study documents the identification of an informative SNP loci panel and demonstrates that the GBS approach represents a powerful tool to define genetic relationships at the intraspecific level. Moreover, this SNP panel provides an important genetic resource for further genetic studies of the marsh fritillary and is a cost-effective and rapid method that can well describe the genetic variability of other nonmodel species with limited genetic resources.
The stringent filtering procedure that we have applied (GQ = 98) (compared with most studies where the GQ filter, is more commonly confirmed by the fact that strong evidence for isolation by distance was found, even when accounting for the spatial structure in our data by omitting comparisons among the three regions in which the marsh fritillary is found (Legendre et al., 2015). Detailed capturemark-recapture studies have demonstrated that most dispersal events are shorter than one kilometer and, only in rare cases, marsh fritillary butterflies disperse more than five kilometers (Johansson et al., 2019;Zimmermann et al., 2011). This suggests that under current levels of fragmentation, isolation by distance can be detected at the same spatial scale as the dispersal capacity of the species. Most Our results suggest that further efforts are needed to maintain genetic diversity in this species in Denmark. Source-sink dynamics will effectively increase mortality from isolated populations because dispersing individuals will be unable to locate suitable habitat. In addition, reduced mixing of populations will affect genetic diversity and ultimately cause inbreeding (Pertoldi et al., 2007;Sigaard et al., 2008).  (Brunbjerg et al., 2017).
Population structure has been studied in many butterfly species using microsatellite markers (Saccheri et al., 2004;Smee et al., 2013;Vandewoestijne et al., 2011;Zeisset et al., 2005). However, due to the formidable challenges involved in developing informative microsatellite markers (Nève & Meglécz, 2000), most studies have relied on a small number of markers with limited resolution (Sigaard et al., 2008

ACK N OWLED G M ENTS
This work was supported by the Aalborg Zoo Conservation and 2017-N-6). The Danish Nature Agency is acknowledged for permission to collect marsh fritillary larvae.

CO N FLI C T O F I NTE R E S T
The authors have no conflict of interest.