Demographic history and genomic signatures of selection in a widespread vertebrate ectotherm

Environmental conditions vary greatly across large geographic ranges, and yet certain species inhabit entire continents. In such species, genomic sequencing can inform our understanding of colonization history and the impact of selection on the genome as populations experience diverse local environments. As ectothermic vertebrates are among the most vulnerable to environmental change, it is critical to understand the contributions of local adaptation to population survival. Widespread ectotherms offer an opportunity to explore how species can successfully inhabit such differing environments and how future climatic shifts will impact species' survival. In this study, we investigated the widespread painted turtle (Chrysemys picta) to assess population genomic structure, demographic history, and genomic signatures of selection in the western extent of the range. We found support for a substantial role of serial founder effects in shaping population genomic structure: demographic analysis and runs of homozygosity were consistent with bottlenecks of increasing severity from eastern to western populations during and following the Last Glacial Maximum, and edge populations were more strongly diverged and had less genetic diversity than those from the centre of the range. We also detected outlier loci, but allelic patterns in many loci could be explained by either genetic surfing or selection. While range expansion complicates the identification of loci under selection, we provide candidates for future study of local adaptation in a long‐lived, widespread ectotherm that faces an uncertain future as the global climate continues to rapidly change.

Individuals can also respond to diverse environmental challenges through phenotypic plasticity (Bodensteiner et al., 2023;Bonamour et al., 2019;Torres-Dowdall et al., 2012).In long-lived iteroparous species, phenotypic plasticity is an important mechanism for adjusting to diverse environmental conditions experienced within an individual's lifetime (Gunderson & Stillman, 2015;Nussey et al., 2007; but see Radchuk et al., 2019).However, in the face of anthropogenic climate change, the capacity of species with long generation times for rapid adaptive evolution is less certain (Moritz & Agudo, 2013).
To understand both the historical evolutionary pressures that shaped species' current ranges and the future risk of extinction from anthropogenic climate change, it is essential to assess the contributions of both phenotypic plasticity and local adaptation to phenotypic change across populations of long-lived species (Valladares et al., 2014).New genomic resources allow the investigation of patterns of selection across the genome, past demographic patterns, and current population genetic structure (Savolainen et al., 2013).
This new information can contribute to our understanding of how long-lived species with widespread geographic distributions inhabit such a wide range of natural environmental conditions and how they may respond to future climate change (Meek et al., 2023).
The predicted survival outcome for many reptiles appears bleak due to the rapid pace of climatic shifts exacerbated by habitat loss and decreased population sizes (Böhm et al., 2016;Gibbons et al., 2000;Sinervo et al., 2010).Thermosensitive traits such as temperaturedependent sex determination (TSD) and anoxia tolerance make these species particularly vulnerable.Temperature changes of only a few degrees could destabilize sex ratios, especially in small populations (Mitchell et al., 2010;Mitchell & Janzen, 2010), or influence individuals' ability to prepare adequately for warming winter conditions (Moss & MacLeod, 2022).In reptiles, most studies investigating potential responses to climate change suggest that plasticity plays an important, and perhaps predominant, role in mediating organismal responses to temperature (e.g., Janzen et al., 2018;Refsnider & Janzen, 2012, 2016;Urban et al., 2014).Although plasticity is likely important for the survival of reptiles with long generation times, the potential contribution of local adaptation to future survival of reptile populations is perhaps less understood than the contribution of phenotypic plasticity (Urban et al., 2014).We currently have a limited grasp of the amount of standing genomic variation in longlived reptile populations, how that variation has been shaped by past demographic events (e.g., population expansions, bottlenecks), and the degree to which this variation might contribute to adaptation.
Investigating population genomic structure, demographic history, and genomic evidence of selection in long-lived reptiles can inform management of these charismatic species and help predict future responses to climate change (Macdonald et al., 2018).
Painted turtles (Chrysemys picta) -long-lived ectotherms with TSD -are an excellent reptile to evaluate genomic variation and signatures of selection across a large geographic range.Painted turtles are widespread across North America, with populations subjected to disparate macro-environmental conditions.For example, in the species' western extent, maximum air temperatures experienced among localities range from <26°C to 35°C and minimum air temperatures range from 0°C to <−22°C.Divergence in colour, scute patterns, nesting behaviour, adult body size, reproductive effort, characteristics of TSD reaction norms (e.g., transitional range of temperatures), and thermal reaction norms for hatchling body mass and incubation time have been documented across populations (Bodensteiner et al., 2019(Bodensteiner et al., , 2023;;Carter et al., 2019;Edge et al., 2017;Ernst & Lovich, 2009;Iverson & Smith, 1993;Janzen et al., 2018;Ultsch et al., 2001).Phenotypic plasticity explains some variation in phenotypes among populations (e.g., date of first nesting, Janzen et al., 2018;nesting behaviour, Refsnider & Janzen, 2012).
However, population genomic structure and genomic signatures of selection have not yet been assessed in this species.In this study, we investigated population genomic structure, demographic history, and genomic signatures of selection in seven locations spanning the western range of painted turtles.We also assessed patterns of divergence in particular genes associated with TSD and extreme anoxia tolerance (e.g., Fanter et al., 2020;Ge et al., 2018), as we hypothesized these two traits may be under differential selection across painted turtle populations experiencing different temperature extremes.Understanding both demographic history and selection dynamics in a widespread, long-lived ectotherm provides insight into how species successfully colonize vast geographic ranges.

| Study organism
Painted turtles (Chrysemys picta) have a unique distribution among turtles, with the widest natural range of any North American turtle species (Ernst & Lovich, 2009).The current distribution encompasses the entire eastern coast of the United States and stretches west across the northcentral United States and southern Canada, with a patchy distribution in the southwestern United States (Figure S1; Bodensteiner et al., 2019;Carter et al., 2019).Fossil evidence suggests the species predates the Pleistocene glaciation period, and limited genetic evidence points to multiple range contractions as glaciers expanded (Starkey et al., 2003).Using one mitochondrial locus, Starkey et al. (2003) found that the southeastern populations were more divergent from other populations and were likely the source of multiple colonizations of the northern United States and Canada following glacial retreat.Further, very low genetic divergence was detected across northern and southwestern populations, particularly in the western half of the species' range, despite the species' age and widespread distribution.This result suggests that glaciation forced a range contraction in the east, followed by a rapid westward radiation of painted turtles.Additionally, the reference genome for painted turtles shows slow rates of sequence evolution compared to other vertebrate lineages, which may further compound a putative lack of genetic diversity observed among populations (Shaffer et al., 2013).
More recent studies of genetic structure in painted turtles, again with limited number of markers, have consistently reported low divergence across much of the range (Jensen et al., 2015), including a few western populations (Reid et al., 2019).The study by Reid et al. (2019) corroborates the hypothesis that the western populations are the result of a rapid radiation westward following glaciation.Taken together, previous work on the genetic structure of this species has relied on a small number of neutral markers and suggests climate has strongly influenced the past and current range of painted turtles.
Though low genetic diversity has been consistently reported across western painted turtle populations, phenotypes do vary across locations from the species' western range.For example, body size and age at maturity of adult painted turtles were positively correlated with latitude and negatively correlated with mean annual temperature (Iverson & Smith, 1993).Additionally, thermal reaction norms for painted turtle hatchling phenotypes varied in commongarden conditions; responses of incubation duration and hatchling mass to temperature varied among hatchlings from different locations, but in a pattern inconsistent with latitude (Bodensteiner et al., 2019).Temperature and precipitation patterns vary widely across the species' range in the western United States (Table S1), which likely exerts different selective pressures on these ectotherms, particularly at thermal extremes.As a model of a long-lived, widespread ectothermic amniote with an exceptional fossil record (see Starkey et al., 2003), a genome-wide study of genetic differentiation in painted turtles is warranted to investigate demographic patterns and putatively adaptive variation.

| Sample collection
We collected clutches of painted turtle hatchlings or eggs from seven locations across the western 2/3 of the species' range (Table 1; Figure S1) and incubated eggs at Iowa State University (ISU) until hatching (see Bodensteiner et al., 2019;Refsnider et al., 2014).
We euthanized hatchlings with a pericardial overdose of 0.5 mL of 1:1 sodium pentobarbital: water, assigned sex according to macroscopic examination of gonads (Schwarzkopf & Brooks, 1985;Warner et al., 2014), and stored hatchlings in 70% ethanol until DNA extraction.All research followed ISU IACUC protocols (6-08-6583-J and 12-03-5570-J) with appropriate permits from the associated agencies of each sampling location (see Data Availability Statement).
To avoid any bias of relatedness, we genotyped only one hatchling from each clutch, with sample sizes ranging from 5 to 38 hatchlings from each location (Table 1).We extracted DNA from hatchling liver tissue using a DNeasy Blood and Tissue Kit (Qiagen, Germantown, MD) or a phenol-chloroform DNA extraction protocol (Sambrook et al., 1989; see Supporting Information).We quantified DNA with a NanoDrop™ 2000 Spectrophotometer and assessed DNA purity and quality using a 1% agarose gel before we performed DNA sequencing on 164 individuals.

| RAD processing and variant calling
We used a restriction site-associated DNA sequencing (RADseq) approach with a single-restriction enzyme, MseI, ligated resulting fragments with two adapters (6 or 8 bp barcode), and pooled samples for size selection.We selected fragments of 275-300 bp with gel electrophoresis and confirmed size with an Agilent® 2100 BioAnalyzer before paired-end 150 bp sequencing on an Illumina HiSeq platform (Illumina, San Diego, CA).Following sequencing, we removed raw reads that contained more than 10% unknown bases, removed reads with more than 50% low quality bases (Q ≤ 5), and trimmed adapter sequences with cutadapt v2.5 (Martin, 2011).Finally, we used the BWA-mem algorithm (Li, 2013) in bwa v0.7.17 (Li & Durbin, 2009, 2010) to align reads to the painted turtle reference genome v3.0.3 downloaded from NCBI, which was generated with a female painted turtle from southern Washington state, USA (Shaffer et al., 2013).
We further filtered the variant and invariant VCFs using VCFtools v0.1.14(Danecek et al., 2011).We removed individuals with low average depth (<2×), excluded variants with mean genotype quality <20.0, and restricted to biallelic sites where applicable.We then removed indels (filtered above), as well as 3 bp upstream and downstream of the indel, from both the SNP and the invariant sites VCFs.Next, we removed SNPs that deviated strongly from Hardy-Weinberg equilibrium (HWE; p < 10 −7 ).As the sampled locations are extremely geographically distant, we set this threshold low to reduce the likelihood of removing real variants generated by substantial population genomic structure (e.g., Martin et al., 2018).We removed sites fixed for the non-reference allele and both variant and invariant sites with >25% of genotypes missing across individuals.This procedure yielded 41,546,386 genotyped sites across the painted turtle genome (genome size ~2.6Gb;Shaffer et al., 2013).We then applied a stricter set of hard filters for population genomic analyses (Pfeifer et al., 2018), which are outlined here.First, we assigned SNP genotypes with a depth less than 4 or greater than 100 as missing to avoid including any low coverage reads or paralogous regions.Second, we partitioned the painted turtle genome scaffolds into 1.5 kb blocks and assigned all SNPs and monomorphic sites to their corresponding blocks using a custom R script (v3.6.3;R Core Team, 2020) with "dplyr" v0.7.5 (Wickham et al., 2018).We removed any blocks with a median distance between SNPs less than 3 or with less than 150 genotyped sites (both variant and invariant) to prevent the inclusion of SNPs in repetitive or misaligned regions.This approach produced a dataset with 37,832,754 invariant and 566,852 variant sites over 153,416 blocks, which was used for calculating nucleotide diversity and percentage of polymorphic loci.
For analyses using only variant sites, we further reduced missing data among SNPs and individuals by removing SNPs with greater than 10% missing genotypes and confirmed that individuals had greater than 75% of SNPs genotyped in the resulting dataset.The resulting VCF file contained 253,365 SNPs across 161 individuals.
Then, we applied a minor allele frequency (MAF) filter such that singletons were excluded, and we retained only one SNP with the least missing data per 1.5 kb block to reduce linkage among markers, for a final dataset of 48,060 SNPs used for population genomic analyses.A summary of filtering steps and number of sites can be found in Table S2, and all custom scripts can be found in the repository (Judson et al., 2023).
To understand population genomic variation, we performed a principal components analysis (PCA) in R using "SNPRelate" v1.22.0Zheng et al., 2012) downloaded from BiocManager v1.30.10 Morgan, 2019).We further assessed groupings in our data using the program STRUCTURE v2.3.4 (Pritchard et al., 2000) with the admixture model and the following parameters: 100,000 burn-in and 100,000 repetitions for each run of K from K = 1 to K = 10, correlated allele frequencies among populations, the alternative, populationspecific ancestry prior (which entails allowing a separate alpha for each population; POPALPHAS = 1), and an initial alpha value of 0.14, where alpha is the result of 1/K and K is the assumed number of populations.These settings were chosen to account for unequal sampling sizes from the seven locations (Wang, 2017).We chose K = 10 to allow some subdividing within sample locations (N = 7), but did not expect population structure within sampling locations given the focused spatial sampling.Each value of K was run 20 times using the above parameters, as recommended by Gilbert et al. (2012).To select the optimal K from the STRUCTURE runs, we used multiple methods (Janes et al., 2017): we assessed the posterior probabilities for each K (Pritchard et al., 2000) and plotted Ln Pr (X|K) (Pritchard & Wen, 2003) and ∆K (Evanno et al., 2005) using CLUMPAK v.
We built a maximum likelihood population tree using TreeMix v1.13, which uses a graph-based approach to model both gene flow and population splits (Pickrell & Pritchard, 2012).We used 500 bootstrap replicates and assumed no migration given the distance among sampled locations.We tested for isolation by distance (IBD) using a Mantel test comparing straight-line geographic distance (km) between locations to pairwise genetic differentiation, F ST / (1−F ST ) (Rousset, 1997), implemented in "vegan" v.2.5.7 (Oksanen et al., 2020) with 10,000 permutations.We also assessed correlations between pairwise distance and differences in annual mean temperature (BIO1) and precipitation (BIO12; Table S1) between locations using a Mantel test, as strong correlations could influence our attribution of population structure to demographic history and selection.

| Demography
Runs of homozygosity (ROH) can be informative of both past demographic changes (e.g., bottlenecks) and recent inbreeding within populations (reviewed in Ceballos et al., 2018).Specifically, long tracts of homozygosity are generated by recent inbreeding among individuals, while more numerous short ROH can be indicative of population bottlenecks farther back in time.We assessed runs of homozygosity in the variant dataset with all filters except the final filter retaining one SNP per block using PLINK v1.9 (Purcell et al., 2007).Before analysis, we separated individuals by sampling location and removed SNPs that were fixed within each location as recommended by Shafer et al. (2016) using GATK (v4.0.4.0, DePristo et al., 2011;McKenna et al., 2010).We used the -homozyg algorithm in PLINK with the following settings: window size of 25 SNPs, defined windows as homozygous if one or less heterozygous sites were found in the window, allowed no more than five missing sites within a window, and set the window threshold for an SNP being defined as in a homozygous segment of a chromosome (hit rate) at 0.05.For a region to be defined as an ROH, segments must include ≥25 SNPs, cover ≥1000 kb, contain a maximum of one heterozygous site, contain minimum SNP density of 1 per 50 kb, and have maximum distance between two adjacent SNPs of ≤1000 kb (Grossen et al., 2018).
Finally, we used StairwayPlot v2.1.1 (Liu & Fu, 2015, 2020) to infer the demographic history of painted turtles from the four sampled locations with a larger sample size (N ≥ 27).We used the variant dataset without the MAF filter or the filter retaining one SNP per block (253,365 SNPs).To create an equivalently filtered monomorphic site comparison for correctly assessing the total number of genotyped sites for demographic analysis, we further filtered the invariant site VCF to remove sites with greater than 10% missing genotypes (Table S2).We created population-specific VCFs using VCFtools v0.1.14(Danecek et al., 2011).To create population-specific folded site-frequency spectra (SFS), we used the python script easySFS (Gutenkunst et al., 2009;Overcast, 2023).We downsampled to maximize the number of SNPs retained and set total length to the sum of variant and invariant sites (26,707,837).We used a generation time of 11 years (Reid et al., 2019) and a mutation rate per site per generation of 4.61 × 10 −9 .This painted turtle mutation rate was based on parent-offspring trio sequencing of turtles from the Illinois sampling site in this study (Bergeron et al., 2023).

| Outlier detection
To detect outlier loci potentially linked to selection across the sampled locations, we first applied the Bayesian approach of BayeScan v2.1 (Foll & Gaggiotti, 2008).For this analysis, we used the variant dataset that contained all filtering steps described above except the final filter retaining one SNP per block (212,670 SNPs; Table S2).We included these potentially linked SNPs to give increased support for any detected outlier SNPs; if selection is causing a deviation from the population mean F ST , the surrounding SNPs should deviate in a similar manner (Storfer et al., 2018).We used a burn-in of 50,000, 100,000 iterations, and prior odds of 100 for the neutral model to reduce the number of false positives.After running BayeScan twice, we evaluated convergence using the R package CODA (Plummer et al., 2006) and identified any loci with a q-value less than 0.05.We performed two additional BayeScan runs with the same settings on a dataset excluding Kansas turtles and two admixed Oregon turtles (hereafter, "reduced dataset"; see Section 3), given the small sample size from Kansas.
We also used a separate approach to detect outlier loci, "pcadapt" v4.3.3 (Privé et al., 2020), which uses a PCA approach to find markers that are significantly related to population structure.After confirming that linkage disequilibrium did not influence outlier detection (Luu et al., 2020), we used the same dataset as the BayeScan analysis including 212,670 SNPs.After assessing values of K from 1 to 20, we used Cattell's rule to select K = 5; higher PCs split individuals from the same sampling location (Figure S2).Outliers were determined using two p-value cutoff methods: q-values with α = .05 (similar to BayeScan) and using Bonferroni correction with α = .001.
We ran the same analyses with the reduced dataset.While we set the threshold for the HWE filter to be low (p < 10 −7 ), the use of an HWE filter can exclude loci that deviate due to population structure and that may be under selection.Thus, we added loci previously excluded for excess homozygosity (see Supporting Information) to previously included loci (217,821 total), excluded Kansas and admixed Oregon turtles, and used pcadapt to detect outliers (Bonferroni correction α = .001).We assessed the genomic location, including whether outliers were in genes or exons, for outlier SNPs using BEDTools v2.27.1 (Quinlan & Hall, 2010).We used VCFtools v0.1.14(Danecek et al., 2011) to calculate F ST for all loci included in outlier detection to assess patterns across the genome.Finally, we assessed whether the gene ontology (GO) functional category of genes represented in the outlier list generated from pcadapt were statistically overrepresented when compared to all represented genes in the SNP dataset using PANTHER v17.0 with FDR correction (Mi et al., 2019;Thomas et al., 2022).
Given painted turtles exhibit two remarkable aspects of biology, TSD and extreme anoxia tolerance (Janzen, 1994;Ultsch & Jackson, 1982), that conceivably could be linked to adaptation to macro-environmental conditions, we further explored potential adaptation at the candidate gene level.First, we queried coverage of genes in our RADseq dataset that are implicated in the sex-determination mechanism, as these genes may play a role in differentiation of the transitional range of temperatures across these populations (Carter et al., 2019).The genes we chose to assess for variation were Doublesex and Mab-3-Related Transcription factor 1 (DMRT1), the expression of which was recently confirmed to be the key sex-determining factor in birds (Ioannidis et al., 2021), and lysine demethylase 6B (KDM6B), a gene that promotes transcription of DMRT1 in red-eared slider turtles (Ge et al., 2018).

| Sampling and RAD variants
In total, 164 painted turtle hatchlings representing seven sampling locations were sequenced for this project (Table 1; NCBI BioProject PRJNA789046, Judson et al., 2023), with an average of 3.77 million high-quality reads per individual and 3.5 million reads mapped to the reference genome.Joint genotyping yielded 3.53 million SNP loci; individual depth filters and site filters resulted in a final dataset of 48,060 SNPs from 161 individuals with an average of 4.43% missing genotypes across individuals (Tables 1 and 2; Table S2).With such a large number of SNPs, a representation of genetic diversity to understand population genetic structure in these locations was likely captured even in locations where the number of individuals sampled was small (Nazareno et al., 2017).

| Population genomics of western painted turtles
Summary statistics suggested that genetic variation varied widely among populations.Polymorphism ranged from 0.48% to 1.23% (  1a).
The second PC separated the New Mexico individuals from the more northern populations, and the third PC separated New Mexico individuals from Nebraska, Kansas, and Minnesota individuals (Figure S3).
The results of the various estimators of best K for the STRUCTURE runs suggested six genetically distinct groups were sampled (Figure 1c; S8).All sampled locations were distinct in the STRUCTURE plot except individuals from Oregon and Idaho, which grouped as a single population.∆K suggested the optimal K was 2 (Figure S9), but this result is not convincing given the frequent recovery of K = 2 using this method (Janes et al., 2017) and other estimators suggesting K = 6.
Painted turtles from the Oregon and Idaho locations were similar in allele frequencies; these individuals overlapped in the PCA and displayed low pairwise F ST and d xy (Table 3).between Oregon and midwestern populations (IL, MN; Figure 1).We calculated individual relatedness using PLINK, and both individuals appeared not to be closely related (see Supporting Information).
The population tree from TreeMix analysis explained 99.8% of the covariance in allele frequencies among locations.There was substantial drift among populations, with Oregon and Idaho grouping closely and the remaining populations showing branching consistent with their east-west location (Figure 2).The Mantel test of IBD supported a significant relationship between pairwise geographic distance and pairwise genetic distance (p = .012;Figure 1b).Mantel tests indicated no correlations between pairwise distance and differences in annual mean temperature (p = .66)and precipitation (p = .89).

| Demography
We assessed the demographic history of western painted turtles using both patterns of ROH and the site-frequency spectrum.We found that the total length of ROH was correlated with the number of ROH (R 2 = .9763;Figure 3a).3b).Instead, populations varied in the total length of the genome that resides in short and intermediate ROH, with Oregon and Idaho individuals having much more numerous ROH <5 MB than other populations (Figure 3c).This result is more consistent with past bottlenecks having generated the observed ROH than with recent inbreeding, particularly when coupled with the linear increase of both ROH length and number with geographic distance among populations (Figure 3a).
To further investigate whether bottlenecks could explain the patterns of ROH across populations, we used StairwayPlot2 to assess demographic history in the four sampling locations with larger sample sizes (Illinois, Minnesota, Nebraska, and Idaho, Table 1).We found that the western populations in Nebraska and Idaho show clear evidence of a bottleneck occurring during and after the Last Glacial Maximum (LGM; Figure 4; Figures S10 and S11), consistent with the ROH results above.Additionally, the severity of the bottleneck in the Idaho population is much greater than that in Nebraska, consistent with serial founder effects during range expansion.
Effective population size estimates are large, but the average values match those of the combined Chrysemys picta bellii group in Reid et al. (2019).While the mutation rate is directly from a population used in this study (Bergeron et al., 2023), generation time could vary in painted turtle populations (Spencer & Janzen, 2010;Wilbur, 1975), and we cannot account for variation in sex ratios, so timing of bottlenecks and estimates of effective population size may not be precise.

| Outlier detection and gene-specific variation
After validating convergence, BayeScan identified 43 F ST outlier loci at the 0.05 FDR for the dataset including all individuals (Figure S12, Table S3) and 53 outlier loci for the reduced dataset (Figure S13, Table S4; discussed here).When we investigated the location of these SNPs in the genome for proximity to genic regions, 34 of the 53 outliers corresponded to genic regions, and two of those were located in a coding region according to the BEDTools query.According to Ensembl's Variant Effect Predictor (VEP; release 110, Cunningham et al., 2022), one variant was downstream of the LDLRAD3 gene, and the other variant was a synonymous SNP in the RBBP6 gene.We investigated genotypic patterns among populations at the 53 outlier loci and found that all 53 outlier loci were segregating in the Illinois population.Further, these loci were often fixed in the sampling sites farthest away from Illinois (Oregon, Idaho, and New Mexico populations).
The second outlier detection method (pcadapt) with q-value α = .05(similar to BayeScan) detected 2802 outliers with the full dataset and 2597 outliers with the reduced dataset.These outliers overlapped with the BayeScan results at 10 SNPs using the full dataset (Table S5) and 13 in the reduced dataset (Figure S14, Table S6), though some of the overlapping SNPs were different.Genes that overlapped in both outlier lists with BayeScan outliers were TNR, ZNF436 -like, CASKIN, and SIM2.We focused on a narrower set of outlier loci using Bonferroni correction with α = .001and found 228 outliers in the reduced dataset (Figure S14).Plotting F ST across the genome (Figure 5, Figure S14) did not show a concentration of high F ST or outliers in any specific genomic regions.Outliers detected with pcadapt tended to include the SNPs with the highest F ST , while BayeScan outliers had lower F ST .Including loci previously excluded with the HWE filter resulted in 476 outliers.New outliers from this SNP set had the highest F ST of all loci (Figure 5), confirming that the HWE filter removed loci strongly differentiated among populations.
Two hundred and twenty-eight of these SNPs (48%) were in 166 unique genes, and 10 were within exons.Of the SNPs that were not downstream of a gene according to VEP, four were synonymous mutations and three were missense mutations (in ZAR1l, AKAP6, and SCML2).The missense mutation in ZAR1l appears to be the result of a de novo mutation in the New Mexico population with a high frequency (0.75).Genotypic patterns were more varied than in the BayeScan outliers: some loci were fixed in the Illinois population and variable in populations farther west.The overrepresentation test in PANTHER found no statistically significant GO category overrepresented in the outlier SNP genes (Figures S15 and S16).
We also assessed variation at specific genes implicated in TSD and anoxia tolerance in painted turtles.Although we did not find coverage of KDM6B, DMRT1 was sequenced in our dataset with 22 SNPs.None of these SNPs were in coding regions, and these SNPs did not appear to be correlated with sex of sampled hatchlings, though most hatchlings were incubated in field conditions and thus we cannot assess the influence of genotype on sex at the pivotal temperature of ~28°C (when theoretical sex ratio is 1:1; Telemeco et al., 2013).Second, we had coverage of ANKRD1, ATF3, CA1, F I G U R E 4 Demographic history from StairwayPlot2 analysis shows evidence of bottlenecks in sampled painted turtle populations.N e is effective population size.Populations are labelled with their state abbreviation.Timing of Last Glacial Maximum shaded in grey (Clark et al., 2012).

| Population genomics and demography of painted turtles
In many widespread species, populations are characterized by strong genetic differentiation from the co-occurring forces of selection and genetic drift (e.g., Rödin-Mörch et al., 2019).During range expansions, and in the absence of significant gene flow, reduction in genetic diversity is often observed at range edges due to serial founder effects (DeGiorgio et al., 2011).Spatial bottlenecks, which may be expected in turtles that utilize freshwater habitat that is not contiguous across the western United States, can drive patterns of stronger genetic structure in populations at range edges than those from the original range (Excoffier et al., 2009;e.g., Graciá et al., 2013).Further, these bottlenecks increase the probability of genetic surfing, which can easily be misinterpreted as selection along clines in environmental conditions along the expansion front (Hoban et al., 2016).Thus, before assessing signatures of selection across the genome, we first investigated the population genomic structure of the western range of painted turtles to test past biogeographic hypotheses of the species' range in North America.
Previous analyses of population genetic differentiation among western painted turtle populations have consistently revealed low genetic divergence.Starkey et al. (2003) investigated the phylogeography of painted turtles across the species' range using a single mitochondrial marker (control region CR) and reported pairwise divergence of 0.151-1.368%within the western populations.Over 60 individuals sampled had the same haplotype, with 28 individuals sharing haplotypes with one or two differences from this main haplotype (Starkey et al., 2003).The authors concluded that this lack of divergence suggests a recent radiation of painted turtles into the western United States and Canada, despite fossil evidence suggesting their presence in the Great Plains as long ago as 1.9 million years (Holman, 1995).A more recent study expanded sampling to include a nuclear marker, PAX-P1, and found that the western range of painted turtles also included low haplotype diversity, though populations did not distinctly group together (Jensen et al., 2015).Finally, Reid et al. (2019) included 11 microsatellite loci to further elucidate population differentiation.Although fewer populations were sampled west of the Mississippi River (N = 3), the results were similar in pattern to those presented here; populations were arranged from east to west in a principal coordinates analysis, and there was a significant relationship between F ST and geographic distance (r = .85,Reid et al., 2019).Ecological niche modelling and demographic modelling suggested a single glacial refugium followed by population expansion (Reid et al., 2019), in contrast to Jensen et al. (2015), which did not find evidence of population expansion.
Despite the low genetic diversity found in the western range by previous studies, our study genetically distinguished most populations except turtles from Oregon and Idaho.Indeed, the F ST values found among populations were high in some cases, particularly when comparing northwestern and New Mexico individuals (F ST = 0.31).Given the filtering of SNPs in this study is conservative, and filters such as the HWE filter can remove loci that are strongly diverged, these F ST values are likely underestimates (e.g., Pearman et al., 2022).These populations reside at the range edges for painted turtles, and as such support the pattern of stronger genetic divergence at range edges compared to the range centre (Excoffier et al., 2009).Importantly, we detected patterns of genetic variation largely consistent with serial founder events driving population genomic structure.The evidence of timing and severity of bottlenecks may have slightly biased our ROH to be lower eastern populations (by inflating heterozygosity, Thorburn et al., 2023).However, the number of markers used and the agreement with demographic results, which have been found to be less sensitive to reference genetic distance when divergence is less than 3% between reference and target species (Prasad et al., 2022), suggest that the increasing number of short ROH from eastern to western populations is likely a close representation of ROH in western painted turtles (Shafer et al., 2016).The conclusion of Reid et al. (2019) that the species' widespread range is the result of the spread of populations westward after the retraction of glaciers following the late Pleistocene is confirmed comprehensively by the genome-wide sequencing performed here.In sum, our results suggest that serial founder effects during range expansion largely shaped genetic variation in western painted turtles.
Serial founder effects characterize not only the genetic structure of painted turtles but also the other Testudines that experienced postglacial colonization of northern ranges.For example, in the spur-thighed tortoise, populations in the northern range were characterized by reduced genetic diversity, strong population differentiation, and clinal variation from southern to northern populations in allele frequencies, all patterns consistent with genetic surfing (Graciá et al., 2013).Similar patterns were reported at microsatellite loci in the European pond turtle (Pereira et al., 2018).
We also find evidence consistent with multiple instances of human-mediated translocation of painted turtles from one population to another.We found two admixed Oregon individuals which appear to be the result of admixture between a northwestern turtle and one sourced from a Midwestern population.Individuals sequenced in a previous study also indicated potential evidence of human-mediated dispersal into the western range (Jensen et al., 2015).This result highlights the importance of managing and reducing the release of pet turtles into native populations, which remains a common practice and has led to both the establishment of invasive turtle populations and the spread of pathogens globally (e.g., Héritier et al., 2017).This concern was raised in the conservation assessment for the western painted turtle in Oregon (Gervais et al., 2009), and we can now establish that introductions have occurred even in areas that previously were not suspected of introductions in Oregon.

| Genomic patterns of selection in painted turtles
Another goal of this study was to evaluate genomic evidence of selection in western populations of painted turtles, which might be expected given the difference in climatic conditions experienced and the phenotypic variation among populations.Examples of phenotypic variation in these western locations include mean adult female body size and average clutch sizes, which vary latitudinally (Iverson & Smith, 1993), and the amount of patterning on the plastron and scute alignment patterns (Ultsch et al., 2001).Additionally, hatchlings from these sampled locations raised in common-garden conditions exhibited population-specific incubation durations and hatchling body mass (Bodensteiner et al., 2019).When we assessed genes that we a priori hypothesized would be under different selective pressures in the widespread sampling locations, we found little  (Hoban et al., 2016;Hofer et al., 2009).Indeed, all F ST outlier tests perform poorly, with many false positives, when range expansion shaped population genetic structure (Lotterhos & Whitlock, 2014;Luu et al., 2017).Our BayeScan outlier loci are similar to those found in invasive house finch populations, which could also be explained by genetic drift from founder events alone, rather than selection (Shultz et al., 2016).
Given the difficulties of detecting outlier loci in species with a history of range expansion, we used an additional method of outlier detection, pcadapt, which does not have the same assumptions with respect to demographic history as BayeScan.pcadapt instead detects outliers by finding loci that are highly correlated with ordination axes of the PCA rather than F ST -related statistics (Privé et al., 2020).Analysis with pcadapt resulted in 228 outliers excluding HWE-filtered loci and 476 including them, many of which had allelic patterns that did not follow the predicted east-to-west pattern and yet were still strongly differentiated among populations.
Additionally, one of the nonsynonymous mutations appears to be a de novo mutation in the New Mexico population that has risen to high frequency.The gene, ZAR1l, is important for coordination of mRNA translation during early embryonic development and is conserved across vertebrate lineages (Heim et al., 2022;Sangiorgio et al., 2008).Some of these outliers, particularly those that appear to have arisen de novo in specific populations, have reached high frequency, and were not segregating in the eastern sampling locations, may be the result of selection as painted turtles spread to inhabit the diverse environmental conditions of the western United States.
However, we note that while pcadapt has lower FDR in range expansion scenarios than BayeScan, it is also prone to false positives in species with history of range expansion (Luu et al., 2017), and these loci should be studied further with alternative methods.
In species with long generation times, it can be difficult to assess local adaptation with reciprocal transplant experiments.
Genomic methods can provide candidate loci for further investigation, but in cases of range expansion, it is clearly difficult to distinguish genetic surfing from selection.For example, in Alpine ibex, a species that experienced a recent bottleneck due to overharvesting, the influence of bottlenecks on genetic variation largely prevented the accurate identification of loci under selection (Leigh et al., 2021).Further, in geographically isolated populations with low levels of gene flow, we would anticipate that local adaptation of a trait would be achieved through divergence at very few large-effect loci and many more small-effect loci (Savolainen et al., 2013).Small-effect loci are often missed in F ST outlier tests (Kemper et al., 2014), and thus we may not be able to effectively identify evidence of local adaptation in the phenotypes that vary among populations using these methods.Finally, in this study we used reduced-representation sequencing, which is an excellent tool for understanding population genetic structure (Catchen et al., 2017), but may miss regions of large differentiation consistent with phenotypic adaptation.Whole-genome sequencing may provide increased resolution for loci of large effect, and genomewide association studies may be a better method for addressing local adaptation in complex phenotypes in species with a history of serial founder effects (Berg & Coop, 2014).
We studied the painted turtle, a widespread ectotherm, to further our understanding of how certain species inhabit entire continents.One frequent feature of widespread temperate ectotherms is a history of isolation in refugia, followed by postglacial colonization (Shafer et al., 2010).Indeed, we found serial founder effects consistent with range expansion following the LGM in this study.While local adaptation can facilitate colonization, long-lived, slowly evolving lineages may require much longer to locally adapt in order to facilitate postglacial spread.Alternatively, phenotypes that ensure survival in a wide range of environmental conditions may predate glaciation.In ectotherms, cold tolerance may be linked to achieving a widespread range during postglacial range expansion (Horreo & Fitze, 2018).Painted turtles possess multiple exceptional phenotypes to survive cold temperatures (e.g., anoxia tolerance and supercooling tolerance of hatchlings).These phenotypes are present across latitudes, suggesting they were acquired at least in some form before range expansion (Packard & Packard, 2003;Reese et al., 2004).Thus, long-lived, widespread temperate ectotherms may achieve their large ranges through pre-existing phenotypes.

B EN EFIT-S H A R I N G S TATEM ENT
Benefits from this research come from the generating and public release of genomic data and the documentation of all code used for analyses, as described in the Data accessibility statement above.

O RCI D
Jessica M.
Genetic variation among painted turtle populations is consistent with genetic drift.TreeMix results of the tree that best explains allelic covariance among individuals with no migration; SE is standard error.ROH < 5 megabases (MB) or long ROH >5 MB, the latter of which indicate recent inbreeding (e.g., Grossen et al., 2018; McQuillan et al., 2008).When comparing long ROH, populations did not vary in the total length of long ROH in the genome, as would be expected under a scenario of increased inbreeding among individuals within certain populations (Figure

F
Patterns of runs of homozygosity (ROH) are consistent with bottlenecks in western painted turtle populations.(a) Relationship between the total length of all ROH in megabases (MB) and number of ROH in each individual, coloured by population.(b) Boxplot of total ROH length in MB including only ROH greater than 5 MB for each population.(c) Boxplot of total ROH length in MB including only ROH less than 5 MB for each population.

F
ST of sampled loci across the assembled chromosomes of the painted turtle genome (excluding unplaced scaffolds) with outlier loci shown in black.Outliers were detected using pcadapt with a Bonferroni correction (α = .001).Plotted F ST and outlier analysis excluded Kansas turtles and two admixed Oregon turtles.SYR61, ENDOD, PTGS2, S100A1, and TNC, which are all identified as differentially expressed during anoxia or re-oxygenation in painted turtles(Fanter et al., 2020).One SNP in CA1 was located in a coding region and changed an amino acid from proline to glutamine.This SNP was found in five Illinois, seven Minnesota, and one Nebraska individual.Four other SNPs associated with amino acid changes in coding regions of CYR61 and TNC were found at low frequency in one to three individuals.4| DISCUSS IONWith genomic data, we can now investigate in greater detail the genomic architecture of local adaptation among populations of widespread species and understand past demographic changes that led to species distributions.In this study, we used RADseq to sequence thousands of markers in painted turtles from seven locations across the western range of the species.We found that populations were genetically differentiated, with patterns of divergence consistent with geographic distance among populations.Demographic analysis suggested that western populations of painted turtles are the result of serial founder effects during range expansion westward following the LGM, consistent with the decrease in genetic variation and increase in shorter ROH from eastern to western populations.This history of range expansion poses a serious challenge to accurate identification of loci under selection: while the results from one outlier detection method can be explained by serial founder effects, another method found that many more outliers may have resulted from local adaptation and thus warrant further investigation.Together, we found candidate loci that are putatively under selection or are the result of genetic surfing during range expansion, and we conclude that serial effects have shaped genomic divergence among western populations of painted turtles.
in the western populations, with Idaho turtles having experienced the most recent and severe bottleneck following the LGM, and the decreasing genetic variation with distance from eastern populations both support range expansion from east to west.Further, the number of short ROH increased, consistent with bottlenecks leading to the current low levels of genetic variation in Oregon and Idaho, and to a lesser extent in the New Mexico population.ROH analysis with RADseq data is only an approximation of ROH dynamics across the genome, and the reference we used (from the northwestern US) Future work should continue to investigate the role of pre-existing phenotypes, local adaptation, and phenotypic plasticity in facilitating the large geographic ranges seen in some ectotherms.In conclusion, we clarified past demography and investigated signatures of selection in a long-lived, widespread ectotherm that faces an uncertain future as the global climate continues to rapidly change.AUTH O R CO NTR I B UTI O N S Jessica M. Judson performed laboratory work, analyses, and wrote the first draft of the manuscript.Luke A. Hoekstra assisted with coding and generating the sequencing data.Fredric J. Janzen acquired funding and performed the initial sampling from hatchlings.All authors contributed to conceptualizing the project and editing the manuscript before approving of the final submitted manuscript.ACK N O WLE D G E M ENTS This work was supported by grants from the National Science Foundation (LTREB DEB-1242510 and IOS-1257857) and the National Institutes of Health (R01-AG049416 to A. Bronikowski).We thank the many people who participated in the collection of samples from sampling locations (B.Bodensteiner, J. Iverson, C. Milne-Zelman, T. Mitchell, J. Refsnider, D. Warner, and the high school, undergraduate, and graduate students at the Turtle Camp sampling location), those who provided comments on drafts of this manuscript (A.Bronikowski, members of the Janzen and Bronikowski lab groups, A. Toth, K. Roe, J. Nason, and five reviewers that greatly improved the manuscript), E. Thornhill for writing the protocol included in the supplemental information, and both E. Thornhill and K. Hagerty for assistance with optimizing the protocol for DNA extraction.This is Kellogg Biological Station contribution number 2304.O PE N R E S E A RCH BA D G E SThis article has earned an Open Data badge for making publicly available the digitally-shareable data necessary to reproduce the reported results.The data is available at https:// doi.org/ 10. 25380/ iasta te.17865320.DATA AVA I L A B I L I T Y S TAT E M E N TRaw sequencing data are available under BioProject PRJNA789046 from the NCBI Sequence Read Archive (SRA), which includes sample and sequence file metadata.Permit numbers, code, genotype files, and results of analyses are available from Iowa State University's data repository (hosted by figshare), which can be found at https:// doi.org/ 10. 25380/ iasta te.17865320.

. To call variants, we used a modified
Sampling information for painted turtle (Chrysemys picta) hatchlings in this study arranged by longitude from east to west.Location abbreviations are the state abbreviation, which is used in following figures; N is sample size and Final N is the number of samples remaining after filtering for sequence quality.

Table 2 )
, with the lowest values of polymorphism and private alleles in the northwestern populations and the highest in the easternmost (Illinois) site.The Illinois population had much greater genetic variation than other sampled locations, with private alleles an order of magnitude more numerous than any other population.Weir and Cockerham's overall weighted F ST across populations was 0.136, and average F ST across loci was 0.089.Pairwise F ST , d xy , PCA, and STRUCTURE results all depicted a similar pattern of genetic structure across the sampled populations.Pairwise F ST values suggested moderate to high differentiation among many populations (Table3).However, Illinois, Nebraska, Minnesota, and Kansas had lower levels of differentiation than other pairwise comparisons, and Oregon and Idaho pairwise F ST was similarly low.
The first three principal components (PC) explained 20% of the variation in the dataset, with percentages of 9.4%, 6.2%, and 4.5% of the variance explained, respectively.The first PC was consistent with geographic positioning of populations from east to west, with Illinois and the northwestern individuals farthest from each other (Figure Two individuals from Oregon were distinguished from the other Oregon individuals in both the PCA and STRUCTURE analyses and may be the result of admixture a All statistics, with the exception of π and % Poly, use the final 48,060 SNP dataset.π and % Poly were calculated with the block-filtered dataset containing 566,852 variant and 37,832,754 invariant sites. Pairwise F ST and d xy across all painted turtle sampling locations.Pairwise F ST estimates among locations are below the diagonal, and interpopulation nucleotide diversity estimates (d xy ) are above the diagonal.Pairwise F ST was calculated with the final filtered dataset of 48,060 SNPs; d xy was calculated with the block-filtered dataset containing 566,852 variant and 37,832,754 invariant sites.
The relationship was linear, with individuals from the most genetically variable population (Illinois) having fewer ROH, individuals from Oregon and Idaho having more of their genome in ROH and more numerous ROH, and other populations displaying intermediate ROH numbers consistent with their geographic distance from the Illinois population (Figure 3a).ROH was then partitioned into either small and intermediate TA B L E 3 variation in genes related to TSD or anoxia tolerance across individuals that would support local adaptation for these traits.BayeScan found only 53 outlier loci (out of 212,670 SNPs), all of which had Berg, J. J., & Coop, G. (2014).A population genetic signal of polygenic adaptation.PLoS Genetics, 10, e1004412.https:// doi.org/ 10. 1371/ journ al. pgen.1004412 Bergeron, L. A., Besenbacher, S., Zheng, J., Li, P., Bertelsen, M. F., Quintard, B., Hoffman, J. I., Li, Z., St Leger, J., & Shao, C. (2023).