Parsing variance by marker type: Testing biogeographic hypotheses and differential contribution of historical processes to population structure in a desert lizard

A fundamental goal of population genetic studies is to identify historical biogeographic patterns and understand the processes that generate them. However, localized demographic events can skew population genetic inference. Assessing populations with multiple types of genetic markers, each with unique mutation rates and responses to changes in population size, can help to identify potentially confounding population‐specific demographic processes. Here, we compared population structure and connectivity inferred from microsatellites and restriction site‐associated DNA loci among 17 populations of an arid‐specialist lizard, the desert night lizard, Xantusia vigilis, in central California to test among historical processes structuring population genetic diversity. We found that both marker types yielded generally concordant insights into population genetic structure including a major phylogenetic break maintained between two populations separated by less than 10 km, suggesting that either marker type could be used to understand generalized demographic patterns across the region for management purposes. However, we also found that the effects of demography on marker discordance could be used to elucidate population histories and distinguish among competing biogeographic hypotheses. Our results suggest that comparisons of within‐population diversity across marker types provide powerful opportunities for leveraging marker discordance, particularly for understanding the creation and maintenance of contact zones among clades.


| INTRODUC TI ON
Population-level processes are a major force in structuring specieslevel genetic diversity across space and time. Especially in the absence of dispersal barriers, comparative tests across multiple species offer important insight into the differential effects of historical and demographic processes on spatial signatures of genetic diversity (Avise, 2009;Knowles, 2009). These comparisons give rise to following questions: why do geographic barriers affect different species or populations in different ways, even for taxa with similar dispersal capabilities and habitat usage (Charlesworth et al., 2003;Irwin, 2002)?
What constitutes a barrier to dispersal, and by what mechanisms do barriers create and maintain these patterns in natural populations living in complex-or simple-habitats? Why do these potential geographic barriers seem to affect different species or even populations within a single species in different ways, even when the underlying biology of those species suggests similar dispersal capabilities and habitat usage (Myers et al., 2019)?
One powerful way to test among competing processes that structure standing genetic diversity is to compare marker types with different (a) mutational properties and (b) response to demographic history (Fischer et al., 2017;Miller et al., 2014). For example, microsatellite markers tend to overestimate within-population diversity due to their rapid mutation rates, large numbers of alleles and the tendency for researchers to select highly polymorphic loci (Putman & Carbone, 2014;Queirós et al., 2015). Single nucleotide polymorphisms from restriction site-associated DNA (RAD) markers are likely to be in slower mutating sections of the genome, since many bioinformatic pipelines select fragments that appear in multiple individuals and finding homologous DNA fragments across many individuals or species relies on consistent enzyme cut site sequences motifs (Lowry et al., 2017). These differences allow within-population comparisons between the marker types, with the goal of identifying the relative timing of demographic events such as population isolation.
Since microsatellites have a higher mutation rate than RAD loci, private microsatellite alleles should on average emerge in an isolated population before private RAD alleles. For example, comparing the two markers within a single population accounts for differences in effective population size and other demographic factors, so we should be able to group timing of population isolation into three separate phases: (1) no private alleles, (2) relative excess of private microsatellite alleles and (3) private alleles for both marker types. Using these three bins, we can tease apart demographic events that otherwise produce indistinguishable genetic signatures (Figure 1), even when both marker types reveal similar phylogeographic patterns in natural populations (DeFaveri et al., 2013;Gärke et al., 2012). Overall, leveraging variation in marker response to historical processes is a powerful, but underutilized, approach to population and evolutionary genetic analyses.
Retrieving information from different types of genetic markers with various characteristics, biases and theoretical behaviours has a long history in population genetics. Notable marker types include electrophoresis polymorphisms in proteins and DNA fragment lengths, Sanger sequence data, microsatellites and next-generation sequencing. In tandem with these approaches, theoretical developments helped to generate testable predictions, including the value of comparing predictions generated by the infinite site and infinite allele models (Griffiths & Tavaré, 1994;Kimura, 1971;Li, 1977;Tajima, 1996). Infinite site model assumes that the states a locus can take on are limited, but there are so many possible mutational loci that sample-wide homoplasy is unlikely. In contrast, the infinite allele model assumes a restricted number of loci that can mutate, but that each locus can take on an infinite number of unique states. Neither of these theoretical constructs occurs in the real world, but rather markers behave on a spectrum between them (Ewens, 1974). Infinite allele models generally capture the behaviour of microsatellites, restriction length polymorphism data and protein electrophoresis alleles, while the infinite site models are often better descriptors of sequence and SNP data.
Marker comparisons specifically within geographically complex and biodiverse regions create the strongest tests of both biogeographic hypotheses and the relative contribution of different historical processes responsible for spatial patterns of standing diversity (Buonaccorsi et al., 2001;Portnoy et al., 2010). California, with its complex geological structure due to tectonic activity, has long been considered an incubator for biogeographic diversity, including high endemism, species richness and complexity of population structure (Calsbeek et al., 2003;Feldman & Spicer, 2006;Gottscho, 2016;Lancaster & Kay, 2013;Wake, 2006). For central California, the most important abiotic factors affecting species distribution and population structure are (a) topographic structure and history, particularly the uplift of the north-south Sierra Nevada and southern Coast Ranges and the tectonic-induced rotation of the east-west Transverse Ranges (Chatzimanolis & Caterino, 2007;Feldman & Spicer, 2006;Lapointe & Rissler, 2005) and (b) rainfall gradients across this topography, especially the replicated rain shadow effects along eastern-facing mountain slopes (Hughes et al., 2009). The disjunct arid habitats in the central California rain shadows have high conservation importance and represent the northernmost distribution of many California desert species (Hill, 2003). These factors all combine to make central California biodiversity an ideal system for marker comparison studies.
In this study, we used an arid-associated lizard species (the desert night lizard, Xantusia vigilis) to identify which of two competing scenarios about the historical distribution and connectivity of populations in central California's xeric ecozones is most consistent with the available data. We then assessed contemporary genetic diversity to inform population management practices. Finally, we compared marker-specific patterns of diversity and allelic evolution to distinguish between two explanations for an unexpected combination of strong genetic affinities across large distances and a deep genetic break across a short geographic distance (<10 km). Together, these analyses help us understand how geography, habitat and history interact to control barriers to migration among populations. F I G U R E 1 Predictions of genetic marker variation supporting different historical biogeographic scenarios. (a) The concordance (or lack thereof) between values of observed heterozygosity (H o ) across RAD and microsatellite loci reflects different population histories (population size through time shown by curved lines in each quadrant, with time increasing towards the top). Most population histories will yield generally concordant levels of heterozygosity (blue points) across both marker types (lower left and upper right quadrants). Discordant heterozygosity values between the two marker types in which one marker is much higher than the other (lower right and upper left quadrants) suggest timing and strength of historical bottlenecks or founder events that will differentially affect these marker types. (b) The predictions from heterozygosity can be integrated with numbers and identity of private alleles across marker types to create a framework for testing among competing biogeographic hypotheses across populations (numbers in circles). Variation in correspondence between these values over time is due to the differential mutational and saturation properties of the two marker types. Recently isolated populations should have few private alleles at either marker type (1). Higher microsatellite mutation rates will generate microsatellite private alleles before RAD private alleles (2). Eventually, private RAD alleles will occur (3), and finally random mutations in other populations could result in size homoplasy, rendering some microsatellite private alleles undetectable (4). (c) In our data, heterozygosity discordance shows two North Transverse populations in or near the upper left quadrant, showing signs of more recent bottlenecks, while Panoche populations occur in the lower right quadrant, indicating an older bottleneck followed by a population rebound. Axes are placed at mean heterozygosity values for each marker type, and the grey line represents the results from a linear regression of RAD heterozygosity against microsatellite heterozygosity. (d) Discordance in private allele rates at our two marker types place the populations on a time-since-isolation axis. The Pinnacles populations show signatures of long-term isolation.

| Biogeographic hypotheses
Xantusia vigilis is a very small (adult mass = 1.5 g), secretive lizard commonly found throughout arid regions of the southwestern US and the Baja California peninsula of Mexico (see Figure 2 inset; Stebbins, 2003). Presumably due to limited dispersal rates and distances, low rates of inbreeding and high effective population size (Davis et al., 2011), this species maintains genetic signatures of historical processes over long periods of time and boundaries between demes tend to be well maintained (Sinclair et al., 2004).
However, this species also shows unexpected and dramatic patterns of genetic similarity among non-neighbouring populations (Leavitt et al., 2007). Xantusia vigilis is intimately tied to plant or rock cover objects, and several authors have suggested the importance of these specialized habitat associations in predicting historical distribution and resolving unexpected patterns in population connectivity (Noonan et al., 2013;Sinclair et al., 2004). These factors have also contributed to the presence of highly fragmented and disjunct F I G U R E 2 Collection locations, habitat and phylogeographic relationships of 17 sampled Xantusia vigilis populations. Lizards in the northern part of the range shelter under the monocot shrub Hesperoyucca whippeli or under bark of fallen grey pine (Pinus sabiana) logs, unlike the mixed Yucca (brevifolia, baccata, schidigera) sheltering sites found in the Mojave desert. The full geographic range of X. vigilis range and the portion of the range sampled here (grey box) is indicated in the inset map. Note that the two main phylogenetic clades meet across a short geographic distance in the Cuyama valley in the Transverse Ranges, demonstrating a biogeographic break that does not follow habitat breaks. The northern and southern clades are reciprocally monophyletic, so no directionality of north-south colonization can be inferred from the tree.
populations across the northern range of X. vigilis, which includes isolated populations in central California and southern Utah (see

| Predictions of marker variance
Diversity metrics of the markers we use here, microsatellites and RADseq data, can depart from co-linearity in several specific scenarios ( Figure 1a), which can improve our understanding of complex biogeographic scenarios. Due to their relatively fast mutation rate, microsatellites should have relatively higher allelic richness in populations that have fairly recently gone through an acute reduction in size, but have since rebounded demographically (Hoelzel, 1999;Martínez-Cruz et al., 2004). Alternatively, this pattern might be produced if a metapopulation with relatively low contemporary migration rates has recently received immigrants (Alexandri et al., 2017), although not all migration processes result in this pattern (Sunde et al., 2020;Zimmerman et al., 2020). In contrast, RAD data should have higher relative heterozygosity in very recently established or post-bottleneck populations compared to the same markers in nonbottlenecked populations. Rare alleles are disproportionately likely to drop out during founder events (Garza & Williamson, 2001). Since microsatellites can have more alleles per locus than sequence data, founder events should reduce the relative allelic richness of microsatellites more sharply than RAD loci (Maruyama & Fuerst, 1985), given that a relatively higher proportion of microsatellite alleles per locus should be in the under 10% frequency category that is most vulnerable to loss in a bottleneck (Luikart, 1998;Tajima, 1989). Low variance in both marker types can be produced by prolonged isolation at low population size, or by a recent and severe founder event (Nei et al., 1975).
The relative proportion of private alleles compared to shared alleles in the populations can distinguish between the scenarios posed above and provide information on the duration of population isolation ( Figure 1b). The longer a population has been isolated, the more likely it is to have private alleles (Harpak et al., 2016). The proportion of private alleles should reflect dynamics in the population that occur post-bottleneck, while the allelic richness analyses should reflect processes that occur during the bottleneck. As such, these two approaches have more explanatory value combined than either does independently.
Due to the difference in mutation rates between our marker types, populations should go through four distinct phases following a demographic event that reduces population size, assuming no migration from surrounding populations. Immediately following the event, neither marker type will have a large number of private alleles, and any private alleles that are present should be due to random sampling events during population subdivision, rather than unique local mutations. In the second phase, new microsatellite alleles will emerge due to local mutations, but new private RAD alleles will still be rare due to their slower mutation rate (Lowry et al., 2017). In the third phase, both marker types will show many private alleles. In a theoretical fourth phase, the rapid mutation rate of microsatellites will lead to size homoplasy with alleles in other populations, rendering former private microsatellite alleles no longer identifiable as private. This pattern could emerge due to the stepwise nature of mutation in microsatellites that leads to re-emergences of some read lengths from different ancestral alleles and the size homoplasy that can arise due to interrupted microsatellite sequences (Estoup et al., 2002). Empirical data show that size homoplasy is widespread in natural systems and increases in prevalence in populations separated by long time frames (Estoup et al., 2002;Lia et al., 2007). The combination of our two marker types should allow us to identify and order the relative occurrence time of demographic events to a greater degree of precision than either of our markers independently.
Using a combination of phylogeographic and demographic analyses, we reconstructed historical patterns of population structure and connectivity in X. vigilis across a complex geological and ecological landscape. In doing so, we leveraged the two marker types to differentiate potential migration patterns across a range of scenarios of patterns of historical connectivity between populations. By assessing biogeographic drivers structuring genetic marker discordance across populations, we provided a clear mechanism for testing among otherwise indistinguishable hypotheses that is useful for other systems with similarly intractable population histories.

| Field collection and tissue acquisition
We collected tissue samples in the field from 354 X. vigilis between 2007 and 2014, which we augmented with five museum samples mainly from the northeastern Mojave desert (Table S1) Table 1) along a north-south latitudinal gradient ( Figure 2, Figure S2). Throughout the manuscript, we refer to our sampling populations by the names of local landmarks and include a numbered identifier referencing the regional genetic deme to which they belong (see Figure 2, Table 1). The Coast Range sites were selected by scanning Google Earth for patches of Hesperoyucca whipplei habitat, followed by an initial survey to check for lizard presence. There is a substantial gap in the lizard's known range between the Transverse Ranges and the Pinnacles and Panoche area, corresponding to an area in which H. whipplei is present only in a few small stands ( Figure S1). We surveyed available H. whipplei stands within that gap and found no evidence of X. vigilis ( Figure S1), although the lizard is difficult to detect in some habitats and populations may be found in this area in the future. For most sites, we captured lizards by lifting, rolling or opening decaying yucca trunks and rosettes (see also Davis et al., 2011). At the Pinnacles site, we captured lizards by first moving fallen Pinus logs onto a white sheet and prying off flakes of bark by hand, as well as flipping associated flakes of talus underneath the logs or on scree slopes. After capture, we took tail clip samples that were stored in 95% ethanol and kept at −20°C until analysis. For outgroup comparisons, we included a sample from X. wigginsi, Wiggin's night lizard.

| Habitat classifications
We collected samples from three major habitat types ( Figure 2).
North of the Transverse Ranges, Hesperoyucca whipplei is distributed into discrete patches on sandstone formations, within which it is the dominant woody shrub species, and therefore all the collection sites were straightforwardly characterized as Hesperoyucca mesohabitat.
In the disjunct populations at Pinnacles National Park, the habitat has been defined as grey pine (Pinus sabiniana)-blue oak (Quercus douglasii) woodland (Sawyer & Keeler-Wolf, 2009) and lizards also occurred in grey pine-manzanita (Arctostaphylos sp.) associations at higher elevations as well as under loose rocks several meters from vegetation. In the Transverse Range collection sites, lizards were also collected in H. whipplei, but here the yucca tended to be interspersed with woody shrubs such as desert tea (Ephedra californica) and common sagebrush (Artemisia tridentata) as well as at least two tree species, California juniper (Juniperus californica) and single leaf pinyon (Pinus monophyla). Our Mojave sites were dominated by creosote bush (Larrea tridentata) and lizards were found mostly under joshua trees (Yucca brevifolia), Mojave yucca (Y. schidigera) and banana yucca (Y. baccata).

| DNA extraction and microsatellite genotyping
We extracted DNA from tissue samples using a standard Qiagen  Table 1) population (Davis, 2012;Davis et al., 2011;Davis Rabosky et al., 2012). We discarded any individual from further analysis that we could not confidently genotype at five or more loci. We checked our microsatellite data for null alleles using the null.all function summary two mean value from the R package 'PopGenReport' (Adamack & Gruber, 2014) and Hardy-Weinberg equilibrium using the R package 'pegas' (Paradis, 2010). One locus was dropped from further analysis due to a frequency of null alleles over 22 percent. The allele calls for the microsatellites are available in a Structure formatted file in our Dryad repository (DOI: 10.5061/dryad.31zcrjdht).

| Next-generation sequencing and data processing
We performed double digest RAD (ddRAD) sequencing on a subset of individuals (N = 104 X. vigilis, plus 10 outgroup samples) following the protocol developed by Peterson et al. (2012). We restricted total genomic DNA using the enzymes EcoR1 and Msp1 and then used a QIAquick gel extraction kit to size select fragments between 100 and 200 base pairs. We used 24 unique barcodes and four unique indices (following Peterson et al., 2012)  After preliminary analysis, we removed individuals that aligned poorly with the remaining dataset. We retained 81 X. vigilis individuals and one X. wigginsi sample as an outgroup. We uploaded fastq files to NCBI's short read archive under BioProject PRJNA649707. We used FastQC to assess the quality of our sequences (Andrews et al., 2011).
The initial results showed adapter contamination in some of our sequences. We processed our RAD results using the ipyrad pipeline (Eaton & Overcast, 2020) using the default settings. The first step in ipyrad includes adapter trimming. After it ran, we reran the cleaned sequences in FastQC to check that the adapter contamination had been successfully removed. The results showed that the adapter contamination had been removed, but the nucleotide ratios in the first 10 nucleotides were not even, and in some individuals the per tile sequence quality scores in the final 10 nucleotides were lower than average. We continued the ipyrad run with a cut-off of 20 individuals per locus sequenced in the output. We used a custom R script to parse the gphocs output file from ipyrad. We first removed all fragments without an X. wigginsi outgroup sequence. We then removed all fragments that were not variable within the X. vigilis samples. For each fragment, we trimmed any base pairs that were uncalled in some individuals along the edges of the sequences, then removed individuals with indels called in the remaining central portion of the sequence. We retained fragments that were longer than 50 bp in length after trimming. We removed fragments with a minor allele frequency below 0.05, and any fragment where one allele was represented only in heterozygotes. We wrote out separate fasta files for each locus, which we used to format program-specific input files for later analysis. The code used to make these per-locus fasta files is available in our Dryad repository (DOI: 10.5061/dryad.31zcrjdht).

| Locus diversity and characteristics
We calculated within-population and within-region allelic richness using the R package 'heirfstat' for microsatellites (Goudet & Jombart, 2022). We removed populations with fewer than three microsatellite genotypes (east Mojave (MOJ1), northeast Mojave (MOJ2), South Chalone (PINN2), Curry Mountain (PINN1) and Quatal (ST3)). We rarefied populations to three individuals and found the allelic richness for each locus. We found the mean value for the richness for each of our seven loci in each population. We repeated the rarefaction 50 times and then found the mean of the 50 mean values for each population. We did the same with our regional designations, which follow our phylogeographic tree (Figure 2).  (Singhal et al., 2017). We then found the mean of the individual heterozygosity values for each population and region.
To identify the proportion of private alleles in each population and region, we used a custom R script for both microsatellites and RAD data. For each locus, we first rarefied the number of individuals to three per population. For each population in turn, we then identified the alleles found in the population and the alleles found in all other populations. We calculated the proportion of the alleles in the focal population that were unique to that population. For example, a population with one allele that occurred only in that population and one that occurred elsewhere would have a privacy proportion of 0.5. We found the mean privacy proportion across all alleles for each population or region. We repeated the rarefaction 50 times and found the mean across results. The scripts for this analysis are available in our Dryad repository (DOI: 10.5061/dryad.31zcrjdht).

| Phylogeography
To reconstruct the phylogenetic relationships and major genetic splits within our focal X. vigilis populations, we used the programs iqtree2 1.6.12 (Nguyen et al., 2015) and astral 5.7.1 (Zhang et al., 2018) to construct a single coalescent tree from trees based on the individual RAD fragments output by our bioinformatics pipeline.
This approach is statistically more robust to incomplete lineage sorting, which could be common in our individual-level dataset. We generated trees for each fragment using iqtree2 with 1000 bootstraps each, setting the X. wigginsi sequence as an outgroup. We collected the output tree files into a single file, then used astral to estimate a single species tree from a randomly selected subset of 1000 gene trees. We used iqtree2 to quantify concordance between our gene trees and the astral-derived species tree. To visually assess the relationship between phylogeny and geography, we mapped each sample from its collection point to its location on the phylogenetic tree using phytools (Revell, 2012).
We used Structure v.2.3 to identify population-level deme groupings and find evidence of admixture in our RAD and microsatellite data (Pritchard et al., 2000). For the RAD Structure input, we randomly selected one SNP per fragment using a custom R script, avoiding selecting from the first or last 10 bp of each sequence due to the FastQC results. We ran each marker type at K values from one to seven. For microsatellites, we performed 2000,000 steps and 1000,000 burnin steps. For the RAD data, we did 200,000 steps with 100,000 burnin steps. We performed 10 runs at each K value for our RAD data and 30 for our microsatellite data and used the Evanno delta K method to select the value of K that best fit the data (Evanno et al., 2005) using the program StructureHarvester web v.0.6.94 (Earl & vonHoldt, 2012). Both marker types identified two demes as the best supported value of K. For both marker types, the location of the split matched the major division in our phylogeographic tree, with Mojave and South Transverse in one group and Pinnacles, Panoche and North Transverse in another. Within each of these groups, we repeated Structure runs from K = 1 to K = 7 for both marker types, with 50 runs at each K value for microsatellites and 10 for RAD data. Other values were the same as our initial runs.
We repeated the Evanno delta K method for both regional groups in both marker types.
We used the Clumpak web server to cluster results across runs at the best supported values of K within the two regional groupings (Kopelman et al., 2015).

| Historical movement corridors
To identify historical patterns of population splitting and migration between our three regional groupings (Mojave + South Transverse, North Transverse and Pinnacles + Panoche), we performed an Approximate Bayesian analysis with Random Forest model selection in the programs diyabc v.1.1.27 and abcranger v.1.16.23 (Collin et al., 2021). We subsampled the data such that each group was represented by an equal number of individuals. To generate the input file, we retained only SNPs from our Structure input that were represented by at least one called individual in each regional deme. We assigned an equal sex ratio to our populations. We set our minimum allele frequency threshold to 0.05. With these constraints, we retained 4782 loci. We set broad priors for population sizes from 0 to 1000,000 individuals, and priors for time since each split between populations between 0 and 1000,000 generations. We based our model selection on the 50 default summary statistics generated by the diyabc platform. We tested all possible topologies of three regional groupings resulting in nine total models. We generated 12,000 simulated datasets, then used abcranger to select the topology that best fit the data based on 2000 random forest trees.
For the best-fit topology, we estimated the number of generations before present that the populations divided using abcranger. To best use our data around regional patterns of uncalled alleles, we created a dataset of only the northern populations (North Transverse, Pinnacles and Panoche). Using the same approach as the full dataset, we tested every possible topology between these three populations and estimated the number of generations since divergence for the best supported topology.
Finally, we assessed the direction of migration between the major population clusters found in the iqtree tree. In this analysis, we followed the philosophy of rangeExpansion package (Peter & Slatkin, 2013) which relies on the observation that one-time, directional population movements often increase the proportion of rare alleles in the expanding population (Hallatschek et al., 2007;Klopfstein et al., 2006). Since rare alleles are more likely to be derived than ancestral, the expected outcome of a range expansion event is to increase the relative proportion of derived alleles in the newly established population (Slatkin & Excoffier, 2012). However, some of our populations have been separated for many generations, leading to many loci differing between populations but being fixed within populations, preventing us from using allele frequencies due to few alleles being variable in both populations in a pair.
To best use the available data, we created a simple pairwise directionality metric between populations. For each RAD locus, we found the putatively ancestral allelic state using a designated outgroup animal, our X. wigginsi sample. If an allele from the X. vigilis sample matched the outgroup allele, we considered that allele to be ancestral. We then found all derived alleles from the X. vigilis individuals in the sample. In cases in which one population had only derived alleles while a second population contained both ancestral and derived alleles, we considered the ancestral + derived allele population to be the source population. For each pair of populations, we resampled sets of three individuals per population 100 times each.
For each set, we recorded the detectable direction, if one could be identified. We found the sum of the directionality estimates for that fragment, then summed again across shared fragments. Finally, we normalized the directionality measures by the number of shared fragments between each pair of populations. We followed the groupings of populations in iqtree output, first finding directionality between the major north and south geographic areas, then between the regional demes nested within each area.

| Locus characteristics, diversity and private alleles
The observed frequency of null alleles in the microsatellite loci re-  (Table 1). We found that many microsatellite alleles were region specific rather than population specific, so we concentrated on patterns of regional privacy going forward.
The ipyrad infile started with 58,359 loci, which we aggressively  Table 1 for full results).

| Phylogeography
Our iqtree analysis recovered two major clades of X. vigilis across For both RAD data and microsatellites, the Evanno delta K method identified Structure runs at K = 2 as the value that best fit the data ( Figure S3a,c). The Evanno delta K method identified K = 2 as the best supported value for both marker types in the southern population ( Figure S3b). Both marker types separated the Mojave population and the South Transverse populations (Figure 3a,b). In the northern population, the best K value for microsatellites was two and the best K value for RADseq data was three ( Figure S3b). In the RAD data, the North Transverse, Pinnacles and Panoche populations all form demes (Figure 3a). The microsatellites show a regional deme in the Panoche area, with all other populations sharing a second deme (Figure 3b). The regional deme signature was especially strong in the southern, smaller and more isolated Panoche populations (PANO1, PANO2 and PANO3). The largest Panoche population (PANO4) retained significant signal of the wider northern microsatellite deme.

Microsatellite allele ranges show that the Northern and Southern
Transverse Ranges had both become fixed for a single allele at some loci, but those loci differed between the two demes ( Figure 3c).
For most loci, the Mojave and Panoche populations had the largest size ranges, with other populations intermediate between them.
The RAD pairwise difference heat map showed lower pairwise distances for within-deme population pairs than pairs from different demes, demonstrating that the Structure demes reflect raw genetic dissimilarities in the data (Figure 3d). Together, the iqtree, Structure and locus characteristic analyses pointed to a strong biogeographic break between northern and southern groups in the Cuyama Valley in the Transverse Ranges.

| Locus comparison
Our comparison between RAD and microsatellite heterozygosity showed populations occupying each of our four quadrants

| DISCUSS ION
In this paper, we leveraged patterns of variation between two types of genetic markers across populations of the desert night lizard, Xantusia vigilis, to test among biogeographic hypotheses that could not be resolved using conventional approaches. Despite a history of range expansion across physical barriers such as the Transverse Ranges in our study area, we surprisingly found that the most extreme phylogeographic break appears without any clear geographic barriers to gene flow. By incorporating population-specific discordance in heterozygosity and allele privacy across our marker types, our methods for leveraging genetic marker type discordance to our analytical advantage can be applied across other systems with similarly intractable population histories.
Despite the inherent biases associated with both of our marker types (Arnold et al., 2013;Putman & Carbone, 2014), they recover highly concordant results for both phylogeographic structure ( Figure 3) and within-population diversity (Figure 1). Our paired analysis ( Figure 3) generally agreed on population groupings and on the strong differentiation between the north and south Transverse Range populations. We also found several areas of disagreement between marker types, for which the nature, magnitude and directionality of discordance were informative for both reconstructing population histories and understanding the creation and maintenance of contact zones among clades. Our study area contains many complex biogeographic patterns which have been influenced by geological and climatic history. To demonstrate how our analytic approaches helped to discriminate among demographic hypotheses and how these methods could be applied to organisms with similar ecologies or geological histories, we discuss four inferences below: one wholistic biogeographic perspective and three regional case studies within our broader study area.

| Tests of biogeographic hypotheses: Expansions and boundaries
One of the unresolved biogeographic questions we tested is whether the northern range-limit populations of X. vigilis represent a historical refugium for the species (North-to-South hypothesis), or whether they are a recent offshoot of the main, Mojave Desert population (South-to-North hypothesis; Morafka & Banta, 1973). We found support for the South-to-North hypothesis, although the expansion process most likely occurred many generations in the past. Our phylogeographic tree showed reciprocal monophyly between the two regions, rather than showing one nested in the other (Figure 2). To with previous mitochondrial trees that showed an expansion of the A-clade X. vigilis approximately 1.5 mya (Leavitt et al., 2007).
Our study also resolved the geographic location of a major phylogenetic break in X. vigilis to the Cuyama valley in California's Transverse Ranges (Figure 2), where the Mojave clade meets the Coast Range clade. In this broad pattern, X. vigilis is similar to many other California species or species groups that show biogeographic breaks in the Transverse Ranges (Chatzimanolis & Caterino, 2007;Gottscho, 2016). However, the break we detect falls in between the edges of Transverse Range-specific biogeographic units detected by F I G U R E 4 Patterns of regional connectivity and migration directionality in RAD data. Historical signatures of expansion between the major clades recovered in our iqtree analysis using RAD loci. We follow the nested structure of the phylogeographic tree, first testing directionality between the northern and southern groups and then between major divisions within the groups. Our analysis shows expansion from south to north (thick arrows). Within the two major groups, expansion from the Mojave populations to the South Transverse populations, and from Pinnacles towards Panoche.  (Chatzimanolis & Caterino, 2007). The phylogeographic patterns used to identify the Transverse Range biogeographic regions may be old enough that they were formed during the Transverse Range uplift, which occurred between 5 and 3 million years ago (Nicholson et al., 1994). In contrast, X. vigilis likely entered the area 1.5 million years ago, dispersing over the Transverse Ranges rather than being divided by their uplift (Leavitt et al., 2007).
Our study populations show a secondary subdivision between the northern Transverse range populations and the Panoche-Pinnacles clade, which could correspond to the glacial lake barrier shown in other Central Valley lizards (Papenfuss & Parham, 2013;Richmond et al., 2017). However, caution should be exercised in this interpre- showing similar patterns in the area include pond turtles Actinemys (Spinks et al., 2010(Spinks et al., , 2014, wood rats Neotoma (Matocq, 2002) and silk moths Calosaturnia (Rubinoff et al., 2021). The Transverse Ranges in general and the Cuyama valley in particular are hotspots of phylogeographic lineage breaks across California herpetofauna (Rissler et al., 2006).

| Transverse ranges populations: A phylogeographic museum
The Cuyama valley, a small area in the northern Transverse Ranges, holds populations from the two major phylogeographic lineages of California A-clade X. vigilis (Figures 2 and 3). Our directionality analysis shows a strong signal of the Mojave population being a source of migrants for the South Transverse populations, and while the North Transverse regions share genetic identity with Panoche and Pinnacles ( Figure 4). However, both the South and North Transverse populations are monophyletic rather than nested within a source population, indicating that they have been in place for many generations and/or have received very few migrants (Figure 2). Our Structure results reflect a similar pattern, with even the two geographically closest populations, Quatal (ST3) and Ballinger (NT1), consistently assigned to opposite demes with little evidence of admixture (Figure 3a,b). Given their apparent long residency and geographic proximity, why do we observe so little admixture between these two regional demes?
With the available evidence, we hypothesize that at least one of the major regional lineages arrived in the Cuyama valley just prior to some paleoclimatic event that dramatically reduced migration rates to their low modern levels. Considering our observed differences in genetic diversity between cover-vegetation types, a reasonable scenario could involve a drying event that replaced contiguous vegetation cover with more xeric vegetation, with fragmented stands of appropriate cover plants interspersed in an inhospitable matrix. This scenario would fit with past work on the species showing patterns of an expansion followed by long-term stasis in the species (Leavitt et al., 2007).
Fault movements are biogeographically significant throughout our study area. The Cuyama valley region in particular is highly geologically active, with both rotation and subduction occurring (DeLong et al., 2007;Luyendyk et al., 1980;Prothero et al., 2008).
The complex geological history of the region has likely contributed to the many species and clade boundaries that occur in the area (Chatzimanolis & Caterino, 2007 Figure S1).
Worldwide, many other areas served as climatically stable refugia during the last glacial maximum, including areas of the Amazon (Bonaccorso et al., 2006), the Eastern Afromontane Biodiversity Hotspot (Demos et al., 2014) and southern Australia (Byrne, 2008).
These locations may also preserve a disproportionate amount of phylogeographic diversity, particularly if conditions have since changed to reduce migration rates of the species concerned. If this mechanism is widespread, high lineage diversity of a variety of low-dispersal organisms, such as snails, plants that spread mostly through vegetative mechanisms or tropical-forest understory specialist birds, might be preserved in historically stable habitat patches. As in the case of Cuyama valley for X. vigilis, these phylogeographic hotspots may not be readily distinguishable from surrounding habitat in the modern day. shows Pinnacles as ancestral, with North Transverse branching before Panoche. A further discordance between marker types occurs within the Panoche population. The microsatellite Structure results detect a regional deme in the southern Panoche populations (PANO1, 2, and 3), which is absent from the RADseq results (Figure 3a,b).

| Panoche Hills: Recovery from old bottleneck results in microsatellite-RAD data conflict
Insights about the demographic histories of our populations from discordance in allelic diversity between our marker types can resolve these observations. The Panoche populations, clustered in the lower right quadrant in Figure 1c, all show a signature of an old bottleneck followed by a rebound (Figure 1a). This signature is   (Rech et al., 2010), the Eastern Arc mountains in Tanzania and Kenya (Burgess et al., 2007;Lovett, 1996), and the Central Asian high plateau north of the Himalayas (Tewari & Kapoor, 2013).

| Curry Mountain: Tectonic drift separates populations that retain strong co-ancestry
Another seemingly paradoxical result we found that may be ex-  (Brothers et al., 2020), in Japan (Hosoi et al., 2020), and in New Zealand (Michailos et al., 2020) could experience similar displacement.

| CON CLUS IONS
Our phylogeographic results show that X. vigilis has the ability to maintain biogeographic breaks between two closely adjacent demes over large timescales. In an apparently contradictory pattern, they also show close population co-ancestry over large geographic distances. When combined with previous work on this species, we hypothesize that the biogeographic history of X. vigilis is largely composed of long periods of population stasis, with little dispersal of any kind. Throughout the species' history in their current range, there have been substantial geographic expansions in which individuals successfully established new populations. For similar scenarios of punctuated dispersal events across long time periods, our work demonstrates the utility of comparing phylogeographic signal in marker types with different mutational properties to successfully resolve complex histories of migration and demographic change. We propose that our approach is applicable to organisms in similarly tectonically active and paleoclimatically complex habitats worldwide.

AUTH O R CO NTR I B UTI O N S
MFW, ARDR, and IAH designed the research. MFW, ARDR, and PJJ collected tissue samples. IAH, IVM, and ARDR performed molecular work, with subsequent data processing by IAH (RAD and microsatellites) and ARDR (microsatellites). IAH conducted the statistical analyses, and IAH, ARDR, and IVM performed data visualization. The manuscript was initially drafted by IAH and ARDR, with substantial revision and feedback from all authors.

ACK N O WLE D G E M ENTS
We thank California Department of Fish and Game, the U.S. Fish and Wildlife Service, and Pinnacles National Park for scientific collection permits. We thank many people for assistance in field collection of We also thank three anonymous reviewers for helpful comments in the preparation of this manuscript. This study was supported by start-up funds from the University of Michigan to ARDR and from the U.S. Bureau of Land Management (L13AS00158) to MFW and ARDR, and the National Institutes of Health and National Institute of Allergy and Infectious Diseases Award T32Il45821 to IAH.

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare no competing conflicts of interest.