The ultimate and proximate mechanisms driving the evolution of long tails in forest deer mice

Understanding both the role of selection in driving phenotypic change and its underlying genetic basis remain major challenges in evolutionary biology. Here, we use modern tools to revisit a classic system of local adaptation in the North American deer mouse, Peromyscus maniculatus, which occupies two main habitat types: prairie and forest. Using historical collections, we find that forest‐dwelling mice have longer tails than those from nonforested habitat, even when we account for individual and population relatedness. Using genome‐wide SNP data, we show that mice from forested habitats in the eastern and western parts of their range form separate clades, suggesting that increased tail length evolved independently. We find that forest mice in the east and west have both more and longer caudal vertebrae, but not trunk vertebrae, than nearby prairie forms. By intercrossing prairie and forest mice, we show that the number and length of caudal vertebrae are not correlated in this recombinant population, indicating that variation in these traits is controlled by separate genetic loci. Together, these results demonstrate convergent evolution of the long‐tailed forest phenotype through two distinct genetic mechanisms, affecting number and length of vertebrae, and suggest that these morphological changes—either independently or together—are adaptive.

genomes--we have a much more complete picture of the adaptive process.
Variation among populations of the deer mouse, Peromyscus maniculatus, provides a system for understanding both the organismal and genetic basis of evolution by local adaptation. This species has the widest range of any North American mammal (Hall 1981), and populations are adapted to their local environments in many parts of the range (Fig. 1A; e.g., Dice 1947;Hammond et al. 1999;Storz et al. 2007;Linnen et al. 2009;Bedford and Hoekstra 2015). Perhaps most strikingly, following the Pleistocene glacial maximum in North America, mice migrated from southern grassland habitat northward and colonized forested habitats, where they have become more arboreal (Hibbard 1968). Natural historians have long recognized two ecotypes of deer mice: forest-dwelling and prairie-dwelling forms (e.g., Dice 1940;Hooper 1942;Blair 1950). These ecotypes are differentiated both behaviorally and morphologically; mice found in forests tend to have smaller home ranges (Blair 1942;Howard 1949), prefer forest habitat over prairie habitat (Harris 1952), and prefer elevated perches (Horner 1954) as well as have bigger ears, longer hind feet, and longer tails (Osgood 1909;Blair 1950;Horner 1954) than prairie forms. However, it is unknown if the current, widespread forest populations reflect a single origin of the arboreal morphology that has spread across the continent, or if independent lineages have converged on the forest phenotype, and the genetic architecture underlying these phenotypic changes remains unexamined.
Of the morphological traits associated with forest habitats, arguably the best recognized is tail length, and evidence suggests that tail elongation may be an adaptive response to increased arboreality. Previous work has shown that deer mice use their tails extensively while climbing. For example, Horner (1954) carried out elegant experiments on climbing behavior in Peromyscus, and not only found a correlation between tail length and climbing performance, but also provided experimental evidence that within P. maniculatus, forest mice are more proficient climbers than their short-tailed counterparts and that they rely on their tails for this ability. Moreover, in two other Peromyscus species, tail length correlates with degree of arboreality (Smartt and Lemen 1980), and climbing ability has been shown to be heritable (Thompson 1990).
Here, we investigate the evolution of the deer mouse tail in several complementary ways. First, we reconstruct phylogeographic relationships among 31 populations of P. maniculatus to test hypotheses about the evolution of tail length. We show that forest-dwelling deer mice do not belong to a single phyletic group or genetic cluster, and thus that long-tailed forest forms appear to have independently converged. We also demonstrate that the evolution of longer tails is correlated with living in forest habitats, even when taking account of relatedness among populations.
Second, we investigate the morphological basis of tail length differences in two geographically distant populations, implicating similar mechanisms of tail elongation in eastern and western forest populations. Finally, we show that differences in the constituent traits of tail length between forest and prairie mice, despite correlation of these traits in the wild, are genetically separable in recombinant laboratory populations. Together, these results suggest that natural selection maintains multiple independent locally adapted forest populations and that independent genetic loci contribute to increases in vertebral number and length.

SAMPLES OF MUSEUM SPECIMENS
To quantify the degree of variation in overall tail length in this widespread species, we downloaded records of P. maniculatus from the Mammal Network Information System (MaNIS 2015), the Arctos database (Arctos 2015), and individual museum databases. Because nearly all specimens were present in the collection as prepared skins, we considered the original collector's field data to be the most reliable source of measurements. We excluded all specimens labeled as "juvenile," "subadult," or "young adult" or those having any tail abnormalities or injuries. We also removed any individuals with total length below 106 mm or tail length below 46 mm, which are considered the adult minima for this species (Hall 1981;Zheng et al. 2003). In total, we gathered data from 6400 specimens.
To assign specimens to subspecies, we digitally scanned the original distribution maps from Hall (1981) and georeferenced them in ArcGIS v. 9.2 (ESRI). P. maniculatus has highly stable subspecies ranges and most of the named subspecies have been described as forest or prairie types based on morphology and inspection of habitat (King 1968;Hall 1981;Gunn and Greenbaum 1986). We found that these historical assignments in the majority of cases hold true: based on GIS land cover data, we could assign most of the specimens labeled as a particular subspecies to a single habitat type. Two subspecies, P. m. gambelii and P. m. sonoriensis, were captured in both forest and nonforest habitats, so we have called their habitat type "mixed."

MORPHOMETRIC MEASUREMENTS AND STATISTICS
We calculated two types of dependent variables. First, we calculated the ratio of tail length to body length for all individuals. Second, we addressed potential nonlinear scaling of the two measures by fitting a linear model of tail length versus body length (Fox and Weisberg 2011). Models, including log, square, and Box-Cox transformations of the response variable, as well as quadratic and cubic terms for the predictor, were compared using ANOVA and the adjusted R 2 values. Because the best model had very low bairdii (n = 198) blandus (72) rufinus (827) Osgood (1909) and Hall (1981). Each dot represents a collecting locale from which one to nine samples were obtained. Dot color represents the local GIS land cover-defined habitat of the site (green = forest, tan = nonforest). Mice from all sample locales were used in subsequent genome-wide capture analyses. The red outline for "x-ray samples" indicates that we used samples from those locations for the comparison of vertebral number and length.
(B) Box-and-whisker plot of tail/body length ratio variation among deer mouse subspecies from museum collections. Box color indicates local habitat in which samples were collected based on GIS land cover data for those subspecies; green = forest, tan = nonforest, gray = mixed. ("Mixed" indicates that populations from a single subspecies were captured in both forest and nonforest locations.) explanatory power (R 2 = 0.11), we decided that simple ratios are the best statistic. As the ratios were not normally distributed (P < 0.001, Kolmogorov-Smirnoff test), the means of prairie, forest, and unclassified forms were compared with the Kruskal-Wallis nonparametric ANOVA. The R package car was used for all computations (R Development Core Team 2005; Fox and Weisberg 2011).

MORPHOLOGY AND HABITAT DESIGNATION
Because skeletal preparations of museum specimens often do not have complete tails (e.g., the smallest vertebrae are often lost during sample preparation), we conducted more detailed analyses on whole mouse specimens (i.e., frozen or fluid-preserved).
We focused the next set of genomic and morphometric analyses on 80 individuals from our lab collection as well as loans from researchers and museums, covering multiple regions included in the above analyses (see Table S1 for details). The sampling en-compassed the range of P. maniculatus and included 31 locations, spread across multiple habitats (Fig. 1A). We assigned the animals from both datasets to local habitat types based on their specific sampling locations: we used ArcGIS (ESRI) to extract land cover (i.e., habitat type) information from the North American Land Change Monitoring System 2010 Land Cover Database (NALCMS 2010) with a 1 km-radius buffer around each sampling location. We split land cover categories into forest and nonforest/prairie designations for all analyses: we called classes 1-6 and 14 forest (Temperate or subpolar needleleaf forest, Subpolar taiga needleleaf forest, Tropical or subtropical broadleaf evergreen forest, Tropical or subtropical broadleaf deciduous forest, Temperate or subpolar broadleaf deciduous forest, Mixed Forest, Wetland) and others nonforest/prairie (Tropical or subtropical shrubland, Temperate, or subpolar shrubland, Tropical or subtropical grassland, Temperate or subpolar grassland, Cropland, Urban, and Built-up).

OF SHORT-READ LIBRARIES
To assess genome-wide population structure, we used an arraybased capture library and sequenced region-enriched genomic libraries using an Illumina platform (Gnirke et al. 2009). Our MYselect (MYcroarray; Ann Arbor, MI) capture library sequences include probes for 5114 regions of the Peromyscus maniculatus genome, each averaging 1.5 kb in length (totaling 5.2 Mb of unique, nonrepetitive sequence; see Domingues et al. [2012] and Linnen et al. [2013]). We extracted genomic DNA using DNeasy kits (Qiagen; Germantown, MD) or the Autogenprep 965 (Autogen; Holliston, MA) and quantified it using Quant-it (Life Technologies). 1.5 µg of each sample was sonically sheared by Covaris (Woburn, MA) to an average size of 200 bp, and Illumina sequencing libraries were prepared and enriched following Domingues et al. (2012). Briefly, we prepared multiplexed sequencing libraries in five pools of 16 individuals each using a "with-bead" protocol (Fisher et al. 2011) and enriched the libraries following the MYselect protocol. We pulled down enrichment targets with magnetic beads (Dynabeads, Life Technologies; Carlsbad, CA), PCR amplified with universal primers (Gnirke et al. 2009), and generated 150 bp paired end reads on a HiSeq2000 (Illumina Inc.; San Diego, CA).

SEQUENCE ALIGNMENT
We preprocessed raw reads (fastq files) by removing any potential nontarget species sequence (e.g., from sequence adapters) and by trimming low quality ends using cutadapt v. 1.8 (Martin 2011) and Trim Galore! v. 0.3.7 (options: -q 20 -phred33 -stringency 3 -length 20 -paired -retain_unpaired) before aligning them to the Pman_1.0 reference assembly using Stampy v. 1.0.22 (Lunter and Goodson 2011) with default parameters. We removed optical duplicates using SAMtools v. 1.2 (Li et al. 2009), retaining the read pair with the highest mapping quality. Prior to calling variants, we performed a multiple sequence alignment using the Genome Analysis Toolkit v. 3.3 (GATK) IndelRealigner tool (McKenna et al. 2010;DePristo et al. 2011;Van der Auwera et al. 2013). Following GATK's Best Practice recommendations, we computed Per-Base Alignment Qualities (BAQ) (Li 2011), merged reads originating from a single sample across different lanes, and removed PCR duplicates using SAMtools v. 1.2. To obtain consistent alignments across all lanes within a sample, we conducted a second multiple sequence alignment and recalculated BAQ scores. We finally limited our dataset to proper pairs using SAMtools v. 1.2.

VARIANT CALLING AND FILTERING
We used GATK's HaplotypeCaller to generate initial variant calls via local de novo assembly of haplotypes. We combined the resulting gvcf files using GATK's CombineGVCFs command and jointly genotyped the samples using GATK's GenotypeGVCFs tool. After genotyping, we filtered these initial variant calls using GATK's VariantFiltration to minimize the number of false positives in the dataset. In particular, we applied the following set of filter criteria: we excluded SNPs for which three or more variants were found within a 10 bp surrounding window (clusterWindow-Size = 10) to eliminate redundant variants; there was evidence of a strand bias as estimated by Fisher's exact test (FS > 60.0); the read mapping quality was low (MQ < 60); or the mean of all genotype qualities was low (GQ_MEAN < 20). We limited the variant dataset to biallelic sites using VCFtools (Danecek et al. 2011) and excluded genomic positions that fell within repeat regions of the reference assembly (as determined by RepeatMasker [Smit et al. 2013[Smit et al. -2015). We excluded genotypes with a genotype quality score of less than 20 (corresponding to P[error] = 0.01) or a read depth of less than four using VCFtools to minimize genotyping errors. In addition, we filtered variants on the basis of Hardy Weinberg Equilibrium (HWE) to remove variants that might be influenced by selection and thus distort phylogenetic relationships: a P-value for HWE was calculated for each variant using VCFtools, and variants with P < 0.01 were removed. This HWE filtering excluded <0.7% of sites.

GENETIC PRINCIPAL COMPONENTS ANALYSIS (PCA)
To assess genetic structure across the range of P. maniculatus, we first used SMARTPCA and TWSTATS (Patterson et al. 2006) with the genome-wide SNP data (after filtering for variants genotyped in >50% of individuals, the final dataset contained 7396 variants) from the enriched short-read libraries described above. We detected significant structure in the first two principal components (P < 0.05 by Tracy-Widom theory, as applied in TWSTATS). To further visualize genetic relationships inferred by the PCA, we generated a neighbor-joining tree by computing Euclidean distances between individuals in the significant eigenvectors of the SMARTPCA output. The Python code to produce these trees can be found at github.com/kingsleyevan/phylo_epk.

FORMS
We filtered the capture data to include only the variable sites with information for at least four individuals (14,076 SNPs) and formatted the resultant data in PLINK v. 1.8 (Purcell et al. 2007). The VCF was converted to a fasta alignment using PGDSpider v. 2.0 (Lischer and Excoffier 2012). We generated a phylogenetic tree in RAxML v. 8 assuming the GTR+G model with 500 bootstrap replicates and a correction for ascertainment bias (Stamatakis 2014). We then tested the hypothesis that long-tailed forest individuals are monophyletic, either across the continent or within each of the western and eastern clades, by constraining the tree inference and comparing the likelihood using the Approximately Unbiased test (AU) (Shimodaira 2002) in CON-SEL v. 0.2 (Shimodaira and Hasegawa 2001).

MITOCHONDRIAL DNA ANALYSIS
Mitochondrial sequences were generated for a set of 106 samples, including 35 of those used in the capture experiment as well as P. keeni, P. polionotus, and P. leucopus as outgroups (Table S2). Amplification and sequencing were performed following Hoekstra et al. (2005). Briefly, we amplified the region spanning the single-exon gene CO3 and ND3, with the intervening Glycine tRNA, in a single PCR, and Sanger-sequenced using the primers 5 CATAATCTAATGAGTCGAAATC 3 (forward) and 5 GCWGTMGCMWTWATYCAWGC 3 (reverse). We aligned the resultant 1190 bp sequences using MUSCLE (Edgar 2004). The ML tree estimation and monophyly testing were carried out as above, with a separate partition defined for each codon position.

WITHIN-SPECIES COMPARATIVE ANALYSIS
To test for a correlation between tail length and habitat type, we used a comparative approach that accounts for the phylogenetic relationships among samples. First, we calculated standard Phylogenetically Independent Contrasts (Felsenstein 1985), based on both the nuclear and mitochondrial ML phylogenies, using the R package GEIGER (Harmon et al. 2008). Because we are comparing populations within a single species, this analysis is potentially confounded by nonindependence because of gene flow in addition to shared ancestry (Stone et al. 2011). Nonetheless, these results support the conclusions of the two controlled tests described below.
To control for the nonindependence of populations outside the context of bifurcating tree, we used a generalized linear-mixed model approach as described by Stone et al. (2011). We used two measures of genetic similarity. First, we pared our data down to populations from which we had three or more individuals (now six populations) and estimated global mean and weighted F ST between populations using the method of Weir and Cockerham (1984) as implemented in VCFtools (Danecek et al. 2011; see "in population contrasts" in Table S1). Second, we used a measure of genetic similarity ("-relatedness2" in VCFtools: the kinship coefficient of Manichaikul et al. 2010) between all individuals in the dataset for which we have tail and body measurements ("in individual contrasts" in Table S1). For both the populationand individual-level analyses, we used a generalized linear mixed model approach, as implemented in the R package MCMCglmm (Hadfield 2010) to test for an effect of habitat (forest vs nonforest) on population average tail/body ratio or individual tail/body ratio when including F ST or kinship coefficient as a random effect.

SPECIMENS
To measure tail morphology, we x-ray imaged the skeletons of wild-caught mice from four populations (Fig. 1A [red circles], and see Table S1 for specimen details). We used a Varian x-ray source and digital imaging panel in the Museum of Comparative Zoology Digital Imaging Facility. To measure the lengths of individual vertebra we used ImageJ's segmented line tool, starting from the first sacral vertebra and proceeding posterior to the end of the tail. The boundary between the sacral and caudal segments of the vertebral column is not always clear, so for consistency, we called the first six vertebrae (starting with the first sacral attached) the sacral vertebrae; the caudal vertebrae are all the vertebrae posterior to the sixth sacral vertebra. This approach allowed us to reliably compare vertebrae across individuals.
Because tail length scales with body size in our sample, we fitted a linear model with the lm function in R (R Development Core Team 2005) to adjust all vertebral length measurements for body size. We regressed total tail length (R 2 = 0.62) and the lengths of individual vertebrae (R 2 = 0.14-0.59) on the sum length of the six sacral vertebrae and used the residuals from the linear fit for all subsequent analyses. We obtained similar results when we regressed vertebral length measurements on femur length instead of sacral vertebral length.

F2 INTERCROSS TRAIT CORRELATION ANALYSIS
To examine the genetic architecture of tail traits, we conducted a genetic cross between a prairie and forest population. We originally obtained prairie deer mice, P. m. bairdii, from the Peromyscus Genetic Stock Center (University of South Carolina). We captured forest deer mice, P. m. nubiterrae, at the Powdermill Nature Reserve in Westmoreland County, Pennsylvania, and used 18 individuals to found a laboratory colony, after quarantine at Charles River Laboratories. All mice were housed in the Hoekstra lab at Harvard University. We mated a single male and female of each subspecies-two mating pairs, one in each cross direction-and used their offspring to establish 5 F1 sibling mating pairs. We then x-ray imaged the resulting 96 F2 offspring and measured their tail morphology as described above (see section on Vertebral morphometrics). We used the lm function in R to assess correlations in the resulting measurements. All animals were adults between 80 and 100 days old when x-rayed.

FOREST DEER MICE DO NOT FORM A SINGLE CLADE
We examined museum records of 6400 specimens belonging to 12 subspecies of Peromyscus maniculatus from across North America. Nearly all of the forest forms have substantially longer tails  than the prairie populations (Kruskal-Wallis test, P = 0.001), often equaling the length of their body (Fig. 1B). We used an array-based capture approach to resequence and call SNPs in >5000 genomic regions in a continent-wide sample of animals from 31 P. maniculatus populations, P. keeni, and an outgroup P. leucopus (Fig. 1A). We identified 14,076 high-quality SNPs, corresponding to an average of one variant every 187 kb of the reference genome with 24.73% of the genotypes missing. We explored genetic similarity among the sampled individuals by genetic principal components analysis (PCA; Patterson et al. 2006) and Maximum Likelihood phylogenetic inference (Fig. 2). Both the PCA and phylogenetic methods show that individuals from forests (as determined by GIS) do not compose a single, monophyletic group. Instead, we see the mice of the putatively derived forest forms clustering with nearby nonforest forms in the PCA (Fig. 2A). Trees based on PCA distances (Fig. S1), estimated from DNA sequences (both genome-wide SNPs [ Fig. 2B; Fig. S2] and mtDNA [ Fig. S3]), all identify multiple origins of forest forms. Monophyly is rejected in ML tests using the nuclear and mitochondrial data (AU test, P < 0.001). These results show that forest forms are evolving independently in eastern and western parts of the species range.

WITH HABITAT
To assess whether differences in the length of the tail are significant even when accounting for evolutionary relationships among populations (created by gene flow and/or shared ancestry), we included measures of genetic similarity among populations and individuals in a set of generalized linear mixed models. In these models, we ask whether animals in different habitats have sig-

. Tail length differences between habitats when accounting for genetic non-independence. (A) Tail/body length ratios for each population predicted by a generalized linear-mixed model taking genetic differentiation (F ST ) between populations into account. (B) Tail/body length ratios for prairie and forest individuals predicted by a similar model as in A, but with pairwise genetic relatedness (kinship coefficient [Manichaikul et al. 2010]) between individuals taken into account. Error bars represent 95%
confidence intervals.
nificantly different tail/body ratios when including measures of genetic similarity as random effects. First, we considered whether forest and prairie populations differ in their mean tail lengths. When taking pairwise F ST between populations into account (Fig. S4), we find that habitat has a significant effect in our mixed model (P = 0.002 for weighted mean F ST , P < 0.001 for global mean F ST ; fit by Markov Chain Monte Carlo, MCMC [Hadfield 2010]). Next, we assessed whether individuals from forest and prairie habitats differ in tail length when accounting for genetic similarity. We find a significant effect of habitat on tail/body ratio in our mixed model with kinship coefficient (Manichaikul et al. 2010) as a random effect (MCMC; P < 0.001). We show model-predicted population means in Figure 3A and the predicted individual means by habitat in Figure 3B. In addition, phylogenetically independent contrasts show a correlation between habitat type and tail/body length ratio (adjusted R 2 = 0.30, P < 1 × 10 -4 ). Together, these results robustly show that, even when taking nonindependence of populations into account, deer mice from forested habitats do indeed have significantly longer tails than those from prairie habitats.

CONVERGENCE IN SKELETAL MORPHOLOGY
We x-rayed specimens from two pairs of geographically distant forest-prairie populations (Fig. 1A [red circles] and "x-ray samples" in Table S1) and measured their vertebrae. We focused on these four populations because individuals from the two forest populations represent independently evolving forest lineages. Though both forest populations have longer tails than nonforest populations, eastern and western forest mice do not have identical tail lengths: western mice have longer tails, both absolutely and relative to their body size, than eastern forest mice (Fig. 4A-D).
In addition to the total tail length, we measured the number of caudal vertebrae and the lengths of each caudal vertebra. We found that, in both east and west paired comparisons, forest mice had significantly longer tails, significantly more vertebrae, and significantly longer vertebrae (here we tested the single longest vertebra) than their nearby prairie form ( Fig. 4; Wilcoxon tests, P < 1.6 × 10 -4 for all comparisons). Notably, we performed all these tests on body-size-corrected data, which means that these forest-prairie differences are not driven by overall differences in body length. We also found significant effects of habitat and subspecies on tail length and longest vertebra length in a two-way analysis of covariance (ANCOVA) on log-transformed data with sacral length as a covariate (P < 6.6 × 10 -8 for all effects).
First, we counted the number of caudal vertebrae and found that forest forms have significantly more vertebrae than the prairie forms (Kruskal-Wallis test, X 2 = 29.0, P = 7.2 × 10 -8 ). In both cases, we found that forest mice had, on average, three more vertebrae than the nearby prairie form. However, the western mice had, on average, one more caudal vertebra than the eastern population from the same habitat type (Fig. 4B). Importantly, none of the populations can be distinguished by the number of trunk vertebrae (all samples had 18 or 19), showing that vertebral differences are specific to the tail.
We next compared the lengths of the individual caudal vertebrae along the tail and found that many but not all caudal vertebrae are longer in the forest than in prairie mice. In the eastern population pair, caudal vertebrae 4 through 15 had longer median lengths in the forest mice than the longest caudal vertebra in prairie mice. The corresponding segment in the western population pair is caudal vertebrae 4 through 16 (Fig. 4C).
We also estimated the relative contributions of differences in vertebral length and vertebral number to the overall difference in tail length. To do this, we modeled the forest and prairie mice having an equal number of vertebrae by inserting three long vertebrae into the center of the prairie tails. These simulated "prairie+3" tails compensated for 42% and 53% of the average difference in overall tail length between the eastern and western forms, respectively (Fig. 4D). These estimates represent an upper bound of the contribution of the difference in vertebral number relative to vertebral length in these populations. Finally, a linear model of the form total length ß longest vertebra length + number of caudal vertebrae has an R 2 = 0.98, suggesting that differences in length and number of vertebrae explain nearly all of the difference in total tail length between these populations. Thus, these two morphological traits-length and number of vertebrae-contribute approximately equally to the difference in overall tail length in the western and eastern clades.

ARE GENETICALLY SEPARABLE
Among the four populations examined above, the two forest populations have both more caudal vertebrae and longer caudal vertebrae. To test whether these two morphological differences are controlled by the same or distinct genetic loci, we performed a reciprocal intercross between the eastern populations: long-tailed, forest P. m. nubiterrae, and short-tailed, prairie P. m. bairdii (Fig. 5A). Specifically, we generated 10 F1 hybrids and paired those 10 F1s to produce 96 F2 recombinant individuals. Using vertebra measurements from x-ray images of each of these animals, we show that alleles controlling number of vertebrae act in a semidominant manner as evidenced by intermediate phenotypes in first generation (F1) hybrids, whereas length of vertebrae in F1 hybrids is more similar to the forest form, suggesting the P. m. nubiterrae allele(s) are primarily dominant (Fig. 5B). Next, we tested for a correlation between the number of caudal vertebrae and the length of the longest caudal vertebra. We detect no significant correlation ( Fig. 5C; t = 0.87, df = 94, P = 0.39) between vertebral length and vertebral number in the tails of our F2 animals. A sample of 96 individuals is well powered, allowing an 80% probability of detecting a correlation of r > 0.25 at P < 0.05. These data suggest that differences in these two phenotypic traits-vertebral number and length-are under independent genetic control.

Discussion
In this study, we explored the evolution of a repeated phenotype within a single species. We found that long-tailed forest-dwelling deer mice are evolving independently in eastern and western parts of their range and that tail length does indeed differ between forest and nonforest habitats, even when controlling for genetic relatedness. Furthermore, we showed that longer tails, in both eastern and western forest mice, are a result of differences in both the number of caudal vertebrae and also the lengths of those vertebrae. Finally, we demonstrated that, despite the observation that caudal vertebrae number and length are correlated in nature, the genetic mechanisms producing these differences can be decoupled in a laboratory intercross. Together, these results imply that natural selection is driving differences in tail length and provide insight into the genetic architecture that underlies tail length evolution.
In his large-scale 1950 survey of adaptive variation in Peromyscus, W. Frank Blair described two main forms of P. maniculatus, grassland and forest, supporting previous suggestions (Osgood 1909). Therefore, we examined the phylogeography of the deer mouse in the context of intraspecific ecological adaptation, and specifically in the framework of the classic prairie-forest dichotomy that has been recognized in this species for over one hundred years. Previous work on the phylogenetic relationships among deer mouse subspecies has hypothesized convergence in these forms-allozyme (Avise et al. 1979) and mitochondrial studies (Lansman et al. 1983;Dragoo et al. 2006) hinted at a split between eastern and western forest populations-yet none have explicitly considered morphological and ecological context in a continent-wide sampling. Additionally, only the mitochondrial study of Dragoo et al. (2006) investigated population differentia-tion at a fine scale, but mitochondrial-nuclear discordance in this species (e.g., Yang and Kenagy 2009;Taylor and Hoffman 2012) complicates the interpretation. Here, we used > 7000 genomewide SNPs to directly test for a correlation between habitat and morphology. We chose to use both tree-based and non-tree-based methods, given the difficulty of constructing bifurcating phylogenetic relationships among intraspecific samples from populations experiencing gene flow.
The results of our genetic PCA provide strong evidence that forest mice do not form a single genetic cluster. The pattern of genomic differentiation we see in the PCA is roughly similar to that recovered by Avise et al. (1979), with a group of populations from eastern North America that are clearly separated from populations in the western half of the continent. In addition, we find a similar pattern and extent of divergence in Maximum Likelihood phylogenetic reconstruction among individuals. These results show at least two, one eastern and one western, independently evolving groups of forest P. maniculatus.
When we assigned habitat values to those populations using GIS land cover data, we found that populations captured in forest habitats have longer tails than those in nonforest habitats, even when taking genetic relatedness among populations into account. This is a critical finding that suggests that tail length differences are driven by natural selection: despite a century of investigation, the current study provides the first rigorous evidence for a continent-wide correlation between tail length and habitat type in this species. Thus, we support the conjecture of Lansman et al. (1983): "It is thus very probable that the currently recognized forest-grassland division of P. maniculatus does not reflect a fundamental phylogenetic split. Rather, it is more likely that environmental selection pressures have led to the independent evolutionary appearance of these two morphs in different maniculatus lineages." Our results showing the correlation of morphology with habitat and the presence of this correlation in independent populations imply convergence on the population level: similar environments resulting in similar morphologies in distinct lineages.
Several lines of evidence suggest that these independent extant forest forms may derive from a short-tailed prairie-dwelling ancestor. First, although our SNP-based phylogeny does not imply an obvious ancestral state, the topology of our mtDNA phylogeny (Fig. S3) is consistent with a prairie ancestor to the P. maniculatus group. Second, three studies identified P. melanotis, found primarily in grassy plains and characterized by a very short tail (Álvarez-Castañeda 2005), as the outgroup to the P. maniculatus species group (Bradley et al. 2007;Gering et al. 2009;Platt et al. 2015). Other close relatives of P. maniculatus are also relatively shorttailed, with the exception of another climbing specialist, the cactus mouse P. eremicus. Third, several fossils attributed to P. maniculatus and dated to the Pleistocene late-glacial period-during which the P. maniculatus group is thought to have radiated-were found in regions likely to be mainly scrub grasslands (Hibbard 1968;Bryant and Holloway 1985). While it is always challenging to infer ancestral phenotypic states, these data are consistent with the ancestral P. maniculatus being a short-tailed grassland form from which at least two long-tailed forest forms evolved.
How would the interpretation of our results change if the ancestral P. maniculatus were not short-tailed prairie dwellers? One possibility is that the ancestral state was long-tailed, implying the subsequent evolution of shorter-tailed prairie populations in the middle of the continent and the retention of the ancestral state in the east and west. In that case, we could not rule out a single transition in tail length, from longer to shorter. A second possibility is that the ancestral population had an intermediate or polymorphic length tail, suggesting that there were also transitions to short tails that we did not address here. Regardless of the ancestral phenotype-short, long, intermediate, or polymorphic-we show that the eastern and western forest forms are convergently evolving at the population level. By this, we mean that similar environments (i.e., forests) appear to favor similar phenotypes (i.e., long tails) in two phylogenetic groups that are not closely related to each other (in terms of population differentiation within this species).
Furthermore, in two forest-prairie population pairs, one eastern and one western, we find that both pairs differ in the two components of the caudal skeleton that could vary to produce differences in tail length, namely the number of tail vertebrae and the length of those vertebrae. This coupling of vertebral number and vertebral length could be explained in two ways: (1) number and length of vertebrae are controlled by identical, or linked, regions of the genome, or (2) multiple genetic variants controlling number and length of vertebrae have been independently selected in eastern and western populations. To distinguish between these hypotheses, we examined F2 hybrid individuals from a laboratory intercross between a forest form, P. m. nubiterrae, and a prairie form, P. m. bairdii. If differences in the number and length of vertebrae are controlled by variants in the same region(s) of the genome, we expect them to be correlated in the F2 individuals. On the other hand, if the two morphological traits are under control of variants in different genomic regions, recombination should decouple these traits in the F2 generation, and we should detect no correlation between these traits. We found the latter: we detected no significant correlation between number and length of vertebrae in the F2. This result implies that eastern forest environments have favored alleles affecting differences in number and length at a minimum of two distinct genetic loci (i.e., one or more mutations that increase vertebrae number and elsewhere in the genome one or more mutations that increase vertebrae length). It may be that alleles affecting number and length were present in the source population from which the forest mice evolved, or that new mutations contributed to these distinct morphologies, or both. By identifying the alleles that influence tail length variation, future work can determine whether these locally adaptive alleles were selected from standing variation or have arisen independently from distinct mutations in eastern and western clades, and whether these mutations are shared in the eastern and western populations.
While vertebrae number and length evolved in tandem in our wild Peromyscus populations, this is contrary to what we may have expected based on artificial selection experiments. Rutledge et al. (1974) performed a selection study in which the authors selected for increased body length and tail length in replicate mouse strains. The authors found that, in two replicate lines selected for increased tail length, one line had evolved a greater number of vertebrae and the other evolved longer vertebrae. That the number and lengths of vertebrae can be genetically uncoupled may not be surprising, given the timing of processes in development that affect these traits. The process of somitogenesis, which creates segments in the embryo that presage the formation of vertebrae, is completed in the Mus embryo by 13.5 days of development (Tam 1981), while the formation of long bones does not begin until later in embryogenesis and skeletal growth continues well into the early life of the animal (Theiler 1989). Nonetheless, the contrasting result between natural and artificial selection studies suggests that it may be advantageous, biomechanically, to have both more and longer caudal vertebrae in wild populations that climb in forested habitats.
Together these data convincingly show that longer tails have evolved repeatedly in similar forested habitat, implicating a role of natural selection. Despite separate genetic mechanisms for number and length of vertebrae, both traits contribute approximately equally to the increase in tail length in both the eastern and western forest populations. Future work will explore the biomechanical implications of caudal vertebrae morphology on function (e.g., climbing), and ultimately fitness, as well as the underlying genetic and developmental mechanisms driving the repeated evolution of this adaptive phenotype.

Supporting Information
Additional Supporting Information may be found in the online version of this article at the publisher's website: Figure S1. Neighbor-joining trees based on Euclidean distances in principal component space. Figure S2. Maximum Likelihood phylogeny of P. maniculatus individuals based on >14000 nuclear SNPs. Figure S3. Maximum Likelihood phylogeny of P. maniculatus individuals based on the mitochondrial CO3-ND3 fragment. Figure S4. Pairwise weighted F ST among population samples containing >3 individuals. Table S1. Samples genotyped in the genome-wide capture. Table S2. Samples genotyped at the mitochondrion.