Population genetic isolation and limited connectivity in the purple finch (Haemorhous purpureus)

Abstract Using a combination of mitochondrial and z‐linked sequences, microsatellite data, and spatio‐geographic modeling, we examined historical and contemporary factors influencing the population genetic structure of the purple finch (Haemorhous purpureus). Mitochondrial DNA data show the presence of two distinct groups corresponding to the two subspecies, H. p. purpureus and H. p. californicus. The two subspecies likely survived in separate refugia during the last glacial maximum, one on the Pacific Coast and one east of the Rocky Mountains, and now remain distinct lineages with little evidence of gene flow between them. Southwestern British Columbia is a notable exception, as subspecies mixing between central British Columbia and Vancouver Island populations suggests a possible contact zone in this region. Z‐linked data support two mitochondrial groups; however, Coastal Oregon and central British Columbia sites show evidence of mixing. Contemporary population structure based on microsatellite data identified at least six genetic clusters: three H. p. purpureus clusters, two H. p. californicus clusters, and one mixed cluster, which likely resulted from high site fidelity and isolation by distance, combined with sexual selection on morphological characters reinforcing subspecies differences.


MACFARLANE Et AL.
Not all barriers are large. Barriers can be small, such as roads (Clark et al., 2010) and agriculture fields (Bush et al., 2011). Behavior, including strong natal site fidelity, migratory routes, foraging sites, or mate choice (Friesen et al., 2007), can lead to reproductive isolation.
The purple finch (Haemorhous purpureus) is a North American songbird with two recognized subspecies: H. p. purpureus to the east of the Rocky Mountains and H. p. californicus on the West Coast ( Figure 1). The two subspecies exhibit morphological, plumage, vocal, and behavioral differences, in addition to variation within the subspecies (Wootton, 1996). The only genetic work conducted on the purple finch (Marten & Johnson, 1986) supports the current taxonomy, but is limited in both the number of sample sites (n = 2) and sample sizes (n = 2 and 15). Using a more representative sampling regime covering more of the purple finch's distribution, and several molecular makers, the aim of this study was to examine the evolutionary history and the effects of dispersal barriers on contemporary genetic patterns.
We examined three hypotheses related to purple finch evolution. First, we hypothesize that given their wide distribution across North America and large number of morphological differences between subspecies, the two subspecies of purple finch have been isolated for a prolonged period of time and likely survived the LGM in separate refugia. The Atlantic Shelf was a refugium for a number of species, including tree species upon which purple finch rely for food and nesting (Brubaker et al., 2005;Walter & Epperson, 2005).
We predict the Atlantic Shelf and areas south of ice sheets as candidate refugia for the purple finch given their current distribution.
F I G U R E 1 Assignment of individuals based on BAPS analysis of mtDNA (a) and z-linked (b) loci and STRUCTURE analysis of microsatellite loci (c). Mitochondrial locus (a) shows H. p. californicus (purple) and H. p. purpureus clusters (orange and green). Z-linked locus (b) shows four clusters corresponding to a predominantly H. p. californicus group (blue), a widespread (green), H. p. purpureus (yellow), and a cluster with no obvious geographic clustering (gray). Microsatellites (c) show two eastern maritime clusters (orange and yellow), a central continental cluster (green), a western H. p. purpureus and VanIs cluster (teal), two H. p. californicus clusters (purple and blue). Figure c includes all sampling sites. Map shading depicts the breeding ranges of H. p. purpureus (light grey) and H. p. californicus (dark grey) Second, with the morphological and behavioral differences and relatively small contact zone between subspecies, we hypothesize no or low levels of gene flow between subspecies. We predict a large number of genetic differences between subspecies, and little haplotype/allele sharing. Third, based on findings from other studies, we hypothesize that contemporary barriers, that is, bodies of water and geographic distance, isolate H. purpureus populations. The straits separating both Newfoundland and Vancouver Island from the mainland correspond to patterns of genetic isolation in other taxa with similar dispersal mechanisms (Holder et al., 2004;Lait & Burg, 2013).
We therefore predict populations on either side of these boundaries will be genetically distinct. Given the species large range, we predict isolation by distance. Lastly, we predict higher levels of population differentiation in the nonmigratory H. p. californicus than the migratory H. p. purpureus, as many sedentary species have been shown to exhibit low dispersal rates (Barrowclough, Groth, Mertz, & Gutierrez, 2004;Burg, Gaston, Winker, & Friesen, 2005;Spellman, Riddle, & Klicka, 2007). 12-m mist nets and playback recordings. Sampling was restricted to summer to reduce the number of migrants caught. A small (<50 μl) blood sample was taken from the brachial vein and the birds were banded and released on site. Blood was stored in 99% ethanol.

| Mitochondrial and Z chromosome DNA amplification and sequencing
A 707-bp fragment of the ATP region of the mitochondrial genome, hereafter referred to as ATP6, containing both ATP 6 (271 bp) and ATP 8 (445 bp) was amplified using primers L8950 (CCAACCACAGCTTCATACCA) and H9694 (GCTAGTGGGCGGAT GAGTAG) for 243 samples. The thermal cycling profile was one cycle of 120 s at 94°C, 45 s at 54°C, 60 s at 72°C; 37 cycles of 30 s at 94°C, 45 s at 54°C, and 60 s at 72°C; and a final cycle of 5 min at 72°C.
A 515-bp fragment of the AldB region of the Z chromosome was amplified in 178 samples for a total of 306 alleles, using primers AldB6-F and AldB8-R (Hackett et al., 2008). This region on the Z chromosome was amplified using a similar thermal cycling program as the ATP6 reactions with the exception of a 50°C annealing temperature, 2.0 mmol/L MgCl 2 , and crimson buffer (New England Biolabs).
Successfully amplified samples were sent to NanuQ sequencing service at McGill University, Montreal, Quebec. Sequences were checked and aligned using MEGA v. 5 (Tamura et al., 2011).

| Sequence data
AldB sequences from all known male birds were run through PHASE v. 2.0 (Stephens & Donnelly, 2003) in DnaSP v. 5.1 (Librado & Rozas, 2009) to account for the presence of two Z chromosome sequences in males. Four individuals from NS were removed due to highly dissimilar z-linked sequences, although their mitochondrial sequences matched other purple finches. Haplotype (h) and nucleotide diversity (π) indices were calculated in DnaSP 5.1 (Librado & Rozas, 2009). We Genetic differentiation between sampling sites was examined using F ST values calculated in gENALEx v. 6.5 (Peakall & Smouse, 2006). Due to small sample size (n ≤ 6), LAB, CoOR, and VanIs sites were removed from all pairwise F ST analysis. p-values were corrected using the modified false discovery rate (FDR) (Benjamini & Hochberg, 1995).
AMOVAs were performed in Arlequin v. 3.5.1.3 to test for the diversity found between the two subspecies, among the populations, and within the populations (Excoffier & Lischer, 2010). Principal coordinate analysis (PCoA) on Φ ST and F ST values in gENALEx for mitochondrial DNA and z-linked loci, respectively, allowed us to quantify the variance in our data.
A Bayesian clustering program, BAPS v. 5.2, was used to group individuals as BAPS forms clusters with no a priori defined populations.
The relationships between all haplotypes were examined using statistical parsimony networks created in TCS v. 1.21 (Clement, Posada, & Crandall, 2000) for both mtDNA and z-linked loci. All connections were made with a 95% connection limit.

| Microsatellite data
Amplified DNA was run on an acrylamide gel with a ladder and controls on the NEN Model 4300 DNA Analyzer (Licor BioSciences). Individuals missing more than three of the seven loci (n = 3) were removed from the analyses. A total of 260 individuals from all 10 populations were successfully genotyped at the seven loci. Populations and loci were checked for deviations from Hardy-Weinberg equilibrium and linkage disequilibrium with GENEPOP (Rousset, 2008), and evidence of large allele dropout, null alleles, and genotyping errors were examined using Microchecker (Van Oosterhout, Hutchinson, Wills, & Shipley, 2004).
Genotype data were run through the Bayesian clustering program STRUCTURE v. 2.3.4 to determine the number of clusters found in the data (Pritchard, Stephens, & Donnelly, 2000). Because this type of analysis uses individuals and is not as sensitive to levels of sampling within populations, LAB, CoOR, and VanIs were included. A burn-in of 50,000 and MCMC length of 150,000 were used with admixture model and loc priors. Ten iterations were performed for each cluster (K) 1-10. Structure Harvester v. 0.6.94 (Earl & vonHoldt, 2012) was used to calculate average likelihoods for each value of K and implement the Evanno method (Evanno, Regnaut, & Goudet, 2005). ∆K and Ln P(D) were used to determine the optimal number of clusters present. As suggested by Pritchard et al. (2000), a hierarchical analysis was performed by rerunning the analysis for each cluster to test for further structure.

| Isolation by distance
To clarify whether genetic differentiation is a result of geographic distance, Mantel tests were performed comparing genetic and geographic distances to test for isolation by distance. Tests were completed for all three datasets in gENALEx v. 6.5 using shortest geographic distances between sampling sites through suitable forested habitat.

| Spatio-geographic modeling
Spatio-geographic modeling was used to identify potential Pleistocene refugia. Occurrence information for the purple finch (>500,000 occurrences) was obtained from the Global Biodiversity Information Facility. After removing points with no UTMs, potential outliers (outside the expected range), duplicated data, samples from migratory periods, or any data from potentially unreliable sources, 5,710 points remained. Five of the 19 climate layers with ≥0.9 threshold in ENMTools (Warren, Glor, & Turelli, 2008) were used. The climatic layers included maximum temperature of warmest month, mean temperature of warmest quarter, mean temperature of coldest quarter, precipitation of wettest month, and precipitation of driest month. purpureus and the resident H. p. californicus were also run separately in case their separate distributions influenced the models, but these did not change the projections and therefore were not included.
MaxEnt was then used to model possible distributions during the LGM (21,000 years ago) and LIG (120,000-140,000 years ago).

| ATP6 sequence data
Diversity indices for the ATP6 locus varied. Excluding populations with fewer than four individuals, haplotype diversity ranged from 0.270 (SCA) to 0.899 (ND), and nucleotide diversity ranging from 0.0004 (SCA) to 0.0056 (NS) ( Table 1). The average nucleotide difference between subspecies was 2.4%. A coalescence time of 1.1 MYA was calculated using the fixed differences in the mitochondrial dataset between the two subspecies.
The ATP6 parsimony haplotype network contained a total of 56 haplotypes. Nineteen of the haplotypes were found in more than one individual and four of these were restricted to a single population. Of the 56 haplotypes, 44 were found in a single individual or population ( The significant differentiation between subspecies was supported by AMOVAs. The majority of the variance for the mitochondrial locus was explained by genetic differentiation between subspecies variation (90%), while 9% was explained by differences between individuals, and just 1% of the variation between populations.
Principal coordinate analysis for ATP6 showed coordinate 1 explained most of the variation (87.3%), while coordinates 2 (8.9%) and 3 (3.2%) explained the rest. The first two coordinates explained 96% of the variation and clustered geographically proximal populations together ( Figure 3). BAPS identified three clusters for the mitochondrial locus, one H. p. californicus cluster and two predominantly H. p. purpureus clusters (Figure 1a). The two H. p. purpureus clusters showed an east-west gradient with one cluster predominantly occurring in the west and decreasing in frequency to the east, and an eastern cluster that is less common in the west.

| AldB sequence data
The final AldB dataset contained 306 sequences. The z-linked chromosome showed lower diversity values than the mitochondrial ATP6, with a range of haplotype diversity values for populations of n > 4 from 0.086 in SCA to 0.837 for ND. Nucleotide diversity ranged from 0.0005 in SCA to 0.0069 in ND (Table 1). CBC had one of the highest haplotype diversities (h = 0.816) ( Table 1).
The z-linked locus shows more haplotype sharing between the two subspecies, but generally separates the subspecies, with the exception of four CoOR, one SCA, and four CBC individuals (Figure 2b). A total of 47 haplotypes were found at this locus. Thirty-six haplotypes were restricted to one sampling site, including 33 that were found in a single individual (Table 3). CBC had the highest overall number of haplotypes (16) and the highest number of shared haplotypes (11) ( Table 3).  The AMOVA from AldB showed 55%, 43%, and 2% of the variation is explained by differences between subspecies, within populations, and among populations, respectively. For AldB, PCoA coordinates 1 and 2 explained 90% and 7% of the variation (Figure 3)

| Genotype data
None of the seven microsatellite loci showed significant effects of large allele dropout or stuttering causing scoring errors for any of the populations.  Coordinates 1 and 2 explained a total of 70% of the variation (45% and 25%, respectively), while coordinate 3 explained an additional 16% (Figure 3c). Unlike the mtDNA and z-linked loci, there is no obvious clustering in the PCoA (Figure 3c)

| Spatio-geographic modeling
The model resulting from running nine climatic layers and over 5,710 occurrence locations performed significantly better than random.
The average omission rate for this model closely resembled the pre-  T A B L E 3 List of all haplotypes, broken down by population for the mitochondrial locus ATP6 (a) and z-linked locus AldB (b) Unique haplotypes are found in one individual, private haplotypes are found in a single population. refugia south of the ice sheets, MaxEnt identified suitable habitat off the coast of Newfoundland. LIG projections showed isolation between eastern and western areas of suitable habitat (Figure 5c).

| Multiple refugia
Our results, particularly the AldB and ATP6 loci, provided support for the hypothesis that the two subspecies likely experienced prolonged  (Table 1).
ATP and AldB data give a clearer picture of historic distributions of H. p. californicus. The lower genetic diversity in SCA and the high nucleotide and haplotype diversity values in WA (Table 1) (Table 1). Low diversity indices can result from bottlenecks or founder effects, and are more common along the periphery of a species range than in the center (Bush et al., 2011;Grus et al., 2009).  (Magee, 1924) and New York (Yunick, 1987) show 56.8% and 23.6%, respectively (n = 1,770), of banded birds return to breed in the same area. Similar patterns of low population structure of highly migratory species are found in northern pike (Esox lucius) (Miller et al. 2001) and
Significant correlations were found between geographic and genetic distance in mtDNA and z-linked loci, suggesting that physical distance could also restrict gene flow. Other barriers also play a role as some populations, WA and CoOR, that are geographically close are genetically differentiated. No obvious physical barriers exist between the WA and CoOR sites; however, contemporary gene flow could be restricted between these two populations due in part to plumage differences. Duvall (1945) noted a unique group of Pacific Northwest birds which he termed Carpodacus purpureus rubidus (not recognized as a subspecies by the American Ornithologists Union), in which males have darker heads, backs, rumps, and anterior lower parts; females are a darker olive. Both sexes lack mottling in the upper back and nape found in other nearby populations (Duvall, 1945). Similar morphological differences are present between the two recognized subspecies.
H. p. californicus females are more olive green in plumage than H. p.
Behavior also plays a role in isolation, as behavioral differences exist with respect to song. The songs of the two subspecies are distinct, and the H. p. purpureus song is more variable (Wootton, 1996). As song is important in mate selection, it may reinforce reproductive isolation similar to assortative mating in Ficedula flycatchers (Qvarnström et al., 2010).
BARRIER did identify one barrier, which separates East Coast populations (NL and NS) from other populations. Although no obvious physical barrier is present, other studies found population structure in both animal and plant species in eastern Canada. Rueness et al. (2003) found reduced gene flow between eastern and western populations of Canada lynx (Lynx canadensis) south of James Bay, and in jack pine Our study has shown how the evolution of the purple finch has been influenced by a number of evolutionary processes and both historical and contemporary distributions. Prolonged isolation of the two subspecies was maintained during the last glacial maximum and contemporary physical barriers, and behavioral barriers including site fidelity, migration, and assortative mating, have maintained patterns of low gene flow between the two subspecies. Some secondary contact was found in a small portion of their range, although it was limited.
The purple finch's evolutionary history demonstrates how population differences shape behavior and response to barriers to dispersal can influence evolution and, ultimately, speciation.