Parallel evolution of site‐specific changes in divergent caribou lineages

Abstract The parallel evolution of phenotypes or traits within or between species provides important insight into the basic mechanisms of evolution. Genetic and genomic advances have allowed investigations into the genetic underpinnings of parallel evolution and the independent evolution of similar traits in sympatric species. Parallel evolution may best be exemplified among species where multiple genetic lineages, descended from a common ancestor, colonized analogous environmental niches, and converged on a genotypic or phenotypic trait. Modern North American caribou (Rangifer tarandus) originated from three ancestral sources separated during the Last Glacial Maximum (LGM): the Beringian–Eurasian lineage (BEL), the North American lineage (NAL), and the High Arctic lineage (HAL). Historical introgression between the NAL and the BEL has been found throughout Ontario and eastern Manitoba. In this study, we first characterized the functional differentiation in the cytochrome‐b (cytB) gene by identifying nonsynonymous changes. Second, the caribou lineages were used as a direct means to assess site‐specific parallel changes among lineages. There was greater functional diversity within the NAL despite the BEL having greater neutral diversity. The patterns of amino acid substitutions occurring within different lineages supported the parallel evolution of cytB amino acid substitutions suggesting different selective pressures among lineages. This study highlights the independent evolution of identical amino acid substitutions within a wide‐ranging mammal species that have diversified from different ancestral haplogroups and where ecological niches can invoke parallel evolution.

among species can arise from mutations on the same gene or different genes (Arendt & Reznick, 2008). The chance of parallel evolution occurring in natural populations depends on how many beneficial mutations are available from the ancestral sequence and the standing genetic variation present (Orr, 2005). It has been estimated using the expected probability of beneficial mutations that under natural selection, parallel evolution occurs two-thirds of the time compared to a scenario of neutral evolution (Orr, 2005). The probability of parallel evolution arising from standing genetic variation increases when natural selection is strong and particularly for genes that have a large phenotypic effect and large population sizes. In this situation, parallel evolution can occur across multiple loci. If selection is weak and population sizes are small, then parallel evolution is expected at only a few loci (MacPherson & Nuismer, 2017).
The identification of genes under selection can facilitate our understanding of parallel evolution, but functional diversity can often be hard to uncover in natural populations with limited genomic resources. Genes commonly under selection in the mitochondrial genome, such as cytochrome-b (cytB) or cytochrome oxidase I (COI), are limited in studies of nonmodel organisms because they often do not provide the resolution for intraspecific diversity or for delineating population boundaries (Kvie, Heggenes, & Røed, 2016). Although conserved, these genes can undergo bouts of mutational change within species (Castoe, Jiang, Gu, Wang, & Pollock, 2008) leading to increased genetic diversity upon which evolutionary forces such as selection can act, particularly in heterogenous or extreme environments (Da Fonseca, Johnson, O'Brien, Ramos, & Antunes, 2008).
This may lead to independent evolution of identical mutations that are retained in gene pools across the landscape where similar environmental conditions and selection pressures are present for the populations, thereby leading to regional parallel evolution. For example, parallel evolution of three populations of a dwarf ecotype that were each more closely related to the local tall ecotypes has been shown in the forest tree, Eucalyptus globulus (Foster et al., 2007).
Alternatively, retention of different mutations in diverse environments indicates different selection pressures that ultimately lead to divergent evolution. This has been shown within species suggesting that the presence of functional differences at the genetic level can cause phenotypic and/or behavioral differences (Foote et al., 2011).
Furthermore, introgressive hybridization, as an increasingly recognized evolutionary process for adding novel genomic components (Abbott et al., 2013;Hedrick, 2013;Seehausen, 2004), is another source where parallel changes would be predicted to occur on the introgressed genes; however, few studies have assessed signatures of parallel changes at such regions (Dowling et al., 2016;Meier et al., 2017).
North American caribou (Rangifer tarandus) demonstrate significant intraspecific variation with several subspecies, and ecotypes largely conforming to 12 recognized Designatable Units (DUs) having been identified based on ecological niches and calving strategies (COSEWIC 2011). One major identifier among ecotypes is displaying either sedentary or migratory behavior (COSEWIC 2011); for example, barren ground caribou undertake long annual migrations and can have seasonal home ranges as large as 111,000 km 2 (COSEWIC 2011;McDevitt et al., 2009;Nagy, Wright, Slack, & Veitch, 2005;Yannic et al., 2014), whereas woodland caribou have a restricted home range size, with annual population home range sizes from 200 to 1200 km 2 (Rettie & Messier, 2001) and in the summer can be as low as 17 km 2 (Wittmer, McLellan, Serrouya, & Apps, 2007). Additional subspecies of Rangifer also occur throughout Eurasia, where they are named reindeer. The glacial cycles during the Quaternary and suitable refugia led to the evolution of three distinct genetic lineages of Rangifer across their Holarctic distribution: the Beringian Eurasian lineage (BEL), the North American lineage (NAL) (Cronin, Macneil, & Patton, 2005;Flagstad & Røed, 2003;Klütsch, Manseau, & Wilson, 2012;Røed, Ferguson, Crête, & Bergerud, 1991;Yannic et al., 2014), and a more recently identified High Arctic lineage (HAL) (Klütsch, Manseau, Anderson, Sinkins, & Wilson, 2017). The NAL, with proposed refugial origins south of the Laurentide ice sheet, is further genetically subdivided into three haplogroups, the likely result of multiple, distinct glacial refugia within North America during the Last Glacial Maximum (Klütsch et al., 2012). Genetic introgression of the NAL and the BEL likely occurred after glacial retreat in North America, about 11,500 years ago, as there are mixed BEL and NAL haplotypes present in many boreal caribou herds (Klütsch, Manseau, Trim, Polfus, & Wilson, 2016;Klütsch et al., 2012).
There have been several studies characterizing the genetic diversity and population differentiation of North American caribou at neutral loci (Flagstad & Røed, 2003;Klütsch et al., 2012Klütsch et al., , 2016Kuhn, McFarlane, Groves, Mooers, & Shapiro, 2010;McDevitt et al., 2009;Weckworth, Musiani, McDevitt, Hebblewhite, & Mariani, 2012), but little work has been carried out to resolve differences among lineages at the level of functional genes that may convey adaptation to ecological niches or their behavior. Different selective pressures may exist across ecotypes and lineages, particularly in relation to energy requirements and demands facilitated by the mitochondrial genome for migratory or sedentary behavior. Our first objective was to characterize the functional genetic diversity of the cytB gene in divergent caribou and reindeer lineages. Previous genetic analyses of Rangifer using the cytB gene have shown greater genetic diversity in the BEL compared to the NAL (Cronin et al., 2005;Yannic et al., 2014); therefore, we predict greater functional diversity in the BEL.
To test this hypothesis, we characterize nonsynonymous changes within cytB and test for significantly radical amino acid substitutions. Our second objective was to use the presence of two phylogenetic lineages (BEL and NAL), three sublineages (NAL1, NAL2, NAL3), and introgression between the BEL and the NAL to provide a means of testing for the parallel evolution of site-specific changes within lineages that may inform on the various selective pressures unique to each lineage. We define parallel evolution as the occurrence of site-specific changes that occur within ecotypes that may belong to different lineages or sublineages and those changes arose before lineages contacted after the glacial retreat. To test for parallel evolution, we reconstructed the phylogeny based on cytB to investigate whether mutations have evolved independently multiple times in caribou and reindeer lineages and when in time those mutations occurred.

| MATERIAL S AND ME THODS
Cytochrome-b haplotypes for Rangifer tarandus were retrieved from the National Center for Biotechnology Information (NCBI) from three independent studies (Cronin, Macneil, & Patton, 2006;Cronin et al., 2005;Yannic et al., 2014) representing samples from Eurasia and North America (Table S1). Additional samples from North America, originally reported in Klütsch et al. (2012), were sequenced at the cytB gene (Table S2). Using the Klütsch et al. (2012) samples, we can identify which mitochondrial control region haplogroup each sample belongs to, either the BEL or one of the three NAL haplogroups: NAL1, NAL2, or NAL3, originally defined by Klütsch et al. (2012). DNA was extracted by swabbing the mucosal layer of fecal pellets and the use of a Qiagen DNAeasy (Toronto, Ontario) extraction kit; for more detailed extraction methods, see Ball et al. (2007). Amplified products were run on a QiAxcel Electrophoresis system for quantification, and all samples were diluted to 2.5 ng/100 base pairs. Amplified DNA was purified using the New England BioLabs Exonuclease I and Antarctic Phosphotase. The ABI BigDye Terminator sequencing reaction kit was used following the manufactures protocols and products were run on an ABI3730.
Sequence results were analyzed and aligned to the Rangifer tarandus mitochondrial genome reference sequence (NC_007703.1; Wada, Nakamura, Nishibori, & Yokohama, 2010) in Geneious v6.0.5 (Kearse et al., 2012). Ambiguous reads resulting from either low amplitude or double peaks were either resequenced or excluded from subsequent analyses. Unique haplotypes were identified using DnaSP v5 (Librado & Rozas, 2009)  The total haplotype alignment, including all reindeer and caribou sequences retrieved from NCBI, and the newly generated caribou cytB sequences were tested for deviations from neutrality using Tajima's D statistic (Tajima, 1989). The accumulation of deleterious mutations in cytB can have severe repercussions for fitness (Hill, Chen, & Xu, 2014;Peischl, Dupanloup, Kirkpatrick, & Excoffier, 2013) and therefore, significant purifying selection was the ex- replicates. The substitution model was set to the Tamura-Nei model (Tamura & Nei, 1993), as predicted by a corrected Akaike Information Criterion (AICc) analysis in jModel Test v6.02 (Darriba, Taboada, Doallo, & Posada, 2012;Guindon & Gascuel, 2003). As some nonsynonymous substitutions occur across genetic lineages or on different clades of the minimum evolution tree, a Bayesian phylogenetic analysis was used to give approximate dates for those nonsynonymous substitutions. In this way, it can be determined whether the substitutions occurred after the retreat of the ice sheet and in which genetic lineage the substitution first occurred. This analysis was performed in the program BEASTv1.8.4 (Drummond, Suchard, Xie, & Rambaut, 2012) using a strict clock model, the same substitution model as was estimated for the minimum evolution tree, and the caribou cytB-specific mutation rate from Yannic et al. (2014). The Markov Monte Carlo Chain was run for 300,000,000 steps with sampling every 1,000 steps. The program Tracer v1.5 (Rambaut, Suchard, Xie, & Drummond, 2014) was used to assess convergence of the parameters and the program FigTree v1.4.3 (Rambaut, 2016) was used to visualize the resulting tree. A mtCR minimum joining haplotype network (Bandelt, Forster, & Röhl, 1999) was assembled using the mtCR haplotype data from Klütsch et al. (2012) in the program PopART (Leigh & Bryant, 2015). The nonsynonymous cytB substitutions were overlaid onto the mtCR network to visually assess evolutionary relationships among those sequences with nonsynonymous substitutions.

| RE SULTS
A total of 232 samples that have a known mtCR haplotype (Klütsch et al., 2012) Table   S1) and aligned with the cytB sequences from this study. After alignment, all cytB sequences were trimmed to 1,062 bp, a length that included the majority of the cytB gene except for the first 25 amino acids. A total of 144 unique cytB haplotypes were present, of which 27 were distinct from previously published cytB sequences (Table   S2).
The Ka/Ks ratio for the BEL averaged 0.02 and ranged from 0.00 to 0.34 and the Ka/Ks ratio for the NAL averaged 0.23 and ranged from 0.00 to 0.97. Note that the HAL, which includes Peary caribou, had the same cytB haplotype as other barren-ground samples (BEL); therefore, the HAL results will not be reported separately, and samples of this lineage will be included with the BEL. There were a total of 27 nonsynonymous substitutions; 12 were observed in only one individual and hence were assumed to be deleterious and likely to be removed quickly from a population (Kimura, 1983; Eyre-Walker and Keightley 2007) and were therefore not included in further analyses. Of the nonsynonymous substitutions that were observed more than once, there were several that occurred in more than five individuals, including  Table 1). Seven of the 16 nonsynonymous changes that occurred in more than one individual changed the polarity of the amino acid (Table 1).
TreeSAAP was then used to identify those substitutions that had a significant radical change in the amino acids property. The Yannic et al. (2014) sequences were used as training data. An initial analysis was performed with no sliding window, eight categories of radicality, and all physiological properties. This initial   Of all the nonsynonymous changes, only two are exclusive to a mtCR haplotype: Ile19Val is associated with mtCR haplotype H66 and Leu241Met is associated with mtCR haplotype H1. The remaining nonsynonymous changes occur within cytB sequences associated with multiple mtCR haplotypes.

| D ISCUSS I ON
The first objective was to detect functional differences among caribou lineages in the cytB region of the mitochondria. Despite the BEL having a greater number of cytB haplotypes (Cronin et al., 2005(Cronin et al., , 2006Yannic et al., 2014), the NAL had a proportionally large number of nonsynonymous amino acid substitutions and thereby functional diversity compared to the BEL ( Table 2). The second objective was to use phylogenetic methods to test for the parallel evolution of nonsynonymous changes among genetic lineages. The Ala246Thr substitution arose independently in the NAL1, the NAL2, and the BEL suggesting parallel formation and retention of this substitution F I G U R E 1 A minimum evolution tree of all cytB haplotypes with bootstrap support greater than 75% on the node labels. The red box highlights those haplotypes from the North American Lineage (NAL) ancestry. The nonsynonymous changes are overlaid onto corresponding branches with those changes identified as having a significant radical effect on the protein structure represented by a star F I G U R E 2 A Bayesian phylogenetic tree of all cytB sequences, including those retrieved from GenBank. The tree root age with 95% HPDI in brackets is displayed on the root node. The colored hexagon shapes are located on those nodes in the tree in which a nonsynonymous substitution occurred that could be dated (date in years before present with 95% HPDI to the right of each symbol). The purple symbol represents the Ile19Val substitution, orange is the Thr309Met substitution, yellow is the Ala246Thr substitution, and pink is the Leu237Phe substitution.

| Evidence of parallelism
When small fractions of an ancestral population colonize new areas, the greater the background level of standing genetic variation, the higher likelihood for the divergence and fixation of novel genotypes or phenotypes that may only occur at low levels in the ancestral source (Elmer & Meyer, 2011). In caribou, the BEL likely represents the ancestral source for North American caribou and it is characterized by a greater level of genetic diversity compared to the NAL. Most of the nonsynonymous changes present in the cytB gene are found in the BEL; however, only in low frequencies. The same nonsynonymous changes are driven to a higher frequency in the NAL, which can be due to a variety of factors including neutral and selective processes (Nielsen, 2005). It was also estimated that the substitutions in the NAL are older indicating these mutational changes occurred before the NAL and BEL lineages could have North American ice sheet. Introgression between these lineages at that time (Klütsch et al., 2016) may have been the driver of the parallel evolution of this nonsynonymous change. Amino acid substitutions among many species classes overwhelmingly arise in parallel (Rokas & Carroll, 2008), and the parallel evolution of amino acid substitutions has been documented in a wide variety of species including mammals (Stewart, Schilling, & Wilson, 1987;Yeager & Hughes, 1999), fish (Jost et al., 2008;Yokoyama & Yokoyama, 1990), birds (Kornegay, Schilling, & Wilson, 1994), and amphibians (Malyarchuk et al., 2010).
The most widespread example of parallel evolution is the Ala246Thr substitution. The Ala246Thr substitution, based on the topology of the phylogenetic tree arose twice within the NAL, in two F I G U R E 3 A minimum spanning network of the mitochondrial control region (mtCR) haplotypes that correspond to each sample sequenced in the present study at the cytB gene. The size of each circle is proportional to the number of sequences with that haplotype, the small black dots are hypothetical haplotypes, each line represents one mutational step, and the dashed lines are alternative connections between haplotypes. Haplotype number (H) corresponds to the mtCR haplotype designation by Klütsch et al. (2012) and circles are colored based on the haplogroup. CtyB nonsynonymous changes are labeled by a triangle next to each haplotype with those changes identified as having a significant radical effect on the protein structure represented by a star  (Klütsch et al., 2012). The presence of multiple glacial refugia for wide-ranging and highly migratory mammal species has been shown previously (Shafer, Cullingham, Côté, & Coltman, 2010;Sim, Hall, Jex, Hegel, & Coltman, 2016), which supports the assertion of different glacial refugia for NAL1 and NAL2. There are two scenarios that could likely explain the widespread presence of the Ala246Thr substitution in the NAL1 and NAL2 haplogroups. First, the mutation may be present in the ancestral population in low frequency, but driven to fixation after diversification of a new lineage(s) in a novel environment (Elmer & Meyer, 2011). The Ala246Thr substitution is found in one individual from the BEL; the geographic origin of that sample is unknown, but this substitution in the BEL occurred too recently to be aged using the Bayesian phylogenetic method suggesting this is not the ancestral source of this substitution. If this were true, it would be expected that most, if not all the individuals belonging to the NAL1 and NAL2 lineages would possess this mutation due to the proliferation of the substitution from a small subset of the ancestral population. The second scenario is that this nonsynonymous change arose in parallel in the NAL1 and the NAL2 haplogroups. This scenario is more likely because first, the nonsynonymous changes are located on different branches of the tree with ancestral-type sequences at the nodes of these branches. Second, during sequence evolution, amino acid changes occur at single amino acid residues, two-amino acid, and three-amino acid motifs (Yeager & Hughes, 1999). If the nonsynonymous change resulted from shared ancestry, there should be an equal number of one, two and three-amino acid residue changes. Conversely, a larger number of single amino acid polymorphisms are indicative of parallel evolution, as these types of changes are more easily evolved (Yeager & Hughes, 1999). We observe one amino acid change within the cytB region, supporting the evidence for parallel evolution at the amino acid level within caribou.

| Divergence within and among lineages
Almost half (44%) of the nonsynonymous changes observed in caribou changed the amino acid's polarity and two theoretically cause changes to the proteins physicochemical properties. The amino acid substitutions associated with changes in the protein's alpha-helical tendencies (Branden & Tooze, 1999) were found associated with the boreal and eastern-migratory caribou ecotypes. Cytochrome-b functions as a proton pump formed by an alpha-helical channel along the inner membrane of the mitochondria (Da Fonseca et al., 2008).
It cannot be determined based on the available data if these amino acid changes correlate to changes in the function of the protein or if the change is neutral (Storz, 2016). Based on the broad distribution and lineage consistency of these significant changes, this could suggest that these substitutions represent true functional variants between caribou lineages; however, additional molecular biological based analysis would be required to confirm this as there was no statistical evidence for positive selection. Barren-ground caribou (BEL) undertake large annual migrations (COSEWIC 2011;McDevitt et al., 2009;Yannic et al., 2014) which would reasonably entail high metabolic demands. Optimal efficiency may therefore be favorable among migrating subspecies which is supported by the significant purifying selection as indicated by Tajima's D. Radical functional variants in cytB may arise through a relaxing of selective pressures (Gering, Opazo, & Storz, 2009;McClellan et al. 2005) or when inefficiencies confer a benefit (Ballard & Pichaud, 2014). Woodland caribou are more sedentary compared to barren-ground caribou (McDevitt et al., 2009), and different metabolic demands may allow selectively neutral or even beneficial functional variants to have evolved repeatedly. To infer causality between the substitutions and protein function among caribou ecotypes, additional work beyond the scope of this study is needed. Genome wide studies are beginning to reveal that there are several hundred genes that may be responsible for phenotypic differences among ecotypes (Bernatchez et al., 2010) and therefore, genome wide scans will likely be necessary to completely elucidate the adaptive genetic differences among caribou ecotypes.

| CON CLUS IONS
Overall, we observed high levels of genetic diversity at the cytB gene in the BEL haplogroup of caribou, but greater functional diversity in the NAL compared to the BEL due to the parallel evolution and retention of lineage-specific amino acid substitutions. Despite the poor resolution of the cytB gene for phylogenetic inferences of caribou (Kvie et al., 2016), variability in protein diversity is present and provides important information regarding the evolution of lineages and potential selection to the local environment. This may have important implications for the future protection and management of caribou, particularly when trying to determine genetically distinct units, as the difference in functional variability might need to be considered in conjunction with neutral genetic structure. This study demonstrates that despite belonging to the same evolutionary lineage, environment and/or location can drive the parallel evolution of nonsynonymous changes within functional genes, adding a new aspect to consider when assigning management criteria, such as designatable units.

ACK N OWLED G M ENTS
We would like to thank Jill Lalor, Bridget Redquest, and Marina Kerr for technical assistance in the DNA laboratory at Trent University.
We are thankful for collaborators who collected fecal samples for this project including Ontario Ministry of Natural Resources,

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
AM and CFCK were responsible for DNA sequencing. RLH and AM performed data analysis. MM, KA, and PJW were supervisors on the project. AM, RLH, MM, and PJW wrote the manuscript. CFCK, BG, and AM provided manuscript edits.