Selective constraints in cold‐region wild boars may defuse the effects of small effective population size on molecular evolution of mitogenomes

Abstract Spatial range expansion during population colonization is characterized by demographic events that may have significant effects on the efficiency of natural selection. Population genetics suggests that genetic drift brought by small effective population size (N e) may undermine the efficiency of selection, leading to a faster accumulation of nonsynonymous mutations. However, it is still unknown whether this effect might be balanced or even reversed by strong selective constraints. Here, we used wild boars and local domestic pigs from tropical (Vietnam) and subarctic region (Siberia) as animal model to evaluate the effects of functional constraints and genetic drift on shaping molecular evolution. The likelihood‐ratio test revealed that Siberian clade evolved significantly different from Vietnamese clades. Different datasets consistently showed that Siberian wild boars had lower Ka/Ks ratios than Vietnamese samples. The potential role of positive selection for branches with higher Ka/Ks was evaluated using branch‐site model comparison. No signal of positive selection was found for the higher Ka/Ks in Vietnamese clades, suggesting the interclade difference was mainly due to the reduction in Ka/Ks for Siberian samples. This conclusion was further confirmed by the result from a larger sample size, among which wild boars from northern Asia (subarctic and nearby region) had lower Ka/Ks than those from southern Asia (temperate and tropical region). The lower Ka/Ks might be due to either stronger functional constraints, which prevent nonsynonymous mutations from accumulating in subarctic wild boars, or larger N e in Siberian wild boars, which can boost the efficacy of purifying selection to remove functional mutations. The latter possibility was further ruled out by the Bayesian skyline plot analysis, which revealed that historical N e of Siberian wild boars was smaller than that of Vietnamese wild boars. Altogether, these results suggest stronger functional constraints acting on mitogenomes of subarctic wild boars, which may provide new insights into their local adaptation of cold resistance.


| INTRODUC TI ON
Evaluating the relative contributions of selection and genetic drift in shaping genome evolution is one of the central issues in population and evolutionary biology. Although previous studies have suggested the complementary effects of selective constraints and genetic drift on the molecular evolution of mtDNA (Mitterboeck & Adamowicz, 2013;Shen, Shi, Sun, & Zhang, 2009), there is still a lack of compelling cases in which opposite effects of genetic drift and selective constraints are observed. It is feasible to differentiate their effects by comparing populations with different effective population sizes (N e ) and natural selection intensity. Population genetics has predicted that the efficacy of purifying selection on mutations is sensitive to N e (Kimura, 1985). In particular, population with a smaller N e would be less effective in removing slightly deleterious mutations in comparison with population with a larger N e , resulting in a faster accumulation of nonsynonymous mutations (Ohta, 1973). To evaluate the relative effects of genetic drift and selective constraints, it is also important to choose a proper genetic system. In this study, mitogenomes were used, as they have only 1/4 N e that of nuclear genes (Hard & Clark, 1989). The relatively lower N e might facilitate the detection of selection pressure based on the inference of population genetics. In addition, in terms of gene functions, mitochondria are well identified as the power station of cells for thermogenesis and thermoregulation upon which the basic physiological processes are possible (Block, 1994;Rowland, Bal, & Periasamy, 2015;Skulachev, 1999). Owing to these strong functional constraints in mitochondria, the evolution of mtDNA is maintained mainly by purifying selection to remove the accumulation of deleterious nonsynonymous mutations (Sun, Shen, Irwin, & Zhang, 2011). Hence, it is appropriate to use mtDNA as a model to examine the relationship between the effects of genetic drift and functional constraints.
In general, the tempo of molecular evolution may be shaped by selective constraints and N e (Kimura, 1985). As most nonsynonymous mutations are (or at least) slightly deleterious (Kryukov, Pennacchio, & Sunyaev, 2007;Mezmouk & Ross-Ibarra, 2014), the metric of Ka/ Ks, representing the proportion of nonsynonymous mutations relative to synonymous mutations, has commonly been used to indicate the efficacy of selection (Hughes, 2013;Wang et al., 2011). In particular, for mitogenomes, which are mainly subject to purifying selection (Ka/Ks < 1) against deleterious mutations (Stewart, Freyer, Elson, & Larsson, 2008), higher Ka/Ks ratios in a population indicate a faster accumulation of slightly deleterious mutations and less efficiency of selection (Björnerfeldt, Webster, & Vilà, 2006;Wang et al., 2011). Previous studies have found ample cases in which the relaxation of selective constraints and/or reduction in N e result in the faster accumulation of slightly deleterious mutations. For example, the birds and insects with weaker locomotive abilities have higher Ka/Ks in their mitogenomes than those with stronger locomotive abilities (Mitterboeck & Adamowicz, 2013;Shen et al., 2009). Domestic dog and yak also evolve under higher Ka/Ks in mitochondrial genes compared with their wild counterparts (Björnerfeldt et al., 2006;Wang et al., 2011). These observations have been interpreted by either the relaxation of functional constraints on mitochondrial genes or the reduction in N e (Björnerfeldt et al., 2006;Mitterboeck & Adamowicz, 2013;Shen et al., 2009;Sun et al., 2011;Wang et al., 2011). However, hitherto very few evidence has been reported to show which factor, for example, selective constraints or N e , would have larger effects on the tempo of molecular evolution, if the effects of selection and N e on molecular evolution are opposite.
In this study, we chose wild boar (and free-range domestic pigs; Sus scrofa) as a model (Figure 1), due to not only their complex natural and artificial selection history, but also their demographic changes along the process of selections. As one of the most successful animals in spreading from Southeast Asia to Eurasia, wild boars are well adapted to various different ecological conditions and environments, including torrid tropical areas, humid forests, arid semideserts, and freezing subarctic winters (Rothschild & Ruvinsky, 2011).
Unlike tropical areas, cold regions may impose more stringent evolutionary constraints on mitochondria due to its important role in thermal physiology, leading to the expectation of a slower accumulation of slightly deleterious mutations in mitochondrial genes. However, considering the effect of "Out-of-Southeast Asia" dispersal or limited gene flow due to geographic isolation, N e (s) of wild boars in cold regions are expected to be smaller. This rationale may lead to an opposite inference that purifying selection in cold regions might be less effective to remove deleterious mutations than that in tropical areas due to genetic drift. Therefore, wild boar may serve as an intriguing Here, the mitogenomes of wild boars and free-range local domestic pigs from a tropical region (Son La province, Vietnam) and three subarctic regions in Siberia (Tyva, Zabaykalsky Krai, and Novosibirsk) were sequenced and compared. The major objectives of this study include the following: (a) based on phylogenetic structure of both Vietnamese and Siberian wild boars (and local pigs) to evaluate the potential differences in tempos of accumulation of slightly deleterious mutation; and (b) to evaluate the causes of these differences in the context of selective constraints and N e .

| DNA samples and sequencing of mitogenomes
In this study, we sequenced the mitogenomes of wild boars and free-range domestic pigs representing geographic samples spanning the northern-and southern-most ranges of nonintrogressed wild boars and local pigs, with a latitude difference of over 30°.
Nine wild boars and eight free-range pigs from the tropical Vietnam (Son La province, ~20°N) and seven wild boars from subarctic Siberia (three from Tyva, ~51°N, two from Zabaykalsky Krai, ~53°N, and two from Novosibirsk, ~55°N) were surveyed (Figures 1 and 2a). It is worth noting that, to date, there is no evidence to show Siberia as one of the domestication centers for pigs, so it seems impractical for us to incorporate any traditional local breeds from this region. Thus, for Siberian samples, only wild boars can be used. In addition, to alleviate the potential problem of limited diversity caused by smaller sample size of Siberian wild boars, we sampled wild boars from three regions with long distances between each other (>1,400 km between Tyva and Zabaykalsky Krai; >1,100 km between Novosibirsk and Tyva). The sampling locations for Vietnam and Siberia are very different in the monthly average low temperatures ( Figure 2b). For all samples, we extracted total DNA from frozen ear tissues using TIANamp Genomic DNA Kit (DP304). Complete genome sequences were amplified using the previously reported primers (Wu et al., 2007).
PCR products were purified on spin columns and then sequenced using the ABI 3730 DNA Sequencer (Applied Biosystems, Foster City, CA, USA). Sequencing data were edited using the DNASTAR multiple program package (DNASTAR Inc., USA). All 24 complete mitogenomes have been submitted to GenBank with accession ID from KX982629 to KX982652. All experimental procedures were approved by the Institutional Animal Care and Use Committee of Huazhong Agricultural University, Wuhan, China (permit HZAUSW2015-0003). In addition, we de novo assembled 24 mitogenomes from publicly available next-generation sequencing (NGS) data (>25) using the ORGanelle ASeMbler software (http:// pythonhosted.org/ORG.asm/). We removed the adapters and lowquality reads using Trimmomatic (Bolger, Lohse, & Usadel, 2014) and conducted two rounds of de novo assembling. First, the default parameters were set to generate raw sequences. Second, by finetuning our parameters, including the read length, seed sequence, and the input read numbers, we make sure that all sequences were circle. Then, the assembling graphs were built and unfolded to generate complete mtDNA sequences. These sequences were aligned with the reference sequence (NCBI Reference Sequence: NC_000845.1) and manually corrected the orientation of fragments. The annotation was conducted using BLAST comparison (McGinnis & Madden, 2004).

| Ka/Ks calculation
In 13 mitochondrial genes, as ND6 has a different codon usage biases (Hasegawa, Cao, & Yang, 1998) and was thus excluded, the other 12 protein-coding genes and their combined dataset were analyzed independently. The ML method implemented in the program CODEML of PAML was used to evaluate Ka/Ks with F3X4 codon frequencies (Yang, 2007). Two datasets were assembled to evaluate the differences between Ka/Ks ratios of Vietnamese and Siberian  Tables S2-S14). Two-ratios model, which assumes different evolutionary patterns between wild boars from Vietnam and Siberia, was compared with one-ratio model, which assumes the same pattern for wild boars. For VS wf , only phylogenybased Ka/Ks ratios were determined to avoid sample size bias. Eight models were constructed to include major grouping possibilities F I G U R E 2 (a) Sample localities. The brown regions are current distributions of wild boars from IUCN (http://maps.iucnredlist.org/map. html?id=41775). (b) Monthly average low temperature (°C) of three sampling locations in Siberia and one location in Vietnam. Data were extracted from http://www.worldweatheronline.com/. (c) Maximum-likelihood tree of domestic pigs and wild boars from Eurasia based on complete mtDNA sequences. An African warthog (GenBank: DQ409327) was used as the outgroup. The red taxa are Siberian wild boars (subclade K), and blue lineages are Vietnamese wild boars and domestic pigs ( Figures 4 and 5). The null model was "one-ratio model (1ω)," which assumes that all branches share the same ω ratio. The alternative models were the other six models and "free-ratio model (Free ω, or Fω)" (Table 1). These models were then ranked by Akaike information criterion (AIC) and Bayesian information criterion (BIC) to select the best model. Likelihood-ratio test (LRT) was used to determine the significance level for the best model determined by AIC and BIC.
In order to evaluate whether the patterns observed from our samples have a general basis, we downloaded and analyzed mitochondrial Cytb (cytochrome b) gene sequences. This gene was chosen because it had the richest published sequences within all mitochondrial coding genes for wild boars. In total, the dataset contained 61 and 64 sequences of wild boars (including the 24 sequences assembled from NGS data) from northern and southern Asia, respectively (Supporting information removed to avoid the bias caused by sequence dependency. In addition, as the elevated Ka/Ks might be due to either relaxation of purifying selection or positive selection in a few sites, we further assessed the possibility of positive selection using an improved version of branch-site model (Zhang, Nielsen, & Yang, 2005). Branch-site model test, accommodating Ka/Ks ratios to vary among branches and amino acid sites simultaneously, is very conservative but has better power to distinguish between the relaxation of purifying selection and positive selection (Shen et al., 2010;Zhang et al., 2005).
The branches detected with higher Ka/Ks by the above-mentioned branch-model were assigned as foreground branches. LRT was used to compare branch-site model A with the corresponding null model ("M1a").

| Demographic history
To evaluate the role of N e in shaping selection pressure, we employed Bayesian skyline plot (BSP) to infer population history of wild boars ( Figure 6). Considering the sensitivity of BSP analysis to sampling numbers, we used D-loop sequences, which represent the most abundant noncoding marker of mtDNA sequences in NCBI. In total, 408 sequences (including locally assembled NGS data) were incorporated for wild boars from nearby regions of Siberia and Vietnam (160 vs. 248) (Supporting information Table S16). BSPs of N e were produced from sequences using MCMC sampling in the program BEAST (version 1.8) Ho & Shapiro, 2011). MCMC chain length was set to 10,000,000 with the first 10% of sampled generations discarded as burn-in to collect sufficient samples for parameter estimation. Piecewise constant model was assumed to generate figures. Estimates of N e were derived from the Bayesian coalescent rate through time. Generation time and mutation rate were fixed to 5 years and 2.5 × 10 −8 substitutions/site/year, respectively, based on previous reports (Bosse et al., 2012;Groenen et al., 2012). Tracer v1.6 was used to determine convergence of model parameters with ESS values (>200), after multiple F I G U R E 3 Parsimony networks constructed for complete mitogenomic sequences using the median-joining method (Bandelt et al., 1999) as implemented in PopART v.1 (Leigh & Bryant, 2015). The Siberian subclade is highlighted within the big circle times of independent MCMC simulations (Rambaut, Suchard, Xie, & Drummond, 2015).  Figures S1 and S2). Consistent with previous studies Wu et al., 2007), all of these inferred trees revealed two major clades, representing wild boars and domestic pigs from Europe and Asia. Despite this divergence between European and Asian clades, F I G U R E 4 Models used in Ka/Ks calculation. In addition to "1ω" model and "Fω" model, other six models are also used in LRT of branch models. Branches with different colors are different in Ka/Ks ratios. The tree used for LRT was generated from the Bayesian inference. The number around nodes are supportive values. "T," "I," "S," "V," "W," and "D" represent "Terminal branches," "Internal branches," "Siberian branches," "Vietnamese branches," "Wild boars," and "Domestic pigs," respectively. A warthog and a Malaysian wild boar were used as outgroups

| Ka/Ks ratios for each mitochondrial gene
Model selection and LRT were also performed for each mitochon-

| Smaller N e in southern wild boars than northern wild boars
According to the corollary of population genetics, the lowered effective population size (N e ) may reduce the efficiency of purifying F I G U R E 6 Bayesian skyline plots of wild boars from Siberia and Vietnam as well as nearby regions. The x-axis gives units of years before present, and y-axis is equal to log(4N e ). The shade areas are within the 95% highest posterior density interval selection (Kimura, 1985) Table S16). The Bayesian skyline plot (BSP) of 408 D-loop sequences showed that southern wild boars may have larger N e than that of northern wild boars ( Figure 6). Although overlap exists between the 95% HPD (highest posterior density) upper limits of northern wild boars and the 95% HPD lower limits of southern wild boars, probably due to the limited resolution of mtDNA compared with wholegenome data, the overall pattern of this maternal BSP inference is consistent with previous results based on genomewide data (Li et al., 2013). In addition, the lower N e in northern wild boars than in southern wild boars was compatible with the expectation based on population migration. It is well known that the early migration of wild boars was initiated from Island Southeast Asia (Frantz et al., 2013;Wu et al., 2007). During this northward range expansion, because of the founder effect, populations may gradually decrease in N e with distance from the places of origin (Henn et al., 2016).

| D ISCUSS I ON
There is an ongoing debate on how the accumulation of slightly deleterious mutation might be affected by effective population size (N e ) and selection coefficient. Studies based on population genetics have revealed that along the routes of range expansion, geographically isolated populations often exhibit increased ratios of deleterious mutations as a result of steadily decreasing N e with distance from the area of origin (Henn et al., 2016;Peischl, Dupanloup, Kirkpatrick, & Excoffier, 2013). However, ratios of deleterious mutations can also decrease when selection coefficient changes. It has been reported that cold-water and fast-moving fish may have reduced ratios of deleterious mutations in mitochondrial genes probably due to stronger selective constraints (Strohm, Gwiazdowski, & Hanner, 2015;Sun et al., 2011). Thus, it is important to resolve the different implications based on demography and selective constraints. We first de- can survive the cold subarctic climate, energy-related genes might contribute to their local adaptation. As mitochondria are the powerhouses of cells, which play essential roles in both thermogenesis and thermoregulation (Gambert & Ricquier, 2007;Onda et al., 2008;Simonyan & Skulachev, 1998), mitogenomes of Siberian wild boars may be under stronger purifying constraints than those of tropical wild boars, to avoid the accumulation of nonsynonymous mutations and to keep the stability of mitochondrial function. In addition, compared with animals in tropical areas, animals in cold regions may have to produce more total thermal energy to sustain chilling-resistance activities such as shaking and shivering (Brück, Wünnenberg, Gallmeier, & Ziehm, 1970;Wei, 1981). Factors including predators and limited food resources could drive them to spend more kinetic energy for foraging activities. Therefore, these stringent selective constraints in cold conditions may be an important agent in driving the evolution of mitogenomes in Siberian wild boars.
Apart from selective constraints, N e is another important factor that may influence the accumulation of slightly deleterious mutations. Ample evidence has shown that N e might also influence the efficiency of purifying selection and cause changes in proportions of slightly deleterious mutations (Cruz, Vilà, & Webster, 2008;Henn et al., 2016;Kimura, 1962;Schubert et al., 2014). For example, rodents (mouse-rat) may have 1.5 times lower Ka/Ks compared with primates (human-orangutan), probably due to the more effective purifying selection in rodents endowed by their larger N e . Domesticated animals, compared their wild relatives, have accumulated more nonsynonymous mutations in both nuclear and mitochondrial genomes, probably caused by the reduced selective efficiency following domestication bottleneck and the relaxation of selective constraints in human-mediated environment Cruz et al., 2008). If the selection coefficients remain unchanged, a larger N e can increase the efficacy of purifying selection to remove the slightly deleterious mutations, thus resulting in lower Ka/Ks ratios. Based on this rationale, the lower Ka/Ks ratio detected in Siberian wild boars ( Figure 5) might also be attributed to their relatively higher N e .
However, this possibility can be eliminated by their migration history, living conditions, our BSP result, and previous wholegenome results (Li et al., 2013). As one of the most widespread animals, wild boars have adapted to many different environments across Eurasia after migrating out of Southeast Asia millions of years ago (Frantz et al., 2013;Rothschild & Ruvinsky, 2011). The process of long-distance migration can lead to the reduction in N e in regions remote from the place of origin due to the bottleneck effect (Li & Durbin, 2011). In human, non-African populations are found to have lower N e than native African populations (Li & Durbin, 2011;Tenesa et al., 2007). In this sense, considering the migration history of Sus scrofa, it seems improbable that Siberian wild boars may have higher N e than Vietnamese wild boars. Other factors such as harsh environment and food shortage can also lead to lower N e in Siberian wild boars than in tropical wild boars.
In particular, considering relatively longer generation times in cold-region animals (Maiti & Maiti, 2011), the N e of Siberian wild boars might be even smaller. Moreover, previous reports and our BSP analysis also reveal that wild boars in northern Asia have lower N e than those in southern Asia (Li et al., 2013). Therefore, the lower Ka/Ks ratio in Siberian wild boars should be attributed to their more stringent selective constraints rather than changes in N e .
Recent studies have emphasized the prominent role of functional constraints in shaping the evolution of mitochondrial genes (Radzvilavicius, Kokko, & Christie, 2017;Strohm et al., 2015;Tavares & Seuánez, 2017). In contrast to the expectation of lowered selective efficiency of the mitogenome due to its much lower N e than nuclear genes (~1/4), mitochondrial genes are under much more effective purifying selection than genome-wide essential genes (Nabholz, Ellegren, & Wolf, 2012).
Comparative genomics analyses find that mitochondrial genes have 5.3 times higher selection coefficient than nuclear essential genes (Popadin et al., 2012). This unexpected high selection coefficient is probably due to the high levels of gene expressions of mitochondrial genes (Nabholz et al., 2012). In this study, we that Cytb gene may play a more important role in energy production than in heat generation (Sun et al., 2011). The stable ability to produce energy could be very important for Siberian wild boars to forage for food over long distances. This possibility is also supported by the close relationship (Figure 2c, subclade K) between wild boars from three distant localities in Siberia. The functional importance of cytochrome b in producing energy for long-range foraging might facilitate the close relationship among Siberian wild boars.

| CON CLUS IONS
The mitogenomes of wild boars (and free-range local pigs) from

ACK N OWLED G M ENTS
We thank Ivan Jakovlić for some helpful discussion. This study was supported by the NSFC-CGIAR cooperation project (31361140365)  .

CO N FLI C T O F I NTE R E S T
The authors report that no conflict of interests exist.

AUTH O R CO NTR I B UTI O N
JHC, SHZ, and JLH conceived the ideas and designed the research; JHC, PN, EVK, VLP, XDL, and TNTT collected and analyzed data.
JHC led the writing of the manuscript with substantial contribution from NŠ and JLH; all authors commented on the manuscript.

DATA ACCE SS I B I LIT Y
DNA sequences: GenBank (NCBI) accessions KX982629-KX982652.