Past and potential future population dynamics of three grouse species using ecological and whole genome coalescent modeling

Abstract Studying demographic history of species provides insight into how the past has shaped the current levels of overall biodiversity and genetic composition of species, but also how these species may react to future perturbations. Here we investigated the demographic history of the willow grouse (Lagopus lagopus), rock ptarmigan (Lagopus muta), and black grouse (Tetrao tetrix) through the Late Pleistocene using two complementary methods and whole genome data. Species distribution modeling (SDM) allowed us to estimate the total range size during the Last Interglacial (LIG) and Last Glacial Maximum (LGM) as well as to indicate potential population subdivisions. Pairwise Sequentially Markovian Coalescent (PSMC) allowed us to assess fluctuations in effective population size across the same period. Additionally, we used SDM to forecast the effect of future climate change on the three species over the next 50 years. We found that SDM predicts the largest range size for the cold‐adapted willow grouse and rock ptarmigan during the LGM. PSMC captured intraspecific population dynamics within the last glacial period, such that the willow grouse and rock ptarmigan showed multiple bottlenecks signifying recolonization events following the termination of the LGM. We also see signals of population subdivision during the last glacial period in the black grouse, but more data are needed to strengthen this hypothesis. All three species are likely to experience range contractions under future warming, with the strongest effect on willow grouse and rock ptarmigan due to their limited potential for northward expansion. Overall, by combining these two modeling approaches, we have provided a multifaceted examination of the biogeography of these species and how they have responded to climate change in the past. These results help us understand how cold‐adapted species may respond to future climate changes.

An integral part of investigating the past dynamics of organisms is the "species distribution modeling" method (hereafter SDM), a mostly correlative approach for computing habitat suitability maps based on the statistical relationships between presence records and environmental predictors (usually climate; Svenning et al., 2011). The projection of habitat suitability maps at different time (either past or future) allows the visualization and quantification of the changes in distributions through time and the testing of evolutionary and biogeographical hypotheses (Graham, Ron, Santos, Schneider, & Moritz, 2004). Complementing SDM, purely genetic (i.e., haplotype network) as well as coalescent-based methods (i.e., Bayesian skyline plots [BSP]) can be integrated to reveal a more detailed phylogeographic picture of the species' past dynamics (Carstens & Richards, 2007;Porretta, Mastrantonio, Bellini, Somboon, & Urbanelli, 2012).
These methods, however, can have limited sensitivity and as a result might not capture the full dynamics over the desired time scale (e.g., a very flat BSP with only a very recent population size change detected; Grant, 2015).
The advent of more affordable whole genome sequencing and enhancement of the coalescent framework (Marjoram & Tavaré, 2006) has now given rise to the possibility of extracting more information from fewer samples. One such method is the Pairwise Sequentially Markovian Coalescent model (PSMC; Li & Durbin, 2011). It allows for the tracking of species' effective population size (N e ) from 10 thousand years ago (kya) through to Early Pleistocene/ Late Pliocene (~3 million years ago [Mya]) from the genome of just one individual (Hung et al., 2014;Nadachowska-Brzyska, Li, Smeds, Zhang, & Ellegren, 2015;Zhao et al., 2013). It therefore the ideal tool to study whether the changes in range size through time, as determined by SDM, reflect the changes in population size. Combined, the two methods have the ability to reveal demographic history of species at an unprecedented level.
In a previous study, we utilized the PSMC method on three grouse species (willow grouse, Lagopus lagopus; rock ptarmigan, Lagopus muta; and black grouse, Tetrao tetrix) in order to study their reaction to climate change throughout the Pleistocene (Kozma, Melsted, Magnússon, & Höglund, 2016). Both willow grouse and rock ptarmigan are cold-adapted species living all year round in the arctic tundra of the Holarctic (Höglund, Wang, Axelsson, & Quintela, 2013;Holder, Montgomerie, & Friesen, 1999). The black grouse has a more southern distribution, inhabiting boreal forest edges, bogs, and moorland throughout the Palearctic (Corrales & Höglund, 2012).
Counter-intuitively, all three species reacted differently to cold temperatures within the last glacial period, leaving us to speculate that the PSMC was capturing lineage specific dynamics.
The current paper builds upon these previous results to explore the demographic history of the grouse in detail by combining PSMC and SDMs and subsequently test the concordance of their respective results. We focus on the Last Interglacial (LIG, ~130 kya), the Last Glacial Maximum (LGM, ~21 kya), the present distribution, and future scenarios of anthropogenic climate change (2050 and 2070).
Here, in addition to adding SDM analyses, we have analyzed the genomes of individuals from different parts of the range as populations of the same species may have faced different climatic histories.

| Climate data
Datasets for 19 climatic variables at 10-min resolution were downloaded from Bioclim (www.worldclim.org/bioclim) for the present conditions, LIG, LGM, projected year 2050 and projected year 2070. To avoid biasing the results using only one general circulation model for future projections, we averaged each variable over the "CCSM4," "IPSL-CM5A-LR," and "MP-ESM-LR" models using QGIS (QGIS Development Team 2015), assuming the representative concentration pathway (RCP) 4.5 greenhouse gas scenario. To reduce multicollinearity, we calculated the correlation coefficients among all the 19 climatic variables for the present time and plotted them in a dendrograph (Supporting Information Figure S1). For every cluster of variables that were highly correlated (variable distance ≤ 0.5), we chose the variable thought to have more influence on the distribution of the target species bases on their natural history. To further reduce any potential multicollinearity, variance inflation factors (VIF, Heiberger, 2016) were calculated for the selected variables. The final variables used in the modeling were: BIO5 (maximum temperature of the warmest month), BIO6 (minimum temperature of the coldest month), BIO12 (annual precipitation), BIO14 (precipitation of driest month), and BIO15 (precipitation seasonality-coefficient of variation).

| Occurrence data
We used two different presence datasets for all three species in order to take into account potential biases or data gaps: presence records downloaded from GBIF (www.gbif.org) and range maps downloaded from Birdlife international (Birdlife International 2014). GBIF data were filtered to remove duplicates, records without coordinates, or records with a special accuracy coarser than the grain size used in the variables (10 min of acr). To obtain the second presence dataset, we generated random points within the polygons, defining the range map of each species.
Background points were created within the full map extent (Northern Hemisphere). In order to evaluate the SDMs, both "GBIF" and "range map" datasets were partitioned further into two geographic subsets. For the willow grouse and rock ptarmigan, all points west and east of the 15th meridian West were grouped into the "West" and "East" subset, respectively (corresponding to North America, Greenland, and Iceland in west vs. Eurasia in east, Supporting Information Figures S2 and S3).
Geographical partitioning could not adequately predict the current range of the black grouse (see Section 3), so for this species, random partitioning was employed, with 75% of data used for training and 25% used for evaluation. Table 1 summarizes the various subsets used for modeling.

| Modeling and evaluation
For each species, the probability of presence at current climatic conditions was modeled using a weighted polynomial GLM (Lehmann, Overton, & Leathwick, 2002), trained on one subset (e.g., GBIF-West), and evaluated on the other subset (e.g., GBIF-East). Evaluation of the models was performed using the receiver operating characteristic (ROC) and its area under the curve (AUC), where AUC scores above 0.7 were deemed to indicate good model performance (Fielding & Bell, 1997;Swets, 1988). The best model was subsequently used to predict the species' presence across other time periods. Finally, a 10% threshold probability was applied in order to obtain a binary "presence/absence" map, from which the species' total range area was calculated.

| Samples, DNA extraction, sequencing, and filtering
For the PSMC analysis, we have chosen four grouse individuals that expand the cover of the geographic range of these three species relative to the ones used in our previous study (Kozma et al., 2016).
These include two willow grouse, one each from Magadan (Eastern Russia), Paxson-South-Central Alaska (USA), one red grouse from Yorkshire Dales National Park (Northern England), and one rock ptarmigan from southwestern Greenland. One additional willow grouse sample from Frøya (Central coast of Norway) was chosen as a control, to test whether the same demographic pattern is seen as in the previously sequenced Norwegian willow grouse sample.
DNA extraction was performed using the Qiagen DNeasy Blood & Tissue Kit ® , and DNA quality of each individual was checked on an agarose gel and subsequently measured using Quibit ® Fluorometer.
After library preparation with the Illumina TruSeq protocol, the samples were sequenced using an Illumina HiSeq machine to generate 125-bp paired end reads. Quality trimming was performed using Trimmomatic v0.36 (Bolger, Lohse, & Usadel, 2014), following a fourstep procedure: (1) removing Illumina TruSeq adaptors, (2) removing leading and trailing bases with quality score <5, (3) scanning the read with a four base-pair sliding window and cutting when the average quality per base dropped below 15, and (4) removing reads that were <50 bp after trimming.

| Assembly and analysis
Properly paired reads that passed quality control were mapped to their respective willow grouse or rock ptarmigan genome (Kozma et al., 2016) using the BWA-MEM alignment algorithm (Li, 2013) with default settings. As no exclusively red grouse genome exists, the red grouse reads were mapped to the highly related willow grouse genome (red grouse is formally recognized as a subspecies of willow grouse). Duplicate reads were marked with Picard (http://broadinstitute.github.io/picard/), and local realignment around indels was performed with the GATK IndelRealigner tool (DePristo et al., 2011;McKenna et al., 2010). The resultant mean coverage of each individual was 26× for Magadan willow grouse, 26× for Alaska willow grouse, 38× for Frøya willow grouse, 30× for red grouse, and 30× for rock ptarmigan.
Subsequent analysis followed the same procedure as in Kozma et al. (2016). Briefly, a consensus sequence was called using the SAMTOOLS v0.1.19 suite (Li et al., 2009), utilizing the samtools mpileup, bcftools, and vcfutils.pl (vcf2fq) pipeline. As the -C50 samtools options (default in the PSMC manual) for calling consensus sequence are very stringent, especially when reads have been mapped to the genome of a related species, we produced consensus sequences without this option. For individuals with mean coverage less than 10× across 1/3 of the genome, the minimum read depth (−d) was set to 10, to prevent low confidence calls. Unmapped and sex TA B L E 1 The number of presence points/background points used to create the species distributions models for the "GBIF" and "Range map" dataset. In the case of the latter, these represent pseudo-presence points chromosomes were removed from the dataset so that the PSMC analysis was carried out on the resulting autosome sequences using 30 iterations (−N 30), T max (−t) of 10, initial mutation/recombination ratio (−r) of 3 and atomic time interval pattern (−p) of "4 + 25 × 2 + 4+6".
Mutation rate of the willow grouse and rock ptarmigan have been determined previously (willow grouse: μ = 0.299 × 10 −8 ; rock ptarmigan: μ = 0.310 × 10 −8 ; Kozma et al., 2016), and the red grouse was assumed to be the same as that of the willow grouse.

| SDM
For willow grouse and rock ptarmigan, geographical portioning of the GBIF dataset produced good models, where training on the "West" subset and evaluation on the "East" produced the best model (willow grouse AUC = 0.85, rock ptarmigan AUC = 0.86). These models also recovered the current range of the species (Figures 1 and 2).
The same results held true for the range map dataset, where training on the "West" subset produced the best models (willow grouse AUC = 0.71, rock ptarmigan AUC = 0.81), which also recovered the current species range (Supporting Information Figures S4 and S5).
Importantly, both species showed the same pattern of range size change over time across the two datasets ( Figure 3 and Supporting Information Figure S7), thus corroborating the separate approaches.
The warm temperatures of the LIG drastically decreased the range size of both species, with Europe maintaining only a patchy distribution and the remainder of the range being pushed into higher latitudes. Any remaining European rock ptarmigan at this stage would have been effectively cut off from the eastern population surviving in northern Siberia. For the willow grouse, the White Sea separated the European and Asian part of the range, also possibly preventing extensive gene flow. Conversely, both species experienced the largest extent of their ranges during the LGM, when the suitable habitats for both species were well connected and extended further south than the current limit. The increasing temperatures within the next century are predicted to push the species' southern limit further north and contract the overall range to an extension similar to the one modeled during the LIG. Across Eurasia, the two species cannot move beyond the current northern limit. The willow grouse is projected to extend its northern limit in North America, with possible further stepwise colonization of Greenland. The rock ptarmigan is expected to expand its already present range in Greenland.
Neither of these shifts will fully mitigate the range reduction in more southern areas, causing an overall decrease in total range size.
For the black grouse, the GBIF dataset failed to recover the full extent of the current species range, irrespective of partitioning (Supporting Information Figure S6). This is most likely caused to the sampling bias, whereby the Asian part of the range is severely under-represented. Moreover, the conditions in the European range show good synchrony for the willow grouse and rock ptarmigan, we believe that the range map approach adequately predicts the distribution of the black grouse as well, even if the result cannot be corroborated with the GBIF dataset. For this species, the warm temperatures during the LIG also had an adverse effect on the overall range, whereby the southern limit of its extent was shifted north and the main European/West Asian region was separated from the East Asian region. The LGM saw a shift to lower latitudes, an increase in connectivity among the two regions and an overall increase in range size. This was still smaller than current range size. Lastly, the rising temperatures in the future will reduce the range size, mainly by the European habitat becoming more unsuitable and patchy. In the eastern part of its range, the species can buffer the rising temperatures by a shift to higher latitudes, in effect replacing the willow grouse and rock ptarmigan.
Overall, the results for all three species are in line with the species' known ecologies, wherein the cold-adapted willow grouse and rock ptarmigan have the biggest range sizes during the LGM, the more temperate black grouse has the biggest range at the present and all three had the smallest range during the LIG. Further corroborating the underlying ecologies, future warming is expected to impact the willow grouse and rock ptarmigan more severely than the black grouse. Nevertheless, the range size in all three species is expected to contract to similar levels as modeled during the LIG.

| PSMC
The PSMC result of the willow grouse from Norway acting as a control in this study showed the same demographic history pattern ( Figure 5) as the previously published Norwegian willow grouse sample ( figure   3b in Kozma et al., 2016)), supporting the overall PSMC approach.
Furthermore, the PSMC of the red grouse ( Figure 5) showed a highly similar pattern to that of the willow grouse, following the expectations stemming from their shared ancestry until about 6 kya. In both cases, the maximum population was reached around 400 kya, followed by a steady decrease up until 40-50 kya, from then on, the population remained stable. No increase in effective population was captured throughout the LGM. Interestingly, the willow grouse from Siberia and Alaska showed a strikingly different PSMC trajectory following the peak N e at 400 kya ( Figure 5). Both showed a second major population expansion following the onset of the last ice age (approx. 110 kya).   Britain; Clark et al., 2009;Clark & Mix, 2002) or were separated from the remainder of their distribution prior to the LGM (Alaska; Holder et al., 2000), these bottleneck most likely indicate the recolonization of these parts upon termination of the LGM and the creation of the Beringian land bridge, respectively. However, again we advise caution when interpreting recent changes in N e (i.e., the plateaued lines in the PSMC plot from ~60 kya to the present) as they generally indicate the moment when PSMC loses power.

| D ISCUSS I ON
In the case of the black grouse SDM, we expected a continued increase in the number of individuals from 130 kya onwards. We also expected a much smaller population size to be present during the LIG than the LGM, as the suitable habitat during the LIG was constricted to mostly northern parts of Europe and coastal East Asia. Despite this, the previously published PSMC study indicates the opposite-a peak population size being reached around the LIG followed by a steady decrease during the glacial period, with a population recovery occurring prior the LGM (figure 3a in Kozma et al., 2016). SDM seems to support the "subdivision" hypothesis (Kozma et al., 2016), where the subdivision of the black grouse into East Asian and European subpopulations throughout the LIG artificially augments the N e for the duration of the subdivision (see F I G U R E 6 The PSMC trajectory of the Greenland rock ptarmigan. Yellow: LIG, light gray: last glacial period, dark gray: LGM The PSMC trajectory of the willow grouse and red grouse. Yellow: LIG, light gray: last glacial period, dark gray: LGM All three species are expected to suffer from range contractions under future climate change. The black grouse has the least amount of projected range loss, owing to its ability to expand northward into the eastern part of its distribution, but in the process displacing the willow grouse and rock ptarmigan. Remaining populations within Europe are projected to become more fragmented and isolated, which can in turn drastically affect their genetic health and overall potential for population viability (Larsson, Jansman, Segelbacher, Höglund, & Koelewijn, 2008). The situation is projected to be more serious for the cold-adapted willow grouse and rock ptarmigan, which are both expected to lose approximately 30% of their current range. Neither can move further north within the Eurasian range and maintaining connectivity within this extent would be challenging, given their relatively short dispersal distance (~3-10 km, Berlin, Quintela, & Höglund, 2008).
There are assumptions inherent to both approaches used in this study that must be considered. For the PSMC method, the major assumption is over such long timescales, the main drivers of population change are the climatic conditions and colonization events. Grouse fall within the herbivore trophic level across their range (Martin, Doyle, Hannon, & Mueller, 2001) and have a variety of predators.
The main specialist predator is the goshawk (Accipiter gentilis), but others include the golden eagle (Aquila chrysaetos), lynx (Lynx lynx), red fox (Vulpes vulpes), and wolf (Canis lupus; Angelstam, Lindström, & Widén, 1984;Martin et al., 2001;Pekkola, Alatalo, Pöysä, & Siitari, 2014). Furthermore, it is known that the predator-prey dynamics of grouse produce cyclical patterns of population peaks and crashes where the peaks occur periodically from three up to 11 years, depending on the geographical locations, and closely follow the cycles of the alternative prey of the predators-the hare (genus Lepus; summarized in Martin et al., 2001). While predators do drive the dynamics of the grouse populations over the timescale of decades, we assume that over the timescale investigated in this study the predator-prey dynamics are stable.
The key assumption made by SDMs is that species are in equilibrium with the environment and are therefore able to fill all the geographical space with suitable ecological conditions (Elith & Leathwick, 2009;Svenning et al., 2011). Even though this assumption does not hold true for all species, the analysis performed by Araújo and Pearson (2005) over groups of organisms showed that European birds are in equilibrium with climate.
Niche conservatism is another important assumption to be considered when SDMs are going to be applied over long time spans (Svenning et al., 2011). While it may be hard to prove such "niche conservation" (Franklin & Miller, 2009; but also see Martinez, Peterson, & Hargrove, 2004), it is possible to use the fossil record to help judge the accuracy of the models (Lawing & Polly, 2011;Metcalf et al., 2014). For the grouse, the fossil record in Europe does agree with the models during the LGM, where these cold-adapted species show a southward shift as well as a shift to lower altitudes (Holm & Svenning, 2014). All three species were found north of the Alps during the height of the glacial period, which is indeed captured in our model. The remoteness of the rest of the species range and the paucity of accurate fossil data over such timescale does make it hard to support the model predictions across the whole range during the LGM as well as the LIG. The assumption of niche conservatism extents into the future as well. We do acknowledge that species have the potential to change their reaction norms by phenotypic plasticity or even adaptive evolution (Jackson, Betancourt, Booth, & Gray, 2009;Merilä, 2012;Vedder, Bouwhuis, & Sheldon, 2013), but the fact that we are projecting range shift over a short period of time (~50 years) may limit the potential effects of this limitation over our model. Even considering the aforementioned limitations, we believe the models utilized in this study are sufficient to capture the large scale, species-wide changes in range size required to contrast the PSMC patterns.
Sampling bias can also have large effects on SDMs. We do observe inadequate sampling of the black grouse in a key part of its distribution (South-eastern Russia), and the limited known presence points are not enough to reconstruct its known range.
Instead, we have relied directly on the range map of the species to build a model. This approach has been used before (Pedersen, Sandel, & Svenning, 2014;Levinsky et al., 2013) and additionally, we test it with the willow grouse and rock ptarmigan, which do not suffer from such sampling bias. In both of these species, the two models (from occurrence points and range map points) produce the same patterns in range size change across the studied time periods; therefore, we feel confident in applying this approach to the black grouse.
In this study, we have expanded the available knowledge on the demographic history of three grouse species by utilizing species distribution modeling and coalescent-based reconstruction of past effective population size fluctuations. We have also presented the advantage of integrating approaches to overcome the limitations of single analytical methods in extrapolating meaningful conclusions from population and species modeling. In willow grouse and rock ptarmigan, we find evidence for lineage-specific patterns of population change, with multiple recolonization events following deglaciation of its northern range. In black grouse, the two methods also hint at a potential past subdivision, but more samples need to be sequenced in order to further support this hypothesis. The species also lacks adequate occurrence data with which to build more reliable distribution models. This will have to be remedied, given the need to have a better understanding of how the current climate change will impact the demography of the species (Pacifici et al., 2015). In this regard, our models do not forecast a pretty future for the three species, as their range sizes are expected to shrink and become more fragmented, which will most likely result in overall decrease in population size and genetic diversity.

ACK N OWLED G M ENTS
The visit to learn SDM with B.M.B. and J-C.S. was made possible through the Zoologiska Stiftelsen grant awarded to R.K. R.K. would also like to thank André Silva for providing further helpful discussion on SDM's. Additionally, we thank David Newborn from the Game & Wildlife Conservation Trust (www.gwct.org.uk) for being instrumental in the red grouse sampling. Funding for this project was received from the Research Council of Sweden (VR) to J.H.

CO N FLI C T O F I NTE R E S T
None declared.