Complex and divergent histories gave rise to genome‐wide divergence patterns amongst European whitefish (Coregonus lavaretus)

Abstract Pleistocene glaciations dramatically affected species distribution in regions that were impacted by ice cover and subsequent postglacial range expansion impacted contemporary biodiversity in complex ways. The European whitefish, Coregonus lavaretus, is a widely distributed salmonid fish species on mainland Europe, but in Britain it has only seven native populations, all of which are found on the western extremes of the island. The origins and colonization routes of the species into Britain are unknown but likely contributed to contemporary genetic patterns and regional uniqueness. Here, we used up to 25,751 genome‐wide polymorphic loci to reconstruct the history and to discern the demographic and evolutionary forces underpinning divergence between British populations. Overall, we found lower genetic diversity in Scottish populations but high differentiation (F ST = 0.433–0.712) from the English/Welsh and other European populations. Differentiation was elevated genome‐wide rather than in particular genomic regions. Demographic modelling supported a postglacial colonization into western Scotland from northern refugia and a separate colonization route for the English/Welsh populations from southern refugia, with these two groups having been separated for more than ca. 50 Ky. We found cyto‐nuclear discordance at a European scale, with the Scottish populations clustering closely with Baltic population in the mtDNA analysis but not in the nuclear data, and with the Norwegian and Alpine populations displaying the same mtDNA haplotype but being distantly related in the nuclear tree. These findings suggest that neutral processes, primarily drift and regionally distinct pre‐glacial evolutionary histories, are important drivers of genomic divergence in British populations of European whitefish. This sheds new light on the establishment of the native British freshwater fauna after the last glacial maximum.


| INTRODUC TI ON
Understanding the demographic and evolutionary forces shaping differentiation and speciation is a major undertaking in evolutionary biology (Seehausen et al., 2014). This is relevant not only for determining historical processes, but also to generate critical contemporary knowledge on how species are structured across their range, information that is important for conservation (Funk et al., 2012). Genomic differentiation can be rather uniform or localized in specific regions, known as islands of differentiation (Harr, 2006), which are usually considered as important genetic drivers of species divergence. Many factors will influence the distribution and magnitude of such regions (Quilodrán et al., 2020), and detailed knowledge of their patterns can improve the understanding of the evolutionary processes affecting species divergence. For example, divergent selection in populations experiencing gene flow, or secondary contact with heterogeneous gene flow following allopatric divergence, should result in few islands of high differentiation with modest and homogeneous background differentiation (Nosil et al., 2009;Ravinet et al., 2017). On the contrary, allopatric populations with limited or no gene flow may lack genomic islands of differentiation, resulting in a global and random pattern of genome-wide differentiation due to drift (Mattingsdal et al., 2020; but see Cruickshank and Hahn, 2014).
Furthermore, although gene flow can swamp genomic regions undergoing divergence, it has been shown that introgression of favourable alleles from divergent lineages or populations can facilitate local adaptation (e.g. Jacobs et al., 2019;Marques et al., 2019). The use of high throughput sequencing techniques enables us to investigate such genome-wide patterns to better understand the evolutionary processes behind population and species divergence.
Because of the continuous expansion and contraction of the glaciers, Pleistocene glaciations are one of the most important events to have shaped the current distribution of flora and fauna in areas that were glaciated (Schmitt, 2007;Stewart & Lister, 2001). Many species were fragmented into lineages in separate refugia, which then experienced secondary contact at the end of the last ice age as they expanded their range into the areas previously covered by ice, leading to complex evolutionary histories (Bernatchez, 2004;Hewitt, 2004).
The postglacial colonization of Britain is of particular interest from a biogeographic perspective because, with the exception of southern England, it was completely covered by ice during the Last Glacial Maximum (LGM) (Brooks et al., 2011;Clark et al., 2012).
The current understanding is that there are three major mitochondrial lineages of European whitefish that occur in Europe and survived in separate refugia: a northern European clade with contemporary populations ranging from northwest Russia to Denmark, a Siberian clade with populations ranging from the Arctic sea to southwest Norway and a southern European clade with populations ranging from the European Alps to Denmark (Jacobsen et al., 2012;Østbye et al., 2005). In Britain, the European whitefish is only native to seven lakes: Lochs Eck and Lomond in Scotland, Llyn Tegid in Wales, and Haweswater, Red Tarn, Brotherswater and Ullswater in England ( Figure 1). Although the relationship between British and continental European populations is unknown, the genetic relationship between British populations has been investigated using allozymes (Beaumont et al., 1995;Ferguson, 1974), mitochondrial restriction fragment length polymorphism (RLFPs) (Hartley, 1995), and more recently with microsatellites (Crotti, Adams, Etheridge, et al., 2020). Although some contradictions exist between these different data types, they generally suggest a separation between populations in Scotland, Wales and England, with the Welsh population being more closely related to the English populations. However, these studies have lacked contextualization by continental Europe populations and were based on few genetic markers, and therefore it was not possible to infer the drivers of differentiation, the role of complex postglacial demography, and whether major genomic structural differences are involved. The European whitefish in the UK is a priority species within the UK Biodiversity Action Plan, and it is protected under Schedule 5 of the Wildlife and Countryside Act 1981 due to its rarity and vulnerability to pressures such as invasive species, water quality and habitat degradation (Maitland & Lyle, 2013;Winfield et al., 2013). Therefore, a better understanding of the relationship between the British populations, the drivers of the population structuring between countries and the evolutionary drivers of distinctiveness is of high importance not only for evolutionary understanding but also for conservation management and policy.
In this study, a high-density genome-wide and mitochondrial dataset for samples from populations across the European whitefish range, including all seven native British populations, was used to reconstruct the evolutionary history of European whitefish in Britain.
To do so, we contextualize the contemporary genetic diversity and DNA was extracted from fish fin clips using the NucleoSpin Tissue kit (Macherey-Nagel) following the manufacturer's recommendations. The protocol used for the ddRADseq (Peterson et al., 2012) library preparation follows , which is modified from Recknagel et al. (2015)

| Data processing
First, raw reads were demultiplexed with process_radtags in Stacks v.2.4.1 (Catchen et al., 2013;Rochette et al., 2019) and both forward and reverse reads were retained, for a paired-end approach. Reads with quality <20 were removed using Trimmomatic (Bolger et al., 2014). Reads were then mapped to a chromosome-level assembly (GCA_902810595.1) of the European whitefish (De-Kayne et al., 2020) using bwa mem v.0.7.17  with default settings and retained if mapping quality was >20 with samtools v.1.7 . A total of 10 individuals from eight populations did not pass QC due to low coverage and were excluded from the dataset. We assembled loci using Stacks v.2.4.1 and the ref_map.pl script. The raw dataset was further filtered depending on the type of analysis performed.

| Data generation, genetic diversity and population structure
The populations programme in Stacks was used to generate a vcf file with the following settings: -p 9 (minimum number of populations a locus must be present in to be retained), -r 0.8 (minimum proportion of samples in a population required to have a locus), --max_obs_het 0.6 (maximum observed heterozygosity for a nucleotide at a locus), --min_maf 0.05 (minor allele frequency across populations required for a SNP to be included in the dataset) and --write_single_snp (retain one SNP per locus). This file was then filtered further with vcftools v.0.1.15 (Danecek et al., 2011), retaining SNPs that fulfilled the following criteria: a minimum sequencing depth of 5 reads per individual, a minimum mean sequencing depth of 10 reads across individuals, a maximum sequencing depth per individual of 30, a maximum mean sequencing depth across individuals of 30 (to remove possible repetitive reads), a minor allele frequency (MAF) of 0.05 and a 15% missing data threshold. We then used the script filter_hwe_by_ pop.pl (available at https://github.com/jpuri tz/dDoce nt/blob/maste r/scrip ts/filter_hwe_by_pop.pl) to filter sites that were out of HWE within populations, although no such sites were detected. Finally, we excluded a sample from the Baltic region which had more than 90% missing data after filtering. The Russian samples had between 68% and 87% missing data but were retained in the dataset only for the population structure and phylogenetic analyses.  (Goudet, 2005), set to a minimum of three individuals (the size of the HAW population). For all these analyses, we included up to 17 SNPs from the retained loci for a total of 41,929 SNPs, as there is no need to account for linkage disequilibrium.
Multiple approaches were used to assess population structure.
For most of these analyses, we used the retained loci but only used one SNP from each (to reduce the effect of linkage), with a total of 91 samples, 25,751 SNPs, and an average of 12.4 (± 5.5) SNPs per 1 Mb; this is referred as the 'pop-gen dataset'. First, to evaluate population structure at the individual level, we ran a principal component analysis (PCA) with SNPRelate v. 1.16 (Zheng et al., 2012) in the R statistical environment (R Core Team, 2019). To investigate whether differentiation along principal components was driven by specific genomic regions, we used the function snpgdsPCACorr in SNPRelate, which calculates the correlation between a SNP genotype and the eigenvector of each principal component (Zheng et al., 2012). Second, we employed a maximum-likelihood approach for population assignment with Admixture v.1.3.0 (Alexander et al., 2009). We ran analyses with a 20-fold cross-validation (CV), and tested K values ranging 2-11, the optimal value was assigned to the one with the lowest CV-error. Third, to complement the modelbased Admixture with a nonparametric approach (Linck & Battey, 2019), we used the R package adegenet v.2.1.1 (Jombart, 2008) to run a Discriminant Analysis of Principal Components (DAPC) (Jombart et al., 2010), which uses k-means clustering and the Bayesian information criterion to identify the most likely number of genetic clusters in the dataset. The xvalDAPC function was used to determine the number of PCs to be retained by the DAPC analysis. Fourth, between population Weir and Cockerham F ST (Weir & Cockerham, 1984) values were calculated in GenoDive (Meirmans & Van Tienderenn, 2004) and significance was assessed with 10,000 permutations, pairwise Dxy distances were calculated with populations in Stacks, and both were visualized as heatmap matrices.
Finally, to infer population structure via shared ancestry, we used the software fineRADStructure (Malinsky et al., 2018), which uses haplotype-based statistics, and is a modified version of the chromopainter/fineSTRUCTURE package (Lawson et al., 2012) but designed specifically for RADseq-type data. For this, we generated a new dataset specific for fineRADStructure with populations, excluding the Russian individuals due to high missing data (Malinsky et al., 2018). Starting from all retained loci, we retained all SNPs per locus and filtered loci with the following settings: -p 9, -r 0.9, --max_obs_ het 0.6 and --min_maf 0.05. The fineRADStructure dataset had a total of 24,060 loci and 87 individuals. The analysis was conducted with a burn-in of 300,000 iterations, followed by 300,000 MCMC iterations and tree building with default parameters. The results were visualized with the scripts fineradstructureplot.r and finestructurelibrary.r (available at http://cichl id.gurdon.cam.ac.uk/fineR ADstr ucture.html).
We used TREEMIX v.1.13 (Pickrell & Pritchard, 2012) to explore and visualize admixture events across populations using genome-wide allele frequency from the pop-gen dataset. This programme allows modelling migration events by adding discrete mixture events to a bifurcating population tree. Variable numbers of migration edges m, ranging 0 to 11, were tested. To account for linkage disequilibrium, we run the analysis in blocks of 100 SNPs.
Because of the lack of outgroup in our dataset, we did not specify a root for the tree.
To understand whether genetic differentiation between the Scottish and other populations was evenly distributed or localized in specific genomic regions, we calculated differentiation across the genome using a window-based approach in vcftools with the popgen dataset, using a window size of 1 Mb. We compared windowbased F ST between the two Scottish and the remaining populations, and between the Baltic and the remaining populations, given its similar level of differentiation from all populations (see Results).
Windows with fewer than five SNPs were excluded as they could be unreliable (Lehnert et al., 2020). Furthermore, we looked for regions associated with high divergence and considered as outliers any genomic windows with F ST values ≥ 3 standard deviations from the mean (SD) (Lehnert et al., 2020). For this last analysis, as the aim was to determine outliers across whitefish lineages, we used only one representative population from Scotland (Loch Lomond) and one representative population from England (Red Tarn).

| Phylogenetic analysis
For phylogenetic analyses, the pop-gen dataset was converted to phylip format using the script vcf2phylip.py (Ortiz, 2019).
Phylogenetic reconstruction was conducted under a maximumlikelihood framework in RAxML-NG (Kozlov et al., 2019), using a generalized time-reversible accounting for among-site rate variation (GTR+G). Branch support was estimated using 100 bootstrap replicates. The analysis was run on the CIPRES server (https://www. phylo.org/porta l2/home.action). In addition, we also investigated the relationship of European whitefish populations using unrooted phylogenetic networks as implemented in SplitsTree v.4.14.6 (Huson & Bryant, 2006) using the same dataset.

| Inference of evolutionary history
To distinguish between alternative evolutionary scenarios regarding the colonization of Britain, we used coalescence simulations implemented in fastsimcoal2 v.2.6 (Excoffier et al., 2013). Six populations were used for this analysis: Norway, Alpine, Baltic, Lomond in Scotland, Llyn Tegid in Wales and Red Tarn in England. Only one population each from Scotland and England were used because the population structure analyses had revealed very close relationship between populations within these two regions. We generated a dataset using the filtering criteria as the pop-gen dataset, the only difference being that no MAF filter was applied to retain informative low-frequency sites (Jacobs et al., 2020). The script sampleK-genotypesPerPop.py (available at https://github.com/marqu eda/ SFS-scripts) was then used to down-sample each population to seven individuals, to avoid biases due to different sample sizes, and no missing data were allowed. The filtering resulted in a dataset of 42 individuals and 20,510 SNPs. The minor multidimensional site frequency spectra (MSFS) were generated with the script easySFS.
py (available at https://github.com/isaac overc ast/easySFS). To determine absolute values for divergence times and other inferred parameters, we corrected the number of monomorphic sites in the SFS (following Jacobs et al., 2018;Kautt et al., 2016). We used a hierarchical approach (Marques et al., 2019) by firstly optimizing a three populations model (3-Pop) that included the Alpine, Baltic and Norwegian populations from the two major European whitefish lineages and their contact zone (Østbye et al., 2005). The likelihood of each model was maximized from 100 random starting parameters, 40 iterations and 100,000 coalescent simulations. Following previous work, an estimated mutation rate of 1x10 −8 was used, as no accurate mutation rate for salmonids is available (Rougeux et al., 2017).
Once the best 3-Pop model was found, we created four population sample size is large, most of the entries in the SFS will be zero or very low values and be difficult to estimate correctly (Excoffier et al., 2013). Therefore, to reduce the number of zero values, the MSFS for all the models except the 6-Pop were projected down to eight chromosomes per population, whereas the MSFS for the 6-Pop model was projected down to six chromosomes per population (Jacobs et al., 2020). We estimated confidence intervals for the parameters of the 6-Pop model using parametric bootstrap: the point estimates with the highest likelihood from the best fitting model as identified by fastsimcoal were used to simulate 50 MSFS, for which we then run 15 parameter optimizations, starting from the best parameters inferred from the observed data, and the parameter from the run with the highest likelihood for each simulated MSFS was used to compute the confidence intervals. Parameter ranges for all the models are reported in Table S1.

| Mitochondrial analysis
We sequenced mtDNA for a subset of the populations from this study. From the extracted DNA, the whole NADH dehydrogenase subunit 1 (ND1) mtDNA gene (975 bp) was PCR amplified using the primers ND1 F B1NDF: 5′-TAA GGT GGC AGA GCC CGGTA-3′ and ND1 R B1NDR: 5′-TTG AAC CCC TAT TAG CCA CGC-3′ (Schenekar et al., 2014), using the PCR conditions described in Schenekar et al., (2014). The PCR product was sent to DNA Sequencing and Services (MRC PPU) for Sanger sequencing in the forward direction for all samples but two, which were sequenced in the reverse direction.

| Genetic diversity, population structure and admixture
To evaluate the extent to which European whitefish populations evolved independently or underwent significant admixture events, we analysed population structure, differentiation, co-ancestry and admixture within Britain and at the European level. A PCA demonstrated that the Norwegian population was the most distinct from all other populations, separating along PC1, which explained 22.2% of the total variance (Figure 2a). The second principal component (18.8%) separated the two Scottish populations from all other populations, whereas the Alpine population separated on PC3 (9.77%, Figure 2b). The correlation analysis between SNP genotype and PC eigenvectors revealed a genome-wide effect on the variation explained by the principal components ( Figure S1), rather than specific genomic regions, suggesting that divergence between the Scottish and the other populations on PC2 is influenced by genomewide patterns rather than few elevated or adaptive regions.
All populations were significantly differentiated from each other, although there were substantial differences across geographic scales. We found the Scottish populations to be highly differentiated from the other British populations, with F ST ranging from 0.546 to 0.661 (Table 1, Figure S4). Amongst populations within England, all The ancestry analyses showed that most populations were separated in well-defined clusters (Figure 2c,d). Nevertheless, high levels of haplotype sharing were evident between the two Scottish populations, and between the four English populations as recovered by fineRADStructure (Figure 2c). This was paralleled to a certain extent in the DAPC (K = 7) ( Figure S2) Figure S6). When looking for genomic window outliers, we did not find any window with F ST > 3 SD ( Figure S7), again suggesting that differentiation between Scottish and other populations is driven by genome-wide mechanisms rather than particular genomic regions.

| Phylogenetic analysis
The unrooted phylogenetic network based on genomic data showed populations to be well defined (Figure 3a), but highlighted the presence of unresolved lineage sorting between the Baltic, Russian and Norwegian populations. The topology of a phylogenetic analysis based on the same data agrees with the result of that phylogenetic network, with a high support for the monophyly of all populations except for Ullswater and Brotherswater in England ( Figure S8a). With midpoint rooting applied, the Norwegian and Russian populations formed a monophyletic group sister to a clade containing all the other populations. The first population to split in this clade was the Baltic, followed by the two Scottish populations ( Figure S8a).
The four English populations were more closely related to the Welsh population, and these two lineages shared a more recent common ancestor with the Alpine population than with any other population ( Figure S8a).

| Mitochondrial analysis
Scottish populations possessed mitochondrial haplotypes that were predominantly in the northern European mtDNA clade (Figure 3b).
The Welsh populations were placed in the southern European clade, being more similar to the Danish populations of Ringkøbing Fjord and Tange Lake (Jacobsen et al., 2012). The Scottish populations showed unique haplotypes, which were more closely related to Baltic ones ( Figure 3b). Furthermore, the Norwegian and Alpine possessed the same haplotypes and were central in the haplotype network.

| Evolutionary history
Using the inference of phylogenetic relationships in the ML tree, we estimated the demographic history of European whitefish using a hierarchical approach. The demographic scenarios for all the models tested are found in Figure 4. In agreement with the findings of previous studies based on mtDNA and microsatellites (Crotti, Adams, Etheridge, et al., 2020;Hartley, 1995), the Welsh population had the highest genetic variation within Britain and formed its own genetic cluster. The four English populations are very closely related, with high levels of within-population genetic diversity and low between-population genetic differentiation. Interestingly, although allozyme and microsatellite analyses suggested the English population of Red Tarn to be the most distantly related of the four (Beaumont et al., 1995;Crotti, Adams, Etheridge, et al., 2020), our SNPs analyses show a very close relationship between Red Tarn and Haweswater.
We found evidence of elevated genomic differentiation between the Scottish and the remaining populations across the entire genome but did not detect any genomic islands of differentiation. Although in an allopatric model of population divergence, the number of islands is expected to increase with time (Nosil et al., 2009), it has been shown that long periods of isolation and limited gene flow can mask the signals of genomic islands, as it will be swamped by elevated background genetic differentiation due to drift (Mattingsdal et al., 2020;Quilodrán et al., 2020;Wang et al., 2016). and grayling Thymallus thymallus (Gum et al., 2005); alternatively, this lineage could have survived in the putative Lough Hibernia, in the present-day Irish sea (Maitland, 1970). The Welsh and English populations split from the Alpine lineage much more recently, ca. 20 Kya, towards the end of the last glacial maximum (Clark et al., 2012). The Alpine lineage is thought to have had its refuge near the mouth of the river Rhine (Hudson et al., 2011) and colonized the Alpine lakes and the Jutland peninsula through the Rhine and Elbe river systems (Hansen et al., 2008;Østbye et al., 2005). At the time of the split, instead of moving towards central Europe, the ancestral stock that gave rise to the Welsh and English populations must have moved westwards towards Britain. The divergence between the English and Welsh populations at ca. 14 Kya agrees quite well with the retreat of the glaciers from Britain, as parts of the Lake District would have been already free of ice at that time (Brooks et al., 2011;Pinson et al., 2013 (Maitland, 1970). Then as the glaciers started retreating, British lakes were colonized in different events; in fact, the Scottish lochs are low altitude, whereas the Welsh and English lakes are high altitude, suggesting that the European whitefish established in these water systems at different time points (Wheeler, 1977 (Oppo et al., 2001). During this period, temperatures increased and tree vegetation reached Scandinavia (Hall et al., 2002). The retreat of the glaciers could have allowed the ancestral stock of European whitefish to move westward. Through the early drainages of the ice lakes (Østbye et al., 2005), whitefish entered the Baltic sea around 81 Kya, corresponding to the interstadial GI 21.1 (Rasmussen et al., 2014) or MIS 5a (Oppo et al., 2001).
The Baltic population in this study showed the highest genetic diversity. The high genetic diversity could be explained by repeated gene flow between the Baltic population and the other lineages, as suggested by the low differentiation, treemix results and the demographic modelling. These introgression events are also supported by mtDNA analysis (Østbye et al., 2005). In particular, the introgression between the ancestor of the Baltic and Alpine population could explain the presence of northern mtDNA haplotypes in the Alpine whitefish radiation (Hudson et al., 2011;Østbye et al., 2005).
Overall, our results add support to the evidence of the Baltic sea as a recognized contact zone between divergent lineages of aquatic species (Johannesson et al., 2020).
Our analysis based on nuclear data does not find a clear distinction between a southern and a northern European whitefish clade, as suggested by mtDNA (Hudson et al., 2011;Jacobsen et al., 2012;Østbye et al., 2005). Instead, the demographic modelling finds a scenario in which three populations, Norwegian, Baltic and Scottish, split one after the other from the common ancestor of European whitefish within a 30 Ky period, from 92 to 65 Kya; then, we observe no split until 20 Ky, when the Alpine lineage splits from the Welsh and English stock. The Baltic population analysed here is from Achterwasser in northern Germany, and it was previously found to possess the northern mtDNA haplogroup (Jacobsen et al., 2012;Winkler et al., 2011).
The fact that the demographic modelling shows the Baltic population splitting after the ancestor of the Norwegian population, rather than from it, adds support to the findings that the separation between southern and northern mtDNA haplogroups is not found in nuclear data. Furthermore, in our and previous analyses (Winkler et al., 2011), the northern European mtDNA haplotypes are prevalent in the Alpine population of Koppentraun, Austria. However, the nuclear data show a strong divergence between the Alpine and Norwegian populations, adding another line of evidence to the discrepancy between nuclear and mitochondrial DNA signal for European whitefish. Discordance between nuclear DNA and mtDNA is well documented in the literature (reviewed by Toews & Brelsford, 2012) and might be a common feature of species whose demographic history has been heavily influenced by glacial cycles (Dufresnes et al., 2020). Cyto-nuclear discordance has also been recovered by previous studies on whitefish, in particular in the Alpine whitefish radiation, where nuclear data supported a monophyletic origin of all Alpine populations, whereas mtDNA showed the presence of both southern and northern mtDNA haplogroups (Hudson et al., 2011), and in the Danish and Baltic populations, where F ST differentiation between Baltic (northern mtDNA haplogroup) and Danish (southern mtDNA haplogroup) was in some cases an order of magnitude higher for mtDNA than from microsatellite data (Hansen et al., 1999).

| CON CLUS ION
We provide molecular evidence of a deep divergence between, and a dual colonization of, Britain by Scottish and Welsh/English populations of European whitefish, which reflects a distinctive pre LGM pattern at the edge of the species distribution relevant to continental recolonization. Previous studies uncovered a similar structuring between the seven British populations (Beaumont et al., 1995;Crotti, Adams, Etheridge, et al., 2020;Hartley, 1995), but it is the first time it has been possible to estimate the time of divergence, highlighting the increased resolution NGS data can provide for conservation (Funk et al., 2012). Our findings support the delimitation of separate conservation units (CUs) (Funk et al., 2012) for each population and, in particular, highlight the uniqueness of the Scottish lineage which diverged and survived in isolation with limited contact for ca. 50K years. Although we did not detect genomic islands of differentiation between Scottish and remaining populations, future work could look at the adaptive differentiation and demographic history between populations within country, that is between the two Scottish populations and between the four English population, to better understand the adaptive and demographic process operating at the local level (Flanagan et al., 2018). Conservation measures have already been implemented to preserve the genetic stock of most British populations through the use of conservation translocations (Adams et al., 2014;Crotti et al., 2021;Maitland & Lyle, 2013;Thomas et al., 2013;Winfield et al., 2013), and we encourage national conservation bodies to look beyond the simple species-based paradigm as they aim to protect the diversity of European whitefish at the most western edge of its range.

CO N FLI C T O F I NTE R E S T
The authors have no conflict of interest to declare.

AUTH O R CO NTR I B UTI O N S
MC, CEA and KRE designed the research. MC, CWB, ARDG, IJW, JW, GB, KP and CEA provided biological samples. MC and MB conducted laboratory work with support from KRE. MC conducted the analyses. MC wrote the paper with contributions from KRE and CEA. All authors edited/approved the final manuscript.

PEER R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/jeb.13948.

DATA AVA I L A B I L I T Y S TAT E M E N T
Scripts, data and the parameter data files for the demographic modelling in fastsimcoal are archived and made available in the University