The role of cryptic diversity and its environmental correlates in global conservation status assessments: Insights from the threatened bird's‐eye primrose (Primula farinosa L.)

Most of the fundamental questions in conservation biogeography require the description of species geographic boundaries and the identification of discrete biological units within these boundaries. International conservation efforts and institutions rely mainly on traditional taxonomic approaches for defining these boundaries, resulting in significant cryptic diversity going undetected and often extinct. Here, we combine high‐throughput genomic data with publicly available environmental data to identify cryptic diversity in the threatened bird's‐eye primrose (Primula farinosa). We aim to characterize evolutionary lineages and test whether they co‐occur with changes in environmental conditions. These lineages can be used as intraspecific units for conservation to enhance assessments regarding the status of threatened species.


| INTRODUC TI ON
Most of the fundamental questions in conservation biogeography require knowledge of the geographic distributions and ecological niches of individual species (Riddle, Ladle, Lourie, & Whittaker, 2011;Whittaker et al., 2005). This knowledge is essential to better understand species responses in a rapidly changing world and prevent the ever-increasing loss of their diversity (Waldron et al., 2017). However, conservation biologists and international efforts, such as the Red List of the International Union for Conservation of Nature (IUCN), often face the double challenge of describing species geographic boundaries and identifying discrete biological units within these boundaries (Keith et al., 2015;Riddle et al., 2011). The identification of subspecific units is of major importance in order to protect both evolutionary history and ecological processes below the species level (Faith et al., 2010;Moritz, 2002;Palsbøll, Bérubé, & Allendorf, 2006).
Wide-range species, particularly those with limited vagility or dispersal potential, often show strong phylogeographic structure across their distributions (Avise, 2009). This structure has been mostly shaped by past geological, ecological and evolutionary processes, and the resulting intraspecific units (i.e., historically isolated groups of populations; hereafter referred to as "lineages") frequently display distinct ecological characteristics (Allendorf, Luikart, & Aitken, 2013). Although intraspecific lineages have long been recognized as significant units for conservation (Moritz, 1994;Ryder, 1986), relevant studies have focused more on the genetic (using mostly a few neutral markers) and less on the ecological (and potentially adaptive) distinctiveness of these lineages (Funk, McKay, Hohenlohe, & Allendorf, 2012;De Guia & Saitoh, 2007). However, responding to recent and ongoing destructive impacts of global climate change and human activities on intraspecific diversity (Miraldo et al., 2016;Pauls, Nowak, Balint, & Pfenninger, 2013), and therefore to ecosystems (Des Roches et al., 2018), requires the integration of genetic and environmental data. Conservation studies integrating both biodiversity aspects are still scarce but are expected to increase as more genetic and environmental data are becoming available (Jenkins, Yannic, Schaefer, Conolly, & Lecomte, 2018;Wilting et al., 2015;Yannic et al., 2017).
Characterization of intraspecific diversity is often linked to the identification of morphologically cryptic lineages (Bickford et al., 2007;Riddle et al., 2011). Cryptic diversity is commonly defined as the occurrence of distinct evolutionary lineages that are otherwise morphologically indistinguishable within a nominal species (Bickford et al., 2007;Struck et al., 2018). Despite the morphological similarities, cryptic lineages may carry not only unique evolutionary trajectories but also the potential of differing responses to ongoing and future global change (Bernardo, 2011;Feckler et al., 2014;Paaby & Rockman, 2014). Therefore, biodiversity assessments that ignore cryptic lineages, including them in a single species or a single conservation unit, may severely undervalue current and future biodiversity patterns (Bálint et al., 2011;Fišer, Robinson, & Malard, 2018;Riddle et al., 2011). Cryptic diversity is typically assessed using genetic data, and the advent of high-throughput sequencing has the potential to revolutionize the discovery of cryptic lineages across the tree of life (Allendorf, Hohenlohe, & Luikart, 2010;Benestan et al., 2016;Funk et al., 2012). However, the number of studies that use genomewide data to uncover cryptic diversity remains small (see Struck et al., 2018 for a review). Additionally, there is an ever growing need to broaden the definition of cryptic diversity and incorporate environmental data when assessing intraspecific cryptic variation (Espíndola et al., 2016;Freudenstein, Broe, Folk, & Sinn, 2017;Sullivan et al., 2019).

Assessment of environmental conditions across populations
has traditionally relied on time-consuming and costly field observations and measurements of relevant ecological variables, with time and cost increasing with species range, number of populations and number of variables (Albert et al., 2010). However, the ever-increasing public availability of spatial environmental data is advancing biodiversity research and enables the time-and cost-effective discovery of ecological biodiversity patterns across taxonomic and spatial scales (Franklin, Serra-Diaz, Syphard, & Regan, 2017;La Salle, Williams, & Moritz, 2016). Data on abiotic variables directly involved in species' physiological limitations and distributions, such as climate and soil characteristics, are publicly available from several sources either as spatial interpolations of field measurements, satellite observations of the earth's surface (i.e., remote sensing) or a combination of both (Franklin et al., 2017). Additionally, remote-sensing technologies offer the means to closely approximate local biodiversity patterns and biotic interactions at large to moderate spatial scales in the form of land cover classes (He et al., 2015;Lausch et al., 2016). Coupled with high-throughput genomic data, statistically derived and remotely sensed environmental data can significantly enhance our ability to uncover and document diversity patterns, yet studies that integrate these diverse data are lacking (Bush et al., 2017;Yamasaki et al., 2017).
The bird's-eye primrose (Primula farinosa L.) represents an excellent system to investigate potential patterns of cryptic diversity Main conclusion: Our results highlight the need for incorporating intraspecific diversity in global assessments of species conservation status and the utility of genomic and publicly available environmental data in conservation biogeography.

K E Y W O R D S
CHELSA, high-throughput genomic data, IUCN, land cover, MODIS relevant for conservation. Primula farinosa is a cold-adapted herb in the family Primulaceae characterized by having a wide, but disjunct distribution in Europe and north-east Asia (Hambler & Dixon, 2003;Krasnoborov, 2000;Roskov et al., 2018;Shishkin & Bobrov, 1952).
Previous taxonomic treatments based on a wide range of diagnostic traits suggested that P. farinosa is largely absent from western and central Siberia, where it is replaced by closely related species, but it occurs from the east of the Altai mountains through Mongolia to eastern Siberia and the Kamchatka peninsula (Fedorov, 2001;Krasnoborov, 2000;Richards, 2003;Shishkin & Bobrov, 1952). The species shows significant variability with many forms and intermediates over its extended Eurasian distribution (Shishkin & Bobrov, 1952;Krasnoborov, 2000;Theodoridis, personal observations), and the lack of distinct morphological differences among these forms has resulted in significant controversy among taxonomists (Hambler & Dixon, 2003;Krasnoborov, 2000;Richards, 2003;Shishkin & Bobrov, 1952). The species grows in calcareous grasslands and alpine streams or lakes at altitudes between 400 and 3,000 m a. s. l. in southern Europe (Carpathians, Alps, Iberia) and Mongolia, while in northern Europe (British Isles, Baltic region and southern Scandinavia) the species grows exclusively in wetlands between 0 and 400 m a.s.l.
Here, we use P. farinosa to (a) identify potential cryptic genetic diversity relevant for IUCN conservation status assessments and (b) test for environmental correlates of intraspecific cryptic diversity F I G U R E 1 Sampling localities of populations of Primula farinosa across its Eurasian range. Altitude is indicated by colour (purple to white) at each locality. Pictures represent species morphology and variation of habitat type in four distinct geographic regions in species with wide continental distributions. To this end, we use previously published and newly generated high-throughput genomic data to identify intraspecific lineages across the species' range. We also extract statistically predicted and remotely sensed environmental variables to approximate the biotic and abiotic habitat of the species. Finally, we test for significant associations between genetic groups and environmental variables. By combining genomic with publicly available environmental data, we aim at contributing novel ways for assessing intraspecific cryptic diversity in conservation biogeography.

| Sampling
To estimate genetic structure and ecological diversity within P. farinosa, we sampled 71 populations that encompass most of its known Eurasian distribution ( Figure 1). Our sampling design aimed at representing most genetic and ecogeographic variation within P. farinosa (Albert et al., 2010;Gotelli & Stanton-Geddes, 2015). For its European distribution, we used 57 populations reported previously (Theodoridis et al., 2018(Theodoridis et al., , 2017 and selected one to three individuals per population based on the size of the available genomic data (i.e., maximizing the total number of sequencing reads per individual). Additional sampling details are provided in Table S1. Distribution and sampling maps were created using the Strabo library (https ://github. com/spyro stheo dorid is/strabo) in JavaScript.

| Sequencing and bioinformatics
DNA extraction and sequencing followed the approach described by Theodoridis et al. (2017). Briefly, leaf samples collected from 93 individuals were preserved in silica gel and DNA was extracted using a modified CTAB protocol (Doyle & Doyle, 1990). Genomic li- Raw DNA sequences were filtered and trimmed following the recommendations of Minoche, Dohm, and Himmelbauer (2011) as implemented in GonoSpy v0.1 package (https ://github.com/spyro stheo dorid is/gonospy) in Python. Initial analysis of the raw reads indicated the existence of a few invariable organellar sites with extremely high coverage. Therefore, we further excluded these sites by mapping the filtered reads to the chloroplast genome of Primula veris and the mitochondrial genome of Vaccinium macrocarpon (Ericaceae) using Stampy v1.0.28 (Lunter & Goodson, 2011) and applying an expected divergence from the reference (substitution rate) of 0.05 substitutions per site. After filtering, our final dataset consisted of nuclear reads that were 90 bp long. To identify putative homologous loci across all individuals, we employed the StacKS pipeline v.1.44 (Catchen, Hohenlohe, Bassham, Amores, & Cresko, 2013) and applied the following parameter values: we required a minimum coverage (m) of five identical reads per stack by allowing a maximum of two gaps (max_gaps) between reads; we also removed stacks with coverage of more than two standard deviations above the mean (Catchen et al., 2013); we allowed a maximum distance (M) between stacks of two nucleotides and a maximum number of two stacks at a single de novo locus in each individual; a maximum of six mismatches (n) and a maximum of four gaps (max_gaps) was allowed between putatively homologous loci across all individuals.

| Phylogenomic inference and genetic structure
To assess intraspecific genetic structure and phylogenetic relationships in P. farinosa, we used the Variant Call Format (VCF) file exported by StacKS and filtered it using GonoSpy to exclude SNPs (a) that were represented by ≤5% of the total reads for the corresponding locus within each individual, (b) with a frequency ≤0.05 across all individuals and (c) absent in more than 30% of the sampled individuals. To infer phylogenetic relationships among the sampled individuals of P. farinosa, we used the filtered SNP data (as described above), including all SNPs per locus, and a maximum likelihood approach implemented in raxml v8.2.11 (Stamatakis, 2014). Alignment sites consisting of only heterozygotes and homozygotes for a single allele (but not homozygotes for the alternative allele) are considered invariant by raxml; therefore, we further excluded these sites using GonoSpy. The final filtered data matrix consisted of 4,089 SNPs. To account for the lack of invariant sites, we applied an ascertainment bias correction (--asc-corr = lewis) to the likelihood calculations as recommended by Leaché, Banbury, Felsenstein, Nieto-Montes de Oca, and Stamatakis (2015). We used the GTR + gamma model of sequence evolution determined by the model selection procedure implemented in Iq-tree (Nguyen, Schmidt, von Haeseler, & Minh, 2015). We also applied the rapid bootstrap algorithm with 1,000 replicates and performed a full search for the best scoring tree. The final tree with associated branch support values was visualized in Itol webserver (Letunic & Bork, 2016).
Genetic structure was inferred using the variational Bayesian framework implemented in the software faStStructure (Raj, Stephens, & Pritchard, 2014). Since faStStructure assumes that the investigated loci are unlinked, we selected one random SNP per locus and repeated this analysis 20 times to account for the stochasticity stemming from the random choice of SNPs (Theodoridis et al., 2017). After filtering, the 20 final data matrices each consisted of 1,220 randomly chosen SNPs. We ran the analyses for numbers of groups, K, ranging from 1 to 10 and further applied the chooseK. py program to estimate the most likely number of K. For the K values that best explained our data, the results of the 20 randomly selected SNP replicates were combined using the "greedy" algorithm within clumpp v1.1.2 (Jakobsson & Rosenberg, 2007). The results of the clumpp analysis were visualized using the D3 JavaScript library (Bostock, Ogievetsky, & Heer, 2011).
We further evaluated the degree of genetic divergence among the genetic groups identified from the phylogenomic and genetic structure analyses using Nei's standard genetic distance (Nei, 1972). Following Joly, Bryant, and Lockhart (2015), for each pairwise comparison we first calculated the mean distance of all SNPs within each locus and then calculated the genome-wide distance by estimating the mean across all locus distances using custom Python scripts.

| Biotic environment
To approximate the type of biotic environments across P. farinosa populations, we used the Global Consensus Land Cover database (Tuanmu & Jetz, 2014) that provides information on the prevalence of 12 land cover (including land use) classes at 1-km resolution  (Hunter, 2007) in Python.

| Abiotic environment
We approximated the abiotic conditions of P. farinosa populations using soil, precipitation and temperature variables obtained from two different sources (global reanalysis data from CHELSA and remote sensed data from MODIS; see below). We combined the above variables in two different datasets, one that included the soil, precipitation and the CHELSA temperature data (hereafter abiotic/CHELSA) and a second one that included the soil, precipitation and the MODIS temperature data (hereafter abiotic/MODIS). We then reduced the dimensions of each of the two datasets by performing principal components analysis as described above (see Biotic environment section). The individual abiotic variables are described below.

| Soil attributes
We characterized soil attributes using six soil variables obtained from the web-based global soil information system (SoilGrids; https ://soilg rids.org) made available by the International Soil Reference and Information Center (ISRIC) at 250 m resolution (Hengl et al., 2017). The SoilGrids system provides global predictions for standard numeric soil properties and is generated using automated statistical mapping and machine learning (Hengl et al., 2017). For each sampled population, we extracted information on the following variables: Predicted most probable class following the World Reference Base (predClass), soil pH in H 2 O at 0 cm (pH1), soil pH in H 2 O at 5 cm (pH2), absolute depth to bedrock (absDepth), soil organic carbon content at 0 cm (carbCont1) and soil organic carbon content at 5 cm (carbCont2).

| Temperature and precipitation (CHELSA)
Precipitation and temperature at 2 m above ground for each sampled population were obtained from CHELSA (Climatologies at high resolution for the earth's land surface areas; http://chelsa-clima te.org/) at 30 arc sec (c. ~1 km on the equator) resolution (Karger et al., 2017). CHELSA includes monthly (mean, maximum and minimum) temperature and precipitation patterns for the time period 1979-2013 derived by downscaling the model output temperature and precipitation estimates of the ERA-Interim climatic reanalysis (i.e., downscaled global reanalysis data; Karger et al., 2017). Using the monthly temperature and precipitation values, we generated the following set of seven climatic variables that describe annual and growing season climate trends for the sampled populations: minimum temperature of growing season (tminGrow), maximum temperature of growing season (tmaxGrow), average temperature range during growing season (tDiffGrow), average temperature of growing season (tmeanGrow), growing degree days (GDD), annual precipitation (precAnnual) and precipitation during the growing season (prcGrow).
Growing season for P. farinosa was defined as the period from March to September (Hambler & Dixon, 2003).

| Temperature (MODIS)
In addition to the CHELSA temperature data, we obtained land surface (0 m above ground) temperature derived from thermal infrared measurements of the NASA Moderate Resolution Imaging

Spectroradiometer (MODIS) onboard the Terra and Aqua Earth
Observing System satellites (Wan & Dozier, 1996). MODIS/Terra  and MODIS/AQUA  land surface temperature data are produced daily at 1 km spatial resolution. Because year-round data from both satellites are available since the year 2003, we considered the time period between 2003 and 2017, and for each population, we downloaded and extracted daily temperature data for the growing season (March-September) using the pymodIS v2.0.9 (http://www.pymod is.org/) and choroSpy v0.1 (https ://github.com/spyro stheo dorid is/ chorospy) packages in Python. We then merged the data from both satellites and calculated mean, maximum and minimum monthly temperatures. Subsequently, we used the monthly values to calculate the same five temperature-related variables as described above (tminGrow, tmaxGrow, tDiffGrow, tmeanGrow and GDD).

| Niche overlap and niche similarity tests
To assess ecological differentiation within P. farinosa, we quantified climatic niche differences among the identified genetic groups using the approach described by Broennimann et al. (2012) in two-dimensional environmental space (PCA; see also Theodoridis et al., 2013). We first calculated niche overlap between genetic groups using Schoener's D similarity index (Schoener, 1970), a metric that ranges from 0 (completely discordant niches) to 1 (identical niches).
We then defined a background extent for each genetic group using a 20 km buffer zone around the sampling localities (Theodoridis et al., 2018) and tested whether the ecological niches of the groups tend to be more similar to each other than would be expected by chance using background (or niche) similarity tests (Broennimann et al., 2011;Warren, Glor, & Turelli, 2008). These tests randomly shift the density of occurrences in one range and calculate similarity (Schoener's D) using the observed niche from the other range. We repeated the process 1,000 times for both ranges under comparison, and the observed D value was compared to the distribution of the randomized D values in a one-sided t test (Broennimann et al., 2011).
The tests were conducted for each environmental category (biotic, abiotic/CHELSA, abiotic/MODIS). For each of the two abiotic categories, the soil variables (and the CHELSA precipitation variables for the abiotic/MODIS) were reprojected and resampled to match the projection and resolution of the respective temperature layers using the Geospatial Data Abstraction Library v2.3.2 (GDAL, www. gdal.org). All tests were conducted using the "ecospat" package in R (Di Cola et al., 2017). where P 1 , P 2 , … P k−1 are the posterior probabilities of each individual belonging to the respective ancestral group (i.e., group one to group K − 1) and Ê is the predicted environmental score along the considered PCA axis. For populations represented by more than one individual (i.e., Balkan and Asian populations), and since all individuals within populations showed identical ancestry profiles (see Results), we randomly selected one individual. Linear and polynomial regression models were fitted using the "lm" and "glm" functions in R version 3.4.0 (R Development Core Team, 2017), respectively. To further test the adequacy of linear models in explaining environmental variation across populations, we tested for residual spatial autocorrelation in each model using Moran's I with kernel distance weights implemented in the python package pySal v.2 (https ://pysal.org/) and 1,000 simulations.  Table 1). E = 0 + 1 P 1 + 2 P 2 +…+ k−1 P k−1 F I G U R E 2 Phylogenomic and population structure analyses of Primula farinosa. (a) Maximum likelihood (RAxML) unrooted tree inferred from a matrix of 4,089 variant sites (SNPs). Colours indicate distinct lineages (i.e., groups of populations separated by long branches) and correspond to distinct geographic regions (inset map). Numbers on major branches indicate branch support (1,000 bootstrap replicates). Branches with <70% bootstrap support were collapsed. (b) The results of the fastStructure analyses based on 20 repetitions of randomly chosen SNPs from 1,220 loci. The Balkan and Asian populations are represented by three and two individuals, respectively (separated by black inner ticks)

| Multivariate biotic and abiotic space and niche similarity
The results of the PCA of the biotic (land use) dataset indicated that the Asian populations, particularly those in the Altai mountains, tend to grow in land mainly covered by shrubs and herbaceous vegetation or land that is mostly barren or snow-covered (Figure 3a). In contrast, the European populations, especially those that belong to the central-northern European group, appear to occupy a wide range of land cover classes, from tree dominated land to land that is mainly cultivated or managed (Figure 3a). Niche overlap (Schoener's D) was generally slightly higher in biotic space across all lineage pairwise comparisons, compared to abiotic space (Table 2). Moreover, the majority of the background similarity tests did not support significant similarities between the available habitats of the six lineages in both biotic and abiotic environments, suggesting ecological divergence in P. farinosa (Table 2).

| Isolation by distance and ancestry/ environment associations
The Pearson correlation coefficient between the log-transformed geographic and genetic distances were relatively high among all Eurasian populations (r = 0.75, p < 0.001; Figure 4), a result probably reflecting the large geographic distance separating the European and Asian populations (Figure 1). This correlation was much weaker when considering only the European populations (r = 0.46, p < 0.001), suggesting that geographic distance is a weak predictor of the genetic divergence observed among populations.
In general, the results of the multiple regressions indicated significant associations between individual ancestry proportions and environment (i.e., PCA axes), with varying degrees of fit across datasets, number of groups (K) and PCA axes ( Table 3).
The first PCA axis in both abiotic datasets, that is CHELSA and MODIS (mostly representing soil pH, temperature range during the growing season, minimum and maximum temperatures during the growing season, and annual precipitation; Figure 3c  Note: Backgrounds for the tests were defined by applying a 20 km buffer zone around the sampling localities of each lineage. The background similarity tests were performed twice for each lineage pair. The second number in the parentheses after the p value indicates the lineage whose background was used for the respective similarity tests. Numbers in bold indicate significant p values (<0.05). and CHELSA respectively) compared to the quadratic polynomial model (AIC = 241.77 and AIC = 243.46; Table 3). Coefficients of determination were also significant for the second PCA axis of both abiotic datasets (mostly representing soil pH and annual and growing season precipitation) and K = 5 (adj. r 2 = 0.323 and adj. r 2 = 0.2945 for abiotic/MODIS and abiotic/CHELSA respectively).
Significant associations were also observed between ancestry proportions and the first PCA axis of the biotic dataset (i.e., land cover classes), with much lower degrees of fit, however, compared to the abiotic datasets (adj. r 2 = 0.1727 for K = 4 and adj. r 2 = 0.2747 for K = 5), and significant residual spatial autocorrelation (Moran's I = 0.32 for K = 4 and Moran's I = 0.22 for K = 5). Overall, the abiotic/MODIS dataset showed the best linear fit with no significant spatial autocorrelation in residuals across the top explanatory models (PC1 and PC2 for K = 5, PC1 for K = 4; Table 3).

| D ISCUSS I ON
Identification of nature's units and the geographic description of biodiversity variation have been central objectives in conservation biogeography ever since the birth of the field (Whittaker et al., 2005; see also Riddle et al., 2011). However, challenges related to data availability and integration have hindered progress towards these objectives, particularly in taxa with little or uninformative morphological variation.
Here, we conducted field work and took advantage of high-throughput sequencing technologies and publicly available environmental data to address the above two objectives in the cold-adapted and threatened plant P. farinosa. We identified six major genetic lineages across its Eurasian distribution corresponding to distinct geographic regions and found significant associations between these lineages and their abiotic environments, indicative of high intraspecific diversity and potential local adaptation to different climatic and soil conditions.
Our results highlight the usefulness of genomic and publicly available data in biodiversity research and the need to incorporate intraspecific diversity in global assessments of species conservation status.
According to IUCN (http://www.iucnr edlist.org/detai ls/20339 8/1) and the ITIS Catalogue of Life (Roskov et al., 2018), P. farinosa is widespread in Europe and temperate Asia. However, the species is threatened or extinct in many European countries, mainly because of anthropogenic environmental pressure. While these threats are acknowledged by IUCN, the global status assessment is summarized as follows: "This species demonstrates declines in most of its range, being marked as threatened in several national red lists. It suffers from grazing and lack of management in its grassland habitats. However, the species is still widespread and unlikely to severely decline in the near future towards extinction.
More information on the current population size, trend and the overall rate of decline is needed to review whether it would qualify for threatened under Criterion A." From this report, it is clear that by considering P. farinosa as a widespread species, local extinctions are overlooked because of its relative abundance globally.
The countries where the species has already gone extinct or displays the most severe declines are Hungary, Poland, Slovakia and Romania. Both our current and previous studies (Theodoridis et al., 2018(Theodoridis et al., , 2017 show that populations in these countries form a distinct Carpathian lineage (Figure 2), occupying to a large extent an ecological niche distinct from the rest of the Eurasian populations ( for securing the protection of unique evolutionary lineages. Identification of cryptic diversity has traditionally relied on the use of a few loci to estimate evolutionary relationships or genetic distances among populations within nominal species (Fišer et al., 2018). Despite the rapid progress in high-throughput sequencing technologies in the last decade, studies that harness the power of genomic data remain few (Struck et al., 2018). Moreover, studies that integrate genomic with publicly available environmental data, that is statistically derived and/or remotely sensed data, to identify cryptic diversity are scarce (Dufresnes et al., 2018). In this study, we combined genome-wide polymorphisms (4,089 SNPs) with climate, soil and land cover data to identify cryptic genomic diversity and its environmental correlates within the cold-adapted P. farinosa. We identified six major lineages (four previously known, two newly discovered; Figure 2) and significant lineage-environment associations (Table 3). Specifically, we uncovered strong relationships between TA B L E 3 Statistical relationships between genetic structure (ancestry; see Figure 2b) of the studied populations and the first three PCA axes in their biotic (land cover classes) and abiotic (climate and soil characteristics) environmental space Note: r 2 : coefficient of determination obtained using a linear model and significance level (asterisks); Moran's I: residual spatial autocorrelation for the linear models and significance level (asterisks); Linear and polynomial models were fitted using the results of the fastStructure analysis (i.e., ancestry proportions) for K = 4 and K = 5. Abbreviation: AIC, Akaike's information criterion. *p ≤ 0.05 **p ≤ 0.01 ***p ≤ 0.001 individual ancestry proportions and the abiotic environment that the individuals occupy. Moreover, the abiotic dataset that includes the remotely sensed surface temperature (abiotic/MODIS) showed a better linear fit across PCA axes and numbers of ancestral groups compared to the abiotic/CHELSA dataset (Table 3). This result further stresses the superiority of satellite-obtained estimates of climatic variables to ground station statistical interpolations in capturing climatic variation in plant species distributions, particularly for regions with complex topography or sparse meteorological station data (Deblauwe et al., 2016). Although we could not experimentally test whether the observed genomic divergence among the Eurasian populations is neutral or adaptive, our IBD results (Figure 4) and the ancestry/environment associations (  (Liu et al., 2013).
Morphological stasis, the tendency of closely related taxa to retain a high degree of morphological similarity over extended periods, has been suggested as a major mechanism to explain high levels of cryptic genetic diversity within species despite the lack of noticeable intraspecific morphological variation (Bickford et al., 2007;Gould & Eldredge, 1977). Taxa experiencing strong selection on traits for adaptation to specific environments might not show morphological disparity, despite their long temporal and spatial isolation (Davis et al., 2014;Estes & Arnold, 2007). Our results show that niche overlap (Schoener's D) among lineages is generally higher in the biotic environmental space compared to the abiotic space, albeit no significant niche similarities were detected (Table 2). Additionally, we found weak relationships between individual ancestry profiles and their occupied biotic space (adj. r 2 = 0.1727 for K = 4 and adj. r 2 = 0.2747 for K = 5; Table 3), indicative of the low variation in biotic conditions, that is land cover types, across lineages, despite the long branches separating them (Figure 2a). Although we used a relatively coarse resolution land cover dataset (~1 km), these results may explain the lack of morphological disparity across the Eurasian lineages, adding further evidence for the role of biotic interactions in constraining the evolution of diagnostic traits (i.e., stabilizing selection) over large spatiotemporal scales (Charlesworth, Lande, & Slatkin, 1982).
While the nature and definition of species boundaries continue to be a matter of debate, recent views have focused on the integration of various criteria, including intrinsic reproductive isolation, morphological diagnosability, monophyly and distinct ecological niches (Freudenstein et al., 2017;De Queiroz, 2007). Our results address the last two criteria by revealing strong ecological and evolutionary differentiation among the six P. farinosa lineages, although Nei's standard genetic distances between these lineages (Table 1) fall within the range of values previously reported for intraspecific populations of plant species (Alexander, Poll, Dietz, & Edwards, 2009;Li, Wang, Liu, Zhuang, & Huang, 2018). Whether the identified Eurasian lineages could be assigned to different nominal species or not will depend on additional research on intrinsic reproductive isolation among them, as well as on phylogenomic analyses including closely related and morphologically distinct species within the highly diverse Primula sect. Aleuritia Duby clade.

| CON CLUS IONS
Identifying discrete biological units within threatened species has important implications for conservation. Conservation guidelines rely mainly on traditional taxonomic characters in defining conservation units or species. However, under this approach, significant cryptic diversity may go undetected and often extinct. Integrative approaches that combine genome-wide data with easily accessible environmental information can provide unprecedented resolution below the species level and significantly help conservation agencies in their assessments. Our study demonstrates the utility of these data in the cold-adapted and threatened P. farinosa for uncovering cryptic genetic diversity and its environmental correlates and further underlines the need for bringing together principles from the fields of ecology and evolution in conservation planning.

ACK N OWLED G EM ENTS
We