Looking for the needle in a downsized haystack: Whole‐exome sequencing unravels genomic signals of climatic adaptation in Douglas‐fir (Pseudotsuga menziesii)

Abstract Conifers often occur along steep gradients of diverse climates throughout their natural ranges, which is expected to result in spatially varying selection to local climate conditions. However, signals of climatic adaptation can often be confounded, because unraveled clines covary with signals caused by neutral evolutionary processes such as gene flow and genetic drift. Consequently, our understanding of how selection and gene flow have shaped phenotypic and genotypic differentiation in trees is still limited. A 40‐year‐old common garden experiment comprising 16 Douglas‐fir (Pseudotsuga menziesii) provenances from a north‐to‐south gradient of approx. 1,000 km was analyzed, and genomic information was obtained from exome capture, which resulted in an initial genomic dataset of >90,000 single nucleotide polymorphisms. We used a restrictive and conservative filtering approach, which permitted us to include only SNPs and individuals in environmental association analysis (EAA) that were free of potentially confounding effects (LD, relatedness among trees, heterozygosity deficiency, and deviations from Hardy–Weinberg proportions). We used four conceptually different genome scan methods based on FST outlier detection and gene–environment association in order to disentangle truly adaptive SNPs from neutral SNPs. We found that a relatively small proportion of the exome showed a truly adaptive signal (0.01%–0.17%) when population substructuring and multiple testing was accounted for. Nevertheless, the unraveled SNP candidates showed significant relationships with climate at provenance origins, which strongly suggests that they have featured adaptation in Douglas‐fir along a climatic gradient. Two SNPs were independently found by three of the employed algorithms, and one of them is in close proximity to an annotated gene involved in circadian clock control and photoperiodism as was similarly found in Populus balsamifera. Synthesis. We conclude that despite neutral evolutionary processes, phenotypic and genomic signals of adaptation to climate are responsible for differentiation, which in particular explain disparity between the well‐known coastal and interior varieties of Douglas‐fir.


| INTRODUC TI ON
Conifers have successfully occupied a large number of habitats and different climates during their past-glacial histories, which allowed them not only to survive and establish in harsh environments but also to colonize ecoregions with optimal growing conditions (Farjon, 2010). As such, Douglas-fir (Pseudotsuga menziesii) constitutes one of the most widespread conifers in western North America with a distribution from Southern California up to the Northern British Columbia . Within its current range of occurrence, two distinct varieties are known, which appear genetically and phenotypically different: the coastal variety (also called P. menziesii var. menziesii) and the interior variety (P. menziesii var.glauca). While the coastal variety is mainly found from central California to coastal British Columbia along the Pacific coast, the interior variety has its main south-to-north expansion from Wyoming/Montana over Alberta up to central British Columbia (Hermann & Lavender, 1990;Martinez, 1949). Both varieties hybridize when in contact, resulting, for example, in a 450-km-wide hybrid zone in British Columbia  where also populations represented by both varieties and introgressed trees predominantly from the interior into the coastal variety were described (van Loo et al., 2015). The macrogeographic regions of both varieties differ substantially in climate with cool and moist climate conditions in the Pacific habitats toward more continental and dry growing conditions in the interior . In addition, these habitats are also separated by the Great Basin and Columbia Plateau, respectively. Accordingly, several common garden experiments found intraspecific trait differences among Douglas-fir provenances that were related to growth, physiology, and phenology and that showed associations with differences in seed source climate among provenance locations (Anekonda et al., 2004;Bansal et al., 2016;De La Torre et al., 2021;Kleiber et al., 2017;Malmqvist et al., 2017;Vangestel et al., 2018). While such traitclimate associations can indeed suggest patterns of local adaptation due to spatially varying selection (Leinonen et al., 2013), their interpretation can still be doubtful, because the climatic clines at which provenances occur often show parallelism with recolonization routes after population contraction and expansion (Nadeau et al., 2016). Consequently, the putative adaptive signal obtained from such trait analyses can be confounded by those that were left behind by neutral processes such as genetic drift and demographic processes, which have often formed genotypic clines as well (Caye et al., 2019;Günther & Coop, 2013).
Incorporating information from genetic markers has significantly improved our understanding of the role of neutral processes in population structuring of many plant species and has shed light into their postglacial migration histories (Hewitt, 1999). The main challenge, however, still remained, which is to distinguish truly adaptive regions in the genome from those that show strong divergence due to neutral genetic processes. Identifying true environmental associations and adaptive regions in the genome (e.g., from single nucleotide polymorphisms, SNPs) is of utmost importance, since only adaptive markers have the potential to assist tree breeding of more resistent genotypes and selection of climatically adapted genotypes for forest management under climate change (Grattapaglia et al., 2018;Harfouche et al., 2012;Neale & Kremer, 2011). Yet, one of the strongest limitations in disentangling adaptive from neutral genomic locations is acquiring a marker set, which is large and dense enough to cover the genome in a representative and unbiased fashion. This is particularly difficult in conifers such as Douglas-fir with large genomes in the magnitude of several gigabases (De La Torre et al., 2014;Neale et al., 2017;Nystedt et al., 2013). Nevertheless, recent improvement in massive parallel sequencing and bioinformatics has made it possible to reduce the complexity of such genomes by reducing them to the protein coding region, which is called the exome (e.g., Neves et al., 2013). Exome capture in conifers results in massive reduction in genome complexity allowing to rapidly generate markers at high number and relatively low costs for a thorough study of genomic regions of interest (see, for instance, Capblancq et al., 2020;Suren et al., 2016;Thistlethwaite et al., 2017, for examples on conifers). Even though the exome represents only a small fraction of the total genome, these data are particularly suited for environmental association analyses (EAA), since adaptive candidate SNPs can be further examined when a gene model or annotated reference genome is available. Moreover, by using a subset of SNPs that show no significant deviation from neutrality expectation (as, e.g., examined with F ST statistics), neutral population substructuring can be estimated as well.
In this study, we investigate a 40-year-old common garden experiment with 16 Douglas-fir provenances from across its distribution by combining dendroclimatological methods with modern sequencing technique, which allowed us to generate a dense set of SNPs throughout the entire exome. Based on the first finding that Synthesis. We conclude that despite neutral evolutionary processes, phenotypic and genomic signals of adaptation to climate are responsible for differentiation, which in particular explain disparity between the well-known coastal and interior varieties of Douglas-fir.

K E Y W O R D S
climatic adaptation, common garden experiment, Douglas-fir, environmental association analysis, exome capture provenances showed a strong association between growth traits (response of annual increment to summer temperature) and seed source climate, we used classical F ST outlier approaches, Bayesian inference methods, and latent factor mixed modeling (LFMM) in order to identify SNPs with a truly adaptive signal for climate adaptation. We hypothesize that spatially varying selection has shaped the actual pattern of trait differentiation in Douglas-fir despite a high amount of neutral covariance caused by genetic drift and gene flow.

| Common garden experiment
The analyzed trees originate from a block-replicated common garden experiment established in 1976 in eastern Austria with a total of 49 native Douglas-fir seed sources (provenances). The mean annual temperature of the trial site is 7.4°C, and mean annual precipitation is 650 mm (average over the period from 1961 to 1990; source: Austrian Central Station for Meteorology). Trees have been planted as two-year-old plants in a 2 × 2 m spacing, and each provenance was replicated three times in a random fashion within the trial. In 2012, two cores per tree were taken from a subset of 16 provenances originating from a wide part of the natural distribution encompassing Oregon, Washington, and British Columbia (see Figure 1, Table 1, and more information below). In brief, tree cores were taken at breast height for a total of 178 trees (9-15 per provenance and evenly sampled across the three blocks) and stored in plastic tubes until further processing in the laboratory.
Cores were then cut into approx. 1.4-mm-wide cross sections with a double-blade circular saw, placed on microfilms, and exposed to a 10 kV (24 mA) X-ray source for 25 min. Thereafter, the obtained microfilms were analyzed using the software WinDENDRO 2009 (Regent Instrument, Quebec, Canada) by measuring annual increments to the nearest 0.001 mm for early-and latewood, respectively (see also George et al., 2019George et al., , 2017George et al., , 2015, for more information on the methodology). The 50% percentile of the density distribution between minimum and maximum density of each ring was used for the identification of earlywood-latewood boundaries (Fries & Ericsson, 2009; illustrated in Figure S1). The radial increment data from this common garden experiment, which is in detail described in George et al. (2019), were reanalyzed for this study, and therefore, additional needle samples for DNA analyses were taken from the same 178 trees in May 2019. Needles were shot from the lower part of the tree crowns by using a shot gun and stored in silica gel until DNA extraction.

| DNA analysis, probe design, and SNP calling
For each sample, genomic DNA was extracted from lyophilized needle tissue using a CTAB protocol (van der Beek et al., 1992) with minor modifications made for the processing of 96-well plates. DNA concentration and quality were assessed with Qubit ® Quant-iT dsDNA BR Assay kit and a Qubit 1 Fluorometer (Thermo   Neale et al. (2017). Only probes that hit once to the genome and that showed sufficient GC content were included. This resulted in 20 K probes that were found in 12,272 scaffolds with an average number of 1.6 probes per scaffold and a maximum number of 18 probes per scaffold. Sequence capture was performed using RAPiD Genomics proprietary chemistry and workflows. Samples were pooled equimolar and sequenced using a HiSeq 2x150.
After sequencing, Trimmomatic (Bolger et al., 2014) was used to remove sequencing adapters and the trimmed reads were mapped with BWA (Li & Durbin, 2009) using the default settings to the Douglas-fir reference genome. SNPs within probes were identified with FreeBayes (Garrison & Marth, 2012, available at https://github. com/ekg/freeb ayes). VCFtools (Danecek et al., 2011) was used to store the identified SNPs for further downstream analysis in VCF format by applying the following filter: accepted mean depth across all individuals = 750, minimum total depth per individual = 3, percent of individuals allowed missing to retain a site = 0.4, minimum SNP quality = 10.

| Consecutive SNP filtering and exclusion of trees with confounding effects
In order to create a genomic dataset that is free of sources, which could potentially affect the false-positive discovery rate of adaptive SNPs, we used a restrictive and conservative filtering approach: As such, we removed all SNPs with a minor allele frequency < 0.05 and those SNPs that showed significant deviation from the Hardy-Weinberg expectation (threshold: 10 -6 ). Furthermore, we performed pairwise linkage disequilibrium (LD) pruning between markers with a LD threshold of 0.2 in order to retain only unlinked SNPs for environmental association analysis. For this, we used the snpgdsLDpruning function implemented in the SNPrelate package in R (R Development Core Team, 2003). Furthermore, kinship among individuals was estimated by calculating the identity-by-descent (IBD) methods of moments between all pairs of individuals using a subset of SNPs that was already corrected for LD (see section above). We retained only those trees that showed a kinship coefficient <0.25 to any other tree in the dataset. Trees that showed signals of heterozygote deficit (calculated as 1 − (Het obs /Het Exp ) over all loci) were removed prior to analysis with a threshold of 0.1. Finally, we included only trees with an overall SNP call rate of 0.95 for subsequent analyses. All filtering steps were performed in R by using the packages gdsfmt, SNPrelate (Zheng et al., 2012), and vcfR (Knaus & Grünwald, 2017).

| Climate data
Information on provenance climate was obtained from the ClimateNA database  under http://www.clima tewna.com/. Latitude/longitude and elevation information for provenance origins was used to retrieve a total of 247 annual, seasonal, and monthly climate variables (reference period: 1961-1990). The full list of climate variables can be found in Appendix S1. In order to reduce the complexity of these data, the 247 variables were transformed into principal components and the first four PCs were standardized (i.e., expressed as standard deviations from the mean). These standardized climate PCs (hereafter called environmental PCs) were used for subsequent analyses of environmental association analysis (see Section 2.5 below).

| Phenotypic data
Growth traits for this dataset were derived from the study of George et al. (2019). Briefly, tree-ring series were standardized using a 15- year cubic smoothing spline in order to remove the biological age trend. The standardized series were aggregated to chronologies for each provenance by calculating the year-to-year biweight robust mean among trees in each provenance (Bunn, 2008). For the purposes of this study, we used the bootstrapped response functions that were applied to ring width chronologies of the same Douglas-fir provenances in George et al. (2019). Response functions and in particular response coefficients between a growth trait (in our case, ring width) and trial site climate provide insights of the relative importance of an environmental variable for a trait (Lévesque et al., 2014;Zang & Biondi, 2013). However, when calculated at provenance level, bootstrapped response coefficients can be related to provenance source climate in order to unravel genecological clines (i.e., some provenances may react more sensitive to an environmental factor than others as a result of local adaptation). In this study, we by using the R packages bootRes (Zang & Biondi, 2013) and dplR (Bunn, 2008).

| Outlier detection and environmental association analysis (EAA)
To distinguish SNP markers throughout the exome that have a putative adaptive signal for an environmental factor from nonadaptive (neutral) SNPs, we used four different algorithms. These are based either on F ST outlier detection methods (Arlequin & BayScEnv) or on correlations between allele frequencies and environmental variables (BayEnv2 and LFMM2). Regardless of the algorithm that is applied, controlling for variation in allele frequencies that is simply caused by neutral evolutionary processes such as genetic drift or gene flow is a crucial step for identifying adaptive candidate SNPs (Rellstab et al., 2015).  Jeffreys, 1961) were considered as candidates for environmental association analysis. Since the algorithm in Bayenv2.0 is based on MCMC simulations, each run was repeated three times in order to ensure that results remain robust. We additionally calculated spearman rank correlation coefficients in addition to Bayes factors for each SNP in order to ensure that detected candidates were correctly identified and not confounded by outliers (see Table 2

| Adaptive signal of outlier SNPs and functional interrogation
In order to further interrogate the identified candidate SNPs, we first defined a set of consensus variants. Consensus SNPs were defined as those SNPs that were independently detected by at least two of the four chosen methods according to thresholds in Table 2.
Subsequently, we functionally interrogated these SNPs with the help of the annotated Douglas-fir reference genome (Psme.1_0) available under https://treeg enesdb.org/FTP/Genom es/Psme/v1.0/. We used the integrative genomics viewer (Robinson et al., 2011) and explored scaffold-wise whether consensus SNPs were located within known genes or situated nearby to genes. Finally and in order to prove whether the unraveled consensus SNPs have the ability to discriminate between provenances adapted to different climates, the first and second eigenvectors for each tree were calculated based on consensus SNPs and, for comparison, based on an equal number of randomly chosen neutral SNPs, respectively. For this, the snpgdsPCA function from the SNPRelate package in R was employed.

| Isolation by climate versus isolation by distance and isolation by colonization
In order to disentangle the various sources that could have caused genomic and phenotypic differentiation among Douglas-fir populations, we performed redundancy analysis (RDA). RDA uses a multiple linear regression method in order to model allele frequencies as a function of independent explanatory matrices. As independent matrices, we included five different predictors: the first two climate PCs, geographic information (Lat, long), and information on shared colonization history. For the latter, we used the STRUCTURE algorithm (Pritchard et al., 2000) and estimated ancestry proportions   (2) 908 (2) and eastern cluster (see Section 3 and Figure S2). Latitude, longitude, and Q-values were z-transformed prior to analyses as was done for the climate PCs. We performed different sets of RDAs on the complete SNP dataset (17,489 SNPs) and subsequently for the consensus outlier SNP set described in Section 2.6: Combined models included either both climate PCs, geographic information (latitude and longitude), or Q-values, respectively. In individual models, we also tested for the effect of every single predictor separately (i.e., either climate PC1, climate PC2, latitude, longitude, or Q-value). For this analysis, the vegan package in R was employed (Oksanen et al., 2020). From the 178 trees, 5 showed significant heterozygosity deficiency probably as a result of inbreeding and were hence excluded from subsequent analyses. There was no significant family structure among trees within provenances as revealed by IBD methods of moments so that all remaining 173 trees were included in environmental association analysis.

| Climatic groups, population structure, and growth response functions
The first two principal components of the 250 long-term climatic variables together explained 79.9% of variation (Figure 1). The first principal component (hereafter called climate PC1) was strongly related to temperature variables (correlation with mean annual temp. Response of earlywood growth to July temperature at the trial site varied significantly among provenances. In general, inland provenances from British Columbia and Washington had lower response coefficients than their coastal counterparts ( Figure 2). Response coefficients varied between 0.17 (Newport, WA) and 0.42 (Cascadia, OR). Differences among provenances were statistically significant as suggested by 95% confidence intervals of the bootstrapped F I G U R E 2 Phenotypic differentiation among provenances and relationship between response of earlywood to summer temperature in the common garden (y-axis) and climate at seed origin (x-axis) coefficients. When regressed against climate PC1 and PC2, the relationship was significant with a stronger positive earlywood response toward warmer provenance origins (r = .57; Figure 2a) and toward wetter provenance origins (r = .48; Figure 2b). There were no significant relationships to climate PC3 and climate PC4, and consequently, only the first two climate PCs were considered for environmental association analysis.

| Outlier SNPs and environmental associations
The total number of outlier SNPs and those associated with climate varied significantly among detection methods. Arlequin and LFMM2 found a total of 1,148 and 1,082 SNPs, which showed signals of selection and association with climate PC1, respectively.
However, when corrected for multiple comparisons, only 29 and 2 SNPS, respectively, passed the adjusted p-value threshold (−log 10 corrected = 5.54). BayEnv2 revealed a total of 10 SNPs associated with climate PC1 and BayScEnv exhibited 4 SNPs (Figures 3 and 4a, with BayScEnv were also listed as outliers in Arlequin. As the most promising candidates, SNPs #15099 and #78509 appeared independently as outliers associated with temperature in three of the four employed programs (Figure 5a).

| Interrogation of putative adaptive SNPs
A total of 18 consensus SNPs (11 for climate PC1 and 7 for climate PC2) were selected for functional interrogation and spatial frequency analysis. All 18 SNPs were at least found independently by two or three of the used algorithms when the thresholds described in Table 2 were applied. However, since a large number of common SNPs were found between LFMM2 and Arlequin (194 and 46, respectively), we selected only the top five candidates ranked by the -log 10 LFMM p-value for both climate PCs, respectively.
Out of the 18 consensus SNPs, six were located directly within annotated genes, five were situated in close proximity to known genes (less than 1,000 bp up-or downstream), and the remaining seven were located more distantly from annotated genes (>9 kb).
Ten genes had known functions, and these encompassed mostly well-known proteins, enzymes, and transcription factors involved in signaling, DNA-binding, and methylation (Table 3). Most interestingly, for SNP #78509 which was independently found by three of the four algorithms as outlier, a circadian clock protein involved in photoperiod sensing in Arabidopsis thaliana was found to be coded by Douglas-fir gene PSME_16548.
Finally, our two candidate SNPs #15099 and #78509, which were three times independently identified as truly adaptive outliers, showed at least by trend differentiation in space, since the alternative allele increased in frequency for both SNPs toward the inland and those toward colder provenance locations (Pearson's correlation coefficients between allele frequencies and mean warmest month temperature were −0.31 and −0.32, respectively; Figure 6).
When comparing the first two eigenvectors based on the 18 consensus SNPs with eigenvectors obtained from an equal number of randomly chosen neutral SNPs, provenances could be clearly separated into inland and coastal provenances, whereas no clear differentiation was observed for 18 randomly chosen SNPs (Figure 7).

| Isolation by climate versus isolation by distance and isolation by colonization
Redundancy analysis revealed no significant effect of the tested predictors for among-population differentiation regardless of whether the total set of SNPs was used or the 18 adaptive consensus SNPs (Table 4). The proportion of explained variance ranged between 0.007 (east-west ancestry) and 0.013 (climate) in the combined models and between 0.002 (longitude) and 0.01 (climate PC1) in the individual models. Only climate PC1 (temperature) showed a moderate significant effect for among-population differentiation when tested on the outlier SNP set (significant at α < 0.1).

| D ISCUSS I ON
In this study, we combined observations from a common garden experiment with genomic information in order to unravel  (Heer et al., 2018;Housset et al., 2018;Trujillo-Moya et al., 2018), it is one of the very first environmental association analyses incorporating phenotypic data for the economically important Douglas-fir, which is recently attracting attention outside its natural range due to its ability to cope with climate warming and drought (Eilmann & Rigling, 2012). Hence, we will discuss our findings also in light of adaptive forest management.

| Growth response of trees after four decades in the common garden
The sixteen analyzed provenances showed a strong phenotypic cline with temperature and precipitation at seed origin.
Provenances with earlywood growth strongly responding to July temperature at trial site originated mainly from the warmer and wetter coastal sites in British Columbia, Washington, and Oregon, whereas less responsive provenances came from colder and drier inland sites. Many other studies have revealed such a differentiation pattern between the coastal variety and interior variety in terms of either productivity (Eilmann et al., 2013), physiology (Anekonda et al., 2004), or even stress response (Kleiber et al., 2017). Our findings could indeed add evidence that coastal varieties are generally more productive, since earlywood response F I G U R E 4 Manhattan plots for marker p-values as obtained from LFMM2. Black dots show markers with p < .05, blue dots with p < .001 and red dots show all SNPs that were still significant after adjusting for multiple testing F I G U R E 5 Venn diagrams for identified SNPs among the different algorithms. Numbers in red show SNPs that were discovered three times. Numbers in brackets indicate SNPs still significant after adjusting for multiple testing to summer temperature is a strong indicator for overall productivity. Earlywood cells in conifers mainly ensure that water demand in the crown will be sufficiently covered during phases of high evapotranspiration and trees with larger earlywood fractions can consequently allocate more carbon (Björklund et al., 2017). The climate at our trial site already represents the dry margin of the natural distribution of Douglas-fir and is prone to extreme summer droughts (George et al., 2019). Hence, we can assume that trees have been growing under stressful conditions in most of the years. The study by George et al. (2019) in which the same trees have been analyzed also demonstrated that provenances from warmer locations had less growth reductions during years with extreme water deficit, whereas colder inland provenances had the highest reductions in annual increment. However, it has yet to be confirmed whether the higher drought tolerance of coastal provenances (expressed as ratio between growth during the drought year compared with a predrought period) really mirrors the ability of trees to withstand drought or simply mirrors adaptive growth patterns, which could come at the cost of lower recovery or higher mortality after drought periods (Montwé et al., 2015).

| Outlier SNPs and the role of selection across a steep environmental cline
One of the main goals of this study was to identify polymorphisms, which show true signals of adaptation to climate and which are not confounded by neutral processes such as gene flow, drift, and migration. The analyzed provenances showed strong genetic substructuring which partly coincided with phenotypic differentiation (Figures 1 and 2). This makes disentangling adaptive from nonadaptive signals particularly challenging (Ahrens et al., 2018;  Nadeau et al., 2016;Rellstab et al., 2015). To solve this problem, we employed four conceptually different algorithms, which im- SNPs in other genera such as Fagus (Pluess et al., 2016), Populus (Fahrenkrog et al., 2017), Alnus (De Kort et al., 2014), and Pinus (Ruiz Daniels et al., 2019) comprised between 1.6% (Fahrenkrog et al., 2017) and 11% (Pluess et al., 2016) of all SNPs that were analyzed. Moreover, when the data were further aggregated to consensus SNPs, which were found by more than one algorithm, only 11 candidates out of 17,489 (0.06%) were detected for temperature and 7 (0.04%) for precipitation regimes. Since the analyzed provenances exhibit a strong pattern of population substructuring into clearly delimited clusters, we strongly presume that this neutral variation is the most important cause for this finding. In particular, provenances D'Arcy (BC) and Snowqualmie Pass (WA)

TA B L E 3 Functional annotation and genomic position of outlier SNPs
represented strongly isolated subpopulations based on their neutral genetic pattern (Figure 1a), although their phenotypic signal fit well into the observed environmental cline for both climate PCs ( Figure 2). Provenances Pine Grove (Oregon coastal) and Cle Elum (Washington coastal) are climatically more similar to inland provenances (cold and comparatively dry), but share demographic history clearly with the other coastal varieties (Table 1, Figure 1).
Phenotypically, both provenances had indeed lower response coefficients compared with the remaining coastal provenances, which demonstrates that local adaptation likely happened despite strong gene flow within clusters. However, this pattern can be confirmed only for the temperature gradient, because the cline was insignificant for precipitation (Figure 2b). The 18 consensus outlier SNPs that were either associated with temperature or precipitation were capable of discriminating between two clusters of provenances, which could be clearly assigned to one coastal and one inland group ( Figure 7). This underpins that these SNPs have most likely featured phenotypic and genetic differentiation in Douglas-fir beyond neutral processes such as isolation by distance. Interestingly, provenances D'Arcy (BC) and Snowqualmie Pass (WA) appeared not any longer as single subpopulation clusters when the outlier dataset was used for clustering (compare with Figure 1a with the neutral dataset of 1,500 SNPs). One possible explanation could be that isolation of these provenances occurred relatively late compared with climatic adaptation. This finding would be corroborated by the fact that both populations showed a rather expected phenotypic signal, which was similar as for the other coastal provenances (Figure 2).
Coastal and interior Douglas-fir varieties have most likely diverged during orogeny of the Cascade range, which have led to xerification of the Great Basin during the Pliocene around 2 Ma ago (Brunsfeld et al., 2001;. However, population contraction and expansion around the last LGM 18 ka ago could have caused local spots with limited gene exchange, in particular in areas with high mountain barriers . Nevertheless and despite the overall small number of adaptive SNPs that was found, we were able to extract two "hot candidate SNPs" associated with temperature regime at seed origin. Minor allele frequencies of both SNPs showed significant correlation with mean warmest month temperature in space, which is highly correlated with the initial response climate variable from the common garden (correlation of July temperature and MWMT = 0.99). In both cases, the frequency of the minor C allele increased toward inland areas with lower temperature. In light of the strict filtering applied in this study and given the number of employed algorithms, the evidence strongly suggests that these SNPs could have featured adaptation to temperature regimes in Douglas-fir across the steep gradient that was analyzed here. Although the correlation between allele frequencies and temperature regime was moderate and characterized by a few unexpected outliers, it is likely that the rather small sample size of analyzed trees within provenances can be responsible for that. We hence see this result as a starting point for further investigations including also landscape genomics in order to corroborate these findings with more data.

| Isolation by environment (IBE) versus isolation by distance (IBD) and isolation by colonization (IBC)
We used redundancy analysis in order to disentangle the various drivers of among-population differentiation by including climate, geography, and ancestry as predictors. While this approach has been shown to be very informative in two other conifers for identifying sources of differentiation (Nadeau et al., 2016), it yielded only very little insights in our study. Surprisingly, neither environment nor geography nor ancestry explained literally any variation when applied to the entire set of SNPs. Nadeau et al. (2016) argued that the high proportion of unexplained variation in Pinus strobus and P. monticola could have been caused by environmental drivers that are usually too complex to be taken into account in such studies (e.g., soil or biotic environment).
In addition, the environmental gradient and number of populations sampled across that gradient are probably still too narrow in our study to highlight such contrasts. Nevertheless, when applied to the subset of 18 outlier SNPs, climate PC1 that relates strongly to the temperature regime at seed origin was at least moderately significant at α < 0.1, and therefore, we presume that testing these SNPs in a larger number of populations in landscape genomic studies could shed more light into adaptive population differentiation in Douglas-fir.

| Putative biological functions of candidate SNPs
In total, 11 out of 18 SNPs were directly located either within known genes or in close proximity to those genes, which underline their biological importance. Although not all of these genes refer directly to biological functions related to climate, their further interrogation could be nevertheless promising, since some of the found transcription factors and proteins are involved in carbohydrate metabolism and signaling pathways. For example, TIFY proteins (SNP#13432) are involved in regulation of jasmonic acid signaling pathways in Arabidopsis thaliana (Chung & Howe, 2009). Jasmonic acid, in turn, is an important driver of plant response to abiotic stress (e.g., Ruan et al., 2019). Bifunctional UDP-glucose 4-epimerase and UDP-xylose 4-epimerase 1 are important catalytic proteins in the pathway of cell wall polysaccharide biosynthesis and cell wall organization in stem tissue of higher plants (Rösti et al., 2007), which is an important trait in conifers conferring drought resistance (Isaac-Renton et al., 2018).
Most interestingly, the triple-found SNP #78509 is in very close proximity (<200 bp) to Douglas-fir gene PSME_16548, which codes for a WD repeat-containing LWD1 protein. This protein belongs to the circadian clock protein family and is regulating circadian period length and photoperiodic flowering in Arabidopsis spec. (Wang et al., 2011;Wu et al., 2008). A closely related protein of this family with similar gene ontology (Appendix S3), named GIGANTEA-5, is involved in adaptation to temperature regimes (Cao et al., 2005) and was found to be under strong selection in Populus balsamifera (Fitzpatrick & Keller, 2015;Keller et al., 2012). In the study by Fitzpatrick and Keller (2015), turnover in frequency of SNPs located in this gene was best explained by differences in temperature regimes among provenance origins, which strongly corroborates the findings of this study. Although the candidate SNP may not directly alter the protein itself, it seems plausible that it influences its expression given its close adjacency. We see this finding hence as a putative needle in the haystack, which could be the starting point for further investigations at landscape level.

| Douglas-fir genomic resources for adaptive forest management
Douglas-fir is currently discussed as a promising substitute species outside its natural range given its putatively higher drought tolerance and yield (Chakraborty et al., 2016;Eilmann & Rigling, 2012). Our results could be pertinent for future studies and applications, which are aiming at identifying adapted seed sources for Douglas-fir. For instance, SNP information can be used in order to genotype large arrays of trees from progeny trials or provenance tests in order to decide whether the selected material is generally site-adapted or likely to be maladapted to temperature and drought regime. Additionally, our dataset can be used as reference dataset for genomic assignment of individuals or stands given that many Douglas-fir seed stands in Europe are still without confirmed geographic origin. Therefore, we provide detailed SNP and phenotypic information such as genomic positions, flanking sequences, and rank scores for more than 17,000 SNPs obtained from the various programs in Appendix S4.

| CON CLUS IONS
Based on a steep phenotypic cline observed in a common garden experiment, we were able to disentangle adaptive signals in Douglasfir from those that were simply caused by neutral demographic processes. We showed that combining dendroclimatological data with genomic information can lead to valuable insights into the adaptation history of a widespread conifer. Although a very small fraction of the analyzed polymorphisms stood out as adaptive candidates, their functional interrogation strongly suggests that SNPs could have indeed featured climatic adaptation in Douglas-fir. Therefore, we shed new light into the adaptive history of another conifer with high economic and ecological importance.

ACK N OWLED G M ENTS
This study is part of the Translational Research Program 122 ("Softwood for the Future") and was funded by the Austrian Science Fund (FWF, project P26504). Dr. Marcela van Loo was also funded by the Austrian Science Fund (FWF) project P26504 "Verhalten