Environmental versus geographical determinants of genetic structure in two subalpine conifers

Authors


Summary

  • Alpine ecosystems are facing rapid human-induced environmental changes, and so more knowledge about tree adaptive potential is needed. This study investigated the relative role of isolation by distance (IBD) versus isolation by adaptation (IBA) in explaining population genetic structure in Abies alba and Larix decidua, based on 231 and 233 single nucleotide polymorphisms (SNPs) sampled across 36 and 22 natural populations, respectively, in the Alps and Apennines.
  • Genetic structure was investigated for both geographical and environmental groups, using analysis of molecular variance (AMOVA). For each species, nine environmental groups were defined using climate variables selected from a multiple factor analysis. Complementary methods were applied to identify outliers based on these groups, and to test for IBD versus IBA.
  • AMOVA showed weak but significant genetic structure for both species, with higher values in L. decidua. Among the potential outliers detected, up to two loci were found for geographical groups and up to seven for environmental groups. A stronger effect of IBD than IBA was found in both species; nevertheless, once spatial effects had been removed, temperature and soil in A. alba, and precipitation in both species, were relevant factors explaining genetic structure.
  • Based on our findings, in the Alpine region, genetic structure seems to be affected by both geographical isolation and environmental gradients, creating opportunities for local adaptation.

Introduction

Retrospective studies of population genetic structure and adaptation in forest trees provide insights into how forests will respond to future environmental changes (Petit et al., 2008). Range expansions since the end of the last glaciations and the presence of specific adaptations in recently colonized areas have demonstrated that tree species can adapt rapidly in response to geographically variable selection (Holliday et al., 2010a; Keller et al., 2011). Gene flow has a pivotal role in this process, as it can both promote and prevent local adaptation (Kremer et al., 2012). Conifers are widespread in northern temperate forests. A long generation time and long dispersal distance of pollen and seeds make conifers valuable case studies for investigating genetic differentiation in the presence of gene flow (Savolainen et al., 2007), and for testing hypotheses about the relative importance of population isolation and environment-dependent selection in creating population genetic structure in long-lived organisms (Nosil et al., 2008; Andrew et al., 2012 for short-lived species).

Species that grow across a wide range are subjected to distinct evolutionary forces, which may lead to local adaptation for ecologically important traits (Eckert et al., 2010a; Holliday et al., 2010a). A long history of common garden experiments has repeatedly demonstrated that the interaction of different types of selection, large amounts of standing genetic variation and gene flow, and environmental heterogeneity promotes local adaptation in forest trees (Kremer et al., 2010; Alberto et al., 2013). Genomic regions affected by selection show specific signatures, such as high differentiation among populations (corresponding to an elevated estimate of genetic distance, FST) and a decrease in polymorphism within populations, and increased linkage disequilibrium (Schlötterer & Harr, 2002; Beaumont, 2005; Keller et al., 2012). Genetic differences between populations are expected to increase with increasing distance between them, as a result of higher population isolation (Rousset, 2004). Moreover, environmental differences between sites are also expected to increase with distance. Therefore, genetic structure could also be affected by the association of geographical distance with a specific environmental factor, as demonstrated in both herbaceous species (Leimu & Fischer, 2008) and some forest trees (De Carvalho et al., 2010; Chen et al., 2012).

One of the key goals of ecological genomic studies is the identification of loci that underlie local adaptation; this often involves identifying loci that show a different polymorphism pattern compared with the whole genome, that is, ‘outlier’ loci (Coyer et al., 2011). There are several strategies to detect such outliers, depending on summary statistics and test assumptions (De Mita et al., 2013). Early methods were based on the allele frequency distributions (Lewontin & Krakauer, 1973; Akey et al., 2002), or on the FST distribution of gene diversity (Beaumont & Nichols, 1996). Both approaches focused on the identification of loci that show significantly higher or lower FST than the neutral expectation. A more recent development implemented a Bayesian method that considers a multinomial Dirichlet distribution (Balding, 2003). An alternative is to consider measures of genetic diversity, such as the expected heterozygosity (He), instead of FST. Schlötterer & Harr (2002) developed such an approach for microsatellite markers that strictly follows the stepwise mutation model. Finally, under a model selection framework where selection effects can be included in or excluded from the model, Foll & Gaggiotti (2008) developed a Bayesian method that tests for specific population and locus effects. Recently these approaches have been increasingly applied to detect candidate genes potentially involved in adaptation in forest trees (Namroud et al., 2008; Eckert et al., 2010a). Several other studies focused on the association between genetic polymorphism and climatic variables (Eckert et al., 2010a,b; Holliday et al., 2010b; Prunier et al., 2011; Tsumura et al., 2012). For example, Prunier et al. (2011) found 26 single nucleotide polymorphisms (SNPs) associated with differences in temperature and precipitation in Picea mariana.

To understand the complexity of adaptive processes and the main forces creating population genetic structure in natural populations, it is crucial to investigate the main selective forces acting upon the target species and environments (Joost et al., 2007). Alpine landscapes are characterized by heterogeneous topography as a consequence of the presence of physical barriers, such as valleys and high mountains, that can limit gene flow and influence genetic diversity within and among subalpine plant populations (Theurillat & Guisan, 2001; Körner, 2003). As a consequence of this topographic heterogeneity, steep environmental gradients are also created in the Alps. Alpine plant communities are especially sensitive to change in climate (Theurillat & Guisan, 2001); therefore, temperature, precipitation and altitudinal gradients are expected to be the key forces shaping forest communities at high elevation. Nevertheless, genetic differentiation among populations can also be increased by population isolation and both topographic and climatic factors could be important (Nosil et al., 2008). In this context, it is relevant to determine whether isolation by geographical distance (IBD) or ecologically dependent reproductive isolation (i.e. ‘environmental isolation’ or ‘isolation by adaptation’ (IBA) – the restriction of gene flow as adaptive divergence increases; Nosil et al., 2008) is the main driver of genetic differentiation in alpine ecosystems. Under IBD, it is expected that species’ range fragmentation will cause an increase in genetic differentiation with the distance between populations, while under IBA the differentiation between populations is correlated to the relative influence of landscape and environmental variables on gene flow (Nosil et al., 2008; Andrew et al., 2012).

In this study, we focus on the question as to whether geographical distance or habitat difference is more important for determining patterns of genetic variation in two subalpine forest trees, silver fir (Abies alba) and European larch (Larix decidua), using SNP genotyping data. The genotyped SNPs were located across a set of 150 genes (see details in Mosca et al., 2012a), involved in broadly different cellular mechanisms, which might underlie traits potentially adaptive to climate change, and are thus suitable to identify both general patterns of population structure and specific signatures of selection acting on some loci. In a previous study of four subalpine conifers (Mosca et al., 2012b), weak overall population genetic structure was found in A. alba and L. decidua, whereas some significant correlations were detected between genetic variation and environmental factors presumably as a result of natural selection. This study pointed to environmental selection as a more important force than historical isolation to explain population genetic structure in subalpine conifers. In the current study, we formally tested this hypothesis in different ways. First, we constructed geographical and environmental groups based on known topographical features and main environmental variables, respectively, in the Alps. Then, we tested for genetic differentiation among geographical and environmental groups using hierarchical analysis of molecular variance (AMOVA), and used different approaches to test for outlier loci based on these groups. Secondly, to test for overall geographical versus environmentally driven isolation, population genetic differentiation was tested for association with physical distance (IBD) and for correlation (after removing the geographical distance effect) with altitude, temperature and precipitation (IBA), three major environmental factors in Alpine ecosystems.

Materials and Methods

Focal species, sampling, and SNP data

Abies alba Mill. and Larix decidua Mill. were chosen for the present study because they are primary components of European alpine landscapes and have great ecological importance in these environments. Abies alba is a mountain species, broadly distributed throughout Europe. Its distribution is patchy, as a result of its demographic history, which is characterized by the presence of two main lineages, in the Alps and the Balkans, and by several migratory pathways from southern refugia following the end of the last ice age (Liepelt et al., 2009). Larix decidua is naturally distributed at high elevation (1000–2200 m) in the mountains of Central Europe; it may occur at the timberline in the Central Alps (Farjon, 1990). Its demographic history is characterized by a population expansion after a bottleneck, probably occurring during the last glaciations (Semerikov & Lascoux, 1999). Larix decidua was at that time a key pioneer species that colonized virgin soil after the glaciers retreated (Pleuss, 2011).

For A. alba, 36 natural populations were sampled along its natural distribution in the Italian peninsula, from southernmost Serra San Bruno in Calabria to the Alps, to cover different ecological and climatic conditions (Fig. 1, Supporting Information Table S1). For L. decidua, 24 natural populations were sampled across its range in the Italian Alps, also covering a wide range of environmental conditions (Fig. 1; Table S1); two populations having < 10 successfully genotyped samples were removed. The populations used in this study are a subsample of those included in Mosca et al. (2012b). In particular, populations were sampled across an altitudinal gradient ranging from 650 to 2197 m in A. alba and from 1123 to 2218 m in L. decidua. The average annual temperature varied greatly across sampling locations in A. alba, from 3.71 to 12.08°C, while it ranged from 1.86 to 6.97°C in L. decidua. The cumulative annual precipitation ranged from 606 to 1692 mm in A. alba and from 758 to 1646 mm in L. decidua (Table S1). Each sampled tree was georeferenced using Trimble GPS technology (http://www.trimble.com/) and marked; needles were collected from 25 to 65 individuals per population (n = 1108 in A. alba and n = 824 in L. decidua) for SNP genotyping.

Figure 1.

Sampling locations across species distribution for Abies alba (a) and Larix decidua (b) across the Italian peninsula. The map was created using qgis 7.1 (Quantum GIS Development Team, 2011) software.

Generation of SNP data is explained in detail in Mosca et al. (2012b) and briefly outlined as follows. SNP discovery was carried out using a Sanger re-sequencing approach, using 800 PCR primer pairs from Pinus taeda in 12 individuals belonging to the two studied species (Mosca et al., 2012a). To obtain gene functional annotation, BLASTx and BLASTn analyses were performed on the P. taeda expressed sequence tag (EST) sequences in the National Center for Biotechnology Information, NCBI (http://www.ncbi.nlm.nih.gov/) and the Arabidopsis Information Resource, TAIR (http://www.arabidopsis.org/) databases, using published sequences of Arabidopsis (Notes S1 and Supplementary File 2 in Mosca et al., 2012a). A subset of the discovered SNPs (384 and 528 SNPs in A. alba and L. decidua, respectively) were then chosen for genotyping based on their Illumina (San Diego, CA, USA) design score, their coverage across the amplicon, the putative gene function, and the SNP allele frequency. The SNP genotyping was carried out at the University of California, Davis Genome Center using the Golden Gate platform (Illumina). SNP arrays were displayed on a Bead Array reader (Illumina) and analysed using GenomeStudio V2009.1 software (Illumina).

The successfully genotyped SNPs were selected using a minimum threshold of 0.25 for the GenCall50 score, a value of minor allele frequency > 1% and a Wright's inbreeding coefficient (FIS) in the range of −0.35 to 0.35. In addition to these criteria, SNPs with a percentage of missing data higher than 20% were removed from the analysis. A suite of summary statistics was calculated to quantify the information content across loci in each sampling location and to further control genotyping quality. For each locus, the observed (Ho) and expected (He) heterozygosities and Wright's inbreeding coefficient (FIS = 1 − Ho/He) were calculated by population (Table S1). After this quality check, a total of 231 and 233 high-quality and polymorphic SNPs (conversion rates of 60.16% and 44.13%) remained for further analyses in A. alba and L. decidua, respectively.

Environmental data

In addition to geographical coordinates (see previous section for the sampling), we recoreded elevation, slope and aspect for each sampled tree. The average georeferenced point of each sampling location was used to identify the geographic position of the sampled population (Table S1). Climatic variables were obtained as described into Mosca et al. (2012b). Before the analysis, aspect was first transformed into ‘folded aspect’ about the north–south line, using the equation (folded aspect = 180 – |aspect – 180|), as suggested in McCune & Dylan (2002). Average monthly data and seasonal quarter averages were calculated for minimum, maximum and mean temperatures for each sampling site. Monthly cumulative precipitations and seasonal cumulative precipitations were calculated for each annual quarter, defined with 1 January to 31 March as the first quarter. The ‘growing degree days’ parameter with base temperature 5°C (GDD5), which is the threshold temperature for growth suggested for boreal conifers (Prentice et al., 1992, 2011; Sork et al., 2010), was computed for each species and site as the difference between average temperature and base temperature. Each sample site was also assigned to either the carbonate or silicate soil type category according to the Ecopedological Soil Map of Italy (European Communities, 2001). For the more densely sampled Trentino Province, a local soil map was used to assign the soil type category (Sartori & Mancabelli, 2009; Panagos et al., 2011). Soil type was used as a categorical variable by assigning a value of +1 to the carbonate soil and −1 to the silicate soil.

To summarize highly correlated environmental data, a multivariate exploratory data analysis was performed using the FactoMineR package (Husson et al., 2012) in R (R Development Team, 2011). A Multiple Factor Analysis for Mixed Data (AFDM) with both quantitative (climate) and categorical (soils) data was carried out for each species. The contribution of each variable to the axes is listed in Table S2A, where five axes (called ‘dimensions’) were taken into account. The correlation coefficient of each variable and the individuals’ coordinates on the axes were calculated (data not shown). The variable contributions were determined using the square cosine parameter and only variables with square cosine > 0.90 were considered to be variables with major contributions and used to define environmental groups (Table S2B; see next section).

Geographical and environmental groups

Based on current knowledge of the demographic history of subalpine trees (Semerikov & Lascoux, 1999; Liepelt et al., 2009; Kozáková et al., 2011) and the location of physiographical barriers (e.g. the Po plain), four geographical groups were defined across the sampling range: one in the Apennines and three in the Alps, using the Dora Baltea and Adige rivers as physical borders (Table S1). In A. alba, two geographical groupings were tested: one considering two regions that corresponded to Alpine and Apennine populations and another considering four groups, one per geographical group described above. In L. decidua, which is absent from the Apennines, we only considered the three groups across the Alps (see Table S1).

The main explanatory variables detected in the AFDM analysis (see previous section) were used to construct environmental groups, sometimes including populations that are geographically distant from each other in the same group. Nine environmental groups (Fig. S1) were defined in A. alba using March mean temperature (tmean_03) and cumulative precipitation of the first seasonal quarter (Q1_prec). Nine environmental groups were also defined in L. decidua, using the GDD5 measured in June (gdd_06) and the monthly cumulative precipitation of December (prec_dec).

Genetic differentiation (AMOVA)

Genetic differentiation among geographical groups and populations was examined for each species using hierarchical AMOVAs (Excoffier et al., 1992) in arlequin v. 3.5.1.3 (Excoffier & Lischer, 2010). The total variance was divided into three components: among geographical groups, among populations within groups, and within populations. The significance for each variance component was determined using a nonparametric permutation procedure (Excoffier et al., 1992).

In a similar way as for geographical groups, hierarchical AMOVAs for each species using arlequin v. 3.5.1.3 (Excoffier & Lischer, 2010) were used to test for genetic differences among environmental groups. Finally, we tested the effect of environmental groups nested within the geographical groups using the function test.within of the hierfstat package (Goudet, 2005) in the R environment.

Detection of outlier loci

Outlier locus detection analyses (i.e. detection of loci showing unusually low or high FST values) were conducted for both geographical and environmental groups. In this way, we were able to investigate whether geographical isolation or environment is driving adaptation in subalpine conifers at the molecular level. We used three complementary approaches to identify outliers. First, outliers were detected considering finite island (method 1) and hierarchical island (method 2) models using arlequin v. 3.5.1.3 (Excoffier & Lischer, 2010). The finite island model assumes an equal probability of migration between populations and mutation-drift equilibrium (Beaumont & Nichols, 1996), while the hierarchical island model assumes structured populations (Slatkin & Voelm, 1991). To correct the P-value obtained with arlequin for multiple testing, the function p.adjust was calculated in R (R Development Team, 2011), using Bonferroni's correction. Secondly, a Bayesian method considering specific population and locus effects was used (method 3), as implemented in BayeScan v. 2.0 (Foll & Gaggiotti, 2008). The method is based on the estimation of two alternative models (including/excluding the effect of selection on single loci); their respective posterior probabilities are estimated using a Monte Carlo Markov chain. Model choice decision is performed using the Bayes factor. A prior odds equal to 10 was applied in both species with a false discovery rate (FDR) equal to 0.001.

The gene functional annotation for outlier loci was obtained from the BLASTx of the P. taeda EST sequence versus Arabidopsis, while the SNP functional annotation (i.e. whether SNPs are noncoding, synonymous or nonsynonymous) refers to the SNPs submitted to the GenBank database (JQ440374JQ445205) and EMBL website (HE663538HE663608 and HE681087HE681096) in the re-sequencing project (Mosca et al., 2012a).

Isolation by distance (IBD) versus isolation by adaptation (IBA)

Mantel tests and multiple regression matrix analyses were both applied to determine whether patterns of overall genetic differentiation in A. alba and L. decidua are attributable to IBD or to environment-driven selection.

First, IBD was investigated using a Mantel test to correlate pairwise physical spatial distances and pairwise genetic differentiation (FST), calculated according to Weir & Cockerham (1984). Both matrices were generated with SPAGeDi 1.3a (Hardy et al. 2009) and scaled prior to the Mantel test. Mantel tests and partial Mantel correlations were also used to investigate the association between genetic distance and environmental variables (IBA), that is, elevation (E), annual mean temperature (T), annual cumulative precipitation (P) and soil type (S), with (to remove purely geographical effects) and without introducing spatial distance as a covariate. IBA analyses were performed considering both all populations and populations within geographical groups (see ‘Geographical and environmental groups’ section). All distance matrices were scaled prior to the analysis. All tests were performed with the mantel function in the ecodist package (Goslee & Urban, 2007) in R (R Development Team, 2011).

Secondly, to further investigate the effects of environment on genetic distance and test for IBA, environmental distance matrices were constructed (Table S1). Sampling site elevation (E), temperature (T), and precipitation (P) were used directly, while two categories (silicate or carbonate) were used for soil type (S), assigning 0 to silicate soils and 1 to carbonate soils. Then, for each variable, environmental distances among sites were computed as the Euclidean distance between populations with 10 000 simulations. All matrices were scaled before the analysis. Environmental distance matrices were used as a predictor in Multiple Regression Distance Matrix (MRDM) analyses, in which genetic distance was the response variable, both using and not using physical spatial distances as a factor. All matrix correlations were performed with the ecodist package (Goslee & Urban, 2007) using the MRM function in R (R Development Team, 2011). The MRDM analyses were performed both using all sampled populations in each species and for each of the geographical groups defined in the previous section.

Results

The number of SNPs successfully genotyped in each species was 231 SNPs across 150 genes in A. alba and 233 SNPs across 151 genes in L. decidua. The expected heterozygosity (He) calculated for each sampling site (population) ranged between 0.247 and 0.305 for A. alba and between 0.170 and 0.301 for L. decidua, while the observed heterozygosity (Ho) ranged from 0.237 to 0.315 in A. alba and from 0.180 to 0.406 in L. decidua (Table S3). Levels of linkage disequilibrium (LD) were low (Fig. S2), as observed in other conifers (Pavy et al., 2012).

Genetic differentiation among geographical and environmental groups

AMOVA was used to test for significant genetic differentiation among geographical and environmental groups in each species. For geographical groups (Table 1a) in A. alba, both genetic differentiation among groups (FCT) and genetic differentiation among populations within groups (FSC) were highly significant (< 0.0001), with a percentage of variation explained of 2.18% and 2.88% for the two regions and 1.32% and 2.50% for the four geographical groups, respectively. In L. decidua, genetic differentiation among groups (FCT) and genetic differentiation among populations within groups (FSC) were also highly significant (< 0.0001), with percentages of variation explained of 3.26% and 2.34%, respectively. With respect to environmental groups (Table 1b), AMOVAs in both A. alba and L. decidua were highly significant (< 0.0001) for genetic differentiation among populations within groups (FSC) and less significant (= 0.002) for differentiation among groups (FCT), explaining substantially lower percentages of variance in A. alba (0.61%) than in L. decidua (2.01%). Finally, genetic differentiation of environmental groups nested within the geographical groups was not significant.

Table 1. (a) Analysis of molecular variance (AMOVA) based on single nucleotide polymorphism (SNP) data assuming a hierarchical geographical population structure with two regions and four geographical groups in Abies alba and with three geographical groups in Larix decidua; (b) AMOVA based on single nucleotide poloymorphism (SNP) data assuming population structure with nine environmental groups in A. alba and in L. decidua
SpeciesCodeSource of variationdfSum of squareVariance components% of variation P Fixation index
  1. The population assignations to the groups are reported in Supporting Information Table S1 and described in the 'Materials and Methods' section.

(a)
A. alba RegionsAmong groups1347.8180.60962.18< 0.00010.0218
Among populations within groups342580.7130.80362.88< 0.00010.0294
Within populations21 80857831.3226.52894.94< 0.00010.0506
A. alba Geo-groupsAmong groups3732.9140.363061.32< 0.00010.0132
Among populations within groups322195.6170.68952.50< 0.00010.0253
Within populations21 80257831.31826.52896.18< 0.00010.0132
L. decidua Geo-groupsAmong groups2549.2210.630473.26< 0.00010.0559
Among populations within groups19973.0520.452972.34< 0.00010.0242
Within populations162629702.4718.267294.40< 0.00010.0326
(b)
A. alba Env-groupsAmong groups8872.460.16760.610.00200.0061
Among populations within groups272056.070.79762.90< 0.00010.0292
Within populations218057831.3226.52896.49< 0.00010.0351
L. decidua Env-groupsAmong groups8912.3430.38262.010.00200.0201
Among populations within groups13609.9300.39652.08< 0.00010.0212
Within populations162629702.4718.267295.91< 0.00010.0409

Detection of outlier loci

Three methods were applied to detect outlier SNP loci in each species.

Using geographical groups, the number of outliers detected assuming a finite island model (method 1) was 15 in A. alba and 34 in L. decidua, whereas the hierarchical island model (method 2) detected 12 and 23 outliers in A. alba and L. decidua, respectively (Table S4). These numbers were greatly reduced after adjusting the P-value with Bonferroni's correction (Tables 2, S4). In A. alba, after multiple test corrections, method 1 detected four outlier SNPs, while only one and two outlier loci were found with method 2, considering two regions and four geographical groups, respectively. While two loci were detected as outliers by both methods (2_6313_01_Abal_160 and CL3116Contig1_03_Abal_118), two other loci (0_7009_01_Abal_212 and 0_5361_01_Abal_287) were only found with the island model. In L. decidua, after multiple test corrections, more outliers were found with method 1 (six loci) than with method 2 (two loci). In this species, the outlier loci were characterized by a higher value of FST than those in A. alba.

Table 2. Outlier loci found in Abies alba (a) and Larix decidua (b) with the hierarchical island model (method 2) with two regions or three/four geographical groups (in L. decidua and A. alba, respectively), and with nine environmental groups
Abies alba Two regionsFour geographical groupsNine environmental groupsP. taeda BLASTxProtein descriptionSNP code
Locus H e F ST P P p.adj H e F ST P P p.adj H e F ST P P p.adj
(a)
0_5361_01_Abal_287        0.1370.1169.8E-05 *** NANANA
0_7009_01_212        0.5040.1041.0E-07 *** NM_104402 Kelch repeat-containing F-box family proteinSYa
2_6313_01_160    0.1970.1371.8E-04 * 0.2000.1461.0E-07 *** NM_118832 Unknown proteinNA
CL1455Contig1_06_152        0.3770.1151.0E-07 *** NM_106473 Protein bindingNA
CL3116Contig1_03_1180.3590.3551.0E-07***0.2830.1831.0E-07 *** 0.2750.1601.0E-07 *** NM_122008 Member of RAN GTPase gene familyNS
Larix decidua Three geographical groups Nine environmental groupsP. taeda BLASTxProtein descriptionSNP code
Locus H e F ST P P p.adj H e F ST P P p.adj
  1. The P-value is adjusted (p.adj) using Bonferroni's correction for multiple testing. Only significant analyses after Bonferroni's correction are shown. The SNP code is as follows: NA, no annotation; SY, synonymous; NS, nonsynonymous. The protein description refers to the BLASTx of the Pinus taeda expressed sequence tag (EST) sequence versus Arabidopsis.

  2. a

    Loci moderately associated with environment in Mosca et al. (2012b).

  3. ***, Pp.adj-value < 0.0001; **P-value < 0.001; *, Pp.adj-value < 0.05.

(b)
0_17790_01_159    0.1570.2481.0E-07 *** NM_120865 Beta-glucuronidaseNAa
0_7001_01_2600.08360.36507.3E-05**0.0730.2701.0E-07 *** NM_111095 Flavodoxin family proteinNA
0_7810_01_525    0.0900.1721.0E-07 *** NM_106106 GDSL-motif lipase/hydrolase family proteinNA
0_9284_02_470    0.2280.1941.0E-07*** NM_202043 Phosphate translocator-relatedNAa
CL1045Contig1_01_380    0.1430.0002.0E-04 * NM_001035973 Protein binding/protein homodimerization/transcription repressorNA
CL1077Contig1_02_225    0.2320.1631.0E-07*** NM_121136 Histone H3SY
CL1634Contig1_03_1080.13080.37481.0E-07***0.1170.3011.0E-07 *** NM_112228 Endomembrane protein 70NA

To identify outliers for environmental groups, only hierarchical island models (method 2) can be applied. In A. alba, 17 outlier loci were detected among nine environmental groups (Table S4) and five loci were still significant after Bonferroni's multiple testing corrections (Table 2). These loci were also considered outliers in analyses based on geographical groups or the island model, with the exception of locus CL1455Contig1_06.Abal.152. In L. decidua, 39 outlier loci were found and seven remained significant after applying Bonferroni's correction (Table 2). Four of these loci were specific for analyses based on environmental groups (0_7810_01.Lade.525, 0_9284_02.Lade.470, CL1045Contig1_01.Lade.380 and CL1077Contig1_02.Lade.225).

Finally, the rate of outliers detected with Foll and Gaggiotti's Bayesian approach (method 3) ranged from 1.73% in A. alba to 6.86% in L. decidua (Table 3; Fig. S3). In both species, the majority of these outliers (three loci out of four in A. alba and 12 loci out of 16 in L. decidua) were only detected with the Bayesian approach (Tables 3, S5) and not with other methods.

Table 3. Detection of outliers using the Bayesian approach (method 3) with false discovery rate (FDR) = 0.001 and prior odds equal to 10
SpeciesSNPIDProblog10 (PO)Alpha F ST SignFunctional annotation
P. taeda blastxProtein description SNP code
  1. The single nucleotide polymorphism (SNP) code is as follows: NA, no annotation; NC, noncoding; SY, synonymous; NS, nonsynonymous. The protein description refers to the BLASTx of the Pinus taeda expressed sequence tag (EST) sequence versus Arabidopsis.

  2. a

    Loci moderately associated with environment in Mosca et al. (2012b).

Abies alba 0_7009_01_Abal_21246110001.1580.125*** NM_104402 Kelch repeat-containing F-box family proteinSYa
1_6493_01_Abal_2261190.9982.795−1.7240.01***NANASY
CL4354Contig1_01.Abal.147600.9972.584−1.6160.011*** NM_115712 Protein serine/threonine phosphatase (PP2A-3)SY
CL4354Contig1_01.Abal.202185110001.3660.147*** NM_115712 Protein serine/threonine phosphatase (PP2A-3)NC
Larix decidua 0_11772_01.Lade.137250.9972.4931.1800.111*** NM_118398 Glycoprotease M22 family proteinSY
0_14221_01.Lade.22472110001.4740.140*** NM_179316 Serine-tRNA ligaseNA
0_14591_02.Lade.108212110001.3250.124*** AP000423 CpDNANA
0_17790_01.Lade.1592160.9871.8941.0500.100*** NM_120865 Beta-glucuronidaseNAa
0_18644_02.Lade.4692480.9982.6981.2040.113*** NM_100301 Unknown proteinNA
0_5038_01.Lade.226102110001.6170.155*** NM_104644 CF9 mRNASY
0_6659_01.Lade.10138110001.6280.156*** NM_103577 PeroxidaseNA
0_9284_02.Lade.470138110002.3690.258*** NM_202043 Phosphate translocator-relatedNAa
2_10352_02.Lade.721613.3981.4700.140*** NM_119190 UDP-glucuronate 4-epimerase/catalytic (GAE1)NA
2_6317_01.Lade.233259110002.2460.239*** NM_106091 Heat shock protein 101NA
2_8011_02.Lade.46825513.3981.3750.130*** NM_116232 Putative LRR receptor-like serine/threonine protein kinase MRH1SY
2_9465_01.Lade.228215110002.4080.266*** NM_123975 GTP-binding proteinNA
2_9465_01.Lade.430179110001.8630.185*** NM_123975 GTP-binding proteinNA
CL1077Contig1_02.Lade.2251810.9942.1911.1040.104*** NM_121136 Histone H3SY
CL3832Contig1_05.Lade.89350.9972.5521.2560.118*** NM_118384 Exostosin family proteinNA
CL4776Contig1_03.Lade.552600.9972.4671.2160.114*** NM_115718 Endonuclease exonuclease phosphatase family proteinNC

Functional annotation of outlier loci

The total number of outlier loci detected with the different methods ranged from eight loci in A. alba to 24 loci in L. decidua (Table S5), corresponding to 3.46% and 10.30% of the total number of SNPs analysed in each species, respectively. The functional annotation of the outlier genes describes the putative protein they code for; for each SNP, annotation type is also reported (NC, noncoding; SY, synonymous; NS, nonsynonymous) (Table S5). Even though most SNPs were found in noncoding regions or were synonymous, one SNP in A. alba was nonsynonymous and located in a gene (CL3116Contig1_03) encoding the GTP-binding nuclear protein Ran-1 in Arabidopsis. In both species, the remainder of the outlier loci were silent and located in genes encoding unknown proteins or proteins broadly involved in several metabolic processes (Table S5; see also the 'Discussion' section).

Isolation by distance (IBD) versus isolation by adaptation (IBA)

The Mantel test based on pairwise genetic and geographical distance matrices was positive and highly significant in A. alba and L. decidua (Table 4; = 0.00001 in both species). This result suggested that IBD may account for most of the differentiation among populations found in both species. By geographical group, IBD was not significant for A. alba in geographical groups 1 (western Alps) and 4 (Apennines), whereas it was significant in the other groups (central and eastern Alps). Conversely, in L. decidua the genetic distance by geographical group was only correlated with the spatial distance in the central Alps.

Table 4. Mantel and partial Mantel correlation coefficients used to test for association of genetic distance (F) considering all populations (‘All sites’) and only populations within geographical groups (‘Geo-groups’) with physical distance between pairwise populations (D), elevation (E), annual mean temperature (T), annual cumulative precipitation (P) and soil type (S)
Abies alba All sitesGeo-group 1 (western Alps)Geo-group 2 (central Alps)Geo-group 3 (eastern Alps)Geo-group 4 (Appenines)
Mantel test r P a r P a r P a r P a r P a
F ~ D0.5491 0.0000 0.03060.45040.9228 0.0000 0.8382 0.0000 −0.01860.4291
F ~ E−0.03390.6340−0.55320.96650.07710.2593−0.10880.7395−0.05240.5486
F ~ T0.2140 0.0014 −0.37530.7348−0.02610.55840.36110.0165−0.31430.8896
F ~ P0.0902 0.0448 −0.04880.5032−0.08130.78730.8078 0.0001 0.05250.4174
F ~ S0.01240.3570−0.19130.56540.4963 0.0003 0.3305 0.0242 −0.17670.6399
F ~ E | D0.5492 0.0000 −0.23410.61740.14880.1337−0.10760.7332−0.05690.5504
F ~ T | D0.5184 0.0000 −0.44640.83460.04480.35950.35540.0167−0.33980.8898
F ~ P | D0.5294 0.0000 −0.16080.6167−0.21630.98850.03640.35280.06340.3655
F ~ S | D0.01830.2952−0.23410.61750.22920.0765−0.00070.4061−0.18510.6380
Larix decidua All sitesGeo-group 1 (western Alps)Geo-group 2 (central Alps)Geo-group 3b (eastern Alps)  
Mantel test r P a r P a r P a r P a   
  1. All distance matrices were scaled before the analysis. Bold font indicates significant values, r is the Pearson coefficient of correlation and P is the associated P-value.

  2. a

    One-tailed P (null hypothesis:  0).

  3. b

    Not calculated; only two populations belonged to Geo-group 3.

F ~ D0.8519 0.0000 0.62270.01640.6843 0.0000 NANA  
F ~ E−0.00810.79570.16050.34880.10240.1466NANA  
F ~ T0.05460.21920.15940.36690.19760.0360NANA  
F ~ P0.5628 0.0001 0.4962 0.0171 0.6174 0.0001 NANA  
F ~ S−0.01470.4713NANA0.06110.2527NANA  
F ~ E | D−0.07150.7427−0.24020.96690.06100.2473NANA  
F ~ T | D−0.01070.49480.44790.21780.13110.0944NANA  
F ~ P | D0.4416 0.0010 −0.23660.85090.2666 0.0152 NANA  
F ~ S | D0.04240.2765NANA−0.05670.6536NANA  

With respect to environmental distances (not corrected by geographical distance), overall correlations with genetic distance were positive in A. alba for temperature (= 0.00140) and precipitation (= 0.04486), but not for elevation. The correlations calculated within geographical groups were positive for soils (= 0.00026) in geographical group 2 (central Alps), and soils (= 0.02422) and precipitation (= 0.00007) in geographical group 3 (eastern Alps). In L. decidua, the (noncorrected) Mantel test including all populations was significant only between genetic distance and precipitation (= 0.00014). The same result (= 0.00007) was found within geographical group 2 (central Alps), as well as a weaker correlation with the same variable (= 0.0170) in geographical group 1 (western Alps).

Partial Mantel correlation was also used to examine the contribution of each environmental variable to the differentiation among populations when the geographical distance was taken into account. In A. alba, the overall correlation between genetic distance and both temperature and precipitation remained significant when taking into account geographical distance (= 0.00001); moreover, in this case, the correlation was also significant with elevation (= 0.00001). In L. decidua, after correcting for geographical distance, the correlation between genetic distance and precipitation was still significant considering both all populations (= 0.0010) and geographical group 2 (central Alps; = 0.00152).

These results were further confirmed by MRDM analyses (Table S6), showing a significant overall association between genetic distance and both temperature and precipitation in A. alba, as well as for the eastern Alps (geographical group 3). The MRDM analysis also found a strong significant effect of soils for A. alba in the central Alps (geographical group 2) and marginally in the eastern Alps (geographical group 3). In L. decidua, the analysis confirmed the overall association between genetic distance and precipitation, and also for the western and central Alps (geographical groups 1 and 2). Moreover, a weak significant association with temperature was found in the central Alps (geographical group 2).

Discussion

This study presents a comprehensive investigation of patterns of genetic variation across environmental gradients in A. alba and L. decidua in the Alps, with a focus on the causes of genetic divergence among populations. Weak but significant genetic structure was found in both species. Among the potential outliers detected, two loci (out of five) in A. alba and two (out of seven) in L. decidua were related to both geography and specific environmental variables (based on environmental groups); four and one additional loci were outliers in relation only to environment in L. decidua and A. alba, respectively. A stronger effect of IBD versus IBA was found for population genetic structure in both species. Finally, we demonstrated how the effect of environment can be partially separated from other confounding factors, such as geography. Specifically, together with historical isolation, both temperature and precipitation in A. alba and only precipitation in L. decidua were identified as relevant factors explaining population genetic structure in these keystone subalpine trees.

The presence of a geographical effect on genetic variation was examined by assigning sampling sites to up to four geographical groups (eastern, central and western Alps, and Apennines). In A. alba, higher genetic differentiation was found in the two regions sampled (Alps and Apennines) than in the four geographical groups. Populations from the Apennines probably belong to a different gene pool from populations from the Alps (Liepelt et al., 2009), which would explain this result; moreover, within the Alps, some genetic structure was found but not clearly corresponding to geographical regions. In L. decidua, genetic differentiation was stronger (FCT = 0.0559), showing that genetic diversity was primarily distributed according to geographical regions. Despite the significant differences detected among geographical regions and among populations within geographical region, the main source of variation occurred within population in both species, which is typical for forest trees (Müller-Starck et al., 1992; Savolainen et al., 2007).

The case for environmental selection was formally tested by defining nine distinct environmental groups in each species, using the principal explanatory variables obtained from an AFDM analysis. Selected climate variables were related to winter precipitation (both species) and to March temperature (A. alba) or GDD5 measured in June (L. decidua). The AMOVA for environmental groups showed low but significant among-groups genetic differentiation, with higher values for L. decidua (2.01% of variation) than A. alba (0.61%). In addition, outlier loci exclusive to environmental groups included one locus in A. alba and four in L. decidua. These findings suggested the presence of environmental discontinuities that might have contributed to shaping population genetic structure, as well as fostering local adaptation. In A. alba, our results suggested differences in the response to climate across the Alps, as previously described based on tree ring growth (Carrer et al., 2010). In L. decidua, our findings suggested that environmental groups based on December precipitation and GDD5 are relevant for adaptive processes. Even when this species loses its needles, minimizing winter desiccation damage, these variables may still be important in regulating species growth phenology. Similar results were found for bud burst in relation to temperature in Picea sitchensis (Mimura & Aitken, 2010) and for water availability in a wild relative of Arabidopsis (Lee & Mitchell-Olds, 2011). The stronger genetic structure and higher number of outlier loci found in L. decidua for environmental groups suggest that adaptation in this species is more environment-dependent than in A. alba. However, these findings could also be related to the higher number of loci potentially under selection found in this species compared with A. alba and/or to sampling biases (i.e. more unbalanced sampling) in L. decidua.

The overall rate of outlier loci was comparable to that found in other genome scans in boreal conifers (3.7% in Picea glauca (Namroud et al., 2008) and 2.4% and 2.7% for temperature and precipitation clustering, respectively, in Picea mariana (Prunier et al., 2011)), and slightly lower (6% and 4% for positive and negative outliers, respectively) than that found in Pinus pinaster (Eveno et al., 2008). The detection of outlier loci was similar across methods for A. alba, but varied for L. decidua from 2.6% using the island model method of Beaumont & Nichols (1996) to 6.86% using the Bayesian approach implemented in BayeScan (Foll & Gaggiotti, 2008), whereas approaches using a more complex demographic model (i.e. the hierarchical island model of Excoffier et al., 2009) produced intermediate numbers (3%). These differences are probably related to population genetic structure in L. decidua that makes the relatively simple model used for data simulation in Beaumont & Nichols (1996) unrealistic (Helyar et al., 2011). Moreover, the method of Excoffier et al. (2009) is known to provide a more conservative estimation of the number of outlier loci, with a higher false-negative rate (Excoffier et al., 2009). In both species, the majority of the outlier loci detected with BayeScan were not significant with the other methods, which reflects the complexity of identifying the precise causal loci for environmental adaptation. Nevertheless, our results indictate more widespread signatures of local adaptation in L. decidua, as well as confirming the advantage of combining several approaches for the identification of candidate functional loci associated with environment (De Mita et al., 2013).

Although gene annotation in nonmodel species is still relatively poor (Ekblom & Galindo, 2011), functional annotation of the Arabidopsis homologues of outlier loci highlighted the presence of several interesting targets for further investigation of subalpine forest tree molecular adaptation. Among the loci detected under the geographical group clustering, the Arabidopsis homologues of locus CL3116Contig1_03 (NM_122008; AT5G20010) found in A. alba encode a protein located in the cell wall, which is produced in response to cadmium ion and salt stress (Meier & Brkljacic, 2010). A member of the same family protein (the RAN GTPase gene family) was found to be associated with phenology and cold tolerance in Pseudotsuga menziesii (Eckert et al., 2009). Another interesting locus detected under the finite island model in L. decidua, CL71Contig1_04 (NM_001036704), encodes a putative disease resistance protein, ADR1-like 1, which is involved in apoptosis and the defence response. Among the outliers detected with the Bayesian simulation, two loci were found in A. alba on a gene (CL4354Contig1_01) encoding a protein serine/threonine phosphatase (PP2A-3; NM_115712) and other two loci were identified in L. decidua on a gene (2_9465_01) encoding a GTP-binding protein (NM_123975). Interestingly, most outlier loci in this study were not among those involved in environmental associations in a previous study (Mosca et al., 2012b), which suggests complex interactions between environment and genetic differentiation, and the existence of other environmental variables (not measured in Mosca et al., 2012b) responsible, at least partially, for patterns of population genetic structure found in subalpine forest trees (e.g. GDD5 and soil type). A similar discordant pattern has been shown in other species (Nosil et al., 2008; Eckert et al., 2010b) but not in P. mariana, where allele frequency was also correlated with temperature and precipitation in 62% of the outlier loci (Prunier et al., 2011). The lack of association between allele frequency and climatic variables may be attributable to a more complex adaptation to local environment, which is only partially explained by the studied variables (Prunier et al., 2011). Finally, two outlier loci in L. decidua were detected in genes significantly associated with a phenotypic trait (autumn cold hardiness or budset timing) in Picea sitchensis: locus 0_6659_01 encoding a peroxidase and locus CL1077Contig1_02 encoding a histone (Holliday et al., 2008, 2010b).

Further insights into geographical and environmental (temperature, precipitation, altitude and soil type) effects on genetic structure came from Mantel tests and MRDM analyses. In both species, geographical distance (i.e. significant IBD) accounts for most of the genetic differentiation among populations. A strong correlation of geographical distance and pairwise genetic distance has been found in other widespread forest trees, such as P. sitchensis (Gapare et al., 2005; Mimura & Aitken, 2007), Pinus mugo (Heuertz et al., 2010), and Picea abies (Tollefsrud et al., 2009). Interestingly, Mantel tests showed a significant effect of both temperature and precipitation in A. alba and of precipitation in L. decidua, after correction for the geographical distance. Alpine vegetation is strongly dependent on climate and plants must be adapted to rapid changes (Rebetez et al., 2004). Several studies found adaptive loci along steep environmental gradients, such as those created by altitude (Gonzalo-Turpin & Hazard, 2009), latitude (Hall et al., 2007; Chen et al., 2012) and precipitation (Tsumura et al., 2012). A significant altitudinal cline for growth in A. alba and other forest species was confirmed using a common garden experiment (Vitasse et al., 2009). A significant effect of soils in the central and eastern Alps was also apparent from MRDMs. To our knowledge, the potential effect of carbonate versus silicate soil type on genetic structure has not been investigated in other forest tree species. The greatest proportion of the forest root system is concentrated in the upper soil horizon (Vogt et al., 1983), where fine roots are often abundant and the effect of bedrock type is minimal. Soil physical and chemical properties are also affected by climate, as was demonstrated along an altitudinal gradient characterized by decreasing temperature and increasing precipitation (Bockheim et al., 2000). Although the present study did not include all factors involved in soil formation and affecting soil properties, it is one of the first to report an association between genetic structure and soil type. This result highlights the importance of further investigations on this aspect of forest ecosystems.

Conclusions

Subalpine forest trees have large amounts of gene flow compared with herbaceous and annual plants (Savolainen et al., 2007). Gene flow plays an important role in promoting or constraining adaptive variation (Hendry & Taylor, 2004; Garant et al., 2007; Kremer et al., 2012). This relationship is complex, as a geographical barrier can increase population isolation and still, if genetic variation is high, result in local adaptation. Conversely, strong environmental selective pressure can result in local adaptation even in the presence of high gene flow. In the present study, we have found evidence of both IBD and environment-driven adaptation (IBA) in A. alba and L. decidua. We also demonstrated how the effect of environment can be partially separated from other confounding factors, such as geographical distance. Moreover, we identified winter precipitation and early spring temperature (March) as key environmental factors that contribute to the genetic structure of two conifers of the Alps. This study found some potentially adaptive loci based on outlier detection for environmental groups constructed using these variables. However, further investigations are necessary to confirm their involvement in adaptation. For example, the role of these SNPs in physiological processes could be investigated in large-scale association studies. By integrating climatic analysis with a landscape genetic approach, we found that in both species precipitation is involved in creating population genetic structure, whereas in A. alba temperature and soil may also drive genetic differentiation across the landscape, at least in some regions. Our conclusion is that, in subalpine trees that have large amounts of gene flow and little population genetic structure, environment-driven selection is still important and, together with geographical isolation, may promote genetic adaptation. This process is favoured by the large amount of seeds produced during the life of a tree (i.e. its life-history traits) and the strong levels of selection at early stages of development (i.e. seedlings and saplings).

Acknowledgements

The authors are grateful to Erica Di Pierro, Lorenzo Bonosi and Alessio Fortunati for their useful comments during preparation of the manuscript. We would like to thank Piero Belletti, Andrew J. Eckert, Alessandro Mancabelli and Nicola La Porta for their help with the sampling design, and the Italian State Forest Service of Alpine Regions, David Blanco, Yuri Gori, Stefano Maffei, Ambrogio Molinari, Marta Scalfi and Daniele Sebastiani for their support during the sampling. We acknowledge Katie Tsang and Randi Famula for their laboratory work, Ben Figueroa for managing data storage, and Jill L. Wegrzyn, John D. Liechty, Vanessa K. Rashbrook and Charles M. Nicolet for their support with the genotyping. We thank all members of the GIS and Remote Sensing Unit at Fondazione Edmund Mach for providing us with the environmental data and for their help. The ACE-SAP project was partially funded by the Autonomous Province of Trento (Italy), with the regulation No. 23, June 12, 2008, of the University and Scientific Research Service. Thanks are extended to the ERA-Net BiodivERsA (LinkTree project, EUI2008-03713), with the Spanish Ministry of Economy and Competitiveness as national funder, part of the 2008 BiodivERsA call for research proposals, which supported the work of S.C.G.-M.

Ancillary

Advertisement