SEARCH

SEARCH BY CITATION

Keywords:

  • association mapping;
  • budset;
  • clinal variation;
  • cold hardiness;
  • single nucleotide polymorphism;
  • Sitka spruce (Picea sitchensis)

Summary

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information
  • Genecological studies in widespread tree species have revealed steep genetic clines along environmental gradients for climate-related traits. In a changing climate, the ecological and economic importance of conifers necessitates an appraisal of how molecular genetic variation shapes quantitative trait variation, and one of the most promising approaches to answer this question is association mapping.
  • We phenotyped a wide collection of 410 individuals of the widely distributed conifer Sitka spruce rangewide (Picea sitchensis) for budset timing and autumn cold hardiness, and genotyped these individuals for a panel of 768 single nucleotide polymorphisms (SNPs) representing > 200 expressed nuclear genes.
  • After correcting for population structure, associations were detected in 28 of the candidate genes, which cumulatively explained 28 and 34% of the phenotypic variance in cold hardiness and budset, respectively. Most notable among the associations were five genes putatively involved in light signal transduction, the key pathway regulating autumn growth cessation in perennials. Many SNPs with phenotypic associations were also correlated with at least one climate variable.
  • This study represents a significant step toward the goal of characterizing the genomic basis of adaptation to local climate in conifers, and provides an important resource for breeding and conservation genetics in a changing climate.

Introduction

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Local adaptation is a fundamental evolutionary process about which there are still wide gaps in our understanding (Orr, 2005). Although a significant body of theory has been developed, and feasible experimental approaches exist, genome-scale empirical studies have been hindered thus far by their high cost. This is beginning to change as the decreasing cost of sequencing and genotyping is coupled with an increasing societal understanding of the importance of maintaining biodiversity, including genetic diversity. Widespread temperate and boreal tree species provide useful evolutionary and crucial ecological contexts in which to study the genomics of local adaptation (Gonzalez-Martinez et al., 2006). These species often inhabit large continuous ranges that are climatically heterogeneous, and although they are often strongly differentiated with respect to climate-related traits, high gene flow via pollen means that little population structure exists at neutral loci (Howe et al., 2003; Savolainen et al., 2007). In a changing climate, standing genetic variation represents an important raw material for adaptive evolution (Barrett & Schluter, 2008), and a coherent description of the genomic basis for local adaptation is therefore becoming increasingly important. Although conservation genetic strategies typically employ anonymous neutral markers to describe patterns of genetic variation, meta-analyses have shown that these markers are not good proxies for variation in heritable fitness-related traits in the wild (Reed & Frankham, 2001). Hence, the search for ecologically relevant genetic markers, that is, those that directly determine the traits or are tightly linked to the loci that do, has taken on additional importance beyond the more general effort to understand the genomic basis and population structure underlying phenotypic variation in complex traits.

Adaptation to local climate in temperate and boreal forest trees is determined by a suite of component traits that together determine the relative susceptibility of local populations to freezing temperatures while enabling sufficiently competitive annual growth (Howe et al., 2003). As growth and cold hardiness are incompatible, the likelihood of freezing injury is a function of the timing of growth initiation in spring and growth cessation in fall, subsequent timing of cold acclimation, and the maximum cold hardiness achieved in mid-winter (Weiser, 1970). Substantial variation for these adaptive traits segregates both within and among populations (Howe et al., 2003; Savolainen et al., 2007). Among forest tree taxa that occupy cold climates, the Picea (spruce) genus harbors some of the most economically and ecologically important species. These include the dominant conifers of the sub-boreal and boreal regions of Scandinavia and Russia (Norway spruce, Picea abies), the sub-boreal and boreal region of Canada (white spruce, Picea glauca), the montane interior of western North America (Engelmann spruce, Picea engalmanii, and its hybrids with white spruce), and the west coast of North America (Sitka spruce, Picea sitchensis). In addition, numerous species with more limited ranges are distributed across North America and Asia. Some of the strongest documented genetic clines are in Sitka spruce (Mimura & Aitken, 2007), which has a latitudinal range that spans c. 3500 km of the Pacific coast of North America, from northern California to southern Alaska.

An attractive approach to understanding the genomic basis of complex adaptive phenotypes in forest trees is association genetics. Large natural populations, low linkage disequilibrium, and efficient gene flow via wind pollination make this method suitable for use in trees (Neale & Savolainen, 2004; Gonzalez-Martinez et al., 2006), and several such studies have already been successfully completed (Gonzalez-Martinez et al., 2007, 2008; Ingvarsson et al., 2008; Eckert et al., 2009). Here we report association mapping of two dormancy-related traits, namely, cold hardiness and budset timing. The results of this study represent a substantial forward step in our understanding of the genomic underpinnings of complex adaptive traits in forest trees, and provide a suite of genetic markers that can be applied to both marker-aided selection (MAS) and conservation genetics, and be used to infer maximum, demographically sustainable rates of adaptation of populations to new climates.

Materials and Methods

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Plant material and tissue sampling

A common garden was established at Vancouver, British Columbia, Canada (49°N), in 2003, consisting of c. 1000 open-pollinated offspring of 10–13 seed parents per populations from 17 geographic populations (Mimura & Aitken, 2007). Owing to size limitations, this common garden was thinned in 2005, leaving 14 populations and a total of 410 plants (Fig. 1, Table 1). In January 2006, we vegetatively propagated each of these plants by detaching and rooting current-year lateral shoots. Lower needles were removed and cuttings were dipped in a 0.01% solution of indole-3-butyric acid before planting in a 3 : 2 mixture of peat : perlite. Cuttings were kept in a glasshouse at ambient temperature with supplemental bench heating to keep the soil temperature between 16 and 20°C throughout the rooting process. The ramets were misted twice daily for 30 min and, once rooted, were moved to an outdoor patio, and subsequently transplanted to a permanent outdoor common garden. Tissue for genomic DNA extraction was sampled from newly flushing lateral buds from the plants in the original common garden in May 2006.

image

Figure 1.  Distribution of Sitka spruce (Picea sitchensis; shaded areas) and origins of populations (open circles) sampled for this study. Population names corresponding to two-letter codes can be found in Table 1.

Download figure to PowerPoint

Table 1.   Origins of Sitka spruce (Picea sitchensis) study populations, associated climatic variables, and mean phenotype scores
PopulationState/Prov.nGeographic and climatic variablesPhenotype means (SE)
Lat./long.Dist.MATMCMTMWMTDDPrecip.Budset dateCold injury
  1. Population abbreviations used in Fig. 1: CA, California, USA; OR, Oregon, USA; BC, British Columbia, Canada; AK, Alaska, USA; Dist., distance of origin of study population from southernmost point in species range (Fort Bragg, CA, USA); MAT, mean annual temperature (°C); MWMT, mean warmest month temperature (°C); MCMT, mean coldest month temperature (°C); DD, degree days (> 5°C); Precip, annual precipitation (mm); budset date, days since January 1; cold injury, index of injury on December 1 (see the Materials and Methods section).

Fort Bragg (FB)CA 339°N, 123°W011.89.114.724791041291.7 (24.7)49.0 (3.5)
Redwood (RW)CA3842°N, 124°W29211.68.814.82395968266.9 (6.0)50.6 (2.0)
Columbia River (CR)OR1347°N, 124°W74110. 65.816.020191705222.9 (8.6)30.4 (4.2)
Vancouver (VA)BC3949°N, 123°W109910.03.617.120271277215.6 (4.7)24.7 (2.6)
Vancouver Island (VI)BC4649.5°N, 125°W12409.73.017.619601179208.9 (3.1)24.1 (2.1)
Ocean Falls (OF)BC4852.5°N, 128°W15858.0−1.016.816521702202.1 (2.8)12.9 (1.5)
Queen Charlotte Islands (QC)BC3753°N, 132°W17938.33.215.014621398212.5 (4.3)20.7 (2.8)
Prince Rupert (PR)BC3353.5°N, 130°W19147.11.313.512262594205.8 (4.0)11.0 (1.8)
Icy Bay (IB)AK1560.5°N, 141°W28124.2−3.412.07214074189.5 (0.5)7.1 (1.2)
Valdez (VD)AK3862°N, 146°W31293.5−5.612.98151712191.4 (0.5)7.8 (1.1)
Montague (MI)AK1261°N, 147°W31573.9−4.112.57862445190.2 (0.8)6.7 (1.2)
Rocky Bay (RB)AK3558°N, 151°W33784.1−2.211.77351956193.2 (1.3)7.8 (1.2)
Iniskin (IN)AK1560°N, 153°W35121.0−10.512.77651706190.4 (0.7)8.1 (2.6)
Kodiak Island (KD)AK3857°N, 153°W35754.7−1.312.87691914190.4 (0.5)7.8 (1.6)

Candidate gene selection

Candidate genes were chosen on the basis of their expression profiles during autumn in Sitka spruce in conjunction with the annotation of their nearest BLASTX hit. Holliday et al. (2008) monitored autumn gene expression both temporally across five autumn time points, and among geographically and phenotypically divergent populations at two of these time points. Transcripts that were significantly up-regulated during the fall or that were significantly more abundant in northern populations (fold difference > 2) were considered potential candidates. As this list comprised more candidates than we could include in the current study, we pared it down according to several criteria. First, amplification was attempted for transcripts up-regulated more than fivefold either temporally or among-population (i.e. higher in the north), regardless of annotation. Secondly, for transcripts with expression differences between two- and fivefold (again, either temporally or among-population), primers were designed and tested for those transcripts with annotations suggestive of a role in budset timing or cold acclimation (i.e. according to the functional categories discussed in Holliday et al. (2008)). Finally, in a few cases, candidate genes were included that were not differentially expressed by these criteria but which had annotations strongly suggestive of involvement in cold acclimation or budset. A total of 202 candidate genes were successfully amplified.

Selection of single nucleotide polymorphisms

Single nucleotide polymorphisms (SNPs) were identified by re-sequencing a discovery panel of 24 Sitka spruce genotypes from across the species range for the candidate genes described in the previous section (Holliday et al., 2010). From these, 665 SNPs were chosen to be genotyped using the Illumina bead array platform in conjunction with the GoldenGate allele specific assay (see the ‘Illumina Golden-Gate genotyping’ section for genotyping details). Three criteria were used to select SNPs from the candidate genes. First, the minor allele frequency (MAF) threshold in the discovery panel was set at 5%. Secondly, a minimum spacing of c. 60 base-pairs (bp) was set as a result of limitations of the GoldenGate platform. Finally, nonsynonymous (i.e. amino acid changing) SNPs were preferred over synonymous SNPs, as the former are more likely to have a functional consequence for the gene product that could then be a target of selection. It should be noted, however, that positive selection may increase local linkage disequilibrium, and therefore synonymous substitutions adjacent to positively selected sites are expected in some cases to associate with the phenotype because of this linkage. We therefore included many synonymous substitutions.

Phenotyping for association tests

Budset timing was assessed in the rooted cuttings in the summer and fall of 2008, by recording the Julian date on which bud scales became visible on the apical bud of transplanted individuals. This was done on a weekly basis between 7 July 2008 and 1 December 2008. Cold hardiness for each individual was measured on 1 December 2007, using an artificial freeze test (Hannerz et al., 1999). Briefly, needle segments (c. 0.5 cm) were frozen at three temperatures, −5, −10 and −20°C, in a small amount of distilled water (0.2 ml) with a miniscule amount of silver iodide to facilitate ice nucleation. A control sample was kept at 4°C throughout. Following freezing, the electrolytic conductivity of the solution was measured, after which frozen and control samples were heat-killed and measured again. The index of injury (It) for each plant at each test temperature was calculated as follows:

  • image

where Rt = Lt /Lk, Ro = Lo /Ld, Lt is the conductance of leachate from the sample frozen at temperature t, Lk is the conductance of the leachate from the sample frozen at temperature t and then heat-killed, Lo is the conductance of the leachate from the unfrozen sample, and Ld is the conductance of the leachate from the corresponding heat-killed, unfrozen sample. A test temperature with an It of 50% would be the LT50 for the plant on that date. For the purposes of the association analysis described in the next section but one, we used the mean It across all three test temperatures for each plant as the response variable.

Illumina GoldenGate genotyping

Single nucleotide polymorphism genotyping was carried out using the Illumina bead array platform in conjunction with the GoldenGate allele-specific assay in a 96-well, 768 SNP format (Fan et al., 2003; Shen et al., 2005). This is a highly multiplexed assay, and involves adherence of genomic DNA to a solid support, followed by annealing of a locus-specific oligo (LSO) and two allele-specific oligos (ASOs) for each SNP to be assayed. Allele-specific extension is then carried out, followed by amplification of the resulting PCR template using three primers specific to the ASOs and LSOs. These primers carry fluorescent moieties, which enable detection of alternative homozygotes as well as heterozygotes. ‘Address sequences’ unique to each set of PCR primers then enable hybridization of the PCR products to a standard array, which carries 30-fold replication of beads for each SNP. SNP calls are thus made on the basis of the clustering of fluorescent signals from replicate beads, and call quality is evaluated on the basis of the separation between homozygous and heterozygous clusters. The minimum threshold for this ‘GenTrain’ score was set at 0.25 for the current assay. In addition, SNPs were only accepted that had call rates of > 90%, although most SNPs retained for further analysis had call rates in excess of 95%. Although we selected SNPs from the discovery panel with MAFs > 5%, it was found that in the total mapping population, some SNPs nevertheless had MAFs below this threshold. Although there is little power to detect associations for these, alleles of large effect size may give significant results, and as a compromise we chose a minimum threshold of 1% for a SNP to be included in the association tests.

Association analysis

As genetic drift in subpopulations can increase type I error in association studies (Neale & Savolainen, 2004), we explored population structure in our mapping population using a separate cohort of 98 SNPs. These SNPs were included on the Illumina array for the purposes of genetic mapping in a separate population that was not part of this study, and most had no significant homology to Arabidopsis. Although we cannot say with certainty a priori that these SNPs do not have a role in adaptation, as they were randomly selected it is unlikely that a substantial fraction have been targets of natural selection related to climate. The same thresholds for quality were applied to these SNPs as those to be tested for associations with phenotypes. Population structure was assessed using both conventional FST estimates (calculated using Arlequin software; http://cmpg.unibe.ch/software/arlequin3/) (Excoffier et al., 2005) and the genetic clustering program Structure (Pritchard et al., 2000), which assigns individuals to one or more genetic clusters on the basis of their multilocus genotype. Five replicate runs of models with a putative number of clusters (k) from one to 15 were tested, with pre- and post-burn-in periods of 50 000 and 100 000 iterations, respectively.

For a largely outcrossing species with a continuous range and high gene flow, such as Sitka spruce, highly variable degrees of kinship are expected. As such, the ability to account for multiple degrees of relatedness further reduces the incidence of spurious associations, particularly as we have a range-wide sample in this study with largely one-dimensional gene flow (Mimura & Aitken, 2007). To incorporate kinship into our association analysis, we estimated pairwise kinship coefficients according to the method Lynch & Ritland (1999), as implemented in the program SPAGeDI (Hardy & Vekemans, 2002; http://www.ulb.ac.be/sciences/ecoevol/spagedi.html). The same 98 SNPs used for estimating population structure were also used by us to estimate kinship.

Association analysis was carried out using the mixed model framework of Yu et al. (2006), implemented in the program TASSEL, version 2.1 (http://www.maizegenetics.net). A mixed linear model (MLM) was fitted for each marker and phenotype. To account for the confounding effects of population structure, the relative assignment of each individual to each genetic cluster (the ‘Q’ matrix of Structure) was incorporated into the model as a covariate, and pairwise kinship coefficients were modeled as random effects. The mixed model can be expressed as

  • image

where y is a vector of phenotypic observations, α is a vector of SNP effects, ν is a vector of population effects, μ is a vector of kinship effects, ε is a vector of residuals, and S, Q and Z are incidence matrices of 1s and 0s relating y to α, ν, μ, respectively. Probability values obtained from this model were adjusted for multiple testing using the false discovery rate (FDR) implemented in the ‘qvalue’ package in R (http://www.r-project.org/; Storey & Tibshirani, 2003). Note that in this context ‘Q-value’ refers to FDR-adjusted P-values, which reflect the probability that individual genotype–phenotype associations are false positives. By contrast, the ‘Q’ matrix from Structure is a matrix defining individual membership in each of three genetic clusters and incorporated into the association model as a covariate.

In addition to testing for associations between candidate-gene SNPs and target phenotypic traits, the set of 98 putatively neutral SNPs were also tested for associations while similarly adjusting for population structure. Although this presents a problem in the sense that the neutral SNPs were also used to infer population structure and kinship, the results should not be seriously biased as each of these SNPs only contributes c. 1% to the Q and kinship matrices.

Results

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Phenotypic trait variation

There was a highly significant relationship between population of origin and both cold injury and budset timing (< 0.001 for each), and significant clinal variation was observed between population means and latitude, which explained 91 and 89% of the variation in budset and cold hardiness, respectively (< 0.001 for each; Fig. 2). When compared in a pairwise manner, difference in latitude between populations was a strong predictor of statistically significant differentiation in phenotype (R2 = 0.601, P <0.001 for budset timing and R2 = 0.500, < 0.001 for cold hardiness; Fig. 3). In spite of this strong population differentiation, phenotypes also segregated within-population (Table 1), although it should be noted that variation in budset timing was reduced in the northern populations as some individuals had already set bud by early July when we began monitoring budset. In these cases, budset date was recorded as the day on which we began measurements (July 7).

image

Figure 2.  Regressions of mean budset date and cold hardiness on latitude and the five climate variables given in Table 1. Sitka spruce (Picea sitchensis) budset was monitored weekly starting on 7 July 2008 and expressed as days since January 1. Cold hardiness is expressed as index of injury measured using the electrolyte leakage assay described in the Materials and Methods section. MAT, mean annual temperature; MCMT, mean coldest month temperature; MWMT, mean warmest month temperature; DD, degree days > 5°C; Precip, annual precipitation.

Download figure to PowerPoint

image

Figure 3.  Relationship between difference in latitude and probability that pairs of populations will be significantly differentiated at the phenotypic level. P-values (log scale) based on Tukey’s honest significant difference for all possible pairwise comparisons of the 14 populations of Sitka spruce (Picea sitchensis) represented in the study.

Download figure to PowerPoint

To better understand the environmental drivers of these strong latitudinal gradients in cold hardiness and budset timing, we regressed five climate variables relevant to local adaptation on our two phenotypic traits, and found numerous significant relationships. Three temperature-related variables, namely mean annual temperature (MAT), mean coldest month temperature (MCMT), and degree days > 5°C (DD), showed strong and significant relationships with each of the phenotypes (Fig. 2). Annual precipitation (Precip) was also significant, but did not explain as much among-population variation as the three aforementioned temperature-related variables. Mean warmest month temperature (MWMT) was the only variable tested that did not give a significant result.

Genotyping

Of the 768 SNPs assayed using the GoldenGate platform, a total of 476 had call rates > 90%, GenTrain quality scores > 0.25, and were polymorphic. This success rate is comparable with other studies in nonmodel systems. For example, a recent 768 SNP GoldenGate assay in white spruce gave 534 successful SNPs that were polymorphic (Namroud et al., 2008). Among the successful SNPs in the current study were the 98 putatively neutral markers that were used to assess population structure and kinship. An additional 39 had MAFs < 1% and were also excluded. The remaining 339 SNPs were distributed across 153 genes, giving an average of 2.2 SNPs per gene.

Population structure and relative kinship

Pairwise FST estimates between geographic populations based on the 98 putatively neutral SNPs varied between 0.007 and 0.212, with 50% below 0.05 and 70% below 0.07 (Supporting Information, Table S1). The highest values were those involving the disjunct populations at Fort Bragg, California, and Kodiak Island, Alaska. Contrasts with the Fort Bragg population produced the highest values but were likely not reliable as this population was represented by only three individuals. Contrasts involving the Kodiak Island population produced estimates between 0.072 and 0.111 (excluding the Kodiak–Fort Bragg contrast for which FST = 0.212). Clustering of multilocus genotypes using Structure revealed no clear solution to the number of genetic populations present in our range-wide collection of Sitka spruce. That is, the log-likelihood for successive models (from k = 1 to k = 15) increased steadily, without an obvious plateau. However, a large increase in the likelihood was observed between k = 2 and k = 3 (mean ΔlogeP(D) = 467.5), followed by generally smaller increases for models assuming k > 3 (mean ΔlogeP(D) between c. 25 and 75). This suggested that k = 3 may be the best solution. Simulation studies have revealed that the rate of change between successive values for k provides an accurate way to assess the ‘true’ number of clusters in a metapopulation in the absence of a clear answer based on the likelihoods (Evanno et al., 2005). Indeed, by applying this method, we found that k = 3 was the best solution. This scenario splits the 14 populations studied into three clusters, the first including all populations in California, Oregon, and British Columbia, the second including all populations in Alaska except Kodiak Island, and the final cluster comprising the Kodiak Island population alone (Fig. 4), and these splits were robust to the number of clusters assumed (i.e. they are still apparent if a greater number of clusters is assumed). We therefore included the Q-matrix from Structure, where k = 3, as the covariate in our association model described in the Materials and Methods section. It should be noted that although this solution suggests rather striking population differentiation between the Prince Rupert and Icy Bay populations, pairwise FST between these two populations (i.e. Prince Rupert and everything south vs all Alaska popula-tions with the exception of Kodiak Island) was only 0.031 (Table S2). Contrasts between Kodiak Island and the ‘Prince Rupert south’ and ‘Alaska’ populations produced higher values of 0.088 and 0.078, respectively.

image

Figure 4.  Genetic clustering of 410 individuals from across the species range of Sitka spruce (Picea sitchensis) using the program Structure. Vertical bars represent individuals, and colours represent proportional membership of each individual in each of three genetic clusters.

Download figure to PowerPoint

As our mapping population consists of open-pollinated families sampled from wild populations, we expect complex familial relationships to be present, and this is evidenced by a range of kinship estimates (Fig. 5). Although most were close to zero, indicating unrelatedness, c. 5% had values between 0.2 and 0.3, suggesting a half-sib degree of relatedness, and a few had values of c. 0.5, suggesting full-sib relationships or their equivalent.

image

Figure 5.  Distribution of pairwise kinship estimates for 410 Sitka spruce (Picea sitchensis) genotypes represented in this study.

Download figure to PowerPoint

Associations between phenotype and genotype

Adjustment for population structure significantly reduced the type I error rate, as evidenced by the cumulative distribution of P-values for the 98 putatively neutral SNPs (Fig. 6). Under the null hypothesis of no associations between SNP genotypes and phenotypes, the distribution of P-values should be roughly uniform. This is reflected in the roughly diagonal lines for the models in which adjustments for population structure were made. Models for which no population structure adjustment was made exhibited a skew in the distribution of P-values, suggesting a number of false-positive results.

image

Figure 6.  Cumulative distribution of observed P-values for 98 putatitvely neutral single nucleotide polymorphisms (SNPs) for association models in which adjustments for population structure were made, and those with no adjustment. Skew in the distribution of P-values for the unadjusted model reflects inflated type I error.

Download figure to PowerPoint

For the 339 successfully genotyped candidate adaptive SNPs, 45 significant associations (Q < 0.1) between 35 SNPs and one or more of the measured traits were detected following adjustment for population structure (Table 2). By contrast, after multiple test correction, no significant associations were identified among the neutral set used to infer population structure and relative kinship (data not shown). Among the SNP associations, 16 were unique to budset timing, nine were unique to cold hardiness, and 10 were shared between these two traits. Most of the SNPs with trait associations segregated in all of the populations represented in the study (Fig. 7). SNPs associated with budset each had a percentage of variance explained (PVE) between 1.0 and 4.3% of the phenotypic variance in this trait, whereas those associated with cold hardiness had PVEs ranging from 0.7 to 5.4% (Table 2). Cumulatively, these associations explain 34.4% of the phenotypic variation in budset and 28.1% of the variation in cold hardiness in our mapping population.

Table 2.   Significant genotype–phenotype associations. Single nucleotide polymorphism (SNP)–climate correlations (< 0.01) with one or more of the climate variables in Table 1
LocusPositionTypeBLASTX vs ArabidopsisMarker effect for budsetMarker effect for cold hardinessCorrelated climate variables
FPQR2FPQR2
  1. MAT, mean annual temperature; MCMT, mean temperature of the coldest month; MWMT, mean temperature of the warmest month; DD, degree days > 5°C; Precip, annual precipitation.

  2. ‘Position’ indicates the location of the SNP within the amplicon, and ‘type’ reflects whether a SNP is synonymous (S) or nonsynonymous (NS)). P-values from the model (P) were adjusted for multiple testing using the false discovery rate, and resulting Q-values (Q) are given.

abc3318SABC transporter6.110.0020.0470.0138.602.E-040.0060.017MAT, MCMT, DD
aec3475NSAuxin efflux carrier5.440.0050.0660.011
ago180Sargonaute5.240.0060.0660.011MAT, MCMT, DD
arg1646NSAuxin/aluminum-responsive7.934.E-040.0140.016
aux1567SAuxin-responsive9.777.E-050.0040.0196.220.0020.0320.012MAT, MCMT, MWMT, DD
cbl3828SCalcineurin B-like5.650.0040.0530.011
chlh2504SMg-chelatase H7.560.0010.0130.015MAT, MCMT, MWMT, DD
co206SConstans-like5.250.0060.0660.011MAT, MCMT, DD
cry2188NSCRYPTOCHROME 15.270.0060.0660.011
264S 7.260.0010.0140.015
cry3123NSCRYPTOCHROME 19.131.E-040.0040.018
207S 9.181.E-040.0040.018
579S 5.280.0050.0660.011
drm2438SDormancy/auxin associated7.280.0070.0910.007MAT, MCMT, DD
erf1455NSEthylene-responsive element-binding factor5.820.0030.0580.0127.370.0010.0140.015
expl2199NSExpansin23.472.E-061.E-040.023
gi374NSGIGANTEA4.890.0080.0830.010
gp498NSGlutathione peroxidase8.213.E-040.0080.016
gp598NSGlutathione peroxidase13.582.E-061.E-040.026
gst4395NSGlutathione S-transferase4.860.0080.0830.010
hta3523SHistone h2A4.740.0090.0910.010
585S 6.610.0020.0320.0137.230.0010.0140.014MAT, MCMT, MWMT, DD
ifr2260NSIsoflavone reductase12.624.E-040.0140.01310.060.0020.0250.010MAT, MCMT, DD
lrr3424SLeucine-rich repeat11.102.E-050.0010.022MAT, MCMT, DD
pal3363SPhenylalanine ammonia lyase5.630.0040.0620.011MAT, MCMT, DD
per3570NSPeroxidase5.730.0040.0590.011
per6573Speroxidase27.835.E-121.E-090.05113.293.E-061.E-040.026MWMT
phya441SPHYTOCHROME A5.170.0060.0670.010
pip5k319SPhosphatidylinositol kinase7.350.0010.0210.015MAT
skip2182NSF-box family6.810.0010.0280.014DD
swap441SSWAP domain-containing5.410.0050.0660.011MAT, MCMT, MWMT, DD
xth139SXyloglucan : xyloglucosyl transferase6.880.0010.0280.01412.675.E-062.E-040.025
 199S 8.363.E-040.0120.0165.390.0050.0640.011MAT, MCMT, DD
 289S 16.132.E-072.E-050.03122.874.E-105.E-080.042
 350S 23.063.E-104.E-080.04330.017.E-132.E-100.054MAT, MCMT, DD
image

Figure 7.  Frequencies by Sitka spruce (Picea sitchensis) population of single nucleotide polymorphisms (SNPs) with significant trait associations (either cold hardiness or budset timing). Frequency given for the minor allele in the total sample. Key to population abbreviations can be found in Table 1.

Download figure to PowerPoint

Numerous light signaling genes were among those harboring SNP associations. These included a gene similar to PHYTOCHROME A (phya), a CONSTANS-like gene (co), a GIGANTEA-like gene (gi), and two genes similar to CRYPTOCHROME 1 (cry2 and cry3). Effect sizes for these associations were generally small but cumulatively explained 4.2% of variation in timing of budset and 5.1% of variation in cold hardiness. The largest effect sizes were found for synonymous SNPs within a xyloglucan : xyloglucosyltransferase (xth1) and a peroxidase (per6). Each of these SNPs had relatively high PVEs for both budset and cold hardiness. A SNP at position 350 in xth1 had the highest PVE of any SNP in our study, at 5.4% for cold hardiness and 4.3% for budset. A neighboring SNP had PVEs of 4.2 and 3.1% for each of these traits, respectively (Table 2). In total, four SNPs were associated with both budset and cold hardiness in this gene. per3 had the second highest PVE among associated SNPs, at 5.1% for budset and 2.6% for cold hardiness. This gene was one of six putatively involved in mitigation of oxidative stress that harbored SNPs associated with one or both of the phenotypic traits we assayed (the others being gp4, gp5, gst4, ifr2, and per6 ). Associations were also found for several auxin-related genes, including a putative auxin efflux carrier (aec3), two putative auxin-responsive genes (arg1 and aux1), and an auxin-associated gene (drm2). Finally, chlh2, a gene similar to the recently described abscisic acid (ABA) receptor (Shen et al., 2006), was associated with cold hardiness.

Clinal variation in allele frequencies and correlations with climate variables

Divergent selection resulting from variation in climate across a species range is expected to enhance population differentiation at sites that are the targets of selection. Therefore, in addition to identifying associations between genotype and phenotype while removing effects related to population of origin, the north–south cline in budset and cold hardiness and associated climatic variables presents an additional means by which to assess the effects of genotype on these traits. For SNP loci with additive effects on phenotypes (or linked to loci with such effects), we would expect the homozygote for the nucleotide with a positive effect on cold hardiness or earliness of budset to be present at higher frequencies in the north of the range, the heterozygous state to be more frequent in the center of the range, and the alternative homozygote to be more frequent in the south. Strong clinal variation (R2 > 0.5) in the expected direction was found for nine of the SNPs associated with either budset timing or cold hardiness, or both (Fig. 8). Among these were two of the four SNPs in xth1 (positions 199 and 350), as well as the SNP in per6. SNPs within these genes had the largest effect sizes of all the SNPs we tested. The candidate light-signaling gene co also exhibited clinal variation. By contrast, many other SNPs that had significant associations with phenotypes had a more uniform range-wide distribution (Fig. 7).

image

Figure 8.  Clinal variation in nine single nucleotide polymorphisms (SNPs) that was associated with either budset or cold hardiness, or both (see Table 2 for associations). Best-fit lines and associated R2-values are given for regression of allele frequency on distance from the southern limit of the Sitka spruce (Picea sitchensis) species range at Fort Bragg, CA, USA (associated P-values all < 0.001). SNP position is given for genes with multiple associated SNPs.

Download figure to PowerPoint

To determine if climate variables were better predictors of SNP frequencies than position within the species range, we calculated correlation coefficients between allele frequencies by population (using the minor allele in the total sample as the reference) and the five aforementioned climate variables, for each of the 35 SNPs with genotype–phenotype associations. Of the nine SNPs that exhibited clinal variation, six were also significantly correlated (< 0.01) with one or more climate variables (Tables 2, S3), as were an additional 10 that did not exhibit strong clinal variation. Not surprisingly, temperature-related variables dominated these relationships, particularly MAT, MCMT, and DD, although MWMT was correlated with five of the SNPs. No correlations with precipitation were found. Among the temperature-related variables, MAT was the most common correlation, and in only three instances did we identify an SNP–climate correlation that did not include MAT.

Discussion

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Here, we report a suite of markers for genes involved in local adaptation to climate in Sitka spruce. Adjusting for population structure reduced the type I error rate substantially, although this approach likely also inflated the type II error rate. This effect is likely to be more pronounced in species such as Sitka spruce, for which population structure covaries with climate. Nevertheless, after correcting for the effects of population structure, numerous significant associations were detected, which individually explained between 1 and 5% of the variance in budset and cold hardiness in our mapping population, but cumulatively explained 34.4 and 28.1% of the phenotypic variation in these traits, respectively. That none of the putatively neutral SNPs exhibited trait association suggests that these results reflect true phenotype–genotype relationships and are not the result of population structure. Many of the SNPs with phenotypic associations were also significantly correlated with one or more temperature variables, reinforcing their likely role in adaptation to local climate. SNP–climate relationships involved MAT in most cases, suggesting that MAT reflects selective constraints related to climate better than the other variables tested.

Most notable among the phenotypic associations were SNPs within five genes putatively involved in light signal transduction, a pathway that regulates budset timing in northern tree species, including a gene similar to PHYTOCHROME A, the most upstream element in this pathway (Howe et al., 1996; Olsen et al., 1997). Although effect sizes in association studies are typically somewhat overestimated (Xu, 2003), that the cumulative effects of the SNPs approach heritability estimates for budset and cold hardiness in Sitka spruce (0.32 and 0.30, respectively; Mimura & Aitken, 2007) suggests that a relatively small number of loci govern climatic adaptation in this species. Given the finite architecture of these traits, and as most adaptive SNPs were present across the species range, the potential for ongoing adaptation to a changing climate appears high. That most of the adaptive polymorphisms segregated in populations well south of the glacial refugia suggests that they may be evolutionarily old, and thus that past climatic adaptation (e.g. during postglacial colonization of the current range) likely proceeded, at least in part, from standing variation rather than new mutations, which may further accelerate the rate of adaptation under contemporary climate change. These results advance our understanding of how populations adapt to local climate using standing variation across many loci, and should facilitate applications of our results to marker-aided selection and conservation genetics by optimizing the cohort of SNPs to be assayed in breeding programs and natural populations.

Phenotypic trait variation

Phenotypic measurements confirm previous work with some of these populations demonstrating the dramatic phenotypic gradients present in this species along its wide latitudinal range (Mimura & Aitken, 2007). Strong clinal variation was observed for population means, and differences in latitude between pairs of population predicted statistically significant differences in phenotypes. All temperature variables except MWMT covaried significantly with both autumn cold injury and budset timing. As noted by Mimura & Aitken (2007), MWMT has a U-shaped distribution along the range of Sitka spruce. In the south (Oregon and northern California) coastal fog during the summer keeps temperatures lower on average than the inner south coast of British Columbia (i.e. adjacent to the Vancouver population). Annual precipitation was a weak but significant predictor of both phenotypes, which may reflect covariation of this variable with latitude rather than a true relationship with budset and cold hardiness phenotypes.

Population structure

Population structure in Sitka spruce has been previously characterized for some of the populations employed in this study (Mimura & Aitken, 2007), and although average FST was found to be relatively high for a widespread conifer (= 0.11), it was largely driven by peripheral disjunct populations contributing high pairwise values. We found a similar pattern, with the two disjunct populations having relatively high FST (> 0.1) in pairwise comparisons, whereas even populations separated by > 2000 km within the continuous portion of the range usually had values below 0.05.

Adjusting for population structure was necessary to avoid false-positive associations. Using 98 SNPs assayed in 410 individuals collected from across the species range, we found that accounting for three genetic populations provides the best solution to population subdivision in Sitka spruce. Although it was predictable that the Kodiak Island population would be genetically distinct given previous work on gene flow in Sitka spruce (Mimura & Aitken, 2007), the delineation between Prince Rupert and Icy Bay is clearer than we might have expected given the continuous nature of the range in this area. Although these two populations are separated by c. 900 km, this distance is not sufficient to explain their genetic differentiation. For example, the distance between Prince Rupert and Redwood is even greater (c. 1600 km), but these populations do not appear differentiated. The most obvious explanation is a geographical barrier. This is clearly the case for the Iniskin–Kodiak Island split, with c. 50 km of ocean separating Kodiak Island from the continuous portion of the range. This latter demarcation is exacerbated by founder effects leading to local inbreeding on Kodiak (Gapare & Aitken, 2005; Gapare et al., 2005; Mimura & Aitken, 2007). There are no obvious, strong geographic barriers to gene flow between Prince Rupert and Icy Bay, but the width of the species range is compressed, particularly close to the coast in the vicinity of the St Elias mountains, south of the Icy Bay population (denoted ‘IB’ in Fig. 1). It is possible that this constriction results in reduced effective gene flow and may partially explain the observed pattern of population structure. An alternative, or possibly synergistic, explanation is that a glacial refugium may have been present in southeast Alaska during the last glacial maxima, perhaps on one of the offshore islands. Expansion from both northern and southern refugia could also explain the genetic differences between mainland Alaska populations of Sitka spruce and those comprising the southern portion of the range.

Possible functional roles of SNP polymorphisms

The putative functions of many of the genes harboring SNP associations provide tantalizing insights into the molecular biology of local climatic adaptation in Sitka spruce. By taking advantage of gene expression profiles, we were able to identify a wide variety of genes for the current study that would likely not have been selected on the basis of their functional annotation alone. On the other hand, some of the most interesting SNP associations uncovered here were within genes that were strong a priori candidates on the basis of their annotations, and would not have been included solely on the basis of their expression profiles. These include a homolog to PHYTOCHROME A (phya), the most upstream element in the pathway leading to seasonal growth cessation and budset (Howe et al., 1996; Olsen et al., 1997), as well as downstream genes similar to CO (co) and GI (gi). It was recently shown that the phase of CO expression relative to the light period explained some of the latitudinal variation in growth cessation in European aspen (Populustremula) (Bohlenius et al., 2006). GI, although not known to be involved in this pathway as it relates to dormancy, is a key component of the photoperiodic flowering pathway that also includes CO and FLOWERING LOCUS T (FT). Two cryptochrome-like genes were also associated with either budset timing or cold hardiness. Cryptochromes are blue light receptors that regulate the transition to flowering in annual plants (Li & Yang, 2007). Although a role for cryptochromes in bud dormancy has not been established, given the apparent overlap between flowering and dormancy-related light signaling pathways, and the associations we have shown here, the role of cryptochromes in bud dormancy and cold hardiness in trees may be underappreciated.

The gene harboring the largest number of SNPs exhibiting phenotypic associations, and with the largest PVE for those associations, was included on the basis of a 5.5-fold increase in Sitka spruce foliage between August and November (Holliday et al., 2008). Four SNPs were tested within this candidate gene, a putative xyloglucan : xyloglucosyltransferase (xth1) (also known as xyloglucanendo-transglycosylase (XET)), and each of these variants was associated with both budset timing and cold hardiness. Two of these SNPs (positions 289 and 350) had very low Q-values and among the highest PVE of all SNPs tested (4.3 and 5.4% for budset and cold hardiness, respectively). As these were all synonymous substitutions, it may be that the true functional variant in this gene is in linkage disequilibrium with the SNPs we detected, particularly considering all four SNPs we assayed in this gene are associated with both traits. The functional role of xth1 in climatic adaptation is not clear, but the literature points to one interesting possibility. Xyloglucan is a cell wall polysaccharide that provides strength and can be covalently linked by XETs to other polysaccharides, including cellulose (Hrmova et al., 2007). Such linkages may increase the flexibility of the cell wall, which could be advantageous during cold acclimation as the protoplast dehydrates, putting stress on connections between the cell wall and the plasma membrane. This hypothesis is supported by another highly significant association between an expansin-like gene (expl2) and cold hardiness. Expansins are best known for their role in cell wall loosening during growth, but have been shown to be expressed in response to abiotic stress, most notably drought (Choi et al., 2006). Indeed, expansin activity is correlated with drought-induced cellular dehydration and an associated increase in cell wall folding (Jones & McQueen-Mason, 2004). As freezing temperatures lead to cellular dehydration, it is likely that the cell wall must adjust in much the same way as under drought stress, and the associations we have shown between a putative XET and expansin may reflect this. This hypothesis is also supported by the more than fivefold up-regulation of expl2 during autumn cold acclimation in Sitka spruce (Holliday et al., 2008). It should be noted, however, that these hypotheses would not explain why SNPs within xth1 are also strongly associated with budset timing.

Clinal variation in allele frequencies in the presence of high gene flow

Forest geneticists have long sought to resolve an apparent paradox that exists for outcrossing plant species that inhabit diverse habitats connected by gene flow. Common garden experiments in such species frequently demonstrate among-population variation in quantitative traits related to local adaptation, but molecular analyses often reveal low or nonexistent population differentiation at neutral loci (Howe et al., 2003; Savolainen et al., 2007; Aitken et al., 2008). Under these circumstances, gene flow tends to homogenize populations at selectively neutral loci (e.g. isozymes, amplified fragment length polymorphisms, microsatellites), while divergent selection on locally adaptive alleles is sufficient to maintain population differentiation in the genomic vicinity of these loci (Storz, 2005). Such differentiation may be facilitated by strong leptokurtic pollen dispersal recently observed in several tree species (De-Lucas et al., 2008; Kamm et al., 2009).

In this study we have identified nine SNPs associated with climate-related complex trait variation that also exhibited clinal variation in allele frequencies along the north-south range of Sitka spruce, suggesting that selection is sufficiently strong at these loci to resist the homogenizing effects of gene flow. However, it should be noted that for a linearly arranged species such as Sitka spruce, adjusting for population structure, which covaries with climate-related selective pressures, probably adjusts out some true associations for clinaly varying SNPs. This is a limitation of association mapping in widespread population samples in general, wherein capturing the greatest quantitative trait variation available also introduces the confounding effects of population structure, which must then be adjusted out to avoid spurious associations. Most genotype–phenotype associations were for SNPs exhibiting weak or nonexistent clines in allele frequency. In these cases, gene flow may have prevented or disrupted differentiation. Although clinal variation in allele frequency provides evidence for the adaptive importance of a particular mutation, clines in phenotypic traits may be established in the absence of clines in SNP frequencies (Barton, 1999), and complex combinations of alleles exhibiting weak or nonexistent clines may have as much to do with divergence in quantitative traits as strong clinal variation in individual alleles (Aitken et al., 2008).

Implications for marker-aided selection and conservation genetics

Phenotypic selection in forest trees has led to substantial genetic gains in growth and yield since its relatively recent inception, and one of the anticipated outcomes of association mapping is that functional genetic markers associated with trait variation may augment the breeding process (Neale, 2007). Potential advantages of MAS include the introduction of alleles from natural populations not currently present in breeding programs, or increasing the frequency of rare but desirable alleles already present, for example those conferring adaptation to warmer, drier environments, as well as the ability to genotype seedlings for informative markers rather than waiting for phenotypes that may not be possible to assay for years (Wu, 2002; Burdon & Wilcox, 2007). Evidence from quantitative trait locus (QTL) studies over the past two decades points to numerous mutations of small effect size controlling adaptive traits, which agrees with results from association studies presented here and elsewhere (Jermstad et al., 2001a,b; Wheeler et al., 2005; Gonzalez-Martinez et al., 2007, 2008; Ingvarsson et al., 2008; Eckert et al., 2009). These small effect sizes are sometimes interpreted as an argument against MAS, since implementation of such a program would require genotyping many markers to determine which crosses to make. However, the decreasing cost and relative technical ease of modern high-throughput genotyping somewhat mitigates these issues. It is also well to remember that a genetic gain of even 10%, obtained through MAS, would be substantial, and such a gain may in principle be achieved with only a handful of markers already available.

The markers presented here also provide a tool previously unavailable to conservation geneticists. Neutral markers are frequently employed by geneticists to assay genetic diversity within a population or species, and to subsequently craft genetic conservation strategy. However, although there is a relationship between neutral marker diversity and population fitness (Reed & Frankham, 2003), there is no such connection with adaptive traits (Kramer & Havens, 2009). In a seminal meta-analysis, Reed & Frankham (2001) found only a weak correlation between neutral marker and quantitative trait diversity, and for some classes of traits, the relationship was nonexistent or even negative. As such, determining the appropriate reserve size to capture most of the standing genetic variation of a population or species, on the basis of neutral markers, is unlikely to successfully capture the desired proportion of segregating trait variation. For traits related to climatic adaptation, this problem will become acute as species climate envelopes are expected to shift dramatically in the coming century (Hamann & Wang, 2006; Wang et al., 2006). As adjusting frequencies of standing adaptive genetic variation provides one means by which populations can track climate change (Aitken et al., 2008; Barrett & Schluter, 2008), the markers we have described here, augmented by those found in future studies, could be a key tool in determining the potential of our forests to adapt to rapidly changing climatic conditions, and in formulating genetic conservation strategies to advance this goal. Future studies of the transferability of these markers among Picea species may also extend their usefulness beyond Sitka spruce.

Acknowledgements

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

We would like to thank Makiko Mimura for establishing the original common garden, Carol Ritland, Dylan Thomas, Michelle Tang and Leyla Tabanfar for assistance with DNA isolations and genotyping, as well as Mack Yuen and Ligia Mateiu for computational support. We would also like to thank Pia Smets and Joanne Tuytel for maintenance of the common garden and assistance with vegetative propagation. This work was supported by Genome British Columbia, Genome Canada and the Province of British Columbia (grant to K.R. and S.A.), by the Natural Science and Engineering Research Council of Canada (NSERC; grant to S.A.), by an NSERC Postgraduate Scholarship to J.H. and by Virginia Tech ‘Startup Funds’ to J.H.

References

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Supporting Information

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. References
  9. Supporting Information

Table S1 Pairwise Fst estimates for 14 geographic populations included in this study

Table S2 Pairwise Fst estimates for three populations inferred using the Structure software

Table S3 Correlations between single nucleotide polymorphisms with phenotypic associations (Table 1) and five climate variables

Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing material) should be directed to the New Phytologist Central Office.

FilenameFormatSizeDescription
NPH_3380_sm_TablesS1-S3.xls26KSupporting info item