Natural selection contributes to geographic patterns of thermal plasticity in Plantago lanceolata

Abstract A long‐standing debate in evolutionary biology concerns the relative importance of different evolutionary forces in explaining phenotypic diversification at large geographic scales. For example, natural selection is typically assumed to underlie divergence along environmental gradients. However, neutral evolutionary processes can produce similar patterns. We collected molecular genetic data from 14 European populations of Plantago lanceolata to test the contributions of natural selection versus neutral evolution to population divergence in temperature‐sensitive phenotypic plasticity of floral reflectance. In P. lanceolata, reflectance plasticity is positively correlated with latitude/altitude. We used population pairwise comparisons between neutral genetic differentiation (F ST and Jost's D) and phenotypic differentiation (P ST) to assess the contributions of geographic distance and environmental parameters of the reproductive season in driving population divergence. Data are consistent with selection having shaped large‐scale geographic patterns in thermal plasticity. The aggregate pattern of P ST versus F ST was consistent with divergent selection. F ST explained thermal plasticity differences only when geographic distance was not included in the model. Differences in the extent of cool reproductive season temperatures, and not overall temperature variation, explained plasticity differences independent of distance. Results are consistent with the hypothesis that thermal plasticity is adaptive where growing seasons are shorter and cooler, that is, at high latitude/altitude.

Such information not only helps us to understand how evolutionary processes have molded current geographic patterns of phenotypic diversification, the information also helps us better predict how climate change will alter current geographic patterns. Here, we provide what we believe is the first study to examine whether these processes contribute to population divergence in thermal plasticity, that is, phenotypic plasticity in response to temperature, along largescale latitudinal and altitudinal gradients.
Thus, although the clinal patterns are common, their causes remain largely untested.
Teasing apart the contributions of neutral and non-neutral evolutionary forces in population divergence remains a challenge because both can produce similar geographic patterns (Huey, Gilchrist, Carlson, Berrigan, & Serra, 2000;Nadeau et al., 2016;Orsini et al., 2013;Whitlock, 2008). For example, founder effects such as migration out of refugia, coupled with limited gene flow and genetic drift, can produce a pattern of isolation by distance, that is, a positive correlation between genetic and phenotypic differentiation, and geographic distance between populations (IBD, Hutchison & Templeton, 1999;Wright, 1943). However, this geographic pattern can also be produced by local adaptation when the environmental conditions driving selection against nonlocally adapted migrants are correlated with distance, resulting in isolation by environment/adaptation (IBE/A, Hendry, 2004;Nosil, Vines, & Funk, 2005;Nosil, Funk, & Ortiz-Barrientos, 2009;Orsini et al., 2013). In such cases, sampling the landscape in such a way that reduces the spatial-environment correlation may help disentangle the contributions of these different evolutionary forces.
The natural geographic variation in thermal plasticity of floral reflectance among populations of Plantago lanceolata, a widespread temperate herb, allowed us to address these challenges. In this species, many individuals can modify the color and NIR (near infrared) reflectance of a spike (i.e., an inflorescence of tightly packed flowers) in response to the ambient temperature experienced during flower development (Lacey & Herr, 2005). Typically, nonplastic individuals produce lightly colored/highly reflective spikes, regardless of ambient temperature (Lacey & Herr, 2005). In contrast, plastic individuals produce darker, less reflective spikes in cool temperatures, and lighter, more reflective spikes in warm temperatures ( Figure 1; Lacey & Herr, 2005;Lacey et al., 2010).
The darker, less reflective spikes can absorb more incoming solar radiation than lighter, more reflective spikes, thereby helping to warm reproductive tissues (Lacey & Herr, 2005). Conversely, during warm reproductive periods, lighter, high reflectance spikes absorb less incoming solar radiation, helping to cool tissues (Lacey & Herr, 2005). Thus, the darker, less reflective spikes appear to be beneficial in cool environments where warming tissues are advantageous (Lacey et al., 2010). Manipulative experiments provide evidence that this plasticity improves fitness via increased reproductive success during cool periods of the reproductive season, but does not negatively affect fitness during warm periods (Lacey, Lovin, & Richter, 2012). Furthermore, warming reproductive tissues during cool periods can enhance offspring fitness (Lacey, F I G U R E 1 Under cool growing conditions (e.g., 15°C day/10°C night), Plantago lanceolata individuals vary in their ability to produce dark preflowering spikes (inflorescences of tightly packed flowers prior to stigma emergence) that exhibit low reflectance. When growing conditions are warm (e.g., 27°C day/20°C night), nearly all individuals produce spikes that are lightly colored and highly reflective (as in D). As the color of spikes lighten from left to right (a-d), the percentage of light reflected at 850 nm (i.e., floral reflectance) increases 1996; Lacey & Herr, 2000). At this point, no cost of this type of plasticity has been detected in terms of reproductive output (Lacey et al., 2012).
As latitude or altitude increases, the degree of temperature sensitivity and the proportion of plastic individuals in a population both increase, while the proportion of nonplastic individuals, which constitutively produce light colored/highly reflective flowering spikes, decreases (Lacey et al., 2010). Nonplastic individuals constitutively producing dark/poorly reflective spikes have rarely been found (Lacey et al., 2010).
To assess the evolutionary cause(s) of variation in floral reflectance plasticity in P. lanceolata, we compared neutral genetic differentiation (F ST and Jost's D), phenotypic differentiation (P ST ), geographic distance, and four parameters of the reproductive season environment. F ST versus Q ST or P ST comparisons are an important tool for inferring the relative importance of neutral evolution versus natural selection (Hangartner et al., 2012;Leinonen et al., 2008;Whitlock, 2008). F ST estimates how much population differentiation can be explained in the absence of natural selection; that is, it provides a null hypothesis to that of natural selection. The comparison can be used to determine whether populations have diverged for a trait, and to determine whether a series of populations along an environmental gradient show evidence of local adaptation. Among each pair of populations, the comparison produces one of three possible outcomes: P ST indicating observed differences are best explained by neutral genetic drift, stabilizing selection, or diversifying selection, respectively (McKay & Latta, 2002;Merilä & Crnokrak, 2001).
The goals for our study were to collect and analyze molecular genetic AFLP data in order to (a) estimate neutral genetic population diversity and differentiation (F ST ) for 14 populations of P. lanceolata in their native European environment, (b) infer patterns of migration and admixture potentially shaping population differentiation, (c) test whether isolation by distance explained a significant proportion of the variance in neutral AFLP markers, (d) compare F ST and P ST to test for a signature of natural selection, and (e) assess whether phenotypic divergence in thermal plasticity between populations is best explained by distance between populations, neutral evolution, and/ or specific environmental properties of the reproductive season that might drive population divergence. The environmental variables we examined allowed us to evaluate potential drivers of selection.
We used Mantel tests to examine whether geographic distance and neutral genetic differentiation (F ST or Jost's D) were correlated, and whether geographic distance and population differences in each environmental variable were correlated. We used multiple regression of distance matrices (MRM) analyses to determine whether variation in P ST could be explained by F ST and Jost's D, geographic distance, and/or environmental properties of the reproductive season. We used linear regression models to examine how phenotypic and genetic differentiation varied along axes of geographic distance and environmental differences.
Additionally, our dataset allowed us to test two local adaptation hypotheses. The local adaptation hypothesis predicts that phenotypic properties of different populations should diverge as selective pressures become more different, and should do so at a greater rate than neutral genetic differences (Orsini et al., 2013). The Thermal Magnitude Hypothesis, which we will refer to as the Magnitude Hypothesis, and the Relative Frequency Hypothesis, which we will refer to as the Frequency Hypothesis, attempt to explain why exhibiting thermal plasticity is more adaptive at higher latitudes and altitudes than being nonplastic. The Magnitude Hypothesis states that thermal plasticity is more adaptive at higher latitudes and altitudes because the magnitude of thermal variation is greater in these regions (Ghalambor et al., 2006;Janzen, 1967). The Magnitude Hypothesis predicts phenotypic differentiation will correlate with environmental differences in the magnitude of thermal variation (i.e., temperature range) among populations. The Frequency Hypothesis states that thermal plasticity is more adaptive at higher latitudes and altitudes because thermally variable growing seasons are shorter and individuals experience a relatively greater proportion of their growing season at cool, rather than warm temperatures (Lacey et al., 2010).
The Frequency Hypothesis predicts phenotypic differentiation will correlate with environmental differences in season duration and the proportion of cool temperatures during the growing season among populations. Lacey et al. (2010) have shown that phenotypic patterns of thermal plasticity among European populations of P. lanceolata are consistent with the Frequency Hypothesis and not the Magnitude Hypothesis. We examined whether their conclusions were supported when we compare the phenotypic patterns with genetic patterns generated by our AFLP data. Finally, our results allowed us to predict how climate change is likely to alter latitudinal and altitudinal patterns of thermal plasticity via thermal change affecting the intensity of natural selection.

| Biology of P. lanceolata
Plantago lanceolata L. (ribwort plantain, English plantain) is a temperate perennial herb native to Eurasia. A weedy species, it is an obligate outcrosser that is primarily wind-pollinated. Reflectance (i.e., the amount of light reflected) and the color of a spike (i.e., an inflorescence of tightly packed flowers) are thermally plastic and are determined by the ambient temperature experienced during flower development (Lacey & Herr, 2005). For a single flower on a spike, floral reflectance becomes fixed at the time of flower development. However, floral reflectance is reversible for an individual plant. An individual plant typically produces spikes throughout an often-lengthy flowering season, for example, 2-6 months depending on location. Also, the external temperature changes during that time. These changes induce the individual plant to modify the floral reflectance of its newly produced spikes. Floral reflectance thermal plasticity is strongest in the visible and NIR regions of the electromagnetic spectrum (Lacey & Herr, 2005).
Floral reflectance plasticity is genetically variable within and among natural populations of P. lanceolata and is positively correlated with latitude and altitude (Lacey & Herr, 2005;Lacey et al., 2010;Umbach, Lacey, & Richter, 2009). Individuals from low latitudes and altitudes often display negligible thermal plasticity and produce only highly reflective/lightly colored spikes. Most individuals from higher latitudes and altitudes reduce reflectance/darken spikes in response to cool environments, but do so to different degrees.

| Experimental populations
We selected fourteen European P. lanceolata populations of the 29 used in Lacey et al. (2010). Populations spanned a latitudinal range of 39.3-50.9°N and an altitudinal range of 1-1,886 m (Table 1). When compared to the larger pool of 29 populations from Lacey et al. (2010), the populations we sampled spanned the southern 53% of latitudinal range, covered all of the altitudinal range, and sampled the lower 78% of the total variation in floral reflectance plasticity.
Distance between populations was determined by uploading latitude-longitude coordinates into Google Earth (earth.google.com) as (a) minimum linear Euclidean distance in meters and (b) minimum geographic distance over land as determined using the path tool.
Analyses conducted with Euclidean distance and distance over land produced the same conclusions; those with distance over land are presented because they are the most biologically reasonable.
The phenotypic data for thermal plasticity of the genotypes in our sample populations came from an earlier study (Lacey et al., 2010). In that study, seeds collected in the year 2000 from each population were, grown, and passed through one generation in a common greenhouse environment, followed by within-population pollination in a common growth chamber environment.
Parental temperature effects were largely reduced by controlling the postzygotic temperature during flower development and seed maturation (Case, Lacey, & Hopkins, 1996;Lacey & Herr, 2000).
Second-generation individuals (also distinct genotypes) from each population were initially grown in a common growth chamber environment (details in Lacey et al., 2010). After 10 weeks, each plant was divided into two genetically identical clones and maintained in the common environment to recover. Then, each clone was randomly assigned to either a "cool" growth chamber set to 15°C day/10°C night or a "warm" chamber set to 27°C day/20°C night, and flowering was induced by increasing the day-length. These temperatures are within the natural range experienced by populations during reproduction (Lacey et al., 2010). For each clone, reflectance was estimated as the percentage of light reflected from flower buds (i.e., spikes just before stigma emergence on the lowest flowers) at 850 nm (in NIR region) using a spectrophotometer with an integrated sphere (details in Lacey & Herr, 2005).
An individual's (i.e., genotype's) thermal plasticity in floral reflectance was calculated by subtracting percent reflectance of spikes TA B L E 1 Population locations and characteristics: Country of origin, location within country, population symbol, mean heterozygosity, mean floral reflectance plasticity, and the number of individuals measured (N) produced by a clone grown at cool temperature from the percent reflectance of spikes produced by a clone of the same genotype grown at warm temperature (for complete methodology, see Lacey & Herr, 2005;Lacey et al., 2010). One-sided Pearson correlations showed that the geographic patterns of thermal plasticity in our sample populations were consistent with the overall geographic pattern observed in the earlier, larger study (Lacey et al., 2010).
Mean plasticity increased significantly with latitude (r = 0.528, one-sided p = 0.026) and marginally with altitude (r = 0.418, onesided p = 0.068) of the source population (R Development Core Team, 2013). We used a one-sided test to test the directional hypothesis of a positive correlation.

| Phenotypic differentiation
Phenotypic differentiation (P ST ) in thermal plasticity was calculated as a conservative proxy for quantitative genetic differentiation (Q ST ) associated with floral reflectance plasticity as where 2 PB denotes between-population phenotypic variance, 2 PW within-population phenotypic variance, and h 2 the heritability (Leinonen, Cano, Mäkinen, & Merilä, 2006;Merilä & Crnokrak, 2001). We calculated P ST using the null assumption that h 2 =1, that is, all of the observed phenotypic variation is genetic, for two reasons. First, the plants we used for phenotyping were second-generation plants from each population that had been grown in a common environment, and the phenotypes were scored in a common environment. Thus, environmental effects from the previous two generations were controlled and should not have inflated the between-population variance in our study. Second, we were unable to determine reliable estimates of heritability. By using the assumption h 2 =1, our measure of P ST is a conservative proxy for Q ST . Phenotypic variance components were calculated for plasticity between each pair of populations using ANOVA tests in SPSS (Merilä & Crnokrak, 2001;SPSS, 2009

| Neutral genetic markers
We extracted DNA from leaf tissue of 315 individuals (n = 10-33 individuals/population) using a modified CTAB method (Doyle & Dickson, 1987) and prepared amplified fragment length polymorphism (AFLP) reaction templates following Vos et al. (1995) using 500 ng of DNA digested with EcoRI and MseI. Thereafter, we completed ligation with EcoRI (E) and MseI (M) primers and selective preamplification using standard AFLP EcoRI (E) and MseI (M) primers containing selective nucleotides E + AC and M + CC (Remington, Whetten, Liu, & O'malley, 1999;Vos et al., 1995). Selective amplification was performed using combinations of the following E We used individuals repeated within each primer-pair combination to establish consensus AFLP scoring panels, and all loci were repeated with at least 5 individuals. All of the individuals that were genotyped with multiple dyes were used to create scoring panels.
In cases where one of the samples of repeated individuals was too poor to score, and the other sample was used for scoring. In cases of disagreement, the sample with the clearest standards in that region was used. If both samples were of equal quality, disagreements were treated as missing data. In total, we scored 313 unique AFLP loci in each of 315 individuals.
Within AFLP scoring panels, we determined scoring error for each dye, and between dyes (Supporting Information Table S1). We calculated scoring error in AFLP markers as percent of markers that disagreed (percent disagreement) between multiple runs of an individual within and between FAM and TAMRA dyes. We calculated error within each primer-pair combination as the number of markers in disagreement divided by the total number of markers able to be scored within repeated individuals. To calculate overall error, we summed the average error within all primer pairs, weighted by the number of individuals used, and divided by the total number of individuals used. The percent of AFLP markers that disagreed between multiple runs of the same individual were 5.06 ± 1.79% and 5.51 ± 2.29% for FAM and TAMRA dyes and 8.86 ± 1.67% between dyes (Supporting Information Table S1).
Once AFLP scoring panels were established, they were used to score each individual. In the final AFLP data set, each individual was included once for each primer pair (individuals were not repeated).
Using the criteria described above, we developed a consensus score for individuals for which we had data from multiple runs.

| Environmental variables
We considered the following four environmental characteristics of the reproductive season for our analyses. (a) The "cool proportion" of the season was calculated as the proportion of the flowering season spent at temperatures below 15°C, based on monthly means.
The physiological rationale for this value is described in Lacey et al.  (Table 2).
Finally, we calculated absolute pairwise differences between populations for each environmental variable and for each composite variable.

| Neutral genetic population structure
Scored AFLP markers were used to estimate genetic diversity within populations and differentiation between populations and to conduct population structure analyses. We estimated neutral genetic diversity as population mean heterozygosity for each population in Hickory v1.1 with 10 5 iterations following a burn-in of 5,000 (Holsinger, Lewis, & Dey, 2002). Comparative phylogeographic studies have found evidence of postglacial migration from southern European refugia following the Pleistocene glaciation in other species (Schönswetter, Stehlik, Holderegger, & Tribsch, 2005;Taberlet, Fumagalli, Wust-Saucy, & Cosson, 1998). To determine whether we could identify postglacial migration routes in P. lanceolata, potentially providing information on the origins of neutral patterns of genetic differentiation, we mapped diversity at population locations and looked for emerging patterns (Supporting Information Figure   S1). Two-sided Pearson correlations between heterozygosity with latitude and altitude were calculated in R (Goslee & Urban, 2007;R Development Core Team, 2013).
We calculated neutral genetic differentiation and 95% confidence intervals between all population pairs using two statistics; F ST estimated with the full model as θ II in Hickory v1.1 with 10 5 iterations following a burn-in of 5,000, and Jost's D calculated in SPADE TA B L E 2 Principal components analyses used to combine multiple aspects of the reproductive season into composite variables Jost, 2008). Q ST or P ST and F ST statistics are equivalent measures of population phenotypic and genetic differentiation and thus are derived from the same evolutionary history and respond similarly to the evolutionary processes that give rise to them (i.e., realized migration and genetic drift). Jost's D on the other hand is specific to the loci being measured and is more strongly affected by mutation than migration. Thus, Jost's D is not legitimately equivalent to Q ST or P ST (Whitlock, 2011). However, we included Jost's D because it is better suited for describing the allelic differentiation among populations (Meirmans & Hedrick, 2011).

| Admixture
To explore the genetic groups within samples and infer geographic patterns of admixture across the landscape, nonhierarchical Bayesian clustering was performed in STRUCTURE 2.3.4 (Pritchard, Stephens, & Donnelly, 2000). The correlated allele frequencies model with admixture was used to test values of K. Five replicates for each K from 2-10 were run with a burn-in of 10 5 , followed by 10 6 replicates, with convergence monitored for each run. We combined and interpreted all runs with Structure Harvester (Earl, 2012), using the methods of Evanno, Regnaut, and Goudet (2005) and Pritchard et al. (2000). We used CLUMPP to average admixture proportions over runs (Jakobsson & Rosenberg, 2007) and visualized averaged runs using Distruct (Rosenberg, 2004). To best resolve ancestral relatedness among populations, we visually examined average admixture plots from low to high values of K groups, with regard to the geographic location of populations.
Then, we grouped populations into geographic regions based upon both their physical locations and the groupings provided by STRUCTURE, and we calculated genetic differentiation between these regions in Hickory with 25,000 iterations following a burn-in of 5,000 (Holsinger et al., 2002). We looked for regions separated by higher genetic differentiation that may represent ancestral populations, and regions separated by lower differentiation representing historical postglacial migration routes (Fischer, 1960;Hewitt, 1999;Schmitt, 2007).

| Hypothesis testing
Phenotypic, genetic, and environmental differentiation values were standardized to zero mean and unit variance with the decostand function (R Development Core Team, 2013). Standardized values were used for hypothesis testing in subsequent analyses.

| Isolation by distance
To determine whether isolation by distance could explain geographic patterns of differentiation in neutral genetic markers, we conducted  (Guillot & Rousset, 2013) and for addressing questions that concern dissimilarity matrices (Legendre, Fortin, & Borcard, 2015). However, Mantel test results should be interpreted cautiously because simulations have shown them to reject the null hypothesis of independence too often, producing a higher number of false positives than it should when comparing two nonspatial matrices that might both be spatially autocorrelated (Guillot & Rousset, 2013).

| Natural selection
To test for evidence of natural selection, we compared phenotypic and neutral genetic differentiation. We examined the relationship between phenotypic (P ST ) and neutral genetic ( Whitlock & Guillaume, 2009), because available packages for calculating F-statistics from dominant AFLP markers do not provide the required per-locus components of variance, that is, the coefficients "a", "b", and "c" from Weir and Cockerham (1984).
To The local adaptation hypothesis predicts that phenotypic differentiation should diverge more rapidly than neutral genetic differentiation along selective gradients. Therefore, under natural selection, P ST should (a) increase more sharply than F ST and Jost's D as environments increasingly differ between populations and/or (b) have a higher y-intercept value (Leinonen et al., 2006;Orsini et al., 2013).
Although the magnitude and frequency hypotheses are not mutually exclusive, each predicts that different characteristics of the reproductive season for P. lanceolata drive the evolution of thermal plas-   Table S2). Relatively low genetic differentiation was found between several regions: Spain and southern France, southern and northern France, and northern and western France (0.05 ≤ F ST ≤ 0.10, Figure 2, Supporting Information Table   S2).

| Neutral genetic population structure is consistent with gene flow among populations
STRUCTURE indicated that the best arrangement for AFLP data from the 14 European populations was in K = 7 or 8 groups.
The highest delta K value was observed at K = 7, while K = 8 showed the highest log probability and low run-to-run variability (Supporting Information Figures S2 and S3). As values of K increased from 2 to 8, evidence of admixture among populations also increased (i.e., new groups were spread among multiple populations). Despite the admixture, southern Italian populations (IA, ICa, and ICs) consistently remained different from other populations ( Figure 3). This pattern was noticeable at K = 2 and K = 8 ( Figure 3).  Tables S3 and S4). Removing F ST from models with both geographic distance and F ST had little effect on the variation explained (e.g., Table 3C vs. 3A), whereas removing geographic distance from these models substantially reduced the variation explained (e.g., Table 3C vs. 3B). Thus, on its own, F ST was a poor predictor of phenotypic differentiation.

| Isolation by distance
Our data were able to distinguish the influence of geographic distance on phenotypic differentiation separately from contributions of potentially selective environmental variables because pairwise differentiation in environmental parameters and geographic distance was uncorrelated (Supporting Information Table S5). As a result, the F I G U R E 3 STRUCTURE admixture plots of 14 Plantago lanceolata populations from southern Europe calculated with 313 AFLP markers from 315 individuals. CLUMPP was used to average admixture proportions over runs (Jakobsson & Rosenberg, 2007) and Distruct (Rosenberg, 2004)  significance of environmental variables as predictors of variation in P ST in MRM models was influenced only modestly by including geographic distance in models (e.g., Table 3H vs. 3I).

| P ST analyses
For 36 of the 91 (39.6%) population pairwise comparisons, we found that P ST was greater than neutral genetic differentiation (F ST ) and the 95% confidence intervals did not include values where P ST = F ST ( Figure 4). We did not find statistical support for a difference among the remaining comparisons, and no comparison suggested that F ST > P ST .
MRM models showed that a noteworthy fraction of the variation in P ST could be explained by the cool proportion of the season and Thermal_PC (the composite variable for cool proportion and duration), independent of the contributions of geographic distance and F ST (Tables 3H-K and 4A-D). In models including cool proportion and geographic distance, cool proportion significantly or marginally contributed to the variation in P ST (Table 3I,K). In the model including Thermal_PC and distance, Thermal_PC significantly contributed to the variation in P ST (Table 4B), and the contribution was marginal when F ST was added to this model (Table 4D). Overall, the models that included Thermal_PC had the highest predictive power.
Substituting Mag_Therm_PC for Thermal_PC (i.e., adding magnitude to the composite environmental variable) failed to improve the model's explanatory power when compared to the model with Thermal_PC alone (Table 4A-D vs. 4I-L). Associations with duration, magnitude, and precipitation alone were not significant (Table 3D-G, L-O, P-S).
The linear regression analyses showed that the slopes of P ST differed substantially from F ST and Jost's D when the statistics were regressed on geographic distance between populations and on population differences for two environmental variables characterizing the reproductive season (Table 5). P ST , F ST , and Jost's D increased with increasing geographic distance between populations, but P ST increased at a significantly greater rate than did F ST and Jost's D ( Figure 5a, Table 5). In contrast, only P ST increased significantly with increasing population differences in the proportion of cool temperatures during reproductive season (Figure 5c, Table 5). With respect to population differences in duration of the reproductive season, and also proportion of cool temperatures, the slope of P ST did not significantly differ from F ST or Jost's D, but the y-intercept was significantly greater for P ST than for F ST and Jost's D (Figure 5b,c, Table 5).
As population differences in Thermal_PC (cool proportion and duration variables combined) increased, so also did P ST and F ST (Figure 5e, Table 5). The slopes for both P ST and F ST were significantly positive, but the slope for P ST was significantly steeper than for F ST (Figure 5e, Table 5). Likewise, the slope for P ST was significantly steeper than for F ST along the Mag_Therm_PC axis (Figure 5g, Table 5).
TA B L E 4 Results of multiple regression of distance matrices (MRM) tests of the phenotypic differentiation (P ST ) in temperature-sensitive floral reflectance plasticity matrix on matrices of geographic distance, genetic differentiation (F ST ), and/or environmental differences between Plantago lanceolata populations Slopes for P ST along axes of duration, thermal magnitude, precipitation and Magnitude_PC did not significantly differ from zero ( Figure 5b,d,f,h, Table 5), and slopes for Jost's D were not significantly positive along any environmental principal components axis.
In analyses where slopes did not differ between phenotypic and neutral genetic differentiation, the y-intercepts of phenotypic differentiation were always higher than y-intercepts of neutral genetic differentiation ( Figure 5, Table 5).
Results of runs tests suggested phenotypic and genetic differentiation variables significantly deviated from linearity when regressed along the axis of reproductive season duration (Supporting Information Table S6). These deviations dissipated when season duration was incorporated into composite PC variables (Supporting Information Table S6). The jackknifing procedure that controlled for nonindependence of population pairs in linear regression analyses produced nearly identical results as those above, and conclusions were equivalent (Table 6).

| D ISCUSS I ON
Our study is the first of which we are aware that has tested the local adaptation hypothesis against the "null" of neutral evolution for thermal plasticity in a trait. The results are consistent with natural selection having shaped large-scale latitudinal and altitudinal patterns in thermal plasticity. Approximately 60% of the population pairwise P ST -F ST comparisons did not provide statistical support for a difference between phenotypic differentiation and neutral genetic differentiation. Therefore, these differences presumably could be explained by genetic drift alone (McKay & Latta, 2002;Merilä & Crnokrak, 2001). However, phenotypic differentiation was greater than neutral differentiation among ~40% of the comparisons. Divergent selection better explains these differences (McKay & Latta, 2002;Merilä & Crnokrak, 2001). Moreover, multiple regression of distance matrices and linear regression analyses were consistent with phenotypic divergence in thermal plasticity occurring along clines in the reproductive environment.
Earlier attempts to test the local adaptation hypothesis against the null at large geographic scales have been limited in two ways.
By examining associations with specific environmental parameters and by sampling a variety of populations over both latitudinal and altitudinal gradients, we were able to largely reduce both of the above limitations. Thermal plasticity was first proposed to be more adaptive at higher latitudes and altitudes because the magnitude of thermal variation is greater in these regions (Ghalambor et al., 2006;Janzen, 1967). The Magnitude Hypothesis (previously called climatic or temperature variability hypotheses) is consistent with mathematical models, and empirical data predicting that plasticity should be favored over nonplasticity as environmental variability increases (Gavrilets & Scheiner, 1993b;Kingsolver & Huey, 1998;Moran, 1992;Schlichting, 1986;Schlichting & Pigliucci, 1998;Via, 1993;Via & Lande, 1985). Alternatively, the Frequency Hypothesis is that thermal plasticity is more adaptive at higher latitudes and altitudes because in these regions, thermally variable growing seasons are shorter and experience a greater proportion of the growing season in cool, rather than warm temperatures relative to low latitude and altitude populations (Lacey et al., 2010). This hypothesis is consistent with theoretical models predicting that for organisms living in temporally variable environments, the selective benefit of plasticity should depend on the relative frequencies of time during an organism's life when alternative phenotypes have a selective advantage, that is, the frequencies or durations of different selective environments (Gavrilets & Scheiner, 1993a;Gomulkiewicz & Kirkpatrick, 1992;Levins, 1968;Moran, 1992;Van Tienderen & van der Toorn, 1991 Note. Our jackknife procedure estimated the slope and p-values of each linear regression 91 additional times. Before each regression, one of the 91 pairwise comparisons was eliminated from the analysis. We eliminated a different comparison each time. The means, standard deviations, and pvalues of the jackknifing procedure are reported: bold = p < 0.05, italic = 0.05 < p < 0.10.
TA B L E 6 Jackknifing results of slopes and p-values from linear regression analyses to control for nonindependence of population pairs in cases where the slope of phenotypic differentiation was significantly greater than the slope for genetic differentiation We expect selection on thermal plasticity in nature to be greater than our data suggest. Our P ST values are likely conservative underestimates of Q ST because we conservatively used a heritability value of 1.0 in our calculations, and reflectance data were collected from clones of the same individuals grown at the same controlled temperatures. Additionally, parental environmental effects had been reduced by passing parents of our experimental plants through one generation in a similar environment, and all individuals were grown in the same environment. Therefore, between-population differences are due exclusively to genetic factors (see Lacey et al., 2010).
Temperature increases associated with contemporary climate change and local land-use change, for example, urbanization, are widespread and are already having a strong influence on species distributions (IPCC, 2014;Parmesan, 2006;Visser, 2008;Walther et al., 2002). For example, reproductive seasons are being lengthened, and advances in flowering have been observed in many plant species including P. lanceolata (Abu-Asab, Peterson, Shetler, & Orli, 2001;Cleland, Chuine, Menzel, Mooney, & Schwartz, 2007;Fitter & Fitter, 2002). Phenotypic plasticity has been proposed as a mechanism for coping with climate change because it can provide organisms with the potential to respond rapidly to changes in their environment (Charmantier et al., 2008;Gienapp, Teplitsky, Alho, Mills, & Merilä, 2007;Matesanz, Gianoli, & Valladares, 2010;Przybylo, Sheldon, & Merilä, 2000;Réale, McAdam, Boutin, & Berteaux, 2003;Visser, 2008). However, few empirical studies provide data to test this hypothesis. Given the evidence that thermal plasticity of floral reflectance in P. lanceolata is better adapted to environments with short and cool reproductive seasons, our data suggest that the advantage of thermal plasticity will generally decrease in extant populations of other species as well, as warmer weather becomes more prevalent.
On the other hand, warming should transform areas beyond today's high latitudinal and altitudinal range limits into suitable habitats. In P. lanceolata, floral reflectance plasticity should help facilitate the colonization of new populations toward the poles and higher elevations.
Recent ecological niche models are consistent with these predictions (Valladares et al., 2014). Projections suggest the less plastic southern and low-altitude populations will experience poleward and uphill niche expansion and will retain a large area of suitable habitat over the next few decades (Valladares et al., 2014).
In closing, we emphasize that plastic responses are specific to the traits, environments, and organisms considered and, thus, are as diverse as life itself. As a result, determining whether plasticity, or the evolution of plasticity, can ameliorate the effects of climate change depends on several factors (Munday, Warner, Monro, Pandolfi, & Marshall, 2013;Parmesan, 2006;Visser, 2008;Walther et al., 2002). For example, determining how the environmental sensitivity of different traits functions to influence fitness should help us predict whether or not plasticity will help individuals endure future conditions. In addition, knowledge of the genetic architecture (i.e., the number and effect size of underlying genes) and the inheritance of plasticity should inform our predictions about how traits will evolve under different scenarios. Future studies that examine the relationship between phenotypic plasticity and life history at large spatial scales will improve our understanding of evolutionary processes and thus our ability to predict species persistence in the face of ever changing environmental conditions.

ACK N OWLED G M ENTS
Funding for this research was provided by grants from the UNCG Biology Department. We thank J.C. Patton for help with preparation of maps, D.J. Smith for specimen photography, and S.J. Richter, J.H. Willis, M.D. Rausher, and O. Rueppell for helpful statistical discussions. We also thank the many undergraduates who helped with plant care and data collection and the National Science Foundation (NSF) for supporting the research (DEB 0236526) to E.P.L.

CO N FLI C T O F I NTE R E S T
The authors have no conflict of interest to declare.