Phylogenetic conservatism in plant phenology

Authors


Summary

  1. Phenological events – defined points in the life cycle of a plant or animal – have been regarded as highly plastic traits, reflecting flexible responses to various environmental cues.
  2. The ability of a species to track, via shifts in phenological events, the abiotic environment through time might dictate its vulnerability to future climate change. Understanding the predictors and drivers of phenological change is therefore critical.
  3. Here, we evaluated evidence for phylogenetic conservatism – the tendency for closely related species to share similar ecological and biological attributes – in phenological traits across flowering plants. We aggregated published and unpublished data on timing of first flower and first leaf, encompassing ˜4000 species at 23 sites across the Northern Hemisphere. We reconstructed the phylogeny for the set of included species, first, using the software program Phylomatic, and second, from DNA data. We then quantified phylogenetic conservatism in plant phenology within and across sites.
  4. We show that more closely related species tend to flower and leaf at similar times. By contrasting mean flowering times within and across sites, however, we illustrate that it is not the time of year that is conserved, but rather the phenological responses to a common set of abiotic cues.
  5. Our findings suggest that species cannot be treated as statistically independent when modelling phenological responses.
  6. Synthesis. Closely related species tend to resemble each other in the timing of their life-history events, a likely product of evolutionarily conserved responses to environmental cues. The search for the underlying drivers of phenology must therefore account for species' shared evolutionary histories.

Introduction

In plants, the timing of seasonal events, such as flowering time, is highly sensitive to climate, making phenology one of the most variable plant traits (Chuine 2010). Along with climate and photoperiod, key phenological drivers include disturbance (e.g. drought or fire), the identity of co-occurring species, competition (Rathcke & Lacey 1985; Ollerton & Lack 1992) and local abiotic conditions (Chuine 2010). Highly variable traits such as these might be expected to show little or no relationship with phylogeny because their expression is, for the most part, environmentally determined. However, high variation in flowering time among species occupying similar environments indicates that phenological responses may be additionally mediated by intrinsic species attributes such as life form, habit, dispersal mode and pollination (e.g. Olsson & Ågren 2002; Bolmgren & Cowan 2008; Jentsch et al. 2009; Davis et al. 2010; Sun & Frelich 2011), which may in turn be phylogenetically conserved (Chazdon et al. 2003; Swenson & Enquist 2007; Kraft & Ackerly 2010). Previous work suggests some evidence for phylogenetic conservatism in phenological traits, including flowering times (e.g. Kochmer & Handel 1986; Bolmgren & Cowan 2008; Willis et al. 2008; Davis et al. 2010), at particular locations. To date, the strength of this phylogenetic component to phenological variation has not been evaluated at broad spatial and taxonomic scales.

Phylogenetic conservatism has attracted much attention recently (Wiens et al. 2010), although its precise definition has proven somewhat problematic. Ackerly (2009) recognized two axes of conservatism: phylogenetic signal and evolutionary rate. Phylogenetic signal captures the tendency for closely related species to resemble each other more closely in their biological characteristics than expected by chance. The evolutionary rate component describes the rate of phenotypic divergence along the branches of the evolutionary tree – low evolutionary rates might be interpreted as trait conservatism. Here, we focus on the former, although we use the terms ‘conservatism’ and ‘signal’ interchangeably. Importantly, if phenology is phylogenetically conserved, such that closely related species share similar life-history attributes, then species cannot be regarded as statistically independent, and the search for correlates and predictors of phenological traits must account for species phylogenetic relationships (Harvey & Pagel 1991; Harvey et al. 1995). Additionally, strong conservatism in mean timing of phenological events may be of practical benefit in allowing the prediction of phenological schedules for species for which we have information only on evolutionary relationships (Brooks, Mayden & McLennan 1992; Mazer et al. 2013).

Whilst there are reasons to expect phylogenetic conservatism in phenological traits, observations for weak conservatism have at least three possible explanations. First, errors in phylogenetic reconstructions or measured traits might scramble potential signal. Second, phenological schedules may evolve in a manner that is not well predicted by phylogeny, for example, when local adaptation or similar directional selective force is strong and dictates evolutionary trajectories. Third, phenology may reflect a flexible response to environment that is more or less independent of taxonomic membership (i.e. phenological plasticity), such that phenological schedules for species and populations are largely determined by the environmental conditions (e.g. temperature, precipitation and photoperiod) in which they grow. Absence of phylogenetic conservatism might then suggest that species have rapidly shifted timing of life-history events in response to changing climates historically. In turn, phylogenetic conservatism in phenological plasticity – if we consider plasticity as a quantitative trait potentially under selection (De Jong 2005) – might be more important for predicting the impacts of future climate change on particular taxa (Matesanz, Gianoli & Valladares 2010), but rarely has it been estimated directly (Pigliucci 2005).

Here, we explore phylogenetic conservatism in Northern Hemisphere plant phenology. We examine two key phenological traits for vascular plants: timing of first leaf (FL) and, in angiosperms, timing of first flower (FF). Our data set is comprised of >70 000 records, from 23 sites providing ~5000 unique site × species observations sampled from multiple ecosystems, latitudes and climates, and across multiple years (NECTAR: Wolkovich, Cook & Regetz 2012; Cook et al. 2012). The scale of our analysis poses numerous challenges. The taxonomic breadth of our data set, spanning the plant tree-of-life, makes comparisons across species difficult because traits are not always comparable between evolutionarily distant lineages. For example, the time of first flower is not a relevant trait for non-angiosperm lineages. In addition, many lineage effects are likely to be subtle and might be detectable only within particular clades (e.g. Davis et al. 2010). Furthermore, as plants are found across the terrestrial biome, they are subject to a great diversity of climatic conditions and may attune to different cues in different environments (Mouradov, Cremer & Coupland 2002; Larcher 2003). Last, phenological events are measured in calendar days, which do not necessarily correspond seasonally in different parts of the world; for example, spring occurs later at higher latitudes and elevations (Schwartz, Ahas & Aasa 2006).

We use comparisons across and within sites to gain insights into the abiotic drivers of phenology and the phylogenetic conservatism of species' biotic responses. If phenology is largely determined by environment, but phenological responses are mediated by biological traits, we would predict that phylogenetic conservatism should be more pronounced within local assemblages than across sites. Weak phylogenetic conservatism across sites might reflect either strong directional selection, resulting in rapid evolutionary adaptation to local climate conditions, or phenological plasticity, which might mask the underlying similarity in phenological responses among more closely related species when they occur in different environments. We suggest that if phenotypic plasticity explains apparent lower conservatism, then the strength of phylogenetic conservatism across sites should converge on local estimates after correcting for cross-site differences in timing of climate cues.

Materials and methods

Phenology data

We used the Network of Ecological and Climatological Timings Across Regions (NECTAR) database on phenological traits for flowering plants (Cook et al. 2012; Wolkovich, Cook & Regetz 2012) and extracted information on day of year of first flower (FF) and first leaf (FL) across 23 sites spanning the Northern Hemisphere (Table 1). These sites encompass a cross section of temporal observations (1739–2010, with time series ranging from 5 to 184 years) and represent a range of species richness (11–1822 species per site), taxa (herbs, shrubs and trees) and biomes (northern temperate, tropical and arid). For most sites, we have an estimate of FF, and for a subset, we have FL, indicating the first recorded event for each species at each site. Because each site represents an independently designed study, data collection protocols differed among them. In the temperate and arid sites, data on FF and FL were generally recorded by direct observation; in tropical sites, direct observation is usually impossible (due to high canopies and low population densities), so phenology was recorded instead using litter trap collections of leaves and flowers (Wright & Calderon 1995).

Table 1. Data sets and site attributes (further details in NECTAR: Cook et al. 2012; Wolkovich, Cook & Regetz 2012)
Site codeData collectedSpecies richnessa (FL)LocationTemporal spanTaxaHabitat/Biome
  1. a

    Following taxonomic synonymization.

  2. b

    Excluded from site-level analysis because of low species richness after synonymization.

Washington, D. C.FF, var(FF)749Washington, D. C.1985–2007Various angiospermsMetropolitan area
FargoFF, var(FF)655Great Plains, North Dakota & Minnesota1910–1961 and 2007–2010Various angiospermsTemperate grasslands
ChinnorFF, var(FF)372Oxfordshire, England1954–2000Various angiospermsTemperate woodland and grassland
Herms Ohio and MichiganbFF, var(FF)11Michigan & Ohio1985–1989 (Michigan) and 1997–2002 (Ohio)Various angiospermsBotanical gardens
SoederstroemFF, var(FF),FL140 (40)Karlskrona, Sweden1843–1877Various angiospermsSarmatic mixed forest
HarvardFF, var(FF),FL31 (28)Harvard Forest, Massachusetts1990–2009Woody speciesTemperate woodland
MarshambFF, var(FF)11Norwich, UK1736–1810, 1834 & 1836–1958Various angiospermsTemperate woodland and grassland
ConcordFF, var(FF)612Concord, Massachusetts1851–2006 (incomplete)Various angiospermsTemperate woodland
WPSFF, var(FF)34Wisconsin1962–2009Various angiospermsTemperate woodland and grassland
GothicFF, var(FF)109Gothic, Colorado1973–2009 (incomplete)Various angiospermsMontane meadow
KonzaFF, var(FF)224Konza Prairie, Kansas2001–2009Various angiospermsTall grass prairie
LuquilloFF, var(FF)83Luquillo Experimental Forest, Puerto Rico1992–2000 and 2006–2007Various tropical speciesTropical forest
Arnell 1877FF, var(FF),FL26 (10)Ångermanland, Sweden1877–1916Various angiospermsTaiga
SevilletaFF, var(FF),FL136 (245)Sevilleta National Wildlife Refuge, New Mexico1991–1994 and 2000–2008Various angiospermsDesert grassland and shrubland
MohonkbFF, var(FF)18Mohonk Lake, New York1928–2002 (incomplete)Perennial angiospermsTemperate woodland
OPGFF, var(FF)19Ohio Phenological Gardens, OhioNAVarious angiospermsPhenological gardens
GunnarFF, var(FF)22Tärnsjö, Sweden1934–2006Various angiospermsSarmatic mixed forest
WauseonbFL26Wauseon, Ohio1883–1912TreesTemperate woodland
UWMFL24Saukville, Wisconsin2000–2009Woody plantsTemperate wetlands
BCIFF102Barro Colorado Island, PanamaNATropical mixed woody plantsTropical forest
RobertsonFF409Carlinville, IllinoisNAVarious angiospermsMesic temperate woodland
ArnellFF553Uppsala, Sweden1873–1919Various angiospermsSarmatic mixed forest
KochmerFF1822North and South CarolinaNAComplete floraMixed subtropical, temperate and boreal habitats

We used FF and FL to describe species phenology. Because both FF and FL might also be influenced by population size – a simple sampling effect would predict that first flower would tend to be recorded earlier in larger populations – it has been argued that peak flowering time might be a better measure (Miller-Rushing, Inouye & Primack 2008). Unfortunately, we did not have sufficient data to estimate peak flowering for the majority of our sites. However, we suspect that variation in populations size among species would reduce signal in the data; thus, our evaluation of phylogenetic signal is probably conservative.

Phylogeny reconstruction

First, we constructed a phylogenetic tree for the complete set of taxa using the software program Phylomatic (Webb & Donoghue 2005), which matches a taxon list against a backbone phylogeny of plant family and genus-level relationships and returns a trimmed ‘megatree’ phylogeny for the group. For this analysis, we used a recent hypothesis from the Angiosperm Phylogeny Working Group (APG tree R20081027, archived at http://svn.phylodiversity.net/tot/megatrees/) as our backbone. Unresolved relationships between genera and all species within genera were treated as polytomies; given the very large number of taxa in our study, practical constraints precluded their manual resolution. We used the BLADJ algorithm in the program Phylocom (Webb, Ackerly & Kembel 2008) to make the branch lengths of the phylogeny proportional to time and known ages of plant fossils (Wikström, Savolainen & Chase 2001) as calibration. We refer to this topology as the Phylomatic tree.

Second, to evaluate sensitivity of our results to tree topology, we constructed an alternative phylogeny directly from DNA sequence data. For practical reasons (i.e. aligning a multigene DNA matrix across many thousands of taxonomically disparate species is a computationally challenging task, and many species are missing sequence data), we resolve the tree to genera and include species as polytomies, as in the phylomatic tree. The phylogeny was assembled using RAxML (Stamatakis, Hoover & Rougemont 2008) and a GTR+G model for each of the seven DNA regions analysed. The support of the resulting tree was assessed using 100 bootstrap replicates. Finally, branch lengths were calibrated in millions of years by enforcing a relaxed molecular clock and multiple fossil calibrations in the software BEAST (Drummond et al. 2012). Further details of tree reconstruction are provided as Supporting Information. We refer to this topology as the ML tree. The two trees therefore differ in branch lengths and resolution above the genus level.

Subsequent analyses were performed using the ‘ape’ (Paradis, Claude & Strimmer 2004) and ‘picante’ (Kembel et al. 2010) libraries in R (http://www.R-project.org; R Development Core Team).

Evaluating phylogenetic conservatism

First, for each species, we determined the mean day of year for FF and FL across all available years for the global data set, first averaging FF and FL within sites and then averaging across sites. Second, we calculated the variance in flowering times as the standard deviation in FF between years for species with ≥5 observations (species × year) at a given site and also averaging across sites. Site data (Table 1) vary in both duration (number of years) and time period (historical sampling dates), and it is therefore possible that climate change in recent decades may have impacted some data sets more than others. Although comparison among species within sites should be unaffected, comparisons among species occupying different sites could be more sensitive (see 'Discussion').

Because it is difficult to define precisely the start and end of the growing season, particularly in aseasonal environments in the tropics, we used a circular transformation to convert day of year to radians and used the Circular R-library (Lund & Agostinelli 2011) to calculate means and variance. For each metric, we then quantified the strength of phylogenetic conservatism in the data using the K-statistic from Blomberg, Garland & Ives (2003) as implemented in the Picante R-library (Kembel et al. 2010). Because sample size of species for which we had data on FF was much larger than for the set of species with data on FL, we additionally estimated K for FF and FL on the subset of species with matching data on both to control for variation in sampling intensity when comparing traits. Next, we calculated the equivalent metrics separately for species within sites, thereby obtaining the mean for FF and FL, and standard deviation in FF for each species (across years) at a given site. In the latter analysis, we include only sites with >20 species because estimates of Blomberg's K derived from low sample sizes may be less reliable.

Blomberg's K compares the observed distribution of tip data to expectations derived from a Brownian motion model of evolution in which species differences accumulate over time in a manner analogous to a random walk, with expectation K = 1.0 for a Brownian motion model, and K = 0 for absence of phylogenetic conservatism. Because K is sensitive to tree resolution (Davies et al. 2012), we estimated phylogenetic conservatism by first thinning the phylogeny to one representative taxon per unresolved node, producing a maximally resolved tree topology, and then generated a distribution of Kthinned values by randomly resampling (n = 100) from the species subtending each polytomy. Significance in phylogenetic conservatism was estimated from the variance of phylogenetically independent contrasts relative to tip shuffling randomization on the complete tree, as implemented in the R-library Picante (Kembel et al. 2010).

Finally, to evaluate how data on one phenological event might help predict timing of another, we explored correlations between FL, FF and variance in FF using phylogenetic generalized linear models as implemented in the Caper R-library (Orme et al. 2012) with the lambda parameter, which measures phylogenetic signal, set to its maximum likelihood value (Freckleton, Harvey & Pagel 2002).

Adjusting for onset of spring across sites

The start of the growing season varies with climate, across latitudes and between years. To correct for differences in the onset of spring between sites, we used the Spring Indices (SI) first bloom model (henceforth SI; Schwartz 1997; Schwartz, Ahas & Aasa 2006) to standardize times of FF (equivalent to first bloom) by subtracting the estimated start of spring (first bloom) from observed FF dates in units of calendar days. The SI were developed to define the onset of spring using climate observations, which vary from year to year, instead of calendar days, and were calculated separately for each site and year. By standardizing by the SI, we effectively rescale time of FF relative to the start of spring as defined by local climate. The various spring indices are based on statistical models of phenology that require only location (latitude) and observational data (daily Tmin and Tmax from climate stations) as input. They can be computed anywhere daily climate data are available (even if phenological data are lacking), but are currently only appropriate for temperature-limited temperate sites; we therefore excluded the tropical and arid sites (Table 1) from our data set and subsequent analysis. Whilst the various Spring Indices have been largely developed using data from cloned lilacs, they have proven useful for predicting phenology in related species (Schwartz, Ahas & Aasa 2006; Schwartz, Ault & Betancourt 2012). We re-evaluated global phylogenetic conservatism for the standardized SI FF values using Blomberg's K, as described above. Because sufficient quality climate data were only available for a subset of sites (Fargo, Chinnor, Gothic, Harvard, Konza and Mohonk) and taxa (n = 718) to calibrate SI between years, for comparison, we also re-estimated the global K for unstandardized FF across the taxa represented within this same subset of species.

Phylogenetic overlap among sites

If species are phylogeographically clustered such that co-occurring species tend to fall within one or a few clades, it is possible that phenology might appear to be phylogenetically conserved even if largely determined environmentally, because close relatives will be exposed to the same suite of environmental cues. To evaluate phylogeographical structure, we first summed the phylogenetic distances among taxa on the subtrees connecting species within each site. We then compared these values with a null distribution generated from resampling the equivalent number of species at random from the species pool of all sites combined (n = 999). If species within sites are more closely related than species among sites (phylogeographically clustered), we would expect, on average, the observed sum of the branch lengths of the subtrees for each site to be less than those obtained from the randomizations. Importantly, in many sites, the sampled taxa represent only a subset of the complete flora, and we do not have comprehensive data on the pool of Northern Hemisphere angiosperms to evaluate evidence for environmental filtering sensu Webb et al. (2002) or phylogenetic niche conservatism sensu Wiens & Graham (2005). Nonetheless, a significant signal for clustering might indicate a spatial component in our estimates of global K.

Results

The Phylomatic phylogenetic tree for the composite data set is 25% resolved and includes over 3800 taxa. The Newick tree file is included as Supporting Information. The molecular phylogeny is fully resolved and contained 1246 genera. The ML tree and associated DNA matrix is archived in the Dryad online data repository (http://dx.doi.org/10.5061/dryad.td03p886). We focus here on results from the ML tree because relative estimates of phylogenetic conservatism were qualitatively similar for both the Phylomatic and ML topologies, where applicable matching results for the Phylomatic tree are provided as Supporting Information.

Phylogenetic conservatism in first flower and first leaf

Globally, closely related species tend to have comparable phenologies, flowering and leafing at similar times of year (Fig. 1).

Figure 1.

Phylogenetic distribution of day of year for (a) first flower (FF), (b) variation in FF and (c) first leaf (FL) on the ML tree topology for the global data set. Branches are shaded in proportion to the weighted average of descendent tips, we present this figure for illustration only and caution against overinterpreting ancestral states. A high-resolution image with species names is included as Supporting Information (Fig. S4). Matching illustrations for FF at Harvard and Chinnor are provided as supplementary Figs S1 and S2, respectively.

Across sites, FL and the mean and variance of FF show significant phylogenetic conservatism (all < 0.001 from randomizations, Table 2), but depart from strict Brownian expectations (K < 1). Strength of phylogenetic signal in FL was greater than for FF (Kthinned = 0.50 ± 0.01 and 0.32 ± 0.01; mean ± SD for FL and FF respectively; Table 2); however, K-values converged when estimated across the matching set of taxa with data on both FF and FL (Kthinned = 0.51 ± 0.005 and 0.52 ± 0.002; mean ± SD for FL and FF respectively, N = 137). In addition, there was a significant positive relationship between FL and FF, although the correlation strength was relatively weak (r2 = 0.07, t = 3.15, and < 0.01 on 123 degrees of freedom, λ = 0.81, from the regression of FL against FF for taxa matched by site and after correcting for phylogenetic non-independence assuming the ML tree topology). In some cases, correlations appeared stronger when regressions were performed within sites (r2 = 0.21 and 0.14, correcting for phylogenetic non-independence for Harvard and Sevilleta, respectively), but the number of sites with sufficient data on both FL and FF was limited, and in a third site, Soederstroem, the relationship was not significant (r2 = 0.01, = 0.293 on 24 degrees of freedom, correcting for phylogenetic non-independence). Across sites, there was also a weak, but highly significant trend for late flowering species to be less variable in flowering times than earlier flowering species (r2 = 0.06, t = −10.12, < 0.01 on 1621 degrees of freedom, and r2 = 0.06, t = −10.21, < 0.01 on 1621 degrees of freedom, λ = 0.910 from the regression of FF against variance in FF across species and after correcting for phylogeny, respectively). On average, the strength of the correlation was stronger within sites, particularly high r-squared was found for Marsham (0.46), Herms Ohio and Michigan (0.69), and Arnell 1887 (0.62), but sample sizes of species were relatively low (species number = 11, 11 and 26 respectively) and no significant correlation was observed for Harvard (n = 31), WPS (n = 33), Gothic (n = 80), Mohonk (n = 18) and OPG (n = 19).

Table 2. Strength (mean ± SD) and significance of phylogenetic signal in times of first flower (FF), first leaf (FL) and variance in first flower (var[FF]) across sites, estimated on the ML and Phylomatic trees, respectively
 Kthinned(ML)P(ML)aKthinned(Phylomatic)P (Phylomatic)a
  1. a

    P-values estimated from the variance of phylogenetically independent contrasts relative to tip shuffling randomization on the complete tree, as implemented in the R-library Picante (Kembel et al. 2010).

  2. b

    K for FF estimated on the matching set of taxa with data for both FF and FL (N = 137) converges on estimates of K for FL (K ≈ 0.51).

FF0.322 ± 0.011b<0.0010.246 ± 0.011<0.001
FL0.502 ± 0.001<0.0010.400 ± 0.014<0.001
var(FF)0.395 ± 0.004<0.0010.263 ± 0.004<0.001

Within sites, phylogenetic conservatism for species' mean phenology (FL and FF) was uniformly greater than observed across the global data set, with the largest increase for FF (Table 3; see also Table S1 in Supporting Information). At the site level, K for FL and FF was similar (median Kthinned = 0.62 and 0.77 for FL and FF respectively; Table 3); however, whilst phylogenetic clustering for FF was significant for the majority of sites, K for FL did not differ significantly from random (with the exception of Sevilleta). The Harvard site was notable in demonstrating much stronger conservatism in FF (Kthinned = 1.42 ± 0.065; mean ± SD; see Fig. S1 in Supporting Information) than any other site, but again phylogenetic signal in timing of FL was not significantly different from random and sample size of species was low. Chinnor, with larger sample size of species and high K, provides a useful illustration of phylogenetic conservatism in FF (see Fig. S2 in Supporting Information), but lacks data for FL. Significance varied among sites, probably reflecting differences in sample sizes of species, site ecologies and accuracy of the underlying site-level phylogenetic trees. For example, Gunnar, Sevilleta and Soederstroem demonstrated significant phylogenetic signal in FF on the Phylomatic tree but not the ML tree. In contrast, K for FF estimated on both the Phylomatic and ML tree was not significant for BCI or OPG. Sampling of species in OPG was relatively poor (n = 20), perhaps limiting statistical power, but BCI includes records for over 100 species and taxonomic breadth is large. BCI represents a tropical biome, and it is possible that phylogenetic conservatism in phenology may be harder to detect when the growing season is less well defined, and phenologies can vary dramatically among species (e.g. species may flower multiple times each year or only supra-annually; Newstrom, Frankie & Baker 1994).

Table 3. Strength and significance of phylogenetic signal in times of first flower (FF), first leaf (FL) and variance in first flower (var[FF]) within sites, estimated on the ML tree
SiteFF Kthinned ± SDFL Kthinned ± SDvar(FF) Kthinned ± SD
  1. *K significant from random at < 0.05, **K significant from random at < 0.01.

Arnell 18771.018 ± 0** 0.853 ± 0
Arnell0.659 ± 0.025**  
BCI0.606 ± 0.025  
Concord0.780 ± 0.017** 0.477 ± 0.010*
Fargo0.738 ± 0.014** 0.453 ± 0.010
Chinnor1.078 ± 0.033** 0.545 ± 0.013**
Gothic0.533 ± 0.019* 0.550 ± 0.010
Gunnar0.825 ± 0.001 0.720 ± 0.003
Harvard1.415 ± 0.065**0.514 ± 00.562 ± 0.111
Kochmer0.537 ± 0.012**  
Konza0.951 ± 0.024** 0.567 ± 0.019
Luquillo0.699 ± 0.004* 0.520 ± 0.005
OPG0.903 ± 0.016 0.846 ± 0.009
Robertson0.831 ± 0.049**  
Sevilleta0.575 ± 0.0170.515 ± 0*0.487 ± 0.007
Soederstroem0.621 ± 0.0140.826 ± 00.546 ± 0.007
UWM 0.720 ± 0 
Washington, D. C.0.650 ± 0.027** 0.480 ± 0.010**
WPS0.796 ± 0.006* 0.812 ± 0.023**

Contrasting with the global results, we observed only weak evidence for significant phylogenetic conservatism in variance in FF within sites. In addition, strength of phylogenetic conservatism for variance in FF was lower (median Kthinned = 0.55) than for mean FF values, and the increase in K for variance in FF within sites relative to global K was less than the equivalent increase observed for mean FF. It remains possible that strength of phylogenetic conservatism in variance might itself vary along an environmental gradient, such that variance is highly conserved in some environments, but only weakly conserved in others, and that averaging across sites masks this variation. Although we observe large variation in K-values for variance in FF between sites, only a few sites depart significantly from random expectations (Table 3; see Table S1).

Adjusting for onset of spring across sites

Contrary to predictions if phenotypic plasticity explains lower conservatism across sites than within sites, we found that the global FF standardized by SI shows weaker, not stronger, phylogenetic conservatism than the unstandardized flowering times (Kthinned = 0.75 ± 0.019 versus 0.87 ± 0.020 for standardized and raw FF, respectively, for the subset of species with data on both mean and variance in FF).

Phylogenetic overlap among sites

The sum of the phylogenetic branch lengths within sites was significantly less than from randomizations shuffling species among sites (< 0.05), indicating phylogeographical clustering of floras. The trend for clustering was also evident at the individual site level, with most sites encompassing less phylogenetic diversity than median expectations based upon sampling the same number of species at random from the phylogeny although significance varied between sites (see Fig. S3 in Supporting Information). Two sites were found to show significant phylogenetic overdispersion: Washington, D. C. and Kochmer. Phylogenetic overdispersion was unexpected. Both sites sample from a larger geographical extent than the remaining sites; the Washington, D. C. data set represents an amalgamation of smaller sites from urban Washington, D. C. and the nearby neighbourhoods (and includes some non-native flora), whereas the Kochmer data set contains records from across North and South Carolina (Table 1). Perhaps it is therefore unsurprising that these two sites might also sample from a broader phylogenetic pool.

Discussion

Here, we have explored two key life-history traits in plants: timing of first flower (FF) across >4000 species and first leaf (FL) across ~200 species. We show that more closely related species tend to flower and leaf at similar times of the year but that timing of FL is only weakly correlated with time of FF. Evidence for phylogenetic signal in plant phenological traits has been reported previously (e.g. Kochmer & Handel 1986; Bolmgren & Cowan 2008; Willis et al. 2008; Davis et al. 2010), but our study is the first to do so at such large taxonomic and spatial scales. Significance and differences in the relative strength of phylogenetic conservatism within and across sites, and across phenological traits, were robust to differences in reconstructed phylogenies using very different methods.

Phylogenetic conservatism of phenology: traits or geography

We found significant phylogenetic conservatism in flowering times when aggregating data from multiple sites spanning temperate and tropical biomes, with some lineages, for example, asters, flowering later in the year, and others, for example, within Myrtales, flowering earlier in the year. Our results are perhaps surprising because phenology is thought to be driven largely by external environmental cues: species within different environments are therefore predicted to demonstrate different phenologies; and even the same species might vary in phenology if exposed to alternate cues (Wolfe et al. 2005; Schwartz, Ahas & Aasa2006). However, genetically based variation in flowering time has been well documented within (e.g. Mazer 1987; Stinchcombe et al. 2004; Wilczek et al. 2009; Exner & Zabala 2010; Rhoné et al. 2010) and among populations (Tarasjev 1997; Chamorro & Sans, 2010; Kawai & Kudo 2011), and among closely related taxa (e.g. Debussche, Garnier & Thompson 2004; Brearley et al. 2007).

We suggest two broad explanations for phylogenetic conservatism in phenological traits at this gross spatial scale. First, genetically based trait conservatism – plant physiology might dictate species sensitivity to particular cues, hence closely related species that share a large portion of their evolutionary history would be expected to also share similar physiologies and sensitivities (Harvey & Pagel 1991). Second, geographical conservatism – closely related species might be more likely to co-occur because of phylogenetic niche conservatism sensu Wiens & Graham (2005), environmental filtering (Webb et al. 2002), or because of shared centres of origin (Bremer 1992). Closely related species might then share similar phenologies simply because they grow in – and are adapted to – similar environments. We evaluated evidence for phylogeographical patterns in species distributions by comparing the phylogenetic diversity represented by species within sites to a null model selecting species at random from the species pool of all sites combined. Perhaps unsurprisingly, we show that species within sites tend to be phylogenetically clustered. Does our finding of phylogenetic conservatism in phenology therefore simply reflect phylogeography and local responses to environment? We addressed this question by investigating the strength of phylogenetic conservatism in flowering times for species within each site separately, thereby removing the confounding effect of spatial location.

If phylogenetic signal were an emergent product of phylogeography and plastic responses to local environmental cues, such as temperature, precipitation and photoperiod, we would predict weak or no signal within sites. Species within each site are exposed to the same suite of environmental cues, although cues will of course vary across the growing season. We observed considerable variation in strength of conservatism between sites, possibly reflecting variation in taxonomic membership (e.g. the Harvard plots included only woody species, so much of the variation observed in other sites across life forms was not included), data quality (e.g. ‘flower baskets’ that catch falling blossoms versus observational data) and ecological attributes. Nonetheless, in all cases, strength of phylogenetic conservatism for mean FF was greater within sites than observed globally, and in several cases, phylogenetic conservatism was several times greater within a site than across sites. Our results suggest strongly intrinsic phylogenetic conservatism in phenological traits that is most apparent when species are exposed to the same suite of extrinsic environmental cues.

Species' phenological schedules might be attuned to local environmental conditions through either local adaptation or phenotypic plasticity. There is increasing evidence that evolutionary adaptation can occur over ecologically relevant time-scales (e.g. Franks, Sim & Weis 2007; Schoener 2011); nonetheless, at least for perennials, interannual variation in flowering times within sites probably represents phenological plasticity. Weaker phylogenetic signal in mean FF detected across sites (relative to within sites) therefore suggests that it is not the particular day of year that traits are expressed that is phylogenetically conserved, but rather species' responses to environmental cues, which vary from site to site. In very different environments, we might then expect close relatives to flower at very different times, but nonetheless still share similar responses to drivers. Thus, we suggest that phylogenetic conservatism for FF tends to be weaker at larger spatial scales because related lineages might have dispersed to different climate or day-length regimes and subsequently converged on different phenological optima better suited to their local environments.

If our interpretation of phylogenetic conservatism in phenology is correct and temperature is the dominant phenological cue, then standardizing flowering times by the start of spring should help align species' phenologies between sites (Schwartz 1997; Schwartz, Ahas & Aasa 2006). Therefore, we predicted that phylogenetic conservatism in flowering times standardized by the Spring Indices (SI) would be greater than that observed for calendar days and should converge on within-site estimates. However, we found that across sites, standardized timings exhibited weaker phylogenetic conservatism when compared with uncorrected day of year. We propose two explanations that might explain why controlling statistically for among-site variation in the SI resulted in a decrease in phylogenetic conservatism. First, a trend for weaker phylogenetic conservatism across sites for the standardized flowering times is consistent with local adaptation – species' phenologies have evolved towards site-specific optima that cannot be simply aligned by correcting for timing of spring. Second, the SI models were designed to simulate the response of species' phenologies that are primarily driven by temperature. Phenologies for species that respond to a wider range of climate variables (e.g. snowpack, precipitation, irradiance, etc.) or photoperiod will not be well modelled by the SI. Thus, standardizing by SI across species with a broad diversity of phenological drivers might not be appropriate.

Phylogenetic conservatism in variability of flowering times

Our finding of stronger phylogenetic conservatism in flowering times within sites than across sites suggests that species are responding to site-specific cues through local adaptation and/or phenological plasticity (Wilzcek et al. 2010). We show also that variance in flowering time is conserved on phylogeny, such that closely related species tend to demonstrate similar variability in flowering times. However, conservatism for variance in flowering time within sites (which we cautiously interpret here as indicative of phenotypic plasticity) was for the most part non-significant and only marginally greater in magnitude than the strength of conservatism observed across the global data set (which might reflect adaptation and/or plasticity). Further, we find support for predictions that early-season species' phenologies should be more sensitive to abiotic cues, for example, because costs of mistimed phenology may be higher (Pau et al. 2011) and therefore demonstrate greater variance in flowering times.

The importance of phenotypic plasticity is well recognized in plant phenology (reviewed in Sultan 2004), and it is perhaps the most relevant indicator of sensitivity to climate change (e.g. Willis et al. 2008). Although variance in flowering times might be a product of local adaptation and/or plastic responses, we found that the strength of covariation with phylogeny was relatively weak (in comparison with first flower). We suggest other factors, such as the period within the growing season that a species flowers (Pau et al. 2011), and local adaptation to alternate environmental cues might have greater direct influence on species variability than taxonomic membership. Identifying the factors that determine phenological plasticity remains a major challenge.

Correlations between leafing and flowering

We explored evidence for covariation between FF and FL, controlling for the shared evolutionary history of taxa (Felsenstein 1985) and found a significant positive correlation between them. Perhaps it is unsurprising that species that flower early should also leaf early. Both FF and FL have been shown to advance (occur earlier) in response to increasing temperatures when evaluated together (e.g. Wolfe et al. 2005; Gordo & Sanz 2009) and are considered to represent similar functional responses to climate (Parmesan & Yohe 2003; Root et al. 2003; Cleland et al. 2007). However, as far as we are aware, their linked evolutionary responses have not been evaluated previously. If shifts in the timing of FF and FL capture similar responses to climate, then the two might be more or less interchangeable. Usefully, FL can be indexed using remote sensing data on ‘green-up’ derived from the normalized difference vegetation index (NDVI) (Zhou et al. 2001; but see White et al. 2009; Schwartz & Hanes 2010), providing an approach for describing plant responses at much larger geographical scales than possible using site-specific observations, which are required for FF. However, we find that the strength of the correlation between FF and FL may vary among different locations. It is possible that FF and FL might demonstrate qualitatively different responses to climate change because they are responding to cues in different parts of the season.

Conclusion

We have shown that the timing of life-history events covaries with phylogeny such that more closely related species tend to flower and leaf at similar times. Critically, any search for drivers of phenological events must therefore consider phylogeny because species cannot be treated as statistically independent. To date, there have been only few phylogenetic comparative analyses of phenological traits (see Jia et al. 2011 and Lessard-Therrien, Davies & Bolmgren 2013; for two recent examples). Last, evidence of significant phylogenetic conservatism in species flowering times indicates that the shape of the response curve linking phenology to environment might be evolutionarily constrained. Whilst our study highlights the large potential for climate tracking through phenological plasticity or rapid adaptive evolution, significant phylogenetic conservatism suggests that there may be some limits to plant responses (Wiens et al. 2010). It is possible that phylogenetic conservatism might inhibit species from evolving appropriate phenological responses to new climate landscapes as species approach their phenotypic limits defined by standing genetic variation, potentially destabilizing the tight network of species interactions and trophic links that define ecological communities (Parmesan 2006; Thackeray et al. 2010; Burkle, Marlin & Knight 2013).

Acknowledgements

This work was conducted as a part of the ‘Forecasting Phenology’ Working Group supported by the National Center for Ecological Analysis and Synthesis, a Center funded by NSF (Grant #EF-0553768), the University of California, Santa Barbara, and the State of California. Additional support was also provided by the USA – National Phenology Network (USA-NPN) and its Research Coordination Network (supported by National Science Foundation Grant IOS-0639794). Support for EMW for came from NSF's Postdoctoral Fellow program (Grant DBI-0905806). We thank B. McGill for his contribution to the working group. Special thanks to all data holder and curators, including (in no particular order) David Inouye and George Aldridge (Gothic), Paul Huth, Shanan Smiley and John Thompson (Mohonk Preserve), John O'Keefe (Harvard), Tim Sparks (Marsham), Richard Primack and Abe Miller-Rushing (Concord), Dan Herms (OPG and Herms data from Ohio and Michigan), Jess Zimmerman, Chris Nytch and Jimena Forero-Montãna (Luquillo), K. Vanderbilt and K. Wetherill (Sevilleta), Joe Wright (BCI), Sylvia Orli, with special acknowledgement of the two main data contributors, Aaron Goldberg and the late John Wurdack (Washington, D. C.), A. H. Fitter with special acknowledgement of the late R. S. R. Fitter (Chinnor, UK) and to M. Lechowicz, who provided an electronic version of the data from the late T. Mikesell (Wauseon). Ethel Johansson kindly shared the data collected by her late husband, Gunnar Johansson.

Significant funding for the collection of some data was provided by the NSF LTER program (NSF Grant numbers BSR 88-11906, DEB 9411976, DEB 0080529 and DEB 0217774). Data collection for Washington D. C. was supported by A. Goldberg and J. Wurdack and is organized through the Department of Botany, Smithsonian National Museum of National History. Data from Gothic were supported by NSF grant DEB 0238331 and 0922080; data from Luquillo NSF DEB grants: #9411973, #0080538, #0218039, #0620910, #0614659, #0218039. Some data used in this publication were obtained by scientists of the Hubbard Brook Ecosystem Study. The Hubbard Brook Experimental Forest is operated and maintained by the Northeastern Research Station, US Department of Agriculture, Newtown Square, Pennsylvania. Data for Konza supported by Konza Environmental Education Program (KEEP). The SI models were developed using phenological data that now reside and are available through the National Phenology Database at the USA National Phenology Network.

Ancillary