Climate refugia, locations where taxa survive periods of regionally adverse climate, are thought to be critical for maintaining biodiversity through the glacial–interglacial climate changes of the Quaternary. A critical research need is to better integrate and reconcile the three major lines of evidence used to infer the existence of past refugia – fossil records, species distribution models and phylogeographic surveys – in order to characterize the complex spatiotemporal trajectories of species and populations in and out of refugia. Here we review the complementary strengths, limitations and new advances for these three approaches. We provide case studies to illustrate their combined application, and point the way towards new opportunities for synthesizing these disparate lines of evidence. Case studies with European beech, Qinghai spruce and Douglas-fir illustrate how the combination of these three approaches successfully resolves complex species histories not attainable from any one approach. Promising new statistical techniques can capitalize on the strengths of each method and provide a robust quantitative reconstruction of species history. Studying past refugia can help identify contemporary refugia and clarify their conservation significance, in particular by elucidating the fine-scale processes and the particular geographic locations that buffer species against rapidly changing climate.
I. Climate refugia: biogeographical and conservation significance
Rapidly growing concern about contemporary climate change impacts on biodiversity has renewed interest in understanding biotic responses to past shifts in climate, and in applying that knowledge to guide current land management (Dawson et al., 2011; Dietl & Flessa, 2011; McMahon et al., 2011). The climates of the glacial cycles of the Pleistocene, in particular the cold period of the Last Glacial Maximum (LGM, from 26 000 to 19 000 calendar yr before present, but cold conditions also prevailed until 14 500 yr ago; Clark et al., 2009), re-organized ecosystems, altered species abundances and forced large-scale movements of taxa, thereby changing biodiversity distribution patterns (Davis, 1976; Huntley & Webb, 1989). At the same time, some locations functioned as climate refugia, preserving local habitats that enabled species to persist in an otherwise inhospitable region, from which they expanded when conditions improved (Box 1; Provan & Bennett, 2008; Keppel et al., 2012). Identifying and characterizing climate refugia provides an important context for understanding the development of modern distributions of species, traits and local adaptation (Davis & Shaw, 2001; Jansson & Dynesius, 2002; Hewitt, 2004; Petit et al., 2005).
Box 1. Defining climate refugia
Several recent reviews provide useful conceptual discussions and definitions of refugia (Bennett & Provan, 2008; Holderegger & Thiel-Egenter, 2009; Rull, 2009; Ashcroft, 2010; Hampe & Jump, 2011; Dobrowski, 2011; Mosblech et al., 2011; Keppel et al., 2012; Tzedakis et al., 2013; Hampe et al., 2013). Here we restate an emerging consensus on the application of the term ‘refugia’ and include a few important modifiers of the term. Bennett & Provan (2008) provided a conceptual classification of refugia, and emphasize that a refugium represents a bottleneck of a population. More recent discussions emphasize the geographic context of refugia. Ashcroft (2010) and Dobrowski (2011) take a bottom-up approach by identifying refugia as physiographic locations that sustain a climate that has become, or is becoming, lost due to climate change. Keppel et al. (2012) defined refugia as locations to which taxa retreat during periods of regionally adverse climate and expand from when conditions become more suitable for the taxa. Tzedakis et al. (2013) state that refugia are locations that provide habitats for the long-term persistence of populations. Thus, a location is classified as a refugium only if the supported population persists to the present. The refugium may represent the entire distribution of a species and be the last holdout before extinction or be an isolated population that is disjunct from a more extensive distribution elsewhere. Refugia may be located within a former distribution as a result of a range contraction (in situ refugia or climate relicts); such populations persisting through several glacial cycles are long-term refugia (Tzedakis, 1993). By contrast, refugia may be located distant from a former distribution (ex situ refugia) as a result of a migration out of a distribution that later is extirpated (Holderegger & Thiel-Egenter, 2009). A refugium that is highly spatially restricted (e.g. limited to a local topographic feature) may be considered a microrefugium (Rull, 2009). ‘Cryptic refugia’ has also been applied to microrefugia because of the improbability of detecting their specific locations from the fossil record (Provan & Bennett, 2008). The various biogeographic scenarios involving refugia from glacial to interglacial, and onwards to a super-interglacial climate, are shown in Fig. 1.
Phytogeographers have long speculated that centers of endemism and isolated disjunctions are the legacy of species persistence in Pleistocene refugia (Hultén, 1937; Braun, 1955; Whittaker, 1961; Daubenmire, 1975). More recently, high species richness and endemism were found to be partly explained by local climatic stability (Araújo et al., 2008; Médail & Diadema, 2009; Sandel et al., 2011). However, current species distributions in the absence of other evidence provide only limited insight into past range dynamics, as modern patterns of disjunction and endemism alone cannot discriminate among alternative histories (e.g. Reid, 1899; MacArthur, 1972). Pioneering paleoecologists (e.g. Deevey, 1949; Van der Hammen et al., 1971; Davis, 1976) had proposed a central role of low-latitude refugia as locations where species survived through the LGM. However, more recent paleoecological and genetic research has provided support for a long-suspected alternative model (McGlone, 1985) in which certain species persisted though glacial periods in high-latitude refugia that contributed to post-glacial expansion (McLachlan et al., 2005; Stewart et al., 2010; Mosblech et al., 2011). Present-day species distributions also support this view of historical refugia, although the specific geographic context may differ for glacial vs interglacial refugia. High elevations offer locally benign conditions under interglacial conditions for many cold-adapted species, whereas south-exposed and particularly protected areas could have offered benign conditions for warm-adapted species under glacial conditions (Birks & Willis, 2008). Although the emergence of broad acceptance of high-latitude glacial refugia may appear to constitute a paradigm shift, a recent review (Tzedakis et al., 2013) has claimed that in Europe evidence of glacial refugia has been too readily accepted and that all forms of evidence require careful scrutiny.
High-latitude refugia may help resolve the ‘Quaternary conundrum’ (Botkin et al., 2007) – namely, that species distribution models that map ecological niches onto future climate surfaces often suggest high rates of extinction in coming decades (e.g. Thuiller et al., 2005); by contrast, there are few recorded cases of species extinction during the late Quaternary that are clearly attributable to climate change (Jackson & Weng, 1999; Barnosky et al., 2004; Birks, 2008; Magri & Palombo, 2013). This apparent contradiction between future projections and past evidence suggests that we may have underestimated the role of (micro)refugia, local adaptation and/or inherent tolerance to environmental change (e.g. through phenotypic plasticity; Pearson et al., 2006; Aitken et al., 2008; Willis & Bhagwat, 2009; Hof et al., 2011).
Recognizing the location and effectiveness of a refugium requires the passage of time, and thus historical evidence provides much of what we understand about refugia today. Evidence for past climate refugia comes from three independent but highly complementary lines of research: fossil records, phylogeography and species distribution modeling. Paleoecologists, geneticists and species distribution modelers all have well-developed methods for identifying past refugia (Jackson, 2006; Petit et al., 2008; Hu et al., 2009; Nogués-Bravo, 2009). The approaches are potentially synergistic and the breadth of information available from them is immense. However, there exists a need for improved quantitative tools that can harness the joint power of these independent lines of evidence. More dialog among disciplines is necessary to better understand their respective strengths and limitations, and to reconcile the sometimes contradictory inferences (Rodríguez-Sánchez et al., 2010; Fig. 2). Moreover, understanding ecological processes that maintain refugial ecosystems requires a challenging crossover of long timescales (i.e. the late Quaternary) and fine spatial scales (i.e. the spatial reality of where refugia occurred). There is, then, considerable potential to bridge the gaps between paleoecology, phylogeography and species distribution modeling by integrating paleoecological analyses that span the LGM, using new genomic tools that reveal genetic variation at unprecedented resolution, and climate modeling that highlights the fine-grained spatial context for refugia and migration. Publication trends clearly show that researchers are increasingly using a combination of approaches to reconstruct refugia: among the publications addressing refugia, the proportion that refer to more than a single approach has more than doubled over the last 15 yr (Fig. 3).
In this paper we review the strengths, limitations and approaches commonly used to reconstruct species distributions with respect to refugia. We then present three examples that demonstrate how integrating across two or more approaches has led to important new insights about refugia. We conclude with a discussion of potential new integrative approaches, and the relevance of studying past glacial refugia for informing future climate refugia.
II. Approaches for reconstructing refugia: strengths, limitations, and recent advances
Below we provide a brief review of the three major approaches used to identify past refugia. We focus on the sensitivity and reliability of each approach, and the potential for false-positive and false-negative inferences regarding the presence of a species in a refugium (Table 1).
|Method||False-positive errors||False-negative errors||Both types of error|
|Fossil records|| |
Fossil with extra-local provenance (e.g. long-distance pollen dispersal) Redeposited fossils
from earlier time periods With respect to environmental
DNA, a potential for laboratory contamination
|Low detection probability (e.g. due to small populations and/or poor fossilization potential)|| |
Misidentification of fossils
Chronological error and re-working of sediments
Areas of high diversity interpreted as refugia are the result of phalanx colonization, past hybridization, or secondary contact zonesa
Apparently endemic alleles not yet discovered in other parts of geographic range Endemic alleles
at colonizing front are the result of leading-edge dynamics or allele
Low genetic variation or poor
resolution of genetic variation interpreted as recent colonization Sampling is limited to extant populations
|Center of genetic diversity shifts through time (range shifts)|
|Species distribution modeling||Dispersal limitation (potential habitat identified by SDM is unoccupied)|| |
Incorrect niche model (variables chosen and/or mathematical expression, insufficient species occurrences to characterize full niche parameters, inclusion of biotic interactions and dispersal processes)
Incorrect paleoclimate characterization (errors in reconstructed or simulated climate)
Imprecise characterization of microscale spatial variability
Realized niche changes over time (due to no-analog climates, biotic interactions, phenotypic plasticity, and/or niche evolution)
1. Fossil records
Fossil records of Quaternary plants and animals have been collected over several decades of research by paleoecologists. There is a standing tradition of assembling paleoecological data into spatial databases used to study glacial and post-glacial species dynamics (Davis, 1976; Bernabo & Webb, 1977; Huntley & Birks, 1983; Williams et al., 2004). This process has been accelerated by recent advances in data management and paleoecoinformatics (Brewer et al., 2012). Fossils and sediments are being physically archived and increasingly tracked through digital curation (e.g. LacCore: http://lrc.geo.umn.edu/laccore and IEDA: http://www.iedadata.org) and raw data are being organized into georeferenced databases (Uhen et al., 2013).
Fossils provide the best evidence for the presence of a species within a past window of space and time. In many cases, corroboration of fossil evidence at multiple sites and careful analyses of taphonomy (i.e. processes leading to fossil preservation) can overcome many of the false positives and negatives associated with fossil data (Table 1). Some fossils are found in situ, such as megafossil trees in a rooted position. For smaller fossils (e.g. conifer needles, macroscopic charcoal and large seeds or bones) in low-energy depositional environments and for limnological microfossils (e.g. ostracodes, diatoms), source areas are usually constrained to scales of 101–102 m. Microfossils that are formed from the degradation of macrofossils, such as leaf stomata, can also indicate local occurrence (Ammann et al., 2014). Aerially dispersed microfossils such as pollen have larger source radii (usually on the order of 10−1–102 km), but local presence can be inferred from high abundances or consistent presence in a record especially for heavy pollen types (Lisitsyna et al., 2011). If datable by radiocarbon (i.e. within the last 50 000 yr), temporal precision is often very good (ranging from 102 yr for Holocene samples and 103 yr for older samples; Bronk Ramsey, 2008). Other absolute dating methods extend through the Quaternary (Walker, 2005), although the error increases markedly. During the earlier Pleistocene, global climate fluctuations reflected by marine isotopic stages and other chronostratigraphic events (e.g. volcanic tephras) can constrain fossil age.
Critically, fossils often reveal the presence of populations of a species that existed outside its present range. The detection of such populations is perhaps the most important way by which fossil data complement phylogeographic approaches, which rely on DNA collected from extant species distributions. Such populations were widespread during the LGM. For example, in eastern North America, species of an extensive boreal flora existed hundreds of kilometers south of their modern southern limits (e.g. Jackson et al., 1997). Arctic sea mammals occurred as far south as San Francisco and New England (Harington, 2008). Similarly, the unglaciated headlands and lowlands, which currently lie below current sea level on the coast of western Canada, formed an LGM refugium for many taxa (Heaton et al., 1996; Lacourse et al., 2003).
Continuous time-series of fossils can provide direct evidence of species persistence and the dynamics of species turnover within a refugium. A 130 000-yr late-Pleistocene pollen record from western Greece, for example, indicated a muted response to the abrupt climatic changes recorded in the North Atlantic, suggesting a refugium characterized by a buffered local environment (Tzedakis et al., 2002). Similarly, a 48 000-yr pollen record from a biodiversity hotspot in the western Amazon revealed 30 000 yr of compositional similarity followed by a slow warming that would have accommodated elevational habitat tracking (Bush et al., 2004).
Targeted sampling within putative glacial refugia can increase the spatial specificity of refugial populations and data from multiple sites can strengthen the interpretation of rare fossil occurrences (Kaltenrieder et al., 2009). Brubaker et al. (2005) synthesized multiple sites in Beringia to show that trace amounts of pollen of boreal trees existed continuously during late-glacial times in Beringia, which could best be explained by the continuous persistence at low abundances of those species within Beringia, contrasting with single-site interpretations suggesting that trees were eliminated from Beringia in the LGM. Macrofossils, such as wood and charcoal in soil or plant leaves in sediment, can provide strong evidence of species presence (Hopkins et al., 1981, 1993; de Lafontaine & Payette, 2011). Such soil charcoal and macrofossil data have shown boreal tree taxa existing in central Europe and northern Eurasia close to ice sheets before and during the LGM (Willis & van Andel, 2004; Binney et al., 2009). The mammal fossil record supports a similar environmental interpretation (Sommer & Nadachowski, 2006).
Limitations and recent advances
The fossil record is discontinuous in space and time, and is limited to species that fossilize. Taxa that were rare in the past are unlikely to occur as macrofossils and therefore the absence of fossils usually provides little information about the absence of a species (i.e. a high false-negative rate; Birks, 2014). Even within a spatial network of fossil sites, small populations in areas far from fossil sites may remain undetected (McLachlan & Clark, 2004; Lesser & Jackson, 2011). Moreover, by their nature, microrefugial populations are spatially restricted and small and will leave correspondingly sparse representation in the fossil record. Fossil-bearing deposits may be absent or impossible to locate in the areas where a refugium putatively existed. In particular, old land surfaces and unglaciated mountains tend to have few sedimentary basins (Hutchinson, 1957). The settings for potential glacial refugia, such as steep south-facing slopes (north-facing in the southern hemisphere) or coastal shelves exposed during the last glaciation, may also be the least-suited for preserving fossils (Bisconti et al., 2011; Beatty & Provan, 2013).
Fossil data will mislead if fossils were misidentified, incorrectly dated or transported between death and deposition (e.g. through long-distance pollen dispersal or fluvial transport of macrofossils; Behrensmeyer et al., 2000). Wood charcoal in soil contexts, for example, is easily mixed over glacial–interglacial timescales and therefore identified pieces must be directly dated (Carcaillet & Talon, 1996). However, identified and dated charcoal can provide robust evidence for species presence (de Lafontaine et al., 2014). In tundra environments the highly dispersible pollen of taxa such as Betula and Pinus occurs far beyond their source limits; even moderate percentages may not indicate local presence and thus generate false-positive errors (Table 1; Birks, 2003). Unambiguous species-level identification is often difficult or impossible.
New detection techniques – particularly in the arena of molecular and ancient DNA approaches – may offer new opportunities to identify formerly cryptic refugia (Willerslev et al., 2003). Small quantities of environmental DNA (i.e. extra-cellular, degraded DNA in sediments) are retrievable for a range of organismal groups, including mammals, vascular plants and fungi. They are identified from short but robust sequences (metabarcodes), usually to genus and often to species (Willerslev et al., 2003; Yoccoz et al., 2012). Using this approach, Parducci et al. (2012) identified chloroplast DNA of spruce (Picea sp.) from sediments dating to just after the LGM at 17 700 yr ago) on Andøya Island in northwest Norway at a site > 100 km from the nearest Picea trees. This particular study has been criticized on the grounds of potential contamination for the aDNA, the lack of supporting pollen data and the problem of re-worked sediments (Birks et al., 2012; Vorren et al., 2013). Should the technique be validated, however, it would offer a powerful approach.
2. Species distribution models
Species distribution modeling (SDM) refers to statistical and/or mechanistic approaches to assess the determinants of species ranges and predict species occurrence across space and/or time. SDM methods span from purely correlative methods (i.e. statistical assessments of relationships between species presence and a set of environmental variables; Guisan & Zimmermann, 2000; Elith & Leathwick, 2009; Franklin, 2009) to purely process-based methods (i.e. explicit ecological relationships between environmental conditions and organism performance; Kearney & Porter, 2009). The accessibility and benign data requirements of correlative SDMs, coupled with the improved availability of paleoclimate simulations, account for their increasing application to paleobiology (Nogués-Bravo, 2009; Svenning et al., 2011), including the study of glacial refugia.
Correlation-based SDMs have been widely applied because of their simplicity and the accessibility of software such as Maxent (Phillips et al., 2006) and BIOMOD for ensemble modeling (Thuiller et al., 2009). Although these models have been criticized because of their reliance on correlative relationship (e.g. Woodward & Beerling, 1997; Birks et al., 2010), process-based approaches also have limitations because of demanding parameterization (Morin & Thuiller, 2009). SDMs may be used in conjunction with paleoclimate simulations to hindcast species occurrence across large regions. This approach is cheaper and easier than those that rely strictly on fossil or genetic data for inference. Therefore, the greatest strength of SDMs is as an exploratory tool for generating spatial hypotheses regarding the past locations (Porto et al., 2013) and dynamics (Graham et al., 2010) of species, communities and/or refugia. An additional notable strength is the ability of SDMs to accommodate species-specific responses to climatic change, which is important because a single refugium is unlikely to be suitable for all taxa. Of the three approaches discussed in this paper, only SDMs can be used to project future distributions and potential refugia, so it is particularly critical to check their predictions against the paleorecord (Williams et al., 2013).
Limitations and recent advances
Despite these practical strengths, a number of assumptions and uncertainties limit the utility of SDMs and complicate their interpretation (Guisan & Thuiller, 2005). Foremost is the assumption that the species–climate relationships identified by SDMs, and upon which their mapped predictions are based, reflect causal relationships that remain static despite changes in the environment and biotic interactions. Although assumptions of niche stability may hold for some species under some circumstances (as shown by comparison with fossil data; Martínez-Meyer & Peterson, 2006; Rodríguez-Sánchez & Arroyo, 2008), multiple lines of evidence suggest that species–climate relationships can change over time due to niche evolution (Pearman et al., 2008), changing biotic interactions (Ackerly, 2003; Colwell & Rangel, 2009; Stigall, 2012), human impacts (Tinner et al., 2013; Henne et al., 2013), the emergence of novel climates (Williams & Jackson, 2007) and changing CO2 concentrations that affect water relations in plants (Cowling & Sykes, 1999). A recent analysis of fossil pollen records in eastern North America suggests that shifting realized niches during periods of no-analog climates leads to low predictive ability of SDMs when projected across time (Veloz et al., 2012). The use of SDMs to locate historic refugia in regions with no-analog climates may be problematic (Williams et al., 2007; Fitzpatrick & Hargrove, 2009; Feeley & Silman, 2010).
Given the ongoing need for relatively simple and general models, perhaps the greatest advances will come from improved understanding of community dynamics, and incorporation of the influence of biotic interactions on species distributions and responses to climate change (Kissling et al., 2012; Wisz et al., 2013; Svenning et al., 2014). ‘Community-level’ models (also known as multiresponse models) use species co-occurrence patterns to infer biotic interactions (Ovaskainen et al., 2010) or simultaneously model community structure and the functional responses of individual species to environmental gradients (Ferrier & Guisan, 2006; Olden et al., 2006; Elith & Leathwick, 2007). Community-level models are only now being applied to paleobiology (Blois et al., 2013a) and hold promise for inferring the location of refugia. For example, they may identify regions where temporal turnover in community composition is predicted to remain low (Fitzpatrick et al., 2011), assuming space-for-time substitutions are valid (Blois et al., 2013b).
The second major limitation is that SDMs typically do not accommodate population- and metapopulation-level processes. For instance, they do not recognize ontogenetic niche differences (e.g. the regeneration niche), nor do they routinely consider relevant population and dispersal dynamics or intraspecific variation in climatic tolerances (Holt, 2009). For example, when populations are small and scattered, dispersal among populations may be critical for ensuring regional metapopulation persistence (Stacey & Taper, 1992). Without explicitly modeling dispersal processes or intraspecific niche differentiation, predictions from SDMs largely reflect potential distributions that assume all populations are instantaneously able to track their optimum climate (Araújo & Peterson, 2012).
Progress has been made in incorporating processes such as dispersal, population dynamics and biotic interactions into SDMs and multispecies landscape models (Midgley et al., 2010; Schurr et al., 2012; Thuiller et al., 2013), with numerous studies describing the application of such methods to modeling changes in species distributions in response to climate change (e.g. Fitzpatrick et al., 2008; Pagel & Schurr, 2012; Dullinger et al., 2012; Fordham et al., 2013). However, relatively few studies have used dynamic species-level models in paleoecological contexts (Saltré et al., 2013; Henne et al., 2013). As these techniques continue to improve and as paleoecological datasets accumulate, there will be greater opportunities for the application of dynamic models to identify refugia and for much-needed model validation and inter-model comparisons. Although dynamic simulations have the potential to address many of the limitations of correlation-based SDMs (Schurr et al., 2012), some process-based models likely will always remain inherently limited by their complexity and therefore will be primarily useful for a handful of relatively well-studied species and sites.
The third major limitation is that most SDMs use spatially- and temporally-smoothed climate data that fail to capture climatic variance and the effects of topography on microclimate. Even when high-resolution downscaled datasets are created, they often use simple assumptions about the effects of topography on climate (e.g. applying lapse-rate corrections for temperature) that do not represent the full set of processes affecting microclimates (Dobrowski, 2011; Holden & Jolly, 2011; Ashcroft et al., 2012). In mountainous regions, precipitation, solar radiation, wind speed, temperature and humidity all vary with topographic position and interact to create complex mosaics of available moisture and energy that, among other factors, mediate the distribution of species. Microclimates may vary dramatically from regional climate due to distinctive water and substrate features that would be difficult to account for in coarsely gridded climate data, but may maintain viable populations over millennia. For example, talus slopes (which would not be distinguished on digital elevation data) may store ice throughout warm seasons, thereby creating pronounced local cooling that support cool-adapted species (Kong & Watts, 1999; Edenborn et al., 2012). Such sites may effectively buffer against regional climate change. Increasing the realism of modeled topoclimates should (1) increase the spatial heterogeneity of climate on the landscape resulting in greater overlap in climatic values through time (Ackerly et al., 2010; Dobrowski, 2011), (2) reduce estimates of the velocity of climate change through time (Dobrowski et al., 2013), (3) result in a more precise definition of the climatic niche of a species (Franklin et al., 2013), and (4) reveal locations that are potentially buffered from regional climate change which serve as in situ refugia (Ashcroft et al., 2009; Patsiou et al., 2014).
Compared to the development and testing of new modeling techniques, relatively little attention has been given to developing microclimatic datasets that capture these critical aspects of the environment at the scale relevant to populations. Recent studies have addressed this challenge by deploying dense networks of inexpensive temperature sensors to develop high-resolution climate datasets (Holden et al., 2011; Ashcroft et al., 2012) suitable for mapping potential microrefugia. Topoclimatic grids are now being produced over large and diverse regions (Ashcroft & Gollan, 2012), but to date they have not been extended to past climates. The lack of such datasets currently limits SDM studies of LGM distributions to macrorefugia rather than microrefugia, although recent attempts to link fine-scale climate with global or regional climate models offer a promising approach to project topoclimatic models back to the LGM (Dobrowski et al., 2009; Holden et al., 2011; Kearney et al., 2014). If successful, such approaches could identify refugia at a spatial scale that is in many situations inherently inaccessible to paleoecological or phylogeographical reconstructions (for the reasons stated in the corresponding sections). The task of developing such datasets also requires a focus on the climate variables that most proximally relates to species occurrence. For plants, a climatic water balance approach has the advantage of integrating physical and environmental variables into the key limiting factors of evapotranspiration and moisture deficits (Lutz et al., 2010; Crimmins et al., 2011).
When applied to a paleoclimatic context (i.e. paleo-SDMs that hindcast potential distributions under past climate), issues related to modeling topoclimates are magnified due to the uncertainty in the regional paleoclimate estimates (Bonfils et al., 2004; Varela et al., 2011; McGuire & Davis, 2013). Our comparison of eight recent paleoclimate simulations produced by the PMIP3 project (Fig. 4) revealed that the standard deviation of the anomaly (LGM–present) of growing degree-days or annual precipitation is comparable to the average anomaly itself. For example, eight simulations may indicate a mean temperature anomaly of −8°C for a single grid cell, but individual simulations range from −2 to −14°C. In addition, there may be a warm bias in some General Circulation Model simulations of LGM climate, leading to an overestimation of potential refugia (Harrison et al., 2013). This uncertainty in coarse-scale paleoclimate simulations is being addressed by comparing and assimilating simulated climates with quantitative paleoclimate reconstructions from the paleoecological and geologic record (Bartlein et al., 2011; Schmittner et al., 2011; Braconnot et al., 2012). Regardless of future improvements in paleoclimate reconstruction, there is slowly increasing awareness that SDM projections should always account for this and other sources of uncertainty, hence producing ‘maps of ignorance’ together with the standard potential distribution maps (Rocchini et al., 2011).
Molecular phylogeographic approaches to locating refugia use spatial patterns of genetic polymorphism sampled from present-day populations to infer population dynamics (Hickerson et al., 2010). In general, one expects higher genetic diversity, increased relative abundance of endemic and ancestral alleles, and less spatial genetic structure despite higher genetic differentiation within refugia relative to recently colonized areas, together with genetic divergence among refugia (Hewitt, 2000; Petit et al., 2003; de Lafontaine et al., 2013). As summarized by Bloomquist et al. (2010), three primary phylogeographic approaches have been applied to the identification of refugia. First, the comparative approach constructs a gene geneology, identifies nested sets of clades, and finally tests the geographical associations of those clades (Cruzan & Templeton, 2000). Second, the population genetic approach – also known as statistical phylogeography (Knowles, 2004) – typically identifies discrete gene pools and uses coalescent models to infer the demographic processes (e.g. population size, migration, divergence) underlying the observed molecular variation within and among populations (e.g. Hey, 2010). Third, spatial diffusion approaches model geographic spread across the landscape using continuous spatial information and random walk models. The first approach was strongly questioned, with good reasons (e.g. Nielsen & Beaumont, 2009; Panchal & Beaumont, 2010), and has now been progressively replaced by the second approach, whereas the third approach still presents major challenges (see Section IV 'New integrative approaches to reconstructing refugia'; Lemey et al., 2010).
Phylogeographic methods provide at least three important strengths relative to fossil records and SDMs. First, these methods provide insight into historical migration patterns, historical population sizes and even natural selection, all of which add nuance to our understanding of Quaternary biogeography. Coalescent simulations under the discrete population genetic approach allow direct testing of alternative biogeographic hypotheses derived from species distribution models or fossil data (e.g. Carstens et al., 2005; Gugger et al., 2010; Tsai & Manos, 2010). Second, molecular methods address among-population (within-species) differences – variation that is typically ignored in SDMs or cannot be detected through paleoecological reconstruction (Scoble & Lowe, 2010; Leites et al., 2012). This intraspecific resolution has suggested the widespread existence of so-called cryptic LGM refugia for numerous taxa (Carstens et al., 2005; Hu et al., 2009). Third, identifying genetic hotspots or shared refugia across taxa is possible by sampling several partially co-distributed species (Carnaval et al., 2009). Although species responses to past climate change have been individualistic rather than in lockstep (Davis, 1986; Stewart et al., 2010), large changes in climate may drive coarsely similar biogeographic histories of some unrelated species and result in congruent phylogenetic patterns.
Limitations and recent advances
Many phylogeographic studies are based on mitochondrial or chloroplast DNA, which do not recombine but have a slow mutation rate such that nucleotide polymorphisms likely predate the LGM. Hence, phylogeographic studies may be limited by a mismatch between the timescale represented by the observed genetic variation and the period of interest (e.g. the past 21 000 yr). For long-lived species, genetic mutations may be insufficient to identify long-term isolation over 10 000 to 100 000 yr, resulting in too little discriminatory power to select among alternative models of glacial refugia. By contrast, if a recent genetic bottleneck occurred, drastic declines in population size cause nonrandom sampling of, and drastic decreases in, nucleotide diversity, and thus leave an uninformative picture of the variation that existed beforehand. Marked genetic variation may date to events that occurred either before (Marske et al., 2011) or after (Knowles & Alvarado-Serrano, 2010) the LGM, reducing the power to draw inferences about LGM species distributions. A potentially powerful means to address some of these limitations is to recover aDNA from fossil samples (Parducci et al., 2012). Placing fossil data in a phylogenetic framework can add power to analyses that connect extant populations with their glacial refugia (Ramakrishnan & Hadly, 2009), or help test and refine alternate models of glacial refugia and post-glacial expansion. However, population genetics models of temporal samples needed for the analysis of aDNA are still in their infancy (Barton et al., 2010).
A major advance in recent years has been a shift from studies using single-locus mitochondrial or chloroplast DNA toward ever-increasing numbers of nuclear loci that are more representative of genome-wide genetic variation. This trend is driven partly by the reduced cost of high-throughput sequencing methods, but also by the recognition that a better representation of the genome is necessary for robust inference. The individual history of any particular locus is highly stochastic, which means that each gene can yield only limited confidence in reconstructing a species history. Population genetic data from many loci, each giving some independent insight, offer more power to infer the common demographic events that have shaped genetic variation across the entire genome, such as population size changes and migration history (Hare, 2001). These inferences allow examination of more spatially and temporally refined hypotheses about refugia and responses to environmental change. Advances in next-generation sequencing have yielded a wealth of techniques that enable the resolution of thousands of single-nucleotide polymorphism (SNP) markers distributed throughout the genome even in nonmodel organisms (Gnirke et al., 2009; Wang et al., 2009; Davey et al., 2011). These methods are cost-effective and resolve individual-level genetic variants necessary for population genomics and phylogeography (Emerson et al., 2010). Another benefit of these approaches is the reduction of ascertainment bias, which can inflate estimates of genetic variation and exclude geographically informative variation (reviewed by Rosenblum & Novembre, 2007).
A difficult step in any statistical phylogeographical study involves the generation of alternative demographic hypotheses; this is due to insufficient previous knowledge about species' history. Both fossil records (e.g. Gugger et al., 2010) and, especially, SDMs (Carstens & Richards, 2007; Waltari et al., 2007; Peterson & Nyari, 2008; Chan et al., 2011) have been used to suggest alternative scenarios that include different configurations in terms of number and size of refugia. These hypotheses are then tested through coalescent simulations. Importantly, the dependence on paleo-SDMs or fossils to generate alternative demographic hypotheses implies that these tend to be affected by all the inherent limitations of the other approaches. Care must be taken to detect whether biases of some particular data (i.e. SDM output or fossil records) could override the signal present in the complementary genetic data. Considering more scenarios can acknowledge the uncertainty in SDM hindcasts and increase the chance that the true species history is included as a candidate hypothesis, but it would also greatly increase the number of demographic hypotheses to be tested (Carstens et al., 2009; Collevatti et al., 2013).
Phylogeography has traditionally focused on reconstructing neutral evolutionary processes that accompany the demographic processes involved in species' range dynamics, and thus has sought out putatively neutral loci to avoid the confounding force of natural selection. The genome carries both neutral and adaptive signatures of historical events, and provides the code for the ecologically relevant phenotypic variation upon which natural selection acts. As genomic data become available for nonmodel species, adaptation can be studied along with post-glacial demography (Orsini et al., 2013). Population expansion out of refugia adds additional difficulties to differentiate adaptive from neutral genetic variation (Excoffier & Ray, 2008), but information about the location of refugia from nongenetic data will facilitate these investigations. Such integrative approaches have been demonstrated in a number of recent studies that combine a phylogeographic perspective with quantitative trait mapping, genome scans for selected loci, or both (Keller et al., 2010; Eckert et al., 2010; Bradshaw et al., 2012). By setting patterns of adaptive genetic variation in an historical context, there is potential to infer the role of adaptation to climate change – a factor commonly neglected in Quaternary paleoecological studies (Davis & Shaw, 2001).
III. Climate refugia of the past: three case studies
Below we highlight three species for which the important role of refugia has been investigated extensively. We chose these examples to illustrate the power of the combined interpretation of fossil records, SDMs and phylogeography. Taxa most suitable for such interdisciplinary studies are normally tree species due to their good representation in the fossil record. Some megafauna species have also been studied quite thoroughly (Lorenzen et al., 2011; Metcalf et al., 2014).
European beech (Fagus sylvatica) distribution since the LGM has been studied in several ways (Fig. 5a). Extensive paleoecological data (pollen and macrofossils) show that beech dating to the LGM or the late-glacial occurred in southern Italy and Greece (Tzedakis et al., 2013), northern Spain and Slovenia (Magri et al., 2006), northeast Italy (Kaltenrieder et al., 2009) and southwest France (de Lafontaine et al., 2014). Magri et al. (2006) combined the fossil data with genetic data (nuclear and chloroplast markers) to infer locations and extent of refugia and their role in the Holocene range expansion of beech. The fossil sites show a general correspondence with refugia inferred from molecular data, suggesting multiple refugia within an areal extent two orders of magnitude smaller than today (Magri, 2008; de Lafontaine et al., 2013). Genetic markers indicate that the colonization of central and northern Europe originated mainly from the northern periphery of Mediterranean refugia (e.g. eastern and western Alps). An SDM supported the presence of beech in most of these refugia and its absence from northern Europe (Svenning et al., 2008), whereas another SDM that incorporated post-glacial dispersal supported a northern refugium in the Carpathian Mountains (marked as ‘?’ in Fig. 5a; Saltré et al., 2013). Together, these findings suggest that the northern edge of Mediterranean refugia played a disproportionately important role in the post-glacial colonization of Europe by beech, with implications for understanding the origin of the species' modern genetic variability.
The Qinghai spruce (Picea crassifolia) of northern China provides an example of how multiple methods together can reconstruct a complex history of repeated fragmentation and dispersal (Fig. 5b). The scattered distribution of this species across alpine meadows and deserts of the Qinghai-Tibet Plateau and the adjacent highlands was suggested to have derived from in situ survival of large populations followed by gradual replacement of these forests by alpine meadow ecosystems from the late Pliocene onwards (Wu, 1980; Chang, 1983; Shi et al., 1998). Although phylogeographic analyses supported a long-term isolation of a northern disjunction in the Helan Shan region, the remaining populations on the plateau shared a common haplotype suggesting a population bottleneck in a single glacial refugium, possibly in the southeast (Meng et al., 2007). In line with this conclusion, palynological data indicated spruce presence before the LGM, very rare spruce pollen during the LGM, a marked increase of spruce pollen during the late Glacial period (16 000–12 000 yr ago), and finally a rapid decline c. 8000 yr ago (Herzschuh et al., 2006, 2010). The fragmentation of these forests and their replacement by alpine meadows may have been caused by increased fire through natural means or human activity (Meng et al., 2007). This possibility was supported by the presence of spruce charcoal, dated to between 8900 and 4000 yr ago, in the soil layers under existing alpine meadow vegetation (Kaiser et al., 2007). Furthermore, SDMs suggested that the modern distribution of spruce is 40% of its potential distribution at the northern end of its range, possibly indicating late-Holocene deforestation (Xu et al., 2012). Thus, this combination of studies revealed a complex history of not only climate change, but also changing disturbance and land use in the Holocene, which gave rise to the modern disjunctions and genetic diversity of this species (Miehe et al., 2009; Zhao et al., 2011).
Statistical phylogeography may also assess whether the fragmented modern distribution of a taxon results from a single refugium or multiple refugia, and thus help understand dispersal history across discontinuous habitat. For example, Gugger et al. (2010) used the fossil record of Douglas-fir (Pseudotsuga menziesii) to generate a set of alternative hypotheses on population subdivision during the LGM and routes of post-glacial expansion or contraction (Fig. 5c). Coalescent simulations were used to predict patterns of DNA sequence variation in each of these alternative scenarios, and then empirical data were compared to these simulations to determine which scenarios of population subdivision could be rejected. These analyses suggested that the modern range of Douglas-fir resulted from both expansion from multiple refugia (some only inferred from molecular data) and contraction in other regions. Paternally-inherited chloroplast markers and maternally-inherited mitochondrial markers revealed asymmetric histories of seed vs pollen dispersal. An SDM for Douglas-fir during the LGM agrees with most of the fossil locations and with several of the phylogeographically inferred refugia, supporting separate interior and coastal southern refugia (Roberts, 2013). Another notable finding was phylogeographic support for a cryptic northern refugium in the Rocky Mountains, although we note that this contrasts with paleoecological data indicating a very cold and dry climate near the ice sheet (Fig. 2).
IV. New integrative approaches to reconstructing refugia
Despite the clear advantages of combined integrative approaches as illustrated in our case studies, disagreement between different data types or lines of evidence is common (Martínez-Meyer & Peterson, 2006; Giesecke et al., 2007; Waltari et al., 2007; Pearman et al., 2008; Marske et al., 2011; Mellick et al., 2012; Saltré et al., 2013; Williams et al., 2013). These debates inevitably raise questions about data uncertainties and the validity of assumptions (Hampe, 2004; Dormann, 2007; Jackson, 2012). In short, predicting the locations of small populations on a heterogeneous landscape, whether under past or future climate, remains a significant challenge.
Increasing recognition of the benefits of integrative reconstructions has stimulated the development of methodological frameworks that pursue joint inferences of species history while addressing drawbacks of existing approaches. Most efforts have emphasized a tighter integration of paleodistribution models and statistical phylogeography. For instance, Knowles & Alvarado-Serrano (2010) and Brown & Knowles (2012) developed a spatially-explicit framework that translates paleo-SDM output into demographic models that account for spatially and temporally variable carrying capacities and migration rates across the landscape. These models can produce maps of past distributions to assess refugial dynamics. However, the transformation of SDM-based suitabilities into demographic parameters (i.e. carrying capacities and migration rates) is not straightforward: choices regarding functional relationships can influence reconstructed species histories (Brown & Knowles, 2012). Hence, the integration of SDMs with phylogeography must be carefully applied (Collevatti et al., 2013; Alvarado-Serrano & Knowles, 2014). Espíndola et al. (2012) used a different approach combining paleo-SDMs and genetic data to reconstruct the late Quaternary range dynamics of a perennial plant in Europe by simulating a spatial diffusion process for each of the four genetic clusters currently identified for the species.
Other approaches avoid using paleo-SDMs as templates to reconstruct species range dynamics in the past. In continuous statistical phylogeography (Lemmon & Lemmon, 2008; Lemey et al., 2009, 2010), a gene tree of georeferenced genetic samples forms the data to be modeled using a spatial diffusion process. Thus, ancestral locations and range shifts are estimated together with the genealogy, providing a spatially explicit reconstruction of species range dynamics (Marske et al., 2012). Another approach uses a phylogeographic clustering method to estimate ancestral locations (Manolopoulou & Emerson, 2012). Finally, dynamic range models (Pagel & Schurr, 2012; Schurr et al., 2012) may be applied to reconstruct past range dynamics. These process-based models, usually built in a Bayesian framework (Marion et al., 2012; Schurr et al., 2012), can assimilate multiple types of data (e.g. current distributions, fossil occurrences, genetic-based inferences of past population presence) to infer population presence at different times, including the location of refugia. Together with the automatic production of rigorous uncertainty estimates, the main advantage of the Bayesian approach is that the limitations of each method or dataset taken alone may be compensated for by strengths in other methods and datasets. For example, genetic data from extant organisms may never be able to underpin the inference of a refugium outside of the current range, but paleoclimatic or fossil data may well indicate that possibility. If the genetic data are included in the analysis in a joint inference, then the plausibility of this refugium is checked automatically via the influence of the geographic scenario on the likelihood of the genetic data. If the datasets are congruent, then one scenario will emerge with a high posterior probability; if they conflict, then the posterior probability density will be spread out amongst several options, indicating that the available data do not have enough power to favor any one scenario. Hence, although more data-hungry and computationally challenging than static SDM hindcasting, these models promise to enable a quantitative synthesis or true joint inference of species past distributions, including the location of refugia, based on multiple lines of evidence.
V. How can historical refugia inform us about future refugia?
Investigating species and population persistence in historical refugia can illuminate two broad aspects of climate-change biogeography. First, they inform the autecology and history of a species and populations in a changing environment. This research may reveal the role of refugia in species persistence and range dynamics, and which functional traits are important. The occurrence of disjunct refugia also influences rates of migration necessary for species to track a changing climate, and thus help close the gap between observed migration rate and that inferred from paleorecords (Feurdean et al., 2013). Second, persistence in a refugium suggests some combination of a moderated refugial environment buffered against the regional climate and tolerance to climate change, by pronounced phenotypic plasticity and/or adaptive capacity. Knowledge of where refugia were located in the past can thus inform us about the geographic features that tend to facilitate population persistence through extreme events (Hannah et al., 2002; Avise, 2008; Hampe & Jump, 2011). Complex topographic controls, for example, could buffer local populations against rapid climate shifts and allow species to persist despite regionally unfavorable environments. Conversely, species in low-relief regions may be particularly sensitive to climate change. Knowing which aspects of climate permitted species persistence within a refugium, how the refugium was spatially organized, and whether there were other suitable but unoccupied areas on the landscape due to dispersal limitation or biotic interactions may also help predict whether species might be able to utilize refugia in the future.
Paleoecology, phylogeography and SDMs all underscore the importance of refugia for the persistence of biota in rapidly changing environments. Past experience tells us that many species manage to find suitable spots for persistence even in extremely unsuitable environments at regional scales, especially in heterogeneous regions. This ability could go a long way in explaining the Quaternary conundrum (Botkin et al., 2007) and partially moderate the most alarmist forecasts of future species loss. A critical need is to improve our mechanistic understanding of what makes a refugium: specifically how species traits and landscape features interact to influence persistence (Bhagwat & Willis, 2008; Hampe et al., 2013), and how this information may be used to improve probabilistic forecasts of future refugia (Eckert, 2011). Even if climatically suitable sites could be predicted, dispersal limitations and the likely influence of biotic interactions will add further uncertainty to species distributions forecasts (Blois et al., 2013c).
Conservation organizations are currently identifying potential climate refugia for modern populations at risk from ongoing climate change (Shoo et al., 2011; Olson et al., 2012; Groves et al., 2012). Will current refugia survive into the future? For species in long-term refugia that have persisted for at least one glacial–interglacial cycle, their existence suggests that they are tolerant of major climate change (Hampe & Petit, 2005; Dobrowski, 2011); hence, they may be more likely to remain as refugia in the future by buffering future changes in local climate (Ashcroft, 2010; Ashcroft et al., 2012). Importantly, however, the direction of this effect is now reversed; historical refugia protected species from the cold, whereas contemporary and future refugia must protect species from a warming world (Bennett et al., 1991). Indeed, projected climate changes may produce a ‘super-interglacial’ that resembles the Pliocene or Miocene rather than any other period in the Pleistocene (Moritz & Agudo, 2013). For a species migrating under ongoing climate warming, its trailing-edge populations (i.e. those currently in warmer climates), which are also the most likely to be genetically unique relicts of former glacial refugia, may actually be the first to suffer local extirpation (Fig. 1; Razgour et al., 2013). Thus, it seems that long-term refugia might be either the most resistant, or the most susceptible, to future climate change – further supporting the need for quantitative future predictions based on a mechanistic understanding of historical refugia.
VI. Concluding thoughts
New integrative, multi-faceted, approaches should facilitate using knowledge about the past to project future refugia. The accumulation of data and development of novel methods are accelerating our understanding of species distributions, both in the past and into the future. Continuing paleontological sampling and database development is producing a dense array of fossil data. Genome-wide genetic markers are becoming more affordable and more easily attainable, and analytical methods are improving rapidly – making it increasingly possible to uncover the historical processes encoded in the patterns of genomic variation for a wide range of nonmodel species. The next generation of SDMs that include increased biological realism (e.g. accounting for dispersal limitation, biotic interactions or populations' demographic inertia) has begun to play a prominent role (e.g. Dullinger et al., 2012). Uncertainty in climate projections is also continually decreasing. Furthermore, ongoing climate change provides the opportunity to calibrate SDMs with field observation. For instance, field surveys can inform where recruitment and mortality are actually occurring, which could signal where future populations are more likely to establish (Mclaughlin & Zavaleta, 2012; Zhu et al., 2012). The heterogeneity of responses of locally adapted populations also may be observed or quantified experimentally (Davis & Shaw, 2001; Leites et al., 2012). A myriad of complementary approaches is thus recommended, and perhaps the best complement to models is good natural history knowledge and high-quality field data (Ricklefs, 2012).
This paper emerged from a workshop ‘Climate Refugia: Joint Inference from Fossils, Genetics, and Models’ in Eugene, Oregon, August 2012. The authors thank the PAGES project of the International Geosphere-Biosphere Program (IGBP) and the College of Arts and Sciences at the University of Oregon for supporting the workshop. Gordon Bettles and The Many Nations Longhouse at the University of Oregon graciously provided meeting space. D.G.G., S.Z.D., K.D.H. and F.S.H. were supported by the National Science Foundation (grant DEB 1146017). F.R.S was supported by the European Union Seventh Framework Programme (FP7/2007–2013) under grant agreement 275094. A.L.S. was supported by the National Science Foundation (grant EAR-0922067). We are grateful for detailed comments from Chronis Tzedakis and three anonymous referees.