Congruence between distribution modelling and phylogeographical analyses reveals Quaternary survival of a toadflax species (Linaria elegans) in oceanic climate areas of a mountain ring range


Author for correspondence:

Mario Fernández-Mazuecos

Tel: +34 914203017



  • The role of Quaternary climatic shifts in shaping the distribution of Linaria elegans, an Iberian annual plant, was investigated using species distribution modelling and molecular phylogeographical analyses. Three hypotheses are proposed to explain the Quaternary history of its mountain ring range.
  • The distribution of L. elegans was modelled using the maximum entropy method and projected to the last interglacial and to the last glacial maximum (LGM) using two different paleoclimatic models: the Community Climate System Model (CCSM) and the Model for Interdisciplinary Research on Climate (MIROC). Two nuclear and three plastid DNA regions were sequenced for 24 populations (119 individuals sampled). Bayesian phylogenetic, phylogeographical, dating and coalescent-based population genetic analyses were conducted.
  • Molecular analyses indicated the existence of northern and southern glacial refugia and supported two routes of post-glacial recolonization. These results were consistent with the LGM distribution as inferred under the CCSM paleoclimatic model (but not under the MIROC model). Isolation between two major refugia was dated back to the Riss or Mindel glaciations, > 100 kyr before present (bp).
  • The Atlantic distribution of inferred refugia suggests that the oceanic (buffered)–continental (harsh) gradient may have played a key and previously unrecognized role in determining Quaternary distribution shifts of Mediterranean plants.


The climatic changes of the Quaternary (i.e. the last 2.6 million yr (Myr)) have shaped the distribution and abundance of extant species in a variety of ways (Bennett & Provan, 2008; Stewart et al., 2010). Among temperate species, these climatic shifts have brought about several cycles of contraction and expansion of their geographical ranges. Certain regions acted as refugia for these species during the ice ages, from where they were able to recolonize previously occupied areas during interglacial periods (Hewitt, 1996, 2000; Comes & Kadereit, 1998). In particular, the three Mediterranean peninsulas of southern Europe (Iberian, Italian and Balkan) are known to have acted as refugia during the Quaternary. They remained mostly ice-free during the last glacial maximum, that is, 26.5 to 19–20 kyr before present (bp; Clark et al., 2009), harbouring many European species and enabling the recolonization of central and northern Europe in the post-glacial period (Comes & Kadereit, 1998; Taberlet et al., 1998; Hewitt, 1999; Schmitt, 2007). Recently, however, the historical complexity of these three peninsulas has been recognized with the demonstration that they did not constitute single refugia throughout the Quaternary but instead several independent or interconnected ‘refugia within refugia’ (Gómez & Lunt, 2006; Nieto-Feliner, 2011). It has also been argued that altitudinal migration, rather than large-scale range shifts, may have dominated the Quaternary history of Mediterranean species (Gutiérrez-Larena et al., 2002; Martín-Bravo et al., 2010). Still, different species are adapted to tolerate different ranges of climatic conditions and therefore may respond to climatic shifts in diverse ways (Stewart et al., 2010). Considering the high species diversity in the Mediterranean region, yet the low number of species studied to date, further research is needed to understand the Quaternary history of Mediterranean taxa.

Three main approaches are employed to reconstruct species ranges throughout the Quaternary climatic cycles: the study of the spatial and temporal distribution of macro- and microfossils; the analysis of DNA markers in living populations to reconstruct phylogeographical patterns; and the projection of species distribution models (SDMs) to past climatic conditions. These approaches have been extensively used to study the distribution range of plants, including Mediterranean taxa, in the last few years (Carrión et al., 2003; Benito Garzón et al., 2007; López de Heredia et al., 2007; Picó et al., 2008; Calleja et al., 2009; González-Sampériz et al., 2010; Désamoré et al., 2012). However, the three approaches have been rarely integrated (Rodríguez-Sánchez & Arroyo, 2008; Rodríguez-Sánchez et al., 2009; Besnard et al., 2013), even though combined analyses using different data sources are highly desirable in this field (Taberlet et al., 1998). Particularly in the case of herbaceous taxa, the paleobotanical approach has been seldom implemented, as these plants are only rarely registered in the fossil record. The reconstruction of past distribution ranges of herbs has therefore relied on the investigation of their phylogeographical patterns and the projection of SDMs, two approaches that are increasingly being integrated (Waltari et al., 2007; Chan et al., 2011).

Linaria elegans is an Iberian, annual species of Linaria section Versicolores, a monophyletic assemblage within the largest genus of the snapdragon tribe (Antirrhineae; Sutton, 1988; Fernández-Mazuecos & Vargas, 2011; Fernández-Mazuecos et al., 2013). It is an allogamous, insect-pollinated toadflax (M. Fernández-Mazuecos et al., unpublished), with a diploid chromosome number of 2n = 12 (Heitz, 1926). Divergence between L. elegans and its sister species Linaria nigricans has been dated back to the early Pliocene–early Pleistocene (Fernández-Mazuecos & Vargas, 2011), thus making L. elegans an appropriate species for the investigation of distribution shifts associated with Quaternary climatic cycles. The current distribution of L. elegans encompasses a diverse range of habitats, including siliceous sandy soils up to 1900 m in therophytic communities, open scrubs and woodlands, as well as open habitats as low as 100 m in oceanic areas (Sáez & Bernal, 2009). It is primarily distributed in a discontinuous mountain ring surrounding the northern plateau of Iberia (Duero river basin), a pattern shared with some other north-west Iberian endemics such as Rumex suffruticosus (López, 1990), Luzula lactea and Luzula caespitosa (Fernández Piedra & Talavera, 2010). The observation of such a distribution pattern raises multiple questions about its origin and the extent to which it has been shaped by climatic oscillations. Three main hypotheses can be considered to explain the Quaternary history of mountain ring-like distribution patterns (Fig. 1). First (hypothesis I), the cold temperatures of glacial periods may have caused an altitudinal descent of populations, which may have then expanded their distribution across the central basin, leading to the admixture of populations from different ranges of the mountain ring, followed by post-glacial centrifugal recolonization of the ring (Robledo-Arnuncio et al., 2005). A second hypothesis (hypothesis II) is that cycles of altitudinal descent have occurred without an admixture of populations from different mountains, leading to long-term isolation between areas (similar to Ronikier et al., 2008). Finally (hypothesis III), the glacial periods may have caused extinction of populations and a contraction of the distribution range into a few favourable environments (refugia), found probably (although maybe not exclusively) in the south (Comes & Kadereit, 1998). In this case, the ring-like pattern observed nowadays would be the result of post-glacial recolonization from glacial refugia along the mountain ring. Note that all hypotheses assume that the distribution pattern during past interglacial periods was a ring similar to that presently observed. Phylogeographical patterns predicted under the three hypotheses are summarized in Fig. 1. These predictions can be tested thanks to recently developed Bayesian phylogeographical methods, which are capable of estimating split times and migration rates between populations and directions of spread (Lemey et al., 2009; Bloomquist et al., 2010; Hey, 2010).

Figure 1.

Testable hypotheses about the late Quaternary history of a mountain ring species. The ring represents a mountain ring that surrounds an inner basin, with each segment symbolizing a distinct mountain area. Areas filled in black represent those occupied by the species during the last interglacial (LIG), during the last glacial maximum (LGM) and at the present time according to the respective hypothesis. Grey arrows indicate routes of post-glacial colonization. Predictions concerning phylogeographical patterns are outlined for each hypothesis.

In this study, we discriminated among the three hypotheses outlined as applied to the late Quaternary history of L. elegans. To this end, we integrated a time-calibrated phylogeographical analysis of L. elegans based on plastid and nuclear markers with species distribution modelling under current, last glacial maximum (LGM) and last interglacial (LIG) climatic conditions.

Materials and Methods

Mountain ring range

The current distribution of Linaria elegans Cav. in the northern half of the Iberian Peninsula was divided into nine areas (Fig. 2). Seven areas were located along the mountain ring that surrounds the Duero basin: Cantabrian Mountains (CM), Galician-Portuguese Mountains (GP), Serra da Estrela (ES), Sierra de Gata (GA), Sierra de Gredos (GR), Sierra de Guadarrama (GU) and Northern Iberian System (NI). We defined two additional peripheral areas outside the mountain ring: Atlantic lowlands (AL) in the north-west, and Southern Iberian System (SI) in the south-east. The nine areas encompassed three major regions (Fig. 2): the Cantabrian region in the north (CM, GP and AL), the Central Mountain System (henceforth Central System) in the south (ES, GA, GR and GU), and the Iberian Mountain System (henceforth Iberian System) in the east (NI and SI).

Figure 2.

Geographical distribution of the 24 populations of Linaria elegans sampled for phylogeographical analyses (numbered dots; see Table 2) and the 54 localities additionally employed for distribution modelling (triangles; see Supporting Information Table S1). The three major regions of L. elegans distribution (the Cantabrian region, Central System and Iberian System) are indicated, as well as the smaller areas delimited for phylogeographical reconstructions: Cantabrian Mountains (CM), Galician-Portuguese Mountains (GP), Atlantic lowlands (AL), Serra da Estrela (ES), Sierra de Gata (GA), Sierra de Gredos (GR), Sierra de Guadarrama (GU), Northern Iberian System (NI) and Southern Iberian System (SI). The inset shows a photograph of L. elegans flowers.

Species distribution modelling

Species distribution modelling (SDM) was performed to evaluate the potential distribution of L. elegans under present climatic conditions and to project it to the past. We employed the maximum entropy algorithm, as implemented in Maxent v3.3 (Phillips et al., 2006). A set of 19 bioclimatic variables for the Iberian Peninsula under current conditions were retrieved from the WorldClim website (; Hijmans et al., 2005). Of these, we selected seven uncorrelated variables (Table 1; see Supporting Information Methods S1 for details). Given that L. elegans is a silicicolous (acidophilous) species (Sáez & Bernal, 2009), we also included a categorical variable distinguishing acidic and basic substrates. This layer was derived from the dominant parent material layer of the European Soil Database (Panagos, 2006). The eight variables were used as predictors to calibrate the distribution model in Maxent. In the occurrence data set, we included 78 reliable point localities (Fig. 2; Table S1), which were randomly split into training data (75%) and test data (25%). Ten subsample replicates were performed. To convert continuous suitability values to presence/absence, we chose the threshold obtained under the maximum training sensitivity plus specificity rule, which has been shown to produce accurate predictions (Jiménez-Valverde & Lobo, 2007).

Table 1. Seven uncorrelated bioclimatic variables employed for distribution modelling of Linaria elegans in Maxent
NameVariableCurrent mean (min–max)LGM-CCSM mean (min–max)LGM-MIROC mean (min–max)LIG mean (min–max)
  1. Mean value and range for each variable in the studied area are shown under current conditions, as well as last glacial maximum (LGM; CCSM and MIROC) and last interglacial (LIG) simulations.

bio3Isothermality (%)37.7 (26–52)45.5 (32–58)36.0 (23–50)30.0 (18–45)
bio4Temperature seasonality (°C)5.8 (2.6–8.0)5.6 (2.4–8.0)5.6 (2.4–8.0)7.8 (3.5–10.8)
bio5Maximum temperature of warmest month (°C)29.1 (9.2–40.1)27.9 (10.4–42.1)26.4 (9.2–37.7)33.3 (14.2–44.9)
bio6Minimum temperature of coldest month (°C)2.0 (−12.5 to 9.4)−2.1 (−18.3 to 8.4)1.1 (−12.2 to 8.8)4.9 (−14.4 to 8.2)
bio13Precipitation of wettest month (mm)82.2 (15–273)131.2 (16–485)126.7 (32–411)103.1 (25–319)
bio14Precipitation of driest month (mm)18.6 (0–117)24.4 (1–136)25.2 (1–135)14.5 (0–107)
bio15Precipitation seasonality (mm)39.3 (11–88)41.0 (11–87)41.0 (11–87)45.0 (15–84)

The distribution model under current conditions was projected to the LGM (c. 21 kyr bp) using paleoclimatic layers simulated under two general atmospheric circulation models: the Community Climate System Model (CCSM; Collins et al., 2006) and the Model for Interdisciplinary Research on Climate (MIROC; Hasumi & Emori, 2004). To test the hypothesis of a ring-like distribution pattern before the LGM, we also projected the model to the LIG (c. 120–140 kyr bp) using the climatic model of Otto-Bliesner et al. (2006). It was assumed that the ecological requirements of L. elegans have remained stable throughout at least the last climatic cycle (Nogués-Bravo, 2009), which seems fairly reasonable given the short time-scale, although caution must be exercised on this point.

DNA sampling and sequencing

We sampled 119 individuals of L. elegans from 24 populations (four or five individuals per population) covering the whole distribution range of the species (Fig. 2; Table 2). Twenty of these populations were sampled along the mountain ring, and four in peripheral areas.

Table 2. Populations of Linaria elegans sampled for DNA sequencing
Geographical areaPopulation numberLocalityAltitude (m)cpDNA haplotypesAt103 haplotypes
  1. Haplotypes of the combined cpDNA (rpl32-trnLUAG/trnK-matK/trnS-trnG) and At103 data sets are indicated.

  2. Geographical areas: Cantabrian Mountains (CM), Galician-Portuguese Mountains (GP), Atlantic lowlands (AL), Serra da Estrela (ES), Sierra de Gata (GA), Sierra de Gredos (GR), Sierra de Guadarrama (GU), Northern Iberian System (NI) and Southern Iberian System (SI).

AL1Spain, Pontevedra, Caldas de Reis47G (x5)2 (x2), 4 (x4), 7 (x2)
2Spain, A Coruña, Santiago de Compostela169H (x3), I (x2)2 (x4), 13 (x2)
GP3Spain, Ourense, San Xoán de Río912B (x2), H (x3)2 (x2), 5 (x4)
4Spain, Zamora, Ribadelago1012D, E, F, G, H2 (x7), 12
5Portugal, Montalegre930A, H, J (x3)2 (x5), 4
CM6Spain, León, Puerto de Ventana1473B (x2), G (x2)2 (x2)
7Spain, León, Isoba1489C (x2), G, H (x2)1 (x2), 2 (x4), 20 (x2)
8Spain, León, Villafrea de la Reina1160G (x5)1 (x2), 2 (x2), 21 (x2)
ES9Portugal, Manteigas1046K (x5)8 (x2), 15 (x2)
GA10Spain, Salamanca, Navasfrías1014A (x4), K
11Spain, Cáceres, Eljas1062A (x4), K2 (x5), 6 (x2), 18 (x2)
GR12Spain, Ávila, Puerto del Tremedal1640M (x2), Q (x3)2 (x2), 14, 18, 19 (x2)
13Spain, Ávila, Puerto de Peñanegra1446L, M, Q, S (x2)2 (x3), 3 (x2), 11 (x2), 14
14Spain, Ávila, Navarredonda de Gredos1605L, M (x3), R4 (x2), 9, 17, 18 (x2)
15Spain, Ávila, Puerto del Pico1385L (x2), M (x3)3 (x4)
16Spain, Ávila, Puerto de Menga1555L, M (x2), Q (x2)4 (x2)
17Spain, Ávila, Puerto de Casillas1461L, N, O (x2), P2 (x5), 22
GU18Spain, Madrid, Cercedilla1293M, S, T (x3)2 (x2), 3 (x3), 10, 17, 18 (x2)
19Spain, Madrid, Puerto de Canencia1527R, S (x3), T2, 4 (x3), 16 (x2)
20Spain, Madrid, Somosierra1538S (x2), T (x3)2 (x2), 4 (x2), 18 (x2), 23
NI21Spain, Burgos, Cerro Grañón1688S (x5)2 (x2), 18, 23 (x5)
22Spain, Burgos, Neila1236S (x5)2 (x3), 9 (x3), 22 (x2)
SI23Spain, Cuenca, El Tobar1304S (x5)2 (x3), 4, 10 (x3), 18
24Spain, Cuenca, Santa María del Val1364S (x5)2 (x3), 16 (x2), 18

Procedures used for DNA extraction, amplification and sequencing followed Fernández-Mazuecos & Vargas (2011). After a pilot study (see Methods S1 for details), we selected five DNA regions that were consistently amplified and variable, including three cpDNA regions: rpl32-trnLUAG (Shaw et al., 2007), trnK-matK (Johnson & Soltis, 1994) and trnS-trnG (Hamilton, 1999); and two nuclear regions: the At103 gene (Rzeznicka et al., 2005; Li et al., 2008) and the nuclear ribosomal internal transcribed spacers (ITS; White et al., 1990). The three cpDNA regions and the At103 gene were amplified and sequenced for all individuals, while the ITS region was only sequenced for one individual per population. As the outgroup, sequences of the same plastid and nuclear regions were obtained for two individuals of the sister species L. nigricans and two or three individuals of additional species of sect. Versicolores (Table S2). Sequences of each DNA region were separately aligned using mafft 6 (Katoh et al., 2002). The three cpDNA regions were concatenated in a single matrix. All new sequences were deposited in the GenBank database (see Table S2 for accession numbers).

Analysis of DNA haplotypes

In order to reconstruct haplotypes from the diploid, unphased At103 sequences, we employed the Bayesian statistical method phase 2.1 (Stephens et al., 2001; Stephens & Donnelly, 2003) with default parameters. Five runs were performed, and the one with the best average goodness-of-fit was selected. Only highly supported haplotype pairs (P > 0.90) were maintained. We used IMgc (Woerner et al., 2007) to generate a reduced, recombination-free matrix, which would be needed for coalescent-based analyses that assume no recombination (see Coalescent-based analyses). This reduced data set was used for all subsequent analyses.

The cpDNA and At103 loci were separately analysed using the statistical parsimony algorithm (Templeton et al., 1992), as implemented in tcs 1.21 (Clement et al., 2000), in order to infer genealogical relationships among haplotypes.

Phylogenetic analyses

Relationships among sequences were assessed using Bayesian inference (BI), as implemented in MrBayes v3.1.2 (Ronquist & Huelsenbeck, 2003), and maximum parsimony (MP), as implemented in tnt 1.1 (Goloboff et al., 2003). We conducted separate analyses on the cpDNA and At103 data sets, after removing identical sequences from both. For the ITS data set, all sequences were included. Outgroup sequences were chosen based on previous phylogenetic results (Fernández-Mazuecos & Vargas, 2011; see Methods S1 for details).

Divergence times of cpDNA sequences

To estimate divergence times among cpDNA lineages, we combined newly generated rpl32-trnLUAG/trnK-matK sequences (one per haplotype) with those of Linaria sect. Versicolores previously published (Fernández-Mazuecos & Vargas, 2011). A relaxed molecular clock approach was implemented in beast v.1.6.2 (Drummond et al., 2006; Drummond & Rambaut, 2007) following procedures of Fernández-Mazuecos & Vargas (2011), except that the divergence time between Chaenorhinum and Linaria was modelled as a normal distribution with mean = 23 Ma and standard deviation = 4. These values were obtained from a dating analysis of plastid ndhF sequences of the tribe Antirrhineae (P. Vargas et al., unpublished; see Methods S1 for details).

Genetic diversity and differentiation

In order to assess differences in cpDNA and At103 genetic diversity across the geographical range of L. elegans, we calculated the number of haplotypes (h), number of private haplotypes (ph) and haplotypic diversity (H) (Nei, 1987) for each of the nine geographical areas (Fig. 2). The same parameters were also computed for the three major geographical regions. Geographical structure for the two loci was assessed using spatial analysis of molecular variance (SAMOVA; Dupanloup et al., 2002). The nearest-neighbour statistic Snn (Hudson, 2000) was calculated in DnaSP (Librado & Rozas, 2009) to assess genetic differentiation among the nine geographical areas (see Methods S1 for details).

Discrete phylogeographical analyses

In order to reconstruct the spread history of L. elegans, cpDNA and At103 sequences were separately analysed using the Bayesian spatial diffusion methodology described by Lemey et al. (2009). Analyses were conducted in beast v.1.6.2 for the complete cpDNA and At103 data sets, excluding outgroup sequences. We defined nine areas (Fig. 2), which were mapped through a discrete phylogeographical analysis (DPA) that employs a standard continuous-time Markov chain (see Methods S1 for details). A Bayesian stochastic search variable selection procedure was implemented to identify parsimonious descriptions of the diffusion process. A Bayes factor (BF) analysis was performed to identify rates (diffusion routes) that were frequently invoked to explain the diffusion process. Rates yielding a BF > 3 were considered to be well supported (Lemey et al., 2009).

Coalescent-based analyses

In the case of discordance between loci, coalescent-based statistical methods are needed for the reliable inference of phylogeographical patterns (Knowles & Maddison, 2002; Nielsen & Beaumont, 2009). Here we implemented coalescent-based analyses in order to estimate split times (which are particularly relevant for the testing of our hypotheses; Fig. 1) and migration rates between major populations of L. elegans. In these analyses, the two extensively sampled sequence data sets (cpDNA and At103) were jointly analysed as two independent loci (not concatenated; Hey, 2010).

First, relationships among populations of the nine geographical areas were estimated using the coalescent-based approach implemented in *beast, which does not account for post-divergence migration (Heled & Drummond, 2010; see Methods S1). Based on the result of the *beast analysis (see Results), we defined three plausible input tree topologies for isolation with migration analyses, representing alternative topological relationships among three major populations (designated N, SE and SW; see Results). Isolation with migration analyses were then implemented in IMa2 (Hey, 2010). In each IMa2 analysis, an infinite sites mutation model (Kimura, 1969) was implemented for both loci. The inheritance scalar was set at 0.25 for the haploid locus (cpDNA) and 1 for the diploid locus (At103). A range of mutation rates was calculated for each locus based on dating results obtained in beast (see Methods S1 for details). These ranges, together with a generation time of 1 yr (given that L. elegans is an annual species), were included in the model in order to estimate parameters on demographic scales. For each input tree, we ran six IMa2 analyses with 20 parallel Markov chains each. After 100 000 burn-in steps, a total of 20 000 trees were sampled in each run, with 100 steps between tree savings (see Methods S1 for further details). All analyses reached equilibrium after the burn-in period. The six runs performed for each input tree were combined for subsequent analyses. For each topology, we additionally evaluated four nested models in which one or more migration rate pairs across the Duero basin were constrained to be zero. Model fit was assessed using the Akaike information criterion (AIC; Akaike, 1976).


Species distribution models

The average species distribution model for current conditions (Fig. 3a) spanned the current species distribution, plus a few additional areas, such as high mountains of south-eastern Spain (Sierra Nevada) and narrow areas of the Pyrenees. The mean area under the receiver-operating characteristic curve (a measure of model fitness) for testing data was high (0.951), which supported the predictive power of the model. The standard deviation of the 10 replicates was low (Notes S1). According to jackknife tests, two temperature (bio6 and bio5) and one precipitation (bio14) variable, followed by the acidic/basic variable, were shown to be the most informative variables for the model (Notes S1).

Figure 3.

Results of species distribution modelling of Linaria elegans. (a) Average distribution model fitted to current climatic conditions. (b) Average projection of the model to the last interglacial (c. 120–140 kyr before present (bp)). (c, d) Average projections of the model to the last glacial maximum (c. 21 kyr bp) using climatic variables under the Community Climate System Model (CCSM) (c) and Model for Interdisciplinary Research on Climate (MIROC) (d) general circulation model simulations. Last glacial maximum (LGM) coastlines are represented in (c) and (d), with current coastlines superimposed as dotted lines.

The projection to the LIG (Fig. 3b) yielded a similar overall distribution, but with unequal occupation areas. While a wider distribution than the current species range was inferred in the Cantabrian region, suitability in the Central System and the Iberian System was restricted to narrow highland areas. A small area was also inferred in Sierra Nevada.

The CCSM and MIROC models for the LGM yielded strongly dissimilar inferences of L. elegans paleodistribution. The CCSM model (Fig. 3c) inferred a restricted potential distribution in north-western Iberia, including putative glacial refugia mostly in the western half of the Central System (ES, GA and GR) and the western half of the Cantabrian region (GP and AL). Suitability was again inferred for Sierra Nevada. By contrast, the MIROC model (Fig. 3d) inferred a large distribution of suitable areas across the Iberian Peninsula, including not only the current distribution range, but also lower lands of the same regions, wide areas of the Duero basin and mountain ranges of south-eastern Spain.

Analyses of cpDNA sequences

The combined analysis of the three cpDNA regions (2542 bp from 119 individuals; Fig 4a–c) yielded 20 haplotypes of L. elegans, which formed a single network with no loops in the tcs analysis (Fig. 4c). Seven missing haplotypes were inferred, most of which (five) separated haplotype A from haplotype B. Haplotype B had the highest number of connections, specifically to six lineages. Haplotypes and haplotype lineages showed distinct distribution ranges (Fig. 4a), with haplotypes B–J exclusively found in the Cantabrian region, the K–T lineage in the Central System and the Iberian System, and haplotype A in western segments of the Cantabrian region (GP) and the Central System (GA). Relationships among haplotypes inferred by phylogenetic analyses (Fig. 4b) showed that L. elegans sequences formed a monophyletic group (posterior probability (PP) = 1; bootstrap support (BS) = 97%). Within the L. elegans clade, haplotype A was revealed as sister to a clade constituted by the remaining haplotypes (PP = 1; BS = 92%). Relationships within this clade were congruent with those inferred by the network analysis.

Figure 4.

Analysis of cpDNA (rpl32-trnLUAG/trnK-matK/trnS-trnG) haplotypes of Linaria elegans. The 20 cpDNA haplotypes are represented as fill patterns and colours, and named as in Table 2 and Fig. S3(a). (a) Geographical distribution of haplotypes across sampled populations. Pie charts represent haplotype proportions, obtained after sequencing four to five individuals per population. Population groups identified by SAMOVA for = 5 are delimited by white dotted lines. (b) Fifty per cent majority-rule consensus tree of the Bayesian phylogenetic analysis; numbers above branches are Bayesian posterior probabilities; numbers below branches are bootstrap supports (in percentage) from the maximum parsimony analysis. (c) Statistical parsimony network of cpDNA haplotypes; lines represent single nucleotide substitutions, and dots indicate missing haplotypes (extinct or not found). Circle sizes are proportional to the number of sequences obtained for each haplotype.

Analyses of nuclear sequences

A total of 111 individuals provided clear At103 electropherograms. The aligned matrix of At103 sequences had a total length of 391 bp. Haplotype reconstruction in phase resulted in highly supported haplotype pairs (P > 0.90) for 81 individuals (162 sequences in total). The IMgc algorithm recovered the last 230 bp of 147 sequences as the largest nonrecombining block of the matrix (15 sequences from areas GR, GU, GA and GP were eliminated). Analysis of this reduced data set in tcs yielded 23 haplotypes (Fig. 5a–c), 22 of which formed a network with no loops, while the remaining haplotype (1) was unconnected (Fig. 5c) and separated from all remaining haplotypes by at least 28 substitutions. Geographical structure (Fig. 5a) was poorer than that obtained from the cpDNA data set. Some haplotypes were widely distributed (e.g. haplotypes 2 and 4), while others were restricted to single or adjacent pairs of areas. Phylogenetic analyses (Fig. 5b) yielded relationships congruent with those of the network analysis. Monophyly of L. elegans sequences was retrieved (PP = 0.84; BS = 84%), except for haplotype 1, which was grouped with L. nigricans. We ascribed this pattern to incomplete sorting of ancestral polymorphisms or ancient hybridization (Blanco-Pastor et al., 2012).

Figure 5.

Analysis of nuclear At103 haplotypes of Linaria elegans. The recombination-free data set obtained in IMgc was employed. The 23 At103 haplotypes are represented as fill patterns and colours, and named as in Table 2. (a) Geographical distribution of haplotypes across sampled populations. Pie charts represent haplotype proportions, and pie chart sizes are proportional to the number of sequences obtained for each population. The single population separated from the remaining localities by SAMOVA (= 2) is delimited by a white dotted line. (b) Fifty per cent majority-rule consensus tree of the Bayesian phylogenetic analysis; numbers above branches are Bayesian posterior probabilities; numbers below branches are bootstrap supports (in percentage) from the maximum parsimony analysis. (c) Statistical parsimony network of At103 haplotypes; lines represent single nucleotide substitutions, and dots indicate missing haplotypes (extinct or not found). Circle sizes are proportional to the number of sequences obtained for each haplotype.

In the phylogenetic analysis of the ITS region (595 bp), L. elegans sequences were grouped together as a monophyletic group (PP = 0.89; BS = 97%; Fig. S1). They formed a large polytomy, except for a weakly supported clade (PP = 0.7; BS < 50%). All populations, except for 9, 11 and 19, yielded a certain number (from one to five) of additive polymorphic sites (Table S3). ITS sequences were not further used given their multicopy nature and concerted evolution (Álvarez & Wendel, 2003).

Genetic diversity and differentiation

Dissimilar patterns of genetic diversity and differentiation were encountered when analysing plastid and nuclear At103 sequences. For the cpDNA data set, two geographical areas were found to harbour the highest genetic diversity (Table 3): GP in the Cantabrian region (= 8; = 0.867) and GR in the Central System (= 8; = 0.800). Three peripheral areas (ES, NI and SI) yielded no haplotypic diversity. By contrast, At103 diversity was more evenly distributed across geographical areas (Table 3).

Table 3. Genetic diversity parameters of population partitions of Linaria elegans using cpDNA and At103 sequences
 cpDNA At103
n h ph H n h ph H
  1. Areas are sorted by decreasing cpDNA haplotypic diversity.

  2. n, number of sampled individuals; h, number of haplotypes; ph, number of private haplotypes; H, haplotypic diversity; NA, non-applicable.

  3. Geographical areas: Cantabrian Mountains (CM), Galician-Portuguese Mountains (GP), Atlantic lowlands (AL), Serra da Estrela (ES), Sierra de Gata (GA), Sierra de Gredos (GR), Sierra de Guadarrama (GU), Northern Iberian System (NI) and Southern Iberian System (SI).

All populations11920NA0.90014723NA0.760
Central System (ES, GA, GR, GU)601190.889671690.862
Cantabrian region (AL, GP, CM)391090.80250970.647
Iberian System (NI, SI)20100.00030800.825

In the SAMOVA analysis of cpDNA sequences for = 5 groups, all Cantabrian populations were included in a single group, while a strong geographical structure was found in the Central System (Figs 4a, S2), with four groups corresponding to the four geographical areas (ES, GA, GR and GU). Populations from the Iberian System were grouped with the easternmost area GU of the Central System. When analysing At103 sequences, no clear geographical structure was found (Figs 5a, S2 and Notes S2 for details). Values of the Snn statistic (Table S4) were fully congruent with SAMOVA analyses.

Divergence times of cpDNA lineages

The relaxed molecular clock analysis of cpDNA sequences (Fig. S3a) estimated a divergence time between L. nigricans and L. elegans in the middle Pliocene to middle Pleistocene (0.89–3.79 Ma). Diversification of L. elegans haplotype lineages occurred during the Pleistocene, with haplotype A diverging at least 420 kyr bp.

Discrete phylogeographical analyses

The maximum clade credibility trees of the DPAs displayed a high uncertainty on the location of ancestors (results not shown). However, the BF analyses supported several different rates (BF > 3; Fig. 6a). Three connections were strongly supported by both cpDNA and At103 sequences: GR–GU, GU–NI and GU–SI. The three areas within the Cantabrian region were connected by either cpDNA or nuclear sequences. The only supported connection between the Cantabrian region and the Central System was the western connection GP–GA supported by cpDNA. No connection was supported between the Iberian System and the Cantabrian region.

Figure 6.

Results of Bayesian phylogeographical analyses based on cpDNA and nuclear (At103) sequences of Linaria elegans. (a) Results of the discrete phylogeographical analyses (DPAs). The nine geographical areas are delimited: Cantabrian Mountains (CM), Galician-Portuguese Mountains (GP), Atlantic lowlands (AL), Serra da Estrela (ES), Sierra de Gata (GA), Sierra de Gredos (GR), Sierra de Guadarrama (GU), Northern Iberian System (NI) and Southern Iberian System (SI). Lines represent spread routes supported by Bayes factors (BF > 3) in DPAs of cpDNA (red) and nuclear At103 (blue) sequences. Bayes factor values are shown. The three mayor populations delimited for IMa2 analyses are also indicated. (b) Relationships among L. elegans populations from the nine geographical areas, as inferred from the coalescent-based *beast analysis (see Fig. S3b for the time-calibrated chronogram). Numbers above branches are Bayesian posterior probabilities. The three mayor populations delimited for IMa2 analyses are indicated. (c) Summary of the isolation with migration models obtained in IMa2, based on the joint analysis of cpDNA and At103 sequences. Results obtained using three alternative tree topologies (left) are shown. Migration rates statistically supported by the test of Nielsen & Wakeley (2001) are indicated as red arrows on the trees (*, < 0.05; **, < 0.01; note that directionality of migration events is interpreted in the coalescent direction, i.e. backwards in time). Plots represent the marginal posterior probability distributions for splitting times t0 and t1 (as defined in each topology). European Quaternary stages are indicated, with glacial periods in grey and interglacials in white (Silva et al., 2009).

Coalescent-based analyses

Area relationships estimated in *beast (Figs 6b, S3b) strongly supported common ancestry for the AL, GP and CM populations (henceforth N; PP = 0.98), and also for the GR, GU, NI and SI populations (henceforth SE; PP = 1). Common ancestry of GA and ES populations (henceforth SW) was weakly supported (PP = 0.59). The three full isolation with migration models obtained in IMa2 are summarized in Fig. 6(c) and Table 4. All three models estimated a split between N and SE populations older than the last glaciation (> 100 kyr bp; Fig. 6c; Table 4), and supported significant migration (according to the test of Nielsen & Wakeley, 2001) between N and SW populations. Migration between SW and SE populations was supported by two of the three models. However, AIC values (Table 5) favoured the model without migration rates as the best-fit model for the three topologies, followed by the model without migration between N and SE populations.

Table 4. Estimated split times and effective population sizes estimated from the three isolation with migration (IMa2) models (see Fig. 6c)
TopologyParameterEstimate95% HPD interval
  1. Values with the highest probability after smoothing and 95% highest posterior density (HPD) intervals are reported. Split time parameters t0 and t1 are defined as in Fig. 6(c). Parameters qN, qSW, qSE and qA, respectively, denote the effective sizes of populations N, SW, SE (as defined in Fig. 6) and the ancestral population of the whole species.

1t0164.1 ka16.2 ka–2.5 Ma
t1395.0 ka214.6 ka–3.6 Ma
qN5.13 × 10ind.2.74 × 105–9.28 × 105 ind.
qSW2.74 × 105 ind.9.13 × 104–7.50 × 105 ind.
qSE4.09 × 105 ind.2.15 × 105–7.32 × 105 ind.
qA7.54 × 105 ind.3.82 × 105–2.25 × 106 ind.
2t0319.3 ka99.2 ka–2.7 Ma
t13.6 Ma405.8 ka–3.6 Ma
qN5.45 × 105 ind.2.83 × 105–1.04 × 106 ind.
qSW2.65 × 105 ind.9.58 × 104–6.69 × 105 ind.
qSE4.21 × 105 ind.2.36 × 105–7.38 × 105 ind.
qA8.78 × 105 ind.3.73 × 105–2.25 × 106 ind.
3t0413.0 ka52.3 ka–2.9 Ma
t13.5 Ma239.9 ka–3.6 Ma
qN5.17 × 105 ind.2.76 × 105–9.77 × 105 ind.
qSW2.72 × 105 ind.9.81 × 104–6.98 × 105 ind.
qSE4.02 × 105 ind.2.27 × 105–6.84 × 105 ind.
qA8.26 × 105 ind.3.73 × 105–2.25 × 106 ind.
Table 5. Analysis of nested models in IMa2 (Hey, 2010) using three alternative topologies (Fig. 6c)
TopologyModelTermslog(P)AICmN > SWmSW > NmN > SEmSE > NmSW > SEmSE > SW
  1. For each model, log-likelihood value, Akaike information criterion (AIC) value and estimates of migration rates (m) between major populations (N, SW, SE) are shown. Zero values in brackets represent migration rates fixed to zero in each nested model. Migration from and to ancestral population A (see Fig. 6c) was fixed to zero in all nested models (not shown).

1All migration parameters89.938−3.8760.4270.0000.0000.0000.4270.107
All migration rates 009.938−19.876[0.000][0.000][0.000][0.000][0.000][0.000]
All migration rates across the Duero Basin 025.498−6.996[0.000][0.000][0.000][0.000]0.0000.075
Migration rates in the East 049.815−11.6300.4270.000[0.000][0.000]0.4270.113
Migration rates in the West 049.137−10.274[0.000][0.000]0.4270.0000.2770.348
2All migration parameters89.993−3.9860.0850.0000.0000.0000.4270.000
All migration rates 009.993−19.986[0.000][0.000][0.000][0.000][0.000][0.000]
All migration rates across the Duero Basin 02−5.15314.306[0.000][0.000][0.000][0.000]0.0000.216
Migration rates in the East 049.544−11.0880.3640.000[0.000][0.000]0.2180.000
Migration rates in the West 046.068−4.136[0.000][0.000]0.0680.0000.0000.331
3All migration parameters810.030−4.0600.0840.0760.4270.0000.4270.000
All migration rates 0010.030−20.060[0.000][0.000][0.000][0.000][0.000][0.000]
All migration rates across the Duero Basin 02−6.56217.124[0.000][0.000][0.000][0.000]0.0000.094
Migration rates in the East 049.609−11.2180.4270.000[0.000][0.000]0.1990.254
Migration rates in the West 046.869−5.738[0.000][0.000]0.4190.0000.4270.400


Linaria elegans has long been recognized as a distinct taxonomic entity based on morphological characters (Viano, 1969; Sáez & Bernal, 2009). The observation of a well-supported monophyletic clade as based on the analysis of cpDNA sequences of L. elegans (Fig. 4b) corroborates these findings, and contrasts with the absence of monophyly obtained for several other species of Linaria sect. Versicolores (Fernández-Mazuecos & Vargas, 2011). This phylogenetic difference between L. elegans and other Versicolores species may have resulted from the old divergence of L. elegans from L. nigricans (0.89–3.79 Ma), and the unlikely recent hybridization between these two species as a result of geographical isolation. Nevertheless, the high number of additive polymorphic sites in ITS sequences of L. elegans (Table S3) might suggest some signal of hybridization (Fuertes Aguilar & Nieto Feliner, 2003; Paun et al., 2009; Abbott et al., 2010), although this possibility does not appear to affect our reconstruction of recent isolation and migration events.

Modelling late Quaternary paleodistribution

The result of the SDM fitted to current conditions (Fig. 3a) closely resembled the current ring-like distribution pattern of L. elegans (Fig. 2), except for certain peripheral areas, such as Sierra Nevada and the Pyrenees, where L. elegans is absent in spite of their environmental suitability. The latter mismatch derives from the fact that SDMs do not consider information on species history and assume species–climate equilibrium, namely that a species occupies all environmentally suitable areas (Nogués-Bravo, 2009). The projection of the distribution model to the LIG also yielded a ring-shaped distribution (Fig. 3b), thus meeting the first assumption of the three proposed hypotheses. However, the hypotheses were unequally supported by projections to the LGM (see diagrams in Fig. 1). Specifically, neither projection clearly supported hypothesis II, which considered a lowland ring-like distribution during ice ages as a result of survival by altitudinal-descent migration to several isolated refugia. The restricted distribution inferred under the CCSM model (Fig. 3c) included putative glacial refugia in western areas (AL, GP, ES, GA and GR), thus supporting hypothesis III. However, the widespread distribution recovered under the MIROC model (Fig. 3d) included a fairly continuous range across the plateau, as postulated by hypothesis I. Therefore, SDMs alone were not sufficient to determine which of the three hypotheses constitutes the best description of the Quaternary history of L. elegans.

The fact that the Sierra Nevada mountains (south-eastern Spain) appear to have harboured suitable habitats for L. elegans since the last interglacial (Fig. 3), yet no populations have been reported to date, is interesting from a biogeographical perspective. A shared climatic and biogeographical history of Sierra Nevada, the Central System and the Cantabrian Mountains is suggested by several Iberian endemics with a disjunct distribution, such as Campanula herminii (Sáez & Aldasoro, 2001), Senecio boissieri (Peredo et al., 2009), Carex furva (Luceño, 2008) and Gentiana boryi (Renobales, 2012). In the case of L. elegans, the fact that its sister species (L. nigricans) is distributed in lowlands of south-eastern Spain, not far from Sierra Nevada, seems intriguing. It can be speculated that an ancestor of L. elegans may have been distributed in Sierra Nevada before becoming extinct in this area, although this hypothesis cannot be tested without fossil evidence.

Phylogeographical hypothesis testing

The analysis of multi-locus data in a coalescent-based framework has the potential to provide insights into the demographic history of populations, including changes in population size, split times and migration rates (Heled & Drummond, 2008; Hey, 2010). This approach is particularly meaningful for the rigorous statistical testing of alternative phylogeographical hypotheses (Dépraz et al., 2008; Díaz-Pérez et al., 2012; Provan & Maggs, 2012).

Different estimates of divergence times were obtained from the dating analysis of cpDNA sequences and the coalescent-based *beast and IMa2 analyses, which incorporated both nuclear and plastid sequence data (Figs S3, 6c). Divergence times of disjunctly distributed cpDNA sequence lineages (Fig. S3a) were estimated to be older than the split of the corresponding populations, as expected based on coalescent theory (Knowles & Maddison, 2002). When estimating population divergence times in a coalescent framework, more recent estimates were obtained from the *beast analysis (Fig. S3b) than from the IMa2 analysis (Fig. 6c), which was also expected as the former does not account for migration among populations, while the latter does (Heled & Drummond, 2010; Hey, 2010). Given that predictions of our three hypotheses include both divergence times and migration among populations (Fig. 1), the isolation with migration analysis is the most appropriate for our hypothesis testing.

Hypothesis I postulates admixture of L. elegans populations across the Duero basin during the LGM as a result of altitudinal-descent migration (as found in Pinus sylvestris by Robledo-Arnuncio et al., 2005). Accordingly, it predicts a split between populations from different mountain ranges post-dating the LGM (Fig. 1). At first glance, this hypothesis might seem to be supported by the At103 haplotype network given the presence of widespread haplotypes (Fig. 5a). However, hypothesis I can be rejected on the basis of our isolation with migration models. In the three IMa2 models using different topologies (Fig. 6c), the major split between Cantabrian region populations and eastern Central System/Iberian System populations (N vs SE; see Fig. 6) was estimated to have occurred before the LGM (26.5 to 19–20 kyr bp; Clark et al., 2009) and the entire last glacial period (Silva et al., 2009; see Fig. 6c). In fact, the peak of the posterior probability distributions of N–SE split times (not considering SW) was placed around the Riss and Mindel glaciations when using topologies 1 and 2, while more uncertainty was recovered when employing topology 3 (Fig. 6c). Post-divergence migration between northern and south-eastern populations was unsupported by either the full IMa2 models or the testing of nested models (Table 5).

Hypothesis II postulates splits between populations from different mountain ranges older than the LGM. The estimated split across the Duero basin (N vs SE) agrees with this prediction. However, the close relationship (Fig. 6b) and strong signal of recent gene flow (Fig. 6a) between Central System and Iberian System populations disagree with hypothesis II. The linear spread pattern found between the Central System and the Iberian System (Fig. 6a) indicates that hypothesis II does not provide a good explanation for the Quaternary history of L. elegans (Fig. 1). The evidence for migration between southwestern populations and adjacent areas recovered from the IMa2 models (Fig. 6c) is also inconsistent with long-term isolation, as postulated by hypothesis II.

Hypothesis III, by contrast, found definite support in our phylogeographical results. The old divergence between northern and south-eastern areas (Fig. 6c) is consistent with the existence of independent and isolated Quaternary refugia on both sides of the Duero basin. A post-glacial colonization of the Iberian System from the Central System is strongly suggested by the close relationship between populations from both regions (Fig. 6b), together with the spread routes between these areas statistically supported by plastid and nuclear DNA-based DPA (Fig. 6a). The relationships and timing of divergence of south-western populations (SW; Fig. 6) remain uncertain. Nevertheless, the three IMa2 models supported post-divergence migration from Cantabrian populations to this region (Fig. 6c), which is in agreement with DPA results (Fig. 6a). Fine-scale location of glacial refugia is suggested by the geographical distribution of genetic diversity and by the genetic structure revealed by SAMOVA analyses (see Notes S2 for additional discussion).

Although phylogeographical investigations on glacial refugia and post-glacial recolonization are usually restricted to the last cycle of climatic fluctuations (Hewitt, 2004), our coalescent-based models (Fig. 6c) suggest an older history of divergence between L. elegans populations. The estimated split times imply that the isolation between the northern and south-eastern refugia dates back to the Riss or even the Mindel glaciation, and persisted throughout at least the Eemian interglacial and the Würm glaciation. The persistence of an old genetic structure throughout several Quaternary climatic cycles has been previously suggested for a lizard species in this geographical context (Paulo et al., 2001). Population genetic analyses of Mediterranean plants in an isolation-with-migration framework are still scanty (Jaramillo-Correa et al., 2010). Nevertheless, studies from other biogeographical regions have frequently reported an old divergence (> 100 kyr bp) between geographically structured intraspecific lineages, which can be interpreted as recurrent survival in disjunct refugia through several Quaternary climatic cycles (Eckert et al., 2008; Ikeda et al., 2009; Ribeiro et al., 2010; Gutiérrez-Rodríguez et al., 2011; Breen et al., 2012; see also Table S5). In the case of L. elegans, the observed genetic divergence may have resulted from recurrent survival events in several isolated refugia (north and south of the Duero basin), followed by recolonization of the mountain ring during interglacial periods (without admixture of the colonizing lineages), and a certain degree of gene flow between northern and southern populations only in the west. A similar pattern of genetic divergence between Cantabrian and Central System populations has been reported for Saxifraga pentadactylis (Vargas, 2003) and Senecio boissieri (Peredo et al., 2009). Further studies involving species with a similar ring-like distribution pattern in north-western Iberia, such as Rumex suffruticosus (López, 1990), Luzula lactea and Luzula caespitosa (Fernández Piedra & Talavera, 2010), may enable the identification of congruent phylogeographical patterns. Additionally, the study of species with a range similar to the potential distribution of L. elegans (including Sierra Nevada; e.g. Campanula herminii, Carex furva and Gentiana boryi) may shed light on biogeographical connections between Sierra Nevada, the Cantabrian mountains and the Central System mountains (Peredo et al., 2009 for an example in Senecio boissieri).

Phylogeographical evidence supports paleoclimatic modelling

Incongruence between the two LGM projections of L. elegans paleodistribution probably derives from the different assumptions and methods of the CCSM and MIROC simulations, which generally result in a stronger decrease in temperature modelled by CCSM, as compared to MIROC (Alba-Sánchez et al., 2010; Habel et al., 2010). Indeed, the three most relevant variables for the L. elegans model display remarkable differences in the study area between the CCSM and MIROC simulation (particularly bio6, the minimum temperature of the coldest month; see Fig. S4). Significant differences between projections obtained under different paleoclimatic models are frequently found (Oláh-Hemmings et al., 2010; Garcia-Porta et al., 2012; Rebelo et al., 2012), and they are sometimes handled by calculating an average or consensus model (Waltari et al., 2007; Abellán et al., 2011; Flanders et al., 2011). However, this approach does not seem appropriate in the presence of strong incongruence. Instead, independent biological data, such as those derived from phylogeographical or paleontological studies, are needed to validate the models.

Our integration of two sources of evidence (SDM and phylogeography) has enabled a reconstruction of the putative evolutionary and biogeographical history of L. elegans during the late Quaternary (Fig. 7). In this reconstruction, we have favoured the CCSM simulation of the LGM climate over the MIROC simulation. The CCSM-based model showed paleodistribution patterns more consistent with DNA-based phylogeographical results. If cycles of vast expansion across the northern Iberian plateau had occurred during ice ages, as suggested by the MIROC model, the old split between northern and south-eastern populations without subsequent significant migration (Fig. 6c) would not be expected. This latter phylogeographical pattern supports restricted glacial refugia (hypothesis III), as inferred by the CCSM model. Nevertheless, a certain degree of altitudinal migration within each refugium (like in hypothesis II) cannot be completely ruled out (Kropf et al., 2012).

Figure 7.

Reconstruction of the late Quaternary history of Linaria elegans based on distribution modelling and phylogeographical analyses. Geographical areas are named as in Fig. 2. Major populations (see Fig. 6) are delimited by black dotted lines. The ancient split between northern and south-eastern populations is represented by a white dotted line, and the splitting time estimated by the isolation with migration models is indicated. Areas inferred as continuously suitable under last interglacial (LIG), last glacial maximum (LGM; CCSM model) and current climatic conditions (putative ‘long-term refugia’) are shaded in red. Areas suggested as refugia during the LGM are underlined. Solid arrowed lines represent routes of post-glacial colonization. Dotted arrowed lines represent additional connections between areas.

The survival of populations in refugia is as important during warm periods as it is during cold periods to ensure the long-term persistence of a species (Bennett et al., 1991). Five areas (GP, AL, ES, GA and GR) may be hypothesized as ‘long-term refugia’ (Stewart et al., 2010) of L. elegans, as indicated by the observation that they harboured suitable habitats for the species throughout at least the last climatic cycle (Fig. 7). Additionally, GU may have harboured the species during the LGM (see the small area in Fig. 3c) after having been colonized from GR. Based on fossil and phylogeographical evidence, other European mountains have been proposed as long-term refugia for different plant species, including the Pyrenees, the Alps, the French Massif Central, the Apennines, the Balkans and the Carpathians (Bennett et al., 1991; Liepelt et al., 2009; Kropf et al., 2012; Schmickl et al., 2012), as well as different mountain ranges in the Iberian Peninsula, such as the Central System and the Cantabrian Mountains (Gómez & Lunt, 2006).

From the long-term refugia of L. elegans, two main routes of post-glacial recolonization can be suggested, one through the northern range (from GP to CM), and the other across central Iberia (from GU to the Iberian System), giving rise to the currently observed distribution. Admixture between the two colonizing lineages in the north-east is not evidenced, as shown by the lack of significantly supported migration between northern and southern populations in the isolation with migration models. Such admixture would represent the next natural step to close the mountain ring, possibly giving rise to a secondary contact, but it may have been hindered by the dominant calcareous substrates of this area. By contrast, our results support significant post-divergence gene flow between western Central System (ES and GA) and northern populations, which is not surprising given the fact that these areas are connected by neighbouring patches of long-term suitable habitats (Fig. 7).

The late Quaternary history of L. elegans does not fit classical models of glacial survival and post-glacial recolonization of mainly temperate taxa in Europe (Comes & Kadereit, 1998; Hewitt, 2000), in as much as a simple model of latitudinal shift towards warmer, southern regions during ice ages is not supported. Survival during this period through altitudinal shifts alone is not supported either. Instead, consistent refugia were found in western Iberian locations, which benefited from Atlantic climatic buffering. This suggests that the oceanic-continental gradient might have played a key role in determining the location of Quaternary refugia of certain species (Stewart et al., 2010). This usually longitudinal oceanic-continental axis has been frequently ignored in biogeography, but its role in shaping species ranges and diversity is increasingly being recognized (Stewart et al., 2010; Cohen et al., 2011; Conord et al., 2012). Our results therefore support the emerging complexity of Mediterranean peninsulas as glacial refugia (Gómez & Lunt, 2006; Schmitt, 2007; Médail & Diadema, 2009; Nieto-Feliner, 2011). Further analyses will confirm whether the oceanic-continental gradient has significantly shaped the Quaternary history of Mediterranean plant species.

Concluding remarks

The results showed that the integration of new methodologies for species distribution modelling and intraspecific phylogeography can enable faithful reconstructions of the late Quaternary history of plant species. This integrative approach is especially helpful as information from one source may help validate the patterns revealed by the other. There has been no investigation conducted to date that validates any of the available simulations of LGM climate using independent biological data. The present phylogeographical results suggest that the CCSM simulation might be more suitable than the MIROC simulation. Further research on additional taxa will determine whether the CCSM model consistently produces more reliable results, thus reducing the uncertainty of distribution models projected to the LGM.


The authors thank David Orgaz, Fidel Fernández-Mazuecos, Alberto Fernández-Mazuecos, Belén Estébanez, Alberto Bañón, Enrique Sánchez-Gullón, Bernardo García and José Luis Blanco-Pastor for field assistance; Jaime Güemes for plant material; Javier Amigo, Carlos Molina, José Luis Benito, Juan José Sánchez and Gonzalo Mateo for assistance in population finding; Emilio Cano, Gemma Andreu, Fátima Durán and Guillermo Sanjuanbenito for laboratory assistance; Jesús Muñoz, Juan Antonio Calleja and Beatriz Vigalondo for advice and comments on distribution modelling; José Luis Blanco-Pastor and Isabel Liberal for assistance with molecular analyses and insightful discussion; editor Richard Abbott and four anonymous reviewers for their comments and suggestions, which greatly improved the quality of the manuscript. This research was supported by the Spanish Ministry of Science and Innovation through project CGL2009-10031, and by the Spanish Ministry of Education through a FPU fellowship (AP2007-01841) to M.F-M.