Patterns of polymorphism resulting from long-range colonization in the Mediterranean conifer Aleppo pine

Authors

  • Delphine Grivet,

    1. Department of Forest Systems and Resources, Forest Research Institute, CIFOR-INIA, Carretera de la Coruña km 7.5, ES–28040 Madrid, Spain
    Search for more papers by this author
    • These authors contributed equally to this work.

  • Federico Sebastiani,

    1. Department of Agricultural Biotechnology, Genexpress, University of Florence, Via della Lastruccia 14, I–50019 Sesto Fiorentino (FI), Italy
    2. Plant Genetics Institute, Division of Florence, National Research Council, via Madonna del Piano 10, I–50019 Sesto Fiorentino (FI), Italy
    Search for more papers by this author
    • These authors contributed equally to this work.

  • Santiago C. González-Martínez,

    1. Department of Forest Systems and Resources, Forest Research Institute, CIFOR-INIA, Carretera de la Coruña km 7.5, ES–28040 Madrid, Spain
    Search for more papers by this author
  • Giovanni G. Vendramin

    1. Plant Genetics Institute, Division of Florence, National Research Council, via Madonna del Piano 10, I–50019 Sesto Fiorentino (FI), Italy
    Search for more papers by this author

  • Original sequences have been deposited with the EMBL/GenBank Data Libraries under accession numbers FJ588494FJ588503.

Author for correspondence:
Santiago C. González-Martínez
Tel: +34 913471499
Email: santiago@inia.es

Summary

  • The evolutionary outcomes of range expansion/contraction depend on the biological system considered and the interactions among the evolutionary forces in place. In this study, we examined the demographic history and the local polymorphism patterns of candidate genes linked to drought tolerance of a widespread Mediterranean conifer (Pinus halepensis).
  • To that end, we used cpSSRs and coalescence modelling of nuclear genes to infer the demographic history of natural populations covering the species range. Ten drought-response candidate genes were then examined for their patterns of polymorphism and tested for selection considering plausible demographic scenarios.
  • Our results revealed a marked loss of genetic diversity from the relictual Greek population towards the western range of the species, as well as molecular signatures of intense bottlenecks. Moreover, we found an excess of derived polymorphisms in several genes sampled in the western part of the range – a potential result of the action of natural selection on populations confronted with new environments following long-range colonization.
  • Wide-range expansions–contractions of forest trees are accompanied by strong selective pressures, resulting in distinct evolutionary units. This knowledge is of crucial importance for the conservation and management of forests in the face of climate change.

Introduction

Predicting how demographic and selective processes interact during colonization and affect the adaptive evolution of a given species is a question of central importance in evolutionary and conservation genetics that remains a daunting challenge. For example, it is expected that populations that have been through bottlenecks or founder effects during colonization will exhibit genetic erosion (Nei et al., 1975; Hewitt, 2000); also, declining genetic diversity with increasing distance from source population has been reported in several study cases (Grivet and Petit, 2003;Ramachandran et al., 2005; Eckert et al., 2008). However, that does not necessarily mean that expanding populations are genetically depleted or that refugial populations always harbour the highest diversity, as has been shown in other studies (e.g. Petit et al., 1999, 2003; Comps et al., 2001; Walter & Epperson, 2001). During range expansion, populations will be confronted with selection and adaptation to different environments as well as to new conspecific and heterospecific interactions (Hewitt, 2000). Whether the mechanisms that generate different genetic patterns in expansion and refugial core populations are able to influence the adaptive potential of a given species will depend on the interplay among all the evolutionary forces acting upon it. In the present study, we examined the genetic consequences of long-range colonization of a widespread Mediterranean conifer.

Aleppo pine (Pinus halepensis) is a conifer that lives in contrasting environments, growing on all substrates and almost all bioclimates of the Mediterranean region, thus reflecting high population adaptability. It is then remarkable that, in large parts of its range, various genetic studies have shown low levels of genetic diversity both in neutral molecular markers and in quantitative traits (Farjon, 1984; Schiller et al., 1986; Bucci et al., 1998; Climent et al., 2008). In addition, neutral genetic diversity in this species appears to be structured along a longitudinal gradient, with eastern Mediterranean populations harbouring greater diversity and more ancient lineages than western ones (Bucci et al., 1998; Morgante et al., 1998; Gómez et al., 2001). These genetic results, combined with common garden experiments pointing to Greece as an outlier (Climent et al., 2008), data on flavonoid content (Barbéro et al., 1998) and few available palynological and dendro-archaeological records (Schiller et al., 1986; Pons, 1992), suggest a scenario in which Aleppo pine populations expanded westwards all around the Mediterranean basin from some refugia located in the southern part of the Balkan Peninsula. This particular population dynamics of Aleppo pine (i.e. long-range colonization probably accompanied by recurrent contractions–expansions caused, for example, by forest fires), combined with its current wide ecological range and scattered distribution across the Mediterranean basin, makes this species especially suitable to study the impact of demographic and genetic processes, as well as their interactions, on the adaptive potential of tree populations.

To look at an extensive number of markers for outliers (i.e. the ‘population genomics’ approach; Luikart et al., 2003) is a recent approach generally used to understand the evolutionary forces at the origin of adaptation, and how these forces shape the spatial–temporal pattern of variability within and between populations. Indeed, all loci of a genome will be affected similarly by demographic processes while only a subset of them will be influenced by selection (but see Begun et al., 2007– the first population genetics study based on whole-genome analysis) and depart from the average loci. Thus, looking at outlier loci among an extensive number of markers allows detection of loci subject to selection (see Namroud et al., 2008 for a recent example in a conifer). In nonmodel species with complex demographical history (i.e. not fitting the simple models often assumed in outlier detection methods), such as Aleppo pine, for which only limited genetic resources are available, an alternative approach to outlier loci detection would be the combination of inferences from putatively neutral markers, such as chloroplast or nuclear microsatellites, which are theoretically affected by demographic history but not by selection, and from candidate genes coding for known biological functions, potentially correlated with adaptive traits, such as phenology (Heuertz et al., 2006; Pyhäjärvi et al., 2007), drought tolerance (González-Martínez et al., 2006; Pyhäjärvi et al., 2007; Eveno et al., 2008) or cold resistance (Wachowiak et al., 2009).

Tree species in the Mediterranean climate are facing specific environmental constraints, among them drought. Aleppo pine is well adapted to drought, although this factor can still constitute an important threat to individual growth and survival at all life stages (Sathyan et al., 2005). In the context of climate change in the Mediterranean area, and its consequences on the adaptive potential of Mediterranean vegetation, it is essential to understand the molecular bases underlying adaptation of populations along colonization gradients and how they are affected by population expansions and contractions. Moreover, from a practical point of view, identifying populations harbouring the genotypes best adapted for drought is essential not only for afforestation in current zones of extreme climatic conditions, such as in Israel (Grunwald et al., 1986) and in Morocco (Boulli et al., 2001) in the case of P. halepensis, but also in the light of future climatic changes, as most of the Mediterranean area is predicted to experience a significant increase in temperatures and aridity in the near future (Petit et al., 2005). The widespread distribution of Aleppo pine suggests that its populations may be differentially adapted to local climatic conditions and may show different capacities to survive water-deficit stress (WDS). This view is reinforced by various studies showing differences in drought tolerance among provenances using ecophysiological data and gene expression (Atzmon et al., 2004 and references therein; Sathyan et al., 2005; Voltas et al., 2008). Consequently, genes associated with traits related to stress tolerance (e.g. cellular resistance to low water potentials or high secondary compound production) have probably been main targets for natural selection in Aleppo pine and therefore constitute relevant candidates for testing the effects of long-range colonization and environment heterogeneity on adaptive patterns in this species.

In this study, we used a set of neutral chloroplast microsatellites (cpSSRs) to screen representative natural populations covering Aleppo pine range and distributed along a longitudinal gradient. Different methods based on cpSSRs (F-statistics, samova and mismatch distributions) were used to investigate the geographic structure of diversity and to infer the demographic history of Aleppo pine populations. Populations were then examined for their local polymorphism patterns at 10 candidate genes related to drought tolerance to (1) describe geographical patterns of functional nucleotide variation across Aleppo pine range and (2) infer the past demographic history of this species using coalescent-based modelling at the local scale. To do so, we simulated numerous demographic scenarios (in particular, expansion after bottleneck models, as suggested by cpSSR analyses) to determine which one best fitted our empirical data and (3) examined whether different selective pressures for drought-response candidate genes exist along the longitudinal gradient sampled (covering putative refugial and expansion zones) by comparing the patterns of nucleotide variation observed at each locus with that expected under best-fitting demographic models. Overall, this study contributes to the identification of the evolutionary forces shaping polymorphism patterns in the widespread Mediterranean conifer Aleppo pine.

Materials and Methods

Study species and sampling

Aleppo pine (P. halepensis Mill.) is a species with a scattered distribution occupying extensive areas in the western Mediterranean, while fewer and more disjoint populations are present in the eastern Mediterranean (Fig. 1). This species is well adapted to high-intensity fire regimes (Tapias et al., 2004), as well as dry conditions, and is therefore a species of choice for afforestation in Mediterranean areas. Pinus halepensis coexists with Pinus brutia in its eastern range (Fig. 1) and the two species can intercross but, as they occupy different geographical ranges and bioclimates, hybridization is rare (Bucci et al., 1998).

Figure 1.

 Distribution of Pinus halepensis (grey areas and small dots) and Pinus brutia (grey hatched areas) along with the localities of the six study populations (large circles: Shaharia (Israel), Elea (Greece), Imperia (northern Italy), Aures Beni Melloul (Algeria), Tarrasa (north-eastern Spain), Zaouia Ifrane (Middle Atlas, Morocco)).

In this study, samples were collected from six populations covering the Aleppo pine natural range, from east to west: Shaharia (Israel), Elea (Greece), Imperia (northern Italy), Aures Beni Melloul (Algeria), Tarrasa (north-eastern Spain) and Zaouia Ifrane (Middle Atlas, Morocco) (Fig. 1). Seeds were collected in each population from several trees, separated by at least 50 m to avoid collecting related individuals, and used to screen chloroplast microsatellites and candidate genes related to drought resistance. Samples used for the two sets of markers were not necessarily collected from the same trees, although they were collected from the same populations.

DNA extraction, amplification, genotyping

Genomic DNA from megagametophyte – the haploid maternal tissue surrounding the embryo in conifer seeds – was extracted with the DNeasy Mini Kit (Qiagen). Twelve to twenty-five samples per population (= 126, cf. Table 1) were used to genotype six chloroplast microsatellites markers: Pt15169, Pt26081, Pt36480, Pt63718, Pt71936 and Pt87268 (Vendramin et al., 1996). Polymerase chain reaction reactions were performed as follows: the 25 μl mix for reaction contained 0.2 mm/dNTP, 2.5 mm MgCl2, 0.2 mm of each primer (the forward primer being labelled with a fluorescent dye, FAM, HEX or TAMRA (Eurofins MWG Operon, Ebersberg, Germany)), 10× reaction buffer (GE Healthcare, Little Chalfont, UK), 25 ng DNA and one unit Taq polymerase (GE Healthcare). The PCR conditions were 5 min at 95°C, 25 cycles of 1 min at 94°C, 1 min 55°C, 1 min 72°C, followed by 8 min at 72°C. Genotyping was performed using a 96-capillary MegaBACE 1000 (GE Healthcare) automatic sequencer. Allele sizes were assessed according to the ET400-R size standards (GE Healthcare) using MegaBACE fragment profiler version 1.2 (GE Healthcare).

Table 1.   Population characteristics and genetic diversity for the four polymorphic chloroplast microsatellites
PopulationOriginNnhnhpnheinline imageHe
  1. Statistics are computed per population and also for the North African populations together.

  2. nh, total number of haplotypes; nhp, total number of private haplotypes; nhe, effective number of haplotypes; inline image, average distance between pairs of individuals; He, unbiased haplotype diversity.

EleaGreece1711107.052.850.91
ShahariaIsrael23732.420.270.61
ImperiaItaly12321.410.290.32
TarrasaSpain25531.401.540.30
Aures Beni MelloulAlgeria2410100
Zaouia IfraneMorocco2510100
 North Africa4910100

At least eight samples per population (except for locus rd21A-like, see the Supporting Information, Table S1) were used to sequence 10 candidate genes related to drought tolerance identified on the basis of functional studies performed in Pinus taeda and other conifers, or derived from model species such as Arabidopsis thaliana: cpk3 (calcium-dependent protein kinase), sod-chl (Cu : Zn superoxide dismutase, nuclear gene for chloroplast product), dhn-1 (dehydrin), erd3 (early response to drought 3), Aqua-MIP (aquaporin, membrane intrinsic protein), pal-1 (phenylalanine ammonia-lyase 1), lp3-1 (water-stress-inducible protein), rd21A-like (cysteine protease similar to rd21A in Arabidopsis), sams-2 (S-adenosylmethionine synthetase 2) and ug-2_498 (unknown function) (Chang et al., 1996; Richard et al., 2000; see also González-Martínez et al., 2006 and references therein). All PCR products were purified and sequenced from both ends using the DYEnamic ET Dye Terminator Kit and an automatic capillary sequencer MegaBACE 1000 (GE HealthCare). Multiple sequence alignments were done using clustalw (Thompson et al., 1994) and adjusted manually using bioedit (http://www.mbio.ncsu.edu/BioEdit/page2.html).

Statistical analyses

Patterns of neutral genetic diversity at cpSSRs

Total number of haplotypes and private haplotypes (nh and nhp, respectively, direct count), effective number of haplotypes (nhe, Kimura & Crow, 1964), average distance between all pairs of individuals according to a microsatellite stepwise mutation model (inline imageEcht et al., 1998; Vendramin et al., 1998), and unbiased haplotype diversity (He, Nei, 1978) were computed for each population using our own spreadsheets. Global FST (Weir & Cockerham, 1984) and global RST (Rousset, 1996) among the six populations, and the pairwise FST estimator were calculated with fstat v. 2.9.3 (Goudet, 2001). In addition, to define the distribution of genetic diversity among populations without explicit a priori definition of population structure, we carried out a spatial analysis of molecular variance (samova; Dupanloup et al., 2002), which defines groups of populations that are geographically homogeneous and maximally differentiated from each other. The significance of the variance components of populations permutated among groups (RCT), of genotypes permutated among populations within groups (RSC) and of genotypes permutated among populations and among groups (RST) was tested by 1000 permutations of individuals for each of the hierarchical levels. We tested K = 2 to six groups of localities to explore the possible genetic structure. The number of groups was selected according to the highest RCT value using the sum of squared size differences between haplotypes with 100 simulated annealing processes (Dupanloup et al., 2002).

Demographic inferences: mismatch distributions at cpSSRs

Mismatch distributions (distribution of the observed number of differences between pairs of haplotypes) were computed using the arlequin version 3.1 software (Excoffier et al., 2005). Chloroplast microsatellites were coded in a binary fashion, following Navascués et al. (2006). These authors also showed, using extensive numerical simulations, that mismatch distributions based on chloroplast microsatellites are sensitive to population growth and therefore suitable to estimate demographic parameters related to population expansion (see also Navascués et al., 2009). Then, Rogers & Harpending’s (1992) demographic model was fitted on the mismatch distributions following Schneider & Excoffier (1999; but see also Excoffier et al., 2005). The goodness-of-fit of the demographic model was tested using 10 000 bootstraps. Mismatch distributions (and fitted demographic models) were examined for each of the six populations except Morocco and Algeria that lacked genetic diversity and were fixed for the same haplotype.

Estimators of population-scaled gene diversity at candidate genes

Gene diversity is controlled by the parameter theta (θ = 4 Neμ, where Ne is the effective population size and μ the per generation mutation rate). Several sample-based estimators of theta (θ) exist, all based on the site-frequency spectrum of the mutations (SFS), that is, the distribution of the proportion of sites where the mutant is at frequency x. In our study, four of them were examined, as defined by Watterson (θw, Watterson, 1975), Tajima (θπ, Tajima, 1989), Fay & Wu (θH, Fay & Wu, 2000) and Zeng et al. (θL, Zeng et al., 2006). These various estimators of population-scaled nucleotide diversity were computed to estimate five summary statistics that were used to, first, infer the demographic history of Aleppo populations (cf. section on Demographic inferences: coalescent simulations at candidate genes), and, second, to perform some neutrality tests aiming at detecting whether some genes were under selection (cf. section on Neutrality tests at candidate genes). The following tests were chosen because, together, they capture most information contained in the SFS and are therefore relevant in inferring the evolutionary processes involved, either demographic or selective (Zeng et al., 2006): (1) Tajima’s D (Tajima, 1989) compares θπ and θw asking about the occurrence of rare and common variants; (2) Fu’s Fs (Fu, 1997) is based on the haplotype (gene) frequency distribution conditional on the value of θπ; (3) Fay and Wu’s H (Fay & Wu, 2000) compares θπ and θH and takes into consideration the abundance of derived high-frequency variants in comparison with intermediate-frequency ones; (4) normalized Fay and Wu’s H (Zeng et al., 2006) is a normalization by the variance of θH and θL; and (5) Zeng et al.’s E (Zeng et al., 2006) compares θw and θL and addresses the relative abundance of very high-frequency and very low-frequency variant classes. As Fay and Wu’s H statistics as well as Zeng et al.’s E require the use of outgroups, two genes out of our 10 (sams-2 and ug-2_498) were disregarded, as outgroups were not available for them. Summary statistics were calculated independently for all populations except the two North African ones, which were combined (as they were genetically homogeneous), using scripts kindly provided by S. E. Ramos-Onsins.

Demographic inferences: coalescent simulations at candidate genes

To further infer the past demography of Aleppo pine populations, we simulated numerous demographic scenarios – standard coalescent (i.e. no changes in population size), exponential growth and several bottleneck models – and examined which one best fitted our empirical data. Our approach was similar to that suggested by Richards et al.’s (2007) in that we used coalescence simulations and summary statistics to test different demographic hypothesis. To do so, we compared the observed distributions of five summary statistics – Tajima’s D, Fu’s Fs, Fay and Wu’s H, normalized Fay and Wu’s H (Hnorm) and Zeng et al.’s E (as defined in the section Estimators of population-scaled gene diversity at candidate genes) – for the 10 candidate genes with those obtained from the different demographical models using the mlcoalsim multilocus coalescent simulator (Ramos-Onsins and Mitchell-Olds, 2007). Coalescent simulations were run independently for each population and for the two North African populations combined (as they were genetically homogeneous): a standard neutral model (SNM) was run first, and when this model was not compatible with our observed data, which was the case for all populations but the Greek one, different models in which population size changed across time were simulated. Because only models consisting of bottlenecks followed by expansion produced simulated values compatible with the observed ones, a more intensive search of the space parameter was done for them. More specifically, times from the end of the bottleneck ranging from 0.0005 to 0.0040 (in units of 4N0 generations from the present) and bottleneck severities (i.e. the magnitude of the reduction in population size) from 0.0100 to 0.0008, keeping the duration of the bottleneck fixed at 0.0015 as in Heuertz et al. (2006) were tested. To simulate population growth after the bottleneck, a logistic curve fixing the shape parameter, ts, to half the growth duration to produce exponential-like growth curves was used. Ancestral effective population size was assumed to be similar to the current one (N0). More specific details about simulated bottleneck models can be found in Fig. S1. Coalescent simulations were performed with recombination using the average value of 0.00365 per site for the 10 genes obtained from composite-likelihood methods (LDhat; McVean et al., 2002). Watterson’s estimate of population-scaled nucleotide diversity (θw) was used to control for sequence polymorphism in the simulations. When more than one model was compatible with the observed data, likelihood values were computed across all genes and all statistics to obtain an overall likelihood indicating the most compatible model (for details see Ramos-Onsins & Mitchell-Olds, 2007).

Neutrality tests at candidate genes

To test whether some candidate genes were the target of positive selection in P. halepensis populations, five neutrality tests (as defined in the section Estimators of population-scaled gene diversity at candidate genes) were computed: Tajima’s D, Fu’s Fs, Fay and Wu’s H, normalized Fay and Wu’s H (Hnorm) and Zeng et al.’s E. Although under neutrality the different population-scaled diversity estimators (θ) have the same expectation, each of them is sensitive to changes in different parts of the site-frequency spectrum (SFS), thus providing information on the evolutionary forces that have acted on the genes under study (Zeng et al., 2006). Neutrality tests were performed separately for each population (except for the two genetically homogeneous North African populations), as the existence of population structure or differences in demographic history would bias the outcome of these tests. For each test under the best-fitting demographic scenario in each Aleppo pine population (the standard neutral model with recombination for Greece and various bottlenecks followed by expansion models for Israel, Spain, Italy and Morocco–Algeria, see the Results section), 10 000 simulated values were first generated using mlcoalsim (Ramos-Onsins & Mitchell-Olds, 2007). The significance of the tests was then obtained by comparing observed values with the expected distribution obtained from the coalescent simulations.

Results

Patterns and distribution of neutral genetic diversity (cpSSRs)

Out of six chloroplast microsatellite loci, two were monomorphic (Pt3665 and Pt6375); the four polymorphic fragments resolved 21 haplotypes across the six populations sampled. The Greek population displayed the highest values for all diversity parameters assessed (nh, nhp,nhe,inline image and He) and a progressive loss of genetic variation for populations located away from Greece was observed: to the east, Israel showed intermediate levels of diversity, to the west, Italian and Spanish populations showed reduced levels of diversity, while North African populations (Morocco and Algeria) were fixed for one common haplotype (Table 1).

When looking at the spatial population structure of the chloroplast markers, a high level of differentiation among populations (FST = 0.394 and RST = 0.255) was observed. Pairwise genetic differentiation estimates indicate that the population in Greece is significantly different from all other populations (= 0.001), and the Israel population is significantly different from those of Algeria and Morocco (= 0.010) (Table 2). Morocco and Algeria harbour the same haplotype and are geographically close, thus forming a homogeneous group. The Spain and Italy populations are not differentiated with cpSSRs, but they were still treated separately because they present quite distinct levels of nuclear genetic diversity (cf. section on Nucleotide diversity and natural selection at candidate genes). Israel’s population was studied on its own because it is geographically separate from the western populations and harbours higher levels of within-population genetic diversity (in contrast to the western populations), suggesting a distinct evolutionary history for this population.

Table 2.   Pairwise FST in Pinus halepensis populations based on six chloroplast microsatellites
 GreeceIsraelItalySpainAlgeriaMorocco
  1. Probability obtained after 15 000 permutations and following standard Bonferroni corrections: **P = 0.01; ***P = 0.001.

Greece     
Israel0.38***    
Italy0.43***0.04   
Spain0.51***0.05−0.01  
Algeria0.57***0.14**0.060.06 
Morocco0.58***0.15**0.070.060.00

The samova delimited two groups (K = 2): the Greek population and the pool of all the other populations (Israel, Italy, Spain and Algeria-Morocco) for the highest RCT value of 0.550. In each test of hierarchical structure the values of the among-group component (RCT = 0.540 for K = 2) and that of the among-population and among-group component (RST = 0.550) are similar and much higher than that of the among-population within-groups component (RSC = 0.002). These results highlight that populations from Greece are genetically alike and, at the same time, genetically distant from Greece.

The results from neutral cpSSRs point out to different groups of genetically distinct populations with a contrasting evolutionary history: the outlier population (Greece), which is genetically distinct from all other populations and displays a high genetic diversity, the eastern population (Israel) with an intermediate level of diversity, the western populations (Italy and Spain) with reduced levels of diversity, and the North African group (Algeria and Morocco), which is genetically depleted.

Demographic inferences

The wavy pattern of the cpSSR-based mismatch distribution for the Greek population indicated a stable demography (Fig. 2). Moreover, Rogers & Harpending’s model of population expansion was rejected for this region (P(simulated ssd ≥ Observed ssd): 0.033). By contrast, Israel, Spain and Italy showed a unimodal mismatch distribution (mismatch observed mean of 0.917, 0.967 and 0.667, with P-values for the Rogers and Harpending’s model of 0.475, 0.757 and 0.749, respectively) that fitted into a pattern of populations undergoing recent range expansion (Fig. 2). Mismatch distributions for Morocco–Algeria could not be computed as these populations are fixed for a single haplotype.

Figure 2.

 Chloroplast microsatellite mismatch distributions for Greece (closed symbols), Israel (open symbols), Spain (light tinted symbols) and Italy (dark tinted symbols).

Gene-based summary statistics (Tajima’s D, Fu’s Fs, Fay and Wu’s H, Fay and Wu’s Hnorm and Zeng et al.’s E) also revealed a marked contrast among populations (Table 3). Greece, in particular, displayed extreme values (more positive for H and Hnorm and less positive for Fs and E), which suggests a distinct evolutionary history for this population. Based on these results, and those from the cpSSR analyses, we examined various demographic models for Greece, Israel, Italy, Spain and Morocco–Algeria separately, although those of Spain and Morocco–Algeria are not discussed in detail because of the low power, owing to the lack of variation (only four and five polymorphic genes respectively, with an average of 1.6 segregating sites per gene for both) to fit the models for these populations (Fig. S2).

Table 3.   Neutrality tests (against best-fitting demographic scenarios) for the 10 drought-response nuclear candidate genes
PopulationGeneNeutrality testa
DFsHHnormE
  1. Summary statistics are given by gene and for the averaged loci, and were computed for Greece, Israel and Italy for which reliable demographic inferences have been obtained.

  2. D, Tajima’s D (1989); Fs, Fu’s Fs (Fu, 1997); H, Fay and Wu’s H (2000); Hnorm, normalized Fay and Wu’s H (Zeng et al., 2006); E, Zeng et al.’s E (2006); na, not available or undefined (not enough segregating sites or no outgroup sequence available).

  3. aP-values for neutrality tests were obtained by comparison of observed values with those obtained from coalescent simulations under best-fitting demographic scenarios in each range (see text for details): *, 0.05 ≤  0.10; **, 0.01 ≤  0.05; ***,  0.01.

Greececpk3 0.8200.818nanana
sod-chlnanananana
dhn-10.3300.3561.8671.035−0.778
erd3−0.432−0.3631.3330.819−1.092
Aqua-MIP−0.839−0.329−4.500***−4.045***3.445
pal-10.384−0.197−0.357−0.3210.534
lp3-10.9860.8490.2500.5400.026
rd21A-like0.7092.608−1.200−0.5341.124
sams-2nanananana
ug-2_498 −0.6890.736nanana
Average 0.1590.5600.4350.4180.543
Israelcpk3 nanananana
sod-chl1.1670.866nanana
dhn-1 −1.471*0.553*−2.571−1.2670.243
erd3−1.0690.009*0.8330.567−1.258**
Aqua-MIP1.9483.2760.0830.0921.054
pal-1nanananana
lp3-1−1.3100.762−3.000**−4.242**3.138
rd21A-like1.7995.621−1.500−0.6671.826
sams-2nanananana
ug-2_498 nanananana
Average 0.1771.8481.2311.1031.000
Italycpk3−1.3620.671−1.361*−1.9500.994
sod-chl0.3340.536nanana
dhn-1 −1.5952.407−2.357*−1.8110.739
erd30.446−1.298***1.6671.011−0.683
Aqua-MIP−0.654−0.1330.3610.281−0.687
pal-11.3713.003−0.750−0.8291.559
lp3-1−1.3100.762−3.000**−4.242**3.138
rd21A-like1.5732.429−0.900*−0.9501.659
sams-2nanananana
ug-2_498 nanananana
Average 0.1501.0470.9061.2130.960

For the Greek population, the observed pattern of polymorphism was compatible with the standard coalescence model for all five summary statistics (probability that simulated values be smaller than or equal to the observed values: = 0.736, Fs = 0.694, = 0.224, Hnorm = 0.138 and = 0.902), suggesting an ancient population with a stable demography, in agreement with our interpretation of mismatch distributions based on cpSSRs for this population. For the rest of the range of Aleppo pine, the standard coalescence model and the different population growth models could be rejected because of the observed negative values of H and Hnorm summary statistics (except for Spain, for which the rejection was based on the observed negative value of D), which contrast with the close-to-zero or positive values always obtained in the simulations (data not shown).

Among the 20 different bottleneck models tested, the five summary statistics for Israel were compatible with 12 scenarios. Likelihood values indicate an ancient (t = 0.0040) bottleneck of weak intensity (severity of 0.0100) as the most probable scenario. By contrast, the five summary statistics for Italy were compatible with only four scenarios of a relatively recent bottleneck (= 0.0018 and 0.0020) of strong intensity (severity of 0.0010 and 0.0008). Likelihood values point to the most recent scenario as the most probable one (= 0.0018 and severity of 0.0010) (Fig. 3).

Figure 3.

 Probability of rejection of different population bottleneck models for Israel and Italy based on Tajima’s D, Fu’s Fs, Fay and Wu’s H, normalized Fay and Wu’s H (Hnorm), and Zeng et al.’s E summary statistics. Incompatible models at = 0.05 are indicated in by dark tint, incompatible models at = 0.1 are indicated in light tint, and compatible models across all five summary statistics are indicated by a black dot. Highest values of overall likelihood (bottom right diagram) indicate the best-fitting model (see text for details) and are indicated with a double frame. On the y-axis is the time of the end of a bottleneck in units of 4N0 generations. On the x-axis is the severity of the bottleneck (i.e. the magnitude of the reduction in population size) in units of N0.

Nucleotide diversity and natural selection at candidate genes

Out of the 10 candidate genes, one was monomorphic across all populations (sams-2) while another was monomorphic in all populations except Greece (ug-2_498). Thus, nucleotide diversity parameters could not be computed for these two genes (Table S1). Similar general patterns of polymorphism were found for nuclear genes and chloroplast microsatellites: the number of segregating sites (S) and different nucleotide diversity estimators (θπ, θw, θH and θL) were highest in the Greek population, while those for Israel and Italy were intermediate and those for North African and Iberian populations were the lowest (Fig. 4a).

Figure 4.

 (a) Genetic diversity of Pinus halepensis for nuclear genes plotted according to each population: segregating sites (S) are plotted on the left y-axis, while Tajima’s (θπ), Watterson’s (θw), Fay and Wu’s (θH) and Zeng et al.’s (θL) population-scaled nucleotide diversity estimators are plotted on the right y-axis. (b) Neighbour joining tree of cpk3 ancestral and derived haplotypes with Pinus taeda used as outgroup. The pie charts correspond to the haplotype distribution for each sampled population.

For the Greek population, summary statistics (H and Hnorm) revealed only one gene (Aqua-MIP, Table 3) that departed from expectations under the standard coalescence model (which, in this case, is the best-fitting demographic scenario). By contrast, summary statistics for the other populations detected many more genes that departed from expectations under the best-fitting bottleneck models, as illustrated by Israel and Italy (Table 3), suggesting the action of natural selection.

When looking at the polymorphism pattern of the genes with putative imprints of natural selection, a common trend was observed: for most genes, various haplotypes that were present in Greece were only present in the other populations in the form of a reduced subset, one of these haplotypes (or a closely derived one) often being fixed in the populations furthest away from Greece. Interestingly, the haplotype that was generally fixed in these populations was not, as could have been expected, an ancestral one but presented a substantial number of derived mutations (see example of cpk3 in Fig. 4b): out of six genes for which we could determine ancestral and derived states, four were fixed for a derived haplotype in Spain (cpk3, pal-1, rd21A-like and dhn-1), Morocco (cpk3, pal-1, rd21A-like and lp3-1) and Algeria (cpk3, pal-1, rd21A-like and dhn-1), and two were fixed in Israel (cpk3 and pal-1). When they were not fixed, the derived haplotypes in these populations (including those for the two other genes, Aqua-MIP and lp3-1) were the most frequent in all but two cases. This result suggests the action of natural selection in the Aleppo pine populations with distinct outcomes in Greece as opposed to all other populations.

Discussion

The main finding of our study is that Aleppo pine shows contrasting patterns of polymorphism at neutral and functional loci in eastern vs western populations, the likely consequence of genetic drift and adaptation to new environments in its western range. Our results highlight the importance of colonization processes in evolution, not only in that they shape the spatial genetic pattern of current populations but also in that migration interacts with demographic processes and selection, all eventually affecting the potential for population adaptation. Moreover, the spatial characterization of patterns of polymorphism in Aleppo pine populations identified zones with distinctive evolutionary histories, information that is critical for the management of this conifer across its wide range.

The clear genetic differentiation of the Greek population, and the loss of genetic diversity away from this population, combined with the demographic scenarios inferred from cpSSRs and nuclear gene sequences, point towards older and more stable populations in the east side of the Mediterranean Basin. In the western and probably more recently colonized area, reliable demographic inferences were obtained only for Italy, which presents the population with the highest level of variation. The pattern of polymorphism in Italy is in agreement with a recent (= 0.0018) and strong (= 0.0010) bottleneck (Fig. 3). If we consider an effective population size for pine of c. 100 000 (Willyard et al., 2007) and a generation time of 25 yr (as in Brown et al., 2004 and in Willyard et al., 2007), the time estimate for the bottleneck compatible with our empirical data for Italy translates into 18 000 yr before present (BP), corresponding roughly to the time of the last glacial maximum (LGM) in Europe.

Although time estimates should be taken with caution, as they are derived from demographic models fitted using a relatively low number of genes and are based on approximate biological information of the species, the inferred demographic scenario for Aleppo pine – ancient and stable relictual Greek population and expanding western Mediterranean Basin populations around the LGM – is consistent with previous genetic analyses made at the species distribution level (Bucci et al., 1998; Morgante et al., 1998; Gómez et al., 2001), as well as with some flavonoid data (Barbéro et al., 1998) and palaeoecological and fossil records that suggest a relatively recent expansion of the species westwards within the last 10 000 yr (Nahal, 1962; Schiller et al., 1986; Pons, 1992). Thus, the indicative timescale inferred for Aleppo pine bottlenecks for western populations could correspond to a period situated just after the LGM (25 000–18 000 yr BP). This time-frame is much more recent than that estimated for cold-tolerant conifers so far: multilocus patterns of 22 gene loci in Picea abies were found to be compatible with a severe bottleneck occurring some several hundreds of thousands of years ago (Heuertz et al., 2006), while those of 16 loci in Pinus sylvestris matched well an ancient bottleneck taking place c. 2 million yr ago (Pyhäjärvi et al., 2007). Our capacity to detect bottlenecks in the Mediterranean Aleppo pine could be connected to a high intensity of founder events combined with a recent and/or rapid colonization process from a limited number of refugia, together leading to a reduced genetic variability in the expanding western populations. By contrast, bottlenecks for cold-tolerant conifers would have been of weaker intensity, and recolonization might have taken place from lineages coming from multiple refugia (as suggested, for example, by mtDNA studies in Pinus sylvestris, Soranzo et al., 2000), which would make the detection of the signatures of demographic events on the polymorphism patterns found in present-day populations more difficult (Pyhäjärvi et al., 2007).

Identifying the evolutionary forces that may have left imprints on a species’ patterns of polymorphism and how they influence adaptive evolution is complex because they interact with each other in multiple ways. Gene flow (or migration) can have two opposite effects on selection (Lenormand, 2002; Stockwell et al., 2003; Garant et al., 2007): in some cases it counteracts inbreeding depression by increasing genetic variation within populations thus increasing evolutionary potential (Hedrick, 1995; Ebert et al., 2002; Alleaume-Benharira et al., 2006), whereas in other cases it tends to oppose the effect of selection by introducing (as well as maintaining by recurrent immigration) foreign genes to locally well-adapted populations, therefore limiting local adaptation (Stearns and Sage, 1980;Riechert, 1993; Tufto, 2001) and constraining adaptive divergence (Hendry & Taylor, 2004; Nosil & Crespi, 2004). Demographic processes such as population bottlenecks and founder effects will lead to a reduced overall genetic diversity within the populations, which will limit the effectiveness of selection and may cause deviations of mean phenotype from the local optimum leading to maladapted genotypes (Nei et al., 1975; Novella et al., 1995). In the case of Aleppo pine, different scenarios can be considered to explain the data following the bottleneck events:

  • 1 Most of the genotypes would have contributed to the expanding populations, and natural selection would have favoured a wide range of well adapted genotypes, depending on environmental conditions, from the original colonization source. However, the overall reduction of genetic diversity in expanding populations showed by both neutral and functional markers is incompatible with such a scenario.
  • 2 A reduced number of genotypes selected at random and with no particular selective advantage would have been spread across the species range, and would have survived under various conditions, given the plasticity of P. halepensis. This scenario, however, does not explain why most genes display derived mutations in expanding areas.
  • 3 Selection through interspecific and intraspecific competition and/or contrasted environmental conditions would have favoured some genotypes now associated with a reduced overall genetic variability in some populations of the expanding areas.

This last scenario would explain both the reduced level of neutral and functional diversity (drift depends only on population demography and affects all genes equally) as well as the distinct mutations selected across genes in expansion versus core populations. Frequent recurrent forest fires, which play a major role in regulating population dynamics in this species (Barbéro et al., 1998; Tapias et al., 2004), could have contributed to maintaining this pattern of selection and bottlenecks. The evolutionary outcomes of colonization depend greatly on the biological system considered and on the interactions among the evolutionary forces in place, leading from local extinction to reduced response to selection (Pujol & Pannell, 2008) to adaptation (Rosenblum et al., 2007). Here we propose that the current observed genetic pattern in Aleppo pine resulted from both genetic drift and selection.

Although Aleppo pine is a particularly plastic species, as shown for example by the different levels of resistance to pests, frost and low water potentials or the reproductive features displayed by phenotypic changes across different environmental conditions (Chambel et al., 2007), the influence of selective forces on its adaptive potential should not be underestimated. For example, common-garden tests have shown that total precipitation and dry season duration are important selective factors for water-use efficiency in Aleppo pine provenances spanning the entire species range. These results highlight the selective role of climate variables in determining intraspecific fitness in this species (Voltas et al., 2008; see also Climent et al., 2008 for reproductive features in P. halepensis). In tree species, the role of natural selection in promoting local adaptation has been largely demonstrated in common garden experiments (Eveno et al., 2008).

The present study, in addition to providing insights into the past demography of Aleppo pine, represents a first step in the characterization of populations for genes potentially linked to drought tolerance across an entire species’ range. Some of the candidate genes that depart from expectations under the best-fitting demographic models have well-established functions. Aquaporins (Aqua-MIP) are membrane pore proteins (i.e. water channels) that play a critical role in controlling the water content of cells. Dehydrins (dhn-1), for their part, have a major role in cell protection against desiccation (via stabilization of endomembrane structures, metal-binding activity and protection from oxidative damage). Remarkably, some of the genes found under selection in this study (erd3, dhn-1, lp3-1) are overexpressed under drought-stress conditions and have been reported to be under selection in other pine species, such as Pinus taeda (erd3; González-Martínez et al., 2006) and Pinus pinaster (erd3, dhn-1, lp3-1; Eveno et al., 2008). They thus constitute relevant candidates to develop future drought-tolerance and adaptation studies in Aleppo pine, and probably other conifers.

From a practical point of view, our work points to different evolutionary and demographically independent units. One of them, represented by the Greek population, encompasses most of the neutral and functional variation found in this study and constitutes, therefore, an important target for conservation purposes. Moreover, the Aleppo pine forests of Greece and the Middle East grow in more arid conditions than those of the western Mediterranean (Barbéro et al., 1998; Arianoutsou & Ne’eman, 2000) and Greek provenances stand out from the others for various adaptive features (e.g. reproductive allocation, ontogeny; cf. Climent et al., 2008 and references therein), pointing to these populations as a potential source material that should be selected not only for its adaptation to current environmental conditions, but also because it will be able to cope with future, more arid and hot conditions. Ultimately, causal inferences on the genetic basis of adaptation, that is, the links between genotypes and phenotypes, will have to be assessed by association studies or complementation and transgenic analyses (Storz, 2005; Wright & Gaut, 2005).

Acknowledgements

We thank Sebastian E. Ramos-Onsins for kindly providing the scripts used to compute various diversity and summary statistics, Ricardo Alía and José Climent for useful insights on Aleppo pine life-history and quantitative genetics, as well as three anonymous reviewers and the Editor for constructive comments on a first version of this manuscript. Thanks are extended to Patricia C. Grant for English language review. The EUFORGEN program (Bioversity International) kindly provided the distribution map used in Fig. 1. This research was funded by the European Commission, EVOLTREE Network of Excellence (http://www.evoltree.eu), the Collaborative Project on ‘Conservation of Forest Genetic Resources’ between the Spanish Ministry of Environment and INIA AEG06-054 and Project CGL2008-05289-C02-02/BOS (VaMPiro) from the Spanish Ministry of Science and Innovation.

Ancillary