• Hybrid zone theory provides a powerful theoretical framework for measuring and testing gene flow and selection. The Senecio aethnensis and Senecio chrysanthemifolius hybrid zone on Mount Etna, Sicily, was investigated to identify phenotypic traits under divergent selection and to assess the contributions of intrinsic and extrinsic selection against hybrids to hybrid zone maintenance.
• Senecio samples from 14 sites across Mount Etna were analyzed for 24 quantitative traits classified into four groups (QTGs), six allozymes and seven simple sequence repeat (SSR) loci to describe patterns of variation throughout the hybrid zone.
• Narrower cline widths or shifts in cline centre position were observed for three QTGs relative to the molecular clines, indicating that these traits are likely to be under extrinsic environmental selection. Altitude was key to describing species distributions, but dispersal and intrinsic selection against hybrids explained patterns at smaller spatial scales. The hybrid zone was characterized by strong selection against hybrids, high dispersal rates, recent species contact and few loci differentiating QTGs based on indirect measures.
• These results support the hypothesis that extrinsic and intrinsic selection against hybrids maintains the hybrid zone and species distinctiveness despite gene flow between the two Senecio species on Mount Etna.
Hybrid zone theory provides a powerful theoretical framework for exploring the interaction between gene flow and selection on alleles and genotypes in natural populations (Barton, 1983; Barton & Hewitt, 1985, 1989; Harrison, 1993). Hybrid zones form where genetically differentiated populations meet, mate and produce fertile hybrids. The evolutionary outcomes of such parapatric contacts are diverse (Bridle & Vines, 2007). For example, contact can result in selective replacement of one species by another (e.g. Anartia butterflies in Panama; Dasmahapatra et al., 2002), extensive gene flow and convergence (e.g. Raphanus radishes in California; Hegde et al., 2006; Ridley et al., 2008), the formation of stable parapatric species boundaries (e.g. Chorthippus grasshoppers in the Pyrenees; Butlin & Hewitt, 1985; Butlin et al., 1992) and the origin of new hybrid taxa (e.g. Iris irises in Louisiana; Arnold, 2006). Hybrid zones may also behave differently for the same species pairs across different points of range contact (e.g. Ensatina salamanders in California; Wake et al., 1986; Wake, 1997).
Because species pairs that form stable hybrid zones typically exhibit many genetic differences, they allow the spread of molecular and phenotypic markers to be tracked through natural populations, allowing detailed population genetic and ecological analyses. Recombination in hybrids means that the selective importance of molecular and quantitative traits can be assessed independently of their species-specific genetic background, providing valuable insight into the evolutionary forces acting in natural populations (Szymura & Barton, 1991; Gay et al., 2008).
For hybrid zones to be stable, they require some form of divergent selection to maintain differentiation despite gene flow. This divergent selection can involve intrinsic selection against hybrids (Barton & Hewitt, 1989), extrinsic environmental selection (Endler, 1977; Moore & Price, 1993), or a combination of these two forms of selection (Nurnberger et al., 1995; MacCallum et al., 1998). The distinction between both types of selection is subtle, but important to our understanding of the evolutionary consequences of hybridization. Intrinsic selection against hybrids is mostly determined by interactions among alleles such that hybrids are generally less fit than parental types regardless of external environmental factors. Extrinsic environmental selection occurs when environment acts on intermediate phenotypes such that hybrid fitness relative to parents is dependent on the environmental context where hybridization occurs. Within this context, different loci may exhibit different patterns of clinal variation, depending on the nature and strength of selection acting directly on these loci or on physically or epistatically linked loci elsewhere in the hybrid genome (Barton & Hewitt, 1989). In practice, however, this distinction is blurred by the associations between loci (linkage disequilbria (LD)) that are generated mainly by gene flow into the hybrid zone. Such LD means that extrinsic selection affects cline widths for characters with no adaptive significance (Barton & Gale, 1993; Nurnberger et al., 1995). Similarly, traits under environmentally divergent selection are affected by intrinsic selection on loci associated with hybrid unfitness. Nevertheless, describing the width and position of clines is an important first step in hybrid zone analysis, and suggests hypotheses about the relative importance of intrinsic hybrid selection and extrinsic environmental selection that can subsequently be tested in detail (Butlin et al., 1991; Nurnberger et al., 1995; MacCallum et al., 1998; Bridle et al., 2001).
Another key distinction is between ‘dispersal-independent’ and ‘dispersal-dependent’ hybrid zones. ‘Dispersal-independent’ hybrid zones are those where divergent environmental selection dominates over the effects of gene flow, meaning that the genetic composition of a population is relatively independent of its neighbours and is determined primarily by external selective factors. These occur where the environment changes at a scale much larger than the mean dispersal distance of the organism (Barton & Gale, 1993) or where intermediate phenotypes are favoured by extrinsic environmental selection strong enough to override gene flow, so that environment–phenotype associations predominate (Barton & Hewitt, 1989; Barton & Gale, 1993; Moore & Price, 1993; MacCallum et al., 1998; Milne et al., 2003). By contrast, tension zones are ‘dispersal-dependent’ in the sense that clines between parental types form as a result of a balance between intrinsic selection against hybrids (or intermediates) and gene flow between adjacent populations. This tension zone model provides simple generalized predictions about the structuring of genetic and phenotypic variation that can be described by sigmoid or stepped clines (Slatkin, 1973).
Again, this distinction is rarely clear-cut and results from the spatial scale at which parapatric contact is considered. In many cases, a ‘dispersal-independent’ hybrid zone maintained by selection along an environmental gradient might appear to be superficially similar to a ‘dispersal-dependent’ hybrid tension zone maintained by intrinsic selection against hybrids (Endler, 1977; Moore & Price, 1993). This will be particularly the case where the selective gradient is locally steepened, meaning that dispersal has a stronger effect on the ability of populations to track local selective optima (Bridle et al., 2009). Gene flow across such locally steep gradients will generate narrow ‘dispersal-dependent’ sigmoid clines. However, throughout the rest of the same environmental gradient, or for other characters, patterns of trait mean will appear to be ‘dispersal-independent’ where selection changes more gradually in space.
Nevertheless, if environmental selection is important in shaping the hybrid zone, the position of cline centres should vary because the selective optima for different characters are likely to occur at different points along the environmental gradient. By contrast, where the cline is dispersal-dependent, clines are free to move by gene flow to lie in the same position (Hewitt, 1988; Barton & Gale, 1993); clines are also likely to remain associated together as a result of the strong LD generated by the mixing of differentiated genotypes at the hybrid zone centre. In both cases, however (and where LD is sufficiently low), clines for neutral genetic markers are likely to be wider and show greater coincidence of cline centre between markers than quantitative traits of probable adaptive significance. This provides a way of identifying traits or loci that are putative targets for diversifying selection (Butlin et al., 1991; Kruuk et al., 1999; Bridle & Butlin, 2002).
The predictions about cline shape arising from tension zone theory can provide indirect estimates of dispersal rates, selection strength and the number of loci contributing to trait differentiation, when combined with data on the amount of LD and trait covariance and variance at the zone centre. Dispersal brings parental types into contact, and hybridization between parental types generates LD that is progressively reduced through recombination in later-generation hybrids. Selection against hybrids limits recombination, so that elevated LD at the centre of hybrid tension zones persists relative to the balance between parental dispersal into the hybrid zone and selection against hybrids relative to recombination (Barton & Hewitt, 1989; Barton & Gale, 1993). Trait covariance is the equivalent of LD for quantitative traits and can be analyzed in an analogous way (Barton & Gale, 1993; Nurnberger et al., 1995; Bridle & Butlin, 2002). Trait variance in hybrids will be elevated owing to a combination of LD and segregation between a limited number of loci controlling trait divergence between parental types. Thus, an elevation in LD, covariance and variance at the hybrid zone centre provide estimates of the balance among dispersal, recombination, selection against hybrids and the number of loci controlling a trait. These are useful estimates where direct measures of the parameters in the field are difficult or impractical, and therefore contribute to our understanding of the hybrid zone under investigation. This also may suggest the effect of ecological or evolutionary processes not included in tension zone models. For example, elevations in LD that are not coincident with cline centres might indicate asymmetric gene flow into the hybrid zone and cline movement, without the need for repeated field measures over time (Cruzan, 2005; Gay et al., 2008). Similarly, elevated estimates of LD at a cline centre may suggest strong assortative mating, or a complex pattern of contact between parental species (Bridle & Butlin, 2002; Bailey et al., 2004, Bridle et al., 2006).
Natural interspecific hybridization is widespread in plants and has been studied extensively at systematic, population genetic and genomic levels in many plant groups (Grant, 1981; Arnold, 2006; Baack & Rieseberg, 2007). Rather surprisingly, detailed hybrid zone analysis has yet to be widely applied to plants in studies of divergent selection between hybridizing species or morphotypes (but see Cruzan, 2005; Schemske & Bierzychudek, 2007). The species we study here, the ragworts Senecio aethnensis and Senecio chrysanthemifolius, are ecologically, morphologically and genetically distinct species that form a hybrid zone halfway up the slopes of Mount Etna, Sicily (Abbott et al., 2000; James & Abbott, 2005). These species have previously attracted scientific interest as the progenitors of an invasive homoploid hybrid species, Senecio squalidus, in the UK (Abbott et al., 2000, 2009; James & Abbott, 2005). S. aethnensis grows on the higher slopes of Mount Etna, above 2000 m on recent lava flows, whereas S. chrysanthemifolius grows below 1000 m on arable agricultural land and waste ground. Both species are short-lived perennials with generalist hoverfly pollinators and wind-dispersed fruits. Where the species meet, extensive ongoing hybridization and introgression occurs at intermediate altitudes, c. 1100–1900 m. A survey of variation for randomly amplified polymorphic DNA (RAPD) markers that distinguish the two species revealed clines associated with altitude and distance across the hybrid zone, with all plants at intermediate elevations shown to be of mixed ancestry (James, 1999; James & Abbott, 2005). Glasshouse studies have confirmed that reproductive barriers between the species are weak to absent (Fici & Lo Presti, 2003; Chapman et al., 2005). Only flowering-time differences appear to limit hybridization in the wild, with peak flowering gradually delayed by up to 8 wk when moving from the lowest to the highest altitudes (Chapman et al., 2005).
Here, we report an analysis of morphological and genetic variation across the Senecio hybrid zone on Mount Etna to gain insight into the nature of the adaptive traits and associated selective forces acting within the hybrid zone. Our analysis addressed the following specific questions. What traits and ecological functions are divergent between S. aethnensis and S. chrysanthemifolius? What are the clinal characteristics (centre and width) of different quantitative traits and genetic markers? What is the probable nature of divergent selection structuring the hybrid zone (intrinsic hybrid selection, extrinsic environmental selection, or both)? What balance of selection, dispersal, number of loci and recombination best describe patterns of trait variance, covariance and genetic LD in the hybrid zone? We discuss how these findings will inform future research into adaptation and speciation within the Senecio complex on Mount Etna and consider the value of such clinal analyses in exploring evolution in plant hybrid zones more generally.
Materials and Methods
Fruits (achenes) were collected during the summer of 2007 from plants of S. chrysanthemifolius, S. aethnensis and intermediate hybrid phenotypes located at different sites on Mount Etna, Sicily (Fig. 1). During November 2007, 10 achenes from each of 15 maternal plants sampled per site were germinated on damp filter paper in the light at 20°C. Three seedlings per maternal plant were transferred to 10-cm-diameter pots containing a compost and grit mix (3:1), and thinned at random to a single plant per pot after 1 month. Plants were grown in a glasshouse at approx 20°C with 16 h day length. Supplementary lighting was provided by halogen lamps.
Quantitative trait measures
Twenty-four life-history, morphological and physiological traits were measured on each individual. These included days to germinate, days to development of first true leaf, days to flowering, height, branch number, leaf number, capitulum number, pedicel length, capitulum length and diameter, ligule number, ligule area, pollen number, floret number, seed and pappus length, photosynthetic rate under normal conditions and after drought and low-temperature treatments, auricle width, leaf perimeter to area ratio, stomata number, and leaf chlorophyll and anthocyanin contents (Supporting Information Table S1).
Genetic marker variation
Allozymes Allozyme variation was resolved among plants cultivated in the glasshouse using a protocol described by (Ashton & Abbott, 1992). Plants were genotyped at six polymorphic loci that encoded the enzymes aspartate aminotransferase (AAT), acid phosphatase (ACP), glutamate dehydrogenase (GDH), isocitrate dehydrogenase (IDH), phosphoglucose isomerase (PGI) and phosphoglucomutase (PGM).
Simple sequence repeats (SSRs) Genomic DNA extraction was performed according to Comes et al. (1997), with the following modifications: 4%β-mercaptoethanol was added to the cetyltrimethylammonium bromide extraction solution, two chloroform extractions were performed and RNA digestion was conducted on the aqueous fraction of the second chloroform extraction. The PCR products for seven SSR loci (S4, S10, S15, S20, S26, V44 and V45), which amplify in both S. aethnensis and S. chrysanthemifolius (Liu et al., 2004), were scored. PCR amplification was conducted using a protocol described in Table S2 and the products were labelled with D2, D3 or D4 fluorescent dye labels (Sigma-Aldrich, UK) using a three primer system with universal 18 bp M13 5′-labelled oligonucleotides, 5′ M13-tagged forward primers and normal reverse primers (Schuelke, 2000). Samples were SSR genotyped by running a sequencing buffer solution containing 1.5–2.5 µl of dye-labelled PCR product through a Beckman Coulter CEQ 8000 sequencer and analyzing the output using ceq v9.0 (Beckman Coulter Inc., 2004) and Microsoft Excel 2003 (Microsoft Corporation, 2003).
The seven sites sampled from the southern slope were mapped onto a best-fit transect, and cumulative geographic distances between transect sites were calculated starting with the southernmost S. chrysanthemifolius site through to a final S. aethnensis site (Table 1, Fig. 1). Our hybrid zone analysis was restricted to a southern transect on Mount Etna, along which sites one to seven were sampled. This was because sites from the eastern and western slopes formed only partial transects from S. aethnensis or S. chrysanthemifolius through to hybrid Senecio, respectively, and their distance from the southern slope might restrict gene flow between sites. This ensured that the underlying assumptions of the tension zone model (that clines are fitted across continuous transects along which dispersal occurs) were met (Barton & Hewitt, 1985, 1989; Barton & Gale, 1993). The trade-off of ignoring data collected from an additional seven sites from other parts of Mount Etna was partly compensated by including data from all 14 sites in other nonspatially explicit models to check and corroborate the hybrid cline results.
Table 1. Collection data for sample sites of Senecio aethnensis (S. aeth), Senecio chrysanthemifolius (S. chrys) and their hybrids from Mount Etna
Ripe fruits were harvested from approx. 25–50 separate maternal plants from an area of 200–500 m2 per sample site.
S. chrys, Nicolosi, roadsides, mature vegetation
S. chrys, road between Rifugio Sapienza and Nicolosi, grassy vegetation
Hybrid, road between Rifugio Sapienza and Nicolosi, grassy vegetation
Hybrid, road between Rifugio Sapienza and Nicolosi, grassy vegetation
Hybrid, near Monte Serra Pizzuta, grey lava soil with sparse vegetation
Hybrid, Rifugio Sapienza, black lava sand with sparse vegetation
S. aeth, Rifugio Piccolo, black lava sand
S. chrys, Adrano, wasteground with grassy vegetation
Hybrid, road between Adrano and Monte Albano, grassy vegetation
Hybrid, road between Adrano and Monte Albano, grassy vegetation
Hybrid, end of Road to Monte Albano, mature shrubby vegetation
Hybrid, road to Piano Provenzana, recent lava flow
Hybrid, Piano Provenzana, grey lava sand, mature shrubby vegetation
S. aeth, above Piano Provenzana, recent lava flow
Quantitative trait measures
Quantitative traits, transformed as necessary to improve normality, were compared between S. aethnensis and S. chrysanthemifolius, from two sites per species using two-sample t-tests, to measure differentiation between species (S. chrysanthemifolius sites 1 and 2, S. aethnensis sites 7 and 14, Table 1, Fig. 1). Significantly differentiated traits were assigned to one of four trait groups corresponding to distinct ecological characteristics that might experience different selection pressures in the wild: plant architecture (A); inflorescence structure (F); leaf structure (L); and seed and fruit structure (S) (Table 2). Correlations between all pairs of traits within these quantitative trait groups (QTGs) were tested using Pearson's tests, and the least-differentiated trait of each correlated pair was excluded to remove significant correlations between traits within QTGs (Table S3).
Table 2. Comparison of quantitative traits between Senecio aethnensis sites and Senecio chrysanthemifolius sites from Mount Etna
Quantitative trait (units)
S. aethnensis mean (n, SE)
S. chrysanthemifolius mean (n, SE)
Compare means t values (trans, df, sig)
Functional group discrim. func. score
S. aethnensis and S. chrysanthemifolius measures were each based on two sample sites from the sample range extremes (sites 7 and 14, sites 1 and 2, respectively). Trait means for S. aethnensis and S. chrysanthemifolius are presented with associated sample sizes (n) and standard errors (SE) in parentheses. Species differences were tested using t tests. The t values are presented with the associated transformations to improve data normality (trans), degrees of freedom (df), and significance (sig) in parentheses. Data transformations were log10 (l) and square root (r). Significance assessment was corrected for multiple testing (**, significant at overall α = 0.01; *, significant at overall α = 0.05). Significantly differentiated quantitative traits were assigned to one of four quantitative trait groups (QTGs) based on involvement in similar ecological functions. (A, plant Architecture; F, inFlorescence structure; L, Leaf structure; S, Seed and fruit structure). The less differentiated quantitative traits out of each pair of traits within QTGs that were found to be significantly correlated were also excluded (excl. in table) to retain only uncorrelated traits within QTGs. Discriminant function analysis was performed on remaining untransformed traits, on a scale from zero to one, for each QTG for the four species reference sites to calculate discriminant function scores for species assignment probabilities.
Photosynthesis; low temperature (µM CO2 m−2 min−1)
−0.006 (22, 0.002)
0.007 (18, 0.005)
−2.37 (l, 26)
Leaf number (per primary stem)
32.57 (28, 1.10)
29.21 (29, 1.00)
2.24 (r, 55)
Germination to true leaf (d)
10.77 (26, 0.37)
11.86 (28, 0.37)
−2.06 (r, 51)
Stomata number (100 magn. fov−1)
18.41 (27, 0.99)
23.00 (27, 1.65)
−2.03 (l, 47)
Ray number (primary capitulum−1)
12.07 (28, 0.27)
12.66 (29, 0.18)
−1.75 (r, 47)
Leaf anthocyanin (mM g−1)
0.61 (26, 0.04)
0.77 (26, 0.08)
−1.56 (r, 45)
Plant height (cm)
31.17 (28, 1.21)
33.65 (29, 1.35)
−1.20 (r, 54)
True leaf to flowering (d)
74.38 (26, 1.95)
72.11 (28, 2.15)
0.83 (r, 51)
Photosynthesis (µM CO2 m−2 min−1)
−0.012 (27, 0.003)
−0.01 (22, 0.003)
−0.55 (l, 40)
13.93 (28, 1.29)
13.31 (29, 1.17)
0.23 (r, 53)
Photosynthesis; water limited (µM CO2 m−2 min−1)
0.001 (26, 0.002)
0.002 (22, 0.004)
−0.15 (l, 36)
Traits within each QTG were scaled between zero and one, and discriminant function analysis was used for the four reference-site samples of S. aethnensis (7 and 14) and S. chrysanthemifolius (1 and 2) to calculate discriminant functions that distinguish the species and to estimate posterior probabilities of species assignment (0, S. chrysanthemifolius; 1, S. aethnensis; hereafter referred to as the QTG hybrid index).
Genetic marker measures
Molecular genotype data were analyzed using genalex v6.1 (Peakall & Smouse, 2006) to obtain diversity statistics and estimates of genetic differentiation (F statistics) for individual and combined allozyme and SSR loci between the four reference sites of S. aethnensis and S. chrysanthemifolius, respectively. Because no locus tested exhibited fixed-allele differences between S. aethnensis and S. chrysanthemifolius, posterior probabilities of species assignment (0, S. chrysanthemifolius; 1, S. aethnensis; hereafter referred to as the molecular hybrid index) were estimated for each individual based on allele-frequency differences between the species using the R package, introgress (Gompert & Buerkle, 2009).
Maximum loge likelihood (ML) sigmoid clines were fitted to the mean hybrid index per site for each of the four QTGs, allozymes, SSRs and combined molecular data against distance along the southern transect using analyse v1.3 (Barton & Baird, 1996). Mean variance was used as the expected variance required by the software to speed up loge likelihood (lnL) calculations. Two of the four parameters necessary to describe a cline (mean trait score at the start and end of the cline) were fixed to preserve five degrees of freedom for the model fit. Thus, the estimated cline parameters were cline centre and cline width, corresponding to the point of inflection and the inverse of the maximum slope at this point, respectively. Support limits for these estimates were taken as the maximum and minimum parameter values resulting in clines not less than two lnL units from the ML value (Edwards, 1972; Bridle et al., 2001). The similarities of fitted cline centres (coincidence) and widths (concordance) for different molecular markers and QTGs were tested, first by estimating the decrease in lnL when clines were refitted to QTGs and molecular data with centres or widths fixed at the ML estimates for the combined molecular data cline. Cline parameters were considered to be significantly different based on lnL ratio tests, Bonferroni corrected for multiple tests.
A second test for differences in cline width and position used data from all 14 sample sites and was performed by regressing the mean QTG hybrid index per site against the mean combined molecular hybrid index per site, weighted by sample size per site (Szymura & Barton, 1991; Bridle & Butlin, 2002). This approach has the advantage of not using any spatial information when comparing clines, and so inferences on cline width are less limited by the spatial sampling resolution. Linear relationships between QTG and molecular traits indicated that clines were similar, while cubic relationships indicated significant differences in cline width or centre. Linear and cubic models were fitted using a nonlinear least squares iterative function (nls) with predefined interactions between parameters. The linear model was y=c + dx, while the cubic model was y = c + d(x + 2x(1 – x)(α + β(x – (1 – x))) (modified from Szymura & Barton, 1991) (y, mean QTG hybrid index; x, mean molecular hybrid index; α, shift in cline centre towards S.aethnensis; β, narrowing of cline width; c, S. chrysanthemifolius mean hybrid index; and d, S. aethnensis mean hybrid index).
A power analysis of the data set was performed to assess where the power of these clinal comparisons was sufficiently high in order to detect the size of differences in width or centre detected for other traits. Confidence intervals about ML cline fit parameter estimates were converted into standard deviations assuming a normal error variance distribution about the parameter estimates, differences in cline widths or centres and sample sizes. These data were then used to estimate the minimum differences detectable with these data, and the sample size that would be required to significantly distinguish the parameter differences observed for other traits, at a study power of 0.8.
Tests of different models of hybrid zone structure
The influence of altitude (which is likely to be correlated with several environmental factors) on hybrid index variation was tested using F ratio tests to compare the variance explained by nested pairs of models, with or without altitude as an explanatory parameter (Butlin et al., 1991; Bridle et al., 2001). The first model comparison tested for a dispersal-independent hybrid zone (i.e. one structured mainly by extrinsic environmental selection rather than by gene flow) by comparing null models (no change in mean hybrid index across the transect) with altitude-only models (linear relationship between hybrid index and altitude). The second model comparisons tested for a dispersal-dependent hybrid zone (i.e. one structured only by intrinsic hybrid selection and gene flow between adjacent sites) by comparing null models with cline-only models (ML fitted sigmoid clines describing trait means at each site). The third model comparisons tested for locally variable extrinsic environmental selection generating strong environment–genotype interactions, over and above the expectations of sigmoid fits, by comparing dispersal-dependent cline-only models with the effect of the cline plus the effect of altitude.
Tests for elevated variance at the hybrid zone centre
Elevations in QTG and molecular variance associated with intermediate values of the mean were estimated by fitting Gaussian (normal) curves described by four parameters (mean, standard deviation, intercept and scaling factor) to data for all 14 sample sites (Bridle & Butlin, 2002). Estimates of the elevation in variance were obtained by subtracting the maximum predicted variance from the residual species variance (mean predicted variance at the curve edges). Elevation in variance was used to estimate the number of genes that additively control species differentiation for each QTG according to the relationship ne = (0.25 – R)/2 * (var(z) – Δz2R/2) (ne is the number of controlling loci; R is the elevation in linkage disequilibrium associated with intermediate values of the mean (estimated in the following section); var(z) is the elevation in variance; and Δz is the difference between species hybrid index means) (Lande, 1981; Sanderson et al., 1992; Barton & Gale, 1993).
Tests for elevated LD and covariance at the hybrid zone centre
To calculate LD across the hybrid zone, molecular data were first made biallelic by rescoring alleles according to whether they were most frequent in either S. aethnensis or S. chrysanthemifolius. Standardized LD (R) that corrects for variation in allele frequency between loci was calculated for each pair of polymorphic loci at each of the 14 sample sites using lda v1.0 software (Ding et al., 2002) according to R = (xiixjj – xijxji)/sqrt(piqipjqj) (x, p and q = 1 – p, are the haplotype and allele frequencies at loci i and j, respectively). The paired locus R values were then corrected for allele sharing between species by multiplying by the allele-frequency differences at loci i and j (Barton & Gale, 1993, p. 26). Tests of the expectation of increased LD associated with intermediate values of the trait mean were performed by fitting Gaussian (normal) curves to mean R values against site mean hybrid index, as described earlier for variance.
The estimated elevation in R at intermediate site mean hybrid index and estimates of molecular cline width were used to estimate dispersal distance per generation across the Senecio hybrid zone according to the relationship: δ = sqrt(rRw2) (r, recombination rate; R, elevation in LD; and w, cline width) (Szymura & Barton, 1991; Barton & Gale, 1993). A range of biologically reasonable recombination rates from 0.1 to 0.5 were examined. The indirect measure of dispersal per generation and the estimated molecular cline width were then used to predict the strength of selection against hybrids according to the relationship s = 8δ2/w2 (Barton & Gale, 1993) and to predict the number of generations of contact in the absence of selection according to the relationship T = w2/2πδ2 (Endler, 1977).
Covariance, the equivalent of LD for quantitative traits, was calculated for each pair of QTGs for each of the 14 sample sites. Tests of the expectation of increased covariance associated with intermediate values of the mean were performed by fitting Gaussian curves to covariance against site mean, as described for the variance analysis.
The estimated elevation in covariance at intermediate site means and corresponding estimates of cline width were used to generate an alternative estimate of dispersal distance per generation across the Senecio hybrid zone according to δ = sqrt(2r . cov(z1, z2) . w1w2/Δz1Δz2) (r, recombination rate; cov(z1, z2), elevation in mean covariance; w1w2, the mean product of estimated cline width between pairs of QTGs; and Δz1Δz2, the mean product of the species differences between pairs of QTGs) (Barton & Gale, 1993).
S. aethnensis and S. chrysanthemifolius differed significantly in mean values for 12 of the 24 measured traits (Table 2). For three of the differentiated traits (leaf perimeter to area ratio, capitulum number and branch number), S. chrysanthemifolius had a larger mean than S.aethnensis (negative F values, Table 2), while for the remainder of the traits the reverse was true. Significant correlations between pairs of significantly differentiated traits within QTGs were avoided by omitting three traits (Table S3), leaving nine traits divided into four QTGs (Table 2). Posterior assignment of individuals to species using discriminant functions was effective at partitioning quantitative trait variation between species and focussing on between-species variation, rather than on within-species variation, to analyse hybridization.
All 13 molecular markers were polymorphic, with allozymes exhibiting fewer alleles and smaller expected heterozygosity values than SSRs (Table 3). In general, low levels of inbreeding or within-site spatial genetic structure were observed, compatible with hybrid zone theory assumptions of outcrossing (Fis values, Table 3). Between-site spatial genetic structure was also appropriate for measurement of hybrid index and hybrid zone analysis because loci exhibited little between-site differentiation within species (Fsr values, Table 3), while significant differentiation between species was present for nine of the 13 loci (Frt values, Table 3). The allozyme locus, ACP, gave the strongest signal of divergence, being almost fixed for different alleles in S. aethnensis and S. chrysanthemifolius. However, ML cline fits provided no evidence for stronger selection on allozymes overall than on SSR loci.
Table 3. Allozyme and simple sequence repeat (SSR) genetic diversity data comparing Senecio aethnensis and Senecio chrysanthemifolius sites from Mount Etna
Mean sample size per site (n)
Mean allele number per site (P)
Mean heterozygosity per site (He)
Among ind. var./popn var. (Fis)sig
Among popn var./ species var. (Fsr)sig
Among species var./total var. (Frt)sig
S. aethnensis and S. chrysanthemifolius measures were each based on two sample sites from the sample range extremes (sites 7 and 14, sites 1 and 2, respectively). Mean sample size per population (n), mean allele number per population (P), mean expected heterozgosity per population (He) and partitioning of molecular variance (var.) among individuals (ind., Fis), populations (popn. Fst), and species (Frt) and their significance (sig) were calculated using Genalex v6.1 (Peakall & Smouse, 2006). Significance assessment was corrected for multiple testing (**, significant at overall α = 0.01; *, significant at overall α = 0.05).
The ML cline estimates for allozymes, SSRs, combined molecular data and the four QTGs, generated cline centre positions from 6.67 to 7.82 km and cline widths from 1.49 to 3.70 km with overlap between support limits (Table 4, Figs 2, 3). Likelihood ratio tests of coincidence of cline centres, and concordance of cline widths, did not detect any differences between clines for allozymes, SSRs or combined molecular data, but did detect a significant narrowing of the leaf structure cline relative to the combined molecular data cline, and a significant shift towards S. aethnensis for the inflorescence structure cline (Table 5). The second nonspatial regression tests of cline, based on data for all 14 sample sites, confirmed these observed cline differences and also detected a significant narrowing of cline width for inflorescence structure, an almost significant shift towards S. chrysanthemifolius for leaf structure and an almost significant narrowing of cline width for fruit structure (Fig. 4). The power analysis supported the two cline-comparison tests by showing that cline differences for inflorescence, leaf and fruit structure compared with molecular data were close to detectably different, while cline differences for other marker and QTG comparisons were probably not significantly different (Table 6). Furthermore, the power analysis indicated that for cline differences for inflorescence, leaf structure and fruit structure compared with molecular data, the increased sample sizes available to the nonspatial regression test (186) would make many of the cline differences for inflorescence, leaf structure and fruit structure compared with molecular data significantly detectable (Table 6).
Table 4. Summary statistics describing maximum likelihood cline fits for allozymes, simple sequence repeats (SSRs), combined molecular data, and four quantitative trait groups for seven sample sites across the southern slope Senecio hybrid zone on Mount Etna
Marker type/ Functional group
Mean S. chrys score
Mean S. aeth. score
Width (km) (95% confidence intervals)
Centre (km) (95% confidence intervals)
Maximum likelihood cline widths and centres and associated loge likelihood (lnL) scores were estimated for each data type using analyse v1.3 (Barton & Baird, 1996). Each quantitative trait group consists of two to three uncorrelated quantitative traits that are involved in similar ecological functions: Confidence intervals were estimated as the largest and smallest cline parameter estimates generating lnL values not less than two lnL units of the maximum likelihood. S. aeth., Senecio aethnensis; S. chrys, Senecio chrysanthemifolius.
Seed and fruit structure
Table 5. Maximum-likelihood cline fit tests of concordance of cline widths and coincidence of cline centres between molecular data and four quantitative trait groups
Marker type/Functional group
Change in lnL; width fixed to SSR cline
Change in lnL; centre fixed to SSR cline
New clines were fitted to molecular or quantitative trait group (QTG) hybrid index data in the first column with either the width (second column) or the centre (third column) fixed to the cline estimate for different types of molecular data (second-column and third-column headers) using analyse v1.3 (Baird & Barton, 1996). Likelihood ratios of the decrease in loge likelihood (lnL) scores of new clines relative to original clines and their significance are presented. Model comparisons differed by a single degree of freedom, and significance assessment was corrected for multiple testing (**, significant at overall α = 0.01; *, significant at overall α = 0.05; ^, significant at overall α = 0.1). SSRs, simple sequence repeats.
Table 6. Power analysis to detect significant differences in cline width and centre for paired molecular and quantitative trait group comparisons for seven sample sites across the southern slope Senecio hybrid zone on Mount Etna
First marker type or functional group
Second marker type or functional group
Detectible cline width difference in km (observed width difference)
Sample size to detect observed width difference (actual sample size)
Detectible cline centre difference in km (observed centre difference)
Sample size to detect observed centre difference (actual sample size)
The detectible cline width and centre differences and the sample sizes required to significantly distinguish observed differences at analysis power 0.8 are presented alongside the observed cline differences and actual sample sizes. Standard deviations of paired cline width and centre differences were derived from 95% confidence intervals of maximum-likelihood (ML) cline width and centre estimates assuming that variance is normally distributed. Sample size was measured as the mean sample size for each paired cline comparison. SSRs, simple sequence repeats.
0.57 (< 0.01)
7 066 413 (87)
Seed and fruit structure
Tests of different models of hybrid zone structure
Dispersal-independent tests confirmed that altitude was a significant environmental factor explaining molecular variation and QTG variation across the study transect (Table 7). However, dispersal-dependent tests, based on a balance of gene flow and intrinsic hybrid selection, were more effective at explaining molecular variation and QTG variation (with larger F ratios) than were the dispersal-independent altitude tests. Tests combining intrinsic selection and extrinsic selection did not reveal any significant genotype × environment interactions that would indicate an effect of altitude to explain deviations from the expectations of a simple clinal model along the transect (Table 7).
Table 7. Altitudes corresponding to maximum likelihood (ML) cline centre and width estimates and model comparison tests of the relative importance of intrinsic hybrid selection and extrinsic altitudinal selection in influencing clinal variation in hybrid index of allozymes, simple sequence repeats (SSRs), combined molecular data and quantitative trait groups for seven sample sites across the southern slope Senecio hybrid zone on Mount Etna
Marker type/ Functional group
Altitude at centre (km)
Altitudinal change for width (km)
Null vs altitude F test(df1, df2)sig
Null vs cline F test(df1, df2)sig
Cline vs cline +alt. F test(df1, df2)sig
Altitude estimates are derived by mapping ML distance estimates of cline centres and widths onto the corresponding transect altitude. Variance ratio tests and their significance, comparing three nested pairs of models of hybrid index variation with different explanatory parameters, are presented. Model comparison tests are: dispersal-independent clines (extrinsic altitudinal selection vs no selection); dispersal-dependent clines (intrinsic hybrid selection vs no selection); and dispersal and extrinsic selection clines (intrinsic plus extrinsic selection). Quantitative trait groups are described in Table 2. Significance assessment was corrected for multiple testing (**, significant at overall α = 0.01; *, significant at overall α = 0.05).
Elevation in variance with intermediate values of the mean
Three QTGs, describing inflorescence, and leaf- and fruit-structure traits, exhibited a significantly positive association between variance and intermediate hybrid index, with estimated elevations in variance of 0.196, 0.134 and 0.209, respectively (Fig. 5). Estimated elevations in variance of these magnitudes led to small estimated numbers of loci, of 0.59, 0.87 and 0.57, which were responsible for between-species differentiation for the inflorescence, leaf-structure and fruit-structure QTGs, respectively (Fig. 5).
Elevation in LD and covariance with intermediate values of the mean
The combined molecular data exhibited a significant, positive association between paired-loci LD and intermediate hybrid index, corresponding to an estimated elevation in LD of 0.028 at the hybrid zone centre (Fig. 6). The strength of intrinsic selection against hybrids maintaining this amount of LD at the hybrid zone centre was estimated to be between 2.2 and 11.2% for recombination rates from 0.1 to 0.5, respectively. The dispersal distances required to maintain this amount of LD in the Senecio hybrid zone were estimated to range from 0.17 to 0.38 km per generation for recombination rates from 0.1 to 0.5, respectively. Finally, this amount of LD at the hybrid zone centre gave estimates from 11 to 57 generations since species contact and hybridization, assuming no hybrid selection, for recombination rates from 0.5 to 0.1, respectively.
There was a significantly positive association between paired-QTG covariance and intermediate hybrid index values for molecular data, corresponding to an estimated elevation in covariance of 0.034 at the hybrid zone centre (Fig. 7). Intrinsic hybrid selection from 8.1 to 40.6%, and dispersal rates from 0.24 to 0.54 km per generation, would be required to generate this amount of QTG covariance for recombination rates from 0.1 to 0.5, respectively. Alternatively, this amount of QTG covariance was estimated to have resulted from 3 to 16 generations of species contact and hybridization, assuming no selection, for recombination rates from 0.5 to 0.1, respectively.
Although clinal analysis provides a powerful way of analyzing hybrid zones, permitting indirect estimation of gene flow, selection and adaptive divergence, it has rarely been applied to plants (but see Cruzan, 2005, and Schemske & Bierzychudek, 2007, for two recent exceptions). This is surprising, given the extent of hybridization and its importance in plant evolutionary processes (Ellstrand et al., 1996; Rieseberg et al., 2003; Arnold, 2006; Baack & Rieseberg, 2007). We applied clinal analysis to the Senecio hybrid zone on Mount Etna, Sicily, and showed that this hybrid zone is shaped by gene flow and selection against hybrids. Our results provide evidence of narrowed cline widths and shifted cline centres for some quantitative traits relative to molecular genetic clines indicating that environmental selection is important in structuring trait differentiation across the Senecio hybrid zone. However, at a smaller spatial scale across the hybrid zone, gene flow and intrinsic selection against hybrids predominates over environmental selection as a result of altitude.
Selection against hybrid phenotypes
Clines for QTGs that were identified as being significantly different from molecular clines are informative about the extent and targets of environmental selection, or phenotypes that are likely to be of reduced fitness in natural populations. Loge likelihood cline-comparison tests found that, in comparison to the cline for molecular variation, the cline width for leaf structure was significantly narrower, while the cline centre for inflorescence structure was significantly closer to the high-altitude end of the cline (Table 5). These cline differences were confirmed by nonspatial regression tests that included data from seven additional sites sampled. These analyses also provided sufficient statistical power to infer inflorescence cline widths and fruit-structure cline widths that were significantly, and almost significantly, narrower, respectively (Tables 5, 6). In addition, the cline centre for leaf structure was almost significantly closer to the low-altitude end of the cline (Fig. 4), suggesting that the position of selective optima varies for different traits. Thus, three of the four QTGs showed evidence for stronger selection against hybrids than observed in molecular traits, suggesting that differentiation between these QTGs is adaptive and that LD is sufficiently low at the hybrid zone centre to allow clines to vary in width and position, relative to their own contribution to selection. These QTGs are differentiated in the direction of larger inflorescences and fruits, and less-dissected leaves in S. aethnensis relative to S. chrysanthemifolius (Table 2). Cline differences between these different QTGs may be caused by different agents or intensities of extrinsic environmental selection or different numbers of loci controlling traits affecting selection intensity per locus. Future, more detailed, sampling and analyses are necessary to explore these possibilities further.
Altitude is a powerful environmental variable that is associated with many environmental variables, such as temperature, light and soil conditions. Each of these is likely to be an important selective force constraining species distributions (Angert & Schemske, 2005; Hall & Willis, 2006; Wu & Campbell, 2006; Kimball & Campbell, 2009). Environmental selection correlated with altitude is therefore important for understanding the overall position of the hybrid zone and the ecological trait divergence within it. The influence of altitude was explored by comparing how well different models of selection explained patterns of molecular and QTG variation across the sample transect (Table 7). Altitude was strongly associated with the orientation of the hybrid zone determining species distributions. However, a clinal model, based on a balance between gene flow and intrinsic hybrid selection, was better at explaining the patterns of observed variation, and the addition of altitude as an extra model parameter, following such cline fitting, did not suggest the strong genotype–environment interactions that would be observed in a mosaic hybrid zone (MacCallum et al., 1998). However, altitudinal selection would not be detected by this test if the altitudinal selection gradient was nonlinear in form (see the section on barriers to gene flow discussed later). In particular, a similar result could be obtained if the effects of altitude on selection were steepened in one part of the altitudinal transect. This would make the effect of gene flow on trait means and fitness locally stronger, creating sigmoid clines in a narrow geographical region. Alternatively, there may have been insufficient site and altitude combinations in this small transect study to detect the fine-scale influence of altitude.
Selection against hybrid genotypes
Our results suggest that, in addition to the significantly different clines influenced by environmental selection, clines for other molecular and QTGs are also strongly influenced by intrinsic selection against hybrids. Clinal dispersal-dependent models, based on a balance of gene flow and intrinsic selection against hybrids, were highly effective at explaining molecular and QTG variation across the hybrid zone (Table 7). Most clines were largely concordant and coincident (Tables 5 and 6), and thus characteristic of hybrid tension zones where intrinsic selection against hybrids is strong relative to recombination rates. Such intrinsic selection maintains high LD and similar selection intensity across the genome (Barton & Hewitt, 1989; Szymura & Barton, 1991).
Elevated LD, associated with intermediate hybrid index, permitted indirect estimates of selection against hybrids, from 2.2 to 11.2% (Fig. 6). There is some uncertainty about this estimated elevation in LD because, in contrast to expectations, LD was still significantly greater than zero in parents (Fig. 6), possibly reflecting scoring errors introduced by the use of biallelic molecular data to calculate LD. The number of samples with intermediate trait scores was also low. The estimated selection strength corresponds to the contribution of all loci experiencing both intrinsic and extrinsic selection against hybrids (Barton & Gale, 1993). The strength of intrinsic hybrid selection probably also varies between hybrids over generations, a common observation being that early generation hybrids exhibit the lowest relative fitness because they possess the most unfavourable parental allele associations compared with later generation hybrids (Arnold & Bennett, 1993; Fishman & Willis, 2002; Johansen-Morris & Latta, 2006). In synthetic S. aethnensis × S. chrysanthemifolius hybrids, F2 and F3 generations exhibited low relative fitness in terms of seed germination, suggesting strong intrinsic selection against hybrids (Hegarty et al., 2009). In the field, selection against hybrids is likely to be higher than intrinsic hybrid selection, as measured in the glasshouse, because it also includes extrinsic selection against hybrids.
Elevated covariance between QTGs was also used to estimate selection against hybrids as between 8.1 and 40.6%, which is greater than estimated for molecular data (Fig. 7). This difference may reflect selection maintaining covariance between QTGs, while LD decays more rapidly between marker loci. However, such effects are likely to be relatively weak at cline centres, where LD is dominated by gene flow between differentiated populations (Barton & Gale, 1993). However, both LD and covariance were significantly elevated, supporting the observed coincidence and concordance of independent molecular and QTG clines.
Mean dispersal rates per generation, from 0.17 to 0.38 km and from 0.24 to 0.54 km, were estimated from the elevations in LD and covariance (Figs 6, 7) and the estimated cline widths (Table 4). Such mean dispersal rates are higher than typical for herbaceous plants that tend to disperse no more than a few tens of metres (Bullock et al., 2002). However, direct measures of dispersal tend to underestimate actual gene flow because they may miss relatively rare, but nonetheless important, long-distance dispersal events (Szymura & Barton, 1991; MacCallum et al., 1998). Furthermore, Senecio, in common with Asteraceae in general, produces fruits (achenes) equipped with a pappus that might promote long-distance, wind-borne dispersal. Alternatively, assortative mating, inbreeding, or site extinction and colonization at the cline centre could all cause departure from the random mating and/or diffusion models assumed here, and so upwardly bias dispersal and selection estimates from those predicted by simple tension-zone models, as seen in Chorthippus grasshoppers (Bridle & Butlin, 2002).
When many loci contribute to selection against hybrids, much of the rest of the genome is linked to these loci, and recombination of neutral alleles from one parental genetic background to the other parent is slowed (Szymura & Barton, 1991; Barton & Gale, 1993; Kruuk et al., 1999). The increased variance in hybrids relative to parents, after accounting for elevated hybrid LD, gives an indication of the number of loci contributing to parental differentiation and divergent selection (Sanderson et al., 1992; Barton & Gale, 1993). Significant elevations in QTG variance in hybrids led to estimates of just one to two loci controlling each of these functional groups (Fig. 5). These estimates of the number of loci differentiating these QTGs (themselves consisting of many phenotypic characters) are surprisingly small and are insufficient to generate sufficient linkage and LD with the rest of the genome to produce the observed similarities in clines for independent molecular markers and QTGs. This method of estimating loci number is sensitive to violation of assumptions such as strict additivity of contributing loci (Lande, 1981; Saldamando et al., 2005). Furthermore, QTG discriminant function assignment probabilities, which focus on between-species variation, may have downwardly biased the estimates of number of loci. Again, the message from these analyses is that LD and covariance were higher than expected, suggesting the presence of spatially complex patterns of contact and/or assortative mating within the hybrid zone in addition to strong selection against hybrids. Again, distinguishing between these possibilities makes a strong case for detailed sampling, field experiments and genomic analyses of this hybrid zone.
Time since species contact and hybridization
The possibility that similar clines might also reflect recent contact between the parents was assessed by indirect estimates from 11 to 57 or from 3 to 16 generations of hybridization since contact, based on the LD and covariance analyses, respectively. As generation time in Senecio is one to a few years, these values clearly underestimate the age of the Senecio hybrid zone because herbarium and documentary evidence dates it to at least 300 yr of age (Kent, 1956; Harris, 2002). However, these estimates neglect the influence of selection against hybrids, limiting the power of recombination to transfer alleles between parental genetic backgrounds (Slatkin, 1973; Endler, 1977; Barton & Bengtsson, 1986; Barton & Gale, 1993). Thus, an underestimate of the number of generations since contact and hybridization serves as additional evidence for strong selection against hybrids.
Alternatively, the study transect might represent a relatively recent hybrid zone. S. aethnensis requires disturbed habitats created by frequent volcanic eruptions on Mount Etna that also cause site extinctions, while S. chrysanthemifolius and hybrids benefit from habitat disturbance by humans (Authors, personal observations). The hybrid clines currently present on Mount Etna could therefore represent regions of relatively recent and transient contact between the parents. Frequent extinction and colonization would decrease estimates of time since species contact, but also elevate LD and trait covariance and associated estimates of selection and dispersal.
Barriers to gene flow
Another reason why some of the clines for independent molecular markers and QTGs are similar is that these clines are associated with a strong barrier to gene flow on Mount Etna. Physical barriers create zones of low population density where independent tension-zone clines accumulate over time as a result of asymmetrical dispersal from neighbouring high-density parental populations (Barton & Hewitt, 1989). On Mount Etna, there is no obvious physical barrier to gene flow or any region of low population density along the southern transect (Authors, personal observation). However, altitude might be important if the relationship between altitude and selection is complex. For example, multiple altitude-associated selective factors, such as precipitation, temperature and winter frost, could all change abruptly in the same position, such as at the cloud base, rather than linearly. Such a clinal altitudinal selection gradient could cause clines for different markers and traits to converge and would be difficult to distinguish from a tension-zone cline formed by intrinsic selection (Kruuk et al., 1999). Such an effect of population history makes distinguishing between intrinsic and extrinsic selection difficult in practice (as discussed earlier). It also makes it difficult to distinguish between evolutionary divergence that has occurred in situ, or predates contact between the species at this locality. However, detailed hypotheses about the shape of the selective function associated with altitude could be tested by comparing hybrid zone transects across different altitudinal gradients, or by controlled transplantation and fitness experiments.
With regard to the specific questions posed in the introduction to this paper, this study has demonstrated the following: there is considerable molecular and quantitative trait differentiation between S. aethnensis and S. chrysanthemifolius on Mount Etna (Tables 2, 3); patterns of molecular and QTG variation all form clines across the hybrid transect (Table 4); differences in centre and width between these clines argue that extrinsic selection against intermediates acts most strongly on QTGs for inflorescence, leaf and fruit structures (Table 5, Fig. 4), but that intrinsic selection against hybrids is key to structuring the majority of molecular and QTG variation across the Senecio hybrid zone (Table 7); altitude is important in influencing the distributions of the two parent species but not for structuring small-scale patterns of hybrid variation (Table 7); and estimates of selection against hybrids were moderate to strong, mean dispersal were high, and numbers of loci differentiating the species were small (Figs 5–7). Future studies involving additional sites and transects from the Senecio hybrid zone on Mount Etna should aim at direct measures of selection and dispersal that could be compared against our findings with a view to obtaining a better understanding of the maintenance of adaptive differences between S. aethnensis and S. chrysanthemifolius.
We thank David Forbes, Harry Hodge, Amy Millar and Rebecca Ross for their assistance in fieldwork, plant care and measurement. This article benefited from discussions with Roger Butlin, Mike Ritchie and Paris Veltsos and the valuable comments of two anonymous referees. The research was funded by a NERC Grant NE/D014166/1 to R.J.A. as PI. A.L.W. was funded by a China Scholarship Council award.