Adaptive divergence within and between ecotypes of the terrestrial garter snake, Thamnophis elegans, assessed with FST-QST comparisons


Mollie K. Manier, Hopkins Marine Station, Oceanview Blvd., Pacific Grove, CA 93950, USA.
Tel.: +1 831 655 6210; fax: + 1 831 655 6215;


Populations of the terrestrial garter snake (Thamnophis elegans) around Eagle Lake in California exhibit dramatic ecotypic differentiation in life history, colouration and morphology across distances as small as a few kilometres. We assayed the role of selection in ecotypic differentiation in T. elegans using FST-QST analysis and identified selective agents using direct and indirect observations. We extended the conventional implementation of the FST-QST approach by using three-level analyses of genetic and phenotypic variance to assess the role of selection in differentiating populations both within and between ecotypes. These results suggest that selection has driven differentiation between as well as within ecotypes, and in the presence of moderate to high gene flow. Our findings are discussed in the context of previous correlational selection analyses which revealed stabilizing and correlational selection for some of the traits examined.


Adaptive divergence is often a compromise between the opposing forces of selection and gene flow. The tension between these forces is especially strong in the case of ecotypic differentiation, in which populations in close proximity may become locally adapted to different selective pressures in the presence of persistent gene flow. An ongoing challenge is to establish the degree and scale of ecotypic differentiation, empirical questions that must be tackled on a taxon by taxon basis. We pursue an analysis of ecotypic differentiation using FST-QST comparison and test a hypothesis of divergence within and between ecotypes. We further identify candidate selective agents using direct and indirect observations of predation events.

The statistical comparison of population differentiation at quantitative traits (QST) and neutral molecular markers (FST) provides a powerful test for the role of selection in phenotypic divergence (Lande, 1992; Spitze, 1993; Merilä & Crnokrak, 2001; McKay & Latta, 2002). Departures from neutral expectations can be used to distinguish between a history dominated by diversifying (QST > FST) as opposed to stabilizing selection (QST < FST). In applications in a wide range of taxa, a priori expectations of the FST-QST relationship have been used both to detect and assay modes of selection acting on quantitative traits (Palo et al., 2003; Wong et al., 2003; Cano et al., 2004) and to test specific hypotheses about local selection (Baker, 1992; Waldmann & Andersson, 1998; Gomez-Mestre & Tejedo, 2004). Here, we test the power of the FST-QST approach by asking whether it can resolve the role of selection in promoting differentiation among populations within ecotypes, as well as between ecotypes. The success of this test may indicate whether the FST-QST approach will generally be able to assess the role of selection in truly subtle instances of phenotypic differentiation.

In our study system, populations of the terrestrial garter snake (Thamnophis elegans) in the Eagle Lake basin of Lassen Co., California show ecotypic differentiation on a scale of several kilometres and in the presence of gene flow high enough to override drift (up to 1.8 effective migrants per generation) at neutral microsatellite markers (Bronikowski & Arnold, 1999; Manier & Arnold, 2005; Sparkman et al., 2007). Previous research has documented ecotypic differences in reproduction, growth and survival between populations along the rocky shoreline of Eagle Lake and those inhabiting the densely vegetated surrounding meadows (Bronikowski & Arnold, 1999). The life history differences constitute a syndrome that may be driven by higher predation rates at the lakeshore. These populations grow faster, reproduce at an earlier age and have larger litters but suffer higher adult mortality than meadow populations. Common garden experiments have demonstrated a genetic basis for the difference in growth rate (Bronikowski, 2000).

Here, we focus on differences in colouration that appear to increase crypticity in rocky lakeshore and grassy meadow environments. To visual predators, the muted colours of lakeshore snakes (dull yellow or tan stripes on a grey background colour) tend to match the rocky substrate of the lakeshore, whereas the meadow snake colour pattern (yellow or orange stripes on a black background) closely resembles dead rushes that litter the shallow meadow substrates (Fig. 1). The difference in colouration between lakeshore and meadow ecotypes may be a result of differential selection for crypticity (Kephart, 1981).

Figure 1.

 Examples of Thamnophis elegans ecotypes. (a) Meadow ecotype. (b) Lakeshore ecotype.

We also examined six scale counts for adaptive differences between T. elegans ecotypes. Vertebral number (measured using ventral and subcaudal scale counts) can vary between different habitats as a function of a snake's ability to utilize substrate irregularities or ‘push-points’ for propulsion during locomotion (Jayne, 1988; Gasc et al., 1989; Kelley et al., 1997). In thickly vegetated habitats with a higher density of push-points, T. elegans and other snake populations have fewer vertebrae, whereas rocky habitats that have fewer push-points support populations with more vertebrae (Klauber, 1941; Kelley et al., 1997; Arnold & Phillips, 1999). Because lakeshore habitats provide fewer push-points than meadow habitats, we expected to see more body and tail vertebrae in lakeshore than in meadow T. elegans. The other scalation traits are likely to reflect a snake's ability to ingest large prey, with high values for these traits promoting extended cranial kinesis (infralabial, supralabial and postocular scale counts) and midsection elasticity (midbody scale count). Because diet studies indicate that lakeshore snakes generally eat larger prey items (fish) than meadow snakes (anuran larvae, leeches; Kephart & Arnold, 1982; Kephart, 1982), we expect selection for ability to swallow larger prey in lakeshore populations and hence higher scale counts. Both scale counts and colouration have been shown to be under selection in these and other populations of snakes (Arnold, 1988; Arnold & Bennett, 1988; Brodie, 1992; King, 1993a; Lindell et al., 1993), making these traits good candidates for our study.

We also ask whether selection has played a role in promoting the more subtle differences in colouration and morphology that appear to characterize populations within each of the two ecotypes. Meadow populations in our system, for example, appear to differ subtly in colouration, such that some populations have a higher incidence of yellow as opposed to orange dorsal stripes. Likewise, meadow populations occupy habitats that differ slightly in water depth, seasonal patterns of drying, phenology and composition of vegetation, as well as in prey availability (Kephart, 1982; Manier & Arnold, 2006). Similar small differences in snake colouration and habitat characteristics can be seen from one lakeshore population to the next. One of our goals is to assess the statistical reality of these impressions of morphological differentiation within ecotypes and to determine whether they might represent responses to selection.

We used estimates of neutral divergence at microsatellite loci to determine whether colouration and scalation traits have experienced diversifying selection, especially between the two ecotypes. We can reject neutrality as an explanation of population differentiation in quantitative traits if QSTFST. Thus, QST > FST suggests diversifying selection, whereas QST < FST suggests stabilizing selection towards the same optimum in different populations (Lande, 1992; Spitze, 1993). Because each ecotype is represented by multiple populations, we can also determine the importance of differentiation within and between ecotypes to FST and QST. Based on our hypothesis of ecotypic differentiation, we expect QST estimates to far exceed FST in ventral scale counts and colouration. We expect that this population differentiation is largely a consequence of strong divergence between ecotypes (QCT >> FCT), with some contribution because of slight divergence among populations within ecotypes (QSC > FSC). We used parallel, three-level analysis of variance for microsatellite alleles and for quantitative traits to test these hypotheses. We also present evidence from direct observations of predation and analysis of culmen impressions on snakes that implicate avian predators as the selective agents responsible for this adaptive differentiation.

Materials and methods

Study sites

Phenotypic and genotypic data were collected from six populations (Table 1) in the vicinity of the south-east corner of Eagle Lake in Lassen Co., California (Fig. 1). Eagle Lake is California's second-largest natural lake and supports populations of garter snakes at intervals along its extensive shoreline. These lakeshore populations are separated by stretches of shoreline a few to many kilometres long that are uninhabited by T. elegans. Distances between populations within an ecotype also ranged from 1.3 to 19.9 km, whereas between-ecotype distances ranged from 3.8 to 19.5 km. Euclidean distances were used among all sites, except the two lakeshore populations, for which shoreline distance was used. Snakes were found in the open and by lifting cover objects and were collected by hand.

Table 1.   Names, abbreviations, ecotype and sample sizes of study populations for scalation and colour variables (for males and females) and microsatellite markers.
Population nameAbbreviationEcotypeScalationColourMicrosat
  1. *Study site name is informal only, not an official geographic place name.

Gallatin Shoreline*GALlakeshore38740636338256
Mahogany LakeMAHmeadow56100486591
McCoy Flat Res.MCYmeadow72116232716
Nameless Meadow*NMLmeadow41109575529
Papoose MeadowsPAPmeadow621113656140
Pikes PointPIKlakeshore346445777948

Measurement of quantitative traits

Colouration traits were scored on 1268 live T. elegans. Population samples ranged from 50 to 745 individuals. We scored colouration traits by matching the dorsal and lateral stripes and the background area between the stripes to colour standards under diffuse, natural light. One person (SJA) did all colour scoring in the field, and snake colours were scored at mid-body under uniform lighting at mid-morning. Predation is most likely at this time of day, when snakes emerge from nocturnal refugia to sun themselves.

Dorsal and lateral stripes were matched to 10 different Pantone® colour swatches, ranging from yellow to orange to tan to pink (127, 128, 134, 135, 136, 137, 141, 148, 155, 162; out of 36 tested; available at For statistical analyses, we translated the Pantone colour codes into two different three-element colour schemes, HSL (hue, saturation, lightness) and RGB (red, green, blue), which were then evaluated for their ability to detect selection. The RGB system should be more biologically relevant, as it approximates the types of vertebrate wavelength receptors (Jacobs, 1981; Chen & Goldsmith, 1986; Jane & Bowmaker, 1988). HSL, on the other hand, is derived from Munsell codes (Munsell Colour Company, 1976), which classifies colour into groups based on human perception. For background, we used a 5-point Kodak grey scale. Sampling bias of certain colour patterns is unlikely, because snakes were not always identified against the dominant background, as when found under cover objects. Furthermore, any sampling bias would most likely result in an underestimate of population divergence, because collectors would preferentially capture snakes with colouration-background mismatch. Variable abbreviations represent all combinations of stripe and colour component; for example, saturation of the dorsal stripe was DORSAT (e.g. Table 2). Background colour (BKGRD) was quantified as various degrees of darkness, with higher numbers corresponding to a darker background.

Table 2.   Heritability estimates for colouration traits.
  1. Significance levels (P) and standard errors (SE) were assessed by boot-strapping over the family structure.

  2. N, number of families.

 P< 0.001< 0.001< 0.0010.007< 0.0010.4210.056< 0.001< 0.0010.0140.507< 0.0010.082

The following six scale counts were made on 2251 preserved and live specimens: number of ventral scales on the body (VENT); number of subcaudal scales (SUB); total number of infralabial (ILAB); supralabial (SLAB); postocular (POST) scales on the left and right sides; and number of dorsal scale rows at midbody (MID), as described by Arnold & Phillips (1999). VENT and SUB correspond, respectively, to the numbers of body and tail vertebrae (Alexander & Gans, 1966; Voris, 1975). Missing values comprised < 4% of the dataset, three-fourths of which were attributed to missing tail tips (SUB). Sex was determined by eversion of hemipenes. Scale counts do not change during the ontogeny of an individual. Experimental studies of the developmental effect of temperature on scalation traits in T. elegans (Arnold & Peterson, 2002) indicate that environmental differences in temperature are unlikely to account for population and ecotypic differences in scalation.

Inheritance of quantitative traits

The heritabilities of colouration traits were assessed using a sample of 325 individuals representing 35 sibships from the Pikes and Wildcat populations. Wildcat is a lakeshore site (see Fig. 2) excluded from the microsatellite analysis because of low sample size. All colouration scores were made on neonates, no more than 1 month after birth. Colouration traits are fully expressed at birth, and no obvious ontogenetic trends were observed when individuals were reared to maturity in the laboratory. Heritabilities (h2), genetic covariances, and environmental covariances were estimated by treating the sibships as unrelated sets of fullsibs using software (H2BOOT; Phillips, 1998) available at a website maintained by P.C. Phillips ( Fullsib anova (Falconer & MacKay, 1996) rather than mother–offspring regression was used to estimate heritabilities, because colouration scores for mothers were missing for over a third of the sibships. Multiple paternity, known to occur in garter snakes (Garner & Larsen, 2005), probably had a negligible effect on our estimates (Arnold & Phillips, 1999). Even in the presence of multiple paternity, most littermates are, on average, nearly full-sibs (Schwartz et al., 1989). Estimates of heritability can be inflated by maternal effects (Falconer & MacKay, 1996), but we have shown that scalation traits are buffered against the most conspicuous type of maternal influence, viz., maternal temperature during development (Arnold & Peterson, 2002). Nevertheless, these fullsib estimates should be viewed as upper bounds on narrow sense heritability, because they may be inflated to an unknown degree by dominance variance and a common family environment during gestation (Arnold, 1981; Falconer & MacKay, 1996). Standard errors of heritability were estimated and tests of the hypothesis that h2 ≥ 0 were conducted using 1000 bootstrap samples in H2BOOT. Heritabilities were estimated separately for males (184 individuals in 35 sibships) and females (141 individuals in 32 sibships), and the average of the separate estimates was used in the QST analyses.

Figure 2.

 Map showing locations of meadow (MCY, PAP, NML, MAH) and lakeshore (PIK, GAL) sites. An additional lakeshore site, WDC, was used to generate heritabilities for scale counts.

Estimates of heritability for the scale count traits were taken from Arnold & Phillips (1999). Those estimates are based on mother–offspring regressions using a sample of 102 mothers and 911 offspring from Pikes Point and the population at Wildcat Point. We used the average of these male and female heritabilities, shown in Table 6 of Arnold & Phillips (1999), estimated from the genetic and phenotypic variances from this population given in Tables 3 and 4 of Arnold & Phillips (1999).

Microsatellite analysis of population structure

Nine microsatellite loci were scored for a total 380 individuals representing the six study populations. Population samples ranged from 16 to 140 individuals (Table 1). These samples generally do not overlap with those used for phenotypic traits. A tail tip or piece of ventral scale was clipped and stored in Drierite®, an anhydrous calcium sulphate desiccant. Whole genomic DNA was extracted using sodium dodecyl sulphate-proteinase K digestion followed by a standard phenol–chloroform extraction, NaCl purification and isopropanol precipitation. DNA was PCR amplified in a 12.5 μL reaction with 10 mm Tris-HCl (pH 9.0), 50 mm KCl, 0.1% Triton X-100, 0.2 mm each of dNTPs, 1.5 mm MgCl2, 0.48 μm forward (labelled with fluorescent ABI dye) and reverse primer, and 0.3 U Taq DNA polymerase. PCR profiles consisted of 94 °C for 2 min followed by 36 cycles of 94 °C for 30 s, appropriate annealing temperature for 30 s and 72 °C for 30 s, ending with 72 °C for 2 min. PCR products were separated using an ABI 3100 capillary electrophoresis genetic analyser and data were visualized using Genotyper 3.7 (ABI Prism).

Population genetic analysis of the microsatellite data is described in Manier & Arnold (2005). All microsatellite data were combined for both sexes. These analyses included exact tests for departure from Hardy–Weinberg equilibrium (Guo & Thompson, 1992; Markov chain parameters: 5000 dememorizations; 500 000 steps per chain) calculated in arlequin v. 2.000 (Schneider et al., 2000) and tests for linkage disequilibrium (Slatkin & Excoffier, 1996; Markov chain parameters: 5000 dememorizations, 1000 batches, 5000 iterations per batch), performed in genepop (Raymond & Rousset, 1995). Significance levels were adjusted for multiple comparisons using two methods, the sequential Bonferroni correction (Rice, 1989) and false discovery rate control (FDR; Benjamini & Hochberg, 1995), both of which yielded the same results. Only Mahogany Lake was out of Hardy–Weinberg equilibrium (at one locus), and we found no evidence for linkage disequilibrium (Manier & Arnold, 2005). Number of alleles and observed and expected heterozygosities in each population and over all populations were calculated in genepop (Raymond & Rousset, 1995). Global estimates of FST were calculated using amova (Excoffier et al., 1992) in arlequin v. 2.000, and significance was assessed after 16 000 permutations. Standard errors of FST were estimated by jackknifing in fstat v. (Goudet, 2001).

Population differentiation in quantitative traits

Differences in phenotypic traits between males and females were calculated for scale counts and colour scores using anova in sas v. 9.1 (SAS Institute, 2002). Sample sizes are shown in Table 1. In a preliminary analysis, all populations were pooled to assess sexual dimorphism over the entire study area. That analysis revealed that two scale counts (VENT and SUB) and four colour scores (DORGRN, DORHUE, LATGRN and LATHUE) were sexually dimorphic (P < 0.0001), after sequential Bonferroni and FDR control for multiple comparisons. For these traits, the difference between the sexes ranged from 0.2 to 0.4 standard deviations for the colourations traits to 0.7 standard deviations for the scalation traits. Females had more muted colours, and males had more ventral and subcaudal scales. Consequently, we analysed males and females separately.

We used a three-level partitioning of variance to estimate variance components and characterize the population structure of both microsatellite and phenotypic traits. For both kinds of traits, variation was partitioned into within-population, among-population (within ecotypes) and between ecotype components of variance by anova. Variance components were calculated by equating observed mean squares with their expectations. For microsatellite traits, an analysis of molecular variance (amova) was conducted in arlequin v. 2.000, applying results given by Excoffier et al. (1992). Thus, we partitioned the total variance in repeat number at each locus into three parts, V = Va + Vb + Vc, where Va is the between-ecotype component of variance, Vb is the among-population (within ecotype) component of variance and Vc is the within-population component of variance. F-statistics were computed from these descriptive components of variance (Excoffier et al., 1992). FST = (Va + Vb)/V is the proportion of the total variance that resides among populations, including both the between-ecotype and the among-population (within ecotype) components. FCT = Va/V is the proportion of the total variance that resides between ecotypes. FSC = Vb/(Vb + Vc) is the proportion of the total variance within ecotypes that resides among populations. Statistical significance of each F-statistic was computed in arlequin v. 2.000 by permutation analysis (Excoffier et al., 1992). Global and population pairwise estimates of FST were also calculated using amova in arlequin v. 2.000. Significance was assessed after 16 000 permutations for global estimates and 3000 permutations for pairwise estimates, using the default settings in arlequin.

For phenotypic traits, descriptive components of variance were estimated separately for each sex from a three-level nested anova, computed using random in proc glm in sas v. 9.1. The statistical significance of Vb was evaluated by testing the among-population within-ecotype mean square over the error (within-population) mean square. The statistical significance of Va was evaluated by testing the between-ecotype mean square over the among-population (within-ecotype) mean square. The statistical significance of (Va + Vb) was conducted by computing separate two-level anovas (in which the ecotype identities of populations were ignored) and testing the among-population mean square over the error mean square. In all cases Type III sums of squares were used. The within-population component of genetic variance, Vc, was estimated by multiplying the observed within-population component of variance, Vw, by the corresponding heritability for that trait. Thus, Vc = h2Vw. The between-ecotype and among-population components of variance were equated with the corresponding genetic components of variance. To produce results analogous to the F-statistics for the microsatellite loci, Q-statistics (Spitze, 1993) were computed from these genetic components of variance using results given by Wright (1943) and Lande (1992) for quantitative traits. Thus,


These statistics have the same interpretations as the corresponding F-statistics. Notice that because estimates of heritability algebraically affect only a portion of the denominator in estimates of QST, QCT and QSC, the value of heritability (and hence any error in its estimation) has relatively little effect on these Q-statistics. Notice, too, that an overestimate of h2 (e.g. because of dominance or maternal effects) will lead to an underestimate of Q-statistics, and hence a conservative comparison with F-statistics. A Mantel test (Mantel, 1967; Mantel & Valand, 1970; Manly, 1997), implemented in arlequin v. 2.000 was used to assess the correlation between pairwise estimates of FST and QST for each trait (significance over 10 000 permutations).

Identification of avian predators

A sample of 46 live T. elegans was captured over a 3-day period in June 2004 at the Gallatin field site and examined for bird culmen marks on their ventral surfaces. Seven of these snakes (15%) had culmen impressions that could unambiguously be attributed to avian attacks. To identify the avian predators responsible for these impressions, we photographed the culmens of a series of candidate avian predators that have been regularly observed at Gallatin over a 30-year period and compared them with culmen impressions on the snakes. The candidate avian predators were Great Blue Heron (Ardea herodias), Ring-billed Gull (Larus delawarensis), American Robin (Turdus migratorius) and Brewer's Blackbird (Euphagus cyanocephalus).


Inheritance of colouration

Heritability estimates for colouration traits ranged from 0.14 to 0.80 in females and from 0.01 to 0.63 in males (Table 2). Sexual averages ranged from 0.08 to 0.65. Focusing on the sexual averages, the highest heritabilities were for DORHUE (0.65) and BKGRD (0.64) and the lowest were for LATRED (0.08) and LATSAT (0.10). Samples sizes were large enough to bound point estimates of heritability that were above about 0.32 away from zero at the 0.05 level. The alternative scoring schemes for colouration (RGB and HSL) showed comparable averages (0.32 and 0.36, respectively) and ranges for heritability. No pairs of traits showed significant genetic or environmental correlation.

Population differentiation in molecular and quantitative traits

Phenotypic traits showed subtle differences among populations within ecotypes but pronounced differences between ecotypes. Histograms of the PANTONE colour scores (Figs 3 and 4) suggest relative uniformity among meadow populations, as well as a striking difference in average colouration between lakeshore and meadow populations. The modal dorsal stripe colour in meadow populations is bright orange, but in lakeshore populations it tends to be tan. Dorsal and lateral stripe colours are more variable in lakeshore populations and include colours (e.g. brown and pink) that are not present in meadow populations. In all populations the colours of lateral stripes tend to be less bright than dorsal stripes. Similar trends are seen in background colouration. Background colour is darker in meadow populations than in lakeshore populations, with lakeshore populations also showing more variation in background colour (Fig. 5).

Figure 3.

 Frequency histograms for dorsal stripe colours. Colours are, from left to right, Pantone 127, 128, 134, 135, 136, 137, 141, 148, 155 and 162. For Gallatin only, the 127 bin includes rare instances of Pantone 106, 120 and 121. The y-axis represents frequency. Localities at the top are furthest from Eagle Lake, such that the first four histograms correspond to meadow populations, and the last two to lakeshore populations.

Figure 4.

 Frequency histograms for lateral stripe colours. For Gallatin only, the 127 bin includes rare instances of Pantone 113, 120 and 121. Other conventions as in Fig. 2.

Figure 5.

 Frequency histograms for background colour. Grey shades are, left to right, 3, 4, 5, 6 and 7. Other conventions as in Fig. 2.

Turning to scalation, the most striking trend apparent in tabulations of population and ecotypic means (Appendix S1 in Supplementary material, Table 3) is that lakeshore populations show higher average counts for all traits in both males and females. Average divergence between ecotypes for both females and males was 0.5 phenotypic standard deviations (SD) for scalation traits and 0.6 SD for colouration traits (Table 3). VENT and SUB showed the strongest divergence among scalation traits (0.6–1.0 SD) and BKGRD showed the largest divergence of colouration traits (2.2–2.3 SD). Statistical analyses reveal that all of these apparent trends are highly significant, and many traits show significant differentiation among populations within ecotypes (Table 4).

Table 3.   Divergence between lakeshore and meadow ecotypes in phenotypic traits.
N Mean N Mean N Mean N Mean
  1. Divergence is expressed in units of average within-population phenotypic standard deviation (SD).

  2. N, sample size.

 Average          0.50.5
 Average          0.60.6
Table 4.   Hierarchical analysis of population structure for microsatellite loci, scalation traits and colouration traits.
Microsatellites F SC P F ST P F CT P
  1. Q SC, QST and QCT columns show the average of male and female values, P columns for scalation and colouration traits show significance levels for separate analyses of females and males, respectively. SE values are given in parentheses.

10.020.0050.02 (0.01)0.003−0.000.46 (0.02)< 0.00010.050.07
30.010.0020.04 (0.01)< 0.00010.030.07
4−0.010.950.01 (0.01) (0.01)0.14−0.000.46
60.030.0080.06 (0.03)< 0.00010.040.20 (0.02)< 0.00010.040.06 (0.02)< 0.00010.040.07
90.020.0060.03 (0.03)0.0020.010.26
Average0.01 (0.004) 0.04 (0.01) 0.02 (0.01) 
Scalation Q SC P Q ST P Q CT P
VENT0.08< 0.0001, < 0.00010.31< 0.0001, < 0.00010.240.003, 0.13
SUB0.16< 0.0001, < 0.00010.35< 0.0001, < 0.00010.230.04, 0.13
MID0.060.44, < 0.00010.21< 0.0001, < 0.00010.150.001, 0.12
ILAB0.030.004, 0.620.44< 0.0001, < 0.00010.420.01, 0.001
SLAB−0.020.47, 0.780.090.02, 0.0050.110.07, 0.01
POST0.31< 0.0001, < 0.00010.280.01, 0.01−0.050.37, 0.52
Average0.10 (0.05) 0.28 (0.05) 0.18 (0.07) 
Colouration Q SC P Q ST P Q CT P
DORRED0.040.004, 0.0080.13< 0.0001, 0.00020.090.03, > 0.05
DORGRN0.050.03, 0.00030.16< 0.0001, < 0.00010.120.02, > 0.05
DORBLU0.03> 0.05, 0.050.39< 0.0001, < 0.00010.370.003, 0.006
LATRED0.02> 0.05, > 0.050.39< 0.0001, 0.0010.360.002, > 0.05
LATGRN0.12< 0.0001, < 0.00010.150.02, < 0.00010.04> 0.05, > 0.05
LATBLU0.02> 0.05, > 0.050.13< 0.0001, 0.0040.100.03, > 0.05
DORHUE0.02> 0.05, 0.0070.020.03, > 0.050.01> 0.05, > 0.05
DORSAT0.040.01, 0.040.23< 0.0001, < 0.00010.200.01, 0.03
DORLT0.03> 0.05, 0.020.26< 0.0001, < 0.00010.240.01, 0.02
LATHUE0.10< 0.0001, < 0.00010.21< 0.0001, < 0.00010.13> 0.05, > 0.05
LATSAT0.02> 0.05, > 0.050.34< 0.0001, 0.00040.320.002, > 0.05
LATLT0.01> 0.05, >, > 0.050.05> 0.05, > 0.05
BKGRD0.040.002, 0.00030.71< 0.0001, < 0.00010.700.0002, 0.0006
Average0.04 (0.01) 0.25 (0.05) 0.21 (0.05) 

In contrast to results for phenotypic traits, the amova for microsatellite data (sexes combined) revealed that the overwhelming proportion of variance was within populations. Thus, the averages across nine loci were 96% (± 0.6% SE) within populations, 1% (± 0.3% SE) among populations within ecotypes and 3% (± 0.6% SE) between ecotypes. The average percentages for males and females across all scalation traits are, respectively, 58% (± 7% SE) within populations, 15% (± 8% SE) among populations within ecotypes and 28% (± 10% SE) among ecotypes. The average percentages for males and females for colouration traits are, respectively, 64% (± 6% SE) within populations, 6% (± 1.6% SE) among populations within ecotypes and 30% (± 7% SE) among ecotypes. (These summary figures are from anovas whose P-values are given in Table 4 but are otherwise not reported in this article. Within-population variance components were converted to genetic components using heritability estimates.) Thus, for the two sets of phenotypic traits, on the order of 40% of variation resided among populations (vs. 4% for microsatellites), and the majority of among-population variation resided between ecotypes. This discrepancy between microsatellite and phenotypic traits in among-population differentiation suggests that strong diversifying selection has acted on the phenotypic traits. We computed F- and Q-statistics to conduct a more rigorous test of this selection hypothesis.

Fixation statistics revealed major differences between microsatellite and phenotypic traits in population differentiation (Table 4). In the case of the microsatellites, population differentiation (FST) accounted for an average of only 0.04 (± 0.01 SE) of total variation. Although for individual loci, FST ranged from only 0.004 to 0.06, this degree of differentiation was statistically significant (at the α = 0.01 level) for seven out of nine loci. Virtually none of this proportion of among-population differentiation could be attributed to differences between ecotypes. On the average, FCT was only 0.02 (± 0.01 SE), and FCT values were not significant for any of the nine loci. We were able to detect significant variation among populations within ecotypes. FSC averaged only 0.01 (± 0.004 SE), but six out of nine loci showing significant values (at the α = 0.05). The large variation in population sample size (16–140) may impact estimates of population subdivision, so we repeated the amova using a dataset with more equalized population sizes (16–30 individuals; described in Manier & Arnold, 2005). These results were virtually identical for FSC and FCT, and similar for FST (0.03 instead of 0.04).

In contrast, average population differentiation was seven times greater (QST/FST) for scalation traits and six times greater for colouration traits. In the case of scalation traits, QST ranged from 0.09 to 0.44 and averaged 0.28 (± 0.05 SE). In the case of colouration traits, QST ranged from 0.02 to 0.71 and averaged 0.25 (± 0.05 SE). These statistics identified the scalation traits ILAB, VENT and SUB, and the colouration traits BKGRD, DORBLU, LATRED and LATSAT as traits that have experienced especially strong diversifying selection. The extreme case was BKGRD with a value for QST (0.71) nearly 18 times greater than the microsatellite average. Most of the among-population differentiation (QST) in phenotypic traits could be attributed to differences between ecotypes (QCT). In particular, the seven traits just highlighted as having high values for QST, also showed large values for QCT that were statistically significant in one or both sexes. It is also apparent that some phenotypic traits have experienced no or only weak diversifying selection. Thus, although QST was 0.28 for POST, this scalation trait showed no significant differentiation between ecotypes. Among colouration traits, DORHUE, LATGRN, LATHUE and LATLT showed statistically insignificant values for both QST and QCT. Finally, differentiation among populations within ecotypes (QSC) averaged 0.10 (± 0.05 SE) for scalation traits, and 0.04 (± 0.01 SE) for colouration traits. Although small, these percentages are, respectively, ten and four times greater than the microsatellite average.

Mantel tests of pairwise FST and QST matrices showed no evidence of correlated patterns of population differentiation. FST and QST were not significantly correlated for any phenotypic trait. Pearson correlation coefficients varied from −0.237 to 0.302 for males and −0.283 to 0.290 for females.

Putative agents of selection

Birds are probably the selective agents responsible for the observed patterns of selection on stripe colours. The photographs of culmen impressions on seven snakes from the Gallatin population matched the culmens of two avian predators. Six out of the seven snakes had impressions from Great Blue Herons (Ardea herodias), and one had impressions from a Brewer's Blackbird (Euphagus cyanocephalus; Appendix S2). In two instances, birds were directly observed preying on garter snakes near this study site. A Great Blue Heron was observed on the Eagle Lake shoreline, 5 km south-west of Gallatin, attacking and flying off with a large gravid T. elegans (C. Cox, personal communication). Jayne & Bennett (1990) observed an American Robin (Turdus migratorius) capturing and flying off with a juvenile T. sirtalis at a site 15 km from Gallatin. Although these results implicate birds as agents of selection in our study system (especially on colouration), we have not demonstrated that they are actually responsible for the selection we detected. Such a demonstration would require experimental manipulation of avian predators in conjunction with measurement of selection.


The phenomenon of ecotypic variation

Ecotypic variation refers to a repeated spatial pattern of population differentiation that coincides with particular environmental variables (Mayr, 1963). In the botanical literature such spatial coincidence has long been interpreted as evidence for local adaptation to ecological features (Turesson, 1922). With the notable exception of fish systems, such as sticklebacks (e.g. Moodie, 1972; Schluter et al., 2004; Baker et al., 2005), vertebrate biologists have been relatively slow to use this concept. Aside from our work on T. elegans, a few examples of ecotypic variation have been reported in snakes. Fox (1951) reported ecotypic variation in Thamnophis atratus in the Eel River drainage of northern California that parallels the colouration differentiation in our study system.

Colour pattern differences between island and mainland water snakes (Nerodia sipedon) and garter snakes (T. sirtalis) in Lake Erie have been viewed as an equilibrium between selection and migration in which strong selection for crypticity maintains population differentiation in colouration in both species in the presence of considerable gene flow (Camin & Ehrlich, 1958; King, 1993a,b; Lawson & King, 1996; Hendry et al., 2001; King & Lawson, 2001a,b; Bittner & King, 2003; Ray & King, 2006). These examples, together with our results, demonstrate that ecotypic variation and other kinds of local adaptation are relatively common in vertebrates.

Ecotypic variation and local adaptation in T. elegans

The historical importance of selection revealed by the FST-QST analysis reinforces the results of correlational selection analyses on vertebral numbers in these and related garter snakes. A previous study showed that the effect of vertebral numbers on growth rate in the Gallatin population can be portrayed as a bivariate ridge (Arnold, 1988). In other words, both VENT and SUB experience stabilizing selection, whereas the ridge's positive slope reflects correlational selection on VENT and SUB. In a study of the effect of vertebral numbers on locomotory performance in a closely related species (T. radix), Arnold & Bennett (1988) found significant positive correlational selection on VENT and SUB, although coefficients of stabilizing selection on these two traits were not statistically significant. Thus, a goal of future work will be to determine whether selectively driven differentiation, as assayed by our FST-QST analysis, occurs along the selective ridges that were diagnosed in the correlational studies of selection. Although the correlational analysis revealed an intermediate optimum for VENT and SUB in the Gallatin lakeshore population (Arnold, 1988), we have not conducted a parallel study to check for an intermediate optimum for lower counts in meadow populations. Finally, it might be illuminating to compare a correlational analysis of selection on the colouration traits with the FST-QST results.

It should be noted that the FST-QST analysis does not account for correlated responses to selection that arise from genetic correlations among traits (Lande, 1979). The result that QST > FST may suggest diversifying selection on the trait in question or on a genetically correlated trait, producing an illusion of selection on the trait in question. Although genetic correlations among scalation traits (Arnold & Phillips, 1999) and among colouration traits appear to be relatively weak or nonexistent, they may have helped to produce correlated responses to selection. We return to the ambiguity induced by genetic correlations in FST-QST analysis in our concluding remarks.

The population differentiation that we assessed with Q-statistics is not likely to represent environmental effects or sampling artefacts. Laboratory rearing of individuals from multiple populations of both ecotypes in the course of conducting a cross between inland and coastal populations of T. elegans (Arnold, 1988) showed that individual, population and ecotypic differences in scalation and colouration were congenital and ontogenetically stable. Furthermore, F1 progenies in this cross showed maternal effects for birth weight but not for scalation or colouration. Temperature effects during gestation in viviparous snakes were once thought to affect vertebral numbers and the other kinds of scalation traits used in our study (Fox, 1948). More extensive experimentation, however, has shown that these traits are extremely well buffered against temperature effects (Arnold & Peterson, 2002). Thus, the population differentiation that we observed probably does not represent direct environmental effects. Likewise, bias and sampling errors in heritability estimates probably had little effect on our results and conclusions. In principle, the actual value of heritability can fluctuate through time or vary from population to population (Falconer & MacKay, 1996). Analytical models and simulation studies show, however, that heritability and other polygenic genetic parameters can achieve stable equilibria, especially in large populations with persistent selection regimes (Lande, 1976; Jones et al., 2003, 2004). Although the persistence of selection has not been resolved in our study system, the effective sizes of our populations (a few to several hundred individuals; Manier & Arnold, 2005) would help to dampen fluctuations. Furthermore, comparative studies indicate that the heritabilities of scalation traits are relatively stable in garter snakes (Dohm & Garland, 1993; Arnold & Phillips, 1999). With respect to sampling issues, our use of full-sib estimates of heritability could have led to an upward bias. However, because of the way that heritability enters the calculation of Q-statistics, this potential bias would have produced conservative estimates of population differentiation. Furthermore, Q-statistics are relatively insensitive to the value of h2 and hence to errors in its estimation. For example, in our data, a value of 0.48 for heritability of VENT in females, in conjunction with estimates of Va, Vb and Vc, produced an estimate of 0.39 for QST. Suppose this value of heritability was in error by 100%, so that the true value was 0.24, then the corresponding estimate for QST would have been 0.56. In other words, even an extreme error in heritability would have produced only a 30% error in QST. Such an error is small, when we consider that in our study, QST was seven times greater than FST for scalation traits and six times greater for colouration traits.

Differentiation in colouration and scalation traits revealed in the present analyses coincides with ecotypic differences in life history (Bronikowski & Arnold, 1999; Sparkman et al., 2007). Thus, the syndrome of differentiation between ecotypes includes scalation, vertebral numbers, numerous aspects of colouration, as well as growth rate and body size–fecundity relationships. Remarkably, this extensive differentiation occurs over a distance of a few kilometres, between populations connected by moderate to high gene flow (Manier & Arnold, 2005).

As expected, vertebral number showed evidence of diversifying selection between ecotypes, with lakeshore snakes having more vertebrae than meadow snakes. This trend corresponds with a difference in push-point density at these sites (lower on the lakeshore, higher in meadow sites). A similar association was observed in a comparison of coastal and inland populations of T. elegans, but because juvenile snakes with more vertebrae crawl faster at all push-point densities, the biomechanical basis for these associations remains unclear (Kelley et al., 1997; Arnold & Phillips, 1999).

The direction of differentiation in other scale counts is consistent with the hypothesis that selection favours the ability to ingest large prey in lakeshore habitats. Studies of stomach contents reveal that lakeshore populations feed on fish and leeches, whereas the meadow populations feed primarily on anuran larvae and leeches (Kephart & Arnold, 1982; Kephart, 1982). The lakeshore diet includes occasional ingestion of large fish, but feeding performance trials are needed to determine whether the observed differences in scalation do indeed enhance the ability of lakeshore snakes to eat large prey.

The FST-QST and FCT-QCT contrasts suggest that diversifying selection is responsible for population differentiation in scalation and colouration characters. For many of the scalation traits and several of the colouration traits, population differentiation is five- to ten-fold more pronounced than would be expected under a drift-migration balance. Furthermore, most of the population differentiation in phenotypic traits coincides with ecotypic differences. The most likely explanation for these results is local adaptation to lakeshore and meadow habitats in scalation and colouration. Thus, T. elegans in the Eagle Lake basin have adapted to lakeshore habitats by evolving more body and tail vertebrae, more scale rows at midbody, more infralabial and supralabial scales, lighter background colour, and bluer dorsal stripes. Concomitantly, adaptation in all these traits has occurred in the opposite direction in meadow habitats.

The FSC-QSC contrasts suggest that diversifying selection is also responsible for population differentiation within each of the two ecotypes. This level of differentiation in scalation and colouration traits is, on average, ten- and four-fold, respectively, more pronounced than what we would expect by drift. Most of the scalation and colouration traits show highly significant QSC-values, in contrast to FSC- or to QCT-values. Although it should be noted that we have considerably more statistical power for QSC than for QST or QCT, it nevertheless appears that subtle local adaptation within lakeshore and meadow habitats has involved numerous aspects of scalation and colouration. Future studies might determine whether small-scale differences among meadow sites in vegetation and other habitat characteristics contribute to selection on colouration. Our results also suggest that adaptation to local conditions may occur on a much smaller spatial scale than is typically assayed in snakes and other terrestrial vertebrates (but see Hoekstra et al., 2004).

Local adaptation in our study system, revealed by QSC, QST and QCT, is especially remarkable given the close proximity of populations (1.3–19.9 km apart) and moderate levels of gene flow among them. The Gallatin shoreline and the Papoose Meadows populations present a case in point illustrating this pattern. An earlier study established that substantial gene flow occurs among 20 populations in our study system (FST = 0.024; average Nem = 0.4), primarily unidirectionally from the major source population, Papoose Meadows (average Nem from Papoose = 1.4; Manier & Arnold, 2005). Gallatin and Papoose are only 4 km apart and are connected by an intermittent stream, Papoose Creek, which is a known dispersal corridor for T. elegans in wet years (S.J. Arnold, unpublished data). Thus, Gallatin and other lakeshore sites have been able to differentiate in the presence of persistent migration from Papoose Meadows, and both lakeshore and meadow populations have maintained their ecotypic identities on a small spatial scale.

Endler (1990) has persuasively argued that spectrographic measurement of colour is preferable to matching with colour standards. Because spectrographic measurements are time-consuming, we opted for colour matching to maximize sample sizes. We circumvented some of Endler's objections to subjective matching by using one person to score colours under a uniform condition of lighting that coincided with the most likely time of heavy predation (mid-morning). Nevertheless, the human visual system undoubtedly differs from that of birds (Chen & Goldsmith, 1986; Jane & Bowmaker, 1988), although the differences may not be large (Ali & Klyne, 1985; Chen & Goldsmith, 1986). Although stripe colours clearly experience selection in our study system, the details of the selection results might be different if a bird-based rather than a human-based scoring scheme had been employed. For this reason, our selection results for colouration should be viewed with caution. Nevertheless, the RGB and the HSL scoring systems gave comparable results, even though the RGB scheme probably better approximates avian colour vision. This consistency between the two coding schemes suggests that the overall picture of selection may be robust to scoring method.

Concluding remarks

As in many other studies (Merilä & Crnokrak, 2001; McKay & Latta, 2002), QST greatly exceeds FST in our study system. This result implies that diversifying selection acting on phenotypic traits has produced departures from neutral expectations. Although this interpretation is correct per se, it is subject to some qualifications. Differences among traits in QST can reflect differences in both inheritance and selection. For example, a trait may show QST > FST, not because it has been a target of selection, but because it has responded to selection that has acted on one or more genetically correlated traits. Although a truly multivariate FST-QST comparison has not yet been devised, the direct role of selection in differentiation can be diagnosed with multivariate retrospective analyses that use the G-matrix for a set of traits (Lande, 1979; Jones et al., 2004).

Studies of selection within a generation do not suffer from QST’s problem of confounding selection with response to selection. Consequently, multivariate analyses of ongoing selection (Lande & Arnold, 1983; Schluter & Nychka, 1994) are a better vehicle than FST-QST analysis for identifying the actual targets of phenotypic selection. Statistical power is, however, a serious concern in these approaches, especially when stabilizing selection is weak and the phenotypic mean is close to an intermediate optimum (Hersch & Phillips, 2004). The analyses of selection discussed here (Arnold & Bennett, 1988; Arnold, 1988) undoubtedly suffered from this problem.

F ST-QST comparisons are limited in their inability to identify the actual agents of selection. Even a few observations of selective events, as in the present study, can help remedy this situation and so direct and illuminate the interpretation of selection analyses. Thus, the overall message of our study is that multiple levels of inquiry into the nature and consequences of selection can yield a synthetic overview of the process that a standard FST-QST comparison alone cannot provide.


We are grateful to many Eagle Lake field workers, especially L.D. Houck. We thank A.M. Bronikowski for discussions and logistical support. M. Blouin, A.R. Kiester, R. Mauricio, P. McEvoy, A. Hendry and anonymous reviewers provided helpful comments on the manuscript. We especially thank S.R. Estes and M.E. Pfrender for encouraging us in this research. We were supported by an EPA STAR Fellowship (U-91552801-5), an NSF GK-12 Teaching Fellowship and an NSF DDIG (DEB-0309017) to MKM; as well as an NSF grant (DEB-9903934) to SJA and M.E. Pfrender, and an NSF grant (DEB-0323379) to SJA and A. M. Bronikowski. CMS was supported by an REU-supplement to DEB-0323379. The specimens and tissue samples used in this study were collected under the auspices of Scientific Collecting Permits granted by the California State Department of Fish and Wildlife, using protocols approved by the IACUC committee at Oregon State University.