Allen’s rule revisited: quantitative genetics of extremity length in the common frog along a latitudinal gradient


Jussi S. Alho, Ecological Genetics Research Unit, Department of Biosciences, PO Box 65, FI-00014 University of Helsinki, Finland.
Tel.: +358 (0)9 19157710; fax: +358 (0)9 19157694; e-mail:


Ecogeographical rules linking climate to morphology have gained renewed interest because of climate change. Yet few studies have evaluated to what extent geographical trends ascribed to these rules have a genetic, rather than environmentally determined, basis. This applies especially to Allen’s rule, which states that the relative extremity length decreases with increasing latitude. We studied leg length in the common frog (Rana temporaria) along a 1500 km latitudinal gradient utilizing wild and common garden data. In the wild, the body size–corrected femur and tibia lengths did not conform to Allen’s rule but peaked at mid-latitudes. However, the ratio of femur to tibia length increased in the north, and the common garden data revealed a genetic cline consistent with Allen’s rule in some trait and treatment combinations. While selection may have shortened the leg length in the north, the genetic trend seems to be partially masked by environmental effects.


Two related ecogeographical rules – Bergmann’s and Allen’s rules – have traditionally and frequently been interpreted as having an adaptive basis in the thermoregulation of endothermic animals (e.g. Mayr, 1963; James, 1970). Bergmann’s rule (1847) (for partial translation see James, 1970) states that animals from colder climates tend to be larger than their counterparts in warmer conditions. Allen’s (1877) rule, on the other hand, posits that animals from colder climates have relatively shorter protruding parts (limbs, tails, wings, ears) than their counterparts in warmer conditions. Both rules have their foundations in the surface area to volume ratio: the greater the ratio, the greater the loss of heat through the surface. The exact definition, underlying mechanisms and even the existence of these rules have been a source of debate and considerable confusion for over a century (e.g. Rensch, 1938; Mayr, 1956, 1963; James, 1970; Blackburn et al., 1999). Nevertheless, Allen’s rule has often been overshadowed by Bergmann’s rule in these debates.

Although both Bergmann’s and Allen’s rules were originally formulated for endotherms, later studies have found that body size of both vertebrate and invertebrate ectotherms also sometimes follow the predictions of Bergmann’s rule (e.g. Cushman et al., 1993; Ashton, 2002; Ashton & Feldman, 2003). The validity of Allen’s rule in ectotherms has been suggested (e.g. Ray, 1960) but rarely tested in the wild (but see e.g. Bidau & Marti, 2008; Langkilde, 2009). In ectotherms, the theoretical expectations under the thermoregulatory adaptation hypothesis are less obvious than in endotherms. Ectotherms increase their body temperature mainly by exposure to external sources of heat and conserve thermal energy by small relative body surface area – which also necessarily limits heat absorption. However, this could be an advantage in thermally heterogeneous environments, as there may be a need to avoid overheating in hot microhabitats and conserve thermal energy in cold ones (see e.g. Angilletta, 2009).

Both Bergmann’s and Allen’s rules have recently experienced something of a revival in the context of climate change (see e.g. Millien et al., 2006; Yom-Tov et al., 2006; Teplitsky et al., 2008; Gardner et al., 2009; Salewski et al., 2010). It has been argued that global warming would select for decreased body mass and increased extremity length, and such morphological trends in nature have been interpreted as adaptive microevolutionary changes (Smith et al., 1995, 1998; Yom-Tov, 2001; Millien, 2004; Yom-Tov et al., 2006; Salewski et al., 2010). However, the observed phenotypic changes can represent either genetic or plastic responses, and consequently, phenotypic trends alone cannot be used to infer genetic changes (Gienapp et al., 2008). Although some evolutionary genetic research on Bergmann’s rule has been carried out (e.g. Mousseau, 1997; Blanckenhorn & Demont, 2004; Laugen et al., 2005a; Teplitsky et al., 2008), similar studies on Allen’s rule are lacking. In general, the heritability and the degree of genotype–environment interactions in relative extremity length remain largely unaddressed (Serrat et al., 2008; but for estimates of leg length heritability, see e.g. Milner et al., 2000; Gómez et al., 2009). Further, to our knowledge, not a single study has explicitly attempted to separate the environmental and genetic components of relative extremity length in the context of Allen’s rule.

In this study, we investigated the existence of a latitudinal cline in the relative extremity length, as well as its genetic basis, in a widespread amphibian, the common frog (Rana temporaria). We analysed data both from the wild and from a large common garden experiment. Our main aims were (i) to test the validity of Allen’s rule along a approximately 1500-km latitudinal gradient, (ii) to test for possible genotype–environment interactions in the expression of relative femur and tibia length as well as the ratio of femur to tibia length using a common garden experiment and (iii) to estimate the degree of quantitative trait divergence and compare it to divergence in neutral marker loci (QSTFST comparison; see e.g. Merilä & Crnokrak, 2001; Leinonen et al., 2008) to test whether the population differentiation has been driven by directional selection, rather than by random genetic drift.

Materials and methods

Study species

The common frog is the most widely distributed anuran in Europe: it can be encountered from Spanish Pyrenees to Russia, from sea level to altitudes above 2000 m (Gasc et al., 1997). It is also the only amphibian species in Europe with a distribution range extending to the North Cape and the Barent’s Sea (Gasc et al., 1997). It is terrestrial as an adult, and breeds in a variety of freshwater habitats from ditches and temporary ponds to marshes along the shores of large lakes. The common frog has been studied extensively and displays large variation in morphological and life-history traits both as larvae and as adult (e.g. Miaud et al., 1999; Laugen et al., 2002, 2003a, 2005a,b; Vences et al., 2002; Hettyey et al., 2005; Jönsson et al., 2009). Previous studies have also shown that many traits have diverged along a latitudinal gradient across the Scandinavian peninsula – sometimes more than would be expected by random genetic drift alone – suggesting local adaptation (Laugen et al., 2003b; Palo et al., 2003). The extensive geographical variation, together with a temperature gradient across Scandinavia, makes the common frogs of the peninsula an interesting model for the study of Allen’s rule.

Data from the wild

Data on wild adults were gathered during the breeding seasons of 1998–1999 from 12 localities (Table 1) as part of other studies (e.g. Laugen et al., 2002, 2003a,b; Hettyey et al., 2005; Jönsson et al., 2009). A total of 109 adult female and 113 male common frogs were collected during the early breeding season, right after emergence from hibernation (April–June, depending on latitude). Live frogs were transported to a laboratory in Uppsala where they were anesthetized and killed with an overdose of MS-222 (tricaine methanesulfonate). Each individual was sexed on the basis of gonadal inspection. Snout–vent length was measured with dial callipers to the nearest millimetre. Frog carcasses were then maintained frozen at −20 °C until measured for femur and tibia lengths.

Table 1.   Populations used in the analyses with sample sizes for data from the wild and the common garden experiment.
PopulationLatitudeLongitudeData from wildCommon garden experiment
NfemalesNmalesNfull-sib familiesNoffspringDesign
  1. *The common garden data from Umeå involved two separate fertilization groups, one with 27 full-sib families and 325 offspring and another with 31 full-sib families and 587 offspring.

Börringe, Svartesjöhus55°30′N13°25′E613
Revinge, Tvedöra55°42′N13°26′E10932581Half-sib
Blekinge, Hemsjö56°19′N14°42′E1010
Karlstad, Lindrågen59°28′N13°31′E118
Järlåsa, Häggedal59°51′N17°14′E9922224Half-sib
Tärnsjö, Gullsmyra60°07′N16°56′E99
Umeå, Grytan63°49′N20°14′E7958*912*Half-sib
Kiruna, Esrange67°51′N21°02′E101032756Half-sib
Kilpisjärvi, Malla69°03′N20°47′E7632821Half-sib
Total  1091131843611 

All measurements of femur and tibia lengths were taken by the same person (GH). Carcasses were thawed and the bones were semi-dissected out so that both ends were clear. Measurements were taken with digital callipers and recorded to the nearest 0.01 mm. Each measurement was taken twice for both legs. The measurements from the right leg were used throughout this study, and repeatabilities were calculated from repeated measures as described in the following sections. In the analyses, we used three measures: femur and tibia length and their ratio. The use of femur/tibia ratio provides information about possible latitudinal differences in relative selection pressures on tibia and femur lengths.

Common garden data

Common garden data on metamorphosed juveniles from earlier studies (Laugen et al., 2003b, 2005a; Palo et al., 2003) were analysed to separate the contributions of additive, dominance, maternal and environmental variations to variation in body size–corrected leg length in six populations (Table 1) across the latitudinal gradient. The same three leg traits – femur and tibia lengths and the ratio of femur to tibia length – were used as in the case of the wild adults. We used the data to estimate heritability based on the additive genetic component and to test for possible latitudinal divergence in the traits. As described in the following paragraphs, there were three temperature and two food treatments during the larval stage, allowing us to test for genotype–environment interactions and thus for possible latitudinal genetic divergence in phenotypic plasticity of the traits. Details of the common garden experiments have been published previously (Laugen et al., 2003b, 2005a; Palo et al., 2003), but the pertinent parts will be briefly restated and leg length measurements described in the following sections.

Tadpoles for the common garden experiments were produced in artificial laboratory crosses of adult frogs caught at spawning sites in the beginning of the breeding season. A North Carolina type II breeding design (Lynch & Walsh, 1998) was used, except for the Ammarnäs population, for which eight freshly-laid spawn clutches were collected from the wild. Except for Umeå and Ammarnäs populations, 16 maternal half-sib families (i.e. 32 full-sib families) were created where eggs from each of eight females were fertilized by sperm from four of the 16 males. The Umeå tadpoles came from a similar design, but for this population two sets of 16 maternal half-sib families were created on different fertilization dates. Because of the large difference in the onset of spawning along the latitudinal gradient (Meriläet al., 2000), the starting dates among the other populations also differed. The fertilizations for the southernmost population (Lund) were performed on 9 April 1998, whereas the corresponding date for the northernmost population (Kilpisjärvi) was 4 June 1998. However, the rearing conditions were the same for all populations. The crosses were carried out following the principles outlined in Laugen et al. (2002). The eggs were divided into three different temperature treatments (14, 18 and 22 ± 1 °C, two bowls per cross in each temperature) at which they were kept until Gosner stage 25 (Gosner, 1960). Water was changed every third day during embryonic development. When most of the embryos in a given temperature treatment had reached Gosner stage 25, eight randomly chosen tadpoles from each cross were placed individually in 0.9-L opaque plastic containers at each of two food levels (restricted and ad libitum). This procedure was repeated for each population in the three temperature treatments, resulting in 48 experimental tadpoles per cross. However, because of mortality during the experiment, the final number of tadpoles per family was typically fewer than 48 (Table 1). Every seventh day, the tadpoles were fed a finely ground 1 : 3 mixture of fish flakes (TetraMin; Ulrich Baensch GmbH, Melle, Germany) and rodent pellets (AB Joh. Hansson, Uppsala, Sweden). The amount of food given to each tadpole was 15 mg (restricted) and 45 mg (ad libitum) for the first week, 30 and 90 mg for the second week and 60 and 180 mg per week thereafter respectively until metamorphosis. The ad libitum level was selected to be such that the individuals did not consume all the food before the next feeding event at any of the temperature treatments. In the restricted food treatment, the tadpoles at the two highest temperature treatments consumed all of their food resources before the next feeding, indicating food limitation, but in the low temperature treatment, the tadpoles frequently had food left even after 7 days of feeding. The tadpoles were raised in dechlorinated tap water that was aerated and aged for at least 24 h before use. The water was changed every seventh day in conjunction with feeding. The light rhythm was 16L : 8D. As the rearing of the tadpoles continued from mid-April to late August, temperatures were measured in the laboratories at fixed locations twice a day throughout the experiment to check that the water temperature did not change over time. There was no temperature change over time in any of the laboratories (see Laugen et al., 2005a).

At the time when metamorphosis at the given population was anticipated to start, the vials were checked once a day. Metamorphosed frogs (Gosner stage 42) were weighed, and age at metamorphosis (in days since Gosner stage 25) was recorded. Water level in the vials was reduced, and the metamorphs were allowed to absorb their tails before being anesthetized and killed with an overdose of MS-222. After this, they were frozen in −20 °C. Leg measurements were later taken from the thawed metamorphs by measuring their right tibia and femur length under stereomicroscope with the aid of digital callipers. Snout–vent length was measured similarly. All the measurements were taken by one person – blind in respect to the identity of the samples – and recorded to nearest 0.1 mm.

Statistical analysis

We calculated the repeatability of femur and tibia lengths and the ratio of femur to tibia length both for the adults caught from the wild and for the juveniles reared in the common garden following Lessells & Boag (1987). In addition, repeatability was calculated for the snout–vent length of the juveniles. In short, one-way analysis of variance using the functions lm and anova in R (R Development Core Team, 2009) was used and the repeatability for each trait was derived as:


where s2 was the within-individual mean squares and inline image was calculated from


where MSA was the among-individual mean squares, MSW the within-individual means squares and n the number of measurements per individual, i.e. two. Ninety-five percent confidence intervals for the repeatability estimates were obtained by nonparametric bootstrap, resampling the data 5000 times.

The repeatabilities for all traits were generally high. For wild-collected adults, they were 0.98 [95% credible intervals (CI): 0.97–1.00] for femur length, 1.00 (95% CI: 0.99–1.00) for tibia length and 0.80 (95% CI: 0.76–0.98) for the ratio of femur to tibia length. For juveniles reared in the common garden, the repeatabilities were 0.98 (95% CI: 0.97–0.99) for snout–vent length, 0.92 (95% CI: 0.85–0.96) for femur, 0.98 (95% CI: 0.96–0.99) for tibia and 0.61 (95% CI: 0.25–0.79) for the ratio of femur to tibia length.

We used a linear mixed model to investigate variation in femur length, tibia length as well as their ratio in wild-collected adults. We incorporated snout–vent length as a covariate to correct for the variation in age and body size, and latitude and its square as other covariates to test for a latitudinal effect on the relative extremity length. Sex was included as a fixed effect and population as a random effect to correct for the varying sample size and the nonindependence of the data. Finally, we included the interactions between sex and latitude, sex and the square of latitude, and sex and snout–vent length in the models. The model fitting was performed in R with the lmer function of the lme4 extension package (available through CRAN; P-values are not available for the fixed effects of linear mixed models in R because they involve unresolved statistical issues, and we hence obtained 95% highest posterior density intervals (HPDI) for parameter estimates by Markov chain Monte Carlo (MCMC) methods using functions mcmcsamp and HPDinterval of the lme4 package.

A Bayesian univariate hierarchical model (Gelman et al., 2004) was constructed for femur and tibia lengths and for the ratio of femur to tibia length measured in the common garden environment. For comparison, a similar model was fitted to the snout–vent length data. The model allowed us to estimate simultaneously among- and within-population genetic variances and heritabilities of the traits accounting for the different quantitative genetic variance components, and the degree of population differentiation as measured by QST (see e.g. Merilä & Crnokrak, 2001; Leinonen et al., 2008). In addition, it allowed the estimation of the correlation of population effects with latitude and the correlation of pairwise QST values with pairwise FST estimates (divergence in neutral molecular marker loci; see e.g. Merilä & Crnokrak, 2001; Leinonen et al., 2008) and physical distances separating the populations. We obtained estimates for the parameters of interest from the joint posterior distribution by MCMC simulation (Gelman et al., 2004) using OpenBUGS version 3.0.3 (Lunn et al., 2009). The estimates were summarized as posterior means and 95% credible intervals (CI), i.e. Bayesian confidence intervals. For each trait, we ran three chains, 150 000 iterations each, and thinned them by five. The first 10 000 of the 30 000 thinned iterations were discarded as burn-in. The convergence and mixing of the chains was checked visually.

The Bayesian model had similarities to the one used by Palo et al. (2003). The model was a linear mixed effects model with treatment combination (temperature, food availability)-specific means. The means were given vague normal priors N(μ, σ2), where μ was the observed trait mean and σ2 variance (0.1 for the ratio of femur to tibia length, 10 for other traits). Although laboratory blocks were shared between different populations and in principle the environmental conditions remained always physically the same, we included block as a population-specific fixed effect as there was no complete temporal overlap between populations because of different fertilization times. In the case of the Umeå population, we used separate block effects for the two different fertilization dates. The effect of first block within each population and fertilization group was fixed to zero, and other block effects were given vague normal priors N(0, 100), where the second parameter is variance. Snout–vent length (mm) and age were included as linear covariates with mean subtracted values so that the population and treatment combination–specific means were defined for individuals of average length (15.7 mm) and age (48.9 days). In the analysis of snout–vent length, it was itself obviously omitted from the explanatory part of the model. The regression coefficients of the snout–vent length and age were given vague normal priors N(0, 10) and N(0, 0.1), respectively, except in the case of the femur to tibia length ratio, where the regression coefficient of the snout–vent length was also given the normal prior N(0, 0.1). Population, dam, sire and family were included as population and treatment combination–specific random effects. The variances of dam, sire and family effects and the residual variance were modelled in terms of the underlying variance components (Lynch & Walsh, 1998):


where VA is the within-population additive genetic, VM the maternal, VD the dominance and Vε the microenvironmental variance. The variance of population effects corresponded to the among-population additive genetic variance Vpopulation, derived from QST as described below. All variance components were defined to be population and treatment combination specific but subscripts indicating this have been omitted from the notation for simplicity.

To obtain flat priors for QST values, we parameterized the model in terms of treatment combination–specific QST rather than among-population additive genetic variance. Thus, QST was given a uniform prior U(0, 1) and the treatment combination–specific among-population additive genetic variances Vpopulation were calculated from:


where μVA is the across populations mean of the additive genetic variances VA. The other variance components VA, VM, VD, Vε were given gamma priors Gamma (0.001, 0.001). Heritability (h2) was calculated as VA/(VA + VM + VD + Vε) for all populations and in all treatment combinations and summarized as the mean h2 across all populations and all treatment combinations.

The data collected from the Ammarnäs population consisted of full-sib families instead of half-sibs (Table 1), and the variance components for this population were thus confounded. However, because parameters of interest were estimated from the joint posterior distribution, the results were valid and the confounding effects expressed themselves solely as possible wider CI for this population.

Pairwise QST values, i.e. QSTs for each two-population combinations, were calculated from:


where Vpairwise was the variance of the estimates of the population means of the two populations.

FST (see, e.g. Merilä & Crnokrak, 2001; Leinonen et al., 2008) was used to measure the degree of population divergence in neutral marker loci. The overall and pairwise FST estimates published by Palo et al. (2003) for our study populations based on eight presumably neutral microsatellite loci were used. Correlations between pairwise estimates of QST, FST, and geographical distance were calculated from the posterior distribution for all treatment combinations and traits, using the odds [p/(1 − p)] of pairwise QST and FST values. The FST values were simulated from the pairwise estimates and associated 95% confidence intervals from Palo et al. (2003) assuming normality. The correlations were equivalent to the ones calculated in Mantel tests, and the CI of the correlation coefficients take into account the correlations between pairwise QST estimates (Palo et al., 2003). The correlations between pairwise QST and geographical distance (rQST) and between pairwise FST and distance (rFST) were used to assess the population divergence and role of selection. A significant positive rQST would suggest that the relative leg length differs genetically between populations and rQST > rFST would be evidence of natural selection being a stronger force than genetic drift in driving the divergence. Furthermore, the difference between overall QST and FST in each treatment combination was calculated. If significantly > zero, this would also suggest divergent selection. Finally, for each treatment combination, we calculated the correlation between population effects and latitude. A significant correlation would indicate latitudinal divergence.

The fit of the Bayesian model was checked by eye and by a formal test. Visual examination of the residuals plotted against the fitted values revealed a weak positive trend which, however, was deemed to have no practical effect on the analysis. The conclusion was supported by formal testing which found no evidence of lack of model fit: The Bayesian P-value for the χ2 discrepancy test was 0.80 for femur length, 0.79 for tibia length, 0.73 for the ratio of femur to tibia length and 0.78 for snout–vent length.


Extremity length in the wild

Generally, patterns in femur and tibia lengths did not conform to simple expectations under Allen’s rule. In the wild, there was no apparent linear trend between either of the traits and latitude after correcting for snout–vent length. However, there was a nonlinear relationship, which peaked at mid-latitudes (Fig. 1, Table 2). In the north, tibia length decreased more steeply with latitude compared to the femur length, causing a strong increase in the ratio of femur to tibia length (Fig. 1). Latitude and its square had a significant effect on the ratio (Table 2). Neither sex nor its interactions had significant effects on any of the traits (Table 2).

Figure 1.

 The effect of latitude on (a) femur length, (b) tibia length and (c) the ratio of femur to tibia length in wild adult common frogs after correcting for the linear effect of snout–vent length. The solid curves represent predictions for males and the dashed curves for females. The filled circles are partial residuals for males and the open circles for females. The models were linear mixed models incorporating the effects listed in Table 2.

Table 2.   The effects of sex, latitude and snout–vent length (SVL) on femur length, tibia length and the ratio of femur to tibia length.
 Femur lengthTibia lengthFemur/tibia
  1. The estimates (95% confidence intervals in parenthesis) for which the confidence interval does not include zero are in bold. The model was a linear mixed model incorporating population as random effect. All length measurements are in millimetres and latitude in decimal degrees. Females were used as the reference sex.

Intercept110.9 (−189.1, −33.5)195.5 (−285.2, −104.4)2.87 (1.87, 3.85)
Sex13.9 (−96.8, 124.1)30.4 (−100.3, 155.5)−0.520 (−1.905, 0.894)
Latitude3.80 (1.32, 6.41)6.73 (3.73, 9.63)0.067 (−0.099, −0.034)
Latitude20.030 (−0.051, −0.010)0.054 (−0.078, −0.030)5.6 × 10−4 (3.0 × 10−4, 8.2 × 10−4)
SVL0.300 (0.244, 0.352)0.290 (0.235, 0.356)6.7 × 10−4 (5.4 × 10−4, 0.001)
Sex × latitude−0.45 (−4.15, 3.11)−0.92 (−5.06, 3.32)0.015 (−0.031, 0.060)
Sex × latitude20.003 (−0.025, 0.033)−0.007 (−0.027, 0.040)−1.2 × 10−4 (−4.8 × 10−4, 2 × 10−4)
Sex × SVL0.031 (−0.047, 0.110)0.023 (−0.069, 0.105)2.7 × 10−4 (−5.6 × 10−4, 12 × 10−4)

Common garden results

Treatment-specific means estimated from the Bayesian model – and thus corrected for snout–vent length – across all populations were 5.8–6.5 mm for femur, 4.5–5.2 mm for tibia length and 1.21–1.31 for the ratio of femur to tibia length (Table 3). Restricted food availability had a tendency to decrease the lengths of both femur and tibia whereas temperature treatment showed no obvious trends (Table 3). Food availability had no direct effect on the ratio of femur to tibia length, but the trait showed a response to temperature under the restricted food level – there was a significant increase in the ratio when moving from 14 to 22 °C, with the 18 °C treatment being intermediate (Table 3). Snout–vent length had a significant effect on both femur and tibia length in all treatments, but only a weak, and in most cases nonsignificant, effect on the ratio of femur to tibia length (Table 3). Age had a positive or nonsignificant effect on both femur and tibia lengths in all treatments, but no effect on their ratio (Table 3).

Table 3.   Treatment-specific means for femur and tibia lengths and for the ratio of femur to tibia length, and the effects of snout–vent length and age on these traits.
Food availabilityTemperature (°C)MeanSnout–vent lengthAge
Posterior mean95% CIPosterior mean95% CIPosterior mean95% CI
  1. Estimates are given as posterior means and associated 95% credible intervals (CI).

Femur length (mm)
 Ad libitum226.−0.0020.013−0.0010.014
Tibia length (mm)
 Ad libitum225.−0.0010.009−0.0010.009
Femur/tibia length
 Ad libitum221.241.191.29−0.022−0.032−0.0120.000−0.0010.002−0.007−0.0180.0030.000−0.0010.001−0.014−0.023−0.006−0.001−0.0020.000

The mean h2 across all populations and treatments was 0.12 (95% CI: 0.07–0.18) for femur length, 0.15 (95% CI: 0.10–0.21) for tibia length and 0.16 (95% CI: 0.12–0.21) for the ratio of the two. For comparison, the mean h2 of snout–vent length was 0.20 (95% CI: 0.14–0.27).

There were five significant correlations between latitude and population effects of femur or tibia and none between latitude and population effects of the ratio of femur to tibia length (Table 4; Fig. 2). The estimation of QST was hampered by very wide CI, but the posterior estimates were relatively low for femur length and the ratio of femur to tibia length and higher for tibia and snout–vent length (Fig. 3). The highest QST value for a leg trait was estimated for tibia length in the 14 °C temperature and low food availability treatment (posterior mean: 0.55; 95% CI: 0.25–0.86). This was the only leg trait and treatment combination for which the correlation between pairwise QST and distance between populations (= 0.56, 95% CI: 0.16–0.79; results for the other combinations not shown) and the difference between QST and FST (posterior mean difference: 0.32, 95% CI: 0.01–0.63; results for the other combinations not shown) were significantly different from zero. The difference between QST and FST values was also significant for snout–vent length in the 14 °C temperature and low food availability treatment (posterior mean difference: 0.31, 95% CI: 0.01–0.63). The correlation between pairwise QST and FST values was not significant for any treatment combination in any trait (results not shown). There was no significant difference in any trait and treatment combination between the pairwise correlations of QST and distance and of FST and distance (results not shown).

Table 4.   Correlations between latitude and population effects of femur length, tibia length and the ratio of femur to tibia length.
Food availabilityTemperature (°C)Correlation
Posterior mean95% CI
  1. Results are summarized by posterior means of the correlation coefficients and 95% credible intervals (CI). Significant correlations are in bold.

 Ad libitum220.10−0.440.57
Tibia length
 Ad libitum22−0.15−0.580.34
Femur/tibia length
 Ad libitum220.20−0.480.69
Figure 2.

 The relationship between latitude and population effects of femur length (a–b), tibia length (c–d) and the ratio of femur to tibia length (e–f) in different treatment combinations as estimated from the common garden data. Plotted are the posterior means. The model included snout–vent length as a covariate, and the effects are thus corrected for body length.

Figure 3.

 Comparison of the degree of quantitative trait divergence (QST) and divergence in neutral marker loci (FST) in all temperature and food availability (ad libitum/restricted) treatments. Filled circles are posterior means of QST values and vertical bars 95% credible intervals for (a) femur length, (b) tibia length, (c) the ratio of femur to tibia length and (d) snout–vent length. Horizontal dash lines represent the FST estimate previously published by Palo et al. (2003), and dotted lines mark the associated 95% highest posterior density interval.


According to Allen’s rule, we would expect selection to favour relatively shorter extremities in colder environments and thus to observe a negative correlation between latitude and relative leg length. Our results from the common frogs in the wild did not conform to this simple prediction, as there was no apparent linear latitudinal relationship in either femur or tibia lengths after correcting for snout–vent length. However, there was a strong nonlinear relationship with both traits peaking at the mid-latitude populations of the study. Tibia length decreased from its peak value more steeply with increasing latitude than did femur length, resulting in a convex increase in the ratio of femur to tibia length towards north (Fig. 1). The common garden results revealed significant latitudinal trends with the among-population additive genetic effects on femur and especially tibia lengths decreasing with latitude under some – but not all – experimental treatments, especially under the harshest conditions (low food, 14 °C) (Fig. 2; Table 4). This suggests that there is an underlying genetic trend consistent with Allen’s rule, but also a significant genotype–environment interaction in its expression. The existence of genotype–environment interaction implies that populations have diverged genetically in respect to the degree of phenotypic plasticity they express. Taken together, our results not only suggest that environmental effects can partially counteract the genetic cline but also demonstrate that selection might have operated on genetic variation that is partly hidden by environmental variance (for other similar examples of this see: Conover & Schultz, 1995; Meriläet al., 2001; Garant et al., 2004).

All three leg traits were heritable (mean h2 0.12–0.16). The heritability estimates were slightly lower but roughly of the same magnitude as the heritability of snout–vent length (mean h2 = 0.20), and there were no significant differences between the traits (see also Laugen et al., 2005b). The presence of additive genetic variation within populations shows that relative extremity length has the potential to respond to selection. Furthermore, the comparison of divergence in quantitative traits and in neutral molecular marker loci (QSTFST comparison; see e.g. Merilä & Crnokrak, 2001; Leinonen et al., 2008) indicates the occurrence of past divergent selection on tibia length. Although the CI for QST were always wide because of the relatively small number of populations (cf. O’Hara & Merilä, 2005), in one treatment combination −14 °C and restricted food availability –QST for tibia length was significantly higher than the FST estimate published previously by Palo et al. (2003) for these populations. Furthermore, the point estimates for QST were consistently higher than the FST estimate in those treatment combinations that showed the strongest latitudinal correlation in the among-population additive genetic effects, suggesting that selection might have driven the divergence. In addition to the evidence from the QSTFST comparisons, the latitudinal order in the population effects in several treatments itself supports the hypothesis that populations have diverged in relation to the tibia – and to a smaller extent in femur – length in response to divergent selection rather than random genetic drift alone.

It is noteworthy that the QST estimates for the ratio of femur to tibia length were very low in all treatments (posterior means ranging 0.09–0.20) compared to the other traits and the FST estimate, suggesting possible stabilizing selection – albeit the CI were again too wide to draw any firm conclusions. The modest population effects on the ratio in the common garden experiment and the resulting small QST values are in a striking contrast to the pattern observed in the wild, where the ratio increases steeply in the north. The common garden result probably stems from a positive genetic correlation between tibia and femur length constraining the genetic variation in the ratio, although we did not quantify this correlation in this study. The difference to the data from the wild is likely caused by strong environmental influences on the ratio, as evidenced by the significant and suggestive differences in the trait mean between temperature treatments in the common garden experiment.

One caveat in our study was that we assumed that the trait values in juvenile and adult phases are correlated and that the measures taken from adults caught in the wild and from juveniles reared in the common garden thus represent the same trait. This assumption seems justified, as age-specific measures of morphology are often strongly positively correlated, also between different life stages in frogs (see e.g. Watkins, 2001). The environmental effects experienced during larval stage can also carry over into adult phase (see e.g. Blouin & Brown, 2000; Gomez-Mestre et al., 2010). Another issue is the fact that the distribution range of the common frogs extends from the Spanish Pyrenees to the North Cape and our study – though geographically extensive – still covered only part of the full latitudinal gradient. This might potentially hide patterns that would be evident on a larger scale. Nonetheless, we believe that the studied latitudinal range is an adequate sample of the range of environmental conditions relevant to Allen’s rule. Finally, the studied populations necessarily differ in a number of other environmental variables than latitude, and they might influence the observed latitudinal patterns.

The primary purpose of legs is obviously not thermoregulation, and hence, it is interesting and even surprising that at least under some environmental conditions there is a discernible genetic cline in the extremity length consistent with Allen’s rule. For example, it has been shown that there is a positive correlation between the relative extremity length and dispersal ability in cane toads in Australia (Bufo marinus; Phillips et al., 2006). Looking solely at the downwardly concave phenotypic trend in femur and tibia length, one might thus hypothesize that this pattern could be a result of simultaneous dispersal from north and south. Such two-way dispersal history has been indicated for the common frogs in Scandinavia (Palo et al., 2004). However, there is no corresponding downwardly concave pattern in the additive genetic population effects estimated from the common garden data, and thus, the pattern is likely of environmental origin and not reflective of the colonization history of the species. Another potential factor driving the differentiation in relative extremity length is the selective force imposed by predation. Relative leg length in frogs can be related to the jumping ability and this in turn may influence ability to avoid predators (see e.g. Tejedo et al., 2000). We do not have any direct information about variation in predation pressure on metamorphs at different latitudes, but it is possible – perhaps even likely – that the predation pressure on metamorphs declines with increasing latitude (Laurila et al., 2008; Schemske et al., 2009). Yet as evidenced by the results from our food and temperature treatments, as well as results from other similar studies (e.g. Emerson, 1986; Tejedo et al., 2000; Relyea & Hoverman, 2003; Gomez-Mestre & Buchholz, 2006; see Tejedo et al., 2010 for a review), environmental influences on the relative leg length can be large. In particular and in accordance with most previous studies (Tejedo et al., 2010), our results demonstrate that larval food level directly constrains relative femur and tibia length of metamorphosed juveniles, whereas temperature experienced at larval stage has no effect on these traits when the length of the larval period is accounted for.

In conclusion, although there was a genetic cline in femur and tibia length consistent with Allen’s rule in some environmental conditions and evidence of a latitudinal divergence driven by selection, we cannot conclusively attribute the cline – nor the divergence – to the thermal mechanism often associated with Allen’s rule. Although it is certainly a plausible hypothesis, experimental work would be needed to establish a thermally adaptive basis. That said, we have demonstrated the existence of a latitudinal trend at the genetic level that is partially masked in the wild by environmental effects and possibly by genotype–environment interactions. This kind of cryptic variation in the relative extremity length shows that a pattern consistent with Allen’s rule can exist underneath environmental effects that hide it from studies relying only on phenotypic data.


We thank Johan Elmberg, Björn Lardner, Jon Loman, Fredrik Söderman and Kilpisjärvi Biological Station for their help in obtaining the material for this study, Heli Kinnunen for measuring the common garden material, and John Loehr, Bob O’Hara and two anonymous reviewers for their comments on earlier manuscript versions. Our research was supported by the Swedish Research Council (IJ, AL, JM), Academy of Finland (JM), LUOVA graduate school (JSA) and Spatial Ecology Program of University of Helsinki (GH).