Evolvability and genetic constraint in Dalechampia blossoms: components of variance and measures of evolvability


  • Thomas F. Hansen,

    1. Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway
    2. Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA
    Search for more papers by this author
  • Christophe Pélabon,

    1. Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway
    Search for more papers by this author
  • W. Scott Armbruster,

    1. Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway
    2. Institute of Arctic Biology, University of Alaska, Fairbanks, AK 99775, USA
    Search for more papers by this author
  • Matthew L. Carlson

    1. Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway
    2. Alaska Natural Heritage Program, Environment and Natural Resources Institute, University of Alaska, Anchorage, AK 99501, USA
    Search for more papers by this author

Thomas F. Hansen, Department of Biological Science, Conradi Building, Florida State University, Tallahassee, FL 32306, USA.
Tel.: +1 850 644 4779; fax: +1 850 644 9829;
e-mail: Thomas.Hansen@bio.fsu.edu


Abstract Many evolutionary arguments are based on the assumption that quantitative characters are highly evolvable entities that can be rapidly moulded by changing selection pressures. The empirical evaluation of this assumption depends on having an operational measure of evolvability that reflects the ability of a trait to respond to a given external selection pressure. We suggest short-term evolvability be measured as expected proportional response in a trait to a unit strength of directional selection, where strength of selection is defined independently of character variation and in units of the strength of selection on fitness itself. We show that the additive genetic variance scaled by the square of the trait mean, IA, is such a measure. The heritability, h2, does not measure evolvability in this sense. Based on a diallel analysis, we use IA to assess the evolvability of floral characters in a population of the neotropical vine Dalechampia scandens (Euphorbiaceae). Although we are able to demonstrate that there is additive genetic variation in a number of floral traits, we also find that most of the traits are not expected to change by more than a fraction of a percent per generation. We provide evidence that the degree of among-population divergence of traits is related to their predicted evolvabilities, but not to their heritabilities.


Evaluating the potential for evolutionary response to natural selection is critical to our understanding of whether, or more accurately, in what sense, macroevolution can be understood as an extrapolation of microevolutionary processes. The neo-Darwinian consensus seems to be that ordinary selection on standing genetic variation is perfectly able to account for even complex evolutionary innovations (e.g. Dawkins, 1996). The empirical basis of this consensus may be found in the optimality, variability and mutability of quantitative characters. The evolvability of quantitative characters is supported by the direct observation of genetic variation (Houle, 1992) and mutability (Lynch, 1988; Houle et al., 1996), by many observations of rapid microevolutionary change (Hendry & Kinnison, 1999), and by the general success of optimality models, which imply that genetic constraints cannot be too severe. It may be premature, however, to conclude that quantitative characters are unconstrained and generally evolvable. One problem is to explain the high degree of stasis that seems to prevail on macroevolutionary time scales (Williams, 1992; Gould & Eldredge, 1993). This is usually done with reference to stabilising selection (e.g. Charlesworth et al., 1982; Williams, 1992), but then begs the question of why the selective optima themselves are so stable. Selective optima are usually the result of a balance among a number of selective factors, at least some of which are more likely to be sensitive to changes in the environment (Travis, 1989; Hansen, 1997). Although some hypotheses, such as tracking of hyperstable niche parameters (Williams, 1992) and internal selection (Wagner & Schwenk, 2000), have been put forward to explain the stability of optima, these are just ideas in need of further testing.

Another problem is that the correspondence between phenotypic variation and adaptive hypotheses is rarely perfect. This is seen in our study species, the neotropical vine Dalechampia scandens (Euphorbiaceae). The specialized flower-like inflorescences (blossoms) of these plants secrete a resin that is attractive to bees that use resin for nest construction (Armbruster, 1984). The blossoms show extensive geographical variation in size and shape. This variation is certainly influenced by selection deriving from different species of resin-collecting bees and from competition with other Dalechampia for the same pollinators (Armbruster, 1985, 1986). Still, attempts at modelling plausible selective factors have only been able to account for a small part of the interpopulation variation (Armbruster, 1990; Hansen et al., 2000). This must at least partially be caused by incomplete characterization of the selective regimes, but limited evolvability of the blossoms may also be involved. Floral optima are influenced by a range of factors such as the composition of the bee community, the abundance of other Dalechampia species, the availability of other resin sources for the bees, and energetic constraints on both plants and pollinators (Armbruster, 1990, 1996). All these factors are ecologically labile, and if the blossoms are not highly evolvable, they may lag behind in adaptation to the current selective regime. In this paper, we assess this possibility by quantifying the short-term evolvability of blossom traits.

Even if we confine ourselves to predicting evolvability over one or a few generations, and ignore constraints caused by pleiotropy and epistasis, evolvability is not easy to measure. The most common measure of short-term evolvability has been the heritability, h2, defined as the fraction of phenotypic variance that is due to additive genetic effects. But in a seminal paper, Houle (1992) demonstrated that heritability is a poor measure of additive genetic variance and therefore of evolvability. Heritability is suspect as a measure of genetic variance because genetic and environmental variances tend to be strongly correlated. Traits with high levels of genetic variation, such as fitness components, may have low heritabilities due to even higher levels of phenotypic variation (e.g. Price & Schluter, 1991; Houle, 1992, 1998; Messina, 1993; Houle et al., 1996; Campbell, 1997; Merilä & Sheldon, 1999; Schluter, 2000; Stirling et al., 2002).

The main justification for using heritability as a measure of evolvability is that heritability, through the breeder's equation R = h2S, predicts the response to selection, R, when the selection differential, S, is known (e.g. Falconer & Mackay, 1996; Roff, 1997). The selection differential is, however, not a measure of selection that is independent of trait variance. Under linear selection it is proportional to the variance of the trait. We therefore expect a negative correlation between S and h2. Traits with high heritabilities often have low levels of phenotypic variance and will need a steeper selection gradient to generate a particular selection differential than will more phenotypically variable traits. Thus, traits with high heritabilities are not necessarily more evolvable.

Houle (1992) used the coefficient of additive genetic variation, CVA, as a scale-free measure of genetic variance, and showed that this is a more sensible predictor of evolvability under many circumstances. This measure still lacks an operational interpretation of its numerical value. Does a CVA of, say, 10% correspond to high or low evolvability? In this paper we show that the mean-standardized additive genetic variance, IA, can be interpreted as a proportional evolutionary response of a trait to a unit strength of directional selection, where a unit strength of selection, which we denote as φ, will be defined as the strength of selection on fitness itself.

van Tienderen (2000) and Morgan (unpublished) have previously suggested that IA is the appropriate measure of evolvability when fitness elasticities are used as measures of selection strength, and Houle (1992) and Burt (1995) suggested IA as a measure of the evolvability of fitness. Although Houle (1992) is certainly right that no measure of evolvability is appropriate in all circumstances, we will argue that interpreting IA as expected proportional response to a unit strength of selection will provide a good perspective on the evolvability of many size- and fitness-related traits on the scale of positive real numbers.

In this paper, we use this interpretation of IA to assess the evolvability of floral traits in a population of D. scandens. This study is part of a larger attempt to understand the links among biological variation on several hierarchical levels in the genus Dalechampia (e.g. Armbruster, 1985, 1986, 1988, 1990, 1991, 1993, 1996, 1997; Armbruster et al., 1997; Hansen et al., 2000, 2003). The overall goal is to understand the basis of adaptive divergence both in the genus as a whole, and in the widespread and morphologically diverse D. scandens. In this paper, we demonstrate that many blossom traits have rather low evolvabilities, and we show that trait diversification among populations is related to the predicted evolvability of the traits.


Measuring evolvability

Evolvability is the ability of a character to respond to selection, and because selection acts on variation, evolvability is ultimately determined by the capability to vary (Wagner & Altenberg, 1996). In the short-term, however, evolvability is determined by the standing variation in the population. Here, we focus on short-term evolvability within the framework of the Lande (1976, 1979) equation, which, in the case of a single trait z, describes the response to selection, ΔZ, under linear directional selection as the product inline image, where inline image is the additive genetic variance, and the selection gradient, β, is the (partial) regression of relative fitness on the trait. This assumes that the additive genetic variance remains constant and is not tied up in correlations with other characters. A concept of evolvability that accounts for genetic correlations is discussed elsewhere (Hansen et al., 2003; Hansen, 2003).

If evolvability is thought of as the ability to respond to varying selection pressures created by the external environment, it becomes essential to represent selection in terms of the fitness function, because the fitness function describes how the environment relates trait to fitness. This means that the selection differential, S, is not an adequate representation of selection strength for our purposes. To illustrate, if the fitness function has slope β, the selection differential is


where w is relative fitness and inline image is phenotypic variance in the trait. This makes S a function of the variation of the trait, and the effects of variation are confounded with the selection pressure generated by the environment. Moreover, this makes S negatively correlated with inline image. Thus, the partitioning into h2 and S embodied in the breeder's equation does not clearly separate the effect of variability from the effect of selective environment. Therefore, heritability does not measure evolvability in the sense of ability to respond to changes in the external environment.

Seeing evolvability as the ability to respond to a selection pressure represented by a fitness function suggests measuring evolvability as the predicted response to a standardized (directional) selection gradient. Towards such a measure we use a result by Morgan (pers. comm.), Morgan & Schoen (1997) and van Tienderen (2000), who showed that the selection gradient, β, (on relative fitness) standardized with the trait mean, Z = E[z], i.e. βZ, can be interpreted as the elasticity of relative fitness with respect to the trait (i.e. the percentage change in relative fitness per percentage change in the trait). Formally the elasticity is W′(z)Z/E[W], where E[W] is the mean fitness. Note that β = W′(z)/E[W] is the selection gradient on relative fitness, and if the selection gradients are obtained from regressions involving absolute fitness measures, they need to be divided by mean fitness to fit the theory presented here. Elasticities have a number of properties that are desirable for comparison across traits and populations (Caswell, 1989, Chapter 6; van Tienderen, 2000).

Notice that if the trait is taken to be fitness itself, then βZ = 1. This gives the elasticity a natural unit, which we call φ, for fitness. This is motivated by the fact that the strength of selection on fitness is invariant across all species and environments, such that φ itself is a biological invariant. We may formally define φ as the mean-standardized selection gradient of fitness itself. Thus, 1φ represents directional selection of the same strength as selection on fitness itself, 0.1φ represents directional selection that is 10% as strong as on fitness itself, and so on.

Based on this, we suggest that short-term evolvability can be operationalized as the predicted (%) response per generation to directional selection of strength 1φ. We now demonstrate that IA, the additive genetic variance divided by the square of the trait mean, is such a measure. From the Lande equation, the proportional response in the trait is


Then the evolvability, as predicted proportional response per strength of selection, is


Thus, provided all the additive genetic variance in a trait is available for selection, a value of 100 × IA can be interpreted as the percentage evolutionary change an unconstrained trait can achieve if the strength of directional selection is 1φ. This value we will call the IA-evolvability. The unit of the IA-evolvability is percentage trait change times φ−1.

Any parameter gets its operational meanings from the theoretical contexts in which it appears. The utility of IA as a measure of evolvability depends on the particular theoretical interpretation given above. It is clear that this interpretation can be more or less appropriate depending on (i) how well the Lande equations describe the response to selection, and (ii) on whether relevant evolutionary differences can be described quantitatively on a relative (%) scale. Relative evolvabilities as measured by IA require that traits be measured on a scale that approximates the positive real numbers.

The CVA, and thus the IA, has been criticised as overly sensitive to small trait means, and should perhaps not be extrapolated to compare traits with very different means (Polak & Starmer, 2001; see also Downhover et al., 1987). Traits with distributions that peak close to zero or are otherwise strongly skewed would be problematic, both because the assumptions of the Lande equations are violated, and because a percentage scale will not capture the obvious asymmetry in the evolvability of the two directions. These considerations aside, we note that most quantitative traits are either measured on, or can be transformed to, a positive scale with roughly a symmetric distribution (Lynch & Walsh, 1998). The IA-evolvability should therefore be widely applicable, and be particularly suitable for size- and fitness-related variables.

The IA-evolvability builds on one of the founding ideas in evolutionary quantitative genetics, namely the separation of variability and selection embodied in the Lande equation. Following Lande & Arnold (1983), the use of selection gradients has greatly facilitated the study of selection in natural populations, and it is useful to view evolvability in relation to this representation of selection. It is worth repeating that this separation is different from the one embodied in the breeder's equation. If no indirect selection is involved, the relationship between the two formulations can be illustrated as follows:


Of course, if the selection differential is known or controllable, the heritability is what is needed to predict the response to selection, which accounts for the utility of the breeder's equation in artificial selection (Falconer & Mackay, 1996). Furthermore, when selection intensities, i = S/σP, are used to measure strength of selection, the heritability is the right measure to predict the response (in units of phenotypic standard deviations). Note, however, that heritabilities and intensities are also expected to be negatively related, as S scales with inline image and not σP under linear selection. Thus, heritabilities should only be used to predict evolvabilities when they have been measured in the same population under the same conditions as S or i.

Limits on evolvability and selectability

Crow (1958) suggested that Iw = Var[W]/E[W]2 could be used as a measure of the opportunity for selection (see Endler, 1986; Downhover et al., 1987; Houle, 1992). The opportunity for selection as applied to a trait, inline image, is the phenotypic analogue of IA, and is an useful upper limit to the evolvability, which is realized when the heritability is 1.

As the IA-evolvability is the predicted response to selection of strength 1φ, and the strengths of selection on most traits are presumably much less than this, we expect actual evolutionary changes to be usually less than the percentage given by the IA-evolvability. We note, however, that it is theoretically possible that the strength of selection may exceed 1φ, as the selection gradient in any one point may be arbitrarly steep. Although it seems implausible for mean-standardized selection gradients to be larger than 1φ over a large range of trait values, the strength of selection is better investigated empirically than theoretically.

Upper bounds to selection strength based on variance in fitness may also be obtained. We can write the relative fitness of an individual with value z for the focal trait and values x={x1,…} for all other traits as


where g(x) is an arbitrary function that captures the effects of all forms of selection on the organism except for directional selection on z. As g(x) is the residual of a regression of w on z, we assume that g(x) is uncorrelated with z. The variance in relative fitness is then


where Ir is the component of variance in relative fitness due to all forms of selection other than directional selection on z. Now, if we assume g(x) to be uncorrelated with z, which holds if indirect selection on z is either absent or included in the regression parameter [i.e. if g(x) is the residual of a regression of w on z alone], we can write (6) as


Thus, an upper limit to the strength of directional selection is given by the ratio of the phenotypic coefficient of variation in fitness to that of the trait:


As this limit is reached only when there is no residual selection in the population, we can assume that the strength of directional selection is usually much smaller. Nevertheless, if an estimate for the variance of fitness (or the fitness component affected by the trait) is available, Eq. (8) can be used along with IA or IP to put bounds on the potential response.

Materials and methods


The plants used in the quantitative genetic experiment were derived from seeds collected near Tulum, Territorio de Quintana Roo, Mexico (20°13′N, 87°26′W) in the spring of 1998. This population has a relatively large resin gland and its primary pollinators are medium-sized bees of the genus Euglossa (Armbruster, 1985). It coexists locally with small-glanded Dalechampia schottii, which is primarily pollinated by smaller bees including Hypanthidium spp. Another small-glanded species, D. heteromorpha, also occurs in the region. When conditions are favourable D. scandens will flower year round.

Fruits with seeds were collected from 84 separate individuals, and transported to the greenhouse of the Department of Biology, Norwegian University of Technology and Science, Trondheim. A subsequent ISSR-marker-based study including a dozen of these individuals revealed substantial genetic variation in this population (unpublished data). Several seeds from each fruit were germinated in March–May 1998. Plants were kept at 16 : 8 L : D photoperiod to promote growth and 11 : 13 L : D to stimulate flowering. Artificial light supplemented natural light as necessary. Parental individuals were crossed in a block diallel in October–December 1998. Mature, yet unopened, blossoms were emasculated and pollinated by applying pollen from a freshly opened male flower from the assigned sire. After pollination, each maternal blossom was labelled and bagged to prevent unintentional pollination and to collect the mature seeds. Two or more seeds from each cross were germinated during August–October 1999. Measurements on these were started in December 1999 and continued until September 2000.

Four additional populations were sampled for interpopulation comparisons. These include one additional Mexican population (Chetumal) and three Venezuelan populations (Caracas, Tovar and Puerto Ayacucho). All these populations are genetically and morphologically distinct, and the Puerto Ayacucho population appears to be as genetically different from the other Venezuelan populations as it is from the Mexican populations (based on unpublished ISSR data). These plants were housed together with, and treated similarly to, the Tulum population.

Blossom traits and covariates

The blossoms of D. scandens comprise a pair of large, showy, involucral bracts, usually 10 male flowers, three female flowers and a resin gland composed of 15–30 resin-secreting bractlets. The blossom morphology and the measurements used in this study are illustrated in Figs 1 and 2 and summarized in Table 1. A number of measures from two- or three-fold symmetries (taken to reduce measurement error and for use in a forthcoming study of developmental stability) were averaged into a single measure. Various composite ‘shape’ variables were constructed from the basic measurements.

Figure 1.

Exploded view and floral measurements of Dalechampia scandens. See Table 1 for definition of measurements.

Figure 2.

Side-view of Dalechampia scandens blossom, indicating gland–anther distance (GAD), gland–stigma distance (GSD) and anther–stigma distance (ASD).

Table 1.  Trait definitions and measurements.
TraitDefinitionObserverinline image
  1. The first observer (CP) measured a total of 1046 blossoms and the second observer (TFH) a total of 387 from the Tulum population. There are a few missing observations for some of the traits. The measurement-error variance is computed as half the variance of the difference between two repeated measures of the same trait. These are based on a varying number (given in parentheses) of repeated measures. Units of the primary measures are in mm, except for GN which is an integer. The subscripts L, C and R means left, central and right. See Fig. 1 for illustration of the measures.

Upper bract width (UBW) CP0.0071 (19)
Upper bract length (UBL)(UBLL + UBLc + UBLR)/3CP0.0230 (22)
Lower bract width (LBW) CP0.0076 (19)
Lower bract length (LBL)(LBLL + LBLc + LBLR)/3CP0.0378 (22)
Gland–anther distance (GAD) CP0.0074 (24)
Gland–stigma distance (GSD)(GSDL + GSDC + GSDR)/3CP0.0175 (54, from TFH)
Anther–stigma distance (ASD) CP0.0139 (24)
Central male flower diameter (CMD) TFH0.0048 (54)
Gland width (GW) CP0.0063 (54)
Gland height (GH)(GHL + GHR)/2CP0.0018 (97)
Gland depth (GD)(GDL + GDR)/2TFH0.0037 (54)
Peduncle length (PDL) TFH0.0223 (54)
Style length (SL)(SLL + SLC + SLR)/3TFH0.0020 (54)
Style width (SW)(SWL + SWC + SWR)/3CP0.00033 (118)
Gland number (GN)# bractlets in glandTFH
Gland area (GA)GH × GWCP0.175 (97)
Gland ratio (GR)100 × GH/GWCP0.877 (97)
Upper bract ratio (UBR)100 × UBL/UBWCP0.203 (19)
Lower bract ratio (LBR)100 × LBL/LBWCP0.229 (19)
Upper bract shape (UBS)100 × (UBLL+UBLR)/2UBLCCP0.346 (22)
Lower bract shape (LBS)100 × (LBLL+LBLR)/2LBLCCP0.319 (22)
GSD shape (GSDS)100 × (GSDL+GSDR)/2GSDCCP17.34 (54, from TFH)
SL shape (SLS)100 × (SLL+SLR)/2SLCTFH1.369 (54)
SW shape (SWS)100 × (SWL+SWR)/2SWCCP3.92 (118)

Measurements were made by two observers. The first observer (CP) measured a restricted set of traits for two blossoms of each plant. The second observer (TFH) measured a larger set of traits for one blossom each on a subset of individuals. Both observers used digital callipers with 0.01-mm precision. The extra traits measured by the second observer required dissection of the blossom under a stereoscope. Most analyses in this study are based on measurements from the first observer supplemented with measurements of the remaining traits by the second observer. The second observer made the measurements on the additional populations.

Each blossom goes through a series of well-defined ontogenetic stages. Initially the involucral bracts are shut tightly over the developing floral buds. When the bracts first open, the female flowers are receptive but no male flowers have yet opened. Thereafter, the bracts close during night and open during day. The blossoms usually remain in the female stage for 2 days before they enter the bisexual stage when male flowers start to open. The 10 male flowers are arranged in a determinate, three-branched inflorescence, with a central (terminal) flower surrounded by the three branches, each containing three flowers. The central flower is always the first to open, and remains the sole open flower for at least one and sometimes two days. This is ‘stage 1’. The second and third flowers to open are the central (terminal) flowers on the lateral branches. When one or two of these open, the blossom enters stage two and stage 3 respectively. The blossom also remains in stage 2 or 3 for at least 1 day. Thereafter, the remaining male flowers open successively. At some point the blossom may self-fertilize if female flowers were not already fertilized. After about a week, the male cymule abscises. Thereafter, the involucral bracts turn green and close permanently over the developing fruits. The fruits mature in about a month, the bracts open or abscise, and the fruits dehisce explosively to disperse the seeds.

To reduce ontogenetic variation, all measurements were made on blossoms in stages 1–3, and before abscission of the central male flower. Stage was also included as a fixed effect in most analyses involving data from the first observer, but not from the second observer, as there were fewer stage 2 and 3 blossoms in this set.

To control for temporal variation, the ‘day’ on which the blossom was measured was used as a random effect (two or three adjacent days were sometimes grouped together). This picks up effects due to day-to-day variation in the environment.

Breeding design and estimation of quantitative genetic parameters

The study was designed as a block diallel where 12 sets of five parental individuals were combined in complete 5 × 5 diallels with both reciprocals and selfed offspring. Two individuals were raised from each mating such that there were four full sibs from each parental pair. For measures from the second observer only one blossom from one individual from each mating was measured (except that two selfed sibs were included). All of the initial parents came from seeds collected on separate individuals in the field. Due to inadequate flowering or mortality, some parental individuals were replaced during the experiment, and then often by plants grown from seeds from the same maternal plant (we assumed these to be half sibs). Thus, in addition to selfed sibs, full sibs and half sibs, we also have some half sibs sharing at least one additional grandparent and some first half cousins (sharing at least one grandparent). The coefficients of coancestry and cofraternity as well as the genetic covariances of the relevant relatives are given in Table 2.

Table 2.  Genetic covariances of relatives.
Type of relativeΘΔGenetic covariance
  1. For each of the five types of relatives used in this experiment we show the coefficient of coancestry, Θ, which is the probability that two alleles drawn randomly from each relative are identical by descent, and the coefficient of cofraternity, Δ, which is the probability that the two relatives have single-locus genotypes identical by decent. This is used to compute the additive and dominance components of the genetic covariance between the relatives (see Lynch & Walsh, 1998, Chapter 7). This assumes that the parental individuals are not inbred and that the grandparents are not related.

Selfed sibs1/21/2inline image + inline image/2
Full sibs1/41/4inline image/2 + inline image/4
Half sibs1/80inline image/4
Half sibs (shr. 1 grpr.)1/8 + 1/321/165inline image/16 + inline image/16
First half cousins1/320inline image/16

These patterns of relationship were implemented into PROC MIXED in SAS by use of the TYPE = LIN general linear variance structure. This entails using each unique parental pair as a random effect and then reading in matrices describing their pattern of variances and covariances. A typical model is


Parents is then a random effect with variance matrix σ2A, where A is a relationship matrix and σ2 is the variance component to be estimated. By specifying the entries in the A-matrix to correspond to the coefficients given in Table 2 we obtain an estimate of the additive genetic variance, inline image. The dominance variance, inline image, was estimated by adding a matrix with the appropriate coefficients given in Table 2. The maternal variance was estimated by adding the mother of each individual as an additional random effect. Note that maternal effects could be estimated independently of the genetic components due to reciprocal matings. Note also that a standard design with Dam and Sire as random effects would be inadequate as it would treat half sibs where the mother of one is the father of the other as unrelated individuals. For the larger data set, with two blossoms from each individual, the individual was included as a repeated effect. Day was included as a random effect to control for temporal variation. Stage of development is usually the only fixed effect.

We analyzed selfed individuals separately from the rest. This was done because selfed individuals are expected to have different mean and residual variances, making the fitted model much more complex and computationally burdensome.


PROC MIXED in SAS 6.12 was used to fit the mixed model. Estimation method for variance components was restricted maximum likelihood based on a Newton–Raphson algorithm, and (empirical) generalized least squares were used to estimate the fixed effects (see Lynch & Walsh, 1998, Chapters 26 and 27 for details). Standard errors of the variance components were based on the observed Fisher matrix.

No transformations were used, as visual inspection showed all traits to have unimodal and fairly symmetric distributions, which could not be easily improved by any common transformation. Due to the large number of analyses, residuals were not systematically diagnosed for each individual analysis, but in general, residuals are more likely to be closer to a normal distribution than the variables themselves.

Measurement error was assessed by repeated measures on a subset of the sample. Although measurement error is negligible for most traits (Table 1), we did subtract the measurement variance from our estimates of phenotypic variance.

Likelihood-ratio tests of whether variance components are larger than zero were based on comparing the increase in log-likelihood (x2) to the chi-square distribution with degrees of freedom equal to the difference in number of parameters.


Nongenetic components of variation

The blossoms show considerable temporal variation in many traits (quantified by the day variances in Tables 3 and 4). Some of this variation is on a longer temporal scale, but even if we include the month at which the blossom was observed as a covariate, there is still considerable variation on a short time scale as quantified by the day variance (not shown). The temporal variation was not a simple size effect, as some traits showed different seasonal patterns (not shown). We were not able to identify any variable that could easily explain this temporal variation. There was no evidence that the temporal variation was interacting with the genetic effects. Estimates of additive genetic variation with and without controlling for temporal effects were consistent (not shown). However, we did include day as a random effect to control for the temporal variation.

Table 3.  Evolvabilities, heritabilities, mean and components of variance (±SE).
TraitIA-evolvability (%)h2inline imageMeaninline imageinline imageinline imageinline imageinline image
  1. The model includes additive genetics, individual and day as random effects, and stage as a fixed effect (except that traits measured by the second observer do not include stage and individual). The dominance variance is estimated in a separate analysis that also included a dominance effect. The IA-evolvability is measured as inline image, where Z is the trait mean, the heritability is inline image, and the phenotypic variation is computed as inline image.

  2. Note: Chi-square likelihood-ratio tests for inline image are significant at P < 0.01 for all traits except GSDS (P = 0.12), SLS (P = 0.39) and SWS (P = 0.30). Tests for inline image have P > 0.10 for all traits except GD (P = 0.09) and LBS (P = 0.04). Tests for inline image are significant at P < 0.05 for LBW, LBL, SW, UBR and LBR, and at P < 0.10 also for UBW, GSD and GW.

UBW0.310.304.5120.59 ± 0.241.33 ± 0.370.06 ± 0.440.31 ± 0.180.60 ± 0.252.94 ± 0.21
UBL0.250.263.0217.55 ± 0.210.78 ± 0.210.13 ± 0.290.17 ± 0.120.56 ± 0.221.93 ± 0.14
LBW0.340.226.6020.73 ± 0.311.47 ± 0.44−0.35 ± 0.560.53 ± 0.251.28 ± 0.494.06 ± 0.30
LBL0.280.244.0618.61 ± 0.230.97 ± 0.29−0.21 ± 0.370.41 ± 0.160.69 ± 0.272.51 ± 0.18
GAD0.120.090.2834.66 ± 0.050.026 ± 0.0130.023 ± 0.0310.001 ± 0.0010.037 ± 0.0150.239 ± 0.017
GSD0.480.270.3874.64 ± 0.060.103 ± 0.031−0.047 ± 0.0390.034 ± 0.0180.026 ± 0.0140.293 ± 0.021
ASD1.710.260.9753.62 ± 0.080.250 ± 0.073−0.059 ± 0.1040.059 ± 0.0450.024 ± 0.0200.781 ± 0.055
CMD0.150.190.0632.79 ± 0.030.012 ± 0.007−0.006 ± 0.0190.008 ± 0.0040.054 ± 0.005
GW0.110.080.5776.61 ± 0.080.046 ± 0.024−0.006 ± 0.0600.048 ± 0.0260.079 ± 0.0320.433 ± 0.031
GH0.310.110.2352.92 ± 0.050.026 ± 0.011−0.006 ± 0.0240.008 ± 0.0100.031 ± 0.0120.185 ± 0.013
GD0.350.450.0672.96 ± 0.030.030 ± 0.0110.037 ± 0.0260.005 ± 0.0030.051 ± 0.005
GN1.460.3220.6821.22 ± 0.496.58 ± 2.583.72 ± 5.891.82 ± 1.0515.57 ± 1.56
PDL0.980.240.4023.16 ± 0.060.098 ± 0.005−0.11 ± 0.130.024 ± 0.0220.351 ± 0.035
SL0.490.280.6836.27 ± 0.110.19 ± 0.090.27 ± 0.200.18 ± 0.070.41 ± 0.04
SW0.330.200.0311.35 ± 0.020.006 ± 0.002−0.003 ± 0.0030.005 ± 0.0010.006 ± 0.0020.017 ± 0.001
GA0.720.1027.1719.56 ± 0.532.74 ± 1.19−1.18 ± 2.661.28 ± 1.203.69 ± 1.4521.00 ± 1.52
GR0.180.1819.0144.10 ± 0.373.51 ± 1.281.62 ± 2.521.16 ± 0.921.03 ± 0.5115.94 ± 1.14
UBR0.100.2234.2885.45 ± 0.777.43 ± 2.12−3.28 ± 2.562.37 ± 1.208.98 ± 3.3019.43 ± 1.41
LBR0.060.1334.6389.79 ± 0.834.66 ± 1.63−4.37 ± 2.614.85 ± 1.1611.88 ± 4.0915.79 ± 1.15
UBS0.020.0816.9888.71 ± 0.401.44 ± 0.72−0.92 ± 1.64−0.24 ± 0.772.19 ± 1.2314.64 ± 1.05
LBS0.020.189.3884.99 ± 0.291.70 ± 0.592.53 ± 1.460.02 ± 0.430.78 ± 0.428.06 ± 0.57
GSDS0.050.0746.5281.91 ± 0.493.13 ± 2.5110.27 ± 8.290.15 ± 3.161.85 ± 1.1060.41 ± 4.27
SLS0.0080.089.8096.66 ± 0.230.79 ± 1.02−10.12 ± 3.200.10 ± 0.3410.67 ± 1.03
SWS0.010.0329.3498.69 ± 0.260.76 ± 0.84−4.14 ± 3.01−0.35 ± 1.620.22 ± 0.3331.92 ± 2.26
Table 4.  Mean and components of variance (±SE) for selfed individuals.
TraitMeaninline imageinline imageinline imageinline imageinline image
  1. The covariance of selfed sibs is inline image. The model includes the random effects corresponding to the listed variance components and stage as a fixed effect. The phenotypic variance is the sum of the estimated variance components minus measurement variance. Based on 222 selfed individuals and 77 parents (except for CMD, GD, GN, PDL, SL and SLS, which are based on 105 selfed individuals and 55 parents).

UBW20.83 ± 0.250.90 ± 0.501.21 ± 0.550.65 ± 0.402.76 ± 0.405.51
UBL17.74 ± 0.200.36 ± 0.300.92 ± 0.370.46 ± 0.291.71 ± 0.253.43
LBW20.99 ± 0.341.43 ± 0.761.83 ± 0.841.37 ± 0.784.15 ± 0.628.77
LBL18.84 ± 0.240.72 ± 0.491.47 ± 0.570.54 ± 0.362.62 ± 0.375.31
GAD4.66 ± 0.0600.045 ± 0.0270.038 ± 0.0240.208 ± 0.0300.284
GSD4.71 ± 0.070.073 ± 0.0330.031 ± 0.0390.024 ± 0.0210.257 ± 0.0390.368
ASD3.86 ± 0.100.267 ± 0.109−0.035 ± 0.1180.002 ± 0.0321.03 ± 0.151.25
CMD2.79 ± 0.0300.001 ± 0.0080.049 ± 0.0080.045
GW6.69 ± 0.090.053 ± 0.0420.085 ± 0.0550.062 ± 0.0350.333 ± 0.0460.527
GH2.94 ± 0.070.040 ± 0.0200.036 ± 0.0240.059 ± 0.0270.133 ± 0.0200.265
GD2.97 ± 0.040.033 ± 0.0120.009 ± 0.0070.044 ± 0.0100.082
GN21.57 ± 0.634.21 ± 3.443.46 ± 2.4818.05 ± 3.9825.72
PDL3.18 ± 0.080.14 ± 0.050.053 ± 0.0310.19 ± 0.050.36
SL6.39 ± 0.110.011 ± 0.0620.20 ± 0.080.36 ± 0.090.57
SW1.40 ± 0.020.007 ± 0.0030.004 ± 0.0030.006 ± 0.0030.015 ± 0.0020.032
GA19.92 ± 0.713.32 ± 2.194.71 ± 2.755.47 ± 2.5715.69 ± 2.2529.02
GR43.90 ± 0.626.29 ± 2.02−0.93 ± 2.093.31 ± 2.0015.91 ± 2.4423.70
UBR85.44 ± 0.754.43 ± 1.881.86 ± 1.937.20 ± 3.1312.00 ± 1.7425.29
LBR89.75 ± 0.946.11 ± 2.372.03 ± 2.3312.37 ± 5.1514.32 ± 2.1234.60
UBS88.06 ± 0.321.88 ± 1.281.48 ± 1.580.04 ± 0.4211.34 ± 1.5514.71
LBS84.49 ± 0.381.78 ± 1.09−0.09 ± 1.410.82 ± 0.8411.01 ± 1.5313.20
GSDS82.03 ± 0.550−4.55 ± 6.56072.60 ± 9.8351.93
SLS96.49 ± 0.330.013 ± 1.330.32 ± 0.809.53 ± 1.998.49
SWS98.23 ± 0.4301.18 ± 3.680.43 ± 1.4037.15 ± 5.1235.85

Maternal components of variance were very small or zero for all traits (not shown). Based on Akaike's information criterion, maternal effects were not included in the model for any trait.

Additive genetic variance and evolvability

Although all the traits, except the three female flower-shape variables (SLS, SWS and GSDS), show clear evidence of additive genetic variation, the IA-evolvabilities and heritabilities are generally small (Table 3). Heritabilities rarely exceed 0.3, and the coefficients of additive genetic variation tend towards the lower end of the range found by Houle (1992) for morphological traits. Evolvabilities, as measured by IA, are considerably less than 1% for most traits.

Dominance variance

There is no evidence for dominance variance. The estimates of inline image are usually small and they are negative as often as they are positive (Table 3). However, the low precision in these estimates makes it hard to conclude that nonadditive variance is without importance. For a few traits the estimated dominance variance is as large or larger than the additive variance.

Within-individual variation

There is surprisingly little within-individual covariance for many traits (Tables 3 and 4). If the only source of similarity between two blossoms on the same plant was due to additive genetics we would expect the within-individual covariance to equal half the additive genetic variance (when the other half has been accounted for by a parental effect). Most estimates of within-individual covariance are less than this, and many are not even significantly different from zero.

One possible explanation for this puzzling result is that a history of inbreeding in the parental stock upwardly biases our estimate of additive genetic variance. Inbreeding in the parental lines will elevate the covariance among full sibs and half sibs with a factor 1+f, where f is the inbreeding coefficient of the parents. Thus, our estimate of the additive genetic variance should be adjusted downwards by a factor of 1/(1+f). The covariance between the two blossoms on the same (unselfed) individual is, however, unaltered, and may therefore appear smaller than expected.

Selfed-sib variance and effects of selfing

In a purely additive model we expect the covariance of selfed sibs to equal the additive genetic variance. A comparison of self-sib variances in Table 4 with the additive genetic variances in Table 3 shows no evidence of excess variance. For some traits, the selfed variance is above the additive variance and for some it is below. Most of these differences are small and we see no reason to suspect that they reflect anything but estimation error. Most traits are slightly larger in the selfed individuals, but the differences are very small, and we conclude that there were no biologically significant differences between selfed and outcrossed individuals.

Evolvability and among-population variation

When variation among five distinct populations is plotted against IA-evolvability for all traits, it appears that traits with low evolvabilities differ little among populations whereas traits with moderate to higher levels of evolvability often display greater among-population divergence (Fig. 3; with trait mean values given in Table 5). Each population potentially interacts with different congeneric species, which may induce different selection pressures on the blossoms. Relative to the main study population (Tulum), the nearby population (Chetumal) has somewhat larger blossoms, whereas the three Venezuelan populations have smaller blossoms that are almost certainly adapted to pollination by smaller bees.

Figure 3.

Interpopulation variation in relation to evolvability: each point represents a trait. The interpopulation variation is measured as the variance among the five populations listed in Table 5 scaled by the square of the mean value for the Tulum population (similar results were obtained by scaling with the mean of the population mean). The evolvabilities are from Table 3.

Table 5.  Trait mean measured in greenhouse for five different populations.
PollinatorPuerto Aya. (Hyp.(?))Caracas (Trigona)Tovar (Hyp.(?))Tulum (Euglossa)Chetumal (Euglossa(?))
  1. Pollinator is the genus of the principal bee pollinator. Euglossa are medium sized, Hypanthidum are small, and Trigona are very small. Larger bees of genus Eulaema are also important pollinators for many Dalechampia populations. Simple averages with standard errors are given. Data for the Tulum population are included for comparison, and are slightly different from the numbers given in Table 3 as they are based on measurements from the second observer, include selfed individuals and are not controlled for family effects. The gland–stigma distances (GSD) of the Caracas and Tovar populations are larger than what they are expected to be in the field. This may be a greenhouse artefact.

  2. aSeven unlobed individuals not included.

UBW16.74 ± 0.6015.60 ± 0.3919.66 ± 0.4920.43 ± 0.1221.54 ± 0.29
UBL18.20 ± 0.5414.54 ± 0.3117.99 ± 0.3917.72 ± 0.0920.05 ± 0.28
LBW15.51 ± 0.6516.72 ± 0.4421.08 ± 0.5420.44 ± 0.1522.34 ± 0.34
LBL17.44 ± 0.56a15.82 ± 0.3719.53 ± 0.4918.75 ± 0.1221.26 ± 0.34
GAD3.32 ± 0.113.48 ± 0.093.55 ± 0.114.63 ± 0.034.96 ± 0.07
GSD4.07 ± 0.124.50 ± 0.155.69 ± 0.164.78 ± 0.044.88 ± 0.08
ASD3.27 ± 0.210.79 ± 0.161.29 ± 0.163.85 ± 0.054.22 ± 0.12
CMD2.26 ± 0.053.06 ± 0.063.02 ± 0.052.80 ± 0.012.74 ± 0.03
GW5.31 ± 0.155.04 ± 0.156.02 ± 0.116.36 ± 0.047.11 ± 0.08
GH2.02 ± 0.091.72 ± 0.062.00 ± 0.052.74 ± 0.033.05 ± 0.06
GD2.41 ± 0.042.29 ± 0.052.73 ± 0.052.97 ± 0.013.44 ± 0.09
GN20.06 ± 0.8517.51 ± 0.5521.00 ± 0.3921.47 ± 0.2421.39 ± 0.50
PDL1.66 ± 0.131.95 ± 0.091.93 ± 0.093.18 ± 0.034.00 ± 0.10
SL5.36 ± 0.176.41 ± 0.057.01 ± 0.176.39 ± 0.046.49 ± 0.09
SW1.05 ± 0.030.89 ± 0.020.81 ± 0.021.32 ± 0.011.29 ± 0.02
GA10.84 ± 0.658.93 ± 0.5112.13 ± 0.4517.73 ± 0.2621.99 ± 0.57
GR38.18 ± 1.5534.39 ± 0.7733.45 ± 0.9042.87 ± 0.2842.87 ± 0.57
UBR109.9 ± 2.2993.84 ± 1.1291.98 ± 1.1587.22 ± 0.3193.29 ± 0.60
LBR114.5 ± 3.5695.07 ± 1.0892.92 ± 1.0792.42 ± 0.3695.24 ± 0.47
UBS86.15 ± 0.9087.41 ± 0.5386.88 ± 0.5188.93 ± 0.2089.92 ± 0.32
LBS78.99 ± 0.55a86.43 ± 0.5286.86 ± 0.6585.33 ± 0.1686.54 ± 0.32
GSDS87.66 ± 2.6890.65 ± 1.9985.44 ± 1.5485.60 ± 0.4085.98 ± 1.03
SLS94.34 ± 0.8496.89 ± 0.6497.47 ± 0.9696.63 ± 0.1795.56 ± 0.34
SWS96.35 ± 3.1393.98 ± 1.4293.04 ± 1.3997.81 ± 0.3192.62 ± 0.68

The number of bractlets in the gland (GN) is the conspicuous outlier in Fig. 3. Despite substantial evolvability (IA-evolvability ∼1.5%) this trait is almost invariant across populations. Due to the close correlation of GN with GA, a trait with different optima in the different populations, this cannot be due to uniform selection. It thus appears that the genetic variation in GN may not be useful for adaptation. We note, however, that the Tovar and Caracas populations do seem to have a somewhat different arrangement of bractlets than the other populations, and the Caracas population has clearly lost some bractlets, which probably contributed to the very small glands of this population.


Although we found unequivocal evidence for additive genetic variance in nearly all of the floral traits examined, the main conclusion from this study is that the blossoms have limited short-term evolvability. The IA-evolvabilities predict that most traits can change only a fraction of a percent per generation unless selection is very strong. As illustrated in Fig. 4, if the strength of directional selection is about 0.1φ, it will take hundreds of generations to produce typical interpopulation differences. This is a substantial constraint for a species with a generation time of up to several years. Of course, this observation is still compatible with substantial changes on a macroevolutionary time scale.

Figure 4.

Evolutionary change and evolvability: percentage difference from the Tulum population is shown for all traits for the Chetumal population (◆) and the Caracas population (▮). The unbroken line illustrates the amount of evolutionary change roughly expected over 40 generations of uniform directional selection if the strength of selection is 1φ and the evolvability stays constant. The dashed line illustrates the amount of evolutionary change expected over 40 generations if the strength of selection is 0.1φ.

This interpretation is tentative because it is based on the assumption that strengths of directional selection are typically much less than 1φ. No compilation of fitness elasticities exists in the literature. Recent reviews of selection strengths in nature (Hoekstra et al., 2001; Kingsolver et al., 2001; see also Endler, 1986) use phenotypic standard deviations and not trait mean values for standardization, and are thus not directly informative on this issue. These studies do show, however, that strengths of directional selection are very variable and sometimes quite large. Possible and typical strengths of selection need to be investigated in units of φ to put IA-evolvabilities in a firm empirical context.

If the Tulum population is representative of the species as a whole, we would expect that many populations are lagging behind in their adaptation to local environmental changes, for instance in the bee community or in the degree of competition from co-occurring Dalechampia. In a recent comparative study of interpopulation variation in gland area and gland–stigma distance, we found that although the effect of proximity to competitors conformed to predictions from a character–displacement hypothesis, only 10–20% of the spatial variation could be explained in this way (Hansen et al., 2000). In that study we suggested that the remaining variation might be due to variation in secondary selective factors not included in the model rather than lack of adaptation to the local environment. However, the low evolvabilities found in the current study suggest that local lag in adaptation is a reasonable alternative explanation.

It is intriguing that the predicted evolvability of each trait appears to be related to the degree of population diversification in that trait (Fig. 3), although the degree of scatter in the diagram makes this conclusion tentative. Notice, however, that no trait with very low evolvability shows much among-population variance. This is consistent with the idea that low evolvability is a reflection of constraint, although we cannot exclude alternative hypotheses, such as strong uniform stabilizing selection simultaneously removing genetic variation and keeping population mean values similar, or that frequent changes in trait mean values also lead to changes in the genetic architecture that facilitate variability.

It should be emphasized that the evolvabilities reported here are maximal values based on assuming that all the additive genetic variation in individual traits is available for adaptation. In reality, a large fraction of the variation in any one trait may be bound up in pleiotropy with other traits that do not necessarily experience concordant patterns of selection. For example, an unknown quantity of new mutational variation may be due to degenerative changes in housekeeping genes or signalling proteins with a multitude of functions. This may generate seemingly usable variation in any one character, but is unlikely to provide a basis for permanent evolutionary change. To study such pleiotropic constraints, we proposed the concept of conditional evolvability (Hansen et al., 2003). The conditional evolvability of a character y relative to a set of characters x refers to y's evolutionary potential when x is under stabilizing selection. We showed that the conditional evolvability could be obtained by replacing the additive genetic variance with the conditional additive genetic variance (i.e. the residual variance of a regression of the breeding value of y on the breeding value of x). This holds under reasonably general conditions and is approximately independent of the strength of stabilising selection on x (Hansen, 2003). In a multivariate analysis of the data reported here, we found that conditioning on key traits such as gland area and bract size would often reduce evolvability by 50% or more (Hansen et al., 2003). These results underscore the limited evolutionary flexibility of the blossoms.

The heritabilities reported here were generally less than 0.3, which is not unusually low for plants. Low heritabilities in plants may be due to high levels of environmental variation caused by the relative plasticity of many plant traits, rather than lack of genetic variation (see review by Mitchell-Olds, 1996). Campbell (1997) provides an example where low heritability and high CVA were found in Ipomopsis life-history characters. But there are also studies that find high heritabilities of floral traits (e.g. Galen, 1996; Andersson, 1997). Galen (1996, 1999) further showed directly that Polemonium corolla widths are capable of a large response to selection by pollinators. Thus, our findings of low evolvability may or may not be typical.

We have argued that heritabilities should not be interpreted as measures of evolvability when selection is modelled in terms of fitness landscapes, selection gradients or elasticities. In Fig. 5a we plot heritability against IA for the traits reported in Table 3. This shows that heritability is indeed a poor predictor of genetic variance and of evolvability in our sense. This adds to similar results by Houle (1992), Messina (1993) and Campbell (1997). In fact, the only discernible signal in Fig. 5a is due to low heritabilities of some of the shape variables that are practically void of genetic variation. Furthermore, as demonstrated in Fig. 5b, heritabilities do not predict among-population variation. To this we may add the observation that heritabilities were as likely to increase as to decrease when trait variation was made conditional on other traits, although such conditioning necessarily decreases both variability and evolvability (Hansen et al., 2003).

Figure 5.

Heritability and evolvability: (a) plot of h2 against IA for the blossom traits in Table 3 and (b) plot of interpopulation variation, as in Fig. 3, against h2.

The conclusion of low evolvability of our study population is partially a straightforward empirical finding, but it is also influenced by a novel conceptual perspective where evolvability is operationalized as a predicted response to a given slope of the fitness function. This measure is designed to assess evolutionary potential in the context of varying selection regimes, as changes in the causal mechanisms of selection alter the fitness function. We have shown that heritabilities are both theoretically and empirically uninformative about evolvability in this sense, and conclude that the evolutionary potential of quantitative characters need to be re-examined with more ecologically appropriate measures.


We thank Liv Antonsen, Linda Dalen and Torborg Berge for seed collection in the field. Thanks to Liv Antonsen, Trond Einar Brobakk, Linda Dalen, Mathilde Deveaud, Guri Fyhn-Hanssen, Trygve Kjellsen, Sigrid Lindmo, Ane Moe, Heidi Myklebost, Elisabeth Sørmeland and many others for greenhouse assistance. Thanks also to Gunnar Austrheim for moral support and use of his car. We are grateful to David Houle, Martin Morgan, Günter Wagner and anonymous reviewers for discussions and/or comments on the manuscript. We thank Martin Morgan for sharing his unpublished manuscript on fitness elasticities. This research was supported by grants (#123846/410, #123650/410 and #128830/410) from the Norwegian Research Council to TFH and WSA.