Testing the phenotypic gambit: phenotypic, genetic and environmental correlations of colour


J. D. Hadfield, Division of Biology, Imperial College London, Silwood Park, Ascot, Berkshire SL5 7PY, UK.
Tel.: +44-114-2220112; fax: +44-114-2220002;
e-mail: j.hadfield@shef.ac.uk


Evolutionary theory is primarily concerned with genetic processes, yet empirical testing of this theory often involves data collected on phenotypes. To make this tenable, the implicit assumption is often made that phenotypic patterns are good predictors of genetic patterns; an assumption that coined the phenotypic gambit. Although this assumption has been validated for traits with high heritability, such as morphology, its generality for traits with low heritabilities, such as life-history and behavioural traits, remains controversial. Using a large-scale cross-fostering experiment, we were able to measure genetic, common environmental and phenotypic correlations between four colour traits and two skeletal traits in a wild population of passerine birds, the blue tit (Parus caeruleus). Colour traits had little heritable variation but common environment effects were found to be important; skeletal traits showed the opposite pattern. Positive correlations because of a shared natal environment were found between all traits, obscuring negative genetic correlations between some colour and skeletal traits. Consequently, phenotypic patterns were poor surrogates for genetic patterns and we suggest that this may be common if trade-offs or substantial parental effects exist. For this group of traits, the phenotypic gambit cannot be made and we suggest caution when inferring genetic patterns from phenotypic data, especially for behavioural and life-history traits.


Evolutionary processes can only be fully understood when both phenotypic and genetic data are available (Roff, 2002). However, estimating genetic parameters in many natural populations is problematic, and so the testing of evolutionary hypotheses is often based on phenotypic data alone. To justify this, the implicit assumption is made that phenotypic data are an adequate predictor of the underlying genetics (Lloyd, 1977; Maynard Smith, 1978; Grafen, 1984); an assumption that coined the phenotypic gambit (Grafen, 1984). Despite the conceptual importance of the phenotypic gambit, very little work has been done to test its validity in a wide variety of traits. Although many authors continue to recommend caution (Roff, 1996; Lynch & Walsh, 1998; Steppan et al., 2002), it is still a major, and often unstated, assumption of many behavioural ecological studies.

Early examples of circumstances under which the phenotypic gambit would prove misleading were usually framed in the context of Mendelian inheritance (Lewontin, 1978). A case in point is the relationship between sickle cell anaemia and malaria in some human populations (Grafen, 1984). Persistence of the two sub-optimal phenotypes, those that die from sickle cell anaemia and those that have no resistance to malaria, would be baffling without the knowledge that the optimal phenotype is produced by the heterozygous genotype. More recently, through the popularization of quantitative genetic models in evolutionary biology (Lande, 1979), and the realization that many traits show polygenic inheritance (Boake, 1994), the phenotypic gambit has been extensively tested by comparing phenotypic and genetic correlations between traits.

Cheverud (1988) provided the first comprehensive comparison of phenotypic and genetic correlations, and concluded that the two were sufficiently similar to justify making evolutionary inferences from phenotypic data alone. Additional work on a wide range of taxa (Roff & Mousseau, 1987; Roff, 1995; Koots & Gibson, 1996; Bonnin et al., 1997; Reusch & Blanckenhorn, 1998; Waitt & Levin, 1998; Reale & Festa-Bianchet, 2000; Kominakis, 2003; Sheldon et al., 2003; Roff et al., 2004) have emphasized this point, but reservations about its generality have been made by the authors themselves and others (see Willis et al., 1991 for a general discussion).

The single biggest limitation of these studies is that a limited number of highly heritable traits have been used to compare phenotypic and genetic correlations, yet many branches of evolutionary biology concern themselves with traits that are likely to have large environmental components (but see Roff & Mousseau, 1987; Hughes, 1995; Roff, 1996). This problem is compounded by the fact that most of the quantitative genetic estimates are taken from laboratory populations where environmental variance may be reduced (Willis et al., 1991; Simons & Roff, 1996, but see Weigensberg & Roff, 1996). The central role that heritability plays in the relationship between phenotypic and genetic correlations can be seen in the equation:


where rP, rG and rE are the phenotypic, genetic and environmental correlations between traits X and Y, respectively, and inline image and inline image denote the heritabilities of traits X and Y respectively (Roff, 1997). It can be seen from this that the phenotypic correlation will tend towards the genetic correlation as the heritabilities of the two traits increase. Under such circumstances the environmental correlation will contribute little to the phenotypic correlation and rP will be equal to rG. On the other hand, as the heritability of the two traits decrease, environmental sources of variation would be expected to deviate the phenotypic correlation from the underlying genetic correlation. For this reason, rP will be unequal to rG for traits with little heritable variation unless environmental sources of variation deviate developmental pathways in the same way as genetic sources of variation (Cheverud, 1984). In this case, rP and rG would be similar despite the traits having low heritabilities because rE and rG would be expected to be similar. However, the notion that rE and rG should be of opposite signs has a long history in the study of life-history evolution (Lande, 1982; Charlesworth, 1984; Reznick, 1985; Lessells, 1991; de Jong & van Noordwijk, 1992). When rE and rG are dissimilar, phenotypic correlations would be expected to be a very poor surrogate for genetic correlations.

Using a highly replicated cross-fostering design, coupled with robust statistical procedures, we estimated genetic, phenotypic and common environmental correlations between four colour traits and two skeletal traits in a wild population of a passerine bird, the blue tit (Parus caeruleus). Using a methodology developed for the comparison of geometrical subspaces, we test whether the phenotypic gambit can be made for this group of traits, and whether genetic and environmental sources of variation operate in the same direction.


Study site and species

All work was carried out in the spring of 2002, on a population of blue tits at Silwood Park, a 100-ha deciduous woodland site 40 km west of London, UK (grid reference SU 939693). Of the 200 nest boxes monitored for nest building, 111 commenced egg laying and 107 successfully hatched chicks.

Previous work on the genetics of fledgling plumage colour in this species and the closely related great tit (Parus major) suggests that colour has a large environmental component (Slagsvold & Lifjeld, 1985; Horak et al., 2000; Fitze et al., 2003; Hadfield & Owens, 2006). These studies have focused on fledgling carotenoid-based chest colouration (Partali et al., 1987), and here we include a similar fledgling trait together with cap, back and wing colour, all of which appear to be less dominated by carotenoid pigments.

Adult plumage colouration in the blue tit has been well studied, and several lines of evidence suggest that it is under sexual selection. The structurally based cap colour has been found to be sexually dichromatic (Andersson et al., 1998; Hunt et al., 1998), and under mutual mate choice (Hunt et al., 1999). In addition, two manipulation studies have shown that female blue tits produce less male offspring (Sheldon et al., 1999) and provide less parental care (Limbourg et al., 2004) when the ultraviolet reflectance of their mates is experimentally reduced. The adult carotenoid-based chest colouration of the blue tit is less well studied, although a single study has found that that this colour may indicate the level of parental care male blue tits provide (Senar et al., 2002). Recent work on the genetics of adult plumage colour also suggests that environmental sources of variation dominate genetic sources of variation. (Hadfield et al., 2006). The functional significance of juvenile plumage traits remains unknown, but recent work suggests that it is condition dependent (Johnsen et al., 2003).

Cross-fostering protocol

Genetic sources of similarity between full siblings may be confounded with similarity arising through a shared environment (Lynch & Walsh, 1998). In birds, in particular, this shared environment arises because full siblings are raised in the same brood. To allow both sources of variation – genetic and common environment – to be estimated, we performed a reciprocal cross-fostering experiment (Riska et al., 1985). This involves pairing nests in which the chicks hatch on the same day, and the day after hatching, swapping an equal number of chicks between the two families. This results in roughly equal numbers of nestlings from two different families being raised together. A total of 102 nests were used in the cross-fostering experiment. The remaining five nests hatched on days when an odd number of nests hatched, and reciprocal cross-fostering was not possible.

On day 15 after hatching, the head-to-bill length and tarsus length of all surviving chicks were measured. Head-to-bill measurements were taken from the tip of the bill to the back of the cranium, and the tarsus measurements were taken from the posterior aspect of the tibiotarsus to the most distal undivided scute (Dhondt, 1982). A total of 840 chicks were measured from 105 nests. Blood samples were also taken from the brachial veins of the majority of chicks for the purpose of sex determination via molecular methods. In addition, three colour measurements were taken from each of four colour patches: the cap, the back, the primary coverts and the upper chest.

Colour measurement and quantification

The colour measurements were made using a well-established protocol (see Andersson, 1996 for details). An Avantes AVS-USB2000 miniature fibre optic spectrometer coupled to a Xenon pulsed light source (XE-2000, Avantes, Boulder, CO, USA) was used for measuring all reflectance spectra. The optic fibre was held at 90° to the colour patch and a Teflon white reference tile (WS-2, Avantes) was used to standardize the reflectance of each measurement. The three measurements taken for each patch on each individual were averaged.

Using the Spec package (http://www.bio.ic.ac.uk/research/iowens/spec), and data on the spectral sensitivity of the blue tit visual system (Hart et al., 2000), we reduced complicated spectral data into four quantal cone catches, denoted VS (very short wavelength), S (short), M (medium) and L (long) (Vorobyev et al., 1998). These cone catches were transformed into three log contrasts with the L cone catch as the denominator (Endler & Mielke, 2005; Hadfield & Owens 2006). Across plumage traits, between 91% and 97% of the chromatic variation was associated with single opponency channels, and so the log-contrasts were projected onto these channels and treated as univariate measurements. High colour scores indicate relatively more reflection at shorter wavelengths.

Molecular sex determination

Blood samples were stored in 95 % ethanol, and DNA extracted using a Chelex®-based method (Walsh et al., 1991). The P2 and P8 primers were used to amplify the CHD-W and CHD-Z genes located on the avian sex chromosomes following Griffiths et al. (1998). The PCR products were separated by electrophoresis on a 10 % polyacrylamide gel and visualized by ethidium bromide staining. A further six birds for which a blood sample was unavailable, or DNA amplification was unsuccessful, were sexed on the presence/absence of a brood patch in the following breeding year. Eighty-nine birds sexed by the presence/absence of a brood patch were also sexed using molecular methods; in all cases the two methods gave the same result.

Variance component analysis

Genetic and common environmental components of phenotypic (co)variance were estimated by fitting a REML multivariate mixed-effect model, with original and recipient nest fitted as random effects. The two nests that make up a cross-fostering pair (dyad) are generally not allocated at random and this can cause upward biases in the estimation of genetic variances (Merila & Sheldon, 2001). Hatch date was fitted as a fixed effect to control for the fact that nest pairs are not assigned randomly with respect to time, and dyad was fitted as a random effect to control for possible spatial nonrandomness because of logistical constraints. A significant dyad effect was not found for any trait and was dropped from the model. Sex was fitted as a fixed effect to control for any sexual dimorphism. Birds from nonexperimental nests (five nests) were excluded from the analysis because genetic and common environmental sources of variation are confounded. In addition, chicks from two nests that were dead prior to measuring, and chicks for which sexing was not possible were also excluded. The final analysis included 751 chicks from 100 families.

Genetic (co)variances were calculated as twice the original nest component, as full siblings only share 50 % of their genes (Lynch & Walsh, 1998). Nest (co)variances were equal to the recipient nest component, and phenotypic (co)variances were equal to the sum of the original nest, recipient nest and residual variance components. The sampling (co)variances of all parameters were obtained from the inverse of the information matrix, and standard errors for all (co)variance components and variance component functions were calculated using the Delta method (Appendix 1, Lynch & Walsh, 1998). The significance of each (co)variance was tested by comparing the likelihood of the full six-trait model with the likelihood from a model in which the (co)variance was fixed at zero. The likelihoods were compared using a likelihood ratio test with a single degree of freedom (Pinheiro & Bates, 2000). The significance of variance component functions [heritability and phenotypic (co)variances] were assessed using a t-test with a t-value equal to the estimate divided by its standard error, and degrees of freedom equal to 100.

Although reciprocal cross-fostering is a powerful way of obtaining relatively clean estimates of additive genetic and common environmental variance, dominance variance and precross-fostering effects are likely to upwardly bias estimates of additive genetic variance (Lynch & Walsh, 1998). However, a recent genetic analysis of chest colouration using an extended pedigree gave identical parameter estimates suggesting that these effects may not be that large (Hadfield and Owens, 2006). However, the expected discrepancy between estimates, even when these confounding effects are substantial, may be small for certain pedigree structures, and more work is needed to address the importance of dominance and early natal conditions for trait determination. The potential directional bias caused by confounded sources of variance is probably offset by the moderate levels of extra-pair paternity (11–14%; Kempenaers et al., 1997) prevalent in this species, but the bias these factors introduce into estimates of genetic correlations is unknown. In addition, quantitative genetic models are only robust to selection bias under restrictive assumptions, and these assumptions may not be met in this study. Further work is required to determine the magnitude by which selection on the parental population affects parameter estimates.

Matrix comparisons

The phenotypic, genetic and natal environment (co)variance matrices (P, G and N respectively), were compared using the methodology described by Krzanowski (1979, 2000) and applied to genetic data by Blows & Higgie (2003) and Blows et al. (2004). Krzanowski's comparison of subspaces is based on geometrical principals and therefore allows straightforward interpretation (Blows et al., 2004): a set of n eigenvectors for each covariance matrix define two n-dimensional spaces imbedded in p-dimensional trait space; in this instance p = 6. n orthogonal axes are then rotated within each of these two n-dimensional spaces so that they are best aligned with each other, producing two test-statistics. The first are simply the angles between each pair of best-matched axes, so that an n-dimensional comparison produces n angular comparisons with an angle of 0° meaning the two axes are perfectly aligned, and an angle of 90° meaning the two axes are orthogonal. The second test-statistic (sum of eigenvalues of S– see Krzanowski, 1979) has a value lying between 0 and n, and summarizes the overall similarity between the two component spaces. A value of 0 indicates no shared structure, and a value of n indicates identical component spaces. For these tests, n has to be equal to or less than half the dimensionality of the trait space, otherwise common axes between the two n-dimensional spaces will always be recovered (Krzanowski, 2000). In this study, a three-dimensional comparison is allowed, with each dimension being determined by the first three eigenvectors of each covariance matrix. These three dimensions describe 96% of the genetic variation, 96% of the brood variance and 68% of the phenotypic variance. A fuller description of the methods, and R code for implementing and visualizing the comparisons can be found in the electronic appendix.

All traits were scaled to have unit variance. This method allows a compromise between matrix comparisons dominated by traits with large variances (Krzanowski, 2000) and matrix comparisons that test for differences in the correlation structure without acknowledging differences in the heritabilities of the traits. Because skeletal and colour traits are measured on different scales, the variances of the two trait types differ by two orders of magnitude and the first issue may be severe. The second issue is more general, as evolutionary responses to selection are dependent on genetic covariances, not genetic correlations (Willis et al., 1991).

Confidence intervals around the matrix comparison summary statistics were estimated by running the full multivariate model on data bootstrapped over dyads (4000 replicates). The full six-trait model converged in 78% of the bootstrap runs for data transformed into scaled PCA scores. Projection of the data onto orthogonal axes does not effect the results of the matrix comparisons (McGuigan et al., 2003).

Tarsus length vs. back colour

As a graphical example of the genetic, common environment and phenotypic relationships between a skeletal and colour trait, we predicted BLUP breeding values and BLUP nest values by fitting an animal model (Mrode, 1996) with animal (genetic) and recipient nest as random effects. The animal model is exactly equivalent to the mixed model with recipient and original nest as random effects. The genetic variance component, however, is estimated directly, rather than being derived by twice the covariance between full siblings, and allows breeding values to be predicted for graphical purposes (Fig. 1)

Figure 1.

 Relationship between tarsus length and back colour at (a) the phenotypic (weak and nonsignificant correlation), (b) genetic (negative correlation), and (c) common environmental level (positive correlation). An ‘animal model’ was fitted to predict the breeding values. The estimates from the animal model and the standard mixed model are equivalent.

All statistical models and code were implemented in R (R Development Core Team, 2004), except the mixed models, which were fitted using ASReml (Gilmour et al., 2002).


Variance components

The phenotypic covariance matrix (see Table 1) shows significant and positive relationships between tarsus and head, and between the three colour patches: cap, back and wing. Significant phenotypic relationships between skeletal and colour traits are limited to cap colour, which shows a positive relationship with both tarsus and head lengths. Chest colour showed little phenotypic covariance with any trait except back colour.

Table 1.   Phenotypic matrix with variances along the diagonal, covariances in the upper triangle and correlations in the lower triangle.
  1. *P < 0.05; **P < 0.01.

Tarsus0.89 ± 0.06**0.37 ± 0.05**0.11 ± 0.04**−0.04 ± 0.040.07 ± 0.040.03 ± 0.04
Head0.40 ± 00.98 ± 0.07**0.16 ± 0.05**0.04 ± 0.040.02 ± 0.050.06 ± 0.05
Cap0.130.180.86 ± 0.07**0.23 ± 0.04**0.37 ± 0.06**−0.02 ± 0.04
Back− ± 0.05**0.21 ± 0.04**0.13 ± 0.04**
Wing0.070.020.420.220.90 ± 0.06**0.06 ± 0.04
Chest0.040.06− ± 0.06**

In accordance with a number of previous studies of avian morphology, skeletal measures were found to be highly heritable (h2 tarsus = 0.56, P < 0.01, head = 0.17, P < 0.01) (see Merila & Sheldon, 2001). In contrast, the colour of the four patches was found to be less heritable (h2 cap = 0.12, P = 0.028, back = 0.09, P = 0.032, wing = 0.05, P = 0.25, chest = 0.12, P =0.027) (see Table 2). The genetic correlations between the four colour patches, and between tarsus and head were positive, whereas the genetic correlations between the colour and skeletal traits tended to be negative. Significant genetic correlations were found between head and tarsus (r = 0.59, P < 0.01), back colour and tarsus (r = −0.68, P < 0.01). Significant common environment effects were found for all traits, and explained up to a third of the variation for some colour patches. Correlations because of a shared environment were positive between all traits, and often significant (see Table 3).

Table 2.   Genetic matrix with heritabilities along the diagonal, covariances in the upper triangle and correlations in the lower triangle.
  1. *P < 0.05; **P < 0.01.

Tarsus0.56 ± 0.09**0.17 ± 0.06**−0.04 ± 0.05−0.15 ± 0.05**0.06 ± 0.050.05 ± 0.06
Head0.590.17 ± 0.06**−0.05 ± 0.04−0.05 ± 0.04−0.01 ± 0.040.04 ± 0.05
Cap−0.17−0.400.12 ± 0.05*0.03 ± 0.040.08 ± 0.03**0.01 ± 0.04
Back−0.68−0.440.280.09 ± 0.06*0.01 ± 0.040.04 ± 0.04
Wing0.44− ± 0.050.05 ± 0.04
Chest0.210.320.120.400.720.12 ± 0.06*
Table 3.   Common environment matrix with the ratio of common environment variances to phenotypic variances along the diagonal, covariances in the upper triangle, and correlations in the lower triangle.
  1. *P < 0.05; **P < 0.01.

Tarsus0.07 ± 0.04**0.11 ± 0.03**0.09 ± 0.03*0.07 ± 0.02**0.04 ± 0.030.02 ± 0.03
Head0.820.29 ± 0.06**0.15 ± 0.05**0.03 ± 0.030.03 ± 0.040.03 ± 0.04
Cap0.640.480.41 ± 0.06**0.15 ± 0.04**0.28 ± 0.05**0.01 ± 0.04
Back0.790.160.690.13 ± 0.04**0.16 ± 0.05**0.10 ± 0.03**
Wing0.260.090.810.770.37 ± 0.05**0.03 ± 0.04
Chest0. ± 0.05**

Matrix comparisons

Three sets of matrix comparisons are presented: those between P and G, those between N and G, and those between N and P One-, two-, and three-dimensional comparisons are made for each pair of matrices. The vectors, planes and solids compared in each test are described by the first, the first and second, and the first, second and third eigenvectors of each covariance matrix respectively. The angle between the dominant eigenvector of P and that of G (Schluter's gmax, Schluter, 1996) is 67.5° suggesting that the two axes are randomly orientated with respect to each other (see Table 4). Contrary to the idea of genetic and environmental sources of variation co-varying the dominant eigenvectors of G and N are nearly orthogonal: 78.8° with less than a 0.01 probability of being less than 45°. Inclusion of more dimensions reveals some shared eigen structure between the three matrices, in particular P and G, but large angles are still associated with vectors closest to the dominant eigenvectors of G and N.

Table 4.   Krzanowski's comparison summary statistics.
Component spaceSum of eigenvalues of SCorresponding critical angles (degrees)
  1. Comparison of subspaces where component spaces of 1–3 refer to the number of dimensions included in the comparison. Critical angles are in increasing order within each comparison. All statistics are the median values from the bootstrapped distribution and the 95% confidence intervals are in parentheses.

P vs. G
 10.15 (0.00–0.73)67.5 (31.1–88.8)  
 21.30 (0.75–1.69)20.4 (6.89–38.4)48.4 (27.9–86.4) 
 32.03 (1.51–2.52)5.97 (0.26–17.6)21.9 (8.64–47.7)62.8 (35.8–88.7)
N vs. G
 10.04 (0.00–0.34)78.8 (54.5–89.5)  
 20.82 (0.23–1.24)36.6 (11.7–63.9)67.1 (49.2–88.4) 
 31.76 (1.21–2.13)9.43 (0.40–29.4)33.5 (13.9–58.7)72.0 (55.1–89.0)
N vs. P
 10.92 (0.18–0.99)16.2 (6.9–64.8)  
 21.68 (0.99–1.89)9.85 (4.01–18.8)32.2 (17.4–82.1) 
 32.67 (2.26–2.86)3.25 (0.15–9.32)12.8 (4.75–26.0)30.4 (18.6–53.7)

Tarsus length vs. back colour

As an illustration of the between-group correlations, Fig. 1 shows the relationship between tarsus length and back colour at the phenotypic, genetic and common environmental level. In this pair of traits, there is a negative genetic correlation (r = −0.68, P < 0.01), a positive natal effect correlation (r = 0.79, P < 0.01) but no correlation at the phenotypic level (r = −0.04, P =0.18).


The overall aim of this study was to test the quantitative genetic assumptions made by the phenotypic gambit in the context of avian plumage colour. As such, this is the first study to test whether significant differences exist between P and G, and whether differences exist between G and an explicit component of environmental variation. This study demonstrates that in the context of bird colouration substantial differences exist between phenotypic and genetic relationships, and particularly between genetic and environmental relationships. In light of this, the phenotypic gambit cannot be made for the traits studied here, and caution needs to be exercised when inferring genetic processes from phenotypic data alone.

Most previous studies have shown a high correspondence between phenotypic and genetic correlations, but they have tended to focus on highly heritable traits connected with growth. This bias was noted by Roff (1996), who suggested that for certain types of trait, such as life-history and behavioural traits, the equality of P and G may not hold. However, no firm conclusion could be reached on whether the discrepancy between phenotypic and genetic correlations was real or because of measurement errors in the genetic correlations. Here we show that for certain types of trait, sampling variance cannot wholly account for the observed difference between P and G.

For such traits the relationship between P and G is primarily determined by the relationship between G and the environmental (co)variance matrix. It has been proposed that genetic and environmental sources of variation are likely to perturb developmental pathways in a similar way (Cheverud, 1984) and that the resulting correspondence between the two matrices will lead to a strong relationship between P and G. Contrary to this idea, we show that environmental effects shared by individuals reared in the same brood act orthogonally to genetic sources of variation for this group of traits, and that important genetic patterns can be obscured at the level of phenotypes.

For example, the relationship between nestling tarsus length (which has been used as a reliable index of health in many nestling passerines see Senar et al., 2002 and references therein) and a colour patch is very different at the genetic, natal and phenotypic levels. Here we show that the negative relationship between tarsus length and back colour has a genetic basis, yet a positive relationship arises at the level of the nest (see Fig. 1). At the phenotypic level, there is no correlation between these two traits, and so by inferring genetic correlations from phenotypic data, or by estimating genetic correlations without controlling for parental effects, conclusions may be especially misleading.

The idea that genetic and environmental correlations between two traits can have opposing signs has a long history in the study of life histories (de Jong & van Noordwijk, 1992; Roff, 2002), and can be expressed in terms of allocation–acquisition rules (Van Noordwijk & Dejong, 1986): if two traits are competing for a common resource, allocation to one of those traits necessarily reduces the amount of resource available to the second trait, and the two traits will show negative covariation. However, if the amount of resource also varies, then those individuals obtaining high levels of a given resource will be able to invest more in both traits than an individual that has access to low levels of the resource. In this case, positive covariation will exist between the two traits. In altricial birds, resource acquisition during growth is primarily determined by parental effects (Starck & Ricklefs, 1998), and will be experienced as an environmental effect. Allocation of those resources between traits, however, will be determined within an individual, and may have a strong genetic basis. As many traits interesting to evolutionary biologists are believed to be both constrained by trade-offs (Roff, 2002) and determined by parental effects (Mousseau & Fox, 2000), we suggest negative covariation between genetic and environmental effects may be common. Under these conditions, it may be prudent to be cautious when inferring evolutionary significance from phenotypic data alone.

This study has focused on the differences between the phenotypic and genetic associations between traits. Although, these differences may change the rate and trajectory with which selective optima are approached, they are only likely to prevent stationary selective optima being reached when the genetic covariance matrix is singular. We lacked sufficient power to test this hypothesis, although work on Drosophila suggests that absolute constraints may be rare (Mezey & Houle, 2005). However, information regarding the dynamics and topology of adaptive landscapes remains critical in assessing the degree to which genetic associations between traits can alter the outcome of an evolutionary process, and we suggest empirical work is needed in this area (Arnold et al., 2001).


We thank M. Blows and S. Chenoweth for help with the matrix comparisons, and N. Hart for kindly providing data on the blue tit visual system. We also thank S. Clegg for help during data collection, and A. Phillimore, D. Orme, A. Lord, M. Burgess, B. Clark, J. Metcalf, B. Sheldon, T. Coulson and E. Svensson for comments on the manuscript. This work was funded by the National Environment Research Council.