Recent research regarding correlations among behaviors—under the labels of behavioral syndromes and animal personalities—has typically assumed that phenotypic correlations between behaviors are representative of underlying genetic correlations. However, for behaviors, the concordance between phenotypic and genetic correlations has not been rigorously examined. I tested this assumption using published estimates and found phenotypic and genetic correlations to be strongly related but found that the average absolute difference between the two was quite high and similar to that observed in other traits. Using absolute differences as the sole criterion, phenotypic correlations do not reliably estimate the magnitude of genetic correlations for behaviors, which is problematic for behavioral syndrome researchers. However, phenotypic correlations explained 75% of the variation in genetic correlations and their sign was typically the same as that of genetic correlations. This suggests that phenotypic correlations between behaviors reliably estimate the direction of underling genetic relationships and provide considerable information regarding the magnitude of genetic correlations. Thus, if researchers are careful about the questions they ask, phenotypic correlations between behaviors can be informative regarding underlying genetic correlations and their evolutionary implications.

The estimation of genetic covariances and correlations as well as other quantitative genetic parameters represents a fundamental limitation on the inferences that can be drawn in evolutionary ecological research. Genetic correlations must be estimated to properly understand how selection can change the distribution of phenotypes in a population (Lande and Arnold 1983; Phillips and Arnold 1989) and can introduce constraints on the ability of populations to respond to selection by limiting the evolutionary trajectories available in multivariate trait space (Blows and Hoffmann 2005; Roff and Fairbairn 2007). Although pivotal to our evolutionary understanding, estimating genetic correlations requires both large sample sizes and information regarding relatedness among individuals (Lynch and Walsh 1998). Unfortunately, the former is often limited by either the system or question being asked and the latter is typically unknown—particularly for research in natural populations.

The logistically prohibitive nature of estimating quantitative genetic parameters is well recognized and has led some authors to suggest that phenotypic measurements might be a suitable proxy for genotypic values (Cheverud 1988; Grafen 1984; Roff 1996, 1997). The ability to use phenotypic estimates in lieu of genetic parameters is particularly relevant to behavioral ecologists who often assume that the phenotypic variation they observe is representative of underlying genetic variation. Grafen (1984) called this assumption the “phenotypic gambit” however its appropriateness to behavioral traits is unclear.

One area of behavioral research in which the phenotypic gambit has been heavily relied upon has been in the study of correlations between behaviors. Behavioral correlations have become a topic of considerable research interest over the last several years under the label of “behavioral syndromes” or “animal personalities” (Sih et al. 2004; Réale et al. 2007). The study of behavioral syndromes has found phenotypic correlations among a wide array of behaviors for many species (e.g., fish [Bell 2005; Dingemanse et al. 2007]; mammals [Réale and Festa-Bianchet 2003; Dochtermann and Jenkins 2007]; birds [Dingemanse et al. 2004]; and arthropods [Reaney and Backwell 2007]). The behavioral traits studied in syndrome research have also been found to be correlated with fitness and the correlations among behaviors likely to generate fitness trade-offs (Smith and Blumstein 2008). Further, the behavioral correlations revealed by syndrome research require researchers to consider behaviors from a multivariate perspective and thus represent a convergence of quantitative genetics with behavioral ecology (Dochtermann and Roff 2010).

As with correlations among morphological and life-history traits, the correlations inherent in behavioral syndromes are of evolutionary interest due largely to their implications for the evolutionary trajectories available to populations (Dochtermann and Roff 2010; Sih et al. 2004). Unfortunately, the majority of behavioral syndrome research to this point has been restricted to the identification of phenotypic correlations or has examined proximate causes such as the hormonal or neurological basis of behavioral variation. Far less research has examined the genetic architecture of behavioral syndrome structures from a quantitative genetics perspective (but see Bell 2005; Dingemanse et al. 2009; Réale et al. 2009; and van Oers et al. 2004 for notable exceptions). As a result, behavioral syndrome research—and the evolutionary inferences drawn therein—has largely been conducted without knowledge of whether the observed phenotypic correlations correspond to underlying genetic correlations.

The general issue of whether phenotypic correlations correspond to genetic correlations has, however, been addressed in the broader evolutionary literature—although not specifically for behaviors. Cheverud (1988) suggested that modifying a phenotypic covariance matrix by the heritability of traits might provide an appropriate proxy of genetic covariance matrices for the estimation of multivariate responses to selection (Cheverud 1988). More informally, “Cheverud's conjecture” (Roff 1995) suggests that phenotypic correlations might be useful estimates of genetic correlations (Roff 1995, 1996, 1997). The validity of evolutionary inferences drawn in behavioral syndrome research is thus mediated by the validity of Cheverud's conjecture to behavioral traits.

Despite methodological concerns (Willis et al. 1991), Cheverud's conjecture has received some empirical support. In a review of the then available literature, Roff (1996) found that, after accounting for sampling error, phenotypic correlations were on average unbiased estimates of genetic correlations among morphological traits and between morphological and life-history traits whereas phenotypic correlations were not as useful in estimating the genetic correlations among life-history traits. Too few independent correlations between behaviors were available to allow general conclusions to be drawn in regard to that domain of traits (Roff 1996).

Kruuk et al. (2008) replicated Roff's review with mostly different data and again demonstrated that phenotypic and genetic correlations are highly correlated, with the Pearson's correlation between the two exceeding 0.7. The relationship between the two was also shown as being close to 1:1. However, both Roff (1996) and Kruuk et al. (2008) found that despite this apparent concordance, the absolute difference between phenotypic and genotypic correlations was quite high. The high absolute difference can be interpreted as a lack of precision of phenotypic correlations as estimates of genetic correlations. The difference between genetic and phenotypic correlations was also found to be greater than would be expected if solely due to sampling error in one of the two datasets (Kruuk et al. 2008). Unfortunately, these reviews of Cheverud's conjecture have focused on morphological and life-history traits and so the status of Cheverud's conjecture as it applies to correlations among behavioral traits remains unclear.

Because of the interest in behavioral correlations generated by behavioral syndrome research, I extended the testing of Cheverud's conjecture to new data for behaviors.



To test Cheverud's conjecture as it applies to behavior I analyzed 115 pairs of estimates for genetic and phenotypic correlations from 13 studies distributed among 13 species in six classes of animals (Table 1 and Table S1). Details of the dataset and its construction are available in appendices. Because only 13 studies were available I did not differentiate based on the type of behavior or whether estimates were for laboratory, field, or domestic populations. I also included all estimates of genetic correlations regardless of statistical significance—although there was some degree of reporting bias as two studies did not report nonsignificant correlations. Thus this approach assumes that genetic correlations are estimated with higher precision than is likely the case. Standard errors were not uniformly reported so it was not possible to weight estimates by uncertainty in subsequent analyses.

Table 1.  Representation of genetic and phenotypic correlation estimates by phylum and class of subject.
Phylum/ClassNumber of speciesNumber of estimates
Arthropoda/Insecta 4 18
Arthropoda/Branchiopoda  1  12
Chordata/Actinopterygii 1  6
Chordata/Reptilia  1   2
Chordata/Aves 3 54
Chordata/Mammalia  3  23


To determine the degree to which phenotypic correlations reliably estimate genetic correlations, I used procedures similar to those of Roff (1996) and Kruuk et al. (2008). First I calculated Spearman's correlation (ρ) between the two sets of estimates. I next calculated the linear relationship predicting genetic correlations based on phenotypic correlations using a mixed-model approach. Genetic correlations were used as the response variable, the corresponding phenotypic correlations used as a continuous fixed factor and the study an estimate was published in used as a random factor (Kruuk et al. 2008) (Supporting Information).

For phenotypic correlations to be useful as reliable substitutes for genotypic correlations, the slope between the two should be close to unity (i.e., 1:1; Roff 1996). I tested this relationship using a single parameter t-test by subtracting 1 from the mixed-model estimated slope and dividing the result by the standard error for the slope. There is not currently a clear way to calculate degrees of freedom for mixed-models so I calculated them as the number of studies minus one. In this case the “true” degrees of freedom would be somewhere between the value I used and the total number of estimates. Therefore the degrees of freedom used here are a highly conservative estimate, making the test itself conservative. I also calculated the proportion of the variation (r2) of genetic correlations attributable to phenotypic correlations—after controlling for study—using Nagelkerke's (1991) generalization of the coefficient of determination to maximum likelihood estimation. It is important to note that the question being evaluated here is whether phenotypic correlations can be used to estimate genetic correlations; causality is—of course—expected to be in the opposite direction. In addition, I calculated the absolute disparity between phenotypic and genetic correlations (see also Willis et al. 1991).

Finally, unlike other reviews, I used a generalized linear mixed model to calculate the odds ratio—and its significance—that the sign of a phenotypic correlation was the same as that of the corresponding genetic correlation. I used a binomial distribution with a logit link function and included the study an estimate was from as a random factor. The response variable was whether the sign of the genetic correlation was negative (0) or positive (1) and the sign of the phenotypic correlation was treated as a categorical (negative or positive) fixed factor (Supporting Information). Contingency table approaches (e.g., chi-square tests) would not be appropriate for this question due to the lack of independence within studies.


The rank order of phenotypic correlations was highly concordant to that of genetic correlations (ρ= 0.87, P≪ 0.001). Consistent with this result, a linear relationship between genetic correlations and phenotypic correlations with study as a random factor explained considerable variation in genetic correlations (r2= 0.75, Fig. 1). However, the slope of the relationship between phenotypic and genetic correlations differed from 1 (slope = 1.318 ± 0.071 SE; t12= 4.47, P= 0. 0008, Fig. 1). Further, the average absolute disparity was high between phenotypic and genetic correlations across studies (0.273, SE = 0.018).

Figure 1.

Relationship between phenotypic (x-axis) and genetic correlations (y-axis) grouped by class (the two classes of Arthropods are pooled in this figure) within Animalia (see legend). The dashed diagonal line represents the best-fit regression line through the entire dataset (r2= 0.75; rgen= 1.318 ×rphen– 0.012). Phenotypic and genetic correlations were significantly correlated (ρ= 0.87, P≪ 0.01). The boxplots along the x and y axes are for the distributions of the absolute values of the correlations. The outer boundaries of the boxes indicate the 25th and 75th percentiles, the whiskers indicate the 90th and 10th percentiles and the stars indicate the 5th and 95th percentiles. The median for the absolute values of genetic correlations (the line within the boxes) was greater than that for the absolute values of phenotypic correlations (0.58 vs. 0.22, respectively).

There was also a strong correspondence between the signs of phenotypic and genetic correlations (Table 2). The odds ratio that the sign of phenotypic correlations corresponded to that of genetic correlations was 478.91 (log-odds ratio: 6.172, z = 5.709, P≪ 0.001). Put another way, a positive phenotypic correlation was 478.9 times more likely to correspond to a positive genetic correlation than to a negative genetic correlation.

Table 2.  A 2 × 2 contingency table for the correspondence of the signs of phenotypic correlations and genetic correlations. Phenotypic correlations had 478.9 times the odds of corresponding to genetic correlations of the same sign than to genetic correlations of the opposite sign.
   r phenotypic  
r genetic   73  2

Because the absolute value of genetic correlations was greater than that of phenotypic correlations (Fig. 1), as also demonstrated by the slope of the relationship between the two, I conducted post-hoc tests of whether this relationship was likely due to sampling error. As not all of the studies included in the analyses reported standard errors I used sample size as a proxy for sampling error. Sample size was determined in two ways: (1) the number of families or cohorts used in breeding designs and (2) the total number of individuals whose behavior was measured. The average value of the absolute genetic correlations for each study was not significantly related to either measure of sample size (Fig. 2).

Figure 2.

Relationship between either the number of families/cohorts (A) in a breeding design or individuals (B) and the average value of the absolute genetic correlations for each study. Because sampling error is expected to decrease with sample size, a negative relationship between sample size measures (x-axes) and the absolute value of genetic correlations (y-axes) would be expected if genetic correlations were biased toward greater absolute values with higher sampling error. Based on a simple linear regression, this was not the case for either measure of sample size (A: F1,8= 0.05, P= 0.83; B: F1,14= 0.022, P= 0.88). Note the break in the x-axis for A (there was no significant relationship between the number of families/cohorts and the genetic correlation even if the outlier after the break was removed (F1,7= 0.78, P= 0.41)).


These results are similar to those found for correlations between other types of traits (Roff 1996; Kruuk et al. 2008) and the correlation between genetic and phenotypic correlations for behaviors was of similar magnitude (ρ= 0.87) as that for other phenotypic traits (r= 0.74, Kruuk et al. 2008). The absolute difference estimated for behaviors (0.273, SE = 0.018) was also similar to that reported by Kruuk et al. (2008) for a sample of correlations consisting primarily of morphological and life-history traits (0.245, SE = 0.222) and, based on the standard errors, the confidence intervals for each likely overlap. This similarity for behavioral correlations and correlations between other types of traits is surprising given that behaviors are sometimes viewed as more plastic than other traits (e.g., Neff and Sherman 2004).

In contrast to similarities in absolute differences, the concordance between phenotypic and genetic correlations was farther from unity (i.e., a slope of 1) for behaviors than found in either previous review. For comparison to the slopes estimated here, Roff (1996) found an average slope of 1.05 for his complete dataset whereas Kruuk et al. (2008) found a slope of 1.06 for their data. In Roff's (1996) review behavioral correlations exhibited an even greater slope (1.68)—although in that review 80% of the behavioral correlations were from a single study. Here, I found a slope of around 1.3 (Fig. 1). Following the rationale of Kruuk et al. (2008) these slopes and absolute differences suggest that Cheverud's conjecture fails as a precise estimator of the genetic correlation between behavioral traits.

These results are potentially problematic for researchers investigating behavioral syndromes because many of the inferences drawn from such research rests implicitly on the validity of Cheverud's conjecture for behaviors. Unfortunately for behavioral syndrome researchers, estimating bivariate genetic correlations in natural populations requires information about pedigrees as well as large sample sizes (Wilson et al. 2010) that may not be available. Given these limitations, what then can be extracted from phenotypic correlations that might be useful to behavioral syndrome researchers interested in the evolutionary implications of trait correlations?

Based on the results presented here several inferences might still be drawn from phenotypic correlations between behaviors. First, the rank order of correlations is highly consistent between phenotypic and genetic estimates. If researchers are primarily interested in which correlations among a set of traits in a behavioral syndrome are most likely to affect evolutionary trajectories, phenotypic correlations will likely be informative. This may be useful to researchers interested in determining which traits to measure in the context of a more thorough quantitative genetics breeding design. Second, the sign of a phenotypic correlation is a reliable indicator of the sign of the genetic correlation (Table 2). If there is knowledge about how specific behaviors relate to fitness, then the sign of the phenotypic correlation may be suggestive of how trait correlations affect evolutionary change. A visual inspection of the data used by Kruuk et al. (2008, Fig. 1 therein) suggests a similar concordance in the sign of correlations for other types of traits. Finally, because of the high correlation (ρ= 0.87) and the considerable amount of variation in genetic correlations attributable to phenotypic correlations (r2= 0.75), Cheverud's conjecture cannot be entirely dismissed based on absolute discrepancies. Thus, despite absolute differences, phenotypic correlations provide considerable information regarding both the direction and magnitude of the underlying phenotypic correlations.

Of course it is important to note that the estimation methods of genetic and phenotypic correlations are likely to create some autocorrelation between the two, potentially contributing to their observed similarity. Because genetic correlations are estimated based on observed phenotypic correlations, they may be biased toward these observed values via a variety of mechanisms. For example, unaccounted for common environmental effects and parental effects may be estimated as components of genetic variances and covariances, exaggerating their correspondence with phenotypic measures. The former may be particularly important in natural populations. Another way concordance between correlations might be exaggerated stems from the fact that they are typically estimated under the same conditions and with the same study population. The experimental conditions thus represent a special case of shared environmental effects which generates shared error in the estimation of quantitative genetic parameters within experiments. Researchers intent on using phenotypic estimates as proxies for genetic estimates should consider these and other mechanisms that generate autocorrelation.

The difference in absolute magnitudes (medians: 0.58 and 0.22, respectively) reflected in the slope between genetic and phenotypic correlations (Fig. 1) also warrants further consideration, as does the difference in slopes observed for behaviors versus those observed for other traits. The difference in absolute magnitudes may result because genetic correlations are estimated not only from additive genetic covariances but also from the estimation of the additive genetic variance present in each trait. Estimation error in any of these three parameters (two variances and one covariance) results in error in the estimation of the genetic correlation. At the extreme, estimation error can result in correlations greater than |1|. The occurrence of correlations outside of the range [-1:1] decreases with decreasing sampling error (Hill and Thompson 1978) and thus with increasing sample size. Because of these estimation concerns, the greater magnitude of absolute genetic correlations observed here may simply be a result of sampling error. If this were the case then it would be expected that the absolute magnitude of genetic correlations would decrease with sample size but, for these data, the magnitude of genetic correlations was not significantly related to sample size (Fig. 2).

If not due to sampling error or heritabilities, what else might cause this general pattern? Also, why is this difference greater for behaviors than for other traits? For the first of these questions, one possible answer is suggested by considering the “effective dimensionality” (Hine and Blows 2006) of a covariance matrix. The effective dimensionality of a covariance matrix can be thought of as the number of independent combinations of traits—similar to orthogonal principal components—among a set of covarying phenotypic traits. Dimensionality, its evolutionary implications, and other properties of covariance matrices have been extensively discussed elsewhere (e.g., Blows 2007; Hine and Blows 2006; Walsh and Blows 2009), but the important points to take away from those discussions relative to the results I have reported here are that (1) the number of effective dimensions in the phenotypic covariance matrix (P) generally sets an upper limit to the number of effective dimensions in the corresponding genetic covariance matrix (G) and (2) effective dimensionality is inversely proportional to the absolute magnitude of phenotypic of genetic correlations. Based solely on these relationships—ignoring causality—as the average phenotypic correlation increases, the effective dimensionality of P decreases. As the dimensionality of P decreases, the dimensionality of G decreases with a concomitant increase in the average genetic correlation for that matrix. As a consequence, the magnitude of the average absolute phenotypic correlation sets a lower limit to the average absolute genetic correlation. Thus genetic correlations can, on average, be expected to be greater in absolute magnitude. Further, pleiotropy will tend to increase genetic correlations, decreasing the effective dimensionality of genetic covariance matrices (Walsh and Blows 2009) whereas environmental variation will tend to maintain effective dimensions of phenotypic covariance matrices (McGuigan and Blows 2007). However, the question of why the magnitude of genetic correlations relative to phenotypic correlations is so much greater for behaviors versus other traits remains.

Roff (1996) found that the magnitudes of genetic correlations were much greater than corresponding phenotypic correlations for behaviors and that this discrepancy was greater for behaviors versus either life-history or morphological traits. The results I have presented here suggest the same when compared to those of Kruuk et al. (2008). This observed disparity between trait types might be related to the expression of plasticity in behavior. Behavioral responses are often expressed continually over time and vary within individuals in response to environmental changes (Dingemanse et al. 2010). In contrast, the relationships between many other types of traits may be more rigid. For example, Cheverud's conjecture is well supported for morphological traits (Roff 1996) and many of the correlations within this domain of traits remain relatively constant after individual maturity (e.g., the correlation between different skull measurements within an individual). For these types of traits the contribution of the environmental correlation to the phenotypic correlation thus remains largely constant after maturity. Contrastingly, for behaviors, the environmental correlation for two traits may change with time. If two behaviors are measured at different times this may bias the absolute magnitude of phenotypic correlations relative to genetic correlations. Regardless of why, what is clear is that at the phenotypic level behaviors vary more independently than suggested by the underlying genetic correlations. From this it might also be reasonable to infer that the influence of behavioral genetic correlations on evolutionary trajectories is greater than suggested by phenotypic correlations.

How genetic correlations between behaviors are generated also remains of considerable research interest. Behavioral genetic correlations can be generated by selection induced linkage-disequilibrium, linkage, or pleiotropy. Observations that some syndrome structures appear easily eroded or quite geographically variable suggests the former (Bell and Sih 2007, Dingemanse et al. 2007, 2009). More broadly the hypotheses of pleiotropy or selection induced linkage-disequilibrium can be tested in at least two ways. First, selection experiments can determine the underlying basis of trait correlations and their contingency on current and historical selective environments (Chippindale et al. 2003; Conner 2002). Second, associations between gene expression and behavioral responses may also reveal whether pleiotropy is generating behavioral correlations (Bell and Aubin-Hoth 2010). Ideally both approaches would be used together as each provides unique information necessary to our understanding of the evolutionary implications of behavioral correlations.

Associate Editor: M. Blows


I thank A. Takahashi and E. Strandberg for providing me with phenotypic estimates in addition to their published genetic correlations. I also thank K. Burls, C. Downs, M. Forister, and S. Karam for helpful comments on earlier versions of this article. M. Blows, A. Bell, and two anonymous reviewers provided essential criticisms and recommendations that greatly expanded the scope and clarity of this article.