Mark W. Blows, School of Integrative Biology, University of Queensland, Brisbane, 4072, Australia. Tel.: 61 7 3365 8382; fax: 61 7 3365 1655; e-mail: firstname.lastname@example.org
Two symmetric matrices underlie our understanding of microevolutionary change. The first is the matrix of nonlinear selection gradients (γ) which describes the individual fitness surface. The second is the genetic variance–covariance matrix (G) that influences the multivariate response to selection. A common approach to the empirical analysis of these matrices is the element-by-element testing of significance, and subsequent biological interpretation of pattern based on these univariate and bivariate parameters. Here, I show why this approach is likely to misrepresent the genetic basis of quantitative traits, and the selection acting on them in many cases. Diagonalization of square matrices is a fundamental aspect of many of the multivariate statistical techniques used by biologists. Applying this, and other related approaches, to the analysis of the structure of γ and G matrices, gives greater insight into the form and strength of nonlinear selection, and the availability of genetic variance for multiple traits.
‘…it's worth taking the trouble to learn a few basics of matrix algebra, so that you can read the primary theoretical literature.’ Stevan J. Arnold (2005; Evolution, 59, 2059) ‘It is no exaggeration to say that these [symmetric matrices] are the most important matrices the world will ever see.’ Gilbert Strang (1998; Introduction to Linear Algebra, 2nd edn. Wellesley-Cambridge Press, Wellesley)
It is probably true that linear algebra is not the topic that most evolutionary biologists choose for bedtime reading, despite the urging of one of the leading evolutionary biologists of our time (Arnold, 2005). Nevertheless, it is also true that evolutionary biology rests firmly on a foundation of linear algebra. It does not take long after getting past the introductions of classic papers in quantitative genetics (Lande, 1979, 1980, 1981) or the study of natural selection (Lande & Arnold, 1983; Phillips & Arnold, 1989) to appreciate that some of the most widely applicable theory in evolutionary biology employs an elegant sufficiency of linear algebra. Perhaps what is less generally appreciated though, is that the empirical study of that most fundamental aspect of evolution, the response to natural (or sexual) selection of functionally related traits that together confer Darwinian fitness, is also heavily dependent on the principles of linear algebra. In this paper, I highlight the roles of two important matrices in evolutionary biology, and show why the application of a few basics of linear algebra can be so important in empirical investigations of the genetic basis of quantitative traits, and the selection acting upon them.
Adaptation is an inherently multivariate process. Natural selection often acts upon sets of functionally related traits, rather than one-dimensional phenotypes (Lande & Arnold, 1983; Phillips & Arnold, 1989; Schluter & Nychka, 1994). Adaptation to a change in a single aspect of the environment such as an increase in abiotic stress (Hoffmann & Parsons, 1991), or to different habitats or niches (Harmon et al., 2005) will often result in changes in a number of traits. In addition, hundreds of years of experience with plant and animal breeding has shown us that selection for one trait will often result in a change in another as a correlated response. Correlated responses primarily arise as a consequence of pleiotropy (Falconer & Mackay, 1996), and although the genetic basis of functionally related sets of traits may be modular to some (as yet unknown) extent (Cheverud, 1982; Wagner & Altenberg, 1996), the genetic basis of even seemingly disparate traits cannot be assumed to be completely independent. Evolutionary biologists have therefore accepted a considerable challenge when attempting to understand the genetic basis of those traits under selection, and how selection operates on them.
Although it is common in many studies to measure multiple phenotypes, it is far less common for genetic and evolutionary hypotheses to be tested in a true multivariate fashion. A very recent example is QTL analysis, where multiple traits are often measured, and although multivariate approaches are available, they are rarely applied (Xu et al., 2005). Similarly, both empirical investigations of the form and strength of selection (Blows & Brooks, 2003), and the quantitative genetic basis of traits (Blows & Hoffmann, 2005) have also commonly ignored the multivariate nature of the biology under investigation, and here I outline why this restricts our ability to determine how and why traits evolve.
A brief overview of matrix diagonalization
Symmetric matrices, and in particular symmetric variance–covariance matrices, underlie many of the multivariate statistical tools commonly employed by biologists, but the matrices themselves often remain unseen. Symmetric matrices have the unique mathematical property that they can always be diagonalized to find a set of real eigenvalues and orthonormal eigenvectors (Box 1) that describe the space using (Strang, 1998):
where A is the matrix to be diagonalized, S is a matrix that contains the eigenvectors of A as columns and Λ is a diagonal matrix containing the eigenvalues of A. Principal components analysis is perhaps the most obvious application of diagonalization, where for example, the phenotypic covariance matrix for a set of n traits is diagonalized to result in n new orthogonal principal components which are created by multiplying the loading of the eigenvectors by the original trait values. Typically, only the new PCs and the associated percent variance explained by each PC are encountered, and very often the covariance (or correlation) matrix itself is not seen.
Table Box 1. Working definitions* of linear algebra terminology.
*For formal definitions (with proofs) the reader is directed to Strang (1998).
Canonical variate (axis): commonly used term for an eigenvector in applications of manova or response surface methodology
Diagonalization: refers to finding the eigenvalues of a matrix which are represented in a diagonal matrix
Eigenvector: a linear combination of the original traits. A set of eigenvectors are orthogonal, and completely describe the space encompassed by the original traits
Eigenvalue: The eigenvalue is the scale factor with which the eigenvector length changes. For G matrices they are genetic variances, for γ matrices they are quadratic selection gradients
Nonpositive definite matrix: a matrix that does not have all positive nonzero eigenvalues
Rank: The number of nonzero eigenvalues of a matrix
Singular matrix: a matrix that has one or more zero eigenvalues
Space: the n-dimensional area described by an n × n matrix
Subspace: the (n − x)-dimensional area described by a subset (x) of eigenvectors from an original n × n matrix
Multivariate analysis of variance (manova) is another example of the use of diagonalization, but is a little more involved. Here, the sums of squares and cross-products (SSCP) matrix at the effect level (H) is first scaled by the SSCP error (E) using:
and the resultant square matrix T is diagonalized. Hypothesis tests are then concerned with the significance or otherwise of the eigenvalues (for example, the endearingly named Roy's Greatest Root is the hypothesis test for the first eigenvalue). The eigenvectors are usually referred to as canonical variates; linear combinations of the original variables which show evidence of an effect of the treatment. Biologists frequently view such transformations of the original variables with some suspicion, even though plots of canonical variates can be a considerable aid in interpretation (Mardia et al., 1979). It is important to note that if one is comfortable with using manova as a test of significance (based on the eigenvalues) as many biologists appear to be, one has by default already accepted the linear transformation of the original variables (the eigenvectors).
The diagonalization of a matrix is the way in which the space described by the matrix can be conveniently summarized, and the properties of the matrix determined. Diagonalization provides a very different perspective on the relationships among traits than can be gleaned from considering the individual elements of the matrix in isolation. Here, I discuss the diagonalization of two important symmetric matrices, the matrix of nonlinear selection gradients (γ) and the genetic variance–covariance matrix (G), and show how the analysis of selection and the genetic basis of quantitative traits can benefit from this perspective.
Measuring selection on multiple traits
Measuring selection on multiple traits has a long history (Bumpus, 1898), and can be accomplished in a number of ways depending on the type of organism under study and the specific question in mind (Manly, 1985; Endler, 1986). It is not my intention to provide a comprehensive review of the methods available, and their assumptions, benefits and limitations. My focus here is to outline extensions to the commonly employed methods using multiple and second-order polynomial regression (Kingsolver et al., 2001).
Measuring linear selection using linear multiple regression, formalized by Lande & Arnold (1983), has been employed in numerous studies (Kingsolver et al., 2001). Briefly, partial regression coefficients from a multiple linear regression of relative fitness (survival, mating success etc.) on the traits of interest are used as measures of the strength of directional selection (selection gradients) on each trait corrected for the presence of all other traits in the model. Although relatively straightforward to implement, measuring linear selection on multiple traits can cause some problems. I have found three recurring issues that are worth a brief mention here.
First, very high correlations among independent variables are well known to cause instability of the partial regression coefficients and their standard errors. Measuring selection on sets of morphological traits for example may often encounter this problem. Recognized as a likely problem from the start, Lande & Arnold (1983, p. 1214) list a number of approaches to address this problem, of which the analysis of the principal components of the phenotypic covariance matrix has the potential to maintain all the variance in the original traits in the analysis.
Secondly, interpretation of partial regression coefficients can often be a challenge, and in some cases counter-intuitive to univariate examination of mean differences between successful and unsuccessful individuals. For example, a trait that displays no difference in mean between those individuals that survive and those that do not may appear as very important in the resulting selection analysis by having a large partial regression coefficient. Such patterns in multiple regression are not uncommon, and Flury (1989) gives a particularly lucid explanation of the correlation structures among independent variables that give rise to such cases.
Finally, tools that are frequently applied in multiple regression situations are often not used to advantage in selection analyses. Variable selection is one such technique that is rarely applied in studies of selection, but can be useful in either conserving precious degrees of freedom, or as another approach to dealing with multicollinearity among independent variables. In addition, Endler (1986) noted that when the fitness measure consists of two groups (dead and alive for example), the vector of directional selection gradients will be proportional to the linear discriminant function between the two groups (it will also be proportional to the first canonical variate from manova outlined above). This immediately suggests that the presence of linear selection can be most effectively tested for by considering the significance of selection on the univariate discriminant function using regression (or even the significance of the first eigenvalue using manova under some circumstances). In a similar fashion, the application of standard approaches to the analysis of response surfaces are often not utilized in analyses of nonlinear selection, a problem which I outline below.
The extension of multiple regression to the measurement of nonlinear selection, and the interpretation of fitness surfaces, using second-order polynomial regression was considered in detail by Phillips & Arnold (1989). It is the measurement and interpretation of nonlinear that has caused most confusion (Blows & Brooks, 2003), primarily as a consequence of the multivariate nature of the tool employed being divorced from the biological categorization of the three types of nonlinear selection; stabilizing (or convex), disruptive (or concave) and correlational selection.
Nonlinear selection on a set of traits can be represented by the second-order partial regression coefficients, which have often been divided into two distinct categories. Nonlinear selection on individual traits, γii (the quadratic regression coefficients), have been the coefficients most often interpreted in evolutionary studies (Kingsolver et al., 2001), and have been used as measures of disruptive and stabilizing selection. The second category of regression coefficients, γij (the cross-product regression coefficients), have been used to represent correlational selection; the nonlinear interaction between two traits. Correlational selection gradients have been consistently under-reported in the literature (Kingsolver et al., 2001), probably as a consequence of the difficulty in interpreting all n(n − 1)/2 coefficients in a meaningful fashion.
In matrix form, the nonlinear regression coefficients are arranged in a symmetrical matrix (γ). It is in this form that it can be seen that the two types of regression coefficients describe an n-dimensional space (or surface), rather than simply being bivariate descriptors of second-order curvature, each to be considered in isolation. Phillips & Arnold (1989) showed how the eigenstructure of this matrix could be analysed by adopting approaches from the response surface statistical literature (Box & Wilson, 1951; Box & Draper, 1987). The eigenvectors of γ are linear combinations of the original traits that display the most nonlinear selection (large eigenvalues) or the least nonlinear selection (small eigenvalues). The sign of the eigenvalues indicates whether the curvature is convex (negative eigenvalue) or concave (positive eigenvalue) in nature.
Canonical analysis of γ provides three benefits to empiricists studying selection that arise from being able to interpret the system as a whole, rather than focusing on individual elements of γ. First, the form and strength of selection can be identified. The importance of being able to view the system as a whole can be seen by considering the concern raised by the review of Kingsolver et al. (2001) in which it was found that nonlinear selection, and surprisingly stabilizing selection, was generally weak. Blows & Brooks (2003) showed that much of the nonlinear selection in empirical systems had been ignored by excluding the off-diagonal correlational selection gradients from measures of stabilizing or disruptive selection, and such selection on trait combinations (the canonical axes) is likely to be far stronger than univariate estimates of quadratic selection gradients indicate. In other words, it tends to be combinations of traits that are under stabilizing or disruptive selection, not the individual traits that empiricists define and measure.
Secondly, by identifying the major axes of nonlinear selection, the presence of nonlinear selection can be tested for in a much more efficient fashion by determining the significance of selection on the n canonical axes, rather than the n(n − 1)/2 coefficients (Blows & Brooks, 2003). This approach avoids the increase in type I error associated with a larger number of tests. More importantly however, it targets the hypothesis testing on those parameters (the eigenvalues) which are more useful in interpreting the nature of selection operating on the set of traits. To put it another way, applying hypothesis tests to bivariate cross-product coefficients implies one wishes to interpret the result, which in many cases appears to be of limited value as has been demonstrated by the way in which these coefficients have been consistently ignored in the empirical literature (Kingsolver et al., 2001).
Finally, the distinction between stabilizing and disruptive selection on one hand and correlational selection on the other is no longer required as the new canonical axes are orthogonal and describe the entire system of nonlinear selection. Note that this does not mean that correlational selection does not exist, or is a concept that has little value. Instead, correlational selection can now be seen as the relationships among the loadings of the original traits to each canonical axis. For example, large positive loadings for two original traits on a canonical axis that describes a substantial amount of nonlinear selection indicates that the two original traits experience correlational selection. This simplification of the classification of selection allows multivariate systems to be classed into more readily interpretable categories; peaks, bowls and saddles (Phillips & Arnold, 1989).
Brooks et al. (2005) showed how this approach allowed multivariate stabilizing sexual selection (a peak) to be identified on a set of male cricket song traits (Fig. 1). In this experiment, five attributes of male cricket song were artificially manipulated to explore the sexual selection fitness surface. From a visual inspection of the γ matrix alone it is difficult to determine the overall pattern of sexual selection operating on these male traits as both positive and negative quadratic and cross-product selection gradients are present. Diagonalization of γ provided a definitive picture of the form and strength of selection, as four of the five eigenvalues were negative indicating stabilizing selection, and only two of the negative eigenvalues were found to represent statistically significant nonlinear selection. Hence, Brooks et al. (2005) concluded that multivariate stabilizing selection was the predominant form of sexual selection operating on these traits.
The genetic basis of multiple traits
When multiple traits are under selection, univariate descriptors of the genetic basis of individual traits (heritability, genetic variance, evolveability) are inadequate to describe the genetic basis of the phenotype under selection. Lande (1979) introduced the genetic variance–covariance (G) matrix that enabled the genetic basis of multiple traits to be associated with selection acting on those multiple traits to result in the response to selection:
where the off-diagonal elements of G are the bivariate genetic covariances between traits. Lande (1979, p. 408) noted that the diagonalization of G provided a way of determining how many genetically independent traits were represented by a set of phenotypes. Subsequently, a number of authors (Cheverud, 1981; Amemiya, 1985; Pease & Bull, 1988; Charlesworth, 1990; Arnold, 1992) have outlined the utility of determining the eigenvectors and eigenvalues of G (but see Dickerson, 1955 for a particularly prescient early contribution).
Similar to the approach of interpreting individual selection gradients detailed above, individual elements of G (genetic variances and covariances) have almost exclusively been the subject of empirical attention. Although trait genetic variances and bivariate genetic correlations have been the targets of interest in many empirical studies, they may often be of limited value when interpreted in isolation from the larger number of functionally related traits that may be under selection (Pease & Bull, 1988; Charlesworth, 1990; Roff, 2002, p. 103; Blows & Hoffmann, 2005). The key property of genetic variance–covariance matrices that argues against the interpretation of individual elements is that such matrices can be singular even in the presence of genetic variance in all traits (Dickerson, 1955; Amemiya, 1985; Charlesworth, 1990). That is, there can be directions in multivariate trait space in which no genetic variance is available, even though all traits exhibit genetic variance.
The study of genetic correlations, particularly the search for negative genetic correlations among life-history traits, has been motivated by the realization that genetic covariance can act as a genetic constraint. The search for negative genetic correlations has been relatively unsuccessful for a number of reasons (Blows & Hoffmann, 2005), including the fact that they are not even necessary for genetic constraints to be generated. What appears not to have been generally appreciated (Blows & Hoffmann, 2005) is how variances and covariances describe n-dimensional genetic spaces; they are not parameters isolated from each other as the common piecemeal approach to their interpretation would suggest. Genetic correlations will only constrain evolution (in an absolute sense) if they result in no genetic variance in the desired direction of selection.
Diagonalization of G matrices offers at least three useful insights into the pattern of multivariate genetic variance that are not available by simply inspecting the genetic variances and covariances in isolation. First, diagonalization in theory allows an unambiguous determination of whether genetic constraints may exist by determining the frequency of zero eigenvalues (Blows & Hoffmann, 2005; Mezey & Houle, 2005). Unfortunately, determining the rank of G (the number of nonzero eigenvalues) is not as simple as estimating the eigenvalues themselves, as most estimates of G tend to be nonpositive definite as a consequence of sampling (Hill & Thompson, 1978). In practice, it is only possible to establish what part of the space of G has statistical support, and very large samples would be required to be able to confidently reject the null hypothesis of full rank (Mezey & Houle, 2005). Determining the dimensionality of covariance matrices has been the subject of considerable debate in the statistical (Amemiya, 1985), ecological (Jackson, 1993; Peres-Neto et al., 2005) and genetic literature (Mezey & Houle, 2005; Hine & Blows 2006). Although consensus is yet to emerge, factor-analytic modelling of genetic covariance matrices (Fig. 2) within a restricted maximum likelihood framework is a flexible and readily available approach to this problem (Meyer & Kirkpatrick, 2005; Hine & Blows 2006).
Secondly, diagonalization allows a more efficient search for the presence of statistically significant genetic variance. It is often bemoaned by evolutionary biologists that quantitative genetic experiments require depressingly large sample sizes to obtain statistically significant estimates of genetic variances and covariances (Klein, 1974; Koots & Gibson, 1996). Although this is certainly true, the simple fact that individual genetic variances or covariances are not statistically significant does not indicate that some part of the space of G is not well estimated and would attain statistical significance when tested for in an appropriate framework. By using factor-analytic modelling for example to directly model the genetic covariance structure, it will often be the case that the first eigenvector of a G matrix (at least) will display statistically significant genetic variance with hearteningly modest sample sizes. This is because the level of genetic variance accounted for by the first genetic eigenvector for a set of functionally related traits is likely to be substantially higher than that of individual traits. Note, however, that the heritability of this linear combination may not be higher than that of individual traits, and is only predicted to be so under certain restrictive conditions (Cheverud, 1981). The importance of this point should nevertheless not be underestimated; quantitative genetics of functionally related sets of traits can be accomplished with modest samples sizes if the questions of interest involve that part of the genetic space which displays the greatest amount of genetic variance. For example, Hine & Blows (2006) showed how statistical support was available for the presence of over 80% of the estimated genetic variance in eight cuticular hydrocarbons of Drosophila serrata (described by a two-dimensional subspace) with a modest half-sib experiment consisting of 355 individuals from 66 sires.
Finally, taking a multivariate approach to the analysis of the quantitative genetic basis of a set of traits potentially provides a very different perspective on the evolution of those traits than would otherwise be obtained. My own experience has involved the issue of the maintenance of genetic variance in male sexually selected traits of D. serrata. Females of this species exert sexual selection on multiple male cuticular hydrocarbons which act as contact pheromones (Hine et al., 2002; Petfield et al., 2005). Analysis of linear and nonlinear sexual selection suggests that female preferences for male CHCs are primarily linear in form, and open-ended (Blows et al., 2004; Chenoweth & Blows, 2005). Therefore, females appear to prefer males with more extreme CHC combinations, at least within the range of male phenotypes that commonly occur within these populations.
Male CHCs are phenotypically and genetically correlated in ways that suggest both resource acquisition/allocation, and other shared aspects of biochemistry are involved in these genetic associations (Blows et al., 2004; Hine et al., 2004). Virtually all male CHCs, under either laboratory or field conditions, display substantial levels of genetic variance (Blows et al., 2004; Hine et al., 2004), consistent with the high levels of genetic variance found for male sexually selected traits in other taxa (Pomiankowski & Møller, 1995). The application of the multivariate breeder's equation, however, suggested that these traits would change very slowly in the direction of female preference; in the order of 1% of a phenotypic standard deviation or less. Further analysis using matrix projection (Fig. 2) indicated that when β was associated with the multivariate orientation of genetic variance, the vast majority of genetic variance can be shown to be orientated at a substantial angle away from the direction of selection (Blows et al., 2004; Hine et al., 2004).
Here then, is an example of how a multivariate perspective completely changed the interpretation of the genetic basis of these traits, and the consequence of sexual selection acting on them. Rather than concluding that genetic variance in male sexually selected traits is maintained at a high level in this species based on univariate levels of genetic variance consistent with many previous studies (Tomkins et al., 2004), we drew the opposite conclusion: genetic variance in male sexually selected traits does not appear to be maintained in the direction of sexual selection at all. Whether or not sexual selection does in fact deplete genetic variance in male traits requires much more work to be demonstrated, and in particular, manipulative evidence for the effect of selection on the genetic variance will need to be obtained. Nevertheless, the important point here is that univariate and multivariate approaches to the same question have resulted in opposite conclusions.
Our understanding of how selection operates in natural populations and the genetic basis of those traits under selection is far from complete. After the application of a few basics of linear algebra, the perspective that one has on multivariate problems in evolutionary biology can change dramatically. It may be that nonlinear selection is much stronger on functionally related sets of traits than was previously envisaged, and the ubiquitous nature of genetic variance may not be as certain as commonly thought. It is sobering to realize that Kingsolver et al. (2001) found only seven papers in which three or more traits had been subjected to an analysis of selection (Appendix A, Blows & Brooks, 2003), and there has been only one serious attempt to determine the dimensionality of a G matrix (Mezey & Houle, 2005). It is just too early to determine whether or not the perspectives revealed by the application of the multivariate approaches outlined here will turn out to be representative.
Although multivariate approaches can change the way in which selection on multiple traits and their genetic basis may be interpreted, the fundamental limitations of the data underlying selection analyses and quantitative genetic parameters are not escaped. The estimation of both γ and G matrices are usually conducted in mensurative studies, with no manipulative evidence to support the patterns found. In addition to this general limitation, the interpretation of G matrices is further restricted by the summative nature of the individual parameters; genetic variances and covariances represent the average effect of all loci that contribute to the phenotypic traits and depend heavily on allele frequency (Falconer & Mackay, 1996). Although cause for concern and careful interpretation, these limitations should also be the impetus for new experimental approaches. For example, applying selection along the eigenvectors of a G matrix is one way to gain manipulative evidence for the multivariate pattern of genetic covariance (a zero eigenvalue predicts no response along that eigenvector for example). In addition, genomic approaches may provide the opportunity to increase the resolution of G matrices; transcriptional profiling experiments searching for gene expression differences between divergent treatments or populations that incorporate quantitative genetic designs offer the promise of being able to model the genetic covariance among individual transcripts that responded to selection.
Multivariate approaches in biology are often not utilized by many of us who would benefit most by their implementation. Why? At least in some cases, the implementation of multivariate approaches may be restricted by difficulties associated with actually physically applying them (numerically difficult, access to software etc.). In other cases though, it may be that accepting as useful, or even legitimate, the linear transformations of the original variables required by these approaches may be an important factor contributing to our field's preoccupation with univariate approaches. Although it is certainly true that the biological interpretation of linear transformations can be a torturous and subjective exercise (Peres-Neto et al., 2003), the value of the approaches outlined here lies in the establishment of the multivariate pattern of selection or genetic variance. The importance of the perspective these approaches can bring to empirical investigations is independent of any attempt to interpret the role of individual original traits in the transformed variables. Indeed, they are valuable because they do not rely on interpreting the role of individual traits at all.
Many of the ideas and approaches outlined in this paper have been developed in collaboration with Rob Brooks, Steve Chenoweth and Emma Hine, although any misunderstandings herein are my own. Thanks to J. Cheverud, D. Roff and S. West for comments on a previous draft. My research is supported by the Australian Research Council.