Quantitative genetic methods provide inferences of evolutionary processes via the study of evolutionary divergence patterns and their relationship to intrapopulation adult variation (Lande 1979; Ackermann and Cheverud 2002, 2004; Marroig and Cheverud 2004; Monteiro and Gomes-Jr 2005; Perez and Monteiro 2009). The connection between neutral microevolutionary processes and macroevolutionary patterns is centered around the additive genetic variance–covariance matrix (**G**) (Lande 1980; Arnold et al. 2001; Jones et al. 2003; Bégin and Roff 2004), which is thought to determine both the response to selection and the pattern of neutral divergence, at least among populations over a small time scale (Lande 1980; Felsenstein 1988; Zeng 1988).

The expected pattern of phenotypic divergence among populations caused by random genetic drift in correlated traits can be used as a null hypothesis to test for neutral evolution (Lande 1979, 1980). The sampling distribution of the change in trait means in one generation () has a mean of 0 and variance–covariance matrix **G**/*N _{e}*, the genetic covariance matrix in a population divided by the effective population size (Lande 1979). If the average phenotype of a population

*a*is represented by a column vector of polygenic traits with additive genetic and environmental components following multivariate normal distributions (Lande 1980), the probability distribution Φ after

*t*generations will be

which is a normal distribution with a mean equal to that of the initial population and variance–covariance matrix **G**(*t*/*N _{e}*) (Lande 1979). If a number of populations are evolving independently (i.e., without gene flow), the expected among-population phenotypic variance–covariance matrix (

**B**) is a function of the genetic covariance matrix (

**G**), effective population size (

*N*), and the number of generations (

_{e}*t*):

As a result, the comparison of among-population (**B** phenotypic) and within-group (**G** genetic) variance–covariance matrices can be used as a means to determine whether genetic drift as a null model explains the pattern of divergence observed (Lofsvold 1986, 1988; Roff et al. 1999; Ackermann and Cheverud 2002; Bégin and Roff 2004).

Because phenotypic covariances are much easier to estimate than their genetic counterparts, replacing average **G** with the pooled phenotypic within-group covariance matrix (**W**), provided that the phenotypic covariance matrices for diverging populations remain similar, has been a widely used approach to study the evolutionary mechanisms of divergence (Ackermann and Cheverud 2002, 2004; Marroig and Cheverud 2004; Perez and Monteiro 2009). Cheverud (1988) investigated the relationship between genetic and phenotypic correlation matrices using data taken from the literature and concluded that phenotypic correlations were reasonable estimates (and generally proportional, although perhaps not in a strict mathematical sense) of the respective genetic correlations. A second conclusion from these data was that phenotypic covariances **W** estimated with large samples might approach **G** more accurately than genetic covariances estimated from small effective sample sizes, at least for morphometric data (Cheverud 1988; Revell et al. 2010). A number of meta-analyses from literature reviews and empirical results have to some degree corroborated Cheverud's findings (Roff 1995; 1996; Koots and Gibson 1996; Roff et al. 1999; Waitt and Levin 1998). Nonetheless, this approach has been criticized on several grounds (Willis et al. 1991), but mostly because **W** is not mathematically proportional (i.e., having a constant ratio) to average **G**. Apart from the issue of similarity and proportionality between matrices, more specific consideration of the actual consequences of using **W** as a surrogate of average **G** in empirical studies (Bégin and Roff 2004; Klingenberg et al. 2010) should prove fruitful and one such aspect, the impact in terms of type I error rates, is the focus of the present study.

Quantitative genetic theory predicts phenotypic covariances within a single population (**P**) to be the sum of the genetic covariation (**G**) and the environmental covariances (**E**), **P**=**G**+**E** (Falconer and Mackay 1996). A part-whole correlation is expected between phenotypic and genetic covariances; therefore, phenotypic covariances can be considered an estimate of genetic covariances with added error due to environmental covariances, even if not mathematically proportional.

Most of the discussion on the surrogacy of average **G** by **W** revolves around the similarities and differences between phenotypic and genetic covariances in single populations or from literature reviews, and the differences in empirical comparative results obtained when using one kind of estimate or the other. The latter are rare, due to the difficulty in estimating genetic parameters for a large number of species at the same time (Bégin and Roff 2004). Considering that Lande's (1979, 1980) model expects the among-population covariance matrix **B** to be proportional to the average **G** when genetic drift is the sole evolutionary mechanism, for the purpose of evolutionary divergence tests of neutral evolution, the relevant discussion is not whether **G** and **P** are exactly proportional in single populations, but whether using the phenotypic pooled within-group covariance matrix **W** instead of the average **G** will add enough error (caused by the environmental covariances) to lead into erroneous conclusions. The tests that have been used in the comparison of among-species phenotypic covariances and genetic covariances (Lofsvold 1988; Ackermann and Cheverud 2002, 2004) do not test for exact proportionality between **B** and average **G**, but for similarity in different matrix features, such as the correlation of principal components and the distribution of eigenvalues. The expectation of proportionality rests on a number of assumptions (Lande 1979) that are probably violated in most natural populations (Lofsvold 1988), for example, through the lack of large effective population sizes (Lofsvold 1988), or because of differences in the starting times of lineages (Revell 2007). Furthermore, error in the estimation of the average **G** might lead to unpredictable deviations from the expectation. Lofsvold (1988) has suggested that the acceptance of genetic drift as a null hypothesis will be more robust to the breaking of the model's assumptions than the rejection (so type I error rates are of more concern than the power), and in real studies it might be hard to determine the actual cause of rejection, natural selection being one of the possible explanations. One might expect that a consequence of using pooled within-group phenotypic instead of genetic covariances would be to increase the probability of rejecting (type I error rate) a true null hypothesis of genetic drift.

In this study, we examined the consequences of using pooled within-group phenotypic instead of average genetic covariance matrices in the Ackermann and Cheverud (2002) test of genetic drift (referred to as the AC test from here on) in terms of type I error rates using a simulation of phenotypic evolution in diverging populations. We identified the most relevant parameters and discuss a number of recommendations.