Introduction
 Top of page
 Summary
 Introduction
 The residual index vs. the ancova
 Conclusions
 Acknowledgements
 References
In ecology and many other disciplines, we often aim to compare a response variable among several treatments or groups after removing the effect of a third, concomitant variable. An example is the comparison of body condition among several animal populations or groups. Body condition is often measured as the weight (mass) relative to the length of the animal. Weight and length are strongly correlated and the concept of body condition emerges as the plumpness, fatness, or wellbeing of the animal after partialling out that correlation, i.e. the size effect. Similar questions in ecology are the comparisons of richness–area relationships among regions, of root: shoot ratios of plants among several treatments, and of C : N ratios among sites. These types of question were dealt with in about 25% (27 of 99 and 44 of 227) of ecological papers (Table 1). A classical difficulty encountered in these and many other contexts is how to remove the size effect or concomitant variable. Three main statistical procedures have been applied to this end.
Table 1. Number of articles (and first page of the articles, between parentheses; ‘–’ = not given) using certain statistical tests to study body condition or other variables in volume 68 of the Journal of Animal Ecology (1999) and volume 80 of Ecology (1999). Total number of papers was 99 and 227, respectively (excluding 4 ‘Forum’ and 3 ‘Comments’ papers, respectively, in which no statistical analyses were given) Statistical method used  Journal of Animal Ecology  Ecology 

Condition†  Other variables  Condition†  Other variables 


Statistical test of a ratio  0  1 (698)  1 (2793)  12 (–) 
ancova‡  2 (571, 753)  20 (–)  2 (989, 1289)  25 (–) 
Statistical test of regression residuals  4 (73, 571, 595, 753)  4 (205, 726, 815, 1010)  2 (989, 1267)  3 (735, 806, 1522) 
Any of the above§  4  24  4  40 
A third procedure has appeared recently in the ecological literature. This procedure (hereafter ‘residual index’) was recommended recently by Jakob, Marshall & Uetz (1996) for studying condition. The residual index is computed for each individual (experimental or sampling unit) as the residual from the simple linear regression of volume or mass (appropriately transformed) on the length variable (see below for the formula and Fig. 1 for an illustration). These residuals are then compared among groups with a ttest or an analysis of variance (anova). This procedure was used in 8% and 2·2% of the 1999 papers of the Journal of Animal Ecology and Ecology, respectively, particularly for the study of condition (Table 1). The objectives of this paper are: (1) to point out several pitfalls of the residual index, and (2) to emphasize the use of ancova as the correct alternative.
The residual index vs. the ancova
 Top of page
 Summary
 Introduction
 The residual index vs. the ancova
 Conclusions
 Acknowledgements
 References
The procedure of the ‘residual index’ is conceptually similar to the standard design of the onefactor ancova. The model of this ancova design is:
where Y_{ij} are the values of the response (dependent) variable for the jth subject of the ith group, X_{ij} is the covariate (concomitant variable), µ is the grand mean (of the variable), α_{i} is the Yintercept (or treatment effect) for the ith group, β_{within} is the slope, X̄_{i} is the covariate mean for the ith group, and ɛ_{ij} is the random deviation (see, e.g. Sokal & Rohlf 1995: 503). As suggested by Porter & Raudenbush (1987) and Winer, Brown & Michels (1991), this model can be rewritten as:
The term on the right of this expression is similar to that of a onefactor anova model (Y_{ij} = µ + α_{i} + ɛ_{ij}), whereas the term on the left is similar to the ‘residual index’ (see, e.g. Jakob et al. 1996):
since a = Ȳ − bX̄, where a and b are the intercept and regression coefficient, respectively, of the simple linear regression. Although the ancova is thus quite similar to an anova of the residuals or to an anova of the adjusted values (Winer et al. 1991: 747), it is not identical and yields a different statistical result for a number of reasons (see Maxwell, Delaney & Manheimer 1985 for a comprehensive discussion). First, standard ancova uses a pooled regression coefficient within groups (b within) for adjusting the data (Sokal & Rohlf 1995: 507). This ‘b within’ is a weighted average of the regression coefficients for each group (Keppel 1991: 313) and generally differs from the conventional regression coefficient found on pooling the data, i.e. ignoring the groups, termed ‘overall b’ by Winer et al. (1991: 757). This overall b is the coefficient used by the residual index. The ‘b within’ is the leastsquares estimator in the ancova model, whereas the overall b is the estimator in the linear regression model (ignoring the groups) but should not be used if the objective is to compare the groups (Maxwell et al. 1985; Winer et al. 1991).
Another important difference is that in the standard ancova the error degrees of freedom (d.f.) are Nk1, where N is the total sample size and k is the number of groups (Sokal & Rohlf 1995: 509). In contrast, in a ttest or anova of the residual index the error d.f. are Nk although they should be reduced by one, as was performed in a similar procedure by Packard & Boardman (1988), due to the fact that the slope is estimated from the same data (Maxwell et al. 1985; Keppel 1991: 312; Winer et al. 1991: 749). The different d.f. affect the error mean square and hence the F and Pvalues. Without considering the different regression coefficients, an overestimated error d.f. implies an underestimated error mean square and hence a higher Type I error rate.
Some numerical examples will illustrate these differences. For instance, Sokal & Rohlf (1995: 512) describe thoroughly an example of the standard design of ancova in which the result for the factor (among groups) is F_{3,16} = 195·98, P < 0·0005. An anova of the residual index for the same data set (without correcting the degrees of freedom) would result in F_{3,17} = 39·6, P < 0·0005. Similarly, an ancova reported in Fig. 2c of Packard & Boardman (1988) was F_{1,13} = 6·10, P = 0·028, whereas the anova of the residual index yields F_{1,14} = 6·54, P = 0·023. An artificial data set for which the two procedures give opposite conclusions (P < 0·0005 and P = 0·33, respectively) is (X , Y): (1, 2) (3, 3) (5, 4) and (7, 4) for one group and (11, 12) (13, 13) (15, 14) and (17, 14) for another. Although the overall conclusion is probably similar for most real data sets, the statistical results differ for the reasons explained above. For instance, for Sokal & Rohlf’s example the ‘b within’ is 21·1 while the overall b is 17·3. The error d.f. are overestimated by one in the anova of the residuals.
Another advantage of ancova is its flexibility in implementing any aspect of the experimental design (e.g. factorial, nested or repeatedmeasured designs). For instance, a special design with the factor × covariate interaction allows us to test the assumption of homogeneity of regression coefficients (parallelism assumption), which is essential to the standard ancova (GarcíaBerthou & MorenoAmich 1993; Sokal & Rohlf 1995). Jakob et al. (1996) criticized the ‘slopeadjusted ratio index’ because it assumes the homogeneity of slopes, without realizing that their residual index also makes this assumption. A hypothetical example will show that the residual index can lead to incorrect biological conclusions if this assumption is not satisfied. An anova of the residual index of the data in Fig. 1 would not detect significant treatment effects on condition (without correcting the degrees of freedom, F_{1,6} = 1·49, P = 0·27), because these effects are sizedependent (ancova, F_{1,4} = 56·3, P = 0·002), i.e. the treatment increases the condition of larger individuals and decreases the condition of smaller individuals. In these situations when the parallelism assumption is not met, Huitema (1980: 104) suggests ignoring the adjusted treatment effects test (i.e. the factor of the standard ancova) and computing and reporting separate regression slopes for all groups and/or employing the Johnson–Neyman procedure.
A final, more technical problem is that even if the assumptions of the linear model hold for the original variables, they will not hold for the residuals. Theil (1965) showed that the residuals will necessarily be intercorrelated and heteroscedastic (see also Hayes & Shonkwiler 1996).
Conclusions
 Top of page
 Summary
 Introduction
 The residual index vs. the ancova
 Conclusions
 Acknowledgements
 References
ancova is a wellestablished statistical procedure that has received an enormous amount of attention and scrutiny in the literature. General linear models such as anova, ancova or simple linear regression involve the computation of various residuals and the graphical analysis of residuals is also essential in order to verify the assumptions of most statistical tests. However, there is no statistical reference to justify the direct anova of the residuals (residual index) as a correct alternative to the ancova. On the contrary, Maxwell et al. (1985) already showed in a insightful psychometric paper that anova of residuals is an incorrect procedure. More recently, Hayes & Shonkwiler (1996) thoughtfully pointed out that, although there are some complex equivalencies in the simplest cases, such tests of residuals are unnecessary. Similarly, Smith (1999) has criticized a similar use of the residuals in a comprehensive review of the statistics of sexual size dimorphism. He suggested that multiple linear regression (or other general linear models) should be generally used instead of using residuals as data, which leads to misspecified regression equations. Note that ancova can also be computed via multiple linear regression with dummy variables (see, e.g. Neter, Wasserman & Kutner 1985: 853; Kleinbaum, Kupper & Muller 1988: 299).
In summary, I suggest that the ‘residual index’ should never be used for statistical analyses of condition or any other variable. The various designs of ancova or other linear models of the original response variable are the reliable, wellknown methodology called for. Excellent accounts of ancova are given for instance in Neter et al. (1985), Keppel (1991) and Sokal & Rohlf (1995) (see also Winer et al. 1991 for a more mathematical treatment). A more comprehensive monograph on ancova and alternative techniques is by Huitema (1980).