The paradigm of body condition: a critical reappraisal of current methods based on mass and length

Authors


Correspondence author. E-mail: jpeig@ub.edu

Summary

1. Body condition is a major concept in ecology addressed in countless studies, and a variety of non-destructive methods are used to estimate the condition of individuals based on the relationship between body mass M and measures of length L. There is currently no consensus about the most appropriate condition index (CI) method, and various traditions have been established within subdisciplines in which ecologists tend to apply that method used previously by their peers.

2. Here, we present a reappraisal of six conventional CI methods: Fulton’s index (M/L3), Quételet’s index (BMI = M/L2), Relative condition (Kn, computed as the observed individual mass divided by the predicted mass Mi* = a Lib where a and b are determined by ordinary least squares (OLS) regression of M against L), Relative mass (Wr, where a and b above are determined from a reference population), the Residual index (Ri, the residuals from an OLS regression of M against L) and ancova. We compare the performance of these methods with that of the Scaled mass index, a novel method which was previously shown to perform better than Ri as a predictor of fat and other body components [J. Peig & A.J. Green (2009) Oikos, 118, 1883].

3. To be reliable, a CI method must successfully account for the changing relationship between M and L as body size changes and growth occurs (i.e. for the scaling relationship between M and L). Using data from three species of small mammals we show that, unlike the Scaled mass index, all six conventional methods fail to do this, and as a result they consistently lead to significant differences in CIs between age classes and sex that are a mere consequence of changes in body size. The Scaled mass index was also particularly successful at detecting changes in CI resulting from high levels of contaminants.

Introduction

Body condition is intimately related to an animal’s health, quality or vigour (Peig & Green 2009), and has been widely claimed to be an important determinant of fitness. A wide range of morphological, biochemical or physiological metrics have been proposed as condition indices (CIs) (Stevenson & Woods 2006). Here, we are only concerned with CIs based on the relationship between body mass (M) and length measurements (L), whose ultimate goal is to interpret variations of body mass for a given body size as an attribute of the individual’s well-being (most typically, variation in the size of energy reserves).

A variety of formulas and statistical methods have been proposed to standardize body size, and there is much debate about which ones are most suitable as CIs (Stevenson & Woods 2006). Among conventional methods, simple ratios (or ratio indices) between M and L or L raised to a specific power (e.g. M/L, M/L2, M/L3) have been in use the longest. Examples include Fulton’s index ‘K’ (where M/L3) still used in some ecological studies, or Quételet’s index or the body mass index (BMI = M/L2) universally applied in health sciences. In fisheries, a popular CI is the so-called Relative condition ‘Kn’, computed as the observed individual mass (Mi) divided by the predicted mass (Mi*, where Mi* = a Lib). The estimates a and b are empirically determined by ordinary least squares (OLS) regression of M against L (both log-transformed) for the whole study population (LeCren 1951). Even more popular in fisheries is a variant of the Kn index, called Relative mass ‘Wr’, where a and b are determined from a reference population instead of the population under study (Murphy, Willis & Springer 1991).

In recent years, the most widely accepted CI in terrestrial ecology has been the Residual index ‘Ri’, which uses the residuals from an OLS regression of M against one or more length measurements, usually after log transformation (Jakob, Marshall & Uetz 1996; Schulte-Hostedde, Millar & Hickling 2001; Ardia 2005; Schulte-Hostedde et al. 2005). Another popular approach is to conduct an analysis of covariance (ancova), which combines features of linear regression and anova to estimate directly the treatment effect on M while controlling for a concomitant variable of influence, denoted by L (García-Berthou 2001; Velando & Alonso-Alvarez 2003; Serrano et al. 2008).

Currently, no consensus exists about the best CI or criteria which allow selection of the most appropriate method in a particular study, and few authors provide a detailed justification of their choice of method. Ecologists and epidemiologists follow traditions within their discipline. Hence, the continuing use of ratios in fisheries and health sciences or the widespread use of Ri in terrestrial ecology, despite criticisms of these approaches (Albrecht, Gelvin & Hartman 1993; Packard & Boardman 1999; García-Berthou 2001; Green 2001; Freckleton 2002). As we will demonstrate, results may differ dramatically depending on the method of choice, causing concern about the reliability of studies based on less appropriate methods.

Peig & Green (2009) presented a novel CI method called the Scaled mass index ‘inline image‘, which standardizes body mass at a fixed value of a linear body measurement based on the scaling relationship between mass and length, according to equation 1:

Scaled mass index (inline image ):

image(eqn 1)

where Mi and Li are the body mass and linear body measurement of individual i respectively; bSMA is the scaling exponent estimated by the standardized major axis (SMA) regression of lnM on lnL; L0 is an arbitrary value of L (e.g. the arithmetic mean value for the study population); and inline image is the predicted body mass for individual i when the linear body measure is standardized to L0. Making a comparison of Mi values between different populations or studies simply requires use of the same L0 value in eqn. 1. In a variety of vertebrate species (five small mammals, one bird and one snake), the Scaled mass index performed better than the Residual index ‘Ri’ as a predictor of variations in fat and protein reserves as well as other body components (Peig & Green 2009).

The overall objective of this paper is to critically reappraise current CI methods and compare their performance with that of the Scaled mass index ‘inline image‘. Using field data from small mammals, we compare the performance of inline image, Residual index ‘Ri’, ancova, Body mass index ‘BMI’, Fulton’s index ‘K’, Relative condition ‘Kn’, and Relative mass ‘Wr’. We demonstrate empirically that conventional CIs can be inherently biased with regard to animal size, and tend to change condition scores in larger animals owing to violations of statistical assumptions and failure to account for growth and scaling relationships.

We included ratio methods in our study, because they are still widely in use and have not previously been compared with inline image. Although ratio methods have been widely criticized, many of their problems are shared by Ri and ancova. Exploring the performance of ancova was also important because, like inline image, it has been advocated as a more reliable method than ‘Ri’ (García-Berthou 2001). The ancova is not strictly a CI but rather an inferential test where individual scores are absent, making validation via correlations with body components such as fat reserves impossible.

First, to help explain why CIs can be unreliable, we consider basic principles underlying the construction of condition indices based on mass and length data.

First principles for a condition index

As pointed out by Kotiaho (1999), if CIs are assumed to reflect, or be validated by, either the ‘absolute’ or ‘percentage’ amount of energetic tissue, they should not be used for comparisons of condition among individuals of different size. This is because, at each stage of growth and development, there is both an optimal range of structural and energy capital (in absolute or % values) and an optimal distribution of this capital among different body components. In other words, the proportional or absolute amount of energy stores can be expected to change with normal growth processes, even in an optimal environment (e.g. one free from pathogens and disturbance, with non-limiting resources). It would therefore be erroneous to assert that adults were in better ‘condition’ or ‘well being’ than juveniles because of a greater absolute amount of fat or protein, or that juveniles were in better condition than adults because of a greater relative content (as %fat, %protein, etc.). The same generally applies to comparisons between sex or subspecies (Gallagher et al. 1996). Kotiaho’s premise has profound implications for condition estimates and the search for a suitable method. Essentially, a method is required that accounts for normal growth processes (i.e. scaling), thus allowing a valid comparison between individuals of a different body size.

Growth leads to strong correlations between M and most linear body measurements (L) (Hoppeler & Weibel 2005; Peig & Green 2009). Because increases in both M and L are parts of the same growth phenomenon (Thompson 1961), both M and L are indicators of body size per se which require some type of mutual standardisation. Strictly speaking, however, as body growth involves not only variation in body size but also in body composition (Huxley 1932; Calder 1984), the aim of any CI is not to control for body size but rather for growth effects as a whole and their consequences for scaling (Peig & Green 2009).

Virtually all work on biological scaling assumes a power function of the form = αXβ where the parameter α is a constant and the parameter β, the ‘allometric or scaling exponent’, determines the dimensional balance between Y and X. The power (nonlinear) relationship is supported by the notion of growth as a multiplicative process of living matter (Shea 1985). β equals the ratio of the specific growth rates of the dimensions Y and X (i.e. dY/Ydt and dX/Xdt, Shea 1985). As the same time interval applies for changes in M and L, estimating body condition from M-L data is time-independent and unaffected by the rate of growth.

Morphogenesis is the mechanism controlling the body proportions (i.e. between M and L) and thus determines the scaling exponent β (Roth & Mercer 2000). This process is largely controlled by genes which regulate how body parts differentiate and orientate to form a well-structured organism (Conlon & Raff 1999; Hogan 1999). As monomorphic species have their own body plan distinct from other species, the assumption that β is species specific (or sex specific for dimorphic species) is likely to be an adequate approximation for representative sample sizes. This is supported by variation in scaling trends between M and L at inter- and intraspecific levels (e.g. Green, Figuerola & King 2001). Comparisons of CIs are valid when made among groups sharing the same β value for M-L relationships, regardless of variation in growth rate between individuals, populations or sex. Other comparisons of CIs (e.g. between sex that lack a common morphogenetic pattern) are nonsensical.

According to the dimensional balance between volume V (closely related to M) and length L (i.e. M ≅ ∝ L3), under ‘isometry’ the scaling exponent that relates M and L is 3. This provides a rationale for the formulation of the Fulton’s index (M/L3). In practice, β for M against L usually deviates from the predicted value of 3, owing to ‘allometry’. Body condition should be estimated at no higher than the species level, which comprises the scaling effects owing to ‘heterauxesis’ and ‘individual allomorphosis’ (Gould 1966). ‘Heterauxesis’ (also called ‘ontogenetic allometry’ or ‘growth allometry’) refers to scaling during the growth of an individual, whilst ‘allomorphosis’ (or ‘static allometry’) is scaling among conspecifics at the same stage of determinate growth (usually adults), but varying in size. Although heterauxesis and allomorphosis require different types of data, both types of scaling are relevant to body condition as they are parts of a common morphogenetic phenomenon (Stern & Emlen 1999).

To enable a meaningful comparison between individuals of different sizes, a CI method must remove the effects of ontogenetic growth on the M-L relationship through standardisation. Unless some factor affects the well-being of animals at a specific growth stage (e.g. iguanas can become inefficient foragers when they reach a certain size, Wikelski, Carrillo & Trillmich 1997), mean condition scores should be equal for different age classes, and confirmation of this indicates that size and composition have been properly standardized (i.e. growth effects have been accounted for). Similarly, even if a species shows sexual size dimorphism, if the sexes have a similar body design (i.e. a similar morphogenesis), we should expect no differences in CIs (i.e. well-being) between sex.

Materials and methods

Matching the scaled mass index to first principles

The Scaled mass index (eqn 1) is based on a mathematical rearrangement of the simple power function = αXβ adapted for the study of body condition (see Appendix S1). The scaling exponent β is the most important parameter of interest (i.e. bSMA) in eqn. 1, and should ideally be estimated from large data set (e.g. large reference populations covering the whole range of body size).

β is estimated by regression between M and L, following natural log transformation of the power equation (i.e. lnM = lnα + βlnL). Because of the natural M-L interdependence and their distinct scale of measurement, the scaled mass index relies on standardized major axis (SMA) regression to estimate the regression coefficient β (≈bSMA). SMA (also called reduced major axis), a form of model II regression, assumes variation in both Y and X variables (see below), and is recommended to ascertain the structural relationship between morphometric data, when the scaling exponent is the parameter of primary interest (Seim & Sæther 1983; LaBarbera 1989; Warton et al. 2006).

The Scaled mass index standardizes all individuals to the same L value and adjusts their body mass with that which they would have at their new L value in accordance to the scaling trend between M and L. In other words, it standardizes all individuals to the same growth phase. For the purpose of comparing condition between individuals, phenomena of body growth variation linked to morphogenesis (e.g. ontogeny or sexual size dimorphism) can be seen as shifts along the line (sensuWarton et al. 2006) described by the scaling exponent (estimated by bSMA).

Thus, unlike conventional CI methods, the Scaled mass index is compatible with the following precepts underlying studies of condition (see Peig & Green 2009 and Table S1):

  • 1 Growth is a multiplicative process of living material (Huxley 1932; Shea 1985), which makes nonlinear (power) models for M-L relationships more appropriate than linear ones.
  • 2 The existence of allometry (i.e. departures from isometry), making empirical estimation of the scaling exponent necessary (Gould 1966; Schmidt-Nielsen 1984).
  • 3 The specificity of the morphogenetic pattern determining the body proportions for a particular species, subspecies, or sex within a species (Conlon & Raff 1999; Hogan 1999), hence the need for a reliable estimate of the scaling exponent in each case.
  • 4 The mutual interdependence of M and L driven by body growth, with both being inherent indicators of true size (i.e. body volume) (Thompson 1961).
  • 5 The concomitant change of body composition with size owing to morphogenesis, such that only individuals at the same growth phase are strictly comparable (Kotiaho 1999).

A case study framework

We compared the performance of seven different CI methods using data from three species of small mammals collected in the Garraf Massif, northeast Spain from 2000 to 2003 for an ecotoxicological survey. Wood mice (Apodemus sylvaticus Linnaeus) (= 272), Algerian mice (Mus spretus Lataste) (= 144), and greater white-toothed shrews (Crocidura russula Hermann) (= 142) were live-trapped in riparian habitats differing in pollution intensity. Two sites, ‘Medium’ and ‘High’ respectively, were affected by landfill leachates rich in heavy metals and organic pollutants from the largest dumping site in Spain (Sánchez-Chardi et al. 2007, J. Peig unpublished data). A third riparian site unaffected by landfill leachates was selected as a ‘reference’ site.

For each individual, body mass (M) was measured with a Pesola® spring balance (±0·1 g). Stomach contents, embryos, and parasites were also weighed using a digital balance (±0·001 g) to correct final body mass. Four linear body measurements (L values) [body (from snout-to-anus), tail, left hind foot, and left ear length] were taken with a square calliper to the nearest millimetre. Each individual was sexed and the degree of sexual maturity determined by standard examination of reproductive structures (e.g. Díaz & Alonso 2003). Specimens were divided into three age classes (I, II or III, see Appendix S2 for details).

Selecting mass–length variables and data set

We examined bivariate linear correlations of M against L (both ln-transformed), using original length variables or a combination of them following principal component analysis (see Appendix S3). As body length was the linear measurement most correlated with M it was selected as the L variable for all CI methods (Appendix S3). A Gaussian distribution of lnM and lnL was assumed based on the Kolmororov–Smirnov Z test (> 0·05) and P-P plots.

Each of the seven methods was used to calculate a CI for all individuals. According to recommended procedures for computing inline image and Wr, only the most reliable data were used to estimate key parameters (Murphy, Willis & Springer 1991; Peig & Green 2009). Hence, to calculate inline image, we first performed bivariate (lnM-lnL) plots to establish which set of individuals provided the most reliable estimate (bSMA) of the scaling exponent β. For all species, the strength of fit improved when only individuals from the reference population were included: the strength of association (r2 × 100) between lnM and lnL increased from 84% (all individuals) to 87% (reference group) for wood mice, from 77% to 93% for Algerian mice, and from 35% to 47% for shrews. A likely reason is that animals inhabiting polluted areas are affected by toxins which disrupt growth trajectories, i.e. the true M-L relationship or β. The phenotypic variance in body length can be used as a measure of developmental instability in small mammals (Velickovic 2007). For each study species, the total variability in body length increased in parallel to pollution intensity, although differences were only significant in one species [variance in body length: s2Ref. = 61·30, s2Med. = 84·12, s2High = 88·62 (= 272, = 0·277) for wood mice; s2Ref. = 37·88, s2Med. = 43·64, s2High = 44·96 (= 144, = 0·020) for Algerian mice; and, s2Ref. = 24·60, s2Med. = 35·65, s2High = 43·43 (= 142, = 0·157) for shrews]. Thus, although inline image was applied to the whole sample, the bSMA was estimated using the reference sample. Likewise, parameters α and β required for Wr were obtained from the reference population.

Relating condition indices to age, sex and location

Age class and sex were selected as natural factors causing differences in body size measured either as M and L. Using seven CI methods, we analyzed the differences in CIs between age classes and sex. CIs are also used to compare groups in different habitats or undergoing different treatments. Hence, we considered the influence of study site (i.e. pollution intensity) on CIs. A detailed interpretation of site/pollution effects on CIs will be given elsewhere.

For inline image (Eq. 1), bSMA was calculated from the reference group using RMA for Java v.1.19 software (Bohanak & van der Linde 2004). The arithmetic mean of body length from the whole population was taken as L0 [98·22 mm (= 272) for wood mice, 79·92 mm (= 144) for Algerian mice, and 67·03 mm (= 142) for shrews]. The Body mass index (BMI = M/L2) and Fulton’s index (=M/L3) were computed using raw mass-length data. A multiplier of 103 for the BMI and 106 for K were included to give conventional units of kg m−2 and kg m−3 respectively. For Relative condition (Kn = M/aLb), a and b were estimated from the whole sample by OLS regression of the linearized power equation lnM = lna’ + blnL [where a in Kn equals e(lna′)]. The same procedure was used to compute Relative mass (Wr), but in this case a and b were obtained from the reference population. The Residual index (Ri) was calculated from an OLS regression of lnM against lnL. Finally, for ancova, lnM was the dependent variable and lnL the covariate for each analyzed factor (García-Berthou 2001).

The resulting CI scores were analyzed with respect to age, sex and location with parametric (anova and student’s t-test) and non-parametric tests (Kruskal–Wallis H test and Mann–Whitney U test) to identify significant (at α < 0·05) or marginal (at 0·05 < α < 0·10) effects. For parametric tests, the assumption of homoscedasticity was explored via the Levene test at α = 0·05. For ancova, we tested the critical assumption of parallelism in regression lines between factor levels at α = 0·05 (García-Berthou 2001).

Complementary biostatistical assays

The different CI methods use different approaches for size standardisation and different assumptions about the relationships between M and L (see Table S1), which we tested. We assessed the underlying nature (linear vs. nonlinear) of the relationship between the selected M and L variables. We also examined the scaling exponents assumed by each CI method. Log-transformation usually normalizes data before using parametric models (linear regression, ancova, etc.), and is also used to fit the conventional M-L relationship (YXβ) (Marshall, Jakob & Uetz 1999; García-Berthou 2001; Green 2001). Thus, the values of β can be compared using the regression coefficients b obtained from different linear models which take the general form bX. For inline image, the model used by SMA regression can be described as Y1 = bSMAY2 (McArdle 2003), where there is no independent variable and both variables (Y1-Y2) are subjected to natural variability and measurement error. The slope bSMA minimizes the areas of triangles (in the y- and x-axis direction from a lnM-lnL plot) between data points and the best-fitted line (McArdle 1988). Ri, Kn and Wr indices are based on the OLS (or Model I) regression Yi=bOLSXi + εi, where there is an independent (X) variable with no natural variability or measurement error. It thus assumes an extremely asymmetrical relationship between Y and X, the slope bOLS being drawn to minimize the sum of squares between data points and the line in the y-axis direction only. The ancova model can be formulated as Yij=ai + bpooled(Xij + ¯i) + εij, combining features of model I regression (based also on the principle of least squares in the y plane) with one-way anova (García-Berthou 2001). ancova thus computes an average slope (bpooled) of the regression coefficients defined by each factor level (ai), whose lines are also fitted by assuming an independent covariate with no natural variability or measurement error. Finally, the BMI and K index use fixed values of β (referred to below as bpresumed).

We compared values of bSMA, bOLS, bpooled, and bpresumed and considered their relationship with the above first principles. Furthermore, we evaluate the close relationship between some of the CI methods. Some methods produce CIs which are highly intercorrelated, yet there remain important differences in their interpretation (see also Peig & Green 2009). Except for calculation of bSMA, statistical analyses were performed with SPSS v.15 (SPSS Inc., Chicago, USA).

Results

Comparing condition indices in relation to age, sex and location

Relative age

Body mass and body length differed significantly between age classes for each species (Kruskal–Wallis H test: < 0·001), except for marginal differences observed for body length in shrews (< 0·1). As expected, both aspects of body size tended to increase with ontogeny for all species (Figure S1).

There were major differences in the relationship between CI and age class between methods. The Scaled mass index did not show age differences in CI for any species, whereas conventional methods usually indicated major differences between age classes (Table 1, Fig. 1). For all species, Ri, ancova, BMI and Kn showed marked differences in CIs among age classes. Fulton’s index (K) showed significant differences in wood mice, marginal differences in Algerian mice, but no differences in shrews. In contrast, Relative mass (Wr) showed major differences for shrews and wood mice, and marginal differences for Algerian mice.

Table 1.   Effects (P-values) of age class, sex and location (population) on body condition estimates for three species of small mammals according to seven different methods. Bold type indicates a significant factor effect at α = 0·10. See Table 2 and introduction for details of each method
Factors by species inline imageRiancovaBMIKKnWr
  1. *< 0·05 for the Levene’s homoscedasticity test.

  2. †Heterogeneity of slopes in the ancova method at α = 0·05.

  3. PA, parametric tests (one-way anova or Student t-test).

  4. NP, non-parametric tests (Kruskal–Wallis H test or Mann–Whitney U test).

Relative age
 Wood mousePA0·741<0·001<0·001†*<0·0010·022*<0·0010·020
NP0·775<0·001<0·0010·030<0·0010·007
 Algerian mousePA0·368<0·001<0·001†*<0·001*0·073<0·001*0·093
NP0·471<0·001<0·0010·104<0·0010·126
 Greater w-t shrewPA0·441<0·001<0·0010·0830·658<0·0010·003
NP0·318<0·0010·0540·497<0·0010·002
Sex
 Wood mousePA0·7920·0990·086†0·0370·2900·1350·350
NP0·5080·0140·0030·7590·0140·054
 Algerian mousePA0·4500·1190·1190·0960·3450·1050·356
NP0·9820·2630·2440·7870·2610·803
 Greater w-t shrewPA0·9390·9490·9490·9930·9170·9110·942
NP0·7610·6820·6400·7830·6820·606
Population
 Wood mousePA0·017*0·0130·014*0·013*0·025*0·012*0·013*
NP0·0190·0100·0090·0300·0100·011
 Algerian mousePA0·045*0·312*<0·001†*0·219*0·059*0·202*0·057*
NP0·1090·5700·5910·1940·5710·189
 Greater w-t shrewPA0·0010·0040·0040·0010·0020·0030·001
NP0·0020·0110·0020·0020·0110·004
Figure 1.

 Ontogenetic trends in body condition index (CI) estimated by seven different methods. There were no significant differences between age classes for inline image. Conventional methods produced CIs that differed between age classes, with the exception of K for shrews (Table 1). Where appropriate, alphabetic characters (a,b,c) below each plot summarize pairwise post-hoc comparisons among age classes after Bonferroni correction (age classes not sharing letters were statistically different). CI values exceeding 1·5 times the interquartile range of boxplots are marked with black dots. *Marginal means and 95% CI for ancova were evaluated at ln (body length) = 4·58 for wood mice, 4·38 for Algerian mice, and 4·20 for shrews.

Condition index values from conventional methods increased consistently with age class, with the exception of a sequential decrease observed for K in wood mice (Fig. 1). The BMI and those CIs based on the principle of least squares along the y-axis (i.e. Ri, ancova, Kn and Wr) showed the same trend with age class as did M and L (Figs 1 and S5). This indicates a systematic bias towards higher CI values in larger individuals.

Sex

Sexual size dimorphism was marked in wood mice (M: ♂ > ♀, < 0·01; L: ♂ > ♀, < 0·01) and Algerian mice (M: ♂ > ♀, < 0·05; L: ♂ > ♀, > 0·10), but not in shrews (♂≈♀, > 0·10 for both M and L) (Figure S1). For all species, comparisons of CIs between sex were legitimate because the scaling relationship between M and L was the same for both sex, as shown by a lack of difference in slope (> 0·10) when fitting a common log-log SMA regression. Thus, despite sexual differences in size, we assumed the same morphogenetic pattern for males and females.

The Scaled mass index showed no sexual differences for any species (Table 1, Fig. 2). In contrast, all conventional methods (except for the Fulton’s index ‘K’) showed significant or marginal effects of sex on CIs, for at least one species. These statistical differences were exclusively recorded in the size-dimorphic species (i.e. wood mice and Algerian mice, Figure S1). Even when differences between these two species were not significant, most conventional methods tended to inflate CIs for males, owing to their larger body size (Fig. 2).

Figure 2.

 Sexual trends in body condition index (CI) estimated by seven different methods. There were no significant differences between sex for inline image even for size-dimorphic species (i.e. wood mice and Algerian mice). In contrast, most conventional methods yielded statistical differences in CIs between sex (at α = 0·10), and sex-differences in CIs were male-biased in accordance with the pattern of size-dimorphism. CI values exceeding 1·5 times the interquartile range of boxplots are marked with black dots. *Marginal means and 95% CI for ancova were evaluated at ln (body length) = 4·58 for wood mice, 4·38 for Algerian mice, and 4·20 for shrews.

Location

According to inline image, sampling location had significant effects on condition for all three species (Table 1). Conventional CIs for wood mice and shrews consistently varied with location. For Algerian mice, Ri, BMI and Kn failed to detect a difference between locations. In contrast, ancova, K and Wr suggested there were differences in condition among populations (despite violations of statistical assumptions), in accordance with inline image. Because Algerian mice were subjected to the same ecological disturbance as the other two species, it seems likely that there was a real difference in condition between locations, which Ri, BMI and Kn failed to detect.

Comparing assumptions about scaling

The relationship between M and L was clearly nonlinear for all species (Appendix S4). The various CI methods relied on different values for the scaling exponent β (Table 2). For all three mammal species, the various regression coefficients (bSMA, bOLS, bpooled) from the linear models suggested an allometric/power relationship between M and L. Although b values of 3 are expected if growth follows simple geometric rules, in reality β may deviate somewhat from this value because growth is not isometric. For instance, bSMA = 2·71 appears to be most reliable approximation for β in wood mice (Table 2) because inline image is the only CI independent of body size (i.e. independent of ontogeny and sex). The Fulton’s index ‘K’ assumes an exaggerated value of 3, thus underestimating the relative mass of larger animals and causing a progressive reduction in CIs with ageing and growth (Fig. 1). On the other hand, ‘b’ values below 2·71 (≅ β) in wood mice inflate CIs in larger animals, producing higher values in older individuals and in males (e.g. Ri, ancova, BMI, Kn, Wr in Figs 1 and 2).

Table 2.   Details of scaling exponents β used by different condition index (CI) methods in three species of small mammals. From the conventional mass-length relationship (i.e. = αLβ), the slope β can be estimated by linear models fitted to an lnM-lnL biplot: bSMA is the regression coefficient of a standardized major axis (SMA) regression (Y1 = α’ + βY2), bOLS is the regression coefficient of an ordinary least squares (OLS) regression (= α’ + βX), and bpooled is the regression coefficient of the ancova model (Yij = αi + β(Xij + ¯i) + εij). bpresumed are fixed values of β and are not estimated from linear models. n is lower for methods in which a reference population was used to calculate β
CI methodType of scaling exponentWood mouseAlgerian mouseGreater w-t shrew
nr2Value [95% CI]nr2Value [95% CI]nr2Value [95% CI]
  1. *P < 0·05 in the anova test for fit of the linear model.

  2. †Heterogeneity of slopes in the ancova method at α = 0·05.

  3. CI, condition index.

Scaled mass index (inline image)bSMA970·75*2·71 [2·43, 2·99]300·88*3·26 [2·83, 3·69]470·22*2·68 [1·97, 3·39]
Residual index (Ri)bOLS2720·70*2·18 [2·01, 2·34]1440·59*2·08 [1·79, 2·36]1420·12*0·79 [0·44, 1·13]
ancova method for
 Relative agebpooled2721·36 [1·11, 1·60]1441·13 [0·86, 1·39]1420·22*0·63 [0·30, 0·97]
 Sexbpooled2722·13 [1·96, 2·31]1440·60*2·05 [1·77, 2·34]1420·11*0·79 [0·44, 1·14]
 Locationbpooled2720·71*2·18 [2·01, 2·35]1442·09 [1·80, 2·38]1420·18*0·87 [0·53, 1·21]
Body mass index (BMI)bpresumed2·002·002·00
Fulton’s index (K)bpresumed3·003·003·00
Relative condition (Kn)bOLS2720·70*2·18 [2·01, 2·34]1440·59*2·08 [1·79, 2·36]1420·12*0·79 [0·44, 1·13]
Relative mass (Wr)bOLS970·75*2·34 [2·06, 2·61]300·88*3·03 [2·58, 3·58]470·22*1·25 [0·54, 1·96]

Likewise, in Algerian mice bSMA = 3·26 is the most reliable β estimate, and lower values of b ( < 2·1) led to consistently biased CIs in larger animals (i.e. Ri, ancova, BMI and Kn,Figs 1 and 2). The level of bias was much weaker in methods using higher b values (bpresumed = 3 for K and bOLS = 3·03 for Wr) and only weak age and sex effects were recorded for these methods (Table 1). Finally, in shrews, low ‘b’ values (< 1·26) used by Ri, ancova, BMI, Kn and Wr explained their systematic bias towards larger animals (Fig. 1). Only inline image (bSMA = 2·68) and K (bpresumed = 3) produced CIs independent of size variation. For this species, both methods performed equally well, suggesting that the true β may lie between bpresumed = 3 and bSMA = 2·68. The M-L data for shrews may have been insufficient to evaluate a reliable β value, or morphogenetic pattern, as suggested by the particularly weak correlation between M and L (Appendix S3).

The linear regression models employed for inline image, Ri, Kn and Wr were all significant (Table 2). However, five of nine ancovas had violations of the critical parallelism assumption and/or homogeneity of variances (Tables 1 and 2). Furthermore, for ancova, the slope (bpooled) had to be recomputed for each explanatory factor analyzed, leading to repeated changes in the scaling exponent β from the same data set and reducing its biological meaning.

Similarities and differences between some CI methods

In some cases, the CIs produced by different methods were perfectly correlated (Appendix S5). However, these methods were not mathematically equivalent, as the CI values and the manner they should be interpreted differ among methods (Appendix S5).

Discussion

Decisions on the methods used for estimating body condition should be made cautiously and in compliance with theoretical precepts. As the aim of CIs based on M and L is to remove the growth effects on M with the ultimate goal of assessing an individual’s health, CIs should be independent of age, sex and other natural causes of body size variation linked to morphogenesis. The pervasive effects of these intrinsic factors on CIs has been recognized as an important limitation of the Body mass index (BMI) in epidemiology (Gallagher et al. 1996; Prentice & Jebb 2001), but are usually ignored in ecological studies. A major finding of our reappraisal is that the Scaled mass index inline image accounted for ontogenetic variation or sexual dimorphism in body size, while all conventional methods failed to consistently standardize these key parameters. As we anticipated, all methods based on the principle of least squares in the y-plane (Ri, ancova, Kn and Wr) were systematically biased towards larger individuals, despite the current acceptance of these methods in ecology or fisheries biology. In fact, in recent years the statistical mechanics have been the main concern of authors proposing particular CI methods (e.g. García-Berthou 2001; Freckleton 2002; Schulte-Hostedde et al. 2005), overlooking the importance of underlying natural factors on condition estimates as definitive proof of their suitability.

Although Fulton’s index (K) has been subjected to much criticism (see Stevenson & Woods 2006), it performed better than more sophisticated techniques (e.g. Ri and ancova), despite producing CIs that decreased with size in the wood mouse. K probably performed well because the scaling relationship it assumes (ML3) is closer to reality than those suggested by the biased slopes from OLS regression of M on L. The BMI also showed systematic bias in larger animals. Our results explain why the conventional methods often compel ecologists and epidemiologists to compare condition among individuals that are relatively homogeneous in body size, i.e. by separating ages or sex, thus circumventing undesired effects of growth on CIs (e.g. Deurenberg, Weststrate & Seidell 1991; Bister et al. 2000; Schulte-Hostedde, Millar & Hickling 2001; Velando & Alonso-Alvarez 2003). Otherwise, the basic premise of size-independence is likely to be broken (Gallagher et al. 1996; Green 2001). For broad ranges of body size, BMI and methods based on least squares in the y-axis automatically cause CIs to change with ontogeny or sexual size dimorphism (see also Blackwell 2002; Forsyth et al. 2005). These biases are likely to be reduced by using inline image based on SMA regression. However, owing to its reliance on log transformation to calculate bSMA, even inline image may be somewhat biased (Packard 2009).

Some reasons why conventional methods may produce misleading CIs were previously pointed out by Green (2001) and Peig & Green (2009). However, this current article represents a comprehensive reappraisal of available methods, based on empirical evidence and biostatistical precepts. The first precept is the existence of scaling and the nonlinear structural relationship between M and L. CI methods based on OLS linear models like Ri or ancova can not account for the true M-L relationship, with increasing bias for unrepresentative sample sizes (e.g. covering small ranges of body size which lead to particularly low r2 values between M and L). A second precept is the existence of allometry (i.e. departures from isometry), which requires the empirical estimation of the scaling exponent β as the best description of the true M-L structural relationship. Presupposing values of β (e.g. = 2 in the BMI or = 3 in the K index) cause a bias in CIs owing to the deviation from the true value of β. The third precept is the mutual interdependence between M and L measurements, and its implications for β. For these two interdependent variables, the true slope will lie somewhere between the extreme slopes fitted by OLS regression of lnM on lnL and the reverse regression of lnL on lnM (McArdle 2003), the exact position depending on the relative importance of each variable as a descriptor of true body size.

The SMA regression is an error in variables model which best accounts for the natural M-L (or L-M) interdependence while accounting for the different scale of measurement (Warton et al. 2006; Peig & Green 2009). For every species, bSMA values not only solved the confounding effects of growth, but they were also more in accordance with scaling theory and previous literature (reviewed by Green 2001), which suggest that the scaling exponents relating M and body length for mammals lie between 2·5 and 3·2 (Table 2). As the information on body size is partitioned between mass and body length, this range [2·5, 3·2] could serve as a guideline to identify reliable estimates of β in mammals. As there is no universal solution for finding the line of best fit between two interdependent variables (McArdle 1988), the use of reference or standard populations can help to define a suitable estimate of the β. For this reason, we used only data from uncontaminated sites to estimate bSMA for the Scaled mass index. However, even when bSMA was computed for the entire study sample, our inline image estimates still performed better than the Ri, ancova, BMI, Kn and Wr (results not shown). The only noteworthy change was that the differences in inline image between age classes became significant for Algerian mice (< 0·05).

The use of OLS regression to fit a line to M-L data is often defended because the residual scores are not correlated with body size or, in particular, with the length measurement L (e.g. Jakob, Marshall & Uetz 1996; Marshall, Jakob & Uetz 1999; Schulte-Hostedde, Millar & Hickling 2001). Nonetheless, this is an invalid justification, based on the incorrect assumption that L is the only indicator of body size and is an error-free measure. Ri and other methods based on least squares in the y-plane will tend to underestimate the regression coefficient (Table 2), being more attenuated (closer to zero) as absolute values of the correlation coefficient decrease (Seim & Sæther 1983; White & Seymour 2005; Arnold & Green 2007). Regression coefficients from conventional methods were well below the isometric value of 3 (with the exception of Wr in the Algerian mouse), with values below 0·8 in shrews. β estimates below the true M-L slope lead CIs to increase with increasing true body size, while β estimates above the true slope make CIs decrease. Very low values of regression coefficients should be taken as an early warning that the model is inappropriate for scaling and CI studies (see also LaBarbera 1989). On the other hand, SMA slopes can be more biased than OLS slopes when error rates in M are more than three times those in L, and when the correlation between M and L is high (>0·9), the choice of regression method makes little difference (McArdle 1988; Arnold & Green 2007). Furthermore, all major statistical packages are based on OLS techniques, making them more suitable for multivariate analyses.

Despite the popular use of BMI in humans (taking height as L), it has rarely been applied to animals (Stevenson & Woods 2006). The BMI assumes β = 2 instead of 3 as predicted by isometry, because bOLS = 1·94 for a regression of mass on height in humans (Khosla & Lowe 1967). In fact, our results and scaling theory suggest that β approximates much more to a value of 3 than to 2, indicating that the BMI will be dependent on size, as illustrated by the way it varied with ontogeny and sexual size dimorphism. We expect that inline image is more appropriate for human studies than BMI.

In conclusion, we have presented solid evidence that conventional methods neither satisfy biological precepts concerning M-L relationships nor control successfully for variation in body size, often making them unreliable predictors of an individual’s body condition. Furthermore, from a pragmatic viewpoint, some methods are very sensitive to statistical assumptions, especially ancova which failed to provide reliable results in most of our analyses. Our results, together with those of Peig & Green (2009), suggest that currently fashionable methods (e.g. Ri and ancova) can produce unreliable results.

In contrast, the Scaled mass index emerges as a powerful tool for condition studies from mass-length data. This method provides empirical results in accordance with the theoretical precepts identified for studies of condition, and has previously been validated with data on body components such as fat and protein (Peig & Green 2009). Being simple, reliable, comprehensive and easily interpretable, we encourage ecologists to implement this method in future studies. It is ideal both for comparing individuals of a single population, and for comparing CIs between populations or even between studies on the same species.

Acknowledgements

We thank Josep Lluis Parra, José Domingo Rodríguez-Tejeiro and Jacint Nadal for valuable comments on earlier drafts. We also appreciate the technical assistance from the Gavà Museum. This work was funded by the Spanish (project #: BOS 2000-0567) and Catalan Government (project #: 2001SGR-00090 and 2001ACOM-00009).

Ancillary