A novel integrative method for measuring body condition in ecological studies based on physiological dysregulation



  1. The body condition of free-ranging animals affects their response to stress, decisions, ability to fulfil vital needs and, ultimately, fitness. However, this key attribute in ecology remains difficult to assess, and there is a clear need for more integrative measures than the common univariate proxies.
  2. We propose a systems biology approach that positions individuals along a gradient from a ‘normal/optimal’ to ‘abnormal/suboptimal’ physiological state based on Mahalanobis distance computed from physiological biomarkers. We previously demonstrated the validity of this approach for studying ageing in humans; here, we illustrate its broad potential for ecological studies.
  3. As an example, we used biomarker data on shorebirds and found that birds with an abnormal condition had a lower maximal thermogenic capacity and higher scores of inflammation, with important implications for their ecology and health. Moreover, Mahalanobis distance captured a signal of condition not detected by the individual biomarkers.
  4. Overall, our results on birds and humans show that individuals with abnormal physiologies are indeed in worse condition. Moreover, our approach appears not to be particularly sensitive to which set of biomarkers is used to assess condition. Consequently, it could be applied easily to existing ecological data sets.
  5. Our approach provides a general, powerful way to measure condition that helps resolve confusion as to how to deal with complex interactions and interdependence among multiple physiological and condition measures. It can be applied directly to topics such as the effect of environmental quality on body condition, risks of health outcomes, mechanisms of adaptive phenotypic plasticity, and mechanisms behind long-term processes such as senescence.


Body condition is an integral part of animal ecology. First, it may be associated with health outcomes such as infections (Arsnoe, Ip & Owen 2011) or with processes affecting the general health state, such as senescence (Kirkwood & Austad 2000). Secondly, within-individual fluctuations in condition may arise from temporal variation in environmental stressors and provide a flexibility to acclimate to a given environment (Piersma & van Gils 2011). Thirdly, among-individual differences in condition may reflect variation in genetic (Blanckenhorn & Hosken 2003) or habitat quality (Oliva-Paterna, Miñnano & Torralva 2003) or arise from life-history trades-offs (Reed et al. 2008) or maternal effects (Hare & Cree 2011). All of the above will contribute variance in fitness and affect the costs and benefits of individual strategies, such as those related to dispersal, mating, or reproduction and favour the evolution of condition-dependent strategies (e.g. Chastel, Weimerskirch & Jouventin 1995; Cotton, Small & Pomiankowski 2006; Kisdi, Utz & Gyllenberg 2012) and phenotypes (Buchanan 2000; Lorch et al. 2003; Bonduriansky 2007).

Despite an abundant literature on these topics, condition remains a poorly defined concept with often vague references to viability, vigour, resource reserves, performance or other descriptors (Johnstone, Rands & Evans 2009; Hill 2011). In addition, it is not clear whether typical indicators such as residual mass, reactive oxidative species, haematocrit or carotenoids (to name a few) always provide a signal on the general condition of an individual (Costantini & Møller 2008). Yet, implicit to many studies interested in condition are the fundamental concepts of the ‘general health state’ of an individual and the ‘optimal physiology in a given environment’. In that sense, the proposition of Hill (2011) to define condition as ‘the relative capacity to maintain optimal functionality of essential cellular processes’ is attractive because it relates directly to these concepts. As Hill stated: ‘To make sense of this concept of condition, researchers need to be able to define the optimal state of body systems […]’.

From that perspective, it seems relevant to shift focus away from single components of condition and move towards measures that provide a more holistic assessment of condition. For example, a measure of homeostasis (Hill 2011) might allow individuals to be graded from ‘normal’ to ‘abnormal’. Thus, integrated methods to measure condition are needed. Physiological biomarkers offer a promising avenue for that purpose. They are generally useful markers precisely because they are integrated in complex physiological regulatory networks for the maintenance of organismal homeostasis, where the levels of different markers are not independent (Cohen et al. 2012; Costantini, Monaghan & Metcalfe 2013), and thus give us a window into underlying system function. They can span various functions (e.g. immune, endocrine, circulation, metabolism, ionic balance, etc.), allowing both the inclusion of numerous relevant variables in the assessment of the condition of an individual (sensu Hill 2011) and testing of specific hypotheses relating condition to physiological functions. For instance, they are commonly used in human studies on ageing, frailty and relative risk of health outcomes (McClearn 1997; Walston 2005).

Recently, we showed how using the joint probability distribution of multiple biomarkers to assess individuals’ physiological condition predicts mortality and health outcomes in an elderly human population (Cohen et al. 2013). The underlying hypothesis was that risk of adverse health outcomes increased with increasing global physiological dysregulation in humans and that an abnormal physiological profile would indicate this dysregulation. Likewise, the physiological response of wild-ranging animals to environmental challenges, including trade-offs (Buehler et al. 2012), may be a system-level property affected by the general condition or health of an individual.

Here, we illustrate and discuss the broad potential of this approach for ecological studies. We apply it to published biomarker data on a captive flock of a migrant shorebird, the red knot Calidris canutus Linnaeus. In the wild, this species is exposed to biotic and abiotic conditions that change drastically over the annual cycle and across the geographical areas occupied during breeding (e.g. high arctic tundra), migration (e.g. open ocean) and wintering (e.g. saline mud flats) periods (Piersma 2007; Buehler & Piersma 2008). Experiments on captive red knots showed that these birds acclimate to seasonal change in temperature by adjusting their thermogenic capacity (Vézina et al. 2007; Vézina, Dekinga & Piersma 2011). They also exhibit seasonal modulations in constitutive immune function, which apparently are tailored to seasonal environmental challenges (Buehler et al. 2008). Here, we further examined whether the general physiological condition of an individual predicted (i) its maximal thermogenic capacity or ‘summit metabolic rate’ (Msum, i.e. its capacity to respond to cold stress) and (ii) its level of foot inflammation (Fscore) likely reflecting an infection by Staphylococcus bacteria (‘bumblefoot’). To assess how normal or abnormal a given biomarker profile was, we used Mahalanobis multivariate statistical distance (DM; Mahalanobis 1936; De Maesschalck, Jouan-Rimbaud & Massart 2000), which assigns each observation a score for distance from the reference population mean (assumed to represent the average or ‘normal’ physiological state). Our hypothesis is that DM can be viewed as measuring underlying physiological condition, while Msum and Fscore are measures of performance and health outcome, respectively. Thus, DM should predict Msum and Fscore if the latter two are determined by condition. Our results are consistent with this hypothesis, and we discuss how they validate the use of Mahalanobis distance as a general measure of condition in ecological studies.

Materials and methods

Captive animals

Previous studies by Buehler et al. (2008, 2012) and Vézina, Dekinga & Piersma (2011) examined how bird physiology and metabolism adjust to changing requirements across the annual cycle. Thirty-one adult red knots (21 females, 10 males) from the northerly wintering subspecies (C. c. islandica) were caught in 2004 and kept in five aviaries at the shorebird facility of Royal Netherlands Institute for Sea Research (NIOZ). The birds were divided into three groups exposed to different temperature treatments: thermoneutral, cold and variable. Birds in the variable and the thermoneutral groups were divided between two aviaries per group; birds in the cold treatment were housed in a single aviary. Phenotypic measurements began in February 2005 and were recorded monthly for 13 months. Details about conditions in captivity, experimental setting and measurements can be found in Vézina et al. (2006), Vézina, Dekinga & Piersma (2011) and Buehler et al. (2008).


Variables used in the current study are shown in Table 1. Since our goal was to illustrate the multivariate distance method to measure condition, biomarker selection was based on availability from previous studies. Nonetheless, they represent a large range of functional groups. In the original studies, the markers were used to investigate the relationship between metabolism and thermogenic capacity, to cover a range of protective functions (immune measures) and to study covariation between markers from different functional groups (Vézina, Dekinga & Piersma 2011; Buehler et al. 2012). Some variables required log transformation prior to the main analyses (Table 1). Mahalanobis distance can only be calculated using cases that have no missing values. The red knot data set included 259 birds-months, a measurement taken on an individual bird in a given month.

Table 1. Description of the variables used in this studya
AbbreviationDescriptionFunctional systembUseTransformation
  1. a

    Measurements originally done in studies by Buehler et al. (2008, 2012) and Vézina, Dekinga & Piersma (2011).

  2. b

    Given for physiological biomarkers only.

BMRBasal metabolic rateMetabolismDM calculationlog
MCEcMicrobial killing activity under exposition to Escherichia coliImmuneDM calculationNone
MCCaMicrobial killing activity under exposition to Candida albicansImmuneDM calculationNone
MCSaMicrobial killing activity under exposition to Staphylococcus aureusImmuneDM calculationNone
HetHeterophil countImmuneDM calculationlog
LymLymphocyte countImmuneDM calculationlog
MonMonocyte countImmuneDM calculationlog
LysHemolysis (complement and antibody immune functions)ImmuneDM calculationlog (value + 1)
AggHemoagglutination (complement and antibody immune functions)ImmuneDM calculationlog
CortCorticosteroneEndocrineDM calculationlog
HctHaematocritOxygen transportDM calculationlog
MassMass of the bird at measurementCovariatelog
MonthMonth of the yearFactorn.a.
TreatmentTemperature treatment (thermoneutral, cold, variable)Factorn.a.
CageAviary (nested in treatment)Random factorn.a.
M sum Maximal thermogenic capacity of an individual exposed to cold stressMetabolismResponseNone
F score Foot inflammation scoreResponseNone

Mahalanobis distance calculation

Mahalanobis distance (DM; Mahalanobis 1936) is a multivariate distance that can serve to assess how close or far an individual's phenotype is from the mean of a reference population (Cohen et al. 2013). The reference population can be the sample at hand or another population for which average trait values and the covariation between them are known (see 'Discussion'). In the present study, we used DM to establish how far the general physiological state of an individual in a given month was from the population's average physiology (including repeated measurements for all individuals). Thus, we interpret the measure as a signal of condition or health state. DM analysis requires the rather strong assumption of a multivariate normal distribution (MVN) of constitutive traits. Whether this assumption holds will depend on the specific traits, their number and their covariation. A violation of MVN is, however, likely to lead to conservative results, that is, a noisier signal of an individual's state (Cohen et al. 2013). DM is given by

display math(eqn 1)

where x is a multivariate observation, μ is the vector of reference population mean values for each variable and S is the reference population variance–covariance matrix.

The 11 red knot biomarkers (Table 1) were first standardized by subtracting the mean and dividing by the standard deviation and then entered in the DM calculation. The full set of observations served as the reference population to compute μ and S. We also calculated the Mahalanobis distance ‘per treatment’ (DM,t) by using the observations on birds exposed to the same treatment as the focal individual to compute μ and S. This allowed us to control for differences in population mean due to treatment effects. For example, if the variable treatment resulted in a physiology intermediate to the warm and cold treatments, individuals in the warm and cold treatments would have DM scores further from the total population mean (and thus seeming to indicate poor condition, when in reality the differences may well be adaptive). By calculating DM,t for each individual relative to its treatment, we can account for this possibility. Shapiro tests indicated that log-transformed DM did not depart from normality (= 0·47) and that DM,t was marginally normal (= 0·07). In addition to DM and DM,t, which reflect the short-term physiological state of each individual at a given month, we also calculated the average value per bird across months (DM,avg, DM,t,avg), reflecting among-individual physiological differences over the whole year.

Statistical models

We investigated the relationship between DM and two response variables, Msum and Fscore, using Bayesian generalized mixed-effects models. Msum was chosen because of its adaptive significance: maximal thermogenic capacity is a major determinant of acclimation to changing temperature over the annual cycle. Considering that susceptibility to infection may strongly depend on condition, Fscore was selected because it is an available measure of inflammation driven by infection (probably by Staphylococcus; T. Kuiken, pers. comm.). After the exclusion of missing values – in particular Msum was not measured in July and August to avoid potential frostbite of blood-filled feather pins (Vézina, Dekinga & Piersma 2011) – a total of 182 and 259 birds-months were available for Msum and Fscore analyses, respectively. For all models, we used the nested design of Buehler et al. (2008, 2012) and Vézina, Dekinga & Piersma (2011) by having individuals nested in aviaries nested in treatments. Thus, bird ID and aviary were fitted as random factors, and treatment as a fixed factor. However, the aviary effect variance was negligible for Msum and was excluded from the final Msum model. Month was included as a fixed factor because previous studies showed that red knots in captivity retain the phenotypic variation schedules of wild birds associated with season and migration. We also initially included the interaction between month and treatment (Vézina, Dekinga & Piersma 2011) but then removed it as it showed no significant effect on response variables and substantially decreased the fit of the models. Sex was fitted only for Fscore as it showed no significant effect on Msum in a previous analysis of the same data (Vézina, Dekinga & Piersma 2011; confirmed in our preliminary analyses). The four Mahalanobis distance measures presented above were entered in separate models. In addition, to evaluate whether Mahalanobis distance conveys a signal different from that of individual biomarkers, we also fitted models with the individual biomarkers, both with Mahalanobis distance included and excluded.

Msum was marginally normally distributed (Shapiro's W = 0·986, = 0·06; Fig. 1a) and was modelled as Gaussian. Since this trait is affected by modulations of the body mass (Vézina, Dekinga & Piersma 2011), Msum models were fitted with and without mass as a covariate to assess whether DM was related to mass-independent or total Msum, respectively.

Figure 1.

Distribution of (a) summit metabolic rate and (b) foot score monthly measurements on 31 captive red knots.

Fscore is an ordinal measure of skin inflammation. It ranged from 0 (=no lesion) to 6 (=big lesion) with a mean value of 0·74. Since values included more zeros (62%) than expected from a Poisson distribution (Pr(Fscore=0)=48%, where Fscore ~ Po(μ = 0·74); Fig. 1b), we fitted a zero-inflated Poisson (ZIP) model (see Methods S1). Since the zero-inflation parameter estimate was very small (see 'Captive animals'), we also fitted a standard Poisson (SP) model and compared the results from both models. The magnitude of enduring differences among individuals was estimated by the repeatability (Nakagawa & Schielzeth 2010; Methods S1). All analyses were conducted in the MCMCglmm package (Hadfield 2010) for R (R Development Core Team 2012).


Correlation between DM and other variables

Correlations between biomarkers ranged from −0·24 (Lysis vs. MCEc) to 0·68 (Lym vs. Mon; Fig. 2). DM was either uncorrelated or weakly correlated with biomarkers, ranging from −0·12 (with Lysis) to 0·20 (with Lym; Fig. 2). There was no clear positive or negative association between DM and any functional group of variables. At most, the correlation between DM and two types of leucocytes, lymphocytes (Lym) and monocytes (Mon), was of similar magnitude. However, DM was not significantly correlated with heterophils (Het) that also belong to leucocytes. Moreover, DM was positively correlated with hemagglutination (Agg) but negatively with hemolysis (Lys), which are two methodologically and functionally related indices of innate immunity. Interestingly, Agg and Lys were weakly positively correlated [Fig. 2; as previously reported by Buehler et al. (2012)]. This is not surprising considering that, while two biomarkers can be positively correlated, they may show different (within-marker) mean–variance relationships and that the correlation between DM and a biomarker is shaped by this relationship. Therefore, DM appears to provide information that is relatively independent from that of individual biomarkers. Further discussion about correlations between biomarkers themselves, both within and among individuals and across months, can be found in Buehler et al. (2012).

Figure 2.

Correlation matrix between biomarkers and Mahalanobis distance (DM). The colour and width of the ellipse show the strength of the correlation between two variables (a narrow ellipse indicates stronger correlation) and tilt the direction. An ‘X’ is plotted over the ellipse when the correlation is non-significant (P-value ≥ 0·05). The figure was drawn with the corrplot package for R (https://github.com/taiyun/corrplot).

Maximal thermogenic capacity model

As previously found by Vézina, Dekinga & Piersma (2011), birds from the warm treatment tended to have a lower Msum, while among-month variation was significant (Table 2). The group-effect variance, after correction for treatment, was negligible, with negative estimates but at the bound of zero. Monthly measurements of Mahalanobis distance (DM or DM,t) showed no significant relationship with Msum (Table 2). However, the average value of an individual over the study period (DM,avg) was negatively associated with Msum, implying a higher maximal thermogenic capacity of birds exhibiting a physiology closer to the population average. The effect size was important (Table 2): considering the range of observed DM,avg values (0·96–1·36), the effect was between 3·6 and 5·0 times the magnitude of the warm treatment effect, which showed the highest effect size among fixed factors (i.e. months and treatments). The magnitude of the mass effect was between one and two times that of DM,avg, assessed as the product of effect size by, respectively, the minimal and maximal values observed for mass. Average Mahalanobis distance per treatment (DM,t,avg) had a similar, albeit somewhat weaker, effect on Msum than DM,avg (Table 2). When mass was removed from the model, the decreasing trend of Msum with increasing DM,avg remained of similar, albeit lower, magnitude but was not significant (estimate: −5·00, Bayesian 95% highest posterior density intervals (HPD): −10·63 to 1·19, = 0·14). This suggests that the effect of DM,avg on Msum is independent of mass and easier to detect when correcting for mass in the model.

Table 2. Effect of Mahalanobis distance, mass, treatment and month on the maximal thermogenic capacity (Msum)
Fixed effectEffect sizeaPosterior probabilityb
Estimate (posterior mode)Lower 95% posterior intervalUpper 95% posterior interval
  1. a

    Values for treatment, month and mass are those from the model fitted with Dm. Effect sizes from models fitted with one of the three other Mahalanobis distances were comparable.

  2. b

    n.s.: non-significant; *, <0·05; **, <0·01; ***, <0·001.

Mahalanobis distance (DM)−0·44−1·150·20n.s.
Per treatment (DM,t)−0·58−1·160·27n.s.
Average (DM,avg)−6·67−11·71−1·60**
Average per treatment (DM,t,avg)−5·17−10·25−0·34*
Treatment = variable−0·35−2·001·00n.s.
Treatment = warm−1·80−2·94−0·10*
Month = May 2005−0·90−1·76−0·12*
Month = June 2005−1·31−2·06−0·70***
Month = September 2005−1·26−1·91−0·47**
Month = October 2005−1·12−1·96−0·59**
Month = November 2005−1·26−1·96−0·61***
Month = December 2005−1·17−1·72−0·36***
Month = January 2006−1·44−2·14−0·84***
Month = February 2006−1·53−2·24−0·94***

In spite of their significant relationships with Msum, DM,avg and DM,t,avg showed relatively wide Bayesian 95% highest posterior density intervals (HPD). Since these two variables represent average measures per bird (hence constant over the study period), their effect could have been harder to estimate because of the cofitting of bird ID as a random effect (due to MCMC sampling toggling part of the Msum variation back and forth between DM,avg and bird ID effects). Actually, repeatability was high, accounting for more than 50% of the variance in Msum after controlling for the fixed effects (Table 3). Therefore, on top of the Mahalanobis distance, other unmeasured variables accounted for by repeatability contributed to consistent differences among individuals in terms of Msum.

Table 3. Repeatability (scale [0,1]) of individual measurements of Msum and Fscore after controlling for fixed effects (Mahalanobis distance, mass, treatment and month)
ResponseDistribution modelledRepeatability
Estimate (posterior mode)Lower 95% posterior intervalUpper 95% posterior interval
M sum Gaussian0·620·450·75
F score Zero-inflated Poisson0·550·190·78
F score Poisson0·560·220·87

We fitted a model that included all biomarkers and all previous fixed and random effects. DM,avg was used as the measure of Mahalanobis distance since it showed the strongest association with Msum in the above analyses. This model recovered an effect of DM,avg on Msum of similar magnitude to the model without the biomarkers (estimate: −5·72, HPD: −11·06 to −0·59, = 0·03). Among biomarkers, only haematocrit showed a significant positive relationship with Msum (estimate: 15·63, HPD: 3·12–23·10, = 0·006; Table S1). Considering the range of observed values (0·4–0·56), the effect size of haematocrit on Msum was comparable with that of DM,avg. An identical model but with DM,avg removed led to similar estimates of biomarker effects, indicating that the effect (or absence of effect) of biomarkers on Msum is independent from that of DM,avg (Table S1).

Foot inflammation score model

The probability that Fscore = 0 due to zero inflation was very small (= 0·003), and both distributional assumptions (ZIP and SP) led to a very similar picture of the relationship between explanatory variables and foot inflammation (Table 4). Therefore, we do not distinguish between ZIP and SP except when necessary. Treatment, mass and sex showed no significant relationship with Fscore (Table 4). However, Fscore varied substantially from month to month. A peak occurred in May, when wild birds undertake the spring migration. DM but not DM,avg had a significant positive effect on Fscore. This result suggests that during certain times of the year (but not over the year on average) individuals exhibiting physiology further from the population mean had a higher incidence of foot inflammation (Table 4). Accounting for the observed DM range (0·25–1·93), the effect size (calculated on the observed scale) was between 0·1 and 0·6 times that of the month with the greatest effect size (estimates from the ZIP model). Mahalanobis distance per treatment (DM,t) had a similar, albeit somewhat smaller, effect than DM (Table 4). Again, the HPD for DM,avg and DM,t,avg were wide, especially on the observed scale (Fig. 3), probably for the reasons discussed above. Average measurements of Mahalanobis distance (DM,avg or DM,t,avg) showed no significant relationship with Fscore with wide HPD on each side of zero (Table 4). As for Msum, repeatability of Fscore was high, accounting for more than 50% of the variance in after controlling for the fixed effects (Table 3). Aviary effect was much smaller (0·05, HPD: 0·01–0·54).

Table 4. Effect of Mahalanobis distance, mass, sex, treatment and month on foot inflammation (Fscore). The first value is from the zero-inflated Poisson model and the second (in brackets) from the standard Poisson modela
Fixed effectEffect sizeaPosterior probabilityb
Estimate (posterior mode)Lower 95% posterior intervalUpper 95% posterior interval
  1. a

    Values for sex, treatment, month and mass are those from the model fitted with Dm. Effect sizes from models fitted with one of the three other Mahalanobis distances were comparable.

  2. b

    n.s.: non-significant; *, <0·05; **, <0·01; ***, <0·001.

Mahalanobis distance (DM)1·02 (0·87)0·14 (0·17)1·83 (1·73)* (*)
Per treatment (DM,t)0·72 (1·01)0·13 (0·03)1·92 (1·78)* (*)
Average (DM,avg)0·03 (−0·54)−7·67 (−7·65)6·67 (7·88)n.s. (n.s.)
Average per treatment (DM,t,avg)0·62 (0·47)−8·80 (−10·23)6·34 (6·49)n.s. (n.s.)
Mass−0·02 (−0·01)−0·03 (−0·03)0·01 (0·00)n.s. (n.s.)
Sex0·33 (−0·23)−1·63 (−1·61)1·54 (1·62)n.s. (n.s.)
Treatment = variable0·68 (0·48)−3·40 (−2·23)4·00 (3·33)n.s. (n.s.)
Treatment = warm−0·62 (0·04)−4·12 (−3·46)3·20 (2·20)n.s. (n.s.)
Month = May 20052·46 (1·73)0·91 (0·76)3·63 (3·55)*** (***)
Month = June 20052·13 (1·75)0·66 (0·73)3·31 (3·26)*** (**)
Month = July 20051·68 (1·64)0·63 (0·71)3·30 (3·27)*** (***)
Month = August 20051·91 (1·59)0·59 (0·73)3·25 (3·29)*** (***)
Month = September 20051·76 (1·93)0·75 (0·84)3·42 (3·46)*** (***)
Month = October 20052·20 (1·85)0·86 (0·86)3·52 (3·28)*** (***)
Month = November 20051·71 (1·51)0·51 (0·69)3·22 (3·32)** (***)
Month = December 20051·18 (1·25)0·16 (0·23)2·93 (2·94)* (*)
Month = January 20060·92 (0·75)−0·40 (−0·35)2·46 (2·60)n.s. (n.s.)
Month = February 20062·01 (1·71)0·50 (0·72)3·24 (3·32)*** (***)
Figure 3.

Predicted foot score as a function of Mahalanobis distance (solid line) and Bayesian 95% highest posterior density intervals (dashed lines).

Again, we fitted models (ZIP and SP) with all the biomarkers included, this time choosing DM as the measure of Mahalanobis distance since it showed the strongest association with Fscore in previous analyses. This analysis recovered an effect of DM on Fscore of similar magnitude to the model without the biomarkers. This effect was slightly higher for the ZIP model (estimate: 1·21, HPD: 0·03–1·91, = 0·03) than for the SP model (estimate: 0·98, HPD: −0·04–1·85, = 0·052). No biomarker except for haematocrit showed a significant relationship with Fscore (Table S2). Haematocrit had a marginally significant negative effect in all models (P-value range: 0·048–0·154), but this biomarker had a substantially higher effect size than DM, when accounting for observed values. Identical models with DM excluded led to estimates of biomarker effects similar to those from models with DM included (Table S2).


In this study, we illustrate the potential of a novel method to measure body condition using Mahalanobis distance calculated from biomarker data, interpretable as a measure of physiological dysregulation. The only former application was our study on ageing in humans (Cohen et al. 2013). In both cases (red knots and humans), higher DM provided a clear signal of worse individual condition, as shown by its power to predict variables related to health or physiological performance (i.e. thermogenic capacity, inflammation, survival, ageing).

Mahalanobis distance as a measure of condition

Mahalanobis distance should be more powerful than univariate proxies (e.g. residual mass) to assess condition because (i) it defines condition relative to a reference population, while the absolute value for individual proxies (including biomarkers) can be driven by factors unrelated to condition (e.g. see the discussion about haematocrit below); (ii) it can simultaneously incorporate numerous biomarkers and thus account for the biological integration of regulatory networks for the maintenance of homeostasis. In fact, integration is typically determined from covariance matrices (Klingenberg 2008; Costantini, Monaghan & Metcalfe 2013), just as Mahalanobis distance is.

Our results strongly support the hypothesis that Mahalanobis distance conveys information about the condition of animals. First, the relationships between DM and DM,avg and response variables were not driven by specific biomarkers, as we detected the same effect whether or not all biomarkers were also entered separately in the models. Secondly, all biomarkers except haematocrit failed to show significant relationships with response variables, indicating that Mahalanobis distance captured a separate signal. Thirdly, Msum and Fscore changed in the direction that we would predict if Mahalanobis distance was negatively associated with condition. Specifically, birds with a more ‘abnormal’ physiology (i.e. higher DM or DM,avg) had a lower maximal thermogenic capacity and a higher susceptibility to infections.

While one could consider Msum and Fscore as indirect proxies of condition based on our results, these measures are actually indicators of physiological performance and health outcome and, as such, may exhibit variable relationships with condition (see above). At the same time, because of its nature, DM represents a more direct (hence upstream) measure of underlying condition at the physiological level. Thus, a strength of using DM is the avoidance of circularity that may arise from using proxies (including single biomarkers) of condition to predict other proxies of condition (e.g. performance indicators).

One potential drawback of our method is that correlations among biomarkers, including those used here, can change with time (Buehler et al. 2012). This may complicate the choice of the reference population, especially if variation in correlations relates to trade-offs expressed at certain times only. Thus, some exploration of how condition changes with different reference populations may be warranted (see below).

Biomarker selection

In the red knot example, we intentionally did not pre-select biomarkers but applied the method ‘as is’ to all those available. While this may seem arbitrary, if, as we hypothesize, there is an underlying dysregulatory process that affects most markers, then Mahalanobis distance should not be too sensitive to the choice of markers; our ability to detect a strong signal with whatever variables happened to be available supports this. In our ongoing work on human data, we find that different combinations – even mutually exclusive ones – will usually produce correlated metrics (typical ρ~0·4 to ~0·5). Nevertheless, a few markers can have larger effects on the results, identifiable with sensitivity analyses (Cohen et al. 2013). In addition, if interactions between the physiological variables are important contributors to condition, then the predictive power of Mahalanobis distance for condition-dependent variables should increase with the number of biomarkers, as found previously with human data (Cohen et al. 2013).

The relative robustness of the results to biomarker selection is also supported by the observation that very different sets of markers in humans and red knots led to similar qualitative conclusions about condition. This implies that the method can be directly applied to other species without much work because (i) many existing ecological data sets already contain sufficient data; (ii) in the field it may be possible to focus on a number of simple, cheap markers measurable in miniscule quantities of blood (uric acid, glucose, triglycerides, blood cell counts, etc.). This is not to say that all biomarkers contribute an equal signal (e.g. see Cohen et al. 2013), and future studies in ecology might reveal that a few specific biomarkers should be included or excluded more systematically in Mahalanobis distance calculation.

Since a majority of biomarkers in this study were from the immune system, one may legitimately ask whether we measured some ‘immunological’ condition instead of the general condition of birds. Above considerations suggest that this risk is limited especially that we also included metabolism, endocrine and oxygen transport measures, giving a more general picture of overall condition.

Reference population

Mahalanobis distance was initially developed to identify observations that represented outliers with respect to several possibly correlated variables (Mahalanobis 1936; Penny 1996; De Maesschalck, Jouan-Rimbaud & Massart 2000). For this purpose, the reference population is typically the whole data set. However, in the context of measuring physiological dysregulation, it seems intuitive to choose a reference population in good health. Interestingly, this intuition is not supported empirically in either the red knots (treatment-specific versus overall DM) or in our ongoing work on humans: in both cases, using the whole population appears to produce similar results to using targeted reference populations, and results are not generally sensitive to reference population choice. We hypothesize that this is because dysregulation happens differently in different individuals and that including less healthy individuals may increase the variance of individual parameters, leading to more robust estimation of the variance–covariance matrix. Validation work is still necessary to confirm the exact sensitivity of analyses to reference populations under different conditions; in the meantime, we recommend using the whole population unless there is a clear biological hypothesis necessitating another choice. Another issue pertaining to the reference population is the sample size necessary to adequately estimate the multivariate distribution of biomarkers. This likely depends on the number of markers and their correlation structure, a question that also deserves further investigation.

Mahalanobis distance vs. haematocrit

It is noteworthy that haematocrit, an index of erythrocyte concentration that is sometimes used as a measure of condition, was the only biomarker showing a relationship with response variables in red knots. Its value as a surrogate for condition is, however, contested (Dawson & Bortolotti 1997; Fair, Whitaker & Pearson 2007). Haematocrit relates to oxygen transport capacity (Prats et al. 1996; Fair, Whitaker & Pearson 2007), which increases in birds in winter (Swanson 2010) and during the physiological preparation for long-distance flights in shorebirds (Piersma, Everaarts & Jukema 1996; Landys-Ciannelli, Jukema & Piersma 2002). Given the importance of aerobic metabolism in shivering, as well as the increase in Msum in red knots exposed to cold experimental conditions (Vézina, Dekinga & Piersma 2011), the association between haematocrit and Msum is not surprising. The (marginally significant) negative relationship between haematocrit and foot inflammation is more difficult to explain and may support the idea that haematocrit provides a signal on condition. However, haematocrit and Mahalanobis distance were uncorrelated and maintained their respective effects on response variables when they were included in the same model, suggesting that they measure two different things.

Insights into animal physiology

The red knot has become a model to investigate the adaptive basis of physiological flexibility in free-ranging animals (Buehler & Piersma 2008; Piersma & van Gils 2011). The results presented here provide important insights on this topic. Maximal thermogenic capacity is a trait of critical adaptive significance for migratory birds that balances the costs and benefits of increased resistance to cold stress under seasonally variable thermal conditions. Previous studies showed that red knots can adjust Msum to external variations in temperature, principally by changing their mass (Vézina et al. 2006; Vézina, Dekinga & Piersma 2011). Our results further show that Msum is also correlated with mean differences in condition among birds (DM,avg). This suggests that variation in individual quality might exist, perhaps owing to good genes or developmental circumstances, and should prompt research in that direction.

Previously, higher immune assay values were considered better (Folstad & Karter 1992; Sheldon & Verhulst 1996; Lochmiller & Deerenberg 2000), but recently this view has been challenged. Higher values might indicate better abilities for fighting pathogens, but could also reflect current infection (Adamo 2004; Horrocks, Matson & Tieleman 2011). In addition, higher values might also reflect over (maladaptive) investment in immune function. A salient result of our study is that physiological abnormality in either direction (high or low) correlates positively with infection, providing evidence that moderate, not extreme, physiology is optimal. Contrary to Msum, Fscore was associated with punctual (DM) but not mean (DM,avg) measurements of condition, underscoring the importance of considering both within- and among- individual variation when investigating the ecological significance of condition.

Applications to ecology and evolution

The red knot example illustrates how a measure of condition based on Mahalanobis distance opens new avenues of ecological research. First, this measure may be employed to determine whether departure from ‘physiological normality’ reflects some form of dysregulation or frailty modulated by determinants of environmental quality, such as resource availability or toxicity, and whether the magnitude of the departure predicts outcomes such as infections, diseases or death. Secondly, the approach can provide insights into adaptive phenotypic plasticity when fitness data are available. For instance, our results raise the question of whether thermal acclimation through Msum modulation in red knots comes at a fitness cost that depends on individual quality. A related issue is whether abnormal physiology correlates with fitness in a given environment. One approach to investigate that question is to use individuals with the greatest fitness in a given environment as the reference population in the DM calculation. Thirdly, a good measure of condition will be crucial for understanding the evolution of costly secondary sexual traits such as ornaments (Hill 2011). Fourthly, it can be applied to understand how individual strategies/decisions such as dispersal express evolutionary trades-offs involving condition. In some sense, Mahalanobis distance may partly correspond to allostatic load (McEwen & Wingfield 2003) and thus may provide a way to test hypotheses about stress and dysregulation in an ecological context (Wingfield et al. 1998).

Another potential application is the computation of DM from biomarkers belonging to specific functional groups to test more mechanistic hypotheses. For instance, response to oxidative stress is thought to play an important role in self-maintenance, and there is evidence that environmental stressors affect the level of integration in this metabolic system (Ristow et al. 2009; Costantini, Metcalfe & Monaghan 2010; Costantini, Monaghan & Metcalfe 2013). DM could serve to assess the normality of an individual's oxidative profile or as a measure of the level of biological integration within oxidative balance systems.

In conclusion, the use of a multivariate statistical distance based on biomarkers provides a more holistic assessment of body condition than univariate proxies because it relates more tightly to homeostasis, general health state or optimal physiology in a given environment, hence to the functionality of essential cellular processes, which likely is an important determinant of fitness. These are probably the aspects that ecologists aim to grasp when they measure condition, and we thus expect the approach to find many applications in future research.


We thank Maarten Brugge and Anne Dekinga for taking care of the captive red knots and Bernard Spaans for capturing the birds. AAC is a member of the FQRS-supported Centre de recherche sur le vieillissement (CDRV) and Centre de recherche Étienne Le-Bel and is a funded Research Scholar of the FQRS. This research was supported by Natural Science and Engineering Research Council of Canada (NSERC) grants to AAC, DMB and FV, a CDRV scholarship to EM, the University of Groningen Ubbo Emmius Scholarship, and Schure-Beijerinck-Popping Fonds to DMB, the Netherlands Organization for Scientific Research and the University of Groningen to TP, and operational grants from NIOZ Royal Netherlands Institute for Sea Research to TP.

Data accessibility

Data available from the Dryad Digital Repository: http://doi.org/10.5061/dryad.mf0ns