Mixed conditional logistic regression for habitat selection studies

Authors

  • Thierry Duchesne,

    Corresponding author
    1. Département de Mathématiques et de Statistique, Université Laval, Sainte-Foy, QC, Canada G1V 0A6
    Search for more papers by this author
  • Daniel Fortin,

    1. Chaire de Recherche Industrielle CRSNG-Université Laval en Sylviculture et Faune, Département de Biologie, Université Laval, Sainte-Foy, QC, Canada G1V 0A6
    Search for more papers by this author
  • Nicolas Courbin

    1. Chaire de Recherche Industrielle CRSNG-Université Laval en Sylviculture et Faune, Département de Biologie, Université Laval, Sainte-Foy, QC, Canada G1V 0A6
    Search for more papers by this author

Correspondence author. E-mail: thierry.duchesne@mat.ulaval.ca

Summary

1. Resource selection functions (RSFs) are becoming a dominant tool in habitat selection studies. RSF coefficients can be estimated with unconditional (standard) and conditional logistic regressions. While the advantage of mixed-effects models is recognized for standard logistic regression, mixed conditional logistic regression remains largely overlooked in ecological studies.

2. We demonstrate the significance of mixed conditional logistic regression for habitat selection studies. First, we use spatially explicit models to illustrate how mixed-effects RSFs can be useful in the presence of inter-individual heterogeneity in selection and when the assumption of independence from irrelevant alternatives (IIA) is violated. The IIA hypothesis states that the strength of preference for habitat type A over habitat type B does not depend on the other habitat types also available. Secondly, we demonstrate the significance of mixed-effects models to evaluate habitat selection of free-ranging bison Bison bison.

3. When movement rules were homogeneous among individuals and the IIA assumption was respected, fixed-effects RSFs adequately described habitat selection by simulated animals. In situations violating the inter-individual homogeneity and IIA assumptions, however, RSFs were best estimated with mixed-effects regressions, and fixed-effects models could even provide faulty conclusions.

4. Mixed-effects models indicate that bison did not select farmlands, but exhibited strong inter-individual variations in their response to farmlands. Less than half of the bison preferred farmlands over forests. Conversely, the fixed-effect model simply suggested an overall selection for farmlands.

5. Conditional logistic regression is recognized as a powerful approach to evaluate habitat selection when resource availability changes. This regression is increasingly used in ecological studies, but almost exclusively in the context of fixed-effects models. Fitness maximization can imply differences in trade-offs among individuals, which can yield inter-individual differences in selection and lead to departure from IIA. These situations are best modelled with mixed-effects models. Mixed-effects conditional logistic regression should become a valuable tool for ecological research.

Introduction

The resource selection function (RSF) is currently one of the dominant tools used to quantify habitat selection (McLoughlin et al. 2010). RSFs link animal distribution to spatial patterns of habitat heterogeneity by contrasting the characteristics of animal locations with those of a set of random locations (Manly et al. 2002). Random locations are often drawn across home-ranges of individuals (Compton, Rhymer & McCollough 2002), in which case observed (response variable coded as ones) and random (response variable coded as zeros) locations are generally contrasted with unconditional logistic regressions. Under such a sampling design, however, estimation methods must consider that a certain number of random locations might have been visited, in which case they do not all represent true absences (Keating & Cherry 2004; Johnson et al. 2006). The use of a matched design can then become advantageous. With a matched design, each observed location is associated with a specific set of random locations drawn within a limited spatial domain (Boyce 2006), often corresponding to the distance where the animal could have travelled during the relocation time interval (Boyce et al. 2003). Because the animal could not also be at the random locations when its actual location was acquired, random locations represent true absences. Furthermore, matched designs are appropriate when evaluating the habitat selection of animals with home-ranges that are either not well defined or large relative to the distance individuals move between relocations (Arthur et al. 1996; Compton et al. 2002). RSFs based on a matched design are estimated by conditional logistic regression (Compton et al. 2002; Boyce et al. 2003; Boyce 2006; McDonald et al. 2006), an approach that is becoming increasingly used in habitat selection analysis.

Despite difficulties in assigning a variance–covariance structure (Craiu, Duchesne & Fortin 2008; Koper & Manseau 2009), the value of random effects in RSFs has been largely recognized in the case of models developed from non-matched designs (Gillies et al. 2006; Hebblewhite & Merrill 2008). Mixed effects should be better suited to analyse unbalanced data sets or when selection for the different landscape attributes vary among individuals (Gillies et al. 2006). Moreover, mixed-effects models can handle the situation where several matched sets of locations come from a same animal and are thus correlated.

The addition of random effects also provides these advantages in studies based on matched sampling designs, but mixed-effects conditional logistic regressions have been largely overlooked in ecological research (but see Bruun & Smith 2003; Fortin et al. 2009). Moreover, unlike mixed-effects conditional logistic regression, fixed-effects models rely on assumptions that might not faithfully represent certain ecological systems. Fixed-effects models assume that the strength of selection is homogeneous among individuals within the population and thus estimate the population-averaged selection. Fixed-effects conditional logistic regression also implies independence from irrelevant alternatives (IIA, Revelt & Train 1998). The IIA hypothesis states that the strength of preference for (i.e. the odds of choosing) habitat type A over habitat type B does not depend on the other habitat types also available. Because behavioural decisions reflect trade-offs among multiple competing demands, changes in available options may alter individual preferences, thereby violating the IIA assumption. For example, prey often make greater use of patches located in relatively safe areas (Hay & Fuller 1981; Morrison et al. 2004; Hochman & Kotler 2007). The foraging efforts and selectivity of Nubian Ibex (Capra nubiana F. Cuvier, 1825) vary with the distance from the safety of a cliff (Hochman & Kotler 2007). In other words, the strength of preference for a given type of food patch over the baseline patch type depends on the presence or absence of a cliff at close proximity, a spatial dependency that might violate the IIA hypothesis. In this context, fixed effects may yield inappropriate conclusions, potentially leading to unfavourable management actions.

In this study, we illustrate how departures from the assumption of homogeneous selection among individuals due either to inter-individual variability in movement rules or to the violation of the IIA assumption may influence the estimation of RSF parameters under a matched sampling design. We begin by showing how mixed effects can be incorporated into conditional logistic regression model. We follow an approach based on random utility theory (Cooper & Millspaugh 1999) because it is easily interpretable in the resource selection context and because the exponential form of the RSF is robust to misspecification (McFadden & Train 2000). We then explain why, unlike the fixed-effects model, its mixed-effects counterpart remains appropriate under some types of violation of IIA. We follow with a simulation-based investigation of the impact of departures from the homogeneity in selection probabilities and from the IIA assumption on the estimation of RSFs. Finally, we illustrate the methods with an analysis of habitat selection by the free-ranging bison Bison bison (Linnaeus, 1758) of Prince Albert National Park (Saskatchewan, Canada) during the springs of 2005–2008. In the spring, bison occasionally leave the park for adjacent private lands where they sometimes damage fences and crops, disturb livestock, and get killed by hunters. It can be beneficial for management to evaluate whether bison use of farmlands results from an active selection, and to quantify whether cross-boundary movements are made by few individuals or whether it is a widespread behaviour. We show that conditional mixed-effects RSFs are better suited than marginal fixed-effects RSFs to achieve this goal.

Materials and methods

Random effects in conditional logistic regression

As with ordinary (unconditional) logistic regression, random effects can be included in conditional logistic regression models by replacing fixed regression coefficients with random coefficients. Because of the conditioning involved, conditional models have no intercept term and random effects are included as random regression coefficients. In the resulting mixed multinomial logit model (sensuRevelt & Train 1998), each animal assigns a value, termed utility (U), to all landscape locations, and among the locations available at a given time selects the one with the highest utility (Cooper & Millspaugh 1999; McDonald et al. 2006). Let = 1,…,K represent the individuals, = 1,…,tn the time steps for individual n and = 1,…,J the available locations (or a sample of all available locations, McDonald et al. 2006) for animal n at time step t. The mixed multinomial logit model considers utilities as random variables, with Unjt being the utility that animal n assigns to the jth location available at time step t. Let inline image represent the values of m covariates (e.g. habitat attributes) measured at the jth location available to animal n at time step t. Now let us assume that the utility assigned to a location depends on its attributes, viz.

image(eqn 1)

where β1,…,βm are the fixed regression coefficients, bn1,…,bnq are animal-level random effects, inline image are fixed values specifying the structure of the random effects (usually equal to the subset of the covariates inline image for which coefficients are random), εnjt are independent and identically distributed random error terms, β = (β1,…,βm)′, xnjt = (inline image)′, = (bn1,…,bnq)′ and znjt = (inline image)′. We make the assumption that the random errors follow an extreme value distribution, which reduces to the usual exponential RSF when there are no random effects (see below); this assumption is mild and the model thereby specified is very flexible (McFadden & Train 2000) Let the random effects b be independent and identically distributed with density f(b;θ), with θ a vector of unknown parameters. The probability that an animal chooses location j within the set of J locations {1, 2,…,J}, i.e. Unjt > Unit for all ≠ j, is

image(eqn 2)

Though the distribution of the random effects is typically chosen as the multivariate normal distribution with mean vector 0 and variance–covariance parameters to be estimated (Gillies et al. 2006; Hebblewhite & Merrill 2008), other distributions such as the lognormal, uniform or triangular can be used (Bhat 2001). When all inline image in eqn (1) take on value zero or when the variance of b is null (i.e. b is identically 0), eqn (2) simplifies to

image(eqn 3)

and we get the ordinary (i.e. fixed effects) conditional logistic regression model (McDonald et al. 2006).

Random effects, heterogeneity in selection and the dependence from irrelevant alternatives

The addition of individual-level random effects in RSFs relaxes the assumption of homogeneous selection among animals. For example, adding an animal-level random regression coefficient allows for inter-individual variations in the response to covariate x, which means that each individual may respond differently to changes in x. Because the random effects are unobserved random variables that are common to all the locations of a given individual, the mixed-effects model does not assume that the observations of that individual are uncorrelated (Revelt & Train 1998). Note that, though they do not explicitly model the animal-level heterogeneity, fixed-effects model estimated by methods such as generalized estimating equations can handle correlated matched sets (Craiu et al. 2008).

The mixed multinomial logit model relaxes the IIA assumption, but only at a population level. It does so by inducing correlation over alternatives in the stochastic portion of utility (Revelt & Train 1998; Skrondal & Rabe-Hesketh 2003). To illustrate this, we considered a forager, such as the Nubian Ibex (Hochman & Kotler 2007), responding to spatial patterns of risk. Suppose that each location is of one of three types, which we code using covariates xjP and xjC: location j may be a risky food patch, coded as xjP = 1, xjC = 0; a safe cliff, coded as xjP = 0, xjC = 1; or a baseline habitat that offers no food or protection, coded as xjP = 0, xjC = 0. We assume that > 2 locations are available and that location = 1 is a food patch (x1P = 1, x1C = 0), and location = 2 is the baseline habitat (x2P = 0, x2C = 0). If we assume a fixed-effect conditional logistic regression model (McDonald et al. 2006) with RSF proportional to exp(βPxjP + βCxjC), then the ratio of the probability that the animal selects location = 1 to the probability that the same animal (or another animal chosen at random) selects location = 2 is given by

image(eqn 4)

which does not depend on whether there is a cliff among the other available locations. Now let us assume the same model, but this time with a random slope βP b for covariate xP instead of the fixed slope βP. Because b remains fixed for all the locations of a given animal, the ratio of the probability that the animal chooses location = 1 to the probability that it selects location j = 2 is given by

image

which still does not depend on the attributes of the alternate locations, but it depends on b, the unobserved animal-specific random effect. Now, if we consider the ratio of the probability that an animal chosen at random selects location j = 1 to the probability that another animal, again chosen at random, selects location j = 2, then we get

image

where the quantity in square brackets now depends on the characteristics of all available locations (Train 2003). This model thus relaxes the IIA assumption at the population level. In other words, by adding random coefficients in the conditional logistic regression model, the population-averaged probability of choosing a given habitat type depends on the local alternatives.

Estimation and inference

We now consider maximum likelihood estimation of the parameters of the model described by eqns (1) and (2) on the basis of data obtained with a matched sampling design. To simplify the notation and without loss of generality, we assume that the location chosen by animal n at time step t among the J available locations is assigned label = 1 (and thus the locations not chosen are assigned labels = 2, 3,…,J). Maximum-likelihood estimates of the RSF and random effects distribution parameters are obtained by finding the values of β and θ maximizing:

image(eqn 5)

Because eqn (5) is a valid likelihood function, any likelihood-based inference method for β, such as Wald confidence intervals based on inverting the Hessian of the negative log-likelihood, likelihood-ratio tests, or AIC-based model selection can be applied (McFadden & Train 2000).

According to parsimony principles, the need for random effects in RSFs should be assessed. If random effects are not needed, then fixed-effects conditional regression would improve estimation efficiency and model interpretability (Verbeke & Molenberghs 2000). Fixed-effects model can be considered as a special case of mixed-effects model where the variance and covariance parameters in f(b; θ) are zero. A likelihood-ratio test for nested models can thus be used to evaluate the need to increase model complexity through the use of random effects. The likelihood-ratio statistic that tests whether the fixed-effects model is reasonable is given by = 2(1 − 0), where 0 and 1 are the values of the maximized log-likelihoods of the fixed- and mixed-effects models, respectively. Because the value zero is on the boundary of the parameter space for variance parameters, the P-value is not simply based on the usual chi-squared distribution but rather on a mixture of chi-squared distributions, with the number of chi-squared variables in the mixture and their respective numbers of degrees of freedom depending on the structure of the variance and covariance parameters set to zero (Verbeke & Molenberghs 2000). Consider for example a mixed-effects model with a single random effect b with distribution N(0,σ2). The likelihood-ratio statistic to test whether b is needed follows, under the null model, a mixture of two chi-squared distributions with zero and one degree of freedom, respectively. This reduces the P-value to inline image, with inline image representing a chi-squared random variable with 1 degree of freedom.

Direct numerical maximization of L(β, θ) given by eqn (5) can be difficult, as it involves integrals that cannot be solved analytically. The numerical maximization of the likelihood is often more likely to converge for a fixed-effects RSF than its mixed-effects counterpart. Bhat (2001) described simulation methods based on Halton quasi-random numbers that can efficiently evaluate the likelihood function. Maximization of the likelihood from eqn (5) can be implemented with this method using the mxlmsl package (Train 2006) for matlab r2008a (MathWorks Inc. 2008). We provide the matlab code used for our bison case study in Appendix S3.

There are other, albeit less direct, means of maximizing the likelihood from eqn (5). Chen & Kuo (2001) showed how to build a nonlinear Poisson model with random effects whose likelihood is equivalent to a closely related multinomial formulation of eqn (5). The required Poisson model can be fitted by maximum likelihood, where the integrals are evaluated with adaptive Gaussian quadrature or penalized quasi-likelihood. Bruun & Smith (2003) used the latter approach to evaluate habitat selection by European starlings (Sturnus vulgaris Linnaeus, 1758). Mixed conditional logistic regression models can also be fitted with Bayesian methods, but the approach then requires specifying prior distributions (informative or not) for β, θ. R.V. Craiu, T. Duchesne, D. Fortin & S. Baillargeon (unpublished data), propose a numerically stable and efficient two-step method that gives accurate approximations to the maximum-likelihood estimates for mixed-effects conditional logistic regression. Perhaps, methods based on the results of the first step (i.e. separate models fitted to each animal) of such a two-step approach could help in determining whether the need for random effects arises from between-animal heterogeneity or the violation of IIA.

Example 1: Simulation of patch selection under predation risk

We use computer simulations to investigate the effect of departures from the assumption of homogeneous habitat selection among individuals. Deviations from the assumption were induced by imposing inter-individual variations in movement rules and by forcing movement decisions that violated the IIA assumption. Individual-based, spatially explicit modelling was conducted using the Spatially Explicit Landscape Event Simulator (Fall & Fall 2001). We simulated the movements of 200 virtual foragers, with each individual starting (time 0) at a random location within the landscape (1000 × 1000 cells), and followed for 50 consecutive moves. Landscapes comprised four types of randomly distributed habitat patches: Patch type H1 offered the most food, followed by H2. Neither H3 nor H4 offered any food. H1 was risky, unless located <15 cells from H3, in which case H1 became safe. H2 was always safe.

We tested four scenarios differing in the movement rules of individuals, with distinct statistical implications. Movements for scenarios 1 and 2 were both consistent with the IIA hypothesis; scenario 1 assumed a homogeneous movement rule, whereas scenario 2 involved inter-individual variation in the rules. Scenarios 3 and 4 both led to violation of the IIA hypothesis at the individual level, because the preference for H1 over H2 depended on whether H3 occurs within 15 cells; a homogeneous movement rule was used for scenario 3 whereas inter-individual variation in movement rules characterized scenario 4. The movement rules as well as the landscape used for each of the four scenarios are described in detail in Appendix S1. To assess the effect of varying patch availability on inferences, scenario 3 was applied to five additional landscapes, where the proportions of H1 and H2 remained unchanged but those of H3 and H4 varied according to Landscape 1: 0·01%, 69·99%, Landscape 2: 0·02%, 69·98%, Landscape 3: 0·03%, 69·97%, Landscape 4: 0·05%, 69·95%, and Landscape 5: 0·06%, 69·94%, respectively.

In all scenarios, each observed location was matched to 10 locations randomly drawn within a 30-cell radius, which was enough to encompass all step distances (Forester, Im & Rathouz 2009). Patch type (H1–H4) was identified at all observed and random locations. Fixed- and mixed-effects conditional logistic regressions were used to build RSFs. Mixed-effects RSFs allowed the coefficient of H1 to vary among individuals according to N(β1,σ2). In all models, H2 was used as the baseline patch type. Models were fitted by maximizing the likelihood given by eqn (5) using a publicly available matlab r2008a (MathWorks Inc. 2008) package (Train 2006).

Example 2: Habitat selection by free-ranging bison

The field study was conducted in the springs of 2005–2008 (9 March–31 May 2005, 1 March–31 May in 2006 and 2007, and 1 March–10 March 2008) in Prince Albert National Park, where the bison population was comprised of 385 individuals. The bison range is mostly composed of forests (85 %), meadows (10%) and water bodies (5%). The range is adjacent to farmlands, where bison are occasionally found.

We followed 24 female bison equipped with Global Positioning System collars (GPS collar 4400M from Lotek Engineering, Newmarket, ON, Canada) taking locations at 06:00 and 18:00 hours. Each observed location was paired with 10 random locations sampled within a 1·6-km radius circle (>90% of all travelled distances between relocations).

Land-cover types at observed and random locations were characterized based on classified Landsat ETM+ satellite images (Fortin et al. 2009). Land-cover types were (i) meadow, including areas near lakes and rivers dominated by grasses, forbs and sedges (MEADOW); (ii) riparian areas largely comprised shrubs and located near streams and rivers (RIPARIAN); (iii) forest consisting of deciduous, conifer and mixed stands (FOREST); (iv) water bodies (WATER); (v) road including the areas located <15 m from a human-made trail or a road (ROAD); and (vi) farmlands (AGRIC). Fixed- and mixed-effects conditional logistic regressions fitted by maximum likelihood were used to build RSFs. Random effects assuming N(0,σ2) were investigated for AGRIC, with FOREST as the baseline land-cover type.

Results

Example 1: Simulation of patch selection under predation risk

Scenario 1 represented a situation where the IIA hypothesis was valid and where the movement strategy was fixed within the population of simulated foragers. As expected in such cases, the fixed- and mixed-effects RSFs yielded a similar coefficient estimate for H1 of −0·91 ± 0·03 (±SE) (Table 1), which agrees with the theoretical approximation (Appendix S2). Moreover, the standard deviation (SD) of the random coefficient associated with the mixed-effects model did not differ significantly from 0 (likelihood-ratio test: P = 0·46), indicating that a random coefficient for H1 was not required (Table 1).

Table 1.   Patch selection estimated by fixed- or mixed-effects conditional logistic regressions with normally distributed coefficients, for virtual foragers travelling in landscapes according to four scenarios. The scenarios differed depending on whether movement rules were similar among all individuals of the population and whether the assumption of independence from irrelevant alternatives (IIA) was violated. H2 was the baseline patch type in all resource selection functions
VariableFixed-effects modelMixed-effects model
βSE95% CIβSE95% CI
Scenario 1: no inter-individual variation, IIA assumption respected
 Fixed coefficient
  H1−0·9080·031−0·969, −0·847
  H4−1·5200·024−1·567, −1·473−1·5200·024−1·567, −1·473
 Random coefficient
  H1−0·9080·031−0·969, −0·847
  SD of coefficient 0·0000·174 
  Max. log likelihood −22 030·021  −22 030·020 
Scenario 2: inter-individual variation, IIA assumption respected
 Fixed coefficient
  H1−0·8350·031−0·896, −0·774
  H4−1·5280·024−1·575, −1·481−1·5280·024−1·575, −1·481
 Random coefficient
  H1−0·8730·041−0·953, −0·793
  SD of coefficient 0·3680·041 
  Max. log likelihood −22 023·359  −21 999·292 
Scenario 3: no inter-individual variation, IIA assumption violated
 Fixed coefficient
  H10·0730·0270·020, 0·126
  H3−0·7360·528−1·771, 0·299−0·7120·530−1·751, 0·327
  H4−1·4580·026−1·509, −1·407−1·4620·027−1·515, −1·409
 Random coefficient
  H10·0060·060−0·112, 0·124
  SD of coefficient 0·7520·046 
  Max. log likelihood −21 557·935  −21 273·523 
Scenario 4: inter-individual variation, IIA assumption violated
 Fixed coefficient
  H10·0190·028−0·036, 0·074
  H3−1·5400·724−2·959, −0·121−1·4610·726−2·884, −0·038
  H4−1·4540·026−1·505, −1·403−1·4500·026−1·501, −1·399
 Random coefficient
  H1−0·0620·061−0·182, 0·058
  SD of coefficient 0·7640·047 
  Max. log likelihood −21 655·263  −21 362·713 

The IIA assumption remained valid in scenario 2, but this time each animal had a different probability of choosing H1. In this context, the mixed-effects RSF received greater empirical support (Table 1) as its random coefficient was an important addition to the model fit (likelihood-ratio test: < 0·0001).

We now consider a situation (scenario 3) where all individuals displayed the same movement strategy, but where the IIA assumption was violated because the odds of choosing H1 depended on whether a refuge patch H3 was at close proximity. Selection coefficients for H1 were then systematically lower when estimated by mixed-effects conditional logistic regression than by their fixed-effects counterpart (Fig. 1). Whether H1 was selected or avoided remained generally consistent with both models, with the exception of when H3 made up 0·04% of landscape. In this case, the fixed-effects RSF suggested a significant selection for H1, whereas the better fitting (likelihood-ratio test: < 0·0001) mixed-effects model revealed that the average simulated forager had no overall selection for H1 (Fig. 1). In the most complex scenario 4, virtual foragers not only violated the premise of IIA, but the strength of selection for H1 also differed among them. Modelling habitat selection under this scenario required, once again, the use of a random coefficient for H1 (likelihood-ratio test: < 0·0001).

Figure 1.

 Changes in the selection coefficient (±95% confidence intervals) for patch type H1 by simulated foragers as function of the percentage of the landscape comprised refuge patch H3, as assessed by resource selection functions estimated from fixed- or mixed-effects conditional logistic regression. Simulations were made according to scenario 3 where the probability that a forager selects H1 compared to H2 increased when a refuge H3 was in close proximity. We also indicated the expected proportion of the population having positive coefficient for patch type H1, based on the N(inline image) estimate of the distribution of β b. Notice that values for fixed and random coefficients were slightly offset from one another to increase clarity.

Example 2: Habitat selection of free-ranging bison

Compared to the forest matrix, female bison selected meadows, water bodies and roads, but displayed no preference for riparian areas (Table 2). The response of bison to farmlands differed depending on whether fixed- or mixed-effects RSFs were used. Population-averaged fixed-effects RSF indicated a general selection for farmlands over forest areas, whereas mixed-effects model revealed that bison had no preference for one land-cover type over the other. The mixed-effects model provided a better depiction of bison selection than the fixed-effects RSF (likelihood-ratio test: < 0·0001). The mixed-effects RSF revealed important heterogeneity in the response to farmlands within the population (Table 2), with 41% (N[−0·275, 1·538]) of female bison having a positive selection coefficient for farmlands.

Table 2.   Resource selection functions for radiocollared female bison in Prince Albert National Park during the springs of 2005–2008, as estimated with fixed- or mixed-effects conditional logistic regressions, with normally distributed coefficients
VariableFixed-effects modelMixed-effects model
βSE95% CIβSE95% CI
Fixed coefficient
 Meadow2·0240·0461·934, 2·1142·0240·0461·934, 2·114
 Water0·3990·0940·215, 0·5830·4010·0940·217, 0·585
 Riparian area−0·3150·163−0·635, 0·005−0·3010·163−0·620, 0·018
 Road0·9420·1430·663, 1·2220·9530·1430·673, 1·233
 Farmlands0·3480·1180·117, 0·579
Random coefficient
 Farmlands−0·2750·377−1·014, 0·464
 H4−1·5200·024−1·567, −1·473−1·5200·024−1·567, −1·473
 SD of coefficient 1·2430·344 
 Max. log likelihood −5947·846  −5930·033 
 Likelihood-ratio test  < 0·0001   

Discussion

We used spatially explicit simulations to demonstrate how mixed-effects conditional logistic regression can capture inter-individual variation in selection induced by differences in movement rules among simulated foragers and by the presence or absence of refuge patches (which led to the violation of the IIA assumption). When the relative preference of resource patches was the same for all individuals and IIA was true (scenario 1), the fixed- and mixed-effects models estimated almost identical regression coefficients. Fixed-effects RSFs then provided an accurate representation of habitat selection within the population and were more parsimonious than mixed-effects RSFs. In contrast, when habitat selection probabilities varied among individuals but the IIA was still a valid assumption (scenario 2), the likelihood-ratio test indicated that the selection for H1 varied significantly within the population, thereby rejecting the fixed-effects model. These conclusions for scenario 2 also held under scenarios 3 (no inter-individual variation in selection and violation of the IIA assumption) and 4 (inter-individual variability in selection and violation of the IIA assumption). In these cases, RSFs that include random effects gave a more accurate representation of habitat selection in the population.

The simulation study also demonstrated that individual-level heterogeneity can be identified and taken into account in RSFs, even when data are collected under a matched sampling design. Furthermore, the simulations (i.e. scenario 3) stress that inter-individual variations in movement rules are only one potential source of heterogeneity which may entail the use of mixed-effects conditional logistic regression to analyse habitat selection data gathered from matched sampling designs. The trade-offs between food intake and predator avoidance can shape movement decisions, potentially leading to the violation of the IIA assumption. A faulty assumption of IIA may introduce sufficient heterogeneity in the response of animals to their habitat for random effects to be needed to adequately model animal distribution in response to spatial heterogeneity. Situations where the observed selection violates the IIA assumption can still be modelled with fixed-effects models when animals are homogeneous in their landscape preference. In this situation, the strength of preference for one habitat type over another depends on available alternatives and this dependence has to be modelled correctly and explicitly in the RSF using proper interaction terms. This precise knowledge is likely to be missing a priori in many studies and mixed-effects model offer a robust safeguard in such cases.

Findings from the simulations imply that the heterogeneity in selection for farmlands expressed by the female bison of Prince Albert National Park can be due to several factors, including inter-individual variations in movement decisions and the violation of the IIA assumption. Mixed-effects logistic regression can conveniently handle both sources of heterogeneity and thereby provide a robust framework for ecological inference. We concurrently modelled the response of bison to multiple habitat attributes before drawing conclusions about their response to farmlands. For example, we found that bison selected roads, as well as meadows where individuals can find large quantities of high-quality food (Fortin, Fryxell & Pilote 2002; Craiu et al. 2008; Fortin et al. 2009). Fixed- and mixed-effects RSFs then pointed out distinct response of bison to farmlands. Fixed-effects models implied that bison generally made selective use of farmlands, whereas the mixed-effects RSFs refuted this assessment by revealing heterogeneous selection for farmlands. A likelihood-ratio test revealed that the mixed-effects RSF was superior to its fixed-effects counterpart. We thus conclude that the problem of cross-boundary movements is linked to a subset, though a fairly large one, of individuals within the population, with c. 40% of female bison making selective use of farmlands. The mixed-effects RSF thus draw park managers a very different picture from the general selection for farmlands that was implied by the population-averaged fixed-effects RSF. Solving human-wildlife conflicts may depend on whether the problem originates from a restricted number of individuals. In this case, the translocation of ‘problematic’ individuals can be the solution (Sukumar 1991; Jones & Nealson 2003). On the other hand, this management approach might not be as effective when all members of the population adopt an ‘unacceptable’ behaviour. Management or conservation actions should be tailored to the nature of the problem, and mixed-effects models are often better suited than fixed-effects models to evaluate adequately the situation.

Our study stressed how drawing robust inference from RSFs may require the use of random effects in conditional logistic regression models. We demonstrated how fixed and mixed conditional logistic regression can lead to different conclusions about animal–habitat interactions. Our simulations illustrated that in some situations models with random coefficients, which yield individual-specific inferences, can provide a more accurate assessment of resource selection by animals compared with fixed-effects models that provide population-averaged inference (Fieberg et al. 2009; Koper & Manseau 2009). We found that the selection for agricultural lands by the population of free-ranging bison of Prince Albert National Park can differ depending on whether random coefficients are used or not. Such differences could have important management and conservation implications. Indeed, habitat selection is commonly used to identify critical resources (Arthur et al. 1996), suitable habitat (Fortin et al. 2008), response to anthropogenic disturbances (Hebblewhite & Merrill 2008), ecological consequences of species reintroduction (Whittaker & Lindzey 2004; Mao et al. 2005). A biased assessment of habitat selection may therefore result in inadequate management or conservation actions. Matched sampling designs and conditional logistic regressions are increasingly used in ecological research (e.g. for RSFs, Boyce 2006; for step selection functions, Fortin et al. 2005), and fixed-effects models may lead to mistaken inferences about selection whenever hypotheses such as IIA or homogeneous strength of selection among animals are not respected. We suggest that mixed-effects conditional logistic regression should become a valuable, and sometimes necessary, statistical tool for valid inference in ecological research.

Acknowledgements

Funding for this study was provided by Parks Canada Species at Risks Recovery Action and Education Fund, a program supported by the National Strategy for the Protection of Species at Risk, Natural Sciences and Engineering Research Council of Canada, Canada Foundation for Innovation, and l’Université Laval. We are grateful to L. O’Brodovich and D. Frandsen, M.-E. Fortin, K. Dancose and S. Courant for their assistance in the field, and to Pierre Racine for his help with SELES, and James Hodson for his editorial comments on the study.

Ancillary