## Introduction

New technologies are making it possible to simultaneously, and remotely, collect time series data on an animal's physiological condition and its physical location (Bevan *et al*. 1995; Zub *et al*. 2009; Cagnacci *et al*. 2010a,b; Signer *et al*. 2010; Tomkiewicz *et al*. 2010). It is hoped that these data will provide novel insights into animal behaviour (Cagnacci *et al*. 2010a; Gaillard *et al*. 2010).

Care must be taken, however, when attempting to infer causal relationships from biotelemetry data. The possibility of unmeasured confounders, responsible for driving both physiological measurements and animal movement, must be considered. Further, response values may be predictive of future covariate values (even after conditioning on …). When this occurs, the covariate process is said to be endogenous with respect to the response variable (Diggle *et al*. 2002), which has implications for both choosing statistical estimation targets and also estimators of these quantities. For example, unbiased estimation of cross-sectional mean parameters relating and requires a 'working independence assumption' (i.e. for the purposes of estimation, observations are treated as independent, but robust standard errors that allow for correlation may be used for inference) (Pepe & Anderson 1994; Diggle *et al*. 2002). Popular methods for analysing correlated data, such as likelihood-based methods that allow for autoregressive error structures or generalized estimating equations with non-independent working correlation structures, by contrast, do not result in consistent estimators of cross-sectional mean parameters (see, for example, Section 2·3 of Diggle *et al*. 2002). In particular, these methods require that , which will not be the case when is endogenous (Pepe & Anderson 1994; Diggle *et al*. 2002).

The goal of this study will be to illustrate some of the challenges involved in modelling time-dependent endogenous variables, using data from a black bear (*Ursus americanus*) biotelemetry study in Minnesota (MN) as a motivating example. We consider models that attempt to relate = log(daily movement rate) to = log(average daily heart rate) and demonstrate that regression parameter estimates are sensitive to assumptions regarding the residual error structure (i.e. whether errors are independent or autocorrelated). We explain the mechanics behind these results by considering relationships between current and lagged values of *x* and *y* and between parameters in the independence and ar(1) models.

Similar issues often arise in epidemiological, biostatistical and econometrics applications; directed acyclical graphs (DAGs), representing causal pathways, are central to understanding potential problems (and their solutions). We suspect an unmeasured confounder may be partially responsible for the behaviour of the regression parameter estimators in our applied example. We construct a DAG representing this possibility and use it to explore the resulting statistical dependencies among current and lagged values of *x* and *y*. In particular, we show that an unmeasured confounder may be responsible for the observed endogeneity between and . We also discuss other possible explanations for this result, including measurement error and causal feedback loops. We conclude with a discussion of the broader implications of this work as it relates to our ability to learn about the ecological and evolutionary consequences of animal movement.