New technologies have made it possible to simultaneously, and remotely, collect time series of animal location data along with indicators of individuals' physiological condition. These data, along with animal movement models that incorporate individual physiological and behavioural states, promise to offer new insights into determinants of animal behaviour. Care must be taken, however, when attempting to infer causal relationships from biotelemetry data. The possibility of unmeasured confounders, responsible for driving both physiological measurements and animal movement, must be considered. Further, response values may be predictive of future covariate values . When this occurs, the covariate process is said to be endogenous with respect to the response variable, which has implications for both choosing statistical estimation targets and also estimators of these quantities.
We explore models that attempt to relate = log(daily movement rate) to = log(average daily heart rate) using data collected from a black bear (Ursus americanus) population in Minnesota. The regression parameter for was 0·19 and statistically different from 0 (P < 0·001) when daily measurements were assumed to be independent, but residuals were highly autocorrelated. Assuming an autoregressive model (ar(1)) for the residuals, however, resulted in a negative slope estimate (-0·001) that was not statistically different from 0.
The sensitivity of regression parameters to the assumed error structure can be explained by exploring relationships between lagged and current values of x and y and between parameters in the independence and ar(1) models. We hypothesize that an unmeasured confounder may be responsible for the behaviour of the regression parameters. In addition, measurement error associated with daily movement rates may also play a role.
Similar issues often arise in epidemiological, biostatistical and econometrics applications; directed acyclical graphs, representing causal pathways, are central to understanding potential problems (and their solutions) associated with modelling time-dependent covariates. In addition, we suggest that incorporating lagged responses and lagged predictors as covariates may prove useful for diagnosing when and explaining why some conclusions are sensitive to model assumptions.
New technologies are making it possible to simultaneously, and remotely, collect time series data on an animal's physiological condition and its physical location (Bevan et al. 1995; Zub et al. 2009; Cagnacci et al. 2010a,b; Signer et al. 2010; Tomkiewicz et al. 2010). It is hoped that these data will provide novel insights into animal behaviour (Cagnacci et al. 2010a; Gaillard et al. 2010).
Care must be taken, however, when attempting to infer causal relationships from biotelemetry data. The possibility of unmeasured confounders, responsible for driving both physiological measurements and animal movement, must be considered. Further, response values may be predictive of future covariate values (even after conditioning on …). When this occurs, the covariate process is said to be endogenous with respect to the response variable (Diggle et al. 2002), which has implications for both choosing statistical estimation targets and also estimators of these quantities. For example, unbiased estimation of cross-sectional mean parameters relating and requires a 'working independence assumption' (i.e. for the purposes of estimation, observations are treated as independent, but robust standard errors that allow for correlation may be used for inference) (Pepe & Anderson 1994; Diggle et al. 2002). Popular methods for analysing correlated data, such as likelihood-based methods that allow for autoregressive error structures or generalized estimating equations with non-independent working correlation structures, by contrast, do not result in consistent estimators of cross-sectional mean parameters (see, for example, Section 2·3 of Diggle et al. 2002). In particular, these methods require that , which will not be the case when is endogenous (Pepe & Anderson 1994; Diggle et al. 2002).
The goal of this study will be to illustrate some of the challenges involved in modelling time-dependent endogenous variables, using data from a black bear (Ursus americanus) biotelemetry study in Minnesota (MN) as a motivating example. We consider models that attempt to relate = log(daily movement rate) to = log(average daily heart rate) and demonstrate that regression parameter estimates are sensitive to assumptions regarding the residual error structure (i.e. whether errors are independent or autocorrelated). We explain the mechanics behind these results by considering relationships between current and lagged values of x and y and between parameters in the independence and ar(1) models.
Similar issues often arise in epidemiological, biostatistical and econometrics applications; directed acyclical graphs (DAGs), representing causal pathways, are central to understanding potential problems (and their solutions). We suspect an unmeasured confounder may be partially responsible for the behaviour of the regression parameter estimators in our applied example. We construct a DAG representing this possibility and use it to explore the resulting statistical dependencies among current and lagged values of x and y. In particular, we show that an unmeasured confounder may be responsible for the observed endogeneity between and . We also discuss other possible explanations for this result, including measurement error and causal feedback loops. We conclude with a discussion of the broader implications of this work as it relates to our ability to learn about the ecological and evolutionary consequences of animal movement.
Motivating example: relating movement metrics to daily heart rate data
In 2007, the Minnesota Department of Natural Resources (MNDNR) initiated a black bear GPS telemetry study at the north-western edge of the Minnesota (MN), USA bear range (Garshelis et al. 2011). This study area is unique in that it is largely agricultural and privately owned, with a few larger blocks of forest contained on public lands. Unlike the rest of MN where bear numbers are stable or declining (Fieberg et al. 2010), the number of bears in this area appears to be increasing, and bear range appears to be expanding westward. The primary research objectives of this study are aimed at determining behavioural tactics that have allowed bears to thrive in this extremely fragmented and agriculturally dominated landscape.
During the winter den seasons of 2009 and 2010, bears were fitted with global positioning system (GPS) store-on-board collars. GPS collars were programmed to collect locations at designated intervals ranging from 2–6 h, with higher sampling rates reserved for late summer and fall to coincide with bears' increased use of agricultural fields. In addition, a small number of days (six for the bear in our applied example) were selected for more intensive monitoring at 20-min intervals.
A subsample of bears were surgically implanted with Medtronic Reveal XT insertable cardiac monitors (Model 9529, Medtronic Inc., Minneapolis, MN, USA). These devices were designed to monitor human patients' electrocardiogram and other physiological information and were also set up to record an average daytime heart rate (between the hours of 08:00–20:00) and an average night-time heart rate (between the hours of 00:00–04:00) in beats per minute (bpm). These averages were downloaded from the cardiac monitor through the animal's skin during winter den visits. Further information on the cardiac monitoring device, methods used for anaesthetization and methods for surgical implantation are described in greater detail in Laske et al. (2011). All methods and animal handling were approved by the University of Minnesota's Institutional Animal Care and Use Committees.
For the purposes of this study, we consider data from a single female with cubs that had both units function for all of 2010; data from three other bears exhibited similar seasonal movement and heart rate patterns. We consider data collected from 28 March, 2010 to 4 December 2010, which excludes the time period when the bear was hibernating (Fig. 1). We used a time-weighted average of the daytime and night-time heart rates as our response variable. We approximated movements using straight-line distances between consecutive GPS locations (hereafter, step lengths). Step lengths may overestimate movement for short time intervals (because of error associated with the GPS locations) and underestimate movement for longer time intervals (as animals will not likely move in straight lines) (Jerde & Visscher 2002; Bradshaw et al. 2007; Rowcliffe et al. 2012); we return to these issues in the next section.
When forming daily movement rates, we only considered step lengths calculated from observations taken within 6 h of each other. In cases where consecutive locations spanned 2 days, we allocated movement and time to both days in proportion to the amount of the time interval associated with each day. We formed daily movement rates by summing (approximate) distance travelled for each day and then dividing this value by the total amount of time associated with these daily movements.
We begin by exploring a few different regression models relating daily movement rates to average daily heart rates. These analyses were initially considered as part of a larger investigation to determine how habitat use, rate of movement, road density, crop depredation and habitat fragmentation affect the daily average heart rate of bears. We hypothesized that movement rate would have the most influence on the variation in heart rate. To simplify the presentation, we do not consider additional covariates in this study.
We fit five different models relating the time series of = log(daily movement rate) to = log(average daily heart rate) using Program R (R Development Core Team 2010). We list the models and R code, below: ols linear model: lm (). ar(1) linear model with ar(1) errors: gls (, correlation = corCAR1(form = ∼Julian), method = 'ML')auto autoregressive model: lm (). dlag distributed lag model: lm (). autodlag autoregressive distributed lag model: lm ().
Errors in the ols, auto and autodlag model are assumed to be .
As might be expected, based on the cross-sectional association between and depicted in Fig. 1e, the coefficient for is positive and significantly different from 0 in the ordinary least squares (linear) regression model (Table 1), suggesting a positive association between movement rates and heart rates. Yet, the residuals are highly autocorrelated (out to many lags; Fig. 2a). The residuals also exhibit some patterning when plotted over time and against (Fig. 2b,c). Nonetheless, the time series nature of the data might initially lead a researcher to specify an autoregressive correlation structure for the residual error [e.g. an ar(1) model; Theil et al. 2004]. In doing so, one discovers that the coefficient for becomes negative, although not statistically different from 0; the autocorrelation parameter, , is very close to 1 (0·97). The negative regression parameter estimate for is at first perplexing and seems at odds with the linear (and positive) association (Fig. 1e).
Table 1. Estimated regression parameters (standard errors) from models fit to time series data; = log(average daily heart rate) and = log(movement rate) collected from a black bear in Minnesota
AIC's represent differences from the minimum AIC between the 5 models considered. Also, note the auto, dlag and autodlag models require lagged variables that are missing for the first observation. To facilitate comparisons between models, this first observation was dropped (for all models) when calculating AIC's.
To understand this result, note that the ar(1) model is also an autoregressive distributed lag model, but with certain constraints on some of the parameters. To see this, we write the ar(1) model as:
( eqn 1)
( eqn 2)
with the assumed to be and ρ is between 0 and 1. We can combine these two equations to give the following:
( eqn 3)
Recognizing that , and substituting this expression into ( eqn 3) gives the following:.
This model is similar in form to the autodlag model except that the coefficient for (note also that the coefficient for is ρ), that is, ar(1) is an autoregressive distributed lag model (autoregressive, as serves as a predictor, and distributed lag, as serves as a predictor). In the (unconstrained) autoregressive distributed lag model (autodlag), the coefficients for both and are positive and the coefficient for is significantly different from 0 (Table 1). The coefficient for is also positive, but its value is much closer to 0 than in the ols model (and, in fact, the coefficient is not statistically different from 0).
Now, consider the ar(1) model. The strong positive association between and implies that ρ should be close to 1. We also expect positive associations between (, ) and . Yet, the coefficient for is constrained to be equal to (eqn 4). Thus, these variables compete with each other. It turns out that ‘wins’ out in this case; ( is close to 1, and the coefficient for is forced to be close to 0 to avoid the suggestion of a negative association between and ).
The positive correlations between , , and have important implications for the estimated regression coefficients in the other fitted models as well (Table 1). For example, the regression coefficient for in the ols model is equal to a weighted sum of the coefficients for and in the dlag model (Diggle et al. 2002; Schildcrout et al. 2011). In addition to the direct effect of on , the regression coefficient for in the ols model captures some of the effect of on (because of the correlation between and ). Because this correlation is positive, the regression coefficient for in the ols model is considerably larger than the corresponding regression coefficient in the dlag model.
These results potentially explain why the models behave as they do, but they do not necessarily help with choosing a particular model. Many ecologists use information theoretical criteria (e.g. AIC, BIC) to select a ‘best’ model for inference, while others might use these criteria to calculate model-averaged coefficients (Burnham & Anderson 2002). Akaike Information Criterion (AIC) strongly favours the auto and autodlag models (Table 1). Importantly, the interpretation of the regression coefficients will change depending on the inclusion (and exclusion) of other covariates. Thus, we argue that causal diagrams should be used to select appropriate models (we return to this point in the Discussion).
Endogeneity: causes, consequences, solutions
Importantly, is correlated with (Fig. 1d), and this correlation remains even if we adjust for (partial correlation, ; P < 0·001). Thus, the time-dependent covariate process, , is endogenous with respect to the set of responses, . Endogeneity may be caused by a causal feedback loop in which the current response influences future predictor variables or by the omission of one or more time-varying confounders (i.e. variables correlated with both the response and predictor of interest). In addition, we suggest that measurement error associated with estimating daily movement rates may be partially responsible for the observed relationships. We will explore these possibilities using causal diagrams (i.e. directed graphs) in the subsequent subsections (after first reviewing key concepts from graph theory).
Directed acyclical graphs, confounders and intermediate variables
Directed acyclical graphs (DAGs), representing causal pathways, are central to understanding observational data (see, for example, Fig. 3). Each graph results in a set of conditional independencies, which can be elucidated by considering a few relatively simple rules. At the most basic level, there are three types of patterns that need to be considered (arrows, below, indicate the direction of causal effects; Pearl 1995, 2000)
Assume for now that A, B and C are the only three variables in the system, and consider each of these patterns in turn. In the case of a chain, A and C will be marginally dependent; however, these variables become independent if we condition on B (B is said to block the path between A and C). In this case, B is said to be an intermediate variable on the causal pathway between A and C. In the second case (a fork), A and C will again be marginally dependent (because of their common cause), but independent if we condition on B. In this case, the marginal correlation is spurious as it does not represent a causal relationship between A and C. In the third case (inverted fork), A and C will be marginally independent. However, conditioning on B will unblock the path between A and C causing these variables to become dependent. The variable, B, in this case is referred to as a collider. Conditioning on a collider illustrates another possible mechanism by which spurious correlations may arise.
These rules also determine statistical independencies when the graph involves more than 3 variables; however, one has to consider all paths between variables along with the various sets of variables that may be conditioned on (or adjusted for); details and examples are given in (Pearl 2000; Shipley 2002). Rather than reviewing these rules here, we note that one can test for statistical independencies using the dSep function in the ggm package of Program R (R Development Core Team 2010; Marchetti et al. 2012). We illustrate the use of this function to determine statistical independencies in the causal models we consider in subsequent sections.
One of the main implications of this body of knowledge (i.e. graph theory) is that causal diagrams are needed to determine which covariates need to be adjusted for when analysing observational data (Hernán et al. 2002; Hernán, Clayton & Keiding 2011). For example, consider a simple 3 variable system (with variables A, B and C) containing either a chain, fork or inverted fork and assume the goal is to estimate the causal effect of A on C. We would not want to adjust for B when it is an intermediate variable in a chain or if it is a collider in an inverted fork. On the other hand, we would want to condition on B if it is a common cause of A and C. Similar considerations apply to more complex systems. Causal diagrams are necessary to determine whether causal effects can be estimated from available data (i.e if they are identifiable; Pearl 1995, 2000; Paul 2011) and also to determine appropriate estimation methods (Pearl 2000; Diggle et al. 2002).
Algorithms have also been developed that allow one to work backwards, using the set of observed statistical independencies in the data to suggest possible causal models or rule out others (see Chapter 9 of Shipley 2002). This process works by first constructing an undirected dependency graph, with lines connecting variables that are statistically dependent (these lines, as well as the arrows in DAGs, are referred to as ‘edges’). A small set of algorithmic rules are then used to ‘orient’ as many of these lines as possible (i.e. determine the direction of the causal path between pairs of variables). The end result is typically a ‘partially oriented’ graph that is similar to a DAG, except some of the edges will usually remain as lines rather than as arrows. This partially oriented graph determines a set of (usually multiple) DAGs (i.e. causal models) that are consistent with the data. Although multiple models can lead to the same set of statistical independencies, these techniques can often be used to rule out a large number of potential causal models (Shipley 2002). Yet, like other exploratory data analysis techniques, these tools are best used to suggest interesting hypotheses that need to be tested with independent data. In addition, these methods are not without their critics; some have questioned the validity and usefulness of causal discovery algorithms, particularly when applied to small data sets (Freedman & Humphreys 1999).
Causal feedback loops
In some systems, the relationship between and may reflect a causal link between the response at time t−1 and the predictor at time t (Fig. 3a). Examples from the medical literature include studies in which patient treatment schedules may be altered based on previous response outcomes (see, for example, Schildcrout et al. 2011). A classic example from the econometrics literature involves the challenges in modelling the relationship between price and demand of consumer goods. Consumer demand changes in response to changes in prices (and vice versa). Although not always recognized, we expect these issues also apply to studies that attempt to relate harvest statistics, hunting or trapping effort, price of animal goods (e.g. pelts) and measures of population abundance (see, for example, Kapfer & Potts 2012). Lastly, models with feedback loops may be needed to describe complex animal behavioural patterns. For example, Zucchini et al. (2008) modelled the feeding behaviour of caterpillars Helicoverpa armigera as a function of a latent internal motivational state representing whether the caterpillars were hungry or sated. The caterpillar's future state was determined by both its previous state and also its current behaviour (whether or not it chooses to feed).
When a causal feedback loop exists between and , estimation of causal effects can be challenging because can serve as both intermediary variable (on the causal path between and ) and a confounder (for the relationship between and ) (Fig. 3 a; Diggle et al. 2002; Schildcrout et al. 2011). Estimation of the causal link between and requires that one adjust for . Conditioning on , however, will alter the regression parameter describing the relationship between and as is an intermediate variable on the causal pathway between and . Thus, it is not possible to estimate the total (direct and indirect) effects of and on using traditional regression techniques that model marginal associations (Diggle et al. 2002).
In our applied example, this explanation for the observed endogeneity seems unlikely – that is, it is difficult to imagine that a high heart rate on day t would directly cause a bear to move more on day t+1. It is more likely that an unmeasured (and time-varying) confounder is responsible for the correlation between and (see next subsection). Yet, causal feedback loops may be relevant to other ecological studies, including studies that relate animal movement and physiological measurements when observations are made more frequently than in our study (e.g. intervals measured in seconds or minutes rather than days; Theil et al. 2004). We refer readers to Pearl (1995, 2000), Diggle et al. (2002) and Robins & Hernán (2009) for more in-depth treatment of these issues, including estimation methods aimed at elucidating causal effects when feedback loops are present.
We expect that unmeasured confounders are likely to be highly relevant to research relating animal movement and physiological measurements, as both of these stochastic processes are generated by the individual sample units under study. That is, we should expect animals to respond in multiple ways to various environmental stressors. For example, various forms of disturbance might increase an animal's stress level and also cause the animal to move a large distance from its current location. In this case, measures of stress will be correlated with movement because of a common cause (i.e. the stressor results in a fork). Similarly, changes in temperature, forage availability and digestibility, and other changes in phenology may impact both a bears movement pattern and its heart rate.
To explore the effect of unmeasured confounders (), we consider the causal diagram depicted in Fig. 3b (both with and without the unmeasured confounders). We begin by creating objects that encode these graphical models, making use of the DAG function in the ggm library of Program R (Marchetti et al. 2012): Model without unmeasured confoundersdag.3b.woc < DAG(X[t] ∼ X[t−1], Y[t−1] ∼ X[t−1], Y[t] ∼ X[t])Model with unmeasured confoundersdag.3b.wc <- DAG(U[t] ∼ U[t−1], X[t−1] ∼ U[t−1], X[t] ∼ X[t−1] + U[t], Y[t−1] ∼ X[t−1] + U[t−1], Y[t] ∼ U[t] + X[t])
We can then test for statistical independencies using the dSep function (also in the ggm library). For example, to test whether is independent from after conditioning on and in the model without confounders, we use the following: dSep(dag.3b.woc, first = ‘Y[t]’, second = ’Y[t−1]’, cond = c(’X[t]’,’X[t−1]’)) The dSep function returns a value of TRUE, indicating is independent from if we condition on . Replacing dag.3b.woc with dag.3b.wc, however, returns a value of FALSE. We can perform similar tests to determine whether is independent of after conditioning on (again, the dSep function returns a value of TRUE when there are no unmeasured confounders and a value of FALSE when there are unmeasured confounders). In summary:
if we condition on either (or both) of and , then is independent of when there are no unmeasured confounders.
The same is not true when is present (in this case, there is a path from to that passes through and ).
if we condition on , then is independent of when there are no unmeasured confounders. However, is not independent of , even if we condition on , when is present.
Thus, one possible explanation for the endogeneity between x and y is that one or more unmeasured confounders are present and partially responsible for the changes in both heart rate and movement rate over time. Residual plots from the ols model also suggest one or more important time-varying covariates may have been omitted from the model (Fig. 2b,c). In particular, the model does not fit the data very well near the end of the time series.
One oft suggested approach for addressing the potential for unmeasured temporal confounders is to detrend the data or to use a regression method that assumes (or allows for) a smooth temporal effect. For example, we could use regression or smoothing splines to allow to vary smoothly as a function of Julian date (see, for example, Dominici et al. 2000). These approaches may be too simplistic, however. Recent investigations into similar approaches applied to spatial problems have shown that smooth terms can compete with (spatially varying) fixed effects variables (Reich et al. 2006; Hodges & Reich 2010; Paciorek 2010). In other words, the regression terms used to allow for a smooth effect of Julian date may be collinear with . Rather than adjusting for unmeasured confounders, these approaches may result in additional biases.
As noted earlier, using GPS location data to estimate movement rates introduces measurement error into the covariate process. Similarly, environmental variables measured at meteorological stations (e.g. temperature, wind speed) will only approximate conditions experienced by individual animals (Theil et al2004).
To explore the consequences of measurement error, let be the true log movement rate and be our estimate of the log movement rate on day t. We represent this scenario with the causal diagram in Fig. 3c. We again explore resulting statistical independencies using the dSep function in the ggm library of Program R, observing that:
is independent of if we condition on either (or both) of and . However, is not independent of if we condition on , , or both and .
is independent of after conditioning on . However, is not independent of if we condition on .
In essence, conditioning on or blocks the path between and , but the same is not true if we condition on either (or both) of and . As a result, the covariate process is endogenous to the response only when it is measured with error.
To further explore the impact of measurement error on regression parameter estimators, we consider a simple simulation example that mimics the causal diagram in Fig. 3c. We assume the following equations govern the relationship between the true log(movement rate), , estimated log(movement rate), and log(heart rate), , for t = 1,…,250:
( eqn 5)
The errors, , and , are assumed to be mutually independent, iid∼N(0,1) random variables. The structure and parameters were chosen to mimic certain characteristics of the data (Fig. 4). In particular, both and exhibit a modal response over time, but the measured covariate process, , is more noisy than the response process, . Further, , , and are all positively correlated (Fig. 4).
The estimated coefficient for is positive (0·48) and statistically different from 0 in the ols model (P < 0·001), but also far from the data generating value of 1. ‘Classic’ measurement error of this type (i.e. where ) results in ols regression parameter estimators that are biased (towards 0) (Carroll et al. 2006). As in our applied example, the estimated regression parameters also exhibit extreme sensitivity to the assumed error structure. The regression parameter is negative (-0·008) and not statistically different from 0 (P < 0·66) in the ar(1) model. The autocorrelation parameter in the ar(1) model is estimated to be close to 1 ().
If we fit the models using the true covariate process, , parameter estimates are similar in the ols and ar(1) models – in both cases, the estimates are close to the data generating value of 1 (). Further, the autocorrelation parameter in the ar(1) model is close to 0 ().
These simulation results are somewhat extreme, because of the fact that the strength of the association between and and between and are both large relative to the association between and . In essence, the path between to (which passes through and ) has a larger ‘flux’ than the path between and (which passes through only ). Nonetheless, we expect these results are informative with respect to our applied example. Together, the simulation results and the conditional independence tests provide an effective means of exploring the impact of measurement error. In particular, we see that measurement error may be responsible for endogeneity between covariate and response variables as well as the sensitivity of regression parameter estimators to the assumed error structure.
As somewhat of an aside, we note that the true measurement error process is likely to be more complicated than we assumed in our simulation example. In particular, we expect that may be biased high for small values of and also for movements occurring over short intervals (as is formed by connecting nearby locations that are themselves measured with error). By contrast, may be biased low if is estimated from infrequent locations (because of the assumption of a linear movement path between successive observations). Because true movement rates (and to a lesser extent, sampling frequencies) vary seasonally, we might expect measurement errors to be serially correlated (i.e. there may be a path connecting and ). As above, we could construct a causal diagram that would represent this possibility and explore the consequences using a combination of conditional independence tests and simulation methods (the latter being useful for evaluating the potential magnitude of any biases). Hernán & Cole (2009) provide additional examples illustrating the use of causal diagrams to explore the consequences of various forms of measurement error.
Estimates of regression parameters reflect the strength of association between predictor and response variables conditional on other variables included in the model. In observational studies, these associations can be misleading, depending on what other variables are adjusted (or not adjusted) for (as illustrated by the simple three variable systems [chain, fork, inverted fork] considered earlier). By contrast, experimental research relies on large samples and randomization to ensure balance among treatment groups with respect to potential confounding variables.
Early notions of a confounder, often based on statistical correlations (e.g. a variable correlated with both the predictor of interest and the response, but not caused by the predictor), are in many cases too simplistic (McNamee 2003). Recent research in graph theory and causal inference have clarified these concepts, and it is this theory that allows one to determine which variables must be adjusted for (and which should not be adjusted for) when estimating causal effects (Hernán et al. 2002).
Although we have focused our attention on research aimed at relating movement and physiological measurements, these concepts apply more broadly. Causal diagrams should be used to guide selection of variables in regression models fit to observational data and also to explore the consequences of these choices (see, for example, Hernán et al. 2002; Hernán, Clayton & Keiding 2011). Contrast this suggestion with popular analytical approaches in applied ecology. Researchers may eliminate variables from consideration if covariates are collinear with each other (e.g. based on pairwise correlations or variance inflation factors). Subsequently, researchers tend to consider many models and then select (or average among) models using Akaike Information Criterion or some other measure of model fit. We have also noticed a tendency for researchers to fit models containing a single set of ’like’ predictor variables (e.g. groups of variables representing weather, habitat, disturbance, etc), with comparisons among these models used to test ’competing hypotheses’.
Strategies that rely on AIC (for model selection or for averaging across models) are often reasonable when the goal is prediction (Burnham & Anderson 2002; Giudice et al. 2012). These practices seem odd, however, when applied to observational data if the goal is to gain a better understanding of the underlying system (i.e. the mechanisms responsible for patterns in the data). In particular, regression parameter estimates can be sensitive to the inclusion or exclusion of important confounding variables. Without proper adjustment for confounders, estimated regression parameters reflect associations that can be misleading or difficult to interpret (e.g. they may be in the opposite direction of any causal effect; Box 1966; Blyth 1972; Arah 2008; Hernán, Clayton & Keiding 2011). These interpretation challenges are likely to be exacerbated if one chooses to average regression parameters across models with and without important confounders. Lastly, we suspect researchers often neglect various model diagnostics tools (e.g. residual plots as in Fig. 2) when fitting many models (Giudice et al. 2012). We expect researchers may entertain fewer (and better) models, and explore these in more depth, if they use causal diagrams to focus their analyses.
Returning to the original motivating data problem, we suggest the best analysis approach is likely question dependent. If the goal were to characterize cross-sectional associations between the two time series, then the simple ols model combined with robust standard errors would appear to offer a reasonable, pragmatic approach (e.g. Lumley & Heagerty 1999; Zeileis 2004). If the goal were to predict the bear's heart rate on the next day, then one might prefer the auto or autodlag models as these models have the smallest AICs. On the other hand, the strong possibility of unmeasured confounders challenges our ability to understand the causal role movement plays in determining variability in the daily heart rate time series. Models with additional variables should be explored, assuming other covariates are available and strong biological arguments can be made for causal links between these variables and and . Alternatively, one could consider models that include a latent variable representing, for example, an individual's time-varying behavioural 'state’ (similar to the model Zucchini et al. 2008 used to explain grasshopper feeding behaviour). Regardless of the question and approach, critical thinking is necessary if we are to acquire meaningful insights into observed data patterns. Ultimately, we find it interesting how such a seemingly simple data set can be so hard to analyse.
We thank D. Garshelis, K. Noyce, T. Laske, P. Iaizzo and S. Howard for helping collect the data. We thank D. Heisey, J. Schildcrout and S. Hanheuse for insightful discussions on the topic. We thank R. Langrock, D. Garshelis, J. M. Gaillard and an anonymous reviewer for helpful suggestions on a previous draft.