Autocorrelation‐informed home range estimation: A review and practical guide

Modern tracking devices allow for the collection of high‐volume animal tracking data at improved sampling rates over very‐high‐frequency radiotelemetry. Home range estimation is a key output from these tracking datasets, but the inherent properties of animal movement can lead traditional statistical methods to under‐ or overestimate home range areas. The autocorrelated kernel density estimation (AKDE) family of estimators was designed to be statistically efficient while explicitly dealing with the complexities of modern movement data: autocorrelation, small sample sizes and missing or irregularly sampled data. Although each of these estimators has been described in separate technical papers, here we review how these estimators work and provide a user‐friendly guide on how they may be combined to reduce multiple biases simultaneously. We describe the magnitude of the improvements offered by these estimators and their impact on home range area estimates, using both empirical case studies and simulations, contrasting their computational costs. Finally, we provide guidelines for researchers to choose among alternative estimators and an R script to facilitate the application and interpretation of AKDE home range estimates.

areas an animal could explore given their movement abilities. Early translations into a statistical definition include quantifying an animal's probability of using a given location (i.e. utilization distribution; Jennrich & Turner, 1969;Worton, 1989). The concept of home range has been redefined by many authors over the years (Harris et al., 1990); here, we follow the definition of home range as the area repeatedly used throughout an animal's lifetime for all its normal behaviours and activities, excluding occasional exploratory excursions outside of home range boundaries. The characteristic temporal stability of a home range also highlights additional concepts: range residency, defined as the tendency of an animal to remain within its home range; and time-scale parameters that quantify the weakness of this tendency, including the home range crossing time-scale (τ), defined as the average time required for an animal to cross the linear extent of its home range.
Home range area estimates are used to inform conservation practitioners and wildlife managers about protected area sizes and to advocate for conservation policy changes (Bartoń et al., 2019;Lambertucci et al., 2014;Linnell et al., 1997). It is thus crucial to provide a reliable and statistically robust metric that is comparable across individuals, species and sites. Natural landscapes are becoming increasingly fragmented (Curtis et al., 2018;Hansen et al., 2020), imposing new challenges at local, regional and global scales, and unreliable estimations may hinder area-based conservation. Reliable estimates of home ranges, however, have proven to be deceptively difficult to achieve, and have occupied generations of ecologists (Fieberg & Börger, 2012;Horne et al., 2020;Jennrich & Turner, 1969;Worton, 1989). The inherent properties of animal tracking data create unique analytical challenges. Specifically, animal movement data frequently feature some combination of autocorrelation, small sample sizes, missing observations or irregular sampling, and home range estimators that are not designed to handle these issues can both under-and overestimate the sizes of home ranges.
Although many home range estimators exist (Horne et al., 2020), autocorrelated kernel density estimation (AKDE) was the first to explicitly account for temporal autocorrelation in the data (Fleming et al., 2015). Since its introduction, AKDE has grown into a family of related techniques, each aimed at mitigating a different source of bias that can affect home range estimates, including unmodelled autocorrelation (Hemson et al., 2005;Kie et al., 2010;Swihart & Slade, 1997), oversmoothing (Seaman & Powell, 1996;Worton, 1995), autocorrelation estimation bias (Cressie, 2015) and unrepresentative sampling in time (Frair et al., 2004;Horne, Garton, & Sager-Fradkin, et al., 2007;Katajisto & Moilanen, 2006). These biases are mitigated, respectively, by the original AKDE (Fleming et al., 2015), the areacorrected AKDE (Fleming & Calabrese, 2017), the perturbative hybrid residual maximum likelihood (REML) parameter estimation and parametric bootstrapping  and weighted AKDE (Fleming et al., 2018). REML is a form of maximum likelihood estimation that reduces biases in variance/covariance estimation. AKDE and associated corrections have been shown to outperform traditional home range estimators across species, degrees of autocorrelation and sample size . The ctmm workflow also allows researchers to partially account for the location errors associated with their tracking datasets . These methods can be run using the programming language R (www.r-proje ct.org) and the ctmm or amt packages (Calabrese et al., 2016;Signer & Fieberg, 2021), or the ctmmweb graphical user interface (https:// ctmm.shiny apps.io/ctmmweb; Calabrese et al., 2021). In addition to offering flexible and open-source tools for home range estimation, these software programs allow easy documentation and implementation of new methods by sharing code and workflows. Such reproducible methods can increase reliability and transparency in ecology (Alston & Rick, 2021;Culina et al., 2020;Powers & Hampton, 2019;Signer & Fieberg, 2021).
Because movement data often violate multiple assumptions of traditional methods, the individual methodological advances offered by the AKDE family of home range estimators can and often should be combined. The costs and benefits of each estimator have previously been described in separate technical papers, so in this paper, we bring all of these estimation methods together in one document.
We describe their effects on the quality of home range estimates, both in isolation and in combination, while evaluating how sample size interacts with multiple different sources of bias. We use tracking data from African buffalo (Syncerus caffer; Cross et al., 2009), lowland tapir (Tapirus terrestris; Fleming et al., 2019) and jaguar (Panthera onca; Morato et al., 2018) as empirical case studies to guide researchers through the application and value of these analyses. Finally, we use simulations to show the improvements offered by combining these techniques and demonstrate their application in real-world problems. We conclude by giving clear guidance on how ecologists can choose among these alternatives to best achieve their study goals. We hope that this review provides a practical guide to why and how to use AKDE methods to estimate home ranges that will be useful for both researchers and practitioners who are unfamiliar with these methods.

| SOURCE S OF B IA S AND MITI G ATI ON ME A SURE S
Many biases, including most that affect home range estimates, are exacerbated by small sample sizes. Conversely, large sample sizes in modern tracking datasets are typically achieved through higher sampling frequencies, which exacerbate autocorrelation.
Autocorrelation is a general statistical property of variables measured across geographic and temporal space (Dale & Fortin, 2002;Legendre, 1993), as observations sampled more closely in space or time tend to be more similar. In these conditions, it is thus important to distinguish between two different measures of sample size: absolute sample size (n) and effective sample size (N). Absolute sample size is simply the total number of observations in a dataset. More relevant for home range estimation, however, is the effective sample size. Specifically, the amount of information available to home range estimators is governed not simply by the total number of observations, but by the number of range crossings that occurred during the observation period (i.e. how many times an animal traversed the linear extent of its home range). The effective sample size can be roughly estimated as T/τ, where T is the temporal duration of the tracking dataset, and τ is the average home range crossing time parameter. Increasing sampling frequency leads to larger absolute sample sizes, but does not increase the effective sample size commensurately. For autocorrelated data, the effective sample size is necessarily smaller than the absolute sample size and, very frequently in practice, orders of magnitude smaller . In contrast, small absolute sample sizes commonly occur in very-high-frequency (VHF) tracking data but are becoming rarer in modern GPS tracking data.
We now describe each source of bias and the mitigation measure available to correct it, highlighting the difference each correction makes with real data from multiple case studies. We present the bias sources in order of their general importance, from the largest bias to the smallest. Note that this ranking refers to the typical magnitude of each type of bias, but the order may be different under some conditions.

| B IA S I: UNMODELLED AUTOCORREL ATION
Traditional home range estimators such as minimum convex polygons (MCPs) and kernel density estimators (KDEs) assume independently and identically distributed (IID) data. When these techniques came into common use in the 1980s, the sheer difficulty of obtaining VHF location fixes ensured that the time interval between successive observations was typically long enough for most of the autocorrelation among observations to have decayed (Swihart & Slade, 1997;Worton, 1989). The IID assumption at the heart of these techniques was therefore usually satisfied by VHF-quality data (Harris et al., 1990). The situation began to change with the arrival of new technologies, most notably GPS tracking systems (Rempel et al., 1995), which now routinely feature large volumes of data with much more frequent temporal sampling than is feasible for VHF-based animal tracking. As autocorrelation arises from observations sampled closely in time also being located closely in space, increasing sampling frequencies inevitably leads to more strongly autocorrelated tracking data (De Solla et al., 1999). Automated, highsampling frequency tracking data have undoubtedly revolutionized movement ecology (Kays et al., 2015), but these advances have broken the armistice between the statistical assumptions of traditional home range estimators and the reality of the datasets now used to study animal movement (Boyce et al., 2010).
Specifically, feeding autocorrelated data into a home range estimator based on the IID assumption yields negatively biased estimates . Autocorrelation-induced underestimation of home range areas is particularly pronounced when the effective sample size is small. In the recent comparative study of Noonan et al. (2019), 368 of 369 tracking datasets featured strong autocorrelation, and roughly half were also plagued by small effective sample size. In these conditions, conventional estimators-such as MCPs, KDEs and local convex hull polygons-underestimate home range areas by a factor of ~2 to 13 (on average), depending on the method and bandwidth optimizer, which is what determines how tightly KDEs conform to the data. Accordingly, published estimates featuring these traditional methods may severely underestimate animal space-use requirements, hindering conservation and management decisions.

| MITI G ATI ON ME A SURE I: AK DE
Fortunately, it is not autocorrelation per se that causes errors in home range estimation, but rather autocorrelation that is statistically 'unmodelled' . Home range estimators that account for autocorrelation can therefore avoid the biases and violated assumptions of traditional methods. Autocorrelated kernel density estimation (AKDE) explicitly requires a movement model that accounts for the autocorrelation in the tracking data ( Figure 1) and then estimates the home range while conditioned on the same movement model (Fleming et al., 2015). This model is identified via In this framework, IID is both a candidate model and one limit of a continuum of possibilities, rather than an a priori assumption. These models are ranked based on Akaike's information criterion adjusted for small sample sizes (AICc) by default, although the ctmm package also offers AIC, Bayesian information criterion (BIC), leave-one-out cross-validation (LOOCV) and half-sample cross-validation (HSCV).
Ad hoc measures such as data thinning (Harris et al., 1990;Rooney et al., 1998) are not necessary, as AKDE allows model assumptions to conform as closely as possible to empirical reality, instead of coercing the data to fit a model with unrealistic assumptions. Feeding IID data into AKDE will not have any adverse effects, as it will simply result in a conventional KDE estimate. This workflow also allows reliable confidence intervals to be determined for home range area estimates, which historically have not been applied to home range estimates. This measure of confidence is fundamental for any statistical estimate (Pawitan, 2001), increasing the comparability of AKDE and its relevance for biogeographical and conservation applications.

| B IA S II: OVER S MOOTHING
Kernel density estimators are best-in-class tools for estimating unknown probability distributions and are used in this capacity across the sciences (Chen, 2017;Silverman, 1986;Wang et al., 2013). In the context of tracking data, KDEs estimate the probability distribution of locations, which is then used to estimate the area of a home range (Powell, 2000;Worton, 1989). Typically, ecologists are more interested in this area estimate than in the distribution itself.
Even when we account for autocorrelation (AKDE), kernel density estimators based on the Gaussian reference function (GRF) remain biased owing to the natural tendency of the GRF approximation to oversmooth (yielding a more spread-out distribution). This bias is estimator specific, and may be either positive or negative (Kie et al., 2010;Worton, 1995): for GRF-KDEs-such as AKDE and h ref (Silverman, 1986)-this bias is positive and, all else being equal, leads to an overestimated home range (Seaman & Powell, 1996). Importantly, for estimators that do not account for autocorrelation, like h ref but unlike AKDE, this positive bias can be masked by the often stronger negative bias caused by unmodelled autocorrelation. For KDEs based on least-squares cross-validation, h LSCV , this bias is typically negative (Blundell et al., 2001;Hemson et al., 2005) and exacerbates the autocorrelation-induced underestimation of home range areas.

| MITI G ATI ON ME A SURE II: K DE c o r AKDE c
Fleming and Calabrese (2017) derived an improved KDE by calculating the bias in area estimation under a GRF approximation and applying a correction in an area-based coordinate system. By pulling the contours of the location distribution estimate inward towards the data without distorting its shape, this correction removes the tendency of GRF-based methods (including AKDE) to overestimate the area of home ranges, particularly at small effective sample sizes F I G U R E 1 An example of autocorrelated data (individual six from the African buffalo dataset, available within the ctmm package), and the same data when it achieves independence (IID) after data thinning (from one fix per hour to one fix per week). We calculated the 95% contour of an autocorrelated kernel density estimation (AKDE) and a Gaussian reference function KDE (GRF-KDE). Displayed errors correspond to % bias of full dataset KDE and subset KDE against full dataset AKDE. N: effective sample size, n: absolute sample size F I G U R E 2 Autocorrelated kernel density estimation (AKDE) and areacorrected AKDE (AKDE c ) calculated for one individual from the lowland tapir tracking dataset with: large effective sample size (N ≈ 1,566), medium effective sample size (N ≈ 261) and small effective sample size (N ≈ 30). Displayed errors correspond to % bias of AKDE against AKDE c of the same individual. Note that for large N values the estimates from AKDE and AKDE c overlap considerably ( Figure 2). Formally correcting the density function estimate allows us to calculate a more reliable home range area and confidence intervals. This correction can be applied to both conventional and autocorrelated GRF-KDEs (then termed KDE c and AKDE c respectively), and is the default method within the ctmm package. As this source of bias is estimator specific, the mitigation must also be estimator specific, so this correction cannot be applied to non-GRF KDE approaches such as h LSCV .

| B IA S III: AUTO CORREL ATI ON E S TIMATI ON B IA S
The main advantage of AKDE is that it accounts for the autocorrelated structure of animal movement data; for optimal performance, we need to estimate this autocorrelation correctly. Maximum likelihood (ML) estimation is the standard approach to fitting movement models to animal tracking data (Horne Garton, Krone, et al., 2007;Michelot et al., 2016) due to its versatility, widespread use and relatively good performance (Pawitan, 2001). However, ML performs best at large sample sizes, while parameters related to variances and covariances tend to be underestimated in small sample size conditions (Cressie, 2015). As variance-associated parameters are closely related to home range size, their underestimation propagates into underestimated home range areas .

| MITIG ATION ME A SURE III: pHREML AND PAR AME TRI C BOOTS TR APPING
Residual ML estimation is often used to improve (co)variance parameter estimation with small sample sizes, but it can perform poorly for the class of movement models on which AKDE depends . To mitigate the small sample size bias in autocorrelation HREML), or both small absolute and small effective sample sizes (perturbative Hybrid REML; pHREML). We focus on pHREML here (Figure 3) as it is the most broadly applicable of these methods and has no serious disadvantages relative to the others, because it combines the bias correction of REML and the stability of ML. It is currently the default parameter estimation method in the ctmm package.
The parametric bootstrap method (Efron, 1982) is another standard solution for the biases caused by ML estimation and can be applied on top of REML-based estimations to further reduce biases.
In extreme cases where effective sample sizes are ~5 or less, parametric bootstrapping may result in substantial improvements. However, the high computational cost incurred by bootstrapped pHREML (Supporting Information File 1), coupled with the usually modest improvements it provides, reinforce its use only as a last resort.

| B IA S IV: UNREPRE S ENTATIVE SAMPLING IN TIME
From a statistical perspective, evenly spaced temporal sampling of tracking data ensures the widest possible range of analytical options. In practice, however, many real-world issues can lead to animal locations being sampled irregularly in time: duty-cycling tags to avoid wasting battery during periods of inactivity, acceleration-informed sampling, device malfunction, habitat-related signal loss and many other causes (DeCesare et al., 2005;Frair et al., 2004;Horne, Garton, & Sager-Fradkin, et al., 2007). When unaccounted for, such cases can yield biased datasets, causing area estimates associated with over-sampled portions of home ranges to be too large and those associated with under-sampled parts of home ranges to be too small (Fieberg, 2007).
There is no guarantee that these contrasting biases cancel each other out, so the overall home range area estimate may be either positively or negatively biased.

| MITI G ATI ON ME A SURE IV: wAKDE
Weighted AKDE (or wAKDE) corrects for unrepresentative sampling in time (Fleming et al., 2018) through the larger bias addressed is where the area is distributed: it optimally upweights observations that occur during under-sampled times, while optimally downweighting observations occurring during over-sampled times. In IID data, optimal weights are uniform (i.e. there is no temporal sampling bias, as all times are equally important) so there is no advantage to weighting. For autocorrelated data with highly irregular sampling, however, the difference between weighted and unweighted AKDE can be considerable (Figure 4). F I G U R E 3 AKDE c calculated with maximum likelihood (ML) and with perturbative Hybrid REML (pHREML) for an individual within the jaguar dataset, showcasing its effect on large absolute but small effective sample size (reduce to a sampling duration of 3 months: n = 363 locations, N ≈ 3.1), and both small absolute and small effective sample size (3 months thinned to n = 5 locations, N ≈ 4). Displayed errors correspond to % bias of ML-fitted AKDE against pHREML-fitted AKDE In practice, very few tracking datasets are perfectly regular, so it is essential to handle data irregularity appropriately. Missing data equate to a loss of information, and these errors can propagate into biases in habitat selection or area-based conservation outputs (Frair et al., 2004). For example, areas with good satellite reception (e.g. open flat landscapes) may appear over-used even when animals did not spend more time in them compared to areas with poorer reception. Shifting sampling schedules (based on behavioural or seasonal patterns) is a common strategy employed in animal tracking projects, due to the trade-off between sampling intensity and battery life (Brown et al., 2012); in these circumstances, weight optimization via wAKDE is critical for comparisons between individuals or populations.

| COMB INATION OF MITIG ATION ME A SURE S
In practice, different sources of bias frequently occur together in the same datasets. This is a key reason why home ranges are so difficult to estimate accurately. However, the mitigation measures described above can be implemented simultaneously when necessary to combat File 1). With an Intel i7 3.9GHz processor using a single core, and an hourly tracking dataset collected for a year, this could correspond to an increase from a few seconds to approximately 45 min. However, unlike AKDE, conventional KDE does not run any autocorrelation model selection, or numerical optimization of parameter estimates.

F I G U R E 4 A uniformly weighted
AKDE c and an optimally weighted AKDE c (wAKDE c ), calculated from an individual from the African buffalo dataset with an irregular sampling schedule likely due to a device malfunction (nicknamed 'Pepper'; available within the ctmm package). Displayed errors correspond to % bias of AKDE c core area (50%) against wAKDE c core area (50%)

| DISCUSS ION
The techniques presented in this paper represent a family of home range estimators starting with conventional GRF-KDE and progressing through a series of estimation methods designed to mitigate bias arising when the core assumption of IID data is not met.
These methods are implemented with efficient computational algorithms that work with both small and large animal tracking datasets.
We have brought these techniques together in a single document to demonstrate when each correction is applicable, the degree to which home range estimates can be improved, and when and how they can be combined to handle the unique quirks of each tracking dataset to yield accurate home range estimates.
The AKDE family of estimators are all implemented in the ctmm R package (Calabrese et al., 2016), so we provide an annotated R script in the supplementary material of this paper to guide users through the applications of these techniques (Supporting Information File 2).
The current default settings are pHREML, for estimating movement model parameters, and (A)KDE c , for estimating home ranges. The decision between KDE c and AKDE c is determined using model selection, and dependent on whether the data are independently distributed or autocorrelated respectively. We recommend that users keep pHREML and (A)KDE c as the default settings and especially caution against changing these settings for any effective sample sizes below 20. When working with legacy data where small effective sample sizes TA B L E 1 Mean improvement (%) in area estimation for each AKDE method compared to baseline KDE, over small (N < 32), medium (32 > N < 512) and large effective sample sizes (N > 512).  (Guo et al., 2019;Péron, 2019;Silverman, 1986;Worton, 1995). The positive bias from boundary spillover is likely less influential than the negative bias due to unmodelled autocorrelation; nevertheless, it is possible to correct for hard boundaries by following the workflow presented in appendix 3 of Noonan et al. (2019). Kernel density methods also fail to adequately resolve non-stationary behaviour and nomadism (Lichti & Swihart, 2011;Nandintsetseg et al., 2019), as nomadic species lack site fidelity to movement pathways or key sites (e.g. breeding or wintering areas). Addressing non-stationarity requires home range estimates that accommodate multiple centres and allow for variation in use patterns (Breed et al., 2017). In addition, a misspecified model due to migratory behaviours will affect the accuracy of AKDE area outputs due to the stationary movement models being leveraged (OU, OUF and IID). However, if an animal is not range resident, then the data are not appropriate for any home range estimation method.

Improvement over KDE
Moving forward, we hope to address two remaining challenges in home range estimation: location error and resource selection (which includes boundary interactions). Home range estimation is not as sensitive to location error as fine-scale quantities, such as speed estimation  including autocorrelation modelling, bandwidth optimization and kernel shape, and will likely take multiple research efforts to fully implement in a general use software solution.
Only by estimating home ranges in a comparable way across sampling schedules, study designs and behavioural idiosyncrasies can wildlife researchers provide wildlife managers and practitioners with accurate information for conservation planning and land-use decisionmaking. Movement ecology has reached an inflection point where it is no longer possible to ignore autocorrelation: using autocorrelated tracking datasets with estimators that assume IID data will result in underestimated home range areas . Although further technological advances will only increase the amount of autocorrelation present in tracking data, autocorrelation is often still present even in VHF data and should not be overlooked. We have provided guidelines to obtain accurate home range area estimates with the AKDE family of home range estimators which, in their current form, provide the most reliable and flexible solution for home range area estimation. These methods were explicitly designed to work synergistically, eliminating discrepancies between empirical reality and estimator assumptions that drive home range under-or overestimation with conventional techniques. Furthermore, these techniques can be implemented with open-source software and code (Calabrese et al., 2016, and new movement processes can be easily added into the AKDE workflow as they are developed. This flexibility 'future proofs' the AKDE family of analyses by allowing it to be tailored to new datasets, movement behaviours and species as necessary.

ACK N OWLED G EM ENTS
This work was partially funded by the Center of Advanced Systems

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/2041-210X.13786.

DATA AVA I L A B I L I T Y S TAT E M E N T
All empirical datasets used in the manuscript are currently openly accessible: the African buffalo tracking data are archived in the MoveBank Data Repository (Cross et al., 2016) and partially included in the ctmm package ; lowland tapir tracking data are archived in the Dryad Digital Repository ; jaguar tracking data are available as a data paper (Morato et al., 2018) and partially included in the ctmm package . R scripts, tutorials and outputs are available on GitHub (https://github.com/ecois ilva/AKDE_minir eview) and archived on Zenodo (Silva et al., 2021).