Are kernels the mustard? Data from global positioning system (GPS) collars suggests problems for kernel home-range analyses with least-squares cross-validation

Authors


†Present address and correspondence: Graham Hemson, Flat 10/12 Cohen Street, Fairlight, NSW 2094, Australia. E-mail: lionmonkey@aardvark.net.au

Summary

  • 1Kernel-density estimation (KDE) is one of the most widely used home-range estimators in ecology. The recommended implementation uses least squares cross-validation (LSCV) to calculate the smoothing factor (h) which has a considerable influence on the home-range estimate.
  • 2We tested the performance of least squares cross-validated kernel-density estimation (LSCV KDE) using data from global positioning system (GPS)-collared lions subsampled to simulate the effects of hypothetical radio-tracking strategies.
  • 3LSCV produced variable results and a 7% failure rate for fewer than 100 locations (n = 2069) and a 61% failure rate above 100 points (n = 1220). Patterns of failure and variation were not consistent among lions, reflecting different individual space use patterns.
  • 4Intensive use of core areas and site fidelity by animals caused LSCV to fail more often than anticipated from studies that used computer-simulated data.
  • 5LSCV failures at large sample sizes and variation at small sample sizes, limits the applicability of LSCV KDE to fewer situations than the literature suggests, and casts doubts over the method's reliability and comparability as a home-range estimator.

Introduction

Home range is the fundamental measure of space use by animals. It is important in determining habitat preferences (Aebischer, Robertson & Kenward 1993), carrying capacities, aspects of species-extinction susceptibility (Woodroffe & Ginsberg 2000; Brashares 2003) and underpins several ecological theories including allometric scaling correlations (e.g. Mace, Harvey & Clutton-Brock 1982; Carbone & Gittleman 2002).

Advances in computing power have allowed ecologists to use increasingly sophisticated methods to estimate home ranges, including contouring methods for estimating complex probability density distributions (Dixon & Chapman 1980; Worton 1989). Contouring methods have considerable advantages over other popular home-range estimation methods such as the minimum convex polygon. They accommodate multiple centres of activity, do not rely on outlying points to anchor their corners and are less influenced by distant points, thereby excluding unused areas and leading to more accurate depictions of space use (Fig. 1).

Figure 1.

The influence of h on home-range size and shape. (a) The raw data (n = 134 points) from a single lion (UG) sampled once every 2 days with a 95% minimum convex polygon fitted (points excluded by the harmonic mean method); (b–d) 50%, 75% and 95% kernel density isopleths using h = 1000 m, 3000 m and 500 m, respectively. Values of h were chosen rather than calculated from the data.

Kernel density estimation (KDE) is widely viewed as the most reliable contouring method in ecology (Powell 2000; Kernohan et al. 2001). It was adapted to home-range analysis by Worton (1989) from a technique for estimating distributions from small samples (Silverman 1986). KDE creates isopleths of intensity of utilization by calculating the mean influence of data points at grid intersections. Each isopleth contains a fixed percentage (e.g. 95%) of the utilization density suggestive of the amount of time that the animal spends in the contour. A critical component of this calculation is the distance over which a data point influences the grid intersections; this value is the smoothing factor, or h. The larger the value of h, the larger and the less detailed the final home-range estimate (Silverman 1986; Worton 1989) (Fig. 1b–d). Conversely, small values of h reveal more of the internal structure of a home range but undersmooth in the outer density isopleths, leading to smaller estimates and creating discontinuous ‘islands’ of utilization (Fig. 1b). As KDE is sensitive to different values of h, the size and shape of the home-range estimates are dependent upon the methods used to calculate h (Silverman 1986; Wand & Jones 1995). This raises the possibility that variation in h may introduce systematic variation into home-range estimation that may complicate or invalidate some inter- and intrastudy comparisons.

The two preferred methods of calculating h in home-range analysis are the reference smoothing factor (eqn 1) and least-squares cross-validation (LSCV) (eqn 2).

For eqn 1, the reference smoothing parameter function: n is the number of locations and σ is the standard deviation of the x coordinates, with y coordinates transformed throughout the calculations to have the same standard deviation (Worton 1989).

image( eqn 1)

The least-squares cross validation function is shown in eqn 2:

image( eqn 2)

where dij is the distance between the ith and jth points and h is a value of the smoothing parameter examined.

LSCV allows h to be chosen so as to minimize the squared distance between the fitted surface and the target surface, integrated over the area. It creates an estimate of this by a formula (eqn 2) derived from the difference between the predicted value at each data point based on a surface fitted using all the data and on one fitted after excluding the data point. This estimate of the error is then minimized by varying the bandwidth (Silverman 1986).

Previous studies have recommended KDE as a reliable home-range estimator and LSCV as the best available method of calculating h while concluding that href oversmoothes (Worton 1995; Seaman & Powell 1996; Seaman et al. 1999; Powell 2000). However, these tests have been based on computer simulation of animal locations, not on field data.

Worton (1995) expanded on analyses by Boulanger & White (1990) using simulated data to test the performance of home-range estimators. Worton concluded that kernels were more reliable and accurate than the harmonic mean method approved by Boulanger & White but cautioned that the choice of smoothing factor had a profound effect on the bias observed in the final estimates.

Seaman et al. (1999) expanded on Worton's analysis. They used simulated data of between 10 and 200 points from more complicated distributions to mimic animal movements more closely and to test the influence of sample size and different methods of choosing h on kernel home-range estimates. The precision of KDE improved to an asymptote of 5–20% bias as sample size increased to 50 data points for simple distributions and 200 with complex distributions. They concluded that h-values chosen using LSCV produced the most reliable estimations of the distributions, giving the lowest frequency of poor estimates when compared with href at sample sizes between 20 and 200 points.

Despite concerns over the superiority of kernels (Robertson et al. 1998), LSCV fixed kernels are viewed as applicable in all but a few specific situations (Blundell, Maeir & Debevec 2001). Perhaps worryingly for advocates of LSCV KDE, the method's performance has been reviewed more critically by statisticians (Sain, Baggerly & Scott 1994; Wand & Jones 1995; Jones, Marron & Sheather 1996). They point out that LSCV may underestimate the value of h appropriate for a distribution and that variation in values of h chosen by LSCV (hlscv) may be considerable compared to methods such as the ‘solve-the-equation plug-in’ which have as yet not been adapted to home-range analysis (Wand & Jones 1995; Jones et al. 1996; Kernohan et al. 2001).

Girard et al. (2002) used GPS data from moose Alces alces to test LSCV KDE. Comparing kernels made from fewer locations to those estimated from using the majority of the data, they concluded that up to 300 locations were required for home-range estimates to become accurate and that accuracy improved up to sample sizes as large as 850. As such they advocate the use of GPS telemetry to acquire adequate sample sizes for optimum accuracy.

We extended these tests to a different species and to individuals with markedly different home-range use patterns in order to explore the relationship of sampling intensity with home-range size and stability in more detail. We used four large data sets (> 3000 points) spanning 9–12 months, collected from lions Panthera leo (Linneaus) with global positioning system (GPS) collars.

Methods

Ten Televilt Simplex Predator 2D collars were placed on lions in the Makgadikgadi Pans National Park in Botswana, between May 2001 and January 2002. The collars were scheduled to take 15 positions in every 24-h period and made 94·5% of fixes attempted. Data were retrieved via a coded VHF transmission and lions were located by radio tracking (Telonics TR-4 receiver and four-element Yagi antenna (Powerserv, Maun Botswana). The data were received and stored with a four-element Yagi antenna (Televilt Y4-FL) and a receiver/data logger (Televilt RX-900) and decoded using Televilt SPM software. Seventy per cent of collars failed within 200 days (Hemson 2002) and continuous data were available from only four lions.

UG and SP were territorial males yielding 3968 and 4624 positions, respectively. We recorded 5069 positions from NI, a solitary adult female, who denned during the study period. AR was an adult female who left her natal pride shortly after tagging; 5073 locations were downloaded, showing a range divided into two overlapping seasonal areas.

Data were converted to the Universal Transverse Mercator coordinate system and subsampled to simulate radio-tracking strategies using code written for the SAS system (SAS Institute Inc. 1989). Fixes were drawn (without replacement) twice a day, once a day, or at intervals of 2, 4, 7, 14, 21 and 28 days for each animal. The starting point for each subsample was a random day within the first 30. The ‘tracking strategy’ for each animal contained 100 subsamples.

LSCV 95% contours were created for all subsamples on a 40 × 40 grid using Ranges6 (Kenward et al. 2002). LSCV begins at 1·51 ×href and works downwards in steps of 0·02–0·09 ×href and stops if it reaches an inflection, at which a decreasing downward slope becomes an upward slope (indicating a local minimum), or increases again in a downward direction (indicating that a local minimum would have been likely with a smaller step size than 0·02). This method was preferred over local and global minimum options as it was most sensitive to changes in the gradient of the function and less likely to fail. If it was unable to find an inflection we used the Ranges6 default substitution of hlscv with href. Therefore hused = hlscv if hused ≠ href and if hused = href, LSCV has failed.

Plots of variation of hused, hlscv and href were examined to compare the trends in variability of these estimates between sampling intervals and animals:

  • 1The value of h used (hused) to create the contour using the LSCV algorithm (accepting that LSCV failure would cause substitution of hlscv with href).
  • 2Only those values created from an inflection point in the LSCV function (hlscv).
  • 3The value of h calculated by the reference method (href).

If the home-range estimator used is perfect then all subsampled home-range estimates within each animal should be identical, having been sampled from the same source distribution. As such we used variation of home-range estimates within a particular ‘tracking strategy’ as an index of home-range performance. Stability of home-range estimates was assessed using overlap analysis in Ranges6 (Kenward et al. 2002). The percentage overlap of each range with each other within a ‘tracking strategy’ was calculated, and a matrix created. If all home-range estimates were identical, all values in the matrix would be 100%. The mean and standard deviation of the values were used as indices of stability for each sampling interval.

To test whether the performance of LSCV or the value of h could be predicted from characteristics of the utilization distribution, descriptive statistics were calculated for each subsample using the harmonic mean routine (Spencer & Barrett 1984).

  • 1The dispersion of the data or ‘the peak density value (at the range centre location) divided by the standard deviation of the density value across all the locations’.
  • 2The ‘value’ or probability density score at the peak of this density.
  • 3Skew: estimated by measuring the distance between the arithmetic centre and the location with the peak density and dividing by the standard deviation of the density across all locations. A measure of the tendency for fixes to be distributed asymmetrically about this mean.
  • 4Kurtosis: an assessment of the size of the tails of the distribution as compared to a normal distribution of the location distribution (Kenward et al. 2001).
  • 5Sample size. The number of locations in a subsample.
  • 6Spread: the mean of distances of locations from the central location.

These statistics describe features of the distribution without circularity in our analysis. Analogous statistics from KDE were not used (e.g. Kenward et al. 2001) because these are dependent upon the value of h.

We investigated the output statistically addressing three issues.

  • 1We used GLM to ascertain how influential h is in predicting the size of the range.
  • 2Binary logistic regression was used to predict the probability of success of LSCV using range statistics standardized (expressed in SD units) to compare the relative contribution of each over the range of observed values (the large sample sizes generated statistically significant P-values throughout).
  • 3We explored which attributes affected the multiple of href equivalent to values of hlscv used when LSCV was successful with a forward stepwise general linear model (Minitab GLM procedure).

Results

The value of h used has a significant influence over the size of range as estimated by KDE using our sample data sets. The predicted 95% area using h, sample size and the animal as predictors was strongly positively correlated with the value of hused (coefficient = 6·13 m2 per unit smoothing factor, F1,3283 = 4167·70, partial r2 = 35·4%P < 0·001).

The divergence of the hused line from the hlscv line in (Fig. 2a–d) represents failure of LSCV to find an appropriate value of h and replacement of hlscv with href (% success also shown). If the failure rate is 0% then mean hused = mean hlscv and if it is 100% then mean hused = href. Thus for animals NI and SA, LSCV begins failing at sample sizes below 100 and produces no values at sample sizes larger than 150. With UG the LSCV algorithm begins to fail at sample sizes of 300 or more and has a 99% failure at 550 points. AR is successful throughout the range tested, with only 13% failure at nearly 700 locations.

Figure 2.

(a–d) The mean and standard deviation in the values of hused, hlscv and href produced with changes in mean sample size per subsample and the percentage of LSCV success within each subsample for AR, NI, SA and UG (lines are smoothed interpolations and not fits).

If the relationship between h and sample size is inspected only for those samples where the algorithm succeeded (i.e. the hlscv lines in Fig. 2a–d), the values and variation of hlscv initially tended to decline with increasing sample size. UG and AR both produced hlscv values up to large sample sizes and AR showed signs of an increase in the mean value of hlscv at samples larger than 200, converging towards the mean value of href at around 700 points. Mean hlscv values for UG appear to reach an asymptote at around 120 and did not converge on href. The decline in variation of hlscv at larger samples is in part an artefact of declining numbers of values of hlscv as LSCV failures become more frequent, as is the increased conformity of hused with href. The results for AR (Fig. 2a), which had the highest success at high sample sizes, suggests that actual variation in hlscv remained fairly constant at sample sizes larger than 180 fixes.

The trend in href with sampling size was more stable and predictable than that observed for hlscv and variability of href declined with increasing sample size (Fig. 1b–d). Variation in href was considerably less than in hlscv and approached an asymptote at around 100 locations (Fig. 3). Both are to be expected as href is a function of the standard deviation and sample size, and estimates of the standard deviation will become less variable as sample size increases. The mean value of hused reflects the percentage of LSCV success and the proportion href or hlscv used. All animals showed a tendency for an initial steep decline in the value at lower sample sizes while LSCV is still successful. At higher samples sizes for SA, NI and UG, mean hused increases towards href as LSCV success rate declined and hlscv was replaced more frequently and in AR, hused tracks hlscv very closely as LSCV failure is minimal.

Figure 3.

Plot of the coefficient of variation in href and hlscv against sample size.

The 95% core range estimates were highly variable at low sample sizes tracking the variation in hused (Fig. 4). Our index of range-estimate stability (mean percentage home-range overlap) increased, and the standard deviation around this mean decreased, as sample size increased. Mean percentage overlap tended towards asymptote at between 100 and 150 data points for three of the four animals, but at closer to 500 points for UG. (The asymptote for NI and SA is an artefact of the failure of LSCV at sample sizes greater than c. 100.)

Figure 4.

The relationship between mean and standard deviation percentage overlap, as indices of stability, with increasing sample size for 95% cores calculated using the Ranges6 LSCV routine in which href is substituted for hlscv when LSCV fails.

Given the strong dependence of the estimated area on whether or not the LSCV algorithm succeeded, it was of some interest to investigate the attributes of samples that affected the likelihood of LSCV success.

Dispersion and spread were correlated strongly with ‘value’ and sample size, respectively (Table 1) and were not included in the final model (Table 2). ‘Sample size’ was the best predictor of LSCV failure (Table 2), although correlations among the variables complicate interpretation. ‘value’, kurtosis and sample size were correlated negatively with the probability of LSCV success.

Table 1.  Pearson's r for correlations between sample size and various range use statistics calculated using the harmonic mean routine in Ranges6
 Sample sizeValueSpreadDispersionSkew
Value 0·778    
Spread 0·991 0·827   
Dispersion 0·758 0·971 0·805  
Skew−0·439−0·527−0·460−0·584 
Kurtosis 0·822 0·366 0·771 0·398−0·301
Table 2.  Outcome of the binary logistic regression modelling LSCV success. Odds ratio (reciprocals in brackets for values lower than 1) signifies size of effect based on standardized values of predictors (all P < 0·001); 97·6% LSCV successes and 85·5% of LSCV failures were predicted correctly by the model (> 0·5 = success and < 0·5 = failure)
PredictorCoefficientOdds ratioLower 95% CIUpper 95% CI
Value−0·99850·37 (2·7)0·170·79
Skew 1·07522·931·744·92
Kurtosis−1·29950·27 (3·7)0·150·51
Sample size−4. 19720·02 (50·0)0·00·10

We investigated whether sample size could be used to predict the mean value of hlscv expressed as a multiple of href, among the four lions. However, there was no consistent pattern in the multiple of href calculated as hlscv vs. sample size (Fig. 5). There is some negative correlation between sample size and the multiple of href used at small sample sizes, although it does not appear consistent between animals and variation is considerable.

Figure 5.

The relationship between multiple of href used as hlscv when LSCV was successful and sample size.

A GLM modelling the influence of the measures of the shape of the utilization distribution on values of href equivalent to values of hlscv when LSCV succeeded, left considerable variation unexplained (Table 3). ‘Value’ was significantly negatively correlated with the multiple of href used and had the most explanatory power. Sample size, skew (both positively associated with use of LSCV) and kurtosis (negatively associated) were all significant but of relatively minor effect size. Comparison of the adjusted and sequential sums of squares suggests that colinearity among predictors does not affect conclusions to an important extent, with the exception of sample size, which was a more useful predictor in a model adjusting for the other predictors. Quadratic functions resulted in minor improvements in the predictive power of the model and were not retained in the final model. A similar observation applies to second order interaction terms between main effects.

Table 3.  General linear model, using a forward stepwise approach, predicting the multiple of href used by LSCV against standardized measurements of value, skew, kurtosis and sample size based on harmonic mean calculations (all P < 0·001)
 Sequential sums of squares% Variation explainedAdjusted sums of squaresCoefficient
Lion ID 85·5127·78% 91·43
Value 39·112·7% 33·74−0·38
Sample size 12·32 4·00% 18·60 0·53
Kurtosis 13·39 4·34%  9·91−0·31
Skew  3·60 1·17%  3·60 0·06
Error153·9950·02%153·99 
Total307·86   

Discussion

The value of h used is an important determinant of KDE home-range estimates (e.g. Fig. 1b–d). Using GPS data from three of four lions LSCV was only consistently successful at deriving values of h at sample sizes less than 100 locations (Fig. 2a–d). However, despite this ‘success’ the variation in values of hlscv were also highest at sample sizes less than 100, suggesting that systematic variation of LSCV at lower sample sizes is considerable (Fig. 3). Indeed, stability of LSCV KDE estimates, as indicated by mutual overlap, was poor for samples of fewer than 200 points for all lions (Fig. 4) and up to samples of 500 locations for UG. The subsequent improvement in stability beyond these sample sizes is an artefact of the increasing substitution of hlscv with the less variable href caused by increased failure rates of LSCV at larger sample sizes. While this substitution is one method of coping with LSCV failure, the relationships between hlscv or href are not consistent with measures of the utilization distribution that might make a basis for an alternative to this method (Table 3 and Fig. 5). Only 39% of LSCV attempts on samples above 100 locations were successful (Fig. 2a–d), most of them from AR.

Failure of LSCV is caused either by a large number of identical points (Silverman 1986) or points that are very close together, relative to href. In the latter situation LSCV would select a value below the range searched. The probability of these occurring increases considerably with increasing sample size and peak density (Value) (Table 2) that result from an animal repeatedly visiting restricted areas of its range. Rising failure rate with increasingly leptokurtic distributions reflects a decrease in the standard deviation of the estimated distribution indicative of a narrower and therefore denser peak density value. Intense peaks of density such as that observed with NI (caused by a denning period) also lead to low values of h (Table 3) and undersmoothing. LSCV is most likely to work with platykurtic or homogenous spatial distributions such as that exhibited by AR, with no areas of repeated high intensity use and low relative peaks of density. However, it is unclear how prevalent uniform range use is in animals and as such an estimator relying on this property may have limited applicability.

That previous studies using computer modelled distributions have returned results at odds with our own findings warrants explanation. There have been no published attempts to model repeated use of focal sites (such as dens, leks, resource patches, roosts, territorial boundaries, etc.) by animals. One study states that ‘there are few identical points or very tight clusters within our simulated points (in this case multimodal combinations of normal distributions), so LSCV rarely failed’ (Gitzen & Millspaugh 2003) and it is inferred that these data were similar to those used in previous studies (Seaman & Powell 1996; Seaman et al. 1999). This suggests that computer simulated data used were unlike the data from three of four lions in this study, which had very close or identical points and clusters of locations around favoured sites. While the properties of simulated data drawn from known distributions may be well understood, these properties may not mimic the irregularities exhibited by real data sets closely enough. If simulated data sets are to be used to test home-range estimators they should be representative of animal-range use.

In the other study of LSCV based on GPS data (Girard et al. 2002), there was an apparent improvement in precision of fit with the total data set up to very large sample sizes. This was also the case in our study for hused, but the effect was mainly an artefact of increasing substitution of hlscv with the less variable href caused by failure of LSCV at larger sample sizes. As the Girard et al. (2002) estimates were created with RangesV which uses href as a substitute value (Kenward & Hodder 1996), a similar explanation may apply in their case, as the reference ranges used to estimate accuracy of estimates made with different sample sizes were those calculated using three locations per day (mean 1559 locations).

Increasing sample size may be considered desirable for increasing the precision of a range estimate, but it also increases the probability of recording returns to favoured areas of the range, thus reducing the likelihood of LSCV success and of obtaining a representative home range. Thus we caution the use of LSCV KDE on large samples such as those generated by GPS collars. Despite suggestions that KDE does not require serial independence (De Solla, Bonduriansky & Brooks 1999), it seems likely from our analysis that problems should be anticipated for data with very short sampling intervals (i.e. highly auto-correlated), although in our examples there was more than one location per day only when sample size exceeded 400.

In previous studies, kernels created using href and sample sizes over 100 points had higher area bias (20–60%) than LSCV (0–20%), and consistently overestimated the area of the home range or distribution (Sain et al. 1994; Wand & Jones 1995; Worton 1995; Seaman & Powell 1996; Seaman et al. 1999; Kernohan et al. 2001; Gitzen & Millspaugh 2003). As such we investigated whether it might be possible to use a smaller multiple of href as an appropriate h. However, we could see no evidence of stable relationships between multiples of href and sample size, or other range-use parameters. We conclude that spatial features specific to individual lions and related to environmental variables (e.g. cubs, prey distribution, hard and soft edges to the ranges and socio-ecological range use requirements) account for the variation unexplained by the GLM and the considerable variation between individuals. We therefore consider that substitution of hlscv with a multiple of href is an unsatisfactory method of coping with LSCV failure, despite a better method remaining elusive.

One recent study shows that differences exist between different methods of LSCV implementations (Gitzen & Millspaugh 2003). Different interpretations of these methods may create even greater variation depending on the scale over which the algorithm is allowed to search, use of global or local minima and absolute or gradient inflections, the direction of search and the programming used. An experimental implementation that attempted LSCV with larger increments of href (0·1) and started with 1 × href yielded poorer results, with only 22% of LSCV attempts successful with samples larger than 100 but with similar patterns of variation in h-values. The Ranges6 implementation we used (it can also search for global and local minima) should fail less often than others by searching over a larger range of values (1·51–0·09 href), in smaller increments (0·02 href) and stopping at a gradient inflection rather than a global minimum. Our results may be influenced by our choice of software implementation but are consistent with the mathematical theory and we are confident that higher failure rates would be expected from other LSCV implementations, particularly those that search for global minima. Publications that use LSCV KDE should indicate the methods used for LSCV, failure rate and the treatment of failures.

Explorations of less variable alternatives to LSCV, such as ‘solve-the-equation plug-in’ methods (Wand & Jones 1995; Jones et al. 1996) need to be made for bivariate data, and alternatives to KDE such as those using local polynomials (Loader 1996; Loader 1999) explored. While the technology of home-range estimation will develop and new methods will be devised, a concerted effort must be made to reach some consensus of which methods perform best in which situations and for which ends (e.g. Powell 2000; Kenward et al. 2001; Kernohan et al. 2001).

Ancillary