SEARCH

SEARCH BY CITATION

Keywords:

  • clustering indices;
  • distance to regularity;
  • significance testing

Summary

  1. Top of page
  2. Summary
  3. Introduction
  4. Approaches
  5. Results and discussion
  6. Conclusions
  7. Acknowledgements
  8. References

1. The spatial analysis by distance indices (SADIE) methodology for data analysis is valuable for quantifying spatial patterns of organisms in terms of patches and gaps. Previous research showed that the calculation of the local clustering indices, key SADIE statistics, does not adequately adjust for the absolute location or the magnitude of the counts.

2. We present a new definition of a local clustering index, which overcomes the problem associated with the original cluster indices related to sampling position and count size. The new index is calculated without breaking the link between the observed count and its original position and quantifies the contribution of an observed count at this particular position to the local gaps or patches for the observed pattern relative to the expected under the assumption of spatial independence amongst observed counts. Randomisation-based testing for statistical significance of an individual local clustering index follows naturally from the definition of the new index.

3. New indices, calculated for several simulated and observed data sets, showed that the original indices overestimated the number of points (sites, locations) contributing to the gaps/patches in a spatial grid. Results indicate that the significance (or interpretation) of individual local clustering indices cannot be made based on its magnitude only and needs to be supported by statistical testing.

4. The newly developed index will provide a valuable tool for quantifying the local pattern and testing for its significance and enhance the value of SADIE methodology in analysing spatial patterns. It can also be used in conjunction with other approaches that test for global clustering.


Introduction

  1. Top of page
  2. Summary
  3. Introduction
  4. Approaches
  5. Results and discussion
  6. Conclusions
  7. Acknowledgements
  8. References

Spatial analysis by distance indices (SADIE) methodology was developed to quantify spatial patterns of organisms for either spatially referenced sampling units or individuals (Perry 1995, 1998). The basis of SADIE for spatially referenced sampling units is to quantify pattern by the total distance that individuals must be moved between sampling units so that the data are as regular as possible. The degree of nonrandomness is quantified by comparing the distance to regularity for the observed data set with distance to regularity for rearrangements of the observed data. One of the SADIE statistics, Ia, is defined as the ratio between the distance moved to achieve the regular pattern for the observed data and the arithmetic mean distance to regularity for randomised samples.

The SADIE methodology has evolved into an analysis of spatial patterns of local clustering and associating indices (Perry et al. 1999). The contribution of each sampling unit count to the observed pattern is quantified as a scaled and dimensionless clustering index using the observed and randomised data. The clustering index vi (a positive value for sampling units with counts greater than the mean) and vj (a negative value for sampling units with counts less than the mean) measure the respective degree to which a sampling unit contributes to patches and gaps. Mean vi (inline image) and vj (inline image) values for a data set are used as measures of the degree of nonrandomness in addition to Ia; the Ia statistic is strongly correlated linearly with inline image and inline image (Xu & Madden 2004, 2005). The SADIE method has been further extended to assess the spatial association of two species (Winder et al. 2001; Perry & Dixon 2002) on the basis of individual sampling unit clustering indices for each species.

Spatial analysis by distance indices methodology has been used in a wide range of research disciplines, such as plant disease epidemiology (Turechek & Madden 1999; Oerke et al. 2010), entomology (Tillman et al. 2009; McGraw & Koppenhofer 2010), weed science (Oveisi, Yousefi & Gonzalez-Andujar 2010), biocontrol of agricultural pests (Sciarretta et al. 2010), soil ecology (Yankelevich et al. 2006; Spiridonov, Moens & Wilson 2007), biological invasion (Maltez-Mouro, Maestre & Freitas 2010), forest management (Barbeito et al. 2009), plant ecology (Rodriguez et al. 2009; Wehncke, Medellin & Ezcurra 2009; Gutierrez-Giron & Gavilan 2010) and plant–soil interactions (Maestre et al. 2003). The interpretation of SADIE statistics may, however, need careful considerations. The magnitudes of Ia, and inline image (or inline image) are influenced by the absolute sampling position of the counts and not just relative positions amongst sampling units, as in autocorrelation and geostatistics approaches (Xu & Madden 2003, 2004). A density map-based method was recently developed to analyse spatial counts data and was shown to be less sensitive to edge (sampling position) effects than the SADIE methodology (Lavigne et al. 2010). Moreover, the current method for calculating vi (or vj) does not completely adjust for the effects of the counts and their sampling positions (Xu & Madden 2005), and thus, individual vi (vj) may not correctly quantify the contribution of an individual count at a particular position to the overall pattern. Finally, currently there is no formal statistical test for significance of each individual vi (vj).

As the clustering indices are the corner stone of the SADIE methodology (Perry et al. 1999), any inaccuracy in estimating these indices might result in erroneous conclusions. In this article, we present an alternative definition of clustering indices and compare the derived statistics with the ones in the current SADIE methodology. More importantly, we propose a direct method for testing the statistical significance of each clustering index.

Approaches

  1. Top of page
  2. Summary
  3. Introduction
  4. Approaches
  5. Results and discussion
  6. Conclusions
  7. Acknowledgements
  8. References

Description of the SADIE clustering indices

The number or count of individuals in a sampling unit is given by x. Sampling units with counts larger than the mean (inline image) are donors, designated with an i subscript; in that, individuals are moved out of these units in determining the total moves to regularity. Similarly, sampling units with counts smaller than inline image are receivers, designated with a j subscript. For donor unit i, the total outflow of individuals is given by inline image; for receiver unit j, total inflow is given by inline image There are a total of ni donor and nj receiver sampling units. The specific moves to regularity are determined by a transportation algorithm (Perry 1995, 1998).

For donor unit i, at position (ai, bi), the outflow to the jth receiver unit at position (aj, bj) is denoted as Vij; the distance (dij) of this flow is inline image The average distance (Yi) of outflow from unit i to all receivers is

  • image(eqn 1)

The total distance to regularity for the entire data set is inline image A standardised and dimensionless clustering index (vi) for a donor unit is defined as

  • image(eqn 2)

where cY and iY are average values estimated from the randomisations as described below. The original counts are kept track of during each randomisation, and their average distance of outflow calculated for each randomisation (Yi,k; = 1, , m). The average outflow distance (cY) for the observed count (xi) across all the randomisations is defined as inline image Similarly, the individual location for the i-th unit (ai, bi) is kept track of during each randomisation, and the average flow distance for this location is determined in each randomisation (k,iY; k = 1, …, m). The average flow distance (iY) for the i-th unit across all the randomisations is then defined as inline image The statistic oY denotes the average absolute value of cY for a data set over all sampling units, which is equal to the average value of iY over all the units. For inflows, vj is defined similarly, again with the convention that it is negative in sign. There were no objective criteria for assessing statistical significance of individual vi (vj). Instead, values of vi > 1·5 or vj < −1·5 were proposed to indicate membership of a patch and a gap, respectively (Perry et al. 1999).

In estimating clustering index (vi), specific spatial patterns resulting from randomisation for a given set of count data range from regular to highly aggregated, but will be expected to be random on average. The principle of the SADIE methodology is that a value of an index (vi, |vj|; inline image, inline image; Ia) close to unity indicates a random pattern, because the index is a ratio between an observed distance value and an average value across all randomisations. Thus, the average vi and vj for a given sampling point over a reasonable large number of randomisations where this particular point remains at the original position, whilst all others are randomly allocated to other positions, should be close to 1·0. However, this was shown not to be the case especially for large or small counts located at the corner or edge (Xu & Madden 2005) where the average vi (vj) value over many randomisations deviated considerably from the expected value of 1·0, suggesting that the influence of spatial location and count size on vi and vj has not been completely accounted for. In addition, because of the nature of randomisation, cY estimated for sampling units with the same count data may differ considerably.

Description of the new clustering index

The current SADIE methodology separates the count from its position when estimating vi (vj). However, it could be argued that a local clustering index at a particular point should describe the count size at this particular position relative to its neighbours. Thus, we argue that the count and its physical sampling position should be considered as a single entity when assessing the local clustering, which is the basis for the new algorithm estimating clustering indices.

For every sampling point (xi or xj) irrespective whether it is a donor or receiver, we conduct m randomisations; in each randomisation, the count remains at its original (observed) position, whilst all other (n − 1) counts are randomly assigned to other (n − 1) sampling points. We calculate the average of the total distance to regularity across the m randomisations for the observed counts xi (or xj) at its original sampling point, also including the observed data set. That is, we define

  • image(eqn 3)

where Di,k is the distance to regularity for the k-th randomisation for point i. Then, the new clustering index for the observed count at this particular point for the observed data is defined as

  • image(eqn 4)

cj is defined similarly for receiver points. Similarly, we calculate the clustering index for the k-th randomisation of this observed count at its original position [i.e. point (site, location) i] as

  • image(eqn 5)

As for vi (vj) of eqn 2, the new indices (ci and cj) are greater or less than zero for the donor and receiver units, respectively. But unlike vi (vj), the average ci or cj value for a given sampling point over all randomisations (including the observed) is by definition equal to 1·0, as inline image, which is the principle of randomisation testing for statistical significance. The new index describes the contribution of this count at this particular point to the local gaps or patches for a given observed pattern relative to the expected under the assumption of spatial independence amongst observed counts.

Calculation of Ia and testing the significance of Ia, that is the global significance of aggregation, follows the original SADIE methodology based on the total × n randomisations and hence is not described here. We only present a method for direct testing of significance of each individual clustering index ci (cj) at a given sampling point. For each point, we have m + 1 clustering indices: the observed (ci, cj) and m randomisations (ci,k, cj,k). A significance test may be conducted for local clustering of the observed count at this point by ranking the m + 1 indices, which is equivalent to ranking the m + 1 values of the distance to regularity for this point. For a donor unit, if the observed index (Di) is ranked at the top k-th position, then ci is significantly greater than expected under the assumption of absence of local aggregation at the significance level of 1 − k/(m + 1). For instance, with 199 randomisations and a significance level of α = 0·05, ci is significant if it is larger than the 190th of 199 Di,k values. Similarly for a receiver unit, if Dj is ranked at the bottom k-th position, then cj is significantly less than expected under the assumption of absence of local aggregation at the significance level of k/(m + 1). As each ci (cj) is calculated from a different set of m randomisations (in which this count is fixed at its original sampling position) and the observed pattern, the proposed significance testing for individual ci (cj) does not thus suffer from potential problems arising from multiple testing, as often encountered in such randomisation-testing procedures, for example, testing significance of spatial statistics (Wiegand & Moloney 2004). However, as in multiple treatment comparisons in conventional analysis of variance, testing for significance of individual ci (cj) values should proceed only when there is an evidence for significant global deviations from a random pattern.

The new algorithm was coded in Microsoft Visual C++ as a Windows program. In addition, the original SADIE algorithm (including the transportation algorithm) was also ported to Microsoft Visual C++. The new code was extensively tested against many data sets to ensure that it produces the same original SADIE statistics as those given by the SADIE software (SADIESHELL, kindly provided by Dr Perry of Rothamsted Research, England, UK). This new program is freely available on the website (http://www.emr.ac.uk/pdf/wsadie.zip).

Evaluating the new clustering index

We focus on (i) the absolute differences between the new ci (cj) and original vi (vj) statistics, (ii) the relationship of the significance level of ci (cj) with the corresponding value of vi (vj) as well as ci (cj) and (iii) relationship of global significance of clustering with significance of local clustering. All statistical analyses were carried out using Genstat™ (Payne 2006). The difference between ci (cj) and vi (vj) was expressed as the percentage of the original local index (vi or vj).

Two types of data were used to compare the new indices with original ones. First, we used four simulated data sets of different sampling grid sizes (3 × 3, 5 × 5, 10 × 10 and 20 × 20) at a nominal unit, which were used in a previous study (Xu & Madden 2005) to investigate behaviour of SADIE statistics. Individual counts at each sampling point was randomly drawn from a beta-binomial distribution with the parameters of α β = 5 and a sample unit size (nsu) of 100, giving a mean of μ = nsuα/(α + β) and heterogeneity parameter of 1/(α + β) = 0·1. The beta-binomial distribution was used to generate counts data because this distribution often can well describe quadrat data of plant diseases (Madden, Hughes & van den Bosch 2007). For each of the four data sets, we conducted 200 random permutations to generate 200 different spatial data sets but with the same counts data (although the counts occupied different sampling locations in the different permutations). These 800 sets are labelled the permutation data sets below. For each of 800 permutation datasets, we calculated the following: ci (cj) and vi (vj) for each sampling point; Ia, inline image (inline image) and inline image (inline image) across all the sampling points. The ranking (used to calculate P-value) of an individual ci (cj) value for each sampling point was based on 99 randomisations (i.e. = 99).

To investigate the relative differences between the old and new SADIE clustering indices in relation to the sampling position and grid size, three groups of sampling points were formed for each sampling grid size: four corner points; four (in the case of 3 × 3 and 5 × 5) or eight (in the case of 10 × 10 and 20 × 20) central points on the edges; and one point (in the case of 3 × 3 and 5 × 5) or four points (in the case of 10 × 10 and 20 × 20) at the centre of the sampling grid. Variance of the percentage differences between the two indices were then calculated across the 200 permutations for each group of the points. In addition, the distance of each sampling point to the grid centre for the 20 × 20 data was calculated. Variances of the percentage differences between the old and new indices were calculated across the 200 permutations for each distance.

A further 4000 random permutations of the 5 × 5 data set (22, 61, 61, 53, 50, 50, 40, 44, 34, 50, 61, 15, 44, 66, 78, 57, 52, 74, 34, 53, 70, 59, 43, 66 and 58) were conducted to compare the new and original indices and to illustrate the bias in the original ones, relative to the new ones, because of the counts size and sampling position. We conducted 1000 permutations for each of the following four scenarios: the maximum count of 78 at location (1,1) (i.e. at the corner) and at location (3,3) (i.e. in the centre) and the minimum count of 15 at location (1,1) and at location (3,3). Distributions of the 1000 new and original indices at each of these four cases were then compared.

We then estimated the new indices for three data sets from published studies on spatial distribution of insects in agricultural lands. The first data set consisted of 554 individuals of cereal aphids sampled in 1996 in a 250 × 180 m field of winter wheat in a 9 × 7 rectangular grid at interval of 30 m (Perry et al. 1999). The second data set consisted of 63 counts data of total 811 arthropods collected on 5 July 1996 in a field of organic winter wheat using the sampling scheme as for the first data set (Holland, Winder & Perry 1999). The final data set, kindly provided by Dr Lavigne of INRA, France, consisted of 30 counts of codling moth in an apple orchard of c. 0·33 ha; this is the data set labelled as Orchard F in the study of Lavigne et al. (2010), evaluating a density map-based approach for quantifying spatial patterns.

Results and discussion

  1. Top of page
  2. Summary
  3. Introduction
  4. Approaches
  5. Results and discussion
  6. Conclusions
  7. Acknowledgements
  8. References

Differences between local indices, ci (cj) and vi (vj)

Despite the high correlation (> 0·99) between the new and original indices for the 800 permutation data sets, the relative difference ranged from −10·1% to 10·2% (average  =  −0·5%, SD  =  3·97), from −16·6% to 16·2% (average  =  0·1%, SD  =  4·70), from −21·6 to 22·4% (average  =  0·0%, SD  =  4·95) and from −32·2% to 30·4% (average  =  0·0%, SD  =  5·61) for data sets of 3  × 3, 5  × 5, 10  × 10 and 20  × 20, respectively (Fig. 1). Correlation of the relative difference with vi (vj) was close to zero (< 0·06) except for the 3 × 3 grid (= 0·15). There was an overall correlation of 0·92 between Ia and (inline image). This agrees with results from both field observational and simulated data (Xu & Madden 2005) that amongst the SADIE statistics, Ia is sufficient to describe the overall aggregation of a single data set.

image

Figure 1.  Boxplots of the differences between the new (ci or cj) and original (vi or vj) spatial analysis by distance indices clustering indices (as % of vi or vj) for the 800 simulated data sets (200 permutated sets for each of four quadrat sizes 3 × 3, 5 × 5, 10 × 10 and 20 × 20). Description of data sets is in Xu & Madden (2005). In the boxplot, the upper and lower limits of the box indicate the upper and lower quartiles of the distribution and the horizontal line through the box indicates the median; the ‘whiskers’ extending beyond the box indicate the range of 10th and 90th percentiles; ‘outlying’ points are shown individually as open circles.

Download figure to PowerPoint

The differences between the two indices for the corner and central edge points increased with increasing sampling grid size (Table 1). There was no clear trend for the centre points. For the 20 × 20 grid, the variance of the percentage difference was 74, 38 and 29 for the corner, central edge and grid central points, respectively. Across all the 200 permutations, the percentage difference was negatively correlated with count size for the four corner points (Fig. 2a,d and g), but not for the central grid points (Fig. 2b,e and h), whereas for the central edge points, the relation was only apparent for the grid 20 × 20 (Fig. 2i). The two ‘segments’ (close to mirror images) in Fig. 2a,d and g and, to a lesser degree, Fig. 2i result probably from the symmetry of ci on the right (larger counts) and cj on the left (smaller counts). For the 20 × 20 grid, the variance of the percentage differences between the old and new indices did not vary much, close to 30, until the distance of 9 and thereafter increased sharply to maximum for the corner points (those farthest from the grid centre; Fig. 3). From the distance 9 onwards, the variance fluctuated greatly: the variance for the points on the edge was much greater than for the nonedge points.

Table 1.   Variances of the percentage differences between the old and new spatial analysis by distance indices clustering indices across the 200 random permutations for the four points at the corners, four (in the case of 3 × 3 and 5 × 5 grid size) or eight (in the case of 10 × 10 and 20 × 20 grid size) points on the edges and one point (in the case of 3 × 3 and 5 × 5 grid size) or four points (in the case of 10 × 10 and 20 × 20 grid size) and at the central location
Grid sizeCentreCornerEdge
3 × 344177
5 × 5353413
10 × 10245324
20 × 20297438
image

Figure 2.  Plots of the differences between the new and original indices (as percentage of the original indices) against the corresponding counts for the points at (a,d and g) the corners, (b,e and h) the grid centre, and (c,f and i) central edges for the 200 permutated data sets of each grid size (5 × 5, 10 × 10 and 20 × 20). The 3 × 3 grid data were not shown because such a small grid size is not used in real studies.

Download figure to PowerPoint

image

Figure 3.  Plot of the variance in the differences between the new and original indices (as percentage of the original indices) across the 200 permutations against the distance to the grid (20 × 20) centre. The largest distances represent points at or near the grid corner.

Download figure to PowerPoint

Although there was fair amount of overlap of the empirical distribution of new and original indices when the maximum (78) and minimum (15) counts were fixed at the corner or centre of the 5 × 5 grid over 1000 permutations, the distributions were clearly different (Fig. 4). Following the principle of randomisation tests for statistical significance, the average index should be close to 1·0. This is, however, not the case for vi (vj) for which there was considerable underestimation and overestimation, respectively, for the location of the corner and the centre (Fig. 4). In contrast, the average of ci (cj) is 1·0 by definition.

image

Figure 4.  Empirical density plot of the 1000 new (red) and original (blue) clustering indices where the maximum (78) and minimum (15) count of the 5 × 5 data set was either fixed at the corner or in the centre.

Download figure to PowerPoint

These results suggest the incomplete removal of the influence of sampling position and count size on the estimation of the original indices, with the bias varying with counts size and sampling position (Xu & Madden 2005). The new index has corrected this bias, which also allows testing for statistical significance of individual indices.

Relationship of the significance level of ci (cj) with corresponding vi (vj)

The greater the value of ci (|cj|), the more likely it achieved statistical significance, as expected (Fig. 5b,d and f). However, there were large variations in the threshold values of corresponding values vi (vj) and ci (|cj|) for achieving statistical significance for all simulated grid data (Fig. 5). The average values of both original and new indices for achieving 5% significance (based on the empirical distribution of ci [cj] values) increased slightly with increasing grid size. For the 20 × 20 grid data, the threshold vi and vj value for achieving 5% significance ranged, respectively, from 1·37 to 3·11 (average = 2·06) and from −1·44 to −3·06 (average = −2·07; Fig. 5e); corresponding values for ci (|cj|) were 1·55 to 2·95 (average = 2·20) and −1·50 to −2·86 (average −2·09; Fig. 5f). These results demonstrate the inadequacy of the current SADIE approach where a single threshold value of 1·5 (−1·5) is used to determine the ‘significance’ of the index (i.e. to determine whether the individual index is substantial).

image

Figure 5.  Plot of (a,c and e) the original (vi or vj) and (b,d and f) the new clustering indices (ci or cj) against the probability that the corresponding new clustering index for an observed point (site, location) is greater (donor units: positive index) or smaller (receiver units: negative index) than that expected under the null hypothesis of no local clustering. Results based on the 100 randomisations for each sampling location for the 200 permutated sets of the 5 × 5, 10 × 10 and 20 × 20 grid size (i.e. 2500, 10 000, and 40 000 randomisations for each permutated data set, respectively, for the three grid sizes). The vertical dotted lines indicate the 5% cut-off threshold for significance of a donor (or receiver) unit. The 3 × 3 grid data were not shown because such a small grid size is not used in real studies (none of indices for nine points was statistically significant).

Download figure to PowerPoint

The extent of inadequacy of defining ±1·5 as a threshold value for clustering indices can be illustrated by the empirical distributions in Fig. 4. For the maximum counts of 78, vi was >1·5 in 189 of 1000 simulations (= 0·189) for the corner position and in 15 of 1000 simulations (= 0·015) for the centre position. Similarly, for the minimum counts of 15, vj was <−1·5 in 135 simulations (= 0·135) for the corner and in 1 simulation (= 0·001) for the centre position. Thus, interpretation of original indices in terms of gaps and patches based on the threshold of ±1·5 may lead to misleading conclusions.

As with the simulated data sets, the original indices overestimated the patch and gap sizes relative to those identified by the new indices for the three example data sets (Fig. 6). In this context, a patch is a set of contiguous points (locations) with original indices all ≥1·5, or new indices all individually significant ( 0·05) based on the randomisations. Likewise, a gap is a set of contiguous points with original indices all ≤−1·5, or new indices all individually significant. For the first data set of aphid counts (Perry et al. 1999), only 3 points with significant ci values and 8 points with significant cj values were found; in contrast, 5 points with vi values >1·5 and 16 points with vj values <−1·5 were found (Fig. 6b,c). For this data set, the number of gaps was reduced from two to one using the new index. For the other two data sets, the number of clear patches or gaps did not change but the (spatial) size of each gap/patch reduced considerably for the new indices in comparison with the original ones (Fig. 6e,f,h and i). In addition, Fig. 6 demonstrates that comparing the magnitude of indices alone (without considering their corresponding P-values) may be misleading regarding the relative contribution of points to the gaps/patches. Thus, a contour plot of the P-values for ci and cj can present a clearer illustration of the observed spatial pattern than for ci and cj values.

image

Figure 6.  Plots of (a) the aphid counts (Perry et al. 1999) and the corresponding values of (b) new and (c) original clustering index, (d) the arthropods counts (Holland, Winder & Perry 1999) and the corresponding values of (e) new and (f) original clustering index and (g) the codling moth counts (Lavigne et al. 2010) and the corresponding values of (h) new and (i) original clustering index. The size of circle symbols is proportional to the absolute values of the indices. Symbols filled with black (receivers) and red (donors) colour indicate that the new indices are statistically significant ( 0·05) or that the absolute values of the original indices are ≥1·5.

Download figure to PowerPoint

Relationship of the global significance level with individual ci (cj) significance levels

The characteristics (clumped, random or underdispersed) of the overall pattern is characterised by Ia and the global (field scale) P-value. On the other hand, individual ci and cj values, together with their associated P-values, provide explanations at a lower scale for the overall pattern when the global P-value indicated significant deviation from a random pattern. For a global random pattern, we expect that, on average, 10 donor units (for the 20 × 20 quadrat data) would have ci values significantly greater than expected (individual  0·05), and 10 receiver units would have cj values significantly less than expected (also at individual  0·05). This expectation is consistent with the results obtained for the permuted data sets (Fig. 7a): the number of ci values initially increased gradually with decreasing global P-value and then steeply when the global P-value was ≤ 0·1 (Fig. 7a). Similar results were obtained for cj and for other grid sizes except 3 × 3 where none of new clustering indices was statistically significant. In analysing spatial patterns, rarely we are interested in just individual sites but in patches and gaps. It is expected that a clumped pattern is associated with not only the total number of donor units with significant ci values but also the size of a contiguous area of these points, as shown in Fig. 7b. Similar results were found for receiver units (data not shown).

image

Figure 7.  Plot of the total number of donor units with significant new clustering indices (ci) against the probability values for the global pattern associated with Ia of spatial analysis by distance indices (Perry 1995, 1998): (a) and against the maximum patch size [number of contiguous donor units with significant ci (b)]. In (a), filled and unfilled symbols represent number of donor units with the ci value significant at the level of 5% and 1%, respectively. In (b), black, green and red-filled circle symbols indicate that the global pattern was significant at the level of 10%, 5% and 1%, respectively.

Download figure to PowerPoint

The P-values of individual ci and cj provide a way to accurately estimate the sizes of patch or gap areas when a global test indicated significant deviation from a random pattern (as demonstrated in Fig. 6). The ci and cj values for patterns at a lower scale may be also used in conjunction with global tests other than just the SADIE Ia-based test. For example, a recently developed method that is based on the density map was shown to be less sensitive to edge (sampling position) effects and to detect nonrandom pattern for low counts more frequently than the SADIE methodology (Lavigne et al. 2010); this result is partly achieved through use of an additional parameter in the statistical model (smoothing bandwidth, h). However, their method does not provide measures for local clustering, and thus ci and cj may be used in conjunction with this density map approach. It is interesting to note that we would not expect the gap in the top left corner in the codling moth data set (Fig. 6h) from the contour map produced by the density map approach [see fig. 4 in Lavigne et al. (2010)].

Conclusions

  1. Top of page
  2. Summary
  3. Introduction
  4. Approaches
  5. Results and discussion
  6. Conclusions
  7. Acknowledgements
  8. References

We derived and evaluated a new local clustering index to be used within the SADIE methodology. The new index (ci, cj) does not separate the location from the count in assessing the distance to regularity in a modified randomisation algorithm, and unlike with the currently defined clustering index (Perry et al. 1999), it eliminates the influence of sampling position and count size on its calculation. More importantly, we have provided a formal statistical test to determine the significance level of the new index for each location based on randomisations; these significance tests should be used when a global test for spatial pattern shows significance. A single threshold value of ±1·5 proposed previously for determining the importance of original clustering indices is likely to lead to misleading conclusions on the patch/gap and their sizes. The new index is complementary to other methods for testing the global pattern to provide explanation of the pattern at a local scale.

Acknowledgements

  1. Top of page
  2. Summary
  3. Introduction
  4. Approaches
  5. Results and discussion
  6. Conclusions
  7. Acknowledgements
  8. References

This project is funded by the China Natural Science Foundation (Project Number: 30928016).

References

  1. Top of page
  2. Summary
  3. Introduction
  4. Approaches
  5. Results and discussion
  6. Conclusions
  7. Acknowledgements
  8. References