Correspondence site: http://www.respond2articles.com/MEE/

# Spatial analysis by distance indices: an alternative local clustering index for studying spatial patterns

Article first published online: 11 NOV 2011

DOI: 10.1111/j.2041-210X.2011.00165.x

© 2011 The Authors. Methods in Ecology and Evolution © 2011 British Ecological Society

Additional Information

#### How to Cite

Li, B., Madden, L. V. and Xu, X. (2012), Spatial analysis by distance indices: an alternative local clustering index for studying spatial patterns. Methods in Ecology and Evolution, 3: 368–377. doi: 10.1111/j.2041-210X.2011.00165.x

#### Publication History

- Issue published online: 4 APR 2012
- Article first published online: 11 NOV 2011
- Received 20 June 2011; accepted 12 October 2011 Handling Editor: David Murrell

- Abstract
- Article
- References
- Cited By

### Keywords:

- clustering indices;
- distance to regularity;
- significance testing

### Summary

**1.** The spatial analysis by distance indices (SADIE) methodology for data analysis is valuable for quantifying spatial patterns of organisms in terms of patches and gaps. Previous research showed that the calculation of the local clustering indices, key SADIE statistics, does not adequately adjust for the absolute location or the magnitude of the counts.

**2.** We present a new definition of a local clustering index, which overcomes the problem associated with the original cluster indices related to sampling position and count size. The new index is calculated without breaking the link between the observed count and its original position and quantifies the contribution of an observed count at this particular position to the local gaps or patches for the observed pattern relative to the expected under the assumption of spatial independence amongst observed counts. Randomisation-based testing for statistical significance of an individual local clustering index follows naturally from the definition of the new index.

**3.** New indices, calculated for several simulated and observed data sets, showed that the original indices overestimated the number of points (sites, locations) contributing to the gaps/patches in a spatial grid. Results indicate that the significance (or interpretation) of individual local clustering indices cannot be made based on its magnitude only and needs to be supported by statistical testing.

**4.** The newly developed index will provide a valuable tool for quantifying the local pattern and testing for its significance and enhance the value of SADIE methodology in analysing spatial patterns. It can also be used in conjunction with other approaches that test for global clustering.

### Introduction

Spatial analysis by distance indices (SADIE) methodology was developed to quantify spatial patterns of organisms for either spatially referenced sampling units or individuals (Perry 1995, 1998). The basis of SADIE for spatially referenced sampling units is to quantify pattern by the total distance that individuals must be moved between sampling units so that the data are as regular as possible. The degree of nonrandomness is quantified by comparing the distance to regularity for the observed data set with distance to regularity for rearrangements of the observed data. One of the SADIE statistics, *I*_{a}, is defined as the ratio between the distance moved to achieve the regular pattern for the observed data and the arithmetic mean distance to regularity for randomised samples.

The SADIE methodology has evolved into an analysis of spatial patterns of local clustering and associating indices (Perry *et al.* 1999). The contribution of each sampling unit count to the observed pattern is quantified as a scaled and dimensionless clustering index using the observed and randomised data. The clustering index *v*_{i} (a positive value for sampling units with counts greater than the mean) and *v*_{j} (a negative value for sampling units with counts less than the mean) measure the respective degree to which a sampling unit contributes to patches and gaps. Mean *v*_{i} () and *v*_{j} () values for a data set are used as measures of the degree of nonrandomness in addition to *I*_{a}; the *I*_{a} statistic is strongly correlated linearly with and (Xu & Madden 2004, 2005). The SADIE method has been further extended to assess the spatial association of two species (Winder *et al.* 2001; Perry & Dixon 2002) on the basis of individual sampling unit clustering indices for each species.

Spatial analysis by distance indices methodology has been used in a wide range of research disciplines, such as plant disease epidemiology (Turechek & Madden 1999; Oerke *et al.* 2010), entomology (Tillman *et al.* 2009; McGraw & Koppenhofer 2010), weed science (Oveisi, Yousefi & Gonzalez-Andujar 2010), biocontrol of agricultural pests (Sciarretta *et al.* 2010), soil ecology (Yankelevich *et al.* 2006; Spiridonov, Moens & Wilson 2007), biological invasion (Maltez-Mouro, Maestre & Freitas 2010), forest management (Barbeito *et al.* 2009), plant ecology (Rodriguez *et al.* 2009; Wehncke, Medellin & Ezcurra 2009; Gutierrez-Giron & Gavilan 2010) and plant–soil interactions (Maestre *et al.* 2003). The interpretation of SADIE statistics may, however, need careful considerations. The magnitudes of *I*_{a}, and (or ) are influenced by the absolute sampling position of the counts and not just relative positions amongst sampling units, as in autocorrelation and geostatistics approaches (Xu & Madden 2003, 2004). A density map-based method was recently developed to analyse spatial counts data and was shown to be less sensitive to edge (sampling position) effects than the SADIE methodology (Lavigne *et al.* 2010). Moreover, the current method for calculating *v*_{i} (or *v*_{j}) does not completely adjust for the effects of the counts and their sampling positions (Xu & Madden 2005), and thus, individual *v*_{i} (*v*_{j}) may not correctly quantify the contribution of an individual count at a particular position to the overall pattern. Finally, currently there is no formal statistical test for significance of each individual *v*_{i} (*v*_{j}).

As the clustering indices are the corner stone of the SADIE methodology (Perry *et al.* 1999), any inaccuracy in estimating these indices might result in erroneous conclusions. In this article, we present an alternative definition of clustering indices and compare the derived statistics with the ones in the current SADIE methodology. More importantly, we propose a direct method for testing the statistical significance of each clustering index.

### Approaches

#### Description of the SADIE clustering indices

The number or count of individuals in a sampling unit is given by *x*. Sampling units with counts larger than the mean () are donors, designated with an *i* subscript; in that, individuals are moved out of these units in determining the total moves to regularity. Similarly, sampling units with counts smaller than are receivers, designated with a *j* subscript. For donor unit *i*, the total outflow of individuals is given by ; for receiver unit *j*, total inflow is given by There are a total of *n*_{i} donor and *n*_{j} receiver sampling units. The specific moves to regularity are determined by a transportation algorithm (Perry 1995, 1998).

For donor unit *i*, at position (*a*_{i}, *b*_{i}), the outflow to the *j*th receiver unit at position (*a*_{j}, *b*_{j}) is denoted as *V*_{ij}; the distance (*d*_{ij}) of this flow is The average distance (*Y*_{i}) of outflow from unit *i* to all receivers is

- (eqn 1)

The total distance to regularity for the entire data set is A standardised and dimensionless clustering index (*v*_{i}) for a donor unit is defined as

- (eqn 2)

where _{c}*Y* and _{i}*Y* are average values estimated from the randomisations as described below. The original *counts* are kept track of during each randomisation, and their average distance of outflow calculated for each randomisation (*Y*_{i,k}; *k *=* *1, *…*, *m*). The average outflow distance (_{c}*Y*) for the observed count (*x*_{i}) across all the randomisations is defined as Similarly, the individual *location* for the *i*-th unit (*a*_{i}, *b*_{i}) is kept track of during each randomisation, and the average flow distance for this location is determined in each randomisation (_{k,i}*Y*; *k = *1, …, *m*). The average flow distance (_{i}*Y*) for the *i*-th unit across all the randomisations is then defined as The statistic _{o}*Y* denotes the average absolute value of _{c}*Y* for a data set over all sampling units, which is equal to the average value of _{i}*Y* over all the units. For inflows, *v*_{j} is defined similarly, again with the convention that it is negative in sign. There were no objective criteria for assessing statistical significance of individual *v*_{i} (*v*_{j}). Instead, values of *v*_{i} >* *1·5 or *v*_{j} < −1·5 were proposed to indicate membership of a patch and a gap, respectively (Perry *et al.* 1999).

In estimating clustering index (*v*_{i}), specific spatial patterns resulting from randomisation for a given set of count data range from regular to highly aggregated, but will be expected to be random on average. The principle of the SADIE methodology is that a value of an index (*v*_{i}, |*v*_{j}|; , ; *I*_{a}) close to unity indicates a random pattern, because the index is a ratio between an observed distance value and an average value across all randomisations. Thus, the average *v*_{i} and *v*_{j} for a given sampling point over a reasonable large number of randomisations where this particular point remains at the original position, whilst all others are randomly allocated to other positions, should be close to 1·0. However, this was shown not to be the case especially for large or small counts located at the corner or edge (Xu & Madden 2005) where the average *v*_{i} (*v*_{j}) value over many randomisations deviated considerably from the expected value of 1·0, suggesting that the influence of spatial location and count size on *v*_{i} and *v*_{j} has not been completely accounted for. In addition, because of the nature of randomisation, _{c}*Y* estimated for sampling units with the same count data may differ considerably.

#### Description of the new clustering index

The current SADIE methodology separates the count from its position when estimating *v*_{i} (*v*_{j}). However, it could be argued that a local clustering index at a particular point should describe the count size at this particular position relative to its neighbours. Thus, we argue that the count and its physical sampling position should be considered as a single entity when assessing the local clustering, which is the basis for the new algorithm estimating clustering indices.

For every sampling point (*x*_{i} or *x*_{j}) irrespective whether it is a donor or receiver, we conduct *m* randomisations; in each randomisation, the count remains at its original (observed) position, whilst all other (*n − *1) counts are randomly assigned to other (*n − *1) sampling points. We calculate the average of the total distance to regularity across the *m* randomisations for the observed counts *x*_{i} (or *x*_{j}) at its original sampling point, also including the observed data set. That is, we define

- (eqn 3)

where *D*_{i,k} is the distance to regularity for the *k*-th randomisation for point *i*. Then, the new clustering index for the observed count at this particular point for the observed data is defined as

- (eqn 4)

*c*_{j} is defined similarly for receiver points. Similarly, we calculate the clustering index for the *k-*th randomisation of this observed count at its original position [i.e. point (site, location) *i*] as

- (eqn 5)

As for *v*_{i} (*v*_{j}) of eqn 2, the new indices (c_{i} and *c*_{j}) are greater or less than zero for the donor and receiver units, respectively. But unlike *v*_{i} (*v*_{j}), the average *c*_{i} or *c*_{j} value for a given sampling point over all randomisations (including the observed) is by definition equal to 1·0, as , which is the principle of randomisation testing for statistical significance. The new index describes the contribution of this count at this particular point to the local gaps or patches for a given observed pattern relative to the expected under the assumption of spatial independence amongst observed counts.

Calculation of *I*_{a} and testing the significance of *I*_{a}, that is the global significance of aggregation, follows the original SADIE methodology based on the total *m *× *n* randomisations and hence is not described here. We only present a method for direct testing of significance of each individual clustering index *c*_{i} (*c*_{j}) at a given sampling point. For each point, we have *m + *1 clustering indices: the observed (*c*_{i}, *c*_{j}) and *m* randomisations (*c*_{i,k}, *c*_{j,k}). A significance test may be conducted for local clustering of the observed count at this point by ranking the *m + *1 indices, which is equivalent to ranking the *m + *1 values of the distance to regularity for this point. For a donor unit, if the observed index (*D*_{i}) is ranked at the top *k-*th position, then *c*_{i} is significantly greater than expected under the assumption of absence of local aggregation at the significance level of 1 − *k*/(*m* + 1). For instance, with 199 randomisations and a significance level of α = 0·05, *c*_{i} is significant if it is larger than the 190th of 199 *D*_{i,k} values. Similarly for a receiver unit, if *D*_{j} is ranked at the bottom *k-*th position, then *c*_{j} is significantly less than expected under the assumption of absence of local aggregation at the significance level of *k*/(*m* + 1). As each *c*_{i} (*c*_{j}) is calculated from a different set of *m* randomisations (in which this count is fixed at its original sampling position) and the observed pattern, the proposed significance testing for individual *c*_{i} (*c*_{j}) does not thus suffer from potential problems arising from multiple testing, as often encountered in such randomisation-testing procedures, for example, testing significance of spatial statistics (Wiegand & Moloney 2004). However, as in multiple treatment comparisons in conventional analysis of variance, testing for significance of individual *c*_{i} (*c*_{j}) values should proceed only when there is an evidence for significant global deviations from a random pattern.

The new algorithm was coded in Microsoft Visual C++ as a Windows program. In addition, the original SADIE algorithm (including the transportation algorithm) was also ported to Microsoft Visual C++. The new code was extensively tested against many data sets to ensure that it produces the same original SADIE statistics as those given by the SADIE software (SADIESHELL, kindly provided by Dr Perry of Rothamsted Research, England, UK). This new program is freely available on the website (http://www.emr.ac.uk/pdf/wsadie.zip).

#### Evaluating the new clustering index

We focus on (i) the absolute differences between the new *c*_{i} (*c*_{j}) and original *v*_{i} (*v*_{j}) statistics, (ii) the relationship of the significance level of *c*_{i} (*c*_{j}) with the corresponding value of *v*_{i} (*v*_{j}) as well as *c*_{i} (*c*_{j}) and (iii) relationship of global significance of clustering with significance of local clustering. All statistical analyses were carried out using Genstat™ (Payne 2006). The difference between *c*_{i} (*c*_{j}) and *v*_{i} (*v*_{j}) was expressed as the percentage of the original local index (*v*_{i} or *v*_{j}).

Two types of data were used to compare the new indices with original ones. First, we used four simulated data sets of different sampling grid sizes (3 × 3, 5 × 5, 10 × 10 and 20 × 20) at a nominal unit, which were used in a previous study (Xu & Madden 2005) to investigate behaviour of SADIE statistics. Individual counts at each sampling point was randomly drawn from a beta-binomial distribution with the parameters of *α *= *β *= 5 and a sample unit size (*n*_{su}) of 100, giving a mean of μ = *n*_{su}α/(α + β) and heterogeneity parameter of 1/(α + β) = 0·1. The beta-binomial distribution was used to generate counts data because this distribution often can well describe quadrat data of plant diseases (Madden, Hughes & van den Bosch 2007). For each of the four data sets, we conducted 200 random permutations to generate 200 different spatial data sets but with the same counts data (although the counts occupied different sampling locations in the different permutations). These 800 sets are labelled the *permutation* data sets below. For each of 800 permutation datasets, we calculated the following: *c*_{i} (*c*_{j}) and *v*_{i} (*v*_{j}) for each sampling point; *I*_{a}, () and () across all the sampling points. The ranking (used to calculate *P*-value) of an individual *c*_{i} (*c*_{j}) value for each sampling point was based on 99 randomisations (i.e. *m *=* *99).

To investigate the relative differences between the old and new SADIE clustering indices in relation to the sampling position and grid size, three groups of sampling points were formed for each sampling grid size: four corner points; four (in the case of 3 × 3 and 5 × 5) or eight (in the case of 10 × 10 and 20 × 20) central points on the edges; and one point (in the case of 3 × 3 and 5 × 5) or four points (in the case of 10 × 10 and 20 × 20) at the centre of the sampling grid. Variance of the percentage differences between the two indices were then calculated across the 200 permutations for each group of the points. In addition, the distance of each sampling point to the grid centre for the 20 × 20 data was calculated. Variances of the percentage differences between the old and new indices were calculated across the 200 permutations for each distance.

A further 4000 random permutations of the 5 × 5 data set (22, 61, 61, 53, 50, 50, 40, 44, 34, 50, 61, 15, 44, 66, 78, 57, 52, 74, 34, 53, 70, 59, 43, 66 and 58) were conducted to compare the new and original indices and to illustrate the bias in the original ones, relative to the new ones, because of the counts size and sampling position. We conducted 1000 permutations for each of the following four scenarios: the maximum count of 78 at location (1,1) (i.e. at the corner) and at location (3,3) (i.e. in the centre) and the minimum count of 15 at location (1,1) and at location (3,3). Distributions of the 1000 new and original indices at each of these four cases were then compared.

We then estimated the new indices for three data sets from published studies on spatial distribution of insects in agricultural lands. The first data set consisted of 554 individuals of cereal aphids sampled in 1996 in a 250 × 180 m field of winter wheat in a 9 × 7 rectangular grid at interval of 30 m (Perry *et al.* 1999). The second data set consisted of 63 counts data of total 811 arthropods collected on 5 July 1996 in a field of organic winter wheat using the sampling scheme as for the first data set (Holland, Winder & Perry 1999). The final data set, kindly provided by Dr Lavigne of INRA, France, consisted of 30 counts of codling moth in an apple orchard of *c. *0·33 ha; this is the data set labelled as Orchard *F* in the study of Lavigne *et al.* (2010), evaluating a density map-based approach for quantifying spatial patterns.

### Results and discussion

#### Differences between local indices, *c*_{i} (*c*_{j}) and *v*_{i} (*v*_{j})

Despite the high correlation (*r *>* *0·99) between the new and original indices for the 800 permutation data sets, the relative difference ranged from −10·1% to 10·2% (average = −0·5%, SD = 3·97), from −16·6% to 16·2% (average = 0·1%, SD = 4·70), from −21·6 to 22·4% (average = 0·0%, SD = 4·95) and from −32·2% to 30·4% (average = 0·0%, SD = 5·61) for data sets of 3 × 3, 5 × 5, 10 × 10 and 20 × 20, respectively (Fig. 1). Correlation of the relative difference with *v*_{i} (*v*_{j}) was close to zero (*r *<* *0·06) except for the 3 × 3 grid (*r *=* *0·15). There was an overall correlation of 0·92 between *I*_{a} and (). This agrees with results from both field observational and simulated data (Xu & Madden 2005) that amongst the SADIE statistics, *I*_{a} is sufficient to describe the overall aggregation of a single data set.

The differences between the two indices for the corner and central edge points increased with increasing sampling grid size (Table 1). There was no clear trend for the centre points. For the 20 × 20 grid, the variance of the percentage difference was 74, 38 and 29 for the corner, central edge and grid central points, respectively. Across all the 200 permutations, the percentage difference was negatively correlated with count size for the four corner points (Fig. 2a,d and g), but not for the central grid points (Fig. 2b,e and h), whereas for the central edge points, the relation was only apparent for the grid 20 × 20 (Fig. 2i). The two ‘segments’ (close to mirror images) in Fig. 2a,d and g and, to a lesser degree, Fig. 2i result probably from the symmetry of *c*_{i} on the right (larger counts) and *c*_{j} on the left (smaller counts). For the 20 × 20 grid, the variance of the percentage differences between the old and new indices did not vary much, close to 30, until the distance of 9 and thereafter increased sharply to maximum for the corner points (those farthest from the grid centre; Fig. 3). From the distance 9 onwards, the variance fluctuated greatly: the variance for the points on the edge was much greater than for the nonedge points.

Grid size | Centre | Corner | Edge |
---|---|---|---|

3 × 3 | 44 | 17 | 7 |

5 × 5 | 35 | 34 | 13 |

10 × 10 | 24 | 53 | 24 |

20 × 20 | 29 | 74 | 38 |

Although there was fair amount of overlap of the empirical distribution of new and original indices when the maximum (78) and minimum (15) counts were fixed at the corner or centre of the 5 × 5 grid over 1000 permutations, the distributions were clearly different (Fig. 4). Following the principle of randomisation tests for statistical significance, the average index should be close to 1·0. This is, however, not the case for *v*_{i} (*v*_{j}) for which there was considerable underestimation and overestimation, respectively, for the location of the corner and the centre (Fig. 4). In contrast, the average of *c*_{i} (*c*_{j}) is 1·0 by definition.

These results suggest the incomplete removal of the influence of sampling position and count size on the estimation of the original indices, with the bias varying with counts size and sampling position (Xu & Madden 2005). The new index has corrected this bias, which also allows testing for statistical significance of individual indices.

#### Relationship of the significance level of *c*_{i} (*c*_{j}) with corresponding *v*_{i} (*v*_{j})

The greater the value of *c*_{i} (|*c*_{j}|), the more likely it achieved statistical significance, as expected (Fig. 5b,d and f). However, there were large variations in the threshold values of corresponding values *v*_{i} (*v*_{j}) and *c*_{i} (|*c*_{j}|) for achieving statistical significance for all simulated grid data (Fig. 5). The average values of both original and new indices for achieving 5% significance (based on the empirical distribution of *c*_{i} [*c*_{j}] values) increased slightly with increasing grid size. For the 20 × 20 grid data, the threshold *v*_{i} and *v*_{j} value for achieving 5% significance ranged, respectively, from 1·37 to 3·11 (average = 2·06) and from −1·44 to −3·06 (average = −2·07; Fig. 5e); corresponding values for *c*_{i} (|*c*_{j}|) were 1·55 to 2·95 (average = 2·20) and −1·50 to −2·86 (average −2·09; Fig. 5f). These results demonstrate the inadequacy of the current SADIE approach where a single threshold value of 1·5 (−1·5) is used to determine the ‘significance’ of the index (i.e. to determine whether the individual index is substantial).

The extent of inadequacy of defining ±1·5 as a threshold value for clustering indices can be illustrated by the empirical distributions in Fig. 4. For the maximum counts of 78, *v*_{i} was >1·5 in 189 of 1000 simulations (*P *=* *0·189) for the corner position and in 15 of 1000 simulations (*P *=* *0·015) for the centre position. Similarly, for the minimum counts of 15, *v*_{j} was <−1·5 in 135 simulations (*P *=* *0·135) for the corner and in 1 simulation (*P *=* *0·001) for the centre position. Thus, interpretation of original indices in terms of gaps and patches based on the threshold of ±1·5 may lead to misleading conclusions.

As with the simulated data sets, the original indices overestimated the patch and gap sizes relative to those identified by the new indices for the three example data sets (Fig. 6). In this context, a patch is a set of contiguous points (locations) with original indices all ≥1·5, or new indices all individually significant (*P *≤* *0·05) based on the randomisations. Likewise, a gap is a set of contiguous points with original indices all ≤−1·5, or new indices all individually significant. For the first data set of aphid counts (Perry *et al.* 1999), only 3 points with significant *c*_{i} values and 8 points with significant *c*_{j} values were found; in contrast, 5 points with *v*_{i} values >1·5 and 16 points with *v*_{j} values <−1·5 were found (Fig. 6b,c). For this data set, the number of gaps was reduced from two to one using the new index. For the other two data sets, the number of clear patches or gaps did not change but the (spatial) size of each gap/patch reduced considerably for the new indices in comparison with the original ones (Fig. 6e,f,h and i). In addition, Fig. 6 demonstrates that comparing the magnitude of indices alone (without considering their corresponding *P*-values) may be misleading regarding the relative contribution of points to the gaps/patches. Thus, a contour plot of the *P*-values for *c*_{i} and *c*_{j} can present a clearer illustration of the observed spatial pattern than for *c*_{i} and *c*_{j} values.

#### Relationship of the global significance level with individual *c*_{i} (*c*_{j}) significance levels

The characteristics (clumped, random or underdispersed) of the overall pattern is characterised by *I*_{a} and the global (field scale) *P*-value. On the other hand, individual *c*_{i} and *c*_{j} values, together with their associated *P*-values, provide explanations at a lower scale for the overall pattern when the global *P-*value indicated significant deviation from a random pattern. For a global random pattern, we expect that, on average, 10 donor units (for the 20 × 20 quadrat data) would have *c*_{i} values significantly greater than expected (individual *P *≤* *0·05), and 10 receiver units would have *c*_{j} values significantly less than expected (also at individual *P *≤* *0·05). This expectation is consistent with the results obtained for the permuted data sets (Fig. 7a): the number of *c*_{i} values initially increased gradually with decreasing global *P*-value and then steeply when the global *P*-value was ≤ 0·1 (Fig. 7a). Similar results were obtained for *c*_{j} and for other grid sizes except 3 × 3 where none of new clustering indices was statistically significant. In analysing spatial patterns, rarely we are interested in just individual sites but in patches and gaps. It is expected that a clumped pattern is associated with not only the total number of donor units with significant *c*_{i} values but also the size of a contiguous area of these points, as shown in Fig. 7b. Similar results were found for receiver units (data not shown).

The *P*-values of individual *c*_{i} and *c*_{j} provide a way to accurately estimate the sizes of patch or gap areas when a global test indicated significant deviation from a random pattern (as demonstrated in Fig. 6). The *c*_{i} and *c*_{j} values for patterns at a lower scale may be also used in conjunction with global tests other than just the SADIE *I*_{a}-based test. For example, a recently developed method that is based on the density map was shown to be less sensitive to edge (sampling position) effects and to detect nonrandom pattern for low counts more frequently than the SADIE methodology (Lavigne *et al.* 2010); this result is partly achieved through use of an additional parameter in the statistical model (smoothing bandwidth, *h*). However, their method does not provide measures for local clustering, and thus *c*_{i} and *c*_{j} may be used in conjunction with this density map approach. It is interesting to note that we would not expect the gap in the top left corner in the codling moth data set (Fig. 6h) from the contour map produced by the density map approach [see fig. 4 in Lavigne *et al.* (2010)].

### Conclusions

We derived and evaluated a new local clustering index to be used within the SADIE methodology. The new index (*c*_{i}, *c*_{j}) does not separate the location from the count in assessing the distance to regularity in a modified randomisation algorithm, and unlike with the currently defined clustering index (Perry *et al.* 1999), it eliminates the influence of sampling position and count size on its calculation. More importantly, we have provided a formal statistical test to determine the significance level of the new index for each location based on randomisations; these significance tests should be used when a global test for spatial pattern shows significance. A single threshold value of ±1·5 proposed previously for determining the importance of original clustering indices is likely to lead to misleading conclusions on the patch/gap and their sizes. The new index is complementary to other methods for testing the global pattern to provide explanation of the pattern at a local scale.

### Acknowledgements

This project is funded by the China Natural Science Foundation (Project Number: 30928016).

### References

- 2009) Response of pine natural regeneration to small-scale spatial variation in a managed Mediterranean mountain forest. Applied Vegetation Science, 12, 488–503. , , & (
- 2010) Spatial patterns and interspecific relations analysis help to better understand species distribution patterns in a Mediterranean high mountain grassland. Plant Ecology, 210, 137–151. & (
- 1999) Arthropod prey of farmland birds: their spatial distribution within a sprayed field with and without buffer zones. Aspects of Applied Biology, 54, 53–60. , & (
- 2010) Spatial analyses of ecological count data: a density map comparison approach. Basic and Applied Ecology, 11, 734–742. , , & (
- 2007) The Study of Plant Disease Epidemics. The American Phytopathological Society, St. Paul, Minnesota, USA. , & (
- 2003) Small-scale environmental heterogeneity and spatiotemporal dynamics of seedling establishment in a semiarid degraded ecosystem. Ecosystems, 6, 630–643. , , , & (
- 2010) Weak effects of the exotic invasive Carpobrotus edulis on the structure and composition of Portuguese sand-dune communities. Biological Invasions, 12, 2117–2130. , & (
- 2010) Spatial distribution of colonizing
*Listronotus maculicollis*populations: implications for targeted management and host preference. Journal of Applied Entomology, 134, 275–284. & ( - 2010) Spatial variability of fusarium head blight pathogens and associated mycotoxins in wheat crops. Plant Pathology, 59, 671–682. , , , , & (
- 2010) Spatial distribution and temporal stability of crenate broomrape (
*Orobanche crenata*Forsk) in faba bean (*Vicia faba*L.): a long-term study at two localities. Crop Protection, 29, 717–720. , & ( - 2006) The Guide to GenStat® Release 9 – Part 2: Statistics. VSN International, Hemel Hempstead, UK. (
- 1995) Spatial-analysis by distance indexes. Journal of Animal Ecology, 64, 303–314. (
- 1998) Measures of spatial pattern for counts. Ecology, 79, 1008–1017. (
- 2002) A new method to measure spatial association for ecological count data. Ecoscience, 9, 133–141. & (
- 1999) Red-blue plots for detecting clusters in count data. Ecology Letters, 2, 106–113. , , & (
- 2009) Wildfire changes the spatial pattern of soil nutrient availability in
*Pinus canariensis*forests. Annals of Forest Science, 66, DOI: 10.1051/forest/2008092. , , & ( - 2010) Spatial clustering and associations of two savannah tsetse species,
*Glossina morsitans*submorsitans and*Glossina pallidipes*(Diptera: Glossinidae), for guiding interventions in an adaptive cattle health management framework. Bulletin of Entomological Research, 100, 661–670. , , , & ( - 2007) Fine scale spatial distributions of two entomopathogenic nematodes in a grassland soil. Applied Soil Ecology, 37, 192–201. , & (
- 2009) Spatiotemporal patterns and dispersal of stink bugs (Heteroptera: Pentatomidae) in peanut-cotton farmscapes. Environmental Entomology, 38, 1038–1052. , , & (
- 1999) Spatial pattern analysis of strawberry leaf blight in perennial production systems. Phytopathology, 89, 421–433. & (
- 2009) Patterns of frugivory, seed dispersal and predation of blue fan palms (
*Brahea armata*) in oases of northern Baja California. Journal of Arid Environments, 73, 773–783. , & ( - 2004) Rings, circles, and null models for point pattern analysis in ecology. Oikos, 104, 209–229. & (
- 2001) Modelling the dynamic spatio-temporal response of predators to transient prey patches in the field. Ecology Letters, 4, 568–576. , , , & (
- 2003) Considerations for the use of SADIE statistics to quantify spatial patterns. Ecography, 26, 821–830. & (
- 2004) Use of SADIE statistics to study spatial dynamics of plant disease epidemics. Plant Pathology, 53, 38–49. & (
- 2005) Interrelationships among SADIE indices for characterizing spatial patterns of organisms. Phytopathology, 95, 874–883. & (
- 2006) Spatial patchiness of litter, nutrients and macroinvertebrates during secondary succession in a Tropical Montane Cloud Forest in Mexico. Plant and Soil, 286, 123–139. , , , & (