Spatial autocorrelation as a tool for identifying the geographical patterns of aphid annual abundance


  • Nadège Cocu,

    1. Département de Géographie, Université Catholique de Louvain, Place Louis Pasteur 3, 1348 Louvain-la-Neuve, Belgium, * Division of Plant and Invertebrate Ecology, Rothamsted Research, Harpenden, Hertfordshire AL5 2JQ, U.K. and Biologie des Organismes et des Populations Appliquée à la Protection des Plantes, Institut National de la Recherche Agronomique (INRA), Centre de Recherches de Rennes, Domaine de la Motte au Vicomte, B.P. 29, 35650 Le Rheu, France
    Search for more papers by this author
  • Richard Harrington,

    1. Département de Géographie, Université Catholique de Louvain, Place Louis Pasteur 3, 1348 Louvain-la-Neuve, Belgium, * Division of Plant and Invertebrate Ecology, Rothamsted Research, Harpenden, Hertfordshire AL5 2JQ, U.K. and Biologie des Organismes et des Populations Appliquée à la Protection des Plantes, Institut National de la Recherche Agronomique (INRA), Centre de Recherches de Rennes, Domaine de la Motte au Vicomte, B.P. 29, 35650 Le Rheu, France
    Search for more papers by this author
  • Maurice Hullé,

    1. Département de Géographie, Université Catholique de Louvain, Place Louis Pasteur 3, 1348 Louvain-la-Neuve, Belgium, * Division of Plant and Invertebrate Ecology, Rothamsted Research, Harpenden, Hertfordshire AL5 2JQ, U.K. and Biologie des Organismes et des Populations Appliquée à la Protection des Plantes, Institut National de la Recherche Agronomique (INRA), Centre de Recherches de Rennes, Domaine de la Motte au Vicomte, B.P. 29, 35650 Le Rheu, France
    Search for more papers by this author
  • and 1 Mark D. A. Rounsevell

    Corresponding author
    1. Département de Géographie, Université Catholique de Louvain, Place Louis Pasteur 3, 1348 Louvain-la-Neuve, Belgium, * Division of Plant and Invertebrate Ecology, Rothamsted Research, Harpenden, Hertfordshire AL5 2JQ, U.K. and Biologie des Organismes et des Populations Appliquée à la Protection des Plantes, Institut National de la Recherche Agronomique (INRA), Centre de Recherches de Rennes, Domaine de la Motte au Vicomte, B.P. 29, 35650 Le Rheu, France
    Search for more papers by this author

Mark Rounsevell. Tel: +32 (0) 10 472872; fax: +32 (0) 10 472877; e-mail:


Abstract 1 A spatial autocorrelation analysis was undertaken to investigate the spatial structure of annual abundance for the pest aphid Myzus persicae collected in suction traps distributed across north-west Europe.

2 The analysis was applied at two different scales. The Moran index was used to estimate the degree of spatial autocorrelation at all sites within the study area (global level). The contributions of each site to the global index were identified by the use of a local indicator of spatial autocorrelation (LISA). A hierarchical cluster analysis was undertaken to highlight differences between groups of resulting correlograms.

3 Similarity between traps was shown to occur over large geographical distances, suggesting an impact of phenomena such as climatic gradients or land use types.

4 The presence of outliers and zones of similarity (hot-spots) and of dissimilarity (cold-spots) were identified indicating a strong impact of local effects.

5 Several groups of traps characterized by similarities in their local spatial structure (correlograms, value of Moran's Ii) also had similar values for land use variables (the area occupied by agricultural zones, forest and sea).

6 It is concluded that trap data can provide information about Myzus persicae that is representative of large geographical areas. Thus, trap data can be used to estimate the aerial abundance of this species, even if the suction traps are not regularly and densely distributed.


Peach-potato aphid Myzus persicae (Sulzer) is a major pest of potatoes and sugar beet in north-west Europe. It may produce winter eggs on Prunus persica or, where winters are warm enough, continue parthenogenetic reproduction year-round on a range of crop or weed species, particularly among the Brassicaceae and Asteraceae (Blackman & Eastop, 2000). Therefore, it appears likely that the geographical distribution of M. persicae in Western Europe is influenced by climatic factors and by the availability of host plants. However, it is difficult to know what is the combined effect of these factors on M. persicae distributions and this difficulty is accentuated by the polyphagous nature of this species.

Ecologists examine the spatial patterns of species to understand the mechanisms that control their distribution (Legendre & Legendre, 1998). Therefore, an understanding of the spatial distribution of insects has important implications for planning pest monitoring programmes (Taylor, 1986; Sharov et al., 1996), for predicting population density at unsampled locations (Liebhold et al., 1991), for improving pest management strategies and for understanding ecological relationships with different environmental factors (Quinn et al., 1991; Harrington et al., 1995).

Previous studies have attempted to describe spatial patterns of organisms with the use of variance-mean methods [s2/m, ICS, ICF, Lloyd's patchiness and crowding indices, Morisita's index (Iδ), and the coefficient of Taylor's Power Law] (Upton & Fingleton, 1985), but these indices have focused on the sample count variance and ignored the spatial location of samples (Liebhold et al., 1991). Moreover, these approaches are based on the assumption that sample values are not correlated, and thus the distance between the samples is ignored and spatial patterns are not identified (Sharov et al., 1996). Therefore, these techniques do not appear to be appropriate for ecological studies where the presence of spatial correlation between samples (i.e. the presence of correlation between values sampled at different points in space) is the norm rather than the exception. Many authors argue that all spatial data fulfil the generalization that values from samples near to one another tend to be more similar than those that are further apart (Liebhold & Sharov, 1998; Koening, 1999; Epperson, 2000). This tendency is termed spatial autocorrelation (SA) (Cliff & Ord, 1973). Consequently, there has been an increasing interest in the use of variograms, covariance functions and correlograms for describing spatial patterns, measuring SA and how its strength varies with distance (Liebhold et al., 1991). These methods, collectively known as geostatistics, provide information about the range (distances) and strength of spatial correlation amongst samples (Sharov et al., 1996). Such techniques can be used whenever a sample value is expected to be influenced by its position in space and its relationship to its neighbours, and this is the case for suction trap data, which monitor the aerial density of airborne insects (Woiwod, 1982). The usage of geostatistical methods for quantifying spatial patterns in insect counts is relatively new and there are only a few published applications (Sokal et al., 1987; Schotzko & O'Keefe, 1989; Liebhold et al., 1991, 1993; Schotzko & Knudsen, 1992).

For aphids, Hullé & Gamon (1990) highlighted the resemblance of Belgian and French suction trap data in space and time, implying the existence of spatial structure in aphid data and indicating that a single trap was representative of a large area (a circle of 100 km radius being quoted as a reference in areas of similar topography). Moreover, previous studies by Taylor (1979) found a very strong correlation between paired aphid samples in two traps, 81 km apart. In two traps, 389 km apart, there was still a significant positive correlation. Quinn et al. (1991) suggest that similar spatial structure can result from shared environmental characteristics. Knowing how and where insects are caught has important implications for pest management and prevention strategies. This knowledge can also assist the siting of traps to maximize the quality of risk assessment. Moreover, exploratory research of this type can also provide insight into the processes that influence species distributions at different spatial scales.

Although several methods have been used to deal simultaneously with the location and the variable values of ecological data, none of these techniques is as generally applicable to the wide range of data collected by ecologists as the weighted form of Moran's I (Jumars et al., 1977; for a complete review, see also Dale et al., 2002 and Perry et al., 2002). Variables may be nominal, ordinal or interval, and points can be regularly or irregularly distributed (Sokal & Oden, 1978; Sokal, 1978), which is important because trap coverage is never ideal for practical reasons. Traps can rarely be placed at regular intervals and the position and number of traps usually changes with time (Woiwod, 1982).

Although ecologists and biologists can formulate assumptions regarding the processes that generate a spatial phenomenon (Legendre & Legendre, 1998), exploratory approaches can contribute to the development of further hypotheses. The analysis reported here consists of an investigation of the presence or otherwise of global spatial structure in aphid data. If a significant spatial pattern is identified, then subsequent analyses could explore the potential underlying factors. Within this framework, the present study was undertaken to quantify the spatial distribution of M. persicae using SA analysis (Moran's I) at a global and a local scale. Two hypotheses were tested: (i) that there is an optimum neighbourhood size between traps (in terms of similarity between sites) and (ii) that this size changes because of local landscape effects. Therefore, an attempt was made to relate spatial structures to environmental factors such as land use. This exploratory analysis aimed to characterize the global and local spatial structure of population data and its stability in time, to quantify spatial dependency with distance lag between traps and to provide insights regarding the mechanisms that may have initiated the spatial phenomenon. Thus, the analysis focused more on the spatial patterns that exist within the data rather than on the biological processes that underpin these patterns. The identification of areas where aphid populations are relatively homogenous should provide clues as to the possible biogeographical characteristics that influence the observed patterns in the data. Subsequently, this could contribute to further studies of population biology and behaviour. The present study also aimed to identify and develop hypotheses that could be tested by other types of analysis.

Materials and methods

Aphid data

Aphid data for the U.K., France and Belgium (34 sites) were obtained from the EXAMINE database, which covers the entire European network of aphid suction traps and comprises daily data for 29 principal pest aphid species (; Harrington et al., 2004). North-west Europe was chosen as a study area because it has the greatest trap density.

Suction traps in this network collect the aerial aphid population in a known air volume (2700 m3/h) and at a height of 12.2 m above ground (Macaulay et al., 1988; Bouchery, 1990).

For the analysis presented here, the variable log10(n + 1) was used with n being the annual total of M. persicae caught at each trap over the years 1989–2001. Annual total is useful as an indicator of broad spatial structure, but masks phenological differences.

Land cover data

An analysis of the relationship between land cover and aphid annual totals was based on the CORINE Land Cover (CLC) database (Communautés Européennes Commission, 1993). The CLC database consists of a European geographical map with a minimum mapping element of 25 ha, which is extracted from satellite data. As a consequence of image data availability, the date of image acquisition varies between countries over a period of 10 years. For example, the Belgian and the U.K. CLC databases were based on data for 1989 and 1990 whereas the French CORINE Land Cover project was carried out using images from 1987 to 1992. Ancillary information was used in combination to refine interpretation and the assignment of the territory into classes described below.

CLC describes land cover (and partly land use) according to a nomenclature of 44 classes that are organized hierarchically in three levels covering the agricultural, urban and natural sectors. The areas occupied by the different land cover classes were extracted for a circle of radius of 50 km centred on each trap. To ensure inclusion in the trap's area of representation, a circle with a radius smaller than the reference values quoted in the literature was selected [i.e. a radius of 80 km is suggested for the British network (Taylor, 1979) and a radius of 100 km for the French Agraphid network (Hullé & Gamon, 1990)].

Only three land cover types were considered: (i) the agricultural area (classes 2.1 arable land, 2.2 permanent crops, 2.3 pastures and 2.4 heterogeneous agricultural areas) that might represent favourable habitats to M. persicae via the presence of hosts plants; (ii) the forested zone (class 3.1 forests) and (iii) the area occupied by sea (calculated as the difference between the circle area and the surface occupied by all the land cover classes) to represent habitats where M. persicae is unlikely to be found. The use of these three land cover types only allows a coarse characterization of the landscape that has not changed much during recent decades. However, a small reduction in agricultural areas due to urban expansion could have occurred between the 1990s and the 2000s, as already quantified in Ireland and Germany (Keil et al., 2003) where the update of the Corine land cover database is available for the year 2000.

Spatial autocorrelation analysis

Distance matrix.  The first step in the analysis is to compute the Euclidean distance matrix from the x and y co-ordinates of the individual suction traps (dij is the element in the distance matrix corresponding to the distance between the observation pair i,j). The characteristics of the distance matrices indicated a minimum allowable distance cut-off of 260 km for the years 1989–2001. Beyond that distance, the matrix does not contain any rows with zero values (i.e. each trap has at least one neighbour). spacestat software (version 1.90) can be used to calculate Moran's I for a matrix based on a distance lower than the cut-off distance but, according to Anselin (personal communication), this changes the scale of the statistic and no comparison between years is possible. A correction can be made to compare the results at shorter distances, but this is probably not relevant because there are only very few points having neighbours at distances lower than 260 km. Moreover, in this case, the range of available data is weak and, consequently, the calculated values of Moran's I and the correlogram obtained would be less robust and reliable. Therefore, distances less than 260 km were not analysed. This constraint could pose a problem if the autocorrelation was zero for this cut-off distance. In this case, it would not be possible to conclude anything about the spatial structure of the data. However, if the data show a positive or negative SA for distances greater than 260 km, then it could be assumed quite reasonably that the SA is even stronger for smaller distances (Tobler, 1970). Thus, it is the identification of distances where SA no longer exists that is more useful than defining the presence of SA at smaller distances.

Spatial weight matrix.  Next, the information in the distance matrix was used to create a spatial weights matrix (wij is the element in the spatial weights matrix corresponding to the observation pair i,j). At this stage, the simplest situation is to compute binary contiguity weights by indicating an upper limit and a lower limit for the distance bands such that points are, or are not, contiguous. That is to say that wij = 1 for i and j‘neighbours’ (i.e. with dij < critical distance) and wij = 0 otherwise. For each year, a distance matrix is created and this is transformed into spatial weights matrices for different contiguity upper limits: 260, 300, 350, 400, 450, 500, 600, 700, 800, 900 and 1000 km. There are no accepted ways of choosing these weights and, according to Jumars et al. (1977), the selection of a very general set of weights can make the test of SA weak whereas a specific set can miss the SA that might be shown by another choice of weightings.

Global spatial autocorrelation.  For the most part, measurements of global SA are used to estimate the presence or otherwise of SA in all of the sites within a study area: they are taken as whole measurements. Global SA was quantified in this research by the Moran index (I) that is given by:


with inline image and inline image

xi and xj are observations for locations i and j (with mean μ); N is the number of observations and S0 is a scaling constant: inline image (i.e. the sum of all weights).

The spatial weights matrix from distances was then row-standardized so that the sum of the weights for a trap was equal to one (i.e. inline image such that inline image). In this case, which is the preferred way to implement this test, the normalizing factor S0 equals N (because each row sums to 1) (Anselin, 1992). This allowed a comparison of the SA indices between the various neighbourhood distances because, for each of them, identical importance is attached to all the neighbours independently of their number. This also enables comparison of the results of the various years in spite of the fact that the traps do not have the same operating period. The numerator corresponds to the covariance between contiguous observations. This covariance is zero in the absence of SA, positive in the presence of positive SA, and negative in the presence of negative SA. The covariance is standardized by the denominator, which is a measure of the variance of the observations. In the calculation of the index, the mean of the observations is a reference:

  • 1When I = 0: the covariance between contiguous observations is zero, the neighbourhood does not play any part; there is no SA, the observed spatial pattern of values is equally as likely as any other spatial pattern (i.e. abundance is independent of trap location);
  • 2When I > 0: the places are more alike if they are contiguous; the SA is positive; similar values, either high or low values, are more spatially clustered than is expected to occur purely by chance (i.e. traps with similar locations tend also to have similar aphid abundance); and
  • 3When I < 0: the places are more alike if they are far apart; the SA is negative (i.e. traps that are closer to each other have opposed aphid catches, reflecting a lack of clustering, more so than for a random pattern).

Local spatial autocorrelation. Anselin (1992, 1995) developed measures of SA at a local scale. These identify the possible contributions of each site to the global index. The dependence, or spatial association, between the value of the variable taken at a site and the value of the variables in its neighbourhood is then quantified. Local statistics are well suited to identifying the existence of pockets of spatial association, to assess assumptions of stationarity and to identify distances beyond which no discernible association remains (Getis & Ord, 1996). Applied to spatial datasets lacking global SA, the methods may find significant local homogeneous (hot-spots) or heterogeneous (cold-spots) areas (Sokal et al., 1998). Conversely, when faced with global SA, both Ord & Getis (1995) and Anselin (1995) recognize that the SA analysis generates a bias. Ord & Getis (1995) provide a discussion of the issue that local statistics must be interpreted according to the degree of global autocorrelation present in the data, otherwise Type I errors may occur. That is, locations are identified as hot spots simply because they lie in areas of generally high (or low) values (Ord & Getis, 2001). However, in this case, many authors recommend the use of local SA in an exploratory and indicative manner to see which localities contribute more than others to the global index (Anselin, 1995; Sokal et al., 1998; Ord & Getis, 2001).

Analysis of spatial outliers allows the identification of atypical observations. Spatial outliers are defined by Fotheringham et al. (2000) as areas having very different values for a variable in comparison with the values taken by its neighbours. This notion of ‘extremeness’ only indicates the importance of observation i in determining the global statistic (Anselin, 1995). Thus, extreme contributors may be identified by means of simple rules, such as the two-sigma rule (i.e. when values of the variable are two standard deviations away from the mean).

The spatial cluster analysis is used to discover local patterns of spatial association and to identify zones of similarity (hot spots) and of dissimilarity (cold spots) in aphid abundance between traps. Local spatial clusters are defined by Anselin (1995) as sets of contiguous locations for which the local indicator of spatial autocorrelation (LISA) is significant. A LISA is defined by Anselin (1995) as any statistic that satisfies the following two requirements:

  • 1The LISA for each observation gives an indication of the extent of significant spatial clustering of similar values around that observation and
  • 2The sum of LISAs for all observations is proportional to a global indicator of spatial association.

A local Moran (Ii) can be defined (Anselin, 1992) as:


where the observations zi, zj are deviations from the mean. The interpretation of the local Moran as an indicator of local instability follows from the relationship between the local and global statistics. Specifically, the mean of Ii will equal the global I up to a factor of proportionality: inline image with inline image. It is also possible to standardize the local index by dividing by m2. When:

  • 1Ii = 0: there is no SA;
  • 2Ii > 0: the SA is positive [i.e. it conveys the presence of an association of values similar to the place i where the index is measured: traps in the neighbourhood of trap i have similar catches whether these are high or low (hot spots)] and:
  • 3Ii < 0: the SA is negative, corresponding to an association of values that are opposed to the place i where the index is measured: surrounding traps have different catches than trap i (cold spots).

Correlogram.  An exploratory and systematic analysis of local and global Moran's I indices was carried out on each spatial weights matrix. A useful aid in interpreting the autocorrelation coefficients is performed by the correlogram, which is a graphical display of the Moran index plotted against the distance lag. The shape of this curve provides supplementary information. The correlogram usually takes the shape of a decreasing curve and the distance where there is no more SA can be identified (i.e. when the correlogram stabilizes around a value close to zero).

Statistical inference.  An advantage of this technique is that the significance of the null hypothesis can be tested (i.e. it is possible to quantify the probability that a spatial pattern as extreme as that observed could have appeared by chance). This can be tested experimentally by Monte-Carlo permutations of n-values of attributes through n spatial units several times (999 times in this case) and by calculating each time the value of the Moran's index I* according to an experimental distribution from which I is built. The proportion for which the value of I* is larger than that of I suggests the probability that a value of I as high as I* can appear by chance. A confidence interval around I is also given for a fixed level of confidence α (Fotheringham et al., 2000).

Hierarchical cluster analysis

A hierarchical cluster analysis was used to explore the role of site characteristics in explaining aphid data. This statistical procedure combines observations into groups or clusters. Relatively homogeneous groups can be identified based on selected characteristics (e.g. land use type). The technique is based on an algorithm that starts with each trap in a separate cluster and then combines clusters sequentially until only one remains (Webster & Oliver, 1990). The analysis was performed using SPSS (version 10.1; SPSS Inc., Chicago, Illinois) statistical software using between group linkage cluster methods with the squared Euclidean distance as a measure of the relationship between the individuals. This method seeks to identify a set of groups that both minimizes within-group variation and maximizes between-group variation. Other techniques, such as single linkage grouping, within-group linkage and Ward's methods were also tested. Similar results were obtained from the different clustering techniques that increased confidence in the results. Only the results of the between-group linkage method are presented in a dendrogram, which is a tree diagram used to represent the steps in hierarchical clustering. It indicates how the clusters are combined and the values of the distance coefficients at each step: connected vertical lines designate joined traps.


General spatial trends

The first step of the analysis consists of observing the spatial distributions within the data for each year to explore the structure of the data and to visualize the general spatial trends. It is expected that these spatial trends will also be highlighted by the SA analysis. Figure 1 shows maps of the spatial distribution of M. persicae annual abundance. To facilitate the observation of the values taken by the variable, the data were interpolated in arc view 3.2 (Environmental Systems Research Institute Inc., U.S.A.) by using the inverse distance weights method and a nearest neighbours approach. These maps suggest the existence of spatial trends over broad areas in the data with a south-east towards north-west orientation.

Figure 1.

The distribution of the variable log10(n + 1) annual total Myzus persicae.

It is important to note that the year 1989 appears to be different, with very high values for the variable log10(n + 1). Furthermore, the trap at Libramont, Belgium (56) is characterized by consistently low values, except in 1991 and from 1998 to 2000, suggesting that it has characteristics distinguishing it from the other traps.

In general, the lowest numbers of M. persicae were trapped in the north and west of the U.K. and in Brittany. These traps are under oceanic influence, which is likely to affect aphid numbers. Conversely, the traps that are characterized more often by larger values of the abundance variable are mainly located in the east of England [Broom's Barn (4), Rothamsted (22), Silwood (24), Writtle (27) and Wye (28)] and in the centre and south of France [Auxerre (60), Orléans (67), Valence (72) and Versailles (73)]. These are important areas where hosts of M. persicae occur: secondary hosts, including crops, on which parthenogenesis can occur, are ubiquitous but particularly abundant in the northern regions, and the primary host, Prunus persica, on which the egg is laid, is particularly abundant in the south of France. The observed spatial trends in the data might therefore be due to broad-extent processes that have a climatic or landscape origin.

Global Moran

The years 1989, 2000 and 2001 show no SA and values are rarely significant (Fig. 2). Otherwise, the correlograms are very similar. The correlograms have been averaged over the replicate years to achieve consistency. The main conclusion is that Moran's I decreases with distance, but the index is still quite high (> 0.2) over a wide range of distance. The curve levels off at more or less 700 km. Beyond that distance, the value of the SA index tends towards zero.

Figure 2.

The global correlograms: the Moran Index calculated for each investigated distance. P < 0.05 is indicated by the symbol: □ and P > 0.05 by the symbol ◆.

Local Moran

Following the results of the global analysis, the local Moran test was limited to the distance range: 260 to 500 km.

Outliers.  For the whole of the considered period, the traps that contributed more than their expected share to the global statistics were traps at Valence (72; seven times), Montpellier (66; six times), Libramont (56; five times), Elgin (50; four times) and Ayr (47; four times) (Table 1). These traps, which occur at the geographical limits of the study area, represent local exceptions to general trends.

Table 1.  Extreme contributors (trap number) identified for each year and for each investigated range of distances. The traps Ayr (47), Elgin (50), Libramont (56), Montpellier (66) and Valence (72) are the most represented over the period 1989–2001
Distance (km)2001200019991998199719961995199419931992199119901989
2604866 66 1442475060604762
  72 72 602250 66 5056
       27  72   
  727272664727 7256725056
  48  725671      
3504866 6266554506647604756
  72 6672 22 72566650 
  48 72  27      
  72 7272 27 7256665056
  48      50  56 
    72705622 7256665058
     72 56 50  56 
  48 7272   50 725650

Clusters.  Results for the years 1989, 2000 and 2001 are presented (Table 2) because, in spite of the absence of global SA, local instability was identified. Because the current analysis is exploratory, the level of confidence for testing the significance can be low and α was set to 0.05. Therefore, only Dundee (48) is significant for at least one distance for each of these years and the clusters are mainly cold spots. Some nodes of autocorrelation have been identified, in particular for the year 2001 where the clusters are essentially located in the centre of the U.K. Conversely for the year 1989, the cold-spots are mainly located in Scotland and in Belgium. Zones of heterogeneity and homogeneity are thus highlighted, even if the aphid data are not structured at a global scale. Therefore, the calculation of local indices (Table 2) complements the information given by the overall distribution of aphid abundance (Fig. 1) by showing the local spatial variability around each trap. For example in 2000, Montpellier (66) and Dundee (48) are characterized by high levels of aphid abundance (Fig. 1), but Montpellier is within a homogeneous area (positive Ii) whereas the area surrounding Dundee is heterogeneous (negative Ii). Direct observation of the data makes it possible to identify (visually) traps that differ most from the others, but the local SA adds a criterion of homogeneity or heterogeneity to this information.

Table 2.  Presence of pockets of local spatial autocorrelation (Ii) in the absence of global spatial autocorrelation
YearDistance (km)62145522244827185666499644570
  • *

    P < 0.05;

  • **

    P < 0.01.

 300−0.54**−0.05 0.02−0.15−1.75*0.14    −0.23*0.16  −0,09
 350−0.30−0.05−0.060.02−0.10−1.75*     −0.17   −0.09
 400−0.17−0.02  −0.07−1.75* −0.30*       −0,09
 450−0.21    −1.06−0.07−0.30*−0.53    −0.29 −0.04
 500 0.08  −0.02 −1.04 −0.30*−0.35   −0.05−0.42* −0.05
2000260 −0.121.11 −0.58−1.29*  0.38*1.780.020.02    
 300 −0.08  −0.38−1.28*  0.38*1.78*0.020.02    
 350 −0.08  −0.25−1.28*  0.221.780.02  −0.29  
 400 −0.05   −1.28* −*0.01  −0.36  
 450     −1.01−0.13  0.590.01  −0.36 −0.61*
 500   0.02 −0.84−0.13  0.59 0.01 0.05−0.33 −0.32
1989260−0.77   −0.59−0.37*   −0.50−0.16   −0.37*−0.04
 300    −0.40−0.31*   −0.50−0.16*   −0.37* 
 350    −0.36−0.31*  −1.21*−0.50−0.16*   −0.37* 
 400    −0.40−0.31*  −1.21*−0.50−0.11   −0.25 
 450   0.05−0.41−0.21  −1.17** −0.08   −0.19 
500     −0.16  −0.80*−0.39  0.03*    

An average local Moran was calculated for each contiguity distance over the replicate years to achieve more consistency and to allow generalization (Table 3). In general, the local index is at its maximum for the smallest investigated distances (260 km). This is in agreement with the observations at the global level. If a local index is maximum among the explored distances from contiguity, then this distance optimizes the resemblance between the total annual catches at a given trap and that of its neighbours (i.e. we can determine the radius of the representation of a trap i for which Ii is calculated). For some traps, the shortest distance is not the optimal, meaning that there is a strong impact of local effects [e.g. Hereford (9), Long Ashton (14), Newcastle (15), Rothamsted (22), Libramont (56), Auxerre (60) and Reims (71)] (Table 3). Environmental factors, such as gradients of land use, may have an influence.

Table 3.  Mean of the significant local index (mean Ii *) over the period 1989–2001 calculated for each contiguity distance and for each trap. (Only the traps presenting at least one significant Ii (P < 0.05) for all the range of distances are shown)
Distance (km)14224827181165566649509471526284716025

The average correlograms for the years 1989–2001 (Fig. 3) show some groups, arbitrarily defined according to similarities in the shape of the correlograms and in the value of the local Moran's Ii. One Group A of traps Libramont (56) and Auxerre (60) is characterized by low negative values of Ii whereas Group C, comprising traps at Ayr (47), Elgin (50) and Montpellier (66), has high positive values of Ii and a decreasing curve. The latter group is composed of traps that have nearest neighbours with identical catches, whereas the former is characterized by traps surrounded by neighbours with heterogeneous catches. Another group B [Hereford (9), Long Ashton (14) and Reims (71)] has correlograms that increase at short distances with a peak at approximately 350 km, meaning that the closest neighbours are not the most similar with respect to aphid abundance. The other correlograms are more difficult to differentiate, although traps at Preston (18) and Wye (28) appear to have some similarity of shape (Group D).

Figure 3.

Mean correlograms over the period 1989–2001 for the different traps: (4, Broom's Barn; 9, Hereford; 11, Kirton; 14, Long Ashton; 15, Newcastle; 18, Preston; 22, Rothamsted; 25, Starcross; 26, Tadcaster; 27, Writtle; 28, Wye; 47, Ayr; 48, Dundee; 49, East Craigs; 50, Elgin; 56, Libramont; 60, Auxerre; 65, Loos; 66, Montpellier; 71, Reims). Different groups can be defined according to similarities in the shape of the correlograms and in the value of the local Moran's Ii: Group A traps are characterized by low negative values of Ii; Group B is characterized by correlograms that increase at short distances with a peak at approximately 350 km; Group C shows high positive values of Ii and a decreasing curve; Group D shows a very low positive value of Ii at short distances and no SA with larger distance lags.

Hierarchical cluster analysis

Hierarchical cluster analysis allowed further investigation of the relationships between aphid data structures at a local scale and the main land use categories. The dendrogram (Fig. 4) shows that the four groups identified by correlograms (Fig. 3) can also be identified on the basis of land use variables. The distance between clusters is sufficiently large to imply good differentiation between groups, whereas the distance within clusters is small enough to indicate that the groups are homogeneous. Thus, the groups previously identified by the correlograms comprise homogeneous clusters on a land use basis (i.e. the local spatial structure, similarity or disparity) relates to landscape characteristics such as land use type. However, it is necessary to consider the information present in Table 4 (i.e. the surfaces occupied by the different land use types within each circle and in the pie-charts of their mean relative proportion for each group) (Fig. 5) to understand what aspects of land use lead to similar groups and thus to similar correlograms. Comparison of the two main clusters (identified in Fig. 4) and the pie-charts (Fig. 5) suggests that the proximity or not of traps to the sea influences the local spatial pattern in the aphid data. Traps close to the sea (Groups C and D) are characterized by a decreasing curve (i.e. local Moran Ii decreases with distance) (Fig. 3); the other traps (Groups A and B) are characterized by an absence of local spatial autocorrelation at short distance lags. The traps within Group B are characterized by a prevailing agricultural landscape whereas the traps within Group C present at least 20% of each land use category. Consequently, as distance lags increase, the landscape composition changes: less sea is likely to be present compared with arable land. Thus, as the distance lag increases, the traps in the neighbourhood of a trap in Group C might have less similar catches and it appears that traps within a neighbourhood of 350–400 km radius might be more similar to traps within Group B. On the other hand, the traps within Group A have a considerable forest component whereas the D traps are marked by a significant oceanic influence. It is not surprising therefore that the traps within Group A and D show no or little negative local SA because the landscape composition is more heterogeneous. Thus, agricultural areas, forest and sea, appear to have an influence on the structure of the aphid data. This supports the assumptions posed earlier following the results of the SA analysis.

Figure 4.

Dendrogram obtained from the hierarchical cluster analysis. Homogeneous groups of traps based on land use characteristics have been identified. Connected vertical lines designate joined traps. Label identifies the traps by their EXAMINE codes; ‘Num’ identifies the number of the trap in the data file; boxes around traps represents the four groups identified by the mean correlograms (in Fig. 4): for traps 14, 9 and 71 (represented in Fig. 4 by a black line); for traps 60 and 56 (represented in Fig. 4 by a grey line); for traps 47, 66 and 50 (represented in Fig. 4 by a black dotted line); and for traps 18 and 28 (represented in Fig. 4 by a black dashed line).

Table 4.  Surfaces in km2 occupied by the different land use categories within each circle of radius 50 km around the traps and the mean values for the traps belonging to each group
  Surface (km2)
GroupTrapsAgricultural areaForested zoneSea
Figure 5.

Pie-charts representing the mean composition of the landscape for the four groups identified on the basis of their mean correlograms (Fig. 3). Proportion (%) of agricultural areas, forest zones and sea for each group (mean values for the traps belonging to each group).


Similarities in the spatial distribution of aphid populations were demonstrated by the correlograms of the global Moran index. These similarities were consistent with the hypotheses that spatial structure is present in the aphid data for Western Europe and that individual traps are correlated over distances up to approximately 700 km for M. persicae. This observation is in agreement with the conclusions of Taylor (1979) that the similarity between traps remains important over large distances, suggesting the influence of broad-extent phenomena. This observation is important because the analysis conducted here was based on north-west Europe, which has the highest density of traps. The implication is that traps in other parts of Europe that are less densely distributed can still provide information about M. persicae that is representative of large geographical areas. Thus, it appears that there are reasonable grounds to believe that trap data can be used to estimate M. persicae aerial abundance at unsampled locations across large parts of Europe.

The characteristics of the data, the results of the global analysis at a small scale and, particularly, the shape of the correlograms (i.e. a monotonically decreasing curve where nearly all SA values are significant) suggest the presence of a linear gradient in the data. Legendre (1993) indicates that there are two kinds of gradients. On the one hand, the observed gradient may be deterministic, and ‘true gradients’ can be extracted using trend-surface analysis (Legendre & Legendre, 1998). In this case, no autocorrelation is assumed in the variable of interest and the spatial structure may result from the effect of explanatory variables, which themselves exhibit a spatial structure. The spatial structure may thus be the result of dependence of the studied variable on one or several causal variables that are spatially structured (Legendre & Legendre, 1998). On the other hand, ‘false gradients’ are structures that may look like gradients, but that appear in the studied variable because the process producing the values of annual abundance is spatial and generates autocorrelation in the data (Legendre & Legendre, 1998). According to Legendre & Legendre (1998), it is difficult to determine whether the observed gradient is deterministic (‘true’) or is part of a landscape displaying autocorrelation at small spatial scales (‘false’). The results presented in this study suggest that both effects could occur and that the factors responsible for the spatial structures are primarily of climatic and landscape origin.

The spatial dependency over large distance lags indicates the possible influence of broad-extent phenomena (e.g. autocorrelation in the habitat, dispersal or some combination of these two factors with dynamic processes such as the evolution of the main climatic gradients). Moreover, large variations in temperature and land use types have a spatial structure at the same scale as the aphid data (i.e. they are all influenced by latitudinal and longitudinal gradients). Topography could also play a significant role in the scale-dependency of these relationships. Areas of high topographic variability tend to have climates with high spatial variability, and the landscape characteristics can change rapidly over relatively short distances. (Communautés Européennes Commission, 1993; Parry, 2000).

The analysis on a local scale supported these hypotheses. Values of annual abundance two standard deviations from the mean occurred at both the northern [Elgin (50) and Ayr (47)] and southern [Valence (72) and Montpellier (66)] extremes of the study area. The influence of more local landscape characteristics is also possible. Elgin and Ayr are hilly areas, characterized by cold winter temperature and, in the latter case, not many crop host plants of M. persicae are grown. Moreover, there will be high levels of winter mortality as low temperatures tend to kill the parthenogens and there are no peach trees on which the eggs can be laid. This could explain the low densities of aphids observed in these two places during the whole of the considered period (Fig. 1). Conversely, in France, peach is primarily produced in the valley of the Rhone and on the Mediterranean coast (Hulléet al., 1998) where the traps at Valence (72) and Montpellier (66) are located. Levels of winter mortality will also be lower in Montpellier because the winter climate is milder. These characteristics are thus more favourable for aphid development and explain the strong aphid abundance observed for these traps in Fig. 1. The case of the trap at Libramont (56) is interesting because it often has unusual catches that are probably due to its geographical position. The trap at Libramont is located in an agricultural area, but on a slope, facing an extensive forested region. The samples are hence influenced by winds carrying mainly forest aphid species. Consequently, captures may not be representative of the flight activity in this region and agricultural pests such as M. persicae may be under-estimated. The local spatial autocorrelation analysis has therefore highlighted these particularities and suggests a need for caution when using these data to extrapolate to wider areas.

The methodology employed at a local scale also reveals some similarities between sites with respect to the behaviour of the local Moran Index (Ii) with distance. Some groups have been characterized by their local spatial structure (shape of the correlograms and value of Moran's Ii) and they also have been identified on the basis of land use criteria by a hierarchical cluster analysis. The area occupied by agriculture, forest and sea appears to be related to the total number of M. persicae caught in a year. Moreover, agricultural areas have greater numbers of aphids because they provide host plants.

The present study has illustrated the use of geostatistics for the spatial analysis of annual abundance of an aphid species: the three stages of this analysis provide clues as to the underlying processes that may initiate the observed spatial structure. Thus, it was shown that this structure is influenced by processes at broad and fine extent and caused by climatic or landscape characteristics or by a position effect (spatial component). Detrending the data could potentially be of use to separate these different effects. The aphid data demonstrate global spatial structure over large distances that is quite stable in time. The existence of hot-spots allows the definition of biogeographical areas around traps within which aphid populations are relatively homogeneous. The local indices could be used in association with geographical information systems to create maps that represent these hot-spots and the homogeneous biogeographical zones (Anselin & Bao, 1997). These results have implications for the study of aphid population biology and suggest lines for further investigation using the spatially and temporally extensive aphid database. For example, examination of the interactions between spatial patterns of aphid abundance and various environmental and landscape variables could provide biological explanations and further evidence in support of the hypotheses proposed here.

Other aphid variables such as phenology could also be analysed. Possible mechanisms behind spatial population synchrony, such as the Moran effect, are still a major issue in population ecology (Bjørnstad et al., 1999). Even though Moran (1953) showed that, for linear models, population synchrony would be expected, matching the corresponding environment, few analyses have been able to establish a parallelism between the structure of environmental factors and population synchrony (Goodridge, 1991; Lindström et al., 1996; Sutcliffe et al., 1996; Koening, 1999). The methodology employed in the present study could be used for environmental data as well as aphid data.

To date, the practical applications undertaken with suction trap catches have primarily been at regional, or even local, scales, providing information to aid in their control decisions. The results presented here suggest that the suction trap data are suitable for studies over a large area and therefore can be important for understanding the likely impacts of processes such as global environmental change.


This work was supported by the European Commission funded project EXAMINE (contract N°. EVK2-CT-1999-2000) and by a grant from the Université Catholique de Louvain. The authors are very grateful to Jon Pickup (SASA, Edinburgh) and Jean-Louis Rolot (CRA, Gembloux) for providing their aphid data; Colin Denholm, Paul Verrier and Damien Maurice for managing the aphid database; Manuel Plantegenest (ENSAR, Rennes), Patrick Bogaert (UCL), Suzanne Clark, Sue Welham and Joe Perry (Rothamsted Research) for stimulating discussion; and Ian Woiwod (Rothamsted Research) and Andy Liebhold (USDA Forest Service) for reviewing a previous draft of the paper. Finally, we gratefully acknowledge Benoit Flahaut (UCL) who provided much assistance with the spacestat software. Rothamsted Research receives grant-aided support from the Biotechnology and Biological Sciences Research Council of the United Kingdom.