Studies were identified through a comprehensive search on Web of Science last updated in April 2011, using the search string: ‘landscape AND [agr* OR crop] AND [enem* OR predat* OR parasit* OR pest OR biological control]’. Many studies have investigated the effects of vegetative diversity or complexity at local scales (synthesised by Letourneau et al. 2011); our goal here was to synthesise the results from studies concerned with complexity at a landscape scale, which we define as ≥ 500 m to include land-cover types extending beyond the field edge. Over 900 abstracts were reviewed for relevance, and 46 studies were ultimately selected using the following criteria: (1) a sample size consisting of at least five unique ‘landscapes,’ in which a landscape comprises a field and the area surrounding it, separated by a minimum distance of 1 km from any other field in the study, (2) quantitative measurements of landscape complexity (as defined below) using GIS or other spatial techniques at ≥ 500 m around the farm, and (3) statistics reported as the univariate relationship between landscape complexity and arthropod response or the partial contribution of landscape complexity among other factors. Authors were contacted if the study design met our criteria but the statistics were not reported in a format suitable to our analysis, and in some cases original data were then obtained and reanalysed.
Predictor variables included several categorical variables and one continuous variable (scale), and correspond to our study questions. (1) Enemies vs. pests: trophic level specified whether the arthropod was an enemy or a pest. In most, but not all cases, the ‘pest’ in question was a serious economic concern for the crop system studied. A few studies (Jonsen & Fahrig 1997; Holland & Fahrig 2000; Kruess 2003; Ekroos et al. 2010) examined herbivorous response for insects that were not considered pests, and those studies were used because they still provide valuable information about how the response of the secondary trophic level differs from higher trophic levels. (2) Response definitions: arthropod response type included abundance and diversity for enemies and pests, predation or parasitism for enemies only, and population growth and plant damage for pests only. Our category for ‘diversity’ in most cases meant raw or rarefied measures of species richness, although a few studies used Shannon indices. Pest population growth was measured as the difference between pest populations in the presence and absence of resident natural enemies at different sites; thus, while it is listed under pest response here, it is also partly a function of natural enemy response. (3) Landscape definitions: landscape complexity metric included % natural habitat, % non-crop habitat, % crop (inverted), habitat diversity and an ‘other’ category (comprised of one study measuring distance to natural habitat and three studies measuring linear features such as length of woody edges at the landscape scale). Habitat diversity was measured using Shannon and Simpson indices; studies purporting to measure diversity but actually using other measures (% non-crop or length of boundary habitat) were reclassified accordingly. The measures for % non-crop and % crop (inverted) were kept separate because of different assumptions regarding the composition of non-crop habitat (see Discussion for more details). (4) Specialists vs. generalists: arthropod specialization defined each arthropod as either a specialist or generalist, according to how they were described in the current literature. If a study in our meta-analysis did not explicitly define its study species as specialist or generalist, the species name was searched in Web of Science with the terms ‘specialist’ and ‘generalist’ to determine how the species is most commonly characterised. (5) Scale of response: the scale at which landscape complexity was measured (i.e. the radius around the farm within which the landscape was characterised for different measures of complexity).
We converted the test statistic (F, χ2, t, or r2) from each response reported in a study to a standard statistic, the correlation coefficient R, in order to compute Fisher’s Z, using the equation (following Rosenthal & DiMatteo 2001)
We use Z rather than the more ubiquitous Hedges’d because Z estimates the magnitude of the relationship between a predictor variable and its response using any test statistic, while Hedges’d uses standardised mean differences as its effect size index. The Hedges index is the most applicable in experiments comparing control and treatment groups; the studies in our analysis tended to be continuous (testing arthropod response across a landscape gradient).
In this manner, we generated 159 effect sizes (Z) from 46 studies. Effect size was then used as the response variable, weighted by the inverse of its variance, in generalised linear mixed models (R, version 2.9.1, http://cran.r-project.org) with our predictor variables (as defined above) as fixed effects and study as a random effect. Using generalised linear mixed models instead of existing meta-analytical software (e.g. Meta-Win) provides greater analytical flexibility, allowing for the incorporation of random effects to account for multiple non-independent measures from the same study (e.g. measurements for different taxonomic groups, or measures of more than one response type, such as abundance and diversity; see also Prugh 2009).
Each study question (see Introduction) was tested with a different model (Table 2). The AIC (Akaike information criteria) score was used as a guide for comparing different models (with a lower AIC corresponding to a more explanatory model, and a difference of > 2 considered to be significant, Burnham & Anderson 2002), but P-values for each factor were also considered. Likelihood-ratio testing was used as a more robust measure for nested models to determine whether the addition of a variable improved the model.
Table 2. Models tested for study questions, the effect of predictor variables on the response variable, with Akaike information criteria (AIC) scores for comparison. Models 1.2–1.8 are nested within 1.1; 1.2a and 1.2b nested within 1.2; 2.2 and 2.3 nested within 2.1. Log-likelihoods (L-L) and d.f. are reported along with the results of log-likelihood-ratio tests to compare the nested models to the null models, where appropriate. No AIC or L-L statistics are included for models 3 and 4 because they are comprised of a different subset of studies and therefore comparison to other models would not be meaningful. Lines in bold show the best model in a set of nested models.
|Model||Question||Predictor variables||Response variable||Papers, Obs.||AIC||L-L||d.f.||p(L-L test)|
|1.1||1||Trophic level||Effect size||46, 159||267||−129.5||4||(null)|
|1.2||2||Trophic level × response type||Effect size||46, 159||254||−119.0||8||0.0003|
|1.2a||2||Trophic × response × landscape||Effect size||46, 159||265||−120.8||12||0.470|
|1.2b||2||Trophic × response × specialisation||Effect size||46, 159||256||−119.4||9||0.337|
|1.3||3||Trophic level × landscape metric||Effect size||46, 159||278||−130.9||8||0.567|
|1.4||4||Trophic level × specialisation||Effect size||46, 159||270||−130.1||5||0.251|
|2.1||(enemy only)||Response type||Effect size||38, 118||163||−76.6||5||(null)|
|2.2||(enemy only)||Response type × landscape metric||Effect size||38, 118||163||−72.8||9||0.103|
|2.3||(enemy only)||Response type × specialisation||Effect size||38, 118||160||−74.0||6||0.022|
|3||5||Trophic level × specialisation||Scale||26, 87||--||--||--||--|
|4||5||Trophic × specialisation × scale||Effect size||14, 214||--||--||--||--|
The models for questions 2, 3 and 4, regarding the predictors arthropod response type, landscape metric and arthropod specialisation, respectively, were nested hierarchically within the model for question 1 (trophic level), and correspond to models 1.2–1.4 (Table 2). Factors that were found to improve the model significantly were then further nested with the remaining factors to test whether additional improvements could be made (e.g. landscape metric and specialisation were each nested within response type and trophic level; models 1.2a and 1.2b, respectively, Table 2). Our initial analyses (models 1.1–1.4) suggested that a lack of significance in pest response could be masking potentially significant distinctions between the effects of different variables on enemies, which could be more thoroughly examined through an independent analysis of this trophic level. Therefore, response type, landscape metric and arthropod specialisation were further explored for question 4 with a separate set of models in which only enemy response was considered (models 2.1–2.3 in Table 2).
Two additional and independent (non-nested) models (3 and 4 in Table 2) were used to address issues of scale in question 5. The extent to which we are able to detect a trend of arthropods increasing or decreasing with landscape complexity depends on the interaction between the scale of the arthropod’s response to landscape and the scale of complexity in that landscape. Many studies measured landscape complexity at only one scale, but some measured it at multiple scales. To handle this difference among studies in the main analysis, for any study utilising multiple scales, we selected the one most predictive scale for each response variable in each study. Then, to investigate scale effects explicitly (question 5), we also conducted secondary analyses using subsets of the original set of studies. Twenty-six of the 46 studies in our data set tested the same response against multiple scales, although some of these only reported results from the most predictive scale. For this set of 26 studies, we tested whether the most predictive scale was different for enemies and pests and/or specialists and generalists (yielding 88 responses from the most predictive scale; model 3 in Table 2), using scale rather than effect size as our response variable in this case. There were 14 studies out of these 26 that reported effects at all scales measured; we used all scales of this further reduced subset (yielding 219 responses, from 2 to 8 scales per study; model 4 in Table 2) to test scale as a predictor for effect size along with other predictor variables (trophic level and arthropod specialisation). Some of these 14 studies measured responses at scales below as well as above 500 m. For this analysis about scale only, we included measures at all scales, even those below 500 m.
Data were further explored for evidence of publication and representational bias. Publication bias was investigated using three different methods. (1) Funnel plots, which depict the standardised effect size against the study sample sizes, provide a qualitative assessment of publication bias. Unbiased data should be shaped like a funnel in these plots, with a wide scatter of effect sizes at low sample sizes, growing narrower at higher sample sizes (Palmer 1999). (2) A Spearman-rank correlation test achieves the same comparison statistically, a significant correlation indicating that studies with large effect sizes are more likely to be published than smaller effect sizes (Begg 1994). (3) We also calculated Rosenthal’s fail-safe number (according to Rosenberg 2005), to determine the number of hypothetical non-significant, unpublished or missing studies that would need to be added to the analysis to make significant overall effects non-significant. If the fail-safe number is sufficiently high (i.e. > 5n + 10, where n is the number of studies included in the meta-analysis), the significant results can be considered robust despite publication bias (Rosenberg 2005).
Representational bias was investigated in post hoc analyses. Several factors (lab group, cropping system and study region) were examined to determine whether they had disproportionate representation in the data set and whether these underlying representational biases were driving the trends seen in our analysis (models 1.5–1.7, Appendix S1). Differences between taxonomic groups were also examined alongside specialist/generalist distinctions to determine if one particular group was driving specialist or generalist responses (model 1.8, Appendix S1). Finally, various measures of diversity for landscape metric and arthropod response type (raw richness, rarefied richness, and Shannon indices, or Shannon and Simpson indices, respectively) were further probed to determine whether different measures had any impact on effect size (Appendix S1).