Quantifying the evidence for biodiversity effects on ecosystem functioning and services

Authors


E-mail: pbalvane@oikos.unam.mx

Abstract

Concern is growing about the consequences of biodiversity loss for ecosystem functioning, for the provision of ecosystem services, and for human well being. Experimental evidence for a relationship between biodiversity and ecosystem process rates is compelling, but the issue remains contentious. Here, we present the first rigorous quantitative assessment of this relationship through meta-analysis of experimental work spanning 50 years to June 2004. We analysed 446 measures of biodiversity effects (252 in grasslands), 319 of which involved primary producer manipulations or measurements. Our analyses show that: biodiversity effects are weaker if biodiversity manipulations are less well controlled; effects of biodiversity change on processes are weaker at the ecosystem compared with the community level and are negative at the population level; productivity-related effects decline with increasing number of trophic links between those elements manipulated and those measured; biodiversity effects on stability measures (‘insurance’ effects) are not stronger than biodiversity effects on performance measures. For those ecosystem services which could be assessed here, there is clear evidence that biodiversity has positive effects on most. Whilst such patterns should be further confirmed, a precautionary approach to biodiversity management would seem prudent in the meantime.

Introduction

Human needs have been, and continue to be, satisfied at the expense of altered land use, climate, biogeochemical cycles and species distributions (MA 2005). As a result, biodiversity is declining a thousand times faster now than at rates found in the fossil record (MA 2005), raising concerns about consequences of such loss for ecosystem functioning, the provision of ecosystem services and human well being (Schläpfer & Schmid 1999; Chapin et al. 2000; Loreau et al. 2001; Kinzig et al. 2002; Díaz et al. 2005; Hooper et al. 2005; MA 2005; Srivastava & Vellend 2005). Such concerns have moved beyond the science community to the global stakeholder and policy community with the publication of the Millennium Assessment (Díaz et al. 2005; MA 2005). That analysis acknowledges that biodiversity probably plays a significant role in directly providing goods and services as well as regulating and modulating ecosystem properties (this term is used here to include ‘processes’ and ‘functioning’) that underpin the delivery of ecosystem services.

Considerable research has gone into teasing out the linkages between biodiversity, functioning and services (Naeem & Wright 2003), and experimental approaches now account for 40% of the publications in this area (Fig. 1). Most experiments have manipulated diversity or have assembled different diversities as a treatment variable and documented the response of ecosystem properties and processes, including modifying effects of environmental factors on such relationships (Naeem et al. 1994; Tilman 1996; McGrady-Steed et al. 1997; Hector et al. 1999). The experimental designs used, results obtained and interpretations made, have not been consistent and the field has been contentious and lively (Grime 1997; Wardle et al. 1997; Huston et al. 2000; Lepš 2004). Attempts have been made to provide common frameworks, identify areas of consensus or future challenges, as well as potential management and policy implications (Schläpfer & Schmid 1999; Loreau et al. 2001; Kinzig et al. 2002; Schmid et al. 2002; Díaz et al. 2005; Hooper et al. 2005), but these syntheses have taken the form of largely subjective assessments through qualitative literature reviews. Such reviews provided an important foundation (in particular Schmid et al. 2002) for us to construct a more complete database using strict selection criteria (Schläpfer & Schmid 1999) for the formal meta-analysis presented here. Specifically, we pose the following questions: (i) what are the most commonly addressed relationships between biodiversity and ecosystem properties? (ii) How do the experimental designs used and the ecosystem properties measured affect the outcomes and interpretation of biodiversity–ecosystem functioning relationships? (iii) What can be learnt about biodiversity–ecosystem service relationships that could be useful for decision makers?

Figure 1.

 The number of biodiversity–ecosystem functioning articles published during the last decade is steadily growing (ISI Web of Science). Experimental work (filled section) has contributed around 40% of the total number of articles (total bar) since the beginning of this century.

Methods

Data collection

One hundred and three publications were included in our database, representing 446 ecosystem property measurements from 1954 to June 2004. These publications were identified from the ISI Web of Science and Biological Abstracts database using criteria previously using the following search terms (Schläpfer & Schmid 1999): biodiversity or species richness and stability or ecosystem function or productivity or yield or food web. Where appropriate, we contacted authors of publications to obtain additional information and additional publications. Information about specifics of experimental designs, the ecosystem properties measured and the significance and size of reported effects were entered into our database. We did not include duplicate records, for example, the same experiment and same measurement reported in a different publication or measured in a different year (repeated measures). If, however, the repeated measures were used to derive a new variable such as temporal variation in the ecosystem property, these data were included. We did not include studies that compared monocultures with mixtures of a single higher diversity level or single-species removal experiments. We used all records that reported effect sizes, allowing us to calculate correlation coefficients for the relationship between biodiversity and ecosystem property, but we excluded studies from our database, which reported only significance.

Data analyses

Biodiversity effects were measured as simple or multiple correlation coefficients, r. Using r instead of r2 (the coefficient of determination) had the advantage that we could assign negative and positive signs to effects. Maintaining negative and positive effects and using a Z-transformation (see below) allowed us to test the overall distribution for normality and to obtain normally distributed error terms after fitting explanatory terms.

Simple correlation coefficients (365 records) were only available where biodiversity was treated as an independent continuous variable or where a linear or log-linear contrast was made for the factor biodiversity. When biodiversity was analysed as a factor with more than one level (or as a polynomial), we calculated multiple correlation coefficients from the entries in the analysis of variance tables (81 records). We used adjusted r2 values to derive correlation coefficients because these correct for the degrees of freedom used to fit a model (Sokal & Rohlf 1995). When the relationship between the levels of the biodiversity factor and the response variable was generally negative, we gave the multiple correlation coefficient a minus sign. In addition to the sign, we also noted the shape of the relationship (see below). To simultaneously analyse simple and multiple correlation coefficients we normalized them using Fisher's z-algorithm (Rosenberg et al. 2000)

image(1)

and analysed these Zr-values as a new dependent variable. We did all analysis with all 446 correlation coefficients and with the subset of the 365 simple coefficients. Because the results were the same, we only present those from the full analysis.

The common, normalized effects measure allowed us to analyse all data together with a single general-linear modelling framework, despite the overwhelming heterogeneity of studies. Based on major controversies as well as areas of consensus identified in previous qualitative synthesis (Schläpfer & Schmid 1999; Loreau et al. 2001; Kinzig et al. 2002; Schmid et al. 2002; Díaz et al. 2005; Hooper et al. 2005), a set of hypothesis were constructed about possible effects of the specifics of experimental designs and the ecosystem properties measured on the biodiversity effects observed (Table 1). The studies were classified into groups using a separate explanatory factor for each of the hypotheses (Table 1). The significance and explanatory power of these factors and of interactions was then assessed in mixed-model analyses of variance (anova). Study site and reference were random terms in the model.

Table 1.   Hypotheses tested in the meta-analysis and corresponding explanatory terms in anova
Explanatory termNull hypothesis
  1. Listed are the null hypotheses we tried to reject.

Type of diversity measureH1, biodiversity effects are independent of type of diversity measure used to estimate relationship (e.g. species vs. functional diversity)
Type of experimental systemH2, biodiversity effects are independent of type of experimental system (e.g. bottle, field)
Ecosystem typeH3, biodiversity effects are independent of ecosystem type (e.g. grassland, forest)
Main cause of diversity changesH4, biodiversity effects are independent of main cause of diversity changes (direct vs. indirect manipulation of diversity)
Design for direct species diversity manipulationsH5, biodiversity effects are the same whether total density is held constant (substitutive designs) or not (additive or designs without control of total density)
Type of indirect species diversity gradientsH6, biodiversity effects are independent of the type of indirect species diversity gradients [natural variation vs. gradient (e.g. nitrogen addition)]
Maximum species numberH7, biodiversity effects are independent of maximum species number in most diverse treatment
Trophic-level manipulatedH8, biodiversity effects are independent of trophic level manipulated
Trophic level measuredH9, biodiversity effects are independent of trophic level measured
Number of trophic links between themH10, biodiversity effects are independent of number of trophic links between level manipulated and level measured
Ecosystem propertyH11, biodiversity effects are independent of the ecosystem property measured
Organization level of ecosystem propertyH12, biodiversity effects are independent of the level of organization at which the ecosystem property was measured (population- vs. community- vs. ecosystem-level)
Biotic vs. abiotic ecosystem propertiesH13, biodiversity effects are independent of whether ecosystem property is biotic or abiotic
Dominant cycle to which ecosystem property belongsH14, biodiversity effects are independent of whether ecosystem property is associated to water, nutrient, energy or biotic dynamics
Nature of ecosystem propertyH15, biodiversity effects are independent of whether ecosystem property is a stock or a rate
Study siteH16, biodiversity effects are independent of location of study site

We compared a small number of alternative models for the fixed terms using adjusted r2 values (which gave the same model ranking as Akaike and Bayesian information criteria). The selected final model contained only main effects but no interactions of fixed terms. Due to correlations between fixed terms, we assessed their explanatory power in two ways if they were entered: (i) first into the model or (ii) in a sequence of decreasing order of their F-values when entered first. The random effects were added after the fixed effects in the sequence study site/reference, imposing a nesting of these terms. In one case, a single publication reported results from two study sites and in another case, a single publication reported results from two separate experiments. In these two cases, we gave each publication two reference IDs to ensure full nesting. To avoid weak pseudo-replication due to measurements of multiple ecosystem properties in single experiments, terms referring to specifics of experimental design and study site could be tested against the reference ID instead of the residual mean square as error term. We used this very strict test but list the mean squares in the anova table so that readers can calculate the more liberal F-test as well. The reciprocal of the variance in the individual Zr values, based on the individual study sizes, was used as a weighting factor in the anova (Crawley 1993). This ensured that studies with small sample sizes were not over-rated in comparison with studies with large sample sizes. Throughout the paper, we report result in terms of these weighted average normalized effect sizes Zr and their standard errors.

Ecosystem properties that could unequivocally be related to ecosystem services (MA 2003; Díaz et al. 2005), and thus that could be assigned a positive (or negative) value for human well being, were further analysed based on mean values and standard errors of effect sizes. Some judgment is involved in the assignment of positive or negative value, because a particular ecosystem property may not be seen as the same benefit by all stakeholders of biodiversity (Srivastava & Vellend 2005). Only those ecosystem properties for which at least five effect size measurements were available were included in the analysis.

Groupings for specifics of experimental design and ecosystem properties (number of records in parentheses)

Type of diversity measure

These included species richness (393), functional group richness (23), evenness (11) and diversity indices (19). Although we aimed to include diversity effects in the broadest sense of the word, the majority of studies examined species richness effects only. Some studies reported effects of functional group richness, but only a few of these were intentionally designed from the start to examine effects of varying functional diversity.

Type of experimental system

System types were bottle (microcosm studies) or pot (111), greenhouse, including climate chambers (62) and field (273). Pot and greenhouse systems differ from field systems in that the latter experience natural climate and light regimes. Field systems included studies that directly and indirectly manipulated species diversity.

Main cause of diversity change

Direct manipulations (398) of diversity were distinguished from indirect ones (48). Indirect manipulations were found only in field studies and were further categorized as follows.

Type of indirect species diversity gradients

Indirect manipulations of diversity were divided into natural variation (39) and gradient (9). In the first category, naturally varying diversity levels were constructed. In the second category, a natural (succession) or experimental gradient in environmental conditions (nutrient application or multiple factors) generated the differences in diversity levels.

Design of direct species diversity manipulation experiments

Direct manipulations of diversity were subdivided into those which were set up so that total density remained constant, i.e. substitutive experiments (357), and others, mostly additive experiments (41).

Maximum species number

Three levels of maximum diversity were recognized: low (≤10 species, n = 211), intermediate (11–20 species, n = 104) and high (>20 species, n = 131).

Ecosystem type

These encompassed forest (43), grassland (258), marine (32), freshwater (68), bacterial microcosm (seven), soil community (15), crop/successional (10) and ruderal/salt marsh (13).

Trophic level manipulated and trophic level measured

Studies that manipulated diversity and/or measured diversity effects at different trophic levels were categorized into: primary producer (319 manipulated and 241 measured), primary consumer (30 and 91), secondary consumer (four and 13), detritivores (15 and 38), mycorrhiza (47 and 15), multitrophic (31 and five) and ecosystem level (0 and 43). ‘Multitrophic’ refers to studies where diversity was manipulated on more than one trophic level or where the ecosystem property involves more than one trophic level (e.g. total macrofaunal biomass). Ecosystem level refers to properties measured in the entire ecosystem within the abiotic compartment (e.g. nutrient loss from the system).

Number of trophic links

We counted the number of trophic links between the trophic level manipulated and the level at which the property was measured (Fig. 2).

Figure 2.

 Number of measurements in published biodiversity–ecosystem functioning experiments for different trophic levels manipulated (base of arrow) and trophic levels measured (end of arrow). A dominance of measurements and manipulations of primary producers was observed.

Effect form

The shapes of the biodiversity–ecosystem property relationships were classified into negative (40), negative linear (92), negative log-linear (41), idiosyncratic (113), positive (70), positive linear (56), positive log-linear (34). This classification was performed independently of significance or size of biodiversity effects simply by inspecting results presented in the text and figures of the publications analysed. This variable is similar to the effect size itself and could be used as an alternative dependent variable in log-linear analysis of deviance. We include this variable in the supplementary online material but except for a single case (see below) the only reported dependent variable in the present paper is effect size per se.

Ecosystem properties measured

We included any physical characteristics of the ecosystems, including process rates of energy and nutrient flow. To simplify comparisons, we grouped similar ecosystem properties (EP), which resulted in 28 groups; an additional group was used to collect those measures that could not be assigned. We distinguished between properties of the ecosystem and those of an invader (defined as any species added after the establishment of a community) and we also distinguished between effects on means of properties measured and those that relate to their variances.

Organizational level of the ecosystem property measured

We distinguished between population-level properties, recorded for individual target species, such as density, cover or biomass, and their temporal variance; community-level properties, recorded for multispecies assemblages, such as density, biomass, consumption, diversity and their temporal variance; and ecosystem-level properties, recorded for abiotic components, such as nutrient, water or CO2 and their temporal variance.

Dominant dynamic of ecosystem property

Properties were assigned to the ecosystem cycle in which they predominate: water, nutrient, energy or biotic dynamics.

Nature of ecosystem property

Stock vs. rate measurements of ecosystem properties were distinguished.

Ecosystem service

Ecosystem services are the benefits people obtain from ecosystems. Our classification followed that of the Millennium Ecosystem Assessment (MA 2003; Díaz et al. 2005). A list of ecosystem properties considered to underpin each ecosystem service, as well as the directionality of expected benefits to human well being, is provided below in the Results section.

Groupings according to place of study and identity of experiment (number of groups in parentheses)

Location of study site (60)

Site location of an experiment ranged from a precise place to a broad region, depending on the extent of the study.

Study site (75)

Generally equivalent to location, this term was used to distinguish different studies within a single location. Study site reflects a set of environmental conditions particular to that experiment.

Reference ID (105)

This corresponded to individual publications, except where a single publication reported results from more than one study, in which case this publication received two reference IDs. This ID is used to distinguish between groups of potentially non-independent measurements in order to avoid pseudo-replication.

Results

The overall mean of the standardized effect sizes Zr (weighted by the reciprocal of the variance of the individual Zr-values) was significantly positive (inline image = 0.101 ± 0.028, t = 3.57, d.f. = 445, P < 0.001), indicating that negative responses of ecosystem properties to biodiversity manipulations are less frequent or less strong than positive ones. Nevertheless, the reported effect sizes varied greatly, ranging from −2.71 to 2.39. In the following sections, we explore the sources of this variation.

Effects of specifics of experimental design and study site

Some specifics of the experimental design which we originally expected to have an influence on effect sizes in fact could not be included in the final analysis model, suggesting that they need not be a concern when designing future biodiversity experiments. For instance, there was only a weak influence of the type of diversity measure on measured effect sizes (Table 2). Of particular note is that effect sizes were only slightly larger when functional-group rather than species richness was manipulated (adjusted mean values ± SE of Zr-values: 0.191 ± 0.103 vs. 0.116 ± 0.030).

Table 2.   Results from one-way analyses of variance (anova)s in the sequence of decreasing F-values and multiway anova using this sequence for fitting the corresponding fixed terms (see Methods for details)
H no.Variabled.f.Sum of squaresMean squaresFP-value% Explained variance
  1. H no., hypothesis number (see Table 1); n.s., not significant (P > 0.05).

  2. *These two terms include the last term (direct vs. indirect) as a category ‘none’.

  3. †This term includes the term ‘direct vs. indirect’ as a category ‘none’.

  4. F-test using reference ID as error term.

One-way anova
 12Organization level EP22031.71015.940.27<0.00115.4
  5Type direct manipulations*21802.5901.235.00<0.00113.6
  7Maximum species number21319.0659.324.57<0.00110.0
  2Experimental system21071.0535.319.54<0.0018.1
  3Ecosystem type72255.8322.312.89<0.00117.1
 11Ecosystem property283241.7115.84.83<0.00124.5
 16Study site746168.683.44.39<0.00146.7
  1Type diversity measure3377.2125.74.330.0052.9
 15Nature of EP186.586.52.92n.s.0.7
  8Trophic-level manipulated5305.161.02.08n.s.2.3
  9Trophic-level measured6295.249.21.67n.s.2.2
 10Number of links137.437.41.28n.s.0.3
 14Cycle type EP4143.936.01.21n.s.1.1
 13Biotic vs. abiotic EP127.327.30.93n.s.0.2
  6Type indirect gradient*214.17.10.24n.s.0.1
  4Direct vs. indirect12.22.20.07n.s.0.0
anova for selected model
 12Organization level EP22031.91016.083.69<0.00115.38
  5Type direct manipulations†21295.5647.418.19<0.001‡9.81
  7Maximum species number2349.3174.74.91<0.05‡2.64
  2Experimental system2485.0242.56.81<0.01‡3.67
  3Ecosystem type7660.394.32.65<0.05‡5.00
 11Ecosystem property281196.642.73.52<0.0019.06
 16Study site652501.738.51.08n.s.‡18.94
 Reference (within study site)26925.535.62.93<0.0017.01
 Residual3373762.412.0  28.49
 Total44413208.129.8  100.00

In contrast, the type of experimental system employed (bottle vs. greenhouse vs. field) strongly modified biodiversity effects (Table 2). More positive effects were found where environmental variables could be controlled best, such as in greenhouses and climate chambers (0.467 ± 0.084) compared with bottle/pot experiments (0.100 ± 0.051) or field experiments (0.007 ± 0.033).

Effect sizes also varied markedly between different types of ecosystem (Table 2). For the four ecosystem types which were represented most frequently in the data set, average effect sizes were close to zero (grassland 0.039 ± 0.038, freshwater −0.010 ± 0.065, marine −0.006 ± 0.109, forest −0.116 ± 0.076), whereas average effect sizes were larger and positive for the ecosystem types with fewer records (ruderal/salt marsh, 1.058 ± 0.154; bacterial, 0.317 ± 0.095; crop/successional, 0.245 ± 0.052; soil, 0.094 ± 0.086). This could imply that the research community's perception of the magnitude and direction of biodiversity effects may be biased by the focus to date on relatively few ecosystem types that included measures of negative impacts on properties. There was considerable variation among study sites, but this was not significant in the multiway anova using the strict F-test with reference ID as error term (Table 2). In other words, effect sizes varied as much between references within study sites as between study sites.

Although average effect sizes were practically identical for studies that manipulated biodiversity directly or indirectly (hypothesis 4), and between versions of indirect manipulations (hypothesis 6), average effect sizes were smaller if direct manipulations maintained total density constant (substitutive designs, 0.031 ± 0.030) than if they did not (0.868 ± 0.102) (Table 2). This confirms something which has long been known to agricultural scientists and plant ecologists using substitutive designs (Harper 1977), the importance of not confounding increasing species richness and total density in experiments.

Average effect sizes were positive if the maximum species richness was larger than 20 species (0.344 ± 0.052) and close to zero for the other two categories (two to 10 species: −0.049 ± 0.030; 11–20 species: −0.034 ± 0.081) (Table 2). Yet only 33 of 105 experiments (reference IDs) employed more than 20 species at the highest diversity level. With respect to effect form there was an indication that the odds ratio between linear and log-linear-negative or -positive relationships was greatest in experiments where maximum species richness was lowest (P < 0.05), but even where maximum species richness was high, this ratio was > 1.

There were no overall effects of trophic level manipulated, trophic level measured or number of trophic links between manipulated and response trophic levels (Table 2). Nevertheless, productivity-related effect sizes did significantly decline with increasing number of trophic links (F1,140 = 5.74, P < 0.05).

Effects of ecosystem properties measured

Biodiversity effects differed significantly among the 29 different groups of ecosystem properties (Table 2). A large fraction of the variance in effect sizes was explained by comparing population-, community- and ecosystem-level measures of ecosystem properties (Organization level EP in Table 2). Biodiversity negatively affected population-level measures (−0.332 ± 0.053), but positively affected community-level measures (0.270 ± 0.036). Ecosystem-level measures showed an intermediate response (0.066 ± 0.046). In contrast, no differences were found between biotic and abiotic ecosystem properties, stocks and rates, nor between those more related to carbon, nutrient, water or biotic cycles (terms ‘biotic vs. abiotic EP’, ‘nature of EP’ and ‘cycle type EP’, respectively, in Table 2).

Biodiversity–ecosystem service relationships

Biodiversity effects were explored in more detail by plotting mean values and SE for groups of ecosystem properties in Fig. 3 and relating these groups to ecosystem services.

Figure 3.

 Magnitude and direction of biodiversity effects (shown are mean values and SE of normalized effect sizes Zr, weighted by the reciprocal of the variance of the individual Zr-values) and number of measurements available for ecosystem properties organized into ecosystem services. Coloured bars show differential effects of trophic level manipulated: green, primary producers; blue, primary consumers; pink, mycorrhiza; brown, decomposer; grey, multitrophic (multiple levels simultaneously manipulated). Ecosystem properties shown in parentheses were considered of negative value for human well being, and thus opposite of effect sizes are shown.

Productivity is a fundamental supporting ecosystem service that underpins the provision of services such as food or wood (MA 2003; Díaz et al. 2005). Generally, increasing biodiversity at one trophic level increased productivity at the same trophic level (Fig. 3). Plant diversity also appeared to enhance belowground plant and microbial biomass (Fig. 3), indicating positive biodiversity effects on the regulating ecosystem service of erosion control, as large root and mycorrhizal networks are expected to reduce soil erosion.

Positive biodiversity effects (Fig. 3) were found for most ecosystem properties associated with nutrient cycling services. Plant diversity had positive effects on decomposer activity and diversity, and both plant and mycorrhizal diversity increased nutrients stored in the plant compartment of the ecosystem. It is unclear whether plant or detritivore diversity has a general effect on soil nutrient supply.

Increasing the diversity of primary producers contributed to a higher diversity of primary consumers, which we consider here as a supporting service (Fig. 3). Our results also suggest positive effects of biodiversity on the closely related regulating service of pest control; higher plant diversity contributed to lowering plant damage (Fig. 3). The effects of plant diversity on the performance and diversity of predatory insects or other animals that control pests require further investigation. In the case of the regulation of invasive species, a service of economic significance and an area of considerable debate (Levine & D'Antonio 1999; Fargione et al. 2003), we found reduced invader abundance, survival, fertility and diversity when plant diversity was higher (Fig. 3).

Temporal stability is directly linked to reliability of service delivery (Díaz et al. 2005). Our analysis indicates that more diverse systems have greater temporal stability, as well as greater resistance to external forces such as nutrient perturbations and invading species (Fig. 3). However, this was not the case for other stressors such as warming, drought or a high variance in other environmental conditions. In contrast to the suggestion of qualitative reviews (e.g. Srivastava & Vellend 2005), portfolio and insurance effects of biodiversity (Tilman 1996; Naeem & Li 1997; Yachi & Loreau 1999), i.e. effects on variances or disturbance responses of ecosystem properties, are not more common than performance effects of biodiversity, i.e. effects on means of ecosystem properties (F1,444 = 0.09, P = 0.75).

Discussion

The database assembled here clearly contains an over-representation of some ecosystem types and ecosystem properties, especially grasslands and primary production measures. It is not surprising that experimental grassland plots are often used as model systems in biodiversity studies, because grassland is a widespread system, experiments can be relatively easily set up at constant total density (as opposed to microcosms with strong population dynamics), yet they do not require very large areas (as opposed to forests). In addition, primary productivity plays a major role in delivering a wide range of ecosystem services. Nevertheless, future biodiversity experiments should embrace a broader range of systems, properties and trophic levels if the generality of these relationships is to be established. In particular, a recent experiment that came to light after our analysis was carried out (Bell et al. 2005), suggests that bacterial systems hold great promise for future research of biodiversity effects on ecosystem functioning.

Notwithstanding this heterogeneity in the database, our analyses indicate an overall significant positive effect of biodiversity on ecosystem processes. We do not believe that this represents a publication bias towards positive effects, because finding a significantly negative effect would be just as interesting and just as likely to be reported. Nevertheless, there was significant variation between studies in the magnitude and direction of biodiversity effects, attributable mainly to specifics of experimental design and the ecosystem properties measured, as also argued in qualitative reviews (Hooper et al. 2005).

Specifics of experimental design and ecosystem properties

A large number of negative effects were associated with population-level measures, whilst positive effects were associated with community-level measures. This result provides perhaps the strongest empirical evidence to date for the prediction that individual populations are expected to fluctuate more with increasing biodiversity, but the community stability and productivity should be enhanced (May 1981; Tilman 1996).

In contrast to the outcomes of qualitative reviews (Hooper et al. 2005), we could not find a simple dependence of biodiversity effects on the trophic levels manipulated or measured. However, we did find productivity-related biodiversity effects that declined with increasing number of trophic links between those trophic levels which were manipulated and those at which the property was measured. This intuitively compelling result has never been reported before. It is clear that experiments need to be extended beyond the single trophic level approach to better understand such variations in biodiversity effects across an ecosystem (Petchey et al. 2002; Raffaelli et al. 2002).

Variation in biodiversity effects among study sites and references suggest that local environmental or specific unrecognized experimental factors may either increase or decrease biodiversity effects. Previous work (Hector et al. 1999) had already indicated important influences of location on biodiversity effects. The additional variation among references within study sites, which actually made the variation between sites non-significant, is reported here for the first time.

Sufficient information is not available to permit analysis of biodiversity-modifying factors, such as nutrient levels or elevated CO2 (Hooper et al. 2005), but it is clear that biodiversity effects are significantly weaker in less-controlled experimental systems. Indeed, it is much more difficult to maintain diversity treatments on open field plots than in closed bottles; environmental heterogeneity, unpredictable biotic and abiotic environmental fluctuations and sampling variances are greater in the former. Thus, while our results would suggest that further research under controlled conditions is needed to improve our understanding of biodiversity effects on ecosystem functioning, extrapolation of those results to the larger landscape scale is likely to be hindered by the greater environmental heterogeneity and its effects on ecosystem functioning (Loreau et al. 2001; Hooper et al. 2005). In this respect, field experiments are likely to be more meaningful for extrapolation to the landscape scales at which humans impact on biodiversity and hence service delivery. On the other hand, in a recently constructed grassland experiment in Jena, Germany, Rosher et al. (2005) found a similar plant diversity–productivity relationship in small plots of 12.25 m2 and in plots more than 30 times larger (400 m2).

The effect on our understanding of the relationship between biodiversity and ecosystem functioning of differences in the way biodiversity is manipulated, how experiments are set up, and how response variables are measured in such experiments has been much debated (Schmid et al. 2002; Lepš 2004). Different experimental designs and setups are acknowledged to have their own advantages and shortcomings; but the present analysis has allowed a formal assessment of the degree to which these really are important. Surprisingly, we found no significant differences between those experiments where diversity was manipulated directly and those involving indirect manipulations by altering environmental conditions. However, there was clear evidence in favour of substitutive designs with control for constant total density of individuals at the start of an experiment. If total density is allowed to vary, in most cases in parallel with species richness, larger effects are seen, but one cannot unequivocally attribute them to biodiversity or density. In other words, such experiments are confounded.

Using a large number of species at the highest diversity levels of an experiment increases the chances of detecting biodiversity effects, although this must be weighed up against the increased work involved in setting up such an experiment. Nevertheless, there is a clear need to include higher levels of species richness in experiments. Unfortunately, interesting new simulation and empirical studies which used non-random extinction scenarios (Raffaelli 2004; Solan et al. 2004; Zavaleta & Hulvey 2004; Bunker et al. 2005; Schläpfer et al. 2005; Srivastava & Vellend 2005) could not be included in our analysis because they were published after our analyses were complete.

An important question when designing a biodiversity–ecosystem functioning experiment is what expression of diversity to manipulate: richness, evenness or functional groups? The literature is somewhat divided on this issue (Díaz & Cabido 2001; Loreau et al. 2001; Hooper et al. 2005; Petchey & Gaston 2006; Wright et al. 2006), but the predominant view is that functional groups may be more important than species richness, consistent with our own findings.

Biodiversity–ecosystem service relationships

Where ecosystem properties could be related to ecosystem services (Srivastava & Vellend 2005), clear positive effects of biodiversity were found, for both regulating and supporting services. Nevertheless, our ability to make these linkages at spatial (landscape) scales relevant to the human enterprise is limited at present (Kremen 2005). There is an urgent need to extend experimental, observational and theoretical work on biodiversity effects for an array of ecosystem functions that can be linked to ecosystem services, such as water quantity and quality, pollination, regulation of pests and human diseases, carbon storage and climate regulation, waste management and cultural services, and to evaluate biodiversity–ecosystem service relationships at the larger spatial scales relevant to management (Kremen et al. 2004; Balvanera et al. 2005).

The role of biodiversity in buffering environmental variation and thus providing consistent service delivery has received extensive theoretical treatment (Tilman 1996; Yachi & Loreau 1999; Hooper et al. 2005). In general, a positive effect of biodiversity is expected on the stability of ecosystem properties (Tilman 1996; Naeem & Li 1997; Yachi & Loreau 1999; Hooper et al. 2005), and qualitative reviews have suggested that such effects on the variance in processes (stability) may be stronger than the effects on means (stocks and fluxes; Srivastava & Vellend 2005). The quantitative results from our meta-analysis do not support this view, rather indicating that biodiversity effects on disturbance buffering are dependent on the nature of the disturbance. Thus, while biodiversity effects on buffering of nutrient perturbations and invading species were positive, biodiversity effects on buffering influences of warming, drought or high environmental variance were neutral or slightly negative.

Conclusions

Whilst there are many qualitative reviews and position statements about the effects of biodiversity on ecosystem properties and services, our analysis provides the first extensive quantitative meta-analysis of this relationship. This analysis suggests that simple generalizations among ecosystem types, ecosystem properties or trophic level manipulated or measured will be difficult to sustain. Considerations of the way in which biodiversity is defined and manipulated, and disentangling the many separate effects and the interactions between them, as well as those with environmental heterogeneity, will be a major challenge for the next generation of experiments. We offer our database as a building block for continued synthesis attempts. The advantages of a formal meta-analysis are illustrated by the following novel contributions we have been able to bring to the synthesis: (i) biodiversity effects are weaker if biodiversity manipulations are less well controlled (e.g. field vs. greenhouse or climate chamber); (ii) biodiversity effects are weaker if the highest diversity levels in an experiment are lower (e.g. ≤ 10 vs. > 10 species); (iii) biodiversity experiments should avoid confounding diversity and total density (they should use a substitutive design); (iv) biodiversity effects are weaker at the ecosystem than the community level and negative at the population level; (v) productivity-related biodiversity effects decline with increasing number of trophic links between level manipulated and level measured; (vi) biodiversity effects on stability measures are not obviously stronger than biodiversity effects on performance measures.

There are clear messages for policy makers from these analyses. First, for those ecosystem services that could be assessed in the present study, there is clear evidence that biodiversity has positive effects on the provision of those services and that further biodiversity loss can only be expected to compromise service delivery. Secondly, whilst further research is needed to confirm such linkages, in particular to extend the work to a broader range of systems and properties, society in the meantime should proceed in a precautionary manner in its use and management of biodiversity.

Acknowledgements

We thank the Swiss Biodiversity Forum for administrative support and guidance and SCOPE, with Chris Field, Carlo Heip and Osvaldo Sala for suggestions about the project design. We acknowledge helpful comments from three anonymous referees and S. Naeem which improved the manuscript. We thank the Swiss Agency for the Environment, Forests and Landscape (SAEFL) and the University of Zurich for financial support. We thank Manuel Maass for suggestions and Alberto Valencia and Heberto Ferreira for technical support.

Ancillary