Testing effects of consumer richness, evenness and body size on ecosystem functioning

Authors


E-mail: g.woodward@qmul.ac.uk

Summary

1. Numerous studies have revealed (usually positive) relationships between biodiversity and ecosystem functioning (B-EF), but the underpinning drivers are rarely addressed explicitly, hindering the development of a more predictive understanding.

2. We developed a suite of statistical models (where we combined existing models with novel ones) to test for richness and evenness effects on detrital processing in freshwater microcosms. Instead of using consumer species as biodiversity units, we used two size classes within three species (six types). This allowed us to test for diversity effects and also to focus on the role of body size and biomass.

3. Our statistical models tested for (i) whether performance in polyculture was more than the sum of its parts (non-additive effects), (ii) the effects of specific type combinations (assemblage identity effects) and (iii) whether types behaved differently when their absolute or relative abundances were altered (e.g. because type abundance in polyculture was lower compared with monoculture). The latter point meant we did not need additional density treatments.

4. Process rates were independent of richness and evenness and all types performed in an additive fashion. The performance of a type was mainly driven by the consumers’ metabolic requirements (connected to body size). On an assemblage level, biomass explained a large proportion of detrital processing rates.

5. We conclude that B-EF studies would benefit from widening their statistical approaches. Further, they need to consider biomass of species assemblages and whether biomass is comprised of small or large individuals, because even if all species are present in the same biomass, small species (or individuals) will perform better.

Introduction

Biodiversity determines how an assemblage mediates ecosystem processes, many of which have now been shown to increase with species richness across a wide spectrum of organisms and systems (e.g. Duffy 2006; see Balvanera et al. 2006; Reiss et al. 2009). Despite the recent emergence and rapid expansion of biodiversity-ecosystem functioning (B-EF) research, there is still no firm consensus on the underlying drivers of biodiversity effects. With respect to resource consumption, a key question is: do species-rich assemblages show faster rates than species-poor ones? Simply speaking, species can potentially perform in a purely additive fashion, they can influence each others’ performance, and/or they can drive different processes that contribute to an overall effect. Many B-EF studies have focused on linking empirical observations with concepts such as the complementarity or facilitation effect (e.g. Cardinale, Palmer & Collins 2002; Cardinale et al. 2007). The former can arise when species complement each other because they drive different processes (in an additive or non-additive fashion) that are measured together. The facilitation effect can be observed when species enhance each others’ performance. This means they operate in a non-additive way, because performance in mixture is more than the sum of its parts (Reiss et al. 2009). Many (plant) B-EF studies have highlighted that when the best-performing species dominates in polycultures, then polyculture performance can be better than that of monocultures, although there may be no species richness effect as such (sampling effect).

Resolving which effects and mechanisms apply to complex natural systems is a challenging task, especially as both biodiversity and ecosystem functioning can be measured in many ways. Increasingly, attention has shifted away from the traditional focus on species richness towards a consideration of functional diversity and evenness (e.g. Dangles & Malmqvist 2004). Evenness is still relatively rarely addressed in B-EF experiments (but see Hillebrand, Bennett & Cadotte 2008; and references therein), despite the fact that dominance is the rule in natural communities and that shifts in evenness are often a critical precursor to species loss (Hillebrand, Bennett & Cadotte 2008; Woodward 2009).

Tightly controlled microcosm experiments provide a powerful tool for studying B-EF relations (e.g. Bell et al. 2005), and many such studies have been carried out in the past two decades (e.g. Naeem et al. 1994; Jonsson & Malmqvist 2000; McKie et al. 2008; Reiss et al. 2010). While there is a general debate in ecological research about the pros and cons of laboratory experiments vs. field studies, the consensus seems to be that highly controlled experiments offer more than just a stimulus to further research (Benton et al. 2007). More importantly, they can produce the parameters that can be fitted in mathematical models (i.e. they inform modelling) that can in turn be tested on natural assemblages. Also, laboratory experiments can draw the focus on particular mechanisms that might operate in natural assemblages but which are likely to go undetected due to the high complexity of natural communities (Benton et al. 2007). There is an unavoidable trade-off between replication and realism in B-EF experiments: for example, levels of richness manipulated in many (animal-based) B-EF experiments do not exceed two or three species (e.g. Jonsson & Malmqvist 2000, McKie et al. 2008 and see Woodward et al. 2009). Despite their apparent simplicity, these experiments have shown how biodiversity effects arise and, further, their species richness levels often mimic local conditions: for instance, 2–3 species typically dominate the ‘shredder’ guild of detritivores in freshwater ecosystems (Dangles & Malmqvist 2004; Hladyz et al. 2009).

The evaluation of B-EF experiments necessitates choosing suitable statistical models that can account for additive and non-additive effects, and most studies have compared predicted performances, estimated from monocultures, with observed performances in polyculture (e.g. Loreau & Hector 2001). In contrast to many plant studies (e.g. Fox 2005), performance of a single species (or type) in polyculture can often not be measured directly in animal-based studies. Hence, statistical models have to allow for the fact that contributions to the mixture are unknown. Moreover, statistical models are needed that can account for the possibility that species or types might perform differently when they are in combination with other types. For most statistical approaches, this can only be circumvented by including additional density treatments (e.g. Underwood 1984; Jonsson & Malmqvist 2003), which soon becomes logistically challenging when running what are already highly complex experiments (Reiss et al. 2009). Further, species (or types) might behave differently when they are dominant, irrespective of species composition, but this cannot be ascertained with existing approaches.

Traditionally, B-EF experiments have been evaluated in the light of hypothetical relationships between biodiversity and ecosystem processes (e.g. the “redundancy” or “idiosyncratic” hypotheses of the effects of species loss), rather than testing explicitly for underlying mechanisms. Recently, however, some important new insights have been made into identifying the underlying drivers of B-EF relations; e.g. species identity can be more important than species richness (‘identity effect’; e.g. Bruno et al. 2006), habitat alteration as a key mechanism behind interspecific facilitation in stream assemblages (Cardinale, Palmer & Collins 2002), or the interdependence of intra- and interspecific competition and species richness (Finke & Snyder 2008). Such approaches highlight the need to find meaningful parameters to include in future analyses, and principal among these is the consideration of specific species traits (Reiss et al. 2009).

In this context, one of the most relevant animal traits might be body mass because it determines the basal metabolic rate, energy demands, and ingestion rates of an individual, as well as the abundance and biomass of a population as a whole, via well-known allometric scaling relations (Peters 1983; Brown et al. 2004; Woodward et al. 2005; Reiss et al. 2009; Woodward et al. 2010). Because an individual’s performance reflects its mass-dependent metabolic needs, an assemblage’s performance is dependent both upon its total biomass and also on how biomass is apportioned among small or large individuals. It is perhaps surprising then that consumer body mass or biomass has rarely been addressed explicitly in B-EF experiments (but see McKie et al. 2008; Perkins et al. 2010). In most studies, biomass has either been controlled for a priori (e.g. Duffy et al. 2001; Emmerson et al. 2001; Stachowicz et al. 2008) or ignored altogether. Because species identity is often confounded with body mass in many studies, this makes it difficult (or impossible) to distinguish between taxonomic and functional diversity or to develop a more general mechanistic and predictive framework.

We sought to address this by testing the effects of multiple aspects of biodiversity (richness, evenness, traits and species combinations) on two key ecosystem processes in fresh waters: litter decomposition rates and the generation of fine particulate organic matter (FPOM) by invertebrate detritivore ‘shredders’. We developed a suite of statistical models and laboratory experiments to do this. We chose a controlled laboratory experiment to reduce potential confounding effects between species richness (or evenness) and body size (and biomass). Two size classes of three consumer species were used to isolate the effects of functional (i.e. size) diversity from those of species richness per se. We assembled these six types in equal proportions to test for richness effects and in unequal proportions to test for evenness effects. The different combinations of types created a gradient of total biomass in our experimental microcosms. Size classes have been used previously in B-EF experiments to separate effects of small and large species (Long & Morin 2005), but, to our knowledge, not within species. We aimed to test whether types performed independently of each other, i.e. whether effects would be additive, or whether types (possibly of the same species) would influence each other. We finally aimed to assess the role of body mass a potentially key underlying driver of performance.

Materials and methods

Two laboratory experiments were run for 28 days in 2007 (‘richness’ experiment) and 2008 (‘evenness’ experiment), respectively. Three common European invertebrate detritivore species (the amphipod Gammarus pulex (L.), the isopod Asellus aquaticus (L.) and larvae of the trichopteran Sericostoma personatum (Kirby & Spence) were used), and a single food resource (alder leaf-litter, Alnus glutinosa L.). In both experiments, we used two body size classes (‘small’ and ‘large’) per consumer species, giving six ‘types’ in total (Table 1). We used three richness levels in the ‘richness’ experiment: monocultures, di- and tri-cultures of types mixed in equal proportions. Three different levels were used in the ‘evenness’ experiment: monocultures and di-cultures run with either equal proportions of types or with one dominating numerically.

Table 1.   Experimental designs for two experiments (‘richness’ and ‘evenness’ experiment) using three shredder species and two size classes (giving a total of 6 types, A to F). Number of individuals (N), assemblage identities (I) used and number of treatments (T) for both experiments
CulturesNIT
  1. (a) Richness experiment: All types were assembled in all possible combinations within three levels of richness (1, 2, and 3). Monocultures contained 12 individuals of one type, di-cultures contained 6 individuals of each type, and tri-cultures contained 4 individuals of each type. (b) Evenness experiment: All types were assembled to give three levels of evenness: level 1 = monoculture, level 2 = types numerically even proportions (1 : 1), level 3 = one type dominates 2 : 1. A, B = small and large Asellus, respectively; C, D = small and large Gammarus respectively; E, F = small and large Sericostoma, respectively; = identity, = number of treatments.

(a) Richness experiment
  Type richness level
  112ABCDEF 6
  26 + 6ABACADAEAFetc.15
  34 + 4 + 4ABCABDABEABFBCDetc.20
  Microbial control0       1
(b) Evenness experiment
  Type evenness level
  1 (one type)12ABCDEF 6
  2 (two types 1 : 1)6 + 6ABACADAEAFetc.15
  3 (two types 2 : 1)8 + 4AABAACAADAAEAAFetc.30
  Microbial control0       1

Organisms and experimental set-up

Individuals were classified as ‘small’ or ‘large’ for each species (4–8 vs. 8–12 mm length for A. aquaticus and G. pulex; 9–14 vs. 14–19 mm length for S. personatum). Average dry body mass for small individuals was 2·1 (±0·2 SE) mg, 2·2 (±0·3 SE) mg and 8·7 (±0·7 SE) mg for A. aquaticus, G. pulex and S. personatum, respectively, and 6·2 (±0·5 SE) mg, 6·7 (±0·4 SE) mg and 16·1 (±1·0 SE) mg for large individuals of A. aquaticus, G. pulex and S. personatum, respectively.

Glass microcosms (11·6 cm wide, 6 cm deep, volume 400 mL), sealed with a 1-mm mesh net cover, were used as experimental units. They were filled with air-dried alder litter (3-g air-dried mass, stalk removed) and subsequently submerged in tanks containing 20 L of degassed tap water and 7·5 L of stream water (see Appendix S1).

Combination of types

After a leaf conditioning period of 1 week, 12 individual shredders were introduced to each microcosm and allowed to feed for 28 days.

In the ‘richness’ experiment, two size classes of three species were used, resulting in six types: A (small Asellus), B (large Asellus), C (small Gammarus), …, F (large Sericostoma) (Table 1). These types were assembled in sets of 1, 2 and 3 types, and this design gave us six monocultures, fifteen ‘di-cultures’ and twenty ‘tri-culture’ treatments (Table 1a). Thus, there were 41 faunal treatments (assemblage identities) in total, in addition to one microbial-only control (Table 1a), which were randomly assigned to microcosms in each block. All treatments were replicated four times (four blocks and 168 microcosms in total).

In the ‘evenness’ experiment, three levels were used, defined as monocultures (only one type), di-cultures with equal proportions of two types and di-cultures with one of the two types dominating (proportions type 1: type 2 = 2 : 1) (Table 1b). All assemblage identities (51) and the microbial control were replicated three times, which resulted in 156 microcosms in total.

Biomass determination and response variables

To determine biomass in the microcosms, high resolution digital photographs were taken at ×100 magnification of every individual that was added to a microcosm before and after the experiment. Subsequently, the length of all individuals was measured with the image analysis software Image-Pro® Plus (Media Cybernetics, Inc., Bethesda, MD, USA) and then converted into dry body mass using log–log body length-body mass relationships derived for each species (Asellus aquaticus: = 2·652x − 1·841, r= 0·94; Gammarus pulex: = 3·015x − 2·242, r= 0·94; Sericostoma personatum: = 1·822x − 1·029, r= 0·94; see Appendix S1). Biomass was estimated by summing the body mass of individuals measured from images taken at the start of the experiment. This way, we knew both average body mass and biomass in each microcosm.

The leaf material remaining at the end of the experiment was oven-dried at 80 °C to constant mass and weighed. Leaf mass loss was then calculated after correcting for losses caused by leaching and microbial activity (after Hladyz et al. 2009). In addition, the FPOM accumulated in the microcosms, which consisted of faeces and finely shredded (<1 mm diameter) leaf material, was dried and weighed separately.

Statistical analysis

To assess B-EF relations in the two experiments, we performed a classical nested analysis of variance (anova). We tested a collection of linear models on data from the 41 (‘richness’ experiment) and 51 (‘evenness’ experiment) assemblage identities (see Table 1). Because the models are related and can be ordered in a hierarchy (Fig. 1), anova compares the goodness-of-fit of the linked models and tests for whether the difference between a model and its related smaller ones can explain the data significantly better or not.

Figure 1.

 Diagram of models used, showing how the models are related. All the models used in (A) the ‘richness’ experiment and (B) the ‘evenness’ experiment. The diagram is read from the bottom, starting with the smallest model (model ‘Constant’), and going up to the largest model (model ‘Assemblage Identity’). Each step upwards has additional degrees of freedom, which are calculated from the (increasing number of) parameters of the models by subtracting all the degrees of freedom for lower models. The lines show where a model is a special case of the one above. The lines between the models are labelled by the letters a to f. Each letter represents a different hypothetical mechanism explaining how assemblages affect process rates, over and above those mechanisms included in smaller models: (a) level of richness, 2 degrees of freedom (d.f.); (b) types performing in a characteristic fashion while not influencing each other (i.e. in an additive fashion); d.f. = 5; (c) types change their performance for each level of richness, d.f. = 10; (d) types influence each other in mixture in a way that a to c cannot explain (e.g. facilitation), d.f. = 23; (e) when a type is in majority or minority it changes its performance in a way which cannot be predicted from monoculture, nor be explained by a to c, d.f. = 5; (f) types influence each other in mixture in a way that a to e cannot explain, d.f. = 28. The degrees of freedom for the difference between two models is the positive difference between the number of parameters for those two models. The sum of squares for the difference between the models is the positive difference between the residual sums of squares for fitting each of those two models.

Our combination of all models simultaneously tested for a range of possible biodiversity effects: richness, evenness, sampling and identity effects, additive mechanisms and interactions between types (i.e. performance is altered when types are in polyculture). The models were designed to explain the relationship between richness or evenness and the response variables, starting from a small and simple model, over intermediate models (all of which, to our knowledge, have not been used in animal B-EF studies before), to the largest one (Fig. 1).

Collection of models for the ‘richness’ experiment

Five linear models (Fig. 1, Table 2a) were considered to test the effects of types (A to F) and combinations of types (AB, AC, ABC, etc.) on each response variable separately. Our ‘null model’ was the model ‘Constant’ (Fig. 1). From here, the next two more complex models were Model 1 and Model 2, followed by the intermediate Models 3 and 4 and finally the largest model, Model 5 (Fig. 1A).

Table 2.   Possible models for the responses leaf decomposition and FPOM production (y) for two experiments (richness experiment and evenness experiment, where the richness and/or evenness of six types is manipulated). The number of independent parameters is shown for each model. For example, for the first model ‘Richness’, there are richness levels 1, 2 and 3, so there are 3 parameters for this model, which assumes that the response y is different for each level of richness and depends purely on this predictor. These parameters are described in more detail in the third column under ‘response depends on’; they are: level 1, 2 and 3; the coefficients ai, bi, ci and di; the constants f, g and h and all identities. For model 2, 3, 4 and 6, we give the linear regression equation underlying the model. These models include covariates x1 to x6 whose values are the number of individuals of a type in a microcosm. ‘Richness’ can be replaced by ‘evenness’, and the models are identical for the two experiments, except for an additional model (model 6) in the evenness experiment
Model and number of parametersResponse depends on:
(a) Richness experiment
  (1) Richness 3Richness of types (1 type or combinations of 2 or 3 types = 3 levels)
  (2) Type 6y = a1x+ a2x+ a3x+ a4x+ a5x+ a6x6
  (3) Richness + Type 8For level 1: y = + a1x+ a2x+ a3x+ a4x+ a5x+ a6x6
Level 2: y = +  a1x+ a2x+ a3x+ a4x+ a5x+ a6x6
Level 3: y = +  a1x+ a2x+ a3x+ a4x+ a5x+ a6x6
  (4) Richness * Type18Level 1: y = b1x+ b2x+ b3x+ b4x+ b5x+ b6x6
Level 2: y = c1x+ c2x+ c3x+ c4x+ c5x+ c6x6
Level 3: y = d1x+ d2x+ d3x+ d4x+ d5x+ d6x6
  (5) Assemblage Identity41Identities: A, B, C, D, E, F, AB, AC etc., ABC, AEF etc.
(b) Evenness experiment
  (1) Evenness 3Evenness of types (3 levels)
  (2) Type 6y = a1x+ a2x+ a3x+ a4x+ a5x+ a6x6
  (3) Evenness + Type 8For level 1: y = +  a1x+ a2x+ a3x+ a4x+ a5x+ a6x6
Level 2: y = +  a1x+ a2x+ a3x+ a4x+ a5x+ a6x6
Level 3: y = +  a1x+ a2x+ a3x+ a4x+ a5x+ a6x6
  (4) Evenness * Type18Level 1: y = b1x+ b2x+ b3x+ b4x+ b5x+ b6x6
Level 2: y = c1x+ c2x+ c3x+ c4x+ c5x+ c6x6
Level 3: y = d1x+ d2x+ d3x+ d4x+ d5x+ d6x6
  (5) Assemblage Identity51Identities: A, B, C, D, E, F, AB, AC etc., AAB, AAC etc.
  (6) Dominance23Level 1: y = b1x+ b2x+ b3x+ b4x+ b5x+ b6x6
Level 2: y = c1x+ c2x+ c3x+ c4x+ c5x+ c6x6
For level 3: as for level 1 and 2, but instead of parameter di, each type
has 2 parameters: ji in the case of dominance, ki in the case of minority

Model 1 ‘Richness’. The response depends on the number of types present (i.e. on the presence of 1, 2 or 3 types; Table 2a). This model is very ‘simple’ and is widely used for B-EF experiments (e.g. Jonsson & Malmqvist 2000). This model is significant if the response differs significantly for each level of type (or species) richness, which in our case is if the response varies significantly between the mono-, di- and tri-culture. Note: we used richness as a factor, whereas some other studies have used it as a covariate (e.g. Cottingham, Brown & Lennon 2001; Bell et al. 2005; Roscher et al. 2005).

Model 2 ‘Type’. Each type has a unique effect, which provokes a characteristic response irrespective of whether the type is combined with other types or not. The response simply depends on additive effects of types. Thus, the response on monoculture A should be α1; the response on di-culture AB should be (α1 + α2)/2; and the response on tri-culture ABC should be (α1 + α2 + α3)/3. The number of organisms of type i present in a given assemblage identity was defined as covariates x1,…,x6. The value for xi is the number of organisms of type i in a microcosm. Hence, this model is equivalent to: y = a1x1 + a2x2 + a3x3 + a4x4 + a5x5 + a6x6 (Table 2a). As there are 12 organisms per microcosm, αi = 12ai. In other words, this model assumed that the response was determined by the additive contribution of types (e.g. performance of a type in di-culture is half of the performance in monoculture). If this model explained the data best, then types do not interact in polyculture. This model is similar to some models used in plant experiments, where it has been called ‘null model’ (Fox 2005) or ‘species identity model’ (Kirwan et al. 2009).

Model 3 ‘Richness+Type’. The predicted response in type is modified by adding a different constant for each level of richness factor. Hence, this model assumes that, in addition to additive effects, the number of types present has an effect on the response (Table 2a).

Model 4 ‘Richness*Type’. Effects are additive, but the effects of a type are different for each level of richness factor, e.g. the response of A is β1 in monoculture, but γ1 in di-culture and δ1 in tri-culture. Hence, the response of A should be β1; the response on di-culture AB should be (γ1 + γ2)/2; and the response on tri-culture ABC should be (δ1 + δ2 + δ3)/3 (Table 2a). This intermediate model allowed a type to behave differently in polyculture relative to its performance in monoculture, but still in an additive fashion (Table 2). Hence, if this model explained the data best, this would show either positive or negative interactions between types which are simply because of the different numbers of types present in each level of richness. Also, in the scenario of the model explaining the data best, it would generate a value for how types alter their performance, thereby rendering the need for additional density treatments obsolete.

Model 5 ‘Assemblage Identity’. The response depends on the six types and their specific combination (41 parameters: A to DEF; Table 2a). Note: the term assemblage identity here describes all ‘fauna treatments’ and is not used as a synonym for species- or type identity. Similarly to Model 1, this model is widely used in B-EF studies (e.g. Jonsson & Malmqvist 2000), but importantly, the interpretation of what it explains depends on what else is fitted. Because of our choice of smaller models, this meant that our test for Model 5 was actually a test for certain combinations of types performing in an unexpected manner, e.g. because of facilitation.

‘Evenness’ experiment

The difference between the ‘richness’ and ‘evenness’ experiment is that richness level three is replaced by evenness level three, so the statistical models are very similar to the richness experiment. Level three of evenness consists of uneven di-cultures: if we label these like the even tri-culture, then all models from the ‘richness’ experiment apply to this experiment as well, but with some of the model names changed. Our tests for evenness effects differ from some existing statistical approaches for evenness effects (e.g. compared to Cottingham, Brown & Lennon (2001) or Kirwan et al. (2009)).

Model 1 ‘Evenness’.  This model assumes that the response is significantly different for each level of evenness. Note: if evenness is tested, then richness is explicitly tested too, because richness is included in evenness.

Model 2 ‘Type’. In evenness level three, the dominating type can be treated as two ‘even’ types. For example, the response to assemblage identity AAB in the model ‘Type’ should be (2*α1 + α2)/3.

The other models are as follows: Model 3 ‘Evnenness + Type’, Model 4 ‘Evenness*Type’ and Model 5 ‘Assemblage Identity’, which can be used to interpret the data as explained above for the richness experiment. However, it is also possible that two different parameters can be assumed for a type depending on whether it is present in majority or minority (Table 2b), which gives an additional model (we have called it Model 6 for consistency with the ‘richness’ experiment, but in the model hierarchy, it resides below Model 5 ‘Assemblage Identity’):

Model 6 ‘Dominance’ This model is the same as model ‘Evenness*Type’ for evenness levels 1 and 2, but for level 3, the response depends on whether a type is present in majority or minority. For example, the response of AAB should be (2*ϕ1 + κ2)/3 and for ABB (κ1 + 2*ϕ2)/3 (Table 2b). This model explains the data best if a type changes performance because it dominates an assemblage in terms of abundance.

Analysis of variance for the six models and other statistics

All statistical analysis was performed on untransformed data. anova was performed to identify the smallest possible model consistent with the data and the relevant parameters (e.g. parameters α1 to α6 in model ‘Type’) to be included. ‘Block’ effects were added when anova was performed. Because the largest model is defined by the factor ‘assemblage identity’ but some of the smaller models include covariates, there is no standard statistical package that will do the appropriate analysis of the whole collection of models in a single pass. In this situation, the standard procedure is to fit each model individually, then use the residual sums of squares and degrees of freedom from all of the models (see Fig. 1) to calculate the anova table (c.f. Bell et al. 2005). For comparison with other B-EF experiments, we further calculated r2 and the Akaike Information Criterion for each model, including Blocks in the model in every case. The whole collection of models was tested in a single statistical analysis. Each row in the anova table represents not a model but the difference between a larger model and the next smaller one (Grafen & Hall 2002; Bailey 2008). Importantly, this means that if a larger model explains the data well, but not significantly better than a smaller model, then the corresponding P-value will reflect this (Grafen & Hall 2002; Bailey 2008).

To illustrate the fit of the model ‘Type’, we regressed predicted against observed values in a linear regression. This regression, in combination with the anova table, demonstrates how well the model ‘Type’ can predict performance in polyculture from known monoculture values.

The relationship between faunal biomass in all treatments and the response variables (decomposition rates and FPOM production) was characterized using linear regression. To compare the feeding performance of small and large individuals, leaf mass loss was compared between monocultures using analysis of variance (anova).

Except for those steps carried out by hand, we used the statistical software R, version 2.5.1 and Minitab 15 for our statistical analysis.

Results

B-EF relations: type richness, type evenness and assemblage identities

Testing our collection of five models via anova revealed that levels of type richness (including species richness) had no significant effect on either response variable (Table 3) in the first experiment. Testing the collection of six models gave the same conclusion for levels of type evenness in the second experiment (Table 3). This result was not because of a lack of statistical power: the total number of samples used in our experiments was high (164 and 153 for experiment richness and evenness, respectively), and this gave 120 error degrees of freedom for the richness experiment, 100 for the evenness experiment. For the model richness to be statistically significant at the 5% level, the F-value would have to be above 3·07. Increasing the number of samples to infinity would still only change this value to 3·00, which is larger than both F-values for richness and both F-values for evenness.

Table 3.   Analysis of variance testing models predicting leaf decomposition and FPOM production in experimental microcosms for (A) the richness experiment and (B) the evenness experiment
Sourced.f.Leaf decomposition rateFPOM production
SSMSFPSSMSFP
  1. Each row in the ANOVA table represents not a model, but the difference between a larger model and the next smaller one. See Fig. 1 for how the models are related.

(A) Richness experiment
  Richness20·0000090·0000050·49n.s.0·0000000·0000000·16n.s.
  Type50·0038590·00077281·37<0·00050·0001290·00002626·47<0·0005
  Richness*Type100·0001270·0000131·34n.s.0·0000030·0000000·35n.s.
  Assemblage Identity230·0001050·0000050·48n.s.0·0000370·0000021·65n.s.
  Block30·0000670·000022  0·0000610·000020  
  Error1200·0011380·000009  0·0001170·000001  
  Total1630·005306   0·000347   
(B) Evenness experiment
  Evenness20·0000210·0000110·90n.s.0·0000090·0000042·59n.s.
  Type50·0032720·00065454·94<0·0010·0004700·00009454·09<0·001
  Evenness*Type100·0000530·0000050·45n.s.0·0000150·0000010·84n.s.
  Dominance50·0000520·0000100·87n.s0·0000080·0000020·89n.s.
  Assemblage Identity280·0004150·0000151·25n.s0·0000350·0000010·72n.s.
  Block20·0000160·000008  0·0000080·000004  
  Error1000·0011910·000012  0·0001740·000002  
  Total1520·005020   0·000719   

In the ‘richness’ experiment, the mean values for level 1, 2 and 3 of richness were 0·0129 ± 0·0005SE, 0·0127 ± 0·0003SE and 0·0126 ± 0·0005SE g leaf DW loss day−1, respectively, for leaf decomposition; and 0·0022 ± 0·0002SE, 0·0022 ± 0·0001SE and 0·0021 ± 0·0001SE g FPOM DW day−1, respectively, for FPOM production. In the ‘evenness’ experiment, the averages for mono-, di- and triculture were 0·0092 ± 0·0008SE, 0·0081 ± 0·0005SE and 0·0081 ± 0·0004SE g leaf DW loss day−1, respectively, for leaf decomposition; and 0·0028 ± 0·0003SE, 0·0021 ± 0·0002SE and 0·0020 ± 0·0001SE g FPOM DW day−1, respectively, for FPOM production.

For both leaf decomposition and production of FPOM, the differences between assemblage identities could be accounted for with the ‘Type’ model, which assumes purely additive effects of types in polycultures (Table 3). Because this model explains the observed data well, the fitted parameters in the ‘Type’ model reflect the average performance (in mg DW day−1) of each type for leaf decomposition and FPOM production (see Appendix S2).

Values for r2 for the model ‘Type’ were 0·74 and 0·55 for leaf mass loss and FPOM, respectively, in the ‘richness’ experiment and 0·65 and 0·67 for leaf mass loss and FPOM in the ‘evenness’ experiment. The models ‘Richness’ and ‘Evenness’ had r2 values of ≤0·18 in any case. Expanding the six-parameter model ‘Type’ to the full model ‘Assemblage Identity’ increased the value of r2 by no more than 0·11 in any case (Appendix S3).

Another way of seeing how well the model ‘Type’ explains the data is to regress the observed means for the assemblage identities against the predicted values from the model ‘Type’: the observed-predicted regression yields an r2 of 0·94 and 0·86 for leaf decomposition rates and 0·76 and 0·88 for FPOM production in the ‘richness’ and ‘evenness’ experiment, respectively. In both experiments, the fitted parameters of model ‘Type’ and ‘Type*Richness’ (or ‘Type*Evenness’) were very similar (Appendix S2). For both experiments, the predicted values for decomposition and for FPOM were very close in the two models: comparing them in a regression yields an r= 0·91 and 0·89, respectively, for ‘richness’; and an r= 0·93 and 0·73 for leaf mass and for FPOM, respectively, for ‘evenness’.

Effects of consumer biomass and body size

In the ‘richness’ experiment, total consumer biomass of the 12 shredders per microcosm showed a strong positive relationship with rates of decomposition (r= 0·77, Fig. 2A) and FPOM production (r= 0·71, Fig. 2B). In the ‘evenness’ experiment, decomposition rates and FPOM production also increased with biomass, with r2 values of 0·75 and 0·73, respectively (Fig. 2C,D).

Figure 2.

 Relationship between total consumer biomass (g DW) and response variables (g DW) in the ‘richness’ experiment, where the responses were (A) decomposition rates per microcosms and (B) production of FPOM per microcosm. The same responses were measured in the ‘evenness’ experiment (labelled C) and (D) respectively). Data points represent the mean value for all replicates of each treatment. Regression equations are as follows: (A) = 0·170x + 0·002, r= 0·765, < 0·001, 95% CI: 0·140, 0·201; (B) = 0·033 x− 0·001, r= 0·707, < 0·001, 95% CI: 0·026, 0·040; (C) = 0·163 x− 0·001, r= 0·746, < 0·001, 95% CI: 0·136, 0·190; (D) = 0·061 x− 0·001, r= 0·733, < 0·001, 95% CI: 0·035, 0·071. Note the difference in y-axes scales.

Within the given biomasses, the body mass of consumers also affected process rates, which became obvious in the monocultures, where body masses were not mixed. Within the monocultures, decomposition rates were slower for assemblages of small consumers when compared with larger conspecifics (F5,18 = 80·52, < 0·001, F5,12 = 14·28, < 0·001 for ‘type richness’ and ‘type evenness’, respectively; Fig. 3), but the opposite was true for per unit consumer biomass rates (F5,18 = 9·08, < 0·001, F5,12 = 2·15, = 0·13 for ‘type richness’ and ‘type evenness’, respectively; Fig. 2). In general, rates were different between faunal and microbial control treatments (Appendix S4).

Figure 3.

 Leaf decomposition (±SE) per 12 shredders and per unit of consumer biomass (1g shredder DW) in the monocultures of small and large Asellus aquaticus (types A&B), small and large Gammarus pulex (types C&D) and small and large Sericostoma personatum (types E&F). Leaf decomposition was very similar for the small and large individuals of the same species when total leaf decomposition was compared in both the richness (A) and evenness (B). However, when leaf mass loss was expressed as decomposition rate per unit biomass, then the response variable was significantly higher for the small individuals than for larger ones of the same species (***< 0·001) in both the richness (C) and evenness (D) experiment.

Discussion

In contrast to many previous B-EF studies, we found no effect of species richness on ecosystem functioning. However, the main focus of our controlled laboratory set-up was to extract mechanisms that underlie the performance of organisms. Consumers behaved in ‘an additive fashion’, which means that they did not influence each other regarding process rates and that performance in polycultures was predictable from monoculture. Consumer combinations provoked specific responses that can be described as assemblage identity effects. However, again, this effect was predictable from monocultures. In terms of the statistical models used, this means that the model ‘Assemblage Identity’ did not explain more than the model ‘Type’ (it would have been significant if the ‘type-model’ and intermediate models had not been fitted). This means that we would have reported an assemblage identity effect if we had used a less complex statistical analysis.

Two main characteristics of each ‘type’ were clearly its biomass and body mass. In other words, one aspect of what made a consumer perform in the way it did was the total biomass it represented and how large it was. Although it might seem trivial that resource uptake is mass specific (and is even predicted to follow a simple ¼ power law scaling [c.f. Peters 1983; Brown et al. 2004; ]), this aspect is often ignored in B-EF studies (but see Ruesink & Srivastava 2001; Perkins et al. 2010). Body mass and metabolism (and hence resource uptake) were clearly key predictors of consumer-driven ecosystem processes that operate at the level of the individual and, by extension, at the population and assemblage level. These variables have rarely been addressed explicitly in previous B-EF research, which seems surprising given the potential insight they can provide into the mechanisms behind biodiversity effects, and parallel developments in trophic ecology are also now starting to reveal how the distribution of individual body sizes plays a key role in structuring food webs (Woodward et al. 2010).

Our results show that the evaluation of body size is necessary to allow interpretation of observed biodiversity effects. For example, it is conceivable that many of the strong ‘sampling effects’ (where diverse assemblages are more likely to contain a high-performing species) and ‘identity effects’ (where certain identities perform better than others) described in previous studies might be underpinned by biomass effects, i.e. identities perform best because they have the highest biomass (Jonsson & Malmqvist 2000). A recent meta-analysis of 111 experimental manipulations of species richness within different trophic groups showed that resource depletion by the most species-rich polyculture tends not to differ from the best performing monoculture (Cardinale et al. 2006). Unfortunately, biomass was not included in the analysis, but our results suggest that, as it is an important underlying driver of individual-based rates, it might explain why some species or identities perform better than others: a reanalysis of existing data, where possible, could test these ideas more rigorously.

Performance was connected to body size in our experiments which shows that it is important to consider body mass distributions within a given biomass to avoid misinterpretation of observed biodiversity effects. Several previous studies have recognized, at least implicitly, that biomass is a potential driver of ecosystem processes (Emmerson & Raffaelli 2000) or assumed it to be constant in experimental set-ups (e.g. Duffy et al. 2001; Emmerson et al. 2001; Stachowicz et al. 2008). In the latter case, it is assumed that detecting biodiversity effects requires the exclusion of biomass effects a priori. However, our results suggest this approach might be problematic because it is important to know whether total biomass is composed of many small or a few large individuals, because the former have higher mass-specific metabolic rates than the latter.

It is also difficult to adjust biomasses without creating treatments that differ in total abundance, which could introduce confounding density effects unless large numbers of additional treatments are also added. This situation often forces researchers to use either similar-sized organisms or to use mixtures of either equal biomass or abundance. We chose to control the number of individuals, because we were able to isolate the biomass effect while accounting for density effects by embedding novel statistical approaches within the design of the experiment to test for additivity.

The organisms used in our study all belonged to the same feeding guild and were contributing to the same process. In this case, metabolic requirements determined how the process was sustained. For natural systems, our results imply that ecosystem functioning does not necessarily depend on taxonomic diversity per se and that many consumers are functionally identical for a given process rate, once differences in body mass and metabolism have been accounted for. When species contribute to the same process, do not complement each other and behave in an additive way, then effects of richness may not be evident in the field or in experiments.

Additive effects do not necessarily imply redundancy: in the case of complementarity or multifunctionality, richness effects can arise when species behave in an additive fashion. For example, when species complement each other and perform in an additive way, then ecosystem process rates will decline as species are lost. Non-complementarity and additive effects can also result in species richness effects when multiple responses (i.e. a better approximation to true ecosystem functioning), rather than single process rates, are considered (e.g. Hector & Bagchi 2007; Gamfeldt, Hillebrand & Jonsson 2008), and a new generation of B-EF experiments and analyses are starting to address these questions (Reiss et al. 2009; Ptacnik, Moorthi & Hillebrand 2010).

Widening the context to plant or fungi B-EF studies, it is worth pointing out that here biomass is traditionally used as a response (rather than predictor as in our case) of biodiversity (e.g. Engelhardt & Ritchie 2001; Caldeira et al. 2005). This is not surprising given that body mass of individual plants is not fixed and can vary with nutrient availability, etc. However, it might be important here to use biomass production (rather than standing biomass) as a response variable (e.g. Cardinale et al. 2007). Plant studies typically have a very different experimental design and statistics compared with animal-based studies. For example, seeds are sown, and species richness/evenness effects are observed over longer time frames (e.g. Wilsey & Stirling 2007, Kirwan et al. 2007), which means they are non-orthogonal experiments (see Kirwan et al. (2009) for a statistical approach to richness and evenness effects in plant studies).

Our statistical models can be adapted and applied to a wide range of B-EF experiments because they tested for the effects of richness, evenness, identity and a range of interactions simultaneously (see Appendix S5 for a worked example including our data). They provided a further advantage in that suites of additional density treatments (i.e. as they have been applied in previous experimental designs [e.g. Jonsson & Malmqvist 2003]) can be avoided, because models such as ‘Richness*Type’ and ‘Evenness*Type’ can allow for a type (or species) to perform differently when it is in combination with others while still behaving in an additive fashion. Additive here means that the type can have an unexpected performance when in polyculture, but in a way that is characteristic for the level of richness or evenness. For example, it is conceivable that small A. aquaticus would have altered their performance in tri-culture because their abundances were low and intraspecific interactions changed. However, although they would not have shown the expected value of 1/3 of monoculture performance, there would have still been a single parameter for small A. aquaticus in tri-cultures that would have not been affected by which other types are present in the tri-culture. Importantly, this means their altered performance could have been quantified (value of the fitted parameter). Because body size classes were not the same for all three species, our model ‘Type’ offered a way to test for the effects of both species identity and body size: because it was significant, we could then test whether body size was an underlying mechanism that explained the performance of a type. We did this by additional regressions of body mass and biomass against the response, but this could also be carried out by fitting body mass as a covariate.

Our ‘larger’ statistical models (e.g. the one allowing for altered performance for each level of richness) did not explain our results better than the ‘smaller’ model that assumed purely additive effects of types, i.e. individuals did not alter their performance in combination with others, nor did they show facilitative or negative interactions. However, these models could be extended to give new insights into how biodiversity effects arise (i.e. how and when individuals interact), because they generate values (‘fitted parameters’) for how individuals alter their performance in polyculture. This could be used, for instance, to derive quantitative assessments of facilitative effects (e.g. when using individuals with very different feeding strategies). Generally, for most B-EF studies, anova will be the general method of choice (Schmid et al. 2002), and different experimental designs and questions will require different linear models (or different collections of linear models), as we have shown with our experiments. Because our experiments were very controlled, we were able to develop ‘simple and small’ models that explained the data effectively and we therefore did not need to apply the types of large and complex models that have been developed elsewhere. We would like to emphasise that statistics addressing ‘richness’ or ‘evenness’ effects cannot necessarily be transcribed from one experiment to the next if they have a different design.

It is crucial that experimental analysis and models are as simple and parsimonious as possible (Bailey 2008), especially in a field of research that is already riddled with complexity and severe logistical constraints upon feasible experimental designs (Reiss et al. 2009). We have demonstrated that including individual body mass measures can help provide a deeper interpretation of B-EF relationships, at least in faunal assemblages. Further, it is possible to develop statistical analyses that offer novel ways to reduce experimental treatments and which simultaneously test for different mechanisms behind biodiversity effects. As a final, concluding remark we advocate more transparency and completeness in the description and statistical design of B-EF experiments (see Bell et al. 2009; Kirwan et al. 2009): the challenge now is to extend these techniques to other assemblages and ecosystem processes to test the generality of the findings reported here.

Acknowledgements

This work was funded by a grant from the Natural Environment Research Council, UK (grant reference: NE/D013305/1), awarded to GW.

Ancillary