Plant–soil feedbacks: a meta-analytical review

Authors


* E-mail: andrewkulmatiski@hotmail.com

Abstract

Plants can change soil biology, chemistry and structure in ways that alter subsequent plant growth. This process, referred to as plant–soil feedback (PSF), has been suggested to provide mechanisms for plant diversity, succession and invasion. Here we use three meta-analytical models: a mixed model and two Bayes models, one correcting for sampling dependence and one correcting for sampling and hierarchical dependence (delta-splitting model) to test these hypotheses. All three models showed that PSFs have medium to large negative effects on plant growth, and especially grass growth, the life form for which we had the most data. This supports the hypothesis that PSFs, through negative frequency dependence, maintain plant diversity, especially in grasslands. PSFs were also large and negative for annuals and natives, but the delta-splitting model indicated that more studies are needed for these results to be conclusive. Our results support the hypotheses that PSFs encourage successional replacements and plant invasions. Most studies were performed using monocultures of grassland species in greenhouse conditions. Future research should examine PSFs in plant communities, non-grassland systems and field conditions.

Introduction

In the past 5 years, there has been a rapid increase in theoretical and experimental plant–soil feedback (PSF) research. This research has suggested that PSFs are an under-explored factor that can determine plant abundance, persistence, invasion and succession (Bever 1994, 2003; Callaway et al. 2004b; Ehrenfeld 2005; Kardol et al. 2007). In brief, this field of study examines how plants, through root exudation, root deposition and susceptibility to enemies and symbionts, change the soil and whether these changes increase or decrease subsequent plant growth. If plants increase the growth of conspecifics, this process results in a type of positive frequency dependence called positive individual PSF. Positive PSFs are expected to increase plant abundance, persistence and the ability of a plant to invade communities where plants realize negative PSFs. Because of the growing number of experimental studies that measure PSFs, it is now possible to use meta-analyses to test the size and direction of PSFs across plant species and functional groups (e.g. grasses, perennials). However, methodological differences among studies require that previously untested biases associated with these methodologies also be examined.

Which species, processes and ecosystems are most likely to be affected by PSFs?

Research in successional and invaded plant communities has dominated PSF research (Kulmatiski & Kardol 2008). Two competing hypotheses regarding the role of PSFs in succession have been developed. The first hypothesis grew from the observation that enemy accumulation encourages species replacements (Van der Putten et al. 1993). Several experimental studies have supported the hypothesis that enemy accumulation or negative PSFs, accelerate succession in early-successional communities while positive PSFs encourage persistence in late-successional communities (Van der Putten et al. 1988; Kardol et al. 2006, 2007).

Alternatively, a second hypothesis predicts that PSFs are positive early in succession and become more negative later in succession (Reynolds et al. 2003). In this hypothesis, symbioses are assumed to be critical to plant growth in high stress (i.e. early-successional, high latitude and high altitude) growth conditions (Reynolds et al. 2003). As plant growth increases across successional sequences, pathogen accumulation is expected to produce negative PSFs (Reynolds et al. 2003). Few studies have explicitly addressed the role of PSFs in successional systems, but many PSF experiments have been performed on early-, mid- and late-successional plant species and communities. A review of published data, therefore, can be expected to identify patterns of PSFs created by these species.

Plant–soil feedbacks have also gained attention as a mechanism that could explain the abundance and persistence of non-native, invasive plants (Reinhart & Callaway 2006). More specifically, soils in introduced habitats are expected to be relatively enemy-free and symbiont-rich, because root herbivores and pathogens have not co-evolved to specialize on non-native plant species, while common symbionts are generalists (Callaway & Aschehoug 2000). Not all non-native plants will benefit from enemy release, but those that do are more likely to become successful invaders. If non-native plants can perpetuate or accentuate favourable soil conditions, then non-native plants can be expected to realize less negative PSFs than those realized by congeners in their home range or by other species in their introduced range (Klironomos 2002; Agrawal et al. 2005; Reinhart & Callaway 2006). The increase in non-native plant growth associated with less negative PSFs can be predicted to result in the competitive exclusion of native plants (Bever et al. 1997; Bever 2003).

There has been little discussion of the potential differences in PSFs among different plant functional groups or ecosystems, although differences might be expected. For example, some plant functional groups with fast growth rates and poor constitutive defences, such as annuals, may be more susceptible to belowground enemies and hence more likely to experience negative PSFs than other functional groups. Although this hypothesis has not been previously explored, testing differences in PSFs among plant life forms could be expected to improve understanding of the role of PSFs in different ecosystems or successional stages. As more research addresses the role of PSFs, it is becoming possible to determine whether there is broad support for this hypothesis.

Measuring PSFs

Plant–soil feedback research is founded on two concepts: (i) plants cause species-specific changes to soils and (ii) plants demonstrate species-specific responses to these changes (Bever 1994; Ehrenfeld et al. 2005). Thus, a PSF experiment incorporates two phases. In phase I, soils are cultivated by known plant species. In phase II, plants are grown on self-cultivated (self) and non-self-cultivated (other) soils. The difference in plant growth between these two soil types is a measure of PSF. When PSFs are used to describe the growth of a plant species on self- and other-soils, this is called a direct or individual PSF. When PSFs are used to describe the growth of two plant species on their own and each others’ soils, this data can be used to determine an indirect or net-pairwise PSF (Bever et al. 1997; Mills & Bever 1998; Reynolds et al. 2003).

Individual PSFs provide information about a species’ relationship with its soils (i.e. positive or negative). A positive individual PSF occurs when a plant grows better on self-cultivated soils than on other-cultivated soils. A negative individual PSF occurs when a plant grows better on other-cultivated soils than on self-cultivated soils. Net-pairwise PSFs, in contrast, can be used to make predictions of competitive exclusion or coexistence among specific pairs of plant species. Theoretical models have demonstrated that net-pairwise PSFs can be more important to plant growth than the direction or magnitude of individual PSFs (Bever 2003; Eppinga et al. 2006), but few studies have produced the data needed to calculate net-pairwise PSFs. As a result, throughout this review we discuss individual PSFs, which have been found to be positively correlated with plant abundance on the landscape (Klironomos 2002).

Researchers have used many different methods to conduct PSF experiments, each of which was developed to address particular questions, but often have limitations (Kulmatiski & Kardol 2008). These methods need to be examined for consistent methodological biases. Soils in phase I, for example, have been cultivated by naturally occurring plants (natural experiment) or by experimentally grown plants (manipulative experiment). The natural experiment approach can eliminate the time required for phase I, and field-collected soils may reflect more natural soil conditions, but this approach is susceptible to uncontrolled differences among sampling sites (Troelstra et al. 2001; Baack et al. 2006; Ellis & Weis 2006).

In phase II, both species-level and community-level plant growth responses to different soils have been measured. Most studies have examined species-level responses, but within this approach the growth of the target species has been measured as the response of a single individual, multiple individuals or individuals of the target species within plant communities. We refer to these differences as phase II neighbourhood differences. In contrast, community-level PSFs are performed by measuring the growth of all plants in a community on self- and other-soils (De Deyn et al. 2004; Kulmatiski et al. 2006; Kardol et al. 2007). Community-level responses have rarely been measured, but they are likely to be important because plants in field conditions grow in communities.

This study addresses the following questions: (i) Do early-successional species (i.e. annuals and biennials) realize more negative feedbacks than late-successional species (i.e. perennials)? (ii) Do native species realize more negative feedbacks than non-native species? (iii) Do PSFs differ among life forms (i.e. grass, forb, shrub or tree)? (iv) Does the presence of conspecific or heterospecific competitors in phase II (i.e. phase II neighbourhood) exaggerate PSFs? (v) Do single plant species cultivate and respond to soil changes differently than plant communities? and Do differences in experimental approaches influence PSFs? More specifically (vi) Do natural and manipulative experiments produce different PSFs? and (vii) Do greenhouse experiments overestimate PSFs?

Methods

We compiled a data set that included measurements of plant growth of target species on self-cultivated (self) and other-cultivated (other) soils. Self-soils were either experimentally cultivated by a target species or field collected in an area that was described as dominated or co-dominated by the target species. Other-soils were either sterilized or cultivated by non-target plant species. This simple ruleset for data collection provided a robust basis for the meta-analytical approach (Lortie & Callaway 2006).

All manuscripts were located by searching keywords in Web of Science for the terms ‘plant, soil and feedback’, ‘soil, feedback and experiment’ or ‘plant, soil and transplant’, examining references within and by obtaining unpublished data. We excluded manuscripts that examined only the effects of components of the soil community (e.g. pathogens, fungi or mycorrhizae), (ii) only examined nitrogen-fixing species because these were expected to produce a sampling bias toward positive PSF or (iii) focused solely on agricultural systems.

We treated experiments where different species were subjected to the same treatments or the same species were subjected to different treatments, as separate experiments (Gurevitch & Hedges 1999, 2001). Different measured response variables on the same experiment were excluded. Aboveground biomass was the most commonly used response variable. Where other response variables were reported, the response variable that linked best to aboveground biomass was used.

For this meta-analysis, seven research questions were identified. Here we summarize these questions by first referring to the covariate name and then by specifying the comparison to be made among the levels of the covariate.

  • 1Length of life cycle: annual vs. biennial vs. perennial.
  • 2Life form: grass vs. forb vs. shrub vs. tree.
  • 3Species origin: native vs. non-native vs. weedy vs. invasive (in grassland systems).
  • 4Phase II neighbourhood: single plant vs. intraspecific vs. interspecific.
  • 5Plant community: species-level vs. community-level.
  • 6Experimental venue: field vs. greenhouse.
  • 7Experimental approach: manipulative vs. natural.

Data classifications (e.g. life form) were derived directly from manuscripts. Appendices S1 and S2 list the complete data set. PSF have long been thought to be important to successional processes, but assigning species to successional stages can be difficult. To resolve this, we relied on author definitions, and assumed that annuals and biennials are early-successional species. Covariate levels native, non-native, weedy and invasive were assigned according to author descriptions or listings by the USDA Plants Database (http://plants.usda.gov/index.html). Data from non-grassland systems were excluded from the test of Species origin because sample sizes of non-native species were too low for non-grasslands systems.

To determine if plant growth differed between self- and other-soils, three statistical approaches were used. First, mixed model meta-analyses (mixed model), which are commonly used in the ecological literature, were performed (Gurevitch & Hedges 2001). Next, to account for sampling and hierarchical dependence in the data set (see below), a hierarchical Bayes linear model (HBLM) approach was used (DuMouchel & Harris 1983; DuMouchel & Normand 2000). The first HBLM accounted for sampling dependence (sampling model) and the second HBLM accounted for both sampling and hierarchical dependence (delta-splitting model; Stevens & Taylor 2008). Sampling dependence occurs when one control group is compared with more than one test group (e.g. plant growth on self-soil is compared with growth on soils cultivated by several other plant species). Hierarchical dependence occurs when many experiments are performed as a part of a single study [e.g. Klironomos (2002) reported PSF values for 61 species].

Effect size

The effect size of interest in each experiment i of this meta-analysis was

image

where μ0 is the mean of the control ( ‘self ’ ) population, μi is the mean of the experimental ( ‘other’ ) population and σ is the standard deviation (SD) common to the experimental and control populations in the study. An unbiased estimate for θi is

image

where inline image is a sample mean, Sp is a pooled SD and

image

with d.f. the degrees of freedom for error in the study. In the mixed model, d.f. were simply − 1. However, several studies reported more than one experimental group, but just one control group. In the Bayes’ models, where we accounted for this sampling dependence, each study was defined to have m experimental (‘other’) samples with sample sizes n1, … , nm and one control (‘self’) sample with sample size n0; then inline image. A pooled SD was used to obtain more precise estimates of the effect sizes. With the sample SD represented by S, a pooled estimate for σ is

image

The effect size estimate di was defined in this way to maintain the desired interpretation of the difference in plant growth between self- and other-cultivated soils. The value d is measured in units of SD. A positive value of d indicates that plants grow better on self-than other-soils, whereas a negative d indicates that plants grow better on other-than self-soils. Thus, the sign of d is consistent with the direction of PSF.

We used the conventional interpretation of the magnitude of the effect size provided by Cohen (1969), where 0 indicates no effect, 0.2 is a small effect, 0.5 is medium, 0.8 is large and 1.0 indicates a very large effect. We examined a funnel plot of effect size estimates and sample sizes; the symmetry observed in this plot did not suggest evidence of publication bias (Begg 1994).

Mixed model

For this model, d was calculated as described above. Effect sizes from individual experiments were then combined to create cumulative effect sizes (di+*) for each covariate level. Studies with larger sample sizes were counted more heavily than studies with smaller sample sizes, assuming that larger sample sizes yield more accurate results (Gurevitch & Hedges 2001). Effect sizes were judged significant if the 95% confidence intervals of the effect size excluded 0.

We performed a between-class homogeneity statistical test (QB*) to test the null hypothesis that effect sizes were equal among covariate levels against the alternative hypothesis that at least one true effect size was different. We evaluated the significance of the QB* test using a standard chi-square table. Results were considered significant if < 0.05, and overlap of 95% confidence intervals were used to determine which classes were different. Formula for calculating di+* and QB*, are outlined in Gurevitch & Hedges (2001).

Sampling dependence

In several studies for this meta-analysis, the same sample was used as control for more than one experimental group. Because the resulting effect size estimates (di) were based on the same sample of data (specifically the same inline image and Sp), this created sampling dependence among the effect size estimates. Studies with this type of sampling dependence are sometimes referred to as ‘multiple-treatment studies’ (Gleser & Olkin 1994). To account for sampling dependence, it is necessary to obtain the estimated sampling covariance matrix V of the effect size estimates. It can be shown that the variance of di can be estimated as

image

where inline image (Stevens & Taylor 2008). When experiments i and h (i ≠ h) have sampling dependence as described above, it can also be shown that the covariance of the effect size estimates di and dh can be estimated as Vi,h = (p − 1)didh (Stevens & Taylor 2008). Alternative estimates of the variance/covariance structure such as those in Gleser & Olkin (1994) can be shown to be asymptotically equivalent to those used here. With the sampling covariance matrix thus estimated, the effect size estimates di can be combined systematically using a linear model (see below).

Hierarchical dependence

Groups of experiments can be considered hierarchically dependent if they were performed as a batch of experiments by the same research team. For example, the Agrawal et al. (2005) study reports 20 experiments on 20 different species. While the effect size estimates from these 20 experiments are not based on the same samples of data (and hence do not have sampling dependence), they can be considered as having come from the same batch (or research team) and hence present the potential for hierarchical dependence.

We combined the effect size estimates from the multiple experiments using a hierarchical Bayes linear model (DuMouchel & Harris 1983; DuMouchel & Normand 2000), accounting for both sampling and hierarchical dependence (Stevens & Taylor 2008). This approach can be summarized in matrix form as the linear model d =  + δ + ε where d is the vector of effect size estimates from all the experiments; X, a design matrix (to account for the covariates based on indicator variables); β, a vector of parameters (including an intercept term and the effects of the covariates); δ, a vector of hierarchical errors; and ε is a vector of sampling errors. This model assumes the distributions

image

where V is the sampling covariance matrix defined above and Δ is the hierarchical covariance matrix. Briefly, Δ is a block-diagonal matrix with hierarchical variance τ2 on the diagonal for all experiments and hierarchical covariance ς on the off-diagonal for pairs of experiments that are hierarchically dependent. The block-diagonal structure of Δ effectively splits the hierarchical errors δi into two components, a study-specific component and an experiment-within-study component. For this reason, this approach may be referred to as ‘delta-splitting’ (Stevens 2005). The hierarchical dependence concerns the correlation among the experiment-within-study components. By forcing ς = 0, this HBLM model can be made to account for only sampling dependence (and ignore hierarchical dependence), creating our ‘sampling’ model. Stevens & Taylor (2008) provide additional details and interpretation.

For each of the seven research questions, we defined a design matrix X to include an intercept column of 1’s, and additional columns of 0’s and 1’s representing indicator variables for specific covariate levels. For example, for the question involving levels of Life cycle, we used annual as a reference category and defined two indicator variables for biennial and perennial to include as columns in the design matrix for the question. The non-intercept columns of the design matrix X were centred about their means so that the ‘intercept’ term in β can be interpreted as the population mean effect size (Louis & Zelterman 1994). Bayesian methods were used to make inference on the β, with a normal prior on β|(τ,ς), a uniform prior on ς |τ and a log-logistic prior on τ. This approach provided the posterior mean and covariance of β, along with the posterior probabilities for each component of β. To facilitate interpretation and comparison between models, the posterior probabilities were converted to two-sided P-values, as in Louis & Zelterman (1994): Pj = 1–2|0.5–pr(βj > 0|data)|. These are reported in the Results section. We used this HBLM model to account for only sampling dependence by forcing ς = 0. Code for the R environment was used, with numerical integration for the Bayes model performed using the Simpson’s rule approximation. Computational details of this approach are provided in Stevens & Taylor (2008).

We calculated mean effect sizes for covariates levels by looking at linear combinations of the posterior means from the HBLM without centring the design matrix columns. The square root of the posterior variance of each linear combination was also calculated and used as a standard error for visualization purposes. Results in the sampling model were considered significant if < 0.05, and in the delta-splitting model were considered significant if < 0.10 because of the conservative nature of the model.

Computational details of this approach are provided in Stevens & Taylor (2008). Tools for the implementation of the Bayes model accounting for hierarchical (and sampling) dependence in meta-analysis are incorporated in the forthcoming metahdep package for R.

Results

The full data set included 329 experiments from 45 independent studies of which 40 (89%) were conducted after 2001 (Table 1). Effect sizes were fairly evenly distributed around values of −1.0 to 0.0 (Fig. 1). Unless indicated, analyses were conducted using a smaller subset of 315 experiments and 43 studies, and excluded the two studies investigating whole plant community responses, which were analyzed separately.

Figure 1.

 Number of plant–soil feedback experiments by effect size. Negative effect sizes suggest that plants grow better on ‘other’ than on ‘self’ cultivated soil (= 329 experiments). The distribution of effect sizes does not suggest a publication bias toward significant results.

Mixed model meta-analyses

Plants, in general, had a medium, negative effect size (d++ = −0.58 ± 0.06 SE, n = 315). The following comparisons of mean effect sizes were different: Life cycle: annual < perennial (Qb* = 15.34, < 0.001); Life form: grass, forb < tree (Qb* = 20.78, < 0.001); Species origin: native < invasive (Qb* = 164.53, < 0.001); phase II neighbourhood: intraspecific < single plant, interspecific (Qb* = 16.43, < 0.001); Plant community: species < community (Qb* = 6.49, < 0.05); Experimental venue: greenhouse < field (Qb* = 7.53, < 0.01); and Experimental approach: manipulative < natural (Qb* = 16.59, < 0.001; Figs 2–4).

Figure 2.

 Mean effect sizes for experiments separated into (a) Length of life cycle, (b) Life form and (c) Species origin for the mixed, sampling and delta-splitting models. Error bars for the mixed model indicate the 95% confidence interval. The square root of the posterior variance of each linear combination of posterior means is used as a standard error for the sampling and delta-splitting models. Sample sizes are indicated at the top. Mixed, sampling and delta-splitting model values with different letters are significantly different at the = 0.05, 0.05 and 0.10 levels respectively.

Figure 3.

 Effect sizes for experiments separated into (a) plant community and (b) phase II neighbourhood for the mixed, sampling and delta-splitting models. Error bars for the mixed model indicate the 95% confidence interval. The square root of the posterior variance of each linear combination of posterior means is used as a standard error for the sampling and delta-splitting models. Sample sizes are indicated at the top. Mixed, sampling and delta-splitting model values with different letters are significantly different at the = 0.05, 0.05 and 0.10 levels respectively.

Figure 4.

 Effect sizes for experiments separated into (a) experimental approach and (b) experimental venue for the mixed, sampling and delta-splitting models. Error bars for the mixed model indicate the 95% confidence interval. The square root of the posterior variance of each linear combination of posterior means is used as a standard error for the sampling and delta-splitting models. Sample sizes are indicated at the top. Mixed, sampling and delta-splitting model values with different letters are significantly different at the = 0.05, 0.05 and 0.10 levels respectively.

Hierarchical Bayes analyses

In all, there were 20 groups of sampling-dependent experiments in this meta-analysis, ranging in size from the group of two Hilaria jamesii experiments in Belnap et al. (2005) to the group of seven Quercus ilex experiments in Puerta-Pinero et al. (2006). There were 35 groups (or batches) of hierarchically dependent experiments in this meta-analysis, ranging in size from the two experiments in Belnap et al. (2005) to the 61 experiments in Klironomos (2002).

In the sampling model, plants, in general had a medium to large negative effect size (−0.75 ± 0.11 SE, = 315). In this model, the following comparisons of mean effect sizes were different: life cycle: annual < perennial (P = 0.001); life form: grass < forb, tree (P = 0.08, < 0.001 respectively) and forb < tree (P = 0.003); species origin: natives < invasive (P = 0.007); phase II neighbourhood: intraspecific < single plant (P = 0.02); plant community: species < community (P = 0.02); experimental venue: greenhouse < field (P = 0.004); and experimental approach: manipulative < natural (P = 0.001; Figs 2–4).

As in the mixed and sampling models, the delta-splitting model indicated that no covariate level produced a significantly positive effect size and the overall effect size was negative (−0.70 ± 0.10 SE, = 315). The only difference among mean effect sizes by categories was for life form: grass < forb (P = 0.001). Nominal differences among mean effect sizes by categories were for species origin: natives < invasive (P = 0.13); and plant community: species < community (= 0.12), although these difference were not significant by our criterion (Figs 2–4).

Multiple hypothesis testing

To address the problems associated with asking many questions of the same data set, we performed a meta-multiple regression with the Bayes model. This model allowed us to determine which tests were significant when all explanatory variables were present. It also allowed us to use a Holms and Bonferonni adjustment to account for the number of questions asked of the data set. These results (not shown) supported our reported results because the same tests that we report as significant were found to be significant in this meta-multiple regression. It was not possible to use this test to ask questions regarding species origin or plant community because these questions were addressed using only a subset of the total data set.

Discussion

Most plants realized negative PSFs. Similarly, all covariate levels realized PSFs that were negative or not different than zero. As a result, the average effect sizes of PSFs on plant growth were between −0.58 and −0.75, in the mixed and sampling models respectively. This range of effect sizes on plant growth was comparable or larger than those observed in meta-analyses of pathogenic fungi (Levine et al. 2004), leaf-litter addition (Xiong & Nilsson 1999), seed limitation (Clark et al. 2007) and seed feeders (Morris et al. 2007); similar to those observed in meta-analyses of aboveground herbivores, total herbivores, viruses, leaf chewers, root feeders (Morris et al. 2007) and soil warming (Rustad et al. 2001); and smaller than those observed in meta-analyses of competitors, plant diversity (Levine et al. 2004), belowground herbivores, pathogens and nematodes (Morris et al. 2007).

Plant–soil feedbacks may be more important than suggested by comparisons with other meta-analyses because both positive (28% of experiments) and negative (70% of experiments) PSFs were observed. Positive PSFs accounted for a 25% increase in growth while negative feedbacks accounted for a 65% decrease in growth. An implication of this variability in PSF values is that PSFs may be important to individual species even if those species are in a covariate level that demonstrated neutral PSFs (e.g. trees; Reinhart & Callaway 2004). A second implication of this variability is that effect sizes in this study may be underestimated relative to effect sizes in other studies because in other meta-analyses effect sizes tend to be in one direction. For example, competitors rarely facilitated growth in the meta-analysis of biotic resistance so nearly all effect sizes of biotic resistance were negative (Levine et al. 2004).

The absolute value of effect size provides an estimate of effect size that is not affected by the sign of the value. The average absolute value of effect sizes in this study ranged from 1.20 to 1.33 in the mixed and sampling models respectively. These very large effect sizes are comparable to those associated with competitors and species diversity (1.1 and 0.9, respectively; Levine et al. 2004).

This review indicates that PSFs are likely to be important relative to many other plant growth factors. A majority of the data (83%), however, were derived from grassland systems (i.e. grasses and forbs), where PSFs were large and negative. PSF were less important for shrubs and trees, for which PSFs values were not different than zero. There was also a bias in the data set toward studies performed in greenhouse conditions (92%). These studies produced larger negative results than field conditions. Published data, therefore, may overestimate the importance of PSFs relative to field conditions. It is unlikely, however, that published data overestimates the importance of PSFs relative to other plant growth factors because greenhouse experiments were common in other meta-analyses (Levine et al. 2004; Morris et al. 2007). The size and direction of PSFs, therefore, supports the hypothesis that PSFs are a strong mechanism that encourages plant coexistence and diversity (Bever et al. 1997).

Differences among the models

We conducted our meta-analyses with three different models: a mixed model, common in the ecological literature, and two newly developed Bayes models, which accounted for either sampling dependence (sampling model) or both sampling and hierarchical dependence (delta-splitting model). The mixed model was the least conservative and showed the most significant differences among covariate levels. The sampling model generally agreed with the mixed model because sampling covariances were not very large in this study. The delta-splitting model produced more conservative estimates of treatment effects because a few of the studies in the data set contained a large number of experiments.

The delta-splitting model essentially makes two adjustments to the data. First, the model lowers the effective sample size from close to 329 (the number of experiments) to closer to 45 (the number of independent studies; Stevens & Taylor 2008). By decreasing the effective sample size this model decreases statistical power, as was evident in the test of single-species vs. community PSFs. In this test, the standard errors in the delta-splitting model were larger changing the P-value from 0.01 to 0.13. This indicates that consistent results within a few large studies masked some of the variation that was present in results among studies.

The second adjustment made in the delta-splitting model is that it devalues effect sizes associated with studies with a large number of experiments, especially when results in those studies differ from the remaining studies in the data set (Stevens & Taylor 2008). As an example, of the 67 experiments using annuals, 35 (52%) were from Kardol et al. (2007), all of which had negative effect sizes. The other 48% of the data came from nine studies, many of which had both positive and negative effect sizes. Thus, the mixed and sampling models, which did not down-weight the 52% of the data from Kardol et al. (2007), showed annuals to have more negative PSF than biennials or perennials, while the delta-splitting model, which down-weighted this data set, did not. Using the delta-splitting model, we would conclude that annuals do not have a negative effect size, even though 57 (85%) of the effect sizes for annuals were negative. Our interpretation of the results from all three models is that there is evidence to suggest that annuals realize more negative PSFs than biennials and perennials, but that this conclusion is biased by at least one large study.

By interpreting results from the three models, we can determine which results are (i) robust, because results are consistent across all three models (i.e. grasses vs. trees, species vs. community, greenhouse vs. field), (ii) suggestive, but require more studies because the sample size was too low (i.e. native vs. invasive, intraspecific vs. interspecific) and (iii) suggestive, but reflect data from one or two large studies (i.e. annuals vs. perennials, manipulative vs. natural).

Plant types

In the comparisons of PSFs among different plant life-forms, we found that grasses demonstrated the most negative effect sizes. To explain grass sensitivity to belowground enemies, we suggest that competition for water in semi-arid systems encourages root rather than shoot competition and that this results in high growth rates, high root to shoot ratios, greater root longevity and a larger proportion of roots near the soil surface (Gleeson & Tilman 1994; Schenk & Jackson 2002; Wilsey & Polley 2006). We suggest that these characteristics of grassland species increase grass exposure to belowground enemies. Because woody plants did not have large negative PSFs (this study) and they are not as affected by biotic resistance (Levine et al. 2004) or pathogens (Morris et al. 2007), woody plants appear to be less sensitive to belowground enemies and competitors than herbaceous plants.

Trees appeared to be resistant to belowground enemies, but these results were derived exclusively from temperate species – no data were available from tropical forests. Tropical trees are believed to be more susceptible to disease than temperate trees and could be expected to develop more negative PSFs (Janzen 1970). As a potential explanation for this pattern, most tropical trees are believed to form arbuscular mycorrhizal associations. Arbuscular mycorrhizae can form common mycorrhizal networks that may suppress competitive exclusion and therefore encourage species diversity. In contrast, many temperate trees form ectomycorrhizal and ericoid associations. These associations form protective sheaths around tree roots that can inhibit pathogen infection and therefore develop less negative PSFs (Duchesne et al. 1989). It could be expected, therefore, that negative PSFs may be more common in tropical than temperate forests. In support of this hypothesis, monodominant stands of ectomycorrhizal tree species have been observed in a matrix of diverse, arbuscular mycorrhizal tree species in a tropical forest in Guyana (Mayor & Henkel 2006). Future research on PSFs in tropical vs. temperate or ectomycorrhizal vs. arbuscular mycorrhizal forests may reveal a novel mechanism of diversity in tropical forests.

In a pattern that reflected that of grasses and trees, annuals realized very large negative PSFs and perennials realized less negative PSFs. The delta-splitting model indicated that this result reflected data from a few large studies, but, if true, it contradicts the hypothesis that PSFs will become more important and more negative across successional sequences (Reynolds et al. 2003), and supports the hypothesis that negative PSFs increase the rate of succession (Van der Putten 1997; Kardol et al. 2007). Because early-successional species, which typically demonstrate the greatest maximum growth rates, appeared to be most susceptible to negative PSFs, the results are consistent with the hypothesis that there is an inherent trade-off between enemy defence and fast growth rates, as has been observed in above-ground systems (Coley et al. 1985). These are the first results we are aware of that address differences in PSFs among life forms. These results suggest that PSFs may help explain fundamental differences among plant life forms and ecosystem types.

Non-native plants

Identifying mechanisms of non-native plant success and failure is a central theme in invasion ecology. We found evidence that PSFs may help explain why some non-native species become invasive while others do not. More specifically, non-native, invasive plants realized the least negative PSFs, suggesting that the success of these plants can be explained by a release from soil-based enemies. In contrast, non-native, noninvasive plants realized negative PSFs that were similar to those realized by native plants, suggesting that soil-based biotic resistance limits the success of many non-native plants.

Previous reviews have suggested that early-successional species make good invaders because they are well-suited to wide dispersal, fast growth and growth in disturbed sites (Rejmanek 1996; Reichard & Hamilton 1997; Prinzing et al. 2002). We provide an alternative explanation; early-successional species make good invaders because they have the most to gain by increases in PSFs. In support of this hypothesis, we found that PSFs increased four times more for non-native annuals than non-native perennials. More specifically, mixed-model effect sizes for non-native annuals (−0.86) were much greater than for native annuals (−1.95), while effect sizes for non-native perennials (−0.53) were only slightly greater than for native perennials (−0.79). Also consistent with this hypothesis, grasses realized the most negative PSFs, and grasslands have realized some of the worst invasions (Sheley et al. 1998).

Implications of different experimental methods

Experiments using experimentally cultivated soils and greenhouse conditions produced more negative effect sizes than experiments using field-collected soils and field conditions, respectively; although, the delta-splitting model indicated that more studies are needed to clearly demonstrate these effects. Highly controlled experiments have similarly been found to produce larger effect sizes in studies of enemy and mutualist effects on plants (Morris et al. 2007). While this difference could be due to lower variability in data from controlled conditions, and while many factors differ between field and greenhouse experiments, results from this analysis are consistent with the hypothesis that microbially rich soils provide functional redundancy and disease suppressiveness (Sanchez-Moreno & Ferris 2007). As a result, we suggest that processes or experimental treatments that create soils with small or low diversity microbial communities (i.e. tillage, inoculation experiments) are more likely to realize large microbial population fluctuations. Large fluctuations in populations of soil enemies or soil symbionts are likely to have greater effects on plant growth than small fluctuations in these populations.

PSF in whole plant communities

Plant species grown in monocultures produced more negative PSFs than plants grown alone or plants grown with other plant species. This suggests that intraspecific competition exaggerates PSFs; although, the delta-splitting model indicated that this result was based on too few studies to be conclusive. Similarly, the mixed and sampling models suggest that species-level PSFs, whether measured using individuals, monocultures or mixed communities, produced more negative PSFs than community-level PSFs. In fact, plant communities were the only class of data for which the mean effect size was positive, though not significantly different from zero. These results should be taken with caution as they were developed from only two studies (Kulmatiski et al. 2006; Kardol et al. 2007).

Plant–soil feedback models of interacting species provide some insight into why community-level responses may be less negative than species-level responses. Bever et al. (1997) demonstrated that for two species to coexist, plant growth on other-soil had to be greater than plant growth on self-soil, otherwise the species that benefits most from its own growth will competitively exclude the other. From this, we might expect that co-existing species in a community will grow better than species in a monoculture. More generally, it could be expected that plant roots in community-cultivated soils will forage for soils that produce the least negative effects on plant growth. This could be expected to minimize negative PSFs in a community. Theoretical and experimental research is needed to better understand the role of PSFs in plant communities.

We only investigated individual PSFs in this paper, even though as mentioned earlier, net-pairwise PSFs are needed to predict the outcomes of specific plant–plant interactions. While net-pairwise PSFs are needed to make precise predictions regarding plant–plant interactions, the results from this review and those of Klironomos (2002) indicate that the qualitative nature of individual PSFs (e.g. more or less negative) are highly associated with plant abundance, persistence, invasion and life form. We suggest that individual PSFs are correlated with plant abundance and plant traits because most plant species realize negative PSFs and any release from these negative effects can improve plant growth regardless of potential interactions with other species.

Conclusions

We conclude that PSFs are a strong mechanism of plant coexistence and the maintenance of diversity, especially in grassland systems, because PSFs demonstrated medium to large, negative effect sizes. We conclude that PSFs are also a likely mechanism explaining the success and failure of many non-native plants because invasive plants realized the least negative PSFs. We also conclude that PSFs are likely to encourage the replacement of early-successional species (i.e. annuals and biennials) by late-successional species (i.e. perennials) because early-successional species appeared to realize the most negative PSFs.

Grassland plants (i.e. grasses and forbs), which dominated the literature, produced medium to large negative PSFs. We suggest that this pattern reflects high root to soil contact in these systems. Future research explaining why grasses and grasslands realize the most negative PSFs may reveal fundamental differences among ecosystems.

Furthermore, we suggest that the abundance of negative PSFs in grasslands may help explain why non-native invasions are successful in these systems. More specifically, we suggest that species with the most negative PSFs (grasses and annuals) are the most likely to realize the benefits of enemy release. These same species are most likely to invade systems where negative PSFs are common (grasslands). If supported, this hypothesis may provide a screening tool for potentially invasive species: species with the most negative PSFs in the home range may be most likely to be invasive in their introduced range.

Future research is needed to examine the role of PSFs in more natural settings. More specifically, net-pairwise PSFs have rarely been measured, and little theoretical or experimental data exist that test for the role of PSFs in plant communities. Similarly, research on the role of PSFs under field conditions is needed to link a growing body of theoretical, greenhouse and mesocosm-derived data to plant growth on the landscape.

Acknowledgements

We thank the following authors for providing data A. Agrawal, R.R. Blank, G. De Deyn, P. Kardol, J. Klironomos, P. Meiman, C. Puerta-Pinero, S. Troelstra and W. van der Putten. We thank A. Croft for assistance with this project. A. Kulmatiski was funded by the Department of Wildland Resources, College of Natural Resources and USU Ecology Center and support was received by the Utah Agricultural Experimental Station.

Ancillary