Causal networks clarify productivity–richness interrelations, bivariate plots do not

Authors


Summary

  1. Perhaps no other pair of variables in ecology has generated as much discussion as species richness and ecosystem productivity, as illustrated by the reactions by Pierce (2013) and others to Adler et al.'s (2011) report that empirical patterns are weak and inconsistent. Adler et al. (2011) argued we need to move beyond a focus on simplistic bivariate relationships and test mechanistic, multivariate causal hypotheses. We feel the continuing debate over productivity–richness relationships (PRRs) provides a focused context for illustrating the fundamental difficulties of using bivariate relationships to gain scientific understanding.
  2. Pierce (2013) disputes Adler et al.'s (2011) conclusion that bivariate productivity–richness relationships (PRRs) are ‘weak and variable’. He argues, instead, that relationships in the Adler et al. data are actually strong and, further, that failure to adhere to the humped-back model (HBM; sensu Grime 1979) threatens scientists' ability to advise conservationists. Here, we show that Pierce's reanalyses are invalid, that statistically significant boundary relations in the Adler et al. data are difficult to detect when proper methods are used and that his advice neither advances scientific understanding nor provides the quantitative forecasts actually needed by decision makers.
  3. We begin by examining Grimes' HBM through the lens of causal networks. We first translate the ideas contained in the HBM into a causal diagram, which shows explicitly how multiple processes are hypothesized to control biomass production and richness and their interrelationship. We then evaluate the causal diagram using structural equation modelling and example data from a published study of meadows in Finland. Formal analysis rejects the literal translation of the HBM and reveals additional processes at work. This exercise shows how the practice of abstracting systems as causal networks (i) clarifies possible hypotheses, (ii) permits explicit testing and (iii) provides more powerful and useful predictions.
  4. Building on the Finnish meadow example, we contrast the utility of bivariate plots compared with structural equation models for investigating underlying processes. Simulations illustrate the fallibility of bivariate analysis as a means of supporting one theory over another, while models based on causal networks can quantify the sensitivity of diversity patterns to both management and natural constraints.
  5. A key piece of Pierce's critique of Adler et al.'s conclusions relies on upper boundary regression which he claims to reveal strong relationships between production and richness in Adler et al.'s original data. We demonstrate that this technique shows strong associations in purely random data and is invalid for Adler et al.'s data because it depends on a uniform data distribution. We instead perform quantile regression on both the site-level summaries of the data and the plot-level data (using mixed-model quantile regression). Using a variety of nonlinear curve-fitting approaches, we were unable to detect a significant humped-shape boundary in the Adler et al. data. We reiterate that the bivariate productivity–richness relationships in Adler et al.'s data are weak and variable.
  6. We urge ecologists to consider productivity–richness relationships through the lens of causal networks to advance our understanding beyond bivariate analysis. Further, we emphasize that models based on a causal network conceptualization can also provide more meaningful guidance for conservation management than can a bivariate perspective. Measuring only two variables does not permit the evaluation of complex ideas nor resolve debates about underlying mechanisms.

A seemingly endless debate over bivariate patterns

Ecologists have a fascination with patterns in ecological data. At the same time, ecologists struggle to define reliable generalizations. This struggle reflects the fact that communities and ecosystems represent heterogeneous collections of species and genotypes and that their properties and characteristics (e.g. species assemblages, mix of life-forms, etc.) are contingent on many factors. Ecological theories often serve to provide some basis for defining expected patterns. However, debates about the generality of ecological patterns and about underlying processes often go unresolved. In this paper, we demonstrate that bivariate patterns and simple statistical descriptions of association neither provide much information about underlying processes nor are sufficient for guiding conservation.

Perhaps no other pair of variables has generated as much discussion amongst ecologists as species richness and ecosystem primary productivity. The debate over productivity–richness relationships (PRRs) provides an ideal context for illustrating the fundamental difficulties of using bivariate relationships to gain scientific understanding. Visual impressions are part of the problem. A relevant comparison can be drawn between the interpretations of bivariate plots and the psychologist's Rorschach test. The basis for the classic Rorschach test (Rorschach 1998) is the tendency for simple patterns to trigger complex interpretations in the human mind that depend on individual search patterns and predispositions. The PRR debate shows that visual examinations of bivariate plots evoke support for a wide variety of alternative theories, depending on the viewer's predispositions. Psychological attachments make it difficult for some to consider alternative interpretations.

A discussion of two diametrically opposed critiques of the results reported by Adler et al. (2011) clearly illustrates this issue. Adler et al.'s conclusion that PRRs for 48 grass-dominated communities were weak and variable triggered simultaneous claims that the true relationship in the data was either (i) strongly linear positive (Pan, Liu & Zhang 2012) or (ii) strongly hump-shaped (Fridley et al. 2012). Grace et al. (2012a) responded by pointing out that scientific objectives (specifically, theory confirmation vs. theory evaluation) influence how data are interpreted, contributing to disagreements over evidence. That response also emphasized that a focus on bivariate patterns can stand in the way of making progress in understanding ecological systems. The most recent criticisms of Adler et al. by Pierce (2013) illustrate a need for further demonstration of the inherent limitations to drawing complex interpretations from simple patterns like bivariate plots.

We first provide more detail about the Adler et al. (2011) study, as these details are central to the discussion. Adler et al. analysed productivity and richness data collected from 1126 plots in 48 herbaceous-dominated plant communities spanning five continents. Data were collected using standardized sampling methods to address objections raised previously about studies that used meta-analysis (e.g. Mittelbach et al. 2001). Adler et al. also analysed data in a number of ways, anticipating that any single approach could be criticized from some perspective. PRRs were examined with the data summarized at three spatial extents: (i) within sites, (ii) across sites within biogeographic regions and (iii) across all sites around the globe. Within-site relationships took on all possible forms; most were non-significant (34), five were positive linear, five were negative linear and only one site showed a statistically significant humped relationship. Regional comparisons amongst sites found one region to demonstrate a humped PRR, though the humped relationship depended on highly altered (anthropogenic) sites being included in the sample. Other within-region comparisons failed to reveal humped patterns regardless of the inclusion of sites representing particularly unique ecosystems (one saltmarsh was included in the study). At the global extent, a statistically significant humped relationship amongst sites was found (R2 = 0·11), but when anthropogenic sites and the one saltmarsh in the data set were omitted, a positive linear relationship was found.11 In sum, Adler et al. concluded that ‘productivity is a poor predictor of plant species richness' and recommended that future studies should investigate ‘the complex, multivariate processes that regulate both productivity and richness’.

Pierce (2013) has re-examined the Adler et al. study and makes a number of claims, as follows:

  1. ‘… the original analysis demonstrated a significant unimodal [PRR] at the global scale… in agreement with the classic humped-back model…’ (italics added for emphasis)
  2. Analysis of the Adler et al. data using upper boundary regression shows a strongly predictive humped PRR (R2 = 0·81 across all plots; R2 = 0·98 across all sites).
  3. Adler et al. excluded from some analyses anthropogenic sites and one saltmarsh, which are crucial to delimiting the general PRR.
  4. ‘… by abandoning the humped-back model ecologists would be unable to inform conservationists as to the conditions maintaining or suppressing, high species richness’.

We respond to these claims in several ways. We begin by reviewing the historical origins of the so-called hump-back model of productivity and richness, which we then translate into more explicit, testable statements. We then go on to develop an example that illustrates why bivariate descriptions of PRRs yield little definitive information. Our example also shows the value of abstracting systems as causal networks for clarifying ideas and structural equations for quantitative evaluation.

The ‘humped-back model’

To be clear, we do not consider ‘the humped-back model’ referred to by Pierce (hereafter, HBM) as simply a prediction about the shape of the relationship between biomass production and species richness. Numerous theories predict a humped relationship between species richness and variables related to biomass production, and different theories invoke different mechanisms (Abrams 1995). When Pierce refers to the ‘humped-back model’, he only references publications by (Grime 1973, 2001; and Al-Mufti et al. 1977, which is co-authored by Grime). We take that to mean that the discussion at hand relates specifically to the particular theoretical abstraction described by Grime, rather than alternative models such as the ‘dynamic equilibrium model’ (Huston 1979), the ‘resource-ratio model’ by Tilman (1982), or the ‘habitat template’ model of Taylor, Aarssen & Loehle (1990). We should also point out that there are models and theories that predict other shapes of relationships between productivity and richness. For example, theoretical and mechanistic studies of diversity effects on richness (Naeem et al. 1994; Tilman, Wedin & Knops 1996) imply a monotonic positive contribution to the PRR. Biogeographic studies of richness along climate and energy gradients have long predicted a positive linear PRR at coarse spatial scales (reviewed in Hawkins et al. 2003) and perhaps even at finer spatial scales (Gillman & Wright 2006). For the purposes of responding to Pierce's commentary, it is important to be as clear as possible about the set of ideas embodied in the phrase ‘the classic humped-back model’.

Most concisely, the humped-back model (HBM) is an abstraction that represents a collection of ideas about the processes and conditions favouring high levels of plant species richness (Grime 1979, 2001). This abstraction is represented by a kind of ‘picture’ (Fig. 1a) meant to summarize the collective action of key processes; (i) dominance, (ii) stress, (iii) disturbance, (iv) niche differentiation and (v) the ingress of suitable species. Note that dominance refers to ‘the tendency of larger plants to suppress the growth and regeneration of smaller neighbours’, that is competitive effects. The term stress is defined by Grime as the effects of external constraints (e.g. abiotic conditions) on productivity, while disturbance refers to the limitation of biomass accumulation by its partial or total destruction. The use of the phrase niche differentiation refers to ‘opportunities for complementary forms of exploitation and regeneration’ afforded by spatial and temporal variations in the environment and is discussed in terms of opportunities for species coexistence. The fifth process, the ingress of suitable species, is described by Grime as being controlled by the size of the reservoir of suitable species in the surrounding landscape and the rate of migration of species into areas following disturbance.

Figure 1.

(a) Grime's humped-back model (Grime 1973, 2001). Key to processes: (i) Dominance; (ii) Stress; (iii) Disturbance; (iv) Niche differentiation; (v) Ingress of suitable species or genotypes. (b) The relationship between maximum standing crop plus litter and species richness of herbs in habitats subject to various management regimes (Grime 2001, page 266).

The pictorial representation of the HBM (Fig. 1a) summarizes ideas in the form of a two-dimensional (bivariate) relationship between maximum potential species richness and the seasonal maximum standing crop plus litter. Al-Mufti et al. (1977) were the first to express the idea that the collective effects of the processes proposed by Grime (1973) to control species richness could be summarized by a single predictor, standing biomass plus litter. In their study (Al-Mufti et al. 1977), they plotted mean species richness against seasonal maximum biomass (including live biomass, standing dead and litter) for herbaceous plants from 14 specifically selected sites that included the herbaceous layer in woodlands (ignoring the overstorey comprising the majority of the biomass), grasslands and communities dominated by tall herbs. Since that time, a tradition has developed of evaluating the HBM (as well as competing models) by empirically examining two-dimensional plots of species richness vs. some measure related to biomass production. Figure 1b shows a fairly typical example (in our experience) of the kind of real-world data presented as consistent with the HBM. Following the publication by Adler et al. suggesting (once again) that the quantitative strength of association in such plots is weak, Fridley et al. (2012) and Pierce (2013) interpreted that conclusion as an attack on Grime's HBM and have staunchly defended the HBM as informative and useful (even ‘vital’). In this paper, we demonstrate why we urge scientists to move beyond the HBM, abandon a fixation on bivariate patterns and adopt a more sophisticated approach to the problem of understanding the regulation of species richness.

An alternative conceptualization of systems – causal networks

While the discussion of productivity and richness has focused on the relationship between two variables, we know that ecological systems are driven by numerous, interconnected processes that operate simultaneously. One approach to studying systems is structural equation modelling (SEM; Grace 2006; Grace et al. 2012b). A few definitions may be useful here. The philosophy of SEM is based on the ideas that (i) systems can be thought of as being controlled by networks of causal processes (causal networks), (ii) our ideas and hypotheses about those networks can be described in causal diagrams and (iii) where we have data to represent some of the elements in a causal diagram, we can empirically evaluate network hypotheses using SEM. For the interested reader not familiar with SEM, we provide additional information and a brief tutorial in Appendix S1 associated with this paper.

Starting with Grime's graphical representation of the HBM (Fig. 1a), we can translate the simplest prediction of the model into a causal diagram (Fig. 2a). This translation summarizes the idea that maximum potential richness can be predicted from maximum standing crop plus litter (total biomass). The HBM further proposes that total biomass conveys the collective effects of dominance, stress and disturbance. We can elaborate the simple diagram in Fig. 2a to show in Fig. 2b how the HBM relates to the empirical pattern in Fig. 1b. Here, we see that an additional assumption is required. We must assume that observed richness is some function of the maximum potential richness (which is a latent, unobserved quantity), since the HBM only claims to predict the maximum potential richness (a point Pierce emphasizes). We can represent that collective pair of ideas (total biomass predicts potential richness and potential richness predicts actual richness) as a simple probabilistic model made up of two observed variables and a latent mediator, maximum potential richness (Fig. 2c).

Figure 2.

(a) Causal diagram for the HBM shown in Fig. 1. Here, we substitute ‘total biomass’ for ‘maximum standing crop + litter’. The symbol U stands for other unspecified influences. (b) Elaboration of causal diagram implied by empirical evaluations. (c) Statistical model used in empirical evaluations. Variables in rectangles are observed variables, while maximum potential species richness is unobserved/latent. Episilon (ε) represents prediction error.

It is possible to develop a more complete causal diagram that includes the processes mentioned by Grime along with other logical assumptions (Fig. 3). In addition to dealing explicitly with the four concepts listed by Grime (Fig. 1a), we can recognize that stress effects should affect productivity while total biomass reflects both rates of production/productivity and rates of biomass removal through disturbance. Both niche differentiation and the supply of species are simply shown as influences on maximum potential richness in this causal diagram, based on the written descriptions in Grime.

Figure 3.

Causal diagram based on the processes described by Grime (2001) as related to the HBM. Other unspecified forces ‘U-variables’ not shown.

Expressing ideas in causal diagrams helps to clarify hypotheses about how processes are interconnected and promotes less ambiguous statements of theoretical ideas. Such diagrams also clarify how to test alternative possibilities given available data. Using SEM, we can determine the degree of support for proposed combinations of linkages and also ask whether the data suggest additional linkages. When additional links in a network model are discovered, they imply the operation of processes not initially hypothesized to be important. To show how this works and where it can lead, we used SEM to evaluate a model based on the causal diagram developed in Fig. 3 using data from coastal meadows in Finland (Grace and Jutila 1999).

In the study reported by Grace and Jutila (1999), herbaceous plant species richness, total biomass and a number of environmental characteristics were measured for 374 1-m2 plots distributed amongst numerous grazed and ungrazed meadows (grazing was treated as a dichotomous variable). Amongst the environmental factors measured were soil conditions and depth to water-table, which serve here as indicators of soil favourability/stress and water availability/stress. The supply of species in the surrounding landscape was not quantified in this study, nor were indicators of niche differentiation measured.

Figure 4 represents an initial SEM developed using the available variables from Grace and Jutila (1999) and the ideas in Fig. 3. Again, the fundamental prediction of the HBM is that the effects of disturbance and stress on richness will be mediated through total biomass. For this demonstration, we used simplifying assumptions (e.g., no feedbacks). In particular, we assumed that the arrow linking biomass to richness in the SE model represents unknown mechanisms involving the impacts of established dominant plants on light and soil resources and conditions. We emphasize this point because the continuous focus on biomass production in the literature has, in effect, implied some direct effect of biomass on richness. This is the kind of confusion that develops when one focuses on bivariate descriptive relationships.

Figure 4.

Initial structural equation model that attempts to represent the statistical expectations from the causal diagram in Fig. 3 while using variables available in Grace and Jutila (1999). Error terms not shown.

We used SEM procedures (Grace et al. 2012b) and the r software (R Core Team 2013) to evaluate the hypothesis represented in Fig. 4. For simplicity, linear additive specifications were used as approximating functions for what are known to be relationships with more complex form. Model-data discrepancies revealed that some linkages were missing from the initial model, leading to a revised final model with somewhat different form from the initial (Fig. 5). Additional information about this analysis is presented in Appendix S1.

Figure 5.

Structural equation model found to be consistent with data from Grace & Pugesek (1997). Numbers next to paths are standardized effects.

The SEM results imply a number of conclusions. As hypothesized by the model, biomass was lower in grazed meadows and increased with increasing soil favourability and soil water availability. We also found support for a negative effect of biomass on richness, which we assume is an indirect effect mediated by unmeasured latent factors. However, processes unrelated to variations in biomass also affected richness (Fig. 5, direct paths from soil favourability and water availability to richness). These results are inconsistent with the basic assumption of the HBM that effects of disturbance and stress on richness can be predicted from total biomass. Finally, the full set of predictors in the model explained variations in species richness in the landscape reasonably well (R2 = 0·55) and substantially better than biomass alone (R2 = 0·14 obtained using a third-order polynomial regression), concordant with previous reviews (Grace 1999). Furthermore, the R2 from a polynomial regression of Y on X does not represent the causal effect of X on Y. In the case of a third-order polynomial, X is being included in the model as if it was three variables (x, x2 and x3), some of which are standing in for other factors. Thus, while we can explain 14% of the variation in richness from biomass alone, this number greatly misrepresents any causal effect of biomass on richness and does not predict any responses that would be expected from changes in biomass.

Interpreting model results further, we see that grazing has two distinct and offsetting effects on richness. First, grazing indirectly promotes species richness by reducing total biomass and lowering competitive effects of dominance in the system (this is an indirect positive pathway made of two links with negative signs, which when multiplied together yield an indirect positive effect). Secondly, there is a direct path from grazing to richness that indicates reductions in richness independent of total biomass, perhaps a result of selective grazing effects on particular species. We point out that bivariate plots do not permit this kind of partitioning of direct and indirect effects.

The HBM makes a fundamental assumption that the effects of stressors and disturbance in the system can be predicted by biomass (Fig. 1a). However, we found the two variables related to stress, represented in this analysis by gradients in soil favourability and water availability, as well as disturbance (grazing), to have unique and even opposing types of direct influences on richness in the SE model (Fig. 5). Model results imply that soil favourability indirectly reduces richness through its positive effects on total biomass. However, richness responds directly to favourable soil conditions in the model. As with grazing, again we see a pair of processes that are partly offsetting because one pathway is positive and the other is negative. Effects of water availability, in contrast, represent a different set of relationships. Here, there is an indirect reduction in richness where water availability leads to high biomass, but there is also a general decline in richness with increasing water availability, probably due to the detrimental effects of soil flooding on potential richness (Gough, Grace & Taylor 1994).

Aside from being more informative and predictive than a bivariate plot, our SE model permits us to explicitly demonstrate difficulties with interpreting bivariate plots. The fundamental problem is elementary, and a basic tenet of both linear algebra and statistics – the number of variables observed needs to exceed (by at least one) the number of parameters/processes of interest if we are to estimate parameters in a model. Most would have first encountered this idea as the principle that you need as many equations as unknown parameters in a set of equations or else the system is ‘under identified’. In that case, one is not able to obtain unique estimates for parameters. We think that perhaps this principle's implications for interpreting bivariate patterns and relevance to the debate over the PRR are not universally appreciated.

To illustrate the problem of trying to choose amongst alternative models using bivariate plots, we started with the model in Fig. 5 and then asked ‘What can we discern about underlying processes from the bivariate relationships between richness and total biomass?’ We then created alternative models comprising subsets of the discovered processes for Finnish meadows and simulated values of richness assuming each of those models. The alternative models chosen are indicated in Fig. 6. Note that from a causal modelling perspective, all these alternatives (except Model A) are misspecifications of the true model. Those misspecifications would be readily detected using SEM procedures. Figure 6 shows results from nine different models (models A-I) that were estimated from the data along with the bivariate PRR patterns implied by those models. Model A represents the full model, and the plot of richness versus total biomass closely resembles the pattern in the raw data (Grace & Jutila 1999; Fig. 2). Model I represents a null model that fits only the mean for richness, while retaining the distributional properties of the raw data. What we hope is clear to the reader is that one cannot discern the differences between models generating the bivariate relations from simply looking at the bivariate plots.

Figure 6.

Models and simulated results. Diagram shows paths referred to in Key to Models. Also shown are the bivariate PRR plots generated by each model. Total biomass is expressed in grams per meter squared.

The kinds of results presented in this example are not to be viewed as unique, but are instead typical for SEM studies. There are now many example applications of SEM to ecological problems. Grace (1999) provided an early review of SEM studies of plant species richness. More recent examples involving richness and/or productivity include Borer, Seabloom & Tilman (2012), Bowker, Maestre & Escolar (2010), Carnicer et al. (2007), Cavieres et al. (2013), Condon, Weisberg & Chambers (2011), Frainer, McKie & Malmqvist (2013), Gazol et al. (2013), Klaus et al. (2013), Lamb & Cahill (2008), Maestre et al. (2011), Mokany, Ash & Roxburgh (2008), Paquette & Messier (2011), Prober & Wiehl (2011), Rooney & Bayley (2011), Schultz et al. (2011), Šímová, Li & Storch (2011), Socher et al. (2012), Weiher (2003). We think that such studies are consistently more informative and also provide for a much more comprehensive explanation of system properties than do bivariate patterns.

Upper boundary regression gives invalid results for the Adler et al. data

Both Fridley et al. (2012) and Pierce (2013) commented that the bivariate PRR presented in Adler et al. (2011) appeared to them to be strongly humped. Pierce went on to apply a statistical technique known as upper boundary regression (Blackburn, Lawton & Perry 1992) to the data from Adler et al. and reported strong predictive relationships for boundary relations. Because much of Pierce's critique of Adler et al. depends on this reanalysis, it is important that the methods are appropriate for the data being analysed. The standard approach to analysing boundary relationships is quantile regression (Cade & Noon 2003), which uses all the data in estimating relationships in upper boundaries defined by various quantiles of the y-values. Adler et al. included quantile regressions in their analyses and reported the findings along with conventional regression results. Within sites, only one humped boundary relationship was found out of the 48 examined. Pierce chose not to use quantile regression, but instead upper boundary regression, which selects a non-random subset of the data for independent analyses. Using this method, he reports a very strong predictive relationship (R2 = 0·81) across all plots (an analysis that ignores the non-independence of plots sampled within sites). Across sites at the global scale, quantile regression results reported by Adler et al. indicated a weak but positive linear relationship for the 95th percentile. Only for the 90th percentile could we find a humped boundary and, again it was weak (R2 = 0·05) and of questionable statistical significance. In contrast, Pierce's upper boundary regression analysis across sites produces a virtually perfect fit to a humped relationship (R2 = 0·98). How can two statistical procedures both designed to quantify boundary relationships lead to such radically different conclusions? As always, the devil is in the details.

While quantile regression fits models to various portions of the full data set (quantiles of the y-variable along the x-axis), upper boundary regression involves defining bins along the x-axis and then selecting some fixed number (not some proportion) of maximal or upper values in that bin while excluding data having lower values. While quantile regression is compatible with a wide variety of x-variable distributions, upper boundary regression has a strict requirement for the x-variable to be from a uniform distribution and for the sampling to capture an equal (and adequate) number of samples across the x-axis bins used for the analysis. Adler et al. made it clear that both biomass production and species richness were log-normally distributed in their sample (histograms are shown in their Fig. 1). Grace et al. (2012a) subsequently emphasized the challenges associated with visually interpreting linear plots of log-normal variates. What is important in the current situation is not simply that the Adler et al. data fail (badly) to conform to the strict requirements for valid inferences using upper boundary regression, but that violating this assumption strongly biases the test towards detection of concave-down (humped) relationships and grossly distorts estimated strengths of association.

How badly can the inappropriate use of upper boundary regression bias results towards finding a humped PRR? A simple simulation illustrates how upper boundary regression can produce evidence for a strongly humped relationship even when dealing with two independent random variables. If we draw 1000 values of two independent variables from independent distributions where the x-variable is drawn from a uniform distribution, upper boundary regression then shows (correctly) that there is no significant relationship (Fig. 7a). However, when the x-variable is drawn from some other distribution (here we use the log-normal, consistent with Adler et al.'s data), upper boundary regression tells us that a strong hump-shaped relationship is present (Fig. 7b). This is a bit shocking, given that we know the two variables are independent and unrelated. Note that similar bias will occur if we draw x from a normal distribution or any other distribution except for the uniform.

Figure 7.

Example visualization of upper boundary regression applied to two independent random variables where (a) the x-variable is drawn from a uniform distribution and (b) the x-variable is drawn from a log-normal distribution. Upper boundary regression applied to random data that is not from a uniform distribution (e.g. as in b) leads to the conclusion that there is a strong, nonlinear relationship between variables, despite the fact that none exists. In this case, to be consistent with Pierce, we fit a Lorentzian three-parameter model to the data. Variance explained for the random data in panel (b) using upper boundary regression is 0·45.

There are additional problems with Pierce's reanalysis of the Adler et al. plot-level data, such as ignoring non-independence of plots from the same site, thereby inflating sample size and biasing the p-value downwards. The selection of upper boundary richness values introduces another serious bias in that of the original 48 sites represented in the data set, only seven remain in Pierce's upper boundary analysis. The bottom line is that, for multiple reasons, Pierce's analyses are invalid and misleading and, therefore, do not provide any new, or stronger, evidence for a hump-shaped relationship in the Nutrient Network data set.

Quantile regression results for the Adler et al. data

While we do not believe quantile regression will lead to clarification of the processes connecting richness and productivity, we understand some wish to know whether a more defensible methodology, like quantile regression, can detect a humped upper boundary in the Adler et al. data. A more complete description of the analyses we performed is given in Appendix S2, while the R-code used and a copy of the Adler et al. data are given in Appendices S3 and S4.

Quantile regression was performed on the site-level data using the ‘quantreg’ package (Koenker 2013, version 5·05). Figure S1 illustrates Ricker equation fit lines for quantile regression applied to the site-level data. As summarized in Table S1, results show that none of the relationships examined were close to statistical significance. A similar finding was obtained using polynomial regression (second-, third- and fourth-order models were considered), which has greater flexibility to conform to nonlinear data.

Because plots are clustered within sites in the Adler et al., data simple quantile regression is inappropriate for the plot-level data. Geraci & Bottai (2013) describe a new method for analysing quantile relationships in multi-level data using mixed models (those with both random and fixed effects). Their procedures, implemented in the ‘lqmm’ package, are summarized in Geraci (2013). Analyses were implemented in r version 2·15·1, (R Core Team 2013) using lqmm version 1·02. We considered a number of equational forms for evaluating Pierce's hypothesis that the upper boundary of the regression of richness on biomass production is humped. Ultimately, we only gave serious consideration to equations that could be transformed to linear additive form because of the limitations of the quantreg and lqmm packages. We ended up choosing the Ricker equation (Cade & Quo 2000; Bolker 2008) as the best form for fitting quantiles in these data. This choice is discussed in greater detail in Appendix S2. Evaluation of plot-level data led to similar findings as found for site-level data (Table S2). Again, none of the quantile relationships estimated were close to meeting classical criteria for statistical significance.

Certainly, there have been numerous other individual studies in which the bivariate relationship between biomass production and species richness were found to be modal and non-random (see Grace 1999 for a review). It is easy to understand why a visual examination of the Adler et al. data would lead one to suspect that some sort of significant boundary relationship might exist for these data. Our analyses, however, do not show a clear relationship between biomass production and richness. As we emphasized earlier, even if we were able to detect a significant humped relationship in the Adler et al. data, it would not move us forward in our understanding of causal relations.

Site inclusion in analyses

A second focus of Pierce's critique is on our decision to present a variety of results from analyses of different subsets of sites. Specifically, we showed that the shape of the PRR is sensitive to the inclusion of what we termed ‘anthropogenic’ sites, which we defined in the main text and the legend of Fig. 2 in Adler et al. as ‘pastures, old fields and restored prairies’. In contrast to all the other sites in our data set, these anthropogenic sites were either cultivated (the old fields and restored prairies) or seeded (pastures). Because of their land-use history, we worried that these sites might have much smaller species pools than all the other sites which were never converted, cultivated, or seeded, though many are managed with grazing or burning (these managed sites were included in all analyses). In fact, the anthropogenic sites do tend to have lower species richness than the rest of our sites (Fig. 2 in Adler et al. 2011). Including the anthropogenic sites provides evidence for a hump-shaped PRR, but we cannot determine whether this reflects the influence of competitive exclusion at high productivity or the influence of small effective species pools at these sites. Recall that we presented both results with these sites included as well as results with these sites excluded, anticipating correctly that we could be criticized for either approach (as we were by Pan et al., 2012 and Fridley et al. 2012).

Pierce also questions the removal of the one saltmarsh site in our data set from some of the analyses. We take this opportunity to better explain our justification for removing the site. Our reasoning was based on (i) the fact that very few plant species have evolved the tolerances required to live in habitats both flooded and saline (i.e. saltmarshes) and, thus, species pool sizes are very small (Gough, Grace & Taylor 1994) and (ii) a PRR curve based on the inclusion of a saltmarsh might be criticized for implying that the observed low richness for a saltmarsh results from competition instead of low pool size.

While Pierce (2013) cited Fridley et al. (2012), who also questioned our decision to present the analysis of the reduced data set, he did not cite Pan, Liu & Zhang (2012), who criticized us in the same discussion for including the anthropogenic sites and the saltmarsh in some of our analyses. In fact, Pan, Liu & Zhang (2012) argued for culling even more sites to purify the sample, which leads to strong support for a positive linear PRR. Collectively, Fridley et al. (2012) and Pan, Liu & Zhang (2012) show how particular site selection criteria produce support for different hypotheses, which is why we presented multiple analyses to illustrate that our general conclusions were robust to such decisions. Ultimately, the results from all of our analyses support the conclusion that the relationship between productivity and richness is ‘weak and variable’. The strongest association observed, the global comparison of all sites, explained only 11% of the observed variation in richness, hardly the strength of relationship needed to be the basis for effective conservation management.

What is most helpful for conservation decision making?

Pierce's focus on management implications of productivity–richness relationships (PRRs) surprised us. Specifically, he appears to interpret our conclusion that PRRs are weak and variable as a recommendation that European land managers should stop mowing, grazing or burning pastures to maintain high plant species richness. How could our analysis of a global pattern be interpreted as a specific management recommendation? Perhaps through the following logic: (i) A hump-shaped PRR indicates that competitive exclusion limits species richness at high levels of biomass production; (ii) therefore, managers should reduce biomass production to promote species richness; (iii) conversely, if the PRR is not hump-shaped, then competitive exclusion does not limit richness and management to reduce biomass production should not promote richness.

The crucial assumption in this logical chain is that we can make inferences about the role of competitive exclusion from the observed, broad-scale relationship between productivity and richness. As we have shown, no single underlying process, or mechanism, can be inferred from the simple bivariate relationship between productivity and richness. In other words, the presence of a hump would not offer strong evidence for the importance of competitive exclusion, nor would the lack of a hump provide evidence against it. Current management techniques in European pastures are effective in spite of, not because of, the information provided by bivariate PRRs.

To further persuade readers that moving beyond the bivariate focus of the HBM is advantageous, we illustrate some of the features of a quantitative case study. Models based on causal network principles, as in Fig. 5, permit us to assess the predicted sensitivity of richness to variations in both management (grazing in this case) and other conditions in the landscape. While only grazing is managed actively at present in these systems, general conservation efforts might be importantly informed by knowledge of how strongly site conditions such as soil favourability and water availability impact richness. We illustrate the predictive implications of our model in Fig. 8 (see Grace et al. 2012b and Appendix S1 for a discussion of interventions in SEM).

Figure 8.

Intervention scenarios and forecasts. Diagram shows paths referred to in Key to Scenarios. The guide to scenarios uses Pearl's (2009) ‘do’ operator to describe the paths whose influence is set equal to zero or to its maximum value. Also shown are the frequency distributions for richness estimates generated in each scenario. Arrows in frequency plots indicate location of medians.

In this illustration, the histograms in Fig. 8 show predictions (omitting uncertainties) for how richness in the collection of plots sampled might respond if it were possible to change conditions. Two things drive these predictions, (i) the prediction coefficients and (ii) the distribution of values of the predictors; thus, these predictions apply only to the sample, not to some larger population. Comparing Scenarios B to A, results indicate that if it were somehow possible to protect sites from selective grazing impacts (perhaps through altered grazing rotations), no major increase in richness could be obtained. That said, there is a substantial range of richness values throughout the landscape due to the importance of other factors (both known and unknown). One of the most important other factors in the control of richness is soil favourability. Our Scenario C shows that if sites with the lowest level of favourability were chosen for conservation, a greatly reduced range of richness values could be expected, providing less conservation opportunities. On the other hand, if sites with favourable soil conditions were selected, much greater levels of species richness might be protected in the local sites (Scenario D). Skipping to Scenario F, results suggest that insufficient water is not limiting for most sites in the study, though the model results (Fig. 5) and further analyses tell us that for those few sites subject to frequent flooding (excess water availability), richness can be greatly reduced. Finally, Scenario E predicts that if, for example mowing or haying were used to maintain optimal biomass levels (to values below 250 g/m2, based on the bivariate peak), the impact on average richness would be minimal (though sensitivity analyses indicate maximal richness would be constrained for a small percentage of the sites). Through these scenarios, we show that by using SEM, we are able to make quantitative predictions customized for different sets of conditions and to incorporate the influences of grazing, soil favourability and water availability into computations.

Finally, natural resource managers certainly benefit from the insights provided from ecological generalization; however, real-world conservation plans consider the many factors that will influence local responses to management (Theobald et al. 2000). Pierce's (and the HBM's) focus on boundary relations implies that one only has to manage biomass to manage richness. Such a prediction is dangerously misleading. Even for the case of managing biomass, Pierce's analysis fails to provide useful information. We must point out that Pierce's regression selects only 7 of the 48 sites for estimation of the boundary relation. How does one propose to inform management for the 41 sites not included in the modelling effort? In contrast, our illustration in Fig. 8 provides predictions for all locations and conditions. We would argue that it is information of this sort that is more relevant for decision makers.

Conclusions and implications

Adler et al. presented two general conclusions about the quest for a canonical PRR relationship: (i) the body of empirical evidence suggests the bivariate association is weak and variable and (ii) we should turn our attention to the study of more complex models. Grace et al. (2012a) went on to state that even if productivity and richness were strongly and consistently correlated, we still would be unable to resolve underlying mechanisms and unable to choose amongst the many proposed theories and models. In this paper, we support the first of these conclusions by showing that the analytical approach, results and empirical conclusions presented by Pierce are invalid and that quantile regression, a more proper approach to the task, fails to find a strong relationship. Thus, we conclude once again that there is no secret, strong signal hidden in the bivariate data presented by Adler et al., just modest associations.

Clearly, our pleas for scientists to ‘focus on fresh, mechanistic approaches to understanding the multivariate links between productivity and richness’ have only been heard by some. We appreciate that this is a complex message and involves ideas and procedures that are unfamiliar to many. One approach to developing and testing more complex, multivariate hypotheses is structural equation modelling. While the roots of SEM lie in biology, these methods have yet to be widely incorporated into the quantitative training of biologists. Therefore, we have used this paper as an opportunity to demonstrate what a causal network perspective can bring to the study of productivity–richness relations (though our analysis is admittedly incomplete).

The HBM leads to only the simplest of predictions and permits only the most basic quantitative test, that there is some non-random, modal relationship between biomass and richness. As a result of this simplicity, it takes only modest evidence to ‘support the HBM’. Pierce uses one of the statistically significant results reported by Adler et al. (ignoring many non-significant results) to declare that our results are ‘in agreement with the classic humped-back model’. This is despite the fact that the vast majority of observed variation in species richness in that particular analysis could not be related to biomass production. We should be concerned that the strongest support for the HBM in the original Adler analysis left 89% of the variation unexplained.

Stated quite frankly, we are not content with the lack of progress on this topic over the past 40 years (since Grime 1973). As discussed in Chapter 12 of Grace (2006), scientists and the public should expect ideas to mature over time, becoming less ambiguous, more completely understood and more predictive. We should also expect debates about competing ideas to be resolvable. Yet, here we are in 2014 discussing the same simple abstraction as in 1973. We see no evidence that any new evidence or ideas have led to an evolution of the HBM, nor is there any sign that the debate is moving towards reconciliation.

The simplistic nature of the HBM not only hinders scientific progress but also limits its use for advising managers and decision makers. We live in a quantitative age. It is not sufficient to say only that the eutrophication of systems will lead to a reduction in maximum potential species richness. Managers expect science to say more precisely how much change will lead to what quantity of impact, which sites and systems will be most impacted, and when thresholds will be crossed (Mitchell et al. 2014). The HBM produces no guidance of that sort. In contrast, an approach based on causal networks and structural equation models does permit such refined evaluations and forecasts, as we briefly demonstrate in this paper (for a more detailed example, see Grace et al. 2012b, Fig. 12 for specific predictions of thresholds for eutrophication impacts in National Parks).

Perhaps of greatest concern to us is the continuing emphasis on compiling even more data on only two variables, total biomass and richness. Such efforts are, at best, inefficient ways of clarifying understanding and, at worst, lost opportunities to collect the data needed for more informative and predictive models. We hope that this paper will encourage other researchers to get over the hump and examine more complete hypotheses using the new methods that are available.

Acknowledgements

We acknowledge Heli Jutila for the availability of data used in some of our illustrations and Marco Geraci for advice on the use of the lqmm package for mixed-model quantile regression. We also thank T.M. Anderson and three anonymous reviewers for comments on an earlier version of the manuscript. Support for JBG was provided by the USGS Ecosystems Program. PBA was supported by the National Science Foundation (DEB-1054040) and Utah State University. The use of trade names is for descriptive purposes only and does not imply endorsement by the US Government. Coordination and data management of the Nutrient Network (http://www.nutnet.org) experiment has been supported by funding to E. Borer and E. Seabloom from the National Science Foundation Research Coordination Network (NSF-DEB-1042132) and Long Term Ecological Research (NSF-DEB-1234162 to Cedar Creek LTER) programs and the Institute on the Environment (DG-0001-13). We also thank the Minnesota Supercomputer Institute for hosting project data and the Institute on the Environment for hosting Network meetings. The authors declare that they have no conflicts of interest relative to this work.

Footnotes

  1. 1

    While not reported by the authors, removal of only anthropogenic sites, without omitting the saltmarsh site, also supported a positive linear PRR instead of a humped PRR.

Ancillary