Whilst the volumes of data generated by scientific instruments such as gene sequencers and satellite-mounted sensors can tax the skills of the most quantitative ecologist, a comparable challenge arises with any effort to review and synthesize the primary literature. Certainly, the sheer number of published scientific journal articles is astounding. PubMed Central has details of over 19 million abstracts, and ISI and Scopus both index at least 40 million records. *New Phytologist* has published over 18 000 articles since its first issue in 1902. Long gone are the days when tracking a handful of journals would suffice to keep one abreast of important research developments. As such, review papers have shifted from a narrative-driven to a data-driven approach, taking the results from primary research articles and quantitatively analyzing and synthesizing these data in an attempt to arrive at more robust conclusions.

‘Long gone are the days when tracking a handful of journals would suffice to keep one abreast of important research developments.’

This meta-analysis approach has become the methodological platform of choice in many areas of empirical science, including ecology. In the process, the statistical tool kit available to the ecological meta-analyst has become more sophisticated and better adapted to the diverse array of experimental designs and reporting formats favored by field biologists (see Lajeunesse, 2011; Koricheva *et al.*, 2012). Concurrent with widening use and increased statistical utility in ecology, the standards of meta-analysis also have been raised. Two recent papers from an international group of peatland ecologists (Limpens *et al.*, 2011 and Limpens *et al.*, 2012, this issue of *New Phytologist*, pp. 408–418) stand out in this regard and bear close examination by those considering embarking on their own meta-analysis (or requiring the same of a graduate student).

As background, Lajeunesse (2010) put forward general recommendations for a high-quality ecological meta-analysis. Of central importance is transparency in the criteria used to select studies for inclusion in the meta-analysis, with a premium on broad inclusivity. Limpens *et al.* make clear in both their meta-analyses of experimental nitrogen (N) additions to peatlands how the literature was searched and which response variables and covariates were extracted from each study. However, one of the study **inclusion** criteria from Limpens *et al.* (2011), that experimental *Sphagnum* plots be exposed to diurnal and seasonal changes in solar irradiance and temperature, resulted in the **exclusion** of all glasshouse-based studies. Limpens *et al.* pose the appropriate follow-on question; do glasshouse and field studies yield similar results?

Another meta-analysis best practice is to account for differences in precision among studies. This often is accomplished by weighting each study effect size by the inverse of the variance (Shadish & Haddock, 1994). Limpens *et al.* take a different, yet effective, approach, graphically examining the relationship between effect size and variance across all studies and testing for differences in within-study variance between glasshouse and field studies. That they found no difference between these two groups is itself an interesting result, since systematically lower variance (higher precision) in glasshouse studies would cause them to be weighted more heavily in meta-analyses (Gurevitch & Mengersen, 2010). The authors further probed the robustness of their results by re-running their models while sequentially removing single studies (a leave-one-out jack-knife approach) and testing for sensitivity to extreme data points.

A common problem associated with ecological meta-analyses is violation of the assumption of independent effect size estimates (Hedges *et al.*, 2010). This problem can occur in different ways. For example, sampling dependence occurs when experimental designs include multiple treatment responses with a common control, as was the case with many of the N addition studies analyzed by Limpens *et al.* This case of nonindependence is also true for repeated measures designs where the same plot or individual sampled at sequential time intervals is compared to its initial value. Hierarchical dependence occurs when a single research group contributes multiple reports, or effect size estimates, to the meta-analysis, which is quite typical in ecology. Limpens *et al.* take advantage of recent theoretical and computational advances to employ a hierarchical Bayes linear model for their meta-regression that controls for both sampling and hierarchical dependence (Stevens & Taylor, 2009). This is an important development and their approach should find a home in many meta-analysts’ tool kit.

To illustrate the utility of this approach, we reconsider results from the meta-analysis of Nave *et al.* (2009) summarizing the effects of elevated N inputs on forest soil N mineralization (N_{min}). In this simple example, drawn from the work of McNulty & Aber (1993), N_{min} from replicate control plots is compared to N_{min} from plots receiving four different N addition rates to arrive at an overall estimate of the effect of N addition on soil N_{min}. The effect size estimates for each N addition rate are therefore not independent because they are compared to a common control. This problem now can be easily addressed with readily-available statistical software.

Here, we use the software R (R Development Core Team, 2012), the same free and open-source software used by Limpens *et al.* (2011, 2012). Even within R, there are a number of analytical packages that have been developed for meta-analysis. One that can deal with both sampling and hierarchical dependence is the ‘metahdep’ (hierarchical dependence in meta-analysis) package developed by Stevens & Taylor (2009). The metahdep() function within this package can take a variety of modeling approaches, either a fixed or random effects meta-analysis, or a hierarchical Bayes meta-analysis. Within each of these approaches, covariates can be treated as independent, or one can account for sampling and/or hierarchical dependence. In this example, we present a simple fixed effects meta-analysis to illustrate the consequence of sampling dependence on an estimate of overall effect size. Table 1 shows the primary data.

Net mineralization (kg N ha^{−1} yr^{−1}) | ||
---|---|---|

Mean | SD | |

Mean and standard deviations from Table 5 in McNulty & Aber (1993). *N*=
| ||

Control | 15.7 | 1.5 |

Treatment 1 | 48.2 | 3.2 |

Treatment 2 | 54.3 | 10.6 |

Treatment 3 | 61.4 | 12.5 |

Treatment 4 | 62.3 | 12.8 |

Following notation and Eqn 1 from Lajeunesse (2011):

we first calculate the log response ratios (RR) for each control-treatment pair (, treatment mean; , control mean), resulting in the vector ** E** = [1.122 1.241 1.364 1.378]′. Then, following Lajeunesse’s Eqn 8, simplified for two treatment levels (A and B),

we can calculate the variance–covariance matrix (SD, standard deviation; *N*, sample size, for the control (subscript C) and each (superscript A or B) treatment (subscript T)). The main diagonals of the matrix **V** are the variance of each treatment (now expanded to a total of four, from Table 1), and the off-diagonals are the covariances:

Along with the design matrix, **X** = [1 1 1 1], **V** and ** E** are arguments in the metahdep() function in the ‘metahdep’ package. We find that treating all samples as independent (i.e. assuming all the covariances, off-diagonals, are zero), the aggregate response ratio () is 1.21 with a variance () of 0.0015. Correctly accounting for sampling dependence, the effect size is reduced ( and ). Expanding the analysis to account for hierarchical dependence requires the use of a Bayesian approach, which in the ‘metahdep’ package is straightforward and easily implemented.

Until recently, the primary options available to the ecological meta-analyst when confronting non-independent data were either to ignore the problem, discard data that violated independence assumptions, or partition the data to avoid aggregating non-independent observations. Fortunately, we now have at our disposal the statistical tools and excellent examples of their use that enable a much better path forward. There are many ways to adapt to life in a data-rich world, but the rigorous, quantitative synthesis of our collective efforts to shed light on the workings of the natural world will only increase in importance as our knowledge base expands.