The influence of climate on biodiversity is a central concept in ecology. Studies include phenological changes (e.g. Both et al. 2009), analysing the responses of biodiversity to past and current climate (e.g. Woodward 1987) and describing population dynamics of species in response to weather variables (e.g. Stenseth et al. 2004; Jonzén et al. 2009). Studies of population dynamics in changing climates have become increasingly important, yet these typically describe effects of climate on individual species or collections of species operating independently (e.g. Keith et al. 2008). A lingering concern is that species can interact through effects such as competition, mutualism, predation and spread of disease, and even subtle interactions can have profound effects on responses of species to climate, making predictions uncertain (Tylianakis et al. 2008). Can we simply add predicted responses across a range of species, and if not, how can we accommodate correlations? Ignoring these interactions is analogous to the single-species vacuum in which models of population dynamics are typically enveloped (Sabo 2008). The paper by Mutshinda, O’Hara & Woiwod (2011), which examined the influence of rainfall and temperature on moth population dynamics, shows how to reveal correlations in the response of species that are not explained by the environmental variables. Their approach has the potential to further our understanding of the response of species assemblages to climate change and other environmental drivers.
The low residual correlation in the study of moth population dynamics (Mutshinda, O’Hara & Woiwod 2011) is good news, essentially confirming that cataloguing the responses of individual species in separate analyses might provide similar results to one, such as this, that integrates across multiple species explicitly. Of course, one moth does not make a summer. This is a single study on one group of species that occupy a similar trophic position and it is viewed through a 30-year window with no more than modest trends in annual climate. The light-trapping data might not be sufficiently precise to reveal correlated changes in abundance. Further, interspecific competition or other changes in the trophic network over that time horizon might be sufficiently small such that unexplained correlations are minor. Species that are linked more directly, for example via parasitism or predation, or that experience larger changes in their immediate trophic network might exhibit stronger correlations in their responses. Nevertheless, it points the way to a new approach to data analysis.
Standard assumptions in many of the analyses described in ecologists’ elementary statistical textbooks include effects being additive and with residuals being drawn from identical and independent normal distributions (e.g. linear regression and anova). Modern statistical analyses are relaxing these requirements. Generalised linear models have been used in ecology for some time, providing a suite of distributions beyond the normal (McCullagh & Nelder 1989; Austin, Nicholls & Margules 1990). Generalised additive models and other methods based on machine learning can accommodate nonlinear relationships (Hastie & Tibshirani 1990; Stenseth et al. 2004; Elith & Leathwick 2009). Finally, the assumption of independence is now being relaxed in ecology, with the study by Mutshinda, O’Hara & Woiwod (2011) showing that the entire variance-covariance matrix can be modelled and estimated, rather than assuming independence in which the covariances equal zero.
Explicitly modelling the covariances in ecological data is important for several reasons. First, the assumption of independence is convenient, but unlikely to be true in ecology. Sites can experience synchronous or asynchronous changes, and effects are likely to propagate with temporal correlation through time, leading to both spatial and temporal correlation within species. Interactions between species will lead to correlations among species. Ignoring these correlations will lead to problems that are essentially matters of pseudo-replication (Hurlbert 1984). Modelling the dependencies provides a more honest analysis of the information content of the data and actually allows us to estimate the magnitude of correlation rather than assuming it is zero.
The correlations can provide ecological insight by suggesting avenues for further research. Low correlations suggest questions about why different species, which could be otherwise similar, are responding differently, e.g. because of niche specialisation or neutrality. High correlations suggest one or more important explanatory factors have been omitted from the analysis, for instance, abiotic variables, species interactions, or phylogenetic or biogeographic history. These correlations provide springboards for further understanding and new lines of enquiry.
Some readers may find the statistical analysis in Mutshinda, O’Hara & Woiwod (2011) complex. Their statistical analysis uses state-space models, which have two components (Buckland et al. 2004). The first component models the change in the population size of species, and the second component models the observation of the data given the abundance of the species. This allows the analysis to conform to the way in which the data were collected, rather than forcing the data collection to conform to a particular statistical model. Doing this permits direct modelling of the relationships between population sizes, leading to inference at the level of the ecological processes rather than being restricted to models and inference of the relationships in the data.
State-space models do not substitute for poor experimental design. For example, ensuring that parameters are identifiable remains an issue (Mutshinda, O’Hara & Woiwod 2011). However, the analysis simplifies interpretation by separating the signal (the change in the abundance of species) from the noise (the data collection, which is an imprecise sample of the true abundance). This separation of the signal from the noise was the original motivation for developing state-space models in aircraft and space navigation, with the Apollo moon missions being some of the first applications (Hutchinson 1984). Yes, this is rocket science.
The ability of the statistical analysis to conform to the data, allowing ecological processes to be modelled directly, explains why ecologists and environmental scientists are increasingly using Bayesian methods (Clark 2005; McCarthy 2007). Most Bayesian analyses in ecology are motivated by the ease of analysing hierarchical models. They rarely take advantage of the principal attribute of Bayesian analyses, which is the ability to incorporate prior information. Instead, they usually use vague priors such that the posterior distribution is similar in shape to the likelihood function. Indeed, the Bayesian Markov chain Monte Carlo machinery can be used to generate maximum likelihood estimates that are independent of the prior (Lele, Dennis & Lutscher 2007). However, progress in ecology hinges on synthesis across case studies. Such synthesis suggests that a priori predictions of ecological parameters should be possible (McCarthy, Citroen & McCall 2008). For example, the consistent responses of the species of Lepidoptera to winter rainfall and temperature (Fig. 4a in Mutshinda, O’Hara & Woiwod 2011) and consistency with previous studies that was noted by the authors suggest that informative priors could be established in this case. Embracing informative priors based on synthesis of previous data would change the comparisons of new and old studies. Currently, restricted to qualitative statements in the discussion sections of papers, these comparisons would be reported in the methods and results when using informative priors, making them much more quantitative. The paper by Mutshinda, O’Hara & Woiwod (2011) shows how Bayesian analyses can integrate across multiple species within a single study, accounting for both the interactions among species and the nature of the data. Ecology should aim for further steps towards systematic integration across studies.