SEARCH

SEARCH BY CITATION

Keywords:

  • amphibians;
  • biotic interactions;
  • community assembly;
  • correlated residuals;
  • Eucalyptus ;
  • frogs;
  • species covariance

Summary

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Data accessibility
  9. References
  10. Supporting Information
  1. A primary goal of ecology is to understand the fundamental processes underlying the geographic distributions of species. Two major strands of ecology – habitat modelling and community ecology – approach this problem differently. Habitat modellers often use species distribution models (SDMs) to quantify the relationship between species’ and their environments without considering potential biotic interactions. Community ecologists, on the other hand, tend to focus on biotic interactions and, in observational studies, use co-occurrence patterns to identify ecological processes. Here, we describe a joint species distribution model (JSDM) that integrates these distinct observational approaches by incorporating species co-occurrence data into a SDM.
  2. JSDMs estimate distributions of multiple species simultaneously and allow decomposition of species co-occurrence patterns into components describing shared environmental responses and residual patterns of co-occurrence. We provide a general description of the model, a tutorial and code for fitting the model in R. We demonstrate this modelling approach using two case studies: frogs and eucalypt trees in Victoria, Australia.
  3. Overall, shared environmental correlations were stronger than residual correlations for both frogs and eucalypts, but there were cases of strong residual correlation. Frog species generally had positive residual correlations, possibly due to the fact these species occurred in similar habitats that were not fully described by the environmental variables included in the JSDM. Eucalypt species that interbreed had similar environmental responses but had negative residual co-occurrence. One explanation is that interbreeding species may not form stable assemblages despite having similar environmental affinities.
  4. Environmental and residual correlations estimated from JSDMs can help indicate whether co-occurrence is driven by shared environmental responses or other ecological or evolutionary process (e.g. biotic interactions), or if important predictor variables are missing. JSDMs take into account the fact that distributions of species might be related to each other and thus overcome a major limitation of modelling species distributions independently.

Introduction

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Data accessibility
  9. References
  10. Supporting Information

The geographic distribution of a species is influenced by its environmental tolerances, as well as by interactions with other species (Hutchinson 1957), but decomposing the roles of abiotic and biotic factors on species’ distributions is far from routine. Species distribution models (SDMs) that correlate the occurrence or abundance of a species with abiotic variables (e.g. climate and topography) are typically used to investigate species-environment relationships (Austin 2002). However, most SDMs only implicitly consider interactions between species (Dormann et al. 2012), despite the potentially important influence of biotic interactions on species’ ranges (Davis et al. 1998; Wisz et al. 2013).

On the other hand, community ecology studies that tackle questions of co-occurrence tend to focus on interactions between species (e.g. trophic dynamics, facilitation or competition). The environment is often inferred to be important if species within local communities are functionally similar (relative to null or randomized communities; Webb et al. 2002). In these randomizations, co-occurrence is often represented by an index (Hardy 2008) that does not account for the amount of co-occurrence that can be attributed to shared environmental responses among species. However, studies are beginning to link the fields of community assembly and species distribution modelling (for a review see Kissling et al. 2012). For example, Helmus et al. (2007) used the residuals from an SDM to calculate a co-occurrence index, thereby considering the effect of environmental variables on co-occurrence estimates.

Likewise, studies that use SDMs are beginning to consider species interactions by restricting the predicted distribution of one species to that of another (Schweiger et al. 2012) or by adding the occurrence or abundance of other species as predictors alongside abiotic variables (Leathwick & Austin 2001; Leathwick 2002; Meier et al. 2010; Pellissier et al. 2010). The addition of biotic interaction terms has generally improved the predictive performance of SDMs (Araujo & Luoto 2007; Heikkinen et al. 2007), and in some cases, biotic predictors have outperformed abiotic variables (Meier et al. 2010). However, this approach only models unidirectional interactions between species and confounds the influence of species interactions and environmental covariates (Kissling et al. 2012).

Similarities in environmental responses of species can be accommodated in multispecies SDMs (Ovaskainen & Soininen 2011; Pollock, Morris & Vesk 2012), and such responses to environmental gradients can be modelled as a function of species traits (Pollock, Morris & Vesk 2012). However, not all features that influence co-occurrence, particularly biotic interactions, will be captured by environmental variables. In this case, residual patterns of co-occurrence will exist. For example, two species might have a 0·5 probability of occurrence at a site, in which case each of the four combinations of co-occurrence (both species present, both absent and one or the other present) would be equally likely if the species occurred independently. However, if the species were perfectly positively associated (taken as one extreme for illustrative purposes), then they would occur together at 50% of sites, and both would be absent from 50% of sites. Alternatively, with perfect negative co-occurrence, one species would be present at 50% of sites, while the other species would only be present at the other 50% of sites, and the species would never occur together. Such residual patterns of co-occurrence can be thought of as correlations in the random (Bernoulli distributed) occurrence of species.

Hierarchical generalized linear models provide a flexible way to include multiple species in a single SDM and incorporate uncertainties that are common in species distribution data (Gelfand et al. 2003). Multispecies models result in more precise estimates of model parameters for rare species because parameters can ‘borrow strength’ from those of common species (Ovaskainen & Soininen 2011; Pollock, Morris & Vesk 2012). Despite these potential benefits, hierarchical multispecies GLMs also usually ignore interactions between species, as they assume that each species’ response to the environment represents an independent draw from a common distribution of possible responses. In practice, however, interactions between species will induce unmodelled dependence in the residuals of such a model. These residual correlations violate a primary model assumption if not accounted for, but more importantly, can be used to gain insights into the relative roles of biotic and abiotic constraints on species co-occurrence patterns.

Here, we describe a joint species distribution model (JSDM) that introduces correlated occurrence into a hierarchical multivariate probit regression model. The statistical foundation of this general method was introduced over 15 years ago (Chib & Greenberg 1998) but has rarely been applied in the ecological literature. In fact, only four of 458 papers that have cited Chib and Greenberg's (1998) seminal work have dealt with ecological problems, and to the best of our knowledge, only two studies (Latimer et al. 2009; Clark et al. In press) have used a multivariate probit model to fit SDMs (but see Ovaskainen, Hottola & Siitonen (2010) and Sebastián-González et al. (2010) for a similar approach using multivariate logistic models).

In contrast to these earlier applications, we provide a general introduction to the use and interpretation of these models in ecology. We include a step-by-step tutorial on how to fit and assess multivariate probit models in a Bayesian framework and include code for running these types of models in R (R Core Team 2013; See Appendix S1). To illustrate our approach, we examine co-occurrence patterns in natural communities using case studies on frogs and trees in Victoria, Australia. We demonstrate how these models can provide insights into the underlying causes of similarities and dissimilarities in distributions among species.

Materials and methods

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Data accessibility
  9. References
  10. Supporting Information

Model description

We model species co-occurrence using a multivariate probit regression model (Chib & Greenberg 1998). Probit regression is a generalized linear model similar to logistic regression (McCullagh & Nelder 1989). Probit regression relates a linear predictor, the standard regression equation used in generalized linear models, to probabilities with a standard normal cumulative distribution function or probit link. In contrast, a logistic regression uses a logit link function.

An alternative way of parameterizing a probit model is indirectly with a latent variable formulation, rather than using a probit link directly. Latent (or unobserved) variables are superficially similar to link functions as both are used to relate a continuous linear predictor to discrete binary response data. If we consider a site by species data set, Yij, species j is present at site i when a latent variable, Zij, is > 0 (and absent if less). Here, Zij is a normal random variate with mean Lij and a standard deviation of 1.

We represent this graphically for two hypothetical species (Fig. 1), with the probability of presence being the shaded area under the density function for values of Zij > 0. The mean of the normal distribution, Lij, is the analogue of the linear predictor in a standard probit regression. A large positive value of Lij implies a high probability of presence, while a large negative value implies a low probability of presence. For example, the probability of presence is 0·69 if Lij = 0·5 and is 0·16 if Lij = −1 (Fig. 1).

image

Figure 1. Probit regression for the occurrence of two hypothetical species (= 1, the tree frog, or = 2, the toad) at a particular site i depicted using probability density functions of the latent normal variate Zij. The species would occur at the site when the latent random variable, which has a standard deviation of 1, is > 0. Thus, the mean of the latent variable (Li1 = 0·5, Li2 = −1·0) determines the probability of occurrence. The probability of occurrence equals the shaded area under the density function greater than zero (0·69 and 0·16). These representations of individual species ignore patterns of co-occurrence.

Download figure to PowerPoint

If the latent variable Zij is independent of the other latent variables in the model (i.e., there is independence among sites and species), then it is a standard probit regression. However, if the latent variables are correlated, indicating that species presences and absences are not independent, then a multivariate normal distribution must be used to model the values of Zij.

The number of dimensions of the multivariate normal distribution is the number of species being modelled. For example, correlation in a latent bivariate normal distribution influences the joint probabilities of presence and absence of two species, with the probability of joint presence or absence increasing with the correlation coefficient (Fig. 2). However, the probability of presence of each species, unconditional of the presence of the other, is unaffected by the correlation. In our hypothetical example, the probability of the presence of species 1 is 0·69 (Fig. 1) regardless of the correlation (summing the probabilities in the two right-hand quadrants in each panel of Fig. 2). Similarly, the probability of the presence of species 2 remains 0·16 as the correlation changes.

image

Figure 2. Co-occurrence patterns of the two species from Fig. 1 modelled using a bivariate normal distribution represented as contour plots of probability density, with correlation 0·75 (a), 0·0 (b) and –0·75 (c). The numbers on the contours (the concentric ellipses) are the probability densities that encompass 0·1, 0·3, 0·5, 0·7 and 0·9 of the volume under the bivariate normal distribution. Each species occurs at the site when the corresponding random variate is greater than 0. Thus, species 1 (the tree frog) occurs when Zi1 is greater than zero (the right-hand quadrants), and species 2 (the toad) occurs when Zi2 is greater than zero (the upper quadrants). The joint probabilities of occurrence are indicated by the values in the corners. In all cases shown, the probability of occurrence of species 1 is 0·69 (the sum of the probabilities in the right-hand quadrants) because the mean of Zi1 (Li1) remains 0·5, as in Fig. 1. Similarly, the probability of occurrence of species 2 remains 0·16 because the mean of Zi2 (Li2) remains −1. The correlation changes the probabilities of co-occurrence, but not the unconditional probabilities of occurrence for each species.

Download figure to PowerPoint

In the Chib and Greenberg (1998) model, the probability of presence changes when the location of the bivariate normal changes, while the correlations defining the multivariate normal can stay the same. For example, if the mean of the bivariate normal changes from (0·5, −1, as in Fig. 2) to (0·5, −0·5, as in Fig. 3), the probability of presence of species 2 increases to 0·31 (Li2 changes from −1 to −0·5), but the probability of presence of species 1 remains 0·69 (Li1 remains 0·5). Thus, associations among species are modelled by changing the correlations of the latent multivariate normal distribution, while the (joint) probabilities of presence are modelled by changing the locations of the distribution.

image

Figure 3. An equivalent representation of co-occurrence patterns of the two species in Fig. 2c, but with a higher probability of occurrence of species 2 (the toad at 0·41) because the mean of Zi2 (Li2) has increased from −1 (in Fig. 2) to −0·5. The probabilities of co-occurrence of the species have also changed as a result, while the probability of occurrence of species 1 is the same (0·69) as the mean of Zi1 (Li1) is unchanged.

Download figure to PowerPoint

While illustrated schematically here using two species and a bivariate normal distribution, the approach to modelling correlated occurrences extends to any number of J species by using a J-dimensional multivariate normal distribution (Chib & Greenberg 1998). The relationship between correlated normal distributions and correlated Bernoulli events has been used previously to simulate correlated fire events (McCarthy & Lindenmayer 1998, 2000). Here, we use it as a basis to estimate correlations in the occurrence of species.

Model details

We fit a multivariate model where the probability of occurrence is the probability density of a latent variable exceeding a threshold (eqn (eqn 1)). The response is species occurrence, represented by the matrix Y with dimensions n sites by J species with elements Yij. If the jth species is found at the ith site, then Yij is one (or zero if absent). The response is predicted by a data matrix (X) that has dimensions n sites by K predictors. All elements of the first column vector of X are ones, which accounts for the model intercept terms, and the remaining column vectors are K−1 environmental variables centred on zero and scaled by their standard deviations.

  • display math(eqn 1)

The probability that the jth species is present at the ith site equals the probability that the equivalent element of a latent variable matrix, Zij, is greater than zero, that is Zij > 0. The row vectors of the latent variable matrix, Zi, follow J-dimensional multivariate normal distributions. Each multivariate normal distribution has the same variance/covariance matrix, ∑. The mean vector of each multivariate normal distribution is the inner product of the corresponding row vector of the predictor data matrix, Xi, and an unscaled J by K coefficient matrix B* (equivalent to Lij above). The first column vector of B* is the unscaled species intercept terms and the remaining K−1 columns are unscaled regression coefficient vectors for the kth environmental variable. The elements of the coefficient matrix, Bjk* are modelled hierarchically by drawing them from normal distributions common to the kth column, with mean μk, and standard deviation σk.

Our motivation for using a hierarchical approach to estimate the regression coefficients is both ecologically and computationally driven. Having a hierarchical structure to estimate environmental responses of individual species has previously been demonstrated to have desirable properties for fitting multispecies distribution models (Latimer et al. 2009; Pollock, Morris & Vesk 2012). But in this case there are also advantages of the hierarchical estimation technique that flows on to the correlated occurrence component of the JSDM (see section Model fitting).

A multivariate normal distribution is defined by a variance/covariance matrix, ∑, which governs the correlations among variates. Because this approach is based on probit regression, all standard deviations are equal to 1, by definition. In this case, the variance/covariance matrix is a correlation matrix. Specifying a prior for the correlation matrix is not straightforward because elements of correlation matrices are related to each other. The inverse Wishart distribution has the necessary constraints for a variance/covariance matrix (it is positive definite), but this does not constrain the standard deviations to be one. To ensure that the variance/covariance matrix ∑ conforms to a correlation matrix, the covariance terms must be divided by the corresponding standard deviations (this is the definition of a correlation coefficient).

As Chib and Greenberg (1998) show, this rescaling of the variance/covariance matrix so that it becomes a correlation matrix also requires a rescaling of the coefficients B* so that they can be interpreted as regular probit regression coefficients. Thus, the scaled probit regression coefficients, B, are calculated by dividing B* by the standard deviations of the variance/covariance matrix, which are the square root of the diagonal elements (Σjj). These scaled regression coefficients Bjk correspond to the regression coefficients of probit regression for the response of species j to environmental variable k. Thus, the probit of the probability of occupancy of species j at site i is:

  • display math(eqn 2)

We can use the output of the model to decompose species correlations into the following: (i) residual correlation and (ii) correlation due to similar environmental responses, which may be used to generate hypotheses about mechanisms that explain why species occur together (or not). For example, strong correlations due to the environment may suggest habitat filtering. Strong residual correlations may hint at a biological interaction between species (e.g. facilitation or competition). Residual correlation may also indicate the need for additional explanatory variables.

Correlation parameters
  • display math(eqn 3)

A correlation matrix P can be calculated by rescaling the variance/covariance matrix. To calculate the correlation in the latent distribution between species j and species j ′, we divide their covariance by the product of their standard deviations (eqn (eqn 3)).

Correlation due to environment
  • display math(eqn 4)

We can also calculate a second correlation matrix, ℙjj ′, that accounts for the component of between species correlation due to their shared environmental responses. Equation (eqn 4) shows that the environmental correlation between species j and j ′ is a function of those species’ scaled regression coefficient vectors Bjk and Bj′k and the covariances of the k environmental variables, assuming environmental data have been centred and scaled appropriately as above.

We include a tutorial and R code for the models as described previously in Appendix S1. The R package ‘BayesComm’ is also available to run a nonhierarchical version of the model described previously. This package returns residual correlations between species (Pjj′), but the current version does not calculate correlations due to shared environmental responses (ℙjj′). ‘BayesComm’ is available at http://cRAN.R-project.org/web/packages/BayesComm/index.html (Golding 2013a,b).

Model fitting

All models were fit with the Markov Chain Monte Carlo Bayesian modelling software JAGS v3.4.0 run through R v3.0.2 via the package R2jags v0.03-11 (R Core Team 2013; Plummer 2014). For both case studies, we ran three chains for 1 000 000 iterations, with the first 15 000 discarded as burn-in. The remaining samples were thinned by a factor of 1000 meaning we retained 985 samples per chain for postprocessing.

We used vague priors for all model parameters in both case studies. We used vague normal priors (mean = 0, SD = 100) for the elements of μk and uniform priors in the interval 0–100 for the standard deviations, σk. For the variance/covariance matrix, we used an inverse Wishart prior with + 1 degrees of freedom and a J by J identity matrix as the scale matrix. Using these parameters for the inverse Wishart distribution implies a uniform prior on the off-diagonal elements of P, the correlation coefficients (Gelman & Hill 2007). We found that without regularizing the unscaled matrix B* by applying the hyperprior to the column vectors, the model would not converge without a more informative prior on ∑. However, using the hyperparameters, μk and σk, allows minimal prior information to be applied to correlation coefficients by setting the degrees of freedom parameter at + 1. We considered model runs converged where after the burn-in, all elements of the parameter matrix B and the off-diagonal elements of P had potential scale reduction factor values of <1·1.

Comparison to co-occurrence indices

We plotted environmental and residual correlations from our model against values calculated from two co-occurrence indices commonly used in community ecology: Schoener's Index and a modified version of Dice's Index (Hardy 2008) using ‘species.dist’ in the picante package (v. 1.6-1) in R (Kembel et al. 2010). Co-occurrence indices are often used to infer ecological processes such as potential species interactions, but unlike JSDMs, these indices are not capable of disentangling the influences of shared environmental responses and residual correlations on co-occurrence. Comparing the output of JSDMs to typical co-occurrence indices therefore provides an assessment of how well co-occurrence indices capture these two processes.

Case study 1: frog communities in greater Melbourne

Our first case study uses data on the occurrence of seven frog species (see Table S1 for a species list) at 104 lentic ponds in parks and gardens around Greater Melbourne, Victoria, Australia. At each site, nocturnal visual searches and acoustic monitoring were conducted three times over two breeding seasons. Anuran assemblages in the study area are strongly influenced by pond size, road cover and the presence of vertical walls surrounding ponds (Parris 2006). We, therefore, used these three environmental variables in our analyses. Pond surface areas were measured in the field or from aerial photographs. Road cover was quantified by calculating the proportion of a 500 m radius surrounding each pond that was covered by sealed roads. The presence or absence of a vertical wall at each pond was determined during field surveys. For further details, see Parris (2006).

Case study 2: eucalypt communities in the Grampians National Park

The Eucalyptus data set includes 12 taxa (see Table S2 for a taxon list) recorded in 458 plots spanning elevation gradients in the Grampians National Park, Victoria, which is known for high species diversity and endemism. The park has three mountain ranges interspersed with alluvial valleys and sand sheet and has a semi-Mediterranean climate with warm, dry summers and cool, wet winters. Plots were based on a nearest-neighbour-sampling approach intended to be at a spatial scale in which species interact. Species and ecological traits are tied to environmental gradients, especially soil type and geology (Enright, Miller & Crawford 1994; Pollock, Morris & Vesk 2012). Here, we use six environmental variables previously found to be important to the focal species. Rock cover, soil sand and loam content were quantified in field plots. Valley bottom flatness identifies areas with poor water drainage that accumulate sediment and was derived from a digital elevation model (Gallant & Dowling 2003). Annual precipitation and temperature variability were estimated using BIOCLIM (Houlder et al. 2000). For a further description of the site and environmental variables, see Pollock, Morris & Vesk (2012).

Results

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Data accessibility
  9. References
  10. Supporting Information

Our analyses demonstrate the value of partitioning the effects of the environment (ℙjj’) from residual interactions between species (Pjj′). This partitioning revealed contrasting patterns of co-occurrence in the frog and eucalypt case studies (Fig. 4). Specifically, frog species tended to respond similarly to environmental conditions and have positive residual correlations in co-occurrence (Fig. 5), whereas eucalypt species had much more variable covariance patterns, with numerous cases of both negative and positive correlations in environmental and residual co-occurrence (Figs. 5). In both case studies, environmental correlations tended to be stronger than residual correlations (Figs 4 and 5).

image

Figure 4. Network diagrams representing modelled environmental correlation (ℙjj′) (left panels) and residual correlation (Pjj′) (right panels) between species of frogs (top) and eucalypts (bottom). Black lines are positive correlations between species and grey lines are negative correlations. Line thickness represents correlation strength. Only correlations in which the credible intervals do not cross zero are shown. See Tables S1 and S2 for full species names.

Download figure to PowerPoint

image

Figure 5. Modelled residual correlation (Pjj′) and environmental correlation (ℙjj′) between species pairs for eucalypts and frogs. Error bars represent 95% credible intervals. Black circles are eucalypt species pairs that interbreed. The open circle represents the pair EbaxteriEgoniocalyx, species from different subgenera.

Download figure to PowerPoint

Many eucalypt species rarely or never co-occur simply because they occupy distinct habitats (negatively correlated estimates of ℙjj′ in Fig. 5). The more interesting cases are those species that occupy similar environments, yet co-occur more or less than expected. For example, the two species with a particularly high positive residual co-occurrence are from different subgenera, whereas the species with similar environmental responses but negative residual co-occurrence are closely related and able to interbreed (Fig. 5).

Comparison of our results to typical co-occurrence indices

Co-occurrence tends to be positively correlated with environmental correlation in the case of the Schoener Index (Fig. 6) and a modified Dice Index (Fig. S1), although the relationships are generally weak and nonlinear. Co-occurrence has no clear relationship with residual correlation (Figs 6, S1), indicating the JSDM captures complex interactions that the simple co-occurrence metrics do not.

image

Figure 6. Median (±95% credible intervals) residual correlations (Pjj′) and environmental correlations (ℙjj′) compared with a co-occurrence index (Schoener Index; see methods) for frogs (left panels) and eucalypts (right panels).

Download figure to PowerPoint

Discussion

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Data accessibility
  9. References
  10. Supporting Information

The importance of including residual correlations in SDMs

Species distribution models (SDMs) are widely used to address issues in ecology, evolution and conservation, but current approaches to fitting SDMs make a range of limiting assumptions (Davis et al. 1998; Guisan & Thuiller 2005). Here, we describe an approach that can help overcome two of the most important of these assumptions – that all relevant environmental covariates are included and that species distributions are independent of interactions with other species. However, like any correlative method that attempts to partition environmental and residual effects, our approach cannot fully disentangle which of these assumptions is being violated. Residual correlations may be due to missing environmental covariates, or ecological (e.g. facilitation) or evolutionary mechanisms (e.g. allopatric speciation). Nevertheless, examining residual co-occurrence patterns in light of the natural history of the species involved may highlight important environmental variables that are missing from a model or may point to cases where further research would provide additional insights into biotic interactions.

Clark et al. (In press), recently, demonstrated that JSDMs reduced inflated estimates of community abundance obtained from aggregating independent SDMs, resulting in more realistic predictions of forest response to climate change. In addition to potentially improving estimates of the responses of species to climate change, this modelling approach may help determine whether SDMs should be used to interpolate or extrapolate species’ distributions across geographic or environmental space more generally. Most SDMs assume that species distributions are in equilibrium with current environmental conditions. However, when there are strong residual correlations between species, projections of species’ distributions necessarily assume that interactions will remain constant, which is a questionable assumption given that species will encounter novel biotic communities in different environments. Similarly, if residual correlations are due to missing environmental covariates, then projections of species’ distributions might also be suspect.

Most current approaches to incorporating biotic interactions in SDMs involve adding the occurrence or abundance of other species as predictors (Araujo & Luoto 2007; Heikkinen et al. 2007; Meier et al. 2010). However, adding species as predictors assumes unidirectional interactions and induces multicollinearity within a model when the distribution of a predictor species is governed by similar abiotic variables (Kissling et al. 2012). In contrast, the approach presented here directly estimates reciprocal interactions. Incorporating interactions in the residuals, rather than in the mean response, avoids issues of multicollinearity. However, JSDMs do not explicitly model species interactions. A more direct approach to understanding strong unidirectional interactions (e.g. commensalism) would be to relate one species’ population dynamics or performance directly to the occurrence of another species via the model mean.

Our analyses demonstrate limitations of using a co-occurrence index to identify potential ecological processes in community ecology studies. There are positive relationships (though subject to uncertainty) between co-occurrence and environmental correlations for frogs and eucalypts (Figs 6, S1), which is expected given co-occurring species share environmental responses. However, there are no strong relationships between the co-occurrence indices and residual correlations estimated from JSDMs (Figs 6, S1). JSDMs go a step beyond co-occurrence indices because they are a model-based approach that decomposes co-occurrence into environmental and residual components. Residual correlation does not necessarily indicate a species interaction, but strong residual correlation between species presents a case for further investigation.

Identifying potential biotic interactions from model outputs

An important component of the JSDM presented here is that it can partition out the contribution of environmental variables on co-occurrence. The environmental effect itself is important because it highlights the potential role of habitat filtering in community assembly. Conversely, the residual correlation beyond the environmental effect may indicate that other ecological or evolutionary processes are important, though correlated responses to unmeasured covariates cannot be excluded. Previous studies have also examined residual correlations between species after accounting for the effects of environmental covariates using similar models (Latimer et al. 2009; Ovaskainen, Hottola & Siitonen 2010; Sebastián-González et al. 2010; Clark et al. In press), but to the best of our knowledge, no previous studies have explicitly quantified the contribution of shared environmental responses and residual co-occurrence. Our model identified sets of frog and eucalypt species that occurred together more and less than expected given shared responses to the environmental variables we considered. Below, we discuss several potential reasons for this pattern with respect to previous studies and the ecologies of these communities.

Our analysis of frog co-occurrence patterns revealed that all species generally responded similarly to pond area, road density and the presence of a pond barrier, and the magnitude and direction of these effects are consistent with earlier studies of species richness in the study area (Parris 2006), as well as findings from a wide range of studies on the occurrence of pond-breeding amphibians (Chardon 1998; Popescu & Gibbs 2009; Heard et al. 2013). However, many of the residual correlations between species were positive, suggesting that species co-occurred more than expected given their shared responses to environmental variables. Facilitative interactions between frogs seem unlikely, but positive residual correlations between frog species could have been due to a shared response to an abiotic variable that was not considered in our model, such as the presence of fish (Hamer & Parris (2013).

In contrast to the frog case study, analyses of eucalypt communities revealed considerable variability in shared environmental responses and residual correlations among species. Two eucalypt species that co-occur more than expected in neighbourhood plots given their responses to environmental variables are Eucalyptus arenacea and Egoniocalyx, which are from different subgenera (Fig. 5). Co-dominance of eucalypts from different subgenera is a common pattern known as Pryor's rule (Pryor 1953). A possible explanation for this pattern is that species from different subgenera are able to differentiate resource use, thereby alleviating competition (Austin, Cunningham & Wood 1983). Another potential explanation is that species from different subgenera are not able to interbreed (Ellis, Sedgley & Gardner 1991). If species are able to interbreed and one species has a reproductive advantage (e.g. if selection favouring one species leads to more pollen output from that species), then, with continued back-crossing, all individuals begin to resemble the favoured species (Levin 2006). Simulations suggest interbreeding species usually form unstable assemblages because one species tends to gain a reproductive advantage over the other (Currat et al. 2008).

Interbreeding (i.e. hybridization) has influenced the evolution of eucalypts and may be a mechanism for dispersal (Potts & Reid 1988). In our study, the species pairs that are reproductively compatible (black dots in Fig. 5) occupy similar environments but have negatively correlated residuals (bottom right quadrant of Fig. 5). In other words, these species co-occur less frequently together than we would expect given their similar habitat preferences.

Further refinements and applications of the model

One advantage of the hierarchical modelling framework used here is that it can be easily modified to account for additional complexities and uncertainties in the data. Additional correlations between species (e.g. functional similarity or phylogenetic relatedness) could be incorporated into the model. For instance, similarity in specific leaf area (SLA) between eucalypt species increases with shared environmental space, yet slightly decreases with increasing residual correlation (Fig. 7). Specific leaf area is functionally related to species occurrences along the environmental gradients studied here, explaining the positive correlation between SLA similarity and shared environmental space (Pollock, Morris & Vesk 2012). Other species traits may be related to residual co-occurrence. For example, flowering time might be related to residual correlation if temporal niche partitioning is important. In these cases, incorporating functional traits as predictors of environmental responses in the model (Pollock, Morris & Vesk 2012) may reveal additional ecological insights.

image

Figure 7. Relationship between residual correlation (Pjj′) and environmental correlation (ℙjj′) and similarity in ln-transformed mean specific leaf area (SLA) between pairs of eucalypt species. Black circles are eucalypt species pairs that interbreed. The open circle represents the pair EarenaceaEgoniocalyx, species from different subgenera.

Download figure to PowerPoint

Our model could also be extended to incorporate imperfect detection probabilities (MacKenzie et al. 2002) or spatial random effects (Parris 2006). In cases where data are available for more than one time period, our approach could also be used to analyse correlates of co-occurrence in community time series, where the effects of biotic interactions may be more readily identified (Mutshinda, O'Hara & Woiwod 2011; Kissling et al. 2012). For example, Sebastián-González et al. (2010) used a similar approach to analyse the effects of heterospecific attraction on the temporal dynamics of seven water bird species. Our model thus offers a flexible approach for examining a wide range of questions in theoretical and applied ecology, and could be adapted to suit a variety of applications.

Acknowledgements

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Data accessibility
  9. References
  10. Supporting Information

This study was much improved by discussions with Dave Harris and by suggestions from Otso Ovaskainen, Jane Elith and two anonymous reviewers. This research was supported by The Australian Research Council Centre of Excellence for Environmental Decisions and the National Environmental Environmental Decisions Hub.

References

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Data accessibility
  9. References
  10. Supporting Information

Supporting Information

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgements
  8. Data accessibility
  9. References
  10. Supporting Information
FilenameFormatSizeDescription
mee312180-sup-0001-SupplementalMaterial.docxWord document54K

Fig. S1. Median (±95% credible intervals) residual correlations (Pjj’) and environmental correlations (ℙjj’) compared to a modified Dice's co-occurrence index for frogs (left panels) and eucalypts (right panels).

Table S1. List of frog species included in case study 1.

Table S2. List of eucalypt species included in case study 2.

mee312180-sup-0002-AppendixS1.pdfapplication/PDF182KAppendix S1. Tutorial for fitting a Joint Species Distribution Model (JSDM) in R.
mee312180-sup-0003-AppendixS2.docWord document27KAppendix S2. Descriptions of Schoener's and Dice's co-occurrence indices.
mee312180-sup-0004-Rcode.Rtext/r7K 

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.