Scale dependence of ecological assembly rules: Insights from empirical datasets and joint species distribution modelling

Environmental filtering and biotic interactions are known to jointly rule the assembly of species into local communities from a regional species pool defined by historical contingencies and dispersal (Götzenberger et al., 2012; Lortie et al., 2004; Vellend, 2010). These ecological assembly rules are hypothesized to function hierarchically at different spatial scales (Hart, Usinowicz, & Levine, 2017; McGill, 2010; Pearson & Dawson, 2003; Schneider, 2001; Thuiller, Pollock, Gueguen, & Münkemüller, 2015; Wiens, 1989). Species with abilities to access the site and presenting ecological requirements that match the specific local abiotic conditions (i.e. environmental filtering) are candidates Received: 31 July 2019 | Accepted: 3 May 2020 DOI: 10.1111/1365-2745.13434

Recently, Araújo and Rozenfeld (2014)  are co-occurring due to facilitation they appear together at all scales of observations (Araújo & Luoto, 2007). Negative interactions leading to local extinctions, in contrast, might not be apparent at coarser scales where increase in space allows the coexistence of competitive species without too close proximity (Conti, de Bello, Lepš, Acosta, & Carboni, 2017;Godsoe et al., 2015). However, empirical studies conducted so far found little support for these expectations (e.g. Belmaker et al., 2015).
The traditional methodology to infer species associations deviating from random patterns (proxy for biotic interactions) consists in analysing species co-occurrence matrices using different types of null model approaches (Gotelli, 2000;Gotelli & McCabe, 2002;Götzenberger et al., 2012). While these approaches make it possible to study pairwise interactions at the community level, they make it difficult to discern if non-random species associations result from biotic interactions, species' environmental preferences or dispersal processes (Chalmandrier et al., 2013;D'Amen, Mod, Gotelli, & Guisan, 2018). For example, segregated patterns could be caused by competition (Gutiérrez, Boria, & Anderson, 2014), environmental filtering (Firth & Crowe, 2010) or dispersal limitations (García-Valdés, Gotelli, Zavala, Purves, & Araújo, 2015). To overcome this challenge, one can use methods that account for environmental conditions and/or dispersal in driving species co-occurrences (e.g. constrained modelling or ordination analyses; D' Amen et al., 2018;Peres-Neto, Olden, & Jackson, 2001). A recent implementation of such approach is joint species distribution modelling (JSDM), which is based on a hierarchical model composed of two stages Wilkinson, Golding, Guillera-Arroita, Tingley, & McCarthy, 2019). The first stage fits a GLM to multiple species simultaneously to account for the effects of environmental factors. In the second stage, the residuals of the fitted models are used to infer species associations not explained by the considered environmental variables. Thus, JSDMs can be used to disentangle the degree with which non-random associations between species might be driven by environmental filtering (species pairs with correlated environmental responses) and biotic interactions (species pairs with correlated model residuals ;Pollock et al., 2014;Zurell, Pollock, & Thuiller, 2018). While JSDMs have been used in this context (e.g. D' Amen et al., 2018), it is important to note that JSDM is a correlative approach (Dormann et al., 2012) and that residual correlations may also appear due to missing (environmental) covariate that influence species co-occurrences (Hui, 2016;Pollock et al., 2014). It should thus be clear that JSDMs can nominate possible environmental filtering and biotic interactions but do not provide unequivocal evidence for them.
Understanding how and to which extent environmental filtering and biotic interactions shape ecological communities at different spatial scales is paramount to forecast future changes at different biodiversity levels and set-up efficient management strategies (Guisan & Rahbek, 2011;Heller & Zavaleta, 2009;Kremen, 2005;Levin, 1992;Wainwright et al., 2018). Here, we aim to examine the scale dependency of environmental filtering and biotic interactions influencing assembly of local vascular plant communities using four datasets collected in alpine (Swiss Alps) and arctic (northern Finland) environments. Due to the relatively small extents of the two study areas, the species can be assumed free of dispersal limitations (D'Amen et al., 2018;Pottier et al., 2013;le Roux, Lenoir, Pellissier, Wisz, & Luoto, 2013). In each dataset, species data were sampled within multiple nested plot sizes (i.e. spatial resolutions; sensu scale in Araújo & Rozenfeld, 2014; see also Dungan et al., 2002). Plot sizes varied from 20 × 20 cm (0.04 m 2 ) to 8 × 8 m (64 m 2 ); a resolution at which herbaceous species can effectively have an impact on each other, yet sampling is still feasible (Stoll & Weiner, 2000).
Methodologically, we make benefit of JSDMs  F I G U R E 1 At fine spatial resolution (i.e. small plot size on the left), yellow and green species occur more often together than alone indicating a positive association, whereas yellow and blue species rarely occur together indicating a negative association. At coarse spatial resolution (i.e. large plot size on the right), yellow and green species occur together more often than alone indicating a positive association similar as in fine scale, whereas yellow and blue species now also indicate a positive association due to more frequent co-occurrence enabled by increasing plot size [Colour figure can be viewed at wileyonlinelibrary.com] by fitting one model for each plot size within each dataset. Based on the models, we assess segregation and aggregation of species pairs, whether these associations are likely driven by environmental filtering and/or putative biotic interactions, and how the pairwise species associations and their potential drivers vary across spatial scales.

| Datasets
We examined species pairwise associations across spatial scales using different datasets collected in non-forested sites from two study areas. Each dataset comprises presence-absence observations of all vascular plant species sampled at various nested plot sizes (varying from 20 × 20 cm = 0.04 cm 2 to 8 × 8 m = 64 m 2 ) together with spatially related information on the abiotic environment (Table 1). Species were recorded as present even if only a part of above-ground vegetative growth was within the plot. To ensure reliable model parameter estimation, only species with prevalence of 5-95 % at all plot sizes for a given dataset were retained for statistical analyses, (Figures S1 and S2 in Appendix). For each dataset, we selected the environmental variables based on previous studies (e.g. Dubuis et al., 2013; showing that they comprehensively represent the eco-physiological requirements of plant species in the study areas: temperature, moisture, soil and light conditions (Mod, Scherrer, Luoto, & Guisan, 2016).
The first dataset (AlpineM) was collected in the western Swiss Alps (46°23′N, 7°5′E; e.g. Randin et al., 2006;Randin, Jaccard, Vittoz, Yoccoz, & Guisan, 2009) and consists of 434 sites, each with four nested square plots of 1, 4, 16 and 64 m 2 (Table 1; Figure S3 in the Appendix). Minimum and maximum distances among sites are 200 m and 40 km, respectively, covering an elevation gradient of 375-3,210 m a.s.l. The dataset has a total of 910 species, but only 122 species met our prevalence criteria, while 420 sites presented at least one species across all resolutions (Table 1; Figure S1 in Appendix). The environmental variables (growing degree days, topographic position index, soil pH and solar radiation) for each site were derived from raster layers with a spatial resolution of 25 × 25 m. Resolution of the environmental data was considered to be appropriate with regard to the topographic variability in the study area (Pradervand, Dubuis, Pellissier, Guisan, & Randin, 2014).   Figure S1 in Appendix). The environmental data were the same as for the previous dataset. The second Arctic dataset (ArcticM) comprises all 21 grids.
Here, 1 m 2 is used as the smallest plot size, and species data with larger plot sizes (4 and 16 m 2 ) were formed by dividing the grids into 2 × 2 m and 4 × 4 m plots ( Figure S6 in Appendix). This resulted in 3,360, 840 and 210 plots with resolutions of 1, 4 and 16 m 2 respectively (Table 1). After applying our prevalence criteria, total species richness decreased from 134 to 48 species. Additionally, few plots of 1 and 4 m 2 containing no species were removed. This dataset was modelled with the same environmental variables as for the ArcticCM dataset, but for plots of 2 × 2 m and 4 × 4 m, we used the mean (median for pH) environmental values of the corresponding 1 × 1 m plots.

| Statistical analyses
Pairwise species associations and their potential causes were analysed using multivariate Bayesian JSDM as implemented in the r-package boral (Hui, 2016). A key feature of this JSDM is the ability to incorporate latent variables as a parsimonious model-based ordination to assess correlation between species. The model is fitted in hierarchical manner. First, boral assess environmental responses of each species by fitting a GLM, by regressing a n × p matrix where rows i = 1, …, n are sites and columns j = 1, …, p are species recorded at each site against a n × q  Table 2.
TA B L E 2 Ecological interpretation of different combinations of environmental and residual correlations. Note that due to the correlative nature of joint species distribution models (JSDMs), identified associations only suggest causal ecological process (i.e. environmental filtering and biotic interactions). Residual correlations, in particular, could be related to a missing (environmental) factor that influence distribution of both species in a pair  (Plummer, 2003). Under this framework, both environmental and residual correlations are represented as posterior distributions. Following Bayesian approaches, posterior distributions with 95% highest posterior density (HPD) intervals excluding zero indicate evidence for non-random association within a species pair. One JSDM was fitted to each dataset and plot size, resulting in four models for the AlpineM dataset, two models for the AlpineCM dataset, and three models for both the ArcticCM and the ArcticM datasets (first 12 rows in Table 1). The environmental variables were included both as linear and second-order terms to account for quadratic effects. All models included ran- intercepts and species coefficients related to environmental variables. We used half-Cauchy distributions for standard deviation parameters associated with random effects on sites (Gelman, 2006). Convergence was assessed using the Geweke convergence diagnostic (Geweke, 1992), which is a diagnostic applicable with only one MCMC chain (see Table 1 in Appendix).

| Additional datasets and models
We run additional models to examine the robustness of our results. This issue is typical for datasets modelled with a spatial approach (Guisan & Thuiller, 2005) where environmental data usually present a coarser resolution than species data (Connor et al., 2018;Guisan, Graham, Elith, & Huettmann, 2007). For example while species data were collected at different plot sizes, environmental data had a resolution of 25 m for all alpine datasets and a resolution of 1 m 2 for ArcticCM dataset. To test how the scale mismatch affects the obtained results, we re-modelled the ArcticM dataset using a 16 m 2 resolution for environmental data for all models and plot sizes (1, 4 and 16 m 2 ). TSS and environmental and residual correlations derived from the ArcticM (matching spatial resolution between species and environmental data) and ArcticM env16 datasets (non-matching spatial resolution) were then compared.

| RE SULTS
All models converged well (on average more than 95% of the parameters converged; Table S1 in Appendix). The TSS indicates an overall good model performance for all datasets (M = 0.58; SD = 0.12; Table S1 in Appendix).
For most datasets and plot sizes, the majority of the species pairs presents 95% HPD intervals of environmental and/or residual correlations that exclude zero (i.e. indicating non-randomly associated species pair; Figure 2). Residual correlations are in general more common and stronger than environmental correlations, and positive residual correlations are more frequent and stronger than negative residual correlations (Figures 2 and 3).  (Table S1 in Appendix).
Percentages of species pairs with random/non-random residual correlations (i.e. 95% HPD interval overlapping/excluding zero) and median absolute residual posterior correlations do not vary consistently across plot sizes among datasets (Figures 2 and 3), but incorporating latent variables into the models improves TSS values more at fine than coarse scales for all datasets (Table S1 in Appendix).   (Figures 2 and 3). However, the species pairs that have non-random negative residual correlation at the finest scales have weaker (or even positive for AlpineM and ArcticM 210 datasets) mean residual correlation at coarser scales than the species pairs that have non-random positive residual correlation at the finest scales ( Figure 4).
Comparison between ArcticM and ArcticM env16 , the datasets with matching and non-matching spatial resolution between species and environmental data, shows only slight differences in environmental and residual correlations. Model fits of ArcticM env16 are better than the model fits of ArcticM (Table S1 in Appendix). Using the same number of plots within each resolution (ArcticM 210 ) results in less species pairs with non-random environmental correlations and less variation in residual correlations across scales.

| D ISCUSS I ON
Scale dependency of assembly rules can affect the interpretations of factors driving the composition of communities (Götzenberger et al., 2012). In theory, it is frequently assumed that the process of environmental filtering mainly functions at spatial resolutions coarser than those at which biotic interactions act (Pearson & Dawson, 2003).
However, mixed results can be found in the literature, especially concerning the scale dependency of biotic interactions (Araújo & Luoto, 2007;Firth & Crowe, 2010;Reitalu et al., 2008). Recently, Araújo and Rozenfeld (2014) proposed that the type of biotic interac- (see Figures 1 and 2). The scale dependency of ecological assembly rules is better supported when looking at the model fits (i.e. how well the distribution of species is explained by abiotic predictors and latent variables). Fits of environment-only models improve with increasing plot size, and incorporating latent variables (i.e. surrogate to biotic interactions) improve model fits more at fine than at coarse scales. However, we acknowledge that the here-measured potential influences of environmental filtering and biotic interactions across scales could be related to the chosen environmental variables. For consistency, we used the same environmental predictors to model species data at all plot sizes for a given dataset.
While these predictors are ecologically justified (Mod et al., 2016), depending on the scale, other abiotic factors might be more relevant (Yeager, Deith, McPherson, Williams, & Baum, 2017). The absolute overall residual correlations show no clear scale dependency, possibly due to the higher prevalence of species at larger plot size (see also Zurell et al. (2018), who demonstrated that species prevalence can drive the magnitude of residual correlations in JSDMs), and/or the stronger role of stochastic processes at fine scales (Bowman & Swatling-Holcomb, 2018). However, the differences in residual correlations across scales become more apparent when examining negative and positive residual correlations separately, as postulated in our second hypothesis based on the model of Araújo and Rozenfeld (2014). Especially negative associations occurring at the finest scales weaken slightly more than positive associations with increasing plot size for the majority of the datasets. This is especially the case for the species pairs that indicate competition due to the shared niche (i.e. have positive environmental correlation and negative residual correlation; see Table 2 and Figure 4).
Different scale dependency of negative and positive biotic interactions might explain why the effect of biotic interactions have been demonstrated even at continental scales such as the effect of host on the distribution of a butterfly species (Araújo & Luoto, 2007), whereas evidence of negative interactions (competition, predation, amensalism) is mainly captured at fine scales (Purves & Law, 2002); but see Louthan, Doak, and Angert (2015) for the competitive exclu- Consequently, part of the results obtained with these datasets may actually stem from a methodological rather than an ecological phenomenon (see also Belmaker et al., 2015;Zurell et al., 2018). As the number of plots increases, so does the possibility to find associations between species by chance, possibly over-emphasizing aggregation and segregation patterns. Cutting down the number of plots (as done here for the ArcticM 210 dataset) does not seem to be the unsurpassed solution either, as it artificially removes ecological information from the dataset, but future applications of JSDM might need to include a penalization for datasets with higher n.

| CON CLUS IONS
Taken together, our results provide some support for the hypotheses related to the scale dependency of assembly rules. The role of environmental filtering and biotic interactions in structuring species assemblages did not consistently vary across scales, but the scale dependency of biotic interactions was revealed when partitioning the residual correlations into positive and negative associations.
Influence of negative interactions decreases slightly more than the influence of positive interactions towards coarse spatial scales.
Despite all datasets not unambiguously supporting the hypothesis, the generality of our findings is increased by the matter that we used datasets from two different study areas, in both alpine and arctic environments.
While we found JSDM as an efficient tool to disentangle the roles of ecological assembly rules across scales, it is conditional to the challenges of using statistical methods to infer patterns from empirical data: measured correlations can support but not prove causality (Dormann et al., 2012). Thus, JSDMs only indicate that environmental filtering and biotic interactions are potential causes of species aggregation and segregation. Further, and as previously identified (Pollock et al., 2014;Zurell et al., 2018), we detected that species prevalence, missing or inaccurate environmental predictors and data structure are potential sources of inaccuracy with JSDM.
Thus, while our results are in line with a study based on simulated data (Zurell et al., 2018), we cannot assess the veracity of our findings (as is the case for many studies) due to a global lack of knowledge on assembly rules, especially biotic interactions, actually taking place in nature. The challenge remains to conduct studies across a multitude of communities, environmental conditions and spatial scales to comprehensively understand the role of ecological assembly rules in community building.

AUTH O R S ' CO NTR I B UTI O N S
H.K.M. conceived the initial idea and led the writing, with all the authors contributing; M.L. and A.G. led the collections of the data; H.K.M. and M.C. analysed the data.