Does trait- based joint species distribution modelling reveal the signature of competition in stream macroinvertebrate communities?

1. The occupancy and abundance of species are jointly driven by local factors, such as environmental characteristics and biotic interactions, and regional- scale fac tors, such as dispersal and climate. Recently, it has been shown that biotic in teractions shape species occupancies and abundances beyond local extents. However, for small ectothermic animals, particularly for those


| INTRODUC TI ON
Understanding the factors determining species occupancies and abundances' spatial arrangements has been a longstanding core topic of ecology and biogeography (Mittelbach, 2012;Wisz et al., 2013). These patterns are crucial not only to conceptual issues on the origin, maintenance and distribution of biodiversity but also to more applied research such as conservation biology. For instance, to predict species range shifts caused by ongoing global change, it is essential to identify the determinants of species occupancy and abundance. At broad spatial extents (i.e. landscapes, regions, continents and beyond), species occupancy and abundance patterns have been attributed mainly to climate and other abiotic factors, whereas the effects of biotic interactions have often been neglected (Pearson & Dawson, 2003;Wisz et al., 2013). Recently, this view has been challenged with increasing evidence showing how biotic interactions both within and across trophic levels contribute to these patterns beyond local extents (Araújo & Luoto, 2007;Blois et al., 2013;Gotelli et al., 2010;Kissling & Schleuning, 2014;Wisz et al., 2013). While it is certainly true that abiotic factors and dispersal affect the occupancy and abundance of species, an additional interesting question is how biotic interactions influence these patterns.
Determining the effect of biotic interactions through statistical methods from observational data began with studies of species co-occurrence patterns on islands (Connor & Simberloff, 1983;Diamond, 1975). The key idea was that competition should lead to the lack of co-occurrence of the competing species (Diamond, 1975) compared to co-occurrences predicted by a null model (Connor et al., 2013;Connor & Simberloff, 1983). Recently, the growing interest in biotic interactions has led to a myriad of different statistical methods to study whether species are found together less often (negative association in occupancy; segregation) or more often (positive association in occupancy; aggregation) than predicted by random chance only (see Dormann et al., 2018 for an overview). These techniques include single species distribution and dynamic occupancy models in which other species' occupancy (Heikkinen et al., 2007), phylogenetic relationships (Morales-Castilla et al., 2017) and/or ecological similarity (Beaudrot et al., 2019) are included as covariates. Other examples include partial correlation networks (García-Girón et al., 2020), and different ordination methods, such as distance-based redundancy analysis (Bottin et al., 2016). A technique that is becoming increasingly popular is joint species distribution models (JSDMs; Ovaskainen et al., 2017;Pollock et al., 2014;Thorson et al., 2015). JSDMs enable the simultaneous modelling of multiple species' responses to environmental variables, and the residuals of the model indicate whether species-to-species associations are positive, negative or random, while controlling for environmental variables (Warton et al., 2015).
Considering the effect of environmental variables is crucial since positive associations might be due to preferences for similar habitat characteristics (reflecting species' convergence), especially among close relatives. Likewise, negative associations may be due to differences in habitat preference (reflecting species' divergence) rather than competition.
However, meaningful interpretation of non-random associations is often complicated, and mere non-random associations, even after controlling for environmental factors, may not allow one to separate between abiotic and biotic factors reliably (Dormann et al., 2018). For example, failure to include biologically important environmental variables in a JSDM can also produce non-random associations (Dormann et al., 2018). Hence, speciesto-species associations from JSDMs merely represent hypotheses suggesting which species potentially show biotic interactions (Ovaskainen et al., 2017). Experiments are by far the most reliable way for testing whether species-to-species associations indeed result from biotic interactions (Fayle et al., 2015;Siepielski et al., 2010) but also hypothesis testing framework is a useful method to explore these associations (D'Amen et al., 2018;Kohli et al., 2018). Within hypothesis testing frameworks, the effects of biotic interactions and habitat preferences can be separated, (stream site), in occupancy rather than abundance, and not related to species functional dissimilarity or to their dispersal ability. Thus, all our hypotheses considering possible competition (H1-H4) were rejected. 5. Competition does not appear to be a major driving force of stream macroinvertebrate communities at the spatial grain sizes considered. The observed positive associations in occupancy at small grain (stream site) may be attributed to species' similar microhabitat preferences, whereas at large grain (river basin), they may stem from metacommunity dynamics. Our results highlight that species traits were necessary to interpret whether or not species-to-species associations from joint species distribution models resulted from biotic interactions.

K E Y W O R D S
body size, dispersal, distribution, functional feeding guilds, interspecific competition, joint species distribution models, streams, substrate attachment mode for instance, by linking observed species-to-species associations to species ecological and functional traits (Dormann et al., 2018;Kohli et al., 2018). If ecologically and functionally similar species with similar habitat preferences are negatively associated, this can be interpreted as a result of interspecific competition (Kohli et al., 2018;Mönkkönen et al., 2017). Indeed, the notion that ecologically similar species should compete more intensively than ecologically dissimilar species is not new (Darwin, 1859), and ecological similarity has been used as a proxy for the probability that species compete with one another (Beaudrot et al., 2019).
Functional and phylogenetic information can be included to the modelling framework of JSDMs (Ovaskainen et al., 2017) to study if species that share functional traits respond in a similar fashion to specific environmental factors (Pollock et al., 2012). However, the dependence of species-to-species associations on species functional dissimilarity cannot (yet) be incorporated into the JSDM framework.
We present an approach that combines JSDMs with a hypothesis testing framework. We build a priori hypotheses of the relationship between species functional dissimilarity and speciesto-species associations when biotic interactions affect communities. We then test these hypotheses with species-to-species associations derived from JSDMs, that is, associations independent of the in situ measured relevant habitat characteristics. A hypothesis testing framework has been used previously with a context of separating possible biotic interactions from other factors on observational data as a form of dichotomous logic-trees (D'Amen et al., 2018;Kohli et al., 2018). Each individual species pair has been tested separately whether the hypothesis is rejected or accepted. By contrast, we evaluate the association of each species pair in relation to all other species pairs (Mönkkönen et al., 2017) to obtain a robust perspective. If negative associations of functionally similar species occur with the same probability as negative associations of functionally dissimilar species, controlling for environmental variables, this raises doubts whether competition drives segregation. In addition, we extend our approach by adding hypotheses predicting at which spatial grains, within which species, and whether using occupancy or abundance species-tospecies associations and functional dissimilarity should show a relationship. Hence, we get a deeper sight whether biotic interactions, and competition in particular, drive species-to-species associations derived from JSDMs.
Specifically, we examined species-to-species associations of stream macroinvertebrates at a regional extent. The studies of biotic interactions at regional extents have focused on the endothermic terrestrial fauna or flora (e.g. Wisz et al., 2013) while small ectothermic animals, particularly those living in freshwater environments, have received less attention. In general, it has been suggested that small ectothermic animals show more random occupancy patterns in contrast to endothermic animals (Gotelli & McCabe, 2002).
However, the findings may stem from differences in species richness among datasets (Ulrich & Gotelli, 2010, and the results obtained thus far for insect communities range from a low degree of competition (D'Amen et al., 2015) to severe competition (Fayle et al., 2015). Distributions of stream macroinvertebrates have been found to show spatial segregation (Heino, 2009), although the effect might be dependent on the environmental context (McCreadie & Bedwell, 2013), accompanied by spatial scaling issues (Heino & Grönroos, 2013). At local extents (i.e. multiple sites within one stream), segregation may be interfered by high dispersal rates, the effects of which may be less important at regional extents (Heino & Grönroos, 2013).
We hypothesized that (H1) if competition drives the speciesto-species associations, functionally similar species will show negative associations, whereas functionally dissimilar species show random associations ( Figure 1). In addition, we hypothesized that the relationship between functional dissimilarity and the strength of the association is more pronounced (H2) for abundances rather than occupancies. Studies of potential biotic interactions at regional extents typically use presence-absence data, which is also true for aquatic macroinvertebrate studies (Heino, 2009;McCreadie & Bedwell, 2013; but see García-Girón et al., 2020). However, the effect of biotic interactions must be very strong to lead to total exclusion of other species from a site or a species inability to occupy a site without the other species (D'Amen et al., 2015), especially in aquatic environments where animals move easily with currents (Tonkin et al., 2018). By contrast, changes in abundances might reveal more subtle patterns remaining undetected in occupancy data, due to, for example, mass effects (Leibold et al., 2004). These effects may be strong especially at local extents (e.g. multiple sites within one stream; Heino & Grönroos, 2013). Abundance data may reveal that a species may be less abundant in the presence of another species at high abundance, and vice versa. These associations in abundances could arise from biotic interactions that take place in macroinvertebrates, at least at grain sizes typically less than 1 m 2 (Holomuzki et al., 2010). Hence, while occupancy data are more often available and more robust than abundance data, abundance may better help to reveal possible biotic interactions.
We hypothesized that the relationship between functional dissimilarity and species-to-species association is more pronounced (H3) at small spatial grains rather than at large spatial grains, and (H4) among species having weak dispersal ability than among species with high dispersal ability ( Figure 1). Species with good dispersal ability may be especially prone to mass effects which would lead to more spurious patterns than species with weak dispersal ability. In comparison to competition, positive biotic interactions among macroinvertebrates have been rarely studied in freshwater settings (Holomuzki et al., 2010;Silknetter et al., 2020), possibly reflecting the long-term tendency in ecology to emphasize more negative interactions than facilitation (Bruno et al., 2003).
Positive associations can occur among functionally very different organisms (Silknetter et al., 2020), and they may be most likely due to indirect facilitation by ameliorating environmental conditions (Holomuzki et al., 2010). Thus, we do not have specific predictions which species, if any, show positive associations in occupancy or abundance.

| Study area
The stream sites were located in western Finland with a North-South gradient of more than 500 km, representing geographical variation from 60 to 65°N and from 22 to 26°E (Supporting Information 1). The stream sites were selected to cover as much variation in local stream environmental conditions as possible. The catchments also covered a wide variation in land use from catchments dominated (50%-60%) by agriculture to catchments almost entirely (80%-90%) covered by natural boreal forests (Petsch et al., 2021). The 105 stream sites were collected from 21 river basins, that is, five separate streams draining into each of 21 river basins were surveyed.

| Field surveys
The stream sites were sampled in September 2014, because most aquatic insect species are found in their larval stage in early autumn.
At each stream site, we measured seven variables that describe habitat characteristics and five water quality variables. These variables have been found important in studies of stream macroinvertebrate communities in the boreal region (Grönroos & Heino, 2012;Malmqvist & Mäki, 1994). For habitat characteristics, we measured (1) current velocity (m/s) with a Schiltknecht MiniAir 2 flow meter (Schiltknecht, Gossau, Switzerland) from the middle of bottom and surface and (2) depth (cm) at 30 random spots in each stream site.
The 30 measurements were taken from three spots in each of the 10 random cross-channel width measurements (see below): one from the middle of the channel, and the other two from both sides at a distance of ¼ of the channel width from the shoreline. In addition, we measured (3) percentage coverage (%) of mosses and particle size distributions at 10 random locations in each site. Inorganic particle size classes (%) were visually estimated in 0.25 m 2 using a modified Wenthworth's (1922) (4) mean grain size (mm) which was used as an instream habitat variable. We also measured (5) mean stream width based on ten random cross-channel measurements at each site, and (6) visually estimated shading of a stream by riparian vegetation (%) and (7) riparian deciduous tree cover (%) at each stream site. For water chemistry variables, we measured (1) pH and (2) conductivity F I G U R E 1 A schematic figure of the framework of the study. Each dot represents a species-to-species association derived from a joint species distribution model. These species-to-species associations reflect whether species show positive, negative or random association when environmental characteristics have been controlled for. We hypothesize that if competition drives these associations (H1) functionally similar species show negative associations, whereas functionally dissimilar species show random associations. Furthermore, we hypothesize that this relationship between functional dissimilarity and species-to-species associations is more pronounced (H2) at small grains that at large grains, (H3) with abundance than with occupancy, and (H4) among species with low dispersal ability than among species with high dispersal ability (H1) Functionally similar species are mostly negatively associated, whereas functionally dissimilar species show random associations In the laboratory, all invertebrates were separated from debris and they were identified mostly to species level (76% of taxa). However, young instars and individuals lacking species-level taxonomic keys were identified to genus level. Chironomidae, Ceratopogonidae, Simuliidae, Hydracarina, Oligochaeta and Turbellaria were identified to a coarser taxonomic level, and were thus excluded from the analyses.
We selected the macroinvertebrate species occupying at least 10 sites and for which we found at least 100 individuals, resulting in a set of 40 species for the statistical analyses. We classified the species according to their primary substrate attachment mode (i.e. burrower, crawler, semi-sessile, swimmer) and functional feeding guild (i.e. gatherer, filterer, predator, scraper, shredder). We calculated mean potential maximum size (dry weight mg) of the aquatic stage of species using the length-weight relationships obtained from the literature for body size measure (e.g. Benke et al. 1999 measures to what extent species j 1 and j 2 are found together more or less often than expected by chance in the scale from −1 to +1. We modelled the occupancies and abundances of the 40 species in the 105 sites with a hurdle model consisting of two parts: we modelled the occupancy of the species by a probit model, and conditionally on the presence, we modelled the abundance (log-transformed count, normalized to zero mean and unit variance within each species) with a normal model. In both parts of the hurdle model, we included random effects to reflect the hierarchical sampling scheme as stream sites (105)

| Effect of species traits
We divided aquatic macroinvertebrate species according to their primary substrate attachment mode, functional feeding guild and body size which are key traits of aquatic macroinvertebrates (Tachet et al., 2010). The selected 40 species covered a range of biological traits (Supporting Information 3). Most of the species were classified as crawlers in their substrate attachment mode (29 from the total of 40 species), scrapers as their functional feeding guild (14 species) and aerial active as dispersal type (21 species).
With species substrate attachment mode, functional feeding guild and log 10 -transformed body size as traits, we calculated pairwise functional dissimilarity using Gower distance (function 'daisy' of the r package 'cluster'; Maechler et al., 2016). If competition drives the associations, we expect negative associations between the most similar species, whereas dissimilar species would show random associations (H1). This would lead to a positive relationship between species-to-species associations and species dissimilarity. Moreover, we expect that this positive relationship is more pronounced between species having weak dispersal ability than between species having high dispersal ability (H4) because species with high dispersal ability may be especially prone to mass effects leading to more spurious patterns in comparison to species with weak dispersal ability.
Thus, we divided species pairs to three groups differing in their dispersal ability: (a) both partners in the pair are aerial dispersers (highest dispersal ability), (b) both are aquatic dispersers (lowest dispersal ability) and (c) one species is an aerial disperser and the other is an aquatic disperser (medium dispersal ability). We applied a simple Spearman rank correlation between species-to-species association and species dissimilarity, for the three dispersal groups separately.
This was repeated for each of the four species-to-species association measures (model: occupancy/abundance; grain: river basin/ stream site). Because of the interdependence of the observations, assessing the level of statistical support through, for example, the p values associated with the linear regressions would not be adequate.
We thus generated a null distribution by randomizing the order of species when calculating dissimilarity and re-calculated the correlations with these randomized data. We repeated the procedure 1,000 times and calculated 2.5% and 97.5% percentiles of the resulting parameter values. In addition, we repeated the procedure without predatory species, as for predator-prey relationships, the pattern of increased segregation with increased similarity was not expected.
All analyses were carried out with r version 3.5.1 (R Development Core Team, 2017). We provide the codes for HMSC and the following dissimilarity analyses in Supporting Information 5.

| RE SULTS
The mixing of MCMC chains was good for the abundance model as the potential scale reduction factor was smaller than 1.1, and the effective sample sizes were close to 2,000 for the studied parameters (Supporting Information 5). For the occupancy model, the mixing was adequate for omega-parameters, although for some pairs in stream sites, the convergence was not ideal (Supporting Information 5). For beta-parameters (species-specific regression parameters de-scribing the response to the environmental data that are not considered in the current study), the effective sample sizes were smaller than 2,000. The potential scale reduction factors were sometimes much larger than 1.1 (Supporting Information 5). This is not an unusual phenomenon with non-normally distributed, large and multidimensional data (Tikhonov et al., 2020). For occupancy, the mean explanatory power (measured as Tjur's R 2 ) of the model was 0.340 and the mean predictive power was 0.146. For abundance, the mean explanatory power (measured as R 2 ) of the model was 0.457 and the mean predictive power was 0.085. There was much variation among species in the explanatory and predictive powers of both parts of the hurdle model (Figure 2).
We found at least moderate (posterior probability at least 80%) species-to-species associations in occupancy for a large number of species pairs at both grains (Figure 3a,b). At the river basin grain, we found both positive and negative associations (Figure 3a), whereas at the stream site grain, the majority of them were positive (Figure 3b).
The negative associations at the stream site grain were related mainly to two species (Asellus aquaticus, Plectrocnemia conspersa; a handful of species (Figure 3d). Thus, without even taking species functional similarity account, we can already discard our hypothesis (H2) stating that species-to-species associations should be more pronounced for abundance than occupancy.
Disagreeing with our hypothesis (H1), there was no positive relationship between species functional dissimilarity and speciesto-species association's strength and sign (Figure 4a,b). This also disagrees with the hypotheses stating that the relationship should F I G U R E 3 Estimates of species-to-species associations using occupancy (a, b) and abundance (c, d), and river basin (a, c) and stream site (b, d) as a grain. The positive (red) and negative (blue) species-to-species associations are shown for each pair if the association has either sign with at least 80% posterior probability. Each species is coded by including the first three letters from its genus and species name (see Table S2 in Supporting Information for the full scientific names of the species)   Table 1). Exclusion of the predatory species did not qualitatively affect the results (Supporting Information 6).
Since all but five of the associations in abundance were weakly supported ( Figure 3c,d), we did not execute further analyses for them.

| D ISCUSS I ON
Stream macroinvertebrates showed both negative and positive species-to-species associations while controlling for in situ measured habitat characteristics. However, the negative associations were mostly at large grain (river basin) than at small grain (stream site), in occupancy rather than abundance, and not related to species functional dissimilarity. Thus, we rejected all of our hypotheses considering possible competition, and we can conclude that competition is not a major driving force behind the negative associations at the spatial scales studied. However, the analyses revealed interesting patterns showing grain dependency of positive associations and their relationships with species functional dissimilarity on specific dispersal modes. This raises the question of the key mechanism behind positive associations.
Our analyses showed that a significant proportion of the species-to-species associations in occupancy were non-random when accounting for habitat characteristics, thus disagreeing with the findings of random co-occurrences of stream macroinvertebrates at regional extents (Heino & Grönroos, 2013;McCreadie & Bedwell, 2013). Also, the sign and strength of the species-tospecies associations were dependent on the spatial grain: at the larger river basin grain, there were both negative and positive associations, whereas at the smaller stream site grain, the associations were more frequent and predominantly positive. This is in line with a study of mammals showing the increase in prevalence of positive co-occurrences at local versus landscape scales (Kohli et al., 2018).
There were both positive and negative associations in occupancy at the river basin grain, but the associations were not related to F I G U R E 4 The relationship between species functional dissimilarity and pairwise association using occupancy at river basin grain (a) and stream site grain (b). The solid grey line indicates zero associations. The different pair's dispersal groups are indicated by different colours. Please note that the trend lines represent fitted linear models while more robust Spearman rank correlation is used in the formal analyses  At the stream site grain, positive associations in occupancy constituted the vast majority of the non-random associations. The reason for these positive associations is likely to be related to microhabitat variation. Although our modelling procedure included 11 environmental variables, which have previously shown to be important for stream macroinvertebrates in boreal regions (Grönroos & Heino, 2012;Malmqvist & Mäki, 1994), there are important environmental characteristics which tend to vary at very fine scales (e.g. algal biomass, fine organic material, and leaves of different aquatic macrophytes and shoreline trees). Unfortunately, it is not feasible to measure these characteristics in such a large-scale study as presented here due to logistic and time constraints. Support for the similar small-scale habitat preferences comes from the result showing that for species pairs with low dispersal ability, the functionally similar ones had the most positive associations in occupancy. This pattern remained undetected in species with higher dispersal ability, possibly because of the strong mass effects. Interestingly, some species pairs had negative associations at large (river basin) grain and positive association at small (stream site) grain. Although this may seem counterintuitive, it stems from those species that occupy different drainage basins more often than predicted by chance, given the habitat characteristics. Still, when they do co-occur in a drainage basin, they also co-occur in a stream site.
The negative associations in occupancy at the stream site grain occurred only for two species, Asellus aquaticus and Plectrocnemia conspersa, which tended to have negative associations with almost all other species. The mechanisms behind these negative associations for the two species are likely to differ. A. aquaticus is able to tolerate various environmental conditions, and it is also often abundant in streams with relatively poor water quality (Maltby, 1991), where most species are not likely to occur. P. conspersa, in turn, is an effective predator, particularly in small stream sites (Edington & Hildrew, 1995), and may thus suppress other species. The general lack of negative associations at the small grain may be because of two reasons. First, the selected species are relatively generalist and able to shift their resource use (Mihuc, 1997), and thus they may not actually compete for resources. Second, the actual grain size was relatively large (c. 50 m 2 , on average) compared to the size of the studied animals, which allows for spatial segregation of the species within a study grain.
Contrary to our predictions, species-to-species associations in abundance were mostly weak, irrespective of the spatial grain. This may be related to mass effects where individuals disperse from a good site (i.e. source) to other sites (i.e. sinks), regardless of environmental characteristics at sink sites (Leibold et al., 2004). A good (source) site is a location where a species can maintain a viable population, which is opposite to an unsuitable (sink) site where a species goes extinct without a flux of immigrants. The characteristics of a good (source) site depend on each species' ecological preferences.
In addition, weak species-to-species associations in abundance are at least partly related to the modelling approach: the abundance was modelled conditionally on presence and thus absences were considered as missing data. This obviously leads to the decrease of data in comparison to the occupancy model, and is also shown as a rather weak predictive power of the abundance model. Another limitation of our approach is the number of species as we were only able to include only 40 of the most abundant species, thus excluding other rare species. Moreover, it may also be that biotic interactions occur among species groups (García-Girón et al., 2020;Schuwirth et al., 2016) and within the same genus, rather than among individual species.
To conclude, our results corroborate the potentiality of a traitbased approach when interpreting species-to-species associations based on joint species distribution models (Dormann et al., 2018;Kohli et al., 2018). Although we found many strong associations based on occupancy information, they were not generally linked to species traits in a way that would support biotic interactions, at least competition. Thus, species traits were used to interpret if species-to-species associations were the result of biotic interactions. The use of two spatial grain sizes in this study enabled the formulation of hypotheses to separate habitat characteristics and biotic interactions as a cause for non-random species-to-species associations. We see value in conducting further research to disentangle whether biotic interactions and abiotic environmental factors jointly affect community assembly even across large spatial extents. Our results add to the previous studies showing that distributions and abundances of stream macroinvertebrates are governed by abiotic environmental conditions (Heino & Mendoza, 2016;de Mendoza et al., 2018). They emphasize using of