Classifying ecosystem stressor interactions: Theory highlights the data limitations of the additive null model and the difficulty in revealing ecological surprises

Understanding how multiple co‐occurring environmental stressors combine to affect biodiversity and ecosystem services is an on‐going grand challenge for ecology. Currently, progress has been made through accumulating large numbers of smaller‐scale empirical studies that are then investigated by meta‐analyses to detect general patterns. There is particular interest in detecting, understanding and predicting ‘ecological surprises’ where stressors interact in a non‐additive (e.g. antagonistic or synergistic) manner, but so far few general results have emerged. However, the ability of the statistical tools to recover non‐additive interactions in the face of data uncertainty is unstudied, so crucially, we do not know how well the empirical results reflect the true stressor interactions. Here, we investigate the performance of the commonly implemented additive null model. A meta‐analysis of a large (545 interactions) empirical dataset for the effects of pairs of stressors on freshwater communities reveals additive interactions dominate individual studies, whereas pooling the data leads to an antagonistic summary interaction class. However, analyses of simulated data from food chain models, where the underlying interactions are known, suggest both sets of results may be due to observation error within the data. Specifically, we show that the additive null model is highly sensitive to observation error, with non‐additive interactions being reliably detected at only unrealistically low levels of data uncertainty. Similarly, plausible levels of observation error lead to meta‐analyses reporting antagonistic summary interaction classifications even when synergies co‐dominate. Therefore, while our empirical results broadly agree with those of previous freshwater meta‐analyses, we conclude these patterns may be driven by statistical sampling rather than any ecological mechanisms. Further investigation of candidate null models used to define stressor‐pair interactions is essential, and once any artefacts are accounted for, the so‐called ‘ecological surprises’ may be more frequent than was previously assumed.

where stressors interact in a non-additive (e.g. antagonistic or synergistic) manner, but so far few general results have emerged. However, the ability of the statistical tools to recover non-additive interactions in the face of data uncertainty is unstudied, so crucially, we do not know how well the empirical results reflect the true stressor interactions. Here, we investigate the performance of the commonly implemented additive null model. A meta-analysis of a large (545 interactions) empirical dataset for the effects of pairs of stressors on freshwater communities reveals additive interactions dominate individual studies, whereas pooling the data leads to an antagonistic summary interaction class. However, analyses of simulated data from food chain models, where the underlying interactions are known, suggest both sets of results may be due to observation error within the data. Specifically, we show that the additive null model is highly sensitive to observation error, with non-additive interactions being reliably detected at only unrealistically low levels of data uncertainty. Similarly, plausible levels of observation error lead to meta-analyses reporting antagonistic summary interaction classifications even when synergies co-dominate. Therefore, while our empirical results broadly agree with those of previous freshwater meta-analyses, we conclude these patterns may be driven by statistical sampling rather than any ecological mechanisms. Further investigation of candidate null models used to define stressor-pair interactions is essential, and once any artefacts are accounted for, the so-called 'ecological surprises' may be more frequent than was previously assumed.

| INTRODUC TI ON
Ecological communities are being subjected to a wide variety of external stressors (Halpern et al., 2015) that act across terrestrial, freshwater and marine biomes and threaten ecosystems and their services (Scheffers et al., 2016). These stressors, also termed drivers, factors or perturbations (Orr et al., 2020), are frequently anthropogenic in origin (Geldmann et al., 2014;Vörösmarty et al., 2010), but are capable of being abiotic or biotic (Przeslawski et al., 2015), and are able to act at the local to global scales (Ban et al., 2014;França et al., 2020). While individual stressors (e.g. climate change, habitat alteration or pollution) are themselves capable of inducing changes in biodiversity or ecosystems and their services (Dirzo et al., 2014;Newbold et al., 2015;Tittensor et al., 2014), ecosystems are frequently, if not predominately, acted upon by multiple stressors simultaneously (Crain et al., 2008). Despite the negative connotations surrounding the term stressor, stressors are capable of inducing effects that are either beneficial or detrimental to the affected ecosystem (Kroeker et al., 2017). One of the grand challenges facing ecologists is to be able to detect, understand and predict how these different types of ecosystem stressors interact to affect biodiversity and ecosystem services (Hodgson & Halpern, 2019); although the challenge is more difficult since the observed interactions can substantially deviate from what is anticipated (Christensen et al., 2006). Ultimately, knowledge of how stressors interact is important in guiding conservation and management initiatives, and in helping to prevent remediation measures from being ineffective, or even potentially harming those ecosystems they are intended to preserve (Brown et al., 2013;Côté et al., 2016).
Aquatic ecosystems and communities are particularly threatened by multiple stressors (Birk et al., 2020). For instance, Halpern et al. (2008) describe how every marine area is subjected to human influence, with 41% of these areas being impacted by multiple stressors.
Moreover, freshwaters represent some of the most at-risk ecosystems and are frequently exposed to a wide range of stressors (He et al., 2019;Hecky et al., 2010;Ormerod et al., 2010;Woodward et al., 2010), with freshwater biodiversity declining at rates exceeding even those of the most impacted terrestrial ecosystems (Sala et al., 2000), and potentially endangering vital ecosystem services (Malaj et al., 2014). While stressors often interact to impact freshwater ecosystems (Birk et al., 2020), their presence in freshwater ecosystems is not a new phenomenon, with some freshwater bodies having been subjected to stressors for several centuries (Dudgeon et al., 2006). However, the stressors that freshwater systems are currently facing has expanded, with the introduction of novel stressors, such as nanomaterials, while existing ones are continuing to have severe impacts (Reid et al., 2019). The cumulative impact of multiple stressors has been identified as one of the most pressing and emerging threats to freshwater biodiversity, but despite this, our current understanding of both how stressors interact, and the severity of their effects, is poor (Reid et al., 2019).
The term ecological surprise (sensu Paine et al., 1998) is often used to describe the changes in a biological response variable that contrast those anticipated when multiple stressors interact (e.g. Christensen et al., 2006;Jackson et al., 2016). Although an ecological surprise may be defined as an interaction that is either greater than, or less than, the expected magnitude from a null model, particular focus has been on interactions of stressors which interact synergistically; that is, where the combined effect is greater than the sum of the individual effects. Synergistic interactions of multiple stressors are important to document, firstly due to their potential to have a dramatic effect on ecological communities, and secondly because the presence of a synergistic interaction means management strategies can potentially have a large effect by mitigating against just one of the interacting stressors (Brown et al., 2013;Côté et al., 2016;Haller-Bull & Bode, 2019). Because of their potential impact, there has been a great deal of effort in recording the frequency of synergy in stressors across different ecosystems and communities (Côté et al., 2016). However, there is always a danger that an emphasis on their importance could lead to over-estimating the frequency of synergisms or other forms of ecological surprise (e.g. antagonisms) within the multiple stressor literature and, as highlighted by Côté et al. (2016), there is little evidence to suggest that stressors predominately interact in a synergistic manner. A pertinent question which has yet to be addressed is whether ecological surprises should be expected, or whether the prevalence of these interactions are skewed in some way by reporting biases, statistical sampling or both.
There is relatively little ecological theory that generates expectations of when and how often the cumulative effects of pairs of stressors should be synergistic, or indeed any other type of interaction. This is in contrast to other ecological interactions, such as the effects of multiple predators on prey density and biomass, where a much richer body of theoretical knowledge that has been used to generate a number of hypotheses for testing (Schmitz, 2007;Sih et al., 1998). Instead, progress on ecosystem stressor interactions has been made largely by meta-analyses across a number of experiments, realms, trophic levels, measured traits, taxonomic groups and stressor types (e.g. Crain et al., 2008;Darling & Côté, 2008;Jackson et al., 2016;Wu et al., 2011). Within ecosystem stressor research, the most popular approach is to use the additive null model where the stressor interaction is predicted to be simply the sum of their individual effects (e.g. Crain et al., 2008;Darling & Côté, 2008;Jackson et al., 2016;Strain et al., 2014), although the multiplicative null model, the log-transformed version of the additive model, is also relatively common (e.g. Bancroft et al., 2008;Gruner et al., 2008;Harvey et al., 2013;Rosenblatt & Schmitz, 2014). These null models classify interactions as either being null (i.e. the additive or multiplicative effect of interacting stressors), synergisms (i.e. greater than the null) or antagonisms (i.e. less than the null). While distinctions are increasingly being made for various forms of antagonistic interactions in this simple scheme (e.g. Jackson et al., 2016), there exists a range of other classification schemes (Orr et al., 2020), and these have been implemented across a number of studies (e.g. Piggott et al., 2015;Travers-Trolet et al., 2014). The profusion of null models can make it difficult to generalize results across different studies.
A 'synergistic' or 'antagonistic' interaction may have contrasting definitions depending on the scheme being used leading to the same interactions being labelled differently under contrasting schemes; hence the biological and statistical interpretation is therefore dependent on the null model being applied. One way round this issue is to pool published data together to harness increased statistical power and conduct a meta-analysis to search for generalities under a particular null model (examples listed in Côté et al., 2016). However, despite their potential, these meta-analyses have to date not identified any general covariates capable of explaining the broad patterns of multiple stressor interactions, meaning we still lack general predictions of the consequences of multiple stressors (Côté et al., 2016).
Given the lack of consistent generalities from empirical studies, the development of ecological theory within multiple stressor research may represent an approach capable of providing novel insights. Some theory has been developed for particular case studies (e.g. Brown et al., 2013;Galic et al., 2018), but only a few studies (e.g. Haller-Bull & Bode, 2019) have so far investigated more general insights. Of primary interest is the generation of theory which can provide a mechanistic underpinning to the field, and potentially allow for an increased understanding of multiple stressor interactions, compared to that provided solely by a null model approach (De Laender, 2018). However, theory could also be used to better understand the results obtained from the null model approach to empirical classification of stressor interactions. In particular, we know of no study that has investigated how robust the null models are to noisy data (i.e. sampling uncertainty and/or process variation); yet understanding this is important before we can draw strong conclusions from the empirical analyses. This knowledge is also important for evaluating the relative performances of the profusion of null models, and is therefore something which may help guide the end-user to decide which null model may be both appropriate and likely to yield important results in the face of what is often noisy and/or limited data.
Here, we begin to close these gaps in understanding by testing for the prevalence of non-additive effects of co-occurring pairs of stressors in freshwater ecosystems. We first develop classical community ecology models based on Lotka-Volterra consumer-resource dynamics in order to simulate data from biologically simple food webs impacted by pairs of stressors. This provides us with 'data' where we know the underlying stressor-pair interactions. We then use these simulated data to investigate the ability of the additive null model to recover interactions under a range of different levels of data uncertainty which we model as observation error. With a better understanding of the statistical null model, we then review the experimental literature to compile and analyse the largest (in terms of the number of interactions) dataset for the effects of co-occurring stressor interactions on the biomasses and densities of freshwater organisms. In particular, we ask whether ecological surprises are common in freshwater stressor interactions. The simulation experiments allow considerable insights into our empirical analyses and help prevent over-interpretation of our results.

| Theoretical models
In order to provide a better understanding of the empirical results that follow, we built food chain models based on the classical Lotka-Volterra consumer resource equations (Heath et al., 2014). We chose these models since we believe stressors may act directly on population-and trophic-level patterns, but also indirectly via trophic cascades (e.g. a species may be indirectly affected if its primary resource is directly affected by a stressor). This approach is also broadly in line with our empirical data which focuses on population and community-level metrics as the responses to stressor treatments (see below). To increase the robustness of our conclusions, we considered two forms of the model; one where (within trophic level) density dependence affects the death rates of each trophic level, and the other where consumer uptake is density regulated (Table 1). Both these scenarios were analysed by Heath et al. (2014) to investigate the roles of different types of density dependence on trophic cascades (see details therein). In both models, the basal level of the chain describes the dynamics of a key nutrient that limits the productivity of the food chain, and we assumed nutrients are added TA B L E 1 Equations used to establish theoretical food chains. The equations, sets and a brief description of the equivalent ecological trophic are shown

Equation type Equation Description
1a Density dependence dx n dt = n n x n−1 x n − n x n − n x 2 n Change in density of apex consumer (x n ) 1b Density dependence Consumer uptake regulation dx n dt = n n x n−1 x n 1 + n x n − n x n Change in density of apex consumer (x n ) 2b Consumer uptake regulation Consumer uptake regulation Change in density of nutrients (x 0 ) at a constant rate, ω (Heath et al., 2014). Each subsequent equation then describes a different type of consumer. The first level is wholly dependent on the nutrients and may represent a primary producer such as an algal species that requires a key mineral such as silica.
The second level consumes the first trophic level and is in turn consumed by a third trophic level, and so on until the apex consumer is reached. In the density dependence model (Equation 1; Table 1), the consumer (trophic level i) exploits the resource (trophic level i -1) with a constant consumption/attack rate, i , and the conversion efficiency parameter, i , determines the proportion of the consumed resource that is converted into new consumers (Heath et al., 2014).
Under density dependence, the consumer is self-regulated by the intraspecific density dependence parameter i , which leads to an increase in death rate as the consumer density increases (Heath et al., 2014). In contrast, the consumer uptake regulation model (Equation 2; Table 1), assumes the effect of increasing consumers is to slow down the consumption of the resource, perhaps due to increased interference (Heath et al., 2014). In this case, the parameter ν i , determines the consumer density at which the maximum per capita uptake rate is halved, defined as the density x i = 1/ν i (Heath et al., 2014).
Using these equations, we established food chains comprising either three, four or five trophic levels, and the equation for each trophic level models how the biomass or density changed over time.
For simplicity, we assumed all key parameters (nutrient input ω; consumption rates i ; conversion efficiencies i ; uptake regulators i ; density independent i , and dependent death rates i , for trophic level i) do not vary over time, and we investigated the effect of stressors on equilibrium biomasses/densities. The models also ignore spatial structure in the community, which also remain closed to immigration from outside apart from the constant input of the nutrient. Hence these models represent the simplest form of community dynamics that can be used to investigate the effects of multiple stressors as well as the manner in which they interact.
Stressors to the food chains were modelled by changing the values for parameters and comparing the resultant equilibrium densities across all trophic levels to the equilibria for a set of baseline parameter values. Equations 1 and 2 are not mechanistic models for specific stressors (e.g. pollution, temperature) but instead capture the net effect of stressors on the ecological processes of the food web species. For simplicity, we assumed each stressor had either a positive or negative effect on one model parameter (i.e. ω, i , i or i ), and we investigated how pairs of stressors interact to affect community densities. The baseline parameters were drawn from uniform distributions with ranges given in Table 2.
Therefore, for a given food chain, the baseline parameters for all trophic levels were independently sampled from the distribution of values given in Table 2. Similarly, the processes (parameters) affected by each stressor were randomly selected from the possible candidates, and the intensity of its effect on the baseline rate was drawn from a uniform distribution with the ranges shown in Table 2. The baseline parameter set therefore represented the control community, and as in experimental studies that employ the factorial design approach (e.g. Davis et al., 2018;Matthaei et al., 2010), we manipulated our model communities by investigating the effect of each stressor acting alone, as well as the stressors acting in combination. From these cases, we then computed the type of stressor interaction and how they combined to alter the community densities (see below for definitions of how stressor interactions are computed). We therefore chose one trophic level at random from the entire food chain, excluding the nutrient level.
We focused on this population/trophic level and mirrored it in our selection of empirical data (see below). This also means the species or trophic levels under scrutiny were not always directly affected by the stressor but could be affected solely due to a trophic cascade effect. It is also important to note that a stressor could have led to either an increase or a decrease in parameter value relative to the baseline; and that multiple stressors could have acted (1) and (2), with the mechanism they reflect, alongside the minimum and maximum values for the ranges of baseline parameter values. Parameter values were drawn from a uniform distribution U~(a, b) with lower limit, a, and upper limit, b, with the limits differing between the baseline and stressed parameters. The method for determining stressed parameter values is detailed in Supplementary Information S1 Parameter Ecological mechanism Baseline value range Stressed value range α

TA B L E 2 Explanation of the different parameters within Equations
The rate at which a trophic level predates upon the trophic level directly below.
The efficiency at which a trophic level can transform consumed matter into new individuals.
The density independent mortality rate of a trophic level.
The constant rate at which a resource (x 0 ) is input into the food chain.
The density dependent mortality rate of a trophic level.
The parameter was not under selection for alteration by a stressor ν A limit to the uptake rate of a consumer through a trait-mediated response that may be behavioural or otherwise. theoretical interactions, and for each community, we randomly selected a single trophic level for the focus of our estimation of the stressor interaction. All subsequent analyses of the theoretical data were performed on this group of 360,000 theoretical interactions.
This subsetting was required as there was a negative relationship between the number of trophic levels and the likelihood of the community being both stable and feasible, which biased the full dataset towards communities with only three trophic levels. The final 360,000 stressor interactions were selected with weighted probabilities to ensure approximately one third (i.e. ~120,000) were from each of the three food chain lengths, and that each model (Table 1) was also approximately equally represented.
Unlike the empirical studies used in the meta-analyses below, the , were from one of 50 different levels, ranging from 1 × 10 −6 to 5 × 10 −1 , in consistent logarithmic increments (e.g. 8 × 10 −6 , 9 × 10 −6 , 1 × 10 −5 , 2 × 10 −5 , etc.). The interpretation of is straightforward, as we would expect 99.7% of all observations to fall within 3 's of the 'true' stressor effect (i.e. the biomass/density in the absence of any observation error). Supplementary Information S1 details a complete overview of how observation error was incorporated into the theoretical data.

| Collation of empirical data
Through Web of Science we searched the primary scientific literature, for papers published before 1st January 2019, which investigated the impacts of multiple stressors on freshwater communities.
In order to be incorporated, papers needed to report results where there was a factorial design, namely: (i) a control (without stressors); (ii) each stressor acting individually; and (iii) the stressors acting simultaneously. We required papers to report the mean value of the response, the number of replicates, and standard deviation or standard error for each treatment in the factorial design; failure to report any of this information led to the study being excluded from our analysis. Additionally, papers were required to report at least one of the following untransformed metrics: biomass, abundance, density or chlorophyll-a of one or more groups of organisms within the stressed community. Hence, and in line with our trophic models, the focus of our effort was directed towards studies that report the effects of stressors acting at the population and community levels.

| The determination of effect sizes and the classification of interactions
Across both the theoretical and empirical datasets, we used the same method to determine the classification of an interaction, using the factorial form of the effect size metric, Hedges' d (Gurevitch et al., 2000). Hedges' d is frequently used to investigate the impacts of multiple stressors as it estimates the standardized mean difference between the means of stressed and control samples and is unbiased by small sample sizes (Hedges & Olkin, 1985). It is calculated by comparing the effect of the interaction on ecological communities to the sum of effects of the stressors acting individually; namely, an additive null model. In line with current methods, we inverted the sign of the interactions when the expected effect of the additive null model was negative (Piggott et al., 2015). This method allowed for interaction effect sizes to be compared regardless of their directionality. We therefore focused on the classification of the interaction as opposed to the absolute magnitude/polarity of the effects.
Supplementary Information S3 gives a complete breakdown of the equations used for calculating Hedges' d.
Once Hedges' d for a given interaction of stressors was calculated, we then classified the interaction into one of four types as illustrated in Figure 1 and following the convention of Jackson et al.

| Vote-counting
Following the classification of all interactions, we implemented a vote-counting method to determine the relative proportions of the interaction classes across both the theoretical and empirical datasets. To consider the effect of different strengths of observation error on the ability to detect the 'true' stressor interaction in the modelled data, we computed the frequency of interaction types for each level of observation error investigated.

| Summary effect sizes
Alongside the vote-counting method, we calculated summary effect For the empirical analysis, summary effect sizes were determined by using a weighted random effect model and implemented in the metafor package (Viechtbauer, 2010) in R. Random effects were specified as being the identity (ID) of the study group of organisms nested within the ID for study. The random effects were specified in order to account for both within-and between-study variation contained within the empirical dataset. Additionally, some empirical studies considered multiple intensities of one or more stressors, and as such, calculations of the interaction class for each intensity of stressor used the same control. To account for any covariance between the different intensities of a single stressor, we incorporated covariance-variance matrices within the metaanalytical models. For the empirical dataset, mixed effect models were also conducted with the fixed effects of stressor pair or organism group, alongside the previously described random effects (see Supplementary Information S5). The summary effect size for the theoretical dataset was also determined using a similar process.
However, due to computational limitations caused by the number of interactions under analysis (360,000 interactions at each level of observation error), fixed effect models for the theoretical data were fitted using the lm function. The models applied to both the theoretical and empirical datasets are explained in further detail within Supplementary Information S5. While we detail the results of both the vote-counting and summary effect size methods, our results primarily focus on summary effect sizes, in line with recommendations for meta-analyses (Gurevitch et al., 2018).
The overall effect from a meta-analysis needs to be checked for consistency among effect sizes, termed as heterogeneity (Nakagawa et al., 2017). We used the I 2 statistic, which is bounded between 0% and 100%, with 25%, 50% and 75% being suggested as levels for, low, medium and high heterogeneity, respectively (Higgins et al., 2003). Ecological meta-analyses often report high levels of heterogeneity (Senior et al., 2016), perhaps due to the variation in study organisms common to the questions being addressed, and we may have expected a high value here due to both range of study organism and range of stressor type. To explore the potential causes of heterogeneity within the empirical meta-analysis, we conducted separate meta-analyses on a sub-group of the dataset, a similar process to running a meta-regression (Nakagawa et al., 2017), using organism group (i.e. producer or consumer) as the categorical moderator to explore heterogeneity (see Supplementary Information S6). We also considered publication bias (see Supplementary Information S6); although it should be noted that common tests for publication bias within meta-analyses can be limited by high heterogeneity (Nakagawa et al., 2017).

| Stressor interactions within theoretical data
We found no strong difference between the classification of stressor interactions from either form of food chain model (Table 1)

| Theoretical expectations
In summary, our theoretical analyses led us to conclude that at bio-  Figure 3a). Additionally, the summary effect size for the entire dataset was negative with 95% confidence intervals that did not overlap zero (−0.632 ± 0.260), indicative of an antagonistic/reversal summary interaction class (Figure 3b).
Our meta-analysis showed medium-level heterogeneity (I 2 = 48.5%) although this was considerably lower than the mean heterogeneity (I 2 = 91.7%) found in an analysis of previous ecological meta-analyses (Senior et al., 2016). Furthermore, two additional meta-analyses were conducted on sub-groups of the empirical dataset, with the categorical moderator of organism group used to explore this heterogeneity (Nakagawa et al., 2017). However, these additional meta-analyses failed to uncover any source of this heterogeneity, with the meta-analysis for consumers reporting mediumlevel heterogeneity (I 2 = 42.5%) and the producer meta-analysis reporting high-level heterogeneity (I 2 = 67.7%; see Supplementary Information S6).

| Comparison of empirical and theoretical interaction classifications
Overall, we found close agreement between our theoretical models with biologically reasonable levels of observation error and the freshwater empirical data (Figure 3)

| DISCUSS ION
There has been much interest in understanding and cataloguing the joint effects of stressors on ecological communities and ecosystems (Côté et al., 2016;Schäfer & Piggott, 2018;Thompson et al., 2018b Figure 2). We believe that once these statistical aspects are considered, the so-called 'ecological surprises' (sensu Paine et al., 1998) may in fact be more prevalent in both our freshwater dataset, and more widely.  Figure 3a; white circles denote additive interactions, green squares denote antagonistic interactions, yellow diamonds denote synergistic interactions, purple triangles denote reversal interactions. Figure 3b; closed circles denote significant summary effect sizes (i.e. 95% confidence intervals do not overlap zero), and open circles denote non-significant summary effect sizes (i.e. 95% confidence do overlap zero)

| Null model sensitivity to observation error
The choice of the null model is hotly debated within ecological stressor research (Schäfer & Piggott, 2018), and it has been argued that null models should be able to accurately predict the combined effects of stressors (Orr et al., 2020). Our results ( Figure 2) are the first attempt to quantify the degree of accuracy for the most commonly used null model, and we conclude that for all but the very lowest levels of observation error it is difficult to correctly reject the additive null interaction (Figure 2a). In other words, we find weak statistical power to recover the underlying stressor-pair interactions. On this basis, and given that most experiments have low sample sizes (we report a mean of 3.83 with a maximum of 16 per treatment in our empirical data), we consider it premature to conclude that most stressor interactions are truly additive in the freshwater data we collected. Instead, we should be careful to conclude that in the majority of cases we do not have sufficient evidence to reject the null (additive) interaction. However, it means that we should take notice whenever a non-null interaction is returned by the additive model, since only strong non-additive effects are likely to be detected (see Supplementary Information S1).
Perhaps more surprising is our finding that meta-analyses using the additive null model report antagonism as the summary interaction classification when observation error is non-negligible, despite synergies co-dominating in our simulation data (Figure 2b). The high sensitivity to estimation uncertainty may be key reasons why stressor synergies are not as often reported as may be expected (Côté et al., 2016;Darling & Côté, 2008), although other reasons may also contribute, and we can also not rule out that the empirical results do truly reflect the underlying interactions.
However, we believe our finding of high sensitivity to observation error in the null model is more general than either our theoretical results, or our freshwater dataset, and we suggest future studies should investigate other null models for their robustness to observation error and sample sizes. Such analyses would build on previous descriptions of the null models (e.g. Folt et al., 1999;Sih et al., 1998Sih et al., , 2004 and would be particularly useful if analyses considered the effect of sample size on statistical power, as this will help guide future empirical studies to improve the detection rate of non-null stressor interactions.

| Theoretical expectations for interaction frequencies
Our food chain models imply that, given adequate sample sizes (see above), we should expect synergistic and antagonistic interactions to co-dominate at the population and trophic levels, whereas additive interactions and reversals should be relatively rare. It may well be the case that our models are not good descriptors of the data we analyse; certainly, we ignore much important detail that is likely a feature in the data, such as spatial structure and temporal variation in parameters caused by external perturbations not linked to the stressors, and more complex food web structure involving omnivory or parasitism. Unfortunately, the null model sensitivity to observation error implies we do not yet have the tools with which to discern the relative abilities of different theoretical models to capture the empirical data. However, our key theoretical finding for the relative rarity of additive interactions appears to be echoed in the few other theoretical studies on stressor interactions in ecological communities (e.g. Haller- Bull & Bode, 2019; Thompson et al., 2018a;Travers-Trolet et al., 2014). This agreement is despite a variety of key differences in the model assumptions. In particular, Haller-Bull and Bode (2019) focused on populations rather than multispecies communities, but found dominant roles for synergistic and antagonistic interactions, with additive interactions occurring most frequently for stressors affecting the carrying capacity. Similar to our model, Thompson et al. (2018a) also focused on multispecies communities, but they assumed biological interactions were constant, whereas we allow interactions (consumption and conversion rates) to be modified by stressors, an assumption that seems likely to be met on a regular basis. For example, stressors have been shown to influence resource competition (Kroeker et al., 2013); susceptibility to parasitism in oysters (Lenihan et al., 1999); and modify the flow of energy through aquatic food webs by inducing changes in trophic links (Schrama et al., 2017). Despite this difference, Thompson et al. (2018a) found additive interactions were most prominent when species facilitated each other (i.e. positive species interactions), but that synergy or antagonism in combined stressor effects on species richness or community biomass were more common when species interactions are negative (competition or resource use).
The apparent rarity of additive interactions in all of these models may appear at odds with the possible interpretation that two stressors acting on different species within a community could lead to an additive joint effect (Jackson et al., 2016). However, feedbacks in the food web, like those found in our models, mean that even if a species is unaffected directly by a stressor, it is highly likely that top-down or bottom-up effects will lead to indirect interactions for many species, and as a result, additive interactions are extremely unlikely in the absence of uncertainty (e.g. observation error). Indeed, we anticipate that additive interactions may only truly occur in scenarios where species in different and very weakly interacting sub-communities are affected by different stressors, or, as found by Thompson et al. (2018a), where species interactions are predominantly positive. We believe there will be an increasing role of theory in generating hypotheses for the ways in which stressors interact (De Laender, 2018), and the most progress will be made when the theory is developed so it can be directly compared against empirical data, much as we have done here.

| Mechanistic understanding of multiple stressors
In this study, we sought to address the question of how multiple stressors interact. This approach, when applied across both theoretical and empirical datasets can allow us to discern what may be expected across the interactions of multiple stressors. Future research may seek to address the question of why multiple stressors interact in the manner that they do. Undoubtedly, these two questions are entwinned, with the answers to each of these questions highly likely to be dependent on the other. However, while the use of null models is essential in determining the combined effect of multiple stressors (Thompson et al., 2018b), the adoption of a mechanistic approach to investigating multiple stressors may provide novel insights which address these joint questions (De Laender, 2018;Schäfer & Piggott, 2018). For instance, a mechanistic understanding may allow for responses such as co-tolerance or co-susceptibility (Todgham & Stillman, 2013) to stressors to be more thoroughly understood from an ecological perspective.
Ultimately, as our results imply, such an understanding is likely to require a large amount of empirical data to fully understand; however, there is ample scope for theoretical ecology to help fill this gap in our collective understanding of multiple stressors, and to generate specific hypotheses to be tested. Similarly, a mechanistic understanding of multiple stressor interactions would prove invaluable when mitigating the effects of stressors or implementing conservation initiatives.

| Future developments
Our analysis represents a novel approach combining both theoretical and empirical methods. While this analysis provides a solid foundation, there are several aspects that could be adjusted in future research. Firstly, there is a clear need to better understand the limitations and data requirements of the null models (e.g. Gurevitch et al., 2000;Lajeunesse, 2011;Thompson et al., 2018b) that are used to classify stressor interactions. Such knowledge would be very useful in guiding experimental design that would maximize the probability of uncovering non-null stressor interactions and would therefore provide a better understanding of their true prevalence. Knowing how many data points are required before we can realistically hope to detect a particular type of pattern, in this case a stressor interaction type of a given strength, is a critical component of experimental design. Moreover, our work has also uncovered some hitherto undescribed biases that lead to meta-analyses potentially over-emphasizing antagonisms, and it is important to investigate other null models for this feature as well as looking for methods to reduce this bias. Secondly, the theoretical communities manipulated here combine multiple populations each on a separate trophic level. While this builds upon similar research conducted on a single population (Brown et al., 2013; Haller-Bull & Bode, 2019), there is scope for this approach to be expanded to consider more complex communities, for instance with multiple populations on a single trophic level (e.g. Thompson et al., 2018a). Finally, the manner in which stressors interact at the parameter or process level can occur in numerous ways, for instance either additively or multiplicatively (Haller-Bull & Bode, 2019). However, whether a process or parameter is impacted in an additive or multiplicative manner, will cause a stressed parameter value to change by differing degrees, with this in turn potentially resulting in contrasting frequencies of interaction classifications at the population level. Accordingly, the manner in which a process or parameter (e.g. feeding rate, mortality) is impacted may be determined by the individual stressors; for instance, if two simultaneously acting stressors are entirely independent of one another then their effect on an ecological process may be additive (Haller-Bull & Bode, 2019). Consequently, allowing stressors to impact the same process undoubtedly represents an area for expansion, particularly when considering how impacts at the parameter level affect population level properties.

| CON CLUS IONS
Determining the ways multiple stressors interact is key when attempting to mitigate their effects, with the class of the observed interaction potentially outlining whether the removal of a stressor will have a beneficial, limited or detrimental impact to an ecosystem (Brown et al., 2013;Côté et al., 2016). Our results show the value of developing a theoretical framework which can aid in the interpretation of environmental stressor interactions, and we hope more general theory that makes specific predictions based on ecological mechanisms (e.g. De Laender, 2018;Fu et al., 2018;Thompson et al., 2018a) will be developed and tested in future.
However, our results also highlight the urgent need to better understand the strengths and limitations of the null models that are used to classify the cumulative effects of community stressors, and we also believe a unified approach to the meta-analyses of individual studies will increase our understanding of how environmental stressors combine.

ACK N OWLED G EM ENTS
We thank four anonymous reviewers for their insightful comments that improved the manuscript. We similarly thank Rory Gibb,

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available in the supplementary information of this article.