### Introduction

- Top of page
- Summary
- Introduction
- Materials and methods
- Results
- Discussion
- Acknowledgements
- References

Species extinctions have reached an unprecedented rate (Barnosky *et al*. 2011), making biodiversity loss one of the most severe threats to ecosystems around the world (Reich *et al*. 2012). Often, extinctions stem from anthropogenic habitat loss, over-harvesting and climate change (Pereira *et al*. 2010), and are likely to have profound effects on important ecological services (Worm *et al*. 2006). To forecast extinction risk, we would like to estimate the probability of each and every species becoming extinct in an ecological network. Certain key species traits are likely to influence extinction vulnerability: for example, large body size, high trophic level and low density all increase the probability of extinction (Gaston & Blackburn 1995; Purvis *et al*. 2000; Cardillo *et al*. 2005; Davidson *et al*. 2009 Lee & Jetz 2011). However, species are not isolated, but rather depend on each other for sustenance, forming a complex network of ecological interactions. Therefore, the extinction of a single species could affect other species with which it interacts, directly or indirectly (Ebenman & Jonsson 2005), potentially setting in motion a cascade of secondary extinctions through the community. These secondary extinctions can emerge from either bottom-up effects (consumers losing their resources) or top-down effects (resources responding to the loss of their consumers).

Traditionally, there have been two main approaches to the analysis of secondary extinctions in ecological networks – often called network robustness. The first line of research, originating from studies of other complex networks (Albert, Jeong & Barabási 2000), focuses exclusively on the presence or absence of consumer–resource relationships. Thus, only the qualitative network structure is taken into account. Typically, one removes species, either randomly or systematically, and tests how network robustness varies with network properties such as number of species or connectance (Sole & Montoya, 2001; Dunne, Williams & Martinez 2002; Memmott, Waser & Price 2004; Srinivasan *et al*. 2007). This so-called topological approach has the advantage of requiring only the network structure as an input: for an adjacency matrix *A*, whose rows and columns represent the species, a coefficient signifies that *i* is a prey of *j*. Although this simplicity makes it possible to analyse very large networks, the approach has several limitations. For example, in the topological case, secondary extinctions only occur when a consumer loses all of its resources: the extinction risk does not grow until all resources are lost, at which point the extinction risk equals one. Also, in the topological approach, all species are usually assumed to have the same baseline probability of extinction, whereas in natural systems some species are more vulnerable than others.

An alternative line of research attempts to explicitly model population dynamics, that is, changes in abundances or biomasses over time, for all species in the network (Ebenman, Law & Borrvall 2004; Eklöf & Ebenman 2006; Riede *et al*. 2011; Stouffer & Bascompte 2011). Using dynamical models one can capture, in addition to the purely topological extinctions, other types of extinctions. For example, through the propagation of indirect effects in the network, the primary extinction of a top predator might lead to the secondary extinction of some of its resources [top-down extinctions, for example, Ebenman & Jonsson (2005)]. Additionally, dynamic models often include an extinction threshold, a population density below which the species are considered extinct, in order to account for processes such as demographic stochasticity (Eklöf & Ebenman 2006). As such, a resource present at low values could still be insufficient to support its consumers. However, dynamical models require an extensive set of parameters, making an empirical parameterization of large food webs next to impossible. Typically, this approach has been used mostly to study synthetic webs generated using physiological scaling of species interaction strengths (Binzer *et al*. 2011). Moreover, because dynamical ecological systems are highly nonlinear, slightly different initial conditions can lead to very different outcomes. This makes it necessary to simulate numerous replicates for each parameterization. Finally, even if one were to measure all parameters correctly, this approach is difficult to extend to the study of very large networks due to limited computing power.

A middle-ground approach is to consider the probability that species will be present or absent in a complex system. Such a framework requires relatively few parameters and assumptions, yet it can account for a wide range of extinction types. One recent example of this type of model is the stochastic ecological network occupancy (SENO) model, which takes the topological network structure as well as colonization and extinction rates as input parameters, addressing the changes in species probabilities over space and time (Lafferty & Dunne 2010). Extensive simulations will converge on the actual probability of extinction for each species, but exact solutions (in the absence of top-down effects) can alternatively be found using Bayesian networks (Lafferty & Dunne 2010).

Here, we explore the use of Bayesian networks (Jensen, 1996) to directly calculate the marginal probability of species extinction in a network without requiring simulation. We add considerable flexibility in the assumptions about how consumers respond to the loss of resources.

A Bayesian network is simply a collection of random variables (here species are represented as Bernoulli random variables determining their presence/absence) with arrows describing their conditional dependencies (feeding relationships). As such, the probability of extinction of each species depends on the state of its resources, which in turn depends on the state of their resources. The use of Bayesian networks has several advantages over the more traditional ways of modelling species extinctions. First, in Bayesian networks, one can directly assign to each species a different baseline probability of extinction. This baseline probability is then combined with the network structure to estimate extinction risk. This is useful for conservation, where lists of endangered species (‘Red Lists’) are often available. The main benefit from a modelling standpoint is that we need few parameters and thus few assumptions about the biological interactions.

Bayesian networks can be solved numerically very efficiently – multiple simulation reiterations are not needed, and therefore, computation time is greatly reduced. There is no need for artificial ‘sequences’ of extinctions, since all possible cases are considered simultaneously. Finally, as we show here, many simulation-based approaches are in fact simulating a Bayesian network that can be solved more efficiently. These benefits stress the importance of connecting ecology with the vast literature on graphical models.

Using Bayesian networks, we introduce a flexible method in which all the possible responses of consumers to resource loss can be modelled. To test our Bayesian network method, we parameterize a model based on differential equations that is frequently used to simulate complex food web dynamics (Berlow *et al*. 2009; Binzer *et al*. 2011). We first perform in silico (simulated) extinctions for the full-fledged dynamical model. We then use our method and attempt to predict the observed extinctions. We evaluate the goodness-of-fit for alternative responses of consumers to resource extinction using likelihoods. We find that a sigmoid response, in which consumers’ extinction risk grows sharply after a critical fraction of resources is lost, best accounts for the observed extinctions. Moreover, adding information on resource importance further improves the forecasting ability of our Bayesian network method.

### Results

- Top of page
- Summary
- Introduction
- Materials and methods
- Results
- Discussion
- Acknowledgements
- References

We tested five different algorithms for calculating a species marginal probability of extinction, each of which is described by two terms. The first term (topological, linear, nonlinear) describes the consumer's functional response to resource loss. Note that the nonlinear response could assume the shape of both the topological and the linear response, if this were to lead to the best likelihood. The second term (binary, flow), describes the quantity used to compute the fraction of resource lost. In the binary case, each resource is equally important, while in the flow case, the importance of a resource depends on its contribution to the diet of the consumer. Thus, the five algorithms are as follows: (i) topological (as binary and flow are exactly the same in this case), (ii) linear binary, (iii) linear flow, (iv) nonlinear binary and (v) nonlinear flow. For each network, we computed the likelihood that a given algorithm exactly reproduced the extinctions observed in the ATN simulations as well as the maximum likelihood. We repeated the whole procedure for the case in which is equal for all species (‘Uniform’) as well for the case in which it depends on the trophic level of *i* (‘Trophic Level Based’).

We found a clear ranking for the performance of the five algorithms. The best performing algorithm was always nonlinear, with more networks favouring the use of flow-based interactions rather than a binary description of the interactions for the resources (Fig. 4). Although these models necessarily fail to predict some of the extinctions, they still produce results close to the maximum-likelihood outcome for all networks (Fig. 5).

For the nonlinear functional forms, we searched for the maximum- likelihood parameters that best fit the observations. As such, we can plot the resulting maximum-likelihood parameter values to determine which shapes are favoured by the data. We start examining the two nonlinear binary cases, obtained for uniform and trophic level-based (Fig. 6). In both cases, the most favoured shape is a sigmoid curve (α > 1, β > 1, red), with the steepest increase in the probability of extinction (mean of the distribution) being around 50% of resource lost – the points are close to the α = β line. In many cases, the transition is quite sharp (α ≫ 1, β ≫ 1, in red). A few networks favour the use of a quasi-topological response (α≪1, β≪1, in which the probability of extinction is constant until almost all the resources are lost, in green), and a few other networks yield almost linear responses (α≈1, β≈1). Very few cases yield convex functions (α > 1, β ≤ 1, in blue), and all but two are close to the linear case. In the nonlinear flow models (Fig. 6), we observe two main differences: (i) the transitions tend to be very sharp (α ≫ 1, β ≫ 1), and (ii) there are severe deviations from the line α = β, meaning that the transition does not happen when about 50% of the flow is lost, but rather at higher levels of loss.

Because for each network and replicate we ran two simulations modifying solely the (using ‘Uniform’ or ‘Trophic Level Based’ approaches), we compare the robustness of each network for the two treatments (Fig. 7). In all cases, the Trophic Level-Based treatment makes the network more robust: preferentially, removing species from the top produces fewer secondary extinctions than those obtained when all species have the same probability of going primarily extinct. Compared to the results obtained using the ATN model, the algorithms based on linear response tend to underestimate robustness, while the topological approaches tend to grossly overestimate it. The two nonlinear cases closely reproduce the ATN results, consistent with the results found using likelihoods. Although the nonlinear cases are more flexible in fitting the data than the topological and linear cases, the difference in likelihoods is typically so large that any model selection technique would prefer the best performing model judged using likelihoods alone (Fig. 5).

### Discussion

- Top of page
- Summary
- Introduction
- Materials and methods
- Results
- Discussion
- Acknowledgements
- References

Traditionally, two main approaches have been taken to study food web robustness: the extremely simplistic topological approach and the considerably more complicated fully dynamical one. Here we applied a Bayesian network approach to a third framework that combines the simplicity of the topological approach with some important features typical of dynamical models, but without the extensive set of parameters needed to track population dynamics. Using Bayesian networks, when provided with a network structure and a baseline probability of extinction for each species, one can calculate the marginal probability of extinction of each species exactly.

In topological models, the only input parameter is network structure, and a species goes extinct whenever it loses all of its resources (Dunne *et al*. 2002). The probability of extinctions does not increase with the fraction of resources lost – a consumer with only a small fraction of its resources remaining is as viable as one that has suffered no loss. This assumption has repeatedly been criticized as unrealistic and contrasts starkly with the biological understanding of extinction risk (Purvis *et al*. 2000; Eklöf & Ebenman 2006; Srinivasan *et al*. 2007). In our Bayesian network method, this issue is addressed by explicitly modelling the functional form of the response to loss of resources, that is, how the probability of extinction of a consumer increases with increasing loss of resources. We tested three functional forms – topological, linear and nonlinear.

In analysis of food web robustness, it is often assumed that all nodes are equal – each species has the same probability of undergoing extinction or the probability is solely determined by network properties (Dunne *et al*. 2002). Although the simplicity of the approach is appealing, this extreme simplification is in contrast to empirical results. One of the strengths of our Bayesian network method is therefore the possibility of including more precise ecological knowledge about species extinction risks. Each and every species is assigned a baseline probability of going extinct, which is taken into account in the likelihood evaluation, and the variation in extinction among species does not need to follow a sequence. This is useful since we know that some species are more vulnerable than others, regardless of their pattern of interactions, large-bodied species, species depending strongly on particularly habitats, overexploited or rare species all likely to go extinct for causes other than network structure (Purvis *et al*. 2000). Using the method presented here, this information can be taken into account, for example, by assigning the baseline probability of extinction using lists of endangered species. For practical applications, there is a need to bridge the gap between theoretical ecology and conservation biology, and including results from conservation-oriented research into algorithms for the analysis of networks is a first step in this direction.

We compared our results from the Bayesian model with the results obtained from the widely used ATN model (Berlow *et al*. 2009; Binzer *et al*. 2011, for example) and assumed that extinctions in the ATN model were ’true’ extinctions. There are two main reasons for this choice. First, experimental data in which replicate extinction experiments in large networks are performed through manipulation are completely lacking – although sorely needed. Second, we wanted to contrast our result using the results from a commonly used dynamical model that included several ecologically relevant parameters. The ATN model meets this criterion. We find that the Bayesian network approach predicts the majority of the secondary extinctions that the ATN model generates. However, secondary extinctions of consumers whose resources are all extant cannot be predicted using our approach. These extinctions can, in the dynamical model, be caused by purely top-down interactions, for example, disruption of predator-mediated coexistence, or still be bottom-up effects in which resources decrease to a level insufficient to support the consumer (Ebenman & Jonsson 2005). In such cases, the network structure itself would not change (the resources would be diminished, but would not become extinct), and thus, our method would predict no increase in the extinction risk. Our simulations show that these extinctions are quite rare (20%) when all species have the same probability of going primarily extinct, but more abundant (44%) when species at the top of the food web are more likely to go extinct than those at lower trophic levels (Fig. 3). Because in natural ecosystems apex predators and large-bodied species have high probability of extinction due to anthropogenic effects, this result highlights a severe limitation of purely network-based studies of robustness. Thus, when we assume that all species are equally likely to go extinct, we tend to grossly underestimate the robustness of the system (Fig. 7), while using a purely topological algorithm, we would encounter the opposite problem (Fig. 7).

An important advantage of the Bayesian network approach is that it does not require simulated extinctions. The most common methodology for analysing food web robustness has then been to simulate primary species removals – sequentially remove species, either at random or based on some topological properties (Dunne *et al*. 2002; Srinivasan *et al*. 2007) and record the number of secondary extinctions following each removal. Whereas in the topological approach each primary extinction can have only one outcome [unless the consumers have some degree of adaptability and can form new interactions if resources are lost, see Staniczenko *et al*. (2010) and Thierry *et al*. (2011)], the outcome of dynamical models depends on the initial conditions of the model. Therefore, there is a need for numerous replicates for each removal (Eklöf & Ebenman 2006; Curtsdotter *et al*. 2011; Riede *et al*. 2011). This makes the simulations computationally intensive and restricts the choice of network sizes. The Bayesian network algorithm, on the other hand, takes into account all the possible outcomes along with their probabilities, without the need to compute them all. This is promising given that data on large ecological networks are appearing more frequently in the literature (Dunne, 2006).

On the other hand, Bayesian networks have some disadvantages. The first is that top-down effects cannot be implemented in a Bayesian network. If these are important in nature, possible alternatives are dynamical models or the SENO model (Lafferty & Dunne 2010). In addition, simulations produce a large number of possible scenarios that might occur in nature, while Bayesian networks produce the means of these scenarios. For some questions, a range of scenarios is of interest, such as the identification of alternative states or expected correlations among species within a community. Of course, these can be simulated sampling outcomes from Bayesian networks as well.

In order to further incorporate biological realism in Bayesian network approaches, we include information on the energy flow between each consumer–resource pair. These flows weight the importance of the links, giving a higher importance to interactions between two species with higher energy flow, an approach often taken in ecosystem studies (Banašek Richter *et al*., 2009; Ulanowicz 2009). This is a possible way to move beyond purely topological structures and start accounting for interaction strengths. In addition to flows, network analysis should take into account how resource needs change from one life stage to another. This has been successfully simulated in past studies (Rudolf & Lafferty 2011; Lafferty 2012), and Bayesian networks can easily incorporate this through specification of the marginal probabilities. However, whenever life stages introduce cycles in the network, one would need to use approaches that can deal with ‘loopy Bayesian networks’ (Jensen & Nielsen 2007).

The earliest topological studies (Dunne *et al*. 2002) already stressed that the lack of a mechanism for consumers to adapt to loss of resources by ‘rewiring’ to other resource species could potentially lead to overestimating the number of secondary extinctions following the removal of single species. This mechanism has now been investigated (Staniczenko *et al*. 2010; Thierry *et al*. 2011), and it has been shown that rewiring increases food web robustness. Although rewiring is not included in the present exercise, it could be included in our method, for example, by modelling the conditional probability of rewiring.

In conclusion, our study shows that a Bayesian network approach can capture the majority of the secondary extinctions in food webs without the need for tracking population dynamics. Our hope is that this method will be a useful complement to existing tools for analysing the robustness of food webs and other ecological networks, reducing the gap between food web theory and conservation-oriented research.