iucn_sim-Improved predictions of future extinctions using IUCN status assessments

O_LIThe on-going environmental crisis poses an urgent need for predicting future extinction events, which can aid with targeting conservation efforts. Commonly, such predictions are made based on conservation status assessments produced by the International Union for Conservation of Nature (IUCN). However, when researchers apply these conservation status data for predicting future extinctions, important information is often omitted, which can majorly impact the accuracy of these predictions. C_LIO_LIHere we present iucn_sim, a command line program, which implements an improved approach for simulating future extinctions based on IUCN status data. In contrast to previous approaches, iucn_sim explicitly models future changes in conservation status for each species, based on information derived from the IUCN assessment history of the last decades. Additionally the program considers generation length information when translating status information into extinction probabilities, as intended per IUCN definition. C_LIO_LIThe program implements a Markov-chain Monte Carlo estimation of extinction rates for each species, based on the simulated extinctions. These estimates inherently contain the chances of conservation status changes and the generation length of each given species. C_LIO_LIBased on an empirical data example including all birds (class Aves), we find that our improved approach has a strong effect on the estimated species-specific extinction rates as well as on the overall number of predicted extinctions. Using simulated data we show that iucn_sim reliably estimates extinction rates with high accuracy if run for a sufficient number of simulations. C_LI


Introduction
IUCN conservation status assessments have been used in numerous scientific 20 studies to infer future biodiversity loss (Cooke et al., 2019;Davis et al., 2018;21 Faith, 2015;Mooers et al., 2008;Oliveira et al., 2019;Veron et al., 2016). The challenge in this approach is to meaningfully transform the IUCN-defined conser-23 vation status categories into explicit extinction probabilities. In these previous 24 studies, researchers have used specific extinction risks, which per IUCN definition 25 are associated with the threatened statuses VU, EN, and CR. Sometimes these 26 risks are also extrapolated to species of the statuses LC and NT (e.g. Davis et al.,27 2018; Mooers et al., 2008;Veron et al., 2016). 28 In order for IUCN to decide on assigning a species to one of the threatened cate-29 gories VU, EN, or CR, this species must meet at least one of five assessment criteria 30 (A-E). One of those criteria (E) is associated with a specific extinction probability, 31 while the other criteria (A-D) mostly encompass estimates of decreasing population 32 trends and fragmentation. The IUCN extinction probability thresholds defined in 33 criterion E are as follows:

34
• VU: 10% extinction probability within 100 years 35 • EN: 20% extinction probability within 20 years or 5 generations, whichever 36 is longer (maximum 100 years) 37 • CR: 50% extinction probability within 10 years or 3 generations, whichever 38 is longer (maximum 100 years) 39 Even though these extinction probabilities only apply to species assessed un-40 der criterion E, they are commonly applied equally to all species sharing the same 41 conservation status (e.g. Davis et al., 2018;Mooers et al., 2008).  ing assumption that the minimum extinction risks defined for criterion E can be 43 meaningfully transferred to species listed under one of the other four criteria (A-D) 44 is difficult to test empirically, but is a necessary simplification in order to model 45 the extinction probabilities for the majority of species. However, there are several 46 other important aspects that can be easily incorporated but are commonly neglected when translating IUCN conservation statuses into extinction probabilities. 48 Neglected information 49 To the best of our knowledge, there are two key elements that are usually not 50 incorporated when using IUCN data for future extinction predictions: generation 51 length (GL) and expected future conservation status changes.

52
Generation length is defined as the average turnover rate of breeding indi-53 viduals in a population (IUCN Standards and Petitions Committee, 2019) and 54 therefore reflects the turnover between generations. Generation length should not 55 be confused with age of sexual maturity, which can be used in the calculation 56 of generation length, but is not equivalent. As per the IUCN definition that we 57 stated above, the extinction probability for the categories EN and CR is to be 58 understood in context of the GL of the given species, if 5 × GL exceeds 20 years 59 for EN species, or if 3 × GL exceeds 10 years for CR species. We argue that in-60 cluding GL data should be the standard practice when modelling extinction risks 61 based on IUCN data, particularly because GL data is readily available for many Here we introduce iucn_sim, a command-line program that uses available IUCN 92 status assessments of species and generation lengths to simulate 1) future changes 93 in IUCN status, 2) possible times of extinction across species, 3) estimates of 94 species-specific extinction rates for any given set of extant species over a user-95 defined time span (Fig. 1).

96
The program, including all software dependencies, is easy to install with a sin- status there will also be a marginal effect of generation length in the extinction 123 risk for species currently assigned to other IUCN categories (see Fig 2a).

124
There are two main input types the user needs to provide for this function: A) 125 the name of a reference group which will be used to calculate status transition rates 126 and B) the list of target species names for which to simulate future extinctions, 127 including estimates of GL (if available).

128
Reference group 129 We model the changes in IUCN status as a stochastic process defined by transition in the reference group is t i , the program applies a MCMC to sample the annual 149 transition rate q ij from the following posterior: where the log likelihood function is that of a Poisson process describing status given the notable recent worsening of almost all vulture species' conservation sta-167 tus, the trends observed over all birds may not be representative of this group.

168
It is not an analytical requirement to choose a monophyletic clade as a reference 169 group.

170
As a general guideline we recommend to choose sufficiently large reference 171 groups of more than 1000 species to minimize stochastic effects (see Fig. S1).

172
In the best case (but not necessarily) this group should contain all of the target 173 species.

174
Target species list and GL data 175 Besides the reference group that is used for status transition rate estimation, the 176 user also provides a list of target species, which are the species whose future ex-177 tinctions are being simulated. For all these species, get_rates fetches the current 178 IUCN protection status, if available. To translate these categories into explicit 179 extinction probabilities to be used for future simulations, we transformed the ex-180 tinction probabilities (E t ) associated with threatened IUCN statuses (see Intro-181 duction), defined over specific time frames (t), into annual extinction probabilities 182 (E 1 ), using the formula provided by (Kindvall & Gärdenfors, 2003): From these annual extinction probabilities for threatened categories, we extrapo-184 lated the annual extinction probabilities for statuses LC and NT by fitting a power 185 function to these points (Appendix 1), estimating the parameters a and b: with x representing the index of the IUCN category, sorted by increasing severity To properly model the extinction probabilities linked to the IUCN categories 189 EN and CR for individual species, we strongly encourage users to provide GL esti-190 mates for all target species. For species that are lacking GL information, this aspect 191 is disregarded. When ignoring GL information, the extinction risk for species with 192 moderate or long generation times (>3.33 years) will be overestimated (Fig 2), 193 based on the IUCN extinction risk assumptions outlined in the introduction.

194
The user may provide multiple GL estimates for each species, representing the 195 uncertainty around the GL estimate of each species, in which case get_rates will 196 calculate separate extinction probabilities for the statuses EN and CR for each 197 provided GL estimate. In that case each simulation replicate will draw randomly 198 from the produced EN and CR associated extinction probabilities, in order to 199 incorporate the uncertainty surrounding these estimates into the simulations.  The function will simulate future extinction dates, which are then used to infer 211 averaged extinction rates. with the rates obtained from the get_rates: The transitions rates between statuses are sampled from their posterior distri-  The function allows the user to simulate different future conservation scenarios.

241
For example one can simulate an increase of conservation efforts by a specific factor.

242
This factor is then applied to all rates in the q-matrix leading to an improvement where D is the number of instances in which w ≤ t max , i.e. the number of species where P (µ i ) is a uniform prior distribution set on the extinction rate U[0, ∞]. the resulting rate estimates against the true rates that were used to simulate the 287 data (Fig. S1). Based on the results we recommend choosing reference groups 288 of preferably more than 1,000 species, because stochastic fluctuations of status 289 counts below that threshold preclude the estimation of transition rates with any 290 meaningful accuracy, particularly so for low rates.

291
Extinction rates 292 We simulated extinction times for 1000 species under known extinction rates, to 293 evaluate the accuracy of the estimated extinction rates produced by the run_sim 294 function. The extinction rates (µ) that were used for these simulations were ran- This simulation was repeated for 100, 1,000, and 10,000 simulation replicates, in 301 order to test how many replicates are necessary for an accurate rate estimation.

302
The results show that iucn_sim estimates extinction rates with high accuracy, yet 303 it requires around 10,000 simulation replicates to ensure this accuracy also for very 304 low rates, as those for species starting as LC (Fig. 4).

305
Empirical data example 306 We ran iucn_sim to estimate future extinction events for all birds over the next 307 100 years.

337
Running iucn_sim 338 We provided the list of IUCN bird species names and the 100 GL estimates for 339 each species as input for get_rates (Supplementary Code sample 1). As refer-340 ence group we used the whole class Aves (∼ 11,000 species). Table 1 Figure 1: Workflow of iucn_sim. The user defines a reference group for status transition rate estimation, as well as a list of target species for which future extinctions and status changes will be simulated. Optionally the user is encouraged to also provide GL estimates for each target species, which are applied in calculating the extinction risks associated with the statuses EN and CR. The current conservation status of all species is determined, using available IUCN information. All of these steps take place within the get_rates function, as indicated by the grey box in the top right of the figure. The estimated transition rates, calculated extinction risks, and current status distribution of all target species is parsed on into the run_sim function. Next, these data are applied to simulate future status changes and extinctions. Finally extinction rates are estimated from the simulation output and various summary statistics and plots are being produced as output. Extinction rate Extinction rate Figure 3: The effect of generation length (GL) and status-change (SC) on estimated extinction rates. The plots show histograms of the posterior density of extinction rates estimated with iucn_sim for two different species: the Turkey vulture (Cathartes aura, GL = 9.9 years, Least Concern), panels a) and c); and the Red-headed vulture (Sarcogyps calvus, GL = 15 years, Critically Endangered), panels b) and d). Upper panels show that the extinction rate estimates slightly decrease when including GL data into the simulations (purple) compared to ignoring GL data (red) for both LC and CR species. Bottom panels show that modeling future status changes slightly increases the extinction rate of LC species, but leads to a decrease for CR species (d). Note that the effect of future status changes on extinction rates depends on the estimated status transition rates and is therefore expected to change depending on the chosen reference group.
10 7 10 6 10 5 10 4 10 3 10 2 True extinction rates Figure 4: Increasing precision and accuracy of extinction rate estimates with more simulation replicates. We plotted the true extinction rates that were used to simulate extinction times for 1000 putative species (x-axis) against the extinction rates estimated with the run_sim function (y-axis). We then ran three analyses with (a) 100, (b) 1,000, and (c) 10,000 simulation replicates. The plots show the mean values (blue dots) and the 95% credible interval (grey vertical lines). The dotted horizontal line shows the minimum extinction rate estimate based on the empirical dataset for all birds (10,000 simulation replicates). Extinction rates below this line are therefore unlikely to occur in empirical data sets. The diagonal red line shows a theoretical perfect correlation for reference.