Assessing species richness trends: Declines of bees and bumblebees in the Netherlands since 1945

Abstract Estimating and predicting temporal trends in species richness is of general importance, but notably difficult because detection probabilities of species are imperfect and many datasets were collected in an opportunistic manner. We need to improve our capabilities to assess richness trends using datasets collected in unstandardized procedures with potential collection bias. Two methods are proposed and applied to estimate richness change, which both incorporate models for sampling effects and detection probability: (a) nonlinear species accumulation curves with an error variance model and (b) Pradel capture–recapture models. The methods are used to assess nationwide temporal trends (1945–2018) in the species richness of wild bees in the Netherlands. Previously, a decelerating decline in wild bee species richness was inferred for part of this dataset. Among the species accumulation curves, those with nonconstant changes in species richness are preferred. However, when analyzing data subsets, constant changes became selected for non‐Bombus bees (for samples in collections) and bumblebees (for spatial grid cells sampled in three periods). Smaller richness declines are predicted for non‐Bombus bees than bumblebees. However, when relative losses are calculated from confidence intervals limits, they overlap and touch zero loss. Capture–recapture analysis applied to species encounter histories infers a constant colonization rate per year and constant local species survival for bumblebees and other bees. This approach predicts a 6% reduction in non‐Bombus species richness from 1945 to 2018 and a significant 19% reduction for bumblebees. Statistical modeling to detect species richness time trends should be systematically complemented with model checking and simulations to interpret the results. Data inspection, assessing model selection bias, and comparisons of trends in data subsets were essential model checking strategies in this analysis. Opportunistic data will not satisfy the assumptions of most models and this should be kept in mind throughout.


| INTRODUC TI ON
Species richness, the number of species present in a community or assemblage, is an important component of biodiversity. Methods for estimating species richness have been given a lot of attention (Gotelli & Colwell, 2011). In particular, obtaining local estimates of species richness by means of multispecies occupancy modeling (Dorazio & Royle, 2005; Guillera-Arroita, Kéry, & Lahoz-Monfort, 2019) has seen a recent surge in interest. In comparison to such estimation of local occupancy and the estimation of sizes of local and total species pools, assessing species richness changes over time seems to have received less attention in method development. There seems to be no general strategy for comparing more than two samples or for estimating a time trend when not all species have been sampled or detected. For data collected in an opportunistic manner without standardized protocols or a planned sampling scheme, even the analysis of a trend in a single focal species is not straightforward (Isaac, Strien, August, Zeeuw, & Roy, 2014).
However, this contrasts with the need to obtain adequate trend estimates even when only opportunistic data are available. For example, in the face of climate change and when accounting for geographical crop variation, the conservation of pollinator richness levels might be crucial to mitigate effects of ecosystem change (González-Varo et al., 2013;Klein, Müller, Hoehn, & Kremen, 2009).
It also seems that an overall and substantial decline in total biomass, such as observed for insects in nature reserves in Germany (Hallmann et al., 2017), will reduce species richness inevitably.
Species richness trends of pollinators providing essential ecosystem services therefore merit a lot of attention and different drivers of pollinator abundance and richness change have already been identified (Potts, Imperatriz-Fonseca, Ngo, Aizen, et al., 2016;. Marshall et al. (2018) stated that bumblebees in Europe have been in a steady decline for decades. However, in a recent analysis of species richness trends, Carvalheiro et al. (2013) concluded that declines in species richness have slowed down for several taxa in NW-Europe, including bumblebees. A reassessment of that analysis (Van Dooren, 2016) concluded that it only provided support for decelerating declines for the bees Anthophila in the Netherlands, conditional on accepting the inference strategy and parameter estimates as valid. An inspection of the scripts executing the analysis of Carvalheiro et al. (2013) found that many of their estimates have uncorrected errors or lead to anticonservative inference (Van Dooren, 2016).
The Dutch wild bee data in these studies are a heterogeneous mix: the records (sensu Isaac & Pocock, 2015) were collected in different ways (observations, observations submitted to a website, samples deposited in collections, by hundreds of observers and collectors) and relatively unplanned. This might be exemplary for most data collected in citizen science efforts. However, this heterogeneity was unaccounted for in previous analyses. In Carvalheiro et al. (2013), data were binned in arbitrary time periods while a year-toyear continuous time analysis would be insightful and could provide more detail on patterns of change, in particular on whether decelerating declines occur. For the reanalysis, I chose to reconsider the inference strategy. It applies two promising approaches for assessing species richness trends among four explored (Appendices S1-S3) and compares their merits, assisted by simulations. The first analyses time patterns in species richness by modeling time-dependent asymptotes of species accumulation curves. In an alternative statistical analysis, richness will not be estimated itself for reasons detailed below. There, time patterns in the rates of colonization and local (i.e., within the Netherlands) extinction will be estimated. Both assume nonexhaustive sampling of the species present and assess richness trends at the level of the entire Netherlands. The opportunistic nature of the data implies the absence of any spatial sampling scheme or design in the data. I therefore decided not to attempt to assess important local variability in richness trends (Yoccoz, Nichols, & Boulinier, 2001).

| MATERIAL AND ME THODS
Species richness is the horizontal asymptote of a species accumulation curve representing the expected number of species in a sample as a function of sample size. Estimators of this asymptote are often biased (Walther & Moore, 2005), or the precision of an estimate can be limited (O'Hara, 2005). In a comparison of two assemblages, species accumulation curves can cross (Chao & Jost, 2012;Lande, DeVries, & Walla, 2000;Thompson & Withers, 2003;Van Dooren, 2016), such that patterns in the relative numbers of species found at low sampling efforts or at rarefied sample sizes (e.g., Biesmeijer et al., 2006) can be independent of actual species richness differences. Gwinn, Allen, Bonvechio, Hoyer, and Beesley (2016) found high sensitivity of bias and precision of change estimates to the relative abundances of species in a dataset. Gonzalez et al. (2016) argued that the risk of biased estimates of change needs to be considered in short time series of species richness. Similarly, Cardinale, Gonzalez, Allington, and Loreau (2018) drew attention to risks of using estimates of the earliest time point as baseline, pointing to the possibility of mistaking regressions to the mean for actual responses.
Additionally, nonlinear data transformations can in particular affect the impression time series give us, as the distance between data points will change more in some intervals due to the transformation than in others.

| Data
The dataset analyzed here consists of 73 years of opportunistic data.
Bee records in the EIS (European Invertebrate Survey Netherlands) database in the period 1945-2018 are analyzed, extending periods investigated in Biesmeijer et al. (2006), Carvalheiro et al. (2013) and Van Dooren (2017). The data were provided by EIS as a list of records with species names, years of collection, and the square kilometer cell where each record was collected. The dataset analyzed in Van Dooren (2017) turned out to contain 150,047 spurious records, presumably due to a query error and therefore had to be redone. Sampling has not occurred in a standardized manner across the study period, and not in a spatially homogeneous or balanced manner either (Figure 1). In particular from mid-eighties until around 2005, increasingly larger numbers of records have been collected, more often by observation and with records from a larger number of square kilometer grid cells (Figure 1). For both non-Bombus and Bombus bees, there have been years where the number of records was less than twice the number of species sampled over the entire study period (four, resp. eleven times). In particular for bumblebees, 1966-1971 and 1981-1983 were periods of consecutive years with such relatively low sampling intensities. These years are expected to contribute little to the estimation of species richness and indicate that insufficient data are available for an analysis at finer spatial scales. The observation data contributed by volunteers through the website https ://waarn eming.nl since 2005 is from a steadily increasing number of grid cells per year (Figure 1), such that the number of observations per grid cell with records now steadily decreases, again complicating any analysis at smaller spatial scales. The samples per year of non-Bombus species show important time trends in the heterogeneity parameter σ of Poisson lognormal distributions (Engen, Lande, Walla, & DeVries, 2002). This parameter σ is a compound measure representing abundance variation between species rescaled by species-specific sampling effort if present. Its changes suggest that the heterogeneity in numbers of individuals per species in the samples has increased during the study period.
I assessed patterns of species richness change over time in the entire Netherlands and therefore fitted models with explanatory variables that changed over time to all data or data subsets that in principle span the entire Netherlands. Previous conclusions were drawn at that scale and I expect that changes at smaller scales are increasingly affected by (nonrandom) sampling variability and lack of consistent protocol. The models fitted did not contain any parameters or variables representing spatial species richness differences or heterogeneity. Only the number of grid cells sampled was added as an explanatory variable because it can be expected to affect predicted nationwide trends. To facilitate comparisons with Carvalheiro et al.
F I G U R E 1 Sampling strategy and volume of wild bee data in the Netherlands have changed over the years. Top row. Total numbers of records per year have changed substantially over the period (thick black lines). Note the difference in scale between both panels. The fractions of records that represent individuals deposited in collections have become a minor fraction of the totals in recent years (thin blue line). The difference is caused by an increasing number of observations without specimens deposited in collections, of which the fraction contributed by volunteers on the website https ://waarn eming.nl is increasing (thin black line: all records except those contributed through https ://waarn eming.nl). Horizontal lines are drawn at twice the total number of species sampled over the entire period per taxonomic group. For points below this line, there are on average fewer than two observations per species. Middle row. Records are obtained in a variable number of square kilometer grid cells per year (thick: total numbers, thin blue line: samples deposited in collections, thin black: samples in collections plus observations not contributed via https ://waarn eming.nl). Bottom row. When distributions of records per species per year are analyzed using Poisson lognormal models (Engen et al., 2002), the σ heterogeneity parameter for their distribution shows a gradual change over the years (estimates for different years connected by a line, approximate 95% confidence intervals for each estimate. Thick/thick: samples deposited in collections, thin/dotted: observations)  1980 1940 2020 1980 1940 2020 1980 1940 2020 1980 1940 2020 1980 1940 2020 1980 Number Each approach accounts for imperfect detection of species, which should never be ignored (Guillera-Arroita, 2017) and for time trends in sampling strategy. Within each approach, model simplification is carried out with the purpose of obtaining the best possible predictions without a priori assuming that the true model is among the candidates compared, hence by means of AICc comparisons (Akaike information criterion adjusted for finite sample sizes, Burnham & Anderson, 2002) as AIC(c) model selection is efficient (Claeskens & Hjort, 2008). Model selection, parameter estimation, and inference in general were backed up as follows. Next, to an analysis of all data, two subsets of records were analyzed separately and for each method: (a) Records for which the individual was deposited in a museum collection, excluding for example observations. (b)

Non−Bombus Genera Bombus
Records collected in square kilometer grid cells that were sampled in three periods of approximately equal length. These are the last period where the number of observations and grid cells have drastically increased (1994-2018; Figure 1), a period of equal length right after the end of WWII  and the years in between . In this subset, spatial locations that contributed samples in a restricted time period only were excluded. This avoided that in one part of the study period richness might have been assessed for a different set of grid cells than in another.
To the entire dataset and to the subsets, different models were fitted. Model simplification increases the precision of the remaining individual parameter estimates, but this can come at the cost of larger estimation bias. In each statistical analysis presented below, I assessed effects of this bias-variance trade-off (Claeskens & Hjort, 2008) by comparing the predicted time trend of species richness between the best maximal model fitted and a simplified minimum adequate model (see below). I will only conclude that a deceleration occurs when it is detected in the full data and in the data subsets and does not suddenly emerge as a result of model selection bias.
To understand patterns of estimation bias, the analysis was further complemented with simulations.

| Generalized nonlinear models
In a first approach, the number of records and the number of species per year in a dataset were used to estimate nonlinear species accumulation curves (Gotelli & Colwell, 2011, Appendix S3) with parameters that change over time. In other studies, nonlinear power functions without horizontal asymptote have been fitted (Ugland, Gray, & Ellingsen, 2003) that predict species richness by extrapolating from a smaller area to a larger fixed range. Here, species accumulation curves are represented by Michaelis-Menten curves a(t) , estimating species number with two time-dependent non-negative functions a(t) and b(t) and with x(t) the number of records in year t, representing sample size. Function a(t) is the timedependent horizontal asymptote of the curve, hence it is this part of the model which describes the change in species richness over time.
I assume that functions a(t) and b(t) change gradually across years and therefore used smooth functions (Wood, 2006) to parameterize them and assess their gradual changes. We can expect that b, which quantifies the increase in species number with sampling effort shows a time pattern affected by changes in sampling method and the number of grid cells sampled. The ratio b/a is an estimate of the Simpson diversity in random samples (Lande et al., 2000). If species richness, given by the asymptote a, has a decelerating decline, we should be able to observe that in the predicted time pattern for a(t). Note that this model representing a species accumulation curve combines the estimation of state variable a(t) with properties of the observation/detection process captured by function b(t). This approach does not need specification of a distribution of capture probabilities per species, nor a distribution of species abundances (e.g.,

O'Hara, 2005 for some examples).
Michaelis-Menten curves were fitted to numbers of species sampled per year. I used the function gnlr() in R (Lindsey, 1997) which maximizes the model likelihood using a general purpose optimizer.
None of the estimation methods proposed by Raaijmakers (1987) nor the nonlinear least squares used by O'Hara (2005) Colwell et al., 2012). The variance regression equation ensured that the error variance never became smaller than this multinomial sampling variance (Appendix S3 gives an example of the code). This is reasonable given that we do not know how much the sampling was different from random, which is an extra source of uncertainty. For the curvature function b(t), alternative models were fitted with splines (up to eight df) of the total number of grid cells with data per year or of the σ parameter of the Poisson lognormal distribution (Engen et al., 2002) fitted to the data per year for the group analyzed. Model simplification occurred in a manner which slightly reduced the total number of models to fit and compare. I fitted all models with splines of 8, 5, 3, and 1 degrees of freedom and the model with no explanatory variables for a or b. Among the models in this set, I selected the one with the lowest AICc. Then, models with one df added to each spline in this model or with one df removed were also fitted, to check whether these modifications would reduce the AICc further. The model with the lowest AICc among all the ones fitted is called the "minimum adequate" model. This model is compared with the "maximal" model: the model with lowest AICc among those with splines of 8 df for both a(t) and b(t).
A bootstrap estimate of estimation bias (Davison & Hinkley, 1997) was obtained as follows. One hundred bootstrap pseudodatasets were constructed by randomly drawing for each year a number of individuals equal to the number of records in that year and repeating the analysis above for each of these sets of pseudodata.
The covariates calculated from the original data were kept except for the multinomial sampling variances which were recalculated for each sample. The maximal and adequate models were fitted to each of these datasets, and species richnesses per year predicted. The estimated bootstrap bias equaled the average of these predictions per year minus the predicted value obtained from the original data. Note that this straightforward bootstrap differs from the one proposed by Chao et al. (2014), in that the number of undetected species is not estimated and incorporated in the resampling.

| Capture-recapture analysis of species encounter histories
Given the recent interest in occupancy methods and the possibility they offer to estimate local and total species richness (Kéry & Royle, 2008), it needs to be argued why these methods were not used here.  2019) found that estimation bias can be substantial, in particular when 15% or more of the total richness has not been sampled.
Inference on parameters representing changes in species richness might proceed differently from the estimation of species richness itself and patterns of bias and precision do not need to be the same for different parameters. A relevant case is found in the difference in estimation bias between estimates of population size and of individual survival between two time points (Cormack, 1972). Survival is a parameter determining changes in population size and is generally much less biased than population size estimators. Hillebrand et al. (2018) rightfully state that richness change in itself should be decomposed into trends in immigration and local extinction. Estimates of these rates might additionally be less biased than of species richness itself.
The analogy between estimating population size and species richness goes deeper. Estimation methods of the capture-recapture framework for individuals can be applied to assemblages of aggregated species encounter histories (MacKenzie et al., 2006;Nichols & Pollock, 1983).
Each is a sequence of zeroes and ones indicating for which years at least one record is present for that species in the dataset. Williams, Nichols, and Conroy (2002) noted that the modeling of such species encounter histories when assuming open communities with colonization and extinction is similar to individual capture-recapture histories in the presence of temporary emigration. However, the dataset and the subsets analyzed here have no repeated within-year occasions where the assemblage can be assumed to be closed, such that "temporary emigration," which here is a component of the colonization and extinction dynamics cannot be estimated because a robust design model cannot be fitted (Pollock, 1982). Temporary emigration, that is the probability that an individual is absent and cannot be sampled, can also be estimated for individual capturerecapture histories assuming an unobservable state (Kendall & Nichols, 2002;Schaub, Gimenez, Schmidt, & Pradel, 2004). Such a model as constructed for capture histories of individuals cannot deal with individual species jointly present inside and outside of the Netherlands and we would have to use occupancy models. For parameter estimability, we would further have to assume that there are no time trends in temporary absence and survival (Schaub et al., 2004) such that the time-dependent models of interest cannot be fitted. Therefore, encounter histories per species were analyzed using capture-mark-recapture analysis for open populations (Pradel, 1996) and trends in temporary absence were assessed otherwise as explained below. Models were fitted to species encounter histories across years using Mark software (White & Burnham, 1999) called via RMark in R (Laake, 2013). I parameterized using the "Pradrec" model, which estimates local survival and colonization, where I note that the first is a probability per species (in between zero and one) and the second the number of colonizing species relative to the number present at the beginning of a time interval (non-negative). Local survival probabilities, species colonization, and detection probabilities were estimated as being constant, with a linear trend over time (in the linear predictor), with categorical effects per year ("time-dependent"), or with regression models that have the total number of records, number of grid cells with records per year, and σ of the Poisson lognormal distribution of the taxon concerned as year-specific covariates ("regression"). This assumes that changes in the grid cells sampled primarily affect detection probabilities and that survival and colonization apply to the entire Netherlands. Correlated explanatory variables were jointly included in the detection probability model because the aim was to obtain the best predictions for survival and recruitment, not to interpret parameter estimates of the detection model or do hypothesis testing on them. The total number of records per species was added as a species-specific (individual) covariate in some models. The Pradel models fit parameters that express per-year gains and losses in species richness and not richness itself. This is expected to reduce estimation bias (Cormack, 1972) but it is also known that capture heterogeneity needs to be addressed (Abadi, Botha, & Altwegg, 2013) to avoid biased estimates of local survival and colonization. For that reason, mixtures of species effects on detection probability were fitted, with two or three components ("PradelRecMix" in RMark, Pradel, Choquet, Lima, Merritt, & Crespin, 2009). Models were compared using AICc to select the minimum adequate model with lowest AICc.
If the probability that a species is temporarily absent remains constant over time, it will not affect estimates of species richness change, only detectability. The presence of time-inhomogeneous temporary absence could affect time trends in species richness and was assessed as follows. If the probability of being temporarily absent in year t is p e (t) and the probability of detection when present p d (t), then the probability of detection p(t) in the time-dependent Pradel model should equal the product of p d (t) and (1 − p e (t)).
Assuming that our regression model for detection probabilities with variables characterizing sampling effort and method fits in fact p d (t) multiplied by a constant, we can investigate the presence of elevated temporary absence in some time intervals by checking time patterns in the difference between detection probabilities predicted by the regression and by the fully time-dependent models. Large positive differences in a sequence of consecutive years can indicate temporary absence affecting species richness trends.
Parameter estimates of local survival and colonization were used to calculate the change in species number in 2018 relative to 1945. The growth rate λ t of species richness in between year t and t + 1 is the sum of the survival probability s t and colonization f t . The predicted relative change in richness in year t + relative to reference year t is equal to, Confidence intervals for these changes were calculated from the confidence interval boundaries of parameters in the minimum adequate models.
To assess estimation bias in the parameters of Pradel models, simulations were used. Appendix S5 and Figures S3 and S4 present results of simulations of local survival, colonization, temporary absence and sampling, which allowed another assessment of estimation bias and of the statistical power to detect decelerating declines and temporary absence concentrated in a certain period.

| Generalized nonlinear models
Prediction intervals of the best maximal models and the models with the most favorable AICc (adequate model) are shown in Figure 2 for the full data and the second subset, that is records from grid cells sampled in three predefined periods (results on collection specimens are available in Figure S1, Table S1).
The overall impression from Figure 2 is that there are moderate declines of species richness for the non-Bombus genera. For Bombus, the analysis of the complete dataset produces a complex pattern in species richness over time. The complexity disappears when the data are restricted to grid cells that have been sampled in three periods and when we consider the minimum adequate model. Comparison with the analyses on the other grid cells that were not sampled in three periods (Appendix S4, Figure S2) reveals that the complex pattern might originate from this data subset and thus from data heterogeneity. For the minimum adequate model on collection specimens only, a straight line remaining within the prediction intervals can be drawn from the start to the end of the study period ( Figure S1).
The bootstrap-bias assessment shows that the predictions of bias-corrected models remain within the original confidence intervals for the non-Bombus genera but not for Bombus ( Figure 2). There, the bootstrap predicts substantial bias. A first explanation is that this is due to the fact that I did not add rare species in the bootstrap procedure (Chao et al., 2014) and thus never sampled more species than in the data for that year. The bootstrap bias disappears in the minimum adequate model for the grid cells that were repeatedly sampled ( Figure 2). This suggests that bias might be larger when complex models overfit the bumblebee data.  [18,20]. Note that the estimated relative loss of richness is larger for Bombus than the other bee genera but when relative losses are calculated from confidence intervals limits, they overlap and both touch zero loss.  Figures 3 and 4). The time-dependent models show negligible colonization and some years with reduced survival. Table 1 gives AICc values of the adequate models and a set of other models that are useful for comparison. In these tables, "t" indicates time categorical effects fitted, "T" a linear effect in the linear predictor, which was linked to the data using log (colonization) or logit (local survival) link functions. The number of mixture TA B L E 1 A selection of Pradel models fitted to the data. The minimum adequate model is given on the first line per taxon. The first three columns list the covariates in the models for local survival, colonization and detection, the last two columns confidence intervals for the time trends in survival and colonization, when estimated. T refers to a linear effect of year, t to categorical effects of year, "1" indicates a model without explanatory variables. The numbers of components in the mixture distributions for detection probabilities are given. Abbreviations for the year-dependent variables are σ for the heterogeneity parameter of the Poisson-log normal distribution, N rec is the total number of records per year, N grid the number of square kilometer grid cells sampled components in the detection model is given. In the last two columns of the tables, estimates of the linear effect of T are given when fitted. At least one estimate for local survival or colonization needs to be significantly positive for a deceleration in the decline of species richness to be possible. This does not occur in the analyses of the different data subsets either (Table S2). However, this is in agreement with simulation results (Appendix S5, Figure   S3) that indicate that there is a bias toward estimating negative trends in local survival. However, the simulations also indicate that survival intercept parameters are estimated relatively well, such that we can use these to estimate species richness change while our minimum adequate models are not including any time effects

| D ISCUSS I ON
To investigate trends in species richness, a dataset with records and observations of Bombus and non-Bombus wild bee species in the Netherlands spanning 73 years was analyzed using two approaches. In the analysis of such an opportunistic dataset, emerging patterns can be caused by sampling heterogeneity across F I G U R E 3 Results obtained from fitting Pradel models to encounter histories for species from non-Bombus wild bee genera. Left column: Parameter estimates and predictions of the fully time-dependent model without fitting mixtures; right column: the adequate model with a mixture for detection probability (Table 1) This was the main source of records before 2000, but now contributes a minor fraction of the data. Neither of the approaches then rejected the hypothesis of a constant decline or no decline at all in favor of a decelerating decline. In fact, the species richness decline is small and maybe absent for non-Bombus bees. For bumblebees, if we take advantage of the larger precision of the capture-recapture analysis, a moderate and significant decline of 19% is inferred. Among the generalized nonlinear models, there was one data subset for which the preferred model had a constant decline in species richness for both bumblebees and other wild bee species, while preferred models for the other subset and the full dataset had smooth regressions of species richness with at least three parameters. Therefore we can conclude that the approach is capable of selecting a nonlinear model, but it did not do so consistently. Several estimation issues were encountered.
Generalized nonlinear modeling showed potential estimation bias for bumblebees, which might result from the lack of adding rare species in the bootstrap, or from fitting too complex models to the data. Simulations indicated that the modeling of species richness changes via encounter histories suffers from estimation bias of parameters representing time trends, making it difficult to reliably detect decelerating declines when using that method alone.

| Wild bee species richness trends
The fraction of species richness lost since 1945 or the species richnesses at the end of the study period could be predicted using either approach. From the capture-recapture analysis, the conclusion is that species richness has declined since 1945 for the Bombus genus.
For the non-Bombus genera small declines are estimated with confidence intervals including zero. The decrease estimated for the Bombus genus is comparable to a study using rarefied richness on Swedish data (Bommarco, Lundin, Smith, & Rundlöf, 2011). On the other hand, the estimated overall decrease for the other genera seems smaller than in an analysis of bee species richness in the UK with relative decreases between 10% and 30% in most study sites, when comparing periods separated by 33 years (Senapathi et al., 2015). However that study used the same methods as Carvalheiro et al. (2013) and no confidence intervals for the changes were given.

| Methodological developments
Risks of species richness loss seem taxon-specific (Bombus vs. non-Bombus), and thus call for trend analysis in smaller taxonomic groups in general. However that requires sufficient data for each of them.
Boyd (2013)  bootstrap methods differently, to estimate magnitudes of estimation bias in the different types of analysis but without imputing rare species. For assessments of estimation of bias, independent simulations seemed more useful than the data bootstrap.
Issues with models for encounter histories are probably alleviated when robust design models (Pollock, 1982) can be fitted to the data. If records had been collected throughout the study period independently by observations and via specimens deposited in collections, then these two sampling methods could have been used to fit a robust design. Heterogeneity in species detection probabilities can be expected in all methods addressing species richness (e.g.,  and it was incorporated into the capture-recapture models by means of mixtures. Note that models for open populations were used in the analysis of species encounter histories. Models exist to estimate species richness in open occupancy models Yamaura et al., 2011), but these often depend on the assumption of individual random sampling which is untenable for the opportunistic data analyzed here. Statistical modeling of how collectors vary in their sampling efforts over time clearly deserves more study and wider application, given that more and more data of the kind analyzed here are mined and as people are encouraged to collect citizen-science data .
Ideally, these developments should result in methods where local or landscape-specific diversity trends and spatio-temporal patterns in sampling effort can be studied jointly. Also studies on abundance trends in different taxa (e.g., Inger et al., 2015) could benefit from multimodel inference and assessments of bias-variance tradeoffs and sampling heterogeneity. Isaac and Pocock (2015) propose that an effort should be made to model observer responses, suggesting the application of data collected from mobile phones to assess different kinds of sampling bias.
This seems to call for an effort to model recording syndromes and to integrate them into the development of statistical methods for uncontrolled data collection.
A very different avenue of research which could be rewarding might be the development of Focused Information Criteria (FIC; Claeskens & Hjort, 2003) for the estimation of species richness trends. Such information criteria have a focus (hence the F), a statistic we are most interested in, and models are preferred that are expected to produce the most precise predictions of that statistic.
We would thus require a FIC for the selection of models that predict species richness trends best.
Meanwhile, how can we decide which approach is reliable enough for inference of trends? Here I resorted to comparisons of consistency in the results from different approaches and subsets.
Additional simulations with similarities to the dataset were essential and revealed critical estimation bias. Therefore, ranking estimates from different approaches based on consistency across data subsets should never occur without accounting for other criteria. For opportunistic data, there seems to be currently no safe alternative other than a pluralistic but conservative modeling approach assisted by simulations which provide some guidance.
To conclude, I want to point out a potential example of how nonrandom data collection might be generated. Note that a recent Dutch reference work on wild bees (Peeters et al., 2012) explicitly pointed to recent records of the wild bee species Andrena coitana in Germany and the habitat type in the Netherlands where the species might be seen again after a long period without records. Recently, the rediscovery of A. coitana was reported (Nieuwenhuijsen, 2016).
We cannot exclude the possibility that a new reference work or other public exposure motivates an increased effort to collect particular species. This can thwart any effort to achieve random sampling, so essential to most of our inference methods, by replacing it with recording syndromes (Isaac & Pocock, 2015).

ACK N OWLED G M ENTS
I thank Menno Reemer for providing the EIS data. Roger Pradel and two reviewers gave useful comments on previous versions.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N
TJMVD designed the study, carried out the analysis and wrote the manuscript.

O PE N R E S E A RCH BA D G E S
This article has earned an Open Materials Badge for making publicly available the components of the research methodology needed to reproduce the reported procedure and analysis. All materials are available at [https ://doi.org/10.5061/dryad.34tmp g4fm].

DATA AVA I L A B I L I T Y S TAT E M E N T
The raw data used are available upon request from http://www.