Study area and plant data
Switzerland is a small western European country with an area of about 41 000 km2. Annual precipitation ranges 438–2950 mm and mean annual temperature −10.5 to 12.5 °C. Switzerland is a very mountainous country, with 60% of its area in the Alps (up to elevation of 4600 m a.s.l) and 10% in the Jura Mountains (up to elevation of 1679 m). Elevation ranges 193–4634 m, with an average of 1300 m (Wohlgemuth et al. 2008).
The Swiss BDM scheme was designed to measure changes in biodiversity in Switzerland, mainly to meet the information needs of the general public and politicians (Weber, Hintermann & Zangger 2004). Species richness is assessed for various taxa, including vascular plants, at the local, landscape and the national level (Weber, Hintermann & Zangger 2004). We used data from the Swiss BDM indicator ‘species richness in landscapes’ (Z7), which aims to monitor vascular plant diversity at the landscape scale in Switzerland as a whole. Based on the national coordinate system of 41 285 1-km2 cells, a stratified random sample of 520 1-km2 quadrats was laid out across Switzerland (Pearman & Weber 2007). Excluding quadrats of 100% water surface, as well as quadrats that were too dangerous to do field work because of their ruggedness, 451 quadrats were surveyed for vascular plants.
Based on the existing national coordinate system, a continuous transect of 2.5 km length was placed as close as possible to the quadrat diagonal (Plattner, Birrer & Weber 2004). The transect was designed to follow existing trails, streets or paths. In rough terrain where no pre-existing paths could be followed, transects were identified and permanently marked with yellow spots on trees or rocks. During each growing season, about 100 transects were surveyed by one among a crew of currently a dozen trained field botanists, with each transect being surveyed within 1 year between 2004 and 2008. Botanists follow the established transects and walk it back and forth, recording all vascular plants within 2.5 m on either side of the transect.
To accommodate different flowering phenologies, field botanists sample transects once in spring and once in late summer (Plattner, Birrer & Weber 2004; Pearman & Weber 2007). These two surveys are conducted in the same year in a given quadrat. Survey dates at different elevations were specified according to the known average length of the vegetation period (see Table S1 in Supporting Information). For 36 quadrats at very high elevation (above 2200 m), that is 8% of the total of 451 quadrats, only one replicated survey was carried out between 10 July and 25 August. We note that this lack of replication is not a problem for our modelling. It simply means that these sites do not contribute any information about the magnitude and seasonal profile of detection probability.
The desire to formally extrapolate from our study species to the larger number of all vascular plant species in Switzerland was very important to us. Currently, about 3000 vascular plant species are known in Switzerland (Landolt et al. 2010). By 2008, nearly 1700 species had been recorded within the Swiss BDM Z7 quadrats. To obtain an estimate of the average magnitude of detection errors in the Swiss flora, we randomly sampled 100 plant species from among all the 1700 species that were detected. Then, to understand the factors affecting detection probability, including LF, we applied a constrained randomization in the selection of our study species, by restricting our sampling to species that were detected in at least 18 quadrats. This left us with 886 species, from which we randomly chose 25 from each of four LFs: grass, forb, shrub and tree.
We applied a multispecies site-occupancy model (Dorazio & Royle 2005; Russell et al. 2009; Zipkin, Dewan & Royle 2009) to our data of two samples of 100 species from 451 sites. This framework formally accommodates false-negative detections by distinguishing a latent state of occurrence, z, which is modelled jointly with a binomial detection process that describes the error-prone mapping of z on the observed detection/nondetection data y (Royle & Dorazio 2008; Kéry 2011). Specifically, let zi,k be the latent occurrence state at quadrat i (i = 1, 2, …, 451) of species k (k = 1, 2, …, 100), such that zi,k = 1 denotes presence and zi,k = 0 denotes absence. Our basic model for the ecological process underlying the true pattern of occurrence of our study species, zi,k, is then a Bernoulli random variable,
- (eqn 1)
where ψi,k is the probability of occurrence of species k at quadrat i. Conditional on the outcome of that Bernoulli random variable, that is z = 1 (presence) or z = 0 (absence), the observation process is modelled as another Bernoulli random variable, that is, we make the assumption that there are no false-positive records. Hence, for the observed detection/nondetection data, yi,j,k, for quadrat i (i = 1, 2, …, 451), replicate survey j (j = 1, 2) and species k (k = 1, 2, …, 100), we assume
- (eqn 2)
where pi,j,k is the detection probability for species k (k = 1, 2, …, 100) at quadrat i (i = 1, 2, …, 451) during survey j (j = 1, 2). We note that although the surveys in the 451 quadrats were conducted over multiple years, the replicate surveys from a particular quadrat were within the same year; thus, the closure assumption was not violated.
We introduced effects of covariates to accommodate spatial and taxonomic variation in occupancy probability and spatial, temporal and taxonomic variation in detection probability, or equivalently, to test for the effects of the associated covariates. Owing to the extreme altitudinal gradient in Switzerland, elevation serves as a covariate that summarizes the effects of a very large number of environmental variables that act more directly on the probability of species occupancy. To account for that and to allow for nonmonotonic relationships, we fitted the following model for occupancy with a quadratic effect for elevation (E):
- (eqn 3)
Here, α00,LF(k) denotes the effects of the LF of species k on the probability of occupancy of species k at quadrat i, α0,k is the deviation of species k from the LF to which it belongs, α1,k and α2,k are the effects for species k of elevation linear and elevation squared, and Ei is the mean of elevation of quadrat i (i = 1, 2, …, 451). For the random sample of 100 plant species from 1700 species, we did not include the effects of the LF of species k (see Appendix S1 in Supporting Information).
Similarly, we modelled the effects of LF, species, elevation, survey date (D) and their interactions on detection probability with logit-linear function:
- (eqn 4)
Here, β0,LF(k) denotes the effects of LF of species k, β1,k is the deviation of species k from the mean value for its LF, β2,k through β8,k are the effects of elevation, elevation squared, date of the survey, date squared and their interactions, and Di,j is survey date at quadrat i (i = 1, 2, …, 451) during survey j (j = 1, 2). Originally, we intended to fit a model with a full interaction of all covariates, that is including an effect of . However, we never obtained numerical convergence for this model, so we removed that term from the model specification. For the random sample of 100 plant species from 1700 species, we included only the effect of date of the survey and date squared (see Appendix S1). We standardized all covariate data for the analyses.
Consistent with the scope of our study and with the sampling scheme, which resulted in the 100 randomly selected species, we treated all parameters indexed by k as random effects, that is as draws from a prior distribution whose parameters we estimated. Specifically, we made the assumption that all sets of species-specific random effects come from normal distributions with mean μ and variance σ2 that were both estimated. The only exceptions were the LF effects α00,LF(k) and β0,LF(k) that were treated as independent (i.e. fixed) effects. We further note that the species-specific intercepts, α0,k and β1,k, were expressed as deviations from the LF means; therefore, the prior distributions for these parameters were centred on zero.
We chose a Bayesian analysis of the model and used vague priors that were meant to introduce little or no information about the estimated parameters. Specifically, we chose uniform distributions, U (a, b), for all parameters, with a and b sufficiently wide as to not affect the posterior distributions. For the variance parameters on the scale of the standard deviation, a was zero (reflecting the fact that a variance cannot be negative) (see Appendix S2 for a description of the model in the BUGS language).
We carried out the analysis in winbugs 1.4.3 (Lunn et al. 2000; Spiegelhalter et al. 2003), which we called from r through package r2winbugs (Sturtz, Ligges & Gelman 2005). We ran three Markov chains for 105 iterations each, discarded the first half as a burnin and thinned by one in 50. The Gelman–Rubin r statistic (Gelman & Rubin 1992) indicated acceptable convergence for all parameters (i.e. r values were between 1.0 and 1.1 for all primary structural parameters of the model). We report posterior means as point estimates and central 95% percentiles of the posterior samples as Bayesian credible intervals (CRI). We conducted a Bayesian analogue to a significance test by checking whether the CRI for a parameter contained zero, in which case we assumed non-significance. Further, we assumed the non-significance of the difference in detection probability between two LFs if the CRI for the derived difference in detection probability for the two LFs contained zero. Finally, using the maximum per-visit detection probability of each species (i.e. the higher value of detection probability for the first and the second surveys), we estimated the minimal number of surveys required to detect a species with a probability of 95% during the optimal survey season (McArdle 1990).