Imperfect detection is the rule rather than the exception in plant distribution studies


Correspondence author. E-mail:


  1. Imperfect detection can seriously bias conventional estimators of species distributions and species richness. Plant traits, survey-specific conditions and site-specific characteristics may influence plant detection probability. However, the generality of the problems induced by imperfect detection in plants and the magnitude of this challenge for plant distribution studies are currently unknown.
  2. We address this question based on data from the Swiss Biodiversity Monitoring, in which vascular plants are surveyed twice in the same year along a 2.5-km transect in 451 1-km2 quadrats. Overall, 1700 species were recorded. We chose a random sample of 100 species from the 1700 species to determine general detection levels. To examine the relationship of covariates on detection, we chose a stratified random sample of 100 species from 886 species that were detected in at least 18 locations, with 25 each from four life-forms (LF): grass, forb, shrub and tree. Using a Bayesian multispecies site-occupancy model, we estimated occurrence and detection probability of these species and their relation to covariates.
  3. Based on the random sample of 100 species, detection probability during the first survey ranged 0.03–0.99 (median 0.74) and during the second survey, 0.03–0.99 (median 0.82). Based on the stratified random sample of 100 species, detection probability during the first survey ranged 0.02–0.99 (median 0.87) and during the second survey, 0.01–1 (median 0.89). Detection probability differed slightly among the four LFs. In 60 species, survey season or elevation had significant effects on detection. We illustrated detection probability maps for Switzerland based on the modelled relationships with environmental covariates.
  4. Synthesis. Our findings suggest that even in a standardized monitoring program, imperfect detection of plants may be common. With the absence of a correction for detection errors, maps in plant distribution studies will be confounded with spatial patterns in detection probability. We presume that these problems will be much more widespread in the data sets that are used for conventional plant species distribution modelling. Imperfect detection should be estimated, even in distribution studies of plants and other sessile organisms, to better control detection errors that may compromise the results of species distribution studies.


Species occurrence is of central importance in ecology and its applications. The collection of sites where a species occurs represents the distribution of that species (Guisan & Thuiller 2005), while the collection of species occurring at a site represents species richness, the most widely used metric of biodiversity (Gotelli & Colwell 2001). However, occurrence is typically not observed perfectly; instead, there are two possible errors that can be made when dealing with species distributions: false-negative errors, also called errors of omission or nondetection, and false-positive errors, also called errors of commission or misclassification (Miller et al. 2011). Given the widespread nature of detection errors (Yoccoz, Nichols & Boulinier 2001), species occurrence is at least partly a latent state that needs to be estimated for unbiased inference about species distributions and species richness (Royle & Dorazio 2008). Treating observed occurrence and species distributions as the true occurrence and distribution, that is failing to make amendments for imperfect detection, may lead to problems in species distribution studies (Kéry 2011), habitat models (Gu & Swihart 2004) and biodiversity management (Chades et al. 2008).

Over the last decades, a plethora of statistical models has been developed to correct for imperfect detection in population analyses for inference about distribution, abundance and vital rates (Seber 1982; Buckland et al. 2001; Borchers, Buckland & Zucchini 2002; Williams, Nichols & Conroy 2002; Royle & Dorazio 2008; King et al. 2010; Kéry & Schaub 2012). However, with few exceptions, use of these models and the associated sampling designs has been restricted to studies in animal ecology. Plant ecologists have been slow to acknowledge the possible need for such methods in their research, presumably because they know that plants do not run away (Harper 1977).

Only a handful of plant distribution studies have formally dealt with the problem of imperfect detection. For instance, in several vegetation surveys, 20–30% species were overlooked (Nilsson & Nilsson 1985; Scott & Hallam 2002; Archaux et al. 2006). Trained botanists recorded more species in plant inventories than untrained ones (Ahrends et al. 2011). The overlooking of some species was recognized as a methodological problem when estimating turnover of plants on islands (Nilsson & Nilsson 1982, 1983). Nevertheless, only very few studies have formally estimated the magnitude of detection errors in plants using adequate protocols and analytical methods (e.g. Alexander, Slade & Kettle 1997; Shefferson et al. 2001; Kéry & Gregg 2003; Slade, Alexander & Kettle 2003; Kéry et al. 2006; Chen et al. 2009). In all of these, detection probability was found to be less than one and sometimes depended on covariates such as plant size or life state.

Nevertheless, these studies give an incomplete description of imperfect detection of plants because species were not selected randomly and survey methods between studies are not comparable. However, they do emphasize a need to better understand the magnitude and the patterns of imperfect detection in space and time and its influence on plant distribution studies (Kéry, Gardner & Monnerat 2010a; Kéry 2011).

In this study, we estimate detection probability and study patterns in detection because of life-form (LF), space and time for a large random sample from an entire national flora. We conducted two analyses, one for a random sample to obtain the best possible estimate of average detection probability in the entire flora and another for a stratified random sample restricted to more common species (operationally defined as those with at least 18 detections) to obtain estimates of the magnitude and of the patterns of detection and occurrence as related to biotic and abiotic covariates. We used data from the Swiss Biodiversity Monitoring (BDM; Weber, Hintermann & Zangger 2004), where detection/nondetection data are collected twice in each year at each sample site. This within-season replication in the sampling protocol enables site-occupancy models (MacKenzie et al. 2002; Tyre et al. 2003) to be applied to jointly estimate occurrence and detection probability. We applied a recently developed multispecies site-occupancy model (Dorazio & Royle 2005; Russell et al. 2009; Zipkin, Dewan & Royle 2009), which combines data from multiple species in a hierarchical model to estimate mean and variance of hyperdistributions describing the variability among species. Treating the effects of individual species as random is consistent with the intended scope of our analyses, namely the entire Swiss flora. Thus, the two samples of 100 species actually studied were simply regarded as replicates of the larger, statistical population of species that could have been selected in our study, that is all Swiss vascular plant species except for the very rare ones.

Our study had three aims. First, we assessed the magnitude of imperfect detection caused by false-negative errors in field survey for plants in a well-designed and well-conducted national BDM program. Secondly, we explored the differences in detection errors among species and LFs. And thirdly, we aimed to identify factors affecting detection probabilities over space and time.

Materials and methods

Study area and plant data

Switzerland is a small western European country with an area of about 41 000 km2. Annual precipitation ranges 438–2950 mm and mean annual temperature −10.5 to 12.5 °C. Switzerland is a very mountainous country, with 60% of its area in the Alps (up to elevation of 4600 m a.s.l) and 10% in the Jura Mountains (up to elevation of 1679 m). Elevation ranges 193–4634 m, with an average of 1300 m (Wohlgemuth et al. 2008).

The Swiss BDM scheme was designed to measure changes in biodiversity in Switzerland, mainly to meet the information needs of the general public and politicians (Weber, Hintermann & Zangger 2004). Species richness is assessed for various taxa, including vascular plants, at the local, landscape and the national level (Weber, Hintermann & Zangger 2004). We used data from the Swiss BDM indicator ‘species richness in landscapes’ (Z7), which aims to monitor vascular plant diversity at the landscape scale in Switzerland as a whole. Based on the national coordinate system of 41 285 1-km2 cells, a stratified random sample of 520 1-km2 quadrats was laid out across Switzerland (Pearman & Weber 2007). Excluding quadrats of 100% water surface, as well as quadrats that were too dangerous to do field work because of their ruggedness, 451 quadrats were surveyed for vascular plants.

Based on the existing national coordinate system, a continuous transect of 2.5 km length was placed as close as possible to the quadrat diagonal (Plattner, Birrer & Weber 2004). The transect was designed to follow existing trails, streets or paths. In rough terrain where no pre-existing paths could be followed, transects were identified and permanently marked with yellow spots on trees or rocks. During each growing season, about 100 transects were surveyed by one among a crew of currently a dozen trained field botanists, with each transect being surveyed within 1 year between 2004 and 2008. Botanists follow the established transects and walk it back and forth, recording all vascular plants within 2.5 m on either side of the transect.

To accommodate different flowering phenologies, field botanists sample transects once in spring and once in late summer (Plattner, Birrer & Weber 2004; Pearman & Weber 2007). These two surveys are conducted in the same year in a given quadrat. Survey dates at different elevations were specified according to the known average length of the vegetation period (see Table S1 in Supporting Information). For 36 quadrats at very high elevation (above 2200 m), that is 8% of the total of 451 quadrats, only one replicated survey was carried out between 10 July and 25 August. We note that this lack of replication is not a problem for our modelling. It simply means that these sites do not contribute any information about the magnitude and seasonal profile of detection probability.

The desire to formally extrapolate from our study species to the larger number of all vascular plant species in Switzerland was very important to us. Currently, about 3000 vascular plant species are known in Switzerland (Landolt et al. 2010). By 2008, nearly 1700 species had been recorded within the Swiss BDM Z7 quadrats. To obtain an estimate of the average magnitude of detection errors in the Swiss flora, we randomly sampled 100 plant species from among all the 1700 species that were detected. Then, to understand the factors affecting detection probability, including LF, we applied a constrained randomization in the selection of our study species, by restricting our sampling to species that were detected in at least 18 quadrats. This left us with 886 species, from which we randomly chose 25 from each of four LFs: grass, forb, shrub and tree.

Statistical methods

We applied a multispecies site-occupancy model (Dorazio & Royle 2005; Russell et al. 2009; Zipkin, Dewan & Royle 2009) to our data of two samples of 100 species from 451 sites. This framework formally accommodates false-negative detections by distinguishing a latent state of occurrence, z, which is modelled jointly with a binomial detection process that describes the error-prone mapping of z on the observed detection/nondetection data y (Royle & Dorazio 2008; Kéry 2011). Specifically, let zi,k be the latent occurrence state at quadrat i (i = 1, 2, …, 451) of species k (k = 1, 2, …, 100), such that zi,k = 1 denotes presence and zi,k = 0 denotes absence. Our basic model for the ecological process underlying the true pattern of occurrence of our study species, zi,k, is then a Bernoulli random variable,

display math(eqn 1)

where ψi,k is the probability of occurrence of species k at quadrat i. Conditional on the outcome of that Bernoulli random variable, that is z = 1 (presence) or z = 0 (absence), the observation process is modelled as another Bernoulli random variable, that is, we make the assumption that there are no false-positive records. Hence, for the observed detection/nondetection data, yi,j,k, for quadrat i (i = 1, 2, …, 451), replicate survey j (j = 1, 2) and species k (k = 1, 2, …, 100), we assume

display math(eqn 2)

where pi,j,k is the detection probability for species k (k = 1, 2, …, 100) at quadrat i (i = 1, 2, …, 451) during survey j (j = 1, 2). We note that although the surveys in the 451 quadrats were conducted over multiple years, the replicate surveys from a particular quadrat were within the same year; thus, the closure assumption was not violated.

We introduced effects of covariates to accommodate spatial and taxonomic variation in occupancy probability and spatial, temporal and taxonomic variation in detection probability, or equivalently, to test for the effects of the associated covariates. Owing to the extreme altitudinal gradient in Switzerland, elevation serves as a covariate that summarizes the effects of a very large number of environmental variables that act more directly on the probability of species occupancy. To account for that and to allow for nonmonotonic relationships, we fitted the following model for occupancy with a quadratic effect for elevation (E):

display math(eqn 3)

Here, α00,LF(k) denotes the effects of the LF of species k on the probability of occupancy of species k at quadrat i, α0,k is the deviation of species k from the LF to which it belongs, α1,k and α2,k are the effects for species k of elevation linear and elevation squared, and Ei is the mean of elevation of quadrat i (i = 1, 2, …, 451). For the random sample of 100 plant species from 1700 species, we did not include the effects of the LF of species k (see Appendix S1 in Supporting Information).

Similarly, we modelled the effects of LF, species, elevation, survey date (D) and their interactions on detection probability with logit-linear function:

display math(eqn 4)

Here, β0,LF(k) denotes the effects of LF of species k, β1,k is the deviation of species k from the mean value for its LF, β2,k through β8,k are the effects of elevation, elevation squared, date of the survey, date squared and their interactions, and Di,j is survey date at quadrat i (i = 1, 2, …, 451) during survey j (j = 1, 2). Originally, we intended to fit a model with a full interaction of all covariates, that is including an effect of math formula. However, we never obtained numerical convergence for this model, so we removed that term from the model specification. For the random sample of 100 plant species from 1700 species, we included only the effect of date of the survey and date squared (see Appendix S1). We standardized all covariate data for the analyses.

Consistent with the scope of our study and with the sampling scheme, which resulted in the 100 randomly selected species, we treated all parameters indexed by k as random effects, that is as draws from a prior distribution whose parameters we estimated. Specifically, we made the assumption that all sets of species-specific random effects come from normal distributions with mean μ and variance σ2 that were both estimated. The only exceptions were the LF effects α00,LF(k) and β0,LF(k) that were treated as independent (i.e. fixed) effects. We further note that the species-specific intercepts, α0,k and β1,k, were expressed as deviations from the LF means; therefore, the prior distributions for these parameters were centred on zero.

We chose a Bayesian analysis of the model and used vague priors that were meant to introduce little or no information about the estimated parameters. Specifically, we chose uniform distributions, U (a, b), for all parameters, with a and b sufficiently wide as to not affect the posterior distributions. For the variance parameters on the scale of the standard deviation, a was zero (reflecting the fact that a variance cannot be negative) (see Appendix S2 for a description of the model in the BUGS language).

We carried out the analysis in winbugs 1.4.3 (Lunn et al. 2000; Spiegelhalter et al. 2003), which we called from r through package r2winbugs (Sturtz, Ligges & Gelman 2005). We ran three Markov chains for 105 iterations each, discarded the first half as a burnin and thinned by one in 50. The Gelman–Rubin r statistic (Gelman & Rubin 1992) indicated acceptable convergence for all parameters (i.e. r values were between 1.0 and 1.1 for all primary structural parameters of the model). We report posterior means as point estimates and central 95% percentiles of the posterior samples as Bayesian credible intervals (CRI). We conducted a Bayesian analogue to a significance test by checking whether the CRI for a parameter contained zero, in which case we assumed non-significance. Further, we assumed the non-significance of the difference in detection probability between two LFs if the CRI for the derived difference in detection probability for the two LFs contained zero. Finally, using the maximum per-visit detection probability of each species (i.e. the higher value of detection probability for the first and the second surveys), we estimated the minimal number of surveys required to detect a species with a probability of 95% during the optimal survey season (McArdle 1990).


Based on the random sample of 100 plant species from the 1700 species ever detected by the Swiss BDM, detection probability during the first survey ranged 0.03–0.99 (median 0.74, Fig. 1a) and during the second survey, 0.03–0.99 (median 0.82, Fig. 1b).

Figure 1.

Frequency distribution of the species-specific detection probability for the first survey (a) and the second survey (b) under the multispecies site-occupancy model analysis in Appendix S1. These results are from the random sample of 100 plant species with at least one observed occurrence in the Swiss Biodiversity Monitoring (BDM). Vertical line indicated the median of the species-specific detection probability for these 100 plant species.

For those 100 plant species that had at least 18 observed occurrences, combining both annual surveys, the observed number of occupied quadrats ranged from 18 to 368 (among 451), which translated into apparent occupancy estimates of 4–81.6%. Correcting for imperfect detection, the estimates of the number of occupied sites among the 451 study sites under our site-occupancy model ranged from 19 to 370. The difference between observed and estimated number of quadrats ranged from 0 to 34 (mean 5) quadrats, representing a negative relative bias of 0–43.8% (mean 6.6). Thus, the overall detection error in the Swiss BDM for both surveys combined was relatively small in most of these fairly widespread species.

However, for a single survey, the magnitude of detection errors was substantial and depended on the survey season. Detection probability during the first survey varied from 0.02 to 0.99 (median 0.87) and during the second survey from 0.01 to 1 (median 0.89) (see Table S2 for estimates for all 100 species). For a shrub Rubus fruticosus, for example, detection probability during the first survey was 0.98 and increased to 1.0 during the second survey. For the grass Digitaria ischaemum, detection probability during the first survey was 0.07 and increased to 0.63 during the second survey. For the forb Oxalis acetosella, detection probabilities during the first survey and the second survey were both 0.97. However, for another forb species, Sherardia arvensis, detection probability during the first survey and second survey were both 0.55.

The mean per-survey detection probability for grasses, forbs, shrubs and trees, respectively, was 0.77, 0.84, 0.87 and 0.88. Detection probabilities were not significantly different among forbs, shrubs and trees, nor between grasses and forbs (Fig. 2). Detection probability differed significantly between grasses on one hand and shrubs and trees on the other (Fig. 2).

Figure 2.

Detection probability for the four life-forms (LFs) under the multispecies site-occupancy model analysis in Appendix S2. These results are from the stratified random sample of 100 plant species with at least 18 detections in the Swiss Biodiversity Monitoring (BDM) survey. For each LF, the solid circle showed the estimated mean of detection probability, and the 2.5% and 97.5% percentiles of the posterior samples of the detection probability were showed to discern the difference of detection probabilities among the four LFs. Interpretation of the name of LFs was 1 (forb), 2 (grass), 3 (shrub) and 4 (tree).

Both elevation and survey season had significant effects on detection probability of 60 of the 100 studied species. Among these species, detection probability of 30 species varied significantly with elevation (see Table S2). For 42 species, detection probability at the optimum elevation (for occurrence) during the first survey differed significantly from that during the second survey (Fig. 3). In 11 species, the joint effect of elevation and survey season (i.e. at least one of the interaction terms) was significant for detection probability (see Table S2).

Figure 3.

Per-survey detection probabilities of 100 Swiss plant species during both the first survey and the second survey. These results are from the stratified random sample of 100 plant species with at least 18 detections in the Swiss Biodiversity Monitoring (BDM) survey. Estimates were based on the site-occupancy model analysis in Appendix S2. Per-survey detection probabilities were based on the elevation where a species had the highest estimated occupancy probability. In 12 species (square), detection probability was higher in the first than in the second survey. In 30 species (circle), detection probability was higher during the second than during the first survey. In 58 species (triangle), detection probabilities did not change significantly over the season, so their estimates fall on the 1 : 1 line.

For example, during the first survey, below 1500-m elevation, detection probability of the forb Galeopsis tetrahit was high, while above 1500 m, detection probability decreased with elevation. During the second survey, below 2300-m elevation, detection probability of this species was uniformly high and did not change with elevation, while above 2300 m, detection probability decreased strongly (Figs 4a and 5). For the grass Luzula campestris, above 1000-m elevation, detection probability in the first survey was relatively high and did not change very much with elevation. During the second survey, however, detection probability of L. campestris was lower than that of the first survey and increased significantly with elevation in the area below 3000 m (Figs 4b and 5). Detection probability of another forb species, Ranunculus ficaria, did not change with elevation in either survey. However, overall detection in this species differed considerably between surveys (Figs 4c and 5). Detection probabilities of R. ficaria at the optimum elevation (with highest estimated occupancy probability) during the first and second surveys were 0.52 and 0.01, respectively.

Figure 4.

Effects of survey date and elevation on detection probability for Galeopsis tetrahit (a), Luzula campestris (b) and Ranunculus ficaria (c) in Switzerland based on the multispecies site-occupancy model analysis in Appendix S2. These three species were from the stratified random sample of species with at least 18 detections. For each species, detection probability was predicted only including covariates of which the central 95% percentiles of the posterior samples of the parameter did not contain zero. Survey date was the standardized Julian date. The standardized Julian date for the first and second surveys was 148 and 236, corresponding to 28 May and 24 August, respectively. Elevation was the mean elevation of the quadrat surveyed.

Figure 5.

Maps of projected occupancy and detection probability for Galeopsis tetrahit (three maps in the top row), Luzula campestris (three maps in the middle row) and Ranunculus ficaria (three maps in the bottom row) in Switzerland based on the multispecies site-occupancy model analysis in Appendix S2. These three species are from the stratified random sample of species with at least 18 detections in the Swiss Biodiversity Monitoring (BDM) survey. To express the effects of survey season on detection probability, we used values of minus and plus one standard deviation of standardized date for the first and second surveys, which corresponds to 28 May and 24 August, respectively. Then, we projected on the map the relationship between elevation and detection probability in the first survey (the second column) and that in the second survey (the third column).

For 92 of the 100 studied species that had at least 18 observed occurrences, the maximum per-visit detection probability was higher than 0.7, and two surveys would be sufficient to detect their occurrence in a quadrat with a probability of 95%. For another three species, the maximum per-visit detection probabilities were between 0.6 and 0.7, and three visits would be required to detect these species with a probability of 95%. Finally, for five species, per-visit detection probability was < 0.6, and four visits would be required to detect them with a probability of 95%.


On the basis of two samples of 100 plants species each, we assessed the large-scale patterns of imperfect detection over an entire country. Detection was less than one for most species, suggesting caution in the interpretation of distribution studies of plants or other sessile organisms that do not formally control for detection probability. Our results emphasize the need for field protocols that accommodate estimation of detection error in the design of new schemes.

Despite sampling randomly, we still expect that our two samples of 100 plant species are probably biased high with respect to the average detection probability of all Swiss plant species. Species with a restricted distribution are often also locally rare (Gaston & Lawton 1990), and local abundance is an essential factor affecting detection probability (Royle & Nichols 2003). It is understood that nearly 40% of plant species of the Swiss flora have not been recorded within the Swiss BDM Z7 quadrats; thus, we expect that there are many elusive and rarer species that have not been detected. It is reasonable to assume that our results are biased high for the overall average detection probability and even more so in our stratified random sample, when considering the Swiss flora as a whole. In other words, had substantial data also been available for the very rare species, we might have found lower average detection probability than in the current study.

Taxonomic patterns in detection probability

In the Swiss BDM program, along a transect of 2500 m observers were most likely to overlook grasses, rather than members of the other three LFs (forbs, shrubs and trees). The low detection probability of grasses may in part be explained; on the one hand, by morphology, grasses represent an elusive gestalt for detection. For trees and shrubs, however, the large size of these plants made them relatively distinctive to the observers, resulting in a rather high detection probability. We note that juvenile individuals of trees or shrubs may be as hard to detect as a forb. For example, observers are not probably to overlook a shrub such as R. fruticosus along the transect, since this species has distinctive flowers and fruits and moreover typically grows in large clumps. It was therefore intuitively reasonable that this species had high detection probability of near 1.0 during both surveys. However, grasses have much smaller size, as well as an elusive gestalt, resulting in lower detection probabilities. Nevertheless, many forbs are similarly small in size as are many grasses, and yet forbs had generally higher detection probability than grasses. For example, Oxalis acetosella, a very small forb species much smaller than most grasses, had high per-visit detection probability of 0.97 during both surveys. It is probably that its white flowers as well as the distinctive and often extensive vegetative parts contributed to its high detection probability.

In our case, grasses consisted mostly of species within the families Gramineae and Cyperaceae. It is possible that grasses had the lowest detection probability because field botanists had difficulties in distinguishing the subtle morphology difference among these species. The likelihood of false-positive errors caused by misclassification of grass might thus increase, or doubtful cases were ignored, leading to increased false-negative detection errors. In contrast, for shrubs and trees, the morphological differences among species were much greater. Field botanists could thus identify and detect shrubs and trees more easily. In short, compared with the other three LFs, grasses often lacked distinctive flowers, fruits or vegetation parts, which, in addition to their relatively small size, could partly explain their low detection probability.

Additionally, local abundance may also explain the detection differences among species. Considering the positive relationship between detection and abundance (Royle & Nichols 2003), it was reasonable for R. fruticosus to have detection probabilities of essentially 1.0 during both surveys, since this shrub is widespread and common on transects of Swiss BDM and typically grows in extensive clumps. Similarly, Sherardia arvensis had a low detection probability of 0.55 during both surveys because this species is cryptic and rare.

Nevertheless, plant traits such as size and survival strategies (e.g. geophyte, hemicryptophyte and therophyte), rather than membership to a specific LF, may have greater effects on detection probability. Further studies should thus be designed to develop an index to quantify the gestalt of a species during field surveys as relevant to detection error. This may enable researchers to select a field protocol and analysis method that allows imperfect detection to be estimated more directly.

Temporal patterns in detection probability

Survey season had effects on the detection probability of 42 species in our study. This is probably because during these two time periods, plants were often in different life stages (e.g. flowering or wilted to the ground); thus, their gestalt was very different. One group of grasses flowers in spring (e.g. Alopecurus pratensis); thus, they have higher detection probabilities during the first survey (see Table S2). Another group flowers in late summer (e.g. Digitaria ischaemum); thus, these species have higher detection probabilities during the second survey (see Table S2).

In the other three LFs, different life stages could also explain their detection differences between surveys. For example, the yellow flowers of Ranunculus ficaria are distinctive in the spring, and its above-ground parts often wilt back in the summer. Accordingly, R. ficaria was detected with probability of 0.52 during the first, but with only probability of 0.01 during the second survey. The projected detection maps showed clearly the temporal pattern of detection probabilities of R. ficaria over Switzerland (Fig. 5).

Spatial patterns in detection probability

In a mountainous country such as Switzerland, elevation has repeatedly been found to be an extremely important factor determining the patterns of biodiversity (Wohlgemuth et al. 2008; Randin et al. 2009). Elevation can be considered a catch-all surrogate for a large number of environmental factors in Switzerland (Kéry, Gardner & Monnerat 2010a).

We believe that the effects of elevation on detection probability could be interpreted in two nonmutually exclusive ways. On the one hand, for some common species, in particular for survey time, populations at different elevations might be in different life stages, thus have a different morphology, which leads to a different gestalt for detection. In the first survey, above 1500 m, detection probability of G. tetrahit was low and decreased with elevation. In the second survey, detection probability was high and did not change with elevation in the region below 2300 m. However in the region above 2300 m, detection probability was low (Figs 4a and 5). This might be because in the first survey, populations in the area below 1500 m are in the flowering stage, while populations in the area between 1500 and 2300 m are in the vegetative stage, since the high elevation areas are colder than low elevation areas. During the second survey, in summer, in the region below 2300 m, populations at both low and high elevations are in the flower or the fruiting stage and thus had a similar chance to be detected (Figs 4a and 5). For the grass Luzula campestris, detection probability in the second survey increased with elevation (Figs 4b and 5). This might be because in the late summer the above-ground parts of L. campestris in low elevation areas wilt way, while this species maintains above-ground parts in higher elevation areas.

On the other hand, species usually find their habitat requirements met best at a certain elevation, leading to distinct elevation profiles in abundance. Via the abundance–detection relationship exploited formally in the heterogeneity site-occupancy model of Royle & Nichols (2003), abundance is probably at the root of such elevation patterns in detection probability. Detection probability of G. tetrahit in areas below 2300 m was relatively higher than that of areas above 2300 m (Figs 4a and 5). We could explain this pattern as that G. tetrahit preferred habitats below 2300 m in Switzerland. So, the abundance of this species was relatively high at these areas, contributing therefore to a high detection. However, in the area above 2300 m, G. tetrahit becomes rare, the local abundance decreases, and detection probability is low (Figs 4a and 5).

Accounting for imperfect detection of plants in ecology and biodiversity management

It is a widely held belief among ecologists that imperfect detection is not an issue for sessile organisms, including plants (Araujo & Guisan 2006). However, our study revealed that for a single survey, which may represent a more typical protocol in studies collecting data that are used to model plant distributions, a substantial degree of imperfect detection was common for plants as well. Without correcting for this systematic error, conventional species distribution models risk modelling the apparent distribution, that is the combined patterns of occurrence and detection, rather than the true distribution patterns (Kéry 2011).

Naturally, with more surveys, the combined detection error will decrease compared with the single-survey detection error. Thus, detection probability may indeed be close to one in well-surveyed areas. That is, the observed presences over a long period of time will reflect more accurately the true occurrence patterns of a plant species. This finding can explain the fact that conclusions of conventional distribution studies based on large samples of floras or herbarium records may not have been very biased, even though these studies did not deal with imperfect detections explicitly (Myers et al. 2000; Qian & Ricklefs 2007), and it is hard to know how biased the estimates are when detection errors cannot be estimated formally. Moreover, the fact that per-visit detection probability was less than one for most species emphasizes the need to incorporate detection probability explicitly into future modelling of species distribution to infer species absence reasonably. Grasses had the lowest detection probability among the four LFs. With the same number of visits, grass species might have a higher probability of false absences in ordinary plant inventories. Also, the survey time effect on the detection probability of R. ficaria suggests that a nondetection of the species in summer may simply be a ‘false absence’, since the species is then simply not available for detection. A related case of nondetection because of temporal unavailability in a perennial plant is Mead's milkweed (Asclepias meadii) that may be dormant in a particular year (Alexander, Slade & Kettle 1997). Given that many historical distribution data sets have no replicate surveys, the site-occupancy approach to estimate detection probability, and therefore to obtain unbiased estimates of occurrence, may not always be an option (though see, for instance, Kéry et al. 2010b; who extracted replicated detection–nondetection data from what is essentially a multispecies presence-only data base). In addition, under some conditions, it may be possible to estimate occupancy and detection separately from single-survey data by assuming strong covariate relationships (Lele, Moreno & Bayne 2012).

The mechanisms underlying the spatial patterns in detection probability in our study may appear trivial and readily understood (namely they are probably mainly due to life state and abundance), but the implications for species distribution modelling are not trivial at all. Elevation, as a factor that affects both occurrence and detection in a majority of species, is precisely one of the factors that cannot be standardized either at the design or at the analysis state of a study. Unless inference on plant distribution is based on a site-occupancy model, which allows us to separate the modelling of the effects of such a factor on both occurrence and detection, species distribution studies would be automatically biased, and sometimes seriously so, by all the patterns that we found in our species detection maps (Fig. 5).

The Swiss BDM enabled us to use site-occupancy models with plant monitoring data, thereby estimating plant detection probability explicitly. But there are many obstacles for the widespread use of site-occupancy models in distribution studies. Lack of replicated observations may often prevent the use of site-occupancy models. It is possible to deduce replicated observations from checklist-type data (Kéry, Gardner & Monnerat 2010a; Kéry et al. 2010b) to improve the models of distributions based on flora checklist data. However, survey-specific differences in effort, skill, methodology and other covariates might cause detection differences. It is therefore necessary to explore effects of these detection heterogeneities on the estimation of occupancy. The different detection patterns in two survey seasons showed the extents to which the occupancy will be biased if the survey date effects on detections had not been modelled explicitly (Fig. 5).

In conclusion, our study points out that in a broad-scale monitoring program, imperfect detection was more common than not. We presume that the problem will be much more widespread, and more severe, in data sets that are normally used for plant species distribution modelling. There are now a vast number of articles and books dealing with survey designs to properly accommodate detection probability in population analyses, including Williams, Nichols & Conroy (2002), MacKenzie et al. (2006), Royle & Dorazio (2008) and Kéry & Schaub (2012). We would argue that imperfect detection should therefore be estimated whenever possible even in distribution studies of plants and other sessile organisms to gain control over such errors that may compromise the results of species distribution studies.


We thank the Swiss Federal Office for the Environment (FOEN) for providing the data of the Swiss BDM. We thank Tobias Roth for comments and suggestions on our earlier manuscript. We thank Thomas Stalling for helping G.C. and M.K. to get familiar with survey work of the Swiss BDM in the field. We thank Bernhard Schmid for supporting G.C. during his visit in Switzerland.