Importance of sampling design and analysis in animal population studies: a comment on Sergio et al.


Correspondence author. E-mail:


  • 1The use of predators as indicators and umbrellas in conservation has been criticized. In the Trentino region, Sergio et al. (2006; hereafter SEA) counted almost twice as many bird species in quadrats located in raptor territories than in controls. However, SEA detected astonishingly few species. We used contemporary Swiss Breeding Bird Survey data from an adjacent region and a novel statistical model that corrects for overlooked species to estimate the expected number of bird species per quadrat in that region.
  • 2There are two anomalies in SEA which render their results ambiguous. First, SEA detected on average only 6·8 species, whereas a value of 32 might be expected. Hence, they probably overlooked almost 80% of all species. Secondly, the precision of their mean species counts was greater in two-thirds of cases than in the unlikely case that all quadrats harboured exactly the same number of equally detectable species. This suggests that they detected consistently only a biased, unrepresentative subset of species.
  • 3Conceptually, expected species counts are the product of true species number and species detectability p. Plenty of factors may affect p, including date, hour, observer, previous knowledge of a site and mobbing behaviour of passerines in the presence of predators. Such differences in p between raptor and control quadrats could have easily created the observed effects. Without a method that corrects for such biases, or without quantitative evidence that species detectability was indeed similar between raptor and control quadrats, the meaning of SEA's counts is hard to evaluate. Therefore, the evidence presented by SEA in favour of raptors as indicator species for enhanced levels of biodiversity remains inconclusive.
  • 4Synthesis and application. Ecologists should pay greater attention to sampling design and analysis in animal population estimation. Species richness estimation means sampling a community. Samples should be representative for the community studied and the sampling fraction among communities compared should be the same on average, otherwise formal estimation approaches must be applied to avoid misleading inference.


The use of top predators as umbrella and flagship species in conservation has been criticized (Andelman & Fagan 2000). However, a recent study described by Sergio et al. (2006; subsequently SEA; see also Sergio, Newton & Marchesi 2005) detected consistently more breeding bird, tree and butterfly species in 1 km quadrats within territories of six raptor species in the Italian Trentino region than in control quadrats. SEA concluded that top predators may be used validly as indicator and as umbrella species; they occur in biodiversity hotspots and, furthermore, networks of protected sites constructed on the basis of their occurrence captured more species than did networks based on the occurrence of lower-trophic species.

Unfortunately, there are issues with the manner in which SEA conducted fieldwork and carried out analyses. Here, we comment on their results on the number of breeding bird species, but similar issues may well apply for the other taxonomic groups they surveyed. SEA state (p. 1051):

All point counts were conducted in May–June, during the first 4 h after sunrise. At each site (breeding or control sites, see below), we conducted a 10-min point count and then slowly walked 500 m in each of the four main cardinal directions, noting all bird species not previously recorded. Therefore, each assessment reflected the biodiversity of an area of approximately 1 km2.

Thus, SEA sampled bird species richness in a 1 km quadrat along a 2 km transect route. They repeated this at 25 raptor nests and at 25 randomly selected control locations. Their analysis consisted of t-tests for the equality of mean species counts in 1 km quadrats that contain a raptor nest and ones that do not.

Here, we point out two issues in SEA that cast doubts on the validity of their inference, and suggest that their results are inconclusive. The first anomaly is that SEA detected on average only 6·8 avian species. Because we estimate here, on average, 32 species in similar quadrats in the same region, SEA presumably missed almost 80% of all bird species present in their quadrats. We argue that the differences in the observed species counts, as opposed to the differences in true species richness between raptor and control quadrats, could have arisen easily by similar differences in the detectability of species, caused by factors that include date, time of day, observer identity, previous knowledge of a site and possible mobbing behaviour of passerines in the presence of predators. The second issue is that the precision of the mean species counts in SEA appears too great: in two-thirds of cases, their standard errors are substantially smaller than in the unlikely case that all quadrats had exactly equal numbers of equally detectable species. This suggests the presence of substantial heterogeneity among species in detection probability. Most probably, SEA counted consistently only a biased and therefore unrepresentative subset of common and easily detectable species.

We believe that valid inference from data such as those of SEA requires direct correction for species detection probability by use of modern statistical models for estimation of community quantities, such as species richness. Various such methods have been developed over the past years (e.g. Nichols et al. 1998a,b; Hines et al. 1999; Dorazio & Royle 2005; Gelfand et al. 2005; Royle et al. 2007). We argue that in the absence of this, or of any quantitative evidence that species detection probability is indeed similar in raptor and control quadrats, there is no scientifically defensible way of ascribing the observed differences in species counts to true differences in species richness.

Estimation of the proportion of avian species detected by Sergio et al.

The numbers of breeding bird species detected by SEA (Fig. 1, also see quasi-identical Fig. 1a in Sergio et al. 2005), appear strikingly low and suggested that SEA detected only a small fraction of all species present. To test this hypothesis, we compared the species counts in their Trentino study area with data collected during the Swiss Breeding Bird Survey at the same time in an adjacent region of Switzerland, the SE Alpine slope. Numerical values from SEA were obtained using Adobe Acrobat 6·0 Professional.

Figure 1.

Numbers of diurnal breeding bird species in 1 km quadrats in two adjacent regions of the Southern Central Alpine slope; SE Switzerland and the Italian Trentino region. Filled black circles show the true number of species in 85 1 km quadrats in SE Switzerland in relation to elevation estimated under a multispecies site occupancy model (Dorazio & Royle 2005); the black line is a loess smooth (span = 0·75). In the Trentino region, means and standard errors of the observed species richness from 25 1 km quadrats centred either on nests of six raptor species (triangles up) or of spatial control locations (triangles down) are shown; these data are from Sergio et al. (2006). The dashed line shows the Trentino grand mean. The Trentino data are plotted at the average elevation of the occurrence of each raptor species (from Pedrini et al. 2005).

As demonstrated by the two most recent atlas studies of these regions (Trentino: survey period 1986–2003, Pedrini et al. 2005; Switzerland, including the SE Alpine slope: survey period 1993–96, Schmid et al. 1998), the Swiss and Trentino regions are closely comparable in terms of area, elevational range and breeding bird community (Table 1). Both the Swiss SE Alpine slope and Italian Trentino regions are examples of the avian community found on the southern slope of the Central Alps. There was a total of 158 breeding bird species occurring in both regions together and 86% were shared species. Hence, we felt justified using data from the Swiss Breeding Bird Survey to estimate the true number of diurnal breeding bird species expected to occur in a 1 km quadrat at different elevations in the Trentino.

Table 1.  Comparison of the Trentino study region of Sergio et al. (2006) and the region of the Swiss SE Alpine slope in terms of area, elevation and avian species richness. Trentino data are from Pedrini et al. (2005) and SE Swiss data from the Swiss Breeding Bird Survey (MHB) (Schmid et al. 2004; Kéry & Schmid 2006)
 Trentino regionSE Switzerland
Area (km2)62063804
Min. elevation (m a.s.l.)  67 193
Max. elevation (m a.s.l.)37693400
Total number of breeding bird species 146 148

We used data from 85 1 km quadrats surveyed 2002–05 either annually as part of the national Swiss Breeding Bird Survey ‘Monitoring Häufige Brutvögel’ (MHB; Schmid et al. 2004; Kéry et al. 2005; Kéry & Schmid 2006) or once every 5 years for the Swiss Federal Biodiversity Monitoring Programme (BDM) (Weber et al. 2004; see also For MHB quadrats surveyed in more than 1 year, we chose the first year with data. As both MHB and BDM use identical protocols, we refer collectively to their results as ‘MHB data’. Distances between the six raptor study areas of SEA and the 85 MHB quadrats on the Swiss SE Alpine slope ranged from 55 to 200 km, while the distance between the two most distant raptor study areas in SEA was about 90 km.

The Swiss MHB quadrats represent a systematic random (grid) sample which guarantees that inference from the sampled quadrats is valid for the entire region. Between 15 April and 15 July, three surveys (two in quadrats above the treeline) were conducted in each 1 km quadrat by a qualified volunteer using territory mapping. Mean quadrat elevation averaged 1571 m (range 250–2650 m) and forest cover 54% (range 0–98%). Surveys followed a quadrat-specific, irregular transect route of length 0·2–8·4 km (mean 4·2). It is this variation that provides the information from which we can derive the species richness expected from a transect length of 2 km (see below). Mean duration of a single survey ranged 30–400 min (mean 243); total survey duration ranged 1·5–18·75 hours (mean 12·75) per quadrat.

To estimate true richness of diurnal bird species, we used a model for animal community structure that is based on species-specific models of occurrence that account for imperfect detection probability (Dorazio & Royle 2005). We use a multispecies version of the site–occupancy model proposed by MacKenzie et al. (2002). One important feature that it deals with is that true community size at each site, Nj for site j = 1, 2, ... , 85, is unknown, and the site-specific parameters Nj are estimated under the model. Analysis of the model was based on a strategy known as data augmentation (Royle et al. 2007), which yields a relatively simple implementation of the multispecies site occupancy model in WinBUGS (Spiegelhalter et al. 2003); see Dorazio et al. (2006) and Kéry & Royle in press) for examples. The model includes species-specific detection and occurrence probability parameters, for instance pi and ψij, for species i and quadrat j.

SEA suggest that their biodiversity assessments pertain to an area of 1 km2, although only a route of length 2 km was surveyed. Swiss MHB routes vary greatly in length, but most are greater than 2 km. Hence, our species richness estimates may be associated with a larger effective sampling area than those in SEA, making a direct comparison unfair. To obtain a better estimate of the proportion of species detected by SEA within their effective sampling area, we used the multispecies occupancy framework to model the effect of route length as a covariate. We then downscaled the species richness estimates to correspond to a route length of 2 km, i.e. we obtained predictions at a smaller spatial scale than the observations. Specifically, we developed species-specific models for the occurrence probability parameters ψij, modelling the effects of route length and elevation (linear and quadratic) on the logit-transform of ψij. Subsequently, the species- and site-specific ψij parameters were used to obtain estimates of occurrence for each species on each of the j= 1, 2, ... , 85 quadrats using a hypothetical 2 km sample route in place of the actual sample route length.

Applying our model to the Swiss MHB data, the average estimated true avian species richness on the south slope of the Central Alps has a fairly constant value of around 32 up to 1500 m (Fig. 1). This includes the mean elevation of occurrence of all six study species in SEA (Pedrini et al. 2005). Lumping their raptor and control quadrats, SEA detected only 6·8 bird species on average. Hence, we estimate that SEA detected only 21% of all species present in their 1 km quadrats, 4·2 or 3·6 times less than the Swiss (89%; Kéry & Schmid 2006) or North American Breeding Bird Surveys (76%, Boulinier et al. 1998a). For a reduced sampling effort corresponding to a route of length 2 km, the expected avian species richness was about 23 species. Hence, our analyses suggest that SEA missed 71–79% of all bird species that were present in their study areas.

Precision of mean species counts

Our second point concerns the striking precision of SEA's estimate of the mean species count of the population from which their 25 sample quadrats were drawn. In other words, the standard errors (SEs) in their Fig. 1a (our Fig. 1) appear very small. This may seem to be a subtle point, but it is nevertheless very important. In the absence of any species differences in detection probability, species counts are a binomial random variable with parameters N and p. N is the number of species present and p is the detection probability of those species. The variance of a single species count is N* p* (1−p) and the SE of the mean of 25 counts is inline image if all N are exactly equal and if p is identical for each species.

For illustration, we simulated the expected distribution of SEs of the mean of 25 species counts in SEA under pure binomial sampling. For each species studied, we rounded to the nearest integer our species richness estimate (N) associated with a 2 km transect. We then sampled all N species with probability p equal to the mean detection probability as determined by our comparison of SEA's mean observed number of species and the expected number of species N. This was repeated for each of 25 quadrats, each with identical N, and every time the total number of species detected, i.e. the simu-lated species count, in a quadrat, and the mean count across the 25 quadrats were recorded. This procedure was repeated 1000 times. The standard deviation of the resulting distribution of the n = 25 means constitutes a single empirical estimate of the expected SE of the mean of 25 species counts, when the sole component of variation present in the counts is binomial sampling variation. To obtain the empirical SE distribution of an n = 25 mean of binomial counts with identical N and p, we repeated this procedure 5000 times for each species. In eight of 12 cases, SEs in SEA were much smaller than expected under binomial sampling (Table 2).

Table 2.  Comparison of observed standard errors (SE) of the means of 25 bird species counts at raptor and control sites in Sergio et al. (2006) (SEA) and the expected standard error when the sole factor introducing uncertainty in these estimates is binomial sampling variation. We summarize the theoretical distribution by the 2·5 and 97·5% percentiles
SpeciesSEs of mean species counts in SEASE under binomial sampling
Raptor siteControl site
Pygmy owl0·260·260·41–0·45
Tengmalm's owl0·450·280·42–0·45
Tawny owl0·370·240·42–0·45
Long-eared owl0·490·470·44–0·48
Scops owl0·260·280·41–0·44

In real life, the mean of 25 species counts will hardly ever contain only pure binomial variation. Rather, there will be effects of additional sources of variation such as spatial variation among quadrats in the number of species present (N) and species heterogeneity in p. The former will make the observed SEs larger than in our simulation, and the latter, smaller. In particular, quadrats are unlikely ever to contain exactly the same avian community (consider the considerable spread of the Ni estimates around the trend line in Fig. 1), which would considerably increase the magnitude of SEs of counts relative to that in our simulation. The small SEs of SEA are suggestive of strong heterogeneity in detection probabilities in their communities.

For illustration, assume that there are n0 species detected virtually always, i.e. with p0≈ 1. The distribution of the observed species counts would then be n0 + Bin(Nn0, p), where N was the total number of species present at a site and p the detection probability for the remaining species. The n0 species will then contribute nothing to the expected variance of the species counts. (The same argument with qualitatively similar results can be made with p0 < 1.) This implies that SEA detected not only a very small fraction of the avian community present but also some undefined and biased subset. Inference to the general ‘biodiversity’ of the whole avian community would be compromised.


Sergio et al. (2006; see also Sergio et al. 2005) counted considerably more bird species in raptor territories than in controls and claimed that species richness was greater in raptor quadrats than in the wider landscape. Unfortunately, there are two anomalies in their study that make their results hard to evaluate: first, they appear to have missed about 80% of all bird species present, and secondly, they probably observed a highly biased, and therefore unrepresentative, sample of all species present at their study locations.

Conceptually, the expected number of species detected is a product of the number of species present and mean detection probability of these species (Williams et al. 2002). Hence, the difference between the observed number of species in raptor territories and control areas may be due to genuine differences in the number of species present or in the detection probability of those species, or to a combination of the two. For instance, a mean species detectability in raptor quadrats of 0·37 and of 0·20 in control quadrats would yield exactly the observed pattern of SEA's Fig. 1a. Such fairly slight effects could arise easily owing to different habitats, identity as opposed to the number of species present, different population sizes of the species present, prior knowledge by the observer of the bird community in raptor but not in control quadrats, mobbing behaviour or alarm calls of potential prey species elicited by the presence of raptors, differences in the day of the year when surveys were conducted, differences in the daytime when surveys were conducted and difference between the observer that conducted the surveys in raptor and control quadrats, as well as by a host of other factors.

The observation that counts in SEA appear ‘too precise’ may seem a subtle one, but in reality it is perhaps even more important than the point made in the previous section. Random sampling of some sort is a cornerstone of all empirical sciences, as it is the sole guarantee that the value of a trait measured in a sample is, on average, the same as that in the wider population about which one wants to learn something. SEA probably observed highly biased samples of the bird communities present in their quadrats; hence they lose the ability to generalize from the particular set of species that they observed to the complete avian community. Interestingly, therefore, a spatial random sample is not enough to ensure a random community sample when detection probability varies among species. (For demonstrations of such heterogeneity see Boulinier et al. 1998a; Nichols et al. 1998a; Hines et al. 1999; Dorazio & Royle 2005; Kéry et al. 2005; Kéry & Plattner 2007; Kéry & Royle in press.)

Counting birds fundamentally means sampling a community. To obtain a sample that allows generalization to the wider community studied, it needs to be (i) representative, i.e. random, and (ii) the sampling fraction, i.e. the proportion of species in the community that appears in the sample, must be the same on average when different samples are compared. We believe that ecologists should pay greater attention to these two requirements when studying communities. (Note that the same applies to individuals and populations.)

Over recent years, novel methods have been developed in community ecology that allow one to separate detection probability from true species richness when comparing species counts. Perhaps the most important, because most general, branch of this work uses capture–recapture ideas to obtain estimates of true species richness corrected for detection probability. (We note that recording a species leads to equivalent data as capturing it; hence, physical capture is not required in general.) Examples include Boulinier et al. (1998a,b, 2001), Nichols et al. (1998a,b), Hines et al. (1999), Cam et al. (2002), Hausner et al. (2002), Lekve et al. (2002), Doherty et al. (2003), Dorazio & Royle (2003, 2005), Gelfand et al. (2005), Dorazio et al. (2006), Karanth et al. (2006), Kéry & Schmid (2006), Kéry & Plattner (2007), Royle et al. (2007), Kéry & Royle (in press). Indeed, in the hierarchical modelling framework of Dorazio & Royle (2005) applied in this paper, it is possible to test for a raptor ‘treatment effect’ directly (also see Hausner et al. 2002). This would be the most efficient, direct and unbiased way of testing for a presumed relationship between the occurrence of one or more raptor species and that of the remainder of the avian community.


In ecology, we have an acute inability to observe the state variable that is the object of inference. We are frequently interested in spatial and temporal variation in abundance or occurrence of species, but we typically only observe counts that are biased by imperfect detection and variation due to factors that are not directly relevant to the ecological process or the science of the problem. Thus, in order to draw inferences from observational data it is critical that we make an explicit distinction between variation due to ecological process and that due to the observation process.

A case in point is the study by Sergio et al. (2005, 2006). Two anomalies in their data complicate greatly the interpretation of their results. We doubt whether the evidence presented by SEA can alone support claims in favour of raptors as indicator species for enhanced levels of biodiversity. We believe that ecologists should pay much greater attention to sampling design and analysis in the difficult field of animal population analysis. Species counts are simply samples from a community, therefore every effort must be made that these samples be representative for the community studied. Furthermore, the sampling fraction should, on average, be the same when different communities, or communities at different times, are compared and quantitative evidence for this should be presented. If they are not, then formal estimation approaches must be applied to avoid misleading inference. Recent years have seen important developments in statistical models for estimation of community quantities, and these should be considered more often.


We thank the many dedicated and excellent volunteers who conducted the field work in the Swiss Breeding Bird Survey. For comments and other help we thank two anonymous referees, and L. Jenni, I. Keller, C. Marfurt, J. Nichols, B. Schmidt and J. von Hirschheydt.