Estimating multispecies abundance using automated detection systems: ice-associated seals in the Bering Sea



  1. Automated detection systems employing advanced technology (e.g. infrared imagery, auditory recording systems, pattern recognition software) are compelling tools for gathering animal abundance and distribution data since investigators can often collect data more efficiently and reduce animal disturbance relative to surveys using human observers.
  2. Even with these improvements, analysing animal abundance with advanced technology can be challenging because of potential for incomplete detection, false positives and species misidentification. We argue that double sampling with an independent sampling method can provide the critical information needed to account for such errors.
  3. We present a hierarchical modelling framework for jointly analysing automated detection and double sampling data obtained during animal population surveys. Under our framework, observed counts in different sampling units are conceptualized as having arisen from a thinned log-Gaussian Cox process subject to spatial autocorrelation (where thinning accounts for incomplete detection). For multispecies surveys, our approach handles incomplete species observations owing to (i) structural uncertainties (e.g. in cases where the automatic detection data do not provide species observations) and (ii) species misclassification; the latter requires auxiliary information on the misclassification process.
  4. As an example of combining an automated detection system and a double sampling procedure, we consider the problem of estimating animal abundance from aerial surveys that use infrared imagery to detect animals, and independent, high-resolution digital photography to provide information on species composition and thermal detection accuracy. We illustrate our approach by analysing simulated data and data from a survey of four ice-associated seal species in the eastern Bering Sea.
  5. Our analysis indicated reasonable performance of our hierarchical modelling approach, but suggested a need to balance model complexity with the richness of the data set. For example, highly parameterized models can lead to spuriously high predictions of abundance in areas that are not sampled, especially when there are large gaps in spatial coverage.
  6. We recommend that ecologists employ double sampling when enumerating animal populations with automated detection systems to estimate and correct for detection errors. Combining multiple data sets within a hierarchical modelling framework provides a powerful approach for analysing animal abundance over large spatial domains.


Several promising approaches have been developed to monitor animal populations using advanced animal detection technology. Pattern recognition algorithms (e.g. Kogan & Margoliash 1998) applied to automated auditory collection systems (cf. Blumstein et al. 2011) are capable of discriminating different species, sexes and groups of animals. Ecologists have deployed acoustic arrays to study a range of taxa including terrestrial (Blumstein et al. 2011), marine (Moretti et al. 2010; Ward et al. 2012) and amphibian (Waddle, Thigpen & Glorioso 2009) species. Another active area of research is application of object-based image analysis to automate animal counts from remotely sensed high-resolution images (see e.g. Groom et al. 2013). In this case, a computer algorithm is trained to automatically count animals on a sequence of images. Lastly, when animals give heat signatures different from their surrounding environment, infrared imagery can be used to enumerate animal populations. This approach is often combined with digital photography to provide information about species identity and has been used to monitor big horn sheep (Bernatas & Nelson 2004), pinnipeds (Chernook, Kuznetsov & Yakovenko 1999; Speckman et al. 2011), polar bears (Amstrup et al. 2004) and, most frequently, ungulates (see e.g. Kissell & Nimmo 2011; Franke et al. 2012, and references therein).

Historically, researchers employed human observers to conduct large-scale animal population surveys, and a variety of sampling designs and statistical models are available to cope with imperfect detection when estimating density and abundance from such records (see e.g. Williams, Nichols & Conroy 2002, for a review). Advanced technologies (e.g. infrared imagery, automated acoustic detectors, pattern recognition software) are a promising alternative for increasing survey coverage and reducing detection error, but are far from perfect. For instance, advanced technologies may still miss animals and may also pick up non-target signatures (resulting in false positives). In multispecies surveys, species misidentification errors may also be present. To accurately estimate abundance from automated detection data, it is thus often necessary to collect sufficient auxiliary information to estimate and correct for multiple error types. However, few statistical methods have been developed to incorporate these error rates into abundance estimates (but see Marques et al. 2013).

In this paper, we develop a hierarchical modelling framework to estimate animal abundance on landscapes surveyed using an automated detection system. Our approach assumes that the investigator collects independent data using a different sampling approach (hereafter, ‘double sampling’) over a subset of the survey area to help estimate error rates. In particular, we require that double sampling data be collected in such a manner that it can be used to estimate the probability of false negatives (missed animals), false positives (erroneous detections) and species misidentification, if applicable. Further, we assume that double sampling data can be used to accurately measure individual covariates (e.g. group sizes for clusters of animals).

We demonstrate our approach on simulated data and also on aerial survey data of ice-associated seals. In both cases, automatic detection data consisted of thermal imagery and double sampling consisted of automated high-resolution digital photography. Under our approach, thermal imagery is used to find ‘hot spots’ – points in the infrared video that have more extreme heat signatures when compared to the surrounding environmental matrix. Digital photographs with matched time stamps can then be searched to get information on the species composition of each hot spot, as well as the number of animals present. Further, independent searches of photographs can be conducted to estimate the proportion of animals missed.

Our manuscript is organized as follows. First, we describe the data necessary to conduct a joint analysis of automatic detection and double sampling data. Next, we describe a model-based framework for estimating animal abundance from such records. After describing a simple simulation study, we analyse a test data set of flights conducted over the eastern Bering Sea in the spring of 2012. In this case, we wish to make inference about the abundance of four ice-associated seal species from data that are contaminated by species misclassifications and anomalous thermal readings.


Data Requirements

We suppose that the investigator partitions their survey area into J (possibly irregular) sampling units, each of which has area Aj (see Table 1 for a complete list of notation). In practice, the size of the sampling unit will likely be constrained by the resolution of available habitat covariates (e.g. remote sensing data). We assume that transects through each sampling unit occur more or less randomly with respect to available habitat so that the investigator is not making fine-scale adjustments within units to target areas of higher habitat quality. We suppose that L (L ≤ J) sampling units are surveyed using an automatic detection system and Rj gives the proportion of unit j that is surveyed. We suppose that the spatial domain surveyed by the double sampling method in unit j is a subset of that surveyed by the automatic detection system. As such, we allow for the possibility that some (potentially a large fraction) of automatic detections are not double-sampled.

Table 1. Definitions of parameters and data used in the hierarchical model for automatic detection and double sampling data. Symbols appearing in boldface represent vectors or matrices
NsTotal abundance of species s in the study area ( = ∑j Njs)
NjsAbundance of species s in sampling unit j (math formula)
math formulaNumber of observed animals in sampling unit j that are truly of species s
math formulaNumber of undetected animals in surveyed regions of sampling unit j that are of species s
math formulaAbundance of species s in unsurveyed regions of sampling unit j
GjsNumber of groups of animals of species s located in sampling unit j
math formulaNumber of groups of animals of species s located in the surveyed region of sampling unit j detected by the automatic detection system
νjsThe log of abundance intensity for species s in sampling unit j
τνsPrecision of the log of abundance intensity for species s; possibly used to impart overdispersion relative to the Poisson distribution
τηsPrecision parameter for spatial random effects associated with species s
λjsAbundance intensity for species s in sampling unit jjs = AjRjpjs exp (νjs))
βsParameters of the linear predictor describing variation in the log of abundance intensity as a function of landscape and habitat covariates for species s
ηsVector of spatial random effects for species s
αsVector of reduced-dimension random effects for species s [when restricted spatial regression (RSR) is employed]
θsParameters describing the distribution of individual covariates at the population level for species s
SijTrue species associated with the ith automatic detection obtained while surveying sampling unit j
math formulaProbability that the ith group of animals encountered while surveying sampling unit j are assigned observation type O given that they are truly of species s.
pjsProbability that a member of species s associated with the area surveyed in sampling unit j is detected (pjs = psajs)
ajsProbability that an animal of species s is available to be detected at the time(s) when surveys are conducted in sampling unit j (for seals, this is their haul-out probability)
psProbability that a member of species s will be detected by the automatic detection system given that it is available to be detected
YjTotal count of automatic detections recorded during surveys of sampling unit j
ZijkThe value of the kth individual covariate associated with automatic detection i in sampling unit j
IijIndicator for whether the ith automatic detection recorded in the jth sampling unit was also subject to double sampling
XsDesign matrix associated with abundance intensity model for species s
AjThe area of sampling unit j (perhaps scaled to its mean)
RjProportion of sampling unit j that is sampled via the automatic detection method during the survey
OijObservation type for the ith automatic detection in sampling unit j (e.g. observed species)
JTotal number of sampling units in the study area
LTotal number of sampling units in the study area that are actually sampled
WAssociation matrix describing spatial neighbourhood structure of sampling units
QStructure matrix for spatial random effects (note the precision matrix for random effects is given by math formula)
KsDesign matrix for spatial random effects when dimension reduction (RSR) is employed

For each unit that is sampled, we suppose that an automatic detection algorithm is employed on remotely sensed data (e.g. thermal imagery, audio recordings) to compile a list of detections of focal taxa. In practice, some tuning of this algorithm may be needed to balance the resulting sensitivity and specificity; making the algorithm too sensitive can markedly increase the number of false positives, while making it too specific can result in a large number of missed animals. Note that we allow for both false positives (anomalies) and false negatives (non-detections) in subsequent modelling.

Data notation

Let Yj denote the total number of automatic detections that are recorded in survey unit j. We assign an indicator Iij = 1 to automatic detections for which a species observation can be made, and set Iij = 0 otherwise (here, i ∈ {1, 2, …, Yj} identifies the ith automatic detection in surveys of sampling unit j). Note that for some automatic detection data (e.g. thermal imagery), double sampling data may actually be necessary to make species determinations, while for others (e.g. auditory detections using speech recognition algorithms), Iij may equal one for every record. The investigator assigns each automatic detection with Iij = 1 an observation type, Oij. There is considerable latitude in selecting species classification schemes (see e.g. 'Species misclassification model' and subsequent examples). The investigator also records any individual covariates, Zijk (e.g. group size), where k identifies the kth covariate (Table 1). For the present development, we require that covariates are available for each record where Iij = 1.


The observed data include a set of species classifications for each sampling unit, a count of unclassified automatic detections for each sampling unit (i.e. those for which Iij = 0), together with individual covariates such as group size. We also allow for the possibility that the investigator has auxiliary data (through double sampling or some other mechanism) to estimate components of the detection process (e.g. detection probability, species misclassification probabilities) and has gathered habitat covariates to help explain variation in abundance. Our next task shall be to devise a way to conduct inference on animal abundance and species–habitat relationships from such a seemingly disparate data amalgam.

When conceptualizing how the observed data arise, we find it intuitive to break the problem down into several components within a hierarchical modelling framework (e.g. Fig. 1). First, we consider the way in which expected abundance for each species varies over the landscape. When space is discretized into individual sampling units (as we have done here), a common way to relate counts to habitat covariates is through a spatial regression model. In our case, we do not know the actual abundance in each sampling unit, but we can still borrow this framework to describe variation in expected abundance in each cell. Secondly, realized animal counts in a given sampling unit will typically be different than the expected abundance for several reasons, including random variation, incomplete coverage of the sampling unit and detection probabilities that are <1. We refer to the model describing the relationship between true species counts and expected abundance as a ‘'Local abundance model'’. Finally, the type of observations that are spawned when a group of animals is detected depends on (i) an observation process relating the true species to different observations classifications and (ii) a process relating the true species to individual covariate values. We refer to models for these processes as the ‘Species misclassification model’ and ‘'Individual covariate model'’, respectively.

Figure 1.

Directed, acyclic graph for the model proposed for multispecies abundance estimation from thermal imagery and digital photography (adapted from Conn et al. 2013, fig. A1). Notation is defined in Table (subscripts and superscripts omitted for clarity).

We now describe each of these four components (spatial regression, local abundance, species misidentification and individual covariate models) in turn. In doing so, we make a number of distributional choices that may require refinement in certain sampling scenarios (see 'Discussion'). We then describe Markov chain Monte Carlo (MCMC) methods and approaches for generating posterior predictions of abundance across the study area. Throughout, we use bold symbols to denote vectors and matrices. For a fuller mathematical treatment, see Appendix S1.

Spatial regression model

For each species s, we write the log of expected abundance in each sampling unit as a function of habitat covariates, spatially autocorrelated random effects and unstructured random effects. For the moment, we treat all sampling units as if they were the same size (adjustments for unequal area are made in the following section). In particular, we express the log of expected abundance (νjs) across the collection of sampling units as

display math(eqn 1)

where Xs denotes a design matrix relating environmental and habitat covariates to expected abundance, βs gives related regression parameters, the ηs specify random effects with spatially autocorrelated, Gaussian errors, and εs represents mean zero Gaussian error with precision parameter τνs.

There are several common choices for inducing spatial autocorrelation in hierarchical spatial regression models (see e.g. Banerjee, Carlin & Gelfand 2004). In the following, we specify an intrinsic conditionally autoregressive prior distribution (ICAR; Besag & Kooperberg 1995; Rue & Held 2005) for ηs such that

display math

where math formula denotes a multivariate normal (Gaussian) distribution and τηsQ gives precision of the Gaussian spatial process. Here, τηs is a precision parameter to be estimated, and Q is defined as Q = DW, where W is an association matrix describing the spatial neighbourhood structure of sampling units and D is a diagonal matrix with elements −W1 (1 being a column vector of ones). For purposes of this paper, we use a formulation for W that approximates thin-plate splines (Rue & Held 2005, section 3.4.2). This approach implies a greater degree of smoothing than first-order formulations for Q, a potentially useful feature when analysing sparse data from abundance surveys (see 'Discussion'). In an effort to eliminate parameter redundancy and confounding between spatial regression parameters and spatial random effects, we also implemented a restricted spatial regression (RSR; Reich, Hodges & Zadnik 2006; Hodges & Reich 2010; Hughes & Haran 2013) version of eqn (eqn 1) (see Appendix S1 for further details).

Local abundance model

The preceding formulation describes variation in the log of abundance intensity, but does not include other factors affecting the expected number of animals encountered by surveys in a given cell. For instance, sampling units may vary in size, the proportion of area surveyed may vary across sampled units, automatic detections may miss animals, and not all animals associated with sampling unit j may be present while surveys are being conducted. Also, we expect random fluctuations in the number of animals present relative to the expected abundance intensity. For these reasons, we model the number of automatic detections of species s in sampling unit j (math formula) as math formula where λjs = AjRjpjs exp (νjs) and pjs = ajsps (recall that notation is defined in Table 1). In conjunction with our choice of a Gaussian distribution for νjs, this formulation implies that the actual number of detections for each species is a realization of a thinned version of the log-Gaussian Cox process (see e.g. Rathbun & Cressie 1994; Møller, Syversveen & Waagepetersen 1998).

The data collected on aerial surveys do not provide sufficient information to estimate availability, (ajs), so auxiliary data or strong priors are needed on these parameters (see e.g. 'Example: Ice-Associated Seals'). One approach for getting information on ps is to conduct an unaided search of double sampling data, and to treat animals found in the unaided search as trials to test the false-negative rate of the automatic detection algorithm; in this case, the number of successful automatic detections can be treated as binomial with success probability ps.

Species misclassification model

The preceding sections describe how animals (or animal groups) of each species are detected. However, in order to allow imperfect species observations, we need to specify a model relating the true species to actual observations. For observations where species can be assigned (Iij = 1), we suppose that observations Oij arise according to a multinomial process conditional on the true species Sij and classification probabilities math formula. In practice, this specification requires that we treat the true species as a latent parameter (i.e. that we admit uncertainty about its value).

Automatic detection data are typically not sufficient to estimate the misclassification parameters, π, so strong priors or auxiliary data are needed to provide structure on these (see subsequent examples). In the following sections, we suppress dependence on individual and transect (i.e. we set math formula). However, we suspect that expressing classification parameters as a function of covariates using a multinomial logit link (Agresti 2002) will be useful in future applications (see e.g. Conn et al. 2013, for further discussion). Although there is considerable flexibility for structuring the species classification matrix, we use a formulation specifically tailored to our seal study in subsequent applications. This formulation requires observers to classify observations by both species and certainty, and also permits them to record species as ‘unknown’ or ‘other’ (Table 2). The ‘other’ category accounts for false positives.

Table 2. Species classification probabilities used in the hierarchical seal abundance model. True species appear in columns, while observation types occur on rows. The column (and row) for ‘Other’ indicate non-seals (e.g. thermal anomalies, non-target taxa)
Obs IndexObs speciesConfidenceTrue species

Individual covariate model

We allow for the possibility that automatic detection i in sample unit j has k associated individual covariates, which are only assumed to be observed if Iij = 1 (see 'Discussion'). The most important of these is likely group size, the number of animals that are associated with a specific automatic detection. In cases where automatic detections can consist of more than one animal, one must include the group size distribution when generating an overall abundance estimate. Our approach is to parametrically model covariates as

display math

where fs(θs) gives a probability mass or density function with parameters specific to species s. In the applications that follow, we only use one individual covariate (group size), which we give a zero-truncated Poisson distribution with a species-specific intensity parameter to be estimated. We also require that each automatic detection be composed of like species.

MCMC sampler

Writing our model hierarchically, we are able to envision how broad landscape-scale processes can ultimately be translated into observed data in a probabilistic fashion. Provided that we believe our model is a reasonable approximation to reality and are willing to assign prior distributions for model parameters, Bayesian calculus provides a convenient way of making inference about the data-generating process (including parameters describing species–habitat relationships and animal abundance). We used a hybrid Gibbs–Metropolis sampler to draw samples from the joint posterior distribution symbolically specified by Fig. 1 (see Appendix S1 for a mathematical specification). This involves iteratively sampling model parameters from their full conditional distributions. Owing to our judicious choice of Gaussian error on log-scale abundance intensity (i.e. νjs), many of the parameter groups can be sampled directly using the same strategies commonly used in Bayesian analysis of linear models (see e.g. Gelman et al. 2004, Chapter 14). Our strategy for updating parameters shares many features with some of our past work on distance sampling (see e.g. Conn, Laake & Johnson 2012; Conn et al. 2013) and is presented in Appendix S1.

Posterior prediction and model comparison

Our models provide estimates of parameters explaining variation in animal abundance (i.e. spatial regression parameters) as well as species-specific abundance estimates for animals detected in sampled areas. To extend inference to the abundance of species s over the entire landscape, we use posterior predictive distributions. For sampling units that we do not sample, posterior predictions can be simulated as

display math

where θs gives the zero-truncated Poisson intensity parameter for group size.

For units that are sampled, we generate posterior samples of abundance that are a combination of (i) animals detected during surveys (math formula; note that the total number of such animals is fixed, but species can vary), (ii) animals that are associated with sampled areas, but were not detected (math formula), and (iii) animals in portions of sampled cells that were not surveyed (math formula), such that math formula. This approach is attractive in that it implicitly includes a finite population correction. For instance, if all animals are detected in a given sampling unit and there is no species misclassification, then abundance in that unit is known with certainty (Ver Hoef 2008; Johnson, Laake & Ver Hoef 2010). Specifically, we can generate abundance predictions as

display math
display math

Predictions of total abundance across the study area can then be calculated as Ns = ∑j Njs.

We also compute a posterior predictive loss statistic to compare the performance of alternative models with different sources of variation in modelled abundance (e.g. different combinations of covariates, presence/absence of spatial autocorrelation). Suggested by Gelfand & Ghosh (1998), this approach measures the ability of a given model to generate data sets similar to the one collected. In particular, a loss statistic is computed for each model m as math formula, where math formula is a measure of posterior loss and math formula is a penalty for variance. Models with a smaller overall math formula are favoured in this context. Our implementation largely follows that of Conn et al. (2013); see Appendix S1 for further details.

Example: Simulated Data

To assess the ability of our proposed model to accurately estimate abundance, we simulated a survey of four species over a 30 × 30 grid (J = 900). We generated true abundance using the same general model structure as used in estimation. Log abundance for each species was generated as a function of several covariates (easting, northing and a Matern-distributed hypothetical covariate) as well as spatially correlated error where spatial random effects were generated for each species assuming an ICAR (τ = 20) distribution. Covariate relationships were configured such that for species one, abundance intensity increased linearly in both eastern and northern directions; species two exhibited a low but constant abundance across the landscape; species three exhibited a high abundance on the western edge of the landscape which declined slightly towards the east; and species four had a strong relationship with the hypothetical covariate. We also included a fifth ‘species’ in an attempt to mimic anomalous readings (false positives), where expected abundance intensity was set to be constant across the landscape. In some cases, the ICAR random effects obscured the covariate relationships (Fig.2).

Figure 2.

Simulated (left panels) and estimated (right panels) abundance across a landscape for five hypothetical species. Red circles on estimated abundance panels indicate sampled cells.

We simulated a survey over 200 randomly selected grid cells, assuming that each survey covered 10% of its target sampling unit (Fig. 2) and that double sampling was conducted for 80% of automatic detections. Total sample coverage was thus ≈2·2% of the population. The observation model was built to resemble our seal example (see 'Example: Ice-Associated Seals'); double-sampled animals could be classified as belonging to any of the four target ‘species’ or could be recorded as ‘unknown’ or ‘other’. In addition, there were three classes of target species classification certainty: ‘certain’, ‘likely’ and ‘guess’ (Table 2). Observations were determined according to a multinomial distribution, with probabilities given in Appendix S2.

We supplied our hierarchical model with the same covariates that were used to generate the data (thus utilizing the ‘correct’ functional form and assuming no covariate measurement error), and permitted estimation of RSR ICAR random effects. We fixed overdispersion relative to the Poisson distribution to be small (τν = 100) to stabilize estimation (see 'Discussion'). We summarized the posterior distribution by running the Markov chain for 600 000 iterations, discarding 100 000 iterations as a burn-in and recording values from every 100th iteration to save disk space. This procedure took ≈ 2·5 days on a 2·93-GHz Dell Precision T1500 desktop with 8·0 GB of RAM.

Example: Ice-Associated Seals

We conducted aerial surveys of four ice-associated seal species (bearded seals Erignathus barbatus, ribbon seals Histriophoca fasciata, ringed seals Phoca hispida and spotted seals Phoca largha) over the eastern Bering Sea between 10 April and 22 May 2012. Our strategy was to use infrared cameras as an automatic detection procedure and to use a set of independent, automated digital photographs as a form of double sampling (Fig. 3). Two aircraft were used in surveys, a NOAA DeHavilland DHC-6 Twin Otter and an AC-690 Aero Commander. The Twin Otter was configured with three FLIR SC645 far-IR infrared cameras with 25 mm lenses measuring data in the 7·5- to- 13-μm wavelength, each of which was paired with a 21 megapixel high-resolution digital single-lens reflex (SLR) camera fitted with a 100 mm lens. All six cameras were mounted in the belly port of the airplane. To avoid counting the same animal twice, the infrared cameras were mounted such that their thermal swaths abutted each other but did not overlap. Flying at a target altitude of 300 m, this configuration produced a thermal swath width of c. 470 m. The Aero Commander was similarly configured with two sets of infrared and SLR cameras, resulting in a thermal swath width of c. 280 m. SLRs were automated to take pictures approximately every 1–1·4 s; flying at a target speed of 130 kts, photographs covered ≈84% of the thermal swath.

Figure 3.

A composite image showing two high-resolution digital photographs (left) with a matched thermal hot spots (right). Thermal videos are screened for such hot spots, and corresponding photographs are searched (when available) to provide information on species identity.

As the quantity and distribution of sea ice varied considerably over the course of the surveys, we selected 10 flights that provided good spatial coverage within a 1-week period (20–27 April) for analysis (Fig. 4), assuming that abundance was constant over the study area during this period. Analysis was also limited to one set of cameras from each plane. In total, our analysis included 9076 km of survey effort (40·7 h of flying time). We limited effort to times and locations when altitude was 228·6–335·3 m and roll was <2·5 from centre. Aircraft yaw could not be calculated reliably and was not included in area calculations.

Figure 4.

Map of eastern Bering Sea study area showing 25 × 25 km sampling units and survey lines for flights that were included in the analysis. The western boundary of the study area was determined by the U.S. Exclusive Economic Zone (EEZ); the southern boundary was determined by limiting analysis to cells that had ≥1% sea ice for at least one day from 1 April 2012 to 20 May 2012. Cells comprised of >99% land were excluded from analysis.

We compiled several covariates we thought might be useful in predicting seal abundance in our study area. These included marine ecoregion (cf. Piatt & Springer 2007), distance from mainland, distance from 1000-m depth contour, sea ice concentration, distance from southern ice edge and distance from 10% sea ice contour (Fig. 5). Remotely sensed sea ice data were obtained at a 25 × 25 km resolution from the National Snow and Ice Data Center, Boulder, CO, USA, on an EASE Grid 2.0 projection. We used this projection and same resolution to define sampling units (Fig. 4). Calculations of covariates were made relative to the centroid of each sampling unit.

Figure 5.

Covariates assembled to help predict seal abundance in the eastern Bering Sea. Covariates include average proportion of sea ice while surveys were conducted (‘ice_conc’), distance from 1000-m depth contour (‘dist_shelf’), distance from mainland (‘dist_mainland’), distance from 10% sea ice contour (‘dist_contour’), distance from southern sea ice edge (‘dist_edge’) and ecoregion (see Piatt & Springer 2007). All covariates other than ice_conc and ecoregion were standardized to have a mean of 1·0 prior to plotting and analysis. Unsampled ecoregions were combined with the closest sampled ecoregion for estimation.

To estimate the probability of detection associated with infrared detections (ps), a technician manually searched an independent, systematic random sample of 11 724 digital photographs (out of a total of 117 225 images) for the presence of seals. The technician spent c. 120 h searching photographs and found a total of 70 seal groups. We then examined whether these seal groups were also detected as hot spots using our infrared hot spot detection method, finding that 66 (94·3%) of them were detected. As species could not always be identified, we set ps = p for all species and used these data to help estimate the overall probability of detection (see below). For reference, the conditional probability of detection for our technician (calculated using seals detected by infrared) was lower at 66/82 = 80·5%.

We obtained data on availability probability (ajs) from ARGOS-linked satellite transmitters affixed to spotted, bearded and ribbon seals in the Bering Sea from 2004 through 2012. A conductivity sensor placed on each transmitter provided hourly data on the proportion of time each tag was dry. As in previous analyses (e.g. Bengtson et al. 2005; Ver Hoef, London & Boveng 2010), dry-time percentages were converted into Bernoulli responses to analyse seal haul-out behaviour, where a success was recorded whenever tags were mostly (≥50%) dry in a given hour (seals could only be detected by thermal imagery when they were out of water). Because we were only interested in explaining variation in haul-out behaviour during spring, we limited analysis to records between 1 February and 31 July of each year, treating each individual-year combination as an independent replicate (i.e. data for individuals obtained in two separate years were treated as if they were statistically independently). This approach resulted in a total of 19 individual-year combinations for bearded seals, 92 for ribbon seals and 55 for spotted seals. These data were analysed within a generalized linear mixed modelling framework that explicitly acknowledges temporal autocorrelation in responses (see Ver Hoef, London & Boveng 2010). For our purposes, the linear predictor was written as a function of hour of day and day of year. Hour of day was treated as a categorical variable with 24 levels, while day of year was calculated as proportion of year since 1 February. We modelled linear, quadratic and cubic effects for day of year and included all interactions between day of year and hour of day. After separate models were fitted to data for each species, predictions in logit space (Fig. 6) and an associated variance–covariance matrix could be computed for any set of availability probabilities (ajs) of interest using standard mixed model theory (see e.g. Littell et al. 1996; Ver Hoef et al. 2013).

We used the following procedure to produce prior samples of detection probabilities (pjs) for surveyed sampling units:

  1. Determine an average time of day and day of year when sampling was conducted on each sample unit.
  2. For rep ∈ {1, 2, …, 1000}, sample math formula, where μ gives mixed model haul-out (availability) predictions in logit space and Σ gives the prediction variance–covariance matrix.
  3. For rep ∈ {1, 2, …, 1000}, sample infrared detection probability as math formula. This formulation implies a conjugate Beta(1,1) prior on ps.
  4. Compute samples of detection probability (availability × infrared detection) by sampling unit as math formula.
Figure 6.

Predicted haul-out (availability) probability as a function of day of year and time of day for each species. Note that estimates are currently unavailable for ringed seals.

Samples of math formula could then be used as a prior distribution within a Metropolis–Hastings step to account for detection probabilities that varied by hour, day of year and species (see Appendix S1 for further details). Note that there were no availability data for ringed seals, so ajs was set to 1·0. As such, ringed seal abundance estimates are uncorrected for availability.

An independent experiment was performed to generate a prior distribution of species classification probabilities (B. McClintock, unpublished data). This analysis used readings by multiple observers and certainty categories (certain, likely, guess) to produce posterior predictions of classification probabilities, with a constraint that observations recorded as ‘certain’ were 100% accurate. These predictions were used directly as a joint prior distribution for the species classification matrix (see Appendix S1). The classification matrix specified by the posterior mean of these predictions is provided in Appendix S2.

We considered several model formulations for each species. Based on prior surveys in the region (see e.g. Conn et al. 2013; Ver Hoef et al. 2013), our a priori expectation was that ribbon and spotted seals would be concentrated in the southern portions of our study area, whereas bearded and ringed seals would be primarily located farther north. We also expected that abundance would be nonlinearly related to sea ice concentration, where zero seals would be detected in cells with no ice and few seals (possibly with the exception of ringed seals) would be detected in cells with 100% ice. Ideally, a model for ribbon seal abundance would be written as a function of the distance from the continental shelf, where nutrient upwelling supports an abundant prey base. However, models with continuous predictors proved problematic for ribbon seals, as covariates (and combinations of covariates) were often maximized in the south-west corner of our study area, producing estimates of abundance that were unbelievably high (note that there were considerable gaps in sampling in this region). To avoid extrapolation past the range of observed data, we thus wrote all models for ribbon seals as a function of ecoregion and sea ice only. For the remaining species, we fit two possible models to the data. In the first, the log of abundance intensity was written as an additive function of ice_conc, ice_conc2, dist_mainland, dist_shelf, dist_contour and dist_edge. In the second model, the log of abundance intensity was the same as ribbon seals, namely an additive function of ice_conc, ice_conc2 and ecoregion.

We initially tried to fit models that included RSR ICAR random effects, but these often produced overinflated estimates of abundance in areas where there were large gaps in spatial coverage, even when the spatial neighbourhood defining the Q structure matrix was a relatively smooth RW2 structure (as in Rue & Held 2005, section 3.4.2). As such, we limit estimation to pure trend surface models that do not include spatial autocorrelation (i.e. νs = Xsβs+εs), acknowledging that posterior predictions of abundance likely overstate precision (see 'Discussion'). Initial runs also produced positive predictions of seal abundance in cells without ice, likely because we only surveyed cells that had ice. To anchor this intercept at zero, we introduced dummy data into estimation that indicated we encountered zero seals in cells with <0·1% sea ice. As with the simulated data example, we set τνs = 100 and summarized the posterior distribution by running the Markov chain for 600 000 iterations, discarding 100 000 iterations as a burn-in and recording values from every 100th iteration to save disk space. This procedure took ≈3·5 days on a 2·93-GHz Dell Precision T1500 desktop with 8·0 GB of RAM.


Simulated Data

Posterior predictive distributions for estimated abundance reasonably approximated the spatial distribution for each species (Fig. 2), and posterior predictive distributions of total abundance captured true abundance in all cases (Fig. 7). This suggests that our estimation scheme produces reasonable estimates, at least for the sample coverage (≈2·2%) and high frequency of double sampling (80%) assumed here.

Figure 7.

Posterior predictive distributions for species abundance as estimated from simulated data. True values are indicated in red.

Ice-Associated Seals

Our posterior loss statistic favoured model one (with continuous covariates for all species other than ribbon seals; math formula) over model two (where ecoregion was used for all species; math formula), although estimated seal abundance was similar for each. Patterns in seal abundance conformed to our a priori expectations regarding species distributions for each model; for brevity, we present overall abundance estimates (Fig. 8) and mean posterior prediction abundance maps (Fig. 9) from model 1 only. Posterior mean density estimates, calculated using an effective study area of 767 114 km2, were 0·39 bearded seals km−2 (95% CI 0·32–0·47), 0·24 ribbon seals km−2 (95% CI 0·19–0·30) and 0·60 spotted seals km−2 (95% CI 0·51–0·73). We also were able to estimate the relationship between seal abundance and ice concentration, finding that for most species abundance was maximized when the proportion of sea ice in a sampling unit was in the 0·6–0·8 range (Fig. 10).

Figure 8.

Posterior predictive distributions of seal abundance in the eastern Bering Sea from Model 1. Estimates of ringed seal abundance are uncorrected for haul-out (availability) probability.

Figure 9.

Mean posterior predictions of seal abundance across our study area in the eastern Bering Sea. Legends indicate posterior predictions of abundance in 25 × 25 km grid cells. Abundance for ‘other’ indicates abundance of other heat signatures that were not seals (e.g. sea lions, walrus, melt pools, birds).

Figure 10.

Mean posterior prediction of abundance for each seal species as a function of sea ice concentration. Predictions for each species were made by setting all other modelled covariates to their means, so that they are best interpreted as the relative effect of sea ice on a species-specific basis (absolute values of predictions are not necessarily biologically meaningful). For instance, predicted ribbon seal abundance was calculated by averaging predicted abundance over all ecoregions, some of which had a predicted abundance near zero.


Automated detection systems offer several potential advantages over human observer surveys. For example, infrared survey flights can be flown faster and at higher altitudes than conventional (human observer) surveys, increasing the effective area that can be surveyed, decreasing the likelihood of animal disturbance and making surveys safer for pilots and crew. Surveys using automated detection devices have the added advantage of providing a physical, archivable record of animal detections. However, such surveys can still miss animals or pick up non-target signatures. Here, we have shown that double sampling (in the seal example, digital photography) is a viable avenue for allowing species-specific inferences about abundance from automated detection data. However, this approach requires rather sophisticated hardware and software, as well as modelling techniques to account for the vagaries of the detection process, including imperfect detection, availability <1, anomalies (false positives) and species misclassification (note that these factors also occur in studies with human observers, even if they are usually ignored!). Despite the complexity, the simulation study suggested that our approach is capable of estimating maps of species distributions that capture large-scale trends in abundance, with posterior predictive distributions of total abundance including true values.

The subset of seal data we used was quite sparse, with survey tracks covering about 0·4% of the study area. Nevertheless, we were able to fit trend surface models to these data and generate posterior predictions for abundance that largely reflected our a priori expectations. For instance, our seal density estimates compared favourably to results from 2006 helicopter transect surveys over a 279 880-km2 subset of our study area (Ver Hoef et al. 2013), where densities were estimated as 0·22 bearded seals km−2 (95% CI 0·12–0·61), 0·22 ribbon seals km−2 (95% CI 0·13–0·68) and 0·84 spotted seals km−2 (95% CI 0·49–2·83). In addition to the actual numbers, the relationships between abundance and underlying landscape and environmental covariates may also be of interest. For instance, we were able to relate seal abundance to landscape features (e.g. distance from land), remotely-sensed sea ice data and ecoregion, and to compare alternative models via a posterior loss statistic. Seal density appeared to peak at slightly higher values of sea ice concentration than previously observed (cf. Ver Hoef et al. 2013), possibly due to the uncharacteristically high levels of ice in the Bering Sea in 2012. We plan to build upon this modelling framework to arrive at more definitive estimates of seal abundance and covariate relationships in the near future. This effort will likely include adding a temporal dimension in the process model to account for changing sea ice conditions (Ver Hoef et al. 2013) and expanding the survey grid to include data from concurrent Russian surveys in the western Bering Sea. Owing to current CPU run-times (i.e. 3·5 days for the seal analysis), this effort will also likely require improvements to computer code (e.g. by using parallel processing).

Although not presented here, our experience with fitting models to both simulated and real data is that there needs to be relatively intense spatial coverage to support estimation of overdispersion (i.e. τνs) and/or spatial random effects using our modelling approach. Since modelling occurs on the log of abundance intensity, the tendency with overparameterized models is for positive bias, particularly in unsampled cells. The robustness of our approach is likely viewed along a continuum. With low spatial coverage, trend surface models (i.e. those without spatial autocorrelation) may still do a reliable job of predicting abundance at the expense of overstated precision. However, even with trend surface models, investigators should take care to avoid situations where the linear predictor for abundance has maximum values in unsampled areas. With higher levels of spatial coverage (and low species misclassification rates), estimation of spatial random effects and overdispersion may be more reliable, particularly when considering reduced rank spatial models like the RSR approach outlined in Appendix S1.

The methods developed in this paper were largely motivated by our seal data, and we recognize that further developments and refinements may be needed when different automatic detection systems and double sampling strategies are employed. For example, our use of double sampling data to estimate detection probability implicitly relies on the assumption that animal detections in each data set (automatic detection, double sampling) are independent. We think this assumption is reasonable in our seal example, but would likely fail in terrestrial applications where habitat cover affects thermal and visual detections similarly (Franke et al. 2012). For many terrestrial applications, as well as surveys using automated image processing, an alternative double sampling data set would likely be needed (e.g. using surveys of known animals). For auditory surveys, assessment of error rates could be conducted using test data sets where true species is known. However, auditory surveys would likely need to account for additional factors such as cue rate and variation in auditory detection distances (Marques et al. 2013), as well as availability probability (Diefenbach et al. 2007).

An additional consideration is the amount of double sampling that needs to be conducted. Required coverage largely depends on the amount of information provided by double sampling, as well as the propensity for spatial variation in detection errors. In our seal example, double sampling (i.e. automated digital photography) was used in at least three ways: (i) to provide species observations (including false positives), (ii) to estimate detectability and (iii) to examine species identification errors via an experiment with multiple image readers. Since species distributions and false-positive rates varied considerably across the landscape, it was necessary to have considerable coverage in double sampling data (e.g. 79% of detected hot spots had associated photographs). By contrast, we only searched ≈10% of available images to estimate detection probability and used 716 photographs for our species identification experiment. We did not expect these error rates to vary spatially, and target proportions were largely informed by power analysis (J. Ver Hoef, unpublished data). We do not expect to increase these latter sample sizes in future work (i.e. even when increasing the number of flights), since we expect to use the same technology and transect protocols.

Our approach was to account for missed animals by including detection probability as a thinning parameter relative to the log-Gaussian Cox process, which is likely appropriate for many populations. However, when automated detection of animals is a function of individual-level covariates (e.g. size, distinctiveness), an alternative approach such as data augmentation (Royle 2009; Conn, Laake & Johnson 2012) would likely be necessary since detection probability must then be modelled at the level of the individual animal. Additional approaches to account for overdispersion (e.g. zero inflation, variance inflation factors) would also be useful and are a subject of current research.

It is important to contrast our approach in this paper, which uses double sampling to estimate detection probability, to approaches that rely on temporal replication or distance data. For instance, N-mixture models (Royle, Dawson & Bates 2004) also specify a hierarchical framework for spatially replicated animal count data; in this case, a population closure assumption and temporal replication render detection probability estimable. However, since detection probability includes a number of processes (e.g. detectability, availability; cf. Nichols, Thomas & Conn 2009), it is usually not possible to scale up to absolute abundance with N-mixture models. Another related approach is hierarchical models for distance sampling data, which rely on the assumption of declining detection with increased distance from the transect line to help estimate detection probability (cf. Schmidt et al. 2011; Conn, Laake & Johnson 2012; Ver Hoef et al. 2013).

Despite the complexities associated with modelling the detection process, we are optimistic about the future of automated detection systems as a tool for estimating animal abundance over large spatial domains. These tools provide the means to markedly increase survey coverage and reduce data processing times. Hierarchical models, like the one we have developed in this paper, provide a natural framework to combine multiple data sets that can be used to estimate different components of the detection process, and to correctly propagate uncertainties associated with each component into final estimates.


We thank all NOAA personnel and contractors that helped collect and process seal data and J. Jansen, D. Johnson, J. Laake and an anonymous reviewer for providing helpful comments on a previous version of this manuscript. Funding for aerial surveys was provided by the U.S. National Oceanic and Atmospheric Administration and by the U.S. Bureau of Ocean Energy Management (Interagency Agreement M12PG00017). Most of the bearded seal haul-out data were collected and made available by the Native Village of Kotzebue and the Alaska Department of Fish and Game, with support from the Tribal Wildlife Grants Program of the US Fish and Wildlife Service (Grant Number U-4-IT). The satellite telemetry studies were conducted under the authority of Marine Mammal Protection Act Scientific Research Permits 15126, 782-1765, 782-1676 and 358-1787. Views expressed are those of the authors and do not necessarily represent findings or policy of any government agency. Use of trade or brand names does not indicate endorsement by the U.S. government.

Data accessibility

R scripts: uploaded as online supporting information.

Seal survey data: uploaded as online supporting information.