Accounting for imperfect detection and survey bias in statistical analysis of presence-only data

Authors

  • Robert M. Dorazio

    Corresponding author
    1. Southeast Ecological Science Center, US Geological Survey, Gainesville, FL, USA
    • Correspondence: Robert Dorazio, US Geological Survey, Southeast Ecological Science Center, 7920 NW 71 Street, Gainesville, FL 32653, USA.

      E-mail: bdorazio@usgs.gov

    Search for more papers by this author

  • Editor: Niklaus Zimmermann

Abstract

Aim

During the past decade ecologists have attempted to estimate the parameters of species distribution models by combining locations of species presence observed in opportunistic surveys with spatially referenced covariates of occurrence. Several statistical models have been proposed for the analysis of presence-only data, but these models have largely ignored the effects of imperfect detection and survey bias. In this paper I describe a model-based approach for the analysis of presence-only data that accounts for errors in the detection of individuals and for biased selection of survey locations.

Innovation

I develop a hierarchical, statistical model that allows presence-only data to be analysed in conjunction with data acquired independently in planned surveys. One component of the model specifies the spatial distribution of individuals within a bounded, geographic region as a realization of a spatial point process. A second component of the model specifies two kinds of observations, the detection of individuals encountered during opportunistic surveys and the detection of individuals encountered during planned surveys.

Main conclusions

Using mathematical proof and simulation-based comparisons, I demonstrate that biases induced by errors in detection or biased selection of survey locations can be reduced or eliminated by using the hierarchical model to analyse presence-only data in conjunction with counts observed in planned surveys. I show that a relatively small number of high-quality data (from planned surveys) can be used to leverage the information in presence-only observations, which usually have broad spatial coverage but may not be informative of both occurrence and detectability of individuals. Because a variety of sampling protocols can be used in planned surveys, this approach to the analysis of presence-only data is widely applicable. In addition, since the point-process model is formulated at the level of an individual, it can be extended to account for biological interactions between individuals and temporal changes in their spatial distributions.

Ancillary