Point pattern modelling for degraded presence-only data over large regions


Avishek Chakraborty, Department of Statistics, Texas A&M University, TAMU 3143, College Station, TX 77843, USA.
E-mail: avishekc@stat.tamu.edu


Summary.  Explaining the distribution of a species by using local environmental features is a long-standing ecological problem. Often, available data are collected as a set of presence locations only, thus precluding the possibility of a desired presence–absence analysis. We propose that it is natural to view presence-only data as a point pattern over a region and to use local environmental features to explain the intensity driving this point pattern. We use a hierarchical model to treat the presence data as a realization of a spatial point process, whose intensity is governed by the set of environmental covariates. Spatial dependence in the intensity levels is modelled with random effects involving a zero-mean Gaussian process. We augment the model to capture highly variable and typically sparse sampling effort as well as land transformation, both of which degrade the point pattern. The Cape Floristic Region in South Africa provides an extensive class of such species data. The potential (i.e. non-degraded) presence surfaces over the entire area are of interest from a conservation and policy perspective. The region is divided into about 37000 grid cells. To work with a Gaussian process over a very large number of cells we use a predictive spatial process approximation. Bias correction by adding a heteroscedastic error component has also been implemented. We illustrate with modelling for six different species. Also, a comparison is made with the now popular Maxent approach though it is limited with regard to inference. The resultant patterns are important on their own but also enable a comparative view, for example, to investigate whether a pair of species are potentially competing in the same area. An additional feature of our modelling is the opportunity to infer about biodiversity through species richness, i.e. the number of distinct species in an areal unit. Such an investigation immediately follows within our modelling framework.