Using density surface models to estimate spatio-temporal changes in population densities and trend

Funding – Centre for Research into Ecological and Environmental Modelling, University of St Andrews and U.S. Geological Survey provided funding for this analysis through a studentship to RJC.


Appendix 1. Variance estimation through posterior simulation
A simple and general method for computing the variance of non-linear functions of model parameters for a fitted generalized additive model (GAM) is outlined here (see also Marra et al. 2012 andWood 2017). This is achieved by simulating a large number of replicate parameter sets from the posterior distribution of the GAM coefficients βˆ. A prediction matrix, X p , maps the model parameters, βˆ, to the linear predictor, η p , as η p = X p βˆ.
A large number of replicate parameter value sets are drawn from the posterior distribution of β which is multivariate normal. The linear predictor is calculated for each of the replicate sets, and resulting quantiles are then used to compute CIs or the variance can be computed empirically.
A smoothing parameter uncertainty correction is incorporated in the posterior simulations as the covariance matrix which is computed as a first-order Taylor expansion (see pg 302 in Wood 2017). In mgcv, version 1.8-17, this correct covariance matrix is computed via vcov(m, unconditional=TRUE) provided m is estimated using REML. We found there to be minor computational costs in computing the unconditional covariance matrix and a wider correction factor in incorporating the smoothing parameter uncertainty (Appendix 1 Fig. 1).
Histograms of 1000 replicate parameter sets from the posterior distribution using a covariance matrix conditional (vcov(m, unconditional=FALSE); left panel) and unconditional (vcov(m, unconditional=TRUE); right panel) on the smoothing parameter.
R code for the above procedure follows. Although these data have been processed successfully on a computer system at the U.S. Geological Survey (USGS), no warranty expressed or implied is made regarding the display or utility of the data for other purposes, nor on all computer systems, nor shall the act of distribution constitute any such warranty. The USGS or the U.S. Government shall not be held liable for improper or incorrect use of the data described and/or contained herein.  (var(as.vector(as.matrix(PredVar.b[i,])))) } Appendix 3. Diagnostics for the distance sampling model Appendix 3 Fig. 4. Detection function plots for the model selected to estimate 'ākepa detection probability. Plots represent the detection probability (left panel) and probability density (right panel). Appendix 3 Fig. 5. Quantile-quantile plot for the detection function model selected to estimate 'ākepa detection probability. The fitted cumulative distribution function (cdf) is plotted against the empirical cdf. The points seem to fall about the straight line, which provides evidence the function adequately fits the data.
Appendix 3 Table 1. Detection function models used to compute density estimates from point-transect distance sampling surveys on Hakalau Forest National Wildlife Refuge, Hawai'i, between 1987 and 2017. Base models (Fun) include half-normal (HN) and hazard-rate (HR) key detection functions with cosine (Cos), hermite polynomial (Hp) and simple polynomial (Sp) adjustment terms (Adj). Covariates were incorporated with the highest AIC-ranked base model included cloud cover, rain, wind strength, gust strength, elevation (Elev), habitat type, minutes since survey start (MinSS), observer and year of survey (Yr). All covariates were treated as factor variables, except minutes since survey start was treated as a continuous variable. Also presented are the number of estimated parameters (Par), negative log-likelihood (-LogLike), AIC values, and change in AIC (∆AIC). † Key model selected.
Key    was less deviation from the straight line (Appendix 5 Fig. 7, 8, 9, 10 and 11). The over-dispersion parameter for the negative binomial was 1.94. In-spection of the Tweedie distribution diagnostic plots revealed that the residual errors were approximately equal to those of the Poisson distribution, but that the Tweedie distribution did not perform as well as the negative binomial dis-tribution. Again, this was particularly apparent in the QQ-plot. The Tweedie over-dispersion parameter, p, was 1.01 and the scale parameter, φ, was 1.009 for the full model.
Thus, the Tweedie distribution tended to a Poisson probability function (Shono 2008).
In addition to model selection using residual diagnostics, we computed Akaike's information criterion (AIC). The negative binomial had the lowest AIC value (Appendix 5 Table 6). Given the diagnostics there was marginally more support that the deviance residuals of the negative binomial model showed acceptable behaviour and had the lowest AIC value, and was therefore chosen for the GAM (Appendix 5 Appendix 5 Fig. 9. Violin plot of deviance residuals for the east term of the spatio-temporal GAM fitted to the 'ākepa count data. A violin plot is a combination of a box plot and density plot that shows the distribution shape of the data. The red dot is the median and the black bar the mean. The interquartile range is indicated by the box and the whiskers the upper and lower adjacent values. Black dots indicate outliers. The density plot portion reveals the distribution of the data showing probability, relative amplitude, of observations. The distribution of the east residuals were highly concentrated around the median with many outliers.
Appendix 5 Fig. 11. Violin plot of deviance residuals for the year term of the detection probability and spatio-temporal GAM fitted to the 'ākepa count data.
There was a concentration of the residuals around the median in the year term and there were a large number of outliers. The medians of the residuals were consistently below the zero line indicating the predicted values tended to be too high.
Appendix 5 Fig. 12. Estimated model terms for the spatio-temporal GAM fitted to the 'ākepa count data. The distribution of the data is visualised in the rug plot along the x-axis for the 1D Easting, Northing and Year plots, while the EDFs are presented on the y-axis labels. The ribbon illustrates the error bounds. The locations of the points are plotted as black dots on the 2D contour plots and the EDF is provided in the plot panel title. Contours at plus and minus 1 SE of estimator variability relative to the mean are shown as blue lines. Estimates provided on the scale of the link function.
Appendix 5 Fig. 13. Estimated model terms for the 3-way smooths of the spatiotemporal GAM fitted to the 'ākepa count data. Maps shown for every three or four years between 1987 and 2017. Variability is minimal during the middle of the time series (middle row) and more variable early and late in the time series (bottom and top rows, respectively). The EDF is provided on the y-axis label and contours at plus and minus 1 SE of estimator variability relative to the mean are shown as black lines.
Appendix 6 Table 9. Effective degrees of freedom (EDF), reference degrees of freedom (rf), χ 2 statistic (χ 2 ) and p-value for the habitat covariate and each smoother term in the fitted spatio-temporal model.