Many studies have considered the effect of temperature and a change point in association with increased mortality. However, the relationship between temperature and mortality cannot be described using a parametric model and is highly dependent on the number of change points. Knowing the change points of temperature may prevent further mortality associated with the weather. The current available methods consist of two steps: they first estimate the models and then detect change points without testing. However, the methods for simultaneously identifying the nonlinear relationship and detecting the number of change points are quite limited. Therefore, in this paper, we propose a unified approach simultaneously estimates the nonlinear relationship and detects multichange points. We propose a semiparametric single index multichange points model as our unified approach by adjusting for several other covariates. We also provide a permutation-based testing procedure to detect multichange points. A criterion for predetermining the maximum possible number of change points is introduced, which is required by the permutation test procedure. Our approach is unaffected by the degree of smoothing of the nonparametric function. Our proposed model is compared to the generalized linear model and generalized additive model using simulation and a real application. Our approach outperforms these models in both model fitting and detection of change point(s). We also show the asymptotic properties of the permutation test for semiparametric single index multichange points model, suggesting that the number of change points is consistent. The advantage of our approach is demonstrated using the mortality data of Seoul, South Korea.

]]>Spatio-temporal analysis of small area health data often involves choosing a fixed set of predictors prior to the final model fit. In this paper, we propose a spatio-temporal approach of Bayesian model selection to implement model selection for certain areas of the study region as well as certain years in the study time line. Here, we examine the usefulness of this approach by way of a large-scale simulation study accompanied by a case study. Our results suggest that a special case of the model selection methods, a mixture model allowing a weight parameter to indicate if the appropriate linear predictor is spatial, spatio-temporal, or a mixture of the two, offers the best option to fitting these spatio-temporal models. In addition, the case study illustrates the effectiveness of this mixture model within the model selection setting by easily accommodating lifestyle, socio-economic, and physical environmental variables to select a predominantly spatio-temporal linear predictor.

Under the finite population design-based framework, locations' spatial information coordinates of a population have traditionally been used to develop efficient sampling designs rather than for estimation or prediction. We propose to enhance design-based individual prediction by exploiting the spatial information derived from geography, which is available for each population element before sampling. Individual predictors are obtained by reinterpreting deterministic interpolators under the finite population design-based framework, making it possible to derive their statistical properties. Monte Carlo experiments on real and simulated data help to appreciate the performances of the proposed approach in comparison both with estimators that do not employ spatial information and with kriging. We found that under the most favorable conditions for kriging, the proposed predictor shows at least the same performances, while outperforming kriging for small sample sizes.

]]>In the analysis of most spatiotemporal processes in environmental studies, observations present skewed distributions. Usually, a single transformation of the data is used to approximate normality, and stationary Gaussian processes are assumed to model the transformed data. The choice of transformation is key for spatial interpolation and temporal prediction. We propose a spatiotemporal model for skewed data that does not require the use of data transformation. The process is decomposed as the sum of a purely temporal structure with two independent components that are considered to be partial realizations from independent spatial Gaussian processes, for each time t. The model has an asymmetry parameter that might vary with location and time, and if this is equal to zero, the usual Gaussian model results. The inference procedure is performed under the Bayesian paradigm, and uncertainty about parameters estimation is naturally accounted for. We fit our model to different synthetic data and to monthly average temperature observed between 2001 and 2011 at monitoring locations located in the south of Brazil. Different model comparison criteria and analysis of the posterior distribution of some parameters suggest that the proposed model outperforms standard ones used in the literature.

]]>We propose a simple piecewise model for a sample of peaks-over-threshold, nonstationary with respect to multidimensional covariates, and estimate it using a carefully designed and computationally efficient Bayesian inference. Model parameters are themselves parameterized as functions of covariates using penalized B-spline representations. This allows detailed characterization of non-stationarity extreme environments. The approach gives similar inferences to a comparable frequentist penalized maximum likelihood method, but is computationally considerably more efficient and allows a more complete characterization of uncertainty in a single modelling step. We use the model to quantify the joint directional and seasonal variation of storm peak significant wave height at a northern North Sea location and estimate predictive directional–seasonal return value distributions necessary for the design and reliability assessment of marine and coastal structures.

]]>We discuss a numerical algorithm for calculating a large class of analytically intractable theoretical variogram functions that arise in studies of random fields on regular lattices. Examples of these random fields include conditional and intrinsic autoregressions, fractional Laplacian differenced random fields, and regular block averages of continuum random fields. Typically, the variogram functions for these random fields appear in the form of multi-dimensional integrals, often with singularities at the origin, and the algorithm laid out to evaluate these integrals invoke certain quadrature rules and regression formulas based on the asymptotic expansions of these integrals. This is so that singularities at the origin can be accounted for in a straightforward manner. This numerical algorithm opens new avenues to advancing geostatistical data analysis, solving kriging and estimation problems and exploring properties for various lattice-based random fields. The usefulness of this numerical method is illustrated by fitting certain theoretical variogram functions to ocean color and the Walker Lake data. Copyright © 2016 John Wiley & Sons, Ltd.

]]>The problems of finding confidence limits for the mean and an upper percentile, and upper prediction limits for the mean of a future sample from a gamma distribution are considered. Simple methods based on cube root transformation and fiducial approach are proposed for constructing confidence limits and prediction limits when samples are uncensored or censored. Monte Carlo simulation studies indicate that the methods are accurate for estimating the mean and percentile and for predicting the mean of a future sample as long as the percentage of nondetects is not too large. Algorithms for computing confidence limits and prediction limits are provided. Necessary R programs for calculating confidence limits and prediction limits are also provided as a supplementary file. The methods are illustrated using some real uncesnored/censored environmental data sets.

]]>Expensive computer codes, particularly those used for simulating environmental or geological processes, such as climate models, require calibration (sometimes called tuning). When calibrating expensive simulators using uncertainty quantification methods, it is usually necessary to use a statistical model called an emulator in place of the computer code when running the calibration algorithm. Though emulators based on Gaussian processes are typically many orders of magnitude faster to evaluate than the simulator they mimic, many applications have sought to speed up the computations by using regression-only emulators within the calculations instead, arguing that the extra sophistication brought using the Gaussian process is not worth the extra computational power. This was the case for the analysis that produced the UK climate projections in 2009. In this paper, we compare the effectiveness of both emulation approaches upon a multi-wave calibration framework that is becoming popular in the climate modeling community called “history matching.” We find that Gaussian processes offer significant benefits to the reduction of parametric uncertainty over regression-only approaches. We find that in a multi-wave experiment, a combination of regression-only emulators initially, followed by Gaussian process emulators for refocussing experiments can be nearly as effective as using Gaussian processes throughout for a fraction of the computational cost. We also discover a number of design and emulator-dependent features of the multi-wave history matching approach that can cause apparent, yet premature, convergence of our estimates of parametric uncertainty. We compare these approaches to calibration in idealized examples and apply it to a well-known geological reservoir model.

]]>No abstract is available for this article.

]]>New methods for modeling animal movement based on telemetry data are developed regularly. With advances in telemetry capabilities, animal movement models are becoming increasingly sophisticated. Despite a need for population-level inference, animal movement models are still predominantly developed for individual-level inference. Most efforts to upscale the inference to the population level are either *post hoc* or complicated enough that only the developer can implement the model. Hierarchical Bayesian models provide an ideal platform for the development of population-level animal movement models but can be challenging to fit due to computational limitations or extensive tuning required. We propose a two-stage procedure for fitting hierarchical animal movement models to telemetry data. The two-stage approach is statistically rigorous and allows one to fit individual-level movement models separately, then resample them using a secondary MCMC algorithm. The primary advantages of the two-stage approach are that the first stage is easily parallelizable and the second stage is completely unsupervised, allowing for an automated fitting procedure in many cases. We demonstrate the two-stage procedure with two applications of animal movement models. The first application involves a spatial point process approach to modeling telemetry data, and the second involves a more complicated continuous-time discrete-space animal movement model. We fit these models to simulated data and real telemetry data arising from a population of monitored Canada lynx in Colorado, USA. Copyright © 2016 John Wiley & Sons, Ltd.

At its most extreme levels, ground-level ozone is most harmful to human health, and meteorological conditions play a critical role in such episodes. In this work, our aim is to better understand how the primary meteorological drivers' effects on extreme ozone vary over the Southeast and Mid-Atlantic region of the USA.

We employ a model based on a bivariate extreme value framework that finds the linear combination of a set of meteorological covariates that has the strongest tail dependence with ground-level ozone. In order to gain knowledge about the spatial behavior of the meteorological drivers, we spatially model the coefficients, which relate the covariates to extreme ozone. Because inference for our extreme value model is not likelihood based, we utilize a two-stage modeling procedure: first, estimating the coefficients in our extreme value model and their associated uncertainties and then using these to fit a multivariate spatial process.

We analyze data from 160 air quality stations located in the Environmental Protection Agency Regions 3 and 4, producing estimated spatial surfaces via the fitted spatial process and co-kriging. We find that the relative contribution of the driving meteorological variables to extreme ozone differs between the northern and southern portions of the study region. For instance, we find that air temperature is more important in the northern portion of the region, while low humidity is more influential in the southern portion of the region.

Local likelihood-based density estimation methods are developed for multivariate interval-censored data. An extension of the Nadaraya–Watson local regression estimator to the interval-censored data context arises in a natural way. The conditional density estimation scheme is used to study the holdover distribution of lightning-caused wildfire ignitions in Northern Ontario. Copyright © 2016 John Wiley & Sons, Ltd.

]]>A clustering procedure for time series based on the use of the total variation distance between normalized spectral densities is proposed in this work. The approach is thus based on classifying time series in the frequency domain by consideration of the similarity between their oscillatory characteristics. As an application of this procedure, an algorithm for determining stationary periods for time series of random sea waves is developed, a problem in which changes between stationary sea states is usually slow. The proposed clustering algorithm is compared to several other methods which are also based on features extracted from the original series, and the results show that its performance is comparable to the best methods available, and in some tests, it performs better. This clustering method may be of independent interest. Copyright © 2016 John Wiley & Sons, Ltd.

]]>The gamma distribution has found extensive applications in the modeling and analysis of environmental data. Furthermore, for groundwater monitoring and environmental monitoring, an upper tolerance limit (i.e., an upper confidence limit for a percentile) can be used for checking compliance and is recommended in some of the standards published by the U.S. EPA. In addition, estimation of the proportion of samples where the measurements exceed a threshold (proportion of non-compliance, for example), referred to as an exceedance fraction, is of obvious interest. Thus the computation of an accurate upper tolerance limit, and an accurate upper confidence limit for the exceedance fraction, are important. The problems are somewhat challenging since environmental samples are typically not large, and large sample methods cannot be used. In the present investigation, we have succeeded in obtaining accurate solutions to the above problems using a small sample modification of the likelihood based method. The results are also applied for analyzing a data set on vinyl chloride concentrations in ground water monitoring wells.

]]>