Bayesian networks (BNs) have been widely applied in environmental modelling to predict the behavior of an ecosystem under conditions of change. However, this approximation doesn't take time into consideration. To solve this issue, an extension of BNs, the dynamic Bayesian networks (DBNs), has been developed in mathematics and computer science areas but has scarcely been applied in environmental modelling. This paper presents the application of DBN to water reservoir systems in Andalusia, Spain. The aim is to predict changes in the percent fullness of the reservoirs under the irregular rainfall patterns of Mediterranean watersheds. In comparison to static BNs, DBNs provide results that can be extrapolated to a particular time so that a climate change scenario can be studied in detail over time. Because results are expressed by density functions rather than unique values, several metrics are obtained from the results, including the probability of certain values. This allows the probability that water level in a reservoir reaches a certain level to be directly computed.

]]>This work presents a case study about the evaluation of the water quality dynamics in each of the 56 major catchments in Scotland, for a period of 10 years. Data are obtained by monthly sampling of water contaminants, in order to monitor discharges from the land to the sea. We are interested in the multivariate time series of ammonia, nitrate, and phosphorus. The time series may present issues that make their analysis complex: non-linearity, non-normality, weak dependency, seasonality, and missing values. The goals of this work are the classification of the observations into a small set of homogeneous groups representing ordered categories of pollution, the detection of change-points, and the modeling of data heterogeneity. These aims are pursued by developing a novel spatio-temporal hidden Markov model, whose hierarchical structure was motivated by the data set to study: the observations are displayed on a cylindrical lattice and driven by an anisotropic and inhomogeneous hidden Markov random field. As a result, four hidden states were selected, showing that catchments could be grouped spatially, with a strong relationship with the dominating land use. This method represents a useful tool for water managers to have a nationwide picture in combination of temporal dynamics.

]]>Understanding energy consumption patterns of different types of consumers is essential in any planning of energy distribution. However, obtaining individual-level consumption information is often either not possible or too expensive. Therefore, we consider data from aggregations of energy use, that is, from sums of individuals' energy use, where each individual falls into one of *C* consumer classes. Unfortunately, the exact number of individuals of each class may be unknown due to inaccuracies in consumer registration or irregularities in consumption patterns. We develop a methodology to estimate both the expected energy use of each class as a function of time and the true number of consumers in each class. To accomplish this, we use B-splines to model both the expected consumption and the individual-level random effects. We treat the reported numbers of consumers in each category as random variables with distribution depending on the true number of consumers in each class and on the probabilities of a consumer in one class reporting as another class. We obtain maximum likelihood estimates of all parameters via a maximization algorithm. We introduce a special numerical trick for calculating the maximum likelihood estimates of the true number of consumers in each class. We apply our method to a data set and study our method via simulation.

Stream water temperature is an important factor in determining the impact of climate change on hydrologic systems. Near continuous monitoring of air and stream temperatures over large spatial scales is possible due to inexpensive temperature recorders. However, missing water temperature data commonly occur due to the failure or loss of equipment. Missing data creates difficulties in modeling relationships between air and stream water temperatures. It also imposes challenges if the objective is an analysis, for example, clustering streams in terms of the effect of changes in water temperature. In this work, we propose to use a novel spatial–temporal varying coefficient model to impute missing water temperatures. Modeling the relationship between air and water temperature over time and space increases the effectiveness of imputing the missing water temperatures. A parameter estimation method is developed, which utilizes the temporal covariation in the relationship, borrows strength from neighboring stream sites, and is useful for imputing sequences of missing data. A simulation study is conducted to examine the performance of the proposed method in comparison with several existing imputation methods. The proposed method is applied to cluster streams with missing water temperatures into groups from 156 streams with meaningful interpretations.

]]>We consider geostatistical regression models to predict spatial variables of interest, where likelihood-based methods are used to estimate model parameters. It is known that parameters in the Matérn covariogram cannot be estimated well, even when increasing amounts of data are collected densely in a fixed domain. Although a best linear unbiased predictor has been proposed when model parameters are known, a predictor with estimated parameters is nonlinear and may be not the best in practice. Therefore, we propose an adjusted procedure for the likelihood-based estimates to improve the predicted ability of the nonlinear spatial predictor. The adjusted parameter estimators based on minimizing a corrected Stein's unbiased risk estimator tend to have less bias than the conventional likelihood-based estimators, and the resulting spatial predictor is more accurate and more stable. Statistical inference for the proposed method is justified both theoretically and numerically. To verify the practicability of the proposed method, a groundwater data set in Bangladesh is analyzed.

]]>The problem of choosing spatial sampling designs for investigating an unobserved spatial phenomenon
arises in many contexts, for example, in identifying households to select for a prevalence survey to study disease burden and heterogeneity in a study region
. We studied randomized inhibitory spatial sampling designs to address the problem of spatial prediction while taking account of the need to estimate covariance structure. Two specific classes of design are *inhibitory designs* and *inhibitory designs plus close pairs*. In an inhibitory design, any pair of sample locations must be separated by at least an inhibition distance *δ*. In an inhibitory plus close pairs design, *n* − *k* sample locations in an inhibitory design with inhibition distance *δ* are augmented by *k* locations each positioned close to one of the randomly selected *n* − *k* locations in the inhibitory design, uniformly distributed within a disk of radius *ζ*. We present simulation results for the Matérn class of covariance structures. When the nugget variance is non-negligible, inhibitory plus close pairs designs demonstrate improved predictive efficiency over designs without close pairs. We illustrate how these findings can be applied to the design of a rolling Malaria Indicator Survey that forms part of an ongoing large-scale, 5-year malaria transmission reduction project in Malawi.

In the analysis of most spatiotemporal processes in environmental studies, observations present skewed distributions. Usually, a single transformation of the data is used to approximate normality, and stationary Gaussian processes are assumed to model the transformed data. The choice of transformation is key for spatial interpolation and temporal prediction. We propose a spatiotemporal model for skewed data that does not require the use of data transformation. The process is decomposed as the sum of a purely temporal structure with two independent components that are considered to be partial realizations from independent spatial Gaussian processes, for each time t. The model has an asymmetry parameter that might vary with location and time, and if this is equal to zero, the usual Gaussian model results. The inference procedure is performed under the Bayesian paradigm, and uncertainty about parameters estimation is naturally accounted for. We fit our model to different synthetic data and to monthly average temperature observed between 2001 and 2011 at monitoring locations located in the south of Brazil. Different model comparison criteria and analysis of the posterior distribution of some parameters suggest that the proposed model outperforms standard ones used in the literature.

]]>No abstract is available for this article.

]]>Under the finite population design-based framework, locations' spatial information coordinates of a population have traditionally been used to develop efficient sampling designs rather than for estimation or prediction. We propose to enhance design-based individual prediction by exploiting the spatial information derived from geography, which is available for each population element before sampling. Individual predictors are obtained by reinterpreting deterministic interpolators under the finite population design-based framework, making it possible to derive their statistical properties. Monte Carlo experiments on real and simulated data help to appreciate the performances of the proposed approach in comparison both with estimators that do not employ spatial information and with kriging. We found that under the most favorable conditions for kriging, the proposed predictor shows at least the same performances, while outperforming kriging for small sample sizes.

]]>Spatio-temporal analysis of small area health data often involves choosing a fixed set of predictors prior to the final model fit. In this paper, we propose a spatio-temporal approach of Bayesian model selection to implement model selection for certain areas of the study region as well as certain years in the study time line. Here, we examine the usefulness of this approach by way of a large-scale simulation study accompanied by a case study. Our results suggest that a special case of the model selection methods, a mixture model allowing a weight parameter to indicate if the appropriate linear predictor is spatial, spatio-temporal, or a mixture of the two, offers the best option to fitting these spatio-temporal models. In addition, the case study illustrates the effectiveness of this mixture model within the model selection setting by easily accommodating lifestyle, socio-economic, and physical environmental variables to select a predominantly spatio-temporal linear predictor.

The problems of finding confidence limits for the mean and an upper percentile, and upper prediction limits for the mean of a future sample from a gamma distribution are considered. Simple methods based on cube root transformation and fiducial approach are proposed for constructing confidence limits and prediction limits when samples are uncensored or censored. Monte Carlo simulation studies indicate that the methods are accurate for estimating the mean and percentile and for predicting the mean of a future sample as long as the percentage of nondetects is not too large. Algorithms for computing confidence limits and prediction limits are provided. Necessary R programs for calculating confidence limits and prediction limits are also provided as a supplementary file. The methods are illustrated using some real uncesnored/censored environmental data sets.

]]>Many studies have considered the effect of temperature and a change point in association with increased mortality. However, the relationship between temperature and mortality cannot be described using a parametric model and is highly dependent on the number of change points. Knowing the change points of temperature may prevent further mortality associated with the weather. The current available methods consist of two steps: they first estimate the models and then detect change points without testing. However, the methods for simultaneously identifying the nonlinear relationship and detecting the number of change points are quite limited. Therefore, in this paper, we propose a unified approach simultaneously estimates the nonlinear relationship and detects multichange points. We propose a semiparametric single index multichange points model as our unified approach by adjusting for several other covariates. We also provide a permutation-based testing procedure to detect multichange points. A criterion for predetermining the maximum possible number of change points is introduced, which is required by the permutation test procedure. Our approach is unaffected by the degree of smoothing of the nonparametric function. Our proposed model is compared to the generalized linear model and generalized additive model using simulation and a real application. Our approach outperforms these models in both model fitting and detection of change point(s). We also show the asymptotic properties of the permutation test for semiparametric single index multichange points model, suggesting that the number of change points is consistent. The advantage of our approach is demonstrated using the mortality data of Seoul, South Korea.

]]>Expensive computer codes, particularly those used for simulating environmental or geological processes, such as climate models, require calibration (sometimes called tuning). When calibrating expensive simulators using uncertainty quantification methods, it is usually necessary to use a statistical model called an emulator in place of the computer code when running the calibration algorithm. Though emulators based on Gaussian processes are typically many orders of magnitude faster to evaluate than the simulator they mimic, many applications have sought to speed up the computations by using regression-only emulators within the calculations instead, arguing that the extra sophistication brought using the Gaussian process is not worth the extra computational power. This was the case for the analysis that produced the UK climate projections in 2009. In this paper, we compare the effectiveness of both emulation approaches upon a multi-wave calibration framework that is becoming popular in the climate modeling community called “history matching.” We find that Gaussian processes offer significant benefits to the reduction of parametric uncertainty over regression-only approaches. We find that in a multi-wave experiment, a combination of regression-only emulators initially, followed by Gaussian process emulators for refocussing experiments can be nearly as effective as using Gaussian processes throughout for a fraction of the computational cost. We also discover a number of design and emulator-dependent features of the multi-wave history matching approach that can cause apparent, yet premature, convergence of our estimates of parametric uncertainty. We compare these approaches to calibration in idealized examples and apply it to a well-known geological reservoir model.

]]>