Analysis of runoff extremes using spatial hierarchical Bayesian modeling
Mohammad Reza Najafi,
Department of Civil and Environmental Engineering, Portland State University, Portland, Oregon, USA
Corresponding author: H. Moradkhani, Department of Civil and Environmental Engineering, Portland State University, PO Box 751/Engineering Bldg. 202J, 1930 SW 4th Avenue, Suite 200, Portland, OR 97207-0751, USA. (firstname.lastname@example.org)
 A spatial hierarchical Bayesian method is developed to model the extreme runoffs over two spatial domains in Columbia River Basin, USA. This method combines the limited number of data from different locations. The two spatial domains contain 31 and 20 gage stations, respectively, with daily streamflow records ranging from 30 to over 130 years. The generalized Pareto distribution (GPD) is employed for the analysis of extremes. Temporally independent data are generated using declustering procedure, where runoff extremes are first grouped into clusters and then the maximum of each cluster is retained. The GPD scale parameter is modeled based on a Gaussian geostatistical process and additional variables including the latitude, longitude, elevation, and drainage area are incorporated by means of a hierarchy. Metropolis-Hasting within Gibbs Sampler is used to infer the parameters of the GPD and the geostatistical process to estimate the return levels across the basins. The performance of the hierarchical Bayesian model is evaluated by comparing the estimates of 100 year return level floods with the maximum likelihood estimates at sites that are not used during the parameter inference process. Various prior distributions are used to assess the sensitivity of the posterior distributions. The selected model is then employed to estimate floods with different return levels in time slices of 15 years in order to detect possible trends in runoff extremes. The results show cyclic variations in the spatial average of the 100 year return level floods across the basins with consistent increasing trends distinguishable in some areas.
 There are several factors that could affect the mean and peak of flow in a basin. Changes in precipitation and temperature as a result of climate change can directly influence the streamflow trend [Arnell et al., 2001; Moradkhani et al., 2010; Gao et al., 2011; Kundzewicz et al., 2013]. Due to the long-term increase in climate temperature, the snowmelt, which is dependent on seasonal temperature, may shift back from spring to winter and that may affect the peak flows [Jung et al., 2012]. Although analyzing the contributing factors in streamflow generation provides a general view of its behavior, it does not suffice as an accurate image of the characteristics of a flow regime in a basin. Studying the historical records of streamflow is a meaningful way to detect long-term trends in the face of natural variations as a result of climate and land use change [Fu et al., 2010; Karl et al., 2009; Piao et al., 2011a].
 The U.S. Geological Survey (USGS) maintains streamflow records at gage sites across the United States as a valuable resource for flood estimation and water resource management. However, gage stations are not uniformly distributed across the river basins with fewer stations in mountainous regions [Durrans and Tomic, 1996]. Also, the recently installed gages provide short records of data that makes it almost impossible to perform a robust flood assessment.
 The analysis of runoff extremes would be possible through the extreme value theory [Beirlant et al., 2004; Coles, 2001] comprising of various extreme value distributions. Commonly, the block maximum [Huerta and Sansó, 1997; Kharin and Zwiers, 2012; Sang and Gelfand, 2010] and the extreme over a specific threshold [Durman et al., 2001] are adopted in hydrologic applications. Several studies have performed at-site analysis of extreme events [Frei et al., 2006; Guttorp and Xu, 2007; Halmstad et al., 2011; Kharin et al., 2009; Towler et al., 2009; Villarini et al., 2010]. However, one may expect similar distributions for extreme runoff records at gages that are close to each other. Furthermore, limited records of hydroclimatic extremes in space and time [Fuentes et al., 2013] require the collection of information from different locations in order to reduce the uncertainty of the simulations and provide more reliable results. Methods have been developed for regional frequency analysis (RFA) that are shown to be superior to the at-site flood estimations [Burn, 1990; Chokmani and Ouarda, 2004; Dalrymple and Survey, 1960; GREHY, 2011; Gupta et al., 1996; Ouarda and El-Adlouni, 2011; Stedinger and Tasker, 2011]. Index flood method [Dalrymple and Survey, 1960] is an approach to combine extreme data from different locations in order to improve the accuracy of the estimates and to predict flood at ungagged sites. This method was further improved by Hosking and Wallis  who used the L-moments approach to estimate the parameters of the extreme value distribution. They divided the method into three steps: outline a homogeneous region, divide the extreme data at each gage by the index flood, and then fit a distribution to the combined data from all gages. RFA, however, does not consider the spatial components of the point data (i.e., geographic coordinate, elevation) and cannot incorporate additional variables (i.e., covariates) into the analysis. Besides, it is not possible to explicitly estimate the uncertainties based on the L-moments approach.
 Recently, with the accessible records of spatially scattered or gridded data and high-performance computing machines, there is growing interest in the analysis of spatially distributed extremes. Applications are found in the studies of wind [Fawcett and Walshaw, 2006], precipitation, and temperature [Aryal et al., 2009; Cooley et al., 2007; Sang and Gelfand, 2010; Schliep et al., 1975] among others. For this purpose, it is possible to consider a univariate extreme value distribution at each point (or grid) generating a spatial model on its parameters [Cooley and Sain, 2010; Cooley et al., 2007; Lima and Lall, 2008; Renard, 2011; Sang and Gelfand, 2004; Schliep et al., 1975]. The spatial dependence can also be modeled using the theory of max-stable process [Coles, 1993; De Haan and Pereira, 2006; Padoan et al., 2010]. In addition, Bayesian approach is a formal way to quantify the uncertainties [Majda and Gershgorin, 1999; Najafi et al., 2010; Tebaldi et al., 1957; Moradkhani et al., 2012] and is flexible in combining different sources of uncertainties [Tebaldi et al., 1957]. By considering the parameters of the extreme distributions as random variables, one would utilize the Bayesian method to find their corresponding distributions.
 Attempts to develop models for characterizing the spatial dependencies in hydroclimate data and the underlying processes root back to 1950s [Cressie, 1992; Robinson and Bryson, 2010]. Besag  introduced the concept of Markov random fields that consist of random variables with Markov properties. Hierarchical spatial models became popular after the introduction of Markov Chain Monte Carlo methods [Berliner, 1996; Wikle et al., 2013]. Arab et al.  and Banerjee et al.  provide detailed explanation of the origins of the hierarchical spatial models. Casson and Coles  performed one of the earliest studies on hierarchical spatial modeling over the hurricane wind speed. Recently, there has been growing interest in the development and application of hierarchical spatial models for climate variables [Fawcett and Walshaw, 2006; Sang and Gelfand, 2004, 2010; Schliep et al., 1975]. Cooley et al.  analyzed the precipitation extremes over Colorado using 56 gage records. Cooley and Sain  studied the changes in precipitation extremes using regional climate model data for historical and future periods. Sang and Gelfand  and Schliep et al.  utilized a spatial autoregressive model for the annual maximum rainfall. Renard  developed a procedure to account for spatial dependency using copulas in the analysis of rainfall extremes.
 Although much attention has been directed toward climate variable extremes such as precipitation and temperature and their spatial variations, fewer attempts have been made to provide more reliable models for hydrologic extremes such as streamflow. Lima and Lall  employed a hierarchical Bayesian model based on a block maxima distribution and incorporated the drainage area as an additional variable to illustrate the parameters of the distributions at each gage station. However, no spatial analysis was performed for the hierarchical model. Furthermore, this approach disregards other extreme events that are lower than the block maximum. Considering the fact that extreme events are rare, the peaks over threshold approach would provide additional data for the analysis [Coles, 2001].
 In this study, a procedure is developed to model the runoff extremes recorded at USGS gage stations given their spatial variations (e.g., latitudes and longitudes), drainage areas, and elevations. This is done by spatially modeling the parameters of an extreme distribution through a hierarchical Bayesian process with latent parameters considered as random variables and simulated using Markov Chain Monte Carlo techniques. Estimates of the return levels for gages not being used in model fitting process are compared with point fit model results in order to validate the procedure. Furthermore, the trend in runoff extreme is assessed using the estimated parameters for time windows of 15 years, starting from 1906 until 2011.
 Section 2 will start by the extreme value theory and the generalized Pareto distribution that builds the basis for the hierarchical Bayesian model. The spatial hierarchical Bayesian modeling is then illustrated along with the Markov Chain Monte Carlo (MCMC) parameter estimation. In section 3, the case study is presented along with the model test and trend analysis of the runoff extremes followed by concluding remarks, a brief summary, and future research in section 4.
2. Theory and Methodology
 In this section, a procedure is presented to model the runoff extremes recorded at gage stations, given their spatial variations (e.g., latitudes and longitudes), drainage areas, and elevations. This is done by spatially modeling the parameters of an extreme distribution through a hierarchical Bayesian process with latent parameters considered as random variables and simulated using Markov Chain Monte Carlo techniques.
2.1. Extreme Value Theory
 The extreme value analysis is developed to illustrate the tail of a distribution [Coles, 2001]. Based on asymptotic outcomes, it provides different classes of distributions for characterizing extremes. In hydrologic applications, the generalized extreme value (GEV) distribution is commonly used when data are considered in annual, seasonal, or monthly blocks in which their block maxima are taken as independent and identically distributed random variables. A pitfall of the GEV distribution is that it only considers one extreme value (the maximum) in each block and ignores other useful data. In order to increase the number of data, the peaks-over-threshold approach is preferred and used in this study. Having a marginal distribution of F, the distribution of the threshold exceedances is given by
where x > 0 and Y represents a set of iid random variables of Yi (e.g., daily runoff). υ is the threshold to distinguish the extreme events of Y. F is not known in practice; therefore, this distribution is approximated for high values of υ. The asymptotic result (for sufficiently large υ) suggests that the observations exceeding υ converge to the generalized Pareto distribution (GPD) [Pickands, 2011b]:
with the probability density function of
where κ and σ represent the shape and scale parameters, respectively, and y > υ. The shape parameter characterizes the tail of the distribution resulting in exponential type (light tail with κ = 0), Pareto type (heavy tail with κ > 0),and beta type (bounded tail with κ < 0) distributions. GPD considers all data exceeding υ, which highlights the importance of the threshold that affects the GPD scale and shape parameters. In general, one may choose the flood level in a region as a threshold. However, the threshold is generally selected statistically with a trade-off between a higher value to reduce the bias by ensuring the reliability of the asymptotic tail approximation, and a lower one to increase the precision (i.e., reduce the variance) of the parameter estimates. There are several methods available for threshold selection including mean residual life (MRL), dispersion index, and threshold choice [Anagnostopoulou and Tolika, 2012; Lang et al., 2005]. The classical threshold selection approaches are based on graphical diagnostics that assess the model fit with different threshold choices. In the MRL method, for example, the empirical estimates of the sample mean exceedances are plotted versus thresholds. The lowest level from where the plot follows a straight line is chosen as the threshold [Coles, 2001].
 Let us denote , then equation (2) can be written as
 The exceedance rate (ε) is the proportion of runoff observations exceeding the threshold (υ).
2.2. Spatial Hierarchical Bayesian Modeling
 The power of the hierarchical Bayesian model compared to other extreme analysis techniques relies on the spatial modeling in the process stage that enables it to borrow strength from data in different locations. The spatial hierarchical model is divided into the so-called data, process, and prior stages [Arab et al., 2007; Banerjee et al., 2004; Cooley et al., 2007; Sang and Gelfand, 2004]. In the first stage of the hierarchy (i.e., the data stage), the extreme data are modeled. At location i in time t, Yi,t exceeds the threshold υi and follows the generalized Pareto distribution with separate scale parameters (σi) and exceedance rates (εi) at each gage. A homogenous shape parameter (κ) over the study region was assumed in this study. Yi's are assumed to be independent conditional on σi. The likelihood function in the first stage of the model is then the product of the GPD equation with respect to space and time.
with ψi = log (σi).
 In the second stage of the model, the GPD parameter is defined based on a latent process that affects the runoff extremes. The scale parameter (ψ) is specified through a Gaussian process:
 This corresponds to a multivariate normal distribution with the mean and covariance defined by latent parameters of λ and χ. n denotes the total number of gage stations for each area under study, and q the total number of covariates (equation (7)) plus one (as the intercept variable). Consequently, a distinct distribution is generated for the scale parameter at each gage station. The scale parameter can be modeled in several ways depending on the definition of its mean and covariance .
where Cov is the covariate such as elevation, area, or any other physical/geophysical characteristics of the basin. Spatially distributed hydrologic data are either recorded at gage sites (i.e., geostatistical) or at areal units of grids or pixels (i.e., lattice). For the spatial analysis of the gage data, variograms, and for grid data the conditional, intrinsic, and simultaneous autoregressive (CAR, IAR, SAR) models are commonly used [Banerjee et al., 2004; Cressie, 1992]. Having the daily runoff data at gage location i exceeding the threshold υ (i.e., Y(i)), we utilize a geostatistical model based on variograms to spatially define the covariance of the scale parameter (i.e., σ(i)). This is based on the assumption that a so-called intrinsically stationary process. The variogram is then defined as
 Assuming an isotropic process, that is depends on the distance between gages ( ) (i.e., ) and not its direction, a parametric function is considered to define the variogram. There are several functions available for this purpose such as the linear, exponential, spherical, Gaussian, rational quadratic, and Matérn. Equation (9) corresponds to an exponential covariogram with a nugget value of zero.
where d represents the geographic distance between gage stations. χψ1 and χψ2 represent the “sill” and “decay” parameters of the covariogram, respectively. Usually, the “range” of a covariogram is used rather than “decay,” which is defined by the distance at which the correlation between the extreme runoff data is less than 0.05. In exponential covariograms, this distance is approximately 3/χψ2 [Banerjee et al., 2004]. Let us consider a scenario where the spatial correlation between gages is high only for a short distance where there are no nearby points. In this scenario, the model performs the prediction (i.e., of the scale parameter) with a value close to the parameter's mean, as the other points do not provide information to deviate from the mean. Therefore, a mean structure (of the scale parameter) with informative covariates, such as drainage area and elevation, may mostly describe the spatial variation of the parameter.
 The third stage of the hierarchical model defines the priors of the latent parameters. Independence between the parameters is assumed in all stages of the model including the prior stage:
 These stages are then combined in a Bayesian framework:
where and with ψ = log (σ), which lets ψ take both positive and negative values. σ and κ are GPD scale and shape parameters. Equation (11) assumes an uninformative uniform prior in the range of for the shape (κ) parameter (i.e., π(κ) = 1 indicating the prior distribution function). λ and χ are the mean and covariance latent parameters. n represents the number of USGS gage stations and q the number of covariates plus the intercept.
 Extreme value distributions are developed based on the assumption that the data are independent and identically distributed. In some cases, two consecutive days may experience flood events as a result of a single storm. Fawcett and Walshaw  developed a first-order Markov chain model to account for the serial dependence but ignored lags greater than one in their dependence model. Declustering is another approach to deal with temporal dependency in which clusters of the extremes over a specified threshold are generated based on the extremes that occur in consecutive days. A time lag of 1 day is used to separate the clusters. The maximum of each cluster is then taken as the single independent extreme event [Fawcett and Walshaw, 2007; Ghosh and Mallick, 2011; Katz, 2013]. In this study, declustering is used to remove temporal dependence. Furthermore, conditional independence is assumed between the spatial data. The assumption of conditional independence implies that given the GPD parameters, the extreme values in different gages are spatially independent. Extreme climate (e.g., temperature) events happen to be smooth across space; however, because of the conditional independence assumption, nonsmooth surface of the model parameters may be obtained. Although the spatial dependence in large scale is considered in the hierarchical parameterization, the dependencies of the data at small scale may not be represented [Sang and Gelfand, 2010].
 The hydroclimate extremes can be expressed in a more descriptive manner through the return levels. The runoff rate with τ year return level (zτ) is the rate that is exceeded by an extreme event with a probability of 1/τ. Exceedance over the threshold once every N observation results in
and for κ = 0, .
 In practice, a τ year return level flood is expressed by
and for κ = 0, , where np is the number of observations per year (e.g., np = 120 for this study).
2.3. Parameter Estimation
 Evidently, the number of model parameters increases with the number of gages resulting in a complex hierarchical model. Because the analytical solution of the multidimensional integral over all the parameters is practically intractable, the MCMC technique is utilized to estimate the parameters.
2.3.1. Metropolis-Hastings Within Gibbs Sampler
 Gibbs sampling [Gelfand and Smith, 1990], along with the Metropolis-Hastings updating algorithm [Cooley et al., 2007; Gelman et al., 2003], is employed to infer the posterior estimates of the parameters in equation (11). As an initial step, the uniform priors ranging from are considered for the mean latent parameters λψ, as well as the covariance latent parameters χψ with constrained ranges obtained from an exploratory analysis of the data as illustrated previously. Metropolis-Hastings step with Gaussian proposal distribution is used to update the scale parameter, log (σ). Assuming as the maximum likelihood estimates (MLE) of the GPD parameters, the latent parameters of as well as the GPD parameters of are initialized. The MCMC procedure shown in Appendix A is applied in order to estimate the posterior distributions of the model parameters.
3. Case Study
 This study is performed using USGS daily streamflow data over two regions in the Columbia River basin (CRB) as shown in Figure 1. CRB is located in the western United States covering parts of seven states along with the province of British Columbia in Canada. With a drainage area of 238,000 mi2, it is the third largest basin in the United States regarding the flow volume. The mountainous regions of CRB are snow dominated and receive most of the precipitation in winter [Matheussen et al., 1992], consequently the temperature fluctuations due to climate change have significant impact on the intensity, frequency, and seasonality of the streamflow [Hamlet and Lettenmaier, 2012; Lettenmaier et al., 2007; Payne et al., 2012].
 The northern region (CRB-N) includes 31 gage stations, while the southern one (CRB-S) consists of 20 gage stations between 1905 and 2011. This number of gages was sufficient to produce the variogram plots (variogram versus distance) and provide the priors for the latent parameters. Separate GP distributions were fitted to each extreme gage data set based on the maximum likelihood estimation of the parameters. Variogram plots were then generated using the at-site scale parameters. The variogram describes the amount of spatial correlation between locations, in which higher variogram values indicate lower correlations. In other words it describes how nearby points deviate from the mean. The variogram model shows stronger correlations for shorter distances, which gets weaker by increasing distance (as shown in Figure 2).
 As there is not a priori information about the parameters, we initially consider uninformative uniform priors. For the mean latent parameters , uninformative uniform priors in the range of are designated which provide proper posteriors as explained in Banerjee et al.  and shown in the following sections. As recommended in previous studies [Banerjee et al., 2004; Cooley et al., 2007], bounded priors are assigned to the covariance latent parameters χψi.
 The prior distribution of sill parameter (χψ1) for CRB-N and CRB-S is taken to be uniform in the range of [0.001, 6] and [0.001, 3], respectively, using the variogram plots.
 Similarly, the decay parameter (χψ2) is uniformly distributed for both CRB-N and CRB-S in the range of [0.12, 3] and [0.15, 3], respectively. This represents a minimum range of 1 mi for both areas, a maximum range of 25 mi for CRB-N, and a maximum range of 20 mi for CRB-S.
3.1. Model Setup
 In this study, the 95th percentile of the streamflow records from December through March at each gage site was selected as the threshold, because it meets the threshold selection criteria mentioned previously (e.g., plot of mean excess versus the threshold). The period of December to March is chosen for the analysis as the largest streamflows occur during this period. Because of the possible temporal dependencies in the exceeding values, one may consider dependencies in the modeling or provide independent data by discarding some consecutive records (declustering). As a result, the value of exceedance rates will decrease. After declustering of the extremes over the threshold, the distributions of ε for the gage sites for the full records of data are shown as histograms in Figure 3. Most of the ε values range from 0.005 to 0.03 for CRB-N showing the wider range as compared to CRB-S with ε = 0.01–0.02.
 To ensure stationary posterior distributions, the simulations are performed for 150,000 iterations with a burn in period of 30,000 iterations. To break the dependence between draws and improve the mixing of the posterior samples in the Markov Chain, we perform the thinning by choosing to keep every 30th draw in the chains. Three parallel chains are generated each time with different initial values, and then the chains are merged to produce the posterior distributions.
 In an MCMC algorithm, when samples are sufficiently drawn from the posterior distribution, adding new samples may not alter the mean of the draws. A plot of the sample means of a parameter distribution versus iteration, the so-called running mean plot, is an effective way to evaluate the convergence of a chain. The results show that the means of parameter samples do not change after 30,000 iterations indicating convergence of the model. In addition, one approach to examine the MCMC efficiency is to plot the autocorrelation of the samples. Slow attenuation in the autocorrelation plot reflects slow mixing of samples resulting in slow convergence. The autocorrelation values for all the parameters are low and they attenuate quickly explaining the satisfactory mixing of MCMC samples in obtaining the posterior distributions.
 As mentioned before, in order to determine the true stationary posterior distribution three initially overdispersed parallel chains are generated. Gelman and Rubin  defined an MCMC convergence criterion for each parameter of the model by comparing the variation within the chains in relation to the total variation across the chains for the final n iterations. They developed the scale reduction factor ( ) as follows:
where m is the number of parallel runs (chains) with different starting points (here m = 3) and n is the number of iterations in each chain after burn in period. B shows the variances between the means of the m parallel chains, W is the mean of the m-within chain variances, and df shows the degrees of freedom of a Student t distribution which approximates the posterior distribution of the random variable (such as the scale parameter). When is higher than 1.2, the chains should be run out longer to improve convergence to the stationary distribution. The factor should decline to 1 as n increases indefinitely. Brooks and Gelman  extended the scale reduction factor to concurrently monitor the convergence of several parameters. The “boa” package [Smith, 2012] was used in this study to evaluate the convergence. The 0.975 quantiles of the scale reduction factors for all parameters are less than 1.20, which indicates the convergence of the model.
3.2. Model Structure
 In order to assess the degree of influence of each covariate on the GPD scale parameter, a generalized Pareto distribution (GPD) is fitted to the recorded extremes at each gage site using the maximum likelihood estimation. The correlations between the scale parameters and the covariates are then calculated (Table 1). For both regions, the linear dependence between the scale parameter and drainage area is higher than the one between the scale parameter and gage elevation; however, all the correlations are significant considering the p values. An analysis of variance test considering both drainage area and elevation shows that elevation does not add significant information to the model. The results are in accordance with the previous studies suggesting the close relationship between the parameters of the generalized extreme value distribution and the drainage area [Lima and Lall, 2008; Villarini et al., 2010].
Table 1. Linear Correlation Between the Scale Parameter of the Fitted Generalized Pareto Distribution at Each Gage (ψ) and the Corresponding Drainage Area as Well as the Gage Elevationa
ψ ∼ x
ψ ∼ y
ψ ∼ x + y
ψ ∼ x
ψ ∼ y
ψ ∼ x + y
 The structure of the hierarchical model is summarized in Figure 4 showing the results of USGS gage#14220500 in CRBN using the full record time series. This figure includes drainage area and elevation as covariates along with the exponential covariogram. It has all the prior and posterior distributions for the parameters in different stages of modeling. The posterior distributions are obtained through MCMC sampling as explained in the previous section. The posteriors show normal distributions except for the decay (range) parameter that is positively skewed. The distribution of the sill parameter shows a small positive skewness as well. The coefficient distribution related to the elevation is close to zero which is in accordance with the results given in Table 1.
 As shown in Figure 5, using the scale parameter σ obtained for each gage station combined with the observed exceedance rates and the shape parameter κ, the τ year return level flood distribution is then obtained.
 In this study, several models are generated based on different covariates, including drainage area and elevation as well as covariograms (exponential, Gaussian, rational quadratic, and Matérn at ν = 3/2 [Banerjee et al., 2004]) and are compared according to the deviance information criterion (DIC) [Spiegelhalter et al., 2011]. Because of its simplicity and effectiveness, DIC has been extensively applied for model selection in numerous studies [e.g., Reich, 2012; Sang and Gelfand, 2004; So and Chan, 2013].
 The posterior distribution of the deviance statistic is obtained using the likelihood function of
 The posterior expectation of the deviance and the effective number of parameters are then calculated by
 The deviance information criterion (DIC) is then determined by
 The smaller DIC value indicates better model. DICs are calculated based on the posterior samples of each model. For each of the three model runs (chains), a separate DIC is calculated and the average of the DICs is then obtained. A model with no covariate is considered as the base model, which has merely two parameters of ψ and κ. Results indicate that incorporating the covariograms (spatial model) reduces the DIC. The models that contain covariograms and covariates present close DIC values that are significantly lower compared to the base model (Table 2). This indicates the importance of the process stage in characterizing the extreme runoffs. Furthermore, the results provided in Tables 1 and 2 (with a DIC difference of 0.1) suggest that the model incorporating drainage area and the elevation is not different from a model with drainage area. For this study, the model including the covariates of latitude, longitude, and drainage area is chosen for the rest of the analyses. The exponential family is also selected because of its simplicity and being a valid variogram in all dimensions.
Table 2. GPD Scale Parameter (ψ) Is Spatially Modeled With Different Combination of Covariates; Model Selection Is Performed Using Deviance Information Criterion (DIC)
None (homogenous scale parameter)
Latitude, longitude, area
Latitude, longitude, elevation
Latitude, longitude, area, elevation
 Since the spatial hierarchical model combines extreme data from different locations, one may expect an increase in the precision of the return level estimates. To verify this, the R package “ismev” [Coles, 2001] is used to calculate the maximum likelihood estimates along with the 95% confidence interval of the GPD parameters for each gage separately. The 95% confidence intervals of the 100 year flood obtained from the hierarchical model is compared with the ones from MLEs. As seen in Figure 6, considerable reduction of uncertainty is obtained when estimating the return level using the hierarchical modeling since data from different locations are combined in the second stage of the modeling. The results suggest that the hierarchical spatial models can be applied in flood intensity and frequency analyses while the uncertainty can effectively be reduced even at gage sites with lower number of recorded extreme events. The accuracy of the model is further evaluated in the model test section.
3.3. Prior Sensitivity
 In Bayesian analysis, the choice of priors might have significant effect on the final inference. Therefore, it is necessary to perform a sensitivity analysis on priors. The posterior distributions of χψi are shown to be sensitive to the priors [Berger et al., 2001]. As discussed earlier, bounded uniform priors are considered for the sill and decay latent parameters. Two other uninformative prior distributions are additionally assigned to evaluate the variations in the posterior distributions of the latent parameters. Furthermore, the sensitivity of the scale parameter posterior distributions to the covariance latent parameter prior distributions is assessed. As shown in Figure 7, the prior distributions for the two regions are
 Application of the above prior distributions shows that the posterior distributions of the latent parameters are sensitive to the choice of the prior distributions. The decay parameter shows higher dependency on priors compared with the sill parameter. The density of the prior distributions in the lower bounds also affects the posterior distributions and cause shifts in the means. No significant change is seen in the scale parameter posterior distribution in both study regions.
3.4. Model Test
 In this study, the performance of the model is assessed through prediction of flood events at ungaged sites considering the full record of data set. For this purpose, 20% of the gages are randomly selected and left aside from the parameter estimation. Using the rest of the available gage data, the distributions of the latent parameters as well as the shape parameter are determined through the MCMC process. The distribution of the scale parameter for each ungaged site is obtained based on the estimated latent parameter distributions. However, generating such a multivariate normal distribution is not straightforward. Considering each latent parameter taken from its posterior distribution and the covariate values for each ungaged station, the mean and covariance of a multivariate normal distribution is obtained. One thousand samples are taken from this distribution and the average of all is considered as one estimate of the scale parameter in the ungaged site. This procedure is repeated for the rest of the latent parameter posterior values in order to obtain the distribution of the scale parameters at each of the ungaged sites. The distribution of the 100 year return levels is then calculated and compared with the return level values obtained from fitting the GPD distribution to each gage separately with its parameters determined through maximum likelihood estimation. In Figure 8, the results for the two study regions demonstrate that the 95% confidence interval of the predicted distribution encompasses the MLEs at each gage.
3.5. Parameter Estimates
 The median and the 95% confidence intervals of the latent parameters are shown in Table 3.
Table 3. The Median and 95 Percentile Highest Probability Density Interval of the Estimated Latent Parameters Corresponding to the Scale Parameter; the Model Has Two Covariates
Range = 3/χ1: The distance at which correlation is less than 0.05.
 This corresponds to a spatial hierarchical model with latitude, longitude, and area as covariates with uniform priors for the parameters. The range of the sill parameter is higher in CRB-N as compared to CRB-S due to the larger area and number of gage sites. The latent parameter pertaining to drainage area is positive for both regions.
3.6. Analysis of the Temporal Trend
 It is widely accepted that hydroclimate events are not stationary due to changes in land use and the impact of climate change [Risley et al., 2011; Najafi et al., 2000]. In this paper, the changes in extreme events are investigated for the two study regions for time periods of 15 years. Streamflow records from available gages at each time period are used to perform the MCMC simulations and sample the parameters. The posterior samples of the return levels with different return periods are calculated using the posterior samples of the shape and scale parameters along with the observed exceedance rates. This allows for quantification of the uncertainties associated with the return level assessments. This temporal analysis is conducted for the period of 1905–2011 in CRB-N and 1950–2011 in CRB-S.
 The 15 year time window is selected in this study so that sufficient number of recorded data is incorporated and also the temporal trends in extreme parameters are represented. Other time periods of 20 years [Dias et al., 2013] and 30 years [Najafi et al., 2010] have also been implemented to compare the future hydroclimate conditions to historical conditions; however, still no specific criterion is suggested in literature. The length of the time windows to choose and its influence on the results of climate change studies is subject to further studies.
 The trends of the scale parameters and 100 year return level floods over the three periods of 1965–1980, 1980–1995, and 1995–2011 are shown for each gage in CRB-S (Figure 9). Each circle represents the mean of the scale parameter and return level flood distribution at each time period. Different colors correspond to different values and larger circles are linked with higher magnitudes. The ranges of the scale parameters (σ) and the 100 year return level floods (cfs) vary between (150–8000) and (1500–75,000 cfs), respectively. Both scale parameter and 100 year return level floods show increasing trends in the majority of the gages for the latest two periods compared with 1965–1980.
 In Figure 10, each box plot on top represents the medians of the return level distributions for all gages. A cyclic change is detectable for CRB-N where increases in extreme runoff have occurred during 1920–1935, 1950–1965, and 1995–2011. For CRB-S, runoff extremes tended to decrease during 1935–1950 to 1965–1980, while they increased from 1965–1980 to 1995–2011. Correspondingly, the median and 95% confidence intervals of the shape parameters are shown in Figure 10 (bottom). The variations in shape parameters show agreements with the ones from the return levels.
 A spatial hierarchical Bayesian method is developed to model the extreme runoffs at USGS gage sites over two regions in the Columbia River basin, namely CRB-N and CRB-S. The regions include 31 and 20 streamflow gage stations, respectively. Extreme events occurred during the months of December through March are selected for this analysis. Generalized Pareto distribution is the basis of the model with its scale parameter being spatially characterized in a hierarchical Bayesian approach.
 Declustering process provides temporally independent data for the hierarchical model. The scale parameter of the extreme distribution is spatially modeled through generalized linear relationships considering covariates of latitude, longitude, drainage area, and elevation. The parameters of the spatial hierarchical model are estimated through an MCMC procedure called Metropolis Hastings within Gibbs sampler. Gelman  criterion is used to inspect the MCMC convergence. The performance of the model is verified by predicting the 100 year return level floods for several test gages using the hierarchical model and comparing the resultant distributions with the at-site maximum likelihood estimates.
 Results show significant increase in the precision of the model compared with a simple maximum likelihood estimator regarding the flood return levels, since information content in the data from different locations are combined using the spatial hierarchical model. Besides, the performance evaluations of the model in predicting ungaged sites show satisfactory results. In addition, the basic assumption for the covariogram function of the scale parameter is to consider it as a realization of an intrinsically stationary random function.
 The spatial dependence among extreme events in different locations is weaker than that of the daily data itself, because they do not necessarily occur at the same time [Sang and Gelfand, 2010]. The assumption of conditional independence implies that the extreme values in different gages are spatially independent given the GPD parameters. The assumption of conditional independence is widely considered in the spatial hierarchical modeling [Banerjee et al., 2004]. Another approach is to explicitly consider the small-scale dependencies of the extremes through a copula density function and obtain continuous surface realizations [Renard, 2011; Sang and Gelfand, 2010]. This is still an open area of research. In this study, homogenous shape parameters were assumed for the two study regions. Considering spatial heterogeneity of this parameter in the hierarchical model will result in increased model complexity, which requires further investigation.
 We run the model for historical time periods of 15 years in order to detect possible trends in runoff extremes. The results show cyclic variations in the spatial average of the 100 year return level floods. However, for some areas consistent increasing trends are distinguishable. Developing spatiotemporal hierarchical modeling of hydrologic extreme events as indicated in Sang and Gelfand  and Vanem et al.  is a substitute for time window analysis and is intended for future studies.
 MCMC implementation to obtain the GPD and latent parameters posterior distributions
 1. Calculate the mean and the covariance of the scale parameter :
 2. Find the scale parameter for the new iteration :
with as the Fisher information matrix.
 Draw ψ* from the proposal distribution .
where and p is the GPD function.
 Draw from the uniform distribution.
 If , otherwise
 3. Find the “scale” mean latent parameter corresponding to each covariate:
 with as the jump rate.
 Draw from the uniform distribution.
 If , otherwise
 4. Calculate the “scale” covariance latent parameter :
with as the vector of jump rates for .
with as the prior probability density function of .
 Draw from the uniform distribution.
 If , otherwise
 5. Find the value of shape parameter for the new iteration:
with as the jump rate for .
 Draw from the uniform distribution.
 If , otherwise
 Repeat until convergence.
Random variable in GPD.
ψ = log (σ)
Mean latent variable.
Covariance latent variable.
 Partial financial support for this project was provided by the National Science Foundation, Water Sustainability and Climate (WSC) program (grant no. EAR-1038925).