Hierarchical Bayesian model averaging for hydrostratigraphic modeling: Uncertainty segregation and comparative evaluation

Authors

  • Frank T.-C. Tsai,

    Corresponding author
    1. Department of Civil and Environmental Engineering, Louisiana State University, Baton Rouge, Louisiana, USA
    • Corresponding author: F. T.-C. Tsai, Department of Civil and Environmental Engineering, Louisiana State University, 3418G Patrick F. Taylor Hall, Baton Rouge, LA 70803, USA. (ftsai@lsu.edu)

    Search for more papers by this author
  • Ahmed S. Elshall

    1. Department of Civil and Environmental Engineering, Louisiana State University, Baton Rouge, Louisiana, USA
    Search for more papers by this author

Abstract

[1] Analysts are often faced with competing propositions for each uncertain model component. How can we judge that we select a correct proposition(s) for an uncertain model component out of numerous possible propositions? We introduce the hierarchical Bayesian model averaging (HBMA) method as a multimodel framework for uncertainty analysis. The HBMA allows for segregating, prioritizing, and evaluating different sources of uncertainty and their corresponding competing propositions through a hierarchy of BMA models that forms a BMA tree. We apply the HBMA to conduct uncertainty analysis on the reconstructed hydrostratigraphic architectures of the Baton Rouge aquifer-fault system, Louisiana. Due to uncertainty in model data, structure, and parameters, multiple possible hydrostratigraphic models are produced and calibrated as base models. The study considers four sources of uncertainty. With respect to data uncertainty, the study considers two calibration data sets. With respect to model structure, the study considers three different variogram models, two geological stationarity assumptions and two fault conceptualizations. The base models are produced following a combinatorial design to allow for uncertainty segregation. Thus, these four uncertain model components with their corresponding competing model propositions result in 24 base models. The results show that the systematic dissection of the uncertain model components along with their corresponding competing propositions allows for detecting the robust model propositions and the major sources of uncertainty.

1. Introduction

[2] When developing a conceptual model to represent a subsurface formation, uncertainties in model data, structure, and parameters always exist. To accommodate for different sources of uncertainty, strategies as model selection, model elimination, model reduction, model discrimination, and model combination are commonly used to reach a robust model, using single-model approaches [Cardiff and Kitanidis, 2009; Demissie et al., 2009; Engdahl et al., 2010; Feyen and Caers, 2006; Kitanidis, 1986; Gaganis and Smith, 2001, 2006, 2008; Irving and Singha, 2010; Nowak et al., 2010; Wingle and Poeter, 1993] or multimodel approaches [Doherty and Christensen, 2011; Li and Tsai, 2009; Morales-Casique et al., 2010; Neuman, 2003; Refsgaard et al., 2006; Rojas et al., 2008, 2009, 2010a-2010c; Singh et al., 2010; Troldborg et al., 2010; Tsai and Li, 2008a, 2008b; Tsai, 2010; Ye et al., 2004, 2005; Wöhling and Vrugt, 2008].

[3] Although single-model approach is commonly used for model prediction and uncertainty assessment of hydrologic systems, yet it has several flaws. Beven and Binley [1992] and Beven [1993] bring the concept of equifinality by pointing to model nonuniqueness of catchment models, which is the possibility that the same final solution can be obtained by many potential model propositions. This concept as coined by von Bertalanffy [1968] means that unlike a closed system, which final state is unequivocally determined by the initial conditions, the final state of an open system may be reached from different initial conditions and in different ways. The problem of model nonuniqueness is salient to almost any field-scale hydrogeological model due to uncertainty about data, model structure, and model parameters. Thus, a single model may result in failing to accept a true model or failing to reject a false model [Neuman and Wierenga, 2003; Neuman, 2003]. In addition, even if a single model can still explicitly segregate and quantify different sources of uncertainty, Neuman [2003] points out to an important observation that adopting one model can lead to statistical bias and underestimation of uncertainty. The hierarchical treatment in this study clearly illustrates this point.

[4] Multimodel approach aims at overcoming the aforementioned shortcomings of the single-model approach by utilizing competing conceptual models that adequately fit the data. Multimodel methods aim at ranking or averaging the considered models through their posterior model probabilities. The most general multimodel method is the generalized likelihood uncertainty estimation (GLUE) [Beven and Binley, 1992], which is based on the equifinality [Beven, 1993, 2005]. In the first step, different models are generated by Monte Carlo simulation and are behavioral according to a user-defined threshold based on their residual errors. In the second step, the posterior model probability for each of accepted models is calculated based on observation data for a given likelihood function.

[5] Variant GLUE methods can be developed by modifying the first step of model generation and acceptance. For example, to move from equifinality to optimality, Mugunthan and Shoemaker [2006] show that calibration performs better than GLUE both in terms of identifying more behavioral samples for a given threshold and in matching the output. However, this is a debatable point. For example, Rojas et al. [2008] remarked that by including a calibration step in multimodel approaches, errors in the conceptual models will be compensated by biased parameter estimates during the calibration and the calibration result will be at the risk of being biased toward unobserved variables in the model [Refsgaard et al., 2006]. This study proposes a hierarchical Bayesian averaging approach to address this concern by explicitly segregating different sources of uncertainty.

[6] Variant GLUE methods can also be developed by modifying the second step by using different likelihood functions for model averaging. Formal GLUE [Beven and Binley, 1992] uses inverse weighted variance likelihood function, but the method is flexible allowing for diverse statistical likelihood functions such as exponential function [Beven, 2000] or even possibilistic functions [Jacquin and Shamseldin, 2007]. Exponential and inverse weighted variance likelihood functions do not account for model complexity and number of data points and may lack statistical bases [Singh et al., 2010]. Rojas et al. [2008, 2010a-2010c] introduce Bayesian model averaging (BMA) in combination with GLUE to maintain equifinality. Although using BMA is statistically rigorous, yet a typical problem with BMA is that it tends to favor only few best models [Neuman, 2003; Troldborg et al., 2010]. For example, several studies [Rojas et al., 2010c; Singh et al., 2010; Ye et al., 2010b] show that model averaging under formal BMA criteria (AIC, AICc, BIC, and KIC) tends to eliminate most of the alternative models, which may underestimate prediction uncertainty and bias the predictions, while GLUE probabilities are more evenly distributed across all models resulting superior prediction. To maintain the use of statistically meaningful functions, while avoiding underestimating uncertainty, Tsai and Li [2008a, 2008b] propose a variance window to allow selection of more models, but may simultaneously enlarge the magnitude of uncertainty, while satisfying the constraints imposed by the background knowledge.

[7] All the previously cited studies are collection multimodel methods, in which all models are at one level. Wagener and Gupta [2005] remark that an uncertainty assessment framework should be able to account for the level of contribution of the different sources of uncertainty to the overall uncertainty. In the groundwater area, to advance beyond collection multimodel methods, Li and Tsai [2009] and Tsai [2010] present a BMA approach that can separate two sources of uncertainty, which arise from different conceptual models and different parameter estimation methods. These were the first two studies to extend the collection BMA formulation of Hoeting et al. [1999] to two levels. The current study generalizes the work of Li and Tsai [2009] and Tsai [2010] to a fully hierarchical BMA method. To our knowledge, this is the first work that extends the BMA formulation in Hoeting et al. [1999] to any number of levels for analyzing individual contributions of each source of uncertainty with respect to model data, structure, and parameters.

[8] The hierarchical BMA provides more insight than collection BMA on the model selection, model averaging, and uncertainty propagation through a BMA tree. Each level of uncertainty represents an uncertain model component with its different competing discrete model propositions. For example, the variogram model selection can be one source of uncertainty and its competing propositions could be exponential, Gaussian, and pentaspherical variogram models. The proposed HBMA method serves as a framework for evaluating competing propositions of each source of uncertainty, to prioritize different sources of uncertainty and to understand the uncertainty propagation through dissecting uncertain model components.

[9] We test the HBMA method on an indicator hydrostratigraphy model to characterize the Baton Rouge aquifer-fault system in Louisiana. The outline of the study is as follows. Section 'Hierarchical Bayesian Model Averaging' shows the derivation of HBMA under maximum likelihood estimation. Section 'Case Study' describes the indicator hydrostratigraphy model and the segregation of its uncertain model components, which are calibration data, variogram model, geological stationarity assumption, and fault conceptualization. Through the BMA tree of the hydrostratigraphic models, section 'Results and Discussion' presents the evaluation of the competing model propositions, the uncertainty propagation, and the prioritization of the uncertain model components. Section 'Conclusions' draws conclusions about the main features of the HBMA.

2. Hierarchical Bayesian Model Averaging

2.1. Terminology and Notation

[10] We start by defining some basic terminologies of the BMA tree. Figure 1 shows a BMA tree, which is a hierarchical structure of models at different levels. The growing of the BMA tree reflects the expansion of the number of sources of model uncertainties, which entails the expansion of the number of models. Each source of uncertainty is represented by one level. A collection is a set of all models at one level. A superior or subordinate level is a level that is ranked higher or lower, respectively. The top level of the hierarchy consists of one model, which is termed the hierarch BMA model and its level number is zero. The immediate subordinate level of the top level is level 1 that tackles the first source of uncertainty; the immediate subordinate level of level 1 is level 2 that tackles the second source of uncertainty and so forth. Ranking is the arrangement of levels. Ranking of different sources of uncertainty in the BMA tree depends on analysts' preference.

Figure 1.

A BMA tree.

[11] The systematic segregation of different sources of model uncertainties is the central idea of the HBMA method. The base level of the hierarchy is a collection of all candidate models that are resulted from all considered sources of uncertainty. The base models can be viewed as basic elements, and are the same in either the collection BMA method or hierarchical BMA method. The only exception is that the base models of hierarchical BMA are developed following a combinatorial design to achieve a systematic representation of the competing propositions of all sources of uncertainty. The base level tackles the last source of uncertainty. The Bayesian model averaging starts from the base level. All models superior to the base levels are BMA models. A parent model is a vertex in the BMA tree, which averages of its child models. Each set of child models represent competing propositions for each source of uncertainty.

[12] The subscript is an important index to determine the hierarchy and branch relationship among models. Consider the model inline image at level p. The subscript inline image locates the model hierarchically top down from the first level, to the second level, and so forth to reach to level p. For example, inline image is model i at level 1, inline image is model j at level 2, which is a child model to parent model i at level 1. inline image is model k at level 3, which is a child model to the parent model j at level 2 and the grandparent model of model i at level 1. From bottom up, parent models inline image at level inline image is composed of the child models inline image at level p. Models inline image at level inline image are composed of models inline image at level inline image and so forth until the hierarch BMA model inline image is reached. Following these notations, next we formulate the hierarchical BMA posterior model probabilities, hierarchical BMA prediction means, and hierarchical BMA prediction covariances.

2.2. Posterior Model Probability and Conditional Posterior Model Probability

[13] Selecting base models for both collection BMA and hierarchical BMA is based on the acknowledged sources of uncertainty. Both collection BMA and hierarchical BMA can deal with base models which are mutually exclusive and collectively exhaustive if all sources of uncertainty and all propositions become available. However, it is practically impossible to exhaust all uncertain model components and all possible propositions. Accordingly, uncertainty arising from uncertain model components not accounted for cannot be evaluated by either the collection BMA and hierarchical BMA. Therefore, it is understood that the number of considered propositions are not exhaustive. The collection BMA and hierarchical BMA still can deal with nonexhaustive models, but require the base models to be mutually exclusive, which can be achieved in practice by not including nested models.

[14] Consider base models at level p. According to the law of total probability, the posterior probability for predicted quantity Δ given data D is

display math(1)

where inline image is the expectation operator with respect to models inline image at level p. inline image is the posterior probability of predicted quantity Δ given data D and models inline image at level p. The expectation inline image is posterior probability averaging at level p. That is

display math(2)

where inline image and m is the number of child models at level p under the branch of the parent model inline image at level inline image.

[15]  inline image is the conditional posterior model probability of model inline image at level p under model inline image at level inline image. inline image also represents the conditional posterior model probabilities and will be used to develop a BMA tree of posterior model probabilities. Note that model inline image is a child model under the parent model inline image because both have the same subscript for the first inline image levels. Equation (2) is the BMA at level p, which can be written as

display math(3)

[16] According to equation (3), one can derive the posterior probability of prediction using BMA over models at any level, say level n:

display math(4)

[17] For collection BMA, only one level of models is considered. Given the law of total probability, equation (1) becomes [Hoeting et al., 1999]

display math(5)

where inline image are the base models at level 1. Equation (5) is the model averaging of all base models inline image, inline image. To develop a multilevel method that separates different sources of uncertainty, we represent BMA in its general hierarchical form. Then equation (1) for hierarchical BMA becomes

display math(6)

[18] Based on the Bayes rule, the posterior model probability for the base models is

display math(7)

where inline image is the likelihood of a base model. inline image often refers to the model weight in BMA. inline image is the prior model probability of a base model. The conditional posterior model probability of a base model under their parent models is

display math(8)

where inline image is the conditional prior model probability of a base model inline image under its parent model inline image. Equation (8) is also referred to as the conditional model weight. The likelihood for parametric base models is

display math(9)

where inline image is a vector of model parameters for base models inline image. The likelihood, inline image is

display math(10)

[19] By considering equal conditional prior model probabilities for inline image, we calculate conditional posterior model probability for the base models under their parent models as follows

display math(11)

[20] Using equations (10) and (11) under the consideration of equal conditional prior model probabilities, the conditional posterior model probability for models at level n under their parent models is

display math(12)

[21] And the posterior model probability at level n is

display math(13)

[22] Therefore, each model at any level in Figure 1 has its own posterior model probabilities as in equation (13) and conditional posterior model probabilities as in equation (12). As a result, a BMA tree of posterior model probabilities can be obtained.

2.3. Prediction Means and Prediction Covariances

[23] Based on the law of total expectation, the expectation of prediction over p levels of models for hierarchical BMA is

display math(14)

where inline image is the expectation of prediction for given data D and models at level p. Moreover, the hierarchical BMA not only shows the total expectation of prediction over all levels of models, but also shows the expectation of prediction at a desired level where models are used. According to equation (4), the expectation of prediction using models at level n is

display math(15)

where inline image. Equation (15) provides thorough information for analysts who can have flexibility to see all possible averaged predicted quantities using various BMA models at different levels, while typical (one-level) BMA only provides one overall expectation of all models. Using equation (15) at any level in Figure 1, a BMA tree of prediction means can be obtained.

[24] The law of total covariance for hierarchical BMA is

display math(16)

where inline image is the covariance of prediction for given data D and base models at level p and inline image is the covariance operator with respect to model inline image at level n. inline image is the between-model covariance:

display math(17)

where T is the transpose operator.

[25] When considering the collection BMA that has only one level inline image, the covariance in equation (16) returns to the usual form [Hoeting et al., 1999]

display math(18)

where inline image are the base models at level 1.

[26] When considering two levels inline image as in Li and Tsai [2009] and Tsai [2010], the covariance in equation (16) is

display math(19)

[27] For this study, we will consider four sources of uncertainty, i.e., four levels inline image. The hierarchical BMA formulation following equations (15) and (16) is

display math(20)
display math(21)

[28] Similarly, the hierarchical BMA permits the evaluation of the prediction covariance when different BMA models at different levels are proposed. The basic information from base models is their covariance of prediction inline image at level p and their mean of prediction inline image. Then, the covariance of prediction using individual models at level inline image is

display math(22)

[29] Therefore, the within-model covariance of prediction using models at level inline image is

display math(23)

[30] The between-model covariance of prediction using models at level n is equation (17). The within-model covariance in equation (23) contains the sum of the within-model covariance and the between-model covariance at level inline image. If stepping into the within-model covariance inline image in equation (23), one can see that this term is composed of the within-model covariance and the between-model covariance at level inline image. In other words, except for the base models, the within-model covariance at level n is composed of the within-model covariances and the between-model covariances at levels inline image, inline image, and so forth, up to p. Using equation (22) for each model at any level in Figure 1, a BMA tree of prediction covariances can be developed.

[31] From the calculation procedure, one needs to first obtain the expectation and covariance for all base models, i.e., inline image and inline image because the base models are the basic elements to either the collection BMA or the hierarchical BMA. Then, the expectation of prediction using models at level n in equation (15) needs to be calculated starting from level p, then to level inline image, then to level inline image, and so forth until it reaches level n. Similarly, the within-model covariance in equation (23) and the between-model covariance in equation (17) at level n need to start from level p, then to level inline image, then to level inline image, and so forth until it reaches level n.

[32] These derivations show that the calculation of the posterior model probabilities for the hierarchical BMA is the same as collection BMA since all models above the base level are BMA models. However, the conditional posterior model probability calculation in the hierarchical BMA is different since it only takes place for child models under their parent models, allowing for the segregation of the competing model propositions and the segregation of the uncertain model components. This is different from collection BMA, in which all child models are treated as one set.

2.4. Computation of Posterior Model Probability With Variance Window

[33] Computation of posterior model probability can be done through sampling techniques or information-theoretic criteria. Markov Chain Monte Carlo (MCMC) simulation has been the most common tool to infer posterior distributions [Wöhling and Vrugt, 2008; Rojas et al., 2010b]. Although accurate, the MCMC simulation requires a large ensemble to achieve stable convergence, which will be computationally expensive. Alternatively, information-theoretic criteria such as Akaike Information Criterion [Poeter and Anderson, 2005; Singh et al., 2010], Bayesian information criterion (BIC), and Kashyap information criterion (KIC) [Neuman, 2003; Ye et al., 2004; Tsai and Li, 2008a; Singh et al., 2010] are inexpensive, fair estimators to evaluate posterior model probability. We note that due to the differences in their basic statistical assumptions, different information-theoretic criteria can often lead to different model ranking and posterior model probabilities [Tsai and Li, 2008a; Rojas et al., 2010c; Singh et al., 2010; Foglia et al., 2013]. A debate on the selection of BIC and KIC under the Bayesian paradigm is given by Ye et al. [2010a] and Tsai and Li [2010]. Assuming a large data size and a Gaussian distribution for prior model parameters [Raftery, 1995], this study adopts the Bayesian information criterion (BIC). Nevertheless, other sampling techniques or information-theoretic criteria can be considered in HBMA. Following skips BIC derivation and readers are referred to Draper [1995] and Raftery [1995].

[34] The likelihood for a base model in equation (9) is

display math(24)

[35] The Bayesian information criterion is

display math(25)

where inline image are the maximum-likelihood estimated model parameters in model inline image, inline image is the dimension of the model parameters inline image, and n is the size of data set D. inline image is the maximum likelihood value. By considering equal prior parameter probabilities for inline image and a multi-Gaussian distribution for fitting errors to observation data inline image, the BIC in equation (25) is simplified to [Tsai and Li, 2008a; Li and Tsai, 2009]

display math(26)

where

display math(27)

is the sum of the weighted squared fitting errors between calculated Δcal and observed Δobs. CΔ is the error covariance matrix.

[36] Substituting BIC in equation (24) into equation (7) and assuming equal prior model probability for inline image, the posterior model probability for the base model is

display math(28)

where inline image. BICmin is the minimum BIC value among all the base models. Using ΔBIC in equation (28) is a common practice [e.g., Neuman, 2003; Li and Tsai, 2009] to avoid numerical difficulty when inline image are large numbers. The conditional posterior model probabilities of the base models under their parent models are

display math(29)

[37] Once we obtain the posterior model probabilities and conditional posterior model probabilities for base models, the posterior model probabilities and conditional posterior model probabilities at any level can be obtained via equations (13) and (12), respectively.

[38] This study adopts the variance window [Tsai and Li, 2008a, 2008b; Li and Tsai, 2009] to calculate the posterior model probabilities for base models. The variance window introduces a scaling factor

display math(30)

into equation (28) as follows

display math(31)

[39] The parameter s1 is the ΔBIC value corresponding to the significance level in Occam's window, and s2 is the width of the variance window in the unit of inline image. The selection of significance level for s1 and the selection of the window size for s2 are subjected to an analyst's preference. If the scaling factor is zero, then all base models are weighted equally. If the scaling factor is unity, then the base models are weighted according to Occam's window. If the scaling factor is smaller than unity, then we enlarge Occam's window to accept more models. The scaling factor can be seen as analogous to the smoothed information criteria [Hjort and Claeskens, 2003, 2006]. For more details, we refer the readers to groundwater studies that compare the use of the variance window and Occam's window [Tsai and Li, 2008a, 2008b; Li and Tsai, 2009; Singh et al., 2010].

[40] The posterior model probability is an epistemic probability [Ellison, 2004; Williamson, 2005]. Under the epistemic probability stance, probability is viewed as being neither physical mind-independent features of the world nor arbitrary and subjective entities, but rather an objective degree of belief [Williamson, 2005]. Thus, the validity of posterior model probability is subject to our knowledge, and the estimated posterior model probabilities will be subject to revision should new knowledge become available. The term knowledge here is not merely limited to our knowledge about the different propositions of the model data, structure, parameters or processes, but also extends to the statistical matrices that are used to calculate the posterior model probabilities.

[41] This provides a complete formulation for hierarchical BMA with variance window, from which we can conclude the following basic concepts. Similar to the collection BMA, the base level of the hierarchical BMA represents the individual models given the full array of different propositions with corresponding posterior model probabilities. All models above the base level are BMA models. For each level, the posterior model probability, conditional posterior model probability, prediction mean and covariances (within-model covariances, between-model covariances, and total model covariances) can be obtained for each BMA model and presented through a BMA tree. From the base level to the level 1, each level distinguishes uncertainty arising from one source of uncertainty. The top level of the hierarchy contains the hierarch BMA model, which contains information from all subordinate models. In other words, the hierarch BMA model is identical to the BMA model of the collection BMA.

2.5. Similarities and Differences Between Collection BMA and Hierarchical BMA

[42] The previous analysis shows that the hierarchical BMA provides the general form for the BMA in Hoeting et al. [1999]. The result of the hierarch model of the hierarchical BMA is identical to the result of the collection BMA. However, Gupta et al. [2012] comment that “while model averaging provides a framework for explicitly considering (conceptual) model uncertainty, it currently lumps all errors into a single misfit term and does not provide insights into model structural adequacy.” While the hierarchical BMA can be used for model averaging similar to collection BMA, GLUE, or other averaging methods [e.g., Seifert et al., 2012], yet on the other hand it facilities a different purpose altogether that is to learn about the individual model components with their competing propositions. This is in line with Gupta et al. [2012] conclusion that “a systematic characterization of different aspects of model structural adequacy will help by explicitly recognizing the role of each aspect in shaping the overall adequacy of the model.” In other words, hierarch BMA can serve as “multiple hypothesis methodology” as proposed by Clark et al. [2011] in which competing hypotheses can be systematically constructed and evaluated, providing a learning tool and can lead to considerably more scientifically defensible model. To serve this purpose, hierarchical BMA offers four additional features to collection BMA as follows.

[43] First, through model dissection following a combinatorial design, the hierarchical BMA provides a systematic representation of the competing propositions of all sources of uncertainty. This can also be done with collection BMA. Yet since this is not a prerequisite to collection BMA, thus it is not a common practice. Second, model dissection allows the evaluation of competing model propositions of each uncertain model component through using the BMA tree of posterior model probabilities. Although this can be directly inferred from model ranking (e.g., Foglia et al. [2013] and this study), yet the BMA tree of posterior model probabilities provides more detailed information. Third, the segregation of the between-model variance for each uncertain model component allows for the prioritization of different sources of uncertainty. Fourth, hierarchical BMA facilitates the illustration of the change of the BMA prediction due to the addition of each source of uncertainty, while collection BMA (one-level) only provides one overall BMA prediction of all models. Similarly, the total model variance for each uncertain model component depicts the uncertainty propagation resulting from adding up different sources of uncertainty. Thus, the hierarchical BMA allows for uncertainty segregation, for comparative evaluation of competing model propositions, for prioritizing the uncertain model components, and for depicting the prediction and uncertainty propagation. These features, which advance our knowledge about the uncertain model components, are not readily possible to obtain through collection BMA. We illustrate these four features in the following case study.

3. Case Study

3.1. Baton Rouge Aquifer-Fault System, Louisiana

[44] The Baton Rouge aquifer system in Louisiana is located at the south limit of the Southern Hills regional aquifer system [Buono, 1983]. The aquifer system consists of complexly interbedded series of sand and clay formations that gently dip south [Tomaszewski, 1996]. This sequence of aquifers/aquicludes extends to a depth of 3000 ft (914.4 m) in the Baton Rouge area. The study area shown in Figure 2 focuses on the Miocene-Pliocene deposits [Griffith, 2003] of the “1200 foot” sand, the “1500 foot” sand, the “1700 foot” sand, the “2000 foot” sand, and the “2400 foot” sand. These sand units were classified and named by their approximate depth below ground level in Baton Rouge industrial district [Meyer and Turcan, 1955].

Figure 2.

The study area. Black dots represent the location of electrical well logs. The dashed lines in West Baton Rouge Parish represent approximated location of the fauts. inline image is a cross section for Figure 5.

[45] The Baton Rouge fault system consists of the Baton Rouge fault and the Denham Springs-Scotlandville fault (Tepetate fault) [McCulloh and Heinrich, 2012]. This east-west trending fault system crosscuts the aquifer/aquiclude sequences in the study area as shown in Figure 2. The low permeable Baton Rouge fault is important from a resource point of view since it separates the sequence of freshwater and brackish aquifers at the north and south of the fault, respectively. Heavy ground water pumping reversed the flow direction, which was originally southward, resulting in saltwater encroachment from south of the Baton Rouge fault [Rollo, 1969; Tomaszewski, 1996; Tsai, 2010]. No detailed hydrogeological information is available on the Denham Springs-Scotlandville fault in literature. To better understand the hydrogeological settings of the Baton Rouge aquifer-fault system, it is important to study detailed hydrostratigraphic architecture of the system.

3.2. Hydrostratigraphic Model and Its Uncertainty

[46] Due to uncertainty of the model data, structure, and parameters, multiple potential hydrostratigraphic models are resulted and calibrated. The central idea of the HBMA method is to segregate different uncertain model components with their corresponding competing model propositions. We illustrate these concepts of uncertain model components and competing model propositions in Figure 3. In this case study as shown in Figure 3, we consider four uncertain model components in the hydrostratigraphic model, which are two calibration data sets, three variogram models, two geological stationarity assumptions, and two conceptualizations with respect to the Denham Springs-Scotlandville fault. Alternatively, Figure 3 shows that the hydrofacies data interpretation and parameter estimation technique are not considered as uncertain model components since only one proposition is considered for each of these model components.

Figure 3.

Uncertainty segregation through dissection of model components with their competing modeling propositions.

[47] The four uncertain model components with their corresponding competing propositions result in 24 calibrated models. We use these models to perform hierarchical multimodel characterization of the hydrostratigraphy of the Baton Rouge aquifer-fault system to present the main features of the HBMA method. In this section, we present detailed description of the hydrostratigraphy model with its uncertain model components.

[48] We define our hydrostratigraphy model on a scale that is smaller than the regional scale and larger than the channel scale to describe strongly bimodal spatial distribution of intercalated pervious and impervious units. This scale is the same as the hydrofacies assemblages complex scale [Rubin, 2003] and depositional environment scale [Koltermann and Gorelick, 1996]. This strong bimodal scale of characterization fits our objective of delineating the thickness, lateral extent and depth of sand and clay units underneath Baton Rouge. For reconstructing the hydrostratigraphic architecture, we adopt the indicator hydrostratigraphy method [Johnson and Dreiss, 1989; Johnson, 1995]. The study analyzes 288 electrical well logs and interprets the sand-clay bimodal sequence for each log based on electrical resistivity, spontaneous-potential, and gamma ray.

[49] For model calibration, we use lithologic data from 33 driller's logs. However, different interpretations of drillers' logs lead to multiple calibration data sets. Sand and gravel are considered as sand facies with indicator 1. Silt and clay are considered as clay facies with indicator 0. The interpretation uncertainty arises from indistinct lithologic terms such as “sand with shale,” “shaly sand,” “sand with strikes of shale,” and so forth. Two data sets are proposed. Data set I interprets the indistinct lithologic terms clay facies with indicator 0. The data set II interprets the indistinct lithologic terms as sand facies with indicator 1.

[50] With respect to the hydrostratigraphic model structure, the first uncertain model component is the choice of the spatial correlation function of the hydrofacies units. In this study, we use three competing propositions, which are exponential, pentaspherical and Gaussian variogram models. The second source of uncertainty concerning the model structure is the geological stationarity assumption. If geological stationarity is shown to be inappropriate, it is helpful to divide the system into zones that are likely to be stationary [Koltermann and Gorelick, 1996; Rubin, 2003; Deutsch, 2007]. For the uncertainty analysis, we adopt two geological stationarity propositions. Global stationarity proposition assumes geological stationarity over the entire modeling domain resulting in one global variogram model. Local stationarity proposition assumes stationarity for each model zone as separated by the fault system resulting in local variogram model for each model zone. For the global variogram model proposition, the correlation between the data across the faults is still prevented, yet the experimental variograms from all zones are used to fit one theoretical variogram model. Beside the aforementioned mathematical structure uncertainty, model structure uncertainty also includes geological conceptualization uncertainty. For example, different fault characterizations can lead to different model structures [Chester et al., 1993; Bredehoeft, 1997; Salve and Oldenburg, 2001; Fairley et al., 2003; Nishikawa et al., 2009]. This study investigates the geological effect due to the Denham Springs-Scotlandville fault. While the Baton Rouge fault is significant to fluid flow, the Denham Springs-Scotlandville fault was not considered in many groundwater models [Torak and Whiteman, 1982; Huntzinger et al., 1985; Tsai and Li, 2008a; Li and Tsai, 2009; Tsai, 2010] due to the presence of no significant evidence of hydraulic discontinuity across the fault.

[51] We test two geological conceptualization propositions, which are the two-zone proposition and the three-zone proposition. The two-zone proposition does not consider the Denham Springs-Scotlandville fault similar to Rollo [1969], and thus the model domain is separated into two zones by the Baton Rouge fault. The correlation between the well log data across Denham Springs-Scotlandville fault is allowed. The three-zone proposition explicitly accounts for the Denham Springs-Scotlandville fault, thus the model domain is separated into three zones. The correlation between the well log data across the Denham Springs-Scotlandville fault is prevented.

3.3. Model Parameters and Calibration

[52] This section presents the inverse procedure to estimate the unknown model parameters. The first model parameter is the formation dip, which establishes data correlation. Different formation dips have a significant effect on the variogram structure and selection of data points. To obtain prior information to constrain the search space, we calculate the formation dip to be inline image from the U.S. Geological Survey (USGS) cross-sectional map in the area [Griffith, 2003]. We assign a range of inline image for the formation dip. The second model parameter is the sand-clay cutoff θ, which rounds the indicator estimate Δ to a binary value. We set the range of the cutoff to inline image.

[53] To estimate the unknown model parameters, we formulate the inverse problem by minimizing the fitting errors between the estimated and observed facies as follows

display math(32)

where, inline image and inline image are the data size of the sand facies and clay facies, respectively, inline image, inline image, and inline image are the observed sand facies indicator, the observed clay facies indicator and the indicator estimate at a location x, respectively. To make the calibration consistent with equation (27), equation (32) includes the variance term inline image, which is the sum of the data variance and the kriging variance at location x. The data variance for the two calibration data sets is 0.128 as calculated from the differences between electrical and driller's logs when both are available at the same locations.

[54] Given two fault conceptualizations, two calibration data sets, two geological stationarity assumptions, and three variogram models combinatorial design results in 24 base models. The unknown model parameters (ϕ,θ) are independently estimated for each of the 24 models. The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) [Hansen et al., 2003], which is a global-local derivative-free optimization algorithm, is used to solve the inverse problem in equation (32) according to the following procedure. First, the CMA-ES generates candidate solutions (ϕ,θ). For each candidate solution, the experimental variograms for each zone are calculated given the formation dip ϕ. Then a theoretical variogram model is automatically fitted to the experimental variograms using the direct search method of Hooke and Jeeves [1961]. Third, indicator kriging is used to estimate facies at the locations of observation data. The indicator kriging estimates are then rounded to indicators by the sand-clay cutoff θ. Fourth, the fitting error is calculated by comparing the estimated indicators to the observation data set, which is data set I or data set II, according to equation (32). This procedure is repeated until the fitting error is minimized.

4. Results and Discussion

4.1. Calibration and BIC

[55] For results and discussion, we use the following short forms. The first level of uncertainty is about the conceptualization of the Denham Springs-Scotlandville fault resulting into two-zone (Z2) and three-zone (Z3) propositions. The second level is for calibration data containing the data set I (D1) and the data set II (D2). The third level has the global (G) and the local (L) stationarity assumptions. The fourth level of uncertainty has three propositions, which are Exponential (Exp), Gaussian (Gus), and Pentaspherical (Pen) variogram models. The short forms of each proposition form the name of the 24 base models and their corresponding hierarchical BMA models. For example, Z3D1LExp is the name of a base model with three-zone (Z3), using the calibration data set I (D1), local stationarity assumption (L), and Exponential variogram model (Exp). The name Z3D1L represents a BMA model of the Z3D1LExp model, the Z3D1LGus model, and Z3D1LPen model under the propositions Z3, D1, and L. Similarly, the Z3D1 model represents a BMA model of the Z3D1L model and the Z3D1G model under the propositions Z3 and D1. The Z3 model is the BMA of the Z3D1 model and the Z3D2 model under the hierarch model. At the top-most level, the hierarch model is a BMA of the Z2 and Z3 models.

[56] Table 1 shows the calibration results of the 24 models to obtain the formation dip and the sand-clay cutoff. The mean sand-clay cutoff 0.41 is in agreement with the calculated sand proportion 0.40 from the electrical logs, which implies that the sand-clay cutoff can be interpreted as the probability of occurrence [Chilès and Delfiner, 1999]. While previous studies [Johnson and Dreiss, 1989; Falivene et al., 2007] consider a sand-clay cutoff of 0.5 as a reasonable assumption. The calibration results show that a fixed cutoff 0.5 will result in an underestimation of sand proportion in this case. The minimum, mean, and maximum formation dip for the 24 models are 0.17°, 0.32°, and 0.45°, respectively. This agrees with the geological information that the aquifer system gently dips south [Tomaszewski, 1996] and with the estimated dip inline image from Griffith [2003].

Table 1. Calibrated Model Parameters, Fitting Errors (Equation (32)), Q, ΔBIC, and Posterior Model Probabilities for Base Modelsa
Base ModelParameters and ResultsPosterior Model Probabilities
Dip (deg.)CutoffFitting ErrorQΔBICOccam's WindowVariance Window 1%Variance Window 5%
1σD2σD3σD1σD2σD3σD
  1. a

    Z3D1LExp is the best model.

Z2D1GExp0.230.392.01010,0108540.000.000.041.420.000.523.54
Z2D1GGus0.440.442.18110,62014640.000.000.000.090.000.010.57
Z2D1GPen0.300.412.20810,92717710.000.000.000.020.000.000.23
Z2D1LExp0.440.422.08010,0629060.000.010.796.500.193.749.51
Z2D1LGus0.320.412.11110,24310870.000.000.000.490.000.131.77
Z2D1LPen0.440.422.00596795230.000.000.021.120.000.383.03
Z2D2GExp0.190.402.10010,19010340.000.000.010.620.000.182.07
Z2D2GGus0.440.422.16010,36212060.000.000.000.280.000.061.24
Z2D2GPen0.180.412.62111,24320870.000.000.000.000.000.000.09
Z2D2LExp0.440.422.08710,1329760.000.000.010.810.000.252.46
Z2D2LGus0.280.402.15110,34911930.000.000.000.300.000.071.29
Z2D2LPen0.170.422.37211,22620700.000.000.000.010.000.000.09
Z3D1GExp0.290.402.08710,0879310.000.000.021.000.000.332.81
Z3D1GGus0.180.412.21910,92817720.000.000.000.020.000.000.23
Z3D1GPen0.300.412.00095443880.000.082.7312.090.968.3714.23
Z3D1LExp0.450.411.93491560100.0099.9196.3271.8098.8484.9545.34
Z3D1LGus0.340.402.10510,19410380.000.000.010.610.000.172.04
Z3D1LPen0.280.412.15010,25310970.000.000.000.470.000.121.71
Z3D2GExp0.290.402.09210,15910030.000.000.010.720.000.212.27
Z3D2GGus0.180.422.18810,84516890.000.000.000.030.000.000.29
Z3D2GPen0.300.412.24911,11819620.000.000.000.010.000.000.13
Z3D2LExp0.450.422.06710,0428860.000.000.031.230.000.433.22
Z3D2LGus0.300.402.16410,39412380.000.000.000.240.000.051.12
Z3D2LPen0.440.422.17410,54813920.000.000.000.120.000.020.71

[57] Given two unknown model parameters and the fitting residual Q, we use equation (26) to calculate inline image. To obtain the BMA tree, we calculate the posterior model probabilities using inline image for both Occam's window and different variance windows. BICmin is the minimum BIC value among all models, which is inline image for the best base model Z3D1LExp. The number of data points is 31,500. Table 1 shows ΔBIC and posterior model probabilities for base models using Occam's window and different variance windows based on the scaling factors of 1% and 5% significance levels and three different standard deviations σD of the fitting residual Q [Tsai and Li, 2008a, 2008b]. Due to the large data size, Occam's window as expected singles out only the best model. Posterior model probabilities of less influential models increase as the significance levels and the standard deviation σD increase, which consecutively decrease the posterior model probabilities of the best models. Adjusting the scaling factor of the variance windows is subject to the analyst decision; and posterior model probabilities are changed as shown in Table 1. However, adjusting the scaling factor does not change the model ranking, but just increases the inclusion of base models [Tsai and Li, 2008a, 2008b; Li and Tsai, 2009; Singh et al., 2010]. Nevertheless, propositions of different variance windows are not mutually exclusive. To illustrate the variance propagation from the base models to the hierarch model, we use a large variance window of 5% and inline image for the successive analysis.

4.2. Model Propositions Evaluation Using the BMA Tree

[58] Figure 4 shows the BMA tree for the four uncertain model components with their corresponding propositions. The best branch starts from the hierarch model to Z3 model, to Z3D1 model, and to Z3D1L model. The best branch coincides with the branch of the best base model because best base model has dominant posterior model probability. Two outcomes can be drawn from the BMA tree. First, model dissection through the BMA tree allows for spotting the propositions that result in good models. By looking at the propositions of the best model, three-zone (Z3), data set I (D1), local stationarity (L), and exponential variogram (Exp) generally show higher posterior model probabilities than other competing propositions. As expected, the worst model Z2D2GPen does not share a single proposition with the best model. The second worst model Z2D2LPen shares only local stationarity (L) proposition with the best model.

Figure 4.

The BMA tree of the posterior model probabilities (model weights) and the conditional posterior model probabilities (conditional model weights) for the four uncertain model components. Models with posterior model probabilities less than 1% are not shown in the figure.

[59] Second, since the posterior model probabilities in the BMA tree is based on the evidence of data, this may provide an opportunity to recognize the robust propositions. In other words, we examine if the posterior model probabilities can relate to our understanding of the model under study. Starting with the base level of the BMA tree as shown in Figure 4, models with exponential variogram propositions (Exp) have higher posterior model probabilities in most branches, followed by the Gaussian variogram proposition (Gus) and finally the Pentaspherical variogram proposition (Pen). This is not surprising since exponential model is an indicative of a sharp transitions occurring between blocks of different values [Rubin, 2003]. Thus, the exponential function honors this binary conceptualization of sand and clay.

[60] The third level of the BMA tree in Figure 4, which represents the global (G) and local (L) stationarity propositions, shows that the local proposition has consistently higher conditional posterior model probabilities, yet generally the conditional posterior model probabilities of the local proposition and global proposition are not largely different. To pool data for common processing for reasonably defined geological region is not refutable from data a priori, but it can be shown inappropriate a posteriori [Deutsch, 2007]. However, Z2D2G and Z2D2L can be regarded as possible a posteriori since their conditional posterior model probabilities are relatively similar.

[61] The second level of the BMA tree indicates that calibrating the models against the calibration data set I (D1) is more robust than data set II (D2). This is anticipated because D1 is in agreement with the electrical logs interpretation that identifies sand and gravel sequences to belong to sand facies with indicator 1.

[62] The first level of the BMA tree compares the two-zone proposition (Z2) and the three-zone proposition (Z3). The posterior model probability of the Z3 proposition that explicitly accounts for the Denham Springs-Scotlandville fault is relatively higher than the Z2 proposition. In Figure 5, we visually evaluate whether the Denham Springs-Scotlandville fault causes sand units displacement along the fault plane. The Z2 model in Figure 5a implies a fault throw in the “2000 foot” sand, but shows no obvious displacement in the “1500 foot” sand. The Z3 model in Figure 5b defines the displacement in the “1500 foot” sand and “2000 foot” sand suggesting that the Denham Springs-Scotlandville fault causes sand units displacement along the fault plane. This is in agreement with the higher posterior model probability of the Z3 proposition. It is interesting to see that the hierarch model in Figure 5c is very similar to the Z3 model, yet showing high total model variance around the Denham Springs-Scotlandville fault in Figure 5d due to the Z2 proposition.

Figure 5.

The BMA model estimates for the cross section inline image (see Figure 3): (a) Z2 model, (b) Z3 model, and (c) Hierarch model. White areas are sand and black areas are clay. (d) Total model variance for the hierarch model.

4.3. Uncertainty Propagation and Prioritization

[63] The total uncertainty as expressed by the total model variance is the summation of the between-model variance and within-model variance. The between-model variance depicts the estimation differences between competing models. By moving to the superior level this total model variance becomes the within-model variance for that level. This section presents the variance propagation of the within-model variance, between-model variance, and total model variance, and aims at prioritizing the uncertain model components based on their corresponding between-model variance. For this purpose, we use the south cross section of the Denham Springs-Scotlandville fault as shown in Figure 6 that follows the fault line shown in Figure 2, but rendered in two dimensions for clarity. The grid spacing is 50 m along the fault line and 1 ft (0.304 m) in the vertical direction.

Figure 6.

The BMA model estimates for the cross section south of the Denham Springs-Scotlandville fault for the best branch: (a) Z3D1L model, (b) Z3D1 model, (c) Z3 model, and (d) Hierarch model. White areas are sand and black areas are clay.

[64] To trace and understand the patterns of uncertainty propagation, Table 2 shows the mean values of the variances for all BMA models in the BMA tree. Table 2 shows the prediction variances and conditional posterior model probabilities for the BMA models at given levels, which are obtained from child models in the subordinate level. For example, Level 3 shows the results from different variogram propositions; Level 2 shows the results from different stationarity propositions; Level 1 shows the results from different calibration data propositions; and the hierarch level shows the results from different fault propositions. Following the best branch starting from the Z3D1L model to the hierarch model, as expected the total model variance is monotonically increasing because the variances are adding up. This is not necessarily the case for other branches. For example, if the model has high total model variance and lower posterior model probabilities as Z2D2G model, then at the next superior level the between-model variance that averages Z2D2G model and Z2D2L model will be less than the total model variance of Z2D2G model. Similar to the total model variance, the within-model variance depends on its subordinate levels and has a tendency to increase as moving up to superior levels, since it is adding up between-model variances at its subordinate levels. Yet unlike the total model variance, the within-model variance is not necessarily monotonically increasing for the best branch and the within-model variance can decrease depending on the posterior model probability. The best branch in Table 2 illustrates this observation in which the within-model variance of Z3 model is less than the hierarch model.

Table 2. Mean Values of the Within-Model Variance (WMV), the Between-Model Variance (BMV), and the Total Model Variance (TMV), and the Conditional Posterior Model Probabilities (cPr.) for the Cross Section South of the Denham Springs-Scotlandville Fault
 Level 3Level 2Level 1Hierarch Level
BMA ModelWMVBMVTMVcPr.WMVBMVTMVcPr.WMVBMVTMVcPr.WMVBMVTMV
Z2D1G0.1830.0200.2030.230.2040.0150.2180.720.2220.0040.2260.260.2440.0280.271
Z2D1L0.1840.0200.2040.77
Z2D2G0.1870.0390.2260.470.2220.0100.2320.28
Z2D2L0.1870.0310.2180.53
Z3D1G0.1850.0170.2020.260.2150.0330.2480.900.2480.0010.2500.74
Z3D1L0.2080.0110.2200.74
Z3D2G0.1800.0130.1930.350.2260.0240.2500.10
Z3D2L0.2090.0340.2430.65

[65] Unlike the within-model variance and the total model variance, the between-model variance is independent of subordinate levels as indicated by equation (17) and as illustrated in Table 2. This feature allows for prioritizing the relative impact of each uncertain model component on the overall model uncertainty. For example, the small between-model variance at level 1 as shown in Table 2 indicates that the calibration data set propositions have insignificant contribution to the overall model uncertainty. The between-model variances of the three other uncertain model components in Table 2 are high indicating that each of them has a large contribution to the overall model uncertainly since the between-model variances are additives as shown in the following discussion. To further understand the uncertainly propagation, we plot the model estimation, within-model variance, between-model variance and total model variance for the best branch.

[66] The BMA models in Figure 6 shows the model estimation for sand-clay distribution south of the Denham Springs-Scotlandville. The prediction of Z3D1L model, Z3D1 model, and Z3 model are almost identical. This indicates that the BMA model prediction with respect to these sources of uncertainty is stable because these models are relatively similar and the best base model dominates the results. Unlike these three BMA models, the hierarch model as shown in Figure 6d is marginally different particularly for the “2400 foot” sand because the Z2 and Z3 propositions produce different estimations as previously illustrated in Figures 5a and 5b.

[67] Figure 7 shows the between-model variance of the four uncertain model components. The Z3D1L model, Z3D1 model, and Z3 model have similar variance patterns, yet with different values. The similar variance patterns indicate again that the best base model dominates the results. High between-model variance indicates that competing propositions are important and the competing models are considerably different. For example, the local and global stationarity assumptions are both good propositions as indicated by their posterior model probabilities and thus resulted in high between-model variance as shown in Figure 7b. Also, the averaging of the Z2 model and Z3 model as shown in Figure 7d resulted in high between-model variance, since their estimations are noticeably different. Similarly, small between-model variance is due to the similarity of the competing models or the presence of a dominant competing proposition. For example, using different calibration data sets resulted in similar models with the same dip and cutoff, yet with different fitting errors as shown in Table 1. Also, D1 proposition outperforms D2 proposition as shown by their posterior model probabilities in Figure 4. Thus, the insignificant impact of the calibration data set propositions on overall model uncertainty is due to the combined effect of the two factors.

Figure 7.

The between-model variance for the cross section south of the Denham Springs-Scotlandville fault for the best branch: (a) Z3D1L model, (b) Z3D1 model, (c) Z3 model, and (d) Hierarch model.

[68] As shown in Figure 8a, the within-model variance of the Z3D1L model is the average of the variances of the three base models Z3D1LExp, Z3D1LGus, and Z3D1LPen. Regions close to the electrical logs have lower variance. Figure 8b illustrates that the within-model variance at the next level for model Z3D1 is the weighted average of total model variances of its subordinate models Z3D1G and Z3D2L. Similarly, we can obtain the within-model variance for the Z3 model and the hierarch model as shown in Figures 8c and 8d, respectively. Comparing Figure 8c with Figure 8d illustrates the previous remark that the within-model variance does not monotonically increase in value, yet more uncertain regions occur. This provides an interesting remark that the areas of uncertainty will always increase by adding more sources of uncertainty, yet the variance magnitude can decrease.

Figure 8.

The within-model variance for the cross section south of the Denham Springs-Scotlandville fault for the best branch: (a) Z3D1L model, (b) Z3D1 model, (c) Z3 model, and (d) Hierarch model.

[69] Figure 9 shows the total model variance that is the sum of the between-model variance in Figure 7 and the within-model variance in Figure 8. Figure 9 shows the monotonic variance increase in value and area for the best branch. Another noticeable remark is that the between-model variance is taking over the within-model variance, which indicates that uncertainty arising from the uncertain model components is higher than the uncertainty arising from the kriging variance.

Figure 9.

The total model variance for the cross section south of the Denham Springs-Scotlandville fault for the best branch: (a) Z3D1L model, (b) Z3D1 model, (c) Z3 model, and (d) Hierarch model.

[70] Model propositions evaluation and prioritization of the uncertain model components as previously discussed are features of the hierarchical BMA, which are not possible to obtain through the collection BMA. However, the estimation (Figure 6d) and total model variance (Figure 9d) from the hierarch BMA model are identical to those from the collection BMA.

5. Conclusions

[71] The hierarchical Bayesian model averaging (HBMA) provides a framework for incorporating competing knowledge about the model data, structure, and parameters to advance our understanding about model prediction and uncertainty. Since uncertainty arises because models are not perfect simulators of reality, it is common to consider multiple models. Similar to collection BMA, the hierarchical BMA utilizes multiple base models for model prediction under Bayesian statistical framework such that the model importance is based on the evidence of data. Unlike collection BMA that results in a single BMA model, the hierarchical BMA develops a hierarchy of BMA models through systematic dissection of uncertain model components. The hierarchy of BMA models allows for uncertainty segregation, comparative evaluation of competing model propositions, and prioritization of uncertain model components.

[72] The HBMA supports the rejection of a single representation of the system in favor of many system representations. The HBMA method illustrates the fact that model uncertainty is likely to be underestimated if only the best model is used. HBMA method explains this observation by distinguishing the within-model uncertainty and the between-model uncertainty for each uncertain model component. Analyzing the uncertainty propagation across different uncertain model components in the BMA tree shows that the within-model uncertainty can increase or decrease depending on the posterior model probabilities. However, the contribution of the between-model uncertainty is cumulative to the total model variance. Therefore, by adding more sources of uncertainty, which increases the number of uncertain model components and/or the number of corresponding competing propositions, the total uncertainly is increased through the between-model variance. The between-model variance presents an important uncertainty origin that cannot be discarded.

[73] The advantages of using the hierarchical BMA over the collection BMA have been illustrated in the hydrostratigraphic modeling of the complex Baton Rouge aquifer-fault system in Louisiana. Comparative evaluation of the competing propositions for each uncertain model component is completed through the posterior model probabilities and the conditional posterior model probabilities in the BMA tree. The conditional posterior model probabilities of the BMA tree suggest that explicit expression of the Denham Springs-Scotlandville fault is favorable, hydrofacies interpretation for the calibration data set I is considerably better, local geological stationarity due to the presence of the faults is favorable, and the exponential variogram model is preferable. The prioritization of different sources of uncertainty can be carried out through the between-model uncertainty. For this case study, uncertainty arising from different conceptualizations with respect to the Denham Springs-Scotlandville fault is more prominent, which is followed by uncertainty arising from variogram models and stationarity assumptions. Uncertainty arising from different calibration data sets appears insignificant due to small between-model variance. The HBMA model is an epistemic model that heavily relies on data evidence and knowledge advancement. Thus, the understanding of the current Baton Rouge aquifer-fault system is subject to revision shall new data, expert knowledge, model propositions, sources of uncertainty, calibration parameters or statistical inference methods become available.

[74] A key feature of the hierarchical BMA is knowledge update. One mean of knowledge update is to drop a level of uncertainty after having sufficient evidences from the posterior model probabilities, model solution and expert knowledge that one model proposition is more robust than other propositions at the same level. A second mean of knowledge update is to conduct data collection for further model evaluation until inappropriate propositions show insignificant posterior model probabilities. These are potential applications of the hierarchical BMA.

Acknowledgments

[75] This study was supported in part by the National Science Foundation under grant 1045064 and grant/cooperative agreement G10AP00136 from the United States Geological Survey. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the USGS. The authors acknowledge two anonymous reviewers for providing constructive comments which improved the manuscript.

Ancillary