A Bayesian mixture-modeling approach for flow-conditioned multiple-point statistical facies simulation from uncertain training images


Corresponding author: M. Khodabakhshi, Petroleum Engineering Department, Texas A&M University, College Station, TX 77843-3116, USA. (m.khodabakhshi@pe.tamu.edu)


[1] Multiple-point statistics (MPS) provides a systematic approach for pattern-based simulation of complex discrete geologic objects from a conceptual training image (TI) as prior model. The TI contains the general shape, geometry, and connectivity structures of complex patterns and encodes the related higher-order spatial statistics of the expected features. Conditioning MPS simulated facies on flow data poses a challenging nonlinear inverse problem for estimating discrete parameter fields. Additionally, the pattern-imitating nature of MPS simulation implies that the simulated facies inherit the spatial structure of the features in the TI. Since TIs are constructed from uncertain geologic information and imperfect assumptions, the resulting simulated facies may fail to predict the correct flow and transport behavior in the subsurface environment. It is, therefore, prudent to account for the full range of structural variability in describing the geologic facies distribution by considering multiple TIs. Here, we present a Bayesian mixture model for adaptive and efficient sampling of conditional facies from multiple uncertain TIs. We partition the posterior distribution of facies into individual conditional densities of the TIs and estimate the corresponding mixture weights from the likelihood function for each TI. To implement the conditional sampling, we apply a recently developed ensemble Kalman filter (EnKF)-based probability conditioning method, whereby EnKF is used to invert the flow data and obtain a facies probability map (soft data) to guide conditional facies simulation from each TI. We demonstrate the suitability of the proposed Bayesian mixture-modeling approach using several numerical experiments in fluvial formations with uncertain orientation and structural connectivity.

1. Introduction

[2] Subsurface systems pose some of the most challenging characterization and modeling problems in science with significant environmental, public health, and energy security implications. The main difficulties in understanding and modeling subsurface phenomena is related to inaccessibility and heterogeneity of geologic formations, together with the complex interactions between fluids and rocks over a wide range of temporal and spatial scales. Consequently, significant uncertainty is introduced into predictions of the related flow and transport processes, thereby complicating the development of subsurface hydrological, energy, mineral, and environmental resources.

[3] Parallel to advances in numerical forward modeling [Peaceman and Rachford, 1955; Aziz and Settari, 1979], significant progress is made in inverse modeling to integrate diverse and disparate data sets into numerical models of complex groundwater and hydrocarbon reservoirs [e.g., Hill and Tiedeman, 2007; Oliver et al., 2008]. A particularly important aspect of inverse modeling is quantification of uncertainty that results mainly from data scarcity and lack of an adequate understanding and modeling of the involved physical processes and subsurface heterogeneity [e.g., Moore and Doherty, 2005; Hendricks Franssen et al., 2009; Blazkova and Beven, 2009; Gotzinger and Bardossy, 2008; Solomatine and Shrestha, 2009; Thyer et al., 2009; Tonkin and Doherty, 2009; Zhang et al., 2008].

[4] The dynamic response of an aquifer to forced disturbances contains valuable information pertaining to both local and global trends in aquifer hydraulic properties and their connectivity. Constraining subsurface flow models to dynamic response measurements of head, concentration, or flow rates is more involved than integration of static data. This complexity is primarily attributed to the nonlinearity and computationally complexity in mapping input model parameters (model space) onto dynamic aquifer response (data space), together with the integral (spatially averaged) and sparse nature of the available data. Over the last several decades, various deterministic and probabilistic inversion techniques have been developed and applied to solve subsurface flow-model calibration problems [e.g., Sun, 1994; de Marsily et al., 1999; Carrera et al., 2005; Yeh et al., 2007; Hill and Tiedeman, 2007; Oliver et al., 2008].

[5] Deterministic inverse methods seek a single “best” solution by minimizing a suitable cost function that penalizes the discrepancies between predicted and observed dynamic and static data, as well as departure from direct and/or indirect prior information about the solution. Inference of heterogeneous hydraulic rock properties such as spatial distribution of permeability from flow measurements typically leads to ill-posed nonlinear inverse problems that have multiple solutions and provide different flow and transport predictions [Yeh, 1986; Carrera and Neumann, 1986a-1986c; Carrera, 1987; McLaughlin and Townley, 1996; de Marsily et al., 1999; Carrera et al., 2005; Hill and Tiedeman, 2007; Oliver et al., 2008]. Probabilistic methods, on the other hand, address the issue of nonuniqueness and uncertainty quantification by characterizing the solution of an inverse problem in terms of probability distributions. The Bayesian inversion theory provides an elegant framework for combining prior model parameter distributions with observed model responses [Tarantola, 2004]. A practical approach to apply the Bayesian inversion to large-scale nonlinear inverse problems is Monte Carlo approximation of the posterior distribution using a finite number of samples. This approach has become particularly popular primarily due to availability of powerful and inexpensive computational resources, development of relatively simple ensemble model calibration techniques, and suitability for systematic uncertainty quantification and risk assessment analysis [Sahuquillo et al., 1992; LaVenue et al., 1995; RamaRao et al., 1995; Gomez-Hernandez et al., 1997; Sambridge and Mosegaard, 2002; Lorentzen et al., 2003; Nævdal et al., 2005; Chen and Zhang, 2006; Wen and Chen, 2006; Nowak, 2009].

[6] Many of the existing inversion techniques are suitable for calibrating flow models with spatially distributed parameters that are amenable to second-order statistical characterization. Although conventional second-order (two-point) geostatistics is widely applied to represent the variability in spatial distribution of hydraulic properties in groundwater models, the connectivity structures in some geologic formations such as meandering fluvial channels are far too complex to model using second-order descriptions. Popularity of variogram-based modeling techniques is rooted more in their mathematical simplicity, computational efficiency, and implementation ease than in their geological interpretation and realism. However, many complex geologic structures such as those containing discrete geologic objects with sharp discontinuities across facies boundaries cannot be described with two-point statistical techniques [e.g., Gomez-Hernandez and Wen, 1998; Deutsch and Journel, 1998; Carle et al., 1998; Western et al., 2001; Zinn and Harvey, 2003; de Marsily et al., 2005]. Of particular importance in subsurface flow and transport are the extreme phenomena that induce preferential flow paths (e.g., channels and fractures) or flow barriers (e.g., thin shale layers) that can dominate the behavior of local and global flow regimes. These complex extreme features do not lend themselves to conventional second-order geostatistical modeling descriptions. In addition, stochastic processes with distinctly different higher-order statistics can sometimes be indistinguishable when only their second-order characterization is considered, indicating the importance of higher-order statistics in describing geologic formation with more complex spatial connectivity [Strebelle, 2002; Caers et al., 2002].

[7] Two common approaches for generating multiple realizations of geologic facies that honor a prior statistical representation and various types of measured and interpreted data are pixel-based approaches such as sequential indicator simulation [Journel, 1983; Isaaks, 1990; Srivastava, 1992; Goovaerts, 1997; Chiles and Delfiner, 1999] and object-based (Boolean) methods, e.g., marked point process, that are better able to describe the continuity in geobodies with well-defined shapes [Haldorsen and Lake, 1984; Stoyan et al., 1987; Deutsch and Wang, 1996; Holden et al., 1998]. Object-based methods, however, lack the flexibility of grid-based simulation techniques, rendering the data integration aspect particularly cumbersome.

[8] Multiple-point statistics (MPS) [Guardiano and Srivastava, 1993; Strebelle, 2002; Caers and Zhang, 2004] presents a grid-based pattern-imitating simulation method to model complex geological connectivity that are not amenable to variogram-based modeling techniques. Instead of using merely point-to-point statistical correlations, MPS accounts for the higher-order statistics captured by multiple-point patterns in a prior training image (TI). Because of its grid-based implementation, conditioning MPS realizations to facies measurement at well locations and soft (e.g., 3-D seismic) data is not difficult [Strebelle, 2002; Journel, 2002; Remy et al., 2009]. However, calibrating the output of MPS simulation against dynamic flow data remains an important research area.

[9] The nonlinear and indirect relation between hydraulic properties and dynamic flow data presents the main difficulty in constraining MPS simulation results to reproduce flow measurements. In recent years, several authors have proposed alternative approaches to address the problem of conditioning non-multi-Gaussian fields to flow data [Sarma et al., 2008; Jafarpour and McLaughlin, 2008, 2009a; Capilla and Llopis-Albert, 2009; Sun et al., 2009; Alcolea and Renard, 2010; Zhou et al., 2011; Mohammad-Khaninezhad et al., 2012a]. Sarma et al. [2008] apply a nonlinear parameterization to the permeability field via kernel principle component analysis to preserve the higher-order statistics of the prior model during calibration. Jafarpour and McLaughlin [2008, 2009a] applied discrete cosine parameterization with ensemble Kalman filter (EnKF) to improve facies continuity and reduce dimensionality. Capilla and Llopis-Albert [2009] present a gradual deformation-based inverse method for conditioning transmissivity fields to various static and dynamic data types. Sun et al. [2009] use an EnKF with grid-based localization and a Gaussian mixture-model clustering to update multimodal parameter distributions from dynamic data. They consider block updating and dimension reduction to reduce the computational costs of their proposed schemes and report improved performance over the regular EnKF implementation. Alcolea and Renard [2010] use a blocking moving window algorithm for conditioning MPS simulations to hydrogeological data such as connectivity constraints and heads. Zhou et al. [2011] report EnKF performance improvement by applying a normal-score transform to the original state vector to ensure univariate Gaussianity prior to update and to preserve the univariate (non-Gaussian) prior statistics after the update (via a back transformation). Mohammad-Khaninezhad et al. [2012b] apply the sparse K-SVD dictionary for reconstruction of geologic models from dynamic flow data and show that the method is robust against prior uncertainty and is able to preserve the geologic continuity in the prior model.

[10] An alternative approach is to incorporate the nonlinear dynamic flow data into the MPS simulation algorithm to generate conditional facies realizations. The probability perturbation method of Caers and Hoffman [2006] uses a parameterization of the simulation probabilities to condition the MPS facies realizations on flow data. This approach can have a slow convergence due to a lack of direct feedback mechanism to adapt and improve the predictive performance of subsequent facies realizations. Mariethoz et al. [2010] present an iterative spatial resampling as a general transition kernel to preserve the prior spatial model during conditional simulation. In a recent paper [Jafarpour and Khodabakhshi, 2011], we introduced a probability conditioning method (PCM) that is used to condition facies simulation from a given TI to nonlinear dynamic flow measurements. We showed that although EnKF update data do not preserve the categorical (discrete) nature of MPS facies realizations, they may be applied to infer probabilistic information about facies distribution in space (i.e., a facies probability map) from flow data. We were able to use the obtained facies probability maps and used them to guide pattern-based MPS facies simulation from a known TI and draw conditional facies samples that reproduce the observed dynamic measurements.

[11] A standing challenge in MPS-based model calibration, however, is the uncertainty in the prior TI. This issue becomes particularly important considering the strict pattern-imitating nature of MPS simulation that restricts the spatial variability of the resulting facies to the structural connectivity and encoded patterns in the given TI. Specifically, realization of facies maps from TIs with different structural connectivities can exhibit distinctly different flow and transport prediction, which can be detrimental for development planning.

[12] The main objective of this paper is to develop an adaptive sampling strategy when multiple TIs are used to acknowledge the uncertainty in the geologic continuity model. A key question to address is how to identify and sample from relevant TIs in a list of candidate prior TIs. We introduce a Bayesian mixture-modeling algorithm for generating conditional facies realizations from multiple uncertain TIs. Data scarcity and low resolution, together with errors in geologic modeling and imperfect assumptions can leave significant uncertainty in interpretation of the existing patterns in a prior TI model. Figure 1 shows a satellite view of a section of Mississippi River near Baton Rouge. The river structure, orientation, and thickness vary in different regions, implying that the distribution of naturally occurring features, such as the fluvial systems, can be too complex to represent with a single stationary TI. As depicted in Figure 1, the consistent TI for the fluvial system inside the left box is different from that on the right, even though the two sections of the river are close to each other. Underground fluvial or turbidite systems portray a similar complexity. One approach to deal with the uncertainty in describing the geologic continuity in a TI is to consider several TIs that capture the full range of geologic variability for a given formation. These TIs could be obtained based on different plausible geological scenarios, for example, from independent interpretations by different geologist or by stochastic treatment of parameters in a geologic modeling study that is used to identify possible connectivity patterns in the formation.

Figure 1.

Sections of the Mississippi River as an example of naturally occurring fluvial systems with varying channel structure and orientation. Two sections of the river with (left) strong meandering features and (right) straight channels are highlighted. The corresponding conceptual models (TIs) are also shown next to each selected area.

[13] We combine a Bayesian mixture model with the PCM for adaptive conditional sampling from multiple TIs. This is accomplished by initially generating unconditional facies realizations from multiple TIs using an initially equal weight for each. The TI weights are then updated based on their predictive performance (likelihood function). For conditional sampling, we first convert the dynamic flow data into a facies probability map using the PCM presented by Jafarpour and Khodabakhshi [2011]. We then incorporate the generated probability map as input into MPS simulation to draw new conditional facies samples from each TI according to the weights assigned to each TI based on their likelihood to match the observed data. This leads to an adaptive facies sampling technique where fewer (more) realizations are generated from the TIs with inconsistent (consistent) geologic continuity.

2. Review of Previous Results on PCM

[14] The proposed algorithm consists of two main steps. First, probabilistic information about the facies distribution in the aquifer, i.e., a facies probability map, is inferred from flow data using a dynamic data integration algorithm (EnKF in this case). In the second step, the estimated probability map in the first step is used to generate conditional facies samples from multiple prior TIs as uncertain representations of the form of geologic continuity in the aquifer/reservoir formation. This enables us to formulate an adaptive sampling approach for consistent simulation of facies realizations from multiple TIs. The developments in this paper are based on our previous work on the PCM [Jafarpour and Khodabakhshi, 2011] and a recent conference presentation [Khodabakhshi and Jafarpour, 2011] in which we discussed a variant of PCM for flow-model calibration under TI uncertainty. Before presenting the new Bayesian mixture-modeling approach in this paper, we briefly review our previous work first.

2.1. Estimation Approach

[15] We briefly describe the EnKF that is used to invert the flow data into a facies probability map.

2.1.1. EnKF

[16] The ensemble Kalman [Evensen, 1994] filter is a Monte Carlo implementation of the standard Kalman [1960] filter for application to nonlinear dynamical state space models. The standard Kalman filter is a recursive state estimation algorithm for linear dynamical systems where an up-to-second-order characterization of the state and measurement distributions is desired. In the case of jointly Gaussian forecast (prior) state and measurement distributions, Kalman filter provides the first two statistical moments that completely characterize the Gaussian posterior distribution. Similar to KF, EnKF consists of a sequence of forecast and update steps that can be compactly written as follows:

display math(1)
display math(2)
display math(3)

where i = 1:N is the number of realizations, xt|t–1 and xt|t are the predicted and updated states, whereas dt and yt|t–1 are the observed and predicted measurements, respectively, all at time t. In the above equations, as in KF, additive model error w and measurement noise v have been assumed. The functions inline image and inline image represent the nonlinear state transition function and the measurement operator, respectively; αt–1 and ut–1 are used to denote model input parameters and control forcings, respectively. The notations inline image and inline image are used to represent the estimated sample state and measurement cross-covariance matrix and the sample measurement covariance matrix, respectively. The analysis form in equation (3) updates individual members of the ensemble. A similar equation can be used to update the ensemble mean of the states as follows:

display math(4)

where the overbar represents the expectation over the ensemble. Additional information about the properties of the EnKF forecast and update equations can be found in Evensen [2006].

[17] The EnKF has been evaluated favorably for several reasons, including its simple implementation, uncertainty quantification mechanism with relatively modest computational demand, and effectiveness for moderately nonlinear systems that are characterized primarily with second-order statistics. However, the limitation of using small ensemble size in practical application of EnKF to large-scale realistic systems can result in sampling errors that create several issues including spurious (nonphysical) correlations, filter divergence, and filter inbreeding that can lead to ensemble collapse. However, practical remedies such as localization and covariance inflation have been developed to address some of these issues. Moreover, the EnKF update is not constrained to preserve the balance equations used in modeling physical systems. Ehrendorfer [2007] recently reviewed some of the main difficulties in practical application of the EnKF.

[18] Despite its theoretical and practical limitations, EnKF is established as a promising nonlinear data integration method for large-scale systems in a number of disciplines, including subsurface model calibration [e.g., Aanonsen et al., 2009; Lorentzen et al., 2001, 2003; Nævdal et al., 2005; Gu and Oliver, 2005; Wen and Chen, 2006; Chen and Zhang, 2006; Skjervheim et al., 2007; Jafarpour and McLaughlin, 2009b; Jafarpour and Tarrahi, 2011].

2.1.2. EnKF Implementations for Probability Conditioning

[19] Application of EnKF for parameter estimation, which is the problem of interest in groundwater and reservoir model calibration, can be carried out by treating model parameters as time-invariant augmented states. Since EnKF is a state estimation for continuous variables, it is not suited for discrete facies estimation. In Jafarpour and McLaughlin [2009a, 2009b] and Jafarpour and Khodabakhshi [2011], we have observed that although EnKF update violates the discrete nature of facies, it can provide information about the distribution of the facies in the field. In Jafarpour and Khodabakhshi [2011], we used this observation to probabilistically interpret the estimated permeability maps for discrete channel systems to guide MPS facies simulation using the PCM. We applied EnKF to update the mean of permeability distributions and used it to construct a facies probability map. The conditional realizations generated after incorporating the updated probability map into MPS simulation could identify the main trends in facies distribution that resulted in more accurate reproduction the dynamic flow data. Here, we extend the PCM formulation using a Bayesian mixture-modeling approach to adaptively simulate facies realizations from multiple TIs.

2.2. Facies Simulation With Multiple-Point Geostatistics

[20] For MPS facies simulations in this paper, we use the snesim algorithm [Strebelle, 2002]. As a grid-based simulation method, snesim has the flexibility to condition facies simulations on hard data. This is conveniently performed by assigning the hard data to their corresponding grid blocks in the simulation grid and excluding those grids from the random path of the simulation. In this paper, we are interested in conditioning snesim simulation on soft data (a facies probability map). Journel [2002] presents a simple framework for integration of multiple probabilistic sources of information based on the permanence of updating ratios approximation.

2.2.1. Soft Data Conditioning

[21] The permanence of updating ratios in Journel [2002] is used for combining different sources of information by assuming that the relative contribution of any single data event is independent of any combination of other data events. Denoting the conditional probabilities of the occurrence of a facies type at a sample location (i.e., event A) based on the TI (data type B) and soft data (data type C) as P(B) and P(C), respectively, and the respective conditional probabilities P(A|B) and P(A|C), the permanence of updating ratios assumption leads to the following equation for estimating the posterior distribution P(A|B,C) [Journel, 2002]:

display math(5)

where the exponent τ(B,C) determines the importance of data types B and C. For values of τ greater than one, data type C assumes more significance than data type B. In this paper, data types B and C represent the TI and (indirectly) the flow measurements. In the absence of any additional information about the quality of the soft data and TI, an equal weight for the two sources of information is achieved by setting τ = 1.

3. Mixture Models for Adaptive Sampling From Multiple TIs

[22] One way to apply the EnKF-based PCM to conditional facies simulation under several TIs is to randomly draw an ensemble of facies realizations from different TIs to account for the full range of variability in geologic continuity. Jafarpour and McLaughlin [2009b] applied the regular EnKF with an ensemble derived from fluvial channel TIs with different channel widths and showed that an ensemble composed of a mixture samples from different TIs may still be able to retrieve the overall continuity structure from flow data. A similar experiment can be set up using the PCM approach. Appendix A shows the effect if using an incorrect prior TI on the performance of regular EnKF and the PCM implementation. In Appendix B, it is shown that uniform sampling from multiple TIs with different structural connectivity does not provide an effective solution. As an effective alternative method, we introduce a Bayesian mixture-model formulation for adaptive sampling from multiple prior TIs.

3.1. Bayesian Mixture Model

[23] We now present a Bayesian mixture-model formulation for estimating TI importance weights, which will subsequently be used for adaptive sampling from the TIs. To this end, we consider the mixture weights inline image such that inline image and inline image. To compute the desired posterior density inline image, we must calculate the individual posterior densities inline image for each TI Tj and the corresponding mixture weights inline image. Assuming that the facies model m is a mixture model of J different density functions (TIs here), we write the posterior density inline image as

display math(6)

In our ensemble formulation, the mixture weights determine the sampling weights for each TI and refer to the relative number of realizations taken from each TI. The individual posterior densities for a given TI can be expressed as

display math(7)

where the equality inline image is used. The first and second terms of the numerator in equation (7) can be computed from the observation model and the posterior at the previous time step, respectively. The denominator in equation (7) is a normalization constant and is not trivial to calculate. In our ensemble framework, we are interested in drawing samples from the above posterior density function for each given Tj according to the mixture weights. The mixture weights inline image for each TI can be computed by invoking the Bayes rule as follows:

display math(8)

where inline image is independent of Tj and can be treated as constant, and inline image is the likelihood of observing the data under TI Tj, which is hard to compute unless simplifying approximations are applied. For example, under Gaussian posterior approximation, inline image may be computed in closed form using equation (7). Since the posterior is not Gaussian in our examples, we follow an alternative approximation approach using Monte Carlo simulation.

[24] For sequential conditioning in time, we can compute the likelihood at each time step using the sequential form

display math(9)

which after substituting in equation (8) leads to

display math(10)

To compute the mixture weights, we must obtain the individual likelihood functions in equation (10). These likelihood densities can be calculated through integration over m, i.e.,

display math(11)

For most realistic problems, finding the exact solution of the integration in equation (11) is not feasible. However, Monte Carlo approximations can be used to find samples from inline image. This is the approach we follow in this paper. The steps involved in sampling from the TIs using our PCM approach are outlined below:

3.2. Initialization

[25] Initialize algorithm by assigning equal weight inline image to each component of the mixture model, i.e., TI (initial weight can be different if existing knowledge suggests so).

For time steps t = 1: Nt

 (A) Prediction step

 (1) Draw Nens realizations from the TIs based on the current TI weights (initially 1/J) and probability map (initially homogeneous) and assign permeability values to each facies type.

 (2) Solve multiphase flow equations for each realization of the permeability ensemble from the initial time step to predict the observed measurements at current time.

 (B) Update mixture weights and probability map

 (3) Compute the sample weights for each realization of permeability according to the following proportionality:

display math(12)

 These weights will be used for adaptive sampling from the TIs for the next time step.

 (4) Using the EnKF analysis equation, update the mean of permeability realizations by integrating the observations at the current time step.

 (5) Use the mean permeability map in Step (4) to construct an updated probability map (following the procedure shown in Figure 2).

Figure 2.

Schematic of steps involved in the proposed workflow for mixture-model formulation of conditional sampling from uncertain TIs.

[26] Since the EnKF update equation requires parameter covariance and cross-covariance between parameters and predicted responses, the log-permeability ensemble is used to derive these statistics. The updated log-permeability ensemble mean is then used to obtain probabilistic information about spatial facies distribution. To do this, we consider the probability of each facies at a grid block to be linearly related to the difference between the updated permeability and the value of permeability assigned to each facies. Hence, we only use the update equation to condition the permeability mean on flow data and use the updated permeability mean to infer a probabilistic description for the facies distribution (a probability map). The updated probability map is used to select an updated ensemble of facies models to perform the sequence of EnKF forecast and update for the next time step.

[27] Note that, following each update step, a new probability map is generated and used to draw an updated ensemble of conditional facies models from the available TIs. For parameter estimation, the states (pressures and saturations) are derived by rerunning the forward simulations from the initial time step using the updated facies models. This eliminates the loss of conservation principles (at the cost of a computational overhead) that is widely known for joint state and parameter estimation with EnKF. It is also worthwhile to note that the permeability updates are expected to be more accurate at the proximity of the wells and somewhat inconclusive away from the wells. The PCM approach is designed to combine the flow data with a prior TI model such that the former is used to deduce the local trends around the observation points (where flow data are more informative) and the latter to describe the facies connectivity away from observation points, where the data tend to be less conclusive [Jafarpour and Khodabakhshi, 2011]. When multiple TIs are used, the simulated facies samples tend to be similar around the well locations and become more dissimilar with increasing distance from the wells.

[28] In the next section, we demonstrate the performance of the mixture-model PCM approach under TI uncertainty using two numerical experiments for adaptive conditional sampling. Finally, note that EnKF is only used to update the facies probability map, whereas the likelihood function is used to update the TI consistency weights.

4. Results and Discussion

[29] In this section, two examples are presented to investigate the uncertainty in structure and direction of channel facies in a fluvial formation. In the first example, the structural connectivity of the TI is uncertainly not known, whereas, in the second example, channel orientation for a given structural connectivity is considered unknown. We use an adaptive sampling strategy to identify the correct structural connectivity in Example 1 and the consistent channel direction in Example 2. The MPS simulations in the following examples are carried out using the snesim implementation in the Stanford Geological Modeling Software [Remy et al.; 2009]. The general simulation and data integration parameters used for all the experiments are summarized in Table 1.

Table 1. General Simulation/Assimilation Information
ParameterSpecial Value/Condition
Simulation Parameters
PhasesTwo-phase (oil/water)
Cell dimensions10 m × 10 m × 10 m
Rock porosity0.20 (constant)
Initial oil saturation0.90 (uniform)
Initial pressure13.8 MPa (uniform)
Injection well constraintsWater flow rate
Production well constraintsBottom hole pressure
Facies typeFluvial formation
Geostatistical simulationSnesim
Assimilation Information
Observation at injection wellsBottom hole pressure
Observation at production wellsOil and water flow rate

4.1. Bayesian Mixture Model for Sampling From Multiple TIs

[30] In this section, we apply the mixture-model PCM algorithm to sample from multiple TIs adaptively. That is, we assume that the number of samples included from each TI is proportional to its likelihood function (i.e., its data prediction performance at previous assimilation steps). We present two different examples in fluvial systems and explore the uncertainty in TI structural connectivity and channel orientation.

4.1.1. Example 1: Uncertainty in Connectivity Structure

[31] In the first example, we use numerical experiments in two-dimensional two-phase (oil-water) systems. The model includes a nine-spot well configuration with one water injection well in the center and eight symmetrically located oil producers on reservoir boundaries. The initial oil saturation is 0.90 everywhere in the reservoir, whereas the initial pressure is 13.8 MPa everywhere. A total of 0.7 pore volume of water are injected during 72 months of simulation. The production ports operate with a constant pressure of 13.5 MPa for wells inside the high-permeability (channel) facies in the reference model and 7 MPa for wells completed in the nonchannel facies. Under these conditions, the injection pressure and water and oil production rates were measured and used as calibration data. The measurements were obtained every 6 months by running the forward simulation with a specified reference permeability field. The channel and background facies are assigned permeability values of 200 mD and 10 mD, respectively. The TIs for all the experiments in this paper are shown in Figure 3.

Figure 3.

Three TIs with different structural connectivities and three corresponding unconditional MPS realizations from them: (a) the TI with straight channels and (b) corresponding realizations; (c) the TI with intersecting straight channels and (d) corresponding MPS realizations; and (e) the TI with meandering channels and (f) corresponding MPS realizations.

[32] Three TIs (Figures 3a, 3c, and 3e) are considered to reflect the uncertainty in the prior connectivity models. In the proposed mixture-model method, we use an ensemble with N = 300 realizations. This number is selected based on our previous experience with similar problems [Jafarpour and McLaughlin, 2009b; Jafarpour and Khodabakhshi, 2011]. A simple linear mapping is applied to convert the mean log-permeability to a facies probability map (Figure 4). The lower and upper bounds Cmin and Cmax determine the level of confidence that is placed in the flow data (i.e., via the probability map). We chose Cmin = 0.10 and Cmax = 0.90 in the examples that follow.

Figure 4.

Linear functions used to convert mean log-permeability fields to probability maps. The conversion is based on the distance between the estimated mean and reference permeability values.

[33] The results of estimating the meandering channel facies using the mixture-model PCM approach are shown in Figure 5. Figures 5a and 5b show the reference facies map and the field setup. Figure 5c illustrates the changes in TI consistency weights with each EnKF analysis step. The weight associated with the correct TI (with meandering channels) is shown with a solid blue line. Within about five EnKF updates, the consistent TI is correctly estimated, and almost all facies realizations are taken from the consistent meandering TI. Comparing the results from the mixture-model approach (Figure 5f) with those from the EnKF and PCM updates in Figure A1 reveals that the probability map estimated by the mixture-model approach better represents the meandering feature in the true facies map. The level of variability in the simulated facies ensemble can be adjusted by the bounds specified for the probability map and/or by changing the τ value in equation (5).

Figure 5.

Facies estimation results for mixture-model-based PCM approach in Example 1: (a) true log-permeability field, (b) its corresponding well configuration, and (c) evolution of the TI consistency weights throughout data integration steps. (d)–(f) The probability map (first row), three sample log permeabilities (second to fourth rows), and the mean and variance of ensemble log permeability (fifth and sixth rows, respectively) are shown for the initial time, after 6 months, and 72 months of data assimilation, respectively.

[34] Without adaptive sampling, the standard PCM approach uses a large number of samples from the two inconsistent TIs. As a result, the sample covariance that is used in the EnKF update becomes inaccurate and degrades the quality of the update (Figure A1). However, by updating a weight that represents the TI consistency in reproducing the flow data, the proposed approach can continuously improve the EnKF analysis, since the updated covariance matrix after adaptively conditional resampling from the TIs better represents the spatial connectivity and correlation in the field. It only takes about four updates in this example to identify the correct TI. However, even with after identifying the correct TI, some uncertainty, mainly about channel boundaries, still remains in the calibrated ensemble of facies models (see Figure 5, last row). This uncertainty may be attributed in part to the scattered nature of the data and the weak sensitivity of available flow and pressure data to the exact location of the channel boundaries.

[35] The production forecast performance is measured using normalized ensemble-based root-mean-square error (RMSE) and normalized ensemble spread defined as follows:

display math(13)
display math(14)

where inline image is the total number of observation times, nwell is the total number of wells, and N is the ensemble size. The spread and RMSE values become very similar when the estimated forecasts become unbiased (forecast mean approaches the true forecast).

[36] Figure 6 summarizes the results from predicting the reservoir dynamic response. The results are also presented in terms of spread and RMSE performance measure in Table 2. It shows that before calibration the flow response of the initial ensemble deviates from the observed measurements, whereas the forecast with the calibrated ensemble better follows the observed trend in the data. Compared with Figures A1a and A1c, the proposed approach does not underestimate the ensemble spread. This is attributed mainly to the resampling step after each data integration step to generate new facies models. It is also important to recognize that, similar to other probabilistic sampling techniques, PCM may generate conditional facies realizations that do not reproduce the observed flow response [Jafarpour and Khodabakhshi, 2011]. From Table 2, it can be confirmed that, after data assimilation, the RMSE of the flow predictions is significantly improved, whereas the spread is also slightly reduced. Interestingly, the reduction in spread is not as significant. A comparison between the RMSE and spread after data assimilation shows that the two measures become very similar, implying that the forecast estimates after calibration are less biased. Assigning higher τ values in the MPS simulation or changing the bounds (Cmin and Cmax) in the probability map to increase the similarity between the generated samples are mechanism that can be used to improve the sampling results and reduce the ensemble spread.

Figure 6.

(top) Initial and (bottom) final production forecasts in Example 1. The first column (left) shows the injection bottom hole pressure, and the second and third columns display field oil and water production rates, respectively. Observations are displayed using red circles, while the forecasts for individual realizations and their mean are shown with thin gray and thick black lines, respectively.

Table 2. RMSE and Spread for Production Forecasts in Example 3
 BHP (Bottom Hole Pressure)Oil Production RateWater Production Rate

4.1.2. Example 2: Uncertainty in Channel Orientation

[37] We applied the proposed method in a separate example to identify channel orientation in a fluvial system. For this example, we use the channel facies structures from the TI shown in Figure 3c but assume that channel direction is unknown. This was done by using the same TI and specifying a rotation angle for the simulated facies realizations in the snesim algorithm. The channel direction was allowed to vary in the range [−90° 90°], which covers all possible channel orientations. The reference model and well configuration for this test case are depicted in Figures 7a and 7b, respectively. The channel direction in the correct TI changes between −20 and 20, based on a histogram obtained from 1000 samples. Based on the directional variability in the TI, we used five specific directions, −72°, −36°, 0°, 36°, and 72° to cover the entire range of channel directions. With this specification, the problem is reduced to adaptively sampling from the TI using the five rotation angles mentioned above. The total simulation time is 36 months. The simulation and model calibration parameters are kept the same as in previous example.

Figure 7.

Facies estimation results for mixture-model-based PCM approach in Example 2: (a) reference log-permeability field; (b) its corresponding well configuration; and (c) evolution of TI consistency weights throughout data integration steps. (d)–(f), The probability map (first row), histogram of channel directions (second row) with the reference direction indicated with the red line, two sample log permeabilities (third and fourth rows), and the ensemble log-permeability mean and variance (fifth and sixth rows, respectively) are shown for the initial step, after 6, and 12 data assimilation steps, respectively.

[38] The results of applying the mixture-model PCM to this problem are given in Figure 7. Figure 7c shows the evolution of the TI directional consistency weight after each data integration step. The weight for the correct (horizontal) direction consistently increases with each update. The red line in the third row of Figure 7c indicates the main continuity direction for the reference model (−6°). After eight steps of assimilation, only the consistent TI has significant contribution. The final results (Figure 7f) show that the mixture-model PCM identifies the correct directionality and provides an estimate for the probability map that is consistent with the features in the reference model. The fourth and fifth rows in Figures 7c–7e show two sample (out of 300) facies at different time steps. The improvement in the directionality of these facies samples is quite evident. We note that some (about 15%) of the sample facies at the final step do not have the correct structure in the reference model. One explanation for this variability, beside the randomness introduced during snesim sampling, is the small number of observations in space (five locations). Note that since the samples are conditioned on hard data, the variance at the well location is zero. The final variance map shows that the main uncertainty is associated with channel edges and at locations far from the observation points.

[39] In the above examples, we assumed that the reference model belonged to one of the prior TIs. Since the TIs had different features in them, the correct TI was identified as a single consistent TI. In some cases, the reference features in the solution may be adequately captured with more than one of the prior TIs or, in a more pessimistic but possible case, with none of the TIs. In such cases, the TI consistency weights and the probability map may be used as useful pieces of information for designing a new TI based on preliminary calibration results. When neither of the TIs provides an adequate representation of reality, existence of a large bias in the forecast ensemble may hint at a fundamental problem in the data integration process and the prior model used. In a more problematic situation, the ill-posed nature of the problem may result in incorrect geologic models that explain the sparse data reasonably well. In such cases, additional data are required to diagnose possible errors in the solution. Ultimately, the quality of any inversion method is impacted by the validity of the prior model used [Jafarpour and Tarrahi, 2011]. This becomes even more important when the solution is constrained to preserve the higher-order statistics of the prior.

5. Conclusion

[40] We presented a Bayesian mixture model for adaptively conditioning the simulation of geologic facies from multiple TIs to nonlinear pressure and flow measurements. The mixture model approach uses the PCM data integration algorithm that we recently developed [Jafarpour and Khodabakhshi, 2011] to adaptively draw conditional facies realizations from multiple TIs. The PCM is applied to convert dynamic pressure and flow data into facies probability map, which, in turn, was used to guide facies simulation from prior TIs. At each sampling stage, the TI weights were estimated based on the likelihood function for the individual prior TIs and determined the number of samples drawn from each. After presenting the implementation details of the proposed approach, we examined its performance under several prior assumptions, including uncertainty in formation connectivity and channel direction (orientation) for a given TI in fluvial systems. When channel direction was uncertain (Example 2), more update steps were needed for the algorithm to identify the correct directionality. This can be attributed to at least two main differences between these problems. First, in the example with structural uncertainty, the TIs had the same global directionality (horizontal), unlike in the example with unknown channel direction. Hence, the sample covariance calculated in the latter included structural correlations from all directions, resulting in a more pronounced degradation of covariance. The second major difference between the two problems is related to the location of the channel features in the reference model relative to the well configuration. In Example 2, the channel feature is less directly observable from the data as it is intersected by fewer wells. In all cases, however, the method was eventually able to identify, and accordingly sample from, the correct TI.

[41] Although the pattern-imitating nature of MPS simulation from prior TIs presents an opportunity to model more complex geologic phenomena, it also poses an important risk when the prior TI fails to represent the correct facies connectivity. Since the resulting facies models often dominate fluid displacement behavior, it is important to take into consideration the TI uncertainty in MPS simulation. Conditioning the MPS simulation results on nonlinear dynamic data remains an important topic in application of this method to modeling complex subsurface systems. The complexity of conditional simulation problem increases when the uncertainty in the TI model has to be acknowledged and incorporated. When prior knowledge about the structural connectivity is not adequate to overwhelmingly support the use of a single TI, it is prudent to consider a wider range of possible structural connectivity models (TIs) and rely on the dynamic flow data to distinguish between alternative TI candidates and adaptively sample from them based on their predictive performance (likelihood to reproduce the observed data).

Appendix A: Model Calibration Under Incorrect TI

[42] We repeat Example 1 with uncertain structural connectivity and consider an inconsistent TI as the prior model. For data integration, we use two different approaches, standard EnKF and regular PCM. We use the nonmeandering TI shown in Figure 3c as a prior model, which is adapted from Mirowski et al. [2008]. Figure A1 summarizes the estimation results with the standard EnKF and regular PCM.

Figure A1.

Facies estimation results for standard EnKF and PCM facies model calibration in appendix examples: (a) true log-permeability field and (b) well configuration; one final facies sample (first row), the mean of final facies ensemble (second row), and the production forecast with final replicates (third to fifth rows) are shown. The results for EnKF and PCM with incorrect prior model are shown in (c) and (d), respectively; (e) and (f) contain the same results when an equal number of samples are used from each TI in applying standard EnKF and PCM approaches, respectively. In the forecast figures (third to fifth rows), the red circles show the observed response of the reference model; the forecasts with each log-permeability realizations are shown with thin gray lines, whereas the mean forecast is displayed with thick black lines.

[43] The final estimation results when the standard EnKF is used, Figure A1c, readily confirm that the EnKF update equation cannot preserve the continuity of facies distribution. Even though the updated samples do not capture the connectivity in the reference facies model, note that the data mismatch does not appear to be as pronounced (see Jafarpour and Khodabakhshi [2011] for a discussion). Figure A1d presents the final estimation results with the PCM method. The first and second rows in Figure A1d show one sample and the ensemble mean of log-permeability distribution. Because in PCM, the samples are updated from the prior TI based on a probability map, the updated replicates preserve the discrete nature of the facies. The ensemble forecasts with the final replicates for this case are depicted in the third to fifth rows of Figure A1d. The forecasts with the PCM approach are inferior to those with the standard EnKF. This is explained by observing that the connectivity structure in the TI cannot be corrected. Therefore, PCM is strongly restricted by the incorrect TI, as reflected by the biased solutions.

Appendix B: Uniform Sampling From Multiple TIs

[44] To highlight the significance of adaptive sampling, we consider three TIs in Example 1 and use the standard EnKF and PCM methods, where an equal number of samples (100) are drawn from each TI. Figure A1e shows the results for the EnKF. The permeability features and the resulting flow responses are better in this case than in those in Figure A1a, which is consistent with the observations in Jafarpour and McLaughlin [2009b] that a mixture ensemble containing consistent features has an overall better performance than a biased ensemble. Although the estimated results are not discrete random fields, they contain important information about the location of channel facies, a key property that is exploited in the PCM [Jafarpour and Khodabakhshi, 2011]. The estimation results with the PCM and using the three TIs with uniform sampling are displayed in Figure A1f. Since two thirds of all samples are taken from the incorrect TIs, the final facies ensemble and the resulting forecasts exhibit more variability than the case where a single TI is assumed (Figure A1d). However, with three TIs containing very different structural connectivity, a uniform sampling strategy is ineffective. Hence, one must resort to an adaptive sampling strategy such as the mixture-model method described in this paper.


[45] The authors acknowledge Crisman Institute for Petroleum Research at Texas A&M University for funding this project.