Geophysical Research Letters

Aquifer structure identification using stochastic inversion



[1] This study presents a stochastic inverse method for aquifer structure identification using sparse geophysical and hydraulic response data. The method is based on updating structure parameters from a transition probability model to iteratively modify the aquifer structure and parameter zonation. The method is extended to the adaptive parameterization of facies hydraulic parameters by including these parameters as optimization variables. The stochastic nature of the statistical structure parameters leads to nonconvex objective functions. A multi-method genetically adaptive evolutionary approach (AMALGAM-SO) was selected to perform the inversion given its search capabilities. Results are obtained as a probabilistic assessment of facies distribution based on indicator cokriging simulation of the optimized structural parameters. The method is illustrated by estimating the structure and facies hydraulic parameters of a synthetic example with a transient hydraulic response.

1. Introduction

[2] Determining field-scale parameters in sufficient detail to capture aquifer heterogeneities is one of the greatest challenges for predicting flow and contaminant transport in large-scale subsurface systems. Uncertainty in the form of conceptual model bias can be introduced in modeling groundwater flow and contaminant transport when aquifer structure is fixed based on sparse or incomplete geophysical data [Chen and Rubin, 2003]. Once the structure has been fixed in this way, focus is placed on model calibration of zoned parameters through parameter estimation. This type of inverse method has been applied in many instances to estimate flow and transport parameters at various spatial scales [e.g., Cooley, 1983; Carrera and Neuman, 1986; Dai and Samper, 2006]. Deterministic aquifer structure may introduce larger bias and uncertainty into a model than an inappropriate choice of facies hydraulic parameters [Ye et al., 2004; Lu and Robinson, 2006].

[3] Given uncertainty in both aquifer structure and hydrolic parameters Sun [2005] suggested “adaptive parameterization” to couple structure identification and parameter estimation in contaminant transport model calibration. This provides a more complete inversion of the model, allowing the optimization to explore combinations of structure geometry and hydraulic parameters. Previous studies usually assumed that the aquifer parameter zonations are randomly distributed. However, recent geological and geostatistical studies indicate that aquifer facies distributions are spatially correlated [Carle and Fogg, 1997; Ritzi et al., 2004].

[4] We propose a structure identification method that accounts for spatial correlation by means of a stochastic inversion of a transition probability model, in an analytical framework [Dai et al., 2007], describing facies volume proportions, mean lengths, and juxtapositioning. The transition probability model provides a nonparametric, Markov chain approach to indicator geostatistics that is well suited to applications with sparse information [Carle and Fogg, 1997]. The analytical solution allows structure identification to be cast as a conventional inverse modeling problem, using statistical structure parameters (such as facies volume proportions and mean lengths) to iteratively update the transition probability model. The facies proportions and mean lengths define the transition probability matrix. Indicator cokriging simulation produces the aquifer facies distributions, ensuring the statistical properties defined by the transition probability model are maintained. In this way, the aquifer structure is updated in the inversion, while the information provided by the conditional data (the sparse geological and geophysical data used to describe the facies distribution in boreholes) is honored. The optimization of the model inversion is performed using a genetically-adaptive multi-method search algorithm called AMALGAM-SO. This method was chosen as it combines the strengths of several different evolutionary search approaches and has been shown to achieve good efficiency across a range of difficult synthetic benchmark problems [Vrugt et al., 2008]. While other optimization algorithms could potentially be implemented to drive the stochastic inversion, AMALGAM-SO was selected to illustrate the stochastic inversion methodology, without comparing its performance to other optimization algorithms. This decision was partly based on the assumption that, although an analytical solution of the transition probability model is utilized here, gradient-based methods would still have difficulty given the stochastic nature of the structural variables, which serve as inputs to the stochastic simulation. The analytical solution of the transition probability model provides the computational efficiency necessary for the large number of model evaluations required by the stochastic inversion. The combination of the analytical solution of the transition probability model and AMALGAM-SO provides a robust, computationally efficient model inversion with the ability to deal with complex fitness response surfaces. While in the past, the stochastic inversion described here was not possible, due to computational and algorithmic limitations, we propose that through the use of modern computers and analytically derived structure parameters [Dai et al., 2007], this type of inversion can be realized.

2. Facies Transition Probability Model

[5] Transition probability models have been used by geologists to describe sediment facies distributions for a few decades [e.g., Agterberg, 1974; Carle and Fogg, 1997]. Recently, Ritzi et al. [2004] and Dai et al. [2005] incorporated the work of Carle and Fogg [1997] to relate the structure of the indicator random variables to proportions, geometry, and pattern of the aquifer facies. Under two assumptions: (1) the cross-transition probabilities depend on facies volumetric proportions only, and (2) the juxtapositional tendencies between categories k and j are assumed symmetric in the direction ϕ, Dai et al. [2007] derived an analytical solution for the transition probability model as

equation image

where tki(hϕ) is the probability of transitioning from facies k to facies i in lag distance h in direction ϕ, pi is the proportion of facies i, δki is the Kronecker delta, λI is the indicator correlation length, and N is the number of facies. By taking the partial derivative of the auto-transition probability (equation (1) with k = i) with respect to h, the mean length (equation imagek,ϕ) can be related to λI as [Dai et al., 2007]

equation image

Equation (2) defines the relationship between indicator correlation length and the statistical parameters of facies proportion and mean lengths. Using equations (1) and (2), the continuous-lag transition probability matrix T in direction ϕ can be defined by the facies proportions and mean lengths as T(hϕ) = (tki(hϕ))N×N. The transition probability matrix can be used for indicator simulation of aquifer facies using the indicator cokriging method [Carle and Fogg, 1997].

3. Stochastic Inverse Method

[6] The transition probability model establishes a bridge between aquifer statistical parameters and aquifer facies distributions. By estimating these statistical parameters, we are able to formulate a model inversion to identify aquifer structure. A flow diagram of the stochastic inversion method is presented in Figure 1.

Figure 1.

Stochastic inversion flow diagram.

[7] The transition probability model is updated by calculating the transition probability matrix, using values of facies lengths and proportions generated as offspring of the previous generation of solutions. The structure is updated by indicator cokriging simulation using the updated transition probability model, where a single realization is used to represent the collection of equally probable realizations of the given transition probability model. Updated facies hydraulic parameters are applied to the structure zonation. Flow simulation is performed using the Finite Element Heat and Mass transfer (FEHM) code [Zyvoloski et al., 1997], employing observed or assumed flow and boundary conditions, producing simulated transient head data.

[8] The inverse modeling is performed with the goal of minimizing residuals of hydraulic head, where the objective function, J, can be defined as

equation image

where equation image(β) is an estimated head using parameter values in the vector β constrained by B, which is defined by the upper and lower parameter bounds, where Bequation image equation imagep, p being the number of parameters, h is a measured head, and M is the number of measured heads.

[9] Equation (3) defines the fitness function optimized with AMALGAM-SO [Vrugt et al., 2008]. In general, AMALGAM-SO allocates the number of offspring at each population size, Nl = {N1l, …, Nql}, to q algorithms using a weighting scheme based on each algorithm's previous performance, where l is the population size index. In this way, AMALGAM-SO is able to exploit the individual strengths of selected algorithms at various stages of the optimization. AMALGAM-SO employs a population incrementing restart strategy as its basis for collecting algorithm performance information used to update the offspring allocation [Vrugt et al., 2008]. This method has the advantage of combining individual algorithm strengths by allowing algorithms to exchange search information, and by adaptively distributing preference to algorithms exhibiting superior performance. In the current study, Covariance Matrix Adaptation (CMA), Genetic Algorithm (GA), and Particle Swarm Optimization (PSO) strategies were selected for the q = 3 optimization algorithms, as this combination has shown improved performance over other combinations [Vrugt et al., 2008]. The sequence of population sizes used in the current research was l = 5, 10, 15, 20. For more details on the settings of AMALGAM-SO, refer to [Vrugt et al., 2008].

[10] The uncertainty associated with the resulting optimized transition probability model is evaluated using conditional simulation and presented as the final result in the form of a structural probability map, which in the present case defines the shape and location of clay facies within a background of sand. A map of clay probability is produced by approximating the one-location marginal probabilities for clay by the sample mean of the clay indicator spatial function Iclay(x) with respect to the optimized structure parameters as

equation image

where R is the number of realizations.

4. Synthetic Example

[11] The stochastic inversion method is illustrated by means of a synthetic example of a confined aquifer with borehole geophysical data and transient head measurements from a synthetic pump test. The distribution of clay and sand in cross-section, where clay facies are embedded within a background of sand, is illustrated in Figure 2. In this example, the permeability of sand is 10−10 m2 and the permeability of clay is 10−13 m2. The synthetic structure was generated by conditional simulation using prespecified structure parameters. The proportion of sand (ps) in the synthetic example is set at 0.7, while the proportion of clay (pc) is 0.3. The mean lengths of the clay in the x (length) and y (thickness) directions are 300 m and 20 m, respectively. Conditional data, which comprises the indicator data of the facies distribution, is collected as continuous bore log data from observation wells defined along the transect at x = 0, 250, 500, 750, and 1000 m indicated by the vertical lines in Figure 2. The well at x = 500 m is set as the pumping well with a constant flow rate of Q = 10.2 kg/s. The boundaries at x = 0 m and x = 1000 m are set as constant head boundaries with heads of 100 m and 95 m, respectively. The top and bottom boundaries at y = 0 m and y = 200 m are no-flow boundaries for this confined model formulation. Measured transient heads were collected at 20 discrete times over a one year simulation, where the size of the time step increased over the simulation. The measurements were collected at 8 locations along the three central wells indicated by white dots in Figure 2, where it has been assumed that the pumping well has the ability to measure pressures while pumping.

Figure 2.

Synthetic distribution where light grey represents sand and dark grey represents clay. The black vertical lines indicate observation and pumping wells, as well as locations where facies indicator data has been collected. The white dots indicate head observation locations.

[12] This synthetic example was constructed to emulate typical applications encountered in practice, while still allowing a complete evaluation of the robustness and efficiency of the stochastic inversion with respect to a known structure and parameterization.

5. Results and Discussion

[13] Figure 3a presents a plot of the objective function as a function of the number of model evaluations, where it is apparent by the step-like decreases in the objective function that AMALGAM-SO continues to locate improved solutions throughout the inversion. Figures 3b and 3c present the distributions resulting at the lower and upper bounds of the parameters, respectively, defined in the first two rows of Table 1. These two scenarios represent two points along the convex hull of the solution space Bequation image equation imagep that the stochastic inversion is required to explore. Inspection of these two plots indicates the diversity of structures considered in the inversion. Figures 3d, 3e, and 3f present distributions at key points during the progression of the inversion, while Figure 3g presents the distribution of the optimal parameters. By inspecting these distributions, the transformation towards the synthetic distribution is apparent. These stages of the optimization are indicated in Figure 3a by their subfigure letter.

Figure 3.

(a) A plot of the objective function versus the number of model evaluations and (b–g) the aquifer structure as different stages of the stochastic inversion, with their corresponding locations noted in Figure 3a. Refer to Table 1 for detailed information on these aquifer structures. (h) Also shown is the clay probability map produced by stochastic simulation of the optimized structural parameters.

Table 1. Structure and Hydraulic Parameters at the Lower (1a) and Upper (1b) Parameter Bounds, at Various Stages of the Model Inversion, and for the Synthetic Examplea
FigureIteration NumberProportionClay Mean Length, mClay Mean Thickness, mPerm. log, m2Objective Function
  • a

    Refer to the figures identified in column 1 for plots of the distributions.


[14] Corresponding tabular information on these distributions, and the synthetic distribution (Figure 2), are presented in Table 1, including iteration number, parameter values, and objective function values. By inspecting the decrease of the objective function plotted in Figure 3a and listed in Table 1, it is apparent that dramatic improvements are made during the course of the optimization, resulting in an extremely small objective function value for the optimized solution. This indicates that an estimate of the aquifer structure has been obtained that closely mimics the hydraulic response of the aquifer. Although an excellent fit to the observed aquifer response has been achieved, it is apparent in Table 1 that there is some discrepancy between the parameters used to generate the synthetic aquifer response, and those estimated with AMALGAM-SO. This is explained by the stochastic nature of this inverse problem. Multiple realizations for the same parameter combination result in widely varying values of the objective function. For instance, when evaluating 100 realizations generated using the optimized parameters, objective function values are obtained that vary between 0.5 and 200, with one outlier around 600. The standard deviation of these objective function values is approximately 68.9. This stochasticity allows different parameter combinations to generate nearly similar responses of the aquifer. The interest is therefore not so much in the exact values of the parameters, but on the optimized structure that has been identified. The latter has been successfully achieved, considering the very small value of the objective function, and the close similarity between the true and inversely estimated facies distribution. Furthermore, 95.5% of the optimized facies grid is assigned to the correct facies with respect to the synthetic example.

[15] The final result is presented in a probability map of clay facies distribution in Figure 3h where the estimated one-location marginal probabilities of clay (equation (4)) were calculated from 100 realizations based on the optimized structure parameters. While this does provide the uncertainty of the structure based on the optimized transition probability model, it does not indicate the uncertainty of the structure with respect to the aquifer response. Future work will expand this uncertainty analysis by identifying the set of plausible structures with regard to aquifer response. It is apparent from a comparison of the true distribution (Figure 2) and the resulting clay probability map (Figure 3h) that the large structural features are captured in the analysis. These results demonstrate that the stochastic inversion is able to reduce the objective function to a reasonably low value (indicating that the response of the aquifer is modeled accurately) and that the large structural features of the aquifer are identified adequately given the limited amount of information provided in the synthetic data.

6. Conclusions

[16] A method has been developed for aquifer structure identification using stochastic inversion. The method is based on updating structure parameters of a facies transition probability model to identify aquifer structure. This approach allows the problem to be formulated in an inverse modeling framework that can be extended to adaptive parameterization of facies hydraulic parameters by taking advantage of the efficiency in model evaluations with the analytical solution of the facies transition probability model. The inversion is driven by a multi-method genetically adaptive evolutionary optimization approach [Vrugt et al., 2008], providing a robust inversion capable of traversing the complicated fitness landscape. Results are obtained as a probabilistic estimate of the existence of facies at a particular location given the optimized statistical structure parameters. This method can be applied to pump-test datasets, where some geophysical data are available, to provide probabilistic estimates of the facies distributions based on the hydraulic responses of the aquifer.


[17] The reported research was supported by Los Alamos National Laboratory's Directed Research and Development Project. The first author is supported as a Graduate Research Assistant at the Los Alamos National Laboratory. The fourth author is supported by a J. Robert Oppenheimer Fellowship from the LANL postdoctoral program. The authors thank Edward Kwicklis, George Zyvoloski, and Zhiming Lu for their constructive comments to the manuscript.