## 1. Introduction and Approach

### 1.1. Motivation: The Biasedness of Linearizations

[2] The principle of linearized error propagation can look back on a long history of successes in stochastic hydrogeology and, in more general, in describing uncertain dynamic systems [e.g., *Schweppe*, 1973]. Typical hydrogeological applications include geostatistical inversion [e.g., *Zimmerman et al.*, 1998; *Keidser and Rosbjerg*, 1991; *Yeh et al.*, 1996; *Kitanidis*, 1995], maximum likelihood estimation of covariance parameters [e.g., *Kitanidis and Vomvoris*, 1983; *Kitanidis and Lane*, 1985; *Kitanidis*, 1995, 1996], and geostatistical optimal design [e.g., *Cirpka et al.*, 2004; *Herrera and Pinder*, 2005; *Feyen and Gorelick*, 2005]. However, the crux of linearization techniques is the trade-off between their computational efficiency, conceptual ease and limited range of applicability.

[3] Jacobian-based linearizations have been shown to be biased for nonlinear problems. A first goal of this study is to find a new type of linearization that has better properties. The biasedness of a linearization is defined as the systematic deviation of a tangent from the actual nonlinear function. It can occur for all processes that depend nonlinearly on their parameters, including all flow and transport processes in heterogeneous subsurface environments.

[4] This general statement is easily supported by the fact that all scale-dependent physical process depend nonlinearly on their parameter fields. For any scale-dependent process, inserting the ensemble average of a parameter field into the original equation does not predict the ensemble mean behavior. Instead, the correctly averaged equation requires effective parameters or even has a different mathematical form [e.g., *Rubin*, 2003; *Zhang*, 2002].

[5] A powerful example is solute transport in heterogeneous aquifers, and it shall serve as illustration throughout the entire study. For solute transport, the effective ensemble mean equation is macrodispersive, whereas the transport equation with ensemble mean parameters is only locally dispersive and underestimates dispersion [e.g., *Rubin*, 2003].

[6] The current study will revise the concept of linearization. A linearization scheme for nonlinear equations (and stochastic partial differential equations in particular) will be derived from the principles of unbiasedness and minimum approximation error in the ensemble mean sense. Because of its properties, it will be called best unbiased ensemble linearization (EL). Being based on ensemble statistics, EL will also guarantee adequate treatment of solute dispersion.

[7] The remainder of the introduction will review where biasedness occurs in geostatistical inverse modeling, and how it may be overcome by conditional simulation. Conditional simulation methods may be split into realization-based (MC-type) ones that treat individual realizations one by one, and ensemble-based ones, which work on entire ensembles at a time. The latter are computationally more efficient, yet can be stochastically rigorous and conceptually straightforward, and can outperform realization-based methods [e.g., *Hendricks Franssen and Kinzelbach*, 2008b].

[8] Recently, ensemble-based methods have been modified from pure state space data assimilation tools [e.g., *Evensen*, 1994] toward joint updating of parameters and states [e.g., *Chen and Zhang*, 2006; *Hendricks Franssen and Kinzelbach*, 2008a; *Evensen*, 2007, p. 95]. The most recent trend is further modification toward pure parameter space updating, where updated states are obtained via simulation with updated parameters [e.g., *Liu et al.*, 2008]. The contribution of this work, summarized at the end of the introduction, may be seen as final step in the transformation of ensemble Kalman filters toward geostatistical inversion.

### 1.2. Biasedness in Geostatistical Estimation Techniques

[9] A large class of linearizing methods can be found in geostatistical inversion of flow and tracer data, or in the generation of conditional realizations. These methods obtain cross covariances and autocovariances and expected values by Jacobian-based linearized error propagation. Jacobians are derived in sensitivity analyses of the involved flow and transport models, e.g., by adjoint state sensitivities [e.g., *Townley and Wilson*, 1985; *Sykes et al.*, 1985]. Examples are the quasi-linear geostatistical method of *Kitanidis* [1995] and the successive linear estimator of *Yeh et al.* [1996], later revisited by *Vargas-Guzmán and Yeh* [2002].

[10] Dependent state variables are estimated by inserting the current estimate of the parameter field into the original equation. This technique is accurate only to zeroth order, and clearly biased. Returning to the example of solute dispersion, the bias appears as a lack of dispersion: because of the scale dependence of transport, dispersion is systematically underrepresented on estimated conductivity fields that are smoother than conditional realizations. The trivial lore “not to use best estimates of conductivity fields for transport simulations” is a direct consequence. Likewise, the corresponding first-order concentration variance fails to represent the uncertainty of concentration related to macrodispersive effects. All interpretations of concentration data based on such linearized approaches are bound to produce inaccurate results.

[11] Successive linearization about an increasingly heterogeneous conditional mean conductivity field may gradually include more effects of heterogeneity [e.g., *Cirpka and Kitanidis*, 2001], but will still underrepresent dispersion and fail to interpret concentration data accurately. *Rubin et al.* [1999, 2003] derived dispersion coefficients that apply to estimated conductivity fields, but require perfect separation of scales between large blocks of estimated conductivity and small-scale dispersion phenomena. Simultaneous estimation of a space-dependent dispersivity also helps to overcome the lack of dispersion [*Nowak and Cirpka*, 2006], but entails a scale dependence on the support volume of available tracer data.

### 1.3. Realization-Based Conditional Simulation

[12] The same lore that recommends not performing transport simulations on estimated parameter fields suggests simulating solute transport on conditioned conductivity fields instead, pointing toward the Monte Carlo framework. Each conditioned random conductivity field honors both data and natural variability, and therefore represents solute dispersion accurately.

[13] Available techniques are computationally quite expensive: the pilot point method of *RamaRao et al.* [1995] and *LaVenue et al.* [1995] [see also *Alcolea et al.*, 2006], sequential self-calibration by *Gómez-Hernández et al.* [1997] and *Capilla et al.* [1997] [see also *Hendricks Franssen et al.*, 2003], or Monte Carlo Markov chain methods [e.g., *Zanini and Kitanidis*, 2008]. Detailed discussion of realization-based methods is provided by H. J. Hendricks Franssen et al. (A comparison of seven methods for the inverse modelling of groundwater flow. Application to the characterisation of well catchments, submitted to *Advances in Water Research*, 2009).

[14] These realization-based methods rely less on linearized error propagation for data interpretation. For transport simulations, they avoid estimated conductivity fields and can legitimately use the hydrodynamic (local) dispersion tensor without the above dispersion and biasedness issues.

[15] The most widespread ones, i.e., the pilot point method and sequential self-calibration, substitute indirect data types (such as hydraulic heads) by a selection of pilot points or master blocks, which are then used for kriging-like interpolation just like direct data (conductivity data values). Values for the pilot points or master blocks are found by quasi-linear optimization, such that the random conductivity fields comply with the given data values. Some of their drawbacks include (1) the approximate character of this substitution, (2) a lack of options to enforce a prescribed distribution shape of measurement/simulation mismatch, and (3) the computational effort involved for optimizing individual realizations. The first two drawbacks may be minor and of little relevance to the resulting ensemble statistics, as shown in the comparison study by Hendricks Franssen et al. (submitted manuscript, 2009), but the issue of computational effort remains.

[16] The Monte Carlo Markov chain method of *Zanini and Kitanidis* [2008] can be seen as a postprocessor to the quasi-linear geostatistical approach (QLGA) [*Kitanidis*, 1995] to improve the quality of conditional realizations. It is stochastically rigorous, but still involves a massive computational effort for conditioning individual realizations. Without that upgrade, the QLGA either requires to use a single linearization about the conditional mean (at the cost of inaccuracy unless the conditional covariance is very small) or to use individual linearizations for each conditional realization (at excessively high computational costs). A most rigorous and fully Bayesian high end-member of conditional simulation with a minimum of assumptions and simplifications is the method of anchored inversion by Z. Zhang and Y. Rubin (Inverse modeling of spatial random fields using anchors, submitted to *Water Resources Research*, 2009). At the current stage, its advantages come at even higher computational costs, which may be reduced by further research.

### 1.4. Ensemble-Based Methods

[17] The advantages of conditional simulation may be exploited at substantially reduced computational costs, when conditioning entire ensembles rather than individual realizations. The current study will use the EL concept along these lines, obtaining a quasi-linear generator for conditional ensembles. Only to mild surprise, the resulting method is quite similar to an ensemble Kalman filter (EnKF), therefore called the Kalman ensemble generator (KEG).

[18] The EnKF has been proposed by *Evensen* [1994], later clarified by *Burgers et al.* [1998] and extensively reviewed by *Evensen* [2003]. EnKFs update transient model predictions whenever new data become available. Designed for real-time forecasting of dynamic systems, they have a strictly forward-in-time flow of information. In other words, they do not update the past with present data. Their key elements are a transient prediction model, a measurement model and a forward-in-time Bayesian updating scheme.

[19] Similar to other conditioning techniques, EnKFs require expected values and cross covariances and autocovariances between all model states. These are extracted from an ensemble of realizations which is constantly being updated. The most compelling motivation to use ensemble statistics is to avoid computationally infeasible sensitivity analysis and storage of excessively large autocovariance matrices of parameters. At the same time, EnKFs behave more robustly for nonlinear problems because the ensemble statistics can be accurately evolved in time with nonlinear models. *Burgers et al.* [1998] showed that EnKFs retain higher-order terms compared to the original Kalman filter or the extended Kalman filter [e.g., *Jazwinski*, 1970].

[20] The EL concept derived in this study will add a new angle to the theoretical foundation of EnKFs: It links the choice of ensemble covariances to the fundamental principles of unbiasedness and minimum approximation error. Seen from this angle, the EnKF and the KEG use linearizations that are optimal for the entire ensemble, providing them with excellent computational efficiency. Moreover, they are conceptually straightforward, stochastically rigorous, easy to implement, and require no intrusive modification of simulation software. The accuracy of the conditional statistics obtained by the KEG and its overwhelmingly low computational costs will be demonstrated later.

### 1.5. From State Space to Parameter Space

[21] Recent developments indicate a transition of the EnKF from the state space toward the parameter space. In their mostly meteorological and oceanographic applications, EnKFs focused on the state space alone. State space methods use measurements of state variables to update the prediction of state variables. Time-invariant physical parameter fields are insignificant, and the notion of geostatistical structures is entirely absent.

[22] This differs from hydrogeostatistical applications, which focus on the parameter space. Soil parameters are modeled as time-invariant (static) random space functions. The main motivation is to identify static parameter fields of soil properties, much less to combine real-time predictions with incoming streams of observed data. The concept of forward-in-time flow of information does not apply. Instead, great attention is paid to the geostatistical structure of variability, because it plays a major role in the effective behavior of heterogeneous porous media [e.g., *Rubin*, 2003].

[23] A somewhat intermediate concept is the ensemble-based static Kalman filter [e.g., *Herrera*, 1998; *Herrera and Pinder*, 2005; *Zhang et al.*, 2005], sKF for short. It involves a steady state rather than a forward-in-time prediction model, but is still a state space method. Its primary objective is still to improve model predictions, not to condition geostatistical parameter fields.

[24] On the basis of its past successes, EnKFs have received quickly growing attention in hydrogeological studies, as summarized by *Chen and Zhang* [2006], *Hendricks Franssen and Kinzelbach* [2008a, 2008b]. The aforementioned studies (and other works cited therein) included geostatistical parameters into the list of variables to be updated by the EnKF.

[25] *Wen and Chen* [2006] demonstrated the improvement of accuracy when restarting the EnKF once the parameter values have been conditioned. *Hendricks Franssen and Kinzelbach* [2008a] tested the restart principle and found only little or no improvement, probably because their model equations are much closer to linearity. The restart accurately reevaluates the ensemble statistics of state variables using the original equations. This moves the EnKF from a state space or mixed state/parameter space method toward a parameter space method.

[26] The KEG introduced in the current study will complete the transformation of EnKFs into classical parameter space methods. Parameters will be seen as static random space functions. Only the parameter space will be updated from measurements. Updated states will be obtained indirectly by simulation with the updated parameters, which is somewhat similar to an enforced restart in the work of *Hendricks Franssen and Kinzelbach* [2008a] and *Wen and Chen* [2006]. In the examples provided here, the KEG will generate ensembles of log conductivity fields conditional on flow and tracer data, and on measurements of log conductivity itself.

[27] The rigorous theoretical foundation via best unbiased ensemble linearization and the successful history of the EnKF strongly advocate the further use of the KEG, sKF and EnKF methods in hydrogeostatistical applications. In the tradition of geostatistical inversion methods, the KEG will allow for measurement error, but not for model error. Kalman filters require model error to conceptualize measurement/simulation mismatch. In the geostatistical tradition, this mismatch is attributed to yet uncalibrated parameters and boundary conditions.

[28] Erroneous model assumptions can lead to biased parameter estimates, and considering model error may increase the robustness in such situations. The highly flexible EnKF framework, however, gives little reason not to include additional uncertain quantities into the list of parameters for updating, thus reducing the arbitrariness and potential errors in model assumptions. A good example is the joint identification of uncertain conductivity and recharge fields of *Hendricks Franssen and Kinzelbach* [2008a], or the joint identification of unknown boundary values. But of course, extensions of the KEG toward model error will be possible.

[29] A remaining concern in the current study is the original state space character of EnKF-like methods and the KEG. Before extensively using the KEG in applications that raise high requirements to geostatistical structures, it will undergo a deep scrutiny in the current study. *Chen and Zhang* [2006] tested the ability of EnKFs to cope with inaccurate assumptions on geostatistical structures. Since their synthetic data set was almost exhaustively dense, the EnKF was still able to converge toward the reference conductivity field from which the synthetic data were obtained.

[30] Quite contrarily, the rationale behind the tests in the current study is to investigate whether the KEG maintains a prescribed geostatistical model in the absence of strong data, or whether the spatial statistics of the parameter field degenerate during the updating procedure. This is somewhat related to the filter inbreeding problem discussed, e.g., by *Hendricks Franssen and Kinzelbach* [2008a]. Filter inbreeding is the deterioration of ensemble statistics due to an insufficient ensemble size and leads to underestimated prediction variances. While *Hendricks Franssen and Kinzelbach* [2008a] and others cited therein only tested one-point statistics (the field variance), the current study will include two-point statistics (covariances) in order to test geostatistical properties. Further quality assessment includes the compliance of measurement/simulation mismatch statistics with the assumed distribution shape for measurement errors.

### 1.6. Contributions and Organization of the Current Study

[31] The new contributions of the current study can be summarized as follows.

[32] 1. The quasi-linear Kalman ensemble generator (KEG) finalizes the trend [e.g., *Chen and Zhang*, 2006; *Hendricks Franssen and Kinzelbach*, 2008a; *Liu et al.*, 2008] of ensemble Kalman filters (EnKFs) toward geostatistical conditional simulation.

[33] 2. The underlying idea of EnKFs is to use ensemble covariances in their updating equations. This concept is rederived from the principles of best unbiased linearization, providing an additional theoretical foundation to EnKF-like methods. The rederivation also clarifies the advantages of the KEG over Jacobian-based linearized conditioning techniques.

[34] 3. A two-step updating approach like the one by *Hendricks Franssen and Kinzelbach* [2008a] is used. The KEG first processes direct data (linearly related to the parameter field) to update the parameters prior to any simulation of state variables. Then, it processes indirect data (nonlinearly related) by updating the parameters to reduce the measurement/simulation mismatch. For the indirect data, it has a quasi-linear iteration scheme, stabilized by a geostatistically driven Levenberg-Marquardt technique [*Nowak and Cirpka*, 2004].

[35] 4. The physicalness of updated model states is always guaranteed because they are updated indirectly via simulation with updated parameters. In combination with the above, this significantly improves accuracy of results.

[36] 5. The accuracy of maintaining a prescribed geostatistical structure (two-point covariances) during the conditioning step is assessed. Previous studies looked at one-point statistics only. The filter bias is shown to be zero, complying with the rederivation from best unbiased ensemble linearization.

[37] The current study is organized as follows: First, the concept of best unbiased ensemble linearization will be derived in section 2. Section 3 summarizes the geostatistical framework in brief to install the necessary notation. The quasi-linear Kalman ensemble generator is introduced in section 4, and its similarity and differences to the EnKF and sKF methods are discussed in more detail. In a computationally intensive test case, section 5 assesses the geostatistical properties of the KEG and discusses its computational efficiency.