## 1. Introduction

[2] Subsurface hydrological properties, such as permeability and porosity, can be inferred through inverse modeling of indirect measurements (e.g., hydraulic head and tracer concentrations) at discrete points and times, provided that such data are sufficiently sensitive to the properties of interest. However, the calibration of distributed groundwater models based on limited measurements is generally an underdetermined inverse problem [e.g., *McLaughlin and Townley*, 1996]. To overcome this limitation, simplifying assumptions regarding spatial variability are commonly made [e.g., *Moore and Doherty*, 2006]. For example, parameter zonation divides the model into a discrete number of zones, each with spatially uniform properties. The number and shape of zones, which may also contain parameterized property variations, may be iteratively determined during the calibration process with increasing granularity as supported by the calibration data [*Sun et al.*, 1998; *Tsai et al.*, 2003; *Berre et al.*, 2009], or a level set formulation that flexibly calibrates zone shapes during inversion of joint hydrogeophysical data may be used [*Cardiff and Kitanidis*, 2009]. Another option is to apply regularization, which enforces some form of spatial variability or smoothing in the property of interest, to make underdetermined inverse problems well posed [*Yeh*, 1986; *Carrera et al.*, 2005]. For example, a heterogeneous property can be cast as a spatially correlated random field if a statistical model of correlation (e.g., a semivariogram) can be inferred from site characterization data or concurrently estimated on the basis of the available calibration data [*Kitanidis*, 1995; *Hu*, 2000; *Caers*, 2003; *Finsterle and Kowalsky*, 2008]. While such simplifications may allow for a unique solution of the inverse problem, they result in a simplified picture of the subsurface that may or may not be adequate, depending on the application [*Moore and Doherty*, 2006]. Decisions regarding parameterization (i.e., how to represent a heterogeneous distribution with a limited number of parameters that are amenable to estimation) are of great importance for successful application of inverse modeling. In this study we aim to highlight the importance of proper spatial parameterization of subsurface heterogeneity, as errors in the model structure are partly compensated for by estimating biased property values during the inversion. These biased estimates—while potentially providing an improved fit to the calibration data—may lead to wrong interpretations and conclusions and reduce the ability of the model to make reliable predictions.

[3] Using at first a geostatistical parameterization, we treat the base 10 log permeability as a spatially correlated random field. To estimate its spatial distribution through inverse modeling of secondary measurements, we use an implementation of the popular pilot point method [*de Marsily*, 1978], of which many variations [e.g., *de Marsily et al.*, 1984; *Certes and de Marsily*, 1991; *Lavenue and Pickens*, 1992; *RamaRao et al.*, 1995; *Gómez-Hérnandez et al.*, 1997; *Doherty*, 2003; *Kowalsky et al.*, 2004, 2005; *Alcolea et al.*, 2006, 2008; *Finsterle and Kowalsky*, 2008] and related methods [*Rubin et al.*, 2010] have been proposed over the years, which have been reviewed by *Hendricks Franssen et al.* [2009]. Heterogeneous log permeability distributions are generated using sequential Gaussian simulation (SGSIM) [*Deutsch and Journel*, 1992], such that they reflect the spatial correlation specified by the semivariogram and such that they are conditioned to (i.e., a function of) so-called pilot point values, which are estimated in the inversion procedure. This geostatistical approach requires that the inversion procedure be repeated multiple times, each time with a different initial random field (based on a different seed number). The multiple inversion realizations result in a multitude of parameter distributions, which provide log permeability estimates at each location in the model with corresponding uncertainty [*RamaRao et al.*, 1995].

[4] While there is no set rule for determining the optimal positioning of pilot points for a given scenario, *Jung et al.* [2011] provide an excellent review of studies that have considered how to add pilot points (e.g., predefining them or adding them sequentially) and how best to select their locations, including empirically based approaches (e.g., random placement or a uniform density of pilot points) and sensitivity-based approaches that optimize pilot point placement on the basis of measurement locations (e.g., on the basis of the adjoint sensitivity technique of *Lavenue and Pickens* [1992] or the D optimality criterion proposed by *Jung et al.* [2011]). Motivated by the lack of studies offering guidelines on implementing pilot points in hydrogeological applications, *Doherty et al.* [2010] offer a variety of such guidelines based largely on the mathematical basis of the pilot point method, and they lay out some future related research directions. Among numerous recommendations, they cite the need for further synthetic studies to elucidate pilot point placement and other implementation details.

[5] The majority of pilot point applications rely on simplistic numerical experiments, and most limit their scope to hydraulic head measurements (and hydraulic conductivity measurements) rather than transient tracer measurements. In addition to the continued need for numerical experiments with pilot points in applications that are of practical relevance and contain a variety of data types, there is a need for more examples in which the methods are applied to field data from experiments with real-world complications and limitations, to help refine, improve and identify guidelines for successful inverse modeling.

[6] Heterogeneity can also be parameterized using geophysical data in a variety of ways, such as through tomographic constraints in the inversion of tracer data [*Linde et al.*, 2006]. Coupled hydrogeophysical approaches have combined traditional hydrological measurements with geophysical data, such as seismic data [*Hyndman et al.*, 1994; *Hyndman and Gorelick*, 1996], which are related to lithological zonation, or electrical resistivity data [e.g., *Pollock and Cirpka*, 2008; *Kowalsky et al.*, 2011; *Pollock and Cirpka*, 2010, 2012], which are sensitive to solute concentration and therefore provide secondary measurements that can be used to estimate hydrological properties.

[7] In general, a zonation parameterization is also of great value, as it is conducive to incorporating characterization data, such as from geophysical measurements, hydrological tests, or core data, into a model. Parameterization techniques can also be combined, such as in the zonation–kriging method of *Tsai* [2006] that integrates the conditional estimates of a kriged field within a geostatistical framework and of a zonal structure honoring a set of sampled data.

[8] *Dafflon et al.* [2011] evaluated several parameter estimation approaches for an application similar to the one considered here, a tracer experiment in an unconfined aquifer. Aside from gaining valuable insight into the variable-density flow phenomena in their experiment, resulting from the high concentration of saline tracer used, they evaluated the usefulness of various sources of geophysical and hydrological information. Their study demonstrated some of the challenges in dealing with real field experiments (e.g., their hydrological model could not properly fit the concentration breakthrough, which they speculate was due to inadequacies in the conceptual model or boundary conditions or in parameterization of heterogeneity).

[9] The current work is more directly motivated by a previous study (M. B. Kowalsky, S. Finsterle, A. Englert, K. H. Williams, C. Steefel, and S. S. Hubbard, Inversion of time-lapse tracer data for estimating changes in field-scale flow properties during biostimulation, submitted to *Journal of Hydrology*, 2012), which analyzed time-lapse tracer data collected in two consecutive biostimulation field experiments that were conducted in a flow cell at a uranium-contaminated aquifer at Rifle, Colorado, in 2002 and 2003. They performed hydrological inverse modeling of the tracer data, using a geostatistical parameterization, to estimate the heterogeneous log permeability distribution for each year. With a goal of identifying subtle changes in flow properties, such as those expected to occur during biostimulation, they concluded there was insufficient information in the tracer data of that particular experiment to accurately infer changes in permeability of less than half an order of magnitude from one year to the next. They hypothesized that the coarse well spacing of the experiment, relative to the length scale of heterogeneity, contributed to nonuniqueness in the inverse problem. The study also pointed to the need for a better understanding of how potential errors in the model parameterization could affect the solution of such inverse problems and how additional site characterization data might be included to reduce uncertainty in parameter estimates.

[10] This study is based on a subsequent field-scale tracer experiment conducted at the Rifle site. Described in detail by *Williams et al.* [2011], the experiment took place in 2007 within a flow cell having a closer well spacing than that used for the 2002–2003 study (Kowalsky et al., submitted, 2012). After providing some details about the site and the experiment in section 2, we describe the hydrological inverse modeling approach in section 3, including details of the hydrological model, parameterization techniques, and the inverse modeling procedure itself. Then we use a synthetic example in section 4 to examine how decisions made regarding parameterization impact solution of the inverse problem, and we examine issues that can arise when implementing zonation information, known to varying degrees of completeness and accuracy, such as from core logs or geophysical data, in the inversion procedure. The inverse modeling approaches are applied to actual field data in section 5, first using a 2-D model with a geostatistical parameterization and testing, among other things, the previous assumption of uniform porosity and gradient direction. The model is then extended to 3-D by employing a geostatistical parameterization together with a zonation parameterization that incorporates facies information derived from geologic well log descriptions, while accounting for uncertainty in the facies geometry. Comparisons are then made between permeability values that are estimated and derived from slug test data, and between values of porosity estimated in the study and inferred from other sources.

[11] The importance of this work is exemplified by the fact that one of the main difficulties in building reactive transport models for complex field sites continues to stem from uncertainty in the basic heterogeneous hydrological properties, such as permeability and porosity. Thus testing, improving, and refining techniques for estimating such properties continues to be an essential research topic in hydrogeology. Furthermore, there is a lack of studies highlighting the impact of decisions related to parameterization and quantifying how they affect inverse modeling results. We intend for this work to add to the relatively limited number of synthetic and field applications offering some guidance for the use of pilot points in complex real-world experiments involving tracer data (as opposed to hydraulic head data). The need is apparent for ongoing synthetic examples for testing the approach as new applications arise, for quantifying the impact of certain modeling assumptions, and for application to field data from complex real-world field experiments.