## 1. Introduction

[2] Accurate identification of heterogeneous material properties in the subsurface is essential for predictive modeling of fluid flow and solute transport. Direct observations of the aquifer properties are limited, and data must be expanded and/or interpolated from the observation locations to populate the model domain. Furthermore, available observations of state variables such as pressure and/or concentration can be brought to bear on estimation of hydraulic parameters through inverse modeling approaches [*de Marsily et al.*, 1999; *Carrera et al.*, 2005]. Recently, the importance of identifying the spatial distribution of hydrogeological heterogeneity has been increasingly centered on transport processes in the subsurface for applications including environmental remediation [*Murakami et al.*, 2010; *Zachara*, 2010], geological storage of CO_{2} [*Baines and Worden*, 2004; *Department of Energy* (*DOE*), 2007], and water resources management [*South Florida Water Management District*, 2005; *Young et al.*, 2010]. For strongly nonlinear problems such as reactive transport processes and groundwater flow in highly heterogeneous formations, a majority of inverse parameter estimations are computationally expensive and the impact of small-scale heterogeneity on model prediction can be profound. Additionally, quantification of the uncertainty associated with model predictions resulting from estimated parameters is receiving increased attention in order to better understand different sources of uncertainty [*Renard et al.*, 2010] and provide decision-makers with reliable predictive models [*Keating et al.*, 2010].

[3] Considerable research has been devoted to developing theories and techniques for solving inverse parameter estimation problems in groundwater problems, which have been reviewed thoroughly in the literature [*Carrera*, 1987; *Carrera et al.*, 2005; *McLaughlin and Townley*, 1996; *Hendricks Franssen et al.*, 2009]. These techniques can generally be classified by considering (1) how the variables being estimated are parameterized (e.g., lumped versus distributed parameters), (2) whether or not the result of the inverse procedure is a deterministic best approximation or provides a probabilistic assessment of the uncertainty, and (3) the assumed degree of linearity in the relationship between material properties and observed state variables. Approaches that provide estimations of spatially varying distributed parameters include the generalized likelihood uncertainty estimation (GLUE) method [*Beven and Binley*, 1992], the geostatistical inverse approach [*Kitanidis*, 1996; *Li et al.*, 2005, 2007], Bayesian approaches to formally integrate prior information [*Woodbury and Ulrych*, 2000; *Jiang et al.*, 2004; *Fu and Gómez-Hernández*, 2009], the method of stochastic differential equations [*Rubin and Dagan*, 1987; *Guadagnini and Neuman*, 1999; *Hernandez et al.*, 2006], gradual deformation methods [*Hu*, 2000], ensemble filtering approaches [*Nowak.*, 2009; *Bailey and Bau*, 2011], sequential self-calibration [*Gómez-Hernández et al.*, 2003], and pilot point methods [*Lavenue and de Marsily*, 2001] among others. Recently, seven inverse methods commonly used in parameter estimation for groundwater flow have been compared in two synthetic heterogeneous transmissivity fields by *Hendricks Franssen et al.* [2009]. It was concluded that while formulation and parameterization of each method are different, the overall performance differences (i.e., predictive accuracy) do not vary significantly. However, the comparison was limited to the synthetic groundwater flow problems in two dimensional (2-D) heterogeneous domains.

[4] A number of previous works have estimated more than one spatially heterogeneous parameter from the same set of observations in 2-D fields. Most commonly, heterogeneous transmissivity (T) and storativity (S) fields have been estimated from some combination of steady state and transient head observations [*Hendricks Franssen et al.*, 1999; *Li et al.*, 2005, 2007]. Other works have focused on the simultaneous estimation of a heterogeneous T field and a spatially varying transport parameter such as the distribution of sorption coefficient, dispersivity, or contaminant source terms [e.g., *Huang et al.*, 2004; *Nowak and Cirpka*, 2006; *Tonkin et al.*, 2007; *Hosseini et al.*, 2011]. These approaches have generally been demonstrated on horizontal fields with other researchers estimating multiple properties in 2-D cross-sectional domains [e.g., *Fienen et al.*, 2009]. A question remains in all of these studies regarding the relationship between heterogeneous parameters. Most commonly, independence between the estimated parameter values is assumed, as often necessitated by the lack of observed values for one of the parameters.

[5] Over the past decade, a number of studies have increasingly focused on parameter estimation in heterogeneous 3-D fields. *Lavenue and de Marsily* [2001] used observations of a sinusoidally varying pressure test with pilot point parameterization to estimate heterogeneous hydraulic conductivity (*K*) in a two-layered medium. *Hendricks Franssen and Gómez-Hernández* [2002] estimated *K* in a 3-D fractured medium from steady and transient head observations with the locations of fracture zones specified a priori. *Llopis-Albert and Capilla* [2010] extended this work to account for stochastic representation of complex fracture structures similar to that presented by *Gómez-Hernández et al.* [2001]. *Li et al.* [2008] used the geostatistical inverse approach to estimate a 3-D *K* field from steady state drawdown and vertical flowmeter surveys. *Riva et al.* [2008] used a stochastic Monte Carlo method to describe 3-D random geological facies and hydraulic conductivities in order to capture the features of the depth averaged breakthrough curve (i.e., temporal moments and long tailing). In particular, *Riva et al.* [2008] used an empirical relationship between *K* and porosity (φ) to generate φ distribution with a dual medium model. Recently, *Schöniger et al.* [2012] improved ensemble Kalman filter (EnKF) approaches by applying nonlinear, monotonic transformations to the observed states in parameter estimation from 3-D hydraulic tomography in multi-Gaussian log *K* fields. *Huber et al.* [2011] used the pilot point method to calibrate a 3-D groundwater flow model of a strongly heterogeneous aquifer with the hydraulic conductivities at a limited number of locations and the leakage coefficients for a number of zones, which were updated in real time with EnKF.

[6] *Doherty* [2003] applied the pilot point method for parameter estimation in the context of underdetermined problems. This allowed pilot points to be distributed through the model domain, resulting in highly parameterized inversion estimation. Inverse estimation of the highly parameterized model with stability and uniqueness was accomplished through regularizations such as the truncated singular-value decomposition (TSVD) and Tikhonov regualarization (i.e., hybrid subspace) approaches [*Tonkin and Doherty*, 2005]. In addition, the highly parameterized inversion using pilot points has led to the development of computationally efficient calibration-constrained model prediction uncertainty by Doherty and coworkers [*Moore and Doherty*, 2005; *Tonkin et al.*, 2007; *Tonkin and Doherty*, 2009]. Highly parameterized inverse estimation with pilot points also has been used in 3-D domains to simultaneously estimate three heterogeneous parameters: porosity, horizontal and vertical hydraulic conductivity, in a model with 11–14 layers conditioned to both head and concentration data [*Tonkin and Doherty*, 2005; *Tonkin et al.*, 2007]. While parameter estimation with pilot points has become relatively common [*Alcolea et al.*, 2006, 2008; *Riva et al.*, 2010; *Doherty and Hunt*, 2010; *Doherty et al.*, 2010], its application to estimation of 3-D fields has been limited.

[7] Here we focus on highly parameterized inverse modeling of spatially heterogeneous fields for a 3-D transport problem. The high level of parameterization is used to estimate fields of spatially correlated properties, and the pilot point method is used to parameterize these spatial fields. Parameter estimation with pilot points often utilizes multiple locations with existing property measurements to condition the estimated field. The experimental data set examined here is unique in that it provides extremely fine-scale (0.25^{3} cm^{3}) 3-D exhaustive observations on concentrations in a 3-D flowcell over a 10–20 cm scale [*Zhang et al.*, 2007; *Yoon et al.*, 2008]. Contrary to most parameter estimation studies, however, there are no direct measurements of *K* or φ within the model domain, but independently measured K and φ values are available [*Zhang et al.*, 2007; *Yoon et al.*, 2008]. The exhaustive 3-D observations available from magnetic resonance imaging (MRI) in the flowcell packed with sands are an uncommon measurement set for inversion, but are analogous to time series imaging of transport in a geologic unit with geophysical techniques. This unique concentration data set provides an opportunity to explore the application of inverse parameter estimation for multiple variables using highly parameterized models in 3-D groundwater flow and solute transport problems.

[8] Objectives of this work are to: (1) Utilize a set of spatially exhaustive 3-D observations to simultaneously estimate heterogeneous *K* and φ fields through application of highly parameterized models. In particular we evaluate the capability of six different approaches for quantifying the relationship between *K* and φ in the estimation process to recover the observed tracer transport. This evaluation also includes the straight forward approach of using the known zonation of the sands and the laboratory measured *K* and φ values of each sand. (2) Examine the ability of highly parameterized estimations of *K* and φ to explain breakthrough curves (BTCs) with and without the addition of a dispersion term in forward solute transport models. If *K* and φ are parameterized at a high resolution, we hypothesize that it should be possible to use a purely advective transport model to match the first temporal moments of BTCs (or mean arrival times, *m*_{1}) as a surrogate for more complex transport in the inversion. The estimated fields should then provide accurate transport results when used for forward runs with the advection-dispersion equation (ADE). (3) Determine the impact of varying the number of parameters and the number of observations on the ability of the estimated 3-D fields to match observed data at different scales. Additionally, the impact of the averaging scale of the observations (i.e., 0.25^{3} and 1.0 cm^{3}) on the resulting estimated fields is examined to evaluate the impact of subscale heterogeneity on forward transport results with ADE.