## 1. Introduction

### 1.1. Background and Literature Review

[2] Understanding transport processes and developing mathematical models capable of simulating observed solute plumes are fundamental to environmental risk assessment and the remediation of contaminated sites. Historically, innovations in the discipline of solute transport modeling were developed and tested by using extensive data sets collected during controlled experimental field studies. These data sets usually include measurements of solute concentration and hydraulic conductivity (*K*), which are essential to properly characterizing subsurface heterogeneity and transport behavior. In the last 2 decades, for example, data collected from a tracer test site at the Columbus Air Force Base in Mississippi, commonly known as the Macrodispersion Experiment (MADE) site, have been invaluable for advancement of new transport theories and mathematical models [*Zheng et al.*, 2011]. The importance of this site is mainly due to its extreme heterogeneity indicated by the high variance of the natural logarithm of the measured hydraulic conductivity *K* (*σ*_{lnK}^{2} ≈ 4.5) [*Rehfeldt et al.*, 1992], which is significantly higher than that of other aquifers for which similar data sets exist [e.g., *Mackay et al.*, 1986; *LeBlanc et al.*, 1991].

[3] Three large-scale natural gradient tracer tests, usually referred to as MADE-1, MADE-2, and MADE-3 (also known as NATS) experiments, were conducted at the MADE site [*Boggs*, 1991; *Boggs et al.*, 1993; *Julian et al.*, 2001]. Measured concentrations revealed that transport behavior is characterized by highly asymmetric plumes, with significant mass accumulation near the source and extensive mass spreading to the far field. Several studies applied different modeling approaches to simulate the concentration distributions observed during the these tracer tests [*Adams and Gelhar*, 1992; *Eggleston and Rojstaczer*, 1998; *Berkowitz and Scher*, 1998; *Zheng and Jiao*, 1998; *Harvey and Gorelick*, 2000; *Feehley et al.*, 2000; *Julian et al.*, 2001; *Baeumer et al.*, 2001; *Schumer et al.*, 2003; *Barlebo et al.*, 2004; *Salamon et al.*, 2007; *Zhang and Benson*, 2008; *Llopis*-*Albert and Capilla*, 2009]. All these studies have in common the conclusion that the classical advection-dispersion model (ADM) is not able to reproduce the transport behavior observed at the MADE site unless physical heterogeneity is adequately resolved. When small-scale variations of water flux due to aquifer heterogeneity are not explicitly described, the ADM underestimates the extensive spreading or “tailing” along the flow direction and is not able to reproduce the substantial mass accumulation near the injection points. This conclusion was further confirmed by two recent forced-gradient tracer tests [*Liu et al.*, 2010; *Bianchi et al.*, 2011].

[4] As initially suggested by *Harvey and Gorelick* [2000] and *Feehley et al.* [2000], a reasonable hypothesis to explain the failure of the ADM is the presence of a network of interconnected highly permeable sediments embedded in a less conductive matrix. This conceptual model of heterogeneity can in fact favor the fast movement of a fraction of solute mass along preferential flow paths (PFPs), while most of the mass stagnates in the matrix. This hypothesis was proposed after modeling results showed that the dual-domain mass transfer model can reproduce the transport behavior observed at the MADE site more accurately. The dual-domain model conceptualizes the aquifer as consisting of distinct, but coexisting, mobile and immobile domains, and this separation is particularly appropriate when reproducing transport in the presence of connected high-*K* structures embedded in a low-*K* matrix [*Gorelick et al.*, 2005; *Liu et al.*, 2004; *Bianchi et al.*, 2008]. The efficacy of the dual-domain model in reproducing the solute plumes observed during the MADE-1 and MADE-2 experiments was considered by *Harvey and Gorelick* [2000] and *Feehley et al.* [2000] an indirect proof of the existence of a PFP network controlling solute transport at the MADE site. This hypothesis was also proposed by *Julian et al.* [2001] and more recently by *Llopis*-*Albert and Capilla* [2009]. *Zheng and Gorelick* [2003] investigated more specifically the transport behavior in a field characterized by a binary dendritic *K* distribution, generated using an invasion-percolation algorithm. Their numerical experiments offered support to the PFP network hypothesis by demonstrating that solute transport in a hypothetical networked *K* field displays highly non-Fickian characteristics similar to those observed at the MADE site.

[5] The traditional approach for reproducing hydraulic conductivity fields in heterogeneous porous media has been based on the assumption of multivariate Gaussian distribution of ln*K*. With this approach, ln*K* is considered a spatially correlated random variable. An important characteristic of multi-Gaussian fields is that entropy (disorder) is maximized and therefore extreme values tend to cluster in isolated zones rather than being arranged in connected structures [*Silliman and Wright*, 1988; *Rubin and Journel*, 1991; *Journel and Deutch*, 1993; *Gómez*-*Hernández and Wen*, 1998; *Zinn and Harvey*, 2003]. The multi-Gaussian approach has become popular due to its relative mathematical simplicity and easy interpretation. However, several studies have demonstrated the importance of connectedness rather than randomness in heterogeneous aquifers [e.g., *Anderson*, 1989; *Sánchez*-*Vila et al.*, 1996; *Koltermann and Gorelick*, 1996; *Webb and Anderson*, 1996; *Tsang and Neretnieks*, 1998; *Fogg et al.*, 1998, 2000].

[6] *Fogg* [1986], for example, suggested that groundwater flow in the Wilcox aquifer in Texas is controlled by the continuity and connectivity of large-scale sand bodies. *Scheibe and Yabusaki* [1998] demonstrated that methods for upscaling the *K* distribution, which lead to a good match between simulated and observed heads, may not be adequate to reproduce transport behavior because transport is strongly affected by the existence and connectivity of high-*K* zones. *Labolle and Fogg* [2001] recognized that the connectivity of highly permeable channel hydrofacies is the most important factor controlling solute migration in the alluvial system at the Lawrence Livermore National Laboratory (LLNL). The hydrostratigraphic reconstruction of the LLNL aquifer showed that about 80% of the channel hydrofacies forms a connected network that percolates in three dimensions. *Proce et al.* [2004] applied transition probability and sequential indicator simulations to simulate the assemblage of facies in a system of buried valley aquifers. A multiscale realization of aquifer heterogeneity showed the presence of interconnected pathways resulting from the significant connectivity of sand and gravel facies. The influence on groundwater flow patterns exerted by the distribution of different lithofacies with significant contrasts in *K* was also investigated by *Heinz et al.* [2003]. By using particle tracking calculations, they showed that sedimentary processes are responsible for the heterogeneities that determine local groundwater flow in aquifers. The importance of connectivity on fracture flow has also been acknowledged [e.g., *Journel and Alabert*, 1989; *Tidwell and Wilson*, 1999].

[7] Numerical studies most commonly have been used to quantify the connectivity of *K* fields. *Gómez*-*Hernández and Wen* [1998] analyzed groundwater travel times in four alternative unconditional representations of a 2-D synthetic *K* field sharing the same Gaussian histogram and covariance function, but different in terms of connectivity patterns. Results showed that travel times in the multi-Gaussian model could be 10 times slower than those observed in the other models. *Western et al.* [2001] applied connectivity functions to produce unconditional 2-D fields with almost identical histograms and omnidirectional variograms, but with very different connectivity. They concluded that standard geostatistical approaches based on variogram models do not properly capture connectivity. *Zinn and Harvey* [2003] showed that unconditional 2-D fields with connected structures can have the same lognormal probability density function and isotropic covariance function as multi-Gaussian fields without connected structures. Since rate-limited mass transfer may be a significant process in *K* fields with highly permeable connected structures, their results highlighted the importance of identifying the connectivity of porous media in order to choose the most appropriate transport model.

[8] One of the first attempts to establish some criteria for ranking *K* fields on the basis of their connectivity was made by *Deutsch* [1998]. The method is essentially based on measuring the number and the size of connected bodies in a 3-D Cartesian grid. More recently, *Knudby and Carrera* [2005] proposed and evaluated nine different indicators of connectivity in order to assess the possibility of predicting flow connectivity from statistical connectivity and, consequently, transport connectivity from flow connectivity. From the lack of correlation between indicators measuring different types of connectivity, they concluded that it is a process-dependent concept. *Lee et al.* [2007] used 3-D data from a real aquifer to investigate connectivity in the LLNL aquifer. Several realizations of aquifer heterogeneity were generated using sequential Gaussian simulation (SGS) and transition probability indicator simulation (T-PROGS). Simulated *K* fields were also used as input to a groundwater flow model to simulate a pumping test performed at the LLNL site. Measures of spatial connectivity showed that the network of high-*K* is characterized by greater lateral connectivity in the T-PROGS realizations compared to the SGS fields. T-PROGS realizations were also more accurate in reproducing the observed drawdown. *Vassena et al.* [2009] investigated the effects of facies heterogeneity on flow and transport in small blocks (1 m^{3}) sampled from the alluvial sediments of the Ticino valley in Italy. Numerical tracer experiments in these systems combined with the use of connectivity indicators suggested that transport and statistical connectivity indicators are correlated with dispersivity. In a recent study of MADE site single-well injection-withdrawal test data, *Ronayne et al.* [2010] showed that intrafacies heterogeneity is responsible for local-scale mass transfer based on a hybrid model that combines 3-D lithofacies to represent submeter connected channels in a matrix based on a correlated multivariate Gaussian hydraulic conductivity field.

### 1.2. Objectives

[9] The main objective of the present study is to investigate connectivity in a small block of the aquifer at the MADE site and the role of that connectivity in advective transport. To attain this goal, 3-D conditional realizations of the geologic heterogeneity were first generated using three different geostatistical methods including sequential Gaussian simulation, sequential indicator simulation and transition probability indicator simulation. Geostatistical realizations are conditioned to *K* values estimated through the grain size analyses of 19 newly collected cores (20 cores were initially collected, but one was found to be unusable due to incomplete depth records). This approach differs from our previous studies [*Gorelick et al.*, 2005; *Liu et al.*, 2007; *Bianchi et al.*, 2008] in which connectivity and solute transport were investigated in synthetic aquifers characterized by a dendritic distribution of the high-*K* zones generated using an unconditional invasion-percolation algorithm.

[10] Since at present there is not agreement in the scientific community about which geostatistical method can better represent connected features in heterogeneous aquifers, we chose to apply three of the most commonly used in order to obtain a wider spectrum of possible representations of the aquifer heterogeneity. In this way we also tried to reduce the possibility that our conclusions regarding the connectivity within the aquifer are biased by the characteristics of a particular geostatistical method. Particle tracking calculations were used to assess the influence of connectivity on advective transport and eventually analyze the geometric characteristics of the connected pathways. Indicators similar to those presented by *Deutsch* [1998] and *Knudby and Carrera* [2005] were then used to quantify the connectivity of each of the generated *K* fields. We also present an exceptionally detailed 3-D data set that provides a close representation of the heterogeneity of an actual aquifer block. On the basis of the analysis of this unique data set, another attribute of our study is that we considered connectivity in 3-D geostatistical realizations conditioned to *K* measurements.

[11] The remainder of the paper is organized as follows. After a brief description in section 2 of the hydrogeological setting of the MADE site aquifer, section 3 presents the core sampling method, the grain size analysis used to estimate the vertical distribution of *K* of each core segment, and the descriptive statistics of the collected data set. Section 4 discusses the geostatistical methods for generating 3-D *K* fields using the cores data, the groundwater flow and particle transport models, and the parameters used for measuring spatial connectivity. Section 5 illustrates and discusses the nature of the connectivity of the studied portion of the MADE aquifer based on the results of the geostatistical conditional simulations, the characteristics of simulated breakthrough curves and particle paths, and the values of the connectivity indicators. Finally, section 6 presents general insights drawn from this study.