Spatial connectivity in a highly heterogeneous aquifer: From cores to preferential flow paths



[1] This study investigates connectivity in a small portion of the extremely heterogeneous aquifer at the Macrodispersion Experiment (MADE) site in Columbus, Mississippi. A total of 19 fully penetrating soil cores were collected from a rectangular grid of 4 m by 4 m. Detailed grain size analysis was performed on 5 cm segments of each core, yielding 1740 hydraulic conductivity (K) estimates. Three different geostatistical simulation methods were used to generate 3-D conditional realizations of the K field for the sampled block. Particle tracking calculations showed that the fastest particles, as represented by the first 5% to arrive, converge along preferential flow paths and exit the model domain within preferred areas. These 5% fastest flow paths accounted for about 40% of the flow. The distribution of preferential flow paths and particle exit locations is clearly influenced by the occurrence of clusters formed by interconnected cells with K equal to or greater than the 0.9 decile of the data distribution (10% of the volume). The fraction of particle paths within the high-K clusters ranges from 43% to 69%. In variogram-based K fields, some of the fastest paths are through media with lower K values, suggesting that transport connectivity may not require fully connected zones of relatively homogenous K. The high degree of flow and transport connectivity was confirmed by the values of two groups of connectivity indicators. In particular, the ratio between effective and geometric mean K (on average, about 2) and the ratio between the average arrival time and the arrival time of the fastest particles (on average, about 9) are consistent with flow and advective transport behavior characterized by channeling along preferential flow paths.

1. Introduction

1.1. Background and Literature Review

[2] Understanding transport processes and developing mathematical models capable of simulating observed solute plumes are fundamental to environmental risk assessment and the remediation of contaminated sites. Historically, innovations in the discipline of solute transport modeling were developed and tested by using extensive data sets collected during controlled experimental field studies. These data sets usually include measurements of solute concentration and hydraulic conductivity (K), which are essential to properly characterizing subsurface heterogeneity and transport behavior. In the last 2 decades, for example, data collected from a tracer test site at the Columbus Air Force Base in Mississippi, commonly known as the Macrodispersion Experiment (MADE) site, have been invaluable for advancement of new transport theories and mathematical models [Zheng et al., 2011]. The importance of this site is mainly due to its extreme heterogeneity indicated by the high variance of the natural logarithm of the measured hydraulic conductivity K (σlnK2 ≈ 4.5) [Rehfeldt et al., 1992], which is significantly higher than that of other aquifers for which similar data sets exist [e.g., Mackay et al., 1986; LeBlanc et al., 1991].

[3] Three large-scale natural gradient tracer tests, usually referred to as MADE-1, MADE-2, and MADE-3 (also known as NATS) experiments, were conducted at the MADE site [Boggs, 1991; Boggs et al., 1993; Julian et al., 2001]. Measured concentrations revealed that transport behavior is characterized by highly asymmetric plumes, with significant mass accumulation near the source and extensive mass spreading to the far field. Several studies applied different modeling approaches to simulate the concentration distributions observed during the these tracer tests [Adams and Gelhar, 1992; Eggleston and Rojstaczer, 1998; Berkowitz and Scher, 1998; Zheng and Jiao, 1998; Harvey and Gorelick, 2000; Feehley et al., 2000; Julian et al., 2001; Baeumer et al., 2001; Schumer et al., 2003; Barlebo et al., 2004; Salamon et al., 2007; Zhang and Benson, 2008; Llopis-Albert and Capilla, 2009]. All these studies have in common the conclusion that the classical advection-dispersion model (ADM) is not able to reproduce the transport behavior observed at the MADE site unless physical heterogeneity is adequately resolved. When small-scale variations of water flux due to aquifer heterogeneity are not explicitly described, the ADM underestimates the extensive spreading or “tailing” along the flow direction and is not able to reproduce the substantial mass accumulation near the injection points. This conclusion was further confirmed by two recent forced-gradient tracer tests [Liu et al., 2010; Bianchi et al., 2011].

[4] As initially suggested by Harvey and Gorelick [2000] and Feehley et al. [2000], a reasonable hypothesis to explain the failure of the ADM is the presence of a network of interconnected highly permeable sediments embedded in a less conductive matrix. This conceptual model of heterogeneity can in fact favor the fast movement of a fraction of solute mass along preferential flow paths (PFPs), while most of the mass stagnates in the matrix. This hypothesis was proposed after modeling results showed that the dual-domain mass transfer model can reproduce the transport behavior observed at the MADE site more accurately. The dual-domain model conceptualizes the aquifer as consisting of distinct, but coexisting, mobile and immobile domains, and this separation is particularly appropriate when reproducing transport in the presence of connected high-K structures embedded in a low-K matrix [Gorelick et al., 2005; Liu et al., 2004; Bianchi et al., 2008]. The efficacy of the dual-domain model in reproducing the solute plumes observed during the MADE-1 and MADE-2 experiments was considered by Harvey and Gorelick [2000] and Feehley et al. [2000] an indirect proof of the existence of a PFP network controlling solute transport at the MADE site. This hypothesis was also proposed by Julian et al. [2001] and more recently by Llopis-Albert and Capilla [2009]. Zheng and Gorelick [2003] investigated more specifically the transport behavior in a field characterized by a binary dendritic K distribution, generated using an invasion-percolation algorithm. Their numerical experiments offered support to the PFP network hypothesis by demonstrating that solute transport in a hypothetical networked K field displays highly non-Fickian characteristics similar to those observed at the MADE site.

[5] The traditional approach for reproducing hydraulic conductivity fields in heterogeneous porous media has been based on the assumption of multivariate Gaussian distribution of lnK. With this approach, lnK is considered a spatially correlated random variable. An important characteristic of multi-Gaussian fields is that entropy (disorder) is maximized and therefore extreme values tend to cluster in isolated zones rather than being arranged in connected structures [Silliman and Wright, 1988; Rubin and Journel, 1991; Journel and Deutch, 1993; Gómez-Hernández and Wen, 1998; Zinn and Harvey, 2003]. The multi-Gaussian approach has become popular due to its relative mathematical simplicity and easy interpretation. However, several studies have demonstrated the importance of connectedness rather than randomness in heterogeneous aquifers [e.g., Anderson, 1989; Sánchez-Vila et al., 1996; Koltermann and Gorelick, 1996; Webb and Anderson, 1996; Tsang and Neretnieks, 1998; Fogg et al., 1998, 2000].

[6] Fogg [1986], for example, suggested that groundwater flow in the Wilcox aquifer in Texas is controlled by the continuity and connectivity of large-scale sand bodies. Scheibe and Yabusaki [1998] demonstrated that methods for upscaling the K distribution, which lead to a good match between simulated and observed heads, may not be adequate to reproduce transport behavior because transport is strongly affected by the existence and connectivity of high-K zones. Labolle and Fogg [2001] recognized that the connectivity of highly permeable channel hydrofacies is the most important factor controlling solute migration in the alluvial system at the Lawrence Livermore National Laboratory (LLNL). The hydrostratigraphic reconstruction of the LLNL aquifer showed that about 80% of the channel hydrofacies forms a connected network that percolates in three dimensions. Proce et al. [2004] applied transition probability and sequential indicator simulations to simulate the assemblage of facies in a system of buried valley aquifers. A multiscale realization of aquifer heterogeneity showed the presence of interconnected pathways resulting from the significant connectivity of sand and gravel facies. The influence on groundwater flow patterns exerted by the distribution of different lithofacies with significant contrasts in K was also investigated by Heinz et al. [2003]. By using particle tracking calculations, they showed that sedimentary processes are responsible for the heterogeneities that determine local groundwater flow in aquifers. The importance of connectivity on fracture flow has also been acknowledged [e.g., Journel and Alabert, 1989; Tidwell and Wilson, 1999].

[7] Numerical studies most commonly have been used to quantify the connectivity of K fields. Gómez-Hernández and Wen [1998] analyzed groundwater travel times in four alternative unconditional representations of a 2-D synthetic K field sharing the same Gaussian histogram and covariance function, but different in terms of connectivity patterns. Results showed that travel times in the multi-Gaussian model could be 10 times slower than those observed in the other models. Western et al. [2001] applied connectivity functions to produce unconditional 2-D fields with almost identical histograms and omnidirectional variograms, but with very different connectivity. They concluded that standard geostatistical approaches based on variogram models do not properly capture connectivity. Zinn and Harvey [2003] showed that unconditional 2-D fields with connected structures can have the same lognormal probability density function and isotropic covariance function as multi-Gaussian fields without connected structures. Since rate-limited mass transfer may be a significant process in K fields with highly permeable connected structures, their results highlighted the importance of identifying the connectivity of porous media in order to choose the most appropriate transport model.

[8] One of the first attempts to establish some criteria for ranking K fields on the basis of their connectivity was made by Deutsch [1998]. The method is essentially based on measuring the number and the size of connected bodies in a 3-D Cartesian grid. More recently, Knudby and Carrera [2005] proposed and evaluated nine different indicators of connectivity in order to assess the possibility of predicting flow connectivity from statistical connectivity and, consequently, transport connectivity from flow connectivity. From the lack of correlation between indicators measuring different types of connectivity, they concluded that it is a process-dependent concept. Lee et al. [2007] used 3-D data from a real aquifer to investigate connectivity in the LLNL aquifer. Several realizations of aquifer heterogeneity were generated using sequential Gaussian simulation (SGS) and transition probability indicator simulation (T-PROGS). Simulated K fields were also used as input to a groundwater flow model to simulate a pumping test performed at the LLNL site. Measures of spatial connectivity showed that the network of high-K is characterized by greater lateral connectivity in the T-PROGS realizations compared to the SGS fields. T-PROGS realizations were also more accurate in reproducing the observed drawdown. Vassena et al. [2009] investigated the effects of facies heterogeneity on flow and transport in small blocks (1 m3) sampled from the alluvial sediments of the Ticino valley in Italy. Numerical tracer experiments in these systems combined with the use of connectivity indicators suggested that transport and statistical connectivity indicators are correlated with dispersivity. In a recent study of MADE site single-well injection-withdrawal test data, Ronayne et al. [2010] showed that intrafacies heterogeneity is responsible for local-scale mass transfer based on a hybrid model that combines 3-D lithofacies to represent submeter connected channels in a matrix based on a correlated multivariate Gaussian hydraulic conductivity field.

1.2. Objectives

[9] The main objective of the present study is to investigate connectivity in a small block of the aquifer at the MADE site and the role of that connectivity in advective transport. To attain this goal, 3-D conditional realizations of the geologic heterogeneity were first generated using three different geostatistical methods including sequential Gaussian simulation, sequential indicator simulation and transition probability indicator simulation. Geostatistical realizations are conditioned to K values estimated through the grain size analyses of 19 newly collected cores (20 cores were initially collected, but one was found to be unusable due to incomplete depth records). This approach differs from our previous studies [Gorelick et al., 2005; Liu et al., 2007; Bianchi et al., 2008] in which connectivity and solute transport were investigated in synthetic aquifers characterized by a dendritic distribution of the high-K zones generated using an unconditional invasion-percolation algorithm.

[10] Since at present there is not agreement in the scientific community about which geostatistical method can better represent connected features in heterogeneous aquifers, we chose to apply three of the most commonly used in order to obtain a wider spectrum of possible representations of the aquifer heterogeneity. In this way we also tried to reduce the possibility that our conclusions regarding the connectivity within the aquifer are biased by the characteristics of a particular geostatistical method. Particle tracking calculations were used to assess the influence of connectivity on advective transport and eventually analyze the geometric characteristics of the connected pathways. Indicators similar to those presented by Deutsch [1998] and Knudby and Carrera [2005] were then used to quantify the connectivity of each of the generated K fields. We also present an exceptionally detailed 3-D data set that provides a close representation of the heterogeneity of an actual aquifer block. On the basis of the analysis of this unique data set, another attribute of our study is that we considered connectivity in 3-D geostatistical realizations conditioned to K measurements.

[11] The remainder of the paper is organized as follows. After a brief description in section 2 of the hydrogeological setting of the MADE site aquifer, section 3 presents the core sampling method, the grain size analysis used to estimate the vertical distribution of K of each core segment, and the descriptive statistics of the collected data set. Section 4 discusses the geostatistical methods for generating 3-D K fields using the cores data, the groundwater flow and particle transport models, and the parameters used for measuring spatial connectivity. Section 5 illustrates and discusses the nature of the connectivity of the studied portion of the MADE aquifer based on the results of the geostatistical conditional simulations, the characteristics of simulated breakthrough curves and particle paths, and the values of the connectivity indicators. Finally, section 6 presents general insights drawn from this study.

2. Site Description

[12] The hydrostratigraphic setting of the MADE site is characterized by a shallow unconfined aquifer, averaging about 11 m in thickness, underlain by an aquitard unit represented by the marine clay-rich deposits of the Eutaw formation. The aquifer is composed of poorly sorted to well-sorted sandy gravel and gravelly sand with small amounts of silt and clay. Extensive sampling of the aquifer [Boggs et al., 1990, 1992] revealed a predominance of sandy to gravelly clay deposits in the surficial 2 m, overlaying an interval, of about 8 m in thickness, characterized by sandy gravel and gravelly sand. A deeper interval of about 1 m to 2 m in thickness and composed by a mixture of sand and fine sediments characterizes the bottom part of the aquifer, representing the transition from alluvial facies to the marine sediments of the Eutaw formation. Recently, Bowling et al. [2005] used ground-penetrating radar and direct current resistivity data, integrated with previously collected borehole flowmeter measurements and sediment cores [Boggs, 1991; Boggs et al., 1993], to assemble a hydrogeological conceptual model of the aquifer consisting of four major hydrostratigraphic units: a meandering fluvial system at the top (from 0 to about 3 m in depth), a braided fluvial system in the middle portion of the aquifer (from 3 m to 10 m), fine-grained sands at the bottom of the aquifer, and the underlying clay aquitard.

[13] Rehfeldt et al. [1992] investigated the spatial distribution of K by performing flowmeter tests in 66 fully penetrating wells distributed in an area of approximately 90 m × 270 m. The vertical spacing of the measurements within each flowmeter well was approximately 15 cm. Statistical analysis of the collected data (more than 2500 measurements) indicated a lognormal distribution of K and an extremely high heterogeneity. The geometric mean of K derived from the flowmeter measurements is approximately 5 × 10−3 cm/s, while the variance of lnK is 4.5.

3. Data Collection

3.1. Soil Core Collection and Grain Size Analysis

[14] The soil cores analyzed in this study were collected with a Geoprobe® core sampler (model 5410) using direct-push technology to minimize the disturbance of the samples and preserve the actual heterogeneity and structure of the aquifer. The cores were collected from a portion of the MADE site aquifer that was previously investigated by the borehole flowmeter measurements performed during the MADE-1 and MADE-2 experiments (Figure 1). In particular, the new cores are representative of the block located in proximity of boreholes K-41, K-42, and K-45 as labeled by Boggs [1991], Rehfeldt et al. [1992], and Boggs et al. [1993]. Coring was performed over a 4 m by 4 m area using a regularly spaced sampling grid with a new well installed at the center (Figure 2). Soil cores (5.08 cm in diameter) are spaced 1 m apart and located within 2 m of the well. They were drilled in 1.22 m sections, and at least five sections were collected for each core in order to sample over the depth interval from the water table to the bottom of the aquifer. Immediately after removal from the Geoprobe® core sampler, the cores were sealed to prevent moisture loss and placed in an on-site freezer stored with dry ice. They were then transported to the laboratory where they remained frozen until grain size analysis was performed.

Figure 1.

Map of the Macrodispersion Experiment (MADE) site showing the location of the flowmeter boreholes (open circles), the location of injection wells used during the MADE-1 and MADE -2 experiments (solid square), and the area investigated in this study. Boreholes are labeled as in Boggs [1991], Rehfeldt et al. [1992], and Boggs et al. [1993]. The Y axis is rotated 12° counterclockwise from the north.

Figure 2.

Location of the 19 cores collected for grain size analysis and hydraulic conductivity estimation. IW indicates the injection well installed at the center of the sampling grid. Core 3 (not shown) was collected, but grain size data are not usable due to missing depth information.

[15] The frozen 1.22 m long core sections were horizontally dissected into approximately 5 cm segments with a masonry block saw under frozen conditions. At the beginning of the grain size analysis, cores segments were wet sieved to determine the weight percentage of silt and clay. During this step, segments were first dried at 105°C in an oven for a 24 h period to eliminate any residual moisture in the sample and then cooled for 1 h before the initial weight was recorded. Using a 230-mesh sieve screen (63 μm opening) as a filter, segments were rinsed with water for 5 min and placed back in the oven for a second 24 h drying period. Segments were weighed again after cooling, and the amount of silt and clay was calculated by subtracting the two measured weights. The fine sand, medium sand, coarse-very coarse sand, and gravel fractions of the core segments were determined using a stack of sieves of 230 mesh (63 μm opening), 60 mesh (250 μm), 35 mesh (500 μm), and 10 mesh (2 mm) screen.

[16] Grain size analysis confirmed the significant heterogeneity of the sediments constituting the aquifer at the MADE site. In all 19 cores, the weight percent of fines, fine, medium, and coarse sand and gravel fluctuates significantly along the vertical extent of the cores (Figure 3). The average percentage of fines (<0.063 mm) is around 8% with a tendency to increase toward shallow depths. In all the cores, fines are not organized into uniform silt/clay layers, but they are disseminated throughout the sands. Gravel content is very variable, ranging from almost 0% to 95%. The diameters of the soil particles at 10% (d10) and 60% (d60) cumulative weights, the coefficient of uniformity (Cu = d60/d10), and the porosity of each core segment were determined from the grain size distribution curves. In particular, the porosity was estimated using the following empirical relation [Kasenow, 2002]:

equation image

Considering all the cores, the average values for the d10 and d60 are 0.40 mm and 6.16 mm, respectively. The high d60 value indicates the relative coarseness of this sector of the MADE aquifer. The average Cu is equal to 16.56, which is typical of poorly sorted sediments, while the average porosity is 0.28. This value is comparable to the porosity (0.32) determined from the analysis of previously collected soil cores [Boggs et al., 1992].

Figure 3.

Cumulative grain size distribution measured in core 5. Colors represent relative abundance of fines (F), fine sand (Sf), medium sand (Sm), coarse sand (Sc), and gravel (G).

3.2. Hydraulic Conductivity Data and Descriptive Statistics

[17] The hydraulic conductivity of each 5 cm core segment was not measured with permeameter or in situ methods. Instead, approximate values of K were estimated using empirical equations based on the data determined from the grain size analysis of the core segments.

[18] These can be generalized as

equation image

where g is the gravitational constant, ν is the kinematic viscosity, Cs is the sorting coefficient, f(n) is a function dependent on the porosity, and de is an effective grain diameter. Several forms of equation (2) can be used that differ in the values assigned to the parameters Cs, f(n), and de. After comparison, we selected K estimates based on the Hazen formulation [Hazen, 1892] where Cs is equal to 6 × 10−4, f(n) is equal to (1 + 10(n − 0.26), and de is equal to d10. Despite the fact that the Hazen equation is usually recommended for well-sorted sediments with Cu < 5, we did not find a significant difference between the estimates calculated using the Hazen equation and those from other equations [i.e., Breyer, 1964]. We recognize that petrophysical models explicitly accounting for infilling of pores by fine particles suggest substantial reduction in hydraulic conductivity values [Koltermann and Gorelick, 1995, 1996; Conrad et al., 2008]. However, in this work we believe that the relative values of hydraulic conductivity variations are captured using the above formula. We proceed with this caveat.

[19] The grain-size derived data set presented in this work consists of 1740 estimates of hydraulic conductivity distributed in a 4 m × 4 m × 6 m sector of the MADE site aquifer (Figure 4). The frequency distribution and the univariate statistics of the ln-transformed K values are illustrated in Figure 5. The mean and variance of the lnK data are equal to −3.48 (mean K = 0.28 cm/s) and 4.35, respectively. The minimum of 8.68 × 10−4 cm/s is observed at a depth of 4.1 m in core 17, while the maximum of 7.96 cm/s is located at 5.77 m below ground surface in core 2. The comparison between the normal probability plots of the cumulative density function of the data and that of a normal distribution having the same mean and variance (Figure 6) shows that the distribution of the lnK data approximates normality for values ranging from −6.5 to −2 but deviates significantly at both tails of the distribution.

Figure 4.

Three-dimensional spatial distribution of the 1740 hydraulic conductivity values estimated from grain size analysis of the core segments. Cores are labeled as in Figure 2.

Figure 5.

Frequency distribution and descriptive statistics of the lnK estimates.

Figure 6.

Normal probability plot of the cumulative density function of the lnK estimates.

[20] To qualitatively test the reliability of the grain size analysis, we compared the distribution and the statistics of the K estimates with those of the 2483 log-transformed flowmeter measurements of Rehfeldt et al. [1992]. The mean of K estimated from the log-transformed flowmeter data is −5.23, significantly lower than that of the lnK estimates presented in this study. This difference can be explained by the nonstationarity of the K field at the MADE site, which is shown by the distribution of depth-averaged flowmeter measurements. This shows that the MADE site is characterized by zones of lower conductivity at the south and northeast ends and by a zone of higher conductivity near the center of the site [Boggs et al., 1990; Rehfeldt et al., 1992; Bowling et al., 2005]. The block of aquifer investigated in this study is located approximately at the boundary between a zone with K ranging from 10−2 cm/s to 10−1 cm/s and a zone characterized by lower K with values ranging from 10−3 cm/s to 10−2 cm/s. The discrepancies between the flowmeter data and the K estimates obtained in this study can also be related to the empirical method used to determine K from the grain size analysis. It has in fact been shown that Hazen equation tends to overestimate K in heterogeneous poorly sorted sediments [Carrier, 2003]. However, it is important to clarify that for the purpose of this study, our primary concern was measuring relative, highly resolved variations of K rather than absolute values. In this sense, it is noteworthy that the two data sets have almost the same variance, indicating that our K estimates are representative of the actual level of heterogeneity of the MADE site aquifer.

4. Geostatistical Analysis and Flow Modeling

4.1. Geostatistical Analysis

[21] Two variogram-based approaches, sequential Gaussian simulation (SGS) and sequential indicator simulation (SIS), and the transition probability approach (T-PROGS) were used to generate conditional realizations of the lnK field. Realizations were conditioned to the 1740 K values estimated from the grain size analysis of the cores. The interpolated domain is 6 m long by 6 m wide, with a thickness of 6.2 m, and was discretized with an interpolation grid with resolution of 20 cm in the horizontal directions (x and y) and 10 cm in the vertical direction (z). A total of 60 conditional realizations of the K distribution, consisting of 20 realizations for each simulation method, were generated.

4.1.1. Sequential Gaussian Simulation (SGS)

[22] With the SGS approach the estimate of a normal random variable Z at any point in space is calculated by randomly selecting a value drawn from the normal distribution defined by the kriging mean and variance. A specified number of data and previously simulated values are allowed to condition each new simulated point. To honor the assumption of normality, the original lnK estimates were initially transformed to their normal scores. The spatial correlation of the normal scores was then evaluated using the experimental semivariogram. As in other geostatistical analyses at the MADE site [Rehfeldt et al., 1992; Salamon et al., 2007; Llopis-Albert and Capilla, 2009], spatial correlation was investigated only in the horizontal (dip = 0°) and vertical directions (dip = 90°). In this way we assumed that the depositional structures, responsible for the variability of K, are horizontal. Directional anisotropy in the horizontal plane was also investigated, but variograms did not show preferential directions of spatial correlation. The experimental variograms were fitted with an exponential isotropic model with nugget, sill, and effective range equal to 0.2, 0.8, and 0.8 m, respectively. Isotropic conditions are indicated by the similarity between the omnidirectional horizontal, the omnidirectional, and vertical experimental variograms presented in Figure 7. This result contrasts with the geostistical analyses result of the flowmeter data, which indicated that the aquifer at the MADE site is characterized by anisotropy. However, the scales of concern are quite different. As shown by Rehfeldt et al. [1992] and Salamon et al. [2007], the horizontal correlation scale of the flowmeter measurements is about 40 m and therefore significantly larger than the dimension of the sampled block (4 m by 4 m). Therefore, it is possible that we were not able to sample significant variations in grain size within our sampling block that can lead to an anisotropic behavior of the experimental variogram. Moreover, the formula used to estimate K from the grain size analysis does not consider the effects of the ratio of sand to gravel. Since this ratio varies significantly in the vertical direction (Figure 3), it is possible that the K estimates based on the d10 do not represent the “true” vertical variations. Sequential Gaussian conditional realizations were generated with the SGSIM code of the Geostatistical Software Library (GSLIB) [Deutsch and Journel, 1998]. Simulations were performed in lognormal space and then the simulated values were back transformed.

Figure 7.

Omnidirectional, omnidirectional horizontal, and vertical experimental variograms of the normal scores of the lnK estimates. The exponential model used for the sequential Gaussian simulation (SGS) realizations is also shown. Experimental variograms are calculated for log-K values.

4.1.2. Sequential Indicator Simulation (SIS)

[23] Unlike the SGS method, which assumes a multi-Gaussian spatial distribution, the SIS method does not require a particular type of distribution of lnK. Even though studies have suggested that due to these characteristics the SIS approach can provide a better representation of connected structures [e.g., Journel and Alabert, 1989; Rubin and Journel, 1991; Koltermann and Gorelick, 1996; Anderson, 1997; Gómez-Hernández and Wen, 1998], results are not conclusive [e.g., Blöschl, 1996]. With the indicator approach, the distribution of the lnK estimates, which was considered a continuous variable, was discretized into mutually exclusive classes bounded by the thresholds zk (Table 1). In this work, the thresholds correspond to the nine deciles of the univariate distribution of the lnK estimates. This choice of cutoff values is the most commonly used [Isaaks and Srivastava, 1989] and was also applied by Salamon et al. [2007] and Llopis-Albert and Capilla [2009] to simulate conditional K fields at the MADE site using the flowmeter measurements. As for the normal scores, the analysis of the nine indicator variograms revealed isotropic behavior and for each threshold the model that best represents spatial continuity is an exponential function. Nugget, sill, and ranges values of the nine indicator variogram models are presented in Table 1. The effective ranges of the exponential models are between 0.4 m and 0.8 m, and the spatial continuity increases from the 0.2 decile to the 0.6 decile and then slightly decreases until the 0.9 decile. Sequential indicator conditional realizations were generated with the code SISIM of GSLIB [Deutsch and Journel, 1998].

Table 1. Geostatistical Parameters of the Isotropic Variogram Models Used for the Sequential Indicator Simulationsa
zkProbability Density Function ValuesModelEffective Ranges (m)Nugget c0Sill c1
  • a

    The nine thresholds zk correspond to the deciles of the probability density function of the lnK estimates.


4.1.3. Transition Probability Indicator Simulation (T-PROGS)

[24] We also applied the transition probability approach (T-PROGS) based on the Markov chain model [Carle and Fogg, 1996, 1997; Carle et al., 1998]. T-PROGS is an indicator approach that can model the full transition probability function, including cross correlations representing commonly observed geologic juxtapositional tendencies with a small number of parameters and a single Markov Chain equation in each direction. The method can also be used to transparently account for quantitative and qualitative geologic information, such as volumetric proportions and mean lengths of categorical classes and depositional facies relationships [Carle et al., 1998; Ritzi, 2000; Ritzi et al., 2004; Dai et al., 2005]. The transition probability approach uses transition probabilities to represent the spatial structure of the data. They are defined as the probability that a certain category j occurs at the location u + h conditioned to the occurrence of another category i at the location u. Considering a number N of categories or classes, the transition probabilities between classes can be modeled by a 1-D Markov chain model in the form [e.g., Carle and Fogg, 1996]

equation image

where T(hϕ) is N × N matrix of transition probabilities, hϕ is the distance or lag in the direction ϕ, and R(hϕ) is the transition rate matrix whose elements rij represent the rate of change from class i to class j, per unit length, in the direction ϕ. Conditional simulations are generated in two steps consisting of a preliminary generation of the distribution of the categorical variable by using sequential indicator simulation and a simulated quenching stage to improve the agreement between measured and modeled transition probabilities. Thorough descriptions of the algorithm are given by Carle and Fogg [1996, 1997] and Carle et al. [1998].

[25] Since the transition probability approach requires categorical variables, the 1740 lnK estimates were distributed in five mutually exclusive classes (Figure 8) and a single value of lnK was assigned to each class. In detail, the 0.1, 0.3, 0.7, and 0.9 deciles were chosen as the limits of the five classes (Figure 8), and the mean of the lnK estimates falling within the cutoffs was considered as the homogeneous lnK value assigned to each class. The deciles of the lnK distribution, including those delimiting the T-PROGS classes, are tabulated in Table 1. Transiograms were calculated in the vertical and horizontal directions and then a 3-D Markov chain model was fitted to the sample transiograms. As with the variogram-based approaches, isotropy is assumed. The transition probabilities of the five lnK classes are shown in Figure 9, where the solid line represents the Markov chain model used to generate the conditional realizations of the lnK field. Transition probability analysis and stochastic simulations were performed using the software T-PROGS [Carle, 1999].

Figure 8.

Quantile plot of the lnK estimates showing the thresholds used to define the classes of homogeneous conductivity used for the conditional T-PROGS realizations. The numbers within parentheses are the mean values of the data falling within the thresholds and represent the lnK assigned to each class.

Figure 9.

Vertical-direction transition probabilities for the classes defined in Figure 8. Plots on the main diagonal represent autotransitions between like classes, while plots off the main diagonal represent cross transitions between different classes. The isotropic Markov chain model was used for the conditional transition probability indicator simulation (T-PROGS) realizations.

4.2. Groundwater Flow and Advective Transport Modeling

[26] Groundwater flow and advective transport were simulated for each of the generated 3-D conductivity fields. Steady state flow was simulated with MODFLOW [Harbaugh et al., 2000] using a 3-D finite difference grid with the same dimensions and resolution as the interpolation grid. No-flow boundary conditions were imposed on the four faces perpendicular to the x and z axes (Figure 10). Specified heads were assigned to the other two boundaries perpendicular to the y axis to impose a mean hydraulic gradient of 0.01 across the entire domain. The simulated groundwater flow direction was oriented approximately as the actual average flow direction at the MADE site as indicated by potentiometric surface maps derived from head measurements [Boggs et al., 1992].

Figure 10.

Schematic showing the geometry and boundary conditions of the model used for groundwater flow and particle tracking simulations. The modeling domain is 6 m wide (x direction), 6 m long (y direction), and 6.2 m thick (z direction). The finite difference grid used for the simulations has a regular grid spacing of 0.2 m in the x and y directions and of 0.1 along the z direction.

[27] Flow paths and particle travel times in the simulated flow fields were calculated using MODPATH [Pollock, 1994]. Only advective transport was simulated, while the effects of mechanical dispersion and molecular diffusion were not considered. A uniform porosity equal to 0.28 was assigned to the model domain. At the beginning, 15,624 particles were evenly distributed (eight for each cell) across the upstream face of the model domain. This configuration mimics an instantaneous, distributed source, and each particle is viewed as a solute particle having an indivisible mass. Spatially averaged breakthrough curves (BTCs) were then calculated by counting the number of particles exiting the system at the face located down gradient with respect to the starting positions. In order to smooth the shape of the BTCs, time intervals of 5 h were used to count the particles. Since the specified heads and no-flow conditions are directly adjacent to the study volume, the influence of the no-flow boundaries on the particle path geometry and distribution was also assessed. For all the realizations, this influence is very limited, and only very few particle paths are the expression of unnatural fluxes on the boundary faces.

4.3. Measures of Spatial Connectivity

[28] Two groups of indicators were used to quantify connectivity in the 3-D hydraulic conductivity fields. The first group refers to a set of parameters that measure connectivity based on the spatial characteristics of clusters of a certain categorical class or geologic facies [Deutsch, 1998]. In this work, we focused on the connectivity of high-K zones defined by clusters of face-connected cells (i.e., sharing a face) with lnK equal to or higher than the 0.9 decile of the data distribution. The 0.9 decile of the lnK estimates (−0.34 corresponding to 0.71 cm/s) is also the threshold used to define the class with the highest conductivity for the SIS and T-PROGS realizations. Connectivity of the high-K clusters was evaluated by calculating the following parameters: total number of high-K clusters and their volume fraction; dimension (number of cells) of the largest connected high-K cluster and its specific surface (surface area per unit volume); dimension (number of cells) of the second largest connected high-K cluster; and fraction of single cell high-K clusters.

[29] The code GEO_OBJ [Deutsch, 1998] was used to identify the high-K clusters and to calculate these quantities. Since the considered high-K clusters occupy a volume proportion of 0.10, they are not expected to fully percolate, meaning that they are not expected to span any two opposing domain boundaries. This is because the volume proportion of the high-K cluster is less than the percolation thresholds measured for correlated three-dimensional cubic lattice media [Harter, 2005; Guin and Ritzi, 2008].

[30] The second group of connectivity indicators are measures of flow and transport connectivity and includes some of the parameters presented by Knudby and Carrera [2005], here indicated as CI1 and CI2. In this work we refer to flow and transport connectivity in the sense originally used by the same authors such that flow connectivity indicates the presence of preferential flow paths and transport connectivity indicates the existence of fast paths allowing early solute arrival. The first indicator CI1 is defined as [Knudby and Carrera, 2005, equation (6)]

equation image

that is, the ratio between the effective conductivity (Keff) and the geometric mean of the 1740 K estimates (KG). Darcy's law between the upstream and the downstream faces of the groundwater flow modeling domain was applied to calculate Keff,

equation image

where Q is the volumetric flow rate across the downstream face of area A and h1 and h2 are the constant heads applied at the upstream and the downstream faces. Since the upper bound of the effective conductivity is the arithmetic mean KA of the K estimates, and the lower bound is the harmonic mean KH, CI1 can assume any positive value between KH/KG and KA/KG. CI1 is an indicator of flow connectivity, and it is a function of the degree of flow channeling or, in other words, of the fraction of the total flow in a small portion of the porous medium. The effective K is larger than the geometric mean (CI1 > 1) for K fields characterized by the presence of preferential flow paths, while it is smaller (CI1 < 1) for fields in which the high-K media are poorly connected [Sánchez-Vila et al., 1996; Zinn and Harvey, 2003]. CI1 is equal to 1.0 in statistically random fields with no connectivity.

[31] The second connectivity indicator CI2 is related to transport behavior and is defined as the ratio between the average arrival time (tave) and the arrival time of a smaller fraction (5%) of particles [Knudby and Carrera, 2005, equation (8)]:

equation image

where t5% is the time at which 5% of the particles has arrived at the exit face of the simulation domain. Both tave and t5% were calculated from the distribution of particle arrival times. A breakthrough curve skewed toward smaller arrival times, with an early peak and significant tailing, will result in a higher value of CI2 when compared to a Gaussian-shaped breakthrough curve [Knudby and Carrera, 2005]. A high CI2 can also be interpreted as an indication of channeling.

5. Results and Discussion

5.1. Generated 3-D Hydraulic Conductivity Fields

[32] The ensemble mean and variance of the lnK fields generated with the SGS method are equal to −3.60 and 4.33, respectively. These are similar to the corresponding values of the SIS (−3.67 and 4.11) realizations and also comparable to the mean and the variance of the original lnK data. The average mean of the T-PROGS realizations is −3.45, while the average variance is 3.98. The lower variance of the T-PROGS realizations is due to the zonation of continuous K values into classes of homogeneous conductivity. Consequently, the appearance of T-PROGS fields is smoother than that of the fields generated with the variogram-based methods, where a cell in the domain can assume, in theory, any value. However, the average lnK variance of all the T-PROGS realizations is only about 8% lower than the data variance, indicating that the actual variability is almost fully preserved.

[33] A summary of the average values of the arithmetic (KA), harmonic (KH), and geometric means (KG) of the generated K fields is presented in Table 2, together with the average effective conductivity (Keff) calculated from the groundwater flow simulations. Average values of KA and KH confirm the similarity between the SGS and SIS realizations, while the average KA of the T-PROGS fields is lower than the corresponding value of the variogram-based methods by about 16%. The reduced variance of the conductivity fields generated with the T-PROGS method due to the K zonation is corroborated by the observation that the standard deviations of the KA, KH, and KG values or the SGS and SIS realizations are significantly higher than those of the T-PROGS realizations.

Table 2. Average Values of the Arithmetic (KA), Harmonic (KH), and Geometric Means (KG) of the Generated lnK Fields and of Their Average Effective Conductivities (Keff)a
  • a

    Values are expressed in cm/s.


[34] Three realizations of the 3-D conductivity fields generated with SGS, SIS, and T-PROGS are shown in Figure 11. From the simple qualitative observation of the 3-D conductivity fields, it is hard to distinguish a connected arrangement of high K cells. For this reason, we further investigated the connectivity of the generated fields by analyzing groundwater flow and transport and by comparing the indicators of spatial connectivity. The results of these two analyses are the topics of sections 5.2 and 5.3.

Figure 11.

Three-dimensional conductivity fields generated with different geostatistical methods. One realization is shown for each method.

5.2. Flow and Transport Simulation Results

[35] As expected, the average Keff values for the SGS and SIS fields are almost identical (lnKeff equal to −2.93 and −2.92, respectively), while that of the T-PROGS realizations is greater by a factor of about 1.25, indicating a higher average flow velocity. Spatially averaged breakthrough curves calculated from particle tracking simulations are shown in Figure 12, while Table 3 compares the average values of temporal parameters resulting from the analysis of particle arrival times. The ensemble means of the BTCs from the SGS and SIS fields are similar in shape and show a sharp and well-defined peak at early times and an extensive tail at later times. Although the scale of our investigation is significantly smaller than that of the natural gradient tracer tests conducted at the MADE site, these characteristics of the simulated BTCs are consistent with the transport behavior observed during the large-scale tests and the hypothesis that transport is controlled by a network of preferential flow paths embedded in lower K matrix.

Figure 12.

Spatially averaged breakthrough curves resulting from particle tracking simulations. The normalized concentration is defined as the fraction of particles exiting the model domain.

Table 3. Average Temporal Parameters Calculated From Particle Tracking Simulationsa
  • a

    Values are expressed in days.


[36] The average arrival time of the peaks of the BTCs (tpeak) is 3.1 days for the SGS realizations and 3.0 days for the SIS fields. All the other temporal parameters (tave and t5%) are also similar for the SGS and SIS realizations. In particular, the average arrival times (tave) are 11.2 days and 10.8 days, while the average of the arrival times of the fastest particles corresponding to 5% of the total (t5%) is about 1.6 days for both the SGS and SIS realizations.

[37] The shape of the BTCs calculated in the T-PROGS fields is more asymmetric and jagged than those in the SGS and SIS fields. Multiple peaks of arrival times are also observed. The average arrival time of the particles is 2.0 days and is about 1 day faster than the average of the SGS and SIS fields. The average t5% observed in the T-PROGS realizations is also faster by a factor of about 1.6 than the respective values in the SGS and SIS fields. On the other hand, the average of the particles arrival times (tave) is higher in the T-PROGS realizations than in the variogram-based realizations. The higher degree of asymmetry of the BTCs calculated in the T-PROGS fields is also indicated by the ratio between tave and the average arrival time of the peaks (tpeak), which is equal to 6.7, while for the variogram-based fields it is about 3.5.

[38] In order to test the significance of the conditioning K estimates on the geostatistical simulations and on the transport behavior, the BTCs calculated for the conditional sequential Gaussian fields were compared to those in unconditional realizations. Unconditional K fields were generated using the same variogram model and flow modeling and particle tracking analysis was performed with the model setup previously described. The comparison clearly shows that the shape of the BTCs for conditional and unconditional fields is significantly different (Figure 13). It is remarkable that the BTCs for unconditional fields do not show any of the characteristics typical of transport behavior influenced by high flow and transport connectivity (i.e., early time sharp peak and significant tailing) that are conversely shown by the BTCs in the conditional fields.

Figure 13.

Comparison of breakthrough curves for conditional and unconditional K realizations. Conditioning is based on the 1740 K data.

[39] Anomalous transport behavior and significant connectivity in the simulated K fields are also indicated by the slope of the tails of the simulated BTCs (Figure 12). The slope for each BTC was calculated by plotting the log-log scale BTCs and then by performing least squares regression of the tail values [Riva et al., 2008]. The average slope for the SGS realizations is 2.42, with values ranging from 1.51 to 2.97. For the SIS realizations, the average slope is equal to 2.63, with a minimum slope of 1.80 and a maximum of 3.11. The average slope of the BTCs generated for T-PROGS realizations is slightly lower (1.89), with values ranging from 1.46 to 2.39. The low values for the slopes of the tails are a clear indication of anomalous transport. Assuming an homogeneous domain with conductivity equal to the mean effective conductivity of the SGS realizations, the slope of the tail of the BTC generated by solving the advection-dispersion equation for an instantaneous pulse is much steeper and equal to 8.92 (longitudinal dispersivity equal to 1.0 m). More interestingly, the slope of the simulated BTCs suggests a high degree of connectivity. According to Willmann et al. [2008], the slope of a log-log plot BTC is mainly influenced by the connectivity of the K field and it reaches asymptotically 2 as connectivity increases.

5.3. Connectivity of the Generated K Fields

[40] The analysis of the characteristics of the simulated flow fields and of the particle tracking simulations showed that the generated K fields are characterized by a high degree of flow connectivity. This is suggested by the trajectories of the particles with fastest arrival times (within the shortest 5% of the arrival time distribution) that tend to converge along preferential flow paths (Figures 14, 15, and 17) whose distribution within the model domain is clearly influenced by the occurrence of the high-K clusters. In particular, preferential flow paths are less scattered in the xy plane in the variogram-based K fields compared to the T-PROGS realizations (Figure 15, left). Projected on the plane perpendicular to the x direction (Figure 15, middle), preferential flow paths are more tortuous in the variogram-based fields. For all realizations, the exit locations of the fastest particles tend to cluster at certain patches (Figure 15, right), showing that particles, which initially entered the system as a distributed cloud, converged along preferential flow paths and exited the systems in a few selected locations. These fast flow paths (first 5% to arrive) exert a strong control on the flow field as shown by the significant fraction of the total discharge flowing through their exit locations. In the SGS fields, the average cumulative discharge through the exit locations of the fastest particles is equal to 39.9% of the total discharge, with a standard deviation equal to 3.6%. Similar results were calculated for the SIS (44.2% on average with a standard deviation of 3.4%) and the T-PROGS fields (40.1% on average with a standard deviation of 3.1%).

Figure 14.

Three-dimensional perspective views of the spatial distribution of the cells with K less than the median (in blue) penetrated by the fastest pathlines (black lines). Particle paths start at the upgradient face (Y = 167.9 m).

Figure 15.

Spatial distribution of the 20 largest high-K clusters (green areas) and of the paths of the fastest particles (black lines) in one realization for each geostatistical method. In each diagram, high-K clusters and particles paths are projected on the plane formed by two Cartesian axes. Black dots in the right-hand plots represent the particle exit locations.

[41] Probabilistic maps showing the exit locations of the fastest particles were generated by considering all the K-fields for each geostatistical method (Figure 16). These maps show that for the simulated portion of the MADE site aquifer, preferential flow paths are most likely located between 5 m and 8 m below the land surface, while large areas at the top and at the bottom of the simulated domain are devoid of exit locations. In particular, an area of higher density of exit locations of about 0.5 m in thickness and 1 m in length along the x direction is shown by the T-PROGS realizations. Other areas of high concentration of exit locations are located along the right boundary (at x = −15.25) of the simulated domain in the SGS and SIS realizations, but their position can be influenced by the no-flow conditions imposed along this border.

Figure 16.

Probabilistic contour maps showing the spatial density of the exit locations of the fastest particles.

[42] The spatial relationship between the particle paths and the high-K clusters was also analyzed by calculating the cumulative percentage of fastest path trajectory (i.e., pathline travel distance) that falls below a certain lnK threshold (Table 4). Results show that about 43% of the total length of the fastest pathlines is located within the high-K clusters in the SGS fields while higher percentages were calculated for the SIS (57%) and the T-PROGS realizations (69%). These percentages suggest the importance of the extreme values of the lnK distribution in determining the geometry of preferential flow paths and the degree of flow and transport connectivity.

Table 4. Average Cumulative Percentages of Trajectories (i.e., Pathline Travel Distances) of the Fastest Particles, Defined as the First 5% to Arrive, That Fall Below a Certain lnK Thresholda
lnKProbability Density Function ValuesSGSSIST-PROGS
Average %SDAverage %SDAverage %SD
  • a

    The nine thresholds correspond to the deciles of the probability density function of the lnK estimates. For the T-PROGS realizations, only the deciles that represent the cutoffs of the five lnK classes (0.1, 0.3, 0.7, and 0.9 deciles) are considered (see Figure 8).


[43] Another important aspect revealed by Table 4 is the different characteristics of the fastest pathlines in variogram-based fields (SGS and SIS) and the T-PROGS fields. In the SGS and SIS fields, sections of the pathline trajectory are within cells with K values significantly lower than the 0.9 decile of the K data set. In particular, about 5% of the pathline trajectory is through cells with K less than the median, about 10% of the pathline trajectory is through cells with K less than the 0.6 decile, and more than 15% of the total pathline length is through cells with K less than the 0.7 decile. Figure 14 shows the cells with K less than the median penetrated by the fastest pathline trajectories. These values indicate that significant transport connectivity does not necessarily require fully connected zones of homogenous K. Rather, particles can travel along preferential flow paths “jumping” (or leaking) from a high-K cluster to another with the transition through much lower K zones. This finding is consistent with that of Trinchero et al. [2008], who studied point-to-point connectivity in a convergent flow field. In the T-PROGS fields, however, we observe that only 1.3% of the total length of the pathlines is within cells with a K value less than the 70% of the K distribution. This small fraction indicates a greater connectivity of K values within the upper tail of the distribution and that the fastest particle paths remain in zones of relatively homogenous high K.

[44] Other observations resulting from the flow and transport simulations also suggest high connectivity. Regarding flow connectivity, for example, the geometric mean of the generated K fields is lower than the calculated effective conductivity for all the generated K fields. As demonstrated by Zinn and Harvey [2003] and Sánchez-Vila et al. [1996], the effective K deviates significantly from the geometric mean in K fields with nearly identical lognormal univariate conductivity, but different in the patterns by which high- or low-conductivity regions are connected. High transport connectivity is also suggested by the highly asymmetric shape of the BTCs, which indicates that a significant number of particles moved faster than others in the simulated domain due to the presence of preferential flow paths. Finally, particle arrival time distributions show multiple peaks, especially in the T-PROGS realizations, and this feature can also be interpreted as the effect of channeling. Multiple peaks can in fact be generated when particles initially move slowly out of zones of low K and then travel at higher velocity along the preferential flow paths [see also Liu et al., 2007].

[45] A more quantitative assessment of the degree of connectivity in the SGS, SIS and T-PROGS realizations is shown by the values of the indicators presented in Table 5. As previously described, the first group of indicators (group 1) is based on the spatial characteristics of the high-K clusters. The average number of high-K clusters in the SGS realizations is almost 3 times higher than in the SIS realizations and more than 6 times higher than the corresponding average in the T-PROGS realizations. The volume fraction of the channels in the SGS and T-PROGS methods honors the actual frequency of high-K values in the data (10%), while in the SIS realizations the average volume fraction of high-K clusters is slightly lower (7.9%). Even though the SGS realizations have the largest number of high-K clusters, the average number of cells in the largest and second-largest connected zones is lower than in the SIS and T-PROGS fields. The average number of cells composing the largest connected high-K cluster in the SGS realizations is in fact 987, and it is lower than the corresponding values in the SIS and T-PROGS realizations by a factor of 1.26 and 2.34, respectively.

Table 5. Average Spatial Connectivity Indicators
Group 1
Total number of high-K clusters160885.6594779.626043.2
Volume fraction0.
Number of cells in the largest connected high-K cluster987470.41243554.92305809.6
Surface area/volume2.
Number of cells in the 2nd largest connected high-K cluster562344.0616262.3914397.1
Surface area/volume3.
Fraction of single cells0.670.010.630.020.590.04
Group 2

[46] The disparity between the different methods is reduced when we consider the dimensions of the second largest connected high-K cluster which is composed by an average of 562 (SGS), 615 (SIS), and 913 cells (T-PROGS). In addition, Gaussian realizations have the highest percentage of single cell bodies (almost 70%) and also the highest surface area to volume ratio for the largest and second-largest high-K clusters. This ratio is an indication of the shape of the connected zones since it is proportional to the degree of tortuosity. For a fixed volume, the greater the surface area, the more tortuous the connected high-K clusters. The characteristics of the generated SGS fields are consistent with the results of previous studies suggesting that extreme values are less inclined to cluster in multi-Gaussian random fields [e.g., Rubin and Journel, 1991; Sánchez-Vila et al., 1996; Gómez-Hernández and Wen, 1998]. However, observing the spatial distribution of the largest, second largest, and third largest connected high-K clusters (Figure 17), it is possible to recognize a spatial arrangement that resembles that of a PFP network, even in the SGS realizations, and this has an impact on the distribution of the fastest paths. The highest degree of connectivity is clearly shown by the T-PROGS realizations where the high-K clusters are more continuous especially in the horizontal plane.

Figure 17.

Three-dimensional perspective views of the spatial distribution of the third largest, second largest, and the largest high-K cluster (in green). Black lines represent the paths of the fastest particles corresponding to the 5th percentile of the arrival time distribution. The complete set of high-K clusters is shown in Figure 15.

[47] The results of the first group of connectivity indicators indicate that connected high-K clusters certainly can be present in the investigated portion of the MADE site aquifer. Among all 60 realizations, the largest connected high-K cluster corresponds to the T-PROGS realization 17 and is composed of 4530 cells. This value corresponds to a volume that is about 7.5% of the total. On the other hand, the smallest connected high-K cluster was observed in the SGS realization 3 and is composed by only 268 cells.

[48] Flow and transport connectivity in the generated K fields is also confirmed by the values of the second group of indicators (group 2 in Table 5). These are calculated from the results of the groundwater flow and particle tracking simulations. The indicator of flow connectivity CI1 shows that for all generated K fields the average effective conductivity is higher than the average geometric mean by a factor ranging from 1.97 for the SGS realizations to 2.16 for the T-PROGS fields. According to Zinn and Harvey [2003] and Knudby and Carrera [2005], such results suggest that the flow fields are characterized by the presence of preferential flow paths. Values of the indicators CI2 and CI3 confirm that particle transport in the generated K fields is highly asymmetric, determining anomalous peaks of arrival times and extensive tailing in the simulated BTCs. This is mostly evident for the T-PROGS realizations that have an average CI2 value of 13.15, but it is also indicated by the averages calculated for the SIS and SGS fields.

6. Conclusions

[49] Previous experimental and modeling studies supported the hypothesis that the connectivity of highly permeable sediments exerts a significant control on solute transport in the highly heterogeneous aquifer at the MADE site. In this study we tested this notion by analyzing the connectivity of 3-D simulated K fields, representative of a small portion (6 × 6 × 6.2 m3) of the aquifer. Geostatistical simulations were conditioned to 1740 values of hydraulic conductivity estimated from the grain size analysis of soil samples corresponding to small segments (5 cm in thickness) of 19 newly collected cores. On the basis of results of previous studies showing that different geostatistical methods can produce realizations with very different degrees of connectivity, we applied three different geostatistical simulation methods. This choice was made with the twofold objective of producing a wider range of possible representations of the actual heterogeneity and minimizing the chance that our results could have been biased by the characteristics of a single geostatistical simulation method.

[50] The 3-D data set itself represents one of the contributions of this work. The level of detail at which heterogeneity is documented, especially along the vertical direction, is extraordinary and quite valuable. For this reason, it can be used for the development and testing of new approaches for the characterization of the heterogeneity of porous media and the conceptualization of the mechanisms of solute transport.

[51] Groundwater flow, advective transport simulations, and two groups of connectivity indicators were used to assess and eventually quantify flow and transport connectivity in the generated K fields. Several observations derived from the flow and particle tracking simulations suggested a high degree of connectivity. Particle tracking simulations showed that the fastest particles converge along preferential flow paths and exit from the model domain in just a few selected small patches. In this study, these patches are concentrated mostly in the middle sector of the domain, at depths ranging from 5 m to 8 m below the land surface. About 40% of the total discharge is through patches where the first 5% of particles exited the system. The distribution of the preferential flow paths and particles exit locations is clearly influenced by the occurrence of the highest K zones. The average percentage of the fastest particle paths' trajectory that is within the high K zones ranges from 43% in the SGS realizations up to 69% in the T-PROGS fields. It was also observed in the SGS and SIS realizations that sections of the fastest paths' trajectory are within cells with K significantly lower than the 0.9 decile of the synthetic K distribution. This indicates that particles traveled along preferential flow paths jumping (or leaking) from one high-K cluster to another with transitions through lower K zones, suggesting that significant transport connectivity may not require connected zones of relatively homogenous high K.

[52] Other characteristics of the simulated flow fields and transport behavior consistent with significant connectivity and channeling include the ratio between effective conductivities and their corresponding geometric mean values, and the shape of the simulated BTCs. The K ratio is equal to about 2 on average, suggesting the entire K field becomes twice as conductive due to the connectedness of higher K zones. The simulated BTCs are characterized by sharp peaks at early times and extensive tailing. Moreover, multiple peaks are observed especially in the K fields generated with T-PROGS. All these characteristics can be interpreted to be the results of a transport mechanism consisting of the fast movement of a significant number of particles along preferential flow paths.

[53] The high degree of connectivity suggested by the flow and transport simulations was confirmed by the values of two groups of connectivity indicators. The first group [Deutsch, 1998] indicates that connected high-K clusters can be present in the investigated portion of the MADE site. On average, the largest connected high-K body in the simulated fields represents a small fraction of the total volume, ranging from 1.6% in the SGS realizations to 3.8% for the T-PROGS fields. The largest connected body among all the realizations occupies about 7.5% of the total interpolated volume. In addition to their dimension, the spatial distribution of the high-K clusters is not random and supports the formation of preferential flow paths as indicated by the values of the second group of connectivity indicators that contain information on the degree of flow channeling and transport connectivity [Knudby and Carrera, 2005]. In all the generated K fields the effective conductivity is higher than the geometric conductivity by a factor of about 2. The average arrival time of the particles is also 9 times greater than the arrival time of a smaller fraction of particles corresponding to 5% of the total. The values of the second group of connectivity indicators are consistent with the results of the groundwater flow and transport simulations. A significant proportion of particles traveled faster than the average along preferential flow paths and generated the highly asymmetric shape of the BTCs.

[54] Anomalous transport behavior and significant connectivity in the simulated K fields are also indicated by the slope of the tails of the simulated BTCs. The slope for each BTC was calculated by plotting the log-log scale BTCs and then by performing least squares regression of the tail values. The average slope for the SGS realizations is 2.42. For the SIS realizations, the average slope is equal to 2.63. The average slope of the BTCs generated for T-PROGS realizations is slightly lower at 1.89. The low values for the slopes of the tails are a clear indication of anomalous transport due to significant connectivity.


[55] This material is based upon work supported by the National Science Foundation under grants EAR 0538011, EAR 0537668, and EAR 0738960. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The authors also wish to thank Xavier Sánchez-Vila, Daniel Fernàndez-Garcia, and two anonymous reviewers for their very helpful comments, which led to significant improvement of this work.