The simulation of conservative solute transport in a heterogeneous unsaturated soil depends on the description of spatial variability of soil hydraulic and chemical properties. The data from the Las Cruces Trench Site were used to explore the impact of alternative ways of describing the variability of hydraulic properties. Three different approaches were considered: Miller and Miller scaling, Leverett scaling, and a multistep approach involving categorization of water retention curves. Conditional sequential geostatistical simulation was used to generate equally probable realizations of soil properties for each approach, and these realizations were then used as input to a numerical simulator to quantify the resultant uncertainty in solute transport predictions. Simulation results show that the scaling techniques seem to oversimplify the description of heterogeneity in the Las Cruces Trench, leading to very narrow spaces of uncertainty. Because the multistep approach allows the reproduction of existing patterns of continuity of soil classes and of contrasting values of hydraulic properties in the field, it led to solute plumes that split into several preferential pathways that were not observed in the simulations based on the scaling approaches, which increased the standard deviation of the solute plume's moments. The results indicate that measurements of both water retention curves and saturated hydraulic conductivity need to be collected for more realistic, conservative studies of flow in heterogeneous unsaturated soils.
 The simulation of water flow and contaminant transport in a heterogeneous unsaturated soil requires a description of the spatial variability of the soil's unsaturated hydraulic conductivity and the soil water retention curves. Depending on the model assumed to represent these properties, one may need as many as six parameters, each one varying in space. To reduce the number of variables, focus is usually placed on those parameters which are deemed the most important: the soil water retention curve shape factors and the saturated hydraulic conductivity [Jury et al., 1987b]. However, dealing even with this reduced number of parameters is not an easy task in a heterogeneous system: it usually requires a large number of measurements of the soil water retention curve and of the hydraulic conductivity which may be time consuming and costly to perform. Scaling theories, especially those of Miller and Miller  and Leverett , have been invoked to make up for the lack of data and have been widely used in both unsaturated [Philip, 1975; Hopmans et al., 1988; Russo, 1991; Tseng and Jury, 1993; Desbarats, 1995; Chen and Neuman, 1996; Rockhold et al., 1996; Deurer et al., 2000; Tartakovsky et al., 2003] and multiphase flow [Essaid et al., 1993; Dillard et al., 1997; Gerhard and Kueper, 2003] simulations. Using the Miller and Miller or the Leverett scaling theories, one can obtain a description of the variability of both hydraulic properties given data for only one of them. For example, the variability in saturated hydraulic conductivity can be used to estimate the variability in soil water retention [Gerhard and Kueper, 2003] and vice versa [Rockhold et al., 1996].
 The appeal of the scaling approach is that it simplifies the description of the spatial variability of θ(ψ) and K(ψ) to just one parameter, instead of two, three or even four, depending on the functions chosen to represent these relationships. The selection of such an approach implies, however, that, first, scaled water retention and scaled hydraulic conductivity curves will always have the same general shape and are shifted by a constant of proportionality, the scale factor, α. Yet, in a truly heterogeneous soil, it is very unlikely that these properties will have such similar shapes. The lack of proper reproduction of such patterns of variability in the description of soil properties could impact the predictions of water flow and the fate of contaminants in these systems. For example, Lemke et al. , simulating DNAPL (dense nonaqueous phase liquid) entrapment and removal, observed notable differences in simulations using models incorporating Leverett scaling of permeability and models utilizing uncorrelated air-entry pressure and permeability fields.
 Despite its widespread use, the scaling of hydraulic properties may be an oversimplification that potentially yields erroneous predictions of solute transport at the field scale. For example, Figure 1a shows a set of some typical soil water retention curves occurring in a natural field soil. The curves vary not only in shape but also frequently cross one another. Scaled curves, on the other hand, have the same general shape and never cross one another (Figure 1b). With this in mind, the objectives of this study are (1) to propose an alternative methodology for describing the soil hydraulic properties that more closely approximates the heterogeneous nature of real field soils and (2) to compare unsaturated zone transport simulation results obtained using the alternative method to those obtained using the scaling approaches.
2.1. Unsaturated Flow and Transport Theory
 The equation describing the two-dimensional, isothermic, unsaturated transient flow of water in a nondeformable heterogeneous soil can be written as
where θ is the water content [L3/L3]; ψ is the water pressure head [L]; t is time [T]; x and z are the horizontal and upward vertical directions, respectively [L]; F is an external source or sink term (positive for sink) [T−1]; Kx(ψ) and Kz(ψ) are the components of the unsaturated hydraulic conductivity tensor. In this study, the medium is assumed isotropic at the grid scale, and hysteresis is neglected.
 The movement of a single nonreactive solute in such a system is given by
where C is the solute concentration [ML−3]; Cs is the concentration of the source/sink term [ML−3]; and Dxx, Dzz, Dxz = Dzx are the components of the dispersion coefficient tensor [L2T−1], given by
where D* is the coefficient of molecular diffusion [L2T−1]; τw is the tortuosity factor (dimensionless), given by τw = θ7/3/θs2 [Simunek et al., 1999]; DL and DT are the longitudinal and the transverse dispersivities [L] (here considered to be uniform throughout); δij is the Kronecker delta function (i.e., δij = 1 if i = j, and δij = 0 if i ≠ j); ∣v∣ = ; vi and vj are the ith and jth components of the pore water velocity [LT−1], respectively, given by Darcy's law:
 To solve (1) numerically, one needs to parameterize the soil water retention and the unsaturated hydraulic conductivity curves. The Brooks-Corey relationship is assumed here to describe the soil water retention curve. It is given by [Brooks and Corey, 1964]
where Θ is the effective saturation (dimensionless); θ is the water content [L3/L3]; θs and θr are the saturated and residual water contents, respectively [L3/L3]; λ is the pore size index (dimensionless); and ψe is the air entry pressure head [L]. The unsaturated hydraulic conductivity is often estimated from the saturated hydraulic conductivity coupled with the soil water retention curve, using, for example, the Burdine model [Burdine, 1953] which, when coupled with the Brooks-Corey model, gives [Brooks and Corey, 1966]
where Ks is the saturated hydraulic conductivity [LT−1].
2.2. Stochastic Simulation of Soil Hydraulic Properties
 One way to deal with the uncertainty in soil hydraulic properties is to treat them as random space functions. As a consequence, the dependent variables (water pressure head, water content and solute concentration) are also random space functions [Russo and Bouton, 1992]. The generation of conditional spatially correlated random fields of soil properties may be easily accomplished within the geostatistical sequential simulation approach [Deutsch and Journel, 1998]. Each soil property is a regionalized variable, z(u), that varies in space stochastically and represents a realization of a random function, Z(u). Usually, Z(u) is assumed to be second-order stationary, meaning that the expected value is invariant and the autocovariance does not depend on u, only on the separation distance h, such that
Within this approach, equally probable realizations of a property, z(u), are generated, each one reproducing: the measured values at sample locations; the sample histogram; and the covariance model [Goovaerts, 1997]. The approach to sequential simulation can be either parametric or nonparametric. In the parametric approach, a normal (Gaussian) function is chosen to describe the distribution of the property z(u) or a transform of z(u) (such as the logarithm of z(u)) so that its univariate distribution is completely characterized by the mean and the variance, and the bivariate distribution requires only the additional knowledge of the covariance function [Deutsch and Journel, 1998]. The process of sequential Gaussian simulation (SGS) involves a prior normal score transformation, in which the cumulative distribution function of the spatial attribute, F(z), is converted into a standard cumulative distribution function, G(y), with y having a standard probability density function with mean 0 and variance 1 [Goovaerts, 1997; Deutsch and Journel, 1998]. The simulations are made in the normal score space and then are back transformed using a one-to-one correspondence between G(y) and F(z).
 In the nonparametric (or indicator) approach, no assumption regarding the distribution of z(u) has to be made. The cumulative distribution function (cdf) of z(u) is divided into a number of classes which are used for a binary coding of each sample value based on whether, for continuous variables, z(u) exceeds a class threshold [Journel, 1983] or, for categorical variables, it equals the probability of being no greater than a given class, which implies a ranking of these classes. In the latter case, the sequential indicator simulation (SIS) approach is used to generate realizations of categorical variables, which involves the prior coding of categorical attributes into indicator data as
where Nk is the number of categories.
 The indicator and the parametric sequential simulation techniques have been combined in what is known as the hierarchical or multistep approach [Alabert et al., 1990; Damsleth et al., 1992]. In this approach, the spatial variability of soil types (or facies) is modeled with indicator semivariograms for each class and then the spatial distribution of these classes is simulated. Therefore the distribution of major qualitative features, for instance, the distribution of coarse and fine soils, can be obtained. This step is then followed by the simulation of the properties of interest within each of these classes. In this way, the variability within each class is accounted for and distinct populations with distinct statistics can be dealt with, as shown by Dillard et al. .
2.3. Scaling of Soil Hydraulic Properties
2.3.1. Miller and Miller Scaling
 According to the scaling approach for the description of the variability in soil water retention and unsaturated hydraulic conductivity, a single scaling factor, varying in space, is able to describe the ratio between each curve at a given point in space and their respective average curve over the entire range of water pressure heads [Peck et al., 1977; Ahuja et al., 1984; Vogel et al., 1991; Clausnitzer et al., 1992; Hopmans, 1992]. The scaling approach is derived from the similitude concept introduced by Miller and Miller [1955a, 1955b], which states that two porous media are similar if they have identical microscopic geometries and differ only in their scale. The water pressure head of a soil u, ψu, can be related to the water pressure head of a reference (or mean) soil, ψ*, by a scaling factor, αu, such that
as long as both media are at the same water content, θ. Similarly, the two unsaturated hydraulic conductivities at the same θ, Ku(θ) and K*(θ), are related by
 The criterion for Miller and Miller similitude is rarely met in heterogeneous soils since soil structural properties vary spatially. Accordingly, Warrick et al.  and Russo and Bresler  relaxed some aspects of the Miller and Miller theory by, first, scaling by the degree of saturation (S = θ/θs) instead of by water content so that soils with different porosities could be scaled. Next, they assumed that the scale factor for the soil water retention curve would not necessarily have to be equal to the scale factor for the hydraulic conductivity curve. The result of relaxing these two assumptions is that dissimilar media can be scaled, although the scale factor loses its physical meaning, as it is no longer associated with a soil microscopic length. The scaling equations, then, become
where αh and αK are the scaling factors for the soil water retention and unsaturated hydraulic conductivity curves of a soil u, respectively. In the practical reality of contaminant transport studies, however, the unsaturated hydraulic curve is rarely measured and therefore is most often inferred from the parameters of the water retention curve, ψ(S). Thus most scalings are performed only with respect to the water retention curve [Shouse et al., 1995; Rockhold et al., 1996; Deurer et al., 2000]. When (5) is used to describe the soil water retention curve, all scaled water retention curves share the same pore size index, λ*, and are shifted by a scaling factor that is the ratio between the air entry pressure head, ψe, and the air entry pressure head of the reference curve, ψ*e, such that
2.3.2. Leverett Scaling
 Another approach for scaling soil water retention curves was proposed by Leverett , who found that experimental soil water retention curves from different unconsolidated sands could be plotted on the same curve, called the J function, when normalized in the following manner:
where Θ is the effective saturation (dimensionless); σ is the surface tension [ML−1T−2], k is the intrinsic permeability of the porous medium [L2], and n is its porosity (dimensionless). Through Leverett's scaling function (since intrinsic permeability (k) and saturated hydraulic conductivity (K) are directly related [Bear, 1972]), two different media containing the same fluid can be related to one another by
where the superscript * refers to the reference soil and the subscript u refers to a different soil. If porosity is considered uniform, then Leverett scaling yields
which then gives
Equations (17) and (12) are similar to one another, only differing on the definition of their scaling factors. When only the saturated hydraulic conductivity is considered, equations (18) and (13) are also similar. Usually, the terms “Miller scaling” or “similar media” are used when the soil water retention curves are the primary information for describing the spatial variability in soils [Desbarats, 1995] and “Leverett scaling” is used when the saturated hydraulic conductivity is the primary source [Gerhard and Kueper, 2003], although sometimes they are used interchangeably [Essaid et al., 1993; Dillard et al., 1997], perhaps because of the convergence of the two theories.
 Studies of the impact of variability in soil water retention curves have already been published but they usually resort to the use of hypothetical data fields that are generated based on a prior knowledge of the spatial correlations of the soil hydraulic parameters [Russo, 1991; Harter and Yeh, 1996, 1998; Russo et al., 1998, 2001; Zhang and Lu, 2002]. For example, Harter and Yeh  investigated the influence of local measurements of saturated hydraulic conductivities and soil water tension using conditional simulation for predicting solute transport in five hypothetical soils. On the other hand, Essaid et al. , Rockhold et al. , and Dillard et al.  used data on soil hydraulic properties (either measured or inferred) collected at real sites to generate only single simulation of the properties' spatial variability, precluding an assessment of the uncertainty arising from the description of the soil heterogeneity. In this study, only data on soil hydraulic properties collected at a real field site were used to compare the impact of different ways of describing their spatial variability on solute transport simulations, with no a priori assumptions made regarding the characteristics of the soil hydraulic properties at the site. Also, the uncertainty in solute transport predictions associated with each approach was investigated through the simulation of multiple realizations.
3.1. Field Site Database
 The soil hydraulic properties used in this study are those in the Las Cruces Trench Site database [Wierenga et al., 1989]. The database was generated as part of a comprehensive field study near Las Cruces, New Mexico, undertaken for testing deterministic and stochastic flow and transport models in the unsaturated zone [Wierenga et al., 1991]. A 24.6 m long by 6.0 m deep trench wall was excavated and 450 samples were taken at nine layers, with 50 equally spaced samples per layer. The saturated hydraulic conductivity was measured in undisturbed samples in the laboratory and 489 measurements were made in situ with a borehole permeameter, 30 cm offset from where the samples were collected. Soil water retention curves were determined for 448 samples with the water content measured at water pressure heads of −10, −20, −40, −80, −120, −200, −300 cm H2O as well as at −1, −5 and −15 bar. More information on the data collection methodology and the experimental protocols is given by Wierenga et al. . The soil characterization and infiltration experiments at this site have been analyzed in a number of articles [Wierenga et al., 1991; Jacobson, 1990; Hills et al., 1991; Rockhold et al., 1996] and reports [Wierenga et al., 1990; Hills and Wierenga, 1991; Hills et al., 1993].
3.2. Simulation of Soil Properties
 All geostatistical simulations of soil properties were conditioned, within each approach, using the entire data set available (448 sets of water retention curves or 489 measurements of saturated hydraulic conductivity). The geostatistical domain was 25 m wide and 6 m deep, with a 10 cm spacing grid, entailing the simulation of 15,000 grid cells. The high sampling density of the data mitigated the destructuration effect (lack of correlation of extreme values) commonly associated with SGS. Since the spacing of the geostatistical grid is approximately the size of the sample support, there was no need for upscaling the soil properties. Experimental semivariograms were calculated and zonal, anisotropic models were fitted using a combination of nested structures of either spherical models (Sph), given by
or exponential models (Exp) given by
where c is the variance contribution, h is the distance separating two locations, and a is either the actual (in (19)) or the effective (in (20)) range of the models. The sequential Gaussian simulations and the sequential indicator simulations were performed by the GSLIB programs SGSIM and SISIM, respectively [Deutsch and Journel, 1998].
3.2.1. Miller and Miller Scaling
 To implement the Miller and Miller scaling, the multiple linear regression method with “dummy” variables described by Draper  was implemented and applied to the 448 sets of capillary pressure head–water content points in the Las Cruces data set. This technique, suggested by Rockhold et al. , requires that θr be equal to 0. Only the data for pressure heads between −40 cm H2O ≤ ψ ≤ −300 cm H2O were considered here, following the guidelines of Corey and Brooks . A single slope λ* = 0.2631 was obtained as a result of the regression analysis with values for ∣ψe∣ ranging from 0.087 cm H2O to 57.63 cm H2O. The values and the summary statistics obtained here (Table 1) are in very good agreement with those presented by Rockhold et al. .
Table 1. Summary Statistics for ∣ψe∣ Values Obtained With Multiple Linear Regression for Miller and Miller Scaling
 In addition to the slope and intercept, one needs a value of air entry pressure for the reference curve, ψ*e, and a reference saturated hydraulic conductivity, K*s, to complete the soil characterization. Since most scaling techniques require that the mean of the scaling factors be equal to 1 [Warrick et al., 1977; Russo and Bresler, 1980; Clausnitzer et al., 1992], the reference value for ψ*e was taken as the ψe value that, when used in (14), provided a set of 448 scaling factors with a mean of 1. For this data set, ψ*e was found to be −4.728 cm H2O. Given the direct relationship between ψe and α, and also the need to use a normal score transform in the SGS approach, simulating either ψe or α would result in the same spatial variability structure. Therefore the values of ψe were directly transformed to normal scores and a directional, zonal anisotropic semivariogram was modeled. The semivariogram parameters are listed in Table 2. The vertical component was approximated as the omnidirectional semivariogram since the data configuration did not allow the estimation of a strictly vertical semivariogram. The horizontal component, on the other hand, showed a very well defined structure, as shown in Figure 2.
The models are given by γ(h) = γ0+ γ1 Str () + γ2 Str (), where γ0 is the nugget effect, γ1 and γ2 are contributions to the variance, ax1 and ax2 are the ranges in the horizontal direction hx, and az1 and az2 are the ranges in the vertical direction hz.
ψe normal scores
Ks normal scores
Class 1 indicators
Class 2 indicators
Class 3 indicators
Class 4 indicators
Class 5 indicators
λ1 normal scores
λ2 normal scores
λ3 normal scores
λ4 normal scores
λ5 normal scores
 The geometric mean of the measured saturated hydraulic conductivity values is typically used as a reference value for scaling [Rockhold et al., 1996; Dillard et al., 1997]. Accordingly, the value of 4.11 m/d was used for K*. One hundred realizations of the spatial distribution of ψe were generated and the saturated hydraulic conductivity field was obtained from (14) and (18):
3.2.2. Leverett Scaling
 The 489 in situ measurements of saturated hydraulic conductivity were used to determine the scaling factors for the Leverett scaling approach. The values of Ks follow a lognormal distribution; the statistics are summarized in Table 3. As in the Miller and Miller approach for ψe, the K*s value that provided a set of 489 scaling factors with a mean of 1 was found to be 6.043 m/d. Since the scaling factors are a direct rescaling of Ks values, they also follow a lognormal distribution.
Table 3. Summary Statistics for Ks Values Measured in Situa
Based on 489 values.
 The Ks values were transformed to normal scores and the omnidirectional, as well as the directional, horizontal semivariograms were calculated (the data configuration did not allow the estimation of vertical semivariograms). The spatial variability of the normal scores of Ks was modeled using a combination of anisotropic, exponential models (Table 2) where the vertical semivariogram was assumed to be identical to the omnidirectional semivariogram (Figure 3). One hundred realizations of Ks field were generated, from which soil water retention curves were derived through the application of (17). The reference values for λ* and ψ*e were taken as 0.2631 and −4.728 cm H2O, respectively, as determined in the Miller and Miller scaling.
3.2.3. MultiStep Approach
 To preserve the information about the soil water retention curves seen in Figure 1a, an approach based on a multistep or hierarchical stochastic simulation of soil properties [Alabert et al., 1990; Damsleth et al., 1992; Dillard et al., 1997] may be useful. In this approach, the spatial distribution of soil classes (or facies) is simulated first, followed by the simulation of the properties of interest within these classes. To implement such an approach here, the same 448 sets of water content–pressure head data used in the Miller and Miller scaling were fit with (5), following the guidelines of Corey and Brooks . θr was taken as the value that provided the best fit of (5) to the logarithm of the values of Θ and ∣ψ∣. Using nonzero values for θr resulted in significantly different values for λ from those obtained assuming θr = 0, as shown in the scatterplots in Figure 4. The best fit θr values ranged from 0 to 0.13. Zero was the best fit value for θr for only 141 out of 448 samples, leading to significant differences in the statistics of ψe and λ (Table 4).
Table 4. Summary Statistics for the Brooks-Corey Parameters Assuming Either r ≥ 0 or r = 0a
Assuming r ≥ 0
Assuming r = 0
∣ψe∣, cm H2O
∣ψe∣, cm H2O
Based on 448 values.
1.20 × 10−5
2.22 × 10−7
 Since the water content at ψ = −100 cm H2O seems to capture the breadth of variation in the soil water retention curve, the effective saturation at a water pressure head of −100 cm H2O, Θ100, was determined for the 448 fitted water retention curves. The population of soil water retention curves was then divided into 5 classes of equal size using as limits the 20th, 40th, 60th, and 80th percentiles of the cumulative density function of Θ100, namely, 0.278, 0.376, 0.512, and 0.657 (Figure 5a). Following this classification, five clusters of ψe versus λ values appear on the scatterplot (Figure 5b). Each cluster can be modeled by
where ai and bi are the slopes and intercepts of the line describing class i, and εi is the random deviation of the model for each class, assumed to follow a normal distribution with mean 0 and variance σ2i, εi ∈ N(0, σ2i) [Draper, 1981]. The values of ai, bi, and σi are shown in Table 5, along with the coefficient of determination, R2, and the statistic from the F test of significance of the regression, F. The R2 values reveal that moderate to good regression is achieved. A higher proportion of variance could be explained with the addition of more classes; however, sufficient data must be present in each class to allow a good modeling of the spatial variability and the conditioning of the realizations to hard data. The use of Θ100 as a classification criterion is not unique. A similar breakdown into classes to that shown in Figure 5a was obtained evaluating d∣ψ∣/dΘ at Θ = 0.9; however, there was more overlap between classes using this method than using the quantiles of Θ100 [Oliveira, 2004].
Table 5. Results of Regression Analysis of ∣ψe∣ Versus λ (Equation (22))
Values of F ≥ 6.73 indicate that the hypothesis that the regression slope is equal to zero can be rejected with a 99% confidence.
 To generate different realizations of soil properties, first, the spatial distribution of the five classes of soil water retention curves was simulated using sequential indicator simulation (SIS). Figure 6 shows the spatial distribution of soil classes in the Las Cruces Trench profile. Class 5 (the most difficult to drain) dominates the bottom of the soil profile whereas class 1 (the easiest to drain) prevails in the upper middle section. The remaining classes are scattered throughout the profile. Class indicator semivariograms were calculated and modeled [Oliveira, 2004] as anisotropic with the vertical range taken as approximately one fourth of the observed horizontal range. The parameters of the semivariogram models are listed in Table 2.
 Next, the values of λ were analyzed and simulated for each class separately. This is necessary because, as the statistics listed in Table 6 show, the expected values and standard deviation of λ differ from one class to another. Using this approach assures that the statistics and the spatial structure of λ within each class are reproduced, and that the values of ψe fall within the expected range. The values of λ were transformed to normal scores, for which semivariograms were calculated and then modeled with the parameters shown in Table 2 (semivariogram graphs can be found in Oliveira ). Finally, statistics of the residual water content, θr, were different among classes. The average values of θr in classes 4 and 5 were 0.030 and 0.015, respectively, which are considerably different from the average value of 0.080 found for θr in classes 1, 2 and 3. Therefore these values were used in the multistep approach, in an attempt to generate soil realizations as close to reality as possible. The average saturated water content, θs, ranged from 0.318 to 0.327; therefore the value of 0.32 was used for all classes, the same value used with the scaling approaches. In those, θr was set equal to 0.05, the average value found in the original data set.
Table 6. Summary Statistics of λ Values Within Each Soil Class
 To incorporate the uncertainty about the spatial distribution of soil classes and λ values, the two-stage approach suggested by Damsleth et al.  was implemented here. Ten realizations of the distribution of soil classes were generated, each one coupled with ten realizations of λ values, resulting in one hundred realizations of soil water retention curves. The ψe value assigned to each grid cell was determined by the value of λ and the appropriate form of (22) for the soil class assigned to the same grid cell. Given the randomness introduced by the last term in (22), positive or very small values of ∣ψe∣ could be predicted, in which case εi was set to 0 to assure a realistic soil representation. Such corrections were made, on average, to 1.5% of the grid cells. ψe values that were not reproduced at sampling locations could have been corrected a posteriori, but it was deemed unnecessary here since the number of simulated values is much larger than the number of sampled data. Figure 7 shows one realization of the spatial distribution of soil classes. Figure 7 shows that the predominance and continuity of classes 1 and 5 at their respective regions of occurrence are reproduced very well by the model (these two classes have lower relative nugget effect values). Class 3 is distributed in a patchy pattern following the sampled data whereas classes 2 and 4 are scattered throughout the domain, partly because the sampled data are scattered and partly due to the higher relative nugget effect values associated with these classes.
 Because the uncertainty in the spatial distribution of Ks must also be incorporated, the soil water retention curve realizations were coupled with one hundred conditional realizations of Ks, assuming independence between Ks and the parameters λ and ψe. The assumption of independence is a total departure from the Miller and Miller and the Leverett scaling theories, in which Ks and ψe are completely correlated. Still, it is an assumption often made in the literature [Russo and Bouton, 1992; Harter and Yeh, 1998; Russo et al., 2001; Zhang and Lu, 2002] and one that is corroborated by an analysis of soil data in this study, presented in the form of scatterplots shown in Figure 8, and elsewhere [Hills et al., 1992]. Because the distributions of Ks values in the five classes were very similar, the same mean and variance (Table 3) were used in all five classes. The multistep approach could, however, incorporate class-specific statistics of Ks data in the same way values of λ were treated. However, Ks varies only over 4 orders of magnitude in this data set, since the site is composed of basically sandy soils. Had one been dealing with a variability in Ks over more orders of magnitude, for example, with mixed populations of sands and clays, this assumption may not be valid, and individual simulations of Ks within each soil class should be adopted, as Dillard et al.  proposed.
3.3. Numerical Flow and Transport Model
 The realizations of soil hydraulic properties generated as described in Section 3.2 with the three models of spatial heterogeneity were input into a water flow and solute transport numerical model, a modified version of HYDRUS-2D [Simunek et al., 1999]. The code was modified so that soil physical properties such as Ks and the parameters describing the θ(ψ) curve could be assigned to elements instead of to nodes in the finite element mesh [Oliveira, 2004] and also to compute metrics for the subsequent analysis, such as time of first arrival at different horizontal planes. On the basis of the work by Rockhold et al. , D* was set equal to 1 × 10−4 m2/d, and DL and DT were set equal to 3 × 10−2 m and 3 × 10−3 m, respectively.
 In this study, the center of the trench was considered. A subdomain 11 m wide and 6 m deep (the horizontal center of the subdomain coincides with the horizontal center of the trench) was discretized by a regular finite element mesh of triangles with horizontal and vertical sides of 10 cm. Trial runs with a finer grid of 10 cm (horizontal) × 2.5 cm (vertical) produced results with similar moments and concentration profiles, but increased the total computational time from 25 minutes to 358 minutes, almost a 14 fold increase. Accordingly, the coarser grid was utilized in this study. The boundary conditions for water flow consisted of no flow boundaries at the sides of the domain, a unit gradient boundary condition at the bottom boundary, and a variable volumetric flux at the top boundary:
for 4.9 m ≤ x ≤ 6.1 m, and
for 0 m ≤ x < 4.9 m and for 6.1 m < x ≤ 11 m. The constant rate of 23 cm/year corresponds to the precipitation reported for the site [Wierenga et al., 1989] and the pulse of 1.82 cm/d corresponds to the infiltration rate applied in experiment 2 [Wierenga et al., 1990]. For the solute transport, a no-flow boundary condition was set everywhere except at the bottom boundary where a free flow condition was set, and at the top boundary for 4.9 m ≤ x ≤ 6.1 m, where a variable solute mass flux was set of
where C0 is a normalized concentration. To obtain an initial field of water pressure heads that reflected the heterogeneity at the site, the domain was subjected to a uniform infiltration rate of 23 cm/year at the top boundary for 1000 days with unit gradient boundary at the bottom and no-flow boundary conditions on the sides. The period of 1000 days was a sufficient time for the pressure heads in the system to reach a quasi-steady state. Cumulative water and solute mass balance errors were, on average, on the order of 0.001% and 1%, respectively.
 The effect of the various spatial variability models on the transport of a conservative solute was assessed through the analysis of metrics derived from the simulations' results. The spatial moments of the solute plume at a time, t, are given by [Russo, 1991]
The zeroth moment, M00, is the amount of mass within the domain. The first normalized moments represent the position of the centroid of the plumes in the x and z directions and were computed, respectively, as
and the second normalized moments, which represent the spread of the plume about the center of mass in the x and z directions, were computed as
The mean value of these metrics were calculated, after 150 days of simulation, for various numbers of realizations up to 100, when they were observed to have stabilized. Thus 100 realizations of the distribution of soil hydraulic properties were generated using the Miller and Miller, the Leverett, and the multistep approaches, and were then input into the modified version of HYDRUS-2D.
 Times of first arrival for a normalized concentration threshold of 10−4 at a depth of 3 m were also recorded for each simulation. The statistics for the ensemble of time of first arrivals for each scenario can be used, for example, to assess how long it would take for the plume to reach a point of compliance, such as the water table or a drinking water well, and the uncertainty associated with this time.
4. Results and Discussion
Figure 9 shows the maps of the first of one hundred realizations of the spatial distribution of Ks, ψe and λ generated using the three approaches described in Section 3.2. The ψe field generated using Miller and Miller scaling (top row) has a balanced proportion of low and high values, with high values occupying predominantly the bottom of the domain. Consequently, the Ks field generated by scaling also has a similar balance of low and high values, with lower values in the bottom part of the domain. The Ks field resulting from the SGS algorithm (middle row) displays a larger proportion of high values than the Ks field generated by the Miller and Miller scaling but that reflects the real distribution of Ks in the field. For both scaling approaches, the λ fields correspond to a single uniform value throughout. The ψe and λ fields obtained with the multistep approach (bottom row) are significantly different from those created with the scaling techniques: values of λ and ψe vary over their respective expected ranges, and, most importantly, the prior categorical simulation generated clear patterns of continuity and contrasting values, especially for λ. Low values of λ corresponding to class 5 predominate in the bottom of the domain, juxtaposed by higher values corresponding to class 1. Conversely, high values of ψe, typical of class 5, dominate the bottom region, whereas lower values, typical of class 1, are clustered in the middle section of the trench. The same kind of pattern for ψe is seen in the scaling approaches; however, the λ pattern is unique to the multistep approach.
 The differences in soil hydraulic properties that the three approaches generate can be readily grasped by examining Figures 10 and 11. Figure 10 shows a complete correlation between the Ks and ψe fields generated with the Miller and Miller and also with the Leverett scaling approaches, while the value of λ is constant regardless of the value of ∣ψe∣. This leads to water retention curves (Figures 11, left, and 11, middle) that lack the behavior observed from field data (Figure 1a). With the multistep approach, on the other hand, Ks and ψe are not correlated, as in the original data set. Furthermore, the points on the ψe versus λ scatterplot are clustered following the five classes depicted in Figure 5b, with their correlation as well as their variances being reproduced. These distinct relationships lead to water retention curves that have different shapes (Figure 11, right), with multiple crossings, mimicking the original set of curves measured in the field (Figure 1a).
 These differences in the approaches to describe the spatial variability of soil hydraulic properties are reflected in the solute transport simulations. Figures 12 and 13 show solute plumes and velocity profiles, respectively, obtained after 150 days of solute transport simulation using the fields of soil hydraulic properties depicted in Figure 9. The Miller and Miller approach produced plume profiles that were, in general, a little wider and shallower than those obtained with the Leverett scaling, but both methods generated very smooth plume shapes, as if they were simulated in homogeneous domains, with very little variation among realizations. Their velocity profiles were also very smooth with virtually no signs of preferential flow. It is interesting to note that the plume shapes resulting from the simulation of scaled soil hydraulic properties are very similar to those reported by Rockhold et al. , despite the fact that another part of the trench was used in that study as well as a slightly different set of initial conditions. On the other hand, the multistep approach produced complex plume shapes, as exemplified in Figure 12, a direct consequence of several preferential pathways that were created in the domains (Figure 13). These preferential flow paths changed from one realization to another, leading to plumes of various shapes.
 The reason for this divergent behavior of simulated plumes using the multistep approach lies in the fact that the shape of the unsaturated hydraulic conductivity curve, K(ψ), depends on λ (equation (6)). The value of λ used in the scaling approaches was 0.2631. In the multistep approach, however, the λ values were higher (the median value was 0.420 (Table 4)), a direct consequence of the approach used to fit to the Brooks-Corey function (section 3.2.3). These high values of λ cause a rapid reduction of unsaturated hydraulic conductivity as ∣ψ∣ increases. Therefore adjacent cells having high values of λ will act as barriers to water flow, since they lose their capacity to conduct water very quickly, while adjacent cells having low values of λ will give rise to preferential pathways since they maintain relatively high unsaturated hydraulic conductivities at high ∣ψ∣.
 The spatial distribution of the soil water retention curve parameters dominates the determination of the flow pathways while the saturated hydraulic conductivity plays a secondary role. This is demonstrated here by observing that the Leverett scaling and the multistep approaches share the same Ks fields; therefore the differences in their predicted solute plumes may be attributable to the variability of λ and ψe which control the soil water retention curve and the unsaturated hydraulic conductivity function. This is further demonstrated by Oliveira , who generated and ran two sets of simulations: one where the Ks field was kept constant while ψe and λ changed and another where Ks field changed while ψe and λ were kept constant. There, also, a clear predominance of the water retention curve parameters over the saturated hydraulic conductivity in determining the preferential pathways was observed.
Figure 14 shows the temporal evolution of each of the spatial moments for the three approaches examined here. Both scaling methods provided a very similar evolution of moments despite the large differences observed for the Ks and ψe fields (Figure 9). The first and second moments of the 100 simulations produced using the scaling methods had very small standard deviations (Table 7). The larger spread in values of xc and zc for the multistep approach reflects the fact that the plumes assumed various shapes, sometimes shallow and wide, sometimes deep and narrow. The spaces of uncertainty of the first moments, xc and zc, encompass the ones obtained with the scaling techniques (Figure 14 and Table 7), indicating that the method provides conservative estimates for these metrics, as its results cover a broader spectrum of possibilities than those obtained with scaling techniques. The ensemble metrics therefore show a very different picture from those resulting from use of the scaling techniques, with higher values for the mean and for the standard deviation of σxx2 and σzz2.
Table 7. Mean and Standard Deviation of the First and Second Spatial Moments of the Contaminant Plume After 150 days of Simulation
Miller and Miller
 First arrival times for a normalized concentration of 10−4 at a plane located at a depth of 3 m were recorded for each simulation. Basic statistics of these sets of values are shown in Table 8. Normal probability density functions for each one, derived from their respective means and standard deviations, are shown in Figure 15. The distribution of times of first arrival predicted by each approach did not overlap much, with the spread around the mean being much smaller for the scaling approaches than for the multistep approach. The predicted times are shorter using Leverett scaling than using Miller and Miller, with mean values of 56.4 and 59.2 days, respectively, which may be explained by the higher values of Ks present in the domains obtained with Leverett scaling. The multistep simulations predicted a mean time of first arrival of 50.6 days and the upper tail of its time pdf overlaps the pdf of Leverett scaling times. In addition, the mean time predicted with the Miller and Miller scaling exceeds the maximum time obtained with the multistep approach (Table 8). Thus the multistep approach would lead to a more conservative assessment of the uncertainty about the time that a solute would take to move through the unsaturated zone.
Table 8. Statistics for Times of First Arrival of a Normalized Concentration Threshold of 10−4 at a Depth of 3 ma
Values are in days.
Miller and Miller scaling
 The results presented here show that the scaling approaches, despite having different Ks and ψe fields, led to concentration plumes with very similar shapes, similar spatial moments and similarly narrow spaces of uncertainty. Although the primary data fields used in these methods (either the soil water retention curves or the saturated hydraulic conductivity) vary among realizations, the resulting concentration plumes were all alike and their shapes resemble those obtained using homogeneous fields, although the heterogeneity of the Las Cruces Trench is well established [Hills et al., 1992]. The assumption of a strict correlation between Ks and ψe fields, along with a uniform λ field, seems to be unrealistic and produces results that potentially obscure the real properties of the site. The water contents coalesced very well in the Miller and Miller approach [Rockhold et al., 1996; Oliveira, 2004]; yet, using the same scaling factors to predict Ks couples the properties in a way that is not supported by field data. Similarly, starting with saturated hydraulic conductivity information and applying Leverett scaling to predict water retention curves, establishes a strict correlation that may not exist. The results here therefore echo the conclusions of Jury et al. [1987a], that both the soil water retention parameters and the saturated hydraulic conductivity need to be analyzed separately. Another drawback of the scaling methods is that if only measurements of one property are available, choosing the reference parameters for the second set of properties (either ψ*e and λ* in the Leverett scaling or K*s in the Miller and Miller scaling) is either subjective or requires some additional measurements. Depending on the set of reference parameters chosen, the scaling procedure can potentially result in unrealistic predictions of soil properties in the domain. For example, a quick calculation shows that for the lowest permeability value in Dillard et al. , 10−18.75 m2, corresponds to a value of 6.5 × 10−4 m−1 for the van Genuchten shape parameter α [van Genuchten, 1980], producing a soil with an air-entry pressure head of approximately −60,000 cm H2O.
 Assuming the existence of measurements of both soil water retention curves and saturated hydraulic conductivities, the multistep approach showed the potential to generate preferential flow paths, thus capturing the impacts of the heterogeneity of a soil's hydraulic properties on the flow of water and solute transport in the unsaturated zone. It may be argued that the differences seen in the multistep technique result solely from the use of additional information. Even so, it shows the importance of measuring and analyzing Ks and θ(ψ) independently, rather than relying on one property to generate information about the other. Because the Ks distributions looked similar across the soil classes (Figure 8), the same mean and variance were used throughout (Table 3). The multistep approach could, however, incorporate class-specific statistics of Ks data in the same way values of λ were generated. The potential spatial cross correlation between soil classes was not considered nor was the spatial cross correlation between λ and ψe within classes. These aspects deserve further investigation, particularly for sites where the hydraulic conductivity varies over several orders of magnitude, such as the Bemidji site, where Ks varies over nine orders of magnitude [Dillard et al., 1997].
 We would like to thank Jirka Simunek for providing the source code of HYDRUS-2D and for helpful comments concerning the modification of the code. Thanks are also owed to Peter Wierenga and Richard Hills for providing the Las Cruces Trench Site data. Mark Rockhold is acknowledged for helpful discussions regarding the Las Cruces data. We also would like to thank the two anonymous reviewers for their comments and suggestions. Funding for this work was provided by a fellowship from CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico–Brazil) to the first author under grant 200765/98-1 and from the Department of Civil and Environmental Engineering at the University of Michigan.