Hydraulic tomography is a method that images the hydraulic heterogeneity of the subsurface through the inversion of multiple pumping or cross-hole hydraulic test data. Transient hydraulic tomography is different from steady state hydraulic tomography in that it utilizes transient hydraulic head records to yield the distribution of hydraulic conductivity (K) as well as specific storage (Ss) of an aquifer. In this paper we demonstrate the robustness of transient hydraulic tomography through the use of hydraulic head data obtained from multiple cross-hole pumping tests conducted in a laboratory sandbox with deterministic heterogeneity. We utilize the algorithm developed by Zhu and Yeh (2005) to conduct the transient inversions and validate the K and Ss tomograms using a multimethod and multiscale validation approach previously proposed by Illman et al. (2006). Validation data consist of cross-hole tests not used in the inversion as well as other hydraulic tests that provided local (core, single-hole tests) as well as large-scale (unidirectional flow-through tests) estimates of hydraulic parameters. Results show that the algorithm is able to yield consistent estimates that agree with independently collected local as well as large-scale hydraulic parameter data. In addition, we find that the transient hydraulic tomography requires a fewer number of pumping tests to estimate a similar quality K tomogram when compared with steady state hydraulic tomography, as the former approach utilizes more data from each pumping test. Overall, we find that transient hydraulic tomography is a robust subsurface characterization technique that can delineate the subsurface heterogeneity in both K and Ss from multiple pumping or cross-hole hydraulic tests.
 Subsurface investigations for water supply assessment and contaminant transport rely on the characterization of hydraulic parameters. For mathematical convenience, traditional subsurface characterization approaches treat the medium to be homogeneous to yield equivalent parameters that are valid only for bulk behaviors of flow in aquifers [Yeh, 1992]. This is the case despite the fact that subsurface geology and associated hydraulic parameters are heterogeneous at multiple scales. For many engineering and site investigation of groundwater flow purposes, these equivalent or averaged parameters are considered to be adequate. However, past and recent studies [e.g., Zheng and Gorelick, 2003; Illman and Hughson, 2005] have shown the importance of capturing subsurface heterogeneity because of its importance in predictions of contaminant transport. The knowledge of the detailed three-dimensional distributions of hydraulic conductivity (K) is especially critical in contaminant transport studies because of their sensitivity to small-scale K heterogeneities and their connectivity [Mas-Pla et al., 1992; Yeh et al., 1995]. Likewise, the knowledge of the subsurface distribution of specific storage (Ss) is critical in assessing groundwater storage [Wu et al., 2005].
 Information about the spatial variability of flow parameters is most commonly obtained through inference from small-scale measurements of cores, slug/bail tests, traditional aquifer tests [e.g., Theis, 1935; Cooper and Jacob, 1946], and single-hole pressure tests by geostatistical methods which require numerous measurements. This requires the drilling of numerous boreholes and the conduct of multiple measurements within various depth intervals in each of them using sophisticated equipment. The approach is expensive and time-consuming. Besides, the physical meaning of the flow parameter estimates from either traditional aquifer tests [Wu et al., 2005] or slug tests [Beckie and Harvey, 2002] is considered to be dubious. Furthermore, it is not clear that geostatistical analysis of data collected on relatively small support scales is necessarily indicative of medium properties that impact flow and transport on scales that are much larger. Moreover, kriging and related interpolation tools generally extrapolate and interpolate values based on the ensemble structure of geologic media. It does not necessarily provide a realistic mapping of heterogeneity in one realization.
 One alternative to traditional geostatistical analyses is hydraulic/pneumatic tomography [eg., Gottlieb and Dietrich, 1995; Butler et al., 1999; Yeh and Liu, 2000; Vesselinov et al., 2001); Liu et al., 2002; Bohling et al., 2002; Brauchler et al., 2003; McDermott et al., 2003; Zhu and Yeh, 2005, 2006; W. A. Illman et al., Steady-state hydraulic tomography in a laboratory aquifer with deterministic heterogeneity: Multi-method and multiscale validation of hydraulic conductivity tomograms, submitted to Journal of Hydrology, 2006, hereinafter referred to as W. A Illman et al., submitted manuscript, 2006]. Hydraulic tomography is potentially a cost-effective technique for characterizing subsurface heterogeneity of hydraulic parameters. During hydraulic tomography surveys, hydraulic heads induced by sequential pumping or injection tests at different locations of an aquifer are collected at a large number of subsurface locations. These hydraulic head data are then used to interpret the spatial distribution of hydraulic parameters of the aquifer. Observed hydraulic heads from pumping or injection at different locations provide more constraints on this interpretation. Pneumatic tomography is similar in concept to hydraulic tomography, but the well tests are conducted with air in the unsaturated zone [Illman and Neuman, 2001, 2003; Vesselinov et al., 2001].
 In particular, Zhu and Yeh  developed an algorithm for transient hydraulic tomography through the use of the sequential successive linear estimator (SSLE). Their approach combines the traditional geostatistical approach and governing flow physical principles to interpolate and extrapolate at locations where samples are not available. As a consequence, the SSLE as implemented in hydraulic tomography yields more realistic estimates than kriging and deterministic/zone-based inverse modeling approaches that consider principles of flow and use one pumping or injection data set only. They showed that K and Ss distributions can be obtained through data sets from numerically simulated pumping tests in synthetic heterogeneous aquifers. They suggested that the transient hydraulic tomography is a potentially cost-effective and high-resolution technique for mapping spatial distributions of the K and Ss in aquifers.
 While various algorithms for hydraulic tomography have been developed and some of them have been validated in sandbox experiments [Liu et al., 2002; W. A. Illman et al., submitted manuscript, 2006], to date, comprehensive validation of the transient hydraulic tomography has not been done either in the laboratory or the field setting. A field validation is the ultimate goal, but prior to that, laboratory validations are necessary in which the baseline heterogeneity is largely known, and all forcing functions and errors can be controlled as opposed to field applications. The main objectives of this paper are (1) to sequentially invert cross-hole pumping test data obtained in a synthetic aquifer with deterministic heterogeneity in sandbox experiments to obtain K and Ss tomograms using the transient hydraulic tomography algorithm of Zhu and Yeh  and (2) to validate the K and Ss tomograms using various independent data and methods.
2.1. Synthetic Aquifer Description
 The synthetic heterogeneous aquifer constructed in the sandbox was designed to test the hydraulic tomography algorithms. The dimensions of the sandbox are 193.0 cm in length, 82.6 cm in height, and 10.2 cm in depth. Forty-eight ports, 1.3 cm in diameter, were cut out of the stainless steel wall to allow coring of the aquifer, to allow installation of horizontal wells and pressure instrumentation, and to serve as pumping ports. The flow system for the cell is provided by two constant-head reservoirs, one at each end of the sandbox. A Mariotte device was connected to the reservoirs to maintain a constant head on both boundaries in cross-hole pumping tests. Three constant-head boundaries can be developed by ponding water over the top of the sand, effectively connecting the two reservoirs. We chose this boundary condition configuration for all cross-hole tests used in hydraulic tomography as they were most stable during each test. The stability of boundary conditions is one of the most important requirements for the laboratory validation of hydraulic tomography. As the design of this sandbox did not allow for maintaining a no-flux boundary at the top very effectively, we did not consider such a boundary at the top of the sandbox.
 Four different commercially sieved sands (20/30 and 40/30, U.S. Silica; F-75 and F-85, Unimin Corporation) were used to pack the sandbox by hand. In particular, we packed eight rectangular sand bodies consisting of lower K material (40/30, F-75, and F-85) within high K sand (20/30). Each rectangular sand body consists of the material of different Ks (see Figure 2a). The sandbox was wetted from the bottom and the water levels increased while packing. Figure 1 is a computer-aided design (CAD) drawing of the sandbox containing the synthetic heterogeneous aquifer showing its dimensions and port locations. In Figure 1, eight rectangular dashed lines indicate the locations of low K sand bodies and port numbers used for pumping are in bold.
 The data acquisition system consists of 50 pressure transducers, a 64-channel data acquisition board, and a dedicated computer. Additional details of the sandbox construction and data acquisition system are given by Craig  and W. A. Illman et al. (submitted manuscript, 2006).
2.2. Methods for Characterization of the Sandbox
 Different hydraulic tests were performed to characterize the hydraulic parameters in the sandbox, including determination of K using core samples, in situ slug tests, and in situ pumping tests. Details of each test and analysis are provided below.
2.2.1. Determination of K From Core Samples
 We first determined the K of the four types of sands from the horizontal cores obtained during the placement of ports. The extracted cores had dimensions of 1.28 cm in diameter and 10.16 cm in length. These cores were then attached to a custom-made constant-head permeameter [Klute and Dirksen, 1986] for determination of K. Details of the core extraction method and the design of the constant-head permeameter are provided by Craig . The K values from cores are calculated using Darcy's law.
2.2.2. In Situ Slug Tests
 We also conducted slug tests at each of the 48 ports. Because of the small size and configuration of the ports on the sandbox, an external well was attached to the ports instead of boring vertical wells into the sandbox. A slug was introduced to perturb the water level in the horizontal well connected to the port, and the corresponding recovery was monitored using a pressure transducer. Because existing analytical solutions cannot be used to interpret the slug tests with our current setting, we analyzed the data by manually calibrating VSAFT2 [Yeh et al., 1993], available at www.hwr.arizona.edu/yeh, by treating the model domain to be a two-dimensional, homogeneous medium [Craig, 2005]. A fine numerical grid (1.64 cm by 1.64 cm) was developed for the slug test analysis. VSAFT2 was chosen to analyze the test data for consistency because the code contains the forward model used later for hydraulic tomography. We report the geometric mean of 40 values that we deem to match the numerical solution well in Table 1. Results obtained revealed that the K values are several orders of magnitude smaller than the core values.
Table 1. Summary of Hydraulic Properties Determined From Core, Slug, Single-Hole, Cross-Hole Pumping Test Data, and Flow-Through Experiments
(K ∼ cms−1)
(Ss ∼ cms−1)
The volume-weighted mean and its corresponding variance are −1.920(1.467 × 10−1) and 1.560, respectively.
 We then conducted pumping tests at each of the 46 out of available 48 ports. Ports 36 and 38 have been damaged so we do not pump from these ports. During the pumping tests, the top and two sides of the aquifer served as constant-head boundaries, as described earlier, while the bottom remained a no-flow boundary. Pumping rates ranged from 150 to 190 mL/min in most cases. For each test, data collection started before the pump was activated to obtain the initial hydraulic head in the sandbox and the data were collected from all ports every 0.75 s set to be constant throughout the duration of each experiment. A data collection interval of 0.75 s was selected to allow for the expected rapid transient change in hydraulic head at the monitoring ports. A peristaltic pump was then activated at the pumping port and allowed to run until the development of steady state flow conditions. The pump was then shut off to collect recovery data until the hydraulic head recovered fully. During each pumping test, pressure heads were collected at all 48 ports.
 The data sets were analyzed in several ways. First, we analyzed the 48 drawdown-time data sets induced by pumping at port 22 and those caused by pumping at port 28 by manually calibrating VSAFT 2 and assuming the aquifer is homogeneous. The numerical setup for the calibration is identical to the slug test analysis. For the pumping test at port 28 (located in 20/30 sand), all 47 cross-hole intervals were matched and one single-hole match was made, which yielded a total of 48 estimates for that pumping test. The pumping test at port 22 (located in F-75 sand) yielded also 47 cross-hole matches for observed and simulated drawdown. The single-hole match in this case was unattainable due to the very large drawdown which VSAFT2 could not simulate. Analysis of the two pumping tests thus yielded 95 estimates of K and Ss for the equivalent homogeneous medium. These two tests will be denoted as cross-hole tests hereinafter.
 Out of the 46 pumping tests, we analyze the drawdown-time data at nine selected pumping ports (2, 5, 14, 17, 32, 35, 44, 46, and 47) using VSAFT2 to yield local or single-hole estimates of K and Ss. These results are denoted as the single-hole results.
2.2.4. Hydraulic Tomography Analysis
 In additional to the single-hole analysis for the aforementioned nine pumping ports, we also selected eight out the nine pumping tests and the drawdown-time observations at the rest of 47 ports during each test as the data sets of a hydraulic tomographic survey. This eight pumping/drawdown data sets are then used for steady state hydraulic tomography and analysis as well as the transient hydraulic tomography analysis, which are discussed in detail in a later section. The remaining one test was reserved for validation purposes.
2.2.5. Unidirectional Flow-Through Experiments
 We also conducted nine flow-through experiments through the entire sandbox to obtain the effective hydraulic conductivity (Keff) of the entire sandbox under steady state unidirectional flow conditions. Specifically, each of these nine experiments was conducted by changing the height of the reservoirs on the both sides of the sandbox. After the flow reached a steady state condition, we measured discharge from one side of the sandbox. We also measured the difference between the heights of the water column in the two constant-head reservoirs to determine the hydraulic gradient. The nine pairs of gradient and discharge were computed using Darcy's law to obtain the Keff.
Table 1 summarizes the results from all these tests. The mean estimates were obtained by computing the arithmetic mean of the natural logarithm transformed data. The variance was likewise computed using the natural logarithm transformed data set. We also calculated a volume-weighted mean and variance of the core values, which are also listed in Table 1. The purpose of computing the volume-weighted mean and variance of the core K values was so that these values are upscaled to the size of the finite element grid used for the inversion so that we can compare them later.
2.3. Inverse Model Description
 Mapping of the spatial distribution of hydraulic properties in the sandbox was carried out using a transient hydraulic tomography algorithm developed by Zhu and Yeh  and hydraulic head data from the six pumping tests. A brief description of the algorithm is given below.
 The inverse model assumes a transient flow field, and the natural logarithms of K (ln K) and Ss (ln Ss) are both treated as multi-Gaussian, second-order stationary, stochastic processes. The model additionally assumes that the mean and correlation structure of the K and Ssfields are known a priori.
 The estimation procedure starts with a weighted linear combination of parameter measurements and transient hydraulic head data at different locations to obtain the first estimate of the parameters. The weights are calculated using the means and covariances of parameters, the covariances of hydraulic heads in space and time, and the cross covariances between heads and parameters. The first estimate is then used in the mean flow equation to calculate the heads at observation locations and sampling times through a forward simulation. At the end of this forward simulation, the differences between the observed and simulated hydraulic heads are calculated and a weighted linear combination of these differences is then used to improve the previous estimates. Iterations between the forward simulation and estimation continue until the improvement in the estimates diminishes to a prescribed value.
 SSLE can handle measurement error through the specification of two convergence criteria in the algorithm. The criteria are the change of variance and pressure heads between two consecutive iterations, which are both set to 0.01 in our model.
 The transient hydraulic tomography algorithm developed by Zhu and Yeh  allows for the sequential inclusion of pumping test data. Some modifications were made to the code for the present study to account for variations in the constant head boundary conditions from one pumping test to the next, as they are sequentially included.
2.4. Inverse Model Parameters
 To obtain the K and Ss tomograms, the synthetic aquifer was discretized into 741 elements and 1600 nodes with element dimensions of 4.1 cm × 10.2 cm × 4.1 cm. The numerical grid is somewhat smaller (161.3 cm by 75.6 cm by 10.2 cm) than the actual dimensions of the sandbox (193.0 cm by 82.6 cm by 10.2 cm) as we only model the portion of the sandbox that contains the porous medium. Both sides and the top boundary were set to be constant-head boundary conditions, while the bottom boundary of the sandbox was considered a no-flow boundary. This grid setup is consistent with that was used in the steady state hydraulic tomography by W. A. Illman et al. (submitted manuscript, 2006).
 Inputs to the inverse model include initial guesses for the K and Ss, estimates of variances and the correlation scales for both parameters, volumetric discharge (Qn) from each pumping test where n is the test number, available point (small-scale) measurements of K and Ss, as well as head data at various times selected from the head-time curve. Although available point (small-scale) measurements of K and Ss can be input to the inverse model, we do not use these measurements to condition the estimated parameter fields to test the inversion algorithm.
2.4.1. Hydraulic Parameters K and Ss
 A number of methods can be used to obtain the initial guess of K and Ss. One can set an arbitrary value that is reasonable for the geologic medium considered or to estimate the average or effective hydraulic conductivity (Keff) and specific storage (Sseff) for an equivalent homogeneous sandbox. If there are small-scale data available, then a geometric mean of the available small-scale data (i.e., core, slug, and single-hole data) can be calculated. An alternative to this is to use the equivalent hydraulic conductivity and storage estimates obtained through the analysis of cross-hole test data by treating the medium to be homogeneous. Yet another approach is to use the results from the flow-through experiments to obtain the Keff. We elect to utilize the mean value of the K and Ss obtained from the analysis of cross-hole tests treating the medium to be homogeneous, as these estimates are commonly and readily available in real field situations.
2.4.2. Variance and Correlation Scales
 The variances and correlation scales of the K and Ss fields are also required inputs to the inverse model. However, estimation of variance always involves uncertainty. A previous numerical study conducted by Yeh and Liu  has shown that the variance has negligible effects on the estimated K using the inverse model. We expect the same for both K and Ss for transient hydraulic tomography. Therefore we obtain variance estimates from the available small-scale data and use these as our input variance in the inverse model for the real data set.
 Correlation scales represent the average size of heterogeneity, which is difficult to determine accurately without a large number of data sets in the field. The effects of uncertainty in correlation scales on the estimate based on the tomography are negligible because the tomography produces a large number of head measurements, reflecting the detailed site-specific heterogeneity [Yeh and Liu, 2000]. Therefore the correlation scales were approximated based only on the average thickness and length of the discontinuous sand bodies.
2.4.3. Transient Hydraulic Head Data
 Transient hydraulic head records are required observation data for transient hydraulic tomography. These were obtained from ports that yielded data that were not too noisy. The remaining data then were treated with various error reduction schemes discussed by W. A. Illman et al. (submitted manuscript, 2006). Briefly, the error reduction schemes consisted of accounting for pressure transducer drift and removal of data at the pumping port affected by skin effects, and averaging the drawdown data at steady state. We then calculated drawdown for each port during a pumping test. Once the drawdown is computed, a sixth-order polynomial curve was fit to each transient drawdown record. We then extracted four to five points that are evenly spaced to capture the transient head record thoroughly. Drawdown curves that could not be properly fitted were manually excluded from the analysis. The average R2 for the fitted curves picked out for inverse modeling is 0.9963, which means the curves were well fitted. In total, we utilized six independent cross-hole tests for the analysis. There were other cross-hole tests available, but those six were considered to be the best among the data set. In addition, we selected the pumping tests from these ports as they are evenly distributed along two vertical profiles throughout the sandbox. More specifically, we utilized four drawdown data from 19 ports totaling 76 in cross-hole tests 14 and 17, four drawdown data from 30 ports totaling 120 in cross-hole test 32, four drawdown data from 26 ports totaling 104 points for pumping test 35, four drawdown data at 36 ports totaling 144 points for pumping test 44, and five drawdown data at 26 ports totaling 130 points for pumping test 47. In total, we utilized 650 drawdown records from six different tests in our transient inversions. W. A. Illman et al. (submitted manuscript, 2006) made use of two additional tests for steady state hydraulic tomography. Here we do not use these data as the transient drawdown records are noisy.
3. Results From Transient Hydraulic Tomography
Figure 2a is a drawing of the sandbox showing the synthetic aquifer with each sand type marked. Figures 2b–2f are the K tomograms obtained by inverting head data induced by the six pumping tests (14, 17, 32, 35, 44, 47; see Figure 1) conducted in that order. Results from using the first two pumping tests (Figure 2b) reveal detailed heterogeneity patterns near the top of the sandbox, where pumping took place. However, little detail to the heterogeneity pattern is revealed near the bottom of the sandbox. This is alleviated with the heterogeneity structure for the entire aquifer appearing when additional cross-hole tests are included sequentially. Upon inclusion of all of the six pumping tests in the inversion, a vivid image (Figure 2f) of the heterogeneity structure appears. However, the two low-K blocks near the bottom boundary still are not well resolved. In addition, a high-K zone is apparent near the top of the sandbox. This may be attributed to the lack of compaction of sands near the top.
 Despite of the lack of resolution near the bottom, the results collectively show that the inversion algorithm is capable of capturing the pattern of the K distribution, which is critical for an analysis of contaminant migration. Another interesting observation that one may make qualitatively is that the transient hydraulic tomography reaches the same quality of result with fewer pumping tests than the steady state approach, which required eight pumping tests (W. A. Illman et al., submitted manuscript, 2006). We also visually compared the K tomogram resulting from transient hydraulic tomography with that obtained from the steady state hydraulic tomography (Figure 2g). This comparison revealed that the tomograms are very similar.
Figures 3b–3f show the corresponding Ss tomogram that was estimated simultaneously. In contrast to Figures 2b–2f, the structure consisting of variable size sand bodies visible in the K tomogram is not visible for the Ss tomogram. This can be attributed to the fact that sands of relatively low compressibility (of various sizes) were used to construct the synthetic aquifer. However, a decreasing trend in Ss with depth in the synthetic aquifer is apparent. Physically speaking, this makes sense because the sands in the upper portion are less compressed, while the deeper sands are more compressed due to the stress exerted by the overlying material. This finding suggests perhaps that the K values are not significantly correlated with the Ss values in this sandbox.
4. Comparisons of K and Ss Fields From Different Tests and Analyses
4.1. Visual Comparisons of Patterns of Heterogeneity of Different Tests and Analyses
Figure 4a shows the contour map of the K values estimated from the 48 core samples. The map as expected outlines the distribution of the blocks of low conductivity values, indicating the distribution of these core measurements. Similarly, the contour map of the K values estimated from the 40 slug tests also reveals a similar pattern (see Figure 4b). Finally, a map of the K estimates based on the 48 hydrographs induced by pumping at port 28 (the cross-hole test) is shown in Figure 4c. As suggested by Wu et al. , each estimated K represents some kind of average of the hydraulic conductivities over the cone of depression, but it is influenced by the hydraulic conductivity values near the pumping well and the observation well. As a result, the distribution of these estimates is smooth and does not necessarily show the pattern of the heterogeneity in the sandbox. The distribution of the K estimates based on the cross-hole tests with pumping at port 22 shows a very similar pattern.
 Next, we plot in Figure 4d the spatial distribution of the estimated Ss values from the cross-hole tests using port 28 as the pumping well. It is interesting to observe that this spatial distribution is in some agreement with that resulting from the transient hydraulic tomography (see Figure 3f). That is, higher specific storage values are at the upper portion of the sandbox. This result appears to support the finding by Wu et al.  that the Ss estimates from the cross-hole analysis with the assumption of homogeneity reflect the Ss values between the pumping well and the 48 observation wells that spread over the entire sandbox.
 According to the above visual comparisons, we may conclude that the measurements using core samples and slug tests can satisfactorily map the heterogeneity pattern, but not the actual values of K, in the sandbox if the number of tests or samples is sufficient. On the contrary, the cross-hole tests that utilize the homogeneous medium assumption produce only inconsistent average values of the hydraulic conductivity in the sandbox. However, they apparently can be used to estimate the spatial distribution of Ss values in our sandbox.
4.2. Comparison of Statistical Moments
 After the visual evaluations of spatial patterns of K and Ss estimates from different tests and analyses, we quantitatively evaluate the K tomogram by comparing its sample and population means of estimated log-transformed hydraulic conductivity () and its sample and population variances (σlnK2) with the corresponding sample statistical moments obtained from other tests. The sample mean of the tomogram is computed by taking the geometric mean of the hydraulic conductivity estimates at 48 elements corresponding to port locations, while the population mean is computed from the hydraulic conductivity estimates at all 741 elements.
 First, we compare the sample mean (−1.860) and the population mean (−1.729) of K values obtained from transient hydraulic tomography using six pumping tests (Table 2) to the ln Keff (−1.757) obtained from the flow-through experiment (Table 1). This comparison shows that the estimate means of the tomogram approach the mean K value of an effective homogeneous medium after including the six pumping tests in the inversion for this particular experimental setup. We note the difference between the mean values and the ln Keff value determined from the flow-through experiment decreases quickly from the addition of one pumping test to the next and the values stabilize (Table 2). These mean estimates are also in agreement with the mean of the estimates from the two cross-hole analyses. On the other hand, the means of the estimates from core samples, slug tests, and single-hole tests are smaller than those from the tomography, flow-through tests, and the cross-hole analysis.
Table 2. Mean () and Variance (σlnK2) of the Log-Transformed K Estimates for the Real Inversions With Number of Cross-Hole Tests Used in the Analysis
Number of Cross-Hole Tests Included in Analysis (Test Number)
(K ∼ cms−1)
(K ∼ cms−1)
2 (14 + 17)
3 (14 + 17 + 32)
4 (14 + 17 + 32 + 35)
5 (14 + 17 + 32 + 35 + 44)
6 (14 + 17 + 32 + 35 + 44 + 47)
 We next compare the sample and population variance (0.768 and 0.906, respectively) of the K tomogram with the estimates of variance obtained from the available core K data (1.498). This comparison shows that both the sample variance (0.768) and population variance (0.906) are much smaller than of the variance of the K values of core samples. This difference likely can be attributed to several factors: (1) The core value represents an averaged value over a smaller volume (V = 13.04 cm3) of the sand in comparison with the averaging volume of the element (V = 171.46 cm3) for the K estimates from the hydraulic tomography–disparity in scale between the two types of estimates; (2) each core value is an effective K for a one-dimensional flow situation, whereas the K values of the tomograms represent conditional “effective” K values from six multidimensional flow fields; (3) the core K values also may be subjected to higher variation due to the fact that we have disturbed the geometry and/or removed the compressive forces when extracting these sediments. Therefore the results are consistent with our expectation.
 Likewise, without any surprises, the variance of the tomograms is greater than those of cross-hole analyses and flow-through experiments since each K estimate from the cross-hole analyses and flow-though experiments represents a spatially averaged value over the cone of depression or the entire sandbox. In comparison with the variances from slug tests and single-hole analyses, the variance of the tomograms is slightly higher but is of the same order of magnitude of those of the slug tests and single-hole analyses.
 Since no estimates of Ss for the cores were derived, we compare the Ss tomogram through the estimates of statistical moments from available single-hole and cross-hole pumping tests. We first compare the sample mean (−7.966) and population mean (−8.047) of log-transformed specific storage of the Ss tomogram obtained from the six sets of pumping tests (Table 3) with the mean of the available single-hole estimates (−7.960) as well as the mean of the equivalent estimates (−8.378) derived from the two cross-hole tests (Table 1). The results are in some agreement, indicative of some usefulness of the cross-hole analysis that assumes homogeneity of the sandbox, perhaps when the number of observations is sufficiently large.
Table 3. Mean () and Variance of the Log-Transformed Ss Estimates for the Real Inversions With Number of Cross-Hole Tests Used in the Analysis
Number of Cross-Hole Tests Included in Analysis (Test Number)
(Ss ∼ cm−1)
(Ss ∼ cm−1)
2 (14 + 17)
3 (14 + 17 + 32)
4 (14 + 17 + 32 + 35)
5(14 + 17 + 32 + 35 + 44)
6 (14 + 17 + 32 + 35 + 44 + 47)
 We next compare the sample (0.214) and population (0.190) variance of log-transformed specific storage from the Ss tomogram (Table 3) with variance estimates obtained from single-hole (1.897) and the two cross-hole (0.047) tests (Table 1). Apparently, there is a large discrepancy between these variances of the tomogram and the variance of Ss estimates from the single-hole tests. We expect that the variance of the estimates from the single-hole test should be greater than that from the tomogram but not by a factor of 10. This may be due to the relatively small number of single-hole Ss estimates.
 The sample and population variance of the estimates from hydraulic tomography, as we anticipated, is about a factor of 4 larger than the variance of the equivalent Ss obtained by the cross-hole analyses that treat the medium to be homogeneous.
4.3. Comparison of Local Values
 To examine the performance of different tests in greater detail, we next compare local K values from the K tomogram to the K estimates at sample locations from the core measurements, single-hole tests, as well as cross-hole analyses at each observation port induced by pumping at port 22 and port 28 (Figure 5). A correlation coefficient is used to quantify the spatial correspondence of log-transformed estimates from tomography, χi, and the estimates from other tests, i,
where N is the total number of elements and μχ and μ are means for the estimates from tomography and the estimates from other tests and analyses, respectively. The R values are also given in Figure 5.
Figure 5 shows that there is quite a bit of scatter but the bias is not too large when the local K estimates from the tomography are compared with the core values and the correlation value of the two types of estimates is 0.57. On other hand, there is a noticeable bias when the local K values from the tomogram are compared with the single-hole test data, but the correlation value is high (0.85), indicating a similar spatial pattern between the two estimates. This bias (i.e., systemically lower values for the estimates by the single hole tests) can be explained by near-well effects, that is, the skin effect discussed by W. A. Illman et al. (submitted manuscript, 2006) that can cause local K estimates from single-hole tests to be slightly smaller. Or possibly, it can be attributed to scale disparity of these estimates.
 Estimates of K values at the monitoring ports from the cross-hole analyses for the two pumping locations (22 and 28) appear as vertical, narrow clusters in Figure 5. The narrow clusters suggest the estimates are smooth (averaged values) in comparison with those from the tomogram at the same locations.
 We next compare local Ss values from the Ss tomogram to the Ssestimates from the single-hole tests (Figure 6). Here numbers associated with each data point indicate the port number in which the pumping took place. Overall, we see that the Ss estimates are larger near the top of the sandbox and decrease as we move deeper into the sandbox as noted earlier for the Ss tomograms (Figures 3a–3f). The correlation between the two estimates is 0.9, indicating a very similar spatial pattern of the two estimates. However, variability of Ss from the single-hole tests treating the medium to be homogeneous is much larger than that of the Ss estimates from hydraulic tomography.
 The plot of the estimates from the cross-hole analyses for pumping ports 22 and 28 in Figure 6 yields a narrow vertical cluster, indicative of similar mean values but small variation in the estimates from the two cross-hole analyses and great variation in the Ss tomogram.
4.4. Comparison of K Tomogram Obtained From Steady State Hydraulic Tomography
 We also compare the K tomogram obtained from the transient hydraulic tomography with that obtained from its steady state counterpart (Figure 7). Here the K data come from Figure 4d of W. A. Illman et al. (submitted manuscript, 2006). The results show that the estimated K values using transient head data are almost identical to those based on the steady head data with a correlation value of 0.83 (i.e., the patterns are very similar). The difference between the two perhaps reflects the influence of Ss parameters on the estimation of K. The dashed line in Figure 7 indicates the 95% confidence interval.
5. Validation of K and SS Tomograms
 As shown in previous section, some of the tests and analyses yield similar K and Ss statistics and patterns. It is difficult, however, to validate each other due to scale disparity, different flow conditions, etc. As a consequence, an appropriate validation approach is to test the predictability of the estimates under different flow scenarios. In order to do this, we first verify the K and Ss tomograms by simulating the last pumping test at port 47 used in the construction of the tomogram. We then compare the simulated and measured transient drawdown at early (3 s), intermediate (10 s), and late (20 s) time periods (Figure 8a) at all ports except for the pumped port. The correlation values between the simulated and observed drawdowns at the three times as well as the means and variances of their differences are reported in Figure 8a. This plot and the quantitative measures show that the comparison is excellent, providing us with further confidence that SSLE can provide an unbiased estimation of the drawdown distribution. Note that the simulated drawdowns will not necessarily match the observed ones perfectly as they should, due to the fact that the tomograms are conditional effective K and Ss fields and also due to noise in the observations. Notice that simulated drawdown values are larger than the observed ones when the values are large. This is consistent with the fact that uncertainty in the predicted drawdown grows with the mean gradient according to stochastic analysis (to be explained later).
 A better validation of the K and Ss tomograms is to simulate an additional pumping test that was not used in the inversion and to examine whether the drawdown at various sampling ports of this independent test can be predicted accurately at various times. For this, we utilize the K and Ss tomograms obtained from the inversion of the six pumping tests (Figures 2f and 3f) and simulate a test with pumping taking place at port 46. The pumping test at port 46 was chosen for validation purposes because it was one of the pumping tests with the cleanest data devoid of external factors. We note that the fact that port 46 is close to port 47 does not make it easier to simulate the observed behavior.
Figure 8b shows the results of this comparison. In Figure 8b the observed drawdowns are plotted against the predicted drawdowns at times 3, 10, and 20 s at the 48 observation ports. It also shows the mean and variance of the difference between the observed and predicted drawdowns as well as the correlations between the observed and predicted. According to Figure 8b, the data pairs are scattered along the 45-degree line, indicating predicted drawdown distributions generally are statistically unbiased in comparison with the observed. The high correlation values for the three different times (0.995) suggest the predicted drawdown distribution is almost identical to the observed distribution, at least the drawdowns at the three times at the observation ports. This is an exciting result because it indicates that using the K and Ss fields derived from hydraulic tomography, one can yield an excellent prediction of the drawdown behavior in the sandbox. Again, the predicted drawdown values using the tomograms are slightly greater than the observed values at all the sampling ports at the three times.
Figures 9a–9c presents another way to validate the tomograms. In Figure 9 the horizontal axis denotes the simulated drawdown values at the sampling ports based on the mean (or effective) K and Ss values from their tomograms. The vertical axis represents the observed head values at the same locations as well as those simulated using the K and Ss tomograms from Figures 2f and 3f. In addition, a 45-degree line indicating perfect drawdowns corresponding to those resulting from the mean parameters is plotted along with an upper and a lower bound of the drawdown that denote the variability in downdowns due to heterogeneity of K and Ss ignored by the homogeneous assumption. The upper and lower bounds were constructed by adding and subtracting twice the square root of the predicted drawdown variance at each observation location, to and from the simulated drawdown based on the mean K and Ss tomograms, respectively. The drawdown variances were calculated from the SSLE algorithm using our initial guess values for the correlation scales of ln K and ln Ss and the mean and variance values of the estimated K and Ss tomograms. Note that the true correlation scales, means, and variances are unknown; the variances of the tomogram should be smaller than the true ones; the means are close to the true as illustrated earlier. Therefore the upper and lower bounds, in effect, are uncertain themselves but the mean behavior is likely reasonable. Figure 9 thus suggests that using effective parameters, a flow model assuming homogeneity predicts quite different drawdowns than those observed at the 48 ports in the heterogeneous aquifer (i.e., the sandbox). That is, the predicted drawdowns are consistently smaller near the pumping port and greater away from the pumping location than the observed ones. On the contrary, using the K and Ss fields resulting from hydraulic tomography, a classic governing equation for groundwater flow can yield an excellent prediction of the drawdowns at three different times at all the 48 observation ports in the heterogeneous sandbox. This result is exciting in view of inherent nonuniqueness of the estimation and errors in the measurements. Certainly, more independent pumping events would make this validation more significant.
 Last, we believe that these results manifest the utility of the hydraulic tomographic surveys and the robustness of the SSLE algorithm. More importantly, these results implicitly validate the classical governing equations for groundwater flow in heterogeneous porous media (at least in our sandbox).
 The main objective of the study is to validate the recently developed hydraulic tomography concept and an analysis algorithm (i.e., SSLE) in a heterogeneous sandbox. In order to accomplish this goal, we investigated the ability of various hydraulic tests and analyses to characterize the heterogeneous sandbox. These tests and analyses include determination of the K from core samples, slug tests, single-hole analyses, cross-hole analyses, and unidirectional flow-through test.
 On the basis of results of this investigation, we draw the following major conclusions: (1) With the number of samples and test locations used in our experiment, the estimated K values from core samples and slug tests can delineate the heterogeneity pattern of the sandbox, but the hydraulic tomography provides considerably more details; (2) the average of the 95 K estimates from the cross-hole tests and analyses yield an averaged K value that is close to the effective K determined from the uniform flow experiment or the mean value of the estimates from hydraulic tomography; (3) estimates of Ss from the single-hole analysis and those from hydraulic tomography exhibit physically plausible Ss distributions; (4) while some of the tests and analyses yield similar K and Ss statistics and patterns, it is difficult to validate each other due to scale disparity, different flow conditions, etc. As a consequence, an appropriate validation approach is to test the predictability of the estimates under different flow scenarios.
 This is the approach we chose to validate the hydraulic tomography concept and the SSLE algorithm. On the basis of this approach, we demonstrate that using the estimated K and Ss fields from the hydraulic tomography, a classic governing flow equation predicts drawdown distributions caused by an independent pumping event in close agreement with the observed distributions at three different times. We thereby conclude that the hydraulic tomography concept and the analysis algorithm (SSLE) is a viable tool for characterizing aquifers at high resolutions although field tests are needed to further substantiate this claim for a real-world problem.
 This research was supported by the Strategic Environmental Research and Development Program (SERDP) as well as by funding from the National Science Foundation (NSF) through grants EAR-0229713, EAR-0229717, IIS-0431069, IIS-0431079, EAR-0450336, and EAR-0450388. We thank the anonymous reviewers for their comments which improved our manuscript.