Abstract
 Top of page
 Abstract
 1. Introduction
 2. Sequential Pumping Tests
 3. Analysis of the Sequential Pumping Tests
 4. Estimation Results
 5. Validation of Estimated T Fields
 6. Discussion and Conclusions
 Acknowledgments
 References
 Supporting Information
[1] Three conceptual models are evaluated for estimating transmissivity (T) fields using data from sequential pumping tests at a field site and data from similar tests simulated in a synthetic aquifer. The three approaches are (1) an equivalent homogeneous approach, (2) a heterogeneous approach based on a single pumping test, and (3) a heterogeneous approach based on joint interpretation of the sequential pumping tests (i.e., hydraulic tomography, HT). They are evaluated on the basis of their abilities to obtain representative estimates of the T field of the aquifer and, more importantly, on the ability of their estimates to predict drawdown distributions in the aquifer induced by independent validation pumping tests. Results show that the first approach yields scenariodependent T estimates, which vary with the location of the pumping well. Independent validation tests show that the predicted drawdowns in both aquifers are biased and dispersed. While the second approach produces scenariodependent T spatial distributions capturing the general pattern of the aquifer, the T fields consistently yield better drawdown predictions than those based on the first approach. Lastly, the joint interpretation approach reduces the scenario dependence of the T estimates and improves the quality of the T estimates as more data sets from sequential pumping tests are included. More importantly, the resultant T estimates lead to the best prediction of different flow events. The robustness of the joint interpretation is then elucidated.
1. Introduction
 Top of page
 Abstract
 1. Introduction
 2. Sequential Pumping Tests
 3. Analysis of the Sequential Pumping Tests
 4. Estimation Results
 5. Validation of Estimated T Fields
 6. Discussion and Conclusions
 Acknowledgments
 References
 Supporting Information
[2] Knowledge of hydraulic properties of aquifers is essential to groundwater resource management and groundwater contamination prevention and remediation. Numerous methods have been developed over the past several decades for determining aquifer hydraulic properties. One popular method is the socalled aquifer test with the use of the analytical solution by Theis [1935] and that by CooperJacob [1946] for analysis. These solutions assume that the aquifer is homogeneous and has infinite lateral extents, although aquifers are inherently heterogeneous and bounded in nature. As a result, it is generally believed that the hydraulic properties estimated by these methods represent spatially averaged hydraulic properties over the cone of depression [e.g., Meier et al., 1998; SanchezVila et al., 1999]. On the other hand, Wu et al. [2005] showed that the analysis yields ambiguous spatially averaged hydraulic properties of an aquifer, which vary with the location of observation. Analyses of field pumping tests by Straface et al. [2007] and Wen et al. [2010] confirm the finding. Wu et al. [2005] further suggested that in order to obtain representative hydraulic properties of a heterogeneous aquifer within the cone of depression using the Theis solution, hydrographs from many observation wells must be used simultaneously.
[3] To improve our ability to characterize aquifers, a hydraulic tomographic (HT) survey has been developed [Gottlieb and Dietrich, 1995; Vasco et al., 2000; Yeh and Liu, 2000; Bohling et al., 2002; Brauchler et al., 2003, 2010; Zhu and Yeh, 2005; Li et al., 2007; Liu et al., 2007; Illman et al., 2007, 2010; Fienen et al., 2008; Castagna and Bellin, 2009; Cardiff et al., 2009; Yin and Illman, 2009]. HT is a series of crosswell interference tests (or sequential pumping tests) using a well field to the maximum extent in terms of hydraulic tests.
[4] HT has been tested successively in synthetic porous aquifers [Yeh and Liu, 2000; Zhu and Yeh, 2005], and fractured aquifers [Hao et al., 2008], laboratory sandboxes [Liu et al., 2002, 2007; Illman et al., 2007, 2010; Xiang et al., 2009; Liu and Kitanitdis, 2011] plotscale fields [Bohling et al., 2007; Straface et al., 2007; Li et al., 2007, 2008; Brauchler et al., 2010, 2011], and a fractured granite field site [Illman et al., 2009]. In particular, Liu et al. [2007] and Xiang et al. [2009] show that the transient HT not only identifies the pattern of the heterogeneous hydraulic conductivity (K) and the specific storage (S_{s}) field of the sandbox but also yields the estimates that can satisfactorily predict the drawdown caused by a pumping test that was not used in the HT analysis.
[5] Illman et al. [2010] and Berg and Illman [2011] evaluated different aquifer characterization methods (including the Theis aquifer test, kriging using core 48 samples, kriging with 48 single well test estimates, and HT) for characterizing a heterogeneous K distribution in a fully saturated sandbox. The performance of these methods was evaluated on the basis of their ability to predict steady state drawdowns at the 48 sampling locations of 16 independent pumping tests at different locations in the sandbox. Note, these drawdowns were not used in the inverse analysis of HT. They concluded that the K estimate from kriging using 48 core samples yielded satisfactory predictions. However, the estimate based on HT yielded the best predictions.
[6] While the robustness of HT has been substantiated by these sandbox experiments and some recent field applications, its ability for highresolution aquifer characterization under field conditions remains to be fully assessed. Especially, the benefits of HT have not been demonstrated and evaluated in comparison with other approaches under field conditions. As a result, the objective of this paper is to quantitatively evaluate three approaches for estimating T fields using data from sequential pumping tests at a field site and data from similar tests simulated in a synthetic aquifer. The synthetic aquifer case represents the ideal case where true T field, boundary conditions, head measurements, and pumping rates are exactly known and noisefree. These three approaches to be evaluated include (1) the equivalent homogeneous approach, (2) a heterogeneous approach based on a single pumping test, and (3) a heterogeneous approach based on joint interpretation of the sequential pumping tests (i.e., HT). For the synthetic aquifer case, the estimated T fields from the three approaches are appraised by comparing them with the true field. The robustness of these estimated T fields of the synthetic and field aquifers are subsequently validated in terms of their ability to predict drawdown distributions of the aquifers induced by independent validation pumping tests. This type of validation of T estimates from field pumping tests has rarely been carried out previously. Finally, insights to joint interpretation of sequential pumping tests are discussed with respect to other approaches.
3. Analysis of the Sequential Pumping Tests
 Top of page
 Abstract
 1. Introduction
 2. Sequential Pumping Tests
 3. Analysis of the Sequential Pumping Tests
 4. Estimation Results
 5. Validation of Estimated T Fields
 6. Discussion and Conclusions
 Acknowledgments
 References
 Supporting Information
[11] Steady state drawdown data at the 10 observation wells during each of the seven pumping tests at boreholes (BH01, BH02, BH03, BH06, BH07, BH08, and BH09) in either the synthetic or the NYUST aquifer were used in the analysis. The remaining four tests were reserved for validation purposes (see section 5.3). The steady state drawdown of each observation well used in the inverse modeling of the NYUST aquifer was the average value of the drawdowns recorded over a 48h time interval after the drawdown stabilized (see Figure 2). Three approaches were used to analyze the drawdowntime data sets from the sequential pumping test: (1) an equivalent homogeneous conceptual model, (2) a heterogeneous model based on individual pumping test, and (3) a heterogeneous model based on joint interpretation of all of the sequential pumping tests (i.e., HT).
3.1. Approach 1: Equivalent Homogeneous Approach
[12] This approach conceptualizes a heterogeneous aquifer as an equivalent homogenous aquifer in the same way as a widely used traditional aquifer test analyses (e.g., Thiem, Jacob, and Theis approaches). The objective of this approach is to estimate the effective or apparent T for the equivalent homogeneous aquifer, which will mimic the average drawdown over the entire aquifer during a given pumping test. Meier et al. [1998] and SanchezVila et al. [1999] showed that applications of the Jacob method to drawdowntime data at observation wells at different locations during pumping at a well in a synthetic heterogeneous aquifer yield a very similar transmissivity estimate. On the contrary, Wu et al. [2005] argued that the heads implied in the governing equation for flow through an equivalent homogeneous aquifer represent the spatial trend of the heads in a heterogeneous aquifer, while the head observed at a single borehole in the heterogeneous aquifer represents the perturbation around the trend. Analyzing heads observed at a single borehole in a heterogonous aquifer with the Theis solution, which is built upon the equivalent homogeneous conceptual model, is therefore comparing apples (perturbations) and oranges (trends). Wu et al. [2005] and Wen et al. [2010] advocated that a large number of spatial head observations in the aquifer must be used to obtain a representative effective parameter for the equivalent homogeneous aquifer. Following this concept, we estimated the effective T for the aquifer using steady state drawdowns at 10 observation wells during each of the seven pumping tests. Because the medium is assumed to be homogenous and drawdown and discharge (pumping rate) are known, uniqueness of the solution to this inverse problem under this condition is guaranteed [Nelson, 1960]. Moreover, the number of observations is greater than the number of parameters to be estimated. Estimating T for each test therefore is an overdetermined and welldefined inverse problem [Yeh et al., 2011], which has a unique solution. According to Meier et al. [1998], SanchezVila et al. [1999], Wu et al. [2005], and Wen et al. [2010], this approach should yield the same T estimate in spite of different pumping locations. The estimate should predict head fields that capture the general flow pattern induced by any stress. That is, the predicted head field for other pumping events will be unbiased but scattered. The scatter reflects the effects of heterogeneity.
3.2. Approach 2: Heterogeneous Approach Based on Individual Pumping Test
[13] This approach adopts a heterogeneous conceptual model to estimate the T value at every location in the aquifer using each pumping test and its steady heads at 10 observation wells, excluding the head at the pumping well. As a result, this is a highly parameterized inverse problem. Since the number of parameters (T value in each element of a finite element model for the aquifer) to be estimated is much greater than the number of head observations for a given pumping test, the inverse problem thus is illdefined (or underdetermined) [Yeh et al., 2011]. For such an illdefined inverse problem, as to be discussed in section 3.4, successive linear estimator algorithm (SLE) based on geostatistics is an appropriate method for obtaining a conditional mean solution with a specified unconditional mean, and a covariance function conditioned on observed heads [Yeh et al., 1996].
3.3. Approach 3: Heterogeneous Approach Based on Joint Inversion of Pumping Tests
[14] Different from approach 2, this approach will estimate T at every location in the aquifer by including several pumping test data sets from the sequential pumping tests simultaneously (i.e., HT). According to the principle of HT, as more nonredundant pumping test data sets from the sequential pumping tests are included in the analysis, the resolution of the estimated T field will improve. This is true even though the problem is highly parameterized and illposed.
3.4. Mathematical Models
[15] VSAFT2 (available at http://www.hwr.arizona.edu/yeh) was used for both forward and inverse modeling exercises in this study. The model is a twodimensional, finite element model that solves governing equations for flow and transport in a geologic media [Yeh et al., 1993]. To conduct the numerical analysis, the NYUST aquifer was treated as a twodimensional, depthaveraged aquifer. The horizontal dimension of the aquifer used in the analysis was 51 m × 51 m and descritized into 2601 elements with 1 m × 1 m in size. The initial total head (head hereafter) data observed at the 11 wells prior to the sequential pumping tests were interpolated onto each node of the grid and used as the initial steady head condition of the aquifer at the site. The extrapolated head at the boundary grids was assumed to be the constant boundary head condition of the aquifer. The extrapolation yields a value of 43.88 m in head for the north and east boundaries and 43.65 m for the head at the south and west boundaries. Note that in the following discussions of the NYUST aquifer, only an area of 21 m × 21 m centered on the aquifer will be considered. For the synthetic aquifer, the inverse model setup is identical to those described in the numerical experiment section (2.2).
[16] The inverse algorithm used here is SimSLE [Xiang et al., 2009], which can simultaneously include results from several pumping tests to estimate the hydraulic property, and is based on the successive linear estimator algorithm (SLE), developed by Yeh et al. [1996]. This algorithm is included in VSAFT2. Here a brief description of the SLE is presented. Mathematically, SLE is a stochastic estimator that seeks the conditional approximate mean field of a stochastic parameter field. Specifically, it estimates the parameter value at any location successively utilizing the linear combination of weighted measurements of parameters and/or heads. The weights are determined from the spatial correlation between the head and the parameter, the spatial correlation between heads, and that between the parameters. With a given mean and covariance function of a stochastic field, and observed heads, SLE always yields a unique estimate since it is the conditional expectation (the best unbiased estimate) as opposed to conditional realizations. Conditional realizations are possible realizations of parameter fields that can yield head fields that honor observed heads and measured parameters if any (see e.g., Hanna and Yeh, 1998; Castanaga and Bellin, 2009; Bohling and Butler, 2010). On the other hand, the conditional expectation is the average of all conditional realizations. When the number of observations is greater than or equal to the number of parameters to be estimated and other necessary conditions are met [Yeh et al., 2011], only one conditional realization exists. This conditional realization is the conditional mean and is the true field. Conversely, if the necessary conditions are not met, a large number of conditional realizations exist, which have a common, unique conditional mean field. This conditional mean field is not the true field but it is the best unbiased estimate that honors all observations. Deviations from the true field are then quantified by the residual covariance.
[17] Inverse modeling using the SLE of VSAFT2 requires specification of values for the mean and variance of T, and a correlation structure. In approach 1, we used the mean and variance of the synthetic aquifer for either aquifer case. The effective T obtained in approach 1 for a given pumping test (Table 1) was then used as the mean for the analysis in approach 2 for the corresponding pumping test. The variance is the same as that in approach 1. Approach 3 used the same mean and variance as those of approach 1. The correlation structure in all analyses was assumed to be an exponential function with a correlation scale equal to one finite element length in all directions. The stopping criterion for the iteration of SimSLE is the stabilization of the L2 norm of the drawdown [see Xiang et al., 2009].
Table 1. Homogeneous T Value Estimated Using a Single Pumping Test (Approach 1)^{a}Pumping Wells  Synthetic Aquifer (10 Heads) (m^{2} s^{−1})  Synthetic Aquifer (441 Heads) (m^{2} s^{−1})  NYUST Field Site (m^{2} s^{−1}) 


BH01  0.00579  0.00586  0.00334 
BH02  0.00328  0.00326  0.00249 
BH03  0.00326  0.00337  0.00188 
BH06  0.00154  0.00114  0.00121 
BH07  0.00264  0.00094  0.00209 
BH08  0.00394  0.00431  0.00072 
BH09  0.00374  0.00444  0.00137 
Mean  0.00346  0.00333  0.00187 
Variance  1.69E–06  3.18E–06  7.67E–07 
Seven pumping wells  0.00300  0.00215  0.00164 
3.5. Performance Statistics
[18] The standard correlation coefficient (COR) (1 ≥ r ≥ 0), mean absolute error (L1 norm), and mean square error (L2 norm) were the performance statistics used to measure the similarity between the true and the estimated T field for the cases involving the synthetic aquifer where the true T field is known. These metrics are also employed to evaluate the similarity between the predicted and the observed drawdown fields of the four validation tests in the synthetic and NYUST aquifers.
5. Validation of Estimated T Fields
 Top of page
 Abstract
 1. Introduction
 2. Sequential Pumping Tests
 3. Analysis of the Sequential Pumping Tests
 4. Estimation Results
 5. Validation of Estimated T Fields
 6. Discussion and Conclusions
 Acknowledgments
 References
 Supporting Information
[29] One of the ultimate goals of aquifer characterization is to improve our ability to predict responses of the aquifer under any stress. Accurate mapping of hydraulic property distributions in aquifers is one of the necessary steps to accomplish this goal. In the case of the synthetic aquifer where the true hydraulic properties are known precisely, the estimated properties by any approach can be compared with their true values (as done in section 4). Alternatively, two possible approaches can be used to validate the estimates using the same well field for the estimation. One approach is to test the ability of the estimated hydraulic properties for predicting the drawdown observed at wells that are excluded from the inverse modeling efforts. The other is to test the ability of the estimate for predicting aquifer responses at all observation wells induced by new stresses at locations which are not considered in the inverse modeling exercises, for example, Illman et al. [2007, 2009], Liu et al. [2007], and Xiang et al. [2009]. These two means are qualified as an independent validation of the estimated parameters.
[30] According to Wu et al. [2005], the drawdown at an observation well during the late time period of a pumping test is correlated with heterogeneity everywhere within the cone of depression and, in particular, is highly correlated with the T values near the observation and the pumping wells. This implies that the drawdown measured at an observation well during a pumping test in a heterogeneous aquifer will likely be different from the drawdown recorded at another observation well at a different location at the same distance from the pumping well. The claim of the independent validation of the first method is thus substantiated. Likewise, the crosscorrelationship suggests that the drawdowns observed at the same observation well induced by pumping at different locations in a heterogeneous aquifer will be different. This corroborates the independence of the second validation method.
[31] Notice that the crosscorrelation pattern of the crosshole pumping test based on the groundwater flow model is the same regardless of which one of the pair of wells is the pumping well or the observation well. Specifically, for a given pair of wells, say, well A and well B, the drawdowntime curve observed at well A because of pumping at well B will be the same as that observed at well B when well A is pumped in any heterogeneous aquifer. This is the socalled principle of reciprocity [Bruggeman, 1972]. As a result, drawdown/pumping data sets of any pair of observation and pumping well that have been used in the inversion theoretically are redundant (i.e., perfectly correlated), and thus cannot be used to validate the estimated T field unless noise is considered.
[32] In both the synthetic and NYUST aquifers, there were 11 pumping events, and in each event, drawdowns at 10 wells were sampled, excluding the pumping well. Therefore, there are 110 steady drawdown data in total from the sequential pumping tests. Approaches 1 and 2 use drawdown data from 10 observation wells induced by one of the 11 pumping tests to the estimated T field. As a result, the 100 heads resulting from the remaining 10 pumping tests are nonredundant and are qualified and used for validating the estimate based on a given pumping test.
[33] On the other hand, for approach 3, only half of the 110 heads are nonredundant according to the principle of reciprocity. Figure 7 shows a scatterplot of the 55 normalized steady state observed drawdowns against the other 55 normalized drawdowns of the NYUST field experiment. These drawdowns are normalized with their corresponding pumping rates. For the synthetic aquifer, the scatterplot of the 55 pairs of drawdowns exactly forms a 45° line, indicating of the validity of the principle of reciprocity. The scatter of the normalized drawdown data of the pumping tests at the NYUST field site could be attributed to noise in the measurements of the heads or flow rates. On the other hand, it merely disproves the principle of reciprocity in the realworld scenarios. The failure of the principle in this situation may be attributed to possible physical incorrectness of the governing flow equation (e.g., turbulent flow, see Delay et al. [2011]). Nevertheless, proof of this principle for the realworld aquifers is beyond the scope of this paper.
[34] For the HT analysis of the study, results of seven out of the 11 pumping tests were used for the joint estimations, and results of pumping tests at boreholes BH04, BH05, BH10, and BH11, were reserved for validation purposes. Therefore, we have 16 pairs of an observation well and a pumping well that were not used in the joint inversion. Since drawdowns at the pumping well were not recorded during the pumping test, we have only 12 pairs of drawdown data left for validation. Only half of the 12 drawdowns (i.e., six drawdowns) are nonredundant if the principle of reciprocity is deemed to be valid.
5.1. Validation of Estimates of Approach 1
[35] Figures 8a–8h show scatterplots of observed versus predicted steady drawdowns at 10 observation wells of the 10 validation pumping tests in the synthetic aquifer. The predicted drawdowns in each figure represent the simulated drawdowns using the estimated effective T derived from the heads from all nodes of the synthetic aquifer associated with a specified pumping test (Table 1). The predicted drawdowns in Figure 8h denote those simulated with the effective T estimated from simultaneous inclusion of all data sets from the seven pumping tests. Because of using heads from the seven pumping test for calibration in this case, only 40 drawdowns are qualified as nonredundant data for validation and are shown in Figure 8h. Performance statistics (L1, L2, and COR) are also given in each figure. The scatterplots and the statistics indicate that the prediction based on the effective T derived from simultaneous inclusion of the seven pumping tests is the best. The predicted drawdowns using the effective Tbased individual pumping test data are generally unsatisfactory: they are either much smaller or greater than those observed.
[36] Results of the same analysis for the NYUST field experiments are shown in Figures 9a–9h. Overall, the predictions are also unsatisfactory as are those of the synthetic aquifer. The scatter in these figures is more significant than that in figures for the synthetic aquifer, possibly because of noise or mathematical model errors, or differences in the scale of the observed and simulated drawdowns. Similar to the synthetic aquifer case, the prediction based on the effective T obtained using heads of all pumping tests is the best and unbiased. These best unbiased predictions corroborate the crosscorrelation analysis by Wu et al. [2005]. The inclusion of more pumping tests brings forth more samples of heterogeneity and, thus, a more representative effective T has been obtained. The unsatisfactory predicted results also illustrate the limitations of traditional Theis and Jacob analyses.
5.2. Validation of Estimates of Approach 2
[37] Comparisons of observed versus predicted drawdowns using heterogeneous T estimates based on each one of the seven single pumping tests in the synthetic aquifer are shown in Figures 10a–10g. Similar results for the NYUST field experiment are illustrated in Figures 11a–11g. Overall, the heterogeneous approach yields T fields that lead to improved predictions of drawdowns for the 10 validation pumping tests compared to those based on the equivalent homogeneous approach. The improvement in the predicted drawdown is much more obvious in the cases associated with the synthetic aquifer than in the cases associated with the NYUST aquifer. Comparing Figures 8 and 10 of the synthetic aquifer, it is clear that the systematic bias in predicted drawdowns associated with the scenario dependence of the T estimate has been reduced significantly using the heterogeneous conceptual model. Likewise, the scatter, which is attributed to heterogeneity omitted by the conceptual model, also becomes much smaller. These improvements are, however, not very clear in the cases with the NYUST aquifer, possibly because of errors in the measurements in the heads or pumping rates, as well as mathematical model errors and other issues such as 3D effects. Nevertheless, the performance statistics of the cases show minor improvements over the results based on the homogeneous conceptual model. That is, even under the situation where data are noisy and infested with errors, a highly parameterized conceptual model is not a drawback to an inverse modeling effort.
5.3. Validation of Estimates of Approach 3
[38] Figures 12a–12g show the scatterplots of the 1764 pairs of true and simulated drawdowns at every node of the synthetic aquifer (21 × 21), excluding the boundary nodes during the four validation pumping tests. Each figure represents the comparison of the true and the predicted heads using the estimated heterogeneous T field based on a given number of sequential pumping tests. The large circles in these figures represent the six nonredundant heads, based on the principle of reciprocity. As indicated in these figures, the predicted drawdown at every node of the aquifer (not just the nodes in the area surrounded by the wells) improves, as data of additional sequential pumping tests are included in the estimation. However, after incorporating the data of four sequential pumping tests, data of additional tests do not improve the bias of the predicted head significantly although scatter is reduced. Notice that the improvement of the drawdown recorded at the three wells during the four tests (the six nonredundant drawdowns) is noticeably better than the drawdowns at many other locations. This can be attributed to their locality close to all the observation wells. Again, we emphasize that improvements are not limited to locations near observation wells and in between observation wells but also far away from the wells and near the boundary. That is, improvement (in terms of scatter) progressively extends to the entire aquifer, as more sequential test data are included. This result again disagrees with that by Bohling and Butler [2010].
[39] Comparison between the observed and predicted drawdowns at the NYUST field aquifer during the four validation pumping tests are illustrated in Figures 13a–13g. All 40 pairs of observed and simulated drawdowns (ignoring the principle of reciprocity) are used in the comparison because of the presence of noise in the observed data. The scatterplots and the performance statistics again show that drawdown prediction progressively becomes better as data from more pumping tests are included. More importantly, the T estimates from the joint inversion of the sequential pumping tests in the NYUST aquifer yield the best predictions of drawdowns induced by the validation pumping tests in comparison with those based on either approach 1 or 2.
6. Discussion and Conclusions
 Top of page
 Abstract
 1. Introduction
 2. Sequential Pumping Tests
 3. Analysis of the Sequential Pumping Tests
 4. Estimation Results
 5. Validation of Estimated T Fields
 6. Discussion and Conclusions
 Acknowledgments
 References
 Supporting Information
[40] Results of our analysis show that the equivalent homogeneous approach (approach 1) yields scenariodependent effective T estimates. The estimate varies with the pumping well location although the estimates are close to the geometric mean of T values of the synthetic aquifer. In comparison with the observed heads of the validation tests, the drawdowns predicted using the estimated T are dispersed and biased in both the synthetic and NYUST aquifers. The bias in the predicted drawdown also depends on the pumping well location employed in the estimation of the T field. This raises a salient question about the utility of the estimates from Theis or Jacob's analysis for a prediction of drawdown, in turn flow, which was defended by Butler [2008].
[41] Likewise, approach 2, a heterogeneous approach using data from a single pumping test, produces T spatial distributions capturing the general pattern of the true field of the synthetic aquifer. Although the general pattern of each estimated T field is similar, the estimated T distributions for both the synthetic and NYUST aquifers again vary with the pumping location. Comparisons of the predicted drawdowns of the validation tests show that the estimated T fields, based on approach 2 for both aquifers, consistently yield much better results than those based on approach 1. That is, even though the estimation problem may be illdefined, a highly parameterized conceptual model results in more information regarding the pattern of heterogeneity, which leads to a better prediction of the drawdown field under different stresses.
[42] Lastly, approach 3 (HT), which also adopts a highly parameterized conceptual model but jointly interprets data sets from the sequential pumping test, progressively removes the scenario dependence of the T estimates in both the synthetic and NYUST aquifers as more data sets from the sequential pumping tests are included. In addition, the quality of the estimate is continuously improved. The improvement in the T estimate is not limited only to areas near the wells, but the entire aquifer. This improvement is also manifested in the predicted drawdown of the validation pumping tests, confirming advantages of joint interpretation of the sequential pumping tests (HT) and a highly parameterized conceptual model.
[43] The results of this study are consistent with the crosscorrelation analysis of the steady drawdown at a point in the aquifer due to pumping and T values everywhere in the aquifer, and refute the inflated warning of limitations of HT by Bohling and Butler [2010], which largely stems from their inappropriate interpretation methodology.
[44] Bohling et al. [2002] advocated that drawdown reaches a steady state shape (a condition in which hydraulic gradient between points does not change with time) much earlier than a steady state condition before being affected by boundaries. In addition, drawdown under steady state shape conditions can be analyzed as a steady state flow. However, the fact is that the steady shape condition exists only in the sense of approximation, and that early time head data are affected only by heterogeneity close to wells, not as a broad area as the steady state head. Moreover, boundary conditions do not play an import role for steady flow analysis if the pumping rate and many heads are known [Nelson, 1960].
[45] Likewise, the crosscorrelation analysis and our results expose a crucial weakness of the pilot point approach used by Bohling and Butler [2010] and others. Briefly, the pilot point approach employs the maximum likelihood approach to convert some observed head information to obtain hydraulic parameter estimates at some selected pilot points. The number of pilot points is generally much smaller than the number of grid points where parameters are to be filled. The hydraulic parameters at the pilot points are then used to interpolate the parameter at the other grid points based on a given spatial statistic covariance function (or variogram) of the property (note Bohling and Bulter [2010] used a spline interpolation function). As a consequence, new head information from the sequential pumping test improves only the estimates at the pilot points and offers little improvement on the estimates at other locations. This limitation is attributed to the fact that the spatial covariance function (or the spline function), which remains the same, does not reflect effects of change in the flow field due to change of the pumping location as indicated by the crosscorrelation analysis. Thus, the information about heterogeneity contained in the head data resulting from the sequential pumping tests is not fully utilized.
[46] Our results also suggest that interpretation tools based on a raytracing technique used in geophysics [e.g., Vasco et al., 2000; Brauchler et al., 2010, 2011], or based on a 2D radialvertical flow model [Bohling et al., 2002; Bohling, 2009; Bohling and Butler, 2010], do not fully consider the cross correlation between the head and T heterogeneity over the entire cone of depression. While these tools are useful, again, they do not maximize the information about the heterogeneity contained in the head measurement.
[47] In conclusion, results of our analysis of sequential pumping tests in both the synthetic and NYUST aquifers promote the use of a highly parameterized conceptual model for inverse modeling with a conjunctive use of the hydraulic tomography interpretation. There is no doubt that hydraulic tomography combined with an appropriate analysis and interpretation can maximize the utility of a well field to improve the characterization of the aquifer. In contrast to arguments from Butler [2008], our results support the call for change on the way data are collected and analyzed [Yeh and Lee, 2007].