The conventional analysis of pumping tests by type-curve methods is based on the assumption of a homogeneous aquifer. Applying these techniques to pumping test data from real heterogeneous aquifers leads to estimates of the hydraulic parameters that depend on the choice of the pumping and observation well positions. In this paper, we test whether these values may be viewed as pseudo-local values of transmissivity and storativity, which can be interpolated by kriging. We compare such estimates to those obtained by geostatistical inverse modeling, where heterogeneity is assumed in all stages of estimation. We use drawdown data from multiple pumping tests conducted at the test site in Krauthausen, Germany. The geometric mean values of transmissivity and storativity determined by type-curve analysis are very close to those obtained by geostatistical inversion, but the conventional approach failed to resolve the spatial variability of transmissivity. In contrast, the estimate from geostatistical inversion reveals more structure. This indicates that the estimates of the type-curve approaches can not be treated as pseudo-local values. Concerning storativity, both analysis methods show strong fluctuations. Because the variability of all terms making up the storativity is small, we believe that the estimated variability of storativity is biased. We examine the influence of measurement error on estimating structural parameters of covariance functions in the inversion. We obtain larger correlation lengths and smaller prior variances if we trust the measured data less.
 Pumping tests are common techniques for hydrogeological site investigation. During pumping tests, water is injected or extracted from a production well and the changes of water level are monitored in adjacent observation wells as well as in the production well itself. Conventional pumping tests are restricted to a single pumping well. Analysis of these tests provides hydraulic properties over a large influence zone, essentially an ellipse, between the production and observation wells [Butler and Liu, 1993; Gottlieb and Dietrich, 1995]. The obtained transmissivity is a weighted average and does not provide detailed spatial information [Yeh and Liu, 2000].
 In this paper, we analyze pumping test data obtained at the test site in Krauthausen of the Jülich Research Center, Germany [Vanderborght and Vereecken, 2001; Vereecken et al., 1999, 2000]. The research center has conducted a sequence of aquifer tests; that is, water was pumped at different wells and head changes were monitored at adjacent wells. Although the way of conducting these tests differs from the suggested three-dimensional setups for hydraulic tomography in the references given above, it still follows the same philosophy, namely giving stress to the aquifer at different locations and observing the response at other locations. It can be viewed as the special case of two-dimensional hydraulic tomography.
 Type-curve methods are the basis of conventional analysis of hydraulic aquifer tests [e.g., Meier et al., 1998]. The conventional approaches are straightforward and easy to implement, but they are based on the assumption of a homogeneous isotropic formation and an infinite domain, which may lead to biased estimates of hydraulic parameters. The results of the conventional approaches are apparent uniform values related to particular stress/observation points. In contrast to the conventional methods, the geostatistical inverse approach is based on the assumption that the parameter fields are spatially correlated random functions, which may be more consistent with the heterogeneous nature of aquifers [e.g., Hoeksema and Kitanidis, 1984; Rubin and Dagan, 1987]. The disadvantages of these inverse approaches are that they are more difficult to implement and require significantly higher computational effort than the type-curve methods.
 Several studies have shown that the conventional analysis of pumping tests can provide valuable information of real heterogeneous media in spite of the assumption of homogeneity. Meier et al.  and Sánchez-Vila et al.  assessed the applicability of Jacob's approach in heterogeneous aquifers. They simulated a conventional pumping test in a virtual two-dimensional aquifer where heterogeneous transmissivity and homogeneous storativity fields were used. Jacob's approach was applied to the late-time response of drawdown curves. The estimated transmissivity closely agreed with the real effective transmissivity. The estimated storativity values varied strongly from one pumping test to the other, although the real storativity was uniform. Based on Theis' method, Leven and Dietrich  estimated the hydraulic parameters for different pumping test configurations in a virtual aquifer. In their study, only the pumping wells were monitored. The estimated transmissivity from multiple pumping tests showed a close agreement with the actual distribution. The estimated transmissivity from the conventional pumping tests showed low variation and approached the effective transmissivity of the virtual aquifer. Like Meier et al.  and Sánchez-Vila et al. , Leven and Dietrich  observed similar behavior in estimating storativity, namely a strong variation in the estimate, despite the fact that the actual field of storativity was uniform. Schad and Teutsch  applied the conventional analysis of pumping tests in a real alluvial aquifer and successfully estimated the effective length scale of the heterogeneous structures. Neuman et al.  developed a type-curve approach to estimate the variance and integral scale of log transmissivity in real media. This type-curve method can be used to estimate the variance and the integral scale if a sufficient number of pumping wells are available.
 In the literature, a number of studies on radial flow toward a well in heterogeneous aquifers have been reported [Sánchez-Vila et al., 1999; Copty and Findikakis, 2004]. Dagan  and Guadagnini et al.  derived analytical relationships between the effective transmissivity and drawdown. These analytical solutions do not assume homogeneity of aquifers and have the potential of analyzing pumping tests to obtain the hydraulic parameters of a formation.
 The purpose of this paper is to test whether transmissivity and storativity estimates, obtained by conventional type-curve analysis, can be viewed as local measurements of the hydraulic parameters themselves. If the latter assumption was valid, a continuous image of the field could be obtained by geostatistical interpolation, i.e., kriging. We will compare the kriged results of conventional pumping test analysis with the best estimate from the geostatistical inverse approach, which is conceptually more consistent. We perform the estimates based on the two-dimensional hydraulic tomography data from the test site in Krauthausen. Unlike Leven and Dietrich , who considered only head measurements in multiple pumping wells, we will include the drawdown information also from adjacent wells.
 As geostatistical inverse approach, we apply the quasi-linear method of Kitanidis  to invert temporal moments of drawdown [Li et al., 2005]. In this approach, we estimate a spatial variable, but smooth parameter field, maximizing the posterior probability density of the parameters, linearized about the estimate itself. We develop a strategy to analyze data from multiple two-dimensional tomographic pumping tests. Unlike Zhu and Yeh , we apply the method to field data, where the true parameters are not known and also the geostatistical parameters are uncertain.
 In case that the functional relationship between measurements and parameters is linear, the most likely value of our inverse approach is identical to the mean of conditional realizations. When the nonlinearity of the functional relationship is pronounced, the most likely value obtained by our method differs from the conditional mean. To overcome this shortcoming and obtain the unbiased best estimate, one may generate multiple realizations meeting the measurements via Monte Carlo simulations [Sahuquillo et al., 1992; Gutjahr et al., 1994] which may be computationally demanding. An alternative method of computing the conditional mean with high-order accuracy is based on conditional nonlocal ensemble moment equations [Guadagnini and Neuman, 1999a, 1999b; Hernandez et al., 2006]. This method relies neither on multiple realizations nor on linearizations. It comes, however, with significantly higher computational costs than the quasi-linear approach used in our study. In order to solve the underlying integrodifferential equations, the conditional covariance matrix of the parameters must be computed explicitly and stored in each iteration, which may become rather demanding for large-scale problems discretized by hundreds of thousands of nodes.
 Both kriging and the quasi-linear geostatistical approach of inversion require the knowledge of structural parameters, such as the variance and correlation length. We estimate these parameters from the data using the restricted maximum likelihood approach [Kitanidis and Vomvoris, 1983], and discuss the influence of the measurement error on the identifiability of the structural parameters. The advantages of this approach are threefold. First, it avoids biased results of conventional experimental variogram analysis. Second, it can infer the structural parameters also from related secondary information, such as hydraulic heads. Third, the hydraulic parameter fields and the structural parameters are estimated jointly.
2. General Approaches
 Conventional analysis of pumping tests consists of fitting analytical solutions of the flow equation to the measured time curves of drawdown [e.g., Theis, 1935; Cooper and Jacob, 1946]. The analytical solutions are for infinite, homogeneous, isotropic, two-dimensional formations. The estimated transmissivity and storativity values represent apparent parameters, since the real formation may not comply with the underlying assumptions of the analytical solutions. By repeating the previous procedure for all series of pumping tests at different locations, we obtain a set of pseudo-local values of transmissivity and storativity. Then, we interpolate these local values, producing a continuous image of the hydraulic parameters. Strictly speaking, this procedure is only allowed in case of small-scale pumping tests, because the support volume of the measurements is neglected.
 In geostatistical inverse modeling, we assume that the logarithm of hydraulic parameters are random space variables exhibiting second-order stationarity [e.g., Kitanidis and Vomvoris, 1983]. On the basis of Bayesian analysis, we obtain the most likely set of the hydraulic parameters by maximizing the linearized posterior probability density function, conditioned on the temporal moments of all drawdown curves in all pumping tests. The approach also gives a lower bound of the uncertainty of the estimated parameters.
 Both methods can be extended by a further Bayesian updating step to estimate the structural parameters of the prior covariance functions, i.e., the prior variances, the correlation length, and the prior correlation between log transmissivity and log storativity.
Figure 1 shows a flowchart of the general procedures applied in the conventional analysis and in geostatistical inversion of the pumping test data. The left branch shows the procedure in the conventional analysis of pumping tests; the right branch illustrates the process of geostatistical inversion using temporal moments of drawdown.
3. Governing Equations
3.1. Governing Equation of Drawdown
 We consider that the assumption of two-dimensional regional groundwater flow in a confined aquifer is valid. Prior to the pumping test, the system is assumed to be in steady state. Then the drawdown s [m] observed during the pumping test meets the following equation:
with the initial and boundary conditions:
where S [–] and T [m2/s] are the depth-integrated coefficients of storativity and transmissivity, respectively, Q(t) [m3/s] denotes the pumping rate, δ(x − xw) [1/m2] is the Dirac delta function, xw [m] is the location of the well, ΓDiri and ΓNeu denote Dirichlet and Neumann boundaries, and n [–] is the unit vector normal to the boundaries.
3.2. Temporal Moments of Drawdown
 The kth temporal moment mk(s(x)) [msk+1] of drawdown and the kth moment mk(Q) [m3sk] of the pumping rate Q are defined by:
 As derived by Li et al. , the temporal moments of drawdown meet the following elliptic equation provided that the pumping rate Q(t) drops off to zero at late times:
in which we have dropped the argument s(x, t) in the notation of the temporal moment mk for simplicity. Equation (7) directly relates the temporal moments to the hydraulic parameters. Thus, we can use a series of steady state equations to represent transient groundwater flow.
4. Geostatistical Framework
4.1. Prior Distribution of the Parameters
 We assume that the logarithms of transmissivity and storativity are second-order stationary multi-Gaussian random space variables. In a discretized domain with ne elements, we have to estimate 2ne parameters. We aggregate the log transmissivity and log storativity of all elements into the 2ne × 1 vector Y, which can be expressed by the sum of a deterministic trend and random fluctuations Y′ about the trend:
in which X is the 2ne × nβ matrix of discretized base functions, β is the nβ × 1 vector of trend coefficients, and nβ is the number of trend terms. In the simplest model, we assume a uniform mean for both lnT and lnS, so that nβ = 2, and X has entries of unity in the first column for all elements of Y, representing log transmissivity, and in the second column for the elements representing log storativity; the other entries are zeros.
 The 2ne × 2ne prior covariance matrix RY′Y′∣ = E[Y′ ⊗ Y′] consists of four blocks representing the discretized auto-covariance functions of log transmissivity and log storativity as well as the discretized cross-covariance function:
in which RlnT lnT∣ is the auto-covariance matrix of lnT for given structural parameters , RlnSlnS∣ is the auto-covariance matrix of lnS, and RlnTlnS∣ is the cross-covariance matrix of lnT and lnS.
 In a continuous description, the various covariance matrices are continuous covariance functions, depending on the distance vector h between two points. At first, we assume that the correlation structures of lnT and lnS are idenltical, and that the quantities are correlated. This leads to:
in which RlnTlnT∣(x + h, x), RlnSlnS∣(x + h, x), and RlnTlnS∣(x + h, x) are the auto-covariance functions of lnT and lnS and the cross-covariance function of these quantities for given structural parameters ; lnT2, and lnS2 are the prior variances of lnT and lnS and lnTlnS is the covariance between lnT and lnS at zero separation; ρ(h) is the correlation function. Grain-size analysis and flowmeter data of the test site under consideration [Vereecken et al., 2000] suggest that the correlation function is an exponential model. We assume that ρ(h) is isotropic:
in which ∣h∣ is the length of the distance vector h and λ is the correlation length. We aggregate lnT2, lnS2, lnT lnS, and λ into the n × 1 vector of structural parameters, with n being the number of structural parameters (here four).
 In later steps of the analysis, we will assume that lnS is spatially uniform. Then, the prior variance of lnS and the cross-correlation coefficient become zeros. The prior covariance matrix contains nonzero entries only in the auto-covariance matrix RlnTlnT while the other blocks become zero matrices. In these cases, the number of structural parameters to be estimated becomes two: the prior variance of lnT and the corresponding correlation length of the covariance function RlnTlnT(x + h, x).
 We assume that the trend coefficients β are uncertain prior to conditioning. β* is the nβ × 1 vector of the prior mean of β, and the prior uncertainty of β about β* is quantified by the nβ × nβ prior covariance matrix Rββ.
 We assume that the statistical distributions of Y′ and β are (multi)Gaussian:
4.2. Posterior Distribution of Parameters
 We now consider the nZ × 1 measured values Zm of a dependent variable Z(Y′, β), where nZ is the total number of measured values. We assume a Gaussian likelihood function p(Zm∣Y′, β) of Zm:
where VZZ is the nZ × nZ matrix of the corresponding epistemic error and Z(Y′, β) is the model prediction.
 Applying Bayes' theorem, we can obtain the conditional distribution p(Y′, β∣Zm, ) of the parameters Y′ and β given the measurements Zm and structural parameters :
The most likely estimate maximizes the conditional probability density p(Y′, β∣Zm, ); that is, it minimizes the doubled value of its negative logarithm of the right-hand side term in equation (20):
4.3. Kriging of Pseudo-Local Values
 In interpolation by kriging, we consider local measurements of lnT and lnS. The functional relation Z(Y′, β) between the measurements and parameters Y can be expressed as:
where H is a 2nℓ × 2ne extraction matrix with a single unit element per line. nℓ is the number of calculated lnT and lnS pairs. Then, our objective function becomes:
 The optimal set of parameters minimizes the objective function given in equation (23). It can be shown that the most likely estimate can be expressed as:
in which is the estimate of the trend coefficients and is a 2nℓ × 1 vector of weights. These coefficients are computed by solving [Kitanidis, 1995]:
 The conditional covariance matrix of the parameter Y is defined as:
It may be worth noting that the simple structure of H does not require computing matrix-matrix products explicitly. HRY′Y′∣HT is the auto-covariance matrix of Y evaluated only for the elements of Y for which measurements exist. Also, because the relationship between measurements and estimated parameters is linear, the most likely value is identical to the expected value of the conditional distribution.
4.4. Hydraulic Tomography
 In our geostatistical inverse approach, we use the temporal moments of various transient drawdown curves as measurements. In this paper, the zeroth and the first temporal moments are used. We introduce a 2nm × 1 vector Zm containing the measurements of the calculated moments, in which nm is the number of moment pairs.
 Because the functional relationship Z(Y) between the parameters Y and the measurements Zm is nonlinear, an iterative scheme has to be applied. Linearization about the last estimate k yields:
in which Hk is the 2nm × 2ne sensitivity matrix about k. Now we introduce a 2nm × 1 vector Z0 of corrected measurements:
Then, the objective function becomes:
The expression is formally identical to that in equation (23) and the same general approach can be used to obtain the most likely value of Y. The structure of H, however, is more complicated than in the case of interpolation. Because Hk and Z0 depend on the current estimate, an iterative approach is needed where Hk and Z0 are updated after each iteration [Kitanidis, 1995]. We stabilize the approach by a modified Levenberg-Marquardt method [Nowak and Cirpka, 2004], compute the sensitivities by the continuous adjoint-state method [Sun and Yeh, 1990], and accelerate the matrix-matrix multiplications involved by periodic embedding and spectral methods [Nowak et al., 2003].
 In the previous derivation, we have assumed that the structural parameters are known. In reality, these parameters have to be estimated from the data as well. For this purpose, we apply the restricted maximum likelihood approach [Kitanidis and Vomvoris, 1983].
 The optimal set of maximizes the conditional probability density p(∣Zm) of the structural parameters given the measurements Zm. In the framework of Bayes' theorem, p(∣Zm) is given by:
in which p(Zm) is a scalar, which does not depend on .
 We assume that the prior probability density function p() of is multi-Gaussian:
in which * is the n × 1 vector of the prior mean of and R is the n × n prior covariance matrix.
 Assuming that the linearization about the most likely value of Y is permissible, it can be shown that p(∣Zm) is identical to [Kitanidis, 1995]:
where Ξ is defined as:
The optimal set of maximizes p(∣Zm) or minimizes the doubled value of its negative logarithm L(∣Zm):
where all terms that do not depend on are included in the constant. In case of diffuse prior knowledge about , the third term in equation (35) disappears. In this paper, the optimal set minimizing L(∣Zm) is determined by the Nelder-Mead simplex method [e.g., Press et al., 1992, p. 408]. Finally, the estimation covariance of is approximated by:
which is determined at the most likely value.
 Because of the underlying nonlinearity of the functional relation between the measurements and hydraulic parameters, the estimate of structural parameters depends on the estimate of the hydraulic parameters. Thus, an iterative procedure is needed in which the hydraulic parameters and the structural parameters are estimated in an alternating manner [Kitanidis, 1995].
5.1. Description of the Test Site
 The Krauthausen test site is located in the southern part of the Lower Rhine Embayment, Germany [Vanderborght and Vereecken, 2001; Vereecken et al., 1999, 2000]. It has an extension of 180 m × 50 m. All studies at the test site have focused on the uppermost aquifer with a thickness of approximately 10 m. This aquifer is part of a floodplain, consisting mainly of gravel and sand sediments. The site is equipped with 73 monitoring wells (approximately 5 cm in diameter) and a single well with approximately 17.5 cm in diameter used as pumping well in a large-scale pumping test.
 Because the drawdowns during the small-scale pumping tests were considerably smaller than the thickness of the aquifer, it is permissible to use the equations for confined conditions in our analysis. We analyze the values of storativity for each pumping test using Theis'  method, resulting in values ranging from 2.5 × 10−4 to 0.017, which indicates confined conditions.
5.2. Pumping Tests and Data Preparation
5.2.1. Description of Pumping Tests
 From March to August 2000, a series of small-scale pumping tests with a discharge rate of 2 m3/h were conducted at 29 different pumping wells at the test site in Krauthausen [Lamertz, 2001]. For each pumping test, approximately 10 wells adjacent to the production wells were used as observation wells. Well locations are marked by circles in Figures 2 and 4. The distances of pumping and observation wells range from 1.63 m to 130.91 m. Both the pumping and observation wells were equipped with automatic loggers of hydraulic head with a resolution of 1 mm to measure and store the head changes at time intervals of 10 s. The pumping continued for 2 hours.
 In addition, the research center Jülich has conducted a large-scale pumping test with a discharge rate of 80 m3/h. The production well was located at the center of the test site. The large-scale pumping test lasted for approximately 7 hours.
5.2.2. Analysis by Theis' Approach
 We use Theis'  approach to estimate lnT and lnS from each transient drawdown curve of the small-scale pumping tests. We obtain the optimal values of lnT and lnS by minimizing the following objective function M:
where s is the measured drawdown, σs is the epistemic error of drawdown and se is the model output defined by:
in which r is the distance between the production and monitoring wells and Ei(·) is the exponential integral function.
 The epistemic error σs2 of drawdown is assumed uncorrelated and identical for all measurements. σs2 includes random and systematic contributions. In most circumstances, σs2 is not known and need to be estimated. If σs2 reflects the real uncertainty of the drawdown measurements, the value of M should statistically follow a χ2 distribution with m degree of freedom, where m is the number of observations minus the number of estimated parameters. The proper value of σs2 is determined by enforcing M to meet its expected value.
 The estimation covariance matrix Qc of the estimated lnT and lnS is approximated by the inverse Hessian matrix at the optimal set of the parameters U:
where U is the vector containing the two variables lnT and lnS, σlnT2 and σlnS2 are the estimation variances of the parameter of lnT and lnS, respectively, and ClnT lnS denotes the cross-covariance between lnT and lnS.
 We obtain a pair of lnT and lnS values for each drawdown curve. At a location which has been used either as an observation point or as a pumping well in multiple pumping tests, we compute a weighted average from all parameters obtained at this location, resulting in a single pair of lnT and lnS values:
where wi is the weight and ℓ is the number of pairs of lnT and lnS at this location, and lnT and lnS is the weighted average of UlnT and UlnS of the estimated lnT and lnS, respectively. We consider these weighted averages as pseudo-local measurements in kriging.
 The associated measurement error is composed of the weighted average of the parameter uncertainty in fitting Theis' solution to the single drawdown curves and the variability of the parameter estimates among the different tests:
where lnT2 and lnS2 are the weighted estimation variances of the estimated parameter lnT and lnS, respectively; lnT is the arithmetic mean of the estimated lnT at this location; lnS is the arithmetic mean of the estimated lnS; and lnT lnS is the weighted cross-covariance between lnT and lnS. In most cases, the latter contribution dominates the computed measurement error. lnT2, lnS2 and lnT lnS compose the term VZZ in equation (23):
in which m is the number of the measurement locations.
5.2.3. Computation of Temporal Moments
 As has been shown earlier [Li et al., 2005], the temporal moments of drawdown for a unit-pulse extraction can be derived from measurements of drawdown observed during continuous pumping. Because the head measurements fluctuated, we obtained more stable estimates of the moments by fitting a parametric function to the observations. The maxentropic semi-infinite distribution for given zeroth and first temporal moments is the exponential one [Gibbs, 1902]. Like Bakker et al. , we use the latter expression to parameterize the drawdown for pulse-like extraction. This results in the following expression for continuous pumping:
where m0 and m1 denote the zeroth and first temporal moments for pulse-like extraction, respectively. The optimal pair of temporal moments is determined by minimizing the weighted difference between the observed and simulated drawdown curves. The corresponding objective function is similar to equation (37). The estimation covariance matrix of m0 and m1 is given by the inverse Hessian matrix of the objective function at the optimal point.
 At late times, when the exponential part in equation (45) approaches zero, the difference between s and se is dominated by the difference between s and Qm0. Qm0 is equivalent to the final drawdown. Statistically, Qm0 is determined by averaging the measured values of drawdown at late times. According to the statistical formulation, the uncertainty of the estimated final drawdown is defined as the ratio between the squared measurement error of drawdown and the number of the measured drawdown data used. The uncertainty decreases with increasing number of measurement points. However, this uncertainty of the final drawdown does not reflect the real measuring process, for which an accuracy beyond the resolution of the device is impossible. To account for such nonrandom effects, we add an additional measurement error to the estimated uncertainty of the final drawdown. This additional measurement error of final drawdown propagates to the estimation variances of the zeroth and first temporal moments and the cross-variance between the estimated m0 and m1:
where m02 and m12 are the total uncertainties of the determined m0 and m1 values, respectively, and m0m1 is the cross-variance between m0 and m1. σm02, σm12, and Cm0m1 are computed from the inverse Hessian matrix of the objective function at the optimal point. In most cases, the contribution of dominates the total uncertainties. m02, m12, and m0m1 are the terms in VZZ of equation (29):
in which nm is the number of moment pairs.
 The error is not necessary for the previous approach where Theis' method is used to estimate lnT and lnS, because Theis' approach does not rely on measurements of final drawdown.
6. Implementation and Results
6.1. Kriging of Pseudo-Local Values
 Using the pseudo-local values of lnT and lnS obtained from type-curve analysis, we can estimate the structural parameters of the covariance functions.
 In this paper, we assume diffuse prior knowledge of the structural parameters. The prior knowledge of the trend coefficients β is given in Table 1. To avoid negative values in the estimates of lnT2, lnS2, and λ, we estimate the logarithms of these parameters. Correspondingly, the uncertainties of these parameters are quantified by the factor of variation (FV), which is the exponential of the standard deviation of the log parameter. Concerning lnT lnS, we estimate its related correlation coefficient rlnTlnS:
To guarantee that rlnTlnS remains within the range between −1 and 1, we apply the error function as transformation between an auxiliary variable μ, ranging between −∞ and ∞, and rlnTlnS, and estimate μ:
Table 1. Field and Grid Information, Prior and Posterior Information of Hydraulic Parameter Fields in Section 6.1 and 6.2, and the Results of Comparison in Section 6.3
 For structural parameter estimation, we apply the approach described in section 4.5. The estimated structural parameters and their corresponding uncertainties are listed in Table 1. With the estimated structural parameters, we estimate lnT and lnS on a regular grid. The field dimension and resolution are listed in Table 1. We subsequently apply equations (25) and (24) to obtain the parameter fields of lnT and lnS. The uncertainties of lnT and lnS are quantified by equation (26). Figure 2 shows the fields of lnT and lnS and their corresponding standard deviations of estimation.
 The estimated prior variance of lnT is very small, namely 0.01, indicating an almost uniform distribution of transmissivity. In comparison, the estimated prior variance of lnS is higher, namely 0.24. As listed in Table 1, the total variances of measurements of lnT and lnS are fairly small, while the associated “measurement” errors are relatively large. As explained in section 5.2.2, the pseudo-local values are obtained by averaging all measurements in which a particular well is involved either as a pumping or as a monitoring well. Obviously, the variability between measurements, involving the same points, is larger than the variability between the averages obtained at different points. Thus, using the results of type-curve analysis as point-like values is not permitted.
 The relatively large variance in the estimated field of lnS and small variance of lnT in our study reflects the findings of other studies [Meier et al., 1998; Sánchez-Vila et al., 1999; Leven and Dietrich, 2006], which were based on numerical simulations only. In these studies, a small variability of lnT and high variability of lnS were found by type-curve analysis despite using uniform storativity values in the simulations. In general, the variance of storativity in natural systems is thought to be relatively small [Meier et al., 1998]. Under confined conditions, compressibilities of rock, water, and the pore space determine the storativity. The contributions of the rock and water compressibilities are considered to be small at all local scales [e.g., Meier et al., 1998]. Under phreatic conditions, the storativity becomes the porosity, which varies only within a small range [e.g., Meier et al., 1998; Vesselinov et al., 2001a, 2001b]. That is, we believe that the estimate of the variability of lnS is biased. The bias may be caused by the inconsistent assumption of homogeneity in the conventional pumping test analysis. It is worth analyzing whether the bias disappears when we apply the geostatistical inverse approach, in which the hydraulic parameter fields are assumed spatially variable in all stages of the estimation procedure.
6.2. Quasi-Linear Geostatistical Inversion of Temporal Moments
 In contrast to section 6.1, we cannot directly derive the structural parameters from the computed measurements of m0 and m1, because the functional relation between the temporal moments and hydraulic parameters is nonlinear. The estimation of structural parameters depends on the current estimates of lnT and lnS. We have to start with the estimation of lnT and lnS. To do that, we need initial values of related structural parameters. For the initial value of the correlation length, we take the values estimated in section 6.1. Because we expect more variations in the estimate of lnT with the geostatistical inverse approach, we take unity as the initial value of the prior variance lnT2, a much higher value than the one estimated from the pseudo-local values in section 6.1. As the initial value of the prior variance of lnS, we take the value estimated in section 6.1.
 We use the same grid resolution as in section 6.1. To reduce the influence of boundary conditions, we enlarge the domain on each side by 50 m. Zero drawdown is assumed on the boundaries of the enlarged domain. Following the approaches described in sections 4.4 and 4.5, we start the quasi-linear geostatistical inversion of temporal moments with the large-scale pumping test and use the resulting estimates as the initial guess for analyzing the small-scale pumping tests. We implement the small-scale pumping tests in a sequential way, beginning with a single small-scale pumping test. Once the optimal parameter set is obtained, we add new pumping test data to the inversion and estimate the hydraulic parameters using the data of all pumping tests accounted for at the current stage. The estimate from the previous sequential step serve as initial guess for the following estimation. We keep adding new pumping tests until all available tests are used. This successive addition of new measurements stabilizes the inverse procedure. The approach differs from the sequential successive linear estimator (SSLE) developed by Yeh et al.  in the way how information in propagated from one sequential step to the next. While Yeh et al. take the estimate and approximated conditional covariance matrix as prior mean and covariance in the next step, we consider the previous estimate only as initial guess. As prior values, we always start from the unconditional distribution, which we then condition on all data accounted for so far. This approach has the advantage that we do not have to perform expensive matrix-matrix multiplications of conditional covariance matrices.
 After we obtain the most likely distribution of lnT and lnS for all pumping tests based on the initial guess of , we start alternating the estimation of the structural and hydraulic parameters. In the alternating procedure, we use only the measurements of temporal moments of the small-scale pumping tests.
6.2.1. Impact of Measurement Error on the Estimation of Structural Parameters
 The measurement error influences the spatial variability of the estimated hydraulic parameter fields. In the following, we take the estimation of the structural parameters as an example to illustrate this influence. Following the sequential and alternating inversion procedure described previously, we estimate lnT2, lnS2, and λ. Figure 3a shows the resulting correlation length λ as function of different values of With increasing measurement error, we obtain an increasing value of λ, which makes the estimates of hydraulic parameters smooth. The estimation of the correlation length is more vulnerable to the change of measurement error than the ones of the prior variances. For the measurement errors used, the estimated prior variances do not show significant changes. Therefore we do not show the estimating results of lnT and lnS. Meanwhile, we calculate the values of the objective function according to equation (29), which decreases with increasing The latter is to be expected because increasing implies that we trust the measurements less, and therefore an identical misfit between measured and simulated moments results in a smaller value of the objective function.
 The influence of the measurement error on the estimation of prior variances becomes clear if we fix the correlation length λ. We take the estimated correlation length in section 6.1 as fixed value of λ. Figure 3b displays the resulting lnT2 and lnS2 as function of the different values of The prior variances decrease when increases, which smoothes the estimated fields of hydraulic parameters.
 If the measurement error reflects the real uncertainty of the model, the value of the objective function should follow a χ2 distribution with 2nm degrees of freedom, where nm is the number of temporal moment pairs. In our current studies, we have 169 pairs of measurements of temporal moments. In general, the 95% confidence interval is used, which implies an interval between 320.11 and 390.82 for our case studies. Since large measurement error will smooth the estimated fields, it is preferable to choose a small measurement error within all acceptable ones. In the following, we have chosen 1.15 mm as our measurement error
 In the illustrated examples in Figure 3, the estimated values of the prior variance of lnS are larger than the estimated values of the prior variance of lnT. As we have discussed previously, the variance of lnS in natural confined system is thought to be small [Meier et al., 1998]. That is, we believe that the estimated variability of lnS is biased, like in the conventional pumping test analysis. Although the geostatistical inverse approach is consistent in the sense that the estimated variability of the hydraulic parameters is accounted for in all stages of the estimation procedure, the results of the storativity distribution appear not very reliable. We believe that this is an effect of aliasing. The estimated distribution of lnT is smoother than the real field. The unresolved small-scale variability has larger effect on the simulated first temporal moments, representing the characteristic time of drawdown, than on the zeroth temporal moments, representing final drawdown. Given a too smooth distribution of lnT, the inverse approach attributes the derived variability in m1 to variations in lnS. In both methods to analyze the pumping tests, we use a two-dimensional description of the aquifer and ignore any vertical flow. That is, the unresolved variability may mainly be caused by neglecting the three-dimensional nature of the real formation. Working with measurements obtained in fully screened wells, however, we cannot resolve the vertical variation of hydraulic parameters.
6.2.2. Estimation With Uniform Storativity
 Presuming that the variance of lnS is relatively small in nature, we now restrict the analysis to the case of a uniform value of storativity. Then, the prior variance of storativity and the cross-correlation between lnT and lnS become zeros. To make the estimate of lnT more comparable with that in section 6.1, we fix the correlation length using the value estimated from the pseudo-local values. Following the procedure as described previously, we estimate the spatial lnT distribution and, as remaining structural parameter, its prior variance lnT2 given a uniform field of lnS. The mean values of lnS and lnT are estimated as well.
 As listed in Table 1, the estimated prior variance lnT2 of lnT for a uniform value of lnS is 1.57, and S is 0.006. These values are for a measurement error of 1.15 mm. Figure 4 shows the estimate of lnT-field and the corresponding standard deviation of estimation. In Figure 4c, we plot the kriged lnT field of section 6.1 in the same color scale as for the lnT distribution estimated in this section. Applying the geostatistical inverse approach, the estimated lnT field reveals more structure than interpolating the results of the pseudo-local values obtained by type-curve analysis. The estimated prior variance lnT2 is much higher than the one from the pseudo-local values. The improvement in revealing the structure of the lnT field is caused by the consistent assumption of heterogeneity in the geostatistical inverse approach.
 The validity of the estimated fields is tested by analyzing the orthonormal residuals [Kitanidis, 1991]. If the model is correct, the sum of squares of orthonormal residuals of the measurements follows a χ2 distribution. In our application, this value is 380.02, which is within the 95% confidence interval (320.11–390.82).
6.3. Comparison of Estimates
 In this section, we compare the results from kriging of pseudo-local values (section 6.1) to those from geostatistical inversion (section 6.2.2). The estimated average properties, the variations of the estimated fields, and the uncertainties of the estimations are investigated.
 We compute the mean values of the estimated fields of lnT and lnS from kriging and inversion. Table 1 lists these values. The mean values of lnT and lnS obtained by the conventional approach are almost identical to the ones of geostatistical inversion. That is, in our application, conventional type-curve analysis of pumping tests leads to reliable estimates of the average properties. This is somehow consistent with the underlying assumption of uniformity in the type-curve analysis.
 We calculate the variances of the estimated lnT field for both kriged and inverse results. Table 1 lists the resulting variances. The variance of the resolved inverse lnT field is higher than the one obtained by kriging. Again, this is consistent with the underlying conceptual assumptions. In geostatistical inversion, we assume that lnT varies in space, and we try to resolve the fraction of heterogeneity that can unambiguously be identified from the data. In type-curve analysis, we start with the assumption of uniformity, but obtain results varying with the combination of pumping and monitoring wells, which is similar to the results of cross-hole pneumatic injection tests of Illman and Neuman .
 We may examine the estimated lnT values at the points of head measurements. Figure 5 shows a comparison of these values for the two types of analysis. The circles stand for the values obtained by geostatistical inversion. The error bars indicate the best estimate of lnT of the geostatistical inversion plus and minus one corresponding standard deviation of the estimate. The crosses denote the pseudo-local values obtained by the conventional method. With very few exceptions, the calculated pseudo-local values are within the acceptable confidence interval.
 To quantify the difference between the two lnT fields throughout the domain, we compute the normalized root mean square error (NRMSE):
where pi is the estimated lnT value in element i, the index inv stands for the estimate of inversion, the index kri for that of interpolated pseudo-local values, and σlnT,c2 is the estimation variance of lnT in the inverse method. The value 0.5984 of NRMSE indicates that the values from the conventional approach can produce reasonable fields of hydraulic parameters, although most of the spatial variability is missing.
 We have analyzed multiple pumping tests conducted at the test site in Krauthausen, Germany, using conventional type-curve analysis and geostatistical inverse modeling. The results show that the conventional approach provides good estimates of the mean hydraulic parameters, but fails to reproduce the spatial variability of the formation.
 Concerning the estimate of lnT, the results from the conventional approach show small variation and are close to the geometric mean. Small variation in estimated lnT has been observed by several researchers [Meier et al., 1998; Sánchez-Vila et al., 1999; Leven and Dietrich, 2006] when they applied conventional analysis to simulated pumping tests. The geometric mean of the pseudo-local transmissivity is very close to the one estimated by the geostatistical inverse approach. Due to more consistent assumptions in the geostatistical inverse approach, it reveals more variability in the estimated lnT field than the conventional approach. The estimated prior variance of lnT and the variance of the resolved lnT field in inversion are higher than their counterparts using the conventional approach. Although the conventional approach leads to small variations in lnT, its results are within the uncertainty range of the lnT field obtained by geostatistical inversion. Overall, kriging lnT values obtained by type-curve analysis may be acceptable in cases where detailed knowledge about the variability of the parameter fields is not required.
 The estimate of lnT obtained by our inverse approach is a smooth estimate, maximizing the posterior distribution of the parameter fields. In case that the functional relation between the measurements and the parameters is linear, this estimate is identical to the mean of conditional realizations. The higher the variance of the parameter fields, the more nonlinear is their relations to drawdown. Then the most likely value obtained by our method differs from the conditional mean. The conditional mean may even not meet the measurements within the given epistemic error. In such cases, one may rely on other methods such as generation of conditional realizations [e.g., Sahuquillo et al., 1992; Gutjahr et al., 1994] or conditional nonlocal ensemble moment equations [Guadagnini and Neuman, 1999a, 1999b; Hernandez et al., 2006], which are computationally more demanding than the quasi-linear approach used in our study.
 Regarding storativity, the geometric means of S obtained by the conventional approach and the geostatistical inverse approach are close to each other. Both approaches show strong variations, which appear unrealistic. Meier et al. , Sánchez-Vila et al. , and Leven and Dietrich,  obtained similar results in analysis of simulated pumping tests using a uniform storativity and spatially variable transmissivities. In our estimation and in previous studies [e.g., Sánchez-Vila et al., 1999], the aquifer is assumed to be two-dimensional. This assumption is consistent with data obtained in fully screened wells. The unresolved vertical variability, however, may be the major cause for the presumed bias in estimating the variability of log storativity. Performing three-dimensional inversion may help to overcome the difficulty of aliasing. Then, the transmissivity is replaced by conductivity and storativity by the specific storage coefficient. Such analysis requires measurements in all flow directions. Dealing with field data, we cannot exclude that conceptual uncertainties (e.g. regarding the validity of treating the system as a confined formation) may contribute to biased results.
 It may be worth noting that, to our knowledge, no reliable information about the variability of storativity exists. Meier et al. , among others, conjectured that the variability should be small because of the small variability of the quantities making up storativity. Experimental data on the variability of lnS, however, hardly exists, and our study indicates that such data may be biased.
 In this study, we have exclusively analyzed pumping test data, trying to estimate both hydraulic parameter fields and corresponding structural parameters. We could show that the estimation of the correlation length depends on the uncertainty of the measured data. If we trust the measurements less, we obtain a larger correlation length. This indicates that head information alone may be insufficient to characterize the structure of the subsurface. Since our geostatistical approach is based on Bayesian analysis, information from other surveys, such as geophysical ones, may be added. Such extensions, however, are beyond the scope of the current study. Our approach of estimating structural parameters requires that the type of geostatistical model is known. In our study, we could rely on previous analysis at the site. An alternative would be to fit parameters of different correlation functions and choose the model which performs best. However, we doubt that such choices can be made with indirect information (in our case, the temporal moments of drawdown) alone.
 We want to thank two anonymous reviewers for their comments helping to improve the quality of the paper.