### Abstract

- Top of page
- Abstract
- 1. Introduction
- 2. Background
- 3. Objectives
- 4. Geostatistical Approach to Estimating the Historical Distribution of a Contaminant
- 5. Adjoint State Formulation and Implementation
- 6. Application to the Estimation of the Historical Contaminant Distribution in Two-Dimensional Aquifers
- 7. Conclusions
- Acknowledgments
- References

[1] As the incidence of groundwater contamination continues to grow, a number of inverse modeling methods have been developed to address forensic groundwater problems. In this work the geostatistical approach to inverse modeling is extended to allow for the recovery of the antecedent distribution of a contaminant at a given point back in time, which is critical to the assessment of historical exposure to contamination. Such problems are typically strongly underdetermined, with a large number of points at which the distribution is to be estimated. To address this challenge, the computational efficiency of the new method is increased through the application of the adjoint state method. In addition, the adjoint problem is presented in a format that allows for the reuse of existing groundwater flow and transport codes as modules in the inverse modeling algorithm. As demonstrated in the presented applications, the geostatistical approach combined with the adjoint state method allow for a historical multidimensional contaminant distribution to be recovered even in heterogeneous media, where a numerical solution is required for the forward problem.

### 1. Introduction

- Top of page
- Abstract
- 1. Introduction
- 2. Background
- 3. Objectives
- 4. Geostatistical Approach to Estimating the Historical Distribution of a Contaminant
- 5. Adjoint State Formulation and Implementation
- 6. Application to the Estimation of the Historical Contaminant Distribution in Two-Dimensional Aquifers
- 7. Conclusions
- Acknowledgments
- References

[2] When determining the effect of historical groundwater contamination, the distribution of a plume at a given point back in time is often required to establish exposure of wells or individuals to the contaminant. For example, in the case of *Woodrow Sterling et al. versus Velsicol Chemical Corporation* [1986], a class of people who owned property in the vicinity of a chemical waste burial site sought damages for personal injury and damages to their property suffered when water in their home wells became contaminated by hazardous chemicals escaping from Velsicol's site. Velsicol admitted that some of the wells were contaminated with chemicals from its waste burial site, but disputed that all members of the class had been exposed and did not agree with the plaintiffs as to the intensity and duration of exposure. Therefore the case centered on estimating the past distribution and concentration of the chemical plume, in order to determine concentrations in the plaintiffs' wells at given times [*Michalak*, 2001]. Emerging inverse modeling methods can be applied to solve such problems.

[3] One set of inverse methods is based on geostatistical principles and allows for the estimation of unknown functions based on the dual criterion of reproducing available observations while maintaining an assumed correlation structure. Methods falling under this category have been used for some time for estimating subsurface hydraulic conductivity or transmissivity distributions based on hydraulic head and other data [e.g., *Kitanidis and Vomvoris*, 1983; *Kitanidis*, 1995; *Zimmerman et al.*, 1998]. More recently, these types of methods have also been applied to contaminant release history identification in groundwater systems [*Snodgrass and Kitanidis*, 1997; *Michalak and Kitanidis*, 2002, 2003, 2004]. In this paper the geostatistical method is extended to the estimation of the antecedent distribution of a contaminant at a given point back in time, making it applicable to cases such as the one described.

[4] In the geostatistical approach to inverse modeling, the solution involves the calculation of a sensitivity matrix relating each point in the discretized unknown function to each observation, which typically requires one forward run for each point in the discretized unknown function. Because inverse problems associated with groundwater systems are typically strongly underdetermined, in the sense that the number of points in the discretized unknown function *m* is greater than the number of available measurements *n*, the computational cost of calculating the sensitivity matrix can be prohibitive. This is especially true when the function to be estimated is itself multidimensional. In this work the adjoint state method is used to efficiently populate the full sensitivity matrix by solving a series of adjoint problems instead of the traditional approach of solving a series of forward problems. The combination of the adjoint and geostatistical methodologies makes the identification of a multidimensional contaminant distribution in a heterogeneous domain feasible.

[5] Note that throughout this paper, we will use the term “historical contaminant distribution” to describe the plume at a single, given point in the past. We avoid using the term prior distribution so as to prevent confusion with the terms “prior” and “posterior,” which have a different, very specific definition in the context of Bayesian inverse modeling. Also, although in the presented applications the historical distribution will be recovered for a single point in time, the presented algorithm could directly be applied for a series of times, yielding a time-dependent description of the history of a plume.

### 4. Geostatistical Approach to Estimating the Historical Distribution of a Contaminant

- Top of page
- Abstract
- 1. Introduction
- 2. Background
- 3. Objectives
- 4. Geostatistical Approach to Estimating the Historical Distribution of a Contaminant
- 5. Adjoint State Formulation and Implementation
- 6. Application to the Estimation of the Historical Contaminant Distribution in Two-Dimensional Aquifers
- 7. Conclusions
- Acknowledgments
- References

[17] Geostatistical inverse modeling follows a Bayesian approach. Bayes' theorem states that the posterior probability density function (pdf) of a state vector **s** given an observation vector **z** is proportional to the likelihood of the state given the data, times the prior pdf of the state. Symbolically,

where the vertical bar means “given.” In this context, prior and posterior probability density functions are with respect to using the data **z**. In the geostatistical approach the prior represents the assumed spatial or temporal correlation structure of the unknown function, as described by a covariance function. The likelihood of the data represents the degree to which an estimate of the unknown function **s** reproduces the available data **z**.

[18] Overall, the objective is to estimate the unknown function **s**. The standard estimation problem may be expressed in the form

where **z** is an *n* × 1 vector of observations and **s** is an *m* × 1 state vector obtained from the discretization of the unknown function. Whereas in past applications of the geostatistical approach to inverse modeling **s** represented the release history from a known source [*Snodgrass and Kitanidis*, 1997; *Michalak and Kitanidis*, 2002, 2003, 2004], in the case examined here **s** is the spatial distribution of a contaminant at a previous time *T*_{a}. The vector **z** contains the available groundwater concentration measurements. The vector **r** contains other parameters needed by the model function **h**(**s**, **r**). The measurement error is represented by the vector **ɛ**. This error encompasses both the actual measurement error associated with collecting the data and any random numerical or conceptual inaccuracies associated with the evaluation of the function **h**(**s**, **r**).

[19] When the function **h**(**s**, **r**) is linear in the unknown **s**, as will be the case in the applications presented in this work, the function **h**(**s**, **r**) can be written as

where **H** is a known *n* × *m* matrix, the Jacobian representing the sensitivity of the observations to the function **s** (i.e., *H*_{i,j} = ∂(*z*_{i} − ɛ_{i})/∂*s*_{j}). In the case of the identification of the historical distribution of a contaminant, **H** represents the sensitivity of available observations to the concentration of the contaminant at given spatial locations and single previous time. The components of **H** could be obtained numerically by performing one run of a groundwater transport model for each component of **s**. When **s** is discretized finely or when it varies in multiple dimensions, the computational cost quickly becomes prohibitive. This is the issue that will be addressed by the implementation of the adjoint state method in the next section.

[20] Following geostatistical methodology and returning to equation (2), **s** and **ɛ** are represented as random vectors. We assume that **ɛ** has zero mean and known covariance matrix **R**. The covariance of the measurement errors that will be used is

where σ_{R}^{2} is the variance of the measurement error and **I** is an *n* × *n* identity matrix. We model **s**, the unknown function, as a random vector with expected value

where **Y** is a known *m* × *p* matrix and **β** are *p* unknown drift coefficients that can represent the mean of the process as well as linear and/or nonlinear dependence on auxiliary variables. For example, for a linear drift in two dimensions, *p* = 3,

and **β** represents the mean and trend of the unknown function, such that at a point (*x*_{1,i}, *x*_{2,i}) the a priori expected value of the function **s** is

The prior covariance function of **s** is

where **Q**(**θ**) is a known function of unknown parameters **θ**. This function represents the correlation between the historical contaminant concentration at various points, which, for most models, decays as the separation distance between the points increases. In the case of a cubic generalized covariance function (GCF) in two spatial dimensions, which is the function that will be used in the presented applications, the covariance matrix can be written as

where (*x*_{1,i}, *x*_{2,i}) and (*x*_{1,j}, *x*_{2,j}) are the *x*_{1} and *x*_{2} coordinates of the *i*th and *j*th locations at which the contaminant distribution is to be estimated and *h* is the separation distance between these locations.

[21] The method used to obtain the structural parameters, in our case θ and σ_{R}^{2}, follows a restricted maximum likelihood approach, as detailed by *Kitanidis* [1995]. In short, the parameters are estimated by maximizing the probability of the measurements

where ∥ denote matrix determinant and

[22] Once these parameters have been estimated, and returning to the Bayesian notation outlined in equation (1), the posterior probability density of the unknown vector **s** is Gaussian:

where the first term represents the likelihood and the second term represents the prior probability density function of **s**. The system of equations that allows us to obtain the best estimate and posterior covariance of **s** is [e.g., *Michalak and Kitanidis*, 2003]

where **Λ** is a *m* × *n* matrix of coefficients and **M** is a *p* × *m* matrix of multipliers. The best estimate of the function is

and its posterior covariance is

The diagonal elements of **V** represent the posterior variance of individual elements of .

[23] In short, once the form of the prior covariance model has been selected, the values of the required structural parameters as well as the measurement error variance can be optimized using a restricted maximum likelihood approach. The inverse problem can then be solved by formulating a set of *n* + *p* algebraic equations to obtain a best estimate for the contaminant distribution, , as well as an estimate of its posterior covariance, **V**. Conditional realizations, which are equally likely realizations of the historical contaminant distribution **s**, can also be generated [e.g., *Michalak and Kitanidis*, 2003].

### 5. Adjoint State Formulation and Implementation

- Top of page
- Abstract
- 1. Introduction
- 2. Background
- 3. Objectives
- 4. Geostatistical Approach to Estimating the Historical Distribution of a Contaminant
- 5. Adjoint State Formulation and Implementation
- 6. Application to the Estimation of the Historical Contaminant Distribution in Two-Dimensional Aquifers
- 7. Conclusions
- Acknowledgments
- References

[24] In the solution of inverse problems, the number of observations is often significantly lower than the number of estimate locations (i.e., *n* ≪ *m*). In such cases, the application of the adjoint state method can significantly reduce the cost of computing the Jacobian **H**. Note that adjoint methods have traditionally been used primarily for nonlinear sensitivity analyses to a small number of parameters. We are instead interested in deriving the sensitivity to a spatially variable function **s** in a linear system.

[25] Note that in the remainder of this paper, *x* denotes the spatial coordinate, whereas *X* denotes locations at which measurements are taken. Furthermore, *t* denotes the temporal coordinate, *t* = *T*_{b} is the time at which measurements are taken, and *t* = *T*_{a} is the time for which the contaminant distribution is to be estimated.

#### 5.1. Multidimensional Advection Dispersion Equation

[26] A generic form of the advection-dispersion equation for solute transport is

where repeated index notation is used, *t* is time, *x*_{i} are the spatial directions (*i* = 1, 2, 3), **x** = (*x*_{1}, *x*_{2}, *x*_{3}), *C* is resident concentration, η is porosity for porous media and unity for other cases, *D*_{ij} is the *i*, *j*th entry of the dispersion tensor, *v*_{i} is the fluid velocity in the direction of *x*_{i}, *q*_{s} is the source flow rate per unit volume, *C*_{s} is the source strength in mass per unit volume, and *q*_{o} is the sink flow rate per unit volume. The initial conditions are

where *T*_{a} is the time at which the initial condition is specified and *C*_{a} is the concentration distribution at that time. The possible boundary conditions are listed in Table 2, where Γ_{1}, Γ_{2}, Γ_{3} are subsets of the domain boundaries, *n*_{i} is the outward unit normal vector in the *x*_{i} direction, *q*_{D} is a dispersive mass flux per unit volume, and is a specified concentration. The first two terms on the right-hand side of equation (17) represent the divergence of the dispersive and advective mass fluxes, respectively. The advection-dispersion operator is

In our application, the initial condition in equation (18) represents the time for which the contaminant distribution is to be estimated, regardless of when the contaminant was originally released. As such, the initial condition does not necessarily represent the time at which the contaminant was introduced into the aquifer. We will be defining simulations that will allow us to compute the sensitivity of the available observations **z** to the unknown distribution of the contaminant in a given area, at a given time in the past *T*_{a}.

Table 2. Possible Boundary Conditions for Solute TransportType | Name | Meaning | Expression | |
---|

First | Dirichlet | fixed concentration | *C* (*x*, *t*) = | on Γ_{1} |

Second | Neumann | fixed dispersive flux | −*D*_{ij}(∂*C*/∂*x*_{j})*n*_{i} = *q*_{D} | on Γ_{2} |

Third | Cauchy | fixed total mass flux | [η*v*_{i}*C* − η*D*_{ij}(∂*C*/∂*x*_{j})]*n*_{i} = η*v*_{i}*n*_{i} | on Γ_{3} |

#### 5.2. Adjoint State Formulation

[27] A sensitivity analysis approach [e.g., *Sykes et al.*, 1985] can be used to derive the adjoint of the advection-dispersion equation presented in equation (17). *Neupauer and Wilson* [2001] presented such a derivation, and their steps are summarized here, modified as needed for the current application.

[28] A performance measure *P* that quantifies some state of the system is defined as

where ζ(*s*, *C*) is a functional of the state of the system, *s* is a parameter or set of parameters that we are interested in estimating, *C* is resident concentration for solute transport, Ω is the spatial domain, and integration is over the entire space-time domain. In the derivation of *Neupauer and Wilson* [2001], *s* was the strength of an instantaneous point source. In our case, *s* is the solute concentration distribution in a region of interest Ω_{a} at some point in the past, *T*_{a}. The performance measure *P* is the predicted concentration *C*(*X*,*T*) at an observation location, and the function ζ(*s*,*C*) is defined accordingly.

[29] The marginal sensitivity of this performance measure with respect to a parameter *s* is obtained by differentiating equation (20):

where *dP*/*ds* is the marginal sensitivity that we are interested in and ψ is the state sensitivity, ψ = ∂*C*/∂*s*. Because the state sensitivity ψ is unknown, adjoint theory is used to eliminate it from equation (21), and the marginal sensitivity is obtained in terms of the adjoint state.

[30] Differentiating the governing equation (17) with respect to a distributed parameter *s* to obtain the governing equation in terms of the state sensitivity, ψ, and assuming that the boundary conditions, porosity, dispersion tensor, fluid velocity, and source and sink flow rates do not depend on the solute distribution at the time of interest, we obtain

where ψ has homogeneous boundary conditions and, because we defined *s* as the distribution of *C*_{a} in the region Ω_{a}, the initial condition becomes

[31] Taking the product of each term in equation (22) with an arbitrary function ψ* (the adjoint state), integrating over time and space, adding this equation to the right-hand side of equation (21) and integrating by parts yields [*Neupauer and Wilson*, 2001]

[32] Because ψ* is not defined at this stage, we can prescribe its properties in a manner that is most convenient to our goal of eliminating ψ from equation (24). The second term in the integral can be eliminated by defining an appropriate governing equation for ψ*. The remaining terms that contain ψ are the spatial and temporal divergence terms. Integrating the temporal divergence term over the time domain, applying Gauss's divergence theorem to the spatial divergence terms, and substituting the initial and boundary conditions on ψ, it can be shown that the remaining terms containing ψ vanish if the final condition on ψ* is set to ψ* (**x**, *T*_{b}) = 0, and the boundary conditions on ψ* are homogeneous on Γ_{1}, Γ_{2}, and Γ_{3} [*Neupauer and Wilson*, 2001]. Specifying ψ* in this way and defining backward time as τ = *T*_{b} − *t*, the adjoint of the governing operator and its initial and boundary conditions are

where we assume steady flow (i.e., · η*v* = −*d* = *q*_{s} − *q*_{0}, and η*v*_{i} − *q*_{0}ψ* = (η*v*_{i}ψ*) − *q*_{s}ψ*). In equation (25), *L** [ ] is the adjoint operator of equation (19) and ψ* is the adjoint state. Note that the differences between the governing equation and its adjoint are that the signs on the first-derivative terms are reversed. In addition, the form of Dirichlet (first-type) boundary condition remains unchanged, the adjoints of Neumann (second-type) boundary conditions are Cauchy (third-type) boundary conditions, and vice versa. For our setup the marginal sensitivity of the performance measure in equation (24) simplifies to

According to equation (23), given that in our setup *s* is the concentration distribution in the subdomain Ω_{a} at time *T*_{a}, this equation simplifies to

where the integral of ψ* is over the subdomain Ω_{a}, because ψ(**x** ,*T*_{a}) = 0 everywhere else.

#### 5.3. Adjoint State Source Terms

[33] In our setup the performance measure *P* is *C*(**X**, 0), the solute resident concentration at a measurement location defined in three spatial directions **X** = (X_{1}, X_{2}, X_{3}), and τ = 0. The load term for the adjoint state, ∂ζ/∂*C*, is defined such that the integral of the performance functional ζ evaluates to the observation value. Assuming a point measurement, ζ is given by

The Dirac delta function in time causes the integral to be evaluated only at τ = 0. Using this ζ, the resulting governing equation for the adjoint state is

[34] Having defined the governing equation, boundary conditions, and source terms on ψ*, the only remaining task is to derive **H** from the results of the adjoint runs. Given the form of the performance functional ζ (*s*,*C*) in equation (29), it is clear that the direct contribution to the marginal sensitivity as defined in equation (28) (i.e., ∂ζ(*s*,*C*)/∂*s*) is zero. In a discretized domain the individual contaminant regions that we are interested in, Ω_{a}, are simply the grid cells within the area where the historical contaminant distribution is to be estimated. Therefore, for each observation location, one adjoint run is performed using a source term as defined in equation (30), and the marginal sensitivities of this observation to the discretized unknown contaminant distribution **s** are defined simply by ψ* (**x**, τ = *T*_{b} − *T*_{a}), where **x** are the grid points at which **s** is to be estimated. An adjoint run with a source term at the location of observation *C*_{a} thus defines one full row of the **H** matrix, *H*_{i,j = 1.m}.

#### 5.4. Implementation

[35] Many general purpose codes as well as case or site-specific models are available for the solution of the advection-dispersion equation. Although implementing inverse methods that provide function estimates of sources or historical distributions has up to this point required the development of custom groundwater flow and/or contaminant transport codes, there are many advantages to reusing existing models, especially if modifications to these models can be avoided. In such cases, these models would essentially be used as external program modules by the inverse modeling code. The use of modules has been shown to improve code maintainability [*Glass and Noiseux*, 1981; *Lientz and Swanson*, 1980] and comprehension [*Shneiderman and Mayer*, 1979] and is compatible with the notions of encapsulation and abstraction advocated by object-oriented design [*McConnell*, 1993]. Because groundwater flow and transport codes offer a collection of services in a way that allows for an external program to interact with them cleanly, they are perfectly suited for being coupled with an additional inverse model. For example, *Neupauer and Wilson* [2001] described the possibility of using existing groundwater transport codes for performing adjoint simulations.

[36] In this section we present an implementation of this idea for the problem of deriving the historical distribution of a contaminant. The flow field, boundary conditions, and load terms in the transport model need to be set up in a manner that reflects the adjoint model described in sections 5.2 and 5.3. The setup of adjoint transport simulations is described here, with additional implementation details presented by *Michalak* [2003]. Note that the setup for the flow field and boundary conditions would be similar for various applications of the adjoint state method, and is also described by *Neupauer and Wilson* [2001] for a different problem. The initial conditions and performed simulations, on the other hand, are specific to the problem being addressed.

##### 5.4.1. Flow Field

[37] The steady state flow field should first be calculated in the same manner as if forward simulations were to be run. The flow field is then reversed because the time parameter τ is defined as reverse time in the derivation of the adjoint state methodology, starting at the time at which observations were taken (τ = *T* − *t*). The simplest way to do this is to change the sign on all flow terms in the output file of the flow model.

##### 5.4.2. Boundary Conditions

[38] First-type boundary conditions remain first-type, second-type boundary conditions become third-type, and third-type boundary conditions become second-type. Furthermore, all boundary conditions are homogeneous (i.e., the right-hand side is zero) for the adjoint runs (see equation (26)). Note that if the velocities normal to a boundary are zero, second-type and third-type boundary conditions have equivalent forms and can therefore be simulated even if the transport code only supports one of these boundary types (see also section 6.2).

##### 5.4.3. Initial Conditions

[39] Because we are working in discretized space, the Dirac delta function δ (**x** − **X**) δ (τ) that was derived as the initial condition in equation (30) becomes a Kronecker delta function in numerical applications. Therefore, if we are interested in estimating the historical contaminant distribution in an aquifer, each adjoint run has an initial concentration of zero everywhere, except in the grid cell containing the observation, where the concentration is set to one.

##### 5.4.4. Simulations

[40] The transport model is run once for each observation. The total duration of the run is equal to the amount of time elapsed between the time at which the contaminant distribution is to be estimated and the time at which the observation was made. Once the simulation has been run for the appropriate time, the concentration is recorded at each point in the discretized area of interest. The concentrations in this zone resulting from each adjoint run represent sensitivities of that observation to a historical concentration at each of the points in the discretized zone. As such, the results of each adjoint run allow for one row of the sensitivity matrix **H** to be filled.

### 7. Conclusions

- Top of page
- Abstract
- 1. Introduction
- 2. Background
- 3. Objectives
- 4. Geostatistical Approach to Estimating the Historical Distribution of a Contaminant
- 5. Adjoint State Formulation and Implementation
- 6. Application to the Estimation of the Historical Contaminant Distribution in Two-Dimensional Aquifers
- 7. Conclusions
- Acknowledgments
- References

[61] The work presented in this paper extends the geostatistical approach to inverse modeling to the recovery of a historical contaminant distribution, implements an adjoint methodology that improves the efficiency of solving underdetermined inverse problems, allows existing groundwater flow and transport codes to be used as modules of the inverse model, and presents the first application of an inverse modeling method to the identification of a historical, multidimensional contaminant distribution in a heterogeneous medium.

[62] The method was tested using three applications. The Idealized Case demonstrated the method's ability to precisely and accurately recover the historical contaminant distribution in an aquifer when the quantity and quality of available data are sufficient. The Homogeneous and Heterogeneous Cases demonstrated the method's ability to recover a reasonable best estimate of the contaminant distribution and to accurately gauge the precision of that estimate. Although the method was applied here to derive the historical contaminant distribution at a single time, the method is also applicable to obtaining a time-dependent description of the history of a plume. In that case the distribution of the adjoint state ψ* in the adjoint runs would be recorded for a series of times τ, and the inversion would be performed for each of these times.

[63] Finally, although the adjoint state methodology was presented with an application to geostatistical inverse modeling in mind, several of the other inverse modeling methods described in Table 1 could benefit directly from this work. Methods such as Tikhonov regularization, nonregularized nonlinear least squares, and minimum relative entropy all require the calculation of a sensitivity matrix analogous to **H**. Therefore, although the specific inverse modeling algorithms differ from the geostatistical approach, the adjoint method implemented in this work would allow for similar computational savings in calculating the sensitivity matrix if applied with these methods.