A constrained robust least squares approach for contaminant release history identification



[1] Contaminant source identification is an important type of inverse problem in groundwater modeling and is subject to both data and model uncertainty. Model uncertainty was rarely considered in the previous studies. In this work, a robust framework for solving contaminant source recovery problems is introduced. The contaminant source identification problem is first cast into one of solving uncertain linear equations, where the response matrix is constructed using a superposition technique. The formulation presented here is general and is applicable to any porous media flow and transport solvers. The robust least squares (RLS) estimator, which originated in the field of robust identification, directly accounts for errors arising from model uncertainty and has been shown to significantly reduce the sensitivity of the optimal solution to perturbations in model and data. In this work, a new variant of RLS, the constrained robust least squares (CRLS), is formulated for solving uncertain linear equations. CRLS allows for additional constraints, such as nonnegativity, to be imposed. The performance of CRLS is demonstrated through one- and two-dimensional test problems. When the system is ill-conditioned and uncertain, it is found that CRLS gave much better performance than its classical counterpart, the nonnegative least squares. The source identification framework developed in this work thus constitutes a reliable tool for recovering source release histories in real applications.

1. Introduction

[2] Groundwater is vulnerable to contamination from point and nonpoint sources. One of the first steps in any environmental remediation project is to identify the locations and release histories of contaminant sources so that a cost-effective remediation strategy can be made and so that cleanup costs can be partitioned among liable parties. In most cases, source locations and release histories are unknown when contamination is first detected. The reconstruction of contaminant source locations and release histories from observed concentration records is a special type of inverse problem.

[3] Sun [1994] classified inverse problems in groundwater modeling into five types: namely, the identification of parameters, boundary conditions, initial conditions, sinks and sources, and the simultaneous identification of more than one of these components. From a theoretical point of view, the identification of source terms in a partial differential equation is simpler than identification of the equation coefficients [Isakov, 1990]. In practice, however, the source identification problem is very challenging because of its ill-posed nature and lack of data [Sun and Sun, 2002, 2005].

[4] Contaminant source identification has been studied for over two decades in groundwater hydrology. Three subproblems are often considered under this topic: finding the release history of a source, finding the location of a source, and recovering the initial distribution of a contaminant plume. Various deterministic and statistical methods have been devised to solve these problems. Atmadja and Bagtzoglou [2001a] and Michalak and Kitanidis [2004] provided extensive literature reviews on this subject. Existing approaches can be roughly classified into three categories.

[5] The first category of approaches formulates the source identification problem as an optimization problem and solves it using either a linear or a nonlinear programming technique. For example, Gorelick et al. [1983] used nonlinear programming to identify pollution sources and disposal episodes. Wagner [1992] presented a nonlinear maximum likelihood methodology for simultaneously identifying flow and transport model parameters, as well as pollutant sources. By recognizing the fundamental ill-posed nature of source recovery problems, Skaggs and Kabala [1994] used Tikhonov regularization (TR) to reconstruct the source release history of a one-dimensional transport problem. Alapati and Kabala [2000] used a nonlinear least squares method without regularization to recover the release history of a one-dimensional problem. Mahar and Datta [2001] considered identifying contaminant sources in conjunction with optimal monitoring network design using nonlinear optimization techniques. Aral et al. [2001] used a progressive genetic algorithm to solve the nonlinear optimization problem for source identification.

[6] The second category of approaches adopts a probability-based method. For example, Woodbury and Ulrych [1996] and Woodbury et al. [1998] used minimum relative entropy (MRE), a Bayesian inference approach, for source recovery. Given prior information in terms of lower and upper bound and a prior “best estimate” of the model, MRE yields closed form expressions for the posterior density function of the estimate by minimizing a measure of relative entropy. Snodgrass and Kitanidis [1997] used a geostatistical method to recover the release history of a point source in a one-dimensional steady state flow field. The unknown release history is regarded as a statistical field characterized by a few statistical parameters. The geostatistical method has been extended to two- and three-dimensional cases [Butera and Tanda, 2002; Michalak and Kitanidis, 2004].

[7] The third category of approaches solves the advection-dispersion equation (ADE) backward in time. Since the dispersion process is irreversible, the ADE cannot be solved backward simply with negative time steps. Bagtzoglou et al. [1992] and Wilson and Liu [1994] reversed the advection part while keeping the dispersion part unchanged. Their method gives a backward location probability or a backward traveltime probability. Later, Neupauer and Wilson [1999, 2001, 2004] used the adjoint state method to compute these probabilities. Atmadja and Bagtzoglou [2001b] derived a backward beam equation for the ADE to obtain the backward-in-time solution, and later, Bagtzoglou and Atmadja [2003] showed this method might produce better results than those obtained by the quasi-reversibility method of Skaggs and Kabala [1995]. The backward-in-time method is an effective direct method if field measurements can provide the final condition for the backward solution. For two- and three-dimensional problems, the backward beam equation is difficult to solve because the advection dispersion operator is squared in the equation.

[8] From our literature review and the comparison-of-methodology table provided by Michalak and Kitanidis [2004], we see in most of the previous studies, the effect of model error was often evaluated through ad hoc sensitivity studies. For the contaminant source identification problem, model error can be caused by oversimplified model structure, inexact model parameters, and numerical error (e.g., the error caused by numerical dispersion or numerical discretization). The approach we will present below allows a modeler to directly incorporate his or her knowledge about model uncertainty into estimation.

[9] The contaminant source identification problem can be transferred into a direct least squares problem because of the linearity of the ADE. This effective method is not used in practice due to its numerical instability [Lawson and Hanson, 1995; Bjorck, 1996]. The least squares matrix is often ill-conditioned so that small errors in the observation data, even if normally distributed, may cause a significant change in the solution. The TR method [Tikhonov and Arsenin, 1977] is often used [Skaggs and Kabala, 1994] to stabilize the solution and it can prevent the solution from growing without bound. There is, however, no rigorous way for determining the regularization parameter. As a result, TR may yield either overregularized or underregularized solutions and thus does not allow confidence in the final result [Schubert, 2003]. The total least squares (TLS) method, pioneered by Golub and Van Loan [1980] and further refined by Van Huffel and Vandewalle [1989, 1991], considers errors in both the coefficient matrix and the right-hand side of the least squares equations. TLS has been successfully applied to many different fields in the last decade [e.g., Van Huffel, 1997; Van Huffel and Lemmerling, 2002], but direct application of the standard TLS is not well suited for contaminant source identification because it can break down easily when the error distribution is not independent identically distributed (IID) with zero mean.

[10] In recent years, a robust counterpart of the ordinary least squares, the robust least squares (RLS) method, was developed based on advances in robust convex programming [El Ghaoui and Lebret, 1997; Chandrasekaran et al., 1996; Ben-Tal and Nemirovski, 1995, 1998]. In the context of this paper, robustness is defined as the resistance, or immunity, of an estimator to model and data uncertainty. Uncertainties can be characterized either in a set theoretic setting which consists of defining bounds for the uncertain variables, or a probabilistic setting which relies on finding probability density functions. Many parameter estimation techniques assume that the error distribution is Gaussian. Real world uncertainty, however, also include non-Gaussian, nonwhite noise, and systematic errors. These uncertainties can easily be considered in a set theoretic setting [Walter and Piet-Lahanier, 1990; Ben-Tal and Nemirovski, 1997; Goldfarb and Iyengar, 2003], and this is the setting used in RLS, which assumes that the system is subject to unknown but bounded perturbations and attempts to reduce the sensitivity of the optimal solution to these perturbations. Robustness, however, does not come without a price. After all, all robust estimators achieve robustness through some type of regularization and thus introduce biases to the solution. One of the key motivations behind using a robust estimator like RLS, as Bertsimas and Sim [2004, p. 1] wrote, is that “robustness assures that the solution remains feasible and near-optimal when data are uncertain.”

[11] RLS has been successfully applied to robust controller analysis [El Ghaoui and Lebret, 1997; De Fonseca et al., 2001], image processing [Schubert, 2003], structural analysis [Ben-Tal and Nemirovski, 1997; Mares et al., 2002], and financial analysis [Bertsimas and Sim, 2004]. To the best of our knowledge, RLS has not been applied to contaminant source identification.

[12] In this paper, we formulate a new variant of RLS, i.e., a constrained robust least squares (CRLS) method, for solving the source identification problem. Prior information on variable bounds, as well as additional linear constraints, can be incorporated directly into our formulation. The presented method is effective (no iteration is needed), robust (bound of model error is incorporated), and practical (system based). It can be readily combined with a mass transport model, such as MT3DMS [Zheng and Wang, 1999], for conducting real case studies.

[13] The paper is organized as follows: we present our system-based approach for source identification in section 2 and discuss various methods for solving a system of linear equations in section 3, with emphases on RLS and our extension to RLS (i.e., CRLS). Finally, both one- and two-dimensional numerical examples are given in section 4. The impacts of observation errors, model errors (including numerical dispersion error) on estimated source strengths are discussed, and comparisons with other estimation methods are made.

2. Problem Formulation for Source Identification

[14] A general parameterization of a space and time-dependent source function, s(x, t), is

equation image

where t is time, x is the spatial coordinate vector, {fm(x)} and {gn(t)} are two sets of basis functions, M and N are dimensions of parameterization in space and time, respectively, and {zmn} are weighting coefficients. With parameterization (1), the source identification problem becomes identifying the unknown coefficients {zmn}, which is coerced into the following vector form,

equation image

For instance, assume M contaminant sources are located in spatial domains Ω1, Ω2, …, ΩM, and these M sources have different source strengths during N consecutive time intervals (or release periods), T1, T2, …, TN. If the source strengths are constant in each release period, we can choose δ(Ωm) for spatial basis function fm(x), and δ(Tn) for temporal basis function gn(t), where δ(·) is the Kronecker delta. In this case, zk in equation (2) simply corresponds to the strength of the mth source located at (Ωm) during nth time period Tn.

[15] Mass transport in porous media can be modeled by the following ADE [cf. Bear, 1979; Sun, 1996],

equation image

with initial and boundary conditions

equation image

In equation (3), C(x, t) ([M/L3]) is the concentration distribution at t, Dij ([L2/T]) are the components of the dispersion tensor, D, Vi ([L/T]) are the components of the seepage velocity vector, V. In equation (4), Φ denotes the flow region with boundary segments Γ1 and Γ2, υ is the normal vector to the boundaries, and T is the total simulation time, i.e., T = equation imageTn.

[16] Although we did not include reaction and decay terms in equation (3), it is worth noting that the methodology presented below works also for any linear reaction model, such as linear equilibrium sorption, first-order decay and first-order kinetics.

[17] Let Cmn0(x, t) be the solution to the following equation:

equation image

subject to the same initial and boundary conditions given in equation (4). By noting the linearity of the advection-dispersion equation, equation (3), it is easy to verify that the following relationship holds between solutions of equation (3) and equation (5):

equation image

Assume L observed concentrations, Clobs(xl, tl), are taken at different observation locations and times

equation image

where ɛl is the observation error associated with the unknown true value, C(xl, tl; z). Substituting equation (7) into equation (6)., we get a set of linear equations

equation image

where A ∈ ℜLxK, K = M × N is the total number of unknowns, ℜ is the real space, z ∈ ℜK×1 is the unknown source vector to be identified, and b ∈ ℜK×1 is the observation data vector. The elements of A and b are given by alk = Ck0(xl, tl) and bl = Clobs(xl, tl), respectively. The approximation sign is used in equation (8) because we assume in this paper that both A and b are uncertain.

[18] In this paper, MODFLOW and MT3DMS are used to solve the forward problem. In order to use MT3DMS, we use δ(Ωm) and δ(Tn) as basis functions. In this case, the source term in equation (5) becomes a unit injection in a single domain (Ωm) during a single “stress period” Tn, and the solution of equation (5) is the result of a unit pulse injection. Under such representation, it is possible to assign physical meanings to rows and columns of A. Each column of A represents responses of all observation wells to a source (a unit strength source in one time period is regarded as one source) at all observation times, whereas each row of A corresponds to responses of one observation well to all sources at one observation time. For a real case study, if a mass transport model exists, matrix A can be obtained by running the model K times. The kth source is assigned unit strength in the kth run (k = 1, 2, ⋯, K), whereas all others are assigned zero strength. The model outputs corresponding to all L observations form the kth column of matrix A. Note that to identify initial concentration distributions (e.g., catastrophic releases), the term δ(Tn) is replaced by δ(0). In this case, matrix A can be obtained by running the mass transport model M times, where M is the number of spatial domains. The elements of A represent responses of a system to pulse excitations; thus A is often referred to as the response matrix, transfer function matrix or sensitivity matrix in the literature.

[19] The superposition method presented above can be used to test whether or not a certain location corresponds to a real source. In other words, the source vector in equation (2) should include all potential sources. The final solution should reveal what the real sources are. Since matrix A only depends on the observation locations and times and the proposed source structure (source locations and release durations), and is independent of the actual source strengths and observation values, we can test the effectiveness of an observation network during design stage using synthetic data.

[20] To identify K sources, only K model runs are needed. Since each model run is independent from others, our approach can be readily implemented for a distributed computing environment to further speed up the solution process. Of course, in the special case where the flow velocity is constant and all release periods have equal lengths, only one model run is required to generate the first column of A and all the other columns can be obtained by applying time shifting.

[21] Before leaving this section, we give the following necessary conditions for source identifiability: (1) for each source to be identified, there should be at least one observation point that responds to the source. Otherwise, the source should be excluded from the identified sources because it is not identifiable with the existing observation system; (2) no pair of sources being identified should result in identical responses by the observation system. This latter condition was presented in the paper of Alpay and Shor [2000] as a theorem: “Given any source pair, there exists at least one sensor that can distinguish their signatures.” Violation of either condition will result in two or more identical columns in A.

3. Constrained Robust Least Squares

[22] Methods for solving equation (8) have been studied for a long time, probably since Gauss first used the ordinary least squares (LS) to calculate orbits of objects in the solar system [Sheynin, 1994]. It is well known that when A is exact, b is subject to IID noise, and the system is well-conditioned, the LS method gives the optimal solution in the L-2 norm sense. Mathematically, the LS method tries to minimize the L-2 norm of the following residual

equation image

for which the solution is given by

equation image

where ∥·∥2 represents the L-2 norm, and A+ = (ATA)−1AT is referred to as the pseudoinverse of A [Golub and Van Loan, 1989]. In practice, A is often ill-conditioned. Ill conditioning means an arbitrarily small perturbation of the data can cause an arbitrarily large perturbation of the solution. Being ill-conditioned does not mean a meaningful approximation cannot be obtained. Instead, it means standard methods, such as LS, cannot be used in a straightforward manner [Hansen, 1994], and more sophisticated and robust methods have to be used.

[23] Numerous methods have been introduced for stabilizing an ill-conditioned system in order to get a useful solution, e.g., the TR technique. TR amounts to finding a weighted least squares solution to an augmented system [Hansen, 1994]

equation image

where λ is called the regularization parameter, L is typically either an identity matrix or a discrete approximation to a derivative operator, and z* is the initial guess. By bounding solutions to a reasonable size, TR alleviates one problem associated with the lack of robustness, namely, solutions being unstable and growing without bound. As mentioned in section 1, however, TR suffers from the drawback that selection of the optimal λ is usually not obvious [Golub and Van Loan, 1980]. Heuristic methods often have to be used for finding λ. When λ is too small, the stabilizing effect of the regularization operator is lost, and the solution will be overly sensitive to errors in the data. On the other hand, when λ is too large, the solution will be artificially smooth, and the fine details of the true solution are lost [Skaggs and Kabala, 1994]. In addition, the standard TR given in equation (11) does not explicitly take into account model errors.

[24] Over the last two decades, the sensitivity of estimations to errors in A and b have received much attention. The TLS method considers errors in both A and b and solves the following minimization problem

equation image

where [ΔAb] is the augmented matrix of errors in A and b, namely, ΔA and Δb, and ∥·∥F denotes the Frobenius norm, which is defined as ∥AF2 = ∑ijaij2. TLS solves a consistent, linear system that is closest to A and b in the Frobenius norm sense. In other words, rather than lumping all the errors to the right side of equation (8) as LS does, TLS corrects errors in both A and b. Golub and Van Loan [1980, 1989] solved equation (12) using the singular value decomposition (SVD) method: assume A ∈ ℜLxK, and let [A:b] = U∑VT be the SVD of the augmented matrix [A:b], where U ∈ ℜLx(K+1) and V ∈ ℜ(K+1)x(K+1), and both of which have orthogonal columns, and is a diagonal matrix containing the singular values of [A:b], which are denoted here as {equation image1, …, equation imageK, equation imageK+1} in descending order. Further let {σ1, …σK} denote the singular values of A in descending order. If equation imageK+1 < σK, the TLS solution is given by

equation image

and the minimum augmented error matrix is

equation image

In the above, uK+1 is the last column in U, and vK+1 is the last column in V.

[25] The standard TLS as given above does not regulate the magnitude of matrix corrections, i.e., ΔATLS. As a result, it is not robust and can lead to a solution in which the effect of errors is overly corrected. In the literature, a frequently cited example is when A is known to be almost exact, but b is far from the range space of A, it can be shown that TLS overcorrects A [Chandrasekaran et al., 1996]. Similar as the LS method, TLS is known to be extremely sensitive to the influence of data outliers. The breakdown point of TLS, defined as the smallest number of contaminated data that can cause an estimator to take on values arbitrarily far from the true estimate, is only one [Rousseeuw and Leroy, 1987]. Various variants of TLS have been proposed to address this issue. For example, Sima et al. [2003] and Renaut and Guo [2005] solved the regularized TLS problem using the TR technique.

[26] RLS was initially introduced to rectify some common drawbacks associated with LS and TLS. The key assumption in RLS is that the errors in A and b are unknown, but have deterministic bounds. RLS then seeks to find the worst-case model in the bounded region and solves a standard residual minimization problem for this worst-case scenario. As a consequence, the robustness of RLS is achieved naturally through considering the worst-case scenario. As with the TR technique, RLS aims at desensitizing ill-conditioned systems, but RLS allows one to incorporate bounds on model uncertainty (RLS remains feasible in the presence of perturbations that are not necessarily small) and RLS provides a mathematically rigorous way for determining the optimal regularization parameter. All these features of RLS make it very appealing for our source identification problem.

[27] The subtle difference between RLS and sensitivity analysis is also worth noting. Sensitivity analysis quantifies locally the stability of the nominal solution with respect to infinitesimal data perturbations, but it does not say how to improve this stability. Robust optimization addresses this problem [Ben-Tal and Nemirovski, 2000] by incorporating the worst-case scenario as a bound.

[28] In the now classical work of El Ghaoui and Lebret [1997], RLS is formulated as a problem of minimizing the worst-case residuals

equation image

where ρ is the upper bound of the augmented error matrix, [ΔAb]. El Ghaoui and Lebret [1997] solved the problem using convex second-order cone programming, which is a subclass of the more general semidefinite programming [Alizadeh and Goldfarb, 2003; Lobo et al., 1998]. Ben-Tal and Nemirovski [1995] considered uncertain second-order cone problems with ellipsoidal uncertainty and RLS can be considered as a special case. Chandrasekaran et al. [1996] considered a similar cost function as that given in equation (15) and solved it using SVD. In their approach, the regularization parameter is shown as the solution to a secular equation, which is solvable with any one-dimensional root finder. The problem with the SVD solution of Chandrasekaran et al. [1996] is that additional constraints, such as nonnegativity or bounds on individual decision variables, cannot be incorporated.

[29] El Ghaoui and Lebret [1997] assumed that the upper bound of the augmented error matrix [ΔAb] is known. Our tests suggest that imposing a bound on [ΔAb] may lead to overcorrection in A, especially when ∥Δb2 is much larger than ∥ΔA2. We will work with the following alternative RLS formulation, which is the starting point of Chandrasekaran et al. [1996] and Mares et al. [2002]:

equation image

subject to

equation image

For a fixed z, there may be many feasible pairs (A, b). The solution to (16) is a solution that minimizes the worst-case residual. Note that in equation (16) the L-2 norms are used, whereas in the TR and LS formulations, the squares of L-2 norms are used. From triangle inequality

equation image

The right-hand side of the inequality in equation (17) provides an upper bound for ∥(A + ΔA)z − (b + Δb)∥2. To reduce the min-max problem in equation (16) to a minimization problem, Chandrasekaran et al. [1996] proved that the upper bound in equation (17) is actually achievable. As a result, the RLS problem becomes: given A ∈ ℜLxK, b ∈ ℜL×1, and the bounds, ρA, ρb, determine z ∈ ℜK×1 that solves the minimization problem

equation image

subject to

equation image

In lieu of solving equation (18) using the SVD approach, as done by Chandrasekaran et al. [1996], we cast it into a second-order cone programming problem

equation image

subject to

equation image

where η and τ are slack variables. The two constraints that appeared in equation (19) are the so-called quadratic cone constraints [Alizadeh and Goldfarb, 2003], which are defined mathematically as

equation image

where the unknown vector y consists of a slack variable y0 and the unknown decision variables that are being sought for, and d is the cone dimension. It can be shown that the solution to equation (19) is given in the following form (Appendix A)

equation image

The connection between RLS and TR can be recognized immediately from the above equation. The difference is that the regularization parameter μ is obtained in a mathematically rigorous way, and it is optimal for robustness. In equation (21), the degenerate case (i.e., μ = 0) occurs when b is in the column space of A, and the solution for this case coincides with the LS solution. When ρA increases, the RLS solution becomes smaller. In the extreme, when ρA is greater than the largest singular value of A, the RLS solution becomes identically zero [Chandrasekaran et al., 1996].

[30] Second-order cone programming problems can be solved in polynomial time by the primal dual, interior point methods [Nesterov and Nemirovski, 1994; Nesterov and Todd, 1997; Fujisawa et al., 1997]. Interior point based methods are much faster than the classical sequential quadratic programming method for solving problems with a large number of decision variables.

[31] The previous versions of RLS are not ideal for source identification problem because the mass transport process is a positive system, which means the nonnegativity constraint must be imposed on equation (19). In addition, it is often desirable to incorporate other types of prior information, such as bounds of source strength, into the estimation process. This leads us to consider the following RLS problem with the additional linear constraints:

equation image

subject to

equation image

where lbi and ubi are the lower and upper bound of the kth source, zk, respectively. We call our method the CRLS method. In this work, we used the Matlab [The MathWorks, 2000] package SeDuMi written by Sturm [1998] to solve the general CRLS problem in equation (22).

4. Numerical Examples and Discussion

[32] Since the methodology presented in section 2 is generic, there is no special requirement on the flow and transport solver. One can construct the response matrix A using either existing flow and mass transport models, or analytical solutions.

[33] An appropriately estimated error bound is important for ensuring that solutions obtained with CRLS are robust but not over conservative. Methods for evaluating uncertainty propagation have been studied extensively in the last two decades in the field of geohydrology. Summaries of developments along this line can be found in the monographs of Dagan [1989], Zhang [2002], and Rubin [2003].

[34] As mentioned before, model error may arise at different levels (e.g., from uncertainty in input parameters, in model structures, and from numerical artifacts). We primarily focus on model error related to parameter uncertainty in this study. Obviously, the most complete way of characterizing parameter uncertainty is via statistical distributions of parameters, with which one can solve a stochastic model to obtain statistical moments of the quantity of interest. In practice, however, a full characterization of the parameter probability distribution is rarely possible and in many cases, only ranges of possible parameter variations are known [Babuška et al., 2002; Babuška and Oden, 2005]. Since CRLS only requires the knowledge of ρA, our task reduces to finding the worst-case perturbation of A. For convex sets that are prevalent in many engineering applications, only vertices of the set need to be considered [Ben-Tal and Nemirovski, 1997]. In other cases, the worst-case perturbation can be found, for example, by sampling the parameter space using Latin hypercube sampling [Helton and Davis, 2003].

[35] In the following, we demonstrate the identification of contaminant sources using CRLS via both one- and two-dimensional examples. Matlab is used to solve all one-dimensional problems and MODFLOW/MT3DMS is used to solve a two-dimensional example. We impose nonnegativity constraints on the unknown source vector. The performance of CRLS is compared with the LS, TLS, and nonnegative LS (NNLS) method. The NNLS method, in which solutions are subject to nonnegativity constraints, is a special case of the more general constrained LS method [Lawson and Hanson, 1995]. Compared with the LS, NNLS is more robust by constraining the oscillation in the estimation.

4.1. Source Recovery for One-Dimensional Transport

[36] The mathematical model of the continuous release of chemicals into a semi-infinite, one-dimensional flow field is

equation image

with initial and boundary conditions

equation image

where V is the flow velocity, DL is the longitudinal dispersion coefficient, and C0(t) is the unknown source release history. Pore-scale diffusion is negligible for advection-dominated problems. Therefore DL can be written as αLV, where αL is the longitudinal dispersivity [Bear, 1979].

[37] The following analytical solution to equation (23) is available when C0 is constant [Bear, 1979; Sun, 1996]

equation image

When C0 is time varying, it is often approximated as a piecewise constant function where the average concentration is used in each time period. The number of time periods used to discretize C0(t) depends on the information available, such as sampling intervals. By using equation (25) as a fundamental solution, problems involving multiple release periods can be solved. Consider, for example, a single release period problem where the duration of injection lasts from t0 to t1, the boundary condition in equation (24) then becomes

equation image

On the basis of the superposition principle, the solution to the above problem can be obtained by adding a source of negative strength (i.e., −C1, after time t1 and at x = 0)

equation image

where F(x, t) is the solution corresponding to a unit pulse input, which is the expression given in equation (25) with C0 set to 1. In general, the solution corresponding to the ith release period is

equation image

With equation (28), synthetic plumes corresponding to any number of release periods can be constructed.

[38] Dimensionless parameters are used for examples in sections 4.1.1 and 4.1.2. A source is placed at the origin with true strengths of 5, 10, 30, 60, 20, 100 and 50 over seven equal-length intervals. The uncertain model parameters are αL and V, with true values of 0.1 and 1.0, respectively. Figure 1 shows the evolution of the synthetic plume in time. Because of the variation in source strengths, the plume exhibits a clear multimodal shape at later times. For reference, the observation locations are marked with circles on the x axis. Our goal is to recover the source strengths in each time period for different variability in the model parameters. To demonstrate the robustness of CRLS, we use only 3 observation locations (x = 2, 6, 8) and 8 observation times (t = 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5 and 7.5).

Figure 1.

Plots of the synthetic one-dimensional plume for different times. The locations of observation points are marked with circles on the x axis.

[39] All response matrixes considered in this section are ill-conditioned. The ill conditionedness of A is usually measured via the matrix condition number, κ, which is the ratio between the largest singular value and the smallest singular value of A. The condition number is a measure of stability or sensitivity of a matrix (or the linear system it represents) to numerical operations [Golub and Van Loan, 1989]. Matrixes with condition numbers near 1 are said to be well-conditioned, and matrixes with condition numbers much greater than 1 are said to be ill-conditioned. For the examples in sections 4.1.1 and 4.1.2, κ(A) is equal to 962. For the example in section 4.1.3, κ(A) is equal to 3.0 × 1012.

4.1.1. Impact of Velocity Uncertainty

[40] In the first numerical experiment, we assume that αL is exact, but V varies in the range V0 (1 ± PV), where V0 is the true velocity, and PV is the level of variation in percentage. For benchmarking purpose, we use the true system as the nominal system in this example. In real life, the true system is never known and thus the nominal system corresponds to one that is observed most often. Many classical methods ignore the model uncertainty by solving the problem with nominal data. Such a solution may be a very bad solution if the nominal data themselves are uncertain [Ben-Tal and Nemirovski, 1998]. One of the first steps in applying CRLS is evaluating the worst-case deviation of A from its nominal state. For the current case, ∥ΔA2 is a function of V and it can be verified numerically that the worst-case deviation corresponds to the lower bound of V. This can be seen from Figure 2, in which ∥ΔA2 is plotted as a function of V. Note that the lower and upper bounds of model parameters are often assessable in practice. The upper bound of ∥ΔA2 can be estimated for systems known to be convex by evaluating the model at the vertices of the parameter space; otherwise, sampling-based methods can be used.

Figure 2.

Plot of ∥ΔA2 as a function of the advective velocity, V, for the one-dimensional example, where the deviation of A from the true system (V = 1.0) is considered. The velocity varies in the interval [0.8, 1.2], and the maximal value of ∥ΔA2 occurs at the lower bound of V.

[41] In this numerical experiment, we use Monte Carlo simulation to compare different methods with 500 realizations. For each realization, a random V is drawn from a uniform distribution in the prescribed range, matrix A is generated, and different methods are used to obtain estimates.

[42] Mean solutions obtained by the CRLS, LS, NNLS, and TLS are listed in Table 1. It is assumed that the velocity variation level, PV, is 5%, and the observed data are error-free. The relative error of estimation is defined as the ratio between the residual norm and the norm of the true solution, that is, ∥equation imagez2/∥z2, where equation image is the mean estimated source strength vector and z is the true source strength vector.

Table 1. Comparison of Estimation Methodsa
PeriodTrue StrengthCRLSLSNNLSTLS
  • a

    Results are based on mean estimates of 500 simulations. The second column lists the true source strengths in each time period. The last row shows the mean relative error of each method, which is defined as ∥equation imagez∥/∥z∥, where equation image is the mean estimated solution and z is the true solution. The velocity variation level is 5%.

Relative error, % 4.4406.664.2817.0

[43] It can be seen from Table 1 that both LS and TLS failed to give meaningful results, especially for the last three time periods. The conditions for the LS and TLS to give optimal performance (see section 3) are not satisfied here. In addition, recall that the elements in A, which correspond to unit pulse injections, are in general much smaller than the elements in b. This may have caused the TLS method to overcorrect the errors in A. CRLS and NNLS give comparable results for the first four time periods. It is the last three time periods that make the distinction between the two. Here, the difficulty arises because the sampling time is not long enough for the contaminant plume to be sampled adequately in the last three periods, especially the last period. Of course, the ill conditionedness can be alleviated or even completely removed with a better designed observation network and sampling schedule. Our goal, however, is to demonstrate source identification for ill-conditioned systems. While NNLS partially regulates the solution by preventing the solution from going to negative, it does not put a bound on the worst-case residual. CRLS ensures that the worst-case residual does not exceed a finite limit. As a result, CRLS yields the most robust estimates for the last three periods.

[44] Solutions of CRLS and NNLS for PV = 10% and 20% are shown in Table 2. The LS and TLS methods are not considered further because of their poor performance. Compared with that of PV = 5%, the relative error of CRLS is little changed when PV is equal to 10%, and increases slightly to 14.76% when PV is increased to 20%. In contrast, the relative error of NNLS increases significantly.

Table 2. Comparison of CRLS and NNLS for PV = 10% and 20%a
PeriodTrue StrengthCRLSNNLS
PV = 10%PV = 20%PV = 10%PV = 20%
  • a

    PV is the velocity variation level. The results are based on 500 simulations. The LS and TLS solutions are not shown because of their poor performance.

Relative error, % 4.014.392.0155.1

[45] Although Monte Carlo simulation is used in this example to benchmark different estimation methods, in practice it may be expensive and inefficient to adopt Monte Carlo simulation as a way of obtaining estimation under uncertainty. It is more likely that a single estimate is desired for a given set of data and a given model. In such a situation, rather than an estimator that may be accurate in the mean sense, one would naturally prefer an estimator that can yield a reliable and thus useful estimation for ill-conditioned systems. If an upper bound of model uncertainty is available, then CRLS offers such a choice. This point is clearly demonstrated in Figure 3, which shows histograms of the estimated strength of the 7th time period for PV = 20%. The most striking feature revealed by Figure 3 is the contrast in the range of variations - the CRLS estimates vary approximately from 10 to 70, while the NNLS estimates vary from 0 to 800 (recall that the true source strength for the 7th time period is 50). The robustness of CRLS is apparent in this example.

Figure 3.

Histograms of source strengths estimated by (a) CRLS and (b) NNLS for the seventh time period based on 500 Monte Carlo runs with PV = 20%. (The true source strength for the seventh period is 50.)

4.1.2. Impact of Outliers in Measured Data

[46] In the previous example, we assume that b is error free. In our second numerical experiment, we simulate the impact of data error on the estimations. The velocity variation level is fixed at 10% and the dispersivity variation level at 20%. Other parameters used are the same as those in the last example. In the base case, b is error free. In the second and third cases, the 5th and 11th element of b are subject to a large random relative error, which is assumed to be uniformly distributed in the range ±ɛo. In the second case, ɛo is set to 50%, and in the third case, ɛo is set to 100%.

[47] Table 3 lists the outputs of CRLS and NNLS from three Monte Carlo simulation runs, with each run based on 500 realizations. All runs used the same random seed for the random number generator. The relative error of CRLS estimations increases only slightly with the increase in ɛo. It is interesting to see that the relative error of NNLS in this example actually decreases with ɛo. An examination of NNLS solutions from all realizations reveals that this phenomenon is largely an artifact created by the Monte Carlo simulation. We again plot the histograms of the estimated source strengths for the 7th time period. As shown in Figure 4, NNLS gives null estimates for almost half the time (the true source strength is 50), and much higher estimates for the remaining half. With such a wide variance, the seeming improvement of NNLS is unreliable. In contrast, the spread of CRLS estimations is much narrower, and more importantly, all the estimations are greater than zero.

Figure 4.

Histograms of source strengths estimated by (a) CRLS and (b) NNLS for the seventh time period based on 500 Monte Carlo runs with ɛo = 100%. (The true source strength for the seventh period is 50.)

Table 3. Comparison of CRLS and NNLS for equation imageo = 0, 50 and 100%a
equation imageo = 0%equation imageo = 50%equation imageo = 100%equation imageo = 0%equation imageo = 50%equation imageo = 100%
  • a

    Here ɛo is the level of random perturbation imposed on the 5th and 11th elements of b. The results are based on 500 simulations for each ɛo.
Relative error, % 5.26.510.6113.9112.5101.1

4.1.3. Problem of Skaggs and Kabala

[48] Skaggs and Kabala [1994, equation 25] created a one-dimensional example to demonstrate their source recovery strategy. This synthetic problem is unique in that the shape of the source release history is complex, and the resulting response matrix is underdetermined and severely ill-conditioned. The problem has become a classical problem and has been pursued by several researchers, for example, Woodbury and Ulrych [1996], Snodgrass and Kitanidis [1997], Neupauer et al. [2000]. All of these studies focused on measurement error. Here, we are interested in illustrating the impact of model uncertainty that is caused by an uncertain V.

[49] We calculated the response matrix following the procedure of Neupauer et al. [2000], namely, the total release period was divided into 100 equal intervals, and the plume were sampled at T = 300 days at 25 locations. The resulting source vector thus consists of 100 unknowns. We recovered the release history for different levels of PV. The result is shown in Figure 5. It can be seen that CRLS recovered the true release history quite well for a negligible PV of 0.1% (Figure 5a). The solution deteriorates as PV increases. It, however, deteriorates in a robust way in the sense that the main features of the true source remains discernable.

Figure 5.

Impact of velocity uncertainty illustrated with the problem of Skaggs and Kabala [1994]. The horizontal axis is time (days), the vertical axis is relative concentration, the solid line corresponds to the true release history, and the dotted line corresponds to the recovered release history. All PV values are in percentage.

4.2. Source Recovery for Two-Dimensional Transport

[50] The velocity variation was directly imposed in the previous examples because V is a parameter in the analytical solution. In reality, uncertainty in velocity results directly from the uncertainty in flow parameters (e.g., in hydraulic conductivity). In this example, we illustrate the impact of uncertain hydraulic conductivity on contaminant source recovery.

[51] Contaminant transport in a two-dimensional, irregularly shaped and unconfined aquifer is considered. The rectangular box surrounding the aquifer is 5000 m long in the east and 2500 m wide in the north direction. The porosity is 0.25. Longitudinal and transverse dispersivities are 10 m and 1 m, respectively. The flow domain is discretized with a uniform numerical grid of 50 rows and 100 columns. Constant head boundaries of 120 and 60 m are imposed on the upper left and lower right sections of the aquifer, respectively (see Figure 6). The aquifer is assumed homogeneous with a nominal horizontal hydraulic conductivity value of 25 m/day and the flow field is at steady state.

Figure 6.

Domain geometry and steady state head distribution for the two-dimensional example. Constant head boundaries of 120 and 60 m are set at the northwestern and southeastern corners of the domain, respectively. The location of source (the square in the upstream) and the locations of 11 observation wells (circles with crosses) are also marked on the plot.

[52] A time-varying contaminant source, which has a size equal to four numerical cells, is placed near the upstream boundary. The true source release function is listed in Table 4. The source release function has two peaks: one is between 360 and 540 days, and the other between 1080 and 1440 days. The source release essentially terminates after the 13th period. The total time of interest is 4320 days and is split into 15 time periods. The observation network consists of 11 observation wells. Not all observation wells collect samples during all time periods, a consequence of possible mechanical failure or human errors. The earliest observation time is at 700 days. The locations of the source and observation wells are also indicated in Figure 6, along with the steady state head contour solved by MODFLOW. A plot showing the observation times of all wells is given in Figure 7. It can be seen that the time interval between 1000 and 1500 days is sampled most frequently, while no sample is taken between 3500 and 4000 days.

Figure 7.

Observation times of all observation wells. The horizontal axis indicates well number, and the vertical axis corresponds to observation times (day). The interval between 1000 and 1500 days is sampled the most, while no sample is taken between 3500 and 4000 days.

Table 4. True Source Release History for the Two-Dimensional Examplea
PeriodStart Time, dayEnd Time, dayTrue Concentration, g/m3
  • a

    The second and third columns are the starting and ending time of each release period, and the last column is the true source strength during each period.


[53] Figures 8a, 8b, and 8c show the simulated concentration profiles at 720, 2160, and 3240 days, respectively. It can be seen that the plume has already reached the most downstream observation well at t = 3240 days.

Figure 8.

Simulated contaminant plumes at (a) t = 720 days, (b) t = 2160 days, and (c) t = 3240 days. Concentration values are in g/m3.

[54] Assume that the hydraulic conductivity is found to have a 10% maximum variation around its nominal value (i.e., between 22.5 and 27.5 m/day). Our goal is to recover the source release history using CRLS by incorporating the information on hydraulic conductivity uncertainty. For this case, the relationship between ∥ΔA2 and the uncertain parameter is again linear, and the upper bound of ∥ΔA2 (i.e., ρA) corresponds to the lower bound of the hydraulic conductivity, namely, 22.5 m/day. The value of ρA is 0.226, and the condition number of the nominal A is 606.

[55] We again use Monte Carlo simulation to compare CRLS with NNLS. In Figure 9, the true source release history is plotted together with the mean estimations obtained using CRLS and NNLS for 100 random hydraulic conductivity realizations. The mean relative estimation error of CRLS is 11.8%, while the mean relative estimation error of NNLS is 33.1%. From Figure 8, it can be seen that CRLS captured the first peak much better than NNLS did. Although observation did not start until after 700 days, observed concentrations from later times can help to decipher information about plumes released at early times. Obviously, it is important to institute a well-designed observation network so that the dispersed plume can still be captured. As pointed out by Skaggs and Kabala [1994], the accuracy of the recovered plume depends on the accuracy of the characterization of the current plume and on the extent to which the plume has dissipated. In the current case, a larger conductivity results in faster plume movement, whereas a smaller conductivity results in slower plume movement. CRLS minimizes the worst-case residual (i.e., the difference between the nominal and the slowest plume) for each release period. For comparison, the minimum and maximum solutions obtained by each estimator are also plotted in Figure 9. It can be seen that CRLS in general shows a much smaller variability than the NNLS and is thus more robust and reliable.

Figure 9.

Comparison of the mean solutions obtained by (left) CRLS and (right) NNLS based on 100 realizations. The dashed line corresponds to the true source release history, the circles correspond to the CRLS solution, and the crosses correspond to the NNLS solution. The minimum and maximum estimates obtained by each method in the Monte Carlo simulation also are shown on the plot.

5. Summary

[56] A robust framework is introduced to recover contaminant source release history. First, the response matrix is derived using a superposition technique. Our formulation is general and does not make specific assumptions about the underlying flow and transport parameters, as long as the transport problem is linear. If there is no model error and the measurement error is normally distributed, the ordinary LS or TLS gives the optimal solution. In reality, model uncertainty is unavoidable. One approach for dealing with uncertainty is via the classic Bayesian approach, in which the prior probability distributions of model parameters have to be acquired first. We present an alternative approach, i.e., CRLS, for encapsulating uncertainty.

[57] CRLS incorporates directly one's knowledge about model uncertainty and measurement error and uses the information to determine a regularization parameter, which is optimal for robustness. When the observation network is imperfect and the quantity and quality of observed data are poor, the resulting system can be ill-conditioned. CRLS becomes most useful for obtaining estimates in such a situation. It does not require prior probability distributions of model parameters. Instead, CRLS assumes the system is subject to unknown but bounded perturbations and it seeks a solution that remains feasible and near-optimal through solving a minimax problem. CRLS is a deterministic method and thus does not yield confidence intervals as statistical methods do. It is possible, however, to combine set theoretic and probabilistic settings to cope with situations where uncertainty is partly described by bounds and partly by probabilistic density functions.

[58] We illustrated the application of CRLS for one- and two-dimensional examples, both of which involve ill-conditioned systems. For the one-dimensional example, the impacts of velocity uncertainty (model error) and measurement outlier (data error) were incorporated into CRLS estimations. For the two-dimensional example, the model error was caused by uncertainty in the hydraulic conductivity. Our results show that the LS and TLS methods failed to give meaningful solutions for the error structure we considered. Although the NNLS method prevents the solutions from going to negative, it does not put a limit on the worst-case residual. CRLS outperformed NNLS in both examples studied. Monte Carlo simulation was used in order to make a fair comparison between methods. In practice, one often has to obtain estimates without Monte Carlo simulation. The small variance of CRLS estimates makes CRLS a reliable choice for such a situation.

[59] The current approach allows one to include potential sources at multiple locations in the analyses. The recovered source release history will tell whether or not a source is a true source. Such a strategy, however, requires detailed prior information on potential source locations. Alternatively, CRLS can be combined with a global optimization algorithm to automate the location search process. This will be a subject of future research.

Appendix A

[60] A proof is provided here for the CRLS solution shown in equation (21). For an introduction to the second-order cone programming, readers may refer to Alizadeh and Goldfarb [2003] and Lobo et al. [1998].

[61] The dual problem of the second-order cone programming problem given in equation (21) is [El Ghaoui and Lebret, 1997]

equation image

where w and u are dual variables.If η = τ at the optimum, then z = A+b and η = τ = ρAz2. Now assume η > τ at the optimum. Setting the objectives of the primal and the dual equal to each other, we have

equation image

The term bTw can be expanded as

equation image

From equation (A1), ∥u2 ≤ ρA, ∥w2 ≤ 1, we can then construct w and u as

equation image

In addition, at the optimum we have

equation image

Substituting equation (A5) and equation (A4) into ATw + u = 0, we get

equation image

Solving for z from equation (A6), we get the result given in equation (21), that is,

equation image


[62] Funding for this project is provided by the Advisory Committee for Research at Southwest Research Institute® (20.R9530). We want to thank Stuart Stothoff, the Associate Editor, and the anonymous reviewers for their constructive comments. The first author also wants to thank Ne-Zheng Sun at UCLA and L. El Ghaoui at UC Berkeley for their input.