## 1. Introduction

[2] In groundwater systems, a source of contamination may not be known to exist until contamination is observed in downgradient wells or surface water bodies. After the contamination is observed, characteristics of the source can be reconstructed using any of a variety of methods that are based on the existing distribution of contamination and on knowledge of the flow and transport processes. As a contaminant moves through the aquifer, the dispersive processes cause a loss of information about the contaminant's past; therefore, a complete reconstruction of the source characteristics is not possible. The reconstruction is further complicated by errors in measured concentrations, and uncertainty and variability in the flow and transport parameters. In this paper, we focus on identifying the location or release time of an instantaneous point source of contamination. Because the source characteristics cannot be known exactly, we represent the source location and source release time as random variables.

[3] In recent years, the source identification problem has received much attention. One type of source identification problem that has been extensively studied is the reconstruction of the release history from a known source of contamination. This problem has been addressed using linear programming and function-fitting [*Gorelick et al.*, 1983], maximum likelihood method [*Wagner*, 1992], Tikhonov regularization [*Skaggs and Kabala*, 1994, 1999; *Liu and Ball*, 1999; *Neupauer et al.*, 2000], method of quasi-reversibility [*Skaggs and Kabala*, 1995], geostatistical approaches [*Snodgrass and Kitanidis*, 1997; *Butera and Tanda*, 2003; *Michalak and Kitanidis*, 2004], classical Bayesian methods [*Woodbury and Ulrych*, 1996, 1998; *Woodbury et al.*, 1998; *Neupauer et al.*, 2000; *Michalak and Kitanidis*, 2003], non-linear optimization [*Alapati and Kabala*, 2000; *Mahar and Datta*, 2000], marching-jury backward beam equation method [*Atmadja and Bagtzoglou*, 2001], and genetic algorithms [*Aral et al.*, 2001]. In this type of problem, the location of the contamination source is known. This differs from the problem addressed in this paper in that our approach is used to identify the location of an unknown contaminant source.

[4] Another source identification problem is the identification of the location or release time of the source, and is the problem that is addressed in this paper. *Dimov et al.* [1996] used concentration measurements and marginal sensitivities of concentration to source mass to identify the location of a point source of contamination in a one-dimensional domain. For an instantaneous point source in a one-dimensional domain, the ratio of the marginal sensitivity to the measured concentration will be equal to the reciprocal of the source mass at (at most) two points, one of which will be the true source location. If a second sample is taken, the ratio of its marginal sensitivity to measured concentration will also be equal to the reciprocal of the source mass at two points, including the source location. *Dimov et al.* [1996] identified the true source location as the point at which both ratios are equal. This approach only works if the measured concentrations, the model parameters, and the conceptual model are all accurately represented, an unlikely situation in any groundwater modeling application. *Neupauer and Wilson* [1999, 2001, 2002] presented an approach for developing probability density functions (PDFs) of the random location or random release time of an instantaneous point source of contamination based on one or more sampling locations or times. These PDFs are related to the marginal sensitivity of concentration to source mass. To calculate these PDFs, the observation location is treated as an instantaneous point source of probability occurring at the time of sampling, and the probability density function is propagated upgradient and backward in time to identify possible former positions (spatial distribution of location PDF at a particular backward time) or possible travel (release) times (temporal distribution of travel time PDF at a particular upgradient location) of the observed particle. These backward probabilities take into account the observation location and time, but do not account for the sampled concentration; therefore, they are based more on the spatial and temporal distribution of the sampling network than on the actual distribution of the contaminant.

[5] Other source identification problems aim to identify both the source location and release history. *Mahar and Datta* [2000] used non-linear optimization to estimate the release concentrations of multiple hypothetical contaminant sources over discrete time intervals using breakthrough curve data at several downstream locations. In their work, the locations of two or three sources were known, and the optimization method was used to determine the source flux at each of these sources. The source fluxes were constant over discrete time intervals, and the durations of these time intervals were assumed to be known. For some of the sources, the true source flux was zero; therefore they also tested the capability of distinguishing between locations that released contamination and those that did not. They found that the estimated source fluxes differed from the actual source fluxes by approximately 10–30%, depending on the quantity of available breakthrough curve data (number of breakthrough curves and number of samples data points in each breakthrough curve). *Aral et al.* [2001] and *Mahinthakumar and Sayeed* [2005] used genetic algorithms to identify the location and release rate of a contamination source. *Aral et al.* [2001] used breakthrough curve data from four observation wells in a genetic algorithm to estimate the release history of contamination from a point source with an unknown, but constrained, location. Even with measurement errors in the breakthrough curve data of up to 12%, they were able to determine the source location to within approximately 3% of the travel distance between the true source location and the most distant observation well. *Mahinthakumar and Sayeed* [2005] used a genetic algorithm to determine the concentration and the location of an instantaneous, non-point source with uniform initial concentration. They used discrete breakthrough curve data at several downstream wells to optimize their solution. They evaluated several different hybrid genetic algorithm-local search methods and found that some hybrid methods accurately identified the source location and concentration to within 1% of the true values for the hypothetical examples they investigated.

[6] A final category of source identification problems is the identification of the historical distribution of a contaminant plume. *Michalak and Kitanidis* [2004] used measured concentration data in a geostatistical inverse method to identify the configuration of a contaminant plume at some time in the past. With accurate concentration measurements taken at a finite number of points in a contaminant plume, they were able to reconstruct the contaminant distribution at a desired time in the past. They did not identify a source location, but rather a representation of the contaminant plume, which may represent a plume that developed from an upgradient point or distributed source, or it may represent the distributed source itself if the source were instantaneous.

[7] In this paper, we extend the work of *Neupauer and Wilson* [1999, 2001, 2002] by developing a method for conditioning backward location and travel time probability density functions on measured concentrations. We show that conditioning on measured concentration significantly increases the accuracy and decreases the variance of the PDFs when model and measurement errors are small, leading to an improvement in the identification of contaminant source locations and the release time of contaminants from the source. This work differs from the approaches of *Mahar and Datta* [2000], *Aral et al.* [2001], *Michalak and Kitanidis* [2004], and *Mahinthakumar and Sayeed* [2005] in that we assume an instantaneous point source and identify the unknown source location or the release time of contaminant from a known source. At this time, our approach has only been developed for an instantaneous point source and is not capable of reproducing the historical contaminant distribution.

[8] In the next section, we present equations for the unconditioned location and travel time PDFs for one or more observations. In the subsequent sections, we develop equations for conditioning these PDFs on measured concentrations, and we demonstrate the conditioning approach using a hypothetical example.