A numerical scheme for the optimal control of groundwater pollution

A scheme is proposed for the numerical study of optimal control problems related to groundwater quality management in the case of nonpoint‐source pollution. The corresponding state equation is a system of coupled nonlinear partial differential equations. The approximation is given by a finite volume scheme based on a two‐point flux approximation with upwind mobilities embedded in an iterative fixed point approximation. The analysis of the convergence of the scheme allows to establish existence and uniqueness results under reasonable assumptions. Numerical illustrations of the performance of the algorithm are given in realistic situations.


INTRODUCTION
The water resource is today threatened by different kinds of pollution linked to human activity, most of them coming from economically necessary production activities. A trade-off between the benefits of production and the decontamination costs must therefore be sought. Such a question enters in the framework of the optimal control theory.
Modeling the spread of a miscible pollutant in groundwater is equivalent to consider the flow of incompressible miscible fluids in porous media. Both convection, diffusion and reaction phenomena should be taken into account in order to be more realistic, leading to a system of coupled nonlinear partial differential equations. The corresponding model will play the role of the state equation constraining the optimization problem. Although there is a large literature considering the question of optimal policy for managing pollution, most works neglect important components of the problem. In particular many studies reduce the hydrogeological model to ordinary differential equations without taking into account any space dependency. Thus the delay between the application of any policy and its effects, induced by the small flow speed in aquifers is either neglected or modeled by a given fixed delay (e.g., [2,6,8,20]). The importance of taking a spatial dimension into account was however underlined in [7]. The space dependency is fully included in the present article.
We treat the optimal control problem from the point of view of numerical analysis. This allows us both to show the relevance of a scheme and to prove a result of theoretical analysis. In summary, we propose a finite volume architecture embedded in an iterative fixed point approximation to handle the full 3D optimal control problem. The choice of the finite volume approximation is relevant because, by construction, finite volume schemes are locally conservative. This property ensures that the overall iterative process quickly converges to an accurate description of the optimal control. Also the scheme has been built to deal with the strong parameter and scale contrasts induced by concrete applications.
There is of course a substantial literature on the numerical approximation of equation-constrained optimal control problems. A major issue in this context is that the control approximations may be highly oscillating: see [18] (resp. [14]) and the references therein for finite differences schemes (resp. for finite element schemes), both dealing with linear elliptic problems. As emphasized in [18], it seems that the oscillations mainly occur when using an approach in the form "discretize then optimize". Here we choose the opposite approach, optimize (that is derive the adjoint problem) then discretize. Unlike the work [14,18], we do not compute precise error estimates but simply establish the convergence of the scheme. Indeed, we have chosen to keep reasonable assumptions and we have poor regularity results for the weak solutions we treat (compared to the H 2 regularity in [14] or even the C 4 or C 2 assumptions in [18]). Another important remark explains why we do not attempt to improve our error estimates. In general, the performance of our algorithm will be doubly constrained: by those of the fixed point algorithm and by the structure of the two coupled advection-diffusion equations, in general with dominant advection, the advection in the adjoint problem being equal to the negative advection of the direct problem. The latter structure is induced by a change in the time variable that we make for handling the anti-diffusive form of the adjoint equation. In our nonlinear parabolic setting, we thus find the difficulty already highlighted in [14] for the elliptic linear setting. The behavior of the numerical method applied to the coupled PDEs is worse than the one of the method applied to the governing PDE only. This structural difficulty is unlikely to be controllable, especially since we want an approach general enough to simulate multiple nonlinear phenomena.
The article is organized as follows. The optimal control problem is introduced in Section 2. Section 3 is devoted to the description of our numerical scheme and to a sketch of the main results of the article. These results are then properly stated and proved in Sections 4-6 through the numerical analysis of the scheme. Notice that we have chosen to detail the proofs to emphasize the ease with which the scheme can be adapted to other optimal control problems. Finally, numerical experiments are performed in Section 7.

The optimal control problem
Let Ω ⊂ R , ∈ N, ≤ 3, be a bounded domain, representing the area affected by the pollution and containing the groundwater collection wells. We assume that the boundary of Ω is such that ∈ 1 . In the case of the 3D example treated in the last section, the boundary of Ω is divided into six subsets for taking into account different physical boundary conditions, namely = ∪ 6 i=1 Γ i where Γ 1 , Γ 2 , Γ 3 , Γ 4 are the laterals faces and respectively Γ 5 , Γ 6 are the top, bottom face. For the theoretical analysis however, it is sufficient to consider a division of into two components, Γ 0 and Γ. Moreover Γ is a (possibly very small) part of Γ 0 where we use a Dirichlet boundary condition for the hydraulic head. We assume that Γ is of nonzero measure for ensuring the validity of a Poincaré type inequality which facilitates the treatment of the elliptic equation of the model. The time horizon is denoted by T, 0 < T < ∞. We set Ω T = Ω × ]0, T[. We consider the following control problem with a single control p: given the problem find the scalar control function p which maximizes the functional subject to (2)-(4) and to the constraint The state equations correspond to a hydrogeological model for the displacement of the pollution in the underground. They are given in the system (1)-(2) modeling the displacement of a miscible pollutant of concentration c in the porous medium Ω under the coupled effect of convection, diffusion and reactions (see e.g., [4]). The velocity of the flow v is related to the hydraulic head by the Darcy law for an incompressible fluid (1) where is the mobility of the fluid in the soil and = q I −q P is a source term taking into account the recharge of the aquifer (q I ) and the water withdrawal in production wells (q P ). In (2), the dispersion effects due to the heterogeneity of the velocity field at the microscopic scale are given by the dispersion tensor S(v) = S m I R + S p (v), I R denoting the × identity matrix. For concision in the numerical analysis below, we retain only the diffusive component, thus setting S p ≡ 0. Nevertheless the work extents to a general dispersive model under classical assumptions (see e.g., [1]). The porosity of the soil is given by the function . The kinetic behavior of the pollutant is described by two terms, namely the retardation factor R and the function r(c), for taking into account reactions that may take place on different time scales. For the Neumann boundary conditions we denote by n the outward normal to the boundary.
The pollution load is characterized by the rate of application p and its location in a subset ℒ ⊂ Ω, ℒ being the characteristic function of ℒ . The function p is actually the control of the problem. Note that this function depends on both space and time. Thus, in particular, it contains two pieces of information: the amount of pollution entering the soil and its location within ℒ , which can vary over time. The control is subject to the constraint (6) meaning from a mathematical point of view that it is bounded. From an application point of view, p > 0 is the maximal pollution load (for instance the maximal fertilizer load that can be applied on a field). The non-negativity means that no remediation occurs.
The optimal control problem is in the Bolza form. We consider a central planner objective given by (5), following for instance the standard setting [20]. The function D correspond to the cleaning costs depending on the position of the production wells and on the pollutant concentration in the groundwater c. In particular, function D may contain the characteristic function of a subset of Ω for evaluating the costs only in a precise part of the domain of interest, for instance the water production wells. This allows to modulate the importance we give to environmental concerns in the model. The pollution is assumed to be linked with some economical activity. The corresponding benefits are described by the function f and have to balance somehow the cleaning costs. The real numbers , 0 < < 1, and ≥ 0 are respectively the social discount rate and a weight for the terminal costs.

Mathematical framework
The state equations being a partial differential equations problem, we are naturally going to use the usual functional spaces associated to the mathematical analysis of PDEs. The set of admissible controls thus reads and our problem is to find p * such that p * = arg max p∈E J(p) subject to state constraints (1) − (4).
As already mentioned, we plan to use the optimality conditions of the first order instead of a direct approximation of (7). It is worth to mention that the characterization via any equivalent approach linked with the Pontryagin principle gives only necessary conditions for the optimal trajectory and optimal open-loop control. However, this approach makes sense because we can rely on an existence result for (7). Assume first: (H0) For the sake of the simplicity in the notations, we treat the case where where k and S m are given positive real numbers. Likewise the porosity and the factor R are assumed to be positive real numbers. The initial data c 0 belongs to L 2 (Ω). The characterization of p * using the first order necessary optimality conditions of the first order now makes sense. Assume further: (H6) The function r (resp. D) is differentiable (resp. differentiable w.r.t. c).
(H7) The benefit function f belongs to 1 and the codomain of the inverse function The first order necessary conditions are derived using the Lagrangian associated with the problem (7) (see e.g., [3,5,9]). Canceling the variations of the Lagrangian associated to the problem with respect to the control p, to the state variable c and to the final state c(T, x), respectively provides the additional initial condition in (12), the optimality condition (10), the adjoint equation (11) and the boundary conditions in (12) below. With the state equations, we get the characterization of the optimal control p * by the following system in Ω T completed by the following initial and boundary conditions Notice that we have used a time variable change in the adjoint unknown to avoid handling an anti-diffusive equation for the adjoint variable. The time inversion operator, denoted by ℐ T , is defined by for any time-dependent function f. We end this section with a proper definition of a weak solution of (8)- (12).

DEFINITION OF THE NUMERICAL SCHEME AND SKETCH OF THE MAIN RESULTS
We propose a discretization of the Euler-Lagrange's version (8)-(12) of the problem. The difficulty lies in the coupling of the diffusive Equation (9) with the antidiffusive Equation (11). Of course, there are not enough starting conditions available at either end point to produce a unique approximated solution. Such problems are known as two-point boundary value problems and are notably more difficult to solve than initial value problems. Due to the structure of the Pontryagin's maximum principle, the numerical solution of forward-backward problems is a classical issue in optimal control and a consequent literature exists on the subject. There are three main classes of methods, namely direct and indirect (multiple) shooting, relaxation, iteration. From a computational performance point of view, it is generally accepted that the first two methods are the most efficient. Nevertheless, all of these methods require an intelligent guess for the missing initial conditions (even if very effective refinements exist, see for instance [12]). Moreover the convergence speed calculations for the three categories of methods are more or less based on the same type of hypothesis, roughly speaking the function describing the solution trajectories of the differential system must be Lipschitz in time. This leads to problems that can become very troublesome when handling with PDEs.
As we want to prove Theorem 1 through the convergence of our iterative scheme without too many restrictive assumptions, we choose here to use another iterative method. Basically, the idea is to construct a finite volume architecture embedded in an iterative fixed point approximation for handling the nonlinear coupling due to (10) within the problem. The finite volume scheme is based on a two-point flux approximation with upwind mobilities. The time dynamics is treated with a Euler scheme. The mathematical analysis of the discrete problem, in particular its convergences properties, will allow us to ensure the existence of a solution to the optimal control problem in the sense of Definition 2.2. Then, the performance of the scheme will be illustrated at the end of the article by numerical illustrations.
We first introduce some notations. Let be a regular and admissible mesh of the domain Ω, constituted of open convex polygons called control volumes with maximum size (diameter) denoted by h. For all K ∈ , we denote by |K| the measure of K. Let N(K) the set of the neighbors of K that is, the set of cells of which have a common interface with K. More precisely, N int (K) is the set of the neighbors of K located in the interior of while N ext (K) is the set of edges of K on the boundary . Recall (see [11]) that the admissibility of implies that Ω = ∪ K∈ K, K ∩ L = ∅ if K and L are two distinct elements of , that there exists a finite sequence of points (x K ) K∈ such that x K ∈ K and such that the straight line x K x L is orthogonal to the edge K,L . For all L ∈ N int (K) denote by K,L the distance between x K and x L , by K,L the interface between K and L, by K,L the unit normal vector to K,L outward to K. And for all ∈ N ext (K), denote by K, the distance from x K to . Further, we assume the classical shape regularity property for the mesh: Let N T be the number of time steps. We set Δt = T∕N T , t n = nΔt, 0 ≤ n ≤ N T . The discrete unknowns are denoted by (for w = c or w = ) To lighten the notations a bit, we will write the scheme in the case where the flow regime is independent of time, that is, the case where the source term does not depend on time. For the initialization of the fixed point algorithm, we use an initial control The discrete problem reads as follows: such that for any n ∈ [0, N T ] ∩ N, for any K ∈ and for any m ∈ N Sketch of the results. The main results of this article may be summarized as follows: • There exists at least one solution of the discrete problem (16)- (20). This result, along with estimates and stability properties, are proved in Section 4. • It can be proven that the sequence to a solution of the optimal control problem (7) (see . Appropriate assumptions ensuring the uniqueness of the optimal control are also provided.
With these results, we prove on the one hand the existence of a solution to the problem in the sense of Definition 2.2 and we give on the other hand a theoretical basis to numerical simulations.
Remark 3.1 The latter results improve the existence result stated in [1]: the assumptions are relaxed and a numerical scheme devoted to the approximation of the optimal control problem is proposed and analyzed.

ANALYSIS OF THE DISCRETE PROBLEM (16)-(20)
Once again, we introduce some notations. We denote by L 2 h (Ω) ⊂ L 2 (Ω) the space of functions which are piecewise constant on each control volume K ∈ . For any function P h ∈ L 2 h (Ω) and for all K ∈ , we denote by P K the constant value of p h in K. Conversely, we associate h , c m,n h , m,n h and p m,n h with the set of values defined by the scheme (16)- (20). In the same way, we define (c h , h ) associated with We define some approximate gradients which are piecewise constant on dual diamond cells. To this aim, if K ∈ and L ∈ N(K) with common (open) vertex (or face in the three-dimensional case) K,L , let T K,L be the convex polygon defined by If an edge of K belongs to N ext (K), we define T ext K, by The semi-norm ||p h || H 1 h coincides with the L 2 (Ω) norm of ∇ h p h , actually: We first derive some a priori estimates.
and, if Δt < R ∕4 * , Furthermore, the approximate control is such that Remark 4.2 A careful look at the proof of (22)-(23) below shows that it is sufficient to ensure that Δt < R ∕4|| min{0, }|| ∞ , meaning that the constraint is necessary only when water withdrawals from the aquifer are greater than natural recharge.
Proof. The bound (24) is of course a direct consequence of the assumption (H7). To prove the estimate (21), we multiply the hydraulic charge equation (17) by K . Summing the resulting equation over K ∈ and setting K,L = for the description of the transmissivity through the interface K,L , we get Then, by using a discrete integration by parts, the definition of K and the Cauchy-Schwartz inequality and Young inequalities, we obtain ∑ for any fixed > 0 (we will choose its value later). Then, according to the discrete Poincaré inequality, there exist a positive constant C 1 such that Thus, choosing such that 1∕ − C 1 > 0, we obtain (21). For the pollutant estimate, we inject (17) in the convective terms of (18) and we multiply the resulting equation by c m,n+1 After a discrete integration by parts, the latter equation reads Using the Cauchy-Schwarz and Young inequalities together with the assumptions on r, p, and f 1 , we infer from the latter result that, for any , > 0, there exist two positive real numbers C 3 and C 4 such that Notice that the last two terms of the right-hand side of the latter relation come from the continuity of the discrete trace function. The result also reads R 2Δt Now, choosing small enough for ensuring S m ∕2 − C 4 > 0, setting ∕2R , K 2 = 2 ( + C 4 + * ) ∕R , we claim that, according to the discrete Gronwall lemma, as soon as 0 < Δt ≤ Δt 0 , Δt 0 < 1∕K 2 . Some computations show that the latter condition may be ensured if Δt < R ∕4 * . Next, with (26) at hands, (25) implies the existence of a positive real number C 5 such that It remains to justify the estimate of the time derivative announced in (22). To this aim, multiply (18) by ∕Δt and sum over K in order to obtain A discrete integration by parts yields We rewrite f 1 in the form f 1 = ∑ i∈ℐ f 1,i Γ i where Γ 1 ∪ Γ 3 = ∪ i∈ℐ Γ i and f 1,i are given real numbers. Using the Cauchy-Schwartz and Young inequalities, we get for any > 0. Next, choosing small enough, multiplying the latter relation by Δt and summing over n yields (22) is proved. The proof of (23) follows the same lines. We now aim at stating an existence result for the discrete problem (16)- (19). First we recall the following Lemma. We have sufficient tools for proving the following existence result for the discrete problem (16)- (19).
Notice that ℱ h is a well-defined and continuous map. Bearing in mind Lemma 4.3, we aim at proving that [ℱ h (C ℳ ) , C ℳ ] > 0 for ||C ℳ || = r if r is large enough. Using Proposition 4.1 and repeating the computations that gave (25) we write If , and Δt are small enough we thus have C( , , Δt), C 1 ( , Δt) and C 3 are given positive real numbers. Due to the regularity of the mesh, ||c m,n+1 We now aim at analyzing the properties of the scheme. We begin by considering the stability of the scheme with regard to the initialization of the control.  ) and it satisfies Proof. The existence of a solution to (28)-(30) is ensured by an obvious analogous of Proposition 4.1. Set P m,n ,K = p m,n ,K − p m,n K , Ψ m,n ,K = c m,n ,K − c m,n K and Υ m,n ,K = m,n ,K − m,n K . These quantities solve the following problem: We have P 0,n ,K = . Thus multiplying (32) and (33) with m = 0 by respectively Ψ 0,n+1 ,K and Υ 0,n+1 ,K and summing for K ∈ we obtain Hence, after a discrete integration by parts in (34), using the Lipschitz character of r and the Gronwall Lemma we get first for any n ∈ [0, N T − 1] Next, using the continuity of r ′ and c D and (23), we infer from (35)-(36) that Then, using the continuity of the map ( p f ) −1 defining the discrete controls p with regard to the discrete adjoint unknowns , we infer from the latter estimates that there exists a positive real number C 4 (h, Δt) such that The next result is a first stone towards the construction of compactness properties.
Proposition 4.6 (space and time translations). There exist positive real numbers C, C 1 , C 2 , and C 3 such that, for any m ∈ N, for any ∈ R and any ∈]0, T[, we have Proof. With the estimates in Proposition 4.1 at hands, follow the lines of the proofs of Lemmas 4.3 and 4.7 in the classical textbook [11]. ▪

CONVERGENCE ANALYSIS
In this section, we claim and prove various convergence results. They are summarized in the following theorem. Proof. For the proof of Theorem 5.1 we could mind a priori that the outcome depends on the way we pass to the limit m → ∞, (h, Δt) → (0, 0) in (16)- (20). We thus detail all the possible combinations.
Moreover, according to Proposition 4.1, there exists m ∈ ( L 2 (Ω T ) ) such that, up to a subsequence, ∇ h c m h weakly converges to m in as h, Δt → 0. It remains to identify ∇c m and m in the sense of distributions. To this aim, it is enough to show that, as h → 0, Let h be small enough so that vanishes in T ext K, for all K ∈ . Since K,L = − L,K we have on the one hand for all t ∈]t n , t n+1 [ On the other hand, we infer from the definition of the discreet gradient that Inserting the two latter relations in the definition of E h , we get Due to the smoothness of , we have Thus, with the Cauchy-Scharwz inequality and the first estimate in (22), we obtain ) and such that, up to a subsequence, the following convergence results hold true as h, Δt → 0:  (20) as follows (see e.g., Section 3.6 in [19]) where p m,n h is defined by (20). The convergence results (43)-(48) are sufficient for passing to the limit (h, Δt) → (0, 0) in the latter three equations. The bounded nonlinear terms may be handled with the Lebesgue's dominated convergence theorem and the term containing c D with the Severini-Egorov's theorem (since c D has at most an affine growth with regard to c, if A is the arbitrarily small part of the domain where the uniform convergence of c m h is not ensured, the quantity | ∫ A c D h )| is also arbitrarily small). We obtain an analogous of the variational problem given in Definition 2.2 with (c m , m ) instead of (c, ).
It remains to let m → ∞. Since all the estimates in Proposition 4.1 are uniform with respect to m, they also hold for (c m , m ) (property of the lim inf). Hence The classical Aubin's compactness argument then allows to pass to the limit m → ∞, once again with the Lebesgue's and the Severini-Egorov's theorems for handling with the nonlinearities.

Convergence 2: letting simultaneously
The reader may check that we can follow the lines developed for (43)-(48) to obtain the analogous results as (h, Δt, m) → (0, 0, ∞), namely, up to a subsequence: where c n h and n h are defined by the following strongly coupled scheme: for any n ∈ [0, N T ] ∩ N, for any K ∈ Because of the strong coupling, this scheme has a strongly implicit form. But in any case, we can also pass to the limit m → ∞ in the estimates of Proposition 4.1. We get Furthermore, The analogous of Theorem 4.4 may be retrieved and the arguments developed for passing to the limit (h, Δt) → (0, 0) in the Convergence 1's setting may be reproduced. Due to the result of Convergence 2, there is a subsequence that satisfies c n h → c and n h → in L 2 (Ω T ) and a.e in Ω T , Theorem 5.1 is proved. ▪

A MANGASARIAN'S TYPE RESULT FOR TURNING BACK TO THE CONTROL PROBLEM
The latter section ensures the convergence of the numerical scheme (16)- (20). Of course, it remains to find the link between the limit function p = ( p f ) −1 ( ) and the optimal control p * solution of our original problem (7). It is given in the following theorem.

Proof.
We turn back to the maximization problem. Recall (c, p), defined in Theorem 5.1, satisfies the Euler-Lagrange's problem (8)- (12). Let q ∈ E. We denote by c q the concentration solution of the state equation associated with the control q, Using the convexity properties of f and −D, we compute Bearing in mind that p f (x, p) = ℐ T and c D(c(T, x)) = R (0, x), that c D(c) is given by the adjoint equation (11), integrating by parts, we obtain We thus study the sign of the last term in (54). Setting − = max{0, − }, we multiply the adjoint equation (11) by − and we integrate by parts over Here, since divv ∈ L ∞ (Ω T ) and curl v = 0, the velocity belongs to L ∞ (Ω T ). Since moreover r ′ (ℐ T c) + R + ℐ T ∈ L ∞ (Ω T ), using the Cauchy-Schwarz and the Young inequalities, we infer from the latter relation that, if c D ≥ 0 (that is if the cost function D is increasing with the pollutant concentration c, which seems like a reasonable choice), The Gronwall lemma then ensures that − = 0, that is ≥ 0 almost everywhere in Ω T . Turning back to (54), we conclude that if r is a concave function then J(p) ≥ J(q) for any q ∈ E, meaning that the limit value p obtained by the scheme (16)- (20) necessary solves the optimal control problem (7). Furthermore, if one of the functions −f , D, −r is strictly convex, we get J(p) > J(q) for any q ∈ E meaning that the solution of (7) is unique. It implies that there is no need to extract subsequences in the previous subsection. This ends the proof of Theorem 6.1. ▪

NUMERICAL EXPERIMENTS
In this part, some numerical results are given for illustrating the efficiency of the scheme (16)- (20). The scheme was implemented in the Python language using the sparse matrices defined in the Scipy module. This choice was guided by the impossibility of representing the different systems to be solved in a dense way in memory, especially in the three-dimensional case. For a reasonable discretization (up to 2 ⋅ 10 4 unknowns), the simulations was run on a laptop computer (2.3 GHz processor, 16 GB memory). For more important simulations ( ( 10 5 ) unknowns), we deported the calculations on a modest computer (Xeon E5-2630 quadcore 2.60GHz, 32 Go). The codes were neither optimized nor parallelized. For the examples below, the calculations took about 12 hours in the 3D case, a few hours in the 2D case. We also checked the stability of the results with mesh refinement for comforting this discretization choice. By the way, we are currently studying the possibility of carrying out our simulations on GPUs (many libraries exist but are not always stable) in order to significantly improve the speed of execution and thus increase the discretization.
First notice that for all the simulations the time and space steps for the discretization are given by Δt = h = 10 −2 . The number of loops (that is the number of increments m in (20)) necessary to arrive at the stopping criterion is denoted by M. Namely M = min m∈N {m + 1}, such that ||p m+1 h − p m h || L 2 (0,T;L 2 (Ω)) ≤ where is a given threshold. We used = 5.10 −6 . The aquifer is simply represented by a parallelepiped, namely (x, y) ∈ ]0,900[ 2 , z ∈ ] − 9, 0[, the values are given here in meters. We consider such a shallow aquifer because most of the groundwater reservoirs are thin geological formations. The aquifer is rescaled into the unit cube for the readability. More precisely, we introduce reference quantities, hereafter denoted with the subscript R for scaling the model. For instance, c R is the reference concentration such that c = c Rĉ . The surface length scale is x R = 900, the vertical length scale is z R = 9, the time scale is t R . We setx = x∕x R ∈ (0, 1) and y = y∕x R ∈ (0, 1),ẑ = z∕z R + 1 ∈ (0, 1) andt = t∕t R ∈ (0, T∕t R ). We skip the details. The rescaled version of equation (9) reads for instance The discretization with h = 10 −2 (resp. Δt = 10 −2 ) is then used in [0, 1] 3 (resp. in (0, T∕t R )).
A water production well is located on the parcel, at the point of coordinates (450; 450). It penetrates 2/5 of the thickness of the aquifer. Furthermore, we assume that the water is pumped all along the borehole and not only in deep areas that could be for a while unaffected by pollution. The water withdrawal in production wells is thus in the form q P = 100 wells . We will use this expression in the definition of the cost function below. The parameter is introduced for handling different pumping rates. Our simulations correspond to a production rate in the range 10 3 -10 4 m 3 h −1 . For the sake of the simplicity we assume that the recharge occurs in all the domain Ω. Globally the balance between outputs and inputs does not put the reservoir at risk: = q I − q P is assumed positive, in average of the order 10 −3 m 3 h −1 . Such values correspond in particular to French data (see [17]). The physical parameters have been chosen in agreement with the classical literature (see for instance Bear [4]). The soil characteristics are = 39.04 m⋅day −1 and = 0.3. In such a shallow aquifer the fluid displacement is essentially horizontal. Here moreover, for ensuring the interpretability of the results, the boundary conditions for the fluid velocity equation (1) are nonhomogeneous Neumann boundary conditions (scaled by parameters v for the inlet side and v r for the outlet side) such that the quasi-horizontal flow goes from left to right (see Figure 1) with a Darcy velocity of about 600 m year −1 . Once the inlet and outlet flow are known, the boundary conditions (4) for the concentration are of Robin's type. We choose f 1 such that they correspond to convective conditions, namely f 1 = v x c ext , x = or r, c ext being initialized at the value of the natural concentration, that is before pollution. This latter value is also used for c 0 .
The water molecular diffusion is S m = 9.4 ⋅ 10 −8 m 2 s −1 . It follows that the Péclet number is of order 600. For such a value, the dispersion cannot be neglected. Nevertheless, it is known to be well approximated by the Fickian hypothesis. In the present shallow aquifer with essentially horizontal water displacement, only the longitudinal dispersion is dominant. The last two points lead to consider a diffusion tensor rather than a diffusion coefficient, of the form: where S = 10 3 S m to account for dispersive effects (see [15] and references therein). One easily checks that using the matrix S m instead of a real number S m in the theoretical analysis do not change the results obtained in the previous sections.
We choose to illustrate the problem of agricultural pollution due to nitrates fertilization. Nitrates are almost not adsorbed by the soil and we thus set R = 1. For the reaction term r, we use the function r(c) = r c 2 with r = 10 −3 . The corresponding Damköhler number is thus of order 5 ⋅ 10 −3 . Small values of the Damköhler number, that is smaller than one, correspond to the fact that nitrate removal within the groundwater is controlled by water flux rather than reaction rate. A test using r = 10 −1 is also performed. The initial condition is c 0 = 5 mg L −1 . The benefit function is defined by with K 1 = 11.7888, K 2 = 8.6 ⋅ 10 −3 , K 3 = 50.465 and p = 1.5. It comes from Godart et al. [13] where the crop yield is depending on the yield value without nitrogen input and the asymptotic value when the input becomes important. Values of the parameters depend on the crop species. Here we choose the example of wheat crops. Notice that the Godard function has been modified for ensuring that the The real number depends on the price of cultivated species per ton and accounts for the production flow rate of the well and cost of treatment per cubic meter. The real numbers and in the definition of f and D may also be used for possibly tuning the weights given to the economic and environmental concerns in the model. For instance, choosing ≪ 1 and ≃ 1 allows to privilege the quality of the water. In all cases we used reasonable parameters to seek the trade-off between the polluter's benefits and the water treatment cost. Thus, for example, in all the simulation results there is indeed fertilizer application but the immediate vicinity of the water production well is systematically protected. The other parameters in the functional J are and . We fix = 0.05. The value of was chosen in the set { 10 −2 ,10,100 } for giving more or less weight to the legacy of pollution. The results of our first experiments are represented in Figures 2 and 3. They were especially designed to verify that the code gives intuitively logical results. But for decision support, a rigorous calculation is sometimes more convincing than an intuition. The performance of the iterative algorithm is illustrated by the second line of figures with in all cases the maximum number of tests M such that M ≤ 50. The greater the ratio ∕ , the more a policy of water resource protection is favored. The simulations results given in Figure 2 respect this point. They also show that the owner of the left parcel is the most penalized. This can be explained by the direction of the groundwater velocity, from left to right in our figures. Thus, everything that is spread on the left part is transported to the water production well. This penalty becomes less severe over time because what is applied in the last few days does not have time to reach the well before the end of the study period. In the last few days there is even a patch of high concentration (around y = 16 decameters) indicating that the farmer can afford to compensate for the lack of application during the first few days. In Figure 3, we focus on the influence of the prize of the pollution legacy by varying the parameter . Once again, the results are qualitatively in line with what one would expect. However, we note that the legacy cost may be as important as the immediate cleanup costs (the reader may also compare the results in Figures 2 and 3 where the pollution legacy weight was set as = 1). Such results make sense of a taxation policy that is uncorrelated with the immediate cost of water treatment.
For all remaining simulations the parameter was set to the value 1. The duration of the experiment in Figures 2-3 is T = 100 days which corresponds to the fertilizer application period for most cereal crops. In Figure 4 however, the simulations are performed for a longer duration, namely for T = 3000 days. First of all, there is a great stability in time of the quantity of fertilizer to be spread The only notable changes occur during the last 10 days of spreading, a windfall effect since the pollution will not have time to reach the water extraction well. Obviously we also notice that the spreading is totally forbidden on the left plot, except during the last 10 days. The optimal solution in long time is thus very different from the one in shorter time. From the point of view of the algorithm on the other hand, the convergence is not at all impacted by the change of scale. So far we have taken advantage of the symmetries of the problem to make only 2D representations. In order to convince the reader of the validity of this approach, two figures finally give the 3D representation of the aquifer. Figure 5 provides an opportunity to verify that the essential mechanism controlling the nitrate removal is the water flux rather than the reaction rate (bear in mind that the flow is oriented from the left side of the cube to the right side; as v = v r , the smaller v is, the higher the water velocity is). This is even clearer when looking at the nitrate concentration of the input (note the colorbar in Figure 6). There is another way to analyze these results. Indeed, we have chosen = 1, = 100 and yet the owner of the plot on the right is not particularly advantaged (compared to the 2D

FIGURE 5
Three-dimensional representation of the optimal fertilizer load. The simulation duration is T = 100 days, the results are figured at the 1st, 20th, and 80th days. Changing the parameter r in the reaction term highlights the influence of the natural remediation of the soil. Changing the entry and exit velocities v and v r illustrates the influence of the water flux FIGURE 6 Same experiment as in Figure 5. Nitrate concentrations on day 20 case presented in Figures 2-4). To arrive at this result we assumed for the 3D simulations that the geological nature of the soil to the right of the studied area is much less permeable to nitrate propagation. For the mathematical model this is taken into account by weighting the Robin boundary condition (9) so that it approaches a homogeneous Neumann boundary condition. For interpretation, this means that the importance of the nature of the soils adjacent to the plots studied should not be forgotten, since it can even balance out the effects of groundwater flow.

CONCLUSION
We have constructed a numerical scheme for the optimal control of groundwater pollution. By proving its convergence, we have also proved the existence and uniqueness of a solution to the optimal control problem. Numerical simulations were provided. Looking at the log scale representations of the convergence curves, we can see the good performance of the algorithm with a convergence speed that approaches the geometric convergence of a fixed point algorithm.
We would like to point out that our work can be extended to many other applications. The nonpoint sources of groundwater pollution linked to a profit are obviously numerous, agriculture, industry, mining and so forth. On the other hand, environmental pollution has many facets, behind the obvious cleaning costs that were considered in the present article. Health risk issues and thus cost of care could be included in the objective functional (see [16]). More surprisingly and perhaps less philanthropically, work has already been developed to express the devaluation of property values as a function of pollution levels (see [10]).

ACKNOWLEDGMENT
This work was partially supported by the New Aquitaine region and Europe (CPER ECONAT).

DATA AVAILABILITY STATEMENT
Data sharing not applicable to this article as no datasets were generated or analysed during the current study ORCID Catherine Choquet https://orcid.org/0000-0003-0632-0358