## 1. Introduction

[2] Coupled simulation-optimization models are increasingly used as decision models to find optimal solutions to groundwater management problems like optimal groundwater remediation design, optimal extraction of groundwater from coastal aquifers, and wetland management [*Gorelick*, 1983; *Gorelick et al.*, 1984; *Ahlfeld and Heidari*, 1994; *Hallaji and Yazicigil*, 1996; *Emch and Yeh*, 1998; *Wang and Zheng*, 1998; *Das and Datta*, 1999a, 1999b, 2000; *Cheng et al.*, 2000; *Mantoglou*, 2003; *Mantoglou et al.*, 2004; *Katsifarakis and Petala*, 2006; *Ayvaz and Karahan*, 2008; *Datta et al.* 2009]. One of the major disadvantages of using the coupled simulation-optimization model is the huge computational burden involved due to multiple calls of the simulation model by the optimization algorithm. Recent studies have used nontraditional optimization techniques for solving groundwater management problems. This includes genetic algorithm [*Aly and Peralta*, 1999; *Cheng et al.*, 2000; *Qahman et al.*, 2005; *Bhattacharjya and Datta*, 2005], evolutionary algorithm [*Mantoglou et al.*, 2004], simulated annealing [*Rao et al.*, 2004], and differential evolution [*Karterakis et al.*, 2007]. Population-based optimization algorithms like genetic algorithms can be effectively used to solve optimization problems considering multiple objectives at a time in which the entire nondominated front of solutions can be obtained in a single run of the optimization model. However, with the use of population-based optimization algorithms like genetic algorithm, several thousands of evaluations of the simulation model may be required before an optimal solution is obtained. One possible approach to reducing the computational burden is to substitute the simulation model using approximate surrogate models for simulation. In spite of the wide use of surrogate models in coupled simulation-optimization approaches, they are rarely accepted as reliable models for simulating groundwater flow and transport, in practical applications. These models are most often disfavored because of the inherently uncertain nature of these “black box” models.

[3] Use of the surrogate models adds an uncertainty component to the simulation-optimization framework. Predictive uncertainty of the surrogate models may have ambiguous effects on the optimality or even the feasibility of the obtained solutions. In the present study, we develop a coupled simulation-optimization model based on an ensemble of surrogate models for optimal management of coastal aquifers under the predictive uncertainty of the surrogate models. The model determines optimal extraction strategies for management. The ensemble of surrogate models is utilized to quantify the predictive uncertainty. The ensemble is then used with stochastic-optimization models to derive optimal extraction strategies.

[4] Previously, a number of different approaches have been used to solve the problem of optimal and sustainable extraction of groundwater from coastal aquifers. The different approaches use either sharp interface or diffuse interface modeling of saltwater intrusion processes within a simulation-optimization framework. Analytical solutions exist for the sharp interface modeling approach and are comparatively easy to use in a simulation-optimization framework [*Iribar et al.*, 1997; *Dagan and Zeitoun*, 1998; *Mantoglou*, 2003; *Park and Aral* 2004; *Mantoglou and Papantoniou*, 2008]. The diffuse modeling approach considers the flow and transport equations which are linked together by the density dependence and needs to be simultaneously solved. The coupled flow and transport equations are highly nonlinear and complex. Linking a numerical model which solves these equations with an optimization algorithm involves huge computational burden [*Das and Datta*, 1999a, 1999b; *Dhar and Datta*, 2009].

[5] In the past few years surrogate models have been used as substitutes for the numerical simulation model within the optimization algorithm. A wide range of approximation surrogates have been used in different studies. Artificial Neural Networks (ANN) have been widely used as approximation surrogates for groundwater models [*Ranjithan et al.*, 1993; *Rogers et al.*, 1995; *Aly and Peralta*, 1999]. Neural network-based approximation surrogates were developed by *Bhattacharjya and Datta* [2005, 2009], *Yan and Minsker* [2006], *Kourakos and Mantoglou* [2009], and *Dhar and Datta* [2009] for use in simulation-optimization models. *McPhee and Yeh* [2006] used ordinary differential equation surrogates to replace the partial differential equations of groundwater flow and transport.

[6] Most of these surrogate modeling approaches assume a fixed surrogate model structure and optimize the surrogate model parameters to obtain the best fit between the explanatory and response variables. Even the most popularly used neural network surrogate modeling approach determines the optimal model architecture by trial and error [*Bhattacharjya and Datta*, 2005; *Rao et al.*, 2004].

[7] In spite of the method used, developing surrogate models from numerical simulation models results in a certain amount of uncertainty in the predicted variable. This is due to the uncertainty in the structure and parameters of the surrogate model. When used in a coupled simulation-optimization framework to derive optimal groundwater management strategies, the uncertainties in the surrogate model predictions affect the optimality of the resulting solution. Thus, while achieving computational efficiency, increased mathematical uncertainty resulting from the residuals is introduced into the simulation-optimization framework by the surrogate model. Depending on the amount of uncertainty, the derived optimal solution may be rendered suboptimal or even infeasible. Hence, it is important to quantify the uncertainty in the surrogate model predictions and reformulate the optimization problem to address this uncertainty.

[8] In our study, an ensemble-based surrogate modeling approach based on genetic programming is used to predict the salinity intrusion into coastal aquifers resulting from groundwater extraction. Genetic programming (GP) has been used in hydrological applications in a few recent studies [*Dorado et al.*, 2002; *Makkeasorn et al.*, 2008; *Parasuraman and Elshorbagy*, 2008; *Wang et al.*, 2009]. GP has been used to develop prediction models for runoff, river stage and real-time wave forecasting [*Babovic and Keijzer*, 2002; *Sheta and Mahmoud*, 2001; *Gaur and Deo*, 2008]. *Zechman et al.* [2005] developed a GP-based surrogate model for use in a groundwater pollutant source identification problem. An ensemble-based GP framework is able to quantify the uncertainty in both the model structure and parameters. *Parasuraman and Elshorbagy* [2008] illustrated the use of ensemble-based genetic programming framework in the quantification of uncertainty in hydrological prediction. *Sreekanth and Datta* [2010] used genetic programming to develop surrogate models for coastal aquifer management and compared it with modular neural network-based surrogate models. Genetic programming-based surrogate models have the advantage that the surrogate model structure need not be fixed prior to the model development. Instead, the optimum model structure is evolved by the self-organizing ability of genetic programming algorithm. It was found that GP-based surrogate modeling can develop simpler and effective surrogate models with model parameters as few as 30 against 1155 weights used in the neural network model. Also, it was demonstrated that the evolution of surrogate model structures by GP and its parsimony in identifying the input variables makes it more effective than the ANN model structure determined by trial and error and arbitrary selection of variables. In the present work we make use of GP to develop an ensemble of surrogate models which are different from each other and use it for more reliable predictions of coastal aquifer processes for use in management model.

[9] Different stochastic optimization techniques have been used in the past for optimal decision making under uncertainty [*Wagner and Gorelick*, 1987; *Tiedeman and Gorelick*, 1993; *McPhee and Yeh*, 2006]. Chance-constrained programming had been used in groundwater management by *Wagner and Gorelick* [1987, 1989], *Morgan et al.* [1993], and *Datta and Dhiman* [1996]. Another method for stochastic simulation optimization is the multiple-realization approach [*Wagner and Gorelick*, 1989; *Morgan et al.*, 1993; *Chan*, 1993; *Feyen and Gorelick*, 2004]. In this method, numerous realizations of uncertain model parameters are considered simultaneously in an optimization formulation. *He et al.* [2010] used a set of proxy simulators, in a coupled simulation-optimization model for groundwater remediation design under parameter uncertainty of the proxy simulators. The proxy simulators were based on a stepwise response surface analysis. The residuals in the prediction were treated as stochastic variables and their deterministic equivalent was incorporated into the optimization model.

[10] Most of the real world groundwater management problems are multiobjective in nature, i.e., they involve more than one objective which are conflicting to each other. The solution to such problems is an entire nondominated front of solutions which gives a trade-off between the different objectives considered. Population-based nontraditional optimization algorithms like genetic algorithms are ideal to solve such problems as different members of the population can converge to different parts of the nondominated front, thus deriving the entire Pareto-optimal front in a single run of the optimization algorithm. Multiobjective genetic algorithm NSGA-II [*Deb*, 2001] is used in this study to solve the multiobjective optimal coastal aquifer management problem.

[11] In this work our main objective is to develop an ensemble of surrogate models for predicting the saltwater intrusion process in coastal aquifers. The ensemble of surrogate models is used in a stochastic multiobjective optimization using multiple-realization approach to derive robust optimal extraction strategies which are less sensitive to the uncertainties in the surrogate model predictions. This study considers the uncertainties of the surrogate models alone and the numerical model is assumed to be certain. Two objectives of management are considered subject to the constraint of controlling saltwater intrusion. The first objective is to maximize the total pumping from the production wells tapping the aquifer. The second objective is to minimize the total pumping from a set of barrier wells which are used to hydraulically control saltwater intrusion. The salinity levels resulting from pumping is simulated using the surrogate simulation model. A chance-constrained optimization model is also developed for coastal aquifer management, taking into consideration the cumulative distribution function of the error residuals of surrogate model predictions. The optimal solutions obtained using these two methods is compared with the solution obtained using only a single-surrogate model in the coupled simulation-optimization model.

[12] The remaining part of this paper is structured as follows. Section 2 describes the framework of the coastal aquifer management model. Section 3 describes the development of the ensemble of surrogate models. Section 4 describes the formulation and implementation of optimization models using multiobjective genetic algorithms. Section 5 illustrates the application of the methodology using a case study. Section 6 summarizes and concludes the paper.