Approximation surrogates are used to substitute the numerical simulation model within optimization algorithms in order to reduce the computational burden on the coupled simulation-optimization methodology. Practical utility of the surrogate-based simulation-optimization have been limited mainly due to the uncertainty in surrogate model simulations. We develop a surrogate-based coupled simulation-optimization methodology for deriving optimal extraction strategies for coastal aquifer management considering the predictive uncertainty of the surrogate model. Optimization models considering two conflicting objectives are solved using a multiobjective genetic algorithm. Objectives of maximizing the pumping from production wells and minimizing the barrier well pumping for hydraulic control of saltwater intrusion are considered. Density-dependent flow and transport simulation model FEMWATER is used to generate input-output patterns of groundwater extraction rates and resulting salinity levels. The nonparametric bootstrap method is used to generate different realizations of this data set. These realizations are used to train different surrogate models using genetic programming for predicting the salinity intrusion in coastal aquifers. The predictive uncertainty of these surrogate models is quantified and ensemble of surrogate models is used in the multiple-realization optimization model to derive the optimal extraction strategies. The multiple realizations refer to the salinity predictions using different surrogate models in the ensemble. Optimal solutions are obtained for different reliability levels of the surrogate models. The solutions are compared against the solutions obtained using a chance-constrained optimization formulation and single-surrogate-based model. The ensemble-based approach is found to provide reliable solutions for coastal aquifer management while retaining the advantage of surrogate models in reducing computational burden.
 Use of the surrogate models adds an uncertainty component to the simulation-optimization framework. Predictive uncertainty of the surrogate models may have ambiguous effects on the optimality or even the feasibility of the obtained solutions. In the present study, we develop a coupled simulation-optimization model based on an ensemble of surrogate models for optimal management of coastal aquifers under the predictive uncertainty of the surrogate models. The model determines optimal extraction strategies for management. The ensemble of surrogate models is utilized to quantify the predictive uncertainty. The ensemble is then used with stochastic-optimization models to derive optimal extraction strategies.
 Previously, a number of different approaches have been used to solve the problem of optimal and sustainable extraction of groundwater from coastal aquifers. The different approaches use either sharp interface or diffuse interface modeling of saltwater intrusion processes within a simulation-optimization framework. Analytical solutions exist for the sharp interface modeling approach and are comparatively easy to use in a simulation-optimization framework [Iribar et al., 1997; Dagan and Zeitoun, 1998; Mantoglou, 2003; Park and Aral 2004; Mantoglou and Papantoniou, 2008]. The diffuse modeling approach considers the flow and transport equations which are linked together by the density dependence and needs to be simultaneously solved. The coupled flow and transport equations are highly nonlinear and complex. Linking a numerical model which solves these equations with an optimization algorithm involves huge computational burden [Das and Datta, 1999a, 1999b; Dhar and Datta, 2009].
 Most of these surrogate modeling approaches assume a fixed surrogate model structure and optimize the surrogate model parameters to obtain the best fit between the explanatory and response variables. Even the most popularly used neural network surrogate modeling approach determines the optimal model architecture by trial and error [Bhattacharjya and Datta, 2005; Rao et al., 2004].
 In spite of the method used, developing surrogate models from numerical simulation models results in a certain amount of uncertainty in the predicted variable. This is due to the uncertainty in the structure and parameters of the surrogate model. When used in a coupled simulation-optimization framework to derive optimal groundwater management strategies, the uncertainties in the surrogate model predictions affect the optimality of the resulting solution. Thus, while achieving computational efficiency, increased mathematical uncertainty resulting from the residuals is introduced into the simulation-optimization framework by the surrogate model. Depending on the amount of uncertainty, the derived optimal solution may be rendered suboptimal or even infeasible. Hence, it is important to quantify the uncertainty in the surrogate model predictions and reformulate the optimization problem to address this uncertainty.
 In our study, an ensemble-based surrogate modeling approach based on genetic programming is used to predict the salinity intrusion into coastal aquifers resulting from groundwater extraction. Genetic programming (GP) has been used in hydrological applications in a few recent studies [Dorado et al., 2002; Makkeasorn et al., 2008; Parasuraman and Elshorbagy, 2008; Wang et al., 2009]. GP has been used to develop prediction models for runoff, river stage and real-time wave forecasting [Babovic and Keijzer, 2002; Sheta and Mahmoud, 2001; Gaur and Deo, 2008]. Zechman et al.  developed a GP-based surrogate model for use in a groundwater pollutant source identification problem. An ensemble-based GP framework is able to quantify the uncertainty in both the model structure and parameters. Parasuraman and Elshorbagy  illustrated the use of ensemble-based genetic programming framework in the quantification of uncertainty in hydrological prediction. Sreekanth and Datta  used genetic programming to develop surrogate models for coastal aquifer management and compared it with modular neural network-based surrogate models. Genetic programming-based surrogate models have the advantage that the surrogate model structure need not be fixed prior to the model development. Instead, the optimum model structure is evolved by the self-organizing ability of genetic programming algorithm. It was found that GP-based surrogate modeling can develop simpler and effective surrogate models with model parameters as few as 30 against 1155 weights used in the neural network model. Also, it was demonstrated that the evolution of surrogate model structures by GP and its parsimony in identifying the input variables makes it more effective than the ANN model structure determined by trial and error and arbitrary selection of variables. In the present work we make use of GP to develop an ensemble of surrogate models which are different from each other and use it for more reliable predictions of coastal aquifer processes for use in management model.
 Most of the real world groundwater management problems are multiobjective in nature, i.e., they involve more than one objective which are conflicting to each other. The solution to such problems is an entire nondominated front of solutions which gives a trade-off between the different objectives considered. Population-based nontraditional optimization algorithms like genetic algorithms are ideal to solve such problems as different members of the population can converge to different parts of the nondominated front, thus deriving the entire Pareto-optimal front in a single run of the optimization algorithm. Multiobjective genetic algorithm NSGA-II [Deb, 2001] is used in this study to solve the multiobjective optimal coastal aquifer management problem.
 In this work our main objective is to develop an ensemble of surrogate models for predicting the saltwater intrusion process in coastal aquifers. The ensemble of surrogate models is used in a stochastic multiobjective optimization using multiple-realization approach to derive robust optimal extraction strategies which are less sensitive to the uncertainties in the surrogate model predictions. This study considers the uncertainties of the surrogate models alone and the numerical model is assumed to be certain. Two objectives of management are considered subject to the constraint of controlling saltwater intrusion. The first objective is to maximize the total pumping from the production wells tapping the aquifer. The second objective is to minimize the total pumping from a set of barrier wells which are used to hydraulically control saltwater intrusion. The salinity levels resulting from pumping is simulated using the surrogate simulation model. A chance-constrained optimization model is also developed for coastal aquifer management, taking into consideration the cumulative distribution function of the error residuals of surrogate model predictions. The optimal solutions obtained using these two methods is compared with the solution obtained using only a single-surrogate model in the coupled simulation-optimization model.
 The remaining part of this paper is structured as follows. Section 2 describes the framework of the coastal aquifer management model. Section 3 describes the development of the ensemble of surrogate models. Section 4 describes the formulation and implementation of optimization models using multiobjective genetic algorithms. Section 5 illustrates the application of the methodology using a case study. Section 6 summarizes and concludes the paper.
2. Outline of the Coastal Aquifer Management Methodology
 The proposed coastal aquifer management methodology using coupled simulation optimization has essentially two components. The first one is the ensemble of surrogate models for simulating the physical process under consideration. In this work we consider the saltwater intrusion in coastal aquifers as a function of the groundwater extractions from the aquifer. The second component is an optimization model used to optimize the groundwater extraction strategies such that the resulting salinity levels are maintained within prespecified limits. The genetic programming-based surrogate models are trained using randomly generated input-output patterns of extraction rates and resulting salinity levels. The input-output patterns are generated using a three-dimensional simulation model for simulating coupled flow and transport called FEMWATER [Lin et al., 1997]. Nonparametric bootstrap method [Efron and Tibshirani, 1993] is used together with genetic programming to construct the ensemble of surrogate models. The ensemble models are then linked to a multiobjective genetic algorithm to obtain the optimal groundwater extraction rates. The different elements of the proposed methodology for developing optimal coastal aquifer management strategies are described in detail in sections 3 and 4.
3. Ensemble of Surrogate Models
 The following procedure was adopted to develop the ensemble of surrogate models.
3.1. Design of Experiments
 The design of experiments is the first step required for training the GP-based surrogate models. Developing a surrogate model based on genetic programming involves learning from input-output patterns. In the case of the coastal aquifer management problem, the inputs are the rates of groundwater abstractions from different potential locations within the aquifer and outputs are the resulting salinity concentrations. The decision space for the problem under consideration is a multidimensional space representing the combinations of groundwater abstraction rates from different locations at various time periods. For the surrogate models to perform satisfactorily, the training patterns should be representative of the entire decision space. Uniformly distributed Latin hypercube samples (LHS) of input patterns are generated from the decision space to train the genetic programming-based surrogate models.
 LHS, a stratified-random procedure, provides an efficient way of sampling variables from their distributions [Iman and Conover, 1982]. The LHS involves sampling ns values from the prescribed distribution of each of k variables X1, X2,…, Xk. The cumulative distribution for each variable is divided into N equiprobable intervals. A value is selected randomly from each interval The N values obtained for each variable are paired randomly with the other variables.
3.2. Numerical Simulation Model
 Once the input patterns of groundwater abstractions are generated, the resulting salinity levels corresponding to each pattern are computed. The numerical simulation model FEMWATER [Lin et al., 1997] is used for this. FEMWATER is a finite element-based 3-D coupled flow and transport simulation model. The density dependent flow and transport equations used in FEMWATER are given as follows [Lin et al., 1997; Sreekanth and Datta 2010]:
where F is storage coefficient, h is pressure head, t is time, K is hydraulic conductivity tensor, z is potential head, q is source and/or sink, is water density at the chemical concentration C, is referenced water density at zero chemical concentration, is density of either the injection fluid or the withdrawn water, is moisture content, is modified compressibility of water, n is porosity of the medium, S is saturation, is dynamic viscosity of water at chemical concentration C, is referenced dynamic viscosity of water at zero chemical concentration, k is permeability tensor, ks is relative permeability or relative hydraulic conductivity, Kso is referenced saturated hydraulic conductivity tensor, a1 and a2 are the parameters used to define concentration dependence of water density and C is the chemical concentration, is bulk density of medium, C is material concentration in aqueous phase, Sa is material concentration in adsorbed phase, t is time, V is discharge, is del operator, D is dispersion coefficient tensor, is compressibility of the medium, h is pressure head, is decay constant, m is qCin (artificial mass rate), q is source rate of water, Cin is material concentration in the source, Kw is first order biodegradation rate constant through dissolved phase, Ks is first order biodegradation rate through adsorbed phase, F is storage coefficient, |V| is magnitude of V, is Kronecker delta tensor, aT is lateral dispersivity, aL is longitudinal dispersivity, am is molecular diffusion coefficient, and is tortuosity.
3.3. Genetic Programming
 Genetic programming [Koza, 1994] is used in this study to evolve surrogate models for modeling the salinity intrusion in the coastal aquifers resulting from groundwater abstraction. Genetic programming is an evolutionary algorithm similar to genetic algorithm in that it uses the concepts of natural selection and genetics in evolutionary computation. For a given model structure and predefined parameter space, the genetic algorithm optimizes the parameter values. Genetic programming has an additional degree of freedom which allows an optimum model structure to evolve parallel to optimizing the parameter values. Thus, genetic programming identifies the best model structure for simulating the process under consideration while simultaneously estimating the optimal parameter values. Genetic programming learns from examples. The major inputs for the genetic programming model are (1) patterns for learning, (2) fitness function (e.g., minimizing the squared error term), (3) functional and terminal set, and (4) parameters for the genetic operators like the crossover and mutation probabilities.
 The functional set consists of the basic mathematical operators and basic functions like addition, subtraction, multiplication, division, trigonometric functions, etc. The choice of the functional set determines the complexity of the model. For example, a functional set with only addition and subtraction results in a linear model structure, whereas a functional set which includes trigonometric functions result in highly nonlinear model structures. The terminal set consists of constants and variables of the model. The total number of parameters used can be limited to a prespecified number in order to prevent overfitting of the model. By using functional and terminal sets, valid syntactically correct programs can be developed. Parse tree notation of two such programs are illustrated in Figure 1. Two parent genetic programs are shown in Figures 1a and 1b. The parent programs are crossed over at the dashed sections and mutation operator changes the value of the constant 2 to 6 to generate two new offspring genetic programs shown in Figures 1c and 1d.
 In the present work, the operators addition, subtraction, and multiplication are considered in the initial functional set. Later, other functions were added into the functional set one by one in the order of their increasing complexity and nonlinearity. For example, an addition or subtraction operation is considered in the functional set before multiplication is considered. However, considering the nonlinear nature of the saltwater intrusion process, multiplication and division are considered in the initial functional set itself. The additional function or operator is accepted upon an improvement in the fitness measure because of this addition.
 GP starts with a set of randomly generated syntactically correct programs. Each program is evaluated by testing the programs in N number of instances, where N is the number of patterns in the training data set generated using Latin hypercube sampling and the numerical simulation model. The input-output data set is split into halves. One half is used to train the GP models and the other half is used to test the developed genetic programs. Testing refers to the validation of the model. The testing data set is not used in the fitness function evaluation; instead it is used to evaluate how the model performs for a new set of data. Also, the evaluations based on the testing data set are used to pick the best programs from the population.
 By comparing the outcome of the program on each of these patterns with the actual outcome, the fitness value is assigned. The fitness function is usually the root mean square error (RMSE). The programs are ranked based on the fitness value and new programs are created using the crossover and mutation operators. This process of evolving new programs by means of genetic operators, and subsequent fitness evaluation, are performed for a specified number of generations to obtain the best fit genetic program.
3.4. Nonparametric Bootstrap Method
 The nonparametric bootstrap method is used to generate different realizations of the actual input-output patterns of groundwater abstractions and salinity concentrations. Each realization of the data set is then used to train a separate surrogate model. An ensemble of surrogate models for the prediction of salinity levels could be obtained using this procedure. Each surrogate model is distinctly different from the rest in the ensemble because of the difference in the training data set and the population based-optimization leading to identification of multiple optima by the search algorithm. The distinction in the model structure and parameters among the different surrogate models is a manifestation of the uncertainty in the model structure and parameters itself. A methodology used by Parasuraman and Elshorbagy  is followed to accomplish nonparametric bootstrap sampling. The data set obtained using Latin hypercube sampling and using the numerical simulation model is assumed to be a representative set of input-output values from the entire population in the decision space. A training data set T of size N is generated using Latin hypercube sampling and the numerical simulation model. Different realizations of this data set are obtained using the nonparametric bootstrap method. For this a bootstrap size of B is chosen. Then B different data sets each of size N is obtained by repeated random sampling with replacement from the set T. Thus each bootstrap sample-set TB has different input-output patterns from the training data set T repeated many times. The bootstrap sample sets TB differ from each other only in terms of the repetition of some patterns and elimination of some from the original data set. The repetition of patterns in the bootstrap causes differential weighting of these patterns. This results in development of the models which are different in their predictive capability in different regions of the decision space of the prediction model. This also triggers the convergence to multiple optimal solutions while training the prediction model. Thus each surrogate model is an optimal model for the prediction, however different in their predictive capability in different regions of the decision space, depending on the weights assigned to patterns from each region.
 The performance of each of the surrogate models is determined by evaluating the root mean square error on the testing data set. After computing the root mean square errors for each of the surrogate model in the ensemble, the standard deviation and coefficient of variation of these errors are computed. The coefficient of variation of these errors is a measure of the predictive uncertainty of the models. The number of surrogate models in the ensemble is determined by performing an incremental statistical analysis on the ensemble performance, i.e., surrogate models are sequentially added in to the ensemble and the resulting uncertainty is evaluated. Also, the RMSE of the resulting ensemble is also computed after the addition of each surrogate model. RMSE is computed on the testing data considering the testing data sets of all the surrogates in the ensemble taken together at each stage of addition. The optimum number of surrogate models in the ensemble is determined as follows. An ensemble with 10 surrogate models is considered initially. The root mean square error of the salinity concentration predictions by each surrogate model is computed. The coefficient of variation of these root mean square errors are computed and is considered as the measure of uncertainty in the ensemble of models. Then, new surrogate models are added into the ensemble one at a time and the resulting RMSE and uncertainty are computed. This procedure is repeated until there is no significant change in the uncertainty of the ensemble with further addition of surrogate models. The number of surrogate models in the ensemble at this stage is the ensemble size. The number of models in the ensemble at which further addition of models into the ensemble do not produce significant change in the uncertainty is considered as the optimum number of surrogate models in the ensemble.
4. Optimization Models
 The main objective of this study is to develop a coastal aquifer management model which uses an ensemble of surrogate models to simulate the saltwater intrusion process. Two approaches of optimization addressing the uncertainty in surrogate model predictions are used in this study. The first one is based on a stochastic simulation-optimization method called multiple realization or stacking approach [Wagner and Gorelick, 1989; Morgan et al., 1993; Chan, 1993; Feyen and Gorelick, 2005]. The second approach uses a chance-constrained optimization model [Morgan et al., 1993; Datta and Dhiman, 1996].
 The stochastic optimization accounts for the uncertainty in the surrogate model structures and parameters. In the multiple-realization approach all the surrogate models in the ensemble are independently linked to the optimization model, i.e., if the ensemble consists of 10 different surrogate models then the optimization formulation has a stack of 10 constraints representing the surrogate models. Thus the optimal solution will be subject to satisfying each of these constraints representing the different surrogate models which differ from each other due to the model structure and parameter uncertainty.
4.1. Multiobjective Optimization Using a Multiple-Realization Approach
 Two conflicting objectives are considered in this study. The first one is the maximization of total beneficial pumping from the aquifer and the second one is minimization of the total pumping from the barrier wells which are used to hydraulically control saltwater intrusion. Limiting the salinity concentrations, resulting from the groundwater extraction, to specified limits are the constraints. The mathematical formulation of this multiobjective optimization problem using multiple-realization approach is as follows:
where is the pumping from the nth production well during the tth time period, is the pumping from the mth barrier well during the tth time period, and is the rth realization of concentration in the ith location at the end of the management time horizon. This is obtained from the rth surrogate model for the salinity at the ith location using the surrogate model given by ( ). M, N, and T are, respectively, the total number of production wells, total number of barrier wells, and total number of time steps in the management model. Constraint (10) imposes the maximum permissible salt concentration in the monitoring well locations. Constraints (11) and (12) define lower and upper bounds of the pumping from production wells and barrier wells, respectively.
 With the multiple-realization approach, optimal solutions with different reliability values can be obtained. The reliability value is the fraction of surrogate models in the entire ensemble whose salinity predictions satisfy the imposed constraints of maximum salinity levels in the optimization model. For example, if there are N different surrogate models in the ensemble, it is possible to obtain an optimal solution with a reliability of by constraining the optimization model to satisfy constraints imposed by at least n surrogate models. Reliability of the optimal solution is close to 1 when the constraints imposed by all N surrogate models are satisfied. However, this reliability pertains to the uncertainty in the ensemble of surrogate models only.
4.2. Chance-Constrained Approach
 The optimal solutions obtained by the multiple-realization approach for different reliabilities are compared to the solutions obtained using a chance-constrained optimization formulation. The chance-constrained formulation uses the same objective functions and constraints as in (7) and (8) and (11) and (12). The constraint given by (10) and (11) are replaced as follows:
where ci is the salinity concentration at the ith location at the end of the management time horizon, is the error in the salinity concentration prediction for the ith location, and is the average of the salinities at the ith location predicted by the ensemble of surrogate models. Rel is the reliability level of the ensemble prediction that the predicted concentration is less than cmax. This reliability is based on the cumulative distribution function of the error residuals in the salinity level prediction by the surrogate models. The reliability is constrained to be greater than or equal to . The probabilistic constraint in (14) is converted into its deterministic equivalent as follows:
where is the inverse cumulative distribution function for the residuals in salinity prediction at the ith location and gives the prediction error corresponding to a reliability .
 A coupled simulation-optimization model with a single-surrogate model predicting the salinity levels at each monitoring location is also developed for comparative evaluation. The same optimization formulation as in (7)–(12) is used for this purpose except that salinity prediction by the ensemble represented by (9) is replaced as follows:
where represents the best surrogate model, in terms of the least value of the objective function obtained in the GP model, for predicting the salinity at the ith location. The original data set is used to develop this surrogate model instead of the bootstrap sample.
4.3. Multiobjective Genetic Algorithm
 A multiobjective genetic algorithm NSGA-II [Deb, 2001] is used to solve the multiobjective coastal aquifer management problem. Similar to GA, NSGA-II uses a population of candidate solutions together with the GA operators cross-over, mutation and selection to evolve improves solutions to the optimization problem over a number of generations. In addition to this, NSGA-II organizes the members of the population into nondominated fronts after each generation, based on the conflicting objectives of optimization. Thus, in a single run, NSGA-II is able to generate the entire Pareto-optimal set of solutions at the end of the specified number of generations.
4.4. Ensemble-Based Coupled Simulation-Optimization Model
 The coastal aquifer management model makes use of a coupled simulation-optimization framework to derive the optimal groundwater extraction strategies for coastal aquifers. The ensembles of the surrogate model for simulating the aquifer responses in terms of salinity concentrations are coupled with the optimization model by linking each surrogate model separately with the optimization algorithm. The multiobjective genetic algorithm randomly generates candidate solutions which are the groundwater extraction rates for the different time periods within the management horizon. The aquifer responses corresponding to each of these patterns of extraction are obtained from the ensemble of surrogate models. All generated candidate solutions are evaluated for feasibility and fitness. New candidate solutions are generated using the genetic algorithm operators. The procedure is repeated for a number of generations, until the termination criteria are satisfied. The solutions are progressively improved to converge to the final Pareto-optimal front. A schematic representation of the ensemble-based simulation-optimization model is shown in Figure 2.
 Once the optimal solution is obtained, its validity is checked by simulating the aquifer processes by using the optimal pumping values in the actual numerical simulation model FEMWATER. The residual in the salinity prediction, i.e., the difference between the surrogate-predicted value and the numerically simulated value, is evaluated for five optimal solutions in different regions of the Pareto-optimal front. This is performed for the optimal solutions obtained using the three optimization models, namely, single-surrogate model, ensemble-based model, and the chance-constrained model.
5. Case Study
 In order to illustrate the application of the proposed methodology, it is applied to derive optimal extraction strategies for an illustrative coastal aquifer system. The aquifer is 2.52 km2 in aerial extent with eight potential locations for groundwater extraction for beneficial use, and three potential barrier well locations for hydraulic control of salinity intrusion. The aquifer considered is single layered with an average depth of 60 m. The boundaries of the study area are all no-flow boundaries, except for the seaward side boundary which is a constant head and constant concentration boundary with a concentration value of 35 kg/m3. The aquifer system is illustrated in Figure 3. The eight potential locations for beneficial groundwater extraction are shown as PW1–PW8. The barrier well locations for hydraulic control of saltwater intrusion are shown as BW1–BW3. The salinity concentrations were monitored at three locations, C1, C2, and C3, at the end of the management time horizon.
 The time horizon for the management model was fixed as 3 years with the extraction rates in each management period of 1 year considered as uniform. The groundwater recharge is specified as a constant rate of 0.00054 m/d, respectively. The lower and upper limits on groundwater abstractions for both beneficial and barrier wells are 0 and 1300 m3/d. Total number of decision variables in the optimization model is 33, corresponding to pumping from 11 wells for three time periods. The management model specifies a maximum permissible salt concentration limit of 0.5, 0.6, and 0.6 kg/m3 at these locations, respectively. The parameters used for the FEMWATER model are given in Table 1.
Table 1. Parameters for Aquifer Simulation
Hydraulic conductivity in x direction
Hydraulic conductivity in y direction
Hydraulic conductivity in z direction
Molecular diffusion coefficient
Density reference ratio
7.14 × 10−7
 A three-dimensional coupled flow and transport simulation model was used to simulate the aquifer processes resulting in salinity intrusion due to groundwater abstraction in this study area. Different groundwater extraction scenarios were generated using Latin hypercube sampling. The salinity concentrations resulting from each of these pumping patterns are simulated using FEMWATER. The simulated salinity level and the corresponding pumping rates form the input-output pattern. Altogether 230 extraction patterns are used in this study. Different realizations of this input-output data set were generated using the nonparametric bootstrap method. Each of these data sets was used to build surrogate models to create the ensemble of surrogate models. Each data set was split into halves for training and testing the GP models. The input-output patterns were then used to train the genetic programming-based surrogate models. Adaptive training [Sreekanth and Datta, 2010] was performed to reduce the number of patterns required for training.
 Surrogates were developed for predicting salinity at three different locations. For each location 30 models in the ensemble was found to be sufficient to characterize the uncertainty. All the genetic programming surrogate models used a population size of 500, mutation frequency of 95, and crossover frequency of 50. A commercial genetic programming software Discipulus was used to develop the surrogate models. The parameters values, as per the guidelines after performing a sensitivity analysis, were used in the development of the model. The functional set in the developed GP models contained the operations addition, subtraction, multiplication, division, comparison, and data transfer. The maximum number of surrogate model parameters used was limited to 30 to prevent overfitting of the model. Squared deviation from the actual value was used as the fitness function. At the end of model training and testing source codes of the model in C language were generated using the interactive evaluator of the software and are then coupled with the multiobjective optimization algorithm NSGA II.
6. Results and Discussion
6.1. Uncertainty in Surrogate Models
 The uncertainty in the surrogate models were quantified using the coefficient of variation of the root mean square errors of the individual surrogate models. The root mean square errors of individual surrogate model salinity predictions C1, C2, and C3 are shown in Figures 4, 5, and 6. The RMSEs are computed over the testing data set used for evaluating the genetic programming-based surrogate models. It could be observed that for different realizations of the same data set, the root mean square errors are different for different surrogate models. This is due to the predictive uncertainty of the surrogate models. The root mean square errors for the ensemble of models predicting salinity C1 are plotted against the number of surrogate models in the ensemble starting from an initial ensemble size of 10 in Figure 7. As the number of models in the ensemble increases, RMSE of the ensemble prediction decreases, at least in this example.
 The coefficient of variation of the RMSEs, as a measure of uncertainty in prediction of salinity, is plotted against the number of surrogate models in the ensemble for each ensemble predicting C1, C2, and C3. The plots are shown in Figures 8, 9, and 10. Uncertainty of the ensemble model has a definite decreasing trend with the increasing number of models in the ensemble. For each of the salinity concentrations C1, C2, and C3 the uncertainty in the ensemble of surrogate model decreases with the number of models in the ensemble and reaches a constant value when the number of models in the ensemble is around 30. Hence the optimum number of models in the ensemble for coupled simulation optimization is chosen as 30. The optimum number of surrogate models depends on the uncertainty level in the model structure and parameters. For more complex systems the uncertainty in the model structure and parameters of surrogate models will be larger and hence more number of surrogate models will be required in the ensemble. The sensitivity of the derived Pareto-optimal solutions to the number of surrogate models in the ensemble is analyzed in section 6.4.
6.2. Multiobjective Optimization
 The multiobjective optimization algorithm NSGA-II was used to solve the optimization formulations of both multiple-realization and chance-constrained approaches. Similar to an ordinary genetic algorithm, NSGA-II has a population-based approach for deriving the optimal solutions. The population size used in this study is 200. NSGA-II was run for 750 generations to obtain the optimal solution. Thus a total of 200 × 750 evaluations of the aquifer response to specific groundwater extraction patterns would be required before obtaining the solutions. The NSGA-II parameters used were crossover probability 0.9 and mutation probability 0.02. The sensitivity of the optimal solution to population size, number of generations, and NSGA-II parameters were evaluated by conducting a number of numerical experiments by running the NSGA-II model with different combinations of the parameters. It was found that for the number of generations less than 750 and population size less than 200, convergence to the Pareto-optimal front is not achieved. However, convergence is obtained for a smaller population size of a larger number of generations. It is noted that reducing the population size affects the spread of solutions in the Pareto-optimal front. Some regions of the Pareto-optimal front get eliminated as a result of reduction in the population size. The optimization problems have 33 variables which are the pumping rates from 11 locations for three time periods. The optimization by multiple-realization approach has 90 constraints, corresponding to three ensembles with 30 surrogate models each predicting the salinity levels C1, C2, and C3.
6.3. Pareto-Optimal Front
 Pareto-optimal solutions refer to a nondominated front of solutions obtained for the coastal aquifer management problem. On the Pareto-optimal front any improvement in one objective function requires a corresponding decline in the other objective function. These sets of solutions are obtained for the coastal aquifer management problem using multiobjective optimization for both multiple-realization and chance-constrained approaches. All the solutions on the front are nondominated and the water managers can choose a prescribed solution to implement a specific pumping pattern so as to maximize the benefits and simultaneously limiting the aquifer contamination.
 The Pareto-optimal solutions for different reliabilities obtained by the multiple-realization and chance-constrained methods are compared in Figures 11–13. Figure 11 illustrates the Pareto-optimal front for a reliability of 0.99. In the multiple-realization approach this set of solutions satisfy the constraints imposed by all the surrogate models linked with the optimization model. In the chance-constrained formulation this set of solutions corresponds to an error in prediction corresponding to a reliability of 0.99. Similarly Figures 12, 13, and 14 illustrate the fronts corresponding to reliability levels of 0.8, 0.66, and 0.5. Figure 14 also compares the fronts of reliability level 0.5 to the Pareto-optimal front obtained using single-surrogate model in optimization.
 For the multiple-realization approach the reliability refers to the percent of surrogate models in the ensemble, the imposed constraints of which are satisfied in the optimization. For the chance-constrained method the reliability is obtained from the inverse cumulative distribution function of the residuals in the salinity prediction by the ensemble of surrogate models for salinities C1, C2, and C3. The cumulative distribution functions corresponding to C1, C2, and C3 are shown in Figures 15, 16, and 17. The errors are more or less symmetrically distributed with a probability of 0.5 for zero residual in all three cases.
 It can be noted that Pareto-optimal solutions with a higher reliability level appears to be inferior to those with a lower reliability level. The plausible reason is that, as reliability decreases, the probability of these solutions violating the constraints increases. Therefore, the apparently better solutions may not be feasible. In Figure 14 the Pareto-optimal front obtained for a reliability level of 0.5 are compared against the Pareto-optimal front obtained using only the best surrogate model in the coupled simulation optimization. It could be observed that the front obtained using the single-surrogate model is very close to and slightly better than the fronts obtained for a reliability level of 0.5 using multiple-realization and chance-constrained methods. In accordance with the general trend of variation of the Pareto-optimal front with the reliability, it could be deduced that the reliability level of the solutions obtained using a single-surrogate model linked with the optimization algorithm is less than 0.5. In using a single best surrogate model in the coupled simulation optimization it is assumed that the surrogate model prediction has a 0 residual, i.e., the surrogate model simulation is equivalent to the numerical model simulation. However, it can be observed from the cumulative distribution functions that the probability of zero residual is 0.5. Since most of the optimal solutions are limit state designs, i.e., optimal solution lying on the constraint bounds, the uncertainty in the surrogate model structure often causes the optimal solution to move into the infeasible region.
 Salinity levels corresponding to five different optimal solutions in the Pareto-optimal front, obtained using the best surrogate model in the coupled simulation-optimization model, are shown in Table 2. It could be observed that, in the optimal solutions, the salinity levels C1 and C3 converge to the permissible maximum concentration and hence the solutions are on the constraint boundaries. Hence, a small error in the surrogate model prediction can move these solutions into the infeasible zone. The salinity levels corresponding to these solutions is simulated using the actual simulation model and is compared with the values obtained using the surrogate model. It could be observed that some of the actual salinity levels obtained from the numerical simulation model violate the constraints, thus forcing the derived optimal solutions into the infeasible zone. The errors in the predicted salinity level for the optimal solutions are given in Tables 3 and 4. Tables 3 and 4 correspond to multiple-realization and chance-constrained approaches, respectively. The errors refer to the difference in the salinity levels obtained using the actual numerical simulation model and the surrogate model. In both the cases, it is evident that the errors are less when the reliability level is high.
Table 2. Salinity Levels Corresponding to Five Optimal Solutions From Single-Surrogate Model-Based Optimizationa
C1 ≤ 0.5 kg/m3
C2 ≤ 0.6 kg/m3
C3 ≤ 0.6 kg/m3
SM × 10−3 kg/m3
NM × 10−3 kg/m3
SM × 10−3 kg/m3
NM × 10−3 kg/m3
SM × 10−3 kg/m3
NM × 10−3 kg/m3
SM = surrogate model, NM = numerical model.
Table 3. Residuals in Salinity Prediction for Five Optimal Solutions Obtained by Multiple-Realization Optimization
Reliability Solution Number
Table 4. Residuals in Salinity Prediction for Five Optimal Solutions Obtained by Chance-Constrained Optimization
Reliability Solution Number
 The ensemble-based surrogate modeling approach quantifies the uncertainties in the model structure and parameters. Reliable optimal solutions for coastal aquifer management were obtained using the ensemble surrogate models with the stochastic multiple-realization and chance-constrained optimization models.
6.4. Sensitivity Analysis
 Comparison of Pareto-optimal fronts for different reliabilities show that for 30 surrogate models in the ensemble, the multiple-realization approach identifies the same front as the chance-constrained optimization approach for identical reliability levels. This implies that the constraints imposed by stochastic optimization using multiple realization is as rigid as the chance constraints when the number of surrogate models in the ensemble is large enough to quantify the uncertainty in the model structures and parameters.
 In order to investigate the effect of the number of surrogate models in the ensemble, numerical experiments were performed with 15, 10, and 5 models in the ensemble for the multiple-realization optimization approach for each reliability level. The corresponding Pareto-optimal fronts for reliability level 0.99 are compared with the fronts obtained using 30 models and the chance-constrained model is shown in Figure 18. As the size of the ensemble decreases, the fronts move further to find seemingly better solutions, which actually may be infeasible solutions. Similar results were obtained for other reliability levels also. Hence, it can be inferred that the size of the ensemble has an effect on the stochastic optimization using multiple realizations. With a sufficiently large number of models in the ensemble, the multiple-realization approach performs similar to the chance-constrained optimization approach.
7. Summary and Conclusions
 Surrogate models are widely used in research to substitute complex numerical simulations models in solving groundwater management problems using coupled simulation-optimization. However, their practical applications have been limited primarily due to the reliability of the surrogate model predictions. The reliability of surrogate model predictions is dependent on the uncertainties in the model structure and parameters. The uncertain surrogate models when used in a coupled simulation-optimization framework affects the quality as well as reliability of the optimal solutions obtained. Because most optimal design solutions are limit state in nature, the error in the surrogate model predictions could make the derived optimal solutions even infeasible. In order to address these issues, and as a possible remedy, this study proposed and evaluated the performance of an ensemble of surrogate models based on a simulation-optimization model. The ensemble of surrogate models is also used to quantify the uncertainty in the surrogate model structure and parameters. Salinity prediction by each surrogate model in the ensemble differs from others due to the model structure and parameter uncertainty. Two different optimization formulations were used to derive the optimal abstraction rates. In the first method, each surrogate model in the ensemble was independently linked to the multiobjective genetic algorithm NSGA II, using the multiple-realization formulation. In the second method, the error in salinity predictions were quantified using the ensemble of models and the cumulative distribution function of the errors was obtained. Based on the cumulative distribution function, the chance-constrained optimization problem was formulated and solved using the multiobjective genetic algorithm NSGA II. The reliability of the chance-constrained model is analogous to the reliability obtained using the ensemble surrogate model approach, as the management model is constrained by the permissible maximum limits on salinity concentrations. The Pareto-optimal sets of solutions obtained using the two methods for different reliability levels were compared. Also, these fronts were compared with the Pareto-optimal set obtained using the best surrogate model in the coupled simulation optimization. It was observed that the front obtained using the single-surrogate model in the optimization was close to the front corresponding to a specified reliability of 0.5. It could be argued that the reliability of the optimal solution obtained using a single-surrogate model in the linked simulation-optimization model for coastal aquifer management roughly corresponds to 0.5. However, using ensemble of surrogate models with stochastic optimization helps improve the reliability of the salinity predictions and subsequent optimal solutions.
 Ensemble-based surrogate modeling in couple-simulation optimization has significant advantages over the single-surrogate modeling approach. The single-surrogate modeling approach does not take into consideration the predictive uncertainty and assume that the surrogate model prediction is equivalent to numerical simulation. The ensemble-based methodology is able to quantify the predictive uncertainty and use it in a stochastic optimization model. Thus the ensemble-based approach accounts for the error in surrogate model prediction due to predictive uncertainty which is difficult to accomplish using the single-surrogate model. The ensemble-based approach is found to derive more reliable optimal solutions while retaining the computational advantages of the surrogate modeling approach.
 It should be possible to use ensemble surrogate models in coupled simulation-optimization groundwater management studies considering the uncertainty in the groundwater parameters. Ensemble of surrogate models could be used to substitute groundwater models with different hydraulic conductivities and other uncertain parameters. For this, each member of the ensemble has to be trained using a different data set obtained by using a particular realization of the uncertain groundwater parameters in the numerical simulation model. The ensemble can be then used in a stochastic-optimization framework to derive groundwater management strategies under groundwater parameter uncertainty.
 This study was funded by CRC for Contamination Assessment and Remediation of the Environment, Australia. We are grateful to the four reviewers for the constructive comments which helped in improving the presentation of this paper.