### Abstract

- Top of page
- Abstract
- 1. Introduction
- 2. Real-Time Modeling With EnKF
- 3. Study Area
- 4. Model Components and Online Data
- 5. Results
- 6. Discussion
- 7. Conclusions
- Acknowledgments
- References
- Supporting Information

[1] Urban groundwater is frequently contaminated, and the exact location of the pollution spots is often unknown. Intelligent monitoring of the temporal variations in groundwater flow in such an area assists in selectively extracting groundwater of drinking water quality. Here an example from the city of Zurich (Switzerland) is shown. The monitoring strategy consists of using the ensemble Kalman filter (EnKF) for optimally combining online observations and online models for the real-time characterization of groundwater flow. We conducted numerical simulation experiments for the period January 2004 to December 2007 with a 3-D finite element model for variably saturated groundwater flow. It was found that the daily assimilation of piezometric head data with EnKF results in a better characterization of piezometric heads than does a model which is inversely calibrated with historical data but not updated in real time. The positive impact of model updating with observations can still be observed 10 days after the update. These simulations also suggest that parameters (hydraulic conductivity and leakage) are successfully updated: 1 and 10 day piezometric head predictions are better with than without updating of parameters. Additional experiments with a synthetic model for the same site, in which the only difference is that certain parameter values are selected as the unknown “true” conditions, show that EnKF also successfully updates unknown parameters. However, this is only the case if spatially distributed hydraulic conductivities and leakage coefficients are jointly updated and if a damping parameter is used. The mean absolute error of estimated log leakage coefficients decreased by up to 63%; for log hydraulic conductivity a decrease of up to 27% was observed. From January 2009 the method has been operational at the Water Works Zurich and showed a remarkable performance until present (October 2010).

### 1. Introduction

- Top of page
- Abstract
- 1. Introduction
- 2. Real-Time Modeling With EnKF
- 3. Study Area
- 4. Model Components and Online Data
- 5. Results
- 6. Discussion
- 7. Conclusions
- Acknowledgments
- References
- Supporting Information

[2] Groundwater resources supply drinking water for numerous people globally. The quantity and quality of groundwater resources are threatened by overpumping, salinization, and various types of contamination. The groundwater below large cities is often contaminated because of leakages from petrol stations and zones of industrial activities. Therefore, it may seem unsuitable to pump groundwater for drinking water purposes in or near large cities. On the other hand, it is attractive to pump drinking water close to large cities because this limits the transportation costs and large cities are often located close to rivers which recharge an aquifer. If groundwater is pumped in an urban area with multiple pollution sources, the operation requires a sophisticated quality control. In this paper we present a methodology which has been put into practice for the pumping of drinking water in the city of Zurich (Switzerland). The methodology is of interest for a large class of cases involving sites that are relatively close to a larger contamination that is not remediated. It is also of interest for the selective withdrawal of colder winter infiltration water if summer temperatures become too high.

[3] We propose an optimal monitoring strategy for the groundwater flow and solute transport at the site through both online measurements and online models. The groundwater flow and transport models are calibrated with historical information. A monitoring network is developed that measures, in real time, piezometric heads (and also concentrations of solutes or temperature). These data are sent to a central server, where piezometric head data are used to update the model predictions. Data and models can be combined in an optimal way by data assimilation methods. In this work, the ensemble Kalman filter (EnKF) [*Evensen*, 1994; *Burgers et al.*, 1998] is used to update the models in real time with observations. EnKF provides an optimal estimate (for linear Gaussian systems) of the current spatial distribution of piezometric heads, concentrations, and temperatures and is therefore very well suited for an optimal characterization of the risk of pumping contaminated water. In this paper we focus on the impact of the assimilation of hydraulic head data on the distribution of piezometric heads.

[4] In this study, the optimal real-time characterization of the groundwater flow situation was used to optimize the management of the water works' well field. The management of the site was adapted if the groundwater flow vector pointed from the city center (with potentially contaminated sites) to the Hardhof area (containing the drinking water well field). The management was adapted for the next few days using fuzzy control techniques and genetic algorithms. The adapted management resulted in a groundwater flow vector that was less likely to introduce potentially contaminated city water into the pumping wells. However, the real-time management is not the subject of this paper. It is described by *Bauser et al.* [2010].

[5] The material of this paper is novel in the sense that EnKF, including parameter optimization, is applied here for a more complex and realistic subsurface flow situation than presented in the literature up to now. This more complex subsurface flow situation includes an unconfined aquifer, the unsaturated zone, and, particularly, river-aquifer interaction. It is shown that information from fluctuating river stages, which strongly influence the groundwater heads, yield important information on river bed and aquifer properties. An additional complicating aspect is that the uncertainty in the values of two different parameters (hydraulic conductivity and leakage coefficients) is roughly equally important in this study. This requires updating (calibration) both uncertain parameter values, which normally is difficult as they are highly correlated and therefore poorly identifiable. An additional important aspect is that we also test the EnKF for a real-world case (Zurich, Switzerland), which presents additional challenges compared to a synthetic study, as new sources of uncertainty might play a role and conceptual model errors cannot be excluded. Moreover, EnKF not only was applied to a real-world case study but was also made operational for the same area. Since January 2009, the subsurface flow situation is calculated in real time, and it is used to adapt the management of the groundwater well field (pumping wells, artificial recharge basins, and artificial recharge wells) in real time. According to our knowledge, this is one of the few cases where stochastic subsurface hydrology is put into practice and the first case where it is made operational as basis for the improved management of a well field. The regulators needed to be convinced of the benefit of this methodology, which includes reducing the risk of pumping contaminated water, while at the same time requiring considerable investment in infrastructure (online sensing equipment, software development, and computing infrastructure). See *Renard* [2007] for an overview of problems associated with putting into practice stochastic groundwater hydrology.

[6] This paper first explains the EnKF for the real-time updating of a groundwater flow model using observations. Then the study area, the groundwater flow model, and the calibration of the model with historical data are introduced. Finally, the results are presented and discussed.

### 2. Real-Time Modeling With EnKF

- Top of page
- Abstract
- 1. Introduction
- 2. Real-Time Modeling With EnKF
- 3. Study Area
- 4. Model Components and Online Data
- 5. Results
- 6. Discussion
- 7. Conclusions
- Acknowledgments
- References
- Supporting Information

[7] The modeling of groundwater flow and solute transport is associated with large uncertainties. These uncertainties may be related to a possible misspecification of the conceptual model (for example, the position of the aquifer boundaries and aquifer bottom and the presence of springs), the unknown external forcing of the aquifer (e.g., recharge and boundary conditions), and, particularly, parameter uncertainties or all of those at once. The most uncertain parameter values of groundwater flow models in humid climates are, in general, the spatially distributed hydraulic conductivities. Hydraulic conductivity is very uncertain due its large spatial variability, the scarcity of measurements, and measurement errors. Model calibration, or inverse modeling, helps to reduce the uncertainty of groundwater flow models by adapting parameters (and possibly boundary conditions, initial conditions, and forcing terms as well) to fit steady state hydraulic heads [e.g., *Kitanidis and Vomvoris*, 1983], historical time series of hydraulic heads [e.g., *Carrera and Neuman*, 1986], or both hydraulic heads and concentration data [e.g., *Medina and Carrera*, 1996]. Inverse methods often focus on obtaining a single best estimate, and the uncertainty of that estimate can be characterized by the posterior covariance matrix obtained from a linearized analysis. However, this uncertainty analysis relies on a Gaussian assumption and underestimates the true variance [*Carrera and Neuman*, 1986]. This and the fact that the calibration process does not have a unique solution were the main motivations to develop methods that generate multiple equally likely solutions to the groundwater inverse problem. This so-called Monte Carlo (MC) type inverse modeling was first formulated for 2-D steady state groundwater flow [*Sahuquillo et al.*, 1992; *RamaRao et al.*, 1995] and later for 2-D transient groundwater flow with the joint calibration of spatially variable transmissivity and storativity fields [*Hendricks Franssen et al.*, 1999], 3-D flow in fractured media [*Gómez-Hernández et al.*, 2001; *LaVenue and de Marsily*, 2001], and coupled groundwater flow and solute transport [*Hendricks Franssen et al.*, 2003; *Wen et al.*, 2003]. All the mentioned MC-type inverse modeling approaches calibrate a large number of spatially distributed fields of hydraulic conductivity (and possibly also other parameters) with derivative-based nonlinear optimization methods, using the adjoint state method to calculate the gradient of the objective function efficiently. The dimensionality of the optimization problem (and therefore also of the gradient vector) is reduced with parameterization techniques that use master blocks or pilot points [*de Marsily*, 1978]. MC-type inverse modeling is not limited to formations with a mild spatial variability of hydraulic conductivity and is very well suited for the characterization of uncertainty. A comparison study showed that MC-type inverse modeling methods outperformed other inverse modeling methods [*Hendricks Franssen et al.*, 2009]. However, some methods like the inverse moment equations method [*Hernandez et al.*, 2003] or the regularized pilot points method in its conditional estimation variant [*Alcolea et al.*, 2006] yielded almost as good results as the MC-type inverse methods but with less CPU time. In principle, MC-type inverse modeling would be suited for the real-time uncertainty characterization, recalibrating the model each time new measurements become available. However, the main limitations of MC- type inverse modeling for real-time modeling are (1) recalibration with all historical data is very CPU intensive, (2) an optimal characterization of the actual conditions is not guaranteed with this approach (the optimum with MC-type inverse modeling being balanced over a historical set of observations), and (3) the application to a physically complex systems (for which an adjoint model has to be formulated) with many different sources of uncertainty is difficult. Alternatives are nonderivative-based MC-type inverse methods like Markov Chain Monte Carlo methods (MCMC) or the EnKF. Currently, MCMC methods are still relatively slow for the inverse modeling of groundwater flow [*Oliver et al.*, 1997; *Fu and Gómez-Hernández*, 2009]. The EnKF [*Evensen*, 1994; *Burgers et al.*, 1998] is a very fast method for the sequential updating of the model states each time new measurements become available. The EnKF was reformulated for subsurface hydrology applications so that with an augmented state vector approach both states and parameters can be updated [*Chen and Zhang*, 2006; *Hendricks Franssen and Kinzelbach*, 2008; *Liu et al.*, 2008; *Nowak*, 2009], an approach that was introduced slightly earlier in petroleum engineering [e.g., *Naevdal et al.*, 2003; *Wen and Chen*, 2006] and surface hydrology [*Moradkhani et al.*, 2005; *Vrugt et al.*, 2005]. For a synthetic study, *Hendricks Franssen and Kinzelbach* [2009] found that EnKF yielded parameter estimates that had approximately the same error as MC-type inverse modeling parameter estimates but with a factor of 80 less CPU time. This was the case for both mildly and strongly heterogeneous transmissivity fields, although EnKF could have been expected to perform worse for more strongly nonlinear problems, as the method relies on Gaussian statistics.

[8] After indicating why we implemented EnKF for the real-time updating of states and parameters, we now present the formulation of EnKF, tailored to our specific problem. The governing equation is the equation for 3-D unsaturated-saturated transient groundwater flow including interaction with rivers [e.g., *Bear*, 1979]:

where *S* is saturation (dimensionless), is density [*M L*^{−3}], *n* is porosity (dimensionless), *p* is pressure [*M L*^{−1}*T*^{−2}], *t* is time [*T*], **k** is permeability [*L*^{2}], *k*_{r} is relative permeability (dimensionless), is dynamic viscosity [*M L*^{−1}*T*^{−1}], *g* is the gravitational acceleration [*L T*^{−2}], *z* is the elevation with respect to a reference [*L*], *q* represents sinks (abstractions) and sources (recharge) [*M L*^{−3}*T*^{−1}], and the Nabla operator [*L*^{−1}] is three-dimensional, referring to spatial coordinates **x**. In this paper, the van Genuchten parameterization was applied for modeling the saturation as function of pressure [*van Genuchten*, 1980]. The boundary conditions that are used to solve equation (1) also include the leakage coefficient *r* [*T*^{−1}]. The leakage coefficient is given by *Q*/(*A*(*h*_{river} − *h*_{gw})), where *Q* is the exchange flux between river and groundwater [*L*^{3}*T*^{−1}], *A* is the surface for the exchange flux [*L*^{2}], *h*_{river} is the river stage [*L*], and *h*_{gw} is the groundwater level [*L*].

[9] Equation (1) is solved using the finite element method. We will refer to the numerical model as *M.* The hydraulic conductivity **K**_{c} and the leakage factor *r* are stochastic parameters in equation (1), and a large number of stochastic realizations of these parameters are generated. In section 4.2, more details are given on the stochastic generation. The ensemble Kalman filter scheme proceeds in the following steps:

where *i* refers to a stochastic realization (*i* = 1, …, *P*) and **x**_{i,h} is part of the vector **x**_{i} and contains states from the previous time step (superscript minus) or the actual time step (superscript 0). An augmented state vector approach is used to update states and parameters jointly. The augmented vector is

where the subscript *Y* refers to log_{10} hydraulic conductivities and *L* refers to log_{10} leakage coefficients. The vector **x**_{i} is of dimension *N* + *E* + *N*_{l}, where *N* is the number of nodes, *E* is the number of elements, and *N*_{l} is the number of leakage zones. The covariance matrix of dimension ((*N* + *E* + *N*_{l}) × (*N* + *E* + *N*_{l})) is estimated from the series of stochastic realizations. For the first time step, these stochastic realizations are unconditional or conditioned only on measurements of *Y* and *L.* For subsequent time steps the stochastic realizations are also conditional on state information. The covariance matrix is given by

where the subscript *hh* refers to covariances between modeled hydraulic heads at two locations (grid nodes), *hY* refers to cross covariances between a modeled hydraulic head value at one grid node and a log hydraulic conductivity value at an element, *hL* refers to cross covariances between modeled nodal hydraulic head and the log leakage coefficient for a zone, *YL* refers to cross covariances between log hydraulic conductivity and log leakage coefficient, *YY* refers to covariances between log hydraulic conductivities at two locations, and *LL* refers to covariances between leakage coefficients for two zones. The observations for the current time step are stored in the vector **y**^{0} of dimension *n.* The data are perturbed (following *Burgers et al.* [1998]) according to

where is a vector of random numbers, drawn from a normal distribution with expectation zero and standard deviation equal to the expected measurement error standard deviation.

[10] The model predictions (equation (2)) and the observations (5) are combined to yield an updated ensemble of states (hydraulic heads) and parameters (log hydraulic conductivities and log leakage coefficients), according to

where is the updated vector containing updated states (hydraulic heads) and parameters (*Y* and *L*) for stochastic realization *i*, is a matrix filled with zeros, except for the diagonal elements, which contain damping factors that take a value between 0 and 1 (for the damping factors related to the states the entries are always equal to 1; i.e., no damping is used), and **H** is a linear operator (*n* × (*N* + *E* + *N*_{l})) that maps the observations to the state space. The damping factor reduces the correcting influence of the head measurements on updating the log hydraulic conductivity field and the log leakage coefficients. In a previous study by *Hendricks Franssen and Kinzelbach* [2008], damping the perturbation was found to give improved results by reducing filter inbreeding problems. **K** is the Kalman gain matrix ((*N* + *E* + *N*_{l}) × *n*):

where **K**_{h} is related to the states (hydraulic heads), **K**_{Y} is related to the log hydraulic conductivities, and **K**_{L} is related to the log leakage coefficients. **K** is obtained from

where **C**^{0} is the covariance matrix for the actual time step (estimated from the ensemble of stochastic realizations) and **R**^{0} (*n* × *n*) is the measurement error covariance matrix for the actual time step, which is estimated a priori.

[11] The ensemble of updated vectors **x**_{i} (hydraulic heads, log hydraulic conductivities, and log leakage coefficients) is the input of the groundwater flow model for the next time step. Equations (2)–(8) are applied each time new observations are available. In several simulation experiments that will be presented in sections 5.1.1 and 5.2.1, only the states are updated and not the parameters, which implies that the equations above are applied excluding log hydraulic conductivities and log leakage coefficients from the expressions.

### 3. Study Area

- Top of page
- Abstract
- 1. Introduction
- 2. Real-Time Modeling With EnKF
- 3. Study Area
- 4. Model Components and Online Data
- 5. Results
- 6. Discussion
- 7. Conclusions
- Acknowledgments
- References
- Supporting Information

[12] The studied aquifer lies below parts of the city of Zurich and is mainly fed by the rivers Sihl and Limmat but also receives water from infiltrating precipitation and lateral inflow from hills. Figure 1 shows an overview of the situation. The river Sihl has, on average, a limited discharge rate (6.8 m^{3} s^{−1} (Swiss Federal Office for the Environment (FOEN)), but it has elevated peaks as a response to intense rainfall. The river Sihl joins the river Limmat in the eastern part of the city. The river Limmat is the outflow of Lake Zurich and has an average discharge of 95.8 m^{3} s^{−1} (FOEN). Two weirs on the river Limmat fall within the study area, with the Hoengg weir being located close to the groundwater well field Hardhof. The rivers Sihl and Limmat infiltrate into the groundwater, except for the downstream, western part, where the aquifer exfiltrates into the river Limmat. The Limmat can show considerable river stage fluctuations, and the aquifer response in most of the study area to these stage fluctuations provides important information about aquifer and river bed hydraulic properties [e.g., *Yeh et al.*, 2009]. No direct measurements of leakage were available, and the leakage coefficient was calibrated for five different river sections via the numerical model. Details will be provided in section 4.

[13] The recharge of the aquifer from rain is limited because of the generally sealed soil surface. Recharge from precipitation is calculated as the difference between precipitation and actual evapotranspiration for the nonsealed areas. Details of the calculations will be given in section 4. The aquifer receives lateral inflow from the hills at its northern boundary (Kaeferberg) and, particularly, at its southern boundary (Uetliberg). The amounts of lateral inflow are, however, not very large. These lateral inflows are calculated as functions of the recharge. The largest inflow occurs from the south (Uetliberg) and in particular close to river Sihl.

[14] The mean hydraulic conductivity of the aquifer is around 2 × 10^{−3} m s^{−1}. This value is obtained from averaging estimated **K**_{c} from small-scale pumping tests along transects of boreholes. The aquifer consists of coarse material, mainly sandy gravel, which has been deposited by the river Sihl and as glacial moraine [*Kempf et al.*, 1986]. The hydraulic conductivity is, in general, larger for the upper aquifer layers than for the lower aquifer layers. The heterogeneity of the hydraulic conductivity is high, and for small-scale measurements we have . There is some evidence that the spatial distribution of log_{10} conductivity *Y* shows a complex spatial pattern consisting of small channels and lenses with high and low hydraulic conductivities caused by changing river courses. The aquifer storativity is assumed to be 0.15. The aquifer thickness is on average around 20 m, but in the eastern part of the study area reaches up to 70 m.

[15] Around 20% of the drinking water for the city of Zurich is pumped in the Hardhof area. Figure 2 gives an overview of the Hardhof area. The drinking water is pumped from four horizontal wells. In addition, 19 bank filtration wells along the river Limmat pump water which is used for artificial recharge and distributed over 12 infiltration wells and 3 recharge basins. These artificial recharge facilities are located in the southern part of the Hardhof area and are supposed to create a hydraulic barrier between the city center and the Hardhof area. Below the city center, diffuse pollution is present, which could reach the pumping wells if the abstraction rates are large. Tracer tests and additional analysis on the basis of electrical conductivities revealed that a considerable part (up to 30%) of the water pumped by two of the four horizontal wells originates from the city area.

### 6. Discussion

- Top of page
- Abstract
- 1. Introduction
- 2. Real-Time Modeling With EnKF
- 3. Study Area
- 4. Model Components and Online Data
- 5. Results
- 6. Discussion
- 7. Conclusions
- Acknowledgments
- References
- Supporting Information

[47] These results indicate the positive impact of assimilating hydraulic head data, which yields better predictions of hydraulic head distributions, including heads away from the measurement locations and at a time horizon of prediction of 10 days, than a conventional model calibrated with historical data. Results can be improved further if uncertain parameter values (*Y* and *L*) are jointly updated together with the states. Jointly updating *Y* and *L* gives better results than updating *Y* or *L* alone. We think that this result is related to the fact that both *Y* and *L* are affected by errors. If only *Y* or *L* is corrected, the updated parameter also accounts for the error in the other parameter. This results in overcorrection and worse results compared to the case where it is acknowledged that both parameters might be affected by errors. However, some questions remain because in the synthetic case, parameter estimates of *Y* got worse later in the simulation period in one case (out of three). However, also in that case the estimates of *Y* at the end of the simulation period were still better than the prior values. There is evidence that the increase in the mean absolute error of the hydraulic conductivities in the second half of the simulation period is related to filter inbreeding. In the real-world case the prediction of hydraulic head distributions including parameter updating did not improve in one case (out of six) compared to predictions based on the same amount of data but without parameter updating. It was found that for that specific case, estimated hydraulic head distributions were much better than those obtained without parameter updating for the first 700 days of simulation but much worse for the last 150 simulation days. This result was related to numerical instabilities and points to the risks of continuously updating parameter distributions in real time. We think that the better results for the experiments where the assimilation is carried out every 10 days instead of each day (and parameters are updated) can also be explained by reduced filter inbreeding. Less frequent updating reduces problems with filter inbreeding [*Hendricks Franssen and Kinzelbach*, 2008]. On the other hand, this may not be the only reason. After a longer time period without assimilation, the model residuals are larger and contain clearer signals about too small/too large parameter values.

[48] To avoid filter inbreeding, we suggested some measures [*Hendricks Franssen and Kinzelbach*, 2008]. The use of a damping parameter is very important in reducing problems with filter inbreeding, and a large ensemble is also important. For large simulation models this requires parallelization of the code and many processors. That setup was not possible in this study, and the number of stochastic realizations (100) was relatively small; better results are expected for a larger number of stochastic realizations. Problems with filter inbreeding can be reduced further by excluding spurious larger positive or negative covariances for large separation distances [*Houtekamer and Mitchell*, 1998]. Results for the synthetic study would have been better if we would have taken additional measures against filter inbreeding, but nevertheless, all results (in terms of hydraulic heads and parameter updates) were better than simulations with updating of states only.

[49] A bit more complicated is the situation for the real-world case study. Here instabilities developed at the end of a large assimilation experiment over 4 years. In this assimilation experiment, less data were assimilated. As parameter updates improve, especially in the beginning of the assimilation period, and stabilize later (or become worse again in the case of filter inbreeding), our suggestion for calibrating real-world models with EnKF and operational applications is to limit parameter calibration to the beginning of the simulation period with, for example, around 30 parameter updates and then update parameters later in the simulation period only very occasionally. The optimal frequency for parameter updating will depend on the case, and it is important to gain experience from additional simulation experiments to determine the role of the updating frequency and provide measures against possible instabilities that could develop over time.

[50] Nevertheless, EnKF was also successful for updating states and parameters of a subsurface flow system (for both a real-world case and an operational case) which was considerably more complex than the synthetic 2-D saturated groundwater flow problem used by *Hendricks Franssen and Kinzelbach* [2008, 2009]. As mentioned in section 1, real-time modeling is used as a basis for real-time optimization of the water management at the site (fixing the amount to be abstracted and infiltrated but optimizing the spatial distribution of artificial recharge, taking into account constraints like the capacity of artificial recharge basins and wells). Simulations indicate that without optimization (traditional management) the fraction of city water (i.e., potentially contaminated water) is 11% at well C (5% in case of optimization) and 6% at well D (2% in case of optimization). If the amount of water to be infiltrated is allowed to increase, an estimated fraction of city water of 0% at all wells can be achieved. The improved quality of the pumped water in online mode was confirmed by the evolution of its electrical conductivity, indicating a reduced fraction of pumped city water. See *Bauser et al.* [2010] for further details. The next extension of the method to more general problems is under preparation. It concerns the calibration with data assimilation of fully distributed, integral hydrological models that include overland flow as well as evapotranspiration from the unsaturated zone. This is only feasible by assimilating various types of data and by parallelizing the code and running it on a supercomputer (at least for the testing stage) to process sufficiently large ensembles.

### 7. Conclusions

- Top of page
- Abstract
- 1. Introduction
- 2. Real-Time Modeling With EnKF
- 3. Study Area
- 4. Model Components and Online Data
- 5. Results
- 6. Discussion
- 7. Conclusions
- Acknowledgments
- References
- Supporting Information

[51] This paper presents data assimilation with the ensemble Kalman filter (EnKF) for variably saturated subsurface flow including river-aquifer interaction, implemented in a finite element model with 173,599 elements. Simulation experiments are carried out for the Limmat Valley aquifer for the period January 2004 to December 2007 and also for a synthetic case which is similar to the Limmat Valley aquifer, except for the fact that a certain parameter distribution was selected as the virtual truth. Finally, results are presented for the online implementation of this same model for the period May 2009–September 2010. To our knowledge, this is the first online implementation of a data assimilation framework for subsurface flow and the first to adapt online the optimal pumping strategy.

[52] Results indicate that data assimilation with EnKF but without parameter calibration improves 1 day and 10 day hydraulic head predictions, at both assimilation and prediction locations. The reduction of the mean absolute error (MAE) is slightly larger for the synthetic experiments (e.g., 59% reduction of MAE for 1 day predictions and assimilation of 87 hydraulic head data) than for the 99 stochastic realizations from the real-world case (53% MAE reduction for the same case). The difference is larger for 10 day predictions based on 87 assimilated data (23% reduction for the synthetic case versus 15% for the real-world case). Further improvements are obtained if log hydraulic conductivities (*Y*) and log leakage coefficients (*L*) are updated along with the states. If *Y* and *L* are updated simultaneously, MAE(*h*) is much lower than for the unconditional case, with an 85% reduction for 1 day predictions and an 84% reduction for 10 day predictions. For the real-world case, MAE(*h*) is reduced by 73% (1 day predictions) or 66% (10 day predictions) when compared to the same sets of simulation experiments. The improvements are smaller for 1 day predictions at verification locations: 54% MAE(*h*) reduction for the synthetic case and 44% reduction for the real-world case. The synthetic experiments allowed us to verify that the parameter estimates indeed improved for all experiments where both hydraulic conductivities and leakage coefficients were updated. The improvement was largest if all observations were assimilated but only every 10 days. In that case the MAE reduction for log hydraulic conductivity was 27%, and the MAE reduction for the log leakage coefficient was 63%. It was also observed that for one of the synthetic simulation scenarios the log hydraulic conductivity estimates initially improved strongly but got worse later during the simulation period, although they were still better than the prior estimates at the end of the simulation estimates. It is believed that this behavior is linked to filter inbreeding because the ensemble of 100 realizations was relatively small, resulting in numerical covariances subjected to considerable sampling fluctuations. Also, the fact that states are nonnormally distributed in this synthetic experiment might have contributed to the suboptimal results. In addition, an already inversely calibrated model was updated in the off-line experiments for the real-world case study. This model yielded the best hydraulic head predictions, which were also improved in the case when additional data were assimilated and were further improved if parameters were adapted in real time.

[53] The hydraulic head estimates obtained with the online operational model are better if data assimilation is applied. The absolute errors are larger than for the off-line model because of the up-to-now inferior characterization of the model forcings.

[54] These results corroborate the potential of the ensemble Kalman filter for the operational updating of large-scale groundwater flow models showing a highly dynamic response to surface water bodies and confirm that unknown model parameters can be calibrated and improved in real time. The results also indicate that very frequent updating of parameters might give less good results than less frequent updating because of filter inbreeding and the risk of numerical instabilities.