## 1. Introduction

[2] A diagonal weight matrix commonly is used to represent system-state observation errors in inverse models of groundwater systems, with the weights calculated as the inverse of observation error variances [e.g., *Hill and Tiedeman*, 2007; *Singh et al*., 2008; *James et al*., 2009; *Liu and Kitanidis*, 2011; *Majdalani and Ackerer*, 2011; *Kowalski et al*., 2012; *Yoon and McKenna*, 2012]. This representation and other common simplifications assume there are no observation error correlations. The necessary methods for including error correlations in inverse modeling have been available for decades [e.g., *Neuman and Yakowitz*, 1979; *Cooley*, 1982; *Carrera and Neuman*, 1986; *McLaughlin and Townley*, 1996; *Hill and Tiedeman*, 2007] and can be implemented, for example, using the inverse modeling software PEST [*Doherty*, 2008, 2010] and UCODE_2005 [*Poeter et al*., 2005]. The methods include the correlations by allowing for a full observation variance-covariance matrix to represent observation errors; this matrix is inverted to obtain the full weight matrix. However, despite availability of these methods, difficulties with quantifying the terms that characterize observation error correlations often lead to their omission and to using a diagonal observation weight matrix for convenience.

[3] Various types of error correlations are widespread in hydrologic models. Error correlations can result from phenomena such as barometric pumping of wells [*Weeks*, 1979] and entrapped air in the unsaturated zone [*Healy and Cook*, 2002] that create spatially and temporally correlated anomalies in groundwater levels. Correlations in errors arise also from use of multiple observations that derive from a single direct measurement, a common situation in hydrologic studies. For example, the water table elevation at a monitoring well is usually calculated from measurements of well elevation and depth of water, so error in the well elevation propagates to all head observations over time at that well. Streamflow observations are usually estimated as nonlinear functions of water depth and empirical rating curve constants. For both stream and groundwater depth, estimates often depend on the nonlinear equations and empirical constants used to estimate pressure from the voltage and temperature at a pressure transducer [*Freeman et al*., 2004]. Additional examples include multiple observations of temporal changes in hydraulic heads that depend on an instantaneous head measurement [e.g., *Hill et al*., 2000] and multiple observations of flow-change between stream gauging stations that depend on a single flow estimate. Although error correlations in hydrologic models have not been extensively characterized, these and other examples indicate that error correlations are potentially widespread.

[4] In this work, we explore the effect of system-state observation error correlations on parameter estimates, predictions, and uncertainty measures. We use both an analytical expression and a groundwater reactive transport model of denitrification to compare results obtained with and without error correlations. For the transport model, geochemical observation errors are correlated because selected direct measurements are used to calculate more than one calibration observation. The correlations are calculated by propagation of measurement error, a method that has precedence for geochemical data [e.g., *Ballentine and Hall*, 1999; *Aeschbach-Hertig et al*., 1999, 2000; *Peeters et al*., 2002] but, to our knowledge, has not been used previously to calculate error correlations in observation weight matrices for hydrogeologic investigations. *Hill* [1992] and *Christensen et al*. [1998] use a less general approach applied to streamflow gains and losses. In the context of inverse modeling, full weight matrices can be used to represent observation, model, and parameter error.

[5] Few groundwater studies have considered correlated observation errors. *Christensen et al*. [1998] used a full weight matrix to represent correlated error in base flows, by expanding the method proposed by *Hill* [1992] to derive the error covariance terms for a system of branching streams. *Christensen et al*. [1998] did not compare results using a full and a diagonal weight matrix, but *Foglia et al*. [2009] reported that an unpublished follow-up comparison found that base flow error correlations had a small effect on the parameter estimates but a larger effect on parameter uncertainty.

[6] *Cooley* [2004] accounted for both observation and model errors in the weight matrix, by summing the variance-covariance matrices of the two error types. Model error was formulated as the difference between stochastic representations of the true and the spatially averaged parameter distributions. *Cooley and Christensen* [2006] and *Christensen and Doherty* [2008] used synthetic models to examine the consequences of using a diagonal instead of a correct full weight matrix calculated by the method of *Cooley* [2004]. Their weight matrices were dominated by model error. *Cooley and Christensen* [2006] reported that the variance of head prediction residuals was larger in a model calibrated with a full weight matrix compared to the same model calibrated with a diagonal weight matrix. They also found that confidence intervals on predictions calculated with a diagonal weight matrix were much too small whereas those calculated with a full weight matrix were nearly correct. Their results underscore the importance of using the correct full weight matrix when observation and model error correlations exist. *Christensen and Doherty* [2008] found that inversion with a full weight matrix produced less accurate predictions, in contrast to expected results based on *Cooley* [2004]. They noted that their result was most likely related to difficulties with generating (by a Monte Carlo method) and inverting the variance-covariance matrix of total error. *Lu et al*. [2013] considered correlations between total errors in the context of weighting for model averaging. They found that using the variance-covariance matrix of total errors to calculate the weights resulted in better predictive performance for models of both synthetic and experimental uranium transport, compared to using the variance-covariance matrix of observation errors.

[7] A full weight matrix commonly is used for parameter errors. For example, when pilot points and regularization are used to estimate the spatial variability of a parameter field, a full variance-covariance matrix for prior information error often is used to represent the spatial correlation of these errors [e.g., *Bentley*, 1997; *Alcolea et al*., 2006; *Singh et al*., 2008; *Hendricks-Franssen et al*., 2009]. To our knowledge, there are no studies that evaluate the effect of including versus excluding these correlated errors.

[8] The implications of observation error correlations extend also to Bayesian methods in hydrologic modeling. In rainfall-runoff modeling, streamflows are the primary system-state observations used for calibration. Observation errors largely stem from using stage-discharge rating curves to determine the flows [e.g., *McMillan et al*., 2010] and these errors can be correlated as suggested by *Foglia et al*. [2009]. Recent rainfall-runoff modeling research has comprehensively examined methods for characterizing both model and observation error and its effect on prediction uncertainty [e.g., *Thyer et al*., 2009; *Schoups and Vrugt*, 2010; *Renard et al*., 2010, 2011]. In these papers, error models are developed for the different error sources, and the parameters of these models are estimated together with rainfall-runoff model parameters. The results show that error correlations can be pronounced and that accurate estimates of observation uncertainty are critical for predictive capabilities and for the decomposition of input and structural errors [*Thyer et al*., 2009; *Renard et al*., 2010].

[9] In this paper, we first present methods for model calibration, error propagation, and calculation of uncertainty. We next consider parameter uncertainty in a simple one-parameter, two-observation inverse model, and derive a new analytical expression for the errors in uncertainty estimates when observation error correlations are omitted. For this model, we also compare uncertainty estimates that are typically obtained in practice with those derived from theoretical calculation of the variances. A reactive transport model of denitrification is then introduced, calibrated using both full and diagonal weight matrices, and used to predict future nitrate concentrations and uncertainty. Model error is accounted for by considering multiple realizations of the geology. The derived analytical expression helps to explain the differences in reactive transport model parameter uncertainty in the calibrations with and without error correlations and provides some general guidance about the importance of using a full weight matrix in inverse models for which observation errors are correlated. Results are expected to have broad relevance, as most of the methods and analyses presented here for exploring the effects of correlated errors apply regardless of the source of these errors.