## 1. Introduction

[2] In recent years, there has been a dramatic increase in the computational complexity of hydro(geo)logical models. This has been driven by new problems addressing large-scale relationships like global warming, reactive transport on the catchment scale, or CO_{2} sequestration. Computational model complexity becomes even more drastic when facing the ubiquitous need for uncertainty quantification and risk assessment in the environmental sciences [*Christakos*, 1992; *Oreskes et al.*, 1994; *Rubin*, 2003] or stochastic inverse techniques for incorporating field data into uncertain hydrogeological models [e.g., *Kitanidis*, 1995; *Gómez-Hernández et al.*, 1997; *Evensen*, 2007; *Franssen et al.*, 2009]. For that reason, reducing the complexity of hydro(geo)logical models has been the focus of many research efforts [e.g., *Hooimeijer*, 2001].

[3] The goal of mathematical (not conceptual) model reduction is to reduce the computational costs or, alternatively, to admit more conceptual complexity, finer resolution, or larger domains at the same computational costs, or to make a brute force optimization task more feasible [*Razavi et al.*, 2012]. The computational demand of stochastic models for space-/time-dependent hydro(geo)logical systems shall serve as an example. The corresponding computational demand can be broken down into contributions from spatial, temporal, and stochastical resolution, e.g., spatial grid resolution, time step size, and the number of repeated simulations dedicated to uncertainty. The latter may involve, for example, Monte Carlo (MC) simulation [e.g.,*Freeze*, 1975; *Smith and Schwartz*, 1980, 1981; *Robert and Casella*, 2004], polynomial chaos expansions (PCE) [e.g., *Wiener*, 1938; *Li and Zhang*, 2007; *Oladyshkin et al.*, 2011a, 2012], or statistical moment generating equations [e.g., *Neuman and Orr*, 1993; *Neuman*, 1993; *Zhang*, 2002]. Reducing model complexity with adequate techniques (but not merely by reducing the spatial resolution) allows the modeler or investigator to almost maintain the required numerical prediction quality while controlling the computational costs.

[4] In this work, we will focus on temporal complexity which is owed to the dynamic character of hydro(geo)logical systems. The dynamic character appears in time-dependent system response curves. Some examples include aquifer reactions due to recharge events, tidal pumping, or changing river stages [e.g.,*Yeh et al.*, 2009], drawdown curves (DC) due to the excitation of the subsurface water level in pumping tests [e.g., *Fetter*, 2001], solute breakthrough curves (BTC) during the injection of water-borne tracers and contaminant spills [e.g.,*Fetter*, 1999], or reactions of river discharge to precipitation in hydrological models [e.g., *Nash and Sutcliffe*, 1970].

[5] In our opinion, the most powerful contribution to temporal model reduction has been made by *Harvey and Gorelick* [1995]. Their approach converts dynamic models to steady state models by employing a Laplacian transformation to the time dimension. After a Taylor expansion of the Laplace coefficients (LC), one can directly simulate characteristics of the time-dependent response curves, the so-called temporal moments (TM) with steady state equations. Alternatively, TM can be derived by projecting the time-dependent governing equations onto a series of monomials*t ^{k}* of the order [e.g.,

*Cirpka and Kitanidis*, 2000a]. The generating equations for TM are steady state equivalents of the original governing equations, and so allow for swift evaluation. TM are said to capture the most significant aspects of the system response such as strength, delay, duration, etc., and often have well-defined physical meanings (seesection 2.2). Thus, TM intuitively bear a high information density, dramatically reducing computational costs at a comparatively small loss of information. The only prerequisites are that the governing equations must be linear (systems) of PDEs or ODEs, and the coefficients must be time invariant.

[6] Even though TM are capable of covering many challenges in hydro(geo)logy (prediction, uncertainty quantification, calibration/inversion, or probabilistic risk assessment), they have found relatively limited application in simulation-based studies. The simulation-based studies that we could find (including the number of moments considered) range from the prediction of groundwater age or life expectancy by*Goode* [1996] (two TM), *Molson and Frind* [2011] (two TM), *Varni and Carrera* [1998] (two TM), and probabilistic assessment of well vulnerability zones [*Enzenhöfer et al.*, 2011] (five TM) to solute travel time analysis [*Cirpka and Nowak*, 2004] (two TM). *Cirpka and Kitanidis* [2000a] (two TM), *Li et al.* [2005] (two TM), *Zhu and Yeh* [2006] (two TM), *Pollock and Cirpka* [2008] (two TM) and *Yin and Illman* [2009] (two TM) applied TM to make calibration or stochastic inverse modeling more efficient. *Lawrence et al.* [2002] extended the pioneer work of *Harvey and Gorelick* [1995] and theoretically derived conditional TM for multirate mass transfer models (MRMT), whereas *Luo et al.* [2008]derived the moment-generating equation for mass transfer described by the moments of an arbitrary memory function.*Cirpka and Kitanidis* [2000b] (three TM), *Cirpka and Kitanidis* [2000c] (three TM), and *Enzenhöfer et al.* [2011] (five TM) deterministically related TM to transport characteristics, while *Cunningham and Roberts* [1998] (four TM) used TM to characterize the impact of nonequilibrium sorption models on solute transport.

[7] We observed that almost all applications involve hypothetical data and scenarios to test and demonstrate their methods, and almost no field applications exist. So we raised the question as to why there are only so few applications. Many models used by practitioners indeed satisfy the prerequisites for applying TM. We believe that hydro(geo)logists have simply not become comfortable yet with TM-based methods and that reservations exist about the possible loss of information when using only a few TM. The goal of the current study is to overcome these reservations, and to push the use of TM, including higher orders, further toward practical application. Among all work known to us, only*Varni and Carrera* [1998], *Cunningham and Roberts* [1998], and *Nowak and Cirpka* [2006]compared their numerical simulations against field measurements of groundwater age, grain-size distributions, and solute breakthrough curves, respectively.*Nowak and Cirpka* [2006] and *Yin and Illman* [2009] reduced measured real breakthrough curves and experimental drawdown curves, respectively, and then used TM in geostatistical inversion.

[8] Furthermore, all of the applications in TM predominantly employed low-order moments. For example, the studies by*Zhu and Yeh* [2005] and *Yin and Illman* [2009] stated that using TM induces a loss of information in inverse modeling based on an analysis up to the first order TM. All of the studies we could find rarely provided reasons for the choice of order, and none of them assessed the information lost by not looking at higher orders. The fact that no systematic assessment has been performed raises the research question: How many TM are required to properly capture the behavior of a system?

[9] Another way of removing the time dependence within the governing equation is to apply Laplace transform techniques. They have been proven to be suitable in forward model reduction (including reconstruction of the full time series from simulated LC) when considering sequences of more than 10 and up to 100 LC [e.g., *Sudicky*, 1989]. Here the questions are how many and which LC are required to properly represent the system?

[10] In addition to integral transformations, increasing attention has been drawn by snapshot-based model reduction methods [*Vermeulen et al.*, 2004; *McPhee and William*, 2008]. Via proper orthogonal decomposition (POD) into dominant spatial patterns [*Papoulis*, 1991], the model is reduced to some number of orthogonal base functions in physical space with time-dependent coefficients. Within other disciplines, this method is referred to as principal component analysis (PCA) [*Pearson*, 1901], or Karhunen Loève transform (KLT) [*Loève*, 1955]. We refer to these methods as spatial reduction methods since the model is, in its proper effects, reduced in physical space while the time-related model complexity remains untouched. The scope of this work, however, is strictly limited to temporal reduction methods. This strict focus is legitimate, because reduction methods in time can be evaluated independent of spatial methods. Reduction techniques in space and in time can be arbitrarily combined because space and time are independent coordinates. For that reason, we do not focus any further on spatial reduction methods.

[11] A principal difference between POD and TM (in addition to working on space or time), however, is that TM employ nonorthogonal base functions. The advantage of working with orthogonal base functions is that they have very elegant properties in the solution of many mathematical and physical problems (e.g., Hermite polynomials are the optimal set of base functions when expanding functions of normal distributed variables [*Cameron and Martin*, 1947]). This leads to the next research question: Would other (and possibly orthogonal) base functions in temporal reduction work better than the nonorthogonal monomials that lead to TM?

[12] We will answer the above raised research questions in the following order: We first address the question of whether other base functions in temporal reduction work better than the nonorthogonal monomials. To this end, we derive theoretically classes of temporal base functions that can reduce hydro(geo)logical models, and compare those to the structural aspects of TM. We perform this analysis solely in terms of computational efficiency with no attention paid to reconstruction options to recover the original function of the model reduction.

[13] Second, we discuss the research question: How many TM or LC are required to properly capture the dynamic system behavior? This is answered by measuring the information density of TM with regard to the underlying fully resolved time series. For this analysis, we apply a novel method called PreDIA (pre-posterior data impact assessor) [*Leube et al.*, 2012] in order to assess the information density of TM. PreDIA is a nonlinear Bayesian filtering scheme originally developed for data worth analysis. We apply it here to measure the informational value of knowing any given number of TM from an unknown curve. PreDIA, as used here, may be seen as a generalized analysis of variance (ANOVA) [*Harris*, 1994] technique. We perform this analysis on an example featuring aquifer pumping tests. This example is based on a single, specific linear PDE and, hence, can by no means be seen as a general assessment for all linear PDEs involved in hydro(geo)logical problems. However, the very same analysis can easily be applied to other cases, e.g., the advection-dispersion equation or linear hydrological models.

[14] These research questions need to be mirrored against the purpose and context of model reduction. Sometimes, TM directly correspond to the physical quantities of interest. In some cases of forward modeling, however, the ability to reconstruct a full time series from the reduced model will be relevant. In inverse modeling, only the information content of measured time series captured by the reduced model will matter. We will address these issues throughout our study wherever appropriate.

[15] The remainder of this paper is organized as follows: Section 2 summarizes the concept of TM. Section 3 discusses possible alternative integral transformations, and section 4 compares their adequacy for model reduction to the approach of TM (both addressing the first research question). Section 5 features the analysis of information density on the example of drawdown curves in aquifer tests, thereby addressing the second research question.