## 1. Introduction

[2] Hydrology is one of the oldest fields of interest in science and has been studied on both small and large scales for about 6000 years. The goal of the present work is to achieve good predictions of flow in a sewage system. Black box models have been providing good prediction results, often much better than conceptual or physical models, depending on how well the actual system is known. *Carstensen et al.* [1998] showed that data driven models are more reliable for online applications in sewers than stationary deterministic models.

[3] Black box models have been used in hydrology for decades; *Sherman* [1932] presented the first black box model by introducing the theory of unit hydrograph. The unit hydrograph is an impulse response function and as such is estimated directly as a FIR model, i.e., the flow is modeled as lagged values of precipitation. The unit hydrograph describes the relation between effective precipitation and quick flow. Hence, for the flow data, a base flow separation must be performed and the effective precipitation must be calculated from the precipitation data. Quite often, physical equations are used for effective precipitation calculations, e.g., Horton's infiltration formula [*Horton*, 1935] or Philip's equation [*Philip*, 1969]. Effective rain identification can also be incorporated in the hydrograph modeling process itself [e.g., *Hsu et al.*, 2002].

[4] For the purpose of flow predictions, ARX and ARMAX (autoregressive moving average exogenous) models are in most cases more successful than FIR models. This means that the flow is modeled not only as a function of precipitation, but also by using past flow values and in that case all the available information is applied. *Todini* [1978] used an ARMAX model for online flow predictions, and *Novotny and Zheng* [1990] used an ARMAX model for deriving watershed response function and their paper provides an overview of how ARMAX models, transfer functions, Green's functions and the Muskingum routing method are related.

[5] Both the FIR models and the ARMAX models are linear time-invariant models. These models are simple and easy to use and in many cases provide acceptable results, particularly when the volume of the flood is large compared to the infiltrated volume. Nevertheless, the rainfall-runoff process is believed to be highly nonlinear, time-varying and spatially distributed [e.g., *Singh*, 1964; *Chiu and Huang*, 1970; *Pilgrim*, 1976]. With increased computer power, nonlinear models have become increasingly popular. *Capkun et al.* [2001] handle the nonlinearity by using an ARX model and by modeling the variance as a function of past rainfall. Bayesian methods have also been applied; *Campbell et al.* [1999] used such a procedure for parameter estimation in their nonlinear flood event model. *Iorgulescu and Beven* [2004] used nonparametric techniques for the identification of rainfall-runoff relationship using direct mapping from the input space to the output space with good results. During the last decade neural networks have been popular as in the work by *Hsu et al.* [1995] and *Shamseldin* [1997] and, more recently, the SOLO-ANN model by *Hsu et al.* [2002]. *Karlson and Yakowitz* [1987] used a nonparametric regression method, which they refer to as the nearest neighbor method. They compare FIR, ARMAX and nearest neighbor models. Their results favor the nearest neighbor and the ARMAX models; however, they do not distinguish between the ARMAX and the nearest neighbor models. *Porporato and Ridolfi* [1996] used a nearest neighbor model and found that the local linear model with small neighborhoods gave the best results. *Porporato and Ridolfi* [1997] detected strong nonlinear deterministic components in the discharge series. They used noise reduction techniques specifically proposed for the field of chaos theory to preserve the delicate nonlinear interactions, and then they used nonlinear prediction (NLP) with good results. *Porporato and Ridolfi* [2001] followed up on these methodologies for multivariate systems. *Prevdi and Lovera* [2004] tackle the nonlinearity by using time-varying ARX models, which they refer to as nonlinear parameter varying models (NLPV). The parameter variation is defined as an output of a nonlinear function and the optimization is performed by using Neural Networks. *Young et al.* [2001] considered the time variable parameters to be state dependent and the method is thus referred to as the SDP approach. For nonlinear phenomena this approach results in a two-stage approach, called the data-based mechanistic approach (DBM) [*Young*, 2003]. In recent years fuzzy methods have been tested for flood forecasting [e.g., *Chang et al.*, 2005; *Nayak et al.*, 2005].

[6] In the present paper conditional parametric models are used to develop models for flow predictions in a sewage system. A conditional parametric model is a linear regression model where the parameters vary as a smooth function of some explanatory variable. Thus the method presented here is in a line with the SDP and the NLPV methodologies. The name conditional parametric model originates from the fact that if the argument of the functions is fixed then the model is an ordinary linear model [*Hastie and Tibshirani*, 1993; *Cleveland*, 1994]. In the models presented here, the parameters vary locally as polynomials of external variables, as described by *Nielsen et al.* [1997]. In contrast to linear methods like FIR and ARX, this methodology allows fixed input to provide different output depending on external circumstances.

[7] This paper is organized as follows: In section 2 the models are described, followed by section 3 with a description of the parameter estimation method. Section 4 contains results, and in section 5, online prediction and control in sewage systems are discussed. Finally, in section 6 conclusions are drawn.