### Abstract

- Top of page
- Abstract
- 1. Introduction
- 2. Model Structure Identification Through Data Assimilation: Theory
- 3. Model Structure Identification by Data Assimilation: Application to Leaf River Basin
- 4. Summary and Conclusions
- Appendix A:: Mixture of Multivariate Normal Distributions Approximation
- Appendix B:: Illustration of the Model Structure Estimation Algorithm Using a Simple Example
- Appendix C:: Extended State-Space Formulation
- Acknowledgments
- References
- Supporting Information

[1] When constructing a hydrological model at the macroscale (e.g., watershed scale), the structure of this model will inherently be uncertain because of many factors, including the lack of a robust hydrological theory at that scale. In this work, we assume that a suitable conceptual model structure for the hydrologic system has already been determined; that is, the system boundaries have been specified, the important state variables and input and output fluxes to be included have been selected, the major hydrological processes and geometries of their interconnections have been identified, and the continuity equation (mass balance) has been assumed to hold. The remaining structural identification problem that remains, then, is to select the mathematical form of the dependence of the output on the inputs and state variables, so that a computational model can be constructed for making simulations and/or predictions of the system input-state-output behavior. The conventional approach to this problem is to preassume some fixed (and possibly erroneous) mathematical forms for the model output equations. We show instead how Bayesian data assimilation can be used to directly estimate (construct) the form of these mathematical relationships such that they are statistically consistent with macroscale measurements of the system inputs, outputs, and (if available) state variables. The resulting model has a stochastic rather than deterministic form and thereby properly represents both what we know (our certainty) and what we do not know (our uncertainty) about the underlying structure and behavior of the system. Further, the Bayesian approach enables us to merge prior beliefs in the form of preassumed model equations with information derived from the data to construct a posterior model. As a consequence, in regions of the model space for which observational data are available, the errors in preassumed mathematical form of the model can be corrected, improving model performance. For regions where no such data are available the “prior” theoretical assumptions about the model structure and behavior will dominate. The approach, entitled Bayesian estimation of structure, is used to estimate water balance models for the Leaf River Basin, Mississippi, at annual, monthly, and weekly time scales, conditioned on the assumption of a simple single-state-variable conceptual model structure. Inputs to the system are uncertain observed precipitation and potential evapotranspiration, and outputs are estimated probability distributions of actual evapotranspiration and streamflow discharge. Results show that the models estimated for the annual and monthly time scales perform quite well. However, model performance deteriorates for the weekly time scale, suggesting limitations in the assumed form of the conceptual model.