#### 2.1. Theoretical Framework

[5] Spreading of cholera epidemics along river networks is addressed by viewing the environmental matrix as an oriented graph. Nodes represent human communities (cities, towns, villages), while edges are hydrologic links between communities [*Bertuzzo et al.*, 2007, 2010]. Edge direction indicates water flow direction. The model is assembled by coupling: i) a local epidemic model at node level, ii) a hydrologic model for the water balance of each node and iii) a transport model for the spreading of the disease agent through the river network. Cholera dynamics in a generic node *i*of the network is described via a compartmental SIR-like model with five state variables, namely the abundance of susceptible to the disease (*S*_{i}), infected individuals (*I*_{i}), recovered from the disease (*R*_{i}) and the abundance of *V. cholerae* (*V*_{i}) in the local water resources (*W*_{i}). The dynamics of the system can be described by the following system of nonlinear differential equations:

The evolution of the susceptible compartment (equation (1)) is described by the balance between population demography, infections due to contact with *V. cholerae* and immunity losses. The host population is assumed to be at a demographic equilibrium with a constant recruitment *μH*_{i}, where *H*_{i} is the size of the local community, and a constant mortality rate *μ.* Susceptible people become infected at a rate *β*(*t*)(*V*_{i}/*W*_{i})/(*K* + *V*_{i}/*W*_{i}), where *β* is the transmission parameter accounting for contact with contaminated water, and (*V*_{i}/*W*_{i})/(*K* + *V*_{i}/*W*_{i}) is the logistic dose-response curve (*sensu* *Codeço* [2001]). The dynamics of the infected compartment (equation (2)) is described as the balance between newly infected individuals and losses due to recovery and natural/cholera-induced mortality, with*γ* and *α* being the rates of recovery and mortality due to cholera, respectively. Recovered from the disease (equation (3)) lose their immunity and become susceptible again at a rate *ρ.* The environmental water reservoir of a generic node *i* (equation (4)) depends on the balance between local rainfall *J*_{i}, losses due to infiltration and evapotranspiration *L*_{i}, and on the difference between outflowing (*Q*_{i}) and inflowing discharge, where *n*_{i}^{up} is the number of nodes immediately upstream of *i.* Equation (5)describes the dynamics of the free-living bacteria in the local water reservoir. Infected people contribute to vibrios abundance at a*per capita* rate *p.*Free-living bacteria are also assumed to die at a constant rate*μ*_{V}. The last two terms in equation (5) implement the transport model of *V. cholerae* through the river network as explained in the following.

[6] A complex set of processes is known to drive the dispersion of vibrios and the consequent spreading of the disease among communities [*Colwell*, 1996; *Pascual et al.*, 2002; *Lipp et al.*, 2002]. A primary mechanism of propagation is related to the dispersion through surface waters. *V. cholerae* can in fact survive outside the human host in the aquatic environment and may also live in symbiosis with phytoplankton and zooplankton [*Islam et al.*, 1994; *Colwell*, 1996]. As a result, the bacteria, and therefore the disease, can spread through hydrologic pathways among human communities. This mechanism is modeled assuming that vibrios are transported downstream by advection at a rate (third term of the RHS of equation (5)). The advection rate is assumed to be proportional to the local river velocity *u*_{i}(*t*) and therefore, in general, it can vary in space and time due to hydrological fluctuations. This baseline mechanism can be superimposed by other possible transport pathways related, e.g., to the short-range distribution of water for irrigation or consumption or to the movement of bacteria attached to phyto- and zoo-plankton species (like algae and copepods) that can in turn be transported by larger organisms [*Lipp et al.*, 2002]. Moreover, human mobility can also enhance the spreading of the disease because susceptibles and/or infecteds can act as vectors of pathogens [*Bertuzzo et al.*, 2011; *Chao et al.*, 2011]. All these processes can promote transport also against the flow direction and therefore they are conceptually modeled as isotropic dispersion, i.e., vibrios diffuse from any node *i* to each nearest neighbor at a rate *l*_{D} (last term of the RHS of equation (5)). In the equations *n*_{i} is the number on nearest neighbors of node *i.* Note that in a river network each node has only a downstream node, therefore *n*_{i} = *n*_{i}^{up} + 1.

[7] Seasonal variations of the hydrologic regime produce changes in the volume of the local water reservoir *W*_{i}(*t*), in the river velocity and, hence, in the vibrio advection rate They are thus expected to significantly affect disease dynamics. The effect of hydrologic fluctuations on the dispersion rate *l*_{D} is instead hardly predictable because of the various potential underlying mechanisms. In absence of field/experimental evidence on its behavior, we decided to keep it constant in the present analysis.

#### 2.2. Application

[8] In order to broaden the applicability of the analysis, we adopt Optimal Channel Networks (OCNs) [*Rinaldo et al.*, 1992] as a general model of hydrological networks. They are constructions shown to yield forms indistinguishable from real-life river networks (Figure 2, right). Moreover, we impose a uniform distribution of population, i.e., *H*_{i} = *H* ∀ *i.*These two assumptions allow us to single out the hydroclimatologic controls on the prevalence patterns in a non-specific geographical context and to exclude other sources of variability. Also, in the case of uniform population, gravity-like models that have successfully been employed to model human mobility [*Mari et al.*, 2011] can be effectively approximated by the dispersion mechanism introduced before.

[9] The nature of the exercise suggests some simplifications of the hydrological model. In particular, for the monsoon-like climate analyzed herein the hydrological dynamics of a local node(4) is fast with respect to temporal variations of precipitation. Therefore, one can reasonably assume that at any time the water reservoir of each node is in instantaneous equilibrium with the external forcings (i.e., *dW*_{i}/*dt* = 0). If in addition we consider that precipitation (*J*_{i}) and loss (*L*_{i}) terms are quite uniformly distributed in space, for any pair of nodes *i* and *j* we obtain the relation *Q*_{i}(*t*)/*Q*_{j}(*t*) = *A*_{i}/*A*_{j}, where *A*_{i} is the drainage area of node *i*, i.e., the total number of nodes upstream of *i.* At any time, also the river velocity can safely be assumed to be uniform (*u*_{i}(*t*) = *u*(*t*)), a reasonable approximation for many landscapes [see *Leopold et al.*, 1964]. As a result, also the advection rate is uniform and it can be expressed as , where the overline indicates time averaging. The water volume of each node *W*_{i}is proportional to the river cross-sectional area*Q*_{i}(*t*)/*u*(*t*), thus we finally obtain the relation *W*_{i}(*t*) ∝ *Q*_{i}(*t*) ∝ *A*_{i}: at a fixed time, the water reservoir increases downstream in proportion to the drainage area. Focusing on a single node, instead, river velocity, cross section area and, in turn, water volume vary in time following the seasonal variation of the hydrologic regime. Adopting a stage-discharge approach [*Leopold et al.*, 1964], one gets *u*(*t*) ∝ *Q*_{i}(*t*)^{a} and consequently *W*_{i}(*t*) ∝ *Q*_{i}(*t*)^{1−a}. In the following we will employ the exponent *a* = 0.4 which is derived under the hypothesis of a uniform flow in a large floodplain [*Leopold et al.*, 1964]. Overall, the hydrologic conditions explained above are deemed representative of a significant range of cases of interest. Moreover, they allow to derive the spatiotemporal evolution of water volume *W*_{i}(*t*) directly from the discharge observed in a single node, without explicitly solving equation (4) and thus without further assumptions on the processes of precipitation, evapotranspiration and infiltration. Specifically, we employ the discharge time series observed in the Bengal area (Figure 1). The model resulting from the above assumptions is detailed in the auxiliary material.

[10] During and after the monsoon period, a large fraction of the region under study, up to 70% during the most severe floods, is inundated and people crowd the unaffected areas. This is typically accompanied by a large failure of the sanitation and sewage systems, which results in a reduced access to treated water and in an increased environmental contamination. From a modeling standpoint, this translates into higher values of the exposure rates *β*, the rate at which susceptibles ingest contaminated water and food, and *p*, the *per capita* rate at which stools reach and contaminate the water resource [*Codeço*, 2001]. We assume that at each node these rates are proportional to the fluctuations of the local water volume around the mean value (see auxiliary material).

[11] To run the model, we also employ climatological forcings typical of the Indian subcontinent. In particular, we model the effects of the temperature annual cycle (Figure 1) on the survival of cholera bacteria in the environment. Warmer temperatures are known to favor the attachment, growth, and multiplication of *V. cholerae* in the surface water [*Colwell*, 1996; *Lipp et al.*, 2002]. This is accounted for by assuming that the mortality of the bacteria in freshwater environments depends on temperature *T* as follows:

where is the average vibrio mortality and *T*_{max} and are the maximum and the average temperature, respectively. The parameter 0 ≤ *ϵ* ≤ 1 quantifies the effect of temperature on *V. cholerae* mortality. With *ϵ* = 0, the mortality rate is assumed to be constant throughout the year, whereas with *ϵ* = 1 the net mortality rate of cholera pathogens is null during the warmest period.

[12] Finally, to model the natural presence of *V. cholerae* in the coastal aquatic environment, we impose at the outlet of the river network a concentration of pathogens proportional to the measured chlorophyll concentration (a proxy for phyto–zooplankton and, in turn, *V. Cholerae* concentration [*Lobitz et al.*, 2000; *Jutla et al.*, 2010] (Figure 1a)). We analyze the long-term behavior of the system by simulating the process for 30 years in which each year has the same hydroclimatological pattern. To ensure that the system has reached an endemic state without memory of its initial conditions, we discard the first 10 years and analyze the patterns of cholera prevalence for the remaining 20 years.