Generation and application of river network analogues for use in ecology and evolution

Abstract Several key processes in freshwater ecology are governed by the connectivity inherent to dendritic river networks. These have extensively been analyzed from a geomorphological and hydrological viewpoint, yet structures classically used in ecological modeling have been poorly representative of the structure of real river basins, often failing to capture well‐known scaling features of natural rivers. Pioneering work identified optimal channel networks (OCNs) as spanning trees reproducing all scaling features characteristic of natural stream networks worldwide. While OCNs have been used to create landscapes for studies on metapopulations, biodiversity, and epidemiology, their generation has not been generally accessible. Given the increasing interest in dendritic riverine networks by ecologists and evolutionary biologists, we here present a method to generate OCNs and, to facilitate its application, we provide the R‐package OCNet. Owing to the stochastic process generating OCNs, multiple network replicas spanning the same surface can be built; this allows performing computational experiments whose results are irrespective of the particular shape of a single river network. The OCN construct also enables the generation of elevational gradients derived from the optimal network configuration, which can constitute three‐dimensional landscapes for spatial studies in both terrestrial and freshwater realms. Moreover, the package provides functions that aggregate OCNs into an arbitrary number of nodes, calculate several descriptors of river networks, and draw relevant network features. We describe the main functionalities of the package and its integration with other R‐packages commonly used in spatial ecology. Moreover, we exemplify the generation of OCNs and discuss an application to a metapopulation model for an invasive riverine species. In conclusion, OCNet provides a powerful tool to generate realistic river network analogues for various applications. It thereby allows the design of spatially realistic studies in increasingly impacted ecosystems and enhances our knowledge on spatial processes in freshwater ecology in general.


| INTRODUC TI ON
The central goal of ecology is to causally understand patterns and processes in ecological systems, such as species coexistence, biodiversity patterns, or the unfolding of species invasions (Elton, 1958;Gause, 1934). Much of early ecological theory and empirical work has either focused on local patterns and dynamics or has taken a spatially implicit perspective. However, virtually all natural ecosystems are spatially structured, and the relevance of the spatial dimension on ecological systems can hardly be overestimated (Hanski & Gaggiotti, 2004;Holyoak, Leibold, & Holt, 2005;Levin, 1992).
A direct consequence of this spatial approach to ecology is the need to describe and understand the spatial structure and layout of natural ecosystems. While initial models of spatial dynamics assumed spatially implicit networks of populations or communities (Levins, 1970), all natural ecosystems follow spatially explicit structures. These structures, such as those typically found in coral reefs and atolls, mountainous landscapes and their elevational gradients, or tidal pools, are shaped by general geophysical processes resulting in characteristic landscape structures. Arguably the most iconic (but also among the most widespread) landscape structure is found in riverine networks (Leopold, Wolman, & Miller, 1964;Rodriguez-Iturbe & Rinaldo, 2001): erosional forces balancing uplift create dendritic networks of rivers and streams following universal patterns. These networks are characterized by their fractal, scale-free structure, as well as by universally applicable laws regarding many geomorphological and hydrological variables of direct relevance to ecology, such as catchment area, river bed width and depth, or variation in discharge (Horton, 1945;Leopold & Maddock, 1953;Rodriguez-Iturbe & Rinaldo, 2001). In contrast to these specific features of natural landscape structures, much of ecological and evolutionary theory and experiments, but also much of the species-distribution modeling has assumed either random networks or simply structured linear, circular or Cartesian networks, in which local patches are connected to their 2, 4, or 8 nearest neighbors (e.g., Bascompte & Solé, 1996;Bell & Gonzalez, 2011;Holland & Hastings, 2008). This oversimplification of spatial network structures may limit the plausibility and relevance of the findings. An application to more realistic network structures has, however, often been hindered by the lack of formalized, spatially correct, and generalizable network structures as well as easily accessible tools generating them.
All of these studies use networks that may at first sight look like "river" networks, but do not satisfy the necessary constraint posed by draining a given surface ( Figure 1). Furthermore, these constructs do not adequately represent the connectivity and several geometric properties (like the distributions of upstream and downstream lengths, and of total contributing area at a point) inherent to natural river networks, and lack the space-filling attribute of small to smallest streams not only incrementally flowing into larger streams, biodiversity, dispersal, ecological modeling, landscape, metacommunity, optimal channel network, river networks, spanning trees but also the common direct inflow of very small streams into large streams. As such, all of this work has been ignoring the extensive and long-lasting knowledge from geomorphology that has appropriately acknowledged and formalized the spatial unfolding of dendritic river networks.
In particular, the fractal character of river networks, epitomized by Horton's laws (Horton, 1945) on bifurcation and length ratios, was observed with regard to several morphological and hydrological characteristics of river basins and expressed by means of a number of power-law relationships, which are the signatures of fractal behavior (Mandelbrot, 1983;Maritan, Rinaldo, Rigon, Giacometti, & Rodríguez-Iturbe, 1996). Notable examples are Hack's law (Hack, 1957) L ∼ A h , linking the maximum upstream channelized length L at any location in the river with the corresponding drainage area A; the slope-area relationship s ∼ A −1 , where s is the channel slope (Tarboton, Bras, & Rodriguez-Iturbe, 1989); the scaling of the probability distribution of drainage areas P (A ≥ a) ∼ a − (Rodriguez-Iturbe, Ijjász-Vásquez, Bras, & Tarboton, 1992). Typical values observed in real rivers for the scaling exponents are h ≈ 0.57, ≈ 0.5, and ≈ 0.43 (Rinaldo, Rigon, Banavar, Maritan, & Rodriguez-Iturbe, 2014). From a hydraulic geometry viewpoint, Leopold's relationships (Leopold & Maddock, 1953) express how mean river depth, width, and velocity change, both at-a-station and along the river's course, as power-law functions of discharge, which in turn scales linearly with drainage area (strictly speaking, this applies to landscape-forming discharges; Rodriguez-Iturbe & . Such scale-invariant properties of river networks prompted the development of a model of idealized stream networks: optimal channel networks (OCNs). OCNs are "optimal" inasmuch as their configuration corresponds to a minimum of total energy expenditure and reproduces all scaling features of real rivers (Maritan et al., 1996;Rinaldo et al., 1992Rinaldo et al., , 2014Rodriguez-Iturbe, Rinaldo, et al., 1992). Importantly, OCNs are exact stationary solutions of the general equation describing landscape evolution (Banavar, Colaiori, Flammini, Maritan, & Rinaldo, 2001). The OCN construct allows the generation of an unlimited number of different network replicas spanning the same drainage domain, therefore enabling one to run computational experiments and derive results that are independent of the shape of a single river network, which would not be the case if real rivers were used as landscapes. Moreover, OCNs enable the investigation of spatial processes occurring not only in dendritic river networks, but also along the elevational gradients of fluvial landscapes (Bertuzzo et al., 2016;Giezendanner, Bertuzzo, Pasetto, Guisan, & Rinaldo, 2019). To this regard, it is worthwhile to note that the elevational landscape generated by an OCN is such that the graph obtained by following the steepest descent directions reproduces the OCN structure (Balister et al., 2018).
Despite the long-standing establishment of the OCN concept, its application especially in ecology and evolutionary biology has been lagging behind, likely because easily accessible code or appropriate tools have been lacking. This is particularly regrettable considering the recent bloom of tools allowing the statistical analysis of data from real dendritic networks (e.g., the R-package SSN; Ver Hoef, Peterson, Cliord, & Shah, 2014). However, such tools are specifically designed for real river networks, while their applicability to virtually generated networks is limited. To fill this gap, we here describe the methodological and mathematical frameworks that underly OCNet, an R-package for the generation and analysis of optimal channel networks, and provide guidelines and examples to facilitate the use of this tool.

| THE O CNE T PACK AG E
The OCN concept is based on the assumption that river network configurations occurring in nature correspond to a minimum of total energy dissipation across the landscape. Both this assumption and the ensuing algorithm generating OCNs are well supported by a comparison with river networks globally  Rodriguez-Iturbe, Rinaldo, et al., 1992). This section is structured as follows: first, we provide an overview on the theoretical background underlying the generation of an OCN; second, we outline the structure of the OCNet package; third, we clarify some concepts concerning the various aggregation levels at which an OCN can be defined and used.

| Theoretical background
Let us consider a regular lattice made up of N cells, where each cell represents the generic node i of the network. Each node i is connected via a link to one of its nearest neighbors. The energy dissipation across the ith network link is proportional to Q i Δh i , where Q i is the landscape-forming discharge in the link , and Δh i = s i L i the corresponding elevation drop, with s i identifying slope and L i link length. By assuming Q i , where A i is the area drained by link i, and the slope- (Tarboton et al., 1989), the functional representing total energy expenditure across a landscape formed by N cells reads.
Note that link lengths do not appear in the above formula, as they can be considered constant with no loss of generality. The OCN configuration is defined by an adjacency matrix W whose corresponding set of drainage areas A = A 1 , …, A N yields a local, dynamically accessible minimum of Equation (1). Note that the correspondence between A and the adjacency matrix W of a tree is subsumed by the relationship I N − W T A = 1, where I N is the identity matrix of order N, and 1 a N-dimensional vector of ones (Bertuzzo et al., 2015). (1)  which only explores close configurations, actually mimics the type of optimization that nature performs, at least in fluvial landscapes . Notably, restricting the search of a network yielding a minimum of Equation (1) to spanning, loopless configurations entails no approximation, because every spanning tree is a local minimum of total energy dissipation (Banavar, Colaiori, Flammini, Maritan, & Rinaldo, 2000). The shape of the so-obtained OCN retains the heritage of the initial flow configuration, although the extent to which this occurs is partly controlled by the cooling schedule adopted ( Figure 2). This underpins the concept of feasible optimality, that is, the search for dynamically accessible configurations.

| Overall setup of the package
The OCNet package consists of a series of functions that allow con-

| Aggregation levels
Before moving to the illustration of some possible applications of the package, we here clarify some concepts and terminology with respect to the aggregation of OCNs. Additional details are provided in the package documentation. Networks produced by the OCN algorithm can be used in a variety of fashions (see Table 1 for a review) by exploiting different connectivity metrics that are embedded in the OCN construct. At a first, nonaggregated level, each cell of the lattice (also termed as pixel) constitutes a node of the network (see  (1)) and temperature T (i.e., cooling schedule of the simulated annealing algorithm) for the 6 OCNs displayed above. Values of H and T are normalized by the energy of the initial network state. Note that, for graphical reasons, the initial states shown in panels (a, e) refer to 25 × 25 lattices to OCNs at the RN level). In a third, aggregated level (AG), nodes correspond to sources, confluences and outlet(s) of the river network identified at the RN level. The whole lattice is then partitioned into areas that directly drain into the nodes at the AG level, or the edges departing from them, thereby constituting the fourth, subcatchment level (SC).

| Number of outlets, boundary types, and elevational gradients
Although the OCN principle is primarily intended to be applied to networks spanning the whole drainage domain (where the area drained by the outlet is equal to the lattice area, see an example in Arrows on the other pixels identify flow directions; note that the these are not representative of an OCN, but are here presented only for illustrative purposes. Numbers represent the cumulative drainage area (in number of pixels). At the FD level, all 64 pixels belong to the network. To obtain the RN level, a threshold area A T = 5 pixels is applied to distinguish pixels belonging to the river network. Bottom row: the same procedure is applied to a single-outlet, 20 × 20 OCN (obtained via the script presented in Generation of an OCN). Aggregation is performed with A T = 5 pixels. Note that river width is proportional to the square root of drainage area (Leopold & Maddock, 1953) formulation presented in Theoretical background holds when multiple-outlet pixels are imposed. Technically, this is done by preventing these pixels to drain into their neighboring pixels. In this case, the sum of the areas drained by all outlet pixels is equal to the lattice area. In the package, the multiple-outlet option can be activated by means of the optional input nOutlet of function create_OCN. In the limiting case, all pixels at the lattice boundary can be treated as outlets (Bertuzzo et al., 2017;Sun, Meakin, & Jøssang, 1994). This is done by setting nOutlet = "All" in create_ OCN. A graphical representation of an OCN obtained for the latter case is shown in Figure 3d.
When a pixel's flow direction is rewired during the search for an optimal network configuration, possible directions are generally those toward the eight neighboring pixels. This is not the case for the outlet pixels (which cannot be rewired) and the pixels at the lattice boundaries, which can be rewired to either three (corner pixels) or five (side pixels) neighboring cells. This latter assumption can be  Figure 3b,e,h. Importantly, the slope-area relationship only holds for the channeled portion of the domain, which implies, strictly speaking, that the OCN must not be aggregated if one aims at making use of a three-dimensional landscape generated by an OCN. Moreover, the slope-area relationship is actually multiscaling (Tarboton et al., 1989), therefore the simple recursive application of (as performed by the function landscape_OCN) to yield an elevational landscape is to be considered as a first approximation, suitable for ecological applications. Methods to account for the scaling of the variance of the slope-area relationship exist (Grimaldi, Teles, & Bras, 2005), but are beyond the scope of this work.

| Relationship between threshold drainage area and number of nodes
Owing to the somewhat heuristic procedure for the definition of an  Here, 50 observation points have been sampled along the network by means of a binomial design, and their distance from the outlet is displayed practical applications. To this extent, it is worthwhile to note that the OCN construct is invariant under coarse graining Rodriguez-Iturbe & Rinaldo, 2001), which means that the choice of the lattice dimension N does not affect the scaling of drainage areas. As in the example above, such choice should rather be based on geomorphological arguments, that is, the answers to the questions: What is the area that the OCN is supposed to drain? What are the expected values of maximum stream order and drainage density on this area?
However, as a rule-of-thumb indication, we suggest to perform aggregation with a threshold not greater than A T = 0.02 ⋅ N, such that the obtained configuration is not affected by the finite-size scaling effect; this corresponds to an expected N AG ≥ 30 (see Figure 5a).

| Compatibility with existing R-packages
Specific functions of OCNet enable transformation of OCNs into objects that can be used by other commonly used R-packages in spatial ecology. In particular, compatibility with igraph (Csardi & Nepusz, 2006), a package for network analysis and visualization, is provided by function OCN_to_igraph. Moreover, function OCN_to_ SSN transforms an OCN at a desired aggregation level into an object that can be read by SSN, a package on spatial statistical modeling and prediction for data on stream networks (Ver Hoef et al., 2014).
Examples for these functions are shown in Figure 6. Finally, output from OCNet can be used in combination with R-packages for geostatistical modeling such as gstat (Pebesma, 2004), based on the coordinates of nodes of an OCN given at any aggregation level.
Remarkably, adjacency matrices and other information can easily be extracted as base R objects, which guarantees compatibility with virtually every R-package and even other programming languages.

| APPLI C ATI ON E X AMPLE: A ME TAP OPUL ATION MODEL
In order to show a possible application of the OCNet package, we now apply a simple metapopulation model for an invasive riverine species to an OCN.

| Generation of an OCN
Let us build an OCN with the following assumptions: it spans a 20 × 20 lattice, with a single outlet located close to the southwestern corner of the lattice, and each pixel represents a square of side 500 m (total size of the catchment is therefore 100 km 2 ); the elevation, slope and channel width of the outlet node are 0 m a.s.l., 0.01, and 5 m, respectively; the threshold area used to aggregate the network is equal to 1.25 km 2 . The code lines to build such network are the following: set.seed(1) # use fixed random number generator.
The resulting OCN is shown in Figure 4.

| Metapopulation model
Let us build a discrete-time, deterministic metapopulation model on the previously built OCN, according to the following assumptions: (a) the model is run on the OCN aggregated at the RN level (consisting of N n nodes); (b) population growth at each node follows the Beverton-Holt model (Beverton & Holt, 1957), with baseline fecundity rate r = 1.05 constant for all nodes, and carrying capacity  Figure 7. When p d = 0.5, the invading species rapidly reaches the equilibrium in the initially occupied node, while colonization of the downstream patches is delayed. When a preference for downstream movement is attributed (p d = 0.7), local population growth in the onset (green) node is slower, whereas invasion of the outlet node occurs faster, both in terms of initial growth and establishment of the equilibrium (see colored vertical lines in Figure 7a). Colonization of the headwater that is farthest from the onset node is also delayed with respect to the default case. As a result, when p d = 0.7, the metapopulation size initially grows faster than when p d = 0.5, due to fast invasion of the downstream nodes and growth of the local populations therein (see Figure 7b); in a second phase, the growth rate of the metapopulation is reduced, because invasion of the upstream nodes is hampered by the low p u value, and the establishment of the equilibrium is delayed. As for the spatial spread of the metapopulation, when a preference for downstream movement is adopted, local population sizes at equilibrium tend to increase in the downstream nodes and decrease in the upstream nodes with respect to the default case (Figure 7a), resulting in a slightly lower overall metapopulation size at equilibrium (Figure 7b).

| CON CLUS IONS
The importance of adequately representing spatial processes in ecological and evolutionary studies cannot be overstated. In the realm of freshwater ecology in particular, it is essential to consider how geomorphology shapes the structure of dendritic river networks and the ensuing connectivity configuration, which in turn control the variability of physical habitats and environmental variables, the dispersal of species and pathogens, and the spatial patterns of biodiversity and ecosystem processes. To this end, we presented OCNet, an R-package that enables the generation of optimal channel networks, spanning trees that reproduce all scaling features of real river networks throughout the globe. These can be used as realistic riverine landscape analogues for a number of ecological, epidemiological, ecohydrological and evolutionary studies. We reviewed the theoretical background of the OCN concept and the existing applications on problems of ecological relevance, provided an overview of the main functionalities of the package, and proposed an example of application in the context of an invasive riverine species. We believe that this tool will allow a leap forward in the way spatial processes in river networks are investigated.