Global mapping of river sediment bars

Recently, deep learning has been increasingly applied to global mapping of land‐use and land‐cover classes. However, very few studies have addressed the problem of separating lakes from rivers, and to our knowledge, none have addressed the issue of mapping fluvial sediment bars. We present the first global scale inventory of fluvial gravel bars. Our workflow is based on a state‐of‐the‐art fully convolutional neural network which is applied to Sentinel‐2 imagery at a resolution of 10 m. We use Google Earth Engine to access these data for a study site that covers 89% of the Earth's surface. We count 8.9 million gravel bars with an estimated area of 41 000 km2. Crucially, the workflow we present can be executed within a month of highly automated processing and thus allows for global scale, monthly, monitoring of gravel bars and associated rivers.


| INTRODUCTION
Our capacity to map river systems globally and our ability to reconstruct their geomorphological trajectories of the recent past (last decades/years) will determine our ability to detect freshwater biodiversity losses, associated ecosystem services availability, such as water resources for energy and food production, as well as flood mitigation strategies (Fryirs, 2017;Tickner et al., 2020).Over the last decade, emerging remote sensing technologies have boosted this capacity to observe river systems from local to global scales (Piégay et al., 2020).Global databases of river features such as river order, slope, discharge and width have been derived indirectly by integrating globally available digital elevation models (Altenau et al., 2021;Grill et al., 2019;Lehner et al., 2008), hydrological modelling, hydraulic geometry relationships and, more rarely, satellite observations (Frasson et al., 2019).These global databases have demonstrated their value allowing us, for instance, to explore the role of major dams globally affecting the natural flow of rivers.Additionally, databases of global estimates of water, sediment and nutrient river fluxes have highlighted the significant roles of human activities of altering such fluxes over the last century (Syvitski et al., 2022).However, these global databases do not yet rely on image-based information or direct observations of landforms but on topographic information used to derive fluxes and river network characteristics.Direct forms of river mapping from space at global scale have been mostly limited to identifying water surface extent as part of comprehensive land-use classifications (e.g., Karra et al., 2021;Zanaga et al., 2022) or its changes over the last decades (e.g., Pekel et al., 2016).However, these studies are not yet able to distinguish lakes and wetlands from rivers and streams.Even more importantly, we have not yet mapped river sediment bars at global scales.This is probably due to the fact the water is easier to classify and isolate using multispectral information whereas fluvial landforms such as sediment bars have a very similar spectral signature when compared with bare soil and agriculture crops bordering river systems.Moreover, identifying water is achievable by using spectral information (Gao, 1996;Xu, 2006), whereas distinguishing rivers from lakes would require an analysis of the shape of the water surface and its interpretation.These tasks require a step forward from a simple pixel-based classification approach to more sophisticated methods based, for instance, on AI, which are able to include in their predictive power information on object shape and texture.This technology is already advanced in other disciplines of computer vision (Long et al., 2015), but it is still not fully exploited in river system mapping where such advances are needed in order to generate novel river geomorphic information capable to advance our understanding of river systems and to better support management needs.Specifically, the use of AI-based monitoring could track the extent rivers, lakes, wetlands, oxbows and so on and associated sedimentary deposits thus delivering novel information on change at global scales.
Low flow channels together with sediment bars composed the active channel and together with vegetated islands and the riparian corridor composed the river corridor (Ham & Church, 2002).Being able to map these fluvial landforms and their dynamics over time at the global scale would open to a new era in our ability to explore river forms and processes.Recently Nyberg et al. (2023) proposed a convolutional neural network (CNN) approach to identify the river channel belt, defined as the corridor of river channel migration formed during one river avulsion cycle, followed by the use of spectral indices across multiple years to inventory rivers (water inside channel belts) and lakes (water outside channel belts).They present a global map of the extent of river channel belts and distinguish between single versus multi-threaded river channels.This is indeed a river mapping contribution going beyond water surface extent mapping and which starts to provide more geomorphologically informed data.However, this approach does not explicitly map sediment bars.Failing to observe sediment bar presence and their spatial configuration over time, we neglect the possibility to generate basic information to understand river behaviours (Brierley et al., 2013).
In this letter, we present an operational workflow capable of semantic classification of streams, rivers and exposed fluvial sediment bars.For the purpose of this work, gravel bars are defined as any subaerial and unvegetated sediment deposit adjacent to a river.This workflow uses Sentinel-2 data at 10 m spatial resolution downloaded from Google Earth Engine (Gorelick et al., 2017).The core classification task is carried out with a fully convolutional network (FCN) dubbed Tiramisu (Jégou et al., 2017), a type of deep learning architecture capable of pixel-level semantic class predictions that preserve the resolution of the input image.Notably, the trained model and classification workflow separates rivers and streams from lakes with a commission error of 8%, and it separates fluvial bars from other types of bare soil with a commission error of 16%.We deploy this workflow at global scale on a study site equivalent to 89% of the Earth's land area.This allows us to present maps of global river density and, for the first time, global densities of fluvial sediment bars.
The workflow can be executed in less than a month meaning that monthly monitoring of rivers and associated changes becomes possible at global scale.

| Model objectives and semantic classes
Our aim is to produce global scale maps that can identify rivers distinctly from lakes and associated fluvial sediment deposits and thus highlight spatial and temporal patterns.We choose the following semantic classes.Class 0 is the background.The type of deep learning model we intend to use requires a background class which is in essence the non-detections.In our case, this will include vegetation, urban areas and barren areas.Class 1 is rivers.Here, we require the presence of water, and we include artificial canals.Dry ephemeral channels will not be classified as rivers.Class 2 is lakes.Mapping lakes is considered beyond the remit of this work.However, in order to teach the model to distinguish rivers from lakes, the lakes need to be in a separate semantic class.In this class, we include abandoned meanders (Oxbow lakes).One challenging scenario is where rivers emerge from lakes and reservoirs and where a boundary needed to be set between the classes.In such cases, the user made a somewhat subjective decision based on a qualitative criteria of seeing the beginning of fluvial forms emerging from the lake.This usually corresponds to the development of curved banks and sometimes of gravel bars.
Class 3 covers all forms of fluvial deposits, but we limit this to those deposits adjacent to a river.These are the four classes actually predicted by the model.We also add four so-called 'inherited' classes because they will be determined at a later stage using external data.Class 4 is the ocean.Class 5 is glaciated terrain.Class 6 is not used in the paper but is set aside for snow.Class 7 is clouds, and Class 8 is a no-data class for cases where no satellite imagery is available.

| Google Earth Engine
Our workflow is based on Sentinel-2 data, and we use Google Earth Engine as a download tool (Gorelick et al., 2017).We download Sentinel-2 Bands 8 (Near-Infrared), 4 (Red) and 3 (Green) to an expanded Google Drive cloud storage space.We keep the native spatial resolution of 10 m.In addition to the image layers, we add a cloud mask derived from the s2cloudless product (Skakun et al., 2022) as an extra band.Spatially, the download process is organised as per the Military Grid Reference System (MGRS) that further divides each UTM zone of 6 in longitude into zones with 8 of latitude.The download area is composed of the 468 MGRS zones that constitute the non-polar world (89% of the Earth's landmass, see the outline of Figure 1).Given that 1 is equivalent to 110 km at the equator but only 55 km at latitudes of ±60 , imagery and all derived products are projected to the appropriate UTM coordinate system in order to preserve the constant, latitude-invariant, 10 m spatial resolution of the original Sentinel-2 imagery.Temporally, we choose a sample duration of 1 month in order to capture changes at smaller time scale recognising that such a short temporal window is not capable of delivering global imagery which is 100% cloud free.The choice of the exact monthly period is the result of a compromise.We discounted boreal winter given that snow can cover as much as 45 000 000 km 2 (Déry & Brown, 2007).We also found that tropical areas of South America and Africa have persistent high levels of cloud at any time of the year.We also noted that boreal summer months with their longer days gave a better overall global coverage of low-cloud data given that most of the Earth's landmass is in the Northern Hemisphere.
Consequently, we chose the month of July for the year 2021.We acknowledge that this will leave data gaps in the tropics and in the area of the Indian Monsoon.Over this period, Google Earth Engine will composite all available images with less than 20% cloud cover by using the median value of available pixels.

| External data
We also use several global databases at various stages of this work.
We use HydroSHEDS vector products (Lehner et al., 2008) to define the ocean area and inform post-processing algorithms that correct errors in large lakes and large rivers.We also use the Randolph Glacier Inventory (RGI Consortium, 2017) and define glaciated terrain as a 5 km buffer extending around glaciers.

| Deep learning and image processing pipeline
Deep learning is rapidly establishing itself as the de facto method for water mapping at large scales (e.g., Isikdogan et al., 2017Isikdogan et al., , 2020;;Moortgat et al., 2022).We use a FCN that can deliver pixel-level semantic classifications (Long et al., 2015).Within this family of models, several architectures exist.We have chosen to use a fully convolutional DenseNet architecture proposed by Jégou et al. (2017) and dubbed 'Tiramisu'.This architecture delivers state-of-the-art results that outperform other models (e.g., SegNet, Badrinarayanan et al., 2017;DeepLab, Chen et al., 2018) against established benchmarks (Brostow et al., 2009) First, we use a large, but more error-prone, training dataset of 740 000 samples derived from an automated labelling of ESRI 2020 land-cover data (Karra et al., 2021).Second, we fine-tune the model with another training phase that uses the manually labelled data from 293 randomly selected locations of Figure 1, the remaining 50 locations are set aside for validation.For each of these sites, we extract an image of 15 Â 15 km and manually label these images.We then use data augmentation in order to get 59 000 very high quality samples of 2.24 by 2.24 km.These data are used to train and validate the model (see the Supporting Information for more detail).A key step of our workflow is a set of novel image post-processing filters.Deep learning semantic classification models produce prediction probabilities for each class with each pixel being assigned to the class of highest probability.However, we find the value of the second highest probability can be combined with contextual and external information to improve our predictions.We postulate that if two adjacent pixels have a different class, there could be a classification error.An examination of the probabilities for the most-likely class (the attributed class) and the second most-likely class is then used to determine if an error has occurred.Additionally, we use vector datasets from the HydroSHEDS database to provide further contextual corrections for large lakes in excess of 5 km 2 and large rivers with widths potentially in excess of 2.24 km.Full code and details on the post-processing filters can be found in Carbonneau (2023a).Figure 2 summarises our deep learning workflow.

| Validation
First, we use 50 manual validation sites set aside before the training process.The data from these tiles have 112 million background pixels, six million river pixels, nine million lake pixels and 770 000 bar pixels.
These data are used to produce confusion matrices and estimate the user's and producer's accuracies.They are also used to produce validation results for some intermediate processing steps (see the Supporting Information).We also merge the river and lakes classes for both truth and model data in order to produce error estimates for a 'water body' class that are more comparable to existing work.Second, we build a validation dataset with 10 000 randomly selected river locations across the full study site.We repeat this random selection for the lakes class and bar class (30 000 locations in total).We then use Google Earth Engine to extract an image for each of these samples.We edit the pixel values at the sample location to create a blue cross composed of five 4-connected pixels, and we save the image in a folder corresponding to the class.We then examine each image in each class folder and assess if the predicted classification of the point marked by the blue cross was correct or incorrect.This procedure will estimate the user's accuracy for the river, lake and bar classes at global scales.

| Validation
Table 1 gives the validation outcomes for the manually labelled data.
Table S1 gives validation results for the intermediate steps in our processing pipeline.Accuracy metrics for the background class are in F I G U R E 1 Study area.Green points show the location of the 343 labelling sites, distributed in catchments that exceed 500 000 km 2 (blue).The full study area is the combined blue and black outlines and represents 89% of the earth's landmass.[Color figure can be viewed at wileyonlinelibrary.com] excess of 98%.Any averaging of the accuracy statistics across classes would be artificially high, especially if weighted by pixel counts, and it is therefore important to consider class accuracies individually.The confusion matrices show that the main source of error for the river class is confusion with lake pixels with producer's and user's accuracies of 95% and 92%, respectively.In the case of bars, user's accuracy is 84% with a lower producer's accuracy of 61%.Here, the main source of confusion is the background class.This means that pixels predicted as bars are reliable, but overall, we can expect a significant portion of global bar area to be absent from the survey.Our bar results should therefore be seen as a lower bound accounting for a little over half the global population of bars.Validation results for intermediate stages of our workflow are available in the Supporting Information.
In the case of a merged water body class, we find user's accuracy of 87% and producer's accuracy reaching 95%.Specifically for the water body class, Zanaga et al. ( 2022) report user's accuracy of 89% and producer's accuracy of 86% for the latest version of the WorldCover product.Karra et al. ( 2021) report user's accuracies of 83% and producer's accuracies ranging from 86% to 94%.Isikdogan et al. (2017) report commission errors of 8% (92% user's accuracy) and omission errors of 13% (producer's accuracy of 87%).However, we stress that none of these comparators studies present data where lakes and rivers constitute distinct semantic classes.Nyberg et al. (2023) do provide a semantic class product with distinct rivers and lakes, but they only validate the channel belt and active channel width predictions resulting from their CNN model.User's and producer's accuracies, or any other form of accuracy assessment (e.g., F1 scores and kappa scores), are not available for their global semantic class product.
At larger scales, we found that a global random check of 10 000 samples from the fine-tuned filtered predictions resulted in slightly lower user's accuracies of 85%, 78% and 77% for rivers, lakes and bars, respectively.We followed up this validation by a full visual inspection of all the data.For rivers, the main source of error remains confusion with lake pixels, most often occurring with elongated lakes or reservoirs.For lakes, the main source of error is shadows.In the case of bars, we observe two sources of error worth noting: (1) faint clouds that have not been detected by s2cloudless and occurring over rivers and (2) arid areas adjacent to  rivers and canals with the same spectral characteristics as bars but not of fluvial origin.
Figure 3 shows four examples of local results.In these four locations, we also show the equivalent WorldCover 2021 data (Zanaga et al., 2022), the HydroRIVERS vectors (Lehner et al., 2008) and the SWORD v2 vectors (Altenau et al., 2021).In the case of Hydro-RIVERS, we used the Strahler order in the attributes data and the estimated widths from Downing et al. (2012) to buffer the vector lines.In the case of the SWORD v2 data, we used the channel width information present in the attributes.Figure 3 illustrates our final model's ability to effectively detect rivers and bars.However, we also note some issues.In row B, we can see that our model has falsely classified small clouds as bars.Also, some very small channels are detected as water bodies in WorldCover data but as background (undetected) in our data.In the case of the vector data, the rightmost column of Figure 3 shows reasonable correspondence and highlights the challenges of using these vector databases in combination with raster-based semantic classifications.Even if these vectors have information in their attributes that relates to local channel widths, this information was never intended to accurately map channel boundaries.It is therefore not possible to perform a direct quality estimation of our results based on these data.
The compressed global set of classification rasters at full 10 m resolution requires 10 GB of storage (Carbonneau, 2023b).Figure 4 shows resulting global maps for the surface densities for rivers and bars.On the top, we show River and Stream Surface Area (RSSA) expressed in ppm of the surface area (10 000 ppm = 1%).The map further divides the study area in grid squares of 2 Â2 .Within each square, we used Python to extract and inventory the semantic classes.
If we consider the total inventory of river area, we observe a total river area of 433 544 km 2 .If we assume that cloudy areas in a given  intensive validation approach with a very large number of scattered validation sites.As a result, their overall validation accuracy is cited as 74.4%.A fundamental issue is the absence of common benchmark datasets that could allow us to cross compare models and results.In computer vision, there are several high-quality benchmark datasets such as ImageNet (Deng et al., 2009) and COCO (Lin et al., 2015) that can be used to evaluate model performance.There is an urgent need to establish such benchmarks for Earth Observation.

| CONCLUSION
Overall, this work shows the potential of deep learning when applied to global scale studies of fluvial systems.Of particular importance is the fact that the workflow presented here required 4 weeks of computation with a very high level of automation.This means that monthly monitoring of river systems at global scale is now possible with 10 m resolution Sentinel-2 data.On this basis, future models that distinguish sedimentary features based on their formation mechanism such as point bars, alluvial fans and ephemeral channels are now within reach.
, while having a low number of trainable parameters and corresponding training times.Crucially, fully opensource implementations of the model are readily available.We use this architecture with the common image chip size of 224 Â 224 pixels because (1) this results in models that are cross-comparable with a wider range of other models (e.g., VGG16, Simonyan & Zisserman, 2015 and ResNET, He et al., 2016) and (2) the final models are more transferable and can be run on more modest GPUs with 8 GB of memory.Our final model is the result of two training phases.

F
I G U R E 2 Deep learning workflow chart.We use a main training phase with 740K medium-quality, automatically generated samples followed by a fine tuning training phase with 59K highquality samples.The final model is then used for inference (i.e., prediction) followed by our post-processing filters.The model predicts three classes: rivers, lakes and bars.External is then used to define the ocean and glaciated terrain.[Color figure can be viewed at wileyonlinelibrary.com]T A B L E 1 Confusion matrices, producer's accuracy and user's accuracy for the manually labelled validation data.

2
Â2 square have the same river density as the non-cloudy area, and if we include cloud-driven no-data values, then we can correct for clouds and we find an estimated total river area of 562 925 km 2 .On the bottom of Figure 4, we show the global densities of bars.This reveals a high concentration of bars at high northern latitudes with additional hotspots South-West of the Amazon, in South East Africa and on either side of the Himalayas.Visual examination revealed that these observations are valid.However, it also confirmed that apparent bar density hotspots around the Nile valley and North of the Arabian F I G U R E 3 Result examples of our data from July 2021 with comparison to ESA WorldCover 2021, HydroRIVERS and SWORD v2 data.(a) River McKenzie, Canada; (b) Jacaré River (Amazon Basin), Brazil; (c) Po River, Italy; and (d) Popigay River, Russia.Greyscale background is Band 8 (NIR) of Sentinel 2. [Color figure can be viewed at wileyonlinelibrary.com]Peninsula are in fact arid lands adjacent to canals falsely classified as bars.In total we find 8.9 million gravel bars.Total bar area is estimated as 41 165 and 33 665 km 2 with and without cloud correction, respectively.

Figure 5
Figure 5 shows global scaling relationships for the bars mapped in Figure 4 (bottom).In Figure 5a, we show the magnitude frequency distribution of global bar sizes in pixels.The relationship is close to a log-linear power law decay with a slope of À1.59 with a slight