Predicting lake bathymetry from the topography of the surrounding terrain using deep learning

Lake morphometric features like surface area, volume, mean, and maximum depth are important predictors of many physical, biological, and ecological processes. Lake bathymetric maps that present the lake basin contours are thus an integral part of limnological investigations. Accurate but cumbersome traditional bathymetric surveys measure the depth using a lead line or echosounder. Recently, airborne bathymetric mapping using imagery or laser scanning has been attempted in shallow freshwater and coastal habitats. However, these methods depend on the ability of light to penetrate the water column, which can be problematic in eutrophic lakes and shallow lakes. To alleviate these issues, we developed and tested a deep learning model (based on the U-net) using data from 153 lakes in Denmark to predict bathymetry using the topography of the surrounding terrain. The deep learning model performed much better (pixel-wise mean absolute error: validation set = 1.75 and test set = 2.15 m) than baseline interpolation approaches (validation set = 3.12 m). In addition, the deep learning model generated more realistic bathymetry maps that did not suffer from interpolation artifacts. We ﬁ nd that the model performance improves slightly with increasing model size (number of trainable parameters) and the extent of the surrounding terrain. In addition, our pretraining procedure improved performance and reduced the time for model convergence. Because the model only relies on digital elevation data which are widely available, it can be ﬁ ne-tuned and used to predict lake bathymetry in other geographical regions. Lakes are prominent landscape features covering approximately 2% of the global land area (Pekel et al. 2016). These

Lakes are prominent landscape features covering approximately 2% of the global land area (Pekel et al. 2016).These important ecosystems harbor many species and play an essential role in element cycling along the freshwater continuum (Tranvik et al. 2009;Biggs et al. 2017).The topography of the global land surfaces can be mapped at high spatial resolution using remote sensing methods (Verpoorter et al. 2012;Abrams et al. 2020).However, information regarding underwater surfaces, water column depth, or lake bottom elevation is much more limited.Improved bathymetric maps are a prerequisite for many kinds of local to regional scale studies of biological, chemical, and physical lake processes.
Lake morphometry influences physical properties (turbulence and stratification-mixing dynamics), water chemistry, and the distribution of organisms (Fee et al. 1996;Dolson et al. 2009).Morphometric variables such as lake volume, mean (z mean ), and maximum (z max ) depths are important predictors of many processes, including water retention time, nutrient loading, and cycling, as well as productivity of phytoplankton, zooplankton, and fish (Håkanson 2005).Even more usefully, hypsographic relationships and bathymetric maps communicate the extent of littoral zones colonized by submerged plants (Seekell et al. 2021) and hypolimnion zones that might be subject to seasonal oxygen depletion (Deeds et al. 2021), and also can be incorporated into physical models (Fricker and Nepf 2000).Lake bathymetry and derived morphometric variables are thus used extensively to investigate lake functioning.Yet, while historical data exist for some lakes, particularly for larger and more iconic lakes, they are not available for the large majority of lakes.
Traditionally, lake bathymetry is determined by measuring depth along regularly distributed sampling points or transects using a lead line or, in recent times, an echosounder that provides more accurate information on depth and position (Høy 1988).Even more recently, remote sensing imagery or "Light Detection and Ranging" (LiDAR)-based methods have been used to measure depths in aquatic habitats (Hilldale and Raff 2008;Dörnhöfer and Oppelt 2016).Although these methods are more efficient and can cover larger areas, the associated costs and analytical complexity are high (Gao 2009).Furthermore, as both imagery and LiDAR methods depend on light penetrating the water column, their use is restricted by high turbidity (Tripathi and Rao 2002).This becomes a challenge in both eutrophic and shallow lakes, where turbidity is often high due to phytoplankton, colored dissolved organic matter, or re-suspended sediment particles (Balogh et al. 2009), and may exhibit pronounced spatiotemporal variation (Martinsen et al. 2022).
Lake volume, z mean, and z max can be estimated from empirical relationships with lake surface area (A), the slope of nearby terrain, and other topographic features (Hollister et al. 2011;Sobek 2011;Delaney et al. 2022).However, many research objectives can only be fulfilled by referring to accurate bathymetric maps rather than morphometric features like volume and z max .Hollister and Milstead (2010) suggest a simple method using only the shoreline and z max to predict lake bathymetry as a function of distance from the shoreline.Predictions can be further improved by including remote sensing imagery (Hodúl et al. 2018;Yunus et al. 2019) or a digital elevation model (DEM; Zhu et al. 2019).
Using a DEM to predict lake bathymetry overcomes some limitations of imagery in systems with high turbidity.This method assumes that lake bathymetry is related to the topography of the surrounding terrain, as both the lake and landscape have been formed by the same processes (Hutchinson 1957).This should be the case for lakes of glacial origin, which are the majority of lakes globally, and for reservoirs or riverine lakes formed by damming (Kalff 2002).However, the functional relationship between lake bathymetry and the surrounding topography is unknown and likely variable between lakes and geographical regions.
In recent years, deep learning methods have proved successful in many applications.Deep learning is the process of training multi-layered neural networks with many tunable parameters from data, whereby the network gradually "learns" to extract valuable representations at each layer of the network (LeCun et al. 2015;Schmidhuber 2015).Neural networks can thus approximate complex functional relationships between paired input and output data in an automated fashion and have proved to be suitable for predicting the content of masked (missing) pixels in images (image inpainting; Elharrouss et al. 2020).This is equivalent to predicting lake bathymetry from the topography of the surrounding terrain: the model input is a DEM with all lake pixels masked, and the model is tasked with predicting elevation in each lake pixel.For image inpainting tasks, models can be trained in an unsupervised manner by providing inputs with pixels masked at random and tasked to reconstruct the original image.Similarly, this approach could be applied to pretrain models for interpolation of missing pixels in DEMs, forcing the model to learn useful landscape representations which may be advantageous for downstream tasks such as predicting lake bathymetry (Erhan et al. 2010).A few studies have applied deep learning methods to interpolate missing regions in DEMs (Qiu et al. 2019;Yan et al. 2021) but to our knowledge, this has never been applied to predict lake bathymetry.
In this study, we investigate methods for predicting lake bathymetric maps from the topography of the surrounding terrain.Using bathymetric maps from 153 Danish lakes, we compare the performance of a deep learning approach with four baseline methods.We explore the influence of unsupervised pretraining and model size (estimated as the number of trainable parameters) on model performance.We hypothesize that: (1) deep learning yields superior lake bathymetric predictions; and (2) model performance improves with pretraining and increased model size.

Study region
In Denmark, approximately 180 thousand lakes have been identified; the majority (97.5% <1 ha) being small lakes (median = 498 and mean = 4002 m 2 ; SDFE 2021).Except for southwest Jutland, the country was covered by ice during the recent Weichsel glaciation, and most lakes, specifically pothole lakes, are of glacial origin.The elevation range of the landscape is small (minimum = À18 and maximum = 172 m).In recent time, particularly the last 200 yr, human activity has influenced the water table (e.g., tile drainage and damming) and bathymetry (e.g., dredging) of many Danish lakes.

Sources
Bathymetric data are derived from the work of land surveyor Thorkild Høy, who generated accurate lake bathymetric paper maps that were subsequently published in the book series Lakes of Denmark (Høy 1988;Høy and Dahl 1991, 1993, 1995, 1996;Høy et al. 2004).The Danish Agency for Data Supply and Infrastructure provided us with digitized versions of these paper maps (SDFE 2021).We excluded lakes formed by coastal processes or peat dredging and small lakes with a surface area (A) of less than 10 ha and z max less than 1 m, as a consequence of the insufficient resolution and vertical accuracy of the DEM.Lakes larger than 10 ha makes up approximately 60% of the total lake surface area in Denmark.Initial experiments showed that the bathymetry of the small, shallow lakes were difficult to predict at the applied resolution.In addition, many of these small lakes are artificial, created for various purposes, for example, reservoirs, gravel pits, and so on which further complicates modeling of bathymetry based on the surrounding terrain.The initial selection left 153 lakes distributed across Denmark for further analysis (Fig. 1), most of which are in recently glaciated central Jutland and Zealand.For each lake, we extracted surface elevations and the topography of the surrounding terrain from a national DEM (10 Â 10 m resolution), created from a very high-resolution (1.6 Â 1.6 m) DEM covering Denmark.

Processing
For a given lake, the surrounding area where topography influences lake bathymetry is difficult to quantify and likely varies between lakes and depends on landscape history (Heathcote et al. 2015).To investigate how this influences model performance, we created DEM grids of three sizes for each lake (Fig. 2a).We expected this distance to scale with A and created grids using three buffer distances (33%, 66%, and 100% of ffiffiffi ffi A p ).The grids were further enlarged until each width and height was a multiple of 16, to conform with the model architecture.Finally, the DEM grid and Høy's bathymetric maps were merged and elevation values were rescaled to a À1 to 1 range.

Baseline
We use multiple interpolation methods as baselines for the suggested deep learning approach, which similarly only rely on topography of the surrounding terrain as input.We do not use distance-based methods (Hollister and Milstead (2010)), as this would require an independent model to predict z max for each lake.We use two general purpose grid interpolation methods (Baseline linear and Baseline cubic ) implemented in the Python Scipy package (Virtanen et al. 2020) and two methods developed for image inpainting tasks (Baseline Telea and Baseline Navier-Stokes ; Bertalmio et al. (2001); Telea (2004)) implemented in the Python OpenCV package (Bradski 2000).Baseline linear and Baseline cubic use the entire DEM grid for interpolation, while the performance of Baseline Telea and Baseline Navier-Stoke does not depend on the size of the DEM grid because the masked region is filled starting from the inner boundaries (lake shoreline).Predictive performance, defined as the pixel-wise mean absolute error (MAE) for the mask region of the baseline methods, was assessed on both the validation set for comparison with deep learning methods and the entire dataset.

Model architecture
We used a U-net architecture to predict lake bathymetry from the surrounding terrain.The U-net, a fully convolutional neural network where both the input and output are rectangular images or grids, and is frequently used for semantic segmentation and image inpainting tasks (Ronneberger et al. 2015;Elharrouss et al. 2020).The input to U-net in our setting is a DEM grid with a masked region (the lake): the model is tasked with predicting elevation values for each pixel within this region, that is, the lake bathymetry (Fig. 2).The model is trained, updating its parameters, to minimize the discrepancy between its predictions and the ground truth through multiple iterations (epochs) on the training data set.
In order for the model input and output dimensions to be identical, inputs should be divisible by 16 because the U-net encoder downsamples the input four times, halving the spatial dimensions each time (2 4 = 16).Likewise, the U-net decoder upsamples the spatial dimensions four times.

Pretraining
We considered two ways of initializing the trainable parameters of the U-net, either at random (U-net X-Random ) or using parameters obtained from unsupervised pretraining (U-net X- DEM ).Obtaining good initializations from unlabeled data with pretraining have proven to improve model performance when fine-tuned with labeled data (Erhan et al. 2010).We pretrained the models by generating DEM grids and lake-shaped masks to simulate the task of predicting lake bathymetry from the topography of the surrounding terrain (Supporting Information Fig. S1).Specifically, we selected 10,000 DEM grids of different sizes and aspect ratios, resizing each to have a height and width of 256 Â 256 pixels.To create lake-like mask regions, we randomly selected the shapes of 1000 Danish lakes.Furthermore, we applied augmentations to the DEM grids and masks that simulate feasible lake shapes to reduce overfitting.Augmentations were performed using the  ) for each lake and predict lake bathymetry from the surrounding terrain.(a) Lake Borre (blue) and the topography of the surrounding terrain and the three buffer distances used to predict lake bathymetry.(b) Lake Borre (masked) and topography of the surrounding terrain (33% buffer).This is the input for the deep learning model.(c) Lake bathymetrical map of Lake Borre (ground truth).(d) Bathymetry of Lake Borre predicted by the deep learning model.(e) Difference between the ground truth (c) and predicted (d) lake bathymetry.The differences are used to calculate the loss (mean absolute error) when training the model.
albumentations Python package (Buslaev et al. 2020) and included shifting, scaling, rotating, flipping, and additionally morphological operations (dilation and erosion) for the masks.

Model training
The U-net models were trained to predict lake bathymetry from the topography of the surrounding terrain.During training, we applied similar augmentations as described for pretraining.Augmentations are used to reduce overfitting and improve model training, in particular when the availability of ground truth data is scarce.Models were implemented in the Pytorch framework (Paszke et al. 2019) and trained using PytorchLightning (Falcon & The PyTorch Lightning team, 2019).Models were trained on a desktop computer with a dedicated graphical processing unit (GPU; RTX A5000, 24 GB RAM; Nvidia, California, USA).

Hyperparameters
This section describes specific choices for designing the various deep learning models reported in this work.These technical details are reported here primarily for the sake of reproducibility.We used a U-net with four down-and upsampling blocks, each consisting of two convolutional layers (transposed convolutional layers used for upsampling) followed by an activation function (leaky ReLU) and skip connections between parallel downsampling and upsampling blocks.The final activation function, "hard-tanh," results in output ranging between À1 and 1.To assess the influence of model size, that is, the number of trainable parameters, we trained models (U-net 4 , U-net 8 , U-net 16 , and U-net 32 ) of four sizes (0.121, 0.485, 1.9, and 7.8 million trainable parameters).
Models were trained using MAE as the loss function for the mask regions and the ADAM optimizer (Kingma and Ba 2014) using a learning rate of 0.0001.For the pretraining task, models were trained for 1000 epochs using a batch size of 32.For the lake task, models were trained for 500 epochs and, to avoid resizing DEM grids and maintain the original spatial dimensions, we used a batch size of 1 combined with gradient accumulation every 8 th iteration, which is equivalent to a batch size of 8.

Model selection and evaluation
The 153 lakes were randomly partitioned into train-validation-test sets (60%-20%-20%) of 91, 31, and 31 lakes, respectively.To identify the best model configuration, we evaluated four model sizes, three buffer distances, and two types of model initialization (24 models in total).The predictive performance (MAE) of these models was evaluated on the validation set.Finally, the performance of the best model was assessed on the test set.In addition to MAE to evaluate performance, we used root mean squared error (RMSE), the Pearson product-moment correlation coefficient (r corr ), and the relationship between observed and predicted lake bottom elevation.

Lakes
The 153 lakes used to investigate the performance of different methods for predicting lake bathymetry were generally small to medium-sized (median = 44.8ha and mean = 161.7 ha) and shallow (median z max = 5.6 m and mean z max = 8.1 m; Table 1).The deepest Lake Fure (z max = 38.2m), the largest Lake Arre (3954.5 ha), and the most voluminous Lake Esrum (0.234 km 3 ) in Denmark were all included.

Baseline methods
As expected, none of the four baseline methods accurately predicted lake bathymetry (Fig. 3).Of the four, the two inpainting methods (Baseline Telea and Baseline Navier-Stokes ) performed the worst.The two more general methods for grid interpolation (Baseline Linear and Baseline Cubic ) performed better, with Baseline Cubic being the best, albeit with slightly decreasing performance as the size of the DEM cropout increased (validation set MAE at 33% = 3.12 m, 66% = 3.15 m, and 100% = 3.17 m).All baseline methods performed slightly better on the validation set alone than on the entire dataset.

Model performance
We used U-net models in different configurations to predict lake bathymetry from the topography of the surrounding terrain.To identify the best-performing model, we tested U-nets of differing model sizes and initializations.In almost all cases, using the U-net improved performance on the validation set, relative to the baseline (Fig. 3).The best-performing deep Table 1.Summary statistics (minimum, 1 st quartile, median, mean, 3 rd quartile, and maximum) of lake surface area, surface elevation, maximum depth, and mean depth for all lakes (n = 153) included in the analysis.

Unit
Min. learning model, U-net 32-DEM , was the largest model, initiated from pretrained weights and trained using the largest DEM grid buffer size (100%).This model had an average MAE of 2.15 m on the test set (1.75 m on the validation set and 1.38 m on the entire dataset).This is a relative improvement of 45% on the validation set, compared to the Baseline Cubic .Performance metrics (MAE, RMSE, and r corr ) calculated for each lake indicate that the model performed well, with most lakes having low MAE and RMSE combined with high r corr and good agreement between predicted and observed average lake bottom elevations for the test set (Fig. 4).

Model experiments
Increasing the size of the DEM grids improved performance for all U-net models (Fig. 3), but the improvements were minor (2.3% relative improvement from 33% to 100% for U-net 32-DEM ).Performances improved when increasing the model size (20.2%relative improvement from U-net 4-DEM to U-net 32-DEM using the 100% sized DEM grids).Finally, using pretrained model parameters also improved performance (average MAE of 2.96 and 1.93 m for all U-net X-Random and U-net X-DEM type models, respectively).The pretraining procedure involved simulating data and training the neural network to interpolate lake-shaped holes in DEM grids.During pretraining, models approached convergence after 1000 epochs, after which improvements were minor (Supporting Information Fig. S2).While U-net models with pretrained parameters performed slightly better, they also converged much more quickly than models with random initialization (Supporting Information Fig. S3), suggesting that pretraining is suitable when the amount of ground truth data is sparse.

Visual perception of lake bathymetry
The comparison between baseline methods and the suggested deep learning approach shows that the latter's performance was superior in terms of MAE aggregated per lake.More importantly, the predicted lake bathymetry is much more realistic than the Baseline Cubic interpolation (Fig. 5).Lake bathymetry predicted from Baseline Cubic suffers from interpolation artifacts that require further postprocessing, for example, applying filters or manual editing to remove artifacts, on a lake-by-lake basis.Predictions from the U-net models do not suffer from these problems as there is good agreement between the observed and predicted bathymetries, which is essential for applications beyond estimating volume, z max, or z mean .Per pixel predictions generally corresponded well with ground truth for most lakes in the test set; however, a few lakes were seemingly more difficult with the model failing to provide good predictions (Supporting Information Fig. S4).For the two lakes in the test set with the highest MAE, lake depths predicted using the U-net model were either strongly overestimated (Lake Halle), despite being surrounded by steep terrain, or underestimated (Lake Glenstrup; Supporting Information Fig. S5).Fig. 3. Validation set (n = 31) performance (white; mean absolute error) of four baseline approaches for interpolation and U-net models of four sizes (Unet 4 , U-net 8 , U-net 16 , U-net 32 ) initiated at random (U-net X-Random ) or using pretrained parameters (U-net X-DEM ) on the three buffer sizes of digital elevation model cutouts (33%, 66%, and 100%).Performance on all observations is also shown for the baseline methods, which do not require supervised training (gray; n = 153).

Lake bathymetric maps
We have shown that a widely used deep learning architecture, the U-net, can be trained to successfully predict lake bathymetry solely from the topography of the surrounding terrain.The suggested deep learning approach improves pixelwise accuracy and produces realistic and ready-to-use bathymetric maps for unvisited lakes without further postprocessing.Predictions require only a DEM grid and lake shoreline, both of which are available on a global scale as remote sensing products enabling large scale applicability (Abrams et al. 2020).However, optimal performance in other geographical regions with differing landscape history likely requires fine-tuning with local data.Furthermore, because the U-net architecture provides predictions for each pixel with input and output grid dimensions left unchanged, resizing the input grid before or after prediction is unnecessary, thus maintaining the geospatial resolution and avoiding potential interpolation artifacts.We also demonstrated the usefulness of pretraining using simulated data, which overcomes the lack of large amounts of labeled data.The presented approach to predict bathymetry is likely insufficient for individual lakes if high local accuracy is required.However, for collections of lakes, upscaling of depth-dependent processes can be improved (Cael and Seekell 2022).
Traditionally, important lake morphometry variables such as volume, z max , and z mean have been predicted by empirical models using lake surface area and eventually elevation range, slope, or other topographical metrics in a buffer zone around the lake (Kalff 2002;Sobek 2011;Messager et al. 2016).In this study, z max and z mean derived from the predicted bathymetric maps were in good agreement with the ground truth; however, the spread for z max were larger and z max tended to be underestimated at higher values (Supporting Information Fig. S6).Poorer performance for lake morphometry variables was expected, as the model is trained to minimize pixel-wise error.While modeling of lake-level morphometric variables is sufficient for some applications, others require bathymetric maps, for example, for analyses of the distribution and biomass of submerged macrophytes (Duarte and Kalff 1986;Lehmann 1998), greenhouse gas emissions (Li et al. 2020), and extent of anoxia (Deeds et al. 2021).For such applications, having bathymetric maps available enables improved quantification of such processes on regional to global scales.

Predicting lake bathymetry
Predictions may deviate from ground truth for several reasons, including sediment deposition and anthropological effects such as excavation.Although sediment deposition over time reduces lake depths (Downing et al. 2008), these changes are relatively minor.However, the variation between lakes might be more pronounced due to differences in system hydrology, productivity, morphometry, and wind exposure (Håkanson 1977;Blais and Kalff 1995;Anderson et al. 2020).Even so, most lakes analyzed here have a very similar history, as they result from processes occurring during the most recent glaciation.
Recent studies have leveraged satellite imagery to predict lake and coastal bathymetry using analytical and empirical approaches (Gao 2009;Dörnhöfer and Oppelt 2016).While such techniques tend to yield good predictive performance, their use is limited to shallow, low-turbidity systems.Some of these limitations also hold for LiDAR methods, which are increasingly used for bathymetric applications due to higher accuracy and lower workload associated with data collection, for example, compared to manual echo sounding (Abdallah et al. 2013).
By contrast, the lakes investigated here are susceptible to high turbidity as they are generally nutrient-rich and shallow, resulting in a dominance of phytoplankton and re-suspended sediment particles (Kristensen et al. 1992;Martinsen et al. 2022).Using a DEM to predict lake bathymetry overcomes the limitations of approaches that are restricted by high light attenuation in water columns.However, it relies on the assumption that similar processes have formed bathymetry and topography of the surrounding terrain.Given our successful application, we find this assumption reasonable in a Danish setting, where the landscape is heavily influenced by processes occurring during the most recent glaciation Examples of ground truth (a, d, g) and predicted using the best deep learning model (b, e, h; U-net 32-DEM ) and simple cubic interpolation (e, f, i; Baseline cubic ) from three lakes.The first row is Lake Almind (area = 52.7 ha and z max = 21.1 m), the second row is Lake Hampen (area = 72 ha and z max = 12.9 m), and the third row is Lake Kaje (area = 25.7 and z max = 4 m).Lake bathymetry elevations are exaggerated by a factor of 10 to improve visualization.
(Houmark-Nielsen 2011), though this might not be the case in other geographical regions.Two recent studies have also used a DEM to predict bathymetry: Zhu et al. (2019) progressively determined lake depth from the shoreline based on the edge slope, and, similar to here, Hosseiny (2021) applied a U-net to predict the extent and depths in Utah's Green River but relied on both a DEM and hydrological model outputs.Remote sensing imagery and DEM grids complement each other regarding strengths and weaknesses and have been used extensively to determine lake volume fluctuations and estimate hydrographic relationships (Dörnhöfer and Oppelt 2016;Schwatke et al. 2020) but have also been found to improve the prediction of lake bathymetry (Getirana et al. 2018).It is straightforward to combine the DEM grids with additional data in the U-net model, e.g., widely available remote sensing imagery, or even soil type, which likely affects bathymetry on longer time scales (decades to centuries).The use of deep learning thus appears to be an attractive option for predicting the bathymetry of aquatic habitats in general.

Deep learning
For this analysis, we designed a pretraining procedure that proved to have several advantages.Pretraining improved model performance, especially for the smaller models, but not by a large margin.However, models initialized from pretrained parameters did converge much more quickly.As expected, predicting lake bathymetry is very similar to predicting missing data in DEM grids in general.This suggests that pretrained models can be used as a starting point and fine-tuned to predict lake bathymetry in other regions, other aquatic habitats, or elevation in areas veiled by forest canopies or buildings, and thus with unknown ground elevation.The ability to perform model pretraining highlights another advantage of deep learning, also known as transfer learning: trained models can be fine-tuned on new datasets using less labeled data.We expect that the advantages of pretraining decreases with increasing availability of diverse training data.
In this study, we used a dynamic buffer distance, scaled to the lake surface area, to extract a zone of surrounding terrain believed to influence lake bathymetry.This procedure is similar to Heathcote et al. (2015) and others (Sobek 2011), who found that the predictive performance of volume and z max decreased when increasing the buffer distance.In the present study, the scaling of the buffer zone only had minor influence on the prediction of lake bathymetry.The different findings are likely a consequence of the applied methodology.The convolutional neural networks used here are likely superior for extraction of local landscape features as opposed to manually defining and extracting global features (i.e., slope or elevation) in a buffer zone as done traditionally.In general, the deep learning approach was robust to changes in data (buffer distance) and model size, that is, beyond the smallest model, U-net 4-DEM , improvements in model performance were minor.We used a DEM with a resolution of 10 m, which is probably sufficient for the medium-large lakes in our dataset; however, increasing the number of high-quality bathymetric maps for small lakes and the DEM resolution provides an opportunity to drastically increase the number of applicable lakes using the methods presented here.

Conclusions
Lack of bathymetric data or using morphometric features, for example, volume, z mean , or z max , can introduce uncertainty when quantifying depth-dependent processes in lakes.Dedicated surveys of one or a few lakes can overcome this limitation, but this is not an option for large collections of lakes.Here, we have described a deep learning model to predict bathymetry relying only on shoreline and elevation data.In a setting with high requirements of high accuracy on one end, obtained by either in-situ measurements or, if possible, remote sensing products with site-specific calibrations, and simple interpolation or geometric shape assumptions on the opposite end, the approach described here falls somewhere in between in terms of the trade-off between accuracy, effort, and data requirements.When possible, bathymetry should be included when quantifying depth-dependent processes and properties ranging from the distribution of organisms to greenhouse gas emissions in lakes.

Fig. 1 .
Fig. 1.Topographic map of the study region (Denmark) with the lakes included in the study (open points).The stippled line is the glacial maximum during the most recent glaciation.

Fig. 2 .
Fig.2.The approach used to create three DEM grids (33%, 66%, and 100% times ffiffiffi A p) for each lake and predict lake bathymetry from the surrounding terrain.(a) Lake Borre (blue) and the topography of the surrounding terrain and the three buffer distances used to predict lake bathymetry.(b) Lake Borre (masked) and topography of the surrounding terrain (33% buffer).This is the input for the deep learning model.(c) Lake bathymetrical map of Lake Borre (ground truth).(d) Bathymetry of Lake Borre predicted by the deep learning model.(e) Difference between the ground truth (c) and predicted (d) lake bathymetry.The differences are used to calculate the loss (mean absolute error) when training the model.

Fig. 4 .
Fig. 4. Performance of the best deep learning model (U-net 32-DEM ) on the test set (n = 31).Performance metrics are calculated using ground truth and predicted lake bathymetry elevations per lake.(a) MAE.(b) RMSE.(c) Pearson product-moment correlation coefficient.(d) Mean predicted vs. observed lake bottom elevations.The dotted line shows the 1 : 1 relationship.

Fig. 5 .
Fig. 5. Three-dimensional representations of lake bathymetric maps colored relatively by depth ranging from shallow (light blue) to deep (dark blue).