Evaluating preprocessing methods of digital elevation models for hydrological modelling

With the introduction of high‐resolution digital elevation models, it is possible to use digital terrain analysis to extract small streams. In order to map streams correctly, it is necessary to remove errors and artificial sinks in the digital elevation models. This step is known as preprocessing and will allow water to move across a digital landscape. However, new challenges are introduced with increasing resolution because the effect of anthropogenic artefacts such as road embankments and bridges increases with increased resolution. These are problematic during the preprocessing step because they are elevated above the surrounding landscape and act as artificial dams. The aims of this study were to evaluate the effect of different preprocessing methods such as breaching and filling on digital elevation models with different resolutions (2, 4, 8, and 16 m) and to evaluate which preprocessing methods most accurately route water across road impoundments at actual culvert locations. A unique dataset with over 30,000 field‐mapped road culverts was used to assess the accuracy of stream networks derived from digital elevation models using different preprocessing methods. Our results showed that the accuracy of stream networks increases with increasing resolution. Breaching created the most accurate stream networks on all resolutions, whereas filling was the least accurate. Burning streams from the topographic map across roads from the topographic map increased the accuracy for all methods and resolutions. In addition, the impact in terms of change in area and absolute volume between original and preprocessed digital elevation models was smaller for breaching than for filling. With the appropriate methods, it is possible to extract accurate stream networks from high‐resolution digital elevation models with extensive road networks, thus providing forest managers with stream networks that can be used when planning operations in wet areas or areas near streams to prevent rutting, sediment transport, and mercury export.

are also important because the length of stream networks changes dynamically between high and low flows (Ågren et al., 2015;Blyth & Rodda, 1973;Jones, 2000).
Recent advances in remote sensing and digital terrain analysis have paved the way for new techniques and better understanding of forest hydrology (Creed, Sass, Wolniewicz, & Devito, 2008;Murphy, Ogilvie, Castonguay et al., 2008;Ågren, Lidberg, Strömgren, Ogilvie, & Arp, 2014;Laudon et al., 2016). The better understanding of forest hydrology is partly due to the availability of better hydrological maps derived from high-resolution digital elevation models (DEMs) generated from Light Detection And Ranging (LiDAR; Murphy, Ogilvie, Castonguay et al., 2008). Early DEMs were created from photogrammetry, whereas modern DEMs are often derived from LiDAR point clouds and can have resolutions of less than 0.5 m × 0.5 m (a grid resolution of 0.5 m × 0.5 m will, from now on, be written as 0.5 m; Reutebuch, McGaughey, Andersen, & Carson, 2003). The amount of country-wide LiDAR datasets is rapidly increasing, and some examples of countries with a national DEM created from LiDAR are as follows: Denmark (Danish Geodata Agency), Finland (National Land Survey of Finland), and Sweden (Swedish Mapping, Cadastral and Land Registration Authority). These new DEMs are increasing in popularity amongst managers and are often used to map hydrological features such as stream networks (Vaze & Teng, 2007). Streams extracted from DEMs have three main advantages: First, they form an integrated drainage network (O'Callaghan & Mark, 1984); second, they are highly accurate (Goulden, Hopkinson, Jamieson, & Sterling, 2014) and follow actual channel depression in the DEM ; and third, they can easily be adjusted for seasonal variations and also display where ephemeral streams appear (Ågren et al., 2015).
Before any hydrological modelling can be applied to a DEM, it needs to be adjusted in order to be hydrologically correct (Jenson & Domingue, 1988;O'Callaghan & Mark, 1984). Water can only move downhill in a DEM, which means that sinks need to be removed to allow water to continue towards the outlet. Sinks are defined as areas surrounded by cells with higher elevations, which prevent water from moving further (Jenson & Domingue, 1988;Lindsay, 2015;Martz & Garbrecht, 1998;O'Callaghan & Mark, 1984;Zhang & Montgomery, 1994). They can be real depressions in the landscape or artefacts from urban features such as bridges. Thus, preprocessing of DEMs is important, especially because any errors in the input data will be amplified with each subsequent calculation (Kenward, Lettenmaier, Wood, & Fielding, 2000;Wise, 2000). There are two commonly used methods to handle sinks: filling (O'Callaghan & Mark, 1984;Wang & Liu, 2006) and breaching (Martz & Garbrecht, 1998;Martz & Garbrecht, 1999;Rieger, 1993). A fill algorithm examines the cells surrounding a sink and increases the elevation of the sink cells to match the lowest outlet cell (Planchon & Darboux, 2002;Wang & Liu, 2006). A breaching algorithm instead lowers the elevation of cells along a path between the lowest cell in the sink and the outlet of the sink (Martz & Garbrecht, 1998).
There are a number of studies that show how different preprocessing methods affect a DEM. Lindsay and Creed (2005) analysed the impact of the removal of artefact sinks from a 5-m DEM and found that methods combining filling and breaching had the least impact on the spatial and statistical distribution of terrain attributes. Poggio and Soille (2012) analysed the effect of preprocessing methods on stream networks and concluded that a combination of breaching and filling produced the most accurate stream network on a 30-m DEM. Lindsay (2015) demonstrated a flexible hybrid breaching-filling sink removal method on six large DEMs with resolutions of 30 and 90 m and concluded that the hybrid method performed similar to the highly efficient fill algorithm by (Wang & Liu, 2006) in terms of processing time. Preprocessing of high-resolution (<2 m) DEM introduces new challenges. There are mainly two problems associated with increasing the resolution of DEMs.
The first problem is processing time, which increases drastically when the resolution increases and thus the number of data points increases (Barnes, Lehman, & Mulla, 2014;Qin & Zhan, 2012). The second problem is that features such as road-stream intersections become detectable, and, because roads are slightly elevated above surrounding terrain, they often appear to block the streams they cross. In reality, water may be draining underneath the road in a culvert or bridge (Shortridge, 2005).
Higher resolution also produces more detailed hydrographic features such as stream networks (Dehvari & Heck, 2013;Goulden et al., 2014;Vaze & Teng, 2007;Yang et al., 2014) but does not improve the detection of large features such as wetlands (Creed, Sanford, Beall, Molot, & Dillon, 2003) or topographic wetness index (Ågren et al., 2014). LiDAR is also sensitive to noise from low-lying vegetation and saturated soil surfaces, which need to be dealt with during the preprocessing (Goulden et al., 2014;Gyasi-Agyei, Willgoose, & Troch, 1995). An important advantage of high-resolution data is that it may contain information of forest ditching and similar small-scale features that impact drainage.
In the small country of Sweden, more than 210,000 km are forest roads built to extract timber from 227,000 km 2 of forested land. That equals roughly to 1 km of roads for every square kilometre of forest landscape. Ågren et al. (2015) mapped stream networks from a high-resolution DEM and found 2-5 km of streams per square kilometre of forested land, depending on season. This highlights the importance of handling sinks caused by road embankments correctly during the preprocessing stage; otherwise, the resulting hydrologically modelled maps will contain misplaced streams. The location of culverts needs to be incorporated into DEMs to prevent this error (Goulden et al., 2014;Shortridge, 2005). It can be done by breaching a path across roads if their locations are known, but this is rarely the case, and mapping culverts in the field is both time-consuming and costly. Much previous work has focused on coarser resolution DEM without small-scale anthropogenic features such as roads (Lindsay, 2015;Poggio & Soille, 2012); however, recent studies have addressed this problem (Lindsay & Dhun, 2015;Schwanghart, Groom, Kuhn, & Heckrath, 2013) using high-resolution data on small geographical areas.
In this study, we focus on digital terrain analysis to extract streams from DEMs with a range of different resolutions, in watersheds containing a large number of small-scale anthropogenic artefacts, which are mostly roads. The first research question in this study is, "Which preprocessing methods most accurately route water across road impoundments at actual culvert locations?" For this purpose, a large field inventory has been conducted in northern and central Sweden, where over 30,000 road culverts in 10 watersheds have been located and mapped manually. This is a unique dataset and a rare opportunity to evaluate the performance of preprocessing methods with focus on road impoundments.
We assume that one wants to enforce continuous flow to the outlet without losing important information from the original DEM. The second research question in this study is therefore, "How much of the landscape is affected by the different preprocessing methods?" Here, we evaluate area changed and the difference in absolute volume between original DEMs and preprocessed DEMs.

| Study sites
This study consists of nine large catchments in central Sweden The preprocessing methods that have been evaluated can be sorted into three categories: algorithms that fill sinks, algorithms that breach sinks, and algorithms that utilize a combination of both filling and breaching to remove sinks. In this study, we focus on efficient algorithms capable of handling large DEMs (~1,000 km 2 at 2-m resolution). The following is a short introduction to the evaluated algorithms.
Each method is given a short name in this study (in italics), and all methods are summarized in Table 1. 2.2 | Fill algorithms (also known as incremental methods) Wang and Liu (2006) introduced the priority flood algorithm, which examines each cell on the basis of its spill elevation, starting from the edge cells, and visiting cells from lowest order using a priority queue. This algorithm was modified to work with larger LiDAR DEMs and implemented in SAGA

| Breaching algorithms (also known as decremental methods)
The first breaching methods were introduced by Martz and Garbrecht (1998) and Rieger (1993Rieger ( , 1998 and worked by identifying and breaching the lowest outflow in a sink if specific criteria of depth and breach length were meet. Studies by Soille (2004), Schwanghart and Kuhn (2010), and Schwanghart et al. (2013) propose breach algorithms based on the least cost auxiliary topography. This algorithm is included in MATLAB TopoToolbox R2013b and will be referred to as LCAT breach. An even more efficient breaching algorithm was introduced by Lindsay (2015) and is available in a small program called GoSpatial. This method will be referred to as complete breach.

| Hybrid algorithms (incremental and decremental combined)
GoSpatial also offers the possibility to combine breaching and filling into a hybrid solution using a priority flood algorithm where sinks can be resolved by selective or constrained breaching. Constrained breach and selective breach was run with a maximum breach length of 50 grid cells and maximum breach depth of 2 m. This means that sinks that would require a breaching path of more than 50 grid cells or sinks deeper than 2 m will be filled instead of breached. The main difference between constrained and selective breach is that selective breaching does not breach sinks that do not meet the criteria above, whereas constrained breaching creates a partial breach up to the above-defined criteria in order to reduce the interior sink size (Lindsay, 2015). For example, constrained breaching will breach a channel of 50 m before applying fill, whereas selective breaching will stop and fill without breaching that specific sink.
There is also an option to burn a known stream network into a DEM.
Unfortunately, forest hydrology is often poorly mapped, and only streams distinguishable from aerial photos are displayed on current maps, which makes stream burning questionable (Lindsay & Dhun, 2015). Even so, it is still reasonable to assume that the location of a stream-road crossing would be easier to distinguish from aerial photos because of the opening in the canopy along roads, making these locations more reliable. Therefore, streams from existing maps were burned into the DEM where they crossed a road, and only a short distance (maximum 50 m) that would correspond to the distance necessary to burn across the largest road embankments in the catchments. This step was done using the tool "burn streams at roads" in Whitebox GAT and will be referred to as "BR." Here we applied (fill, complete breach, selective breach, constrained breach, and LCAT) separately to the stream-road-burned DEM. Methods where the stream-road intersections were burned into the DEM have "BR" added to the name to clarify this (BR fill, BR complete breach, BR constrained breach, BR selective breach, and BR LCAT).

| Evaluation
Field mapping an entire stream network is not an easy task, and stream networks are tricky to compare in a reliable way (Molloy & Stepinski, 2007). Instead of comparing the entire stream network, we focus on locations where streams intersect roads. Our unique dataset of field- . This means that culverts located in areas near a water divide, before a stream has been initiated, will not be intersected by this stream network. A lower flow initiation threshold would produce a more extensive stream network and intersect more of the fieldmapped culverts, but we decided that it would be more relevant to use a realistic flow initiation. A stream-road intersection was only considered to be accurate if a stream passed within 10 m of both ends of a culvert. The 10-m search radius was chosen to avoid nearby culverts at road intersections and similar locations.

| Effects of preprocessing methods on DEMs
Preprocessed DEMs were compared to original DEMs in order to analyse how preprocessing methods changed the DEM. This comparison included area changed and absolute volume changed, which are commonly used to assess the impact of preprocessing methods (Lindsay & Creed, 2005;Poggio & Soille, 2012). Absolute volume change is the sum of the absolute height difference for all cells in the catchment before and after the preprocessing multiplied by the total number of cells (Equation 1).
Abs volume ð Þ¼a ∑ N i¼1 z i;orig −z i;proc À Á : (1) a is the area of a raster cell, z i,orig is the elevation for raster cell i in the original DEM, z i,proc is the elevation for raster cell i in the preprocessed DEM, and N is the number of raster cells in the DEM.
LiDAR is absorbed by water, so elevation data in these surfaces were interpolated from surrounding terrain during the DEM creation. They were also flattened using lake and river polygons from a topographical map and given an arbitrary slope towards the coast. These areas were excluded from the evaluation of preprocessing methods impact on the DEMs.

| Correct stream-road crossings
Stream networks from all preprocessed DEMs were intersected with over 30,000 field-mapped road culverts, and the number of correct The accuracy of topographically derived stream networks increases with increasing digital elevation model (DEM) resolution.
Preprocessing methods that prioritize breaching over filling lead to more accurate stream networks on all DEM resolutions stream-road crossings was used to evaluate accuracy of each method.

| Preprocessing effects on DEMs
The impact of each method was defined by changes in DEM area and absolute volume between the original DEMs and the preprocessed DEMs. Methods that prioritized breach over fill made the least changes, to both area and absolute volume (Figure 3). This was the case for all resolutions. All methods changed larger areas on higher resolution DEMs, especially fill. The difference in area changed between methods that prioritize breaching and methods that prioritize filling also increased with increasing DEM resolution. Burning streams from the topographic map across roads from the topographic map ( We also found that the accuracy increased when streams from the topographic map were burned across roads from the topographical map, before applying a complete preprocessing method. In this study, Increasing DEM resolution increased the area affected by preprocessing, especially for methods that prioritize fill. This is likely due to sinks caused by small-scale features, which become visible at higher resolutions. Features such as road embankments could explain increasing differences in area changed, and number of correct stream-road crossings, between breaching and filling at higher resolutions. BR LCAT created the most accurate stream network, and it changed 52% less of the study area compared to the classic fill method on the 2-m DEM. This is also consistent with recent findings. Lindsay and Dhun (2015) evaluated preprocessing algorithms on a 1-m DEM in a landscape dominated by agriculture and found that breaching changed an area 86.5% smaller than filling. This is consistent with results from a study on a 30m DEM by Poggio and Soille (2012). Some of the difference in impact between filling and breaching can be attributed to flat areas in our catchments. If a road crosses the outlet of a flat area, the fill algorithms will fill up the whole area in order to remove the sink, whereas breach algorithms will breach a channel across the road. Burning streams from the topographic map, across roads from the topographic map, reduced the impact on the DEMs and improved the accuracy for all methods but especially for filling methods. This shows just how sensitive filling is to road embankments.
Previous studies have shown that methods that change the DEM less produces more accurate stream networks (Lindsay & Creed, 2005;Poggio & Soille, 2012). This is consistent with our results, but there is no reason why minimizing the impact should be a goal by itself.
The aim of any preprocessing method is to create accurate flow directions and by extension accurate stream networks. One of the most important advantages with breaching instead of the filling used here is the behaviour of flow paths upstream of a road embankment.
Streams from both methods might cross the road in a correct location, but fill will produce straight parallel streams across the filled area, whereas breach uses the flow path information of the unfilled DEM to the beaching point. This means that filling fails to utilize information about flow directions in the filled areas ( Figure 4). However, there are other filling methods such as the one described by Martz and Garbrecht (1998)  Selecting appropriate thresholds for maximum breach depth and length can be difficult and will vary with resolution because the origin of sinks changes with resolution.
Road embankments, bridges, and culverts are some of the biggest issue to address in order to create reliable stream networks from highresolution LiDAR DEMs (Schwanghart et al., 2013). One advantage with high-resolution LiDAR DEMs is that they might contain information about small-scale anthropogenic features such as ditches that can be incorporated in the hydrological models in order to improve the accuracy of stream networks. This would allow us to shed some light on the unknown headwaters described by Bishop et al. (2008).
Forest managers could use these stream networks to better plan operations in wet areas near streams in order to prevent rutting (Ågren et al., 2015) and subsequent sediment transport (Kreutzweiser & Capell, 2001) and mercury export (Munthe & Hultberg, 2004).

| CONCLUSIONS
The accuracy of stream networks, in terms of correct culvert intersections, increased with increasing DEM resolution. Stream networks extracted from DEMs that had been breached instead of filled created more accurate stream networks on all resolutions and had less impact in terms of change to area and absolute volume. The difference in accuracy between breaching and filling increased with increasing resolution. The accuracy also increased when streams from the topographic map were burned across roads from the topographical map, for all methods and resolutions.

ACKNOWLEDGMENTS
Big thanks to the personnel at the Swedish Forest Agency who conducted the field survey for the five large study catchments. Anneli M. Ågren http://orcid.org/0000-0002-6758-3971