Quantifying the dynamics of microtopography during a snowmelt event

Knowledge of soil microtopography and its changes in space and over time is important to the understanding of how tillage influences infiltration, runoff generation and erosion. In this study, the use of a terrestrial laser scanner (TLS) is assessed for its ability to quantify small changes in the soil surface at high spatial resolutions for a relatively large surface area (100 m2). Changes in soil surface morphology during snow cover and melt are driven by frost heave, slaking, pressure exertion by the snowpack and overland flow (erosion and deposition). An attempt is undertaken to link these processes to observed changes at the soil surface. A new algorithm for soil surface roughness is introduced to make optimal use of the raw point cloud. This algorithm is less scale dependent than several commonly used roughness calculations. The results of this study show that TLSs can be used for multitemporal scanning of large surfaces and that small changes in surface elevation and roughness can be detected. Statistical analysis of the observed changes against terrain indices did not yield significant evidence for process differentiation. © 2019 The Authors. Earth Surface Processes and Landforms Published by John Wiley & Sons Ltd. © 2019 The Authors. Earth Surface Processes and Landforms Published by John Wiley & Sons Ltd.


Introduction
Soil microtopography can be defined as soil surface morphology at the scales of the aggregate and soil clod, including rills, vegetation and washed on sediments (Ploeg et al., 2011). Soil surface morphology, in particular hydrological connectivity and surface storage, is an important driver for hillslope response to rainfall (e.g. Appels et al., 2011;Frei and Fleckenstein, 2014;Thompson et al., 2010) and snowmelt. Its importance for the understanding and simulation of overland flow processes has long been recognized (Römkens and Wang, 1987). As a derived surface morphological characteristic, random roughness (RR, m) is an important variable in many overland flow and erosion models. RR can be used to estimate a roughness factor such as Manning's n (e.g. De Roo et al., 1996;Deng et al., 2005;Jinkang et al., 2007) or as a variable in the energy gradient in hydraulic equations (Fiedler and Ramirez, 2000;Esteves et al., 2000). Some models treat soil surface roughness as a dynamic parameter that changes during an event or simulation period (Loon, 2002;Kirkby et al., 2008). Another derived parameter, depression storage capacity (DSC, m) is used in the water balance of several models to determine a lag period before an area starts to produce overland flow (e.g. De Roo et al., 1996). Temporary storage and infiltration of overland flow in sinks also determine how hydrological connectivity develops during a runoff event (Darboux and Huang, 2001;Peñuela et al., 2016;Wang et al., 2018). At the plot and field scales, quantitative knowledge of microtopography therefore is a key factor for understanding and simulating overland flow.
Microtopography changes over time due to several processes. On agricultural soils, the most significant of these are tillage, precipitation impact, overland flow, erosion and sedimentation, slaking and, in cold climates, frost heave.
Much work has been done on the quantification of roughness development as a result of rainfall and surface runoff (Zobeck and Onstad, 1987;Cremers and Dijk, 1996;Römkens and Wang, 1987;Haubrock et al., 2009). Surface roughness concerns morphology at the spatial scale of small aggregates down to soil particles. Roughness at this scale is not directly included in catchment scale runoff models, but is relevant for runoff and erosion models developed for simulations at high spatial resolutions, i.e. centimetre scale (Fiedler and Ramirez, 2000;Esteves et al., 2000). This roughness influences the infiltration process because it is related to surface storage and crust formation, and decreased initial infiltration rates as a result Singer, 1992, 1993). Surface roughness and its change over time are also relevant for studies that utilize synthetic-aperture radar (SAR), where they are used to interpret backscatter (Louis et al., 2004;Snapir et al., 2014).
Roughness metrics are strongly dependent on the spatial resolution at which the calculations are carried out (Kamphorst et al., 2000). This can be problematic in multitemporal 44, 2544-2556 (2019) measurements if the spatial scope of the expected changes is unknown. Algorithms that show a high correlation between slope gradient and roughness are less likely to distinguish sub-grid cell roughness elements from the overall slope within this grid cell.
Little is reported about roughness development during snow cover and melt (Blackburn et al., 1990). Slaking, or sloughing, is a process of smoothing out small elements at the soil surface. It is a form of soil structure decline, driven by, for example, the impact of frost-thaw cycles. No studies were found to report on the magnitude or characteristics of slaking during snow cover and subsequent melt.
The quantification of erosion and sedimentation processes at the level of microtopography has been reported in laboratory studies (Helming et al., 1998;Rieke-Zapp and Nearing, 2011;Moritani et al., 2010;Bogner et al., 2012) and field studies at limited spatial scales (Vidal Vázquez et al., 2010;Haubrock et al., 2009). Eltner et al. (2018) use a combination of unmanned aerial vehicles (UAV) and terrestrial laser scanner (TLS) to quantify erosion and sedimentation at the hillslope scale.
When conditions are characterized by frost and high soil moisture conditions, the process that is likely to have the largest impact on surface elevation change is frost heave. Frost heave rates can be several orders of magnitude higher than erosion and sedimentation rates. Beskow (Black and Hardenberg, 1991) reported road heave values in Sweden of 0.6 m. The same study indicated that frost heave rates can be 0.1 mm h 1 in a fine sandy soil. Bronfenbrener (2009) reported values of 3.7 mm d 1 for the same medium in an experimental setup. Hermannson and Spencer Guthrie (2005) measured maximum heave values between 0.20 and 0.25 m during a winter season at a field site. While progress in the theoretical understanding and modelling of frost heave is ongoing (Peppin and Style, 2013), the spatial heterogeneity of the driving forces and soil physical parameters limits the possibilities for simulations at high spatial resolutions in and for field conditions.
Typical for most soil roughness and erosion and sedimentation studies is that the processes affecting microtopography are isolated or that the magnitude of the lesser processes is assumed small in comparison to the main process studied. In field studies where the comparative magnitudes are unknown, no conclusions about the contribution of the individual processes to surface morphology dynamics can be drawn without explicitly addressing process differentiation. With the exception of Eitel et al. (2011), Vidal Vázquez et al. (2010 and Kaiser et al. (2018), little is known about the correlation between the changes to microtopography over time and their correlations with the processes that drive them.
Obtaining quantitative information about changes in soil surface morphology can only be undertaken by means of multitemporal georeferenced measurements of a certain area of interest. Soil microtopography can been studied with several different technologies, ranging in resolution from pin-frames (several centimetres) to TLS (sub-centimetre) (Jester and Klik, 2005). More recently, Thomsen et al. (2015) provided a comparison of these methods with regard to their ability to quantify surface roughness. Photogrammetric methods, most notably structure-from-motion (SfM) techniques, are increasingly being used to quantify soil surface roughness on agricultural soils (Gilliot et al., 2017) and its development over time (Snapir et al., 2014;Bauer et al., 2015;Hänsel et al., 2016). One-dimensional laser scanners have been in use for almost 30 years (Huang and Bradford, 1992), and continue to be employed (Martinez-Agirre et al., 2016), but their frame mounting limits the spatial extent of the measurable object.
Most other methods to obtain three-dimensional datasets are either intrusive or can only cover limited surface areas. They are therefore not suited to multitemporal measurements at the larger plot scale.
TLSs have been used to characterize soil surfaces at different resolutions. Barneveld et al. (2013) and Nield et al. (2013) indicated that TLSs can be used to obtain high-resolution digital elevation models (DEM) of good quality for areas over 100 m 2 . Eitel et al. (2011) correlated TLS-measured soil roughness values with runoff characteristics and the quantification of erosion and deposition for artificial runoff events. The maximum value of surface lowering obtained in their study was 6.0 10 2 m. Overland flow velocities as a result of snowmelt during early spring in southern Norway are likely to be much smaller than in Eitel's study. The effects on microtopography can therefore be expected to be smaller too. Examples of gully erosion monitoring with multitemporal TLSs can be found in Goodwin et al. (2016) and Perroy et al. (2010). Hohenthal et al. (2011) provided an overview of how airborne and terrestrial laser scanning can be used to monitor river bed dynamics. Gully development and fluvial dynamics, however, result in relatively large changes to the soil surface or stream profile (0.01 m and over). Eltner and Baumgart (2015) reported that small elevation changes as a result of erosion and deposition can be detected by multitemporal TLS. The smallest changes detected reliably in their study were 1.5 10 2 m. With the exception of Marx et al. (2017), no studies of TLS measurements of surface elevation changes smaller than a centimetre have been found for on-field experiments.
Data quality in any multitemporal measurement must be such that errors and uncertainties are smaller than the expected changes to be observed (Williams, 2012). Errors that can be expected with the use of TLSs for creating DEMs of difference (DoD) can be classified broadly into measurement inaccuracies, misalignment of subsequent terrain models and miscellaneous errors. Measurement accuracy and precision are primarily dependent on the equipment used. Misalignment between elevation models occurs when reference points move during the time between measurements. The most notable miscellaneous error that may influence the quality of the DoD for soil surfaces can be vegetation growth.
The objective of this study was to quantify changes to the surface of an agricultural soil at the plot scale in field conditions typical for the spring climate of southern Norway. The main challenge with comparing DEMs is that fixed reference points for calibration and quality assurance are usually not available on agricultural soils. As a consequence, there is no fully reliable method to differentiate measurement bias and noise from actual soil surface changes. This study hypothesizes that changes in soil surface elevation and roughness reflect the processes that have driven these changes. Slaking, erosion and deposition and frost heave are typically not randomly distributed over the surface area of an agricultural field. The primary objective of this study is to correlate changes in the DEM with the processes that drive these changes locally. The secondary objective is to define an algorithm for RR that is less sensitive to spatial resolution and local slope.

Experimental site and event
The experiment was carried out on a runoff plot situated at Bjørnebekk (C59ř 39 0 9.2 00 , C10ř 50 0 12.01 00 ) in Akershus County, southeastern Norway. The soil type at the location is a silty clay loam. Since tillage is the principal source of roughness for agricultural soils, a mouldboard ploughed surface, typical for wheat fields in Norway, was studied. The plot is 5.5 m and 21 m long, in the direction of the slope. The slope shape is largely linear, with an average inclination of 12.4%. Tillage was undertaken in late September 2010, in parallel with the slope. In the course of the winter 2010-2011, a snow cover of 0.52 m of depth on average, or 0.147 m snow water equivalent (SWE, m), accumulated. Snowmelt occurred between 17 March and 9 April. Peak melting rates occurred between 22 and 24 March. During this period, the development of the snowpack was monitored by taking snow depth and SWE measurements at around 12:00 am. Depth was monitored with a ruler, while SWE was measured by taking and weighing cores of a known volume. Both the depth and SWE of the snowpack were spatially homogeneous throughout the winter and melting periods. Tipping bucket counts of the runoff collector of the plot were taken simultaneously ( Figure 1). The maximum melting rate recorded was 1.9 mm h 1 (SWE) on 24 March at 1:30 pm. The maximum runoff rate on the same day was 3.6 mm h 1 at 3:00 pm. Note that the soil temperature drops considerably after the bulk of the snowpack melted on 24 March. Melting continued gradually without inducing runoff at the same order of magnitude as on 24 March.

Soil surface elevation
The study plot was scanned in mid-October 2010, several weeks before the first snow. No precipitation occurred between scanning and the beginning of snow cover. The next scan was carried out less than a week after all snow had melted completely, in the second week of April 2011. Again, no precipitation occurred between the completion of the melting process and the TLS scan.
Soil surface elevation data were obtained with a Leica ScanStation2 TLS. This scanner issues a pulsed green laser signal to measure distance (time-of-flight). It records a single return signal and this, combined with a vertical and horizontal angle, yields a three-dimensional vector (Vosselman and Maas, 2010). Its maximum scan rate is 50 000 points per second (Leica Geosystems, 2007). The scanner was mounted on a tripod, so that the effective scanning height was between 1.7 and 2.0 m above ground level ( Figure 2). A series of scans from eight viewpoints around the plot were required to ensure sufficient coverage to create a reasonably fine DEM with a grid cell length of 0.02 m. With the eight viewpoints, shading due to the ploughing ridges could be minimized. Milenkovic et al. (2015) quantified the importance of scan geometry, and concluded that roughness features smaller than 5.0 10 2 m are  underrepresented when a soil object is covered by only a single scan. Georeferencing the viewpoints and the DEMs at the different points in time was carried out by installing 10 iron pins of 0.60 m length along the sides of the plot. The pins were driven into the ground for at least 0.40 m to avoid frost deformation or displacement. It was confirmed that the pins were not bent or otherwise disturbed after snowmelt. In the absence of any unmoveable structures in the direct vicinity of the runoff plots, changes in the exact positions of the pins could not be ruled out. Detachable, high-contrast visual targets were mounted on the pins during the TLS measurements. The total plot size was 115.5 m 2 and the number of points of the combined point clouds totalled 11.5 million. The average point density therefore was 10.0 points cm 2 .
Point cloud registration and error analysis The goal of this study was to quantify differences in elevation and surface roughness between two terrain models. The ability to detect these differences depends on their magnitude relative to the errors associated with data capture and post-processing. The precision and accuracy of the TLS depend on the equipment used, atmospheric and light conditions, the scanning geometry and the object of the measurement (Soudarissanane et al., 2011). The principal sources of error are mixed pixels, where the laser beam obtains multiple returns because of partially reflecting objects, errors in co-referencing multiple scans and inaccuracies in the time-of-flight recordings.
The raw point clouds were registered with Leica Cyclone (v. 7.0, Leica Geosystems, Inc.). Point cloud registration generally consists of two steps: coarse and fine registration. Coarse registration in the Cyclone software package consists of a transformation of successive point clouds, so that they have a consistent geometry in a certain coordinate system. The parameters for these transformations are calculated by manually identifying common features in the paired point clouds. In our case, these were the cross-points in the reference targets that were placed along the boundary of the plot. Fine registration is then undertaken by a bundle adjustment algorithm (Triggs et al., 2000), during which further local transformations are undertaken until the residual error between paired point clouds reaches a minimum. A similar sequence of transformations is undertaken for the registration of the multitemporal point clouds. Here, the possibility that the reference targets have moved in 4 months between the measurements introduces an additional potential source of errors. The algorithms for accuracy assessment used in the Cyclone were not known to the authors at the time of writing. Moreover, it is unclear how the accuracy metrics generated by the software are influenced by the complex geometry of the soil surface. A registration accuracy assessment is therefore proposed that takes into account the effect of grid cell size and the roughness of the terrain. The method is based on two premises: 1. The average horizontal distance between points within a grid cell is a function of the cell dimensions, or resolution. 2. The average vertical distance between points within a grid cell is a function of the surface roughness.
As a logical consequence, the average point-to-point distance (d P2P , m) in a grid cell of size zero without roughness is equal to zero.
The final point clouds, i.e. including the measurements from all the viewpoints, were tested for their residual error in the following way. First, the average d P2P within a grid cell and the standard deviation of the elevation ( 2 z , m) were calculated. This was done for grid cell lengths between 0.01 and 0.10 m, with 0.01 m increments. The results for the different cell sizes were combined into a single file, with the average d P2P as the dependent variable and 2 z , and the cell size as the independent variables. A nonlinear regression was carried out using the R statistical package (R Development Core Team, 2008). The model to be fitted was given by Following premises 1 and 2, was taken to be the compound error as a result of data capture and post-processing. The regression was carried out by assigning grid-size-dependent weights, since the number of observations decreases quadratically with increasing grid sizes. The regression model was then compared to the terrain indices, described below ('Process differentiation'), in order to test for any spatial bias.

Point cloud filtering
The raw point cloud contains non-ground points that are the result of the presence of vegetative material and mixed pixels. The registered point clouds were filtered to discard these. Non-ground points were identified by the iterative application of an algorithm proposed by Kraus and Pfeifer (1998). The algorithm is designed for filtering vegetation from LIDAR data of forested areas, but is also able to remove mixed pixels, vegetation and other non-ground elements in a TLS point cloud. The algorithm assigns a weight to individual points proportional to the vertical distance to an initial estimate of the ground surface level (Equation 2). This estimate is then used in the next iteration for the new vertical distance of each point to the new average.
where w i is the weight-assigned point measurement z i , z is the standard deviation of the vertical distances within the grid cell (m) and a and b are dimensionless parameters. The values of a and b were set at 12 and 2, respectively. For b the value is based on the original research by Kraus and Pfeifer (1998). The value of a was set at 12, which is higher than the value proposed by Kraus and Pfeifer, who used 4. This ensures a sharper vertical cutoff in Equation 2, which makes the algorithm faster on surfaces with sparse vegetation. Points lower than the initially estimated average elevation within the grid cell, and those higher than N z C 2 z are discarded in order to reduce the number of iterations. Filtering was undertaken by means of a raster grid with a cell length of 0.02 m. Iteration stops when the difference between the current and the new estimated average surface elevation is below a certain threshold. The number of iterations typically required in the absence of vegetative material is two to four (Barneveld et al., 2013).

Process differentiation
Without the presence of a reference surface, a direct distinction between observed changes and measurement error cannot be made. Besides the random error of the measurement itself, the main error source for multitemporal comparisons of a soil surface is expected to be caused by misalignment of the DEMs at the different moments. Errors in vertical georeferencing would result in overestimations of erosion or deposition values across the entire surface area. Referencing errors in the horizontal plane would result in patterns that do not match terrain characteristics such as local slope and catchment area. Differentiating measurement errors from actual changes in soil surface therefore requires a conceptual model of what these changes entail and where in the runoff plot they are expected to be more pronounced. For this purpose, a series of structural topographic indices were calculated for comparison with the observed changes in surface elevation and roughness and the quantification of frost heave (Table I). These indices are calculated for each grid cell in the DEM and then compared to the observed changes in surface elevation and roughness in that particular cell. The DEM used for calculation of the indices was created by importing the filtered point cloud into a raster grid in SAGA GIS (Conrad, 2001), with a cell length of 0.02 m. The correlations between terrain index and observed roughness or elevation change were tested by means of linear regression analysis in R (R Development Core Team, 2008) and the coefficients of determination of these linear functions.
Surface roughness development In the absence of any impact of raindrops, the development of surface roughness during snow cover and melt is driven by slaking (Blackburn et al., 1990), overland flow and possibly the pressure exerted by the snowpack. All three are expected to lead to a reduction in surface roughness, although (Eitel et al., 2011) suggest the emergence of new roughness elements in zones of concentrated flow, mainly due to new incisions in the soil surface.
Surface roughness is a scale-dependent quantity (Vidal Vázquez et al., 2005). The scale at which the relevant changes in roughness occur cannot always be defined precisely. Govers et al. (2000) classified four scales for microtopography, where soil roughness was in the range of the individual soil aggregate. However, the order of magnitude would vary between several millimetres and several centimetres. Choosing a spatial resolution is required by all methods for the calculation of soil roughness. The sensitivity of the final result to this choice, however, varies with the method. A much-used roughness parameter -RR (Doren and van Linden, 1986) -is defined as a comparison of the elevation value in a central cell with the values in the eight neighbouring cells. This method is sensitive to the resolution at which the calculations are done and can be expected to correlate strongly with the local slope. Local slope gradients in a high-resolution DEM of a ploughed agricultural soil can be well over 100%. The implication is that roughness elements smaller than the slope multiplied by the grid cell length will not be recognized. The roughness calculations in this study were therefore based on the vertical distance of the individual points of the filtered point cloud to a fitted reference plane, rather than average values within the limits of a raster cell.
In order to minimize the scale dependency of the calculation, three methods to define a reference plane were tested for their scale dependency. All three use the point cloud and are based on the standard deviation of residual topography (Brubaker et al., 2013), where the average distance from the points to a reference level is a measure for roughness. The algorithms were programmed in C++.
First, a maximum scale is chosen and used as the cell size for the calculations. Any roughness feature larger than this scale will be ignored, or at least underrepresented. The average distance of the individual points within a grid cell to a reference plane is defined to be a measure for the local soil roughness. In the first method, the reference plane is defined as a horizontal plane with the same elevation as the geometric centroid of the points contained by the grid cell (Figure 4a), and RR is given by where RR is random roughness, z i is the elevation of point i (m), n is the number of observations and N z is the elevation of the centroid (m).
In the second method, the reference plane is defined by the vectors between the centroid C and two points in the point cloud (Figure 4b). These two points, A and B, are identified by first searching for the point with the largest horizontal distance to the centroid. After that, the point with the largest distance to the centroid in the adjacent quadrant (in the x or y direction) is identified (Figure 3). Point pairs were selected from adjacent quadrants rather than diagonals in order to minimize the likelihood that they were situated on the same line (i.e. in opposite directions from the centroid). The vectors E CA and E CB are then used to calculate the point-normal equation of the reference plane in the form of where a, b, c and D are constant for the plane thus defined, on which point (x i , y i , z i ) is situated (Figure 4b). The proposed measure for soil roughness is the orthogonal distance from point i in the point cloud to this projected plane, given by where O z z i is orthogonal distance between the plane and the point, and .c x c y c z / is the dot product of E CA and E CB. The third method is similar to the second, but selects four pairs of points in each grid cell quadrant, and subsequently defines four separate reference planes (Figure 4c). Equations 5 and (4) are then applied to all of the four planes, depending on the position of point i within the grid cell. This method is expected to follow the actual point cloud more closely than the method with a single equation of a plane, and therefore to be less scale dependent.
The three methods were applied at grid cell lengths from 0.01 to 0.10 m (0.01 m increments) and from 0.12 to 0.20 m (0.02 increments) in order to quantify their sensitivity to scale and dependence of local slope.
The magnitudes of the effect of slaking, overland flow and pressure are not expected to be spatially homogeneous. Slaking is expected to be more pronounced on steeper slopes, where the centre of gravity of the aggregate or clod has a large lateral offset from the overall slope plane. Increased exposure to the sun increases the amplitude of diurnal frost-thaw cycles, and is expected to accelerate the slaking process.
Runoff processes occur everywhere, but higher shear stress values are expected to result in smoother surfaces. Only in situations where small, i.e. sub-grid-cell size, incisions occur would roughness increase. The weight of the snowpack is expected to result in higher shear stress values on aggregates protruding from steeper slope segments. The changes in roughness over time (RR) were compared to local slope values, an index that describes the relative position in the plough ridge-furrow system, and an aspect indicator. Local slope values S (m m 1 )were calculated as follows: where the slope in the x-direction S x is given by where z i the elevation in neighbouring cell i, N z is the average elevation in a 3 3 cell window and dx i is the horizontal distance from the centre of the window to the centre of the ith cell. S y is calculated in a similar way. The position in the ridge-furrow system is a comparative predictor for the amount of runoff that can be expected to pass a grid cell.
The DEM was used to calculate the dimensionless ridge-furrow index, I RF , a topological indicator that compares the elevation values within a moving window to the value of the central cell. In this study, the index is used to describe a location's position in the ridge-furrow system typical for mouldboard plouging. I RF is calculated as I RF D n higher =n lower (8) where n higher is the number of cells with higher elevation values and n lower is the number of lower values within the window. The window size should be chosen so that it covers the full ridge-and-furrow system only once. In this study, the size of the moving window was set at 0.30 m. In this way, grid cells on the ridges obtain values close to one, while those at the bottom of the furrows obtain values close to zero. The I RF algorithm was implemented in a C++ program. Since slaking is a frost/thaw-driven process (Kvaernø and Øygarden, 2006), a grid cell's orientation towards the sun will determine the daily temperature amplitude and maximum value. The ridge-furrow index is expanded to include a categorical aspect index. The ridge-furrow index (I RF ) differentiates between north-and south-facing slopes in combination with the position either on the plough ridge or in the furrow, to correct for exposure to sunlight. It is determined by classifying the terrain model into four classes: north or south facing and on the ridge or in the furrow, so that Locations with I RF values below 0.5 were classified as furrows, and higher than 0.5 as ridges.
The subsets were analysed for normality and variance. Two-tailed analyses of variance and paired t-tests for samples with equal variance were carried out using the R statistical package (R Development Core Team, 2008) to assess whether the RR was different for the four I RF classes.

Frost heave
What is known about frost heave is that it is unlikely to occur in small restricted areas. Although no literature was found on the minimum extent at which the process is observable, it can be assumed that heave patterns on a mouldboard ploughed surface occur at scales similar to or larger than the ridge-furrow topography in our experimental plot. The most appropriate procedures to differentiate smaller-from larger-scale processes in terrain analysis is DEM detrending or -the opposite -smoothing (Vieux, 1993). The lower limit of an appropriate window size is given by which topographical features are to be ignored. The ridge-furrow system as a result of tillage is the largest feature to be smoothed out. The window size therefore should be 0.50 m or larger. The presence of ridges and furrows creates a boundary effect when a moving window is applied. Preliminary tests with artificial data showed that the magnitude of this boundary effect is of the order of centimetres. This is larger than the expected changes in surface elevation. A field margin with a width of half the window size was therefore ignored in the final calculations. In order to rule out effects of scale, detrending was carried out with different moving window sizes (0.5-2.5 m, with 0.5 m increments).
To obtain the magnitude of frost heave, the detrended DEM of October 2010 was subtracted from the detrended DEM of April 2011. To rule out the possible effects of sedimentation, only the plough ridges (I RFA D 1 and I RFA D 2) were used in the calculation. Peppin and Style (2013) showed that all explanatory and predictive models of frost heave pressure have water content as the principal variable; either directly or through a water pressure term. Several studies (McNamara et al., 2005;Grayson et al., 1997) report a soil moisture gradient at the end of a wet season that is positive in the direction of the slope. Soil water content in the experimental plot was not measured. It can, however, be assumed that, under the wet conditions of autumn 2010, soil water content would show a positive trend in the downward direction.
Soil detachment and deposition Quantification of the erosion and deposition process at the spatial resolution used in this study by means of simulation would require dynamic hydraulic modelling. In order to obtain an accurate sediment budget for a simulation at plot scale, such a model would require meticulous calibration against measured data (e.g. Fiedler and Ramirez, 2000;Esteves et al., 2000). At plot scale, the spatial variability of the parameters that drive soil surface deformation can be expected to be considerable. Instead of attempting a comprehensive simulation of the runoff and erosion processes, two terrain indices -a convergence and a sedimentation index -are proposed to link the observed change in soil surface elevation with overland flow. Each of the indices is based on a characteristic assumption. The first index is based on the assumption that sediment is more likely to be deposited in cells where overland flow is diverged. Sediment transport capacity of overland flow is an exponential function of flow depth and velocity (Singh, 1997), so when runoff is distributed over a larger plan, the amount of particles it can carry is reduced. Positive values of the convergence index, CI (unitless), indicate converging flowlines, whereas negative values indicate diverging lines. Therefore, a negative correlation is expected between CI and surface elevation change; for example, the more flow diverges, the more likely is sedimentation to occur. CI was calculated with the Convergence Index module of SAGA GIS (Conrad, 2001). The second index looks at the vertical profile along the stream lines. Deposition is more likely to be predominant on concave sections with a small slope and erosion is more likely on convexities with larger slopes. A sedimentation index, SI (degrees per metre), was defined based on the assumption that sedimentation is more likely to occur at less steep slopes and in slope segments with decreasing slopes. SI was therefore defined as the product of S (here in degrees) and slope curvature, S 0 (degrees per metre). SI D S 0 S (10) Figure 5 shows the general principle of the sedimentation index on a unitless slope. Besides CI and SI, the correlations between both slope and curvature and surface elevation change were also determined. Since a large part of the surface area consists of sinks, the calculations were carried out for sink and non-sink cells in the DEM. The terrain indices were used The net change in surface elevation, n z, was calculated as the measured difference in surface elevation minus the change in elevation that resulted from frost heave. Correlations between observed net elevation change and the terrain indices were expressed as the Pearson correlation coefficient, r, and by means of fitting linear regression functions.
Finally, the DEMs were analysed for concentrated forms of erosion. Overland flow is likely to result in incisions in the soil surface that will shorten the average length of flow paths. These changes were detected by comparing average flow length values and local changes in catchment area. Since the erosive force of runoff is a function of flow velocity and discharge, these incisions are expected to be more pronounced towards the lower end of the runoff plot.

Point cloud co-registration and filtering
The number of points that were discarded during the filtering processes amounted to 34.1% for the October scan and 34.2% for the April scan. The subsequent decrease in residual error of the final DEMs is given in Table II. Equation (1) was used to map the predicted residual error EOE as a function of d P2P and 2 z for l grid D 0.02 m, with the corresponding fitted parameters a and b, so that EOE D d P2P 9.20 10 3 . 2 z / 1.169 The Pearson correlation coefficient r of EOE with the terrain indices SI and RF, roughness RR and point cloud density (number of observations per grid cell, n) was calculated in order to test for spatial bias in the residual error. While there was some degree of correlation with RR ( 0.17), the r-values for SI ( 0.05), RF and n (both 0.00) were much lower.

Frost heave
A profile was extracted by calculating the average surface level change (m) and elevation for each row of grid cells in the 0.02 m DEM. Figure 6 shows the average surface-level change after the winter period as a function of elevation. The profile is characterized by a clear negative correlation between elevation and frost heave rate. At the bottom of the slope, however, corresponding to the lower 2.0 m of the runoff plot, the trend is disturbed.

Surface roughness
Compared sensitivity of roughness calculations Comparison of the methods to define the reference plane for the determination of RR indicates that fitted planes decrease Table II. Residual errors ( ) before and after filtering Scan before filtering after filtering October 6.2 10 4 m 6 . 0 10 4 m April 9.0 10 4 m 3 . 7 10 4 m Figure 6. Frost heave (m) as a function of elevation and distance along the profile (two moving window sizes) the overall average RR in comparison to using the average elevation of the grid cell. Compared to a single plane per grid cell, average RR is smaller when four planes are defined. The increase in RR between a cell size of 0.01 and 0.20 m for the horizontal plane method is 14.0 10 3 m, 13.0 10 3 m for the single plane and 9.5 10 3 m for the four-plane method (see Figure 7a). Comparison of the methods for the effect of local slope on RR shows that the dependence of RR on local slope is largest when the grid cell average elevation is used as the reference plane. Figure 7b shows the Pearson correlation coefficient (r) as a function of the spatial resolution. The difference between the horizontal reference and four-plane methods is smaller, but RR by the four-plane method is less dependent on local slope than when one reference plane is used as the reference elevation.
Further roughness calculations were based on the four-plane method, at a grid cell length of 0.02 m. At this resolution, the correlation between local slope and RR is still considerable, but this cell size was chosen because it allows for the identification of changes in the soil surface that are small in comparison to the soil aggregate. At this resolution there is also a sufficient number of point measurements available per grid cell (n D 36 on average, after filtering).

Effect of topography on roughness decrease
The overall mean RR decreased from 1.30 10 3 m to 1.19 10 3 m during the winter and snowmelt event (p < 0.001). This general smoothing was more pronounced in the non-sink areas (RR D 1.9 10 4 as opposed to 1.3 10 4 m within the sinks) when absolute values are regarded. However, RR in the sink areas was less than in the non-sink areas in the pre-winter situation (by a mere 8.0 10 5 m). The relative decrease in the sink areas is somewhat higher: 33% in the sinks, as opposed to 21% outside. The decrease in RR was more pronounced in concavities than in convexities: 3.0 10 4 m and 2.4 10 4 m, respectively. Similarly, areas facing north showed a larger decrease ( 4.0 10 4 m) than south-facing slopes ( 1.0 10 4 m). Table III gives an overview of the average decrease of RR in each of the I RF classes. The ANOVA significance test on I RF against RR showed that all four aspect categories have different magnitudes of RR decrease (p <0.001). A Tukey HSD (honest significant differences) test further revealed that all four I RF categories have different average RR values (p < 0.001). The frequency distributions of both SI and CI indicate a normal distribution. Slope and curvature show positive skewness. The graphs for curvature, SI and CI also indicate that exclusion of the sinks in this part of the data analysis does not mean that all deposition is neglected: a considerable volume of soil is distributed in non-sink areas: deposition occurs on 46% of the surface area not classified as sinks. The apparent trends of the classified data cannot be confirmed by linear regression. The coefficients of determination, r 2 , for surface elevation change as a function of slope, curvature and the convergence index  were 0.09. For the sedimentation indices, r 2 was 0.08. The Pearson correlation coefficient for slope was 0.04. Slightly better correlations were established for the convergence index (r D 0.20), curvature (r D 0.21) and sedimentation index (r D 0.24). All of these weak correlations are significant, with p 0.001. Figure 9b gives an overview of where soil erosion and deposition occur within the runoff plot. The observed net surface elevation change was calculated for sink and non-sink areas. The average n z for the sink areas (6.30 m 2 ) was 2.78 10 3 m. For the non-sink areas (56.49 m 2 ) this was 0.67 10 3 m. Note that the total area (62.79 m 2 ) is less than the total area of the runoff plot, due to boundary effects of the moving window for the calculation of frost heave. At plot scale, the volume of detached soil is 3.79 10 2 m 3 , while 1.75 10 2 m 3 is deposited. Assuming a bulk density of 1.52 10 3 m 3 (Skøien et al., 2012) this would correspond to a net erosion rate of 4.83 t ha 1 .

Erosion and deposition
To illustrate the significance of soil loss and local redistribution for the hydrological behaviour of this soil surface, the pre-winter contributing areas for individual cells were compared to those after snowmelt. The average difference of all the runoff plot's grid cells is 0.066 m 2 . Figure 9a shows this difference in contributing area for the runoff plot. Similarly, the average length of the flow path from the individual cells to the runoff collector at the bottom or any other sink decreased from 0.301 m to 0.286 m during the melting event.

Discussion
The magnitude of frost heave was expected to be larger in the lower parts of the plot, where soil water content can be assumed to be higher. The general trend in Figure 6 confirms this expectation, with the exception of the lower two or so metres. A possible explanation for this would be the collapse of the plough ridges as a result of saturation after thawing and runoff. Visual inspection of the ridges did indeed reveal signs of severe structural deterioration in the lower part of the plot. The effect of soil moisture content on aggregate stability under Figure 10. Soil roughness decrease as a function of (a) slope and (b) ridge-furrow position. Bar graphs indicate relative frequency freeze-thaw cycles has been documented by (among others) Staricka and Benoit (1991), Oztas and Fayetorbay (2003), and Kvaernø and Øygarden (2006). The magnitude of this change could be larger than the effect of frost heave, but no metrics were derived from the data to confirm this.
At the upper plot boundary, frost heave is more pronounced than would be expected. The area immediately above the plot is under a permanent grass cover. The observation therefore could be explained by higher soil moisture conditions at the top of the plot compared to areas closer to the centre. Since no soil moisture content measurements were undertaken during the experiment, this explanation could not be confirmed by field data.
Data analysis confirmed the assumption that the four-plane method was less sensitive to the choice of cell size (Figure 7). Without exact prior knowledge concerning the size of the features that are affected by the processes studied, relative scale independence is an important quality of any metric used. In most cases, however, the choice of spatial resolution is still more significant for the magnitude of RR than the choice for any of the three methods applied in this research.
Both the single-and four-plane methods show a gradual decrease with increasing resolution. This is likely due to the fact that slope values decrease with increasing resolution, whereas RR is more likely to increase due to the presence of more roughness elements, such as cavities and protruding aggregates. The horizontal plane shows an initial increase and this could be the result of the compound effect of scale and the poorer fit of the horizontal plane to the actual point cloud.
This reduction in the correlation between local slope gradient and RR resembles the statistics presented by Chu et al. (2012). They apply a less sscale-dependent RR algorithm, developed by Hansen and Sibbesen (1999), but reduce slope dependency on tilled soils by DEM preprocessing.
The overall reduction in RR during snow cover and melt is in accordance with all studies that quantify this change (e.g. Römkens and Wang, 1987;Haubrock et al., 2009;Wang et al., 2018). The decrease in RR in this study is an order of magnitude smaller than studies characterized by intense runoff, such as . Some studies (Haubrock et al., 2009;He et al., 2018) report local increases in RR as a result of rill formation -locally or towards the lower boundary of a runoff plot. These local increases could not be quantified in this study. This can either be because of the limited extent of rill formation, or its more homogeneous distribution over the runoff plot (Figure 9a). North-facing areas show a larger reduction in RR than those facing south. This is contrary to the expectation that the larger daily temperature amplitudes on south-facing slopes would result in a more marked reduction in soil roughness. This could be explained by a difference in water content. Before the onset of winter, north-facing areas can be expected to evaporate less of their moisture due to the smaller amount of incoming solar energy.
The difference between the average reduction in RR between the concave and convex is small (0.30 and 0.24 10 3 m, respectively) but statistically significant. This could be explained by the fact that concave areas where material is deposited can be expected to smooth during an erosive event. The correlation between curvature and net elevation change (Figure 1b) confirms this presumption: concave areas are more likely to accumulate sediment.
Despite the apparent trends in Figure 10, neither the linear regression nor the (Pearson) correlation coefficients were significant for both slope and I RF . A possible explanation for this apparent independence could be that I RF does not perform well as a hydrological terrain index. Overland flow can be expected to concentrate in the furrows, but its ability to reshape the soil surface will also depend on the force it exerts. This force will depend on the catchment area of each particular point in the furrow rather than on the ridge-furrow index.
The derived net soil loss rate for this spring period is at a realistic level. Although only representative for the spring season, this value compares favourably with the average measured value at Bjørnebekk for the autumn ploughed plots between 1994 and 2004, which is approximately 6.00 t ha 1 yr 1 (Skøien et al., 2012).
The effect of soil redistribution during snow cover and melt on the plot's hydrological characteristics is in accordance with the changes reported for rainfall events. Darboux and Huang (2001) quantified the reduction in surface storage capacity as the result of soil deposition in sinks during consecutive rainfall events. Average deposition in the sinks in this study was indeed positive (2.78 10 m 3 ). Average flow path length in the study plot decreased, while the average contributing area increased. This observation is in accordance with the conclusion by,for example, Peñuela et al. (2016) and Wang et al. (2018) that hydrological connectivity increases during a sequence of runoff events.

Conclusion
The first objective of this research was to correlate small changes in soil microtopography during a winter/spring interval by means of TLS technology. Several observations did indicate that the error associated with data acquisition and pre-processing was small in comparison to the actual changes that occurred. Most notably, the residual error is distributed over the runoff plot independent of the observed changes. The decreasing roughness, and its spatial distribution in the runoff plot, is a clear quantification of a general smoothing of the soil surface that can be expected during a runoff event. In a similar fashion, the quantification of net accumulation of soil material in the depressions in the terrain, compared to the net soil loss from the areas in between, can be regarded as a confirmation that TLS is able to change in the same order of magnitude of the residual error. After correcting for frost heave, the calculated changes to the soil surface at the scale of the runoff plot yield results that are both consistent with our understanding of how overland flow, the pressure of the snowpack and freeze-thaw cycles affect soil surface morphology. The derived erosion and deposition rates are realistic when compared to long-term sediment measurement series.
With regard to soil surface roughness, TLS technology can be applied to quantify the kind of minute changes that occur during a single event. The difference in surface roughness decrease in the plough furrows and on the ridges was small but significant. This outcome corresponds to the expectation that roughness decrease is more pronounced in areas where flow processes are the main driver for a change in surface roughness.
Since no ground truth was available against which to evaluate the observed surface elevation changes, secondary terrain indicators had to be developed and applied. Each of these terrain indicators is connected to a certain understanding of the different processes leading to changes in microtopography. Most of the indicators displayed some degree of correlation with the observed dynamics in microtopography when presented as class average values. The large variance of the data within the classes complicated the objective confirmation of these trends. Additional research might investigate the appropriateness of non-continuous terrain indices for better linking surface dynamics to processes. This research has shown that categorical distinctions, such as I RF and sink/non-sink, provided unambiguous confirmation that soil surface changes can be observed when linked to their respective locations in the runoff plot.
The categorical indices used in this study, I RF , but also the distinction between sink and non-sink areas, provided much less disputable correlations than continuous indices such as slope, curvature and the sedimentation index.
A question that was not addressed in this research is how the decrease in surface roughness is correlated with surface elevation change. On average, their orders of magnitude are roughly the same. In erosive zones, flow processes may result in both elevation and roughness decrease, while deposition zones can be expected to display elevation rise and roughness decrease. Further quantification of this correlation could provide useful information in situations where measurements have lower spatial resolutions. Understanding the link between roughness decrease and surface elevation change then could complement understanding about the process of soil redistribution at the very local scale. Of the roughness metrics tested in this study, the four-plane method proved to be least sensitive to both spatial resolution and local slope. As such, it is a useful method to calculate RR in situations when the spatial scale of the expected changes is not clearly defined. It also provides an alternative to DEM pre-processing (detrending) in order to minimize the effect of local slope. This also is a matter of scale, especially on agricultural surfaces where local slopes are a composite of general topography (hillslope) and microtopography (tillage ridges).