Grain‐Size Distribution and Propagation Effects on Seismic Signals Generated by Bedload Transport

Bedload transport is a key process in fluvial morphodynamics, but difficult to measure. The advent of seismic monitoring techniques has provided an alternative to in‐stream monitoring, which is often costly and cannot be utilized during large floods. Seismic monitoring is a method requiring several steps to convert seismic data into bedload flux data. State‐of‐the‐art conversion approaches exploit physical models predicting the seismic signal generated by bedload transport. However, due to a lack of well‐constrained validation data, the accuracy of the resulting inversions is unknown. We use field experiments to constrain a seismic bedload model and compare the results to high‐quality independent bedload measurements. Constraining the Green's function (i.e., seismic ground properties) with an active seismic survey resulted in an average absolute difference between modeled and empirically measured seismic bedload power of 11 dB in the relevant frequency band. Using generically estimated Green's function parameters resulted in a difference of 20 dB, thus highlighting the importance of using actual field parameters. Water turbulence and grain hiding are unlikely to be the cause of differences between field observations and our analysis. Rather, they may be either due to the inverted model being particularly sensitive to the coarse tail of the grain‐size distribution, which is least well constrained from field observations, or due to the seismic model underestimating effects of the largest grains.

The advent of seismic monitoring techniques, a stream-side monitoring method (Burtin et al., 2016), may allow monitoring bedload flux under conditions that have been previously inaccessible. In this method, the target process is monitored via the elastic waves generated by sediment particle impacts on the river bed. Because the seismic sensors are deployed on river banks, they are neither affected by nor do they affect the flow (Burtin et al., 2008;Govi et al., 1993). The method is noninvasive, provides high temporal resolution, is relatively low-cost, and can be deployed on nearly any river bank. On the contrary, the seismic approach is very sensitive to human activities in the frequency band of interest, which is an important drawback especially in urban or rural areas. Seismic river monitoring has been used at a variety of scales and river systems, from small torrents (e.g., Bakker et al., 2020;Barrière et al., 2015;Burtin et al., 2008;Roth et al., 2016) to low gradient gravel and sand bed rivers (Dietze et al., 2019;Polvi et al., 2020;Schmandt et al., 2017) and large streams (Burtin et al., 2011;Schmandt et al., 2013).
Several processing steps are required to convert an indirect time series of seismic ground motion into an estimate of bedload flux. One frequently used approach relies on the assumption that sediment transport is the dominant process causing seismic waves during a flood, or that it can be clearly isolated from other seismic sources, in particular the turbulent flow of water. Indeed, previous studies have suggested that, if the seismic station is located at an appropriate distance from the river, the signal generated by turbulent flow occurs at lower frequencies than the one generated by bedload (Gimbert et al., 2014;Schmandt et al., 2013). The appropriate distance depends on many parameters (mainly bed material grain size distribution, river gradient, phase wave velocity, and ground quality factor, see Table 1), but is dictated by the ability to record both signatures with sufficient power above background, but also with the least overlap in the frequency domain (Dietze et al., 2019). Under such circumstances, the seismic energy within a chosen frequency band is expected to be proportional to bedload flux (Gimbert et al., 2016;Roth et al., 2014;Tsai et al., 2012). Thus, bedload flux may be related to seismic power by fitting linear mixing models (Roth et al., 2014), regressing in-stream sampling derived bedload flux values (Bakker et al., 2020) Figure 2a) are denoted by * . Quantities that are not constant through frequency (see Figure 5 for entire range) are denoted by * * . GF denotes parameters relevant for estimating the Green's function, GSD denotes parameters relevant for estimating the grain size distribution. -denotes others parameters. † The active seismic survey implies that Q is independent of frequency. Thus, η equals 0 in equation Q Q f f  0 0 ( / )  from Tsai et al. (2012).  Tsai et al.(2012) flux (Cook et al., 2018), by scaling analysis (Bakker et al., 2020;Burtin et al., 2011), by dedicated physical model inversion (Bakker et al., 2020), or by inversion of combined physical models (Dietze et al., 2019;Polvi et al., 2020). Tsai et al. (2012) has developed a model to predict the seismic spectrum generated by the impact of bedload particles moving along the channel bed. The model requires knowledge of stream and sediment characteristics to constrain the source terms, for example, the channel geometry and grain size distribution, and ground properties affecting the wave propagation, that is, frequency-dependent wave velocity or attenuation characteristics. To date, this model has been tested in a flume experiment (Gimbert et al., 2019) and dedicated field studies (Bakker et al., 2020;Dietze et al., 2019), but with limited attention given to the effects of the grain size distribution, neither in terms of model representation nor impact of the tails on model results.
The aim of this study is to constrain the Tsai et al. (2012) seismic model with dedicated field experiments and to test it against independent bedload measurements. This approach is similar to that adopted in Bakker et al. (2020), although (i) we target a quite different river, the Nahal Eshtemoa in Israel, that has a different geometry (e.g., smaller slope and grain sizes) and experiences different flow conditions, since it is located in a semi-arid region in which water flow only occurs during episodic flash floods, (ii) we make use of a newly implemented code to run the model with discrete, measured grain-size classes, and (iii) we constrain ground seismic properties relevant for the description of wave propagation in the model, using array-based active seismics, as opposed to performing empirical calibration from simpler rock impacts. We exploit hydraulic and bedload flux time series, recorded by a state-of-the-art stream observatory, to invert the seismic data and reconstruct bedload flux, and then compare the results against high-quality independent observations.

Study Site and Monitoring Station
The Nahal (river) Eshtemoa is a gravel-bed river draining 119 km 2 of the Southern Hebron Mountains and the Northern Negev, northeast of Beer Sheva, Israel (Figure 1). The climate in the catchment is semi-arid, with a mean annual precipitation of 286 mm. Rainfall mainly occurs between October and May. The river is ephemeral with flash floods occurring on average five times per year (Alexandrov et al., 2009). The recurrence interval of the bankfull discharge of 26 m 3 s −1 has been estimated to be 1.25 years (Powell et al., 2012).  Bedload fluxes are high by worldwide standards with sediment transport as much as 400 times more efficient than in a typical perennial humid river (Laronne & Reid, 1993;Reid & Laronne, 1995).
The river is equipped with a monitoring station in a straight channel section with a trapezoidal cross section. The banks are nearly vertical, 1.2 m high, and comprise loess-derived aeolian fines and interbedded gravel. The mean channel slope is 0.0075 radians, which is generally mirrored by the water surface slope, with exception during the arrival of a flashflood bore (Meirovich et al., 1998). At the monitoring site, the surface layer of the channel bed is composed of pebbles and cobbles with a 5%-10% sand-granule matrix. The Eshtemoa bed (Powell et al., 2012) is characterized by bars featuring coarse sediment with a median diameter D 50 of 38 mm alternating with "flats" with a D 50 of 27 mm. Values of D 95 exceed 100 mm on the bar surface and are activated at contemporary nondimensional shear stresses above 4 (Powell et al., 2001). Particles with a diameter approaching 150 mm may be mobilized during supra-bankfull floods (Bergman et al., 2007).
The monitoring station features five Reid-type bedload slot samplers, pipe, and plate microphones (Laronne et al., 1992;Rickenmann et al., 2014) to monitor bedload flux, as well as vented pressure transmitters to monitor water depth. Water depth is also monitored at five longitudinal locations to determine water surface slope and its variation during arrival of flash flood bores (Meirovich et al., 1998). Of the five slot samplers, only the three central ones are presently operating, because the channel has narrowed due to deposition of silt on the banks and in the channel bed, thereby causing a considerable increase in the shear stress required for initiation of motion (Barzilai et al., 2013). The samplers are equipped with pressure-pillows, allowing for time-resolved bedload flux quantification. The slots are 11 cm wide. Thus, bedload particles with diameters close to these sizes or larger are not sampled (Bergman et al., 2007). The bedload samplers are emptied after each bedload transporting event to determine the temporal and cross sectional evolution of bedload grain size (Powell et al., 1999(Powell et al., , 2001. Due to the restricted capacity of the samplers (0.48 and 0.34 m 3 for the central and lateral devices, respectively), they usually fill before the cessation of bedload flux. Thus, to provide material flux information for an entire event, a calibrated pipe microphone and plate microphone are used (Gray et al., 2010;Mizuyama et al., 2010). Since 2016, there is also a broadband seismometer (Nanometrics Trillium Compact 120s) installed 5.75 m from the river centerline, which is sampled by a Nanometrics Centaur data logger at 200 Hz. The station is fully automatic; the seismometer records data continuously, and the other sensors are activated when a flood approaches.

Theoretical Background
The mechanistic model by Tsai et al. (2012) predicts the vertical power spectral density (PSD) due to Rayleigh waves caused by impacts of saltating particles on the river bed. Rayleigh waves are the only surface waves that have a vertical ground motion component, and body waves are assumed to play a negligible role. The model is based on classic wave source and propagation physics, in which ground motion is the result of the combined action of the forces applied and the medium's response. The coarsest particles (D > D 90 ) of the bedload grain size distribution (GSD) are expected to cause the most powerful impacts on the bed, generating the largest seismic signals. The ground properties are described using the Green's function. The model requires 13 input parameters (Table 1).
In the model, the final PSD corresponds to: Here, PSD* is a PSD per unit bedload flux (dimension Hz −1 s −1 ), taking into account all parameters (Section 2.2.2) apart from bedload flux (q bd ). If the model adequately describes the physical processes and if other signal sources are negligeable compared to bedload transport, PSD final equals the observed seismic data PSD seismic . Thus, it is possible to calculate the bedload flux q bd , as q bd = PSD seismic /PSD*.
The mechanical model by Gimbert et al. (2014) predicts the vertical PSD of Rayleigh waves generated by water flow interacting with roughness elements along the riverbed. We use this model to investigate the LAGARDE ET AL.
10.1029/2020WR028700 potential impact of water turbulence on the seismic signal. We use the same parameters for this model as used for the bedload model (Table 1), except for the bedload flux, which is not required, and the bedload GSD.
For the water flow model by Gimbert et al. (2014), we used the GSD measured on the riverbed instead of the bedload GSD (see Section 2.2.2.2).

Model Parameters
The bedload model requires constraints on 13 parameters (Table 1). These parameters can be grouped into three classes: the GSD parameters, the Green's function parameters obtained from the active seismic survey (GF in application column of Table 1, except r 0 ), and the river morphological parameters (parameters without reference (−) in application column of Table 1).
In the bedload seismic model, some of the required parameters are approximated using auxiliary functions. In order to make the distinction between the parameter values obtained empirically and the parameter values computed using the auxiliary functions, these latter are indicated by a subset aux. For example, N 11 is considered frequency-independent ( 11 aux N ), and the Green's function parameters v c and v u are described using empirical functions: Here, v c0 is the phase velocity of the Rayleigh wave at a reference frequency f 0 = 1 Hz, and ξ is an exponent.
Typically, the GSD of rivers is measured in a few, discrete classes instead of by continuous parametric functions (e.g., Parker, 2008). Thus, in order to allow application of the bedload model to such cases, we have implemented the support of empirical GSD to the model (Dietze 2018a, v.0.5.0). Thus, the input GSD can be defined by discrete measurement classes rather than being defined by the parametric log-raised cosine function. All calculations were done in the statistic programming language R v. 3.6 (R Core Team, 2020), using the package eseis v. 0.5.0 (Dietze, 2018b).

Directly Measured Parameters
The depth of the river H was obtained using the vented pressure transmitters, with a temporal resolution of 1 min. The channel width W is indirectly inferred using H and the cross section geometry. Field observations suggest that the section of the channel can be approximated by a trapezoid shape. The base and top length were measured with tape. Knowing H, current channel width W is then obtained with a temporal resolution of 1 min. As the channel does not have a perfect trapezoid shape, the uncertainty in W is set to 0.2 m, corresponding to 10% of the difference between the trapezium base length and the trapezium top length. We determined the distance between the seismometer and the channel center line r 0 by measuring tape (uncertainty 0.01 m), and the channel bed slope θ by a total station (uncertainty 0.0004 radians).
Bedload flux q bd was monitored by the bedload slot samplers and the plate microphone, at 1-min interval.
As the sampler boxes were progressively filled, and sampler data were available only for parts of the event, three different methods were used to compute q bd . During a first phase (Phase 1 in Figure 2a), q bd corresponds to an average of the fluxes of the three slot samplers, with the flux from the central sampler multiplied by 1.5 to correct for its 150% wider cross section. In a second phase (Phase 2 in Figure 2a), the right and central slot samplers were full, but the left slot sampler was still operating: its bedload flux is multiplied by 1.3 to represent the entire section. The product derives from the average bedload flux in the center relative LAGARDE ET AL.
10.1029/2020WR028700 5 of 20 to the side samplers. Finally, in the third phase (Phase 3 in Figure 2a), all of the samplers were full, and q bd was determined using the calibrated plate microphone located immediately upstream of the left sampler (R 2 = 0.76). The calibration was done using a mass aggregation method (Halfi et al., 2020), with the constant mass interval of 4 kg. This mass corresponds to the sensitivity of the sensor and allows to discriminate noise from the real bedload flux measurement. The difference in terms of values between each computation method are compared ( Figure 2a).

Grain Size distribution
After the flood, the sampler boxes were emptied, and their contents were sieved, giving nine different class-intervals. For each class-interval, the total mass was measured. However, for the largest class (D ≥ 76 mm), information may be incomplete, because we do not know the maximum transported grain size D max . For each class, the geometric mean of the limits was used as the representative size. For the largest measured class, in the interval between 51 to 76 mm, the representative size is equal to 62 mm.
Since the slot samplers cannot resolve particles larger than 110 mm, the maximum mobile grain size D max is unlikely to be caught by the samplers (Bergman et al., 2007). To estimate the impact of this observational limit, we extrapolate the measured GSD by fitting the log-raised cosine function to the empirical GSD using least-square regression. Model predictions were made with six different GSDs (Table 2, thin lines), truncated at three different values. The lowest cut-off value of 62 mm is representative of the largest class mid-point of the empirical data. The second cut-off value of 110 mm is the slot width of the sampler. The largest cutoff value of 250 mm corresponds to the largest clasts found on the channel bed and is used to explore the effect of particles that are too large to be directly sampled. Each of the three limits was used to cut either the fitted log-raised cosine function or the measured discrete GSD (Table 2). Model PSDs were calculated using these six different approaches (Table 2), keeping all other model parameters constant (Table 1). In order to compare the grain-size effect in a time-resolved manner throughout the flood event, we also inspected the differences between modeled and measured spectrograms for  Table 2 for the definitions of abbreviations), using q bd from the bedload samplers ( Figure 4).
In addition, we explored the sensitivity of the model to the log-raised cosine function prediction GSD funct on the largest grain sizes. Within the Tsai et al. (2012) model, the grain size approximately affects the PSD by power 3, that is, PSD ∼ D 3 . Thus, the largest moving grains have a strong impact on the predicted PSD. However, because the bedload samplers could capture grains with a maximum diameter of 110 mm, the largest grains may not have been sampled. Thus, we extrapolated the empirical GSD using the log-raised cosine function to describe the largest grain size fractions. To explore potential effects of the extrapolation on the PSD, we used a Monte Carlo approach. We randomly sampled the log-raised cosine function one hundred LAGARDE ET AL.  Table 2 Summary of the Different Abbreviations Used to Describe the GSD times, with D 50 ranging between 8 and 16 mm and σ g ranging between 1 and 1.7 (Figure 9, Section 4.5), generating a uniform distribution within the parameter range. For each test, the empirical GSD was extended between 62 and 110 mm, and the corresponding PSD was computed.
For the turbulence model (Gimbert et al., 2014), we used the GSD of the bed. In 2015, pebbles were sampled randomly on the bed and sieved, giving nine different class-intervals. We fitted the empirical bed GSD with the log-raised cosine function mentioned above using least-square regression.

Green's Function Parameters From Active Seismic Survey
To obtain the Green's function parameter values N 11 , Q, v c , and v u (Table 1), we determined local seismic ground properties by an active seismic survey in November 2017. A total of seven geophone transects were deployed on both sides of the river, both parallel and perpendicular to the channel. The total geophone transect length was between 20 and 100 m. For each line, 23 one-component and 6 three-component PE6/B 4.5 Hz geophones were placed at different distances, with Digos DataCube loggers sampling at 400 Hz with a gain of 32. Signals were produced by 10 sledge hammer blows onto a 40 × 50 × 5 cm iron plate for each transect configuration, with the plate placed at one extremity of the transect. Sensor locations were determined by measurement tape and total station surveys. Rayleigh wave dispersion curves vary broadly depending on the selected line and could induce an inversion bias. As we are interested in the attenuation of the waves traveling from the river to the seismometer, and the sensor integrates signals of a linearly extensive source over a certain length along the stream, the line parallel to the channel on the seismometer bank was selected for the subsequent detailed analyses (see section Appendix B for comparison with dispersion patterns from the other transects).
The obtained signals were processed with the "COMPUTER PROGRAMS IN SEISMOLOGY" software (Herrmann, 2013). Processing consisted of two stages. In stage one, the hammer blow signals were used to obtain the shear wave velocity v s as a function of depth, from the inversion of Rayleigh wave dispersion curves (Xia, 2014). This shear-wave velocity profile v s is further called "ground model". In stage two, the ground model was used to obtain N 11 , Q, v c , and v u as a function of frequency.
In stage one, hammer blow signals were stacked on the selected line. Then, the stacked signals were processed to extract Rayleigh wave dispersion curves, that is, relationships of phase velocity and frequency, using the p-omega stacking of McMechan and Yedlin (1981). To build the ground model, the maximum depth of the model was fixed at 15 m corresponding to 1/3 wavelength, a common assumption in classical wave physics (Shearer, 2009). A prior assumption for the shear wave velocity v s within the ground model was set to 0.53 km/s, corresponding to the mean of the Rayleigh wave velocity computed previously (Figure 5b). The ground model was obtained by using iterative linearized least square inversion code of Herrmann (2013) to fit the observed dispersion by a stepwise model of increasing shear wave velocity. Once the ground model is obtained, it can be used for stage two, that is, to compute and display the anelastic attenuation coefficient, the eigenfunctions and theoretical dispersion as a function of depth and frequency, thus obtaining the parameters of interest Q, N 11 , v c , and v u (Figures 5c and 5d).
In contrast to these empirically based parameter characteristics, the bedload model by Tsai et al. (2012) uses a simpler implementation (Section 2.2.2, Equations (2) and (3)). We therefore simplified the active seismic survey results to match the model inputs:  (Table 2) and keeping all other model parameters constant (Table 1). To compare the effect of Green's function formulations in a time-resolved manner during the flood event, we inspected the differences between modeled and measured spectrograms (Figure 6), using 110 mix GSD , q bd from the samplers, and keeping all other model parameters constant, except for flow depth and channel width that changed with time (Table 1).

Bedload Flux Reconstruction
The measured seismic data were deconvolved to remove the amplitude and phase of the seismic signal linked to sensor characteristics. The signal was also detrended and the mean was removed to avoid artifacts in the application of the Fourier transform. The spectrogram of the seismic data from the continuously recording T120s sensor (Figure 2b) was calculated with 20 s resolution using the Welch method (Welch, 1967) with 15 s sub-windows, each overlapping by 90%.
For the same minute-long time intervals as the hydraulic data sets, PSD seismic was obtained using the autoregressive option of the function with default values (eseis v. 0.5.0, Dietze, 2018b). The autoregressive method implies that the output is based on its own previous values, leading to a smoothing process. For each minute, PSD* (Equation (1)) was computed with GSD set as 110 mix GSD and keeping all other model parameters constant, except for flow depth and channel width that vary with time (Table 1) and using empirical Green's function parameters v c , v u , and N 11 . Each value of PSD seismic was divided by the corresponding value of PSD*. The average of this ratio in the 20-60 Hz band (river activity band, as has been determined in the previous study of Dietze et al., 2019) gives the reconstructed bedload flux q bd . The reconstructed bedload flux was visually compared with the bedload flux measured by the bedload samplers.

Characteristics of the Monitored Flood
We chose a flood event that occurred on February 22, 2016 because of its high bedload flux with peaks exceeding 1 kg/sm and water level between 0.5 and 0.8 m (Figure 2a, Dietze et al., 2019;Halfi et al., 2018). The beginning of the event is defined as the onset of bedload flux recorded by the samplers at 05.40 UTC, considered to be minute 0 in this study. The following 112 min of the event (05:40-07:32 UTC), during which bedload transport occurred continuously was analyzed although the recession continued during the following 13 h, in a typical logarithmic fashion. In addition to an increase in bedload flux as water depth initially increased, and a decrease when stage decreased toward the end of the monitored part of the event, bedload flux showed an undulating rising and falling pattern, with four peaks in the first 50 min (Phase 1), and two peaks in the following 40 min (Phase 2). The methods used to compute the bedload flux during phases 1 and 2 gave similar results, with an average difference of 47%. However, the method used in Phase 3 steadily recorded higher bedload fluxes in comparison to the measurements in the other two phases, by an average of 79% compared with method two during Phase 2. Thus, the bedload flux measured after minute 87 is likely overestimated by ca. 1 kg/sm. Seismic power in the 20-60 Hz band increases between minutes 0 and 60 (Figure 2b). A pattern similar to this can be observed at frequencies above 70 Hz. However, since there is no physical representation of such a repeated frequency pattern in any of the models we used, we exclude the >70 Hz frequency part from the subsequent analysis (Dietze et al., 2019). Variations in bedload flux coincide visually with the variation in seismic signal, considering that bedload flux recorded in Phase 3 is very likely overestimated. In the 20-60 Hz band, the signal is mainly above −120 dB during Phases 1 and 2. However, there are also periods when both times series deviate. For example, there is a decrease in the seismic signal between minutes 12 and 20, whereas the bedload flux is roughly constant during this period. Machines operating at the observatory generate narrow seismic bands, predominantly at 22 and 44 Hz. The two broadband seismic spikes (vertical lines in the PSD of Figure 2b) at minutes 5 and 9 were caused by people working at the observatory to collect data and conduct station maintenance duties.

Impact of Grain Size Characteristics
The best fit of the log-raised cosine function to the bed GSD, while keeping a physical meaning, is obtained with 50 bed D = 74 mm and  g bed = 2. The associated coefficient of determination (R 2 ) is 0.88.
The best and physically meaningful fit of the log-raised cosine function to the bedload GSD from the samplers is obtained with D 50 = 12 mm and σ g = 1.35. The associated root mean square deviation is 23.7 m −1 . With these parameters, the function deviates from the empirical data in several cases (Figure 3). At 0.63 mm, the GSD sampler is more than twice as high as the GSD funct . GSD funct misses the double-peaked shapes of GSDsampler . The coarse tails are also different: GSD sampler ends at 62 mm, whereas if not truncated, GSD funct ends above 250 mm.
The comparison of PSDs generated by is extended with the log-raised cosine function, the spectral difference between the non-extended ( ) and the extended GSD (GSD mix ) rises, to 9 dB for D max = 110 mm and 18 dB for D max = 250 mm.
When the spectrogram is modeled using 62 sampler GSD , the absolute difference to the measured spectrogram is mostly above 20 dB in the 20-60 Hz band (Figure 4b), whereas when 110 mix GSD (which corresponds to the largest grain sizes expected to travel in the channel, based on analyses of grain size distribution of bedload in prior events Bergman et al., 2007) is used (Figure 4c), the difference between both spectrograms is almost always below 20 dB on the 20-60 Hz band, with an absolute average difference of 11 dB. Between minutes 60 and 80, there are three areas where the difference (vertical blue bar) is marked on the whole frequency domain. These areas correspond to the time where q bd dropped to almost 0 (Figures 2a and 2b). After minute 80, for all the GSD formulations, the difference between the spectrograms extracted from the data and reconstructed from the bedload model tends to become positive. This period corresponds to Phase 3, during which the q bd value is likely overestimated.

Green's Function Parameters
Evaluation of the active seismic survey data from the transect along the right bank of the river shows that the Rayleigh wave velocity (Figure 5a (Figure 5c). u aux v , has a similar shape to c aux v , and is on average 120 m/s below it for frequencies above 10 Hz (Figure 5c). The parameter N 11 is roughly constant for frequencies above 50 Hz, and varies for lower frequencies (Figure 5d). The mean value of N 11 , set equal to the constant model parameter 11 aux N , is 0.6.
The PSD computed using empirical Green's function parameters (Figure 6a) shows a two-step-like shape, with a first plateau at 10 Hz and a second starting at 22 Hz, whereas the PSD computed with the model functions monotonically rises (Figure 6a). Within the 20-60 Hz band, there are on average 10 dB of difference between both approaches. When using the Green's function parameterization as implemented in the bedload model, the absolute average difference between the spectrograms is of 20 dB between 20 and 60 Hz (Figure 6c), whereas when the Green's function parameters are based on the active seismic survey, the difference between the spectrograms is almost always below 20 dB on the 20-60 Hz band, with an absolute average difference of 11 dB (Figure 6d). Between minute 60 and 80, there are three areas where the difference (vertical blue bar) is marked on the whole frequency domain. These areas correspond to the time where q bd dropped to almost zero (Figures 2a and 2b).
10.1029/2020WR028700 10 of 20 ). The red line shows the interpolation of the empirical, sieved GSD. The blue line shows the best fit version of the log-raised cosine function (Tsai et al., 2012). The inset shows details of the fitted function for the coarse tail of the distribution, note that y-axis is in log scale. Dashed vertical lines depict extrapolated limits of the fitted distribution at 62, 110, and 250 mm. (b) Cumulative sieved GSD extended to 110 mm using the log-raised cosine function.

Reconstruction of Bedload Flux
The RMSE between the modeled and measured bedload flux, ignoring the maintenance caused spikes at minute 5 and 9, is 47.5 kg/sm for the entire event, and more specifically 67.4 kg/sm for Phase 1, 23.3 kg/sm for Phase 2, and 4.3 kg/sm for Phase 3.
The predicted bedload flux range overestimates measured bedload flux (Figure 7) by about two orders of magnitude during Phases 1 and 2. For Phase 3, the discrepancy reduces to one order. However, in Phase 3, there is a systematic error in the bedload measurement, which leads to an overestimation of the values by ∼1 kg/sm. The wavy temporal variation of the bedload flux demonstrates pulses of bedload (e.g., Dhont and Ancey, 2018); it is not visible in the reconstructed seismic data.

Discussion
Previous studies on along-stream seismic bedload quantification have revealed different limitations, each related to the specific method. Roth et al. (2014) discussed issues of nonlinear effects and the drawback of overlapping signals due to discharge and bedload movement. Bakker et al. (2020) found inversion uncertainties of less than an order of magnitude when using an updated version of the Tsai et al. (2012) model, which accounts for nonideal particle impact mechanics quantified experimentally in Gimbert et al. (2019). Dietze et al., (2019) found much smaller model deviations for a combined bedload and water turbulence inversion approach, but this was due to not using empirically constrained model parameters and considering those as free parameters, as opposed to in Bakker et al. (2020) and presently, where these are constrained independently. Since Dietze et al. (2019) tested their approach at the same site and on the same flood as in the present paper, it provides an ideal case to compare with the fully constrained case. The Monte Carlo-based inversion routine Dietze et al. (2019) used for the joint inversion of water turbulence and bedload movement found the most likely parameter combinations to match the measured seismic spectra, but these combinations were not based on observations. Specifically, Dietze et al. (2019) conducted sensitivity tests to constrain the Green's function parameters rather than using empirical values from an active seismic survey. This approach yielded results that were in much better agreement with the independent measurements for bedload flux, but did not describe the ground properties well when compared to the results from the active seismic survey ( Figure 5; Table 1 from Dietze et al., 2019). Consequently, the model parameters resulting from the active seismic survey analysis differ from those of Dietze et al. (2019) (Figure 5). Moreover, while in both studies, GSD parameters are the same (D 50 = 10 mm, σ g = 1.35), Dietze et al., (2019) used the whole range of the log-raised cosine function, whereas here we truncated it at a maximum value of 250 mm ( Figure 3). As shown in Figure 4, D max impacts the resulting PSD. Due to these differences in approach, there are two important outcomes from the results of our study. First, the temporal evolution of measured bedload flux was not always reflected in the variation of seismic power (Figure 2), and, second, there is a discrepancy of one to two orders of magnitude between the reconstructed and the measured bedload flux (Figure 7). These effects can be due to a variety of reasons. Below, we discuss the potential effects within five broad areas: (i) potential near-field effects on Rayleigh wave propagation, (ii) seismic signal due to water turbulence, (iii) effects specific to seismic monitoring of bedload, (iv) the sensitivity to the parametrization of the Green's function, and (v) sensitivity to the GSD of moving bedload.  (Table 2). (b)-(d) Differences between empirical and modeled PSDs using different GSDs, using q bd from the bedload samplers and empirical Green's function parameters. Areas outside the river activity band are in gray.

Near-Field Effects
A "near-field" situation occurs when r 0 k < 1, with k = 2πf/v c the wave number of the Rayleigh wave (Gimbert et al., 2014). In our study we find r 0 k < 1 when f < 18.4 Hz. As the river activity band is within 20-60 Hz (Section 2.2.3), we did not take in account the Rayleigh wave attenuation due to near-field geometrical spreading.

Water Turbulence
Seismic power could be primarily caused by turbulent flow instead of by bedload. To test that, we used a model that predicts seismic power due to turbulent river flow (Gimbert et al., 2014). We use the same parameters for this model as used for the bedload model (Table 1), but replaced the bedload GSD by the riverbed GSD. During the flood, this model predicted the highest spectral power in the 20-30 Hz band (Figure 8), whereas the highest spectral power of the measured seismic data were in the 30-50 Hz band (Figure 2b). Moreover, seismic power predictions using the turbulence model were on average 35 dB lower than the measured seismic power in the 20-60 Hz band (Figures 2b and 8). In contrast, seismic power predictions using the bedload model showed an absolute average difference of 11 dB compared to the measured seismic LAGARDE ET AL.
10.1029/2020WR028700 12 of 20 power in the 20-60 Hz band (using 110 mix GSD and empirical Green's function parameters v c , v u , and N 11 ). Thus, according to this turbulence model, water flow does not appear to be a first order factor for the generation of seismic signal in the studied flood event. The turbulence model (Gimbert et al., 2014) predicts the vertical PSD of Rayleigh waves generated by water flow interacting with roughness elements along the riverbed. It is based on simplified physical assumptions, including steady and uniform flow, and spherical grains that are fully exposed to the flow. More realistic assumptions of asymmetrical and angular grains that are partially hidden in the bed would likely increase the seismic energy due to turbulent flow predicted by the model. However, it is unlikely that this increase would be sufficient to account for the remaining difference to the measurements: the roughness length of the riverbed, the parameter most impactful here (Figure 4, Gimbert et al., 2014), computed as k s = 3D 50 , would need to decrease by a factor of more than 100 to match the observations, which is physically unreasonable.

Specifics of the Method
The seismic signal integrates over the width of the river and a certain distance upstream and downstream, not only a single cross section as is the case for direct measurements with slot samplers. Moreover, the seismometer is located 23 meters upstream of the samplers, introducing a time shift between the instruments due to the finite velocity of bedload particles. Both of these effects, and particularly so the averaging of the seismic signal over a channel reach, cause a spatial averaging of the seismic signal that leads to a smoothing of the pulsating bedload pattern, rather than a large systematic bias. Thus, it cannot explain the discrepancy of one to two orders of magnitude between the reconstructed and the measured bedload flux (Figure 7). Direct measurements from the bedload samplers were consistent between phases 1 and 2, but in Phase 3, the indirect bedload measurements by the plate microphone overestimated the bedload flux. Because in Phase 3 indirect rather than direct measurements were used, we judge them to be less reliable than the slot measurements used during phases 1 and 2. The large difference between the sampling methods and the seismic data illustrates the importance of high-quality measurements as benchmarks for seismic techniques.

Green's Function Parametrization
The use of measured Green's function parameter values, based on the active seismic survey, reduces the absolute average difference between the modeled spectrogram and the spectrogram calculated from the measured seismic data in comparison to the calculations based on the generic model by 9 dB (Figures 6b  and 6c). However, a discrepancy of one to two orders of magnitude between the reconstructed and the measured bedload flux remained even when Green's function parameter values based on the active seismic survey were used (Figure 7). Discrepancies between the Green's function parameter values used in the bedload model ( 11 , , c u aux aux aux N v v ) and those from our active seismic survey (N 11 , v c , v u ) were highest in the 15-30 Hz band (Figures 5c and 5d). This band overlapped with the 20-60 Hz band that is typically dominated by bedload-related seismic signals (Figure 2b). It remains unclear whether or not this is particular to this site. Active seismic surveys to constrain the Green's function in parallel with high-quality hydraulic and bedload data from other sites with a range of substrates would be necessary for a systematic investigation of this issue.
LAGARDE ET AL.

Parametrization of Grain Size Distribution
We obtained the GSD from the bedload samplers (Section 2.2.2.2), and used a constant GSD for the whole event. As such, we were able to identify three potential GSD-related effects: (i) the model sensitivity to the choice of the hiding exponent, (ii) the role of the largest moving grain sizes, and (iii) the effect of physical processes not explicitly described by the Tsai et al. (2012) model.
In the mobile bed layer, not all particles are equally mobile due to hiding effects. The relative particle mobility as a function of their size is described by the hiding exponent. In the Tsai et al. (2012) model, the hiding exponent is implemented with a default value of 0.9. Although we did not constrain this parameter with our field experiment, we tested its influence on the model results. Varying the exponent between the physical limits of 0-1, we obtained a maximum PSD spread of 3.8 dB (see Appendix A). Thus, calibration of the hiding function cannot explain the observed discrepancy (Figure 7).
In the calculations including grains with D > 110 mm ( 250 mix GSD ), the largest grain size class contributed less than 0.15% of the total bedload mass, but decreased the absolute average difference between model and data spectrograms at frequencies between 20 and 60 Hz by 8 dB compared to the spectrograms obtained for  (Figures 4b-4d). This implies that the maximum grain size has a disproportionately large impact on the PSD, and thus on the quality of the bedload inversion. The grain size percentile yielding the largest PSD is D 99 for all the GSDs investigated in this article. Hence, we conclude that the approximations in the GSD, that is, the need to estimate the largest mobile grain size D max , and an uncertain extrapolation of the GSD beyond the measured values, can account for most of the observed differences between reconstructed and measured bedload flux.
We used a Monte Carlo approach on GSD funct to explore how the different maximum particle sizes may impact the PSD. With increasing maximum particle size, the range of pD/D decreased. For example, at D = 34 mm the pD/D range was 30 m −1 , whereas at D = 60 mm, the pD/D range was 2 m −1 (Figure 9). Extending the GSD to up to 110 mm, which corresponds to the largest grain sizes expected to travel in the channel (Bergman et al., 2007), using the log-raised cosine function with the different values for D 50 and σ g had a maximal impact of 5.4 dB on the PSD. Thus, caution needs to be applied when using the log-raised cosine function. However, in most practical situations, measured GSDs of the transported bedload will not be available. In these cases, the mobile GSD should be predicted from state-of-the-art bedload models (e.g., Wilcock & Crowe, 2003).  The model by Tsai et al. (2012) accounts only for saltating grains. However, sliding and rolling particles also emit seismic waves. Particles tend to roll if the fluid shear stress is close to the threshold of motion (Auel et al., 2017), which can be expected for the largest moving grains. Laboratory experiments indicate that rolling particles deliver more seismic energy per distance traveled than saltating particles under the same flow conditions (Tsakiris et al., 2014;Turowski & Rickenmann, 2009). This seems to be related to the frequent contacts with the bed in comparison to saltating, even though a single contact delivers less energy on average. Moreover, field observations by Turowski et al. (2015) indicate that the largest grain sizes deliver more energy than what would be expected by the linear scaling with mass implemented in the Tsai et al. (2012) model, corresponding to a scaling of PSD ∼ D 3 , independent of grain size. The strong sensitivity to large grains that emerged from our comparison of model predictions, together with the observations of Turowski et al. (2015), suggests that a more complete description of the physics of motion of the largest bedload grains is the most important next step to improve the accuracy of model inversion.

Conclusion
Inverting seismic data using a physical model of bedload transport (Tsai et al., 2012) to estimate bedload flux provides a promising approach in river monitoring with several advantages over in-stream monitoring concepts, including cost, accessibility and small demands on infrastructure and staff. The quality of bedload flux measurements from seismic data strongly depends on the quality of the input data for the model. Direct measurements of these parameters, chiefly seismic ground properties needed for the Green's function and the GSD of the moving bedload, considerably improved the quality over generic approaches using empirical or theoretical functions.
When using the best approximation ( 110 mix GSD , Green's function parameter values based on the seismic active survey), the absolute average difference between empirical and modeled PSD is equal to 11 dB for frequencies between 20 and 60 Hz. However, this still results in a bedload flux overestimation by two orders of magnitude. The Tsai et al. (2012) model demonstrates strong sensitivity to the coarse tail of the imposed GSD, Figure 9. Monte Carlo distribution applied to the log-raised cosine function GSD. Red points depict the sampler's GSD data used to obtain the log-raised cosine function GSD. Gray lines correspond to the log-raised cosine function with D 50 ∈[5; 16]mm and σ g ∈[1; 1.70]. The red line corresponds to the parameters used in this study (D 50 = 12 mm, σ g = 1.35).
which is most difficult to measure (e.g., Bakker et al., 2020). From an observational perspective, this may pose a challenge. The largest grain size class is, by definition, at the upper most end of the GSD. As such, the largest grains move less often and over shorter distances than grains of more common sizes (Wilcock, 1997). Data or estimates of the size of the largest moving grains are thus associated with large uncertainties, and may be only locally applicable. However, seismic methods may also offer an opportunity. Precisely because they are most sensitive to the motion of the largest mobile bedload grains, they can be used as a complementary method to provide constraints on the motion of these grains. To achieve this, the effects of sliding and rolling grains on the delivery of seismic energy needs to be included in the model. High-quality field data from a range of sites, including accurate information on the largest mobile bedload grains, for example using tracer methods, will be necessary for calibrating and validating such a model. dicular to the channel do not show a similar trend: on line C1 a mode is clearly identifiable between 10 to 70 Hz; and on line D no mode is clearly identifiable above 20 Hz. As we are interested in the attenuation of the waves traveling from the river to the seismometer, and the sensor integrates signals of a linearly extensive source over a certain length along stream, line B1, parallel to the channel on the seismometer bank was selected for the subsequent detailed analyses.

Data Availability Statement
The raw seismic data are available at Lagarde et al. (2020b). The supporting information (Lagarde et al., 2020a) contains the information necessary to reproduce the results of this study. All other codes are available through R Core Team (2020); Dietze (2018aDietze ( , 2018b.