The Impact of Sediment Supply on the Initiation and Magnitude of Runoff‐Generated Debris Flows

Rainfall intensity‐duration (ID) thresholds are commonly used to assess the potential for runoff‐generated debris flows, but the sensitivity of these thresholds to sediment supply, which can change rapidly with time, is relatively unexplored. Furthermore, debris flows often self‐organize into distinct surges, but the factors controlling the magnitude and frequency of these surges, including sediment supply and grain size, are poorly constrained. We use a combination of numerical modeling and debris flow monitoring data from Chalk Cliffs, Colorado, USA, to explore how sediment supply influences rainfall ID thresholds for debris flows and surge properties. Results suggest that rainfall ID thresholds only become sensitive to sediment supply below a sediment thickness threshold. Surge magnitude is a nonmonotonic function of sediment supply (i.e., channel bed sediment thickness and grain size) with the largest surges tending to form at intermediate values of sediment availability with intermediate grain sizes.


Introduction
Runoff-generated debris flows, which are common in recently burned areas (Cannon, 2001;Gabet & Bookter, 2008;Kean et al., 2011;Nyman et al., 2011) and unburned alpine settings (Berti et al., 1999;Coe et al., 2008), represent a significant and widespread hazard. Rainfall intensity-duration (ID) thresholds are commonly used to forecast the initiation of runoff-generated debris flows . Rainfall ID thresholds are often determined from historical data, within a geographic region, that characterize the rainfall intensities that have produced both debris flows and water-dominated flows (e.g., Bacchini & Zannoni, 2003;Bel et al., 2017;Caine, 1980;Cannon et al., 2008Cannon et al., , 2011Destro et al., 2017;De Vita et al., 2013;Giannecchini et al., 2016;Guo et al., 2016;Ma et al., 2017;Marra et al., 2016;Staley et al., 2013Staley et al., , 2015Staley et al., , 2017. Given that specific ratios of water and sediment result in debris flows (Iverson, 1997), we expect that sediment supply should also influence rainfall ID thresholds. However, due to a paucity of data on sediment supply conditions during debris-flow-producing rainstorms, no relationship between sediment availability and rainfall ID thresholds is established for runoff-generated debris flows.
Debris flow magnitude, in addition to initiation thresholds, is a critical piece of information for hazard assessment and mitigation purposes. Many metrics of debris flow magnitude, such as peak discharge, are influenced by the tendency for debris flows to self-organize into distinct surges (e.g., Davies, 1990;Hungr, 2000;Iverson, 1997;Iverson et al., 2010;Takahashi, 1981;Zanuttigh & Lamberti, 2004a, 2004b. In longer, low-gradient channel reaches, debris flow surge formation may result from progressive instabilities that share many similarities with the roll waves that commonly form in water-dominated flows and viscous fluids (Zanuttigh & Lamberti, 2007). However, in steep headwater regions, debris flow surges may form as a result of regressive instabilities, such as the temporary creation and subsequent failure of sediment dams. Kean et al. (2013) demonstrated that debris flow surges at Chalk Cliffs, Colorado, USA, often (though not exclusively) form through this type of regressive instability where bedload sediment preferentially deposits in low-slope portions of the channel and continues to build until the resulting sediment dam fails en masse. Despite the potential hazard caused by debris flows, comparatively little is known about the factors, including sediment supply and grain size, that control the frequency and magnitude of debris flow surges that form through regressive instabilities. Using a numerical model and data from an intensively monitored study area at Chalk Cliffs, Colorado, USA, we explore how debris flow initiation thresholds, and debris flow surge magnitude and frequency vary as functions of rainfall intensity, sediment supply, and grain size. The model is first applied to simulate runoff and debris flow initiation during one debris-flow-producing storm at Chalk Cliffs, which occurred during a period with an abundance of available sediment in the channel. Then, the model is tested on four other flow events in the same year at Chalk Cliffs, where progressively less sediment is available for transport with each storm. After demonstrating that the model is capable of capturing observed differences between these contrasting debris flow and water-dominated flow events, we perform a series of numerical experiments designed to investigate the factors controlling debris flow initiation thresholds and surge properties (magnitude and frequency). The model-predicted rainfall ID thresholds are compared with existing ID thresholds for Chalk Cliffs that were derived through traditional empirical methods (Coe et al., 2008). Results provide insight into (1) the utility of rainfall ID thresholds in situations where sediment supply and grain size vary with time, which is especially applicable in recently disturbed environments, and (2) the factors controlling debris flow surge properties.

Study Area
Our study area, Chalk Cliffs, belongs to the southern flank of Mount Princeton in the Sawatch Range of central Colorado, USA (Figure 1a). An average of two to three debris flow events occur each year in the study basin between May and October (Coe et al., 2008). Sediment generally accumulates in the channel network throughout the winter via dry ravel and rockfall, sourced from steep hillslopes and rock outcrops. Debris flows initiate during summer rainstorms when runoff concentrates in channels and mobilizes the sediment (Coe et al., 2008;Kean et al., 2013). Based on standard sieve and hydrometer methods (up to 64 mm), the median grain size of channel sediment, excluding larger cobbles and boulders, is within the range of 3 to 10 mm McCoy et al., 2012McCoy et al., , 2013. For more details about grain size measurement, we refer to McCoy et al. (2012McCoy et al. ( , 2013. Chalk Cliffs contains several different monitoring stations, but we focus on data recorded at the upper monitoring station, which has an upstream drainage area of 0.06 km 2 (Figure 1a). A typical debris flow event at Chalk Cliffs consists of multiple coarse-grained granular surges separated by water-rich, intersurge flow (McCoy et al., 2010(McCoy et al., , 2011. The debris flow channels in Chalk Cliffs fill up with sediment during cold periods due to a combinations of frost weathering and dry ravel processes (Coe et al., 2008;Rengers et al., 2020). Rengers et al. (2020) detail the processes controlling sediment production from bedrock slopes at Chalk Cliffs. Debris flows that occur earlier in the summer storm season remove most of the sediment from the channels in the upper parts of the basin resulting in more flood-dominated responses by late summer . Kean et al. (2013) reported that the debris flows at Chalk Cliffs generally consist of several small surges at low rainfall intensity (<30 mm/hr), larger-amplitude, regularly repeating surges at a similar frequency at intermediate rainfall intensity (30-60 mm/hr), and a single, large surge front followed by small, high-frequency fluctuations at high rainfall intensity (>60 mm/hr).

Monitoring Debris Flow Activity at Chalk Cliffs
Debris flow monitoring began at Chalk Cliffs in 2004 with rainfall and soil moisture measurements (Coe et al., 2008) and expanded in 2008 McCoy et al., 2010McCoy et al., , 2011McCoy et al., , 2012McCoy et al., , 2013. Here, we make use of rainfall and stage time series at the upper monitoring station. Rainfall is measured by a tipping bucket rain gage, and stage is estimated by a laser distance meter installed over the center of the channel. Debris flow surges appear in the stage time series as a sharp rise in stage followed by a more gradual decline . Additional details regarding the monitoring equipment setup can be found in Kean et al. (2015). Data used in this study can be found in Kean et al. (2020).
We selected rainstorms on 4 July 2014 and 31 July 2014 that produced debris flows and rainstorms on 1, 4, and 10 August 2014 that produced water-dominated floods and debris floods to assess model performance in both sediment-rich and sediment-limited conditions. The storm on 4 July 2014 was the first significant rainstorm of the season  and therefore occurred during a period with abundant sediment supply. Following the accumulation of ravel during the winter, approximately 0.34 m of sediment was in the channel at the upper station before the debris flow on 4 July 2014. The depth of sediment in the channel prior to the storm was determined by the stage of the laser above the bedrock channel datum before any flow . The rest of the storms in July and August 2014 occurred with a reduced channel sediment supply as a result of erosion from previous storms.

Numerical Model
The numerical model is designed to represent infiltration, rainfall interception, fluid flow, sediment transport, and mass failure of bed material, as described by McGuire et al. (2016McGuire et al. ( , 2017. The model has been previously applied to simulate runoff, sediment transport, and debris flow initiation processes (McGuire et al., 2017;Tang et al., 2019). Here, we only provide a brief overview of the model and describe the addition, in the present study, of a new bedload sediment transport component. The addition of this model component was necessitated by the relatively coarse sediment at Chalk Cliffs relative to other sites where the model has been applied previously.
Water flow and sediment transport are modeled using the two-dimensional, nonlinear shallow water equations coupled with advection equations to track the movement of sediment within the water column. The Hairsine-Rose (HR) soil erosion model (Hairsine & Rose, 1991, 1992a, 1992b can represent sediment detachment by both raindrop impact, which we neglect at Chalk Cliffs since typical particle sizes are greater than several millimeters, and overland flow, which is proportional to unit stream power. The HR model represents sediment as a two-layer system, where sediment that has yet to be detached from the soil surface has different erodibility properties than sediment that has been previously detached and transported. Once detached and then deposited in a new location, sediment becomes easier to entrain. Throughout a modeled rainstorm, substantial amounts of sediment may be deposited in locations within the channel where hydrologic conditions are favorable for deposition. In addition to tracking the thickness of this deposited sediment layer as it evolves over time, we also assess the stability of this sediment layer at each model time step. The mass failure of sediment within the deposited layer occurs whenever the driving forces acting on the bed sediment layer exceed the resisting forces acting on that sediment layer (McGuire et al., 2017;Tang et al., 2019). If a bed failure occurs in a particular location, all the sediment in the deposited layer in that location is instantaneously added to the water column. A debris flow may be generated through this mechanism. Debris flows can also be generated in the absence of any mass failure mechanism if entrainment rates are sufficiently high. We neglect the mass failure of sediment that is not part of the deposited layer (i.e., cohesive soil on hillslopes).
The bedload sediment transport model employed here (Rickenmann, 2001) is the same as that used by Kean et al. (2013) in their study of debris flows at Chalk Cliffs. The bedload sediment discharge per unit width of the channel is given by Rickenmann (2001): where f and s are the density of the water and sediment, * is the nondimensional shear stress, * c is the nondimensional critical stress for the initiation of bedload transport, Fr is the Froude number, D is the particle grain size, and g is acceleration due to gravity (see Table S1 in the supporting information for symbol description and unit). Although this particular formulation for bedload transport was not employed in previous version of the model, it replaces the bedload flux component in Equations 10 and 11 of McGuire et al. (2016). Bedload transport will only occur when the nondimensional shear stress is greater than the nondimensional critical stress for the initiation of bedload transport. It should be mentioned that the amount of sediment being transported as bed load does not influence the sediment concentration within the water column. Only sediment within the deposited layer can be transported as bedload. If the initial thickness of the deposited layer is set to zero, this means that sediment must be detached and deposited by overland flow before it is able to be transported as bedload. In practice, however, we initialize the model with a deposited sediment layer thickness that is greater than zero within the channel network in order to account for the observation that sediment within the channel network is readily available to be transported as bedload (additional details in section 3.3). In all simulations, we assume a single representative grain size, and the nondimensional critical shear stress is calculated based on Ferguson (2012). Since the downslope component of grain weight is large enough to significantly alter the slope dependency in critical shear stress when slopes are greater than roughly 15 • , we modify the nondimensional critical shear stress following Ferguson (2012) and Kean et al. (2013). For further details regarding bedload transport in this model, we refer readers to section 5.2 in Kean et al. (2013).

Model Parameters and Calibration
We derive numerical model parameters from literature searches, field measurements, and calibration (see Table S2). The rainstorm on 4 July 2014 was chosen to calibrate model parameters associated with hydrologic and sediment transport processes, whereas the other four rainstorms serve primarily as test cases.
Simulations of the five monitored rainstorms are performed using a 1-m resolution digital elevation model derived from airborne lidar. We simulate sediment transport using one particle size class with a representative grain diameter of 5.0 mm, which is within the range determined by field measurements (McCoy et al., 2012). The number of parameters in the HR model can be substantially reduced because many are associated with raindrop-driven detachment of sediment, a process that we neglect at Chalk Cliffs due to the coarse nature (3 to 10 mm) of the sediment. One parameter in the HR model that does require calibration for this study is the fraction of stream power that is effective at detaching sediment via flow-driven processes. This detachability parameter influences the modeled hydrograph by altering the timing and magnitude of debris flow surges. Two other parameters, a wetting front suction head (H f ) and saturated hydraulic conductivity (K s ), are needed to solve the Green-Apt infiltration equation in the model. The saturated hydraulic conductivity on soil-mantled hillslopes is set to a value of 20 mm hr −1 based on field measurements (McCoy et al., 2012). Bedrock-mantled portions of hillslopes are identified based on a slope angle above 45 • and assigned K s = 0 ( Figure S1), since the bedrock at Chalk Cliffs is composed of quartz monzonite and assumed to be 10.1029/2020GL087643 relatively impermeable. The saturated hydraulic conductivity of ravel deposits in the channel, which was not constrained by field measurements, is set to 100 mm hr −1 based on a model calibration ( Figure S5). There were no field-based constraints on wetting front suction head for different landscape positions; we assume that it is negligible everywhere given the amount of the bedrock exposed at our study site.
For the first storm (i.e., 4 July 2014), the Manning coefficient, saturated hydraulic conductivity in the channel deposit, and effective fraction of stream power are calibrated following Rengers et al. (2016) and Tang et al. (2019) through a comparison between the simulated hydrograph and the actual hydrograph as recorded at the upper monitoring station. We use the correlation coefficient as the objective function to assess model performance (see the supporting information). More specifically, we ran a series of simulations with different parameter values and then selected the best fit parameters by maximizing the correlation coefficient between the modeled and actual hydrographs ( Figures S2, S3, S4, and S5). Then, the best fit Manning coefficient and effective fraction of stream power are set as constants and used in later storms without recalibrating. Each simulation for different storms begins with a constant thickness of sediment throughout the entire channel network, consistent with field observations of widespread accumulation of dry ravel in the channel prior to the summer rainy season (Rengers et al., 2020). The thickness of the sediment layer (H s ) within the channel prior to a particular storm is determined by model calibration. If sufficient erosion occurs during simulations for these ravel deposits to be completely eroded in any location, then we assume that there can be no further erosion in that location (e.g., no erosion into the bedrock channel). A summary of model parameters used to simulate the response to the five runoff-producing storms in 2014 are included in the supporting information (Table S2).
Simulations of the observed rainstorms during 2014 serve to constrain necessary input parameters (storm on 4 July 2014), initial conditions (sediment thickness in the channel), and test the numerical model at Chalk Cliffs (rest of storms) and to demonstrate the sensitivity of debris flow surge properties to variations in sediment supply. For example, the thickness of sediment within the channel network prior to each storm is unknown and needed to be calibrated. We hypothesize that the modeled hydrograph will be sensitive to this initial condition and, therefore, the calibration process will reveal different values for the initial sediment thickness within the channel at the start of the rainstorms (Table S2; Figures S2 and S6). If, however, debris flow surge properties do not depend strongly on the sediment supply within the channel, then simulations should reveal that the modeled stage is relatively insensitive to changes in the prescribed channel bed thickness (H s ).

Rainfall ID Thresholds
The hydrological responses to designed rainstorms are used to estimate rainfall ID thresholds with the same best fit model parameter set determined by the correlation coefficient for the storm on 4 July 2014. We determine whether or not an average intensity of I mm hr −1 for a given duration is above or below the ID threshold by simulating the hydrological responses to a rainstorm that lasts given minutes and has an average intensity of I mm hr −1 . Rainfall intensity is spatially uniform and time dependent with a normal distribution ( Figure S7). The designed rainstorms have durations of 15, 30, and 60 min and average rainfall intensities ranging from 5 to 40 mm hr −1 with increments of 1 mm hr −1 (Figure S7). If a debris flow forms within the model in response to a particular designed storm, then it implies that the rainfall ID threshold was exceeded during that storm. During each simulation, the criteria associated with a debris flow is a flow depth above 10 cm and, simultaneously, a sediment concentration above 40% at the location of the upper monitoring station (McGuire et al., 2017). Therefore, the proposed approach enables us to identify the rainfall ID threshold, at durations of 15, 30, 60 min, to within approximately 1 mm hr −1 . Further, in order to quantify the impact of sediment supply on rainfall ID thresholds, we conduct simulations for all designed rainstorms with a range of values for the initial thickness of sediment in the channel. Therefore, we can determine rainfall ID thresholds as a function of sediment supply that can be compared with the ID thresholds derived for Chalk Cliffs using an empirical approach by Coe et al. (2008).

Surge Magnitude and Frequency
In addition to quantifying the threshold for debris flow initiation, simulations quantify the magnitude and frequency of debris flow surges at the location of the upper monitoring station. Debris flow magnitude is quantified using peak discharge and total discharge volume during all debris flow surges throughout a modeled rainstorm. We calculate the discharge per unit channel width, q = h √ u 2 + v 2 , where h represents the flow depth, and u and v are the velocities in x and y directions, respectively. We then integrate over the channel cross section to obtain total discharge, Q, as a function of time at the location of the monitoring station. Based on discharge, Q, we figure out the maximum discharge as the peak discharge, Q p . The total volume of water and sediment during debris flow periods (i.e., not counting times when the flow is identified as being water dominated) is calculated at the upper station according to where K is the number of debris flow periods and t i and t i+1 denote the start and end time of the ith debris flow period. The percentage of time during simulations when debris flow activity is identified at the upper station is given by Here, T is the total simulation time. The above metrics are useful for quantifying and summarizing differences in flow properties among simulations.

Model Results for Observed Storms
Hydrographs from the first two observed rainstorms are characterized by high-frequency debris flow surges when the 15-min average rainfall intensity (I 15 ) exceeds the empirical threshold (19 mm hr −1 ) from Coe et al. (2008). Kean et al. (2015) identified seven distinct periods of debris flow surges during the first rainstorm on 4 July 2014. The timing of modeled debris flow surges generally corresponds with the timing of observed surges (Figures 1b and 1d). The modeled hydrograph similarly includes high-frequency debris flow surges (Figures 1b and 1d), where sediment concentrations exceed 40%. The best fit sediment thickness calibrated for the first storm is 75 cm, likely reflecting an abundance of ravel deposits within the channel at the start of this storm.
The observed flow depth during the storm on 31 July 2014 is also characterized by high-frequency debris flow surges (Figures 1c and 1e). However, the frequency of surges during this storm is much higher relative to that on 4 July 2014, and the combined duration of all debris flow surges is shorter. The calibration process demonstrated that a sediment thickness of 25 cm provided the best fit between the modeled and observed stage data (Table S2 and Figure S6). Recall that sediment thickness is the only calibrated input parameter for this storm and the subsequent three storms. All other parameters are fixed based on their calibrated values for the storm on 4 July 2014. The inferred reduction in sediment thickness likely reflects the influence of widespread erosion within the channel by previous storms, including the storm on 4 July 2014. Simulations indicate that a higher number of debris flow periods with greater combined duration would have occurred if more sediment had been in the channel.
The third substantial rainstorm of the year, on 1 August 2014, resulted in a combination of water-and debris-dominated flow. Model results are consistent with this type of flow behavior, with simulations suggesting water-dominated flow during the initial stages of the storm that transitions into flow behavior characterized by periodic debris flow surges ( Figures S8a and S8b). The overall model performance, based on the correlation coefficient (CC = 0.42) and a visual comparison between the modeled and observed hydrographs, is worse for this storm relative to the other four. The last two substantial rainstorms of the year, on 4 and 10 August 2014, produced water-dominated floods. The timing of modeled flow compares well with observations. The correlation coefficient is higher for both of these two storms (CC = 0.80, CC = 0.78) than it is for the calibration case on 4 July 2014 (CC = 0.64) despite that fact that all parameters are fixed with the exception of the initial sediment thickness (Table S2). Calibrated sediment thicknesses decrease systematically with each storm, dropping from 0.75, to 0.25, 0.20, and 0.15 m, finally to 0.1 by 10 August 2014 (Table S2). This set of model experiments suggests that the model is capable of simulating flow behavior across a range of sediment supplies and that reduced sediment supply, rather than rainfall characteristics, is primarily responsible for the decrease in debris flow activity between the storms 4 July 2014 and 10 August 2014 ( Figures S2 and S6).

Rainfall ID Thresholds
Simulations with the idealized storms are applied to estimate rainfall ID thresholds at durations of 15, 30, and 60 min for Chalk Cliffs (Figure 2). The empirical rainfall ID thresholds for Chalk Cliffs are 19, 11, and 7 mm hr −1 for durations of 15, 30, and 60 min, respectively (Coe et al., 2008). The rainfall intensity threshold for 15-min duration, for given channel sediment thickness, was estimated using the model. We refer to these thresholds as the lower thresholds (LTs) for debris flow initiation. We define another rainfall intensity threshold, for the 15-min duration only (for computational reasons), that delineates the transition between debris flows and flood as the upper threshold (UT). The transition back to floods at greater intensities can be driven by an increase in water runoff that reduces sediment concentration. Both LT and UT are functions of sediment thickness (Figure 2a). LT and UT increase initially with increasing sediment thickness and then become roughly constant for sediment thicknesses greater than 40 cm (Figure 2a).
We also derived LTs for I 15 , I 30 , and I 60 (Figure 2b, 2c, and 2d) as functions of grain size for different initial sediment thicknesses. When grain size is relatively small (i.e., 1 mm), debris flows can initiate even when there is a limited sediment supply. However, debris flows are unable to initiate with a limited sediment supply as the grain size increases. An initial sediment thickness of approximately 40 cm is required before debris flows begin to initiate if the grain size is set to 10 mm while an approximately 5 cm thickness is sufficient to generate debris flows when the grain size is 1 mm. In all cases, the critical rainfall intensity for debris flow initiation generally increases as a function of sediment supply such that it roughly matches the empirical thresholds when the system is not supply limited (Figure 2). For sediment thicknesses above 40 cm, the percent difference between the empirical and model-predicted rainfall thresholds for durations of 15, 30, and 60 min are 16%, 18%, and 28%, respectively. . We used four variables to summarize model results: peak discharge (Q p ), total discharge volume (Q df ), number of debris flow periods (K), and ratio of time during debris flow periods to total simulation time (Φ). Box with quartiles represents 75% of the data range, blue-filled circle represents the mean, and red cross represents outliers.

Debris-Flow Surge Magnitude and Frequency
We performed an extensive parameter study to provide insight, beyond initiation thresholds, into the controls on debris flow surge properties. We focus on the factors controlling peak discharge (Q p ), total discharge volume during debris flow periods (Q df ), the number of debris flow periods (K), and the ratio of time during debris flow periods to total simulation time (Φ), with changes in three variables: rainfall intensity (I 15 ), grain size (D), and sediment thickness in the channel (H s ). For each set of simulations, we vary one variable (i.e., I 15 , D, or H s ), while keeping other parameters fixed at the values used for the storm on 4 July 2014. In cases when I 15 is fixed, it is set to 20 mm hr −1 .
The parameter study (D = 5 mm and H s = 75 cm) of rainfall intensity shows the peak discharge (Q p ) increases from 0.7 to 3.7 m 3 s −1 when the rainfall intensity increases from 5 to 50 mm hr −1 (Figure 3a). Meanwhile, the total discharge during the debris flow periods (Q df ) increases from 19 to 202 m 3 ( Figure 3d). As rainfall intensity increases, the number of debris flow surges (K) grows to a peak of 10 surges at 40 mm hr −1 and then drops to roughly five surges at an intensity of 50 mm hr −1 (Figure 3g). Debris flows are moving past the upper station during roughly 20% of simulation time when I 15 = 30 mm hr −1 , but that percentage is reduced to almost zero when I 15 = 50 mm hr −1 (Figure 3j). The parameter study of the grain size (I = 20 mm hr −1 and H s = 75 cm) shows that Q p increases slightly as grain size increases and then rapidly decreases (Figure 3b). The Q df decreases with increasing grain size ( Figure 3e). As the grain size increases, K grows to a peak at 9 mm and rapidly drops to zero above 30 mm (Figure 3h). Debris flows are much more common than water-dominated flows when grain sizes are small relative to when they are large (Figure 3k). Assuming I 15 = 20 mm hr −1 and D = 5 mm, simulations show a distinct sediment thickness threshold for debris flow initiation (Figures 3i and 3l). No debris flow surges are observed when the initial channel sediment thickness is less than 20 cm. Both Q p and Q df are generally greatest near the transition from water-dominated flows to debris flow (i.e., when sediment thickness is approximately 20 cm).

Discussion
Rainfall ID thresholds are commonly used to assess the potential for runoff-generated debris flows. Model results suggest that rainfall ID thresholds are likely to be robust until the sediment supply is depleted below a threshold ( Figure 2). Interestingly, decreases in sediment supply, all else being equal, do not lead to increases in rainfall ID thresholds for debris flows. Rather, they lead to decreases in surge magnitude ( Figure 3c) and decreases in the range of rainfall intensities over which debris flows are produced (Figure 2a), which is consistent with the conclusion from Pastorello et al. (2020). Once sediment supply drops below some critical level, runoff-generated debris flows may be triggered by less intense rainfall, especially if the grain size is small. A decrease in rainfall thresholds with a decrease in grain size is consistent with what one would expect if using a critical dimensionless discharge threshold to predict debris flow initiation (Figure 2b) (Gregoretti & Fontana, 2008;Tognacca et al., 2000). In our model, the reduction in the ID threshold in these cases is driven by the limited water storage capacity within thin layers of bed sediment and the fact that finer sediment is relatively easy to entrain and transport. In systems with limited sediment availability, however, even modest rainfall intensities (i.e., >20 mm hr −1 ) appear to produce sufficient runoff to dilute the sediment concentration below that typical of a debris flow, resulting in a flood response (Figure 2a). Therefore, the likelihood of a flood may be increased relative to a debris flow even though initiation thresholds for debris flows would be relatively low.
In determining how debris flow threats may be influenced by future land use and climate changes, results here underscore the need to consider how recharge rates for sediment supply will vary over both short and long time scales (Borga et al., 2014;Bovis & Jakob, 1999;Jakob et al., 2005). Still, the relative insensitivity of rainfall ID thresholds across a range of sediment supply conditions is promising for the use of rainfall ID thresholds in cases where sediment supply is known to vary. Sediment supply within debris-flow-prone alpine channel networks is frequently variable, such as at Chalk Cliffs (Coe et al., 2008;Kean et al., 2013) and the Illgraben in Switzerland (Berger et al., 2011;Schlunegger et al., 2009). Sediment supply can also change rapidly due to disturbances, including wildfire (Florsheim et al., 2016;Lamb et al., 2011;Tang et al., 2019). Results here also support those of Tang et al. (2019) who noted no substantial changes in rainfall ID thresholds for debris flows throughout the first year following a wildfire in the San Gabriel Mountains of California, USA, a result that could be attributed to maintaining sediment supply above the threshold. A major difference, however, between Chalk Cliffs and recently burned areas, is that the channel provides the main source of debris flow sediment at Chalk Cliffs whereas both hillslopes and channels can be important sources of sediment for debris flows in burned areas (Nyman et al., 2011;Staley et al., 2014;Tang et al., 2019).
The rainfall ID thresholds derived here, like those employed in other studies, only consider the occurrence of debris flows within a given drainage basin regardless of debris flow properties, such as surge magnitude and frequency. In practice, knowing both the threshold for initiation and the conditions that are most likely to give rise to impactful debris flow events is useful. When modeled debris flows form in response to low rainfall intensities (e.g., 20 mm hr −1 ), they consist of small, low-frequency surges with a relatively high background (intersurge) sediment concentration (Figures S11c and S11d). At moderate rainfall intensity (e.g., 35 mm hr −1 ), greater runoff and erosion increase formation of sediment dams that periodically fail and produce debris flow surges. As a result, large-amplitude, high-frequency debris flow surges are commonly observed in modeled hydrographs (Figures S11e and S11f). At high rainfall intensity (e.g., 50 mm hr −1 ), the model often indicates the formation of a single large debris-flow surge followed by several small, short-period surges with a relatively modest background flow (Figures S11g and S11h). A large amount of water is converted to runoff during high-intensity storms, which can decrease sediment concentration and the number of locations within the drainage network where sediment dams are capable of forming. However, this leads to the formation of a fewer number of larger dams. When these sediment dams do fail, they produce large-magnitude surges. Similar patterns between debris flow surge properties and rainfall intensity have been observed at Chalk Cliffs  and recently burned sites in southern California .
Grain size (D) influences debris flow surge properties through its controls on bedload sediment discharge and particle settling velocity. When grain size is small (e.g., 1-2 mm), modeled debris flows tend to organize into low-frequency, long-lasting debris flow surges with moderate magnitude (Figures 3b, 3h, 3k, and S12). We interpret this as resulting from efficient entertainment processes that can maintain high sediment concentration throughout the channel network. Similar behavior has been observed in postwildfire settings, where an abundance of fine-grained sediment is often available for transport. Tang et al. (2019) describe how the first runoff event following the 2016 Fish fire likely consisted of two different phases and caused a fine-grained slurry followed by a series of more discrete debris flow surges. Model results here suggest that this change in flow behavior could be associated with a coarsening of the sediment available for transport. As grain size increases from 2 to 10 mm, surge frequency increases with grain size. Due to the more limited mobility of the particles within this size range, the periodic creation and mass failure of sediment dams becomes more common (Figures S12c, S12d, S12e, and S12f). When D > 10 mm, the particles become difficult to transport, sediment dams to do not regularly form, and the model only produces water-dominated floods (Figures S12g and S12h). Kean et al. (2013) similarly explored links between debris flow surge properties and grain size using a 1-D model for a small (≈10 m), idealized channel reach and found that grain size had minimal effects of surge frequency for particle diameters ranging from 1 to 5 mm. Differences between the results obtained by Kean et al. (2013) and those reported here could be caused by the more complex topography used in the present study or the inclusion of suspended load sediment transport, which would have more of an impact when grain sizes are small. It also should be noted that debris flows usually have a wide range of grain sizes, which may quickly sort after motion begins, causing large grains to collect at the front and sides of the surge (e.g., Iverson, 1997;Johnson et al., 2012). These processes could lead to a more complex relationship between grain size, surge magnitude, and surge frequency than that presented here using one uniform grain size. The volume of runoff-generated debris flows is also likely to be a function of watershed relief, steepness, and size. We focus here on a relatively small watershed so results may be interpreted as a representation of debris flow surge properties near initiation zones (Cannon et al., 2010;Gartner et al., 2014).
Results here support the use of rainfall ID thresholds across a range of sediment supply conditions and also elucidate the first-order interactions between entrainment, deposition, and debris flow initiation processes that control the size and frequency of surges within runoff-generated debris flows. Surge development at Chalk Cliffs and within the numerical model is often driven by the formation and subsequent failure of sediment dams within the channel (e.g., a regressive instability), which differentiates them from surges that form through progressive instabilities in slurries and non-Newtonian fluids (Zanuttigh & Lamberti, 2007). Debris flow surge development through this type of regressive instability has also been inferred from modeling studies in other steep drainages (McGuire et al., 2017;Tang et al., 2019) and is important to understand from a hazards prevention and mitigation perspective. Simulations suggest that high-frequency, large-magnitude, debris flow surges are most commonly formed during storms with moderate rainfall intensity and in cases with moderate-to-high sediment availability and intermediate grain sizes. In these cases, sediment dams form and fail regularly. While the absolute values of the rainfall thresholds derived here are specific to the Chalk Cliffs site, the model employed here is general and potentially portable to other locations.

Conclusions
Rainfall ID thresholds are commonly used to predict the initiation of runoff-generated debris flows. In this study, we combine field observations with a series of numerical experiments at Chalk Cliffs to quantify relationships between rainfall ID thresholds and sediment supply as well as relationships between debris flow surge magnitude/frequency and sediment supply, grain size, and rainfall intensity. Simulations suggest that, above a grain-size dependent threshold, sediment supply does not strongly influence rainfall ID thresholds. Moreover, our parameter studies show a systematic relation between the magnitude or frequency of debris flow surges and the amount of erodible sediment, grain size, and rainfall intensity. Peak debris flow discharge reaches a local maximum at intermediate values of grain size when sediment is small enough to lead to efficient transport and building of sediment dams but not so small that it remains in suspension and created long-duration, slow-moving surges. Peak discharge generally increases with sediment supply and rainfall intensity. Results support the use of rainfall ID thresholds for debris flow initiation across a range of sediment supply conditions.

Data Availability Statement
Data used in this manuscript are stored in the USGS ScienceBase archive at Kean et al. (2020). Code for the numerical model is stored in the Community Surface Dynamics Modeling System (CSDMS) model repository (at https://csdms.colorado.edu/wiki/Model:SWEHR).