4.2. Effects of Formula Calibration and Complexity
 Accuracy was also considered in relation to degree of formula calibration and complexity. The number and nature of calibrated parameters determines the degree of formula calibration which, in turn, determines equation complexity. In general, formulae computed by grain size fraction (di), using site-specific particle size distributions, are more calibrated and more complex than those determined from a single characteristic particle size. Moreover, formulae that are fit to observed bed load transport rates and that use site-specific hiding functions (e.g., the Parker et al.  (di) equation) are more calibrated and complex than those that that use a hiding function derived from another site (e.g., our use of the Andrews  function in variants of the Parker et al.  and Meyer-Peter and Müller  equations). The Bagnold  formula does not contain a hiding function, is based on a single grain size, has a limited number of user-calibrated parameters and is therefore ranked lowest in terms of both calibration and complexity. However, we have ranked the Bagnold  (dmqb) variant higher in terms of calibration because the modal grain size varied with discharge and was calculated from the observed bed load transport data. We consider the Ackers and White  equation equal in terms of calibration and complexity to both the Meyer-Peter and Müller  (di) equation and the Parker et al.  (di via Andrews ) equation because all three are calculated by di, have a similar number of calibrated parameters and contain “off the shelf” particle-hiding functions that are calibrated to other sites, rather than to site-specific conditions.
 Results from our prior analyses (Figures 5 and 6) indicate that the most complex and calibrated equation (i.e., Parker et al.  (di)) outperforms all other formulae except for the Ackers and White  equation. However, we find no consistent relationship between formula performance and degree of calibration and complexity. Size-specific formulae (those calculated by di) do not consistently outperform those based on a single characteristic particle size (d50ss, dmqb or dmss), nor does a site-specific hiding function (i.e., Parker et al.  (d50ss)) guarantee better performance than an “off the shelf” hiding function (i.e., Parker et al.  (di) via Andrews ).
4.3. A New Bed Load Transport Equation
 The bed load equations examined in sections 4.1 and 4.2 are some of the most common and popular equations used for gravel bed rivers. However, their performance is disconcerting and makes us ask whether there is a better alternative?
 Similar to Whiting et al. , we find that bed load transport at our sites is generally well described in log10 space (0.50 < r2 < 0.90) by a simple power function of total discharge (Q)
where qb is bed load transport per unit width, and α and β are empirical values [Leopold et al., 1964; Smith and Bretherton, 1972; Vanoni, 1975]. Figure 7 shows a sample fit of this function at the Boise River study site. Moreover, we find that equation (3) performs within the accuracy specified in section 4.1.3 (Freese's  χ2, α = 0.05) and is superior to the other bed load equations examined in terms of describing the observed transport (Figures 5 and 6). In particular, the median critical error, e*, for equation (3) is significantly lower than that of the other equations (paired χ2 test, α = 0.05) and is within the specified accuracy (E = 1 log10 unit). We expect this result because equation (3) is empirically fit to the data and thus fully calibrated. Nevertheless, the results demonstrate that a power function of discharge may be a viable alternative to the other equations examined in sections 4.1 and 4.2. To generalize equation (3) and make it predictive, we next parameterize α and β in terms of channel and watershed characteristics.
4.4. Parameterization of the Bed Load Transport Equation
 We hypothesize that the exponent of equation (3) is principally a factor of supply-related channel armoring. Emmett and Wolman  discuss two types of supply limitation in gravel bed rivers. First, the supply of fine material present on the streambed determines, in part, the magnitude of phase I transport (motion of finer particles over an immobile armor) [Jackson and Beschta, 1982]. Second, supply limitation occurs when the coarse armor layer limits the rate of gravel transport until the larger particles that make up the armor layer are mobilized thus exposing the finer subsurface material to the flow (phase II transport [Jackson and Beschta, 1982]). Mobilization of the surface material in a well armored channel is followed by a relatively larger increase in bed load transport rate compared to a similar channel with less surface armoring. Consequently, we expect that a greater degree of channel armoring, or supply-limitation, will delay mobilization of the armor layer and result in a steeper bed load rating curve (larger exponent of the bed load equation (3)) compared to a less armored channel [Emmett and Wolman, 2001].
 Dietrich et al.  proposed that the degree of channel armoring is related to the upstream sediment supply relative to the local transport capacity, and presented a dimensionless bed load transport ratio, q*, to represent this relationship. Here, we use q* as an index of supply-related channel armoring and examine its effect on the exponent (β) of the observed bed load rating curves (3).
 We define q* as
where is the total shear stress at Q2 calculated from the depth-slope product (ρgDS, where ρ is fluid density, g is gravitational acceleration, D is flow depth at Q2 calculated from hydraulic geometry relationships, and S is channel slope) and and are the critical shear stresses necessary to mobilize the surface and subsurface median grain sizes, respectively. Channel morphology and bed load transport are adjusted to bank-full flows in many gravel bed rivers [e.g., Leopold et al., 1964; Parker, 1978; Andrews and Nankervis, 1995], hence bank-full is the relevant flow for determining q* in natural rivers [Dietrich et al., 1989]. However, we use Q2 because it is a bank-full-like flow that can be determined objectively from flood frequency analyses without the uncertainty inherent in field identification of bank-full stage (section 4.1.2). The critical shear stresses are calculated as
where τ*c50 is the dimensionless critical Shields stress for mobilization of the median grain size. We set this value equal to 0.03, corresponding with the lower limit of dimensionless critical Shield stress values for visually based determination of incipient motion in coarse-grained channels [Buffington and Montgomery, 1997].
 Values for q* range from 0 for low bed load supply and well-armored surfaces (d50s ≫ d50ss and ≈ to 1 for high bed load supply and unarmored surfaces (d50s ≈ d50ss and ≈ As demonstrated by Dietrich et al. , q* does not measure absolute armoring (i.e., it is not uniquely related to d50s/d50ss), but rather relative armoring (a function of transport capacity relative to bed load supply). The denominator of (4) is the equilibrium transport capacity of the unarmored bed and is a reference transport rate (theoretical end-member) that quantifies the maximum bed load transport capacity for the imposed boundary shear stress and the size of supplied bed load material. The numerator is the equilibrium transport rate for the actual bed load supply (equilibrium excess shear stress), with equilibrium transport achieved by textural adjustment of the bed (fining or coarsening in response to bed load supply) [e.g., Dietrich et al., 1989; Buffington and Montgomery, 1999b]. Hence q* is a relative index of armoring (textural adjustment as a function of excess shear stress that provides equilibrium bed load transport). It describes armoring as a function of bed load supply relative to boundary shear stress and transport capacity. Consequently, q* is not a measure of absolute armoring (d50s/d50ss). For the same degree of armoring, one can have different values of q*, depending on the bed load supply and the corresponding equilibrium excess shear stress [Dietrich et al., 1989; Lisle et al., 2000]. Similarly, for a given q*, a lower degree of armoring will occur for lower values of equilibrium excess shear stress [Dietrich et al., 1989; Lisle et al., 2000].
 We determined values of q* at 21 of the 24 study sites. Values of q* could not be determined for three of the study sites because their median grain sizes were calculated to be immobile during Q2 (Salmon River at Shoup, Middle Fork Salmon River at Lodge and Selway River). Results show an inverse relationship between q* and the exponent of our bed load power function (Figure 8) supporting the hypothesis that supply-related changes in armoring relative to the local transport capacity influence the delay in bed load transport and the slope of the bed load rating curve. We parameterize q* in terms of low-flow bed material for practical reasons (safety during grain size measurement and feasibility of future application of our approach). However, surface grain size can change with discharge and transport rate [Parker and Klingeman, 1982; Parker et al., 2003], thereby potentially making β discharge dependent. Nevertheless, β is an average value across the range of observed discharges (including channel-forming flows) and any textural changes are empirically incorporated into our relationship between β and q*.
Figure 8. Relationships between (a) q* and the exponent of the bed load rating curves (equation (3)) and (b) drainage area and the coefficient of the bed load rating curves (equation (3)) for the Idaho sites. Dashed lines indicate 95% confidence interval about the mean regression line. Solid lines indicate 95% prediction interval (observed values) [Neter et al., 1974; Zar, 1974].
Download figure to PowerPoint
 Two sites (Thompson Creek and Little Buckhorn) appear to be outliers and therefore were removed from the analysis (shown as open diamonds in Figure 8). The anomalous Thompson Creek q* value may be due to an extensive network of upstream beaver dams. The availability of sediment at all but the greatest flows is likely influenced by dam storage, delaying transport and increasing β by compressing the effective flows into a smaller portion of the hydrograph. In contrast, the large q* value for Little Buckhorn may be due to a lack of peak flow information. The Q2 at this site was calculated from a drainage area versus Q2 relationship developed from the other 23 Idaho study sites where peak flow information was available. This relationship may overpredict Q2 at Little Buckhorn, resulting in an anomalously high q* value.
 Because q* is a relative measure of armoring (i.e., relative to bed load supply and transport capacity), it is unlikely to be biased by site-specific conditions (climate, geology, channel type, etc.). For example, the relative nature of q* implies that channels occurring in different physiographic settings and possessing different particle size distributions (e.g., a fine gravel bed stream versus a coarse cobble bed one) may have identical values of q*, indicating identical armoring conditions relative to transport capacity and bed load sediment supply and thus identical bed load rating curve slopes. Although, q* is not uniquely related to absolute armoring (d50s/d50ss), we examined its effect on the exponent of our transport function (3), and found that the relationship was not significant (F test, α = 0.05). Hence relative armoring (q*) is more important than absolute armoring (d50s/d50ss). Because q* is a relative index of armoring, it should be a robust predictor of the exponent of our bed load power function and unbiased by changing physiography and channel morphology.
 In contrast, the coefficient of the bed load power function (α) describes the absolute magnitude of bed load transport, which is a function of basin-specific sediment supply and discharge. In general, sediment transport rate (qb) and discharge (Q) both increase with drainage area (A) [Leopold et al., 1964], however discharge increases faster, such that the coefficient of the bed load power function is inversely related to drainage area (a surrogate for transport rate relative to discharge, α ∝ 1/A ∝ qb/Q) (Figure 8). The rate of downstream increase in unit bed load transport rate (qb) also depends on 1) downstream changes in channel width (a function of discharge, riparian vegetation, geology and land use) and 2) loss of bed load material to the suspended fraction due to particle abrasion [Cui and Parker, 2004]. Factors that affect channel width also influence flow depth, boundary shear stress, and surface grain size and thus may influence, and be partially compensated by, β. We hypothesize that the Figure 8 relationship is a region-specific function of land use and physiography (i.e., topography, geology, and climate). Consequently, care should be taken in applying this function to other regions. In contrast, prediction of the exponent of our bed load transport equation may be less restrictive, as discussed above.
 On the basis of the relationships shown in Figure 8 we propose the following empirically derived total bed load transport function with units of dry mass per unit width and time (kg m−1 s−1).
where the coefficient and exponent are parameterized in terms of channel and watershed characteristics. The coefficient is a power function of drainage area (a surrogate for the magnitude of basin-specific bed load supply) and the exponent is a linear function of q* (an index of channel armoring as a function of transport capacity relative to bed load supply).
 The 17 independent test sites (Table 1) allow us to consider two questions concerning our bed load equation (6). First, how well can we predict the coefficient and exponent of the bed load power function at other sites? Second, how does our bed load formula perform relative to those examined in section 4.1? These questions are addressed in the next two sections.
4.5. Test of Equation Parameters
 We test our parameterization of equation (6) by comparing predicted values of the formula coefficient (α) and exponent (β) to observed values at 17 independent test sites in Wyoming, Colorado and Oregon (Figure 1). The independent test sites cover a generally similar range of slopes and particle sizes as the 24 Idaho sites used to develop equation (6) (Table 1). However, the East Fork River test site occurs at the gravel/sand transition [e.g., Sambrook Smith and Ferguson, 1995; Ferguson et al., 1998; Parker and Cui, 1998] and is significantly finer than the coarse-grained Idaho study sites. The Idaho study sites and the supplemental test sites are all snowmelt-dominated streams, except for Oak Creek which is a rainfall-dominated channel. The geology is also similar across the study and test sites. The channels are predominantly underlain by granitics, with some metamorphic and sedimentary geologies, except for Oak Creek which is underlain by basalt. Bed load transport was measured with Helley-Smith samplers at all sites, with the exception of the East Fork and Oak Creek sites, where slot traps were used [Milhous, 1973; Leopold and Emmett, 1997].
 As expected, the exponent of our bed load function is better predicted on average at the 17 test sites than the coefficient (Figure 9). The observed exponents are reasonably well predicted by equation (6) with a median error of less than 3%. This suggests that q*, determined in part through measurements of the surface and subsurface material during low flow, is able to accurately predict the rating curve exponent over a range of observed discharges (including channel-forming flows) despite any stage-dependent changes in surface grain size [Parker and Klingeman, 1982; Parker et al., 2003]. Moreover, the rating curve exponents are accurately predicted across different climatic regimes (snowmelt- and rainfall-dominated), different lithologies (basalt and granite), and different bed load sampling methods (Helley-Smith and slot samplers), despite the fact that the predictive equation is derived from a subset of these conditions (i.e., snowmelt rivers in granitic basins, sampled via Helley-Smith). In particular, β is reasonably well predicted at the two test sites that are most different from the Idaho study sites (Oak Creek and East Fork; observed β values of 2.55 and 2.19 versus predicted values of 2.43 and 1.82, respectively). We suspect that the success of our exponent function is due to the robust nature of q* to describe supply-related channel armoring regardless of differences in physiography and channel conditions (section 4.4).
Figure 9. Box plots of predicted versus observed values of (a) coefficient and (b) exponent of our bed load transport function (6). Median values are specified.
Download figure to PowerPoint
 In contrast, the predicted coefficients are considerably less accurate and were over 3 times larger than the observed values at many of the 17 test sites (Figure 9). Prediction errors, however, can be significant for both parameters, which is expected given the spread of the 95% prediction intervals shown in Figure 8. The largest errors in predicting the coefficient occurred at the Oak Creek and East Fork sites (3 orders of magnitude overprediction, and 2 orders of magnitude underprediction, respectively). The cause of the error at these sites is examined below.
 The Oak Creek watershed is unique relative to the 24 Idaho study sites in that it is composed primarily of basalt, rather than granite, and has a climatic regime dominated by rainfall, rather than snowmelt. Because equation (6) accurately predicts the exponent of the Oak Creek bed load rating curve, as discussed above, the overprediction of total bed load transport at this site is principally due to prediction error of the transport coefficient (observed α of 1.9 × 10−4 versus predicted α of 0.39), which may be due to differences in basin geology and sediment production rates. Basalt is typically less erosive and produces less sediment per unit area than the highly decomposed granites found in the Idaho batholith [e.g., Lisle and Hilton, 1999]. Consequently, one would expect equation (6) to overpredict the transport coefficient at Oak Creek, as observed. Alternatively, the bed load supply and transport coefficient may be influenced by climate and runoff regime; however the relationship between these two variables is not well documented. Previous studies suggest that in temperate climates bed load supplies may be higher in rainfall regimes than snowmelt-dominated ones [Lisle et al., 2000]. Therefore our α prediction, which is derived from snowmelt streams, would be expected to underpredict transport rates in the rainfall-dominated Oak Creek, contrary to what we observe. Consequently, differences in runoff regime do not appear to explain the observed error at Oak Creek. Regardless of the exact physical cause, the prediction error highlights the site-specific nature of our coefficient function (α) (discussed further in section 4.7).
 In contrast, the underprediction of the transport coefficient at East Fork may be due to a difference in channel type. The East Fork site occurs at the gravel/sand transition and has a finer, more mobile bed than the coarser-grained Idaho sites. The gravel/sand transition represents a shift in the abundance of sand-sized material that likely increases the magnitude of phase I transport and total sediment load compared to gravel bed channels. Consequently, one might expect equation (6) to underpredict the coefficient at East Fork, as observed.
 The Oak Creek and East Fork sites also differ from the others in that bed load samples were obtained from channel-spanning slot traps, rather than Helley-Smith samplers. Recent work by Bunte et al.  shows that differences in sampling method can dramatically affect bed load transport results, although Emmett  demonstrates reasonably good agreement between slot and Helley-Smith samples at the East Fork site. Consequently, differences in sampling method do not explain the observed prediction error of the transport coefficient, at least at the East Fork site.
 Differences in climate and runoff regime may also influence the rating curve exponent (β). This is not a source of error in our analysis (β is accurately predicted by equation (6), even at Oak Creek), but rather a source of systematic variation in β. A rainfall-dominated climate produces greater short-term variability in the annual hydrograph (i.e., flashier hydrograph) than one dominated by snowmelt [Swanston, 1991; Lisle et al., 2000] and typically generates multiple peak flows throughout the year versus the single, sustained peak associated with spring snowmelt. Consequently, the frequency and magnitude of bed load events differs between rainfall- and snowmelt-dominated hydrographs. The magnitude of flow associated with a given return period is typically greater in a rainfall-dominated watershed than in a similarly sized snowmelt-dominated watershed [Pitlick, 1994]. This is seen at our study sites in that the highest Q2 unit discharge (0.44 m3 km−2) occurs at the rainfall-dominated Oak Creek test site and is almost twice the second highest Q2 unit discharge (0.27 m3 km−2) at the snowmelt-dominated Dollar Creek study site. Furthermore, because of the greater short-term variability of rainfall-dominated hydrographs, the duration of intermediate flows is reduced, which can lead to fining of the bed surface and a decrease in the degree of the channel armoring [Laronne and Reid, 1993; Lisle et al., 2000; Parker et al., 2003]. Consequently, one might expect less armoring and lower bed load rating curve exponents (β) in rainfall-dominated environments compared to snowmelt ones. However, β and the degree of armoring are also influenced by bed load supply and boundary shear stress (section 4.4), so those parameters must be factored into any comparison of runoff regimes. Lack of data (only one rainfall-dominated site in our data set) precludes further examination of this issue here.
4.6. Comparison With Other Equations
 To compare the accuracy of our bed load transport equation (6) to those presented in section 4.1, we performed a test of six formulae (including equation (6)) at the 17 test sites. The test procedure was similar to that used in section 4.1.3; however, we assume no transport observations are available for formula calibration (i.e., blind test), and therefore we do not include the two variants of the Parker et al.  (di and d50ss) equation or the Bagnold  (dmqb) equation which require measured bed load transport data. Consequently, only five of the eight variants of the formulae from section 4.1 are included here, plus our power law equation (6).
 Similar to section 4.1, incorrect zero-transport predictions are a problem for threshold-based equations, but the number of zero predictions is significantly less at the test sites (about 50% less compared to those shown in Figure 3 for the Idaho study sites). For both variants of the Meyer-Peter and Müller  equation the median percentage of incorrect zero transport predictions is about 22%, while the median percentage of incorrect zero predictions for the Bagnold  (dmss) equation is 28%. In contrast, the Ackers and White  equation incorrectly predicted zero transport at only one test site (Oak Creek) for 35% of the observations. The significance of incorrect zero-transport predictions is similar at the 17 test sites as at the 24 Idaho sites. The Qmax/Q2 ratio for both variants of the Meyer-Peter and Müller  (d50ss and di) equation had a median value of 29% and 26%, respectively, at the Idaho sites and about 30% at the test sites. The Qmax/Q2 ratio for the Bagnold  (dmss) equation decreased slightly from a median value of 40% at the Idaho sites to 27% at the test sites.
 Figure 10 shows the distribution of log10 differences across the 17 test sites and demonstrates a significant improvement in the performance of both versions of the Meyer-Peter and Müller  equation and the Bagnold  (dmss) equation due to fewer incorrect zero transport predictions; median log10 differences improve from an underprediction of almost 10 orders of magnitude at the 24 Idaho sites to an overprediction of only 1.3 to 2.2 orders of magnitude at the test sites. The performance of both the Parker et al.  (di via Andrews ) equation and the Ackers and White  equation decreased slightly at the test sites with median log10 differences increasing from 2.73 and 0.25, respectively, at the Idaho sites to 3.27 and 0.80, respectively, at the test sites. Our bed load equation (6) had the lowest median log10 difference (0.62) at the 17 test sites.
Figure 10. Box plots of the distribution of log10 differences between observed and predicted bed load transport rates at the 17 test sites. Median values are specified. MPM stands for Meyer-Peter and Müller.
Download figure to PowerPoint
 As in section 4.1.3, the performance of each formula was evaluated using Freese's  χ2 test, with results similar to those of the Idaho sites; all formulae perform significantly worse than the specified accuracy (E = 1 log10 unit, α = 0.05), including equation (6). We also evaluated the critical error, e* [Reynolds, 1984], at each test site and, like the Idaho study sites, we found that a given formula may occasionally provide the required accuracy, but generally no equation performs within the specified accuracy (Figure 11, all median e* values >E).
Figure 11. Box plots of the distribution of critical error, e*, for the 17 test sites. Median values are specified. MPM stands for Meyer-Peter and Müller.
Download figure to PowerPoint
 Nevertheless, our bed load transport equation (6) outperformed all others at the 17 test sites, except for the Ackers and White  equation which was statistically similar to ours (paired χ2 test of e* values, α = 0.05) (Figure 11). As with the Idaho sites, the worst performers were the Bagnold  (dmss) equation and both variants of the Meyer-Peter and Müller  equation, both of which were similar to one another, but different from the Bagnold  (dmss) equation (paired χ2 test, α = 0.05). Critical errors for the Parker et al.  (di via Andrews ) equation were between these two groups of best and worst performers and statistically different from them (paired χ2 test, α = 0.05). Overall, the patterns of formula performance were similar to those of the Idaho study sites, but the 17 test sites tended to have lower values of critical error (see Figures 6 and 11).