Continental-scale relationship between bankfull width and drainage area for single-thread alluvial channels

Authors


Abstract

We explore the bankfull width (Wbf) versus drainage area (Ada) relationship across a range of climatic and geologic environments and ask (1) is the relationship between ln(Wbf) and ln(Ada) best described by a linear function and (2) can a reliable relationship be developed for predicting Wbf with Ada as the only independent variable. The principal data set for this study was compiled from regional curve studies and other reports that represent 1018 sites (1 m ≤ Wbf ≤ 110 m and 0.50 km2 ≤ Ada ≤ 22,000 km2) in the continental United States. Two additional data sets were used for validation. After dividing the data into small, medium, and large-size basins which, respectfully, correspond to Ada < 4.95 km2, 4.95 km2 ≤ Ada < 337 km2, and Ada ≥ 337 km2, regression lines from each data set were compared using one-way analysis of covariance (ANCOVA). A second ANCOVA was performed to determine if mean annual precipitation (P) is an extraneous factor in the Wbf versus Ada relationship. The ANCOVA results reveal that using Ada alone does not yield a reliable Wbf versus Ada relationship that is applicable across a wide range of environments and that P is a significant extraneous factor in the relationship. Considering data for very small basins (Ada ≤ 0.49 km2) and very large basins (Ada ≥ 1.0 × 105 km2) we conclude that a two-segment linear model is the most probable form of the ln(Wbf) versus ln(Ada) relationship. This study provides useful information for building complex multivariate models for predicting Wbf.

1. Introduction

River channel morphology and evolution are critical topics in the study of earth-surface processes [Howard, 1996; Whipple and Tucker, 2002]. In addition, predicting change in response to changing drivers is an important goal in fluvial research [Shreve, 1979; Newson, 2002; Hardy, 2006]. Progress on these topics facilitate holistic interpretations of channel changes in the context of past climate changes, shorter-term human impacts, and potential future climate change [Dollar, 2004]. Moreover, the need for progress is presently heightened because of worldwide efforts to restore, renaturalize, and reengineer rivers to meet changing social and ecologic objectives and expectations [Wallick et al., 2007; Bernhardt et al., 2005].

Bankfull channel width (Wbf) is a key measure of stream size that is used by hydraulic engineers, hydrologists, fluvial geomorphologists, and stream ecologists and biologists [Faustini et al., 2009; McCandless and Everett, 2002]. It is an attractive measure of channel size because it is relatively easy to measure and is available in many data sets and independent measurements of Wbf are more consistent—presumably due to Wbf being insensitive to minor differences in estimates of the bankfull stage—than measurements of bankfull depth or cross-section area [He and Wilkerson, 2011; Leopold, 1994; Wahl, 1977]. However, using Wbf as a descriptor of stream size can be problematic because alluvial streams with identical flows might have different average widths due to differences in bed slope, sinuosity, bed or bank material composition, available sediment load, streambank vegetation, climate, human impacts, geology, large woody debris (LWD; usually defined as wood larger than 10 cm in diameter and 1 m in length) loading, or other factors. Furthermore, it is not universally agreed that Wbf is the most important or consistent measure of channel size [e.g., Hey, 2006; Roper et al., 2010; Krstolic and Chaplin, 2007]. Thus, in an assessment of “natural stable channel design” procedures, Hey [2006] notes that cross-sectional area underpins all scaling procedures for stream restoration design, and Krstolic and Chaplin [2007] conclude that bankfull cross-section area (computed as the product of bankfull width and mean depth) is more consistent than Wbf.

Two commonly used models for predicting Wbf are

display math(1)
display math(2)

where Qbf is the bankfull discharge, Ada is the drainage area, and α0 and β1 are parameters. Given values for α0 and β1 typically apply to a region with a homogeneous hydrologic response. Common practice is to refer to equation (1) as a regional curve relationship and equation (2) as a (downstream) hydraulic geometry relationship. A range of α0 and β1 values have been reported, and reported values have been compiled by Anderson et al. [2004], Knighton [1998], Dunne and Leopold [1978], and Soar and Thorne [2001]. The variability associated with (2) is less than the variability associated with (1) [Soar and Thorne, 2001]. Equation (1), however, has the advantage that values of Ada are easier to obtain than values Qbf. More specifically, values of Ada can be obtained from digital elevation models whereas substantial field campaigns are required to obtain reliable Qbf estimates. Recently, He and Wilkerson [2011] proposed using

display math(3)

where Q2 is the 2 year return-period discharge, as an alternative to or alongside of equation (1) on the grounds that (a) Q2 reflects the geologic characteristics of a basin, (b) values of Q2 can be obtained almost as easily as values of Ada in the United States, and (c) estimates of Wbf derived using equation (3) are in some cases significantly better than Wbf estimates obtained using equation (1). Related to regional curves are what Faustini et al. [2009] refer to as enhanced regional curves, which are models for predicting channel geometry that include drainage area and other landscape variables. Faustini et al., for example, identify the following factors as important for predicting bankfull width: bed material, ecoregion, mean annual precipitation, elevation, mean reach slope, and human disturbance.

2. Study Objectives

Despite the large number of factors and interactions that underlie Wbf, this study is concerned with the relationship between Wbf and Ada for single-thread alluvial channels representing a broad range of geologic, terrestrial, climatic, and botanical environments. This study is a foundational study intended to facilitate (1) building a comprehensive mechanistic model for predicting Wbf and (2) learning about the dynamic watershed processes responsible for Wbf. The first objective of this study is to determine whether or not the ln(Wbf) versus ln(Ada) relationship is best represented by a simple linear model. The second objective is to determine if Ada alone is a sufficient independent variable for developing a general relationship for predicting Wbf that is applicable across a wide range of environments. For interpreting the results of this study, we briefly consider the effect of mean annual precipitation depth on the Wbf versus Ada relationship. The significance of this study is due to Wbf being a critical variable for many applications in hydraulic engineering, hydrology, fluvial geomorphology, stream ecology, and surface process modeling.

3. Data

The principal data set for this study was compiled from a collection of regional curve studies and other reports that represent 1000 sites near U.S. Geological Survey (USGS) gaging stations and is referred to hereafter as the USGS data set. To facilitate testing hypotheses, making comparisons, and for validating the results of this study, we used two data sets compiled by the U.S. Environmental Protection Agency (EPA) and its state, tribal, and federal partners. That is, we used data from (1) the Wadeable Streams Assessment (WSA) and (2) the National Rivers and Streams Assessment (NRSA). The USGS and EPA data sets include sites located in a wide range of geologic and hydrologic settings. Data sets used for developing regional curves, in contrast, typically include only sites that drain geologically and hydrologically homogeneous basins. The following text describes relevant features of each data set.

3.1. USGS Data

The USGS data include NUSGS = 1018 sites at which measurements of Wbf and Ada and estimates of mean annual precipitation (P) were available. The data set is so named because every site included therein is located near an active or inactive USGS gaging station that is used for developing flood flow regression equations in one or more USGS flood-frequency reports (later we discuss why these features are important). Estimates of Wbf are from bankfull hydraulic geometry reports and regional curve studies published by the USGS and other sources. Estimates of Ada are from USGS flood-frequency reports and the USGS National Water Information System website (http://nwis.waterdata.usgs.gov/nwis). Estimates of P were derived from the PRISM (Parameter-elevation Regression on Independent Slopes Model) algorithm [Daly et al., 2008], are based on 30 year (1971–2000) averages, and were extracted from an 800 m grid (http://www.prism.oregonstate.edu/products/).

Rantz et al. [1982] discuss criteria for selecting a location for a stream-gaging station that will be used for measuring stage and discharge and developing a rating curve. Rantz et al. note that ideal features for a gage site include having a streambed that is not subject to scour and fill and having unchanging natural controls (e.g., a bedrock outcrop). Finally, Rantz et al. note that out of necessity, oftentimes a nonideal site must be accepted (e.g., a site with a moveable bed and erodible banks). Juracek and Fitzpatrick [2009] describe the geomorphic content of stream-gage information and limitations of stream-gage information for fluvial geomorphic investigations. On the one hand, Juracek and Fitzpatrick note that many USGS gaging sites have limited geomorphic value, as the channels do not readily adjust in response to changing conditions. On the other hand, they note that many USGS gages can provide information of geomorphic value since they are located at nonideal gage sites. Sites included in the USGS data set are nonideal gage sites that have alluvial and dynamically stable channels (i.e., channels in equilibrium with the variables that control channel dimensions) [Hey, 1997].

Assurance that sites in the USGS data set are suitable for this study is due to site selection criteria and site descriptions in the bankfull hydraulic geometry reports and regional curve studies from which Wbf estimates were taken. Specifically, data sources from which we extracted Wbf estimates contained site selection criteria declaring that the flow regime is stationary and not appreciably altered by human activities. Bankfull width data sources were also required to contain a statement expressing belief that the study sites are stable. For example, Mulvihill and Baldigo [2007] and Mulvihill et al. [2005] state that for assessing channel stability at actively gaged study sites, recent flow measurements and rating curves were inspected for evidence of scour, deposition, and frequent shifting of bed material. At inactive gage sites, they assessed channel stability by developing a stage-discharge curve and comparing it to the last known rating curve for the site in question. For additional assurance that only stable and alluvial channels are represented in the USGS data set, sites were also excluded if a Wbf data source revealed that (1) following Chaplin [2005], more than 30% of the drainage area was underlain by carbonate bedrock; (2) following Hammer [1972], more than 20% of the drainage area was urbanized; (3) the channel's bankfull stage was estimated using an assumed return period for bankfull discharge; (4) there was noted evidence of channelization prior to the bankfull estimate being made; or (5) the channel had flow on or near bedrock [e.g., Keaton et al., 2005].

Among data sources represented in the USGS data set, bankfull stage (and subsequently, Wbf) was identified by considering geomorphic indicators (i.e., floodplain breaks, inflection points, scour lines, depositional benches, point bars, vegetation, etc.) [McCandless and Everett, 2002; Tetra Tech EM Inc., 2004; Castro and Jackson, 2001]. In some studies, researchers verified the identified bankfull stage by reviewing the recurrence interval of the corresponding discharge [e.g., Castro and Jackson, 2001; Westergard et al., 2005]; sites were not excluded for this reason in spite of concern about this practice being tantamount to using an assumed return period for estimating bankfull discharge.

That each site in the USGS data set is proximate to an active or inactive USGS gaging station that is used for developing flood flow regression equations in one or more USGS flood-frequency reports is important because such gaging stations are selected in consideration of Interagency Advisory Committee on Water Data (IACWD) [1982] recommendations. The following IACWD recommendations are of particular interest: (1) flood flow records are not affected by climatic trends or cycles (i.e., are stationary) and (2) they are not appreciably altered by human activities (e.g., reservoir regulation, diversions, urbanization, channelization, land cover, or levee construction).

In the USGS data set, values of Wbf vary from 0.76 to 152 m, and the median value is 11.9 m. If we obtained more than one estimate of Wbf for a site, the arithmetic mean of the estimates was recorded in the data set. Values of Ada vary from 0.363 to 21,652 km2 and the median value 140 km2. The median bed-material size (D50) was reported for 272 (27%) of the sites represented in the USGS data set. Reported values of D50 range from 0.06 to 222 mm, and the median value is 24.3 mm (Figure 1). Among sites for which D50 is known, 1% (2) are silt-bed channels (D50 ≤ 0.062 mm), 22% (60) are sand-bed channels (0.062 mm < D50 ≤ 2.0 mm), 60% (162) are gravel-bed channels (2.0 mm < D50 ≤ 64 mm), and 18% (48) are cobble-bed channels (64 mm < D50 ≤ 250 mm).

Figure 1.

Median bed-material size distribution for 272 (27%) of the sites in the USGS data set and 100% of the sites in the EPA-WSA data set. The median value of D50 for the USGS and EPA-WSA data sets is 24.3 and 4.96 mm, respectively.

Sites in the USGS data set span 29 of the 48 states in the continental United States (58%) (Figure 2) and a range of geologic, terrestrial, climatic, and botanical environments. Montana, with 444 sites (43% of NUSGS), contains more sites than any other state. Kansas, with 90 sites (8.8% of NUSGS), contains the second greatest number of sites. Both New Jersey and West Virginia contain one site. The mean and median numbers of sites in each state are 35 and 14, respectively.

Figure 2.

States represented in USGS data set. Parenthesized numbers indicate the number of sites located in respective states.

Ecological regions or ecoregions denote areas of general similarity in ecosystems and in the type, quality, and quantity of environmental resources. For delineating ecoregions, consideration is given to biotic and abiotic phenomena, e.g., geology, physiography, vegetation, climate, soils, land use, wildlife, and hydrology [Wiken, 1986; Omernik, 1987, 1995] (http://www.epa.gov/wed/pages/ecoregions.htm). In North America, there are 15 broad, Level I ecological regions. These highlight major ecological areas and provide a broad backdrop to the ecological mosaic of the continent [Commission for Environmental Cooperation Working Group, 1997]. Level II ecological regions are derived by subdividing Level I ecoregions and are useful for national and subcontinental overviews of ecological patterns. Of the 20 Level II ecoregions that lie partly or wholly within the continental United States, 17 (85%) are represented in the USGS data set.

The USGS data set includes sand, gravel, and cobble-bed streams and rivers for which Wbf varies across four orders of magnitude, and Ada varies across six orders of magnitude. In addition, sites included in the USGS data set span 59% of the states in the continental United States and 85% of the ecoregions in the continental United States. Therefore, we consider the USGS data set to be representative of single-thread alluvial channels in the continental United States.

3.2. The EPA's Wadeable Streams Assessment Data

In 2004 the EPA and its partners completed the WSA, a nationwide survey of streams and rivers that are shallow enough to sample without boats [U.S. Environmental Protection Agency (USEPA), 2006]. For conducting the WSA, standardized field and laboratory protocols for sampling were employed [Faustini et al., 2009; USEPA, 2006; Peck et al., 2006].

Considering specified criteria for stream and basin size, data completeness, and data quality, Faustini et al. [2009] screened the WSA data and then extracted a subset of the data that included 1588 sites. The screened WSA data set was used by Faustini et al. to develop and evaluate hydraulic geometry relationships for water resources regions [Seaber et al., 1987] and aggregated ecoregions [USEPA, 2006]. Among the 1588 sites selected by Faustini et al., 1152 had been selected for inclusion in the WSA through a spatially balanced random probability survey design [Stevens and Olsen, 1999, 2004; Herlihy et al., 2000]. A spatially balanced random probability survey design assures spatial balance as well as randomization and can produce a relatively even distribution of sample points that better reflect the nature of stream networks than a strictly random sample. The remaining 436 sites included in the data subset extracted by Faustini et al. had been handpicked for inclusion in the WSA.

In accordance with the data selection criteria established for this study, the screened WSA data set (provided by A. Herlihy, personal communication, 2009) was further screened. Consequently, of the 1588 sites included in the screened WSA data set, 20 sites were excluded for being more than 20% urbanized, and one site was excluded because the value for percent urbanization was missing. The resulting data set, referred to hereafter as the EPA-WSA or WSA data set, includes 1567 sites for which 0.97 m ≤ Wbf ≤ 72.7 m, 1.00 km2 ≤ Ada ≤ 9991 km2, and 14.0 cm ≤ P ≤ 345 cm. Reported values of D50 range from 0.0078 to 807 mm, and the median value of D50 is 4.96 mm (Figure 1).

3.3. The EPA's National Rivers and Streams Assessment Data

On 26 March 2013 the EPA released a new data set in conjunction with the 2008–2009 NRSA (http://water.epa.gov/type/watersheds/monitoring/aquaticsurvey_index.cfm). As such, the NRSA data were released near the end of this study, i.e., after most of the analyses performed for this study had been completed and after results for this study had been formulated. Nonetheless, the NRSA data set has great significance for this study since it represents streams and rivers across the United States and includes data for both wadeable and nonwadeable streams. Therefore, this manuscript was modified to incorporate NRSA data in selected analyses. The NRSA data are available from http://water.epa.gov/type/rsl/monitoring/riverssurvey/. In addition, NRSA project management, design, methods, and standards are documented in four companion documents: USEPA [2009a, 2009b, 2009c, 2010]. For the present study, a subset of the NRSA data set was provided by P. Kaufman (personal communication, 2013). Before filtering, the data set included 2020 sites.

The NRSA data were screened using criteria similar to that used for screening the USGS data (Table 1). There were unavoidable differences in how the data sets were screened, however, because the data sets contain different basin and channel information. For example, for a given site in the USGS data set, percent urbanization was assessed by estimating the percentage of the drainage basin that is urbanized. For sites in the NRSA data set, percent urbanization was assessed by identifying an area around the midpoint of the study site and then estimating the percentage of the enclosed area that is urbanized [USEPA, 2013]. In spite of percent urbanization being estimated in different ways for the USGS and NRSA data sets, NRSA sites were included only if percent urbanization was reported as being less than or equal to 20%. Different criteria were also applied to the USGS and NRSA data sets with respect to the variable “percentage of drainage area underlain by carbonate bedrock.” Since this variable was not recorded for sites represented in the NRSA data set, sites in the NRSA data set were not excluded for having more than 30% of their drainage area underlain by carbonate bedrock; this is in contrast with criteria used for excluding sites from the USGS data set. After filtering the NRSA data, 662 of the initial 2020 sites remained. The filtered NRSA data, referred to hereafter as the EPA-NRSA data, represent all 48 states in the continental United States and consequently, represent a wide range of geologic and hydrologic settings and drainage areas. Respectively, the average, median, minimum, and maximum numbers of basins representing each state are 14, 15, 2, and 30.

Table 1. Screening Criteria for EPA-NRSA Data Set
Variable NameDescriptionaCriteria for Including Channel
  1. a

    Sources: USEPA [2009b, 2013] and metadata available at http://water.epa.gov/type/rsl/monitoring/riverssurvey/.

*CONPATTERNChannel pattern (SINGLE, ANASTOM, BRAIDED)SINGLE
*PCT_BDRKBed surface percent bedrockMissing value or reported value ≤5%
*PCT_URBPercent urbanization for area within a 1 km2 area around the midpoint of the sampled stream segmentMissing value or reported value ≤20%
*XBKF_WMean bankfull width (m)Value not missing and reported value greater than zero
*SQ_KMWatershed area (km2)Value not missing and reported value greater than zero
*RMD_PHabSite disturbance category (“R” = least disturbed (“reference”) sites; “M” = moderately disturbed sited, and “D” = most disturbed sites)Not equal to “D”
*W1H_WALLDistance weighted index of near-channel human disturbance (e.g., riparian and near-stream walls, dikes, and revetment)Equal to zero
*CONFEATURESConstraining featuresNot equal to “BEDROCK” or “HUMAN”

3.4. Statistical Treatment of Data (Methods)

In the analyses that follow we consider linear and nonlinear regression models for relating ln(Wbf) and ln(Ada). Subsequently, we attempt to identify the best model among them. The following paragraphs present information about the models considered in this study. Local regression (LOESS) curves were used to identify the central pattern of the USGS, EPA-WSA, and EPA-NRSA data. Therefore, we discuss the principal features of LOESS curves. Goodness-of-fit measures (used for comparing candidate models), analysis of covariance (ANCOVA; used for comparing regression lines derived from different groups of data), and bootstrapping (used, in some cases, to estimate model parameter confidence intervals) are also discussed in the paragraphs that follow. Statistics presented herein were estimated using IBM SPSS Statistics (SPSS) [IBM SPSS Inc., 2012].

3.5. Models

This study is concerned with the relationship between the transformed variables Y = ln(Wbf) and X = ln(Ada). Following is a brief description of the six models considered herein for relating Y and X. One of the evaluated models is the simple linear regression model given by

display math(4)

where β0 and β1 are model parameters (note that Model 1 is a linearized form of equation (1)). A piecewise linear function is a function composed of a finite number of linear segments. The endpoint of each segment is called a knot. For relating Y and X the following two- and three-segment piecewise linear functions were considered:

display math(5)
display math(6)

where β0, β1, β2, β3, Xk,1, and Xk,2 are model parameters. More specifically, β0 and β1 are the intercept and slope coefficients for the first segment of the piecewise linear function; β2 and β3 are differential slope coefficients for the function's second and third segments, respectfully; and Xk,1 and Xk,2 are knots. Note that Xk,1 < Xk,2. The logical expressions in Models 2 and 3 evaluate to 0 if false and 1 if true and thereby limit the influence of β2 and β3 to values of X that are greater than Xk,1 and Xk,2, respectively.

In addition to the linear functions described above, the following nonlinear functions were evaluated for relating Y and X:

display math(7)
display math(8)
display math(9)

where β1, β2, β3, and β4 are parameters. Equations (7), (8), and (9) represent sigmoidal (or S-shaped) curves: equation (7) is Gompertz's model, equation (8) is the logistic model, and equation (9) is Richards model. Whereas the logistic model is symmetric about its point of inflection (i.e., X = −β2/β3, Y = β1/2), Gompertz's model is asymmetric about is inflection point which also occurs at X = −β2/β3. Richards model extends the logistic model by including β4. For developing a foundational model for predicting Wbf, equations (7)(9) are appealing because Y → 0 as X → −∞. In addition, the models are widely applied in many fields [Draper and Smith, 1998; Ratkowsky, 1990; Graybill and Iyer, 1994]. Equations (7)(9) each have a finite upper asymptote equal to β1. This observation raises concern about the suitability of these models for relating ln(Wbf) and ln(Ada) since there is no widely accepted physical basis for why there should be an upper limit to Wbf for single-thread alluvial channels.

3.6. LOESS Curves

LOESS, which stands for local regression [Cleveland, 1979; Cleveland et al., 1988; and Cleveland and Grosse, 1991], is a nonparametric smoothing technique used in fitting a curve or surface to data [Freund et al., 2006; Cohen, 1999]. LOESS modeling provides greater flexibility than traditional modeling techniques because it does not require knowing or making an assumption about the mathematical function that relates the variables of concern [Cohen, 1999]. As Cleveland [1979] and Helsel and Hirsch [2002] note, a LOESS curve is a useful exploratory tool for discerning the form of a relationship between an independent variable and one or more predictors. One of the trade-offs for these features is increased computation [Guthrie, 2012]. More, to produce a good model, LOESS requires a large, densely sampled data set.

LOESS curves are derived from locally weighted polynomial regression [Guthrie, 2012]. That is, at each point in the data set a low-degree polynomial is fit to a localized subset of the data. The value of the regression function for the current point is obtained by evaluating the local polynomial using the independent-variable value(s) for the current point. In SPSS, a LOESS curve can be added to a scatterplot after specifying the proportion of data points to be used in calculating each local polynomial and a kernel (or weighting) function that specifies which data points in relation to the current point receive more weight. For calculating a local polynomial in SPSS, 50% is the default value for the smoothing parameter. According to Cleveland [1979], choosing a value in the range of 20%–80% should serve most purposes and a value of 50% is a reasonable starting value when there is no clear idea of what is needed. The default kernel function in SPSS is the Epanechnikov function. Herein, LOESS curves are used for exploratory analyses. The curves were generated in SPSS after (1) selecting 30% as the smoothing parameter and (2) choosing the Epanechnikov function as the kernel function.

3.7. Best Model Selection Criteria

Three goodness of fit measures were used for comparing models: the standard error of estimate (sY|X), the coefficient of determination (r2), and Akaike's information criterion (AIC) scores. The sY|X is a measure of variability in residuals from a regression model, and r2 is the proportion of variance in a dependent variable that is explained by an independent variable [Field, 2009]. Smaller values of sY|X indicate less variability between a model and the data. Larger values of r2 indicate that the independent variable explains a greater proportion of variance in the dependent variable. The AIC methodology yields an index for choosing which among competing models, is the most parsimonious, that is, which model best explains the data with a minimum number of explanatory variables [Crawley, 2002]. AIC scores can be computed from Burnham and Anderson [2002]:

display math(10)

where n is the number of data points and k is the number of regression parameters, including the intercept and SSR. Thus, for example, k = 3 for equation (4).

When comparing models using the AIC methodology, the “best model” is that model which yields the smallest AIC score. Although AIC scores are useful for comparing regression models, the scores are based on data with inherent sampling error. A consequence of sampling error is uncertainty about whether or not the best model is substantially better than, for example, the second best model. To address this uncertainty we computed the difference between AIC scores for the best model and each of the remaining models (ΔAIC) and then used ΔAIC to judge the best model. Following Burnham and Anderson [2002], for ΔAIC > 10 the best model was judged as being clearly superior, for 2 < ΔAIC ≤ 10 the superiority of the best model was judged as being indefinite, and for ΔAIC ≤ 2 the best model was judged to be tied with the compared model.

3.8. ANCOVA Analysis

A one-way ANCOVA can be used to determine if the relationship between a dependent variable (Y) and an independent variable (X) varies between two or more groups or levels of a categorical independent variable (Z). An ANCOVA proceeds by testing for (1) equality of regression line slopes among the subsets of data and (2) equal intercepts among the subsets of data if the slopes are found to be equal [Conover and Iman, 1982]. For example, herein we consider whether the relationship between Y = ln(Wbf) and X = ln(Ada) is the same or different across two independent data sets: the USGS data set and the EPA-WSA data set. Given the two data sets, an ANCOVA is performed using

display math(11)
display math(12)

where Z is the dummy variable assigned a value of 0 for sites in the USGS data set and 1 for sites in the EPA-WSA data set, β0 is the intercept coefficient for USGS data, β1 is the slope coefficient for USGS data, β2 is the differential intercept coefficient for EPA-WSA data, and β3 is the differential slope coefficient for EPA-WSA data. If the value of β3 in equation (11) is not significantly different from zero then the USGS and EPA-WSA regression lines would be deemed to have equal slopes. Also, in this case, β2 in equation (12) would be evaluated. If β2 in equation (12) is not significantly different from zero then the USGS and EPA-WSA regression lines would be considered to have the same intercept value, and it would be concluded that the USGS and EPA-WSA regression lines are coincident. In this case, equation (12) reduces to

display math(13)

Two assumptions for a valid ANCOVA are that the regression residuals for each data subset are normal and have equal variance. The Shapiro-Wilk's test was used to test for normality of regression residuals, and Levene's test was used to test for homogeneity of variance.

3.9. Bootstrapping

For making reliable inferential statements about least squares regression parameters, residuals should be normally distributed as N(0, σ2). If this parametric assumption holds, hypotheses about parameters can be tested, and confidence intervals for the parameters can be performed using the z or t table [Mooney and Duval, 1993]. If we assume the residuals are normal when the residuals are in fact not normal, parametric tests of hypotheses and confidence intervals may not be useful. Bootstrapping provides a means for overcoming this problem [Efron, 1979, 1981; Freedman, 1981; Freedman and Peters, 1984]. In statistics, bootstrapping is a method for assigning measures of accuracy to sample estimates [Efron and Tibshirani, 1994]. Bootstrapping uses repeated samples from the original data set. The samples are used to generate an empirical estimate of the entire sampling distribution of a statistic. With reference to least squares regression, bootstrapping provides a way to estimate confidence intervals and perform significance tests without requiring an assumption about how the residuals are distributed. Bootstrap confidence intervals, therefore, are a useful alternative to parametric estimates when the assumptions of underlying parametric statistics are in doubt. In section 'Results', where regression residuals were nonnormal, bootstrap confidence bounds were computed for model coefficients and in turn, used for hypothesis testing.

4. Results

One result from this study indicates that rather than a simple linear model or a nonlinear model, a linear-piecewise model with three sections is the best model for relating Wbf and Ada when considering data representing a wide range of geologic and hydrologic settings and drainage areas. Another result is that Ada alone is not sufficient for developing a general relationship for predicting Wbf that is applicable across a wide range of environments. Presented below are the analyses that support these results and discussion about how these results should be interpreted.

4.1. Analysis of USGS Data

A linear regression line was fit to the USGS data (Figure 3a); the equation for the line is

display math(14)
Figure 3.

(a) Model 1 and Model 3 plotted with USGS data; (b) Model 1 and USGS LOESS curve (red dashed line); (c) EPA-WSA data, LOESS curve fit to EPA-WSA data, and Model 3; (d) regression lines for small, medium, and large-size basins in USGS and EPA-WSA data sets; (e) EPA-NRSA data, LOESS curve fit to EPA-NRSA data set, and Model 3; and (f) regression lines for small, medium, and large-size basins in USGS and EPA-NRSA data sets.

The coefficient of determination (r2) is 0.59. A histogram of standardized residuals from Model 1 (Figure 4) reveals that the residuals are bimodal. Consistent with Figure 4, results from a Shapiro-Wilk's test for normality, W(1,018) = 0.989, p < 0.001, indicate that the residuals have a non-Gaussian distribution. Since the Model 1 residuals are nonnormal, bootstrap confidence intervals were computed for the parameters. The bootstrap 95% confidence intervals indicate that parameters for Model 1 are significantly different from zero.

Figure 4.

Normal curve and histogram of standardized residuals from regression of ln(Wbf) versus ln(Ada) for USGS data. The residuals appear to be bimodal.

A LOESS curve indicates the structure of data without requiring specification of a parametric function. Thus, to the extent that Model 1 is a good fit for the data, it will coincide with a LOESS curve fit to the data. A LOESS curve (USGS LOESS curve) is plotted along with Model 1 in Figure 3b. The LOESS curve intersects Model 1 at three points and thereby raises the question, is there a better model than Model 1 for relating ln(Wbf) and ln(Ada)? On the one hand, the difference between the USGS LOESS curve and Model 1 do not warrant any concern because (1) linear regression models have been used extensively for relating ln(Ada) and ln(Wbf) in regional hydraulic geometry studies, and (2) the amount of scatter in Figure 3a outweighs differences between the USGS LOESS curve and Model 1. On the other hand, the differences do merit attention because one aim for this study is to establish a foundation for building a mechanistic model for predicting channel width. With hopes of finding a regression model that closely follows the USGS LOESS curve, a comparative analysis was performed to determine whether the ln(Wbf) versus ln(Ada) relationship is best expressed using a simple linear model or another model.

For identifying the best model for relating ln(Wbf) to ln(Ada), five additional models were considered (Table 2). Among the considered models are two linear-piecewise models (one with two sections and one with three sections) and three sigmoidal models (the Gompertz model, the logistic function, and Richards curve). Each of the sigmoidal models has an asymptotic minimum and maximum value; the minimum value is zero, and the maximum value is estimated as part of the model fitting analysis. That the sigmoidal curves have a maximum value is noteworthy because there is no widely accepted physical basis for arguing that there should be a maximum value for ln(Wbf).

Table 2. Regression Models Evaluated and Model Parameters for USGS Data
Model No.Model name/expressionaModel Parameter EstimatesbGoodness-of-Fit Measuresc
β0β1β2β3Xk1Xk2 inline imager2AIC
  1. a

    Y = ln(Wbf), X = ln(Ada), Xk,i is the ith knot, βi is the ith model parameter.

  2. b

    Bootstrap confidence intervals were estimated for the model parameters because each model's residuals were nonnormal and bimodal in appearance; all model parameters are significantly different from zero at the 95% confidence level. The confidence intervals were used for testing hypotheses about the model coefficients.

  3. c

    sY|X is the standard error of estimate, r2 is the coefficient of determination, and AIC is Akaike's information criterion.

1Linear: inline image0.7000.3650.4430.594−828.2
22-piece linear: inline image0.4970.429-0.2335.8940.4280.609−860.7
33-piece linear: inline image0.7810.1910.271−0.2791.6005.8200.4250.612−865.9
4Gompertz: inline image4.264−0.7410.2940.4290.607−859.6
5Logistic: inline image3.808−1.6380.4940.4270.609−863.3
6Richards: inline image3.698−2.5180.5930.6650.4280.609−861.8

The two-segment linear-piecewise model (Model 2; Figure 5a) indicates that the Wbf versus Ada relationship has two significantly different phases (p ≤ 0.05): one for basins where ln(Ada) is less than 5.89 (Ada < 363 km2) and a second for basins where ln(Ada) is greater than 5.89. Likewise, the three-segment linear-piecewise model (Model 3; Figure 5b) indicates that the Wbf versus Ada relationship has three phases: one for basins where ln(Ada) ≤ 1.60 (i.e., Ada ≤ 4.95 km2), a second where 1.60 < ln(Ada) ≤ 5.82, and a third where ln(Ada) > 5.82 (i.e., Ada > 337 km2). The Gompertz model (Model 4; Figure 5c), the logistic function (Model 5; Figure 5d), and Richards curve (Model 6) indicate that the slope of the Wbf versus Ada relationship changes continuously. Referring to Model 6, the bootstrap 95% confidence interval for β3 is C(−0.107 ≤ β3 ≤ 1.438) = 0.95. This result indicates that β3 is not significantly different from 1.0, and consequently, Model 6 reduces to Model 5. Given this, Model 6 is not given further consideration in this comparative analysis.

Figure 5.

The red dashed line in Figures 5a–5d is a LOESS curve that was fit the USGS data set. The solid lines are plots of the (a) two and (b) three-segment linear-piecewise models, (c) Gompertz model, and (d) logistic model when fit to the USGS data set.

Considering the sY|X, r2, and AIC values for Models 1–5 we conclude that Model 3 is the best model, Model 2 ranks second, and Model 1 ranks fifth. Following Burnham and Anderson [2002], Models 2–5 are clearly superior to Model 1 since the difference between AIC scores (ΔAIC) is greater than 10. Compared to Models 2, 4 and 5, the superiority of Model 3 is indefinite since 2 < ΔAIC ≤ 10. Model 3 is adopted for this study because it yields the smallest SSR and AIC and the largest r2; and because adopting Model 3, rather than one of the nonlinear models, leaves open the possibility of using a suite of statistics that are available for evaluating linear regression models. The following equation presents Model 3 in a more user-friendly form:

display math(15)

4.2. Model 3 Reliability: Comparison of Regression Lines for USGS and EPA-WSA Data

Data used by Faustini et al. [2009] (EPA-WSA data) were used to assess the reliability of Model 3. For broadly assessing the EPA-WSA data, the data are plotted in Figure 3c along with a fitted LOESS curve and Model 3. ANCOVA was used to formally compare the ln(Wbf) versus ln(Ada) regression lines derived from the USGS and EPA-WSA data, that is, to determine if the regression slopes are coincident. Before conducting the ANCOVA, the data were split into three basin-size subgroups: small, medium, and large-size basins that respectfully, correspond to Ada < 4.95 km2, 4.95 km2 ≤ Ada < 337 km2, and Ada ≥ 337 km2. The knots for Model 3 formed the basis for splitting the data. A separate ANCOVA was conducted for each basin-size subgroup. Bootstrap confidence intervals were computed for the ANCOVA model parameters, and Levene's test, based on the median value of the residuals, was used to test for equality of variance in the USGS and EPA-WSA data.

To facilitate interpreting the ANCOVA results, Figure 3d shows regression lines for small, medium, and large-size basins in (1) the USGS data set and (2) the EPA-WSA data set. For small-size basins, ANCOVA revealed that regression lines for the USGS and EPA-WSA data are parallel [C(−0.098 ≤ β3 ≤ 0.390) = 0.95] but not coincident [C(0.053 ≤ β2 ≤ 0.331) = 0.95]. In this case, the intercept for the EPA-WSA regression line is significantly greater than the intercept for the USGS regression line. For medium-size basins, the USGS and EPA-WSA regression lines are not parallel [C(−0.224 ≤ β3 ≤ −0.115) = 0.95)]. Finally, for large-size basins the regression lines for the EPA-WSA and USGS data are parallel, C(−0.135 ≤ β3 ≤ 0.056) = 0.95, and the EPA-WSA regression line has a significantly smaller intercept than the USGS regression line, C(−0.698 ≤ β2 ≤ −0.517) = 0.95. For both small and large-size basins, Levene's test indicated that the homogeneity of variance assumption was violated [W(1,252) = 7.691, p = 0.006; and W(1,638) = 9.852, p = 0.002; respectively].

With regard to large-size basins, we attribute the smaller intercept for the EPA-WSA data to the EPA limiting its data collection efforts to wadeable streams: wadeability is not a site selection criterion for the USGS. Later, additional results are presented that also support this interpretation. Regarding small and medium-size basins, there is no clear reason why the USGS and EPA-WSA data should differ in either slope or intercept. A hypothesis that might explain the difference is that one or more extraneous factors (variables that are not of interest in a study but could influence the dependent variable) [Leech et al., 2011) have substantial influence on the Wbf versus Ada relationship for small and medium-size basins. Evidence supporting this hypothesis would indicate that Model 3 is not reliable.

For testing the hypothesis stated above, mean annual precipitation (P) is considered an extraneous variable. Precipitation is chosen over other candidate extraneous variables (e.g., bank vegetation type and density, channel bed slope, sediment load, and size) for the simple reason that estimates of P are available for every site represented in the USGS and EPA-WSA data sets, whereas measurements for characterizing other variables are not. Histograms of ln(P) are presented in Figure 6 for small and medium-size basins in the USGS and EPA-WSA data sets. It is apparent from Figure 6 that the number of medium-size basins is much greater than the number of small-size basins. Figures 6a and 6b show that the USGS data have a bimodal distribution, and Figures 6c and 6d indicate that the EPA-WSA data are right-skewed and platykurtic. In essence, the ln(P) distributions are quite different for the two data sets. It follows from the stated hypothesis that the USGS and EPA-WSA data sets should yield different ln(Wbf) versus ln(Ada) relationships for small and medium-size basins because the data sets have different ln(P) distributions.

Figure 6.

Histograms showing distribution of ln(P) values for medium-size basins (1.6 ≤ ln(Ada) < 5.52; Ada in units of km2) in USGS and EPA-WSA data.

For testing the hypothesis that ln(P) is a significant factor in the ln(Wbf) versus ln(Ada) relationship, each site in the USGS and EPA-WSA data sets was assigned to a precipitation class (PClass) based on ln(P). Then, considering sites assigned to a common PClass, ANCOVA was performed to determine whether or not the USGS and EPA-WSA data yield ln(Wbf) versus ln(Ada) regression lines that coincide. Presented in Table 3 are ln(P) intervals for each PClass, and the number of USGS and EPA-WSA sites represented in each PClass (NUSGS and NWSA, respectively). Only medium-size basins as defined by Model 3 (4.95 km2 ≤ Ada < 337 km2 or 1.60 ≤ ln(Ada) < 5.82) were considered in this analysis. Small-size basins were not considered because among candidate ln(P) intervals (Figures 6a and 6c) either NUSGS or NWSA was deemed too small to yield reliable ANCOVA results.

Table 3. Assigned Precipitation Class (PClass) Numbers and Intervals and ANCOVA Results for Medium-Size Basinsa
PClass NumberPClass Interval (P in Units of cm)NUSGSNWSAInteraction-Term Coefficient, β3, in Equation (11)Data-Source Coefficient, β2, in Equation (12)Shapiro-Wilk's Test for Normality of ResidualsLevene's Test for Equal Residual Variance
F(1,NUSGS + NWSA−4)Sig.F(1,NUSGS + NWSA−3)Sig.WUSGSSig.WWSASig.F(1,NUSGS + NWSA−2)Sig.
  1. a

    Values in bold font indicate that the corresponding result is significant, i.e., p ≤ 0.05.

13.4 ≤ ln(P) <3.543140.450.503.290.080.960.130.940.374.160.05
23.5 ≤ ln(P) <3.663280.090.760.000.960.960.040.980.781.460.23
33.6 ≤ ln(P) <3.745453.160.080.010.920.970.380.980.440.130.72
43.7 ≤ ln(P) <3.834621.690.201.760.190.87<0.010.980.481.500.22
53.8 ≤ ln(P) <3.930550.870.352.400.130.950.150.990.780.340.56
63.9 ≤ ln(P) <4.024530.350.551.050.310.950.260.980.524.590.04
74.0 ≤ ln(P) <4.118490.380.540.020.890.970.770.980.590.050.82
84.1 ≤ ln(P) <4.215710.360.550.440.510.940.430.950.010.490.48
94.2 ≤ ln(P) <4.39581.070.307.42<0.010.68<0.010.970.241.610.21
104.3 ≤ ln(P) <4.412820.810.376.93<0.010.980.970.980.183.180.08
114.4 ≤ ln(P) <4.525720.000.9919.81<0.010.950.300.980.430.110.74
124.5 ≤ ln(P) <4.641820.710.4023.01<0.010.930.010.980.182.440.12
134.6 ≤ ln(P) <4.757991.950.1638.02<0.010.980.310.970.054.750.03
144.7 ≤ ln(P) <4.8821090.790.3824.24<0.010.970.080.990.691.910.17
154.8 ≤ ln(P) <4.929610.080.7815.71<0.010.950.220.980.640.500.48
164.9 ≤ ln(P) <5.018587.34<0.01        

To the extent that ln(P) is a significant factor in the ln(Wbf) versus ln(Ada) relationship, performing ANCOVA for USGS and EPA-WSA data in the same PClass should yield coincident ln(Wbf) versus ln(Ada) regression lines. Conversely, ANCOVA will not yield coinciding ln(Wbf) versus ln(Ada) regression lines for the USGS and EPA-WSA data if either ln(P) is not a significant extraneous variable or other extraneous variables substantially influence the ln(Wbf) versus ln(Ada) relationship. In performing ANCOVA (Table 3) we first tested for homogeneity of regression slopes (i.e., significance of interaction-term coefficient, β3 in equation (11)). If the slopes were equal, we tested whether or not the intercepts for the USGS and EPA-WSA data differ (i.e., if β2 in equation (12) is significant). Other tested assumptions include normality of residuals (Shapiro-Wilk's test) and homogeneity of variance (Levene's test).

The results presented in Table 3 indicate that the EPA-WSA and USGS regression lines are coincident for PClass = 1–8, parallel for PClass = 9–15, and not parallel for PClass = 16. Also, the normality of residuals assumption was violated for four PClass' and homogeneity of variance was violated for two PClass'. That the USGS and EPA-WSA regression lines for PClass = 1–8 are coincident is unambiguous evidence that ln(P) is a significant factor in the ln(Wbf) versus ln(Ada) relationship in spite of occasional ANCOVA assumption violations. For PClass = 9–15, the values of β2 in equation (12) are, respectively, −0.46, −0.35, −0.40, −0.35, −0.37, −0.27, and −0.33. That the β2 values are negative and significantly different from zero (p < 0.01) for PClass = 9–15 indicates that the EPA-WSA regression lines have significantly smaller intercept values and consequently, smaller ln(Wbf) values for a given value of ln(Ada) than corresponding USGS regression lines. Reasoning that larger values of P lead to wider and deeper channels (i.e., less wadeable channels) for a given value of Ada, that the EPA-WSA data yield smaller intercepts for 66.7 cm ≤ P < 134.3 cm is attributed to the EPA-WSA limiting its data collection efforts to wadeable streams. That said, the results for PClass = 9–15 could also be due to the influence of another extraneous variable, e.g., the difference in the bed-material size distribution for the USGS and EPA-WSA data sets (Figure 1).

4.3. Comparison of Regression Lines for USGS and EPA-NRSA Data

The USGS and EPA-NRSA data were compared to further determine if there is evidence to support the hypothesis that a linear-piecewise model with three sections is the best model for relating Wbf and Ada. Two noteworthy advantages of using the EPA-NRSA data are that it includes both wadeable and nonwadeable streams, and it represents streams and rivers across the United States.

For broadly assessing the EPA-NRSA data, the data are plotted in Figure 3e along with a fitted LOESS curve and Model 3. ANCOVA was used to compare the ln(Wbf) versus ln(Ada) regression lines derived from the USGS and EPA-NRSA data. For comparing the USGS and EPA-NRSA data, the data were split into small, medium, and large-size basins as was done for comparing the USGS and EPA-WSA data. To facilitate interpreting the ANCOVA results, Figure 3f shows regression lines for small, medium, and large-size basins in the USGS and EPA-NRSA data sets. For small-size basins, ANCOVA revealed that regression lines for the USGS and EPA-NRSA data are parallel [C(−0.416 ≤ β3 ≤ 0.087) = 0.95] but not coincident [C(0.138 ≤ β2 ≤ 0.502) = 0.95]. In this case, the intercept for the EPA-NRSA regression line is significantly greater than the intercept for the USGS regression line. Levene's test indicated that the homogeneity of variance assumption was violated for the small-size basins [W(1,824) = 6.871, p = 0.009]. For medium and large-size basins, the USGS and EPA-NRSA regression lines are not parallel [C(−0.252 ≤ β3 ≤ −0.093) = 0.95)] and [C(0.026 ≤ β3 ≤ 0.207) = 0.95)], respectively. For medium-size basins, the slope of the EPA-NRSA regression line is significantly smaller than the USGS regression line. For large-size basins, the slope of the EPA-NRSA regression line is significantly greater than the USGS regression line. In summary, none of the regression lines in Figure 3f are coincident, and only one pair of the regression lines are parallel. With respect to the goal of developing a reliable model for predicting Wbf that applies across a broad range of environments, results from comparing the USGS and EPA-NRSA data once again suggest that using Ada alone is insufficient.

5. Discussion

An extensive analysis of the USGS data indicates that Model 3 (a three-segment linear-piecewise model) is the best and most parsimonious model for relating ln(Wbf) and ln(Ada). There were four other candidate models, and surprisingly, Model 1 (a simple linear model) ranked last among the considered models. Based on the knots in Model 3, sites represented in the USGS, EPA-WSA, and EPA-NRSA data sets were split into three basin-size groups: small-size (Ada ≤ 5.0 km2), medium-size (5.0 km2 < Ada ≤ 340 km2), and large-size (Ada > 340 km2). For each basin-size group, we compared the ln(Wbf) versus ln(Ada) regression lines yielded by the USGS, EPA-WSA, and EPA-NRSA data (Figures 3d and 3f). For small-size basins, both the EPA-WSA, and EPA-NRSA regression lines are parallel to the USGS regression line but have significantly larger intercepts. For medium-size basins, the EPA-WSA, and EPA-NRSA data have significantly smaller slopes than the USGS data. For large-size basins, the EPA-WSA and USGS regression lines are parallel but the EPA-WSA regression line has a smaller intercept. In contrast, the EPA-NRSA regression line is steeper than the USGS regression line. That none of the paired regression lines in either Figure 3d or 3f are coincident is evidence that extraneous factors significantly influence the ln(Wbf) versus ln(Ada) relationship.

From Table 3, the USGS and EPA-WSA data yield coincident ln(Wbf) versus ln(Ada) regression lines for 1 ≤ PClass ≤ 8 (i.e., 30.0 cm ≤ P < 66.7 cm). These results are evidence that ln(P) is an extraneous factor in the ln(Wbf) versus ln(Ada) relationship for medium-size basins, and we conclude, therefore, that Ada alone is not sufficient for developing a reliable relationship for predicting Wbf that is applicable across a wide range of environments. Moreover, in developing ln(Wbf) versus ln(Ada) relationships, in addition to P, other factors should also be considered. For example, in addition to Ada and P, factors identified by Faustini et al. [2009] as significant for predicting Wbf include ecoregion, bed-material type, mean reach slope, and elevation above mean sea level. The USGS and EPA-WSA data yield parallel regression lines for 9 ≤ PClass ≤ 15 (i.e., 66.7 cm ≤ P < 134.3 cm), and the EPA-WSA regression lines have smaller intercepts (Table 3). Reasoning that larger values of P lead to wider and deeper channels (i.e., less wadeable channels), we attribute the foregoing result to the EPA limiting its data collection effort to wadeable streams in the EPA-WSA survey.

We now focus on the form of the relationship between Wbf and Ada. For discussing this topic, Model 3 is plotted in Figure 7 along with LOESS curves for both the EPA-WSA and EPA-NRSA data. Also plotted in Figure 7 are data from Jackson and Sturm [2002], Ashworth and Lewin [2012], and Milliman and Syvitski [1992]. Jackson and Sturm [2002] investigated the morphology of “small streams” in the Pacific Northwest (i.e., first and second-order streams with active channel widths <4 m). They surveyed 42 streams for which 0.36 m ≤ Wbf ≤ 3.63 m and 0.011 km2 ≤ Ada ≤ 0.49 km2. The median value of Wbf and Ada was 1.42 m and 0.081 km2, respectively. It is noteworthy that bedrock was present at several of the Jackson and Sturm study sites. Ashworth and Lewin [2012] and Milliman and Syvitski [1992] compiled data, referred to hereafter as the ALMS data, for some of the world's largest rivers. The 17 rivers represented in the ALMS data set have mainstream channels that are either single-threaded and alluvial (4), braided (1), anabranching or anastomosing (9), or mostly bedrock (2). Moreover, 600 m ≤ Wbf ≤ 12,000 m, 1.0 × 105 km2 ≤ Ada ≤ 3.3 × 106 km2, and the median value of Wbf and Ada is 1800 m and 9.9 × 105 km2, respectively. Admittedly, we did not scrutinize the Jackson and Sturm [2002] or the ALMS data to the same extent as the USGS, EPA-WSA, or EPA-NRSA data. Nonetheless, the data are useful for interpreting the form of the Wbf versus Ada relationship.

Figure 7.

Plot of Model 3 and extended lines, and LOESS curves fit to EPA-WSA and EPA-NRSA data.

For small-size basins, Model 3 and the two LOESS curves in Figure 7 follow a similar trend. Also, earlier it was determined that for small-size basins the EPA-WSA data and the EPA-NRSA data yield regression lines that are parallel to the USGS data. Nonetheless, additional evidence is sought to substantiate the slope of the first segment of Model 3 (i.e., the segment corresponding to Ada ≤ 4.95 km2). Using ANCOVA, we compared the ln(Wbf) versus ln(Ada) regression line yielded for small basins in the USGS data set to that yielded by the Jackson and Sturm [2002] data. The results reveal that the regression lines are coincident (F(1,124) = 0.030, p = 0.862). All assumptions were met except for one; the Jackson and Sturm data had significantly less variance than the USGS data (F(1,125) = 20.5, p < 0.000). Most likely, this is due to the relative homogeneity of the region in which Jackson and Sturm collected data. Given that the first segment of Model 3 is (1) coincident with the Jackson and Sturm data and (2) parallel to the EPA-WSA and EPA-NRSA data, we conclude that the first segment of Model 3 is representative of stable alluvial channels in basins as small as 0.010 km2.

Under conditions of uniform flow, the ability of a fluid to do work can be characterized by stream power per unit wetted area, ω = τ0V, where τ0 = γf RS is the mean boundary shear stress, V is the mean velocity, γf is the specific weight of fluid, R is the hydraulic radius, and S is the channel slope [Rhoads, 1987]. Albeit speculation, we attribute the change in slope between the first and second segments of Model 3 to a change in the relationship between stream power and nonfluvial processes that affect channel width (e.g., damming by LWD and encroachment by vegetation). In this case, Knot 1 in Model 3 (Figure 5b) marks a threshold beyond which the effects of stream power are comparable to the geomorphic effect of nonfluvial processes. Similar threshold values have been reported or implied in several studies (Table 4). In particular, Bilby and Ward [1989] studied second- to fifth-order streams (0.4 km2 ≤ Ada ≤ 68 km2 and 3.6 m ≤ Wbf ≤ 19.7 m) in western Washington and demonstrated that the orientation of LWD pieces has a distribution that varies with stream width. As stream width increases (and presumably, basin area and stream power), there is an increase in the number of pieces of LWD oriented downstream and a decrease in the number of pieces of LWD oriented perpendicular (i.e., at right angles to the flow).

Table 4. Reported Threshold Values for Wbf Versus Ada Relationshipa
Cited StudyThreshold Values
Ada (km2)Wbf or inline image (m)
  1. a

    Threshold values correspond to a change in the relationship between stream power and nonfluvial processes that affect the bankfull width of small-size streams.

This study5.03.0
Swanson and Lienkaemper [1978]0.2–4.0n.a.
Jackson and Sturm [2002]n.a.>4
Nakamura and Swanson [1993]1.01–5.510.9–15.6
Zimmerman et al. [1967]5.2–163.7–5.4

Model 3 and the EPA-WSA LOESS curve both have a readily discernible knot in the vicinity of Ada ∼300 km2 (Figure 7). In contrast, the EPA-NRSA LOESS curve has a barely discernible knot point at Ada ∼1500 km2. The noted differences between Model 3 and the LOESS curves underscore the need for additional information for authenticating the form of the ln(Wbf) versus ln(Ada) relationship for medium- and large-size basins. We used the ALMS data to fulfill that need. With reference to Figure 7, if the middle segment of Model 3 is extended, the extended line passes through the ALMS data. Moreover from Figure 3f and to a lesser extent, Figure 7 it appears that if a trendline was extended from the EPA-NRSA data for large-size basins it would also pass through or lie near the ALMS data. For contrast, if a trendline was extended from the last segment of Model 3 it would plot well below the ALMS data. Given the aforementioned results, we conclude that the middle segment of Model 3 is broadly representative of channels in large-size basins, and therefore, that the most probable form of the ln(Wbf) versus ln(Ada) relationship is that of a two-piece linear model, not Model 2 however. One corollary to this conclusion is that Model 3 (and in correspondence, the USGS data) yields downward biased estimates of Wbf for channels that drain large-size basins. The reason for the supposed bias is unknown, and we can only speculate that it is a consequence of having excluded data from gaging stations that have flows that are appreciably altered by human activities (e.g., reservoir regulation, diversions, urbanization, channelization, land cover, and levee construction). Alternatively, the supposed bias could reflect unmeasured effects of human activities in large basins. With reference to Figure 7, a second corollary to the conclusion presented above is that for large-size basins the EPA-NRSA data have the greatest potential to yield unbiased estimates of Wbf.

6. Conclusions

The broad objective of this study was to explore the Wbf versus Ada relationship for single-thread alluvial channels across a broad range of geologic, terrestrial, climatic, and botanical environments. Specific questions addressed are (1) is the ln(Wbf) versus ln(Ada) relationship best represented by a simple linear model and (2) is Ada alone a sufficient independent variable for developing a general relationship for predicting Wbf. After reviewing data for drainage basins ranging in size from 0.010 km2 to over 106 km2 it is concluded that a linear-piecewise model with two sections is the best model for describing the ln(Wbf) versus ln(Ada) relationship.

To address the second question, we demonstrated that ln(P) is an extraneous factor in the ln(Wbf) versus ln(Ada) relationship. It is concluded, therefore, that a reliable model for predicting Wbf cannot be developed from Ada alone. Making progress in developing a comprehensive mechanistic model for predicting Wbf—the overarching objective of this study—will require methodical analysis of additional variables known to have significant influence on the Wbf versus Ada relationship, e.g., P.

Acknowledgments

Our thanks to S. Jerrod Smith for kindly extracting PRISM precipitation estimates for the USGS data. We gratefully acknowledge that the EPA-WSA and EPA-NRSA data sets are products of an EPA-led collaborative effort involving state, tribal, and federal agencies. We thank Alan Herlihy for graciously sharing with us data from the EPA-WSA data set. The 2008–2009 EPA-NRSA data and guidance for screening the EPA-NRSA data were generously provided by Phil Kaufman. Our thanks to Phil Kaufmann and two anonymous reviewers for the many thoughtful comments they provided. This work was supported by the STC program of the National Science Foundation via the National Center for Earth-surface Dynamics under agreement EAR-0120914 and is intended as a contribution in the area of stream restoration.

Ancillary