## 1. Introduction

[2] Sediment carried by a current of wind or water tends to disperse and spread out as it moves downstream. While there are a great many formulas designed to predict the bulk discharge of sediment at a particular point in a flow [e.g., *Yalin*, 1977], the rich problem of sediment speed and dispersion has received much less attention. Yet there are many applications for which it is important to know the speed and rate of spread of a body of sediment. Examples include the fate and transport of solid-phase contaminants in streams, the time lag between erosional exhumation of a grain and its delivery to a sedimentary basin, and the accumulation of cosmogenic radionuclides in sediment grains during transport. Applications like these require stochastic models of sediment transport, and stochastic transport models require data sets against which to test their predictions. Of particular importance are data sets that can reveal the extent to which the dispersion process obeys traditional Fickian behavior, as compared with the “fractional” or “anomalous” dispersion that has been documented in a diverse range of natural systems. In this paper, we use data from a classic field experiment on sediment dispersion experiment to test the predictions of a family of stochastic transport models, and in particular to compare the predictions of models based on thin- versus heavy-tailed grain velocity distributions.

[3] *Einstein* [1937, 1950] was among the first to recognize the appeal of a stochastic model of fluvial bed load transport. He conceptualized particle motion as a series of random length steps separated by rests of random duration. This model and numerous other similar models that treat fluvial sediment transport as a random walk [e.g., *Sayre and Hubbell*, 1965; *Paintal*, 1971; *Yang and Sayre*, 1971; *Niño et al.*, 2003; *Papanicolaou et al.*, 2002; *Marion et al.*, 2008] share an underlying assumption that all of the complexity, interaction, and variability in the factors that affect the erosion, transport, and deposition of sediment can be “encapsulated” in a probability distribution that predicts the likelihood that a grain moves farther than a certain distance or is immobile longer than a certain time. Specifically, these models all assume that the distributions of step length and resting time have well defined mean values surrounded by characteristic amounts of variability. What if this assumption is incorrect? What if geomorphic transport processes are so variable and complex that a distribution with a finite mean or variance is not a good representation of the underlying process?

[4] In recent decades, a wide variety of transport systems have been identified that exhibit scale-dependent behavior, where the fitted parameters in classical governing equations appear to change depending on the spatial or temporal scale over which it is measured (for thorough reviews see *Metzler and Klafter* [2000, 2004]). One familiar example from the geosciences is the observation that the apparent dispersivity of a heterogeneous aquifer tends to increase with the scale of the tracer test [see *Neuman*, 1990, and references therein]. The dispersivity represents a characteristic length scale of random motion. In subsurface spreading of conservative solutes under ideal conditions, the Fickian dispersion coefficient was found to be approximately linearly related to velocity magnitude, leading to a decomposition of the dispersion coefficient into a dispersivity tensor that reflected the porous medium properties and the velocity vector (a review is given by *Bear* [1972]). In one dimension, the dispersion coefficient is the product of the dispersivity and fluid velocity. *Benson et al.* [2001] showed that the scale-dependent behavior of a solute tracer injected into an aquifer in Mississippi during the macrodispersion experiments (MADE) could be replicated with a random walk model in which the probability distribution of step length has a divergent second moment. This type of distribution has a heavy, power law tail that specifies a much higher probability of extreme values than thin-tailed distributions like the Gaussian or the exponential. Depending on the amount of probability mass in the tail of the distribution, one or both of the integrals that define the first and second moments of the distribution (the mean and variance) *m*_{q} = ∫*x*^{q}*p*(*x*)*dx*, *q* = 1,2 may not converge on a finite value. In practice, a divergent moment means that the sample statistic will tend to increase with the number of samples drawn from the distribution because the larger sample is more likely to include an extreme value [e.g., *Schumer et al.*, 2001]. It is this effect that gives rise to the apparent scale-dependent behavior. These distributions also violate the familiar form of the Central Limit Theorem (CLT), which connects random walk processes with finite moments to the advection-diffusion equation (ADE). In one dimension the ADE is

where *C* is the spatial concentration of the independent random walkers, *t* is time, *x* is the spatial coordinate, *v* is the average drift velocity (*v* = 0 for pure diffusion), and *D* is the diffusion (or dispersion) coefficient with dimensions of *L*^{2}/*T*. Dispersive transport which follows the ADE is commonly described as Fickian, or normal dispersion. The Fickian diffusion coefficient is proportional to the variance of the underlying step length distribution and the variance of the dispersing plume scales as the product of diffusion coefficient and transport time. Motion governed by underlying probability distributions with divergent first or second moments obey a more generalized form of the CLT and the dispersion follows a more generalized form of the ADE, where the orders of the derivatives on the dispersive and time terms need not be integers [*Benson*, 1998]. The simplest form of this type of equation is one dimensional and fractional in space only:

In the fractional advection-dispersion equation (fADE), 0 < *α* ≤ 2 and the dispersion coefficient *D* has dimensions of *L*^{α}/*T.* The fADE describes transport in which the underlying probability density function of step length decays like a power law for large values: *p*(Δ*x*) ∝ Δ*x*^{−1–α}. A plume following this equation has two features: the apparent centered second moment (the variance) grows proportionally to *t*^{2/α}, and for *α* < 2 the leading edge concentration falls off according to a power law *C*(*x,t*) ∝ *x*^{−1–α}. For *α* = 2, the ADE is recovered and the variance scales linearly with time. One of the advantages of the fADE in describing non-Fickian transport is that the dispersion coefficient is not scale-dependent, as it would have to be if it were directly proportional to the divergent second moment of the underlying step length distribution. Instead, the super-Fickian growth rate of the spreading plume is captured by the fractional derivative. *Benson et al.* [2001] and *Schumer et al.* [2001] provide discussions of the derivation of the fADE, the meaning of fractional derivatives, and the connection to random walks governed by heavy-tailed probability distributions. Other examples of fractionally dispersive transport systems range from charge transfer in semiconductors [*Scher and Montroll*, 1975] to fluid turbulence [*Shlesinger et al.*, 1987] to bioturbation of soil and sediment [*Meysman et al.*, 2008]. Recent reviews were given by *Metzler and Klafter* [2000, 2004].

[5] The dynamics of fractional dispersion can depart significantly from normal, Fickian dispersion. In this paper, we focus primarily on fractionally dispersive plumes with a heavy downstream tail that result from a distribution of particle step lengths with a divergent second moment. This results in earlier arrival of the tracers at downstream locations earlier than expected from a Fickian model and a mean tracer position that moves faster than the peak concentration.

[6] There are several reasons to suspect that geomorphic transport systems might exhibit fractional dispersion. A common observation in geomorphology is that estimates of geomorphic rates tend to change depending on the interval over which the measurement technique integrates. For example, *Gardner et al.* [1987] plotted estimates of surface elevation change due to various processes against the duration of the interval over which the change was measured and found a slope of less than one, meaning that the rate of uplift or erosion tended to decrease with the length of the measurement interval. They suggested that a possible explanation for the slowdown is that longer observation intervals are more likely to capture a long interval of no surface elevation change, resulting in a lower average rate. *Kirchner et al.* [2001] found a similar effect but in the opposite direction: average erosion rates in a catchment in Idaho increased with the length of the measurement time scale. They proposed that this was because longer intervals were more likely to include an extreme erosional event such as a large landslide. More recently, *Singh et al.* [2009] demonstrated that the sediment flux in a large experimental flume was dependent on the time over which the measurement was made. These observations are similar to the apparent scale dependence of dispersivity in an aquifer and suggest that the underlying processes might be heavy tailed.

[7] In particular, there are several pieces of evidence that suggest that bed load transport by rivers may be fractionally dispersive in some cases. First, *Nikora et al.* [2001, 2002] found that the sample variance of plumes of real and simulated tracer grains can grow nonlinearly with time under some circumstances. In Fickian transport, the variance of the tracer plumes scales linearly with time *t*, *σ*^{2} = 2*Dt*, so the nonlinear variance scaling *σ*^{2} ∼ *t*^{γ}, *γ* ≠ 1 observed by Nikora and colleagues indicates an apparent scale-dependent diffusion coefficient and fractional dispersion. Second, tracer studies in rivers have commonly observed strongly right-skewed distributions of travel distance [e.g., *Schmidt and Ergenzinger*, 1992; *Habersack*, 2001; *McNamara and Borden*, 2004]. These are usually fitted with distributions such as the exponential or gamma, but that does not exclude the possibility that the transport is actually heavy tailed, if relatively rare events on the tail of the distribution were not captured by the experiment. An exception is *Pyrce and Ashmore* [2003], who found that a Cauchy distribution (a symmetrical distribution with heavy tails) centered on the spacing between the pools and bars described the step length distribution of tracers in a flume. Third, *Stark et al.* [2009] and *Tucker and Bradley* [2010] presented arguments for nonlocal behavior in certain forms of sediment transport, implying a potential for heavy-tailed particle travel distances and fractional dispersion. Also, *Ganti et al.* [2010] presented an argument for how heavy-tailed particle travel distances might arise in gravel bed rivers. Finally, there are the data from the tracer experiment performed by W.W. Sayre and D.W. Hubbell of the U.S. Geological Survey in the 1960s [*Sayre and Hubbell*, 1965]. Motivated in part by a need to develop and test a probabilistic transport model like Einstein's, the Sayre and Hubbell tracer experiment provides an unusually vivid picture of the advection and dispersion of a pulse of radioactively tagged tracer sand in a natural river. The tracer concentration profiles exhibit heavy (power law) downstream leading edges, similar to those observed in solute tracer tests in heterogeneous aquifers, suggesting fractional dispersion.

[8] In this paper, we reanalyze the Sayre and Hubbell data. We begin by reviewing the tracer experiment and then discuss their original transport and dispersion model, highlighting three weaknesses. Next, we introduce a transport model that is conceptually similar to the Sayre and Hubbell model, but uses a distribution of particle step lengths that is heavy tailed with a divergent variance. This model is able to reproduce the non-Fickian behavior of the plume, but the values of model parameters are almost entirely empirically determined. Finally, we add a feature to the model that partitions mass into a detectable, mobile phase and an undetectable, immobile phase. This allows us to reproduce another feature observed in the Sayre and Hubbell data, the decrease in the amount of detected mass over the course of the experiment. It also provides additional constraints on some of the model parameters, allowing us to derive them directly from the data rather than choosing values by matching the model results to the observed concentration curves visually. The success of the models in describing the data provides strong evidence for fractional dispersion of sediment by a river.