The Impact of Cylinder Diameter Distribution on Longitudinal and Transverse Dispersion Within Random Cylinder Arrays

Numerous studies focus on flow and mixing within cylinder arrays because of their similarity to vegetated flows. Randomly distributed cylinders are considered to be a closer representation of the natural distribution of vegetation stems compared with regularly distributed arrays. This study builds on previous work based on a single, fixed, cylinder diameter to consider non‐uniform cylinder diameter distributions. The flow fields associated with arrays of randomly distributed cylinders are modeled in two dimensions using the ANSYS Fluent Computational Fluid Dynamics software with Reynolds Stress Model turbulence closure. A transient scalar transport model is used to characterize longitudinal and transverse mixing (Dx and Dy) within each geometry. The modeling approach is validated against independent laboratory data, and the dispersion coefficients are shown to be comparable with previous experimental studies. Eight different cylinder diameter configurations (six uniform and two non‐uniform) are considered, each at 20 different solid volume fractions and with seven different transverse positions for the injection location. The new dispersion data cover a broad range of solid volume fractions, for which simultaneous estimates of Dx and Dy have not been available previously. There are no systematic differences in non‐dimensional Dx and Dy between uniform and non‐uniform cylinder diameter distributions. When non‐dimensionalized by cylinder diameter, both dispersion coefficients are independent of solid volume fraction. When non‐dimensionalized by cylinder spacing, both longitudinal and transverse dispersion can be described as linear functions of the ratio of cylinder diameter to cylinder spacing.


Introduction
Understanding the flow and mixing processes in vegetated flows is essential for hydraulic engineers to be able to improve the performance of vegetated water bodies such as treatment wetlands and ponds. The stems of certain aquatic vegetation types are approximately cylindrical (Nepf, Sullivan, & Zavistoski, 1997), which makes the cylinder an ideal choice for a basic study of vegetated flow. From a fluid mechanics point of view, flow past a cylinder is a classic case study and numerous detailed investigations have explored all characteristics of the flow past one or more cylinders. Zdravkovich (2003) provides a relatively complete collection of experimental and practical studies on flow past cylinders. There are a considerable number of experimental and computational studies on flow and mixing within regularly spaced cylinder arrays (e.g., Iwaki et al., 2004;Stoesser et al., 2010). To make the models closer to natural distributions of vegetation, several studies have employed random distributions of cylinders (e.g., Ricardo et al., 2014;Tanino & Nepf, 2008;White & Nepf, 2003).
The main mixing processes in all flows are molecular diffusion, turbulent diffusion, and differential advection (Fischer et al., 1979). Vegetation stems generate downstream wakes, enhancing differential advection and creating wake zones in which trapping effects contribute to longitudinal dispersion (White & Nepf, 2003). In addition, flow path tortuosity enhances transverse dispersion. This process is often referred to as "mechanical dispersion" (Nepf, 1999). The overall effect of all these processes is typically quantified via a single dispersion coefficient in each direction. Tanino (2008) and Tanino and Nepf (2008) quantified transverse dispersion (D y ) in the laboratory for a range of solid volume fractions (ϕ) of randomly distributed uniform diameter 6.35 mm cylinders. The data revealed an "N"-shaped relationship between D y and the ratio of mean stem diameter (d) to the mean nearest stem edgeto-edge spacing (s), with a local peak at around d/s = 0.6 (ϕ = 0.031) and a local minima at around d/s = 2.5 (ϕ = 0.20). Nepf (1999) suggested, and it was experimentally shown by Tanino and Nepf (2008), that the net transverse dispersion can be expressed as the linear superposition of turbulent diffusion and mechanical dispersion. Whilst the model derived from this work is widely cited, it should be noted that it incorporates linear scaling Abstract Numerous studies focus on flow and mixing within cylinder arrays because of their similarity to vegetated flows. Randomly distributed cylinders are considered to be a closer representation of the natural distribution of vegetation stems compared with regularly distributed arrays. This study builds on previous work based on a single, fixed, cylinder diameter to consider non-uniform cylinder diameter distributions. The flow fields associated with arrays of randomly distributed cylinders are modeled in two dimensions using the ANSYS Fluent Computational Fluid Dynamics software with Reynolds Stress Model turbulence closure. A transient scalar transport model is used to characterize longitudinal and transverse mixing (D x and D y ) within each geometry. The modeling approach is validated against independent laboratory data, and the dispersion coefficients are shown to be comparable with previous experimental studies. Eight different cylinder diameter configurations (six uniform and two non-uniform) are considered, each at 20 different solid volume fractions and with seven different transverse positions for the injection location. The new dispersion data cover a broad range of solid volume fractions, for which simultaneous estimates of D x and D y have not been available previously. There are no systematic differences in non-dimensional D x and D y between uniform and non-uniform cylinder diameter distributions. When non-dimensionalized by cylinder diameter, both dispersion coefficients are independent of solid volume fraction. When non-dimensionalized by cylinder spacing, both longitudinal and transverse dispersion can be described as linear functions of the ratio of cylinder diameter to cylinder spacing.

STOVIN ET AL.
factors for the two components which were determined by fitting to laboratory data. The model implies that turbulent diffusion falls to zero for d/s greater than 3.0 (ϕ > 0.25). Zero turbulent diffusion seems questionable, given that turbulent kinetic energy has both been measured experimentally and predicted theoretically by Tanino and Nepf (2008) at these solid volume fractions.
In recent work on vegetated flows, many authors have used Computational Fluid Dynamics (CFD) as an alternative or complement to more traditional laboratory explorations. Okamoto and Nezu (2010) successfully utilized large eddy simulation (LES) modeling to investigate the complex 3D flow patterns and mass transport processes associated with submerged vegetation. However, the study represented vegetation using thin plates, rather than cylinders. Ricardo et al. (2018) used 3D LES modeling to highlight the anisotropic nature of turbulence within a representative random cylinder array. Whilst solute transport was not explicitly considered, this study highlights the potential need to account for anisotropic turbulence characteristics in any numerical modeling based work.
In contrast to the high-resolution and high computational expense associated with LES modeling, other researchers have demonstrated the potential to employ Reynolds Averaged Navier Stokes (RANS) CFD models in this context. Ghani et al. (2019) employed the commercial Fluent CFD software with the non-isotropic Reynolds stress model (RSM) turbulence closure to investigate the 3D turbulent flow characteristics within an open channel with a regular pattern of vegetation patches. The simulated flow fields confirmed the model's ability to reproduce the main flow characteristics and velocity shear effects, which are expected to dominate mixing in this type of flow. It should be noted that the model was validated on longitudinal velocities, but not on turbulence quantities or solute transport. Golzar et al. (2018) demonstrated the feasibility of applying 2D RANS CFD to explicitly model mixing due to random cylinder arrays. The flow fields associated with two arrays of regularly and randomly distributed cylinders were modeled using ANSYS Fluent 16.1 with the RSM turbulence model closure (ANSYS Inc., 2015). Notably, the model was sufficiently large (1.5 m long × 0.5 m wide, for a solid volume fraction, ϕ, of 0.005) to permit transient scalar transport modeling at a physical scale comparable to previously-reported laboratory dye tracing. For the same cylinder diameter and solid volume fraction, greater dispersion was evident in the random cylinder array compared with the regular array.  compared CFD-derived transverse dispersion coefficients, for a range of solid volume fractions, 0.01 ≤ ϕ ≤ 0.35, with those reported experimentally by Tanino and Nepf (2008). Whilst the CFD models in this case covered a smaller plan area (0.6 × 0.2 m) than in the Golzar et al. (2018) study, the resulting trend in transverse dispersion coefficient with solid volume fraction closely matched the experimental data. Further information regarding these preliminary studies can be found in Golzar (2018). Sonnenwald et al. (2017) simultaneously quantified both longitudinal and transverse dispersion for both artificial and real, emergent, vegetation. The new experimental data were plotted alongside data from a range of previous studies to reveal significant uncertainties in the estimation of longitudinal dispersion coefficient D x and transverse dispersion coefficient D y . The systematic trends in D y reported by Tanino and Nepf (2008) have not been consistently reproduced in other laboratory datasets. Sonnenwald et al. (2017) highlighted the heterogeneity in stem diameters in natural vegetation, which led the authors to question the validity/relevance of previous work based on uniform diameter cylinders. The authors utilized observed stem diameter distributions to highlight the sensitivity of existing models to this key lengthscale descriptor, leading to a recommendation that future models intended for application to real vegetation should be based on probabilistic descriptions of both stem diameters and stem spacings. The present study aims to address this specific research need. Non-uniform cylinder diameters are expected to introduce multiple scales of dead zones (in the cylinder wakes) and preferential pathways between the cylinders, the interactions of which may lead to enhanced (or possibly reduced) differential advection. The observed dispersion reflects the combined effects of differential advection (or mechanical dispersion) and turbulent diffusion, and it is not straightforward to estimate how these interacting processes will combine to impact on longitudinal and transverse dispersion; therefore the aim of this paper is to apply CFD modeling to systematically test the null hypothesis: In terms of both longitudinal and transverse dispersion, a configuration based on non-uniform cylinder diameters does not behave differently from a configuration based on a single uniform cylinder diameter.
In Section 2 we present a Computational Fluid Dynamics (CFD) based methodology for simulating flow through a random array of rigid cylinders, and justifications for the simulation set-up are presented. Comprehensive 10.1029/2021WR030396 3 of 23 validation based on both the flow field and derived dispersion coefficients is presented in Section 3. The results of the cylinder diameter distribution comparison are presented in Section 4, beginning with the presentation of representative cases to illustrate the model output. Finally, further discussion on the findings is presented in Section 5.

Introduction
The approach adopted here was first implemented by , who utilized a commercial CFD code (ANSYS Fluent 16.1) to directly simulate flows and scalar transport around random cylinder arrays. The models were simulated in 2D (plan) using the RSM turbulence closure. The variation in transverse dispersion coefficients as a function of solid volume fraction estimated from the simulation was comparable to the experimental data of Tanino and Nepf (2008) for 6.35 mm diameter cylinder arrays. Golzar et al. (2018) used the same approach to compare dispersion due to regular and random cylinder arrays, and demonstrated the feasibility of using the CFD modeling approach to determine both longitudinal and transverse dispersion coefficients simultaneously. A number of sensitivity analyses were undertaken to support that work, further details of which can be found in Golzar (2018). The same basic approach has been adopted for the present study, details of which follow.

Configurations Considered
We considered eight different cylinder (stem) diameter distributions ( Table 1). The rationale for these configurations was as follows. The most complex configuration, D8, is based on a simplification of the distribution of stem diameters for a representative real vegetation, winter Typha latifolia, as presented in Sonnenwald et al. (2017). Similar asymmetric diameter distributions are often observed in other types of vegetation (Rennolls et al., 1985). The simplified distribution uses five cylinder diameters: 4, 8, 12, 15 and 20 mm. The original and reproduced distributions are shown in Figures 1a and 1b respectively. D2 to D6 correspond to uniform distributions of the five components of D8. To provide an intermediate option, D7 was designed as a bimodal distribution ( Figure 1c). D3, D7 and D8 share the same median cylinder diameter, d 50 = 8 mm. Finally, for comparison with Tanino and Nepf (2008) and , D1 introduces a uniform cylinder diameter of 6.35 mm.
The simulations corresponded to a 3.0 m long, 1.0 m wide vegetated channel. For each cylinder diameter configuration, we simulated the flow field for a single 2D model corresponding to a 1.0 × 1.0 m plan section of a vegetated channel. The modeled flow fields were chained together to form a longer channel for the solute transport work. The model dimensions were based on a number of factors: firstly the scale is comparable to our previous laboratory mixing work using real vegetation ; secondly, the patch is sufficiently large to allow realistic clusters of cylinders to occur, regenerating the spatial heterogeneity observed in real vegetation; thirdly, the width is expected to be sufficient to prevent tracer from spreading to the walls even when three sections are placed end-to-end; and fourthly, the generous channel width permits multiple point source injections (spanning the central 300 mm of the channel) to be used, thereby permitting exploration of variations in D y and D x due to injection location and the specific spatial distribution of cylinders experienced by each trace.
Cylinder locations were randomly distributed to achieve 20 target solid volume fractions between 0.005 and 0.350. This covers the range of solid volume fractions used in previous studies, and to date is the largest range of densities used to investigate mixing in vegetated flows. A minimum spacing of 0.2 mm was specified, both between cylinders and between cylinders and the channel side walls. For the non-uniform cylinder diameter distributions, cylinder diameters were randomly selected for placement from the defined distributions. Two examples of the D8 distribution (ϕ = 0.05 and ϕ = 0.25) are presented in Figures 1d and 1e respectively. The seven transverse injection locations are indicated by red crosses on the upstream (left) boundary of the flow domain in each case. A cylinder-free spacing of 0.4 mm was retained at the upstream and downstream model boundaries. This was done firstly to support the use of a periodic boundary condition (see below) and secondly to provide an uninterrupted "window" for tracer concentration analysis. Mean edge-to-edge cylinder spacing (s) ranged from 0.001 to 0.098 m. Cylinder spacing distributions for the two example D8 distributions are presented in Figures 1f and 1g respectively.

Model Settings
The CFD modeling was undertaken using a steady state RANS (Reynolds Averaged Navier Stokes) model. Models were run with ANSYS Fluent 19 (ANSYS Inc., 2018). The inlets and outlets were defined as periodic boundaries, with a mass flow rate set to give a stem Reynolds Number (Re d50 ) of 500. Re d50 was calculated using the pore velocity, where U bulk is the mean channel velocity without cylinders present. Mass flow rate was calculated from Re d50 as (1ϕ)Re d50 μhw/d 50 , where μ is the dynamic viscosity, h is the unit depth in a 2D model, and w is channel width (1.0 m).
Cylinders were represented as solid regions, with "circular" boundaries defined using small linear segments in the 2D mesh. The cylinder and channel wall boundaries were represented as smooth walls. The Enhanced Wall Treatment (ANSYS Inc., 2018) was specified for the near-wall treatment; this treatment blends the laminar and logarithmic law of the wall functions to form a continuous function. We have confirmed the suitability of this wall treatment in the current context through a detailed comparison with LES simulations presented by Stoesser et al. (2010), which can be found in the Supporting Information.
The Reynolds Stress Model (RSM) turbulence closure was used with the coupled solver, which provides an implicit coupling of the momentum and continuity equations, and second order discretization was specified. These settings were previously validated by Golzar (2018), and are further justified in this paper's Supporting Information. Further information on the numerical models used can be found in general CFD text books (e.g., Versteeg & Malalasekera, 2007) and in the Fluent User Documentation (ANSYS Inc., 2018).
Model convergence was determined to have occurred when residuals stabilized into a repeating pattern for several thousand iterations and the continuity residual was 10 −3 or lower. We also confirmed that the area-weighted mean turbulent viscosity (μ t ) had stabilized, as solute transport (particularly transverse dispersion) is sensitive to the latter parameter. Stabilization of the turbulent viscosity term was defined as a standard deviation on 2000 iterations of less than 10 −5 . The RSM model estimates turbulent viscosity in a similar way to other k-ε models (Versteeg & Malalasekera, 2007). It should be noted that a single, isotropic, value for k (and hence μ t ) is determined from the modeled Reynolds stresses. Whilst this assumption of isotropy represents a potential limitation for our simulations, both Stoesser et al. (2010) and Ricardo et al. (2018) have presented work based on more sophisticated LES models which shows that the longitudinal and transverse Reynolds stresses tend to be similar in magnitude within comparable cylinder arrays.
Note that not all D1 simulations were run due to time constraints, and a few low solid volume fraction simulations for other configurations did not converge within a reasonable time frame (i.e., one month of computer time and 1 million iterations), and were abandoned. Overall, 137 simulations are included in the final dataset.

Time-Dependent Simulations of Solute Transport
For the simulation of solute transport, the simulated flow field was retained, but the periodic boundary conditions were replaced with inlet and outlet conditions. Time-dependent scalar transport simulations were used to model traces based on a frozen flow-field, using Equation 1.
where θ is a passive scalar, ρ its density, u i is cell velocity, S ct is the turbulent Schmidt number, and S is the source term. Subscript i denotes {x, y} directions.
Tracer was injected at x = 0 m, at seven transverse locations regularly spaced between y = 0.35 and y = 0.65 m (see Figure 1). The injection duration was 1 s. The second order implicit formulation was used with a 0.05 s time step. Time-step convergence was set to residuals of 1 × 10 −5 . The tracer was represented as a passive scalar with diffusivity equal to μ t /(S ct ρ). S ct is here taken as 1.0, assuming the ratio of mass to momentum transport is equal, in the absence of compelling evidence for a specific value within cylinder arrays. The tracer was assumed to have the same density and viscosity as water.
As illustrated by Shucksmith et al. (2007), it is important that the tracer experiences the representative flow field fully and has entered the equilibrium zone, before analyzing for the dispersion coefficients. The distance required is a function of the solid volume fraction and cylinder diameter, and scales with the expected levels of dispersion. The distance required for longitudinal dispersion estimates is typically one order of magnitude larger than transverse dispersion. In the present context, Golzar (2018) showed that at least 0.8 m was required between the injection point and the first transverse profile. This estimate is consistent with White and Nepf's (2003) formula which suggests that between 0.1 and 1.0 m would be sufficient for almost all of the configurations considered here. Distances greater than 1 m would be required for a small number (6) of configurations characterized by low solid volume fractions and large cylinder diameters. We therefore set our first monitoring location at x = 1.0 m. To simulate solute transport over a length of channel longer than the 1 m model, solute transport simulations were carried out in series, with the downstream (outlet) profile from the first model being used to define the upstream (inlet) conditions for the second model, etc. We used three models in series; therefore the cross-sectional monitoring locations were x = 1.0, 2.0 and 3.0 m, giving two 1 m reaches.
From the monitoring locations, 2D concentration profiles were assembled, and 1D advection-diffusion equation (ADE) routing (Fischer et al., 1979) was used to obtain the longitudinal and transverse dispersion coefficients, D x and D y , Equations 2 and 3 respectively.
where c is concentration, x 1 and x 2 are the upstream and downstream measurement locations, U is longitudinal velocity, ̄= ( 2 − 1) ∕ is travel time, V is transverse velocity, and γ and λ are variables of integration.
1D optimisation of solutions to the longitudinal (transverse sum data) and transverse (temporal sum data) ADEs was carried out for each of the seven transverse injections and two reaches, to obtain up to 14 estimates of longitudinal dispersion coefficient D x and transverse dispersion coefficient D y . In some reaches, solute came within 0.01 m of the channel wall; in these instances the dispersion coefficients have not been included. In total 1,886 estimates of D x and D y were made. While the upstream and downstream data showed flux-balance, the concentration data showed differences in mass-balance on the order of 5% due to the variations in velocity profile. The concentration data were mass-balanced before optimisation.
Time-step independence was checked on two configurations represented by 0.333 × 0.333 m models, full details of which can be found in the Supporting Information, at time steps of 0.01 and 0.05 s. R t 2 has been shown to be a robust means of evaluating the similarity of two temporal concentration profiles in optimization (Sonnenwald et al., 2013). Longitudinal concentration profiles generated at the two different time steps showed very good agreement, with R t 2 = 0.999 at all downstream locations. D x values were within 1% of each other.

Mesh Independence
Triangular computational meshes were generated with the ANSYS Meshing tool. Mesh independence checks were carried out on two configurations represented by 0.333 × 0.333 m models, full details of which can be found in the Supporting Information. D8 at ϕ = 0.18 was examined as it contains all of the cylinder diameters and a representative range of expected cylinder spacings; D2 at ϕ = 0.01 was chosen as it has the largest gaps between cylinders. Based on a comparison of 11 different meshing control settings, a mesh size of 1.0 mm with a minimum cell size of 0.2 mm adjacent to cylinders and walls (denoted 1.0, 0.2) was chosen. Changes from a significantly finer resolution (0.5, 0.05) were found to be less than 1% for streamwise velocity, less than 7% for turbulent kinetic energy and turbulent viscosity, and less than 5% for transverse dispersion coefficient. Interestingly, the 10.1029/2021WR030396 7 of 23 error on dispersion coefficient was found to be smaller than the variation associated with altering the location of the point source injection within the same simulation model, which was found to be 12%.
It is acknowledged that we have not demonstrated perfect mesh independence; rather we have shown that the errors in the quantities of interest are acceptably small. Further mesh refinement would lead to required resolutions that would preclude the study at the spatial extent (1.0 × 1.0 m) that we feel is required. For this reason we have identified two published studies that permit validation of the model's fitness for purpose, see Section 3.

2D Versus 3D
To ensure the applicability of 2D simulations, a 3D version of the 0.333 × 0.333 m portion of the D8 configuration at ϕ = 0.18 was compared with a 2D version; please see the Supporting Information for further details. The depth of the 3D model was 0.05 m with a fixed-lid symmetry boundary condition to represent the free surface. Except for the region immediately adjacent to the bed, the velocity field was observed to be uniform with depth, and turbulent viscosity was accurately reproduced. Importantly, checks were also undertaken on scalar transport, confirming that comparable downstream profiles of solute concentration were obtained in both cases. It should be noted that, using less rigorous meshing criteria than the 2D model, the 3D model of just one ninth of our 1 m 2 tile required more than 39 million cells (roughly 80 times more cells for the same plan area). This is at the upper limit of current practical limits to computation, and extension to a 1 m 2 in 3D would not be feasible. The trade-off associated with a reduced simulation plan area is not justified.

Non-Dimensionalization
At the experimental design stage, median cylinder diameter (d 50 ) as opposed to mean cylinder diameter (d) was used to characterize cylinder diameter distributions. The median is often considered to be a more appropriate indicator of central tendency than the mean for natural, non-Gaussian, distributions such as particle size or cylinder diameter. The experimental design includes comparisons between uniform and non-uniform cylinder diameter distributions based on equivalent d 50 of 8 mm, that is, D3, D7 and D8.
However, in this case, the stepped distributions used for D7 and D8 respectively mean that d 50 does not provide a robust estimate of the distributions' characteristic length scales. The mean cylinder diameters for D7 and D8 were closer to 11 and 10 mm respectively, meaning that the actual Reynolds numbers for these simulations were higher (Re d = 675 for D7 and Re d = 619 for D8) than, and therefore not directly comparable with, those associated with the uniform cylinder diameter distributions (Re d = 500). The dispersion coefficients were therefore non-dimensionalized by dividing by Ud, where U is the velocity estimated in the 1D optimisation of the longitudinal dispersion coefficient and d is mean cylinder diameter. In practice, differences between nominal and optimized values of U were negligible, around 1% on average. Tanino and Nepf (2008) showed that turbulent processes within vegetation are affected by vegetation density.
Where there is sufficient spacing between the cylinders, turbulent processes are governed by cylinder diameter as the relevant length scale. However, once a threshold density of vegetation is reached, cylinder spacing becomes the relevant length scale. Sonnenwald et al. (2019a) subsequently demonstrated that longitudinal dispersion coefficients determined for three alternative laboratory cylinder arrays converged when non-dimensionalized by the median edge-to-edge cylinder spacing, s 50 . We have therefore also non-dimensionalized the dispersion coefficients based on s.
Specific solid volume fractions for each reach of each trace were determined by calculating the area between the edges of the tracer cloud, determined by 1% of peak concentration. Similarly, both mean and median cylinder spacing and cylinder diameter were also calculated over the area enclosed by the edges of the tracer cloud.

Introduction
In this section we present two cases to validate the CFD based modeling methodology. The first is based on a qualitative comparison with the laboratory flow-field measurements presented by Ricardo et al. (2014). We then utilize the analysis presented by Sonnenwald et al. (2019a) to validate our model predictions of the longitudinal dispersion coefficient. Prior to discussing the comparisons between the measured and modeled data sets, it is important to note the following aspects. The bottom of the laboratory flume was covered with a horizontal layer of gravel and sand, while an acetate sheet was placed at the free surface to ensure optical stability and to eliminate laser sheet reflections. Neither of these boundary effects are represented in the 2D CFD model. The horizontal resolution of the PIV data was reported as 1.3 mm, which is larger than the CFD set-up we adopted, where the maximum cell size was 1.0 mm. The full-width PIV maps were compiled from multiple sections patched together.

Flow
Figure 2 compares streamwise velocity u at measurement sections P7 and P8. Qualitatively the plots confirm that the expected cylinder-induced velocity variations are reproduced in the CFD model. The streamwise velocity matches particularly well in the less dense section, P7. In the denser section, P8, there are several places where the CFD velocity predictions are faster, but as these occur between cylinders where there is little or no PIV data, direct comparisons are not feasible. Patches of low velocity, in the cylinder wakes, appear to be slightly longer in the simulation in both cases. An additional plot presented in the Supporting Information ( Figure S25) provides a direct comparison between the zero velocity u contours, again confirming that any differences in the shape and magnitude of the cylinder wake are negligible. The Supporting Information also provides comparisons between the transverse profiles of measured and modeled u velocity profiles at x = 7.440 m and x = 7.545 m ( Figure S26), where it is noted that minor differences in the magnitudes of the low velocities may reflect differences in the spatial resolution of the two data sets. Not presented for brevity, the transverse velocity component is similarly predicted well. Figure 3 compares the PIV measurement of turbulent kinetic energy k, given as 0.5 (u'u' + v'v'), with the CFD prediction. Note, 0.5(u'u' + v'v') is the 2D analogue of turbulent kinetic energy, omitting w'w', which cannot be measured with a 2D PIV system. The quality of the estimation of this parameter is important for predicting turbulence, as the CFD model uses k to estimate μ t , and hence turbulent diffusion.
Again, qualitative comparisons between the measured and simulated data suggest that the CFD model reproduces the main features of the observed turbulent kinetic energy distribution, both in pattern and in magnitude. P7 is dominated by individual cylinder wakes. For most cylinders, the prediction is reasonable, but for some wakes, for example, at x = 7.43, y = 0.28 m, turbulence appears to be under-estimated. Whilst it is difficult to physically explain the much larger PIV estimates of turbulence, the reported values may have been affected by spatial and temporal averaging as part of the PIV analysis. It is also worth noting that in other locations (not pictured), turbulence was over-estimated by the CFD model. For P8, both the turbulence location and levels are predicted well overall. Interestingly, where the predictions at P8 most obviously differ from the measured data is in the spread of k. Particularly, at x = 7.54, y = 0.14 m and x = 7.54, y = 0.37 m, the CFD model predicts a dip in turbulence that is not shown in the PIV results. The Supporting Information also provides comparisons between the transverse profiles of measured and modeled u velocity and turbulent kinetic energy profiles at x = 7.440 m and x = 7.545 m ( Figure S26 and S27 respectively).
Whilst some differences in the flow and turbulence quantities have been noted, the overall impression is that the observed velocity and turbulent kinetic energy distributions of this complex flow field are reproduced convincingly by the CFD model. Acknowledging the uncertainties in the laboratory data (non-uniform flow depth, rough bed, and the inconsistencies in wake definition in section P7), and bearing in mind the simplifications inherent in the CFD model (2D approximation with no bed or free surface effects), these comparisons demonstrate that the 10.1029/2021WR030396 9 of 23 CFD model set-up is fit-for-purpose in the context of assessing the effect of cylinder size distribution on mixing in vegetated flows.
Further validation against external data is provided in the Supporting Information as part of our wall boundary condition sensitivity analysis. We demonstrate that the CFD set-up adopted here was capable of accurately reproducing the streamwise velocities and turbulence intensities provided experimentally by D. Liu et al. (2008) and using LES simulation by Stoesser et al. (2010) at key locations in the vicinity of a cylinder. These comparisons also highlight the complementarity of laboratory experiments and the CFD model; the laboratory experiments provide first-hand observational data that provides confidence in the model results; whilst the CFD model provides a controlled experimental environment in which high-resolution data on flow field and turbulence characteristics is easily accessed.

Dispersion-300 mm Flume
As noted above,  showed that a CFD model configured as set out here was capable of reproducing both the magnitude of the laboratory-derived transverse dispersion coefficient, and the trends in variation with solid volume fraction, reported by Tanino and Nepf (2008). However, no previous validation for longitudinal dispersion has been undertaken.
With respect to the longitudinal dispersion coefficient, D x , Sonnenwald et al. (2019a) presented laboratory data relating to three different arrangements of artificial emergent vegetation. One arrangement comprised randomly-distributed 8 mm diameter cylinders, ϕ = 0.027, tested at Re d = 450. This configuration was simulated using a repeating 3 m long CFD simulation of the laboratory flume, with all other simulation set-up options as detailed above. The laboratory experiments used point measurements of dye tracer using four mid-channel, mid-depth, fluorometers to record temporal concentration profiles. Figure 4 compares the three monitored downstream temporal concentration profiles with routed profiles based on the value of D x obtained from an equivalent injection experiment and monitoring locations in the CFD model. All R t 2 values were greater than 0.97, confirming the CFD predicted D x to be acceptable.

Illustrative Examples of Simulation Data and Dispersion Analysis
In this section we present the simulation data corresponding to configurations D3 and D8 at ϕ = 0.05. These two configurations were chosen to provide a contrast between two geometries with the same solid volume fraction, one with a uniform stem diameter (D3, at d = 8 mm) and the other (D8) characterized by a non-uniform stem diameter distribution. Figure 5 presents contours of streamwise velocity u for the complete 1 × 1 m model simulation. Qualitatively, the flow field is as expected, with clear evidence of wake zones downstream of cylinders, high velocity preferential flow paths between the cylinders and some evidence of wake interaction effects between  adjacent cylinders. There is no immediately observable difference due to stem diameter distribution. Note that the upstream (left hand boundary) velocity profile matches the downstream (right hand boundary) velocity profile due to the periodic boundary condition utilized in the simulation set-up. Figure 6 shows the spread of the simulated tracer as it moves along the channel, while Figure 7 shows how the data were processed to generate temporal sum cross sectional concentration profiles for transverse dispersion analysis and transverse sum temporal profiles for longitudinal dispersion analysis. Figure 6 presents a zoomed in view of the central section of the channel. It is clear from Figure 5 that the central portion of configuration D3 is characterized by a band of high velocity flow. It is not surprising, therefore, to note that at 12 s, the dye in the D3 configuration (Figure 6a) has traveled further than the D8 configuration ( Figure 6b). The effects of individual cylinders mechanically splitting and directing the tracer cloud can be clearly seen. It should be noted that these individual snapshots represent specific realizations of geometries and flow paths. However, the dispersion associated with all injection locations and reaches needs to be considered to reach representative conclusions about systematic variations in dispersion characteristics as a function of the cylinder diameter distribution. These comparisons are made in the following section.

Analysis of the Full Data Set-The Impact of Cylinder Diameter Distribution on Dispersion
The mean R t 2 of predictions compared to downstream records for all optimized longitudinal concentration profiles was 0.997, with a standard deviation of 0.006. The mean R t 2 of the optimized transverse concentration profiles was 0.998 with a standard deviation of 0.001. This shows that all traces were represented well using an ADE model. Figure 8 shows the individual optimized values of the longitudinal and transverse dispersion coefficients. There is significant variation in D x at low solid volume fractions, due to the random occurrence of preferential flow paths increasing longitudinal spread. At higher solid volume fractions, cylinders are packed more closely, reducing preferential flow paths, and increasing the averaging effects due to higher numbers of cylinders encountered; hence D x is more consistent. D y shows fairly consistent scatter for all solid volume fractions. Configurations D7 and D8, with non-uniform cylinder diameter distributions, show higher values of D y compared with the uniform cylinder diameter configurations. However, as noted earlier, the data requires non-dimensionalization before direct comparisons may be made. For the majority of the uniform diameter distributions, systematic variation with solid volume fraction is not observed for either D x or D y ; both appear to be essentially independent of solid volume fraction. The D y data does reveal two minor trends with respect to solid volume fraction: in the case of D2, D y appears to increase with increasing solid volume fraction up to a value of around ϕ = 0.1, but then to gradually decrease with further increase in ϕ; for both the non-uniform cylinder diameter distributions, D7 and D8, there is some indication of an increase in D y with ϕ.
At the highest solid volume fractions there is some evidence of a systematic increase in both D x and D y with cylinder diameter as the configurations progress from D2 to D6. The x-axis scatter in Figure 8 indicates that, although a nominal ϕ was defined when creating the geometry, the actual ϕ experienced by the tracer varies significantly depending on the transverse position of the point injection. D x is approximately one order of magnitude greater than D y, which is typical in most flows and consistent with Sonnenwald et al. (2017). Figure 9 shows the mean non-dimensional optimized dispersion coefficients, D/Ud, where U is optimized velocity and d is the mean cylinder diameter. In Figure 9 the x-axis has been changed from ϕ to d/s, the ratio of cylinder diameter to cylinder spacing. This presentation permits direct comparison with Tanino and Nepf (2008) and also discriminates better between configurations with similar values of ϕ, but very different cylinder sizes and cylinder spacings.
The apparent enhanced dispersion due to diameter non-uniformity seen in Figure 8 disappears. Therefore, the original hypothesis is correct; the fundamental dispersion characteristics associated with a vegetated flow are unaffected by the uniformity (or otherwise) of the cylinder diameter distribution. Sonnenwald et al. (2019a) suggested non-dimensionalizing by cylinder spacing instead of cylinder diameter as the characteristic mixing length. Figure 10 shows that D x /Us and D y /Us both collapse reasonably well onto a single line, again removing any evidence of cylinder diameter non-uniformity effects. Furthermore, much of the variation observed at low solid volume fractions in Figure 9 is removed in Figure 10. Using least squares 13 of 23 regression and forcing the equation to pass through the origin, we fitted the following non-dimensional predictive relationships: D x /Us = 1.04 d/s and D y /Us = 0.111 d/s. These relationships can be simplified to D x /Ud = 1.04 and D y /Ud = 0.111, which correspond to the constant values for D/Ud seen in Figure 9. However, it should be noted that the reduced dispersion associated with the D2 configuration persists, particularly for D y . This will be revisited in the discussion.

Sensitivity of Dispersion Coefficients to Injection Location
The experimental design included the use of seven different transverse injection locations for each geometry. All the analysis was undertaken based on data collected from monitoring points located 1.0 m or more downstream from the injection point. This was to ensure that the tracer had interacted with multiple cylinders prior to the first monitoring section in all cases. Nonetheless, Figure 8 reveals significant variations between individual estimates of the dispersion coefficients, particularly for D x at low solid volume fractions, where the range approaches 6.0 × 10 −3 m 2 /s, well in excess of 100% of the mean value. Golzar (2018) used essentially the same methodology as presented here, but-rather than showing transverse dispersion to be independent of solid volume fraction-reported enhanced transverse dispersion around 0.05 ≤ ϕ ≤ 0.10. Further analysis of the Golzar (2018) CFD models suggests that this resulted from a combination of specific geometry realizations and injection location effects. Subsequent analysis with additional injection locations produced results more consistent with the new data presented here.
With fewer cylinders at lower solid volume fractions, averaging effects are reduced, and variations between individual realizations of the geometry are therefore greater. It is acknowledged that some of this uncertainty would have been removed through the introduction of a longer upstream equilibration section, particularly at the lower solid volume fractions. Nonetheless, our experimental set-up is comparable with many physical laboratory experiments, so similar uncertainties can also be expected to be associated with laboratory-derived data. The reach-averaged dispersion coefficients are not applicable for engineering problems characterized by low solid volume fractions and short travel distances. The relationships derived here are intended for application in contexts where dispersion processes apply over a long distance.

Turbulence Quantities and Theoretical Dispersion Relationships
Based on the data presented in Figure 10, we proposed two linear relationships to estimate D x and D y as functions of U, and d. It is expected that these functional relationships will be sufficient for most practical engineering purposes.
Whilst the empirical relationships highlighted in Section 4 provide practical estimation tools, they provide no insights into the underlying processes. As highlighted in Section 3, one of the benefits of CFD modeling is that the underlying flow field and turbulence characteristics can be interrogated to reveal the physical mechanisms responsible for the observed dispersion effects. Here we briefly consider the contribution of turbulent diffusion to overall reach scale dispersion. We then discuss data on C D (drag coefficient), turbulent kinetic energy and turbulent viscosity. These parameters have specifically been selected due to the regularity with which they have been used to explain, or estimate, dispersion coefficients. Figure 11 shows the relative contribution of turbulent diffusion (μ t /(S ct ρ)) to overall reach-scale longitudinal and transverse dispersion for two configurations, D3 and D8. As described earlier, the turbulent diffusion term is isotropic, so the absolute value is the same in all directions. It is clear that turbulent diffusion is a minor component of longitudinal dispersion throughout the range of solid volume fractions. In contrast, as noted by Tanino and Nepf (2008), its contribution to transverse dispersion is significant, particularly at low solid volume fractions. However, our data challenges the Tanino and Nepf (2008) model's expectation that turbulent diffusion reduces to zero at d/s values greater than approximately 3.0 (ϕ > 0.25); instead our data shows that its value tends toward a constant, non-zero, value with increase in solid volume fraction.
This analysis confirms why our solute transport model validation based on D x was convincing, even when discrepancies in the estimation of turbulence quantities were evident. Further validation, and possible refinement, of the model is contingent upon the future availability of high quality laboratory data characterizing both the flow fields and the resultant transverse dispersion. Figure 12 presents C D , total kinetic energy and turbulent diffusivity for the full simulated data set. C D was calculated from the CFD results by extracting the pressure gradient dp/dx used in the periodic boundary condition (units Pa/m) and substituting it into: (4)   where a is frontal facing area, calculated from the known geometry. Frontal facing area is defined as the sum of the diameters of all cylinders in the 1 × 1 m section, that is, ∑

=1
. Area-weighted mean turbulent kinetic energy over the entire CFD geometry is shown in Figure 12b non-dimensionalized by area-weighted mean velocity within the CFD simulation, U CFD . Turbulent diffusivity is presented in Figure 12c as the area-weighted mean non-dimensional turbulent diffusion coefficient, μ t /(S ct ρU CFD s), where μ t is the area-weighted mean turbulent viscosity. In Figure 12d, turbulent diffusivity is non-dimensionalized using d rather than s.
For C D , total kinetic energy, and turbulent diffusion, the data presented in Figures 12a-12c provides further confirmation that the processes contributing to dispersion are consistent for uniform and non-uniform cylinder diameter distributions. The data for the non-uniform cylinder diameter configurations (D7 & D8) collapses onto the same line as the equivalent uniform diameter configuration (D3). Indeed, except for the smallest diameter (D2, d = 4 mm) configuration, all configurations converge to a single non-dimensional relationship. D2 has already been noted as an outlier from all of the other data; this will be addressed in Section 5.3. Figure 12d will be considered below, when we briefly compare the new data with selected previously-proposed relationships.
Drag coefficient C D is used in several models for predicting dispersion coefficient, for example, White and Nepf (2003), Serra et al. (2004), and Tanino and Nepf (2008). It is also used in a variety of hydraulic resistance and other computational models which utilize a momentum sink approach, for example, Sonnenwald, Guymer, and Stovin (2019). The independence of C D from cylinder diameter is somewhat expected given Equation 4 involves the area mean parameters ϕ and a. It is also consistent with the results of previous experimental studies, which have shown that drag coefficient is primarily a function of Re d and ϕ (M. Y. Liu et al., 2020;Sonnenwald et al., 2019b;Tinoco & Cowen, 2013).  Figure 12b compares the area-weighted mean non-dimensional turbulent kinetic energy (symbols) from the CFD simulation with a turbulent kinetic energy prediction based on theory presented by Tanino and Nepf (2008) (lines). (We have substituted the modified coefficients proposed in Sonnenwald, Guymer, and Stovin, 2019 and the method of predicting drag coefficient C D given in Sonnenwald et al., 2019b). The symbols and lines agree well, providing evidence that supports the Tanino and Nepf (2008) relationship between geometry and turbulent kinetic energy. The CFD results also show the same trend as in Tanino and Nepf (2008) Figure 15 for their own and other field studies. As μ t is defined by the ratio of k 2 to the turbulent dissipation rate ε, the consistency between Figures 12b and 12c suggests that this ratio remains constant for different cylinder diameters.
In Figure 12d the CFD results for turbulent diffusion have been non-dimensionalized by cylinder diameter to permit a direct comparison with the Tanino and Nepf model. Interestingly, an initial increase in turbulent diffusion between ϕ = 0.005 and ϕ = 0.020 is evident; this is consistent with the initial rise in turbulent mixing predicted by Tanino and Nepf (2008). All configurations also show decreasing turbulent mixing with increasing d/s, although the observed decrease is less rapid. Whilst the Tanino and Nepf (2008) model implies that turbulent mixing makes no contribution to transverse dispersion at d/s > 3.5, Figure 12d confirms that turbulent mixing contributes to transverse dispersion at all solid volume fractions and for all cylinder diameter distributions. The new CFD-derived transverse dispersion coefficients do not show the characteristic "N" or "hump" shape of the Tanino and Nepf (2008) model. The current data suggest a D y /Ud value of 0.111 (Figure 9), compared with the value of 0.2 recommended by Nepf (2012).
As with Tanino and Nepf (2008), White and Nepf (2003) suggested that the longitudinal dispersion coefficient is also made up of two summative components, namely dispersion due to vortex trapping and cylinder-scale secondary wake dispersion. Their experimental data covered ϕ < 0.06, with the secondary wake component decreasing slightly and vortex trapping increasing linearly with increasing cylinder density. At the higher values of ϕ studied here, the relationship given by White and Nepf (2003) predicts much larger values of D x /Ud. In contrast to the scaling with respect to ϕ suggested by White and Nepf (2003), Lightbody and Nepf (2006) provided a relationship from studies in real vegetation, which estimates D x /Ud independently from the solid volume fraction. The new CFD derived data ( Figure 9) suggests a similar ϕ independent relationship.

The Outlier Behavior for Configuration D2
The configuration based on a 4 mm uniform cylinder diameter (D2) generated estimates of dispersion that consistently fell below those of all other configurations. In the previous section it was shown that the levels of non-dimensional turbulence and drag are also underestimated in this configuration. Whilst mesh independence checks were undertaken (Section 2.5) specifically with this smallest cylinder size, it is possible that the mesh was not sufficiently fine to fully define the smaller-scale cylinder wakes in this configuration. We therefore recommend further exploration of meshing strategies for future studies focusing specifically on the smallest cylinder diameter configurations. In addition, laboratory validation data for this specific configuration would be particularly welcome. Table 2 summarizes a range of previous studies considered to be relevant in this context. Figure 13 compares the new CFD generated longitudinal and transverse dispersion coefficients with those from other numerical and experimental studies of dispersion in cylinder arrays with Re d > 40. Note that the Golzar (2018) results are a direct comparison at Re d = 500, while the other studies' results are a mean taken from a range of Re d . Cylinder spacing is available only for Golzar (2018), Sonnenwald et al. (2017), Sonnenwald et al. (2019a), Lou et al. (2020), and Jamali et al. (2019). For the remaining studies, s has been estimated using Equation 5 (Sonnenwald et al., 2019a;Tanino & Nepf, 2008):

Comparison With Existing Data
The new D x and D y values are generally consistent with those from previous studies, being within one order of magnitude, and generally fitting the linear trends identified in Figure 10. Whilst some scatter in the D x values is noted, no obvious outliers are highlighted.
There is some variation between the new CFD-derived transverse dispersion coefficients and those from other studies. The estimates of D y /Us from Serra et al. (2004) are consistently higher than the new data presented here. The experimental results of Tanino and Nepf (2008) are also higher than the new data for some values of d/s. These data were originally collected by Tanino (2008). The observed differences are believed to reflect differences between the random cylinder placement rules applied in the two studies. Tanino (2008) excluded a 2 d × 2 d square around the cylinder center (such that the minimum edge-to-edge cylinder spacing ranged from the radius to the radius plus 2.6 mm), while in the present study we excluded a smaller region, the radius plus 0.2 mm around the cylinder center. Additionally, Tanino (2008) drilled flume plates at ϕ = 0.20 and ϕ = 0.35, and filled randomly selected holes to generate the lower solid volume fractions. These geometry rules will have reduced the potential for clumping of vegetation, which could explain the difference in the results. The results of Nepf, Sullivan, and Zavistoski (1997) significantly differ from the new CFD-derived results for transverse dispersion coefficient. Tanino (2008) noted a similar discrepancy with her results and suggested this was likely the combination of two effects, a regular staggered array and an injection location upstream of the vegetation.
Overall, the comparison with previously-published data has not highlighted any systematic inconsistencies in our new data, and tends to confirm that D x and D y can be estimated as linear functions of d, s and U.

Conclusions
A CFD-based methodology for quantifying the effects of random cylinder arrays on flow characteristics and solute mixing processes has been proposed and validated against a range of relevant laboratory studies.
The present study focused on the influence of cylinder diameter distribution on both longitudinal and transverse dispersion. Over 130 unique configurations were simulated, comprising combinations of eight different uniform and non-uniform cylinder diameter distributions and 20 different solid volume fractions. The range of solid volume fractions considered here is broader than in any previous study. The CFD-generated dispersion coefficients were shown to be consistent with the results of previous CFD simulations and laboratory experiments.
We found that the CFD-generated dispersion coefficients are largely independent of solid volume fraction.  agreement with those for vegetation with a uniform cylinder diameter. Cylinder diameter distribution does not affect the mixing processes associated with vegetated flows simulated using cylinder arrays.
Non-dimensionalizing by cylinder spacing reveals a linear relationship between dispersion and d/s that is consistent with previously reported laboratory studies. This confirms that cylinder spacing-rather than cylinder diameter-is the relevant length scale for turbulent mixing processes in vegetated flows.
In addition to the "black-box" dispersion relationships derived from the CFD-derived dispersion coefficients, the CFD models provide additional insights into the hydrodynamic processes controlling mixing in vegetated flows. Whilst we have not endeavored here to develop strong theoretical arguments to explain the observed relationships, the detailed flow and turbulence data provided by the CFD model is invaluable to support the future development of process-based solute transport models.

Data Availability Statement
The data presented in Figures 8-12, representing the original results from this work, can be found in Sonnenwald et al. (2021) https://doi.org/10.15131/shef.data.14595870.