Evaluation of a Stochastic Mixing Scheme in the Deep Convective Gray Zone Using a Tropical Oceanic Deep Convection Case Study

A stochastic horizontal subgrid‐scale mixing scheme is evaluated in ensemble simulations of a tropical oceanic deep convection case using a horizontal grid spacing (Δh) of 3 km. The stochastic scheme, which perturbs the horizontal mixing coefficient according to a prescribed spatiotemporal autocorrelation scale, is found to generally increase mesoscale organization and convective intensity relative to a non‐stochastic control simulation. Perturbations applied at relatively short autocorrelation scales induce differences relative to the control that are more systematic than those from perturbations applied at relatively long scales that yield more variable outcomes. A simulation with mixing enhanced by a constant factor of 4 significantly increases mesoscale organization and convective intensity, while turning off horizontal subgrid‐scale mixing decreases both. Total rainfall is modulated by a combination of mesoscale organization, areal coverage of convection, and convective intensity. The stochastic simulations tend to behave more similarly to the constant enhanced mixing simulation owing to greater impacts from enhanced mixing as compared to reduced mixing. The impacts of stochastic mixing are robust, ascertained by comparing the stochastic mixing ensembles with a non‐stochastic mixing ensemble that has grid‐scale noise added to the initial thermodynamic field. Compared to radar observations and a higher resolution Δh = 1 km simulation, stochastic mixing seemingly degrades the simulation performance. These results imply that stochastic mixing produces non‐negligible impacts on convective system properties and evolution but does not lead to an improved representation of convective cloud characteristics in the case studied here.

10.1029/2023MS003748 2 of 23 that the time scale of the large-scale forcing be sufficiently slower than the convective adjustment time (i.e., buoyancy dissipation) and that the grid column be large enough to hold a robust cloud ensemble obeying the "law of large numbers"-a requirement of any bulk deterministic parameterization scheme (e.g., Sakradzija et al., 2016).
It is generally accepted that  Δℎ  (100) m (used for large eddy simulations, or LESs) is required to properly resolve deep convective motions (e.g., Bryan et al., 2003;Lebo & Morrison, 2015).LES closures of the filtered Navier-Stokes equations are constructed to determine the SGS fluxes and to represent the transfer of kinetic energy from the resolved scale to the SGS realistically.However, such closures are generally only appropriate if at least part of the turbulence inertial subrange is resolved, that is, if the model's filter length (typically scaling with Δ h ) lies within the inertial subrange.This is generally not true for km-scale Δ h where the length scale of convective overturning scales with Δ h (Bryan et al., 2003;Lebo & Morrison, 2015).Even though this requirement is violated at the km scale, these traditional LES closures are often applied to km scale simulations in modern numerical models.This demonstrates the two sides of the gray zone problem-that km-scale Δ h is too small for convection parameterizations but too large for LES-based closures.
The Weather Research and Forecasting (WRF) model (Skamarock et al., 2008) is one example of a widely used community mesoscale model in which traditional LES closures are often employed at the km scale, at least for horizontal mixing.The LES closures in WRF generally (though not always) follow a downgradient approach whereby SGS mixing is diagnosed to be inversely proportional to the local gradient of the resolved-scale variable-the larger the gradient, the larger the degree of mixing (this follows from the standard mixing length model with a mixing length and eddy diffusion coefficients that are >0).While vertical mixing in non-idealized km-scale WRF simulations is often performed using a planetary boundary layer (PBL) scheme (typically with higher-order schemes that are not necessarily always downgradient), horizontal mixing is usually performed using a downgradient diffusion approach following the Smagorinsky scheme (Lilly, 1967;Smagorinsky, 1963) or is neglected entirely.For deep convective updrafts, this means that entrainment and lateral mixing of relatively dry environmental air with the updraft is dependent on the resolved-scale gradient across the width of the updraft.
Within the deep convective gray zone, the updraft size spectrum is truncated at a lower size controlled by Δ h , ultimately acting to shift the updraft spectrum to larger sizes compared to those produced by LES (Bryan & Morrison, 2012;Lebo & Morrison, 2015;Stanford et al., 2020).Larger updraft sizes then experience weaker horizontal gradients of vertical momentum and scalars which results in under-mixing compared to LES-produced updrafts.The net result is an established tendency for km-scale models to produce updrafts that are too undilute, leading to excessive vertical transport of heat, momentum, and aerosols to the mid-and upper-troposphere (Lebo & Morrison, 2015;Stanford et al., 2020;Varble et al., 2014aVarble et al., , 2020;;Verrelle et al., 2015Verrelle et al., , 2017)).Stanford et al. (2020, hereafter S20) proposed a method to address the under-representation of lateral mixing in km-scale models by introducing a stochastic mixing scheme in which the mixing coefficients are multiplied by a stochastic perturbation factor that allows updrafts of the same width to experience a wider range of mixing events than possible using the traditional Smagorinsky scheme.S20 found that stochastically perturbing the horizontal mixing coefficient (K h ) in Δ h = 0.5, 1, and 2 km idealized squall line simulations generally resulted in more dilute updrafts owing to more frequent sampling of large K h .However, the degree to which internal updraft core properties were modified was strongly modulated by the impact of stochastic mixing on the squall line's mesoscale thermodynamic and kinematic structure.Motivated by this result, further evaluation of the stochastic mixing scheme is desired in less idealized frameworks that are applicable to the simulation of DMC using km-scale NWP and regional climate models.Moreover, the approach stands to be tested in simulations that are less dependent on cold pool thermodynamic forcing (contrasting with the mid-latitude squall line simulated by S20).

10.1029/2023MS003748
3 of 23 This study extends the S20 framework to 36-hr, Δ h = 3 km simulations of a tropical oceanic deep convection event that occurred from 7 to 8 December 2011 during the Atmospheric Radiation Measurement (ARM) Madden-Julian Oscillation (MJO) Investigation Experiment (AMIE) and DYNAmics of the MJO (DYNAMO) field campaign (Long, 2016;Yoneyama et al., 2013).Situated over the central Indian Ocean, this event exhibited a broad field of small, isolated convective cells that evolved into an organized mesoscale convective system (MCS).Specific foci independent of S20 include establishing sensitivities of convection to the spatiotemporal scale of the stochastic perturbations and framing results within the context of observations from an S-band radar and a Δ h = 1 km simulation.Particular emphasis is placed on establishing the sensitivity of precipitation to stochastic mixing.
The remainder of the paper is organized as follows.Section 2 describes the model details including the stochastic horizontal mixing scheme.Section 3 details data sets, Section 4 outlines the experimental design, and Section 5 describes the case and results.Finally, Section 6 provides a discussion and conclusions.

Model Setup
All simulations are performed using the Advanced Research WRF (WRF-ARW) model version 3.9.1 with spatial domains shown in Figure 1.These include a 1,200 km × 1,200 km domain with Δ h = 3 km which is used for all stochastic mixing simulations and the control simulation.A 600 km × 600 km domain is also one-way nested to Δ h = 1 km (blue box in Figure 1) and used for a resolution sensitivity test.The Δ h = 3 km simulations use 61 vertical levels with vertical grid spacing (Δz) increasing from ∼100 to 200 m in the lowest 1 km to ∼300-400 m in the free troposphere.The Δ h = 1 km simulation uses 101 vertical levels with Δz increasing from ∼100 to 200 m in the lowest 1 km to ∼200 m in the free troposphere.
Physics parameterizations common among all simulations are the Predicted Particle Properties (P3) microphysics scheme (Morrison & Milbrandt, 2015), the Rapid Radiative Transfer Model for General circulation models (RRTMG) longwave and shortwave radiation schemes (Iacono et al., 2008), the Mellor-Yamada-Nakanishi-Niino (MYNN) Level 2.5 PBL scheme (Nakanishi & Niino, 2006) for vertical SGS mixing which predicts second-order subgrid turbulent kinetic energy (TKE), and the Unified Noah Land Surface Model (Chen & Dudhia, 2001), although the surface is mostly open ocean (Figure 1).Two-dimensional horizontal SGS mixing is performed using the Smagorinsky closure described in Skamarock et al. (2008) and modified for stochastic simulations following the description later in this section.Third-and fifth-order advection schemes are used in the vertical and horizontal directions, respectively, with a monotonic limiter applied to advect scalars and moisture.Dynamic time steps of 12 and 4 s are used for the Δ h = 3 and 1 km simulations, respectively.Initial and lateral boundary conditions are provided by the fifth generation ECMWF atmospheric reanalyses of the global climate (ERA5) data set (European Centre for Medium-Range Weather Forecasts, 2019), which is produced at a quarter of the operational model's spatial resolution on a grid with Δ h = 31 km and 37 vertical pressure levels.Boundary conditions are provided every 6 hr and linearly interpolated at intermediate times.The simulations are initialized at 0000 UTC on 7 December and run through 1200 UTC on 8 December (36 hr total).A spin-up time of 12 hr is allowed (0000-1200 UTC on 7 December) before analysis begins.The Δ h = 1 km simulation is initialized at 0600 UTC on 7 December.

Stochastic Mixing Scheme Description
Stochastic perturbations are applied to the horizontal eddy diffusivity coefficient (K h ), for which the diagnostic equation is given by: where c s is the Smagorinsky constant set to 0.25, l h is the horizontal mixing length set to Δ h , and   = (  ∕ +   ∕) is the resolved-scale horizontal deformation tensor.Overlines represent the Reynolds-average operator in Cartesian coordinates (horizontal components x i , x j ) and    and    are the ith and jth components of the velocity vector.Because S20 used an idealized framework, the Smagorinsky closure (similar to Equation 1) was used for both vertical and horizontal diffusion (making it a 3D closure, albeit anisotropic) that included a static stability dependence in calculating K h .Here, vertical diffusion is instead performed by the PBL scheme while horizontal mixing is controlled by the 2D Smagorinsky closure.Diffusion coefficients for scalar mixing are then obtained by dividing K h by the turbulent Prandtl number (P r , here set to 1/3).
Stochastic perturbations are applied to K h following where   ′ ℎ is the perturbed diffusivity coefficient, K h is diagnosed via Equation 1, and γ is the stochastic perturbation.Hereafter, 2 γ will be referred to as the stochastic multiplicative factor, F. The stochastic pattern generator that yields γ is described in detail in S20 (their Section 2b), Stanford et al. (2019), and the Appendix of Berner et al. (2015).The perturbation factor γ follows a zero-mean Gaussian distribution with a wavenumber-dependent perturbation amplitude determined by a user-prescribed grid point variance (σ 2 , where σ is the standard deviation), a temporal autocorrelation scale (τ), and a spatial autocorrelation scale (δ).Herein, σ 2 is chosen to be 4 (following the value used by S20) while the sensitivity to τ and δ is explored following an experimental design described in Section 4. A maximum (minimum) limit of 3σ (−3σ) is applied to γ.A default maximum limit to prevent numerical instability is also applied to K h requiring it to not exceed 30,000 m 2 s −1 for Δ h = 3 km and 10,000 m 2 s −1 for Δ h = 1 km, while the downgradient treatment requires K h > 0. Note that the log 2 sampling space of γ allows F to vary around a median (and geometric mean) value of unity.Stochasticity is implicitly applied to scalar mixing by dividing   ′ ℎ by P r .

Data Sets
The AMIE/DYNAMO field campaign took place from October 2011 through March 2012 and was centered over the tropical Indian Ocean.The broad objective of the field campaign was to observe and characterize the thermodynamic and kinematic structure and evolution of the MJO.We exploit the observational network from the AMIE/DYNAMO campaign to analyze a tropical convection event described further in Section 5.1.

SPOL
The National Center for Atmospheric Research (NCAR) S-PolKa radar (NCAR/EOL S-Pol Team, 1996) is used for evaluating simulated system evolution.S-PolKa is a dual-polarized, dual-wavelength Doppler radar that transmits at 10 cm (S-band) and 0.86 cm (Ka-band).Only the S-band data are used here (and thus the radar is hereafter called SPOL).SPOL was located on Gan Island, Maldives on the Addu Atoll (0.6°S, 73.1°E, see red "×" in Figure 1).Specific details about SPOL's scanning strategy and quality control procedures are described in Rowe and Houze (2014).
The SPOL DYNAMO Legacy data set (Rutledge, et al., 2018a(Rutledge, et al., , 2018b) has a 15-min time interval and is gridded onto a 300 km × 300 km (150-km radial distance) Cartesian grid.The Δ h is 1 km and Δ z is 0.5 km ranging from 0.5 to 20 km above ground level (AGL).For comparison with simulations, the SPOL data set is coarse-grained to Δ h = 3 km while conserving radar reflectivity factor.SPOL DYNAMO Legacy instantaneous rain rates were derived using the Colorado State University blended, tropical rainfall hybrid algorithm that employs both single and dual-polarization variables.This algorithm follows from Thompson et al. (2015), where relationships between radar reflectivity (Z) and rain rate (R) were derived for convective and stratiform regions separately by optimizing the Z-R relationship with observed drop size distributions retrieved from long-term (18-month) disdrometer data at a site near Gan Island.Relationships between R and differential reflectivity (Z dr ) and specific differential phase (K dp ) were also employed through guidance of disdrometer observations following Thompson et al. (2018).The algorithm logic is thus tuned specifically to the AMIE/DYNAMO region using the "best estimate" from these relationships.Rain rates were derived at a height of 2.5 km AGL, which is higher than the lowest radar elevation angle at a radial distance of 150 km and well below the freezing level in the Tropics (∼4 km AGL), thereby avoiding the inclusion of enhanced reflectivity associated with the bright band signature in robust stratiform regions.
Included with the "best estimate" are lower and upper uncertainty bounds that account for radar measurement and rain rate equation uncertainties.To define rain rate uncertainty for the domain, a bootstrapping-like technique is employed.For each time, the rain rate at each grid point is randomly sampled as the minimum, best estimate, or maximum value.All grid point values across the domain are then averaged or summed depending on the desired metric, and this process is repeated 1,000 times.The median value of the 1,000-member distribution is then used to describe the best estimate and the lower and upper bounds are defined as the centered 68.2% of the distribution (corresponding to a single normally distributed standard deviation).As shown below, spatial averaging yields very small uncertainties in observed rain rates.This data set has been used in several recently published studies (e.g., Cheng et al., 2018Cheng et al., , 2020;;Rowe et al., 2019;Shell et al., 2020).Simulated instantaneous rain rates are also computed at a height of 2.5 km AGL.

Convective/Stratiform Partitioning
SPOL reflectivity is partitioned into convective and stratiform regions following the algorithm of Steiner et al. (1995, hereafter S95).This algorithm is texture-based and employed at 2.5 km AGL.Convective regions are iteratively identified by grid points in which (a) the reflectivity exceeds 40 dBZ or (b) the reflectivity exceeds the background value within an 11-km radius by an assigned threshold that decreases as background reflectivity increases.Convective regions are then laterally expanded as a function of the background reflectivity, with a maximum radial expansion of 6 km if the background reflectivity exceeds 40 dBZ.Stratiform regions are then identified as all grid points with reflectivity >0 dBZ not identified as convective.Our results are not qualitatively different with modifications to the tuneable parameters within the S95 algorithm.
Convective/stratiform partitioning is also performed in the simulations at 2.5 km AGL using the S95 algorithm.Simulated reflectivity is computed in P3 by integrating the sixth moment of the hydrometeor melted equivalent diameter size distributions for each hydrometeor species and summing across species (ice, cloud water, and rain).Different dielectric factors between liquid and ice are accounted for following Smith (1984).We note that Rayleigh reflectivity in km-scale tropical DMC simulations is known to be high-biased (e.g., Lang et al., 2007;Li et al., 2008;Matsui et al., 2009;Stanford et al., 2017;Varble et al., 2011) which results in an expansion of simulated convective regions.Because this bias originates from both dynamical and microphysical sources in the model, it cannot be explicitly controlled.
The S95 algorithm is also used to identify contiguous convective reflectivity (CCR) cores, defined as a masked region of horizontally contiguous radar reflectivity identified as convective at 2.5 km AGL.Motivation for identifying CCR cores is provided by K. E. Hanley et al. (2015), Machado and Chaboureau (2015), and Stein et al. (2015), all of which showed sensitivity of convective cell size and cell-averaged properties to the turbulent mixing length (a component of K h , see Equation 1).In addition, S20 showed that modifying the mixing coefficient altered the size distribution of updraft cores, which suggests that reflectivity/precipitation cores will also be impacted.CCR cores are identified at each output time (every 15 min) and those that lie along the edge of the domain (i.e., a core containing grid points along the domain boundary) are excluded from statistical analysis.

Experimental Design
All simulations are summarized in Table 1.A control simulation (named BASELINE) is performed with Δ h = 3 km and uses standard Smagorinksy mixing.An additional simulation using standard Smagorinsky mixing is performed with Δ h = 1 km (BASELINE_1KM).For comparison purposes, BASELINE_1KM is coarse-grained via averaging to Δ h = 3 km and is named BASELINE_1KM_CG_3KM; the purpose of the Δ h = 1 km simulation is to assess Δ h effects on stochastic mixing performance.Two stochastic ensembles (five members each) are constructed with the ensembles differing only by their spatial and temporal autocorrelation scales.The STOCH_SHORT ensemble employs a temporal autocorrelation scale of 600 s (10 min) and a length scale of 10 km.The STOCH_LONG ensemble uses a temporal scale of 10,000 s (∼2.8 hr) and a length scale of 300 km.
Examples of the stochastic perturbations for STOCH_SHORT and STOCH_LONG are shown in Figure 2 for a single member of each ensemble.The STOCH_SHORT pattern in Figure 2a evolves relatively more quickly in space and time than the STOCH_LONG pattern in Figure 2b.The only difference among the five members of each ensemble is the random number seed used for the stochastic perturbations.In subsequent figures, the STOCH_SHORT ensemble members have red line colors, in which varying hues are used simply for visible distinction among individual ensemble members.The STOCH_LONG ensemble uses blue hues. 10.1029/2023MS003748 6 of 23 Two additional simulations are included with constant perturbations to the mixing scheme.The 4X simulation applies a constant F = 4 to the diagnostic K h value (Equation 1), with F = 4 being chosen for consistency with S20 and to maintain numerical stability.The NO_MIXING simulation turns off explicit horizontal mixing entirely by using F = 0. Note that the NO_MIXING simulation is analogous to reducing turbulence dimensionality from 3D to 1D as done in Machado and Chaboureau (2015), Verrelle et al. (2015), and Fiori et al. (2010).That is, all simulations other than NO_MIXING perform explicit 3D mixing via a combination of the PBL scheme and the horizontal Smagorinsky mixing scheme while NO_MIXING performs only vertical mixing via the PBL scheme.
Finally, an ensemble with five members is constructed using the control configuration for mixing but with random grid-scale perturbations applied to the initial boundary layer potential temperature (θ) field.Following S20, these perturbations are sampled from a zero-mean Gaussian distribution with a maximum amplitude of ∼0.1 K.The upscale growth of these perturbations provides a measure of spread produced from grid-scale state variable noise alone.This ensemble (named θ-pert) is included to determine whether or not the sensitivity to stochastic mixing is distinguishable from that of white noise.In subsequent figures, members of the θ-pert ensemble use purple line colors.

Case Overview and BASELINE Simulations
The case studied herein occurred from 7 to 8 December 2011 during which the AMIE/DYNAMO domain experienced suppressed MJO conditions (Guy & Jorgensen, 2014;Yoneyama et al., 2013).Environmental soundings  at 12-hr increments from 0000 UTC on 7 December through 1200 UTC on 8 December are shown in Figure 3.The observed soundings were released from Gan Island while the simulated soundings are spatially averaged over the SPOL domain.In the following discussion, we discuss the relative performance of the Δ h = 3 km control simulation (BASELINE) and the Δ h = 1 km simulation (BASELINE_1KM), both of which employ standard (non-stochastic) Smagorinsky mixing.
At 0000 UTC on 7 December, the atmospheric column over Gan Island was relatively dry with low-level northwesterly winds that veered with height to westerlies above 9 km AGL, all of which was captured well by the model (Figure 3a).From 0000 to 1200 UTC, small regions of precipitation developed within the SPOL domain.
A region of large-scale convergence and convective activity stretched eastward from the Somalia coast between 0° and 10°N toward the AMIE/DYNAMO domain (not shown).The model initial conditions thus include regions of enhanced precipitable water and surface mass convergence within the northwest quadrant of Figure 1.
The column moistened significantly between 1200 UTC on 7 December and 0000 UTC on 8 December (Figure 3c).At 2100 UTC on 7 December, SPOL displayed regions of small, isolated cells with one larger, more contiguous reflectivity region (Figure 4a).The BASELINE and BASELINE_1KM simulations (Figures 4e  and 4i) also produced small, isolated cells with the BASELINE simulation producing a relatively larger, spatially contiguous reflectivity region.The frequency of cells increased between 1200 and 2100 UTC as they traversed across the SPOL domain with a southeasterly storm motion.By 0000 UTC on 8 December (Figures 4b, 4f, and 4j), a region of stratiform precipitation was observed in the SPOL domain along with scattered, isolated convective cells.The BASELINE_1KM simulation did not produce this region of stratiform precipitation and instead maintained small, isolated cells (Figure 4f), whereas the BASELINE simulation produced a small region of stratiform-like precipitation that was smaller than observed (Figure 4j).
By 0600 UTC on 8 December (Figures 4c, 4g, and 4k), SPOL showed more widespread convective cells and continued stratiform precipitation, while simulations at both Δ h continued to show less organization than observed with very little stratiform precipitation.By 1200 UTC on 8 December, the simulations produced a more organized MCS with some quasi-linear convective features embedded in stratiform precipitation that moved eastward across the SPOL domain similar to observed (Figures 4d, 4h, and 4l).Analysis of the atmospheric column at this time (Figure 3d) shows that the simulations captured continued moistening like observed.This event thus exhibited a transition of small, isolated convective cells within a tropical oceanic environment into a quasi-linearly organized MCS.However, we note that convection also formed within the entire northern half of the 1,200 × 1,200 km Δ h = 3 km model domain beyond SPOL coverage associated with a large-scale moisture supply and surface convergence (not shown).

Precipitation Structure and Evolution
Precipitation comparisons between SPOL and simulations are limited to the SPOL domain.Domain-accumulated volumetric rainfall, defined as the sum of all grid point instantaneous rain rates in the SPOL domain multiplied by the grid cell area, is shown as a function of time in Figure 5.The BASELINE simulation captures the observed precipitation evolution well.The STOCH_SHORT ensemble members (Figure 5a) follow relatively closely to BASELINE, while two members of the STOCH_LONG ensemble deviate significantly between 0000 STANFORD ET AL. 10.1029/2023MS003748 9 of 23 and 0900 UTC on 8 December (Figure 5b).All θ-pert ensemble members also follow very closely to BASELINE (Figure 5c).The 4X, NO_MIXING, and BASELINE_1KM_CG_3KM simulations are also shown in Figure 5c.The latter two simulations are similar to BASELINE, while the 4X simulation produces a larger amount of rainfall between 2100 UTC on 7 December and 0600 UTC on 8 December, similar to the two members of the STOCH_LONG ensemble mentioned above in Figure 5b.
The differing precipitation evolution of ensemble members relative to BASELINE in Figure 5 partially results from differing MCS locations within the SPOL domain and partially from differing MCS precipitation structures.Radar reflectivity at 2.5 km AGL at 0300 UTC on 8 December is shown in Figure 6 for a 600 km × 600 km subdomain.Each simulation produces varying degrees of mesoscale organization.Most notably, the 4X simulation produces a relatively smooth and spatially contiguous reflectivity field.The sharp peak in volumetric rainfall around this time in Figure 5c results from the passage of this system through the SPOL domain.In contrast, the NO_MIXING and BASELINE_1KM domains are dominated by small, isolated cells, while BASELINE lies in between the extremes of 4X and NO_MIXING.All members of the STOCH_SHORT ensemble (Figures 6f-6j) produce slightly larger regions of contiguous reflectivity and lesser frequency of small, isolated cells compared to BASELINE.The STOCH_LONG ensemble (Figures 6k-6o) shows more extreme variability in reflectivity structure among its members consistent with varying degrees of mesoscale organization.
Rain rates are partitioned by the S95 convective/stratiform algorithm and shown as a function of time in Figure 7. Notably, the BASELINE_1KM_CG_3KM simulation produces similar convective and stratiform rain rates compared to observations (Figures 7e and 7f), lending credibility to its role as a quasi-benchmark simulation despite its grid spacing still being within the gray zone.All other simulations produce larger convective rain rates and smaller stratiform rain rates than observed, which is consistent with Varble et al. (2014aVarble et al. ( , 2014b) ) who showed an overestimation of convective rainfall and an underestimation of stratiform rainfall in simulations of tropical DMC.The STOCH_SHORT ensemble systematically produces larger convective rain rates compared to BASELINE while stratiform rain rates are similar.The STOCH_LONG ensemble shows greater variability than STOCH_SHORT but retains the systematic behavior of larger convective rain rates compared to BASELINE.The 4X simulation produces the largest convective rain rates while NO_MIXING and the θ-pert ensemble members

Contiguous Convective Reflectivity (CCR) Cores
Contiguous convective reflectivity (CCR) core properties are shown as a function of time in Figure 8.The number of CCR cores systematically decreases for the STOCH_SHORT ensemble (Figure 8a) while the mean CCR core area increases (Figure 8d), resulting in a total CCR core area (analogous to convective area) that is not significantly different from BASELINE until after 0300 UTC on 8 December for some ensemble members (Figure 8g).Note that the total CCR core area evolution in Figures 8g-8i is closely tied to the MCS evolution and thus generally follows the evolution of domain-accumulated volumetric rainfall from Figure 5.The STOCH_LONG ensemble produces the greatest variability in CCR core properties (Figures 8b, 8e, and 8h), but on average mimics the STOCH_SHORT ensemble tendency to produce a smaller number of larger CCR cores than BASELINE.The θ-pert ensemble produces the least variability (Figures 8c, 8f, and 8i).
The 4X simulation (Figures 8c, 8f, and 8i) produces the smallest number of cores that are largest in size while the NO_MIXING simulation produces cores similarly sized to BASELINE and SPOL, but with more cores than BASELINE in better agreement with SPOL.The shift in 4X to wider cores is consistent with K. E. Hanley et al. (2015) and Machado and Chaboureau (2015) who showed that increasing the mixing length (analogous to increasing K h in our 4X simulation) produces wider convective cells.However, we note that K. E.  scheme.They showed that increasing the c s parameter in Equation 1 (effectively increasing K h ) produced smaller convective core numbers, larger core areas, and greater total core area.These findings are further consistent with S20's finding of wider updraft cores in 4X relative to BASELINE for mid-latitude squall line simulations.Note that because CCR core-averaged rain rates increase with core size (not shown), the larger core sizes in 4X and the stochastic mixing ensembles are consistent with greater convective rain rates compared to BASELINE shown in Figure 7. CCR core properties are also strongly sensitive to the native Δ h , whereby BASELINE_1KM_ CG_3KM produces narrower cores relative to BASELINE and more cores by up to a factor of about 2 (Figures 8c  and 8f).This is associated with greater total CCR core area (Figure 8i).An increase in number and decrease in size of CCR cores with decreasing Δ h , even when the higher resolution simulation is coarse grained, is consistent with the sensitivity of updraft core size and number to Δ h found by Varble et al. (2020).However, although narrower cores in BASELINE_1KM_CG_3KM are closer to SPOL at some times, the much larger number of cores deviates significantly from observations.

Microphysical and Dynamical Response to Stochastic Mixing
In an effort to provide a physically based explanation for the response of precipitation and convection characteristics to stochastic mixing, we next expand the analysis to include the entire 1,200 km × 1,200 km domain such that bulk statistics are not limited by varying placement of the SPOL domain relative to MCS placement in the simulations, while omitting evaluation of the Δ h = 1 km simulation since it samples a different domain.
Domain-and time-accumulated volumetric rainfall for the full domain is summarized for each simulation in Table 2 along with percentage differences relative to BASELINE.Note that these values are accumulated at every time step, whereas those used for comparison with SPOL in Figure 5 used instantaneous rain rates every 15 min.All STOCH_SHORT ensemble members produce less rainfall compared to BASELINE with differences ranging from 3% to 6%.Four of the STOCH_LONG ensemble members produce more rainfall relative to BASELINE, by up to 9.5%, with one member producing less.For the θ-pert ensemble, one member produces 4% more rainfall than BASELINE, while all other members remain below 2% differences.The 4X simulation produces the greatest difference with 18% more rainfall compared to BASELINE, while NO_MIXING produces only 1% more rainfall.
To understand these differences, we present bulk condensed water budgets next.

Bulk Condensed Water Budget
Following Morrison et al. (2015) and Ferrier et al. (1996), the bulk condensed water budget is defined by: where PRE is the surface rain rate expressed as a mass flux (units of kg m −2 s −1 ), 〈COND〉 is the column-integrated condensation rate, 〈EVAP〉 is the column-integrated evaporation rate, and RES is a residual term that represents condensate remaining in the atmosphere and net transport across lateral boundaries.The angled brackets 〈 〉 represent vertical integration following: where P is the relevant process rate (in units of kg kg −1 s −1 ), ρ is air density, and the integration is performed from the model surface to the model top.The 〈COND〉 term includes contributions from the conversion of water vapor to rain, cloud water, and ice and the 〈EVAP〉 term includes the conversion of cloud water, rain, and ice to water vapor.The RES term is an order of magnitude smaller than the other terms and is omitted from the following analysis.Also note that the PRE, 〈COND〉, and 〈EVAP〉 terms are accumulated at every model time step in order to perform a complete budget analysis, making PRE analogous to the values listed in Table 2.
Time-averaged PRE, 〈COND〉, 〈EVAP〉, and precipitation efficiency (PE, defined as PRE/〈COND〉) are shown in Figure 9 as differences of simulations relative to BASELINE, with the absolute value of the BASELINE The RES term in convective regions is also much larger than in the domainmean averages and represents net transport of condensate out of convective regions (not shown).Residual condensate does not impact interpretation of convective 〈COND〉 and 〈EVAP〉, but note that PE in convective regions (Figure 9h) therefore does not account for condensate transport that later falls as stratiform precipitation.
The STOCH_SHORT ensemble produces smaller domain-mean 〈COND〉 and 〈EVAP〉, leading to small differences in PE relative to BASELINE (Figure 9d), with lower overall PRE consistent with less rainfall (Table 2).
The STOCH_LONG ensemble produces mostly larger PRE and 〈COND〉, but also larger 〈EVAP〉, leading to variable responses in PE relative to BASELINE (Figure 9d).Little variability around BASELINE is produced by the θ-pert ensemble compared to stochastic ensembles.Note that convective mean PRE, 〈COND〉, and 〈EVAP〉 increase in the STOCH_SHORT ensemble relative to BASELINE (decreasing PE), despite the domain-mean 〈COND〉 and 〈EVAP〉 decreasing in this ensemble.Most STOCH_LONG ensemble members also increase the convective 〈COND〉 and 〈EVAP〉, but with greater 〈EVAP〉, resulting in three ensemble members producing larger PE and two members producing smaller PE.Since condensation in convective systems is heavily controlled by saturated convective updrafts, an explanation for these results is explored next through analysis of vertical mass flux, convective area, and convective intensity.

Convective Mass Flux, Area, and Intensity
Time series of domain-mean vertically integrated mass flux, 〈MF〉, for all positive (upward) motions (hereafter 〈MF〉 TOT ) and for cloudy convective updrafts only (w > 1 m s −1 and total water content (TWC) > 0.1 g m −3 , hereafter 〈MF〉 CONV ) are shown in Figure 10.The 4X simulation produces smaller 〈MF〉 TOT relative to BASELINE but larger 〈MF〉 CONV .This shows that although enhancing mixing decreases total upward motion, convective mass flux becomes stronger, consistent with larger domain-mean and convective PRE, 〈COND〉, and PE (Figure 9).On the other hand, the NO_MIXING simulation produces greater 〈MF〉 TOT by up to a factor of 2 but smaller 〈MF〉 CONV .Greater 〈MF〉 TOT in NO_MIXING agrees with greater 〈COND〉 in Figure 9b, while smaller 〈MF〉 CONV agrees with smaller convective 〈COND〉 (Figure 9f).Greater domain-mean and convective 〈EVAP〉 in NO_MIXING reduce PE relative to BASELINE, thus producing only small changes in domain-mean PRE despite greater domain-mean 〈COND〉.
All members of the STOCH_SHORT ensemble produce greater mean 〈MF〉 TOT and greater mean 〈MF〉 CONV compared to BASELINE (Figures 10a and 10d).The STOCH_LONG ensemble members also tend to produce greater 〈MF〉 TOT and 〈MF〉 CONV , but with greater variability.With the STOCH_SHORT ensemble producing greater 〈MF〉 CONV and convective 〈COND〉 and 〈EVAP〉, what drives the decrease in total rainfall and domainmean 〈COND〉 and 〈EVAP〉?To answer this, we evaluate the areal coverage of convection.
Figure 11 shows differences relative to BASELINE for time-averaged total echo area, convective area, and stratiform area, and domain-mean convective and stratiform rain rates, using S95 to define convective and strati-    10.1029/2023MS003748 15 of 23 form grid points.The STOCH_SHORT ensemble consistently produces less areal coverage of convection and precipitation but with larger convective rain rates.This is also how the 4X simulation behaves, though with greater stratiform area and stratiform rain rates as well.The NO_MIXING simulation tends to produce larger convective and stratiform areas with weaker convective and stratiform rain rates.
The combined interpretation of vertical mass flux, convective area, and convective rain rates is that enhancing mixing (i.e., 4X) decreases the areal coverage of convection but increases its intensity in terms of convective mass flux and convective rain rates, while larger stratiform area and stratiform rain rates enhance total rainfall, consistent with an overall increase in 〈COND〉 and PE.On the other hand, despite the NO_MIXING simulation producing an increase in convective and stratiform areas, the convective intensity is reduced along with a reduction in PE owing to greater 〈EVAP〉 despite larger 〈COND〉 relative to BASELINE.Interestingly, the STOCH_SHORT ensemble qualitatively follows behavior of the 4X simulation with decreased convective area and increased convective rain rates, but differs with a decrease in stratiform area.Despite considerably more variability in the STOCH_LONG ensemble, it also reduces convective area and increases convective rain rates.

Mesoscale Organization and Updraft Dilution
To quantify the impact of stochastic mixing on mesoscale organization, we compute the organization index (I org ) introduced by Tompkins and Semie (2017), which follows from previous convective clustering approaches introduced by Weger et al. (1992) and Zhu et al. (1992).Convective updraft grid cells (w > 1 m s −1 and TWC > 0.1 g m −3 ) are identified at 2.5 km AGL.Updraft objects are computed via contiguous convective updraft grid points, and the distance to the nearest neighboring convective updraft object is computed using the locations of the center of mass for each updraft object, forming a nearest neighbor cumulative density function (NNCDF).A Weibull distribution is assumed to represent a Poisson point process that would be valid for entirely randomized convective organization: where r is the nearest neighbor distance and λ is the number of points per unit area.The NNCDF is then plotted against NNCDF random and the area under the curve is integrated to yield I org (see Appendix A and Figure 18 from Tompkins and Semie (2017) for more details).For completely random convective organization, I org = 0.5.I org < 0.5 indicates regularly distributed updrafts while I org > 0.5 indicates organized (clustered) updrafts.In general, larger I org indicates a greater degree of mesoscale organization.
Figure 12 shows time series of I org .All simulations produce I org > 0.5, which is expected for this case study of a quasi-organized MCS and is consistent with the levels of observed organization for AMIE/DYNAMO case studies evaluated by Cheng et al. (2018Cheng et al. ( , 2020)).All members of the STOCH_SHORT ensemble (Figure 12a) have larger I org compared to BASELINE.For most times, all members of the STOCH_LONG ensemble also produce larger I org compared to BASELINE (Figure 12b).The 4X simulation produces a much larger degree of organization than BASELINE, while NO_MIXING produces less (Figure 12c).This is consistent with Zhang et al. (2021) in which larger mixing coefficients led to greater levels of organization.In addition, the BASE-LINE_1KM_CG_3KM simulation produces smaller I org than BASELINE (closer to NO_MIXING), while all members of the θ-pert ensemble are similar to BASELINE.
The greater degree of organization in 4X (the opposite for NO_MIXING) is related to the relative dilution of updrafts, confirmed through a passive fluid tracer analysis.Passive fluid tracers (ϕ, units of kg −1 ) are initialized with a value of unity in the bottom 1 km of the domain, and zero above 1 km, 12 hr into the simulation and are allowed to be transported horizontally and vertically.Figure 13 shows the average ϕ within convective updrafts  (ϕ updraft ), time-averaged from 3 hr after tracer initialization through the end of the simulations (15 hr).The 4X simulation (Figure 13c) produces larger values of ϕ updraft compared to BASELINE while NO_MIXING produces less.This indicates that although more mixing occurs in 4X, the updrafts within 4X are actually less dilute.The opposite is true for NO_MIXING.Consistent with the metrics previously discussed, the stochastic simulations also produce more ϕ updraft than BASELINE, with a more consistent response in STOCH_SHORT and a more variable response in STOCH_LONG (Figures 13a and 13b).These results indicate that updrafts within more organized convection are less dilute, possibly owing to their greater width but also potentially due to greater protection from dry, ambient environmental air due to clustering with nearby cloudy updrafts.Similarly, Becker et al. (2018) found that although simulated bulk entrainment increased in aggregated versus unaggregated convection, aggregated convection experienced a lesser reduction in updraft buoyancy due to the presence of a relatively moist region around the aggregated updrafts.

Long Autocorrelation Stochastic Mixing Ensemble
While the STOCH_SHORT ensemble exhibits systematic differences compared to BASELINE for all metrics, the STOCH_LONG ensemble is more variable.We further analyze the STOCH_LONG ensemble by limiting the analysis again to the 300 km × 300 km SPOL subdomain.Within this subdomain, the mean factor for perturbed mixing  (γ ) is computed and the relative difference between a given subdomain-mean metric and that from BASE-LINE is calculated for all output times (yielding 97 samples between 1200 UTC on 7 December and 1200 UTC on 8 December).In this way, statistical correlations are produced between    and the relative difference of a given metric compared to BASELINE.Note that a 300 km × 300 km subdomain is chosen since the spatial autocorrelation scale of the STOCH_LONG ensemble is 300 km.
Figure 14 shows correlations between    and the relative difference (from BASELINE) in subdomain total echo area, convective area, and stratiform area for each of the STOCH_LONG ensemble members.A linear regression is performed (blue line in each panel of Figure 14) and the coefficient of determination (r 2 ) is computed and displayed above each panel.For the first three members, there is a general tendency for the STOCH_LONG simulations to produce smaller total echo, convective, and stratiform areas relative to BASELINE when mixing is enhanced and larger areas when mixing is reduced.Little to no correlation is found for the last two members, though the scatterplots reveal some similar behavior to the first three members.For convective area, these results are generally consistent with the 4X and NO_MIXING simulations, whereby enhanced mixing reduces convective area and reduced mixing increases convective area.However, the reduction (increase) in stratiform area for enhanced (reduced) mixing is not consistent with the 4X and NO_MIXING simulations, which both showed an increase in stratiform area.Lesser effects of stochastic mixing on stratiform regions may be related to their slower evolution relative to convective regions.
Correlations between    and domain-accumulated volumetric rainfall, convective rain rates, 〈MF〉 TOT , and 〈MF〉 CONV are shown in Figure 15.Considering again only the first three ensemble members, volumetric rainfall and 〈MF〉 TOT both decrease when mixing is enhanced and increase when mixing is reduced.However, there is no correlation with the convective intensity metrics (convective rain rates and 〈MF〉 CONV ).While the last two members of the ensemble show smaller correlations, there is still a tendency for them to follow the first three members for volumetric rainfall and 〈MF〉 TOT .
The STOCH_LONG simulations follow the behavior of NO_MIXING and 4X regarding 〈MF〉 TOT , whereby enhancing mixing (4X) decreases 〈MF〉 TOT and NO_MIXING does the opposite.However, the response of volumetric rainfall in the STOCH_LONG ensemble is opposite to 4X and behaves similarly to NO_MIXING (though the relative difference between BASELINE and NO_MIXING for volumetric rainfall is rather small, see Table 2).The lack of correlation between    and the convective intensity metrics (including 〈MF〉 CONV ) suggests perturbations to mixing over a long timescale and large spatial extent mainly drive variability in the areal coverage of convection rather than the intensity.In essence, the STOCH_LONG approach modulates system evolution (i.e., mesoscale organization) over time scales large enough to induce drift from the systematic response that was seen in the STOCH_SHORT simulations relative to BASELINE, but not long or large enough to produce the same systematic response as the constant modified mixing simulations.Furthermore, the lack of correlations between    and various convective and rainfall parameters in the last two members of the STOCH_LONG ensemble suggests that mixing perturbations in other parts of the domain and/or at prior times modulate the local system evolution such that a systematic response relative to BASELINE (i.e., the slope of the regression lines) is not produced across all ensemble members.

Discussion and Conclusions
The stochastic horizontal mixing scheme introduced by Stanford et al. (2020) was evaluated here in Δ h = 3 km ensemble simulations of a tropical oceanic deep convection event and presented together with constant modified mixing simulations, observations, a higher resolution (Δ h = 1 km) simulation, and an ensemble applying random grid-scale noise to the initial conditions.
Compared to a control simulation (BASELINE), increasing mixing coefficients by a constant factor of four (4X) enhanced mesoscale organization with a smaller areal coverage of convection but greater convective intensity and greater stratiform rainfall.Despite the enhanced mixing coefficients, updrafts in the 4X simulation were actually less dilute than BASELINE (demonstrated using a passive tracer), and thus presumably less subject to entrainment of dry environmental air.This behavior may have been due to larger updraft core sizes in 4X as shown in S20.Consistent with this picture, updrafts were stronger in 4X compared to BASELINE with greater condensation rates, precipitation, and precipitation efficiency.
Conversely, turning off horizontal mixing completely (i.e., only allowing vertical mixing by reducing the turbulence dimensionality from 3D to 1D, NO_MIXING) decreased mesoscale organization.Narrower convective cores in NO_MIXING were more susceptible to dry air entrainment such that the updrafts were more dilute compared to BASELINE.Both convective intensity and precipitation efficiency were reduced but this was balanced by greater precipitation areal coverage.Thus, whereas increased mixing may be expected to enhance updraft dilution and reduced mixing to do the opposite, it appears that altered updraft widths (following S20) and changes in mesoscale organization countered this expected response.
Stochastic mixing was applied at two different spatiotemporal autocorrelation scales: a relatively short scale of 10 min and 10 km (STOCH_SHORT, roughly scaling with the size and lifetime of individual deep convective updrafts) and a longer scale of ∼3 hr and 300 km (STOCH_LONG, meant to induce mesoscale perturbations).A systematic response was produced in the STOCH_SHORT ensemble whereby all five members experienced somewhat greater mesoscale organization relative to BASELINE, which decreased convective area but increased convective intensity.However, unlike the 4X simulation, the increased mesoscale organization in STOCH_ SHORT was not great enough to induce more rainfall, owing to a smaller relative increase in convective rain rates and a decrease in stratiform area.In general, stochastic mixing applied at these short scales behaved more similarly to the 4X simulation than the NO_MIXING simulation.This is consistent with the fact that the stochastic perturbations γ were applied to the mixing coefficients as relative scaling factors via Equation 2. In other words, the absolute change in mixing coefficient was diminished as the mixing coefficient was reduced; conversely, it was increased as the mixing coefficient was increased (up to an upper limit dictated by numerical stability).Therefore, it is not surprising that the simulations responded more significantly to regions of enhanced versus reduced mixing, an effect also seen in the idealized simulations of S20.
The STOCH_LONG ensemble experienced greater variability among individual ensemble members, but generally behaved similarly to the STOCH_SHORT ensemble with greater mesoscale organization, decreased updraft dilution, and increased convective intensity compared to BASELINE.However, the longer and spatially larger 10.1029/2023MS003748 20 of 23 mixing perturbations of the STOCH_LONG ensemble also modified the system evolution such that there were less systematic changes relative to BASELINE compared to the other simulations with modified mixing.
Simulations were compared to observations from an S-band radar and to a higher resolution Δ h = 1 km simulation.The Δ h = 1 km simulation produced similar accumulated volumetric rainfall and convective and stratiform rain rates as observed, despite producing a higher number of convective cells.Stochastic mixing simulations also produced a reasonable representation of the observed event in regards to accumulated volumetric rainfall, but often degraded results relative to the BASELINE Δ h = 3 km simulation when compared to both the Δ h = 1 km simulation and observations.For Δ h within the deep convective gray zone where updrafts are generally biased as too large, strong, and undilute (e.g., Lebo & Morrison, 2015;Stanford et al., 2020), the stochastic mixing scheme here generally acts to enhance these biases.As discussed in S20, this is likely due to the downgradient design of the Smagorinsky model with a lower limit of K h requiring it to be positive, but an upper limit controlled only by numerical stability.Overall, this makes instances of enhanced mixing more impactful.
There are several important distinctions between this study and S20.In particular, stochastic mixing in S20 was applied to an idealized squall line after the system had reached a quasi-steady dynamical and thermodynamic state.Here, stochastic mixing was applied for the whole simulation, including during deep convection initiation, which is a likely reason for significant variability of system evolution in STOCH_LONG.Other differences compared to S20 were the use of a PBL scheme for vertical mixing here and the use of Δ h = 3 km, whereas the largest Δ h considered in S20 was 2 km and no PBL scheme was used.Nonetheless, stochastic mixing as applied in S20 did have a significant effect on the squall line's thermodynamic and kinematic structure, broadly consistent with the results here.Several characteristics of the squall line updraft core properties (in particular, updraft core size) were improved by removing horizontal mixing in S20 as compared to LES "truth."Relative to the Δ h = 1 km simulation here, the NO_MIXING configuration also improved upon BASELINE based on the number of convective cores (Figure 8c) and the degree of organization (Figure 12c).However, we note that Δ h = 1 km cannot be considered as truth, given that convergence is not realized in deep convection simulations until much finer grid spacing is used.Indeed, Figure 8c showed that the Δ h = 1 km simulation diverged more significantly from observations compared to the Δ h = 3 km BASELINE simulation.Simon and Chow (2021) investigated kilometer-scale simulations of a convective boundary layer with a sensitivity experiment in which only numerical diffusion operated (i.e., K = 0 for both horizontal and vertical mixing).They found that numerical diffusion alone produced excessive TKE at resolved scales, relative to an LES, but did so on relatively appropriate wavenumbers, similar to what was shown in Stanford et al. (2020, see their Figure 18).Moreover, Simon and Chow (2021) showed that potential temperature profiles in the numerical-diffusion-only simulation agreed quite well with the filtered high-resolution simulation, suggesting that numerical mixing alone may be sufficient for horizontal potential temperature diffusion.However, they also suggested that employing an explicit turbulence closure in their study acted to dissipate the excess resolved-scale kinetic energy without shifting TKE to larger wavelengths.Further work is therefore needed to explore the efficacy of horizontal implicit numerical diffusion in kilometer-scale simulations.Moreover, since WRF's odd-ordered upwind advection scheme is implicitly diffusive, it is not clear how the role of explicit horizontal mixing may change in models employing centered advection schemes.Simon and Chow (2021) further suggested the potential benefits of enhancing vertical mixing in kilometer-scale simulations.Our study did not explore the sensitivity to vertical mixing, which might be important to consider in future gray-zone stochastic mixing studies, especially given the anisotropic treatment of mixing at these scales.For example, Kitamura (2015) evaluated the dependence of turbulent length scales on model resolution relative to an a priori high-resolution simulation.They found noteworthy anisotropy of the turbulence length scale even for grid aspect ratios near unity (i.e., Δ x,y ∼ Δ z ).For length scales obtained from the thermal eddy diffusivity, they showed the vertical component to be larger than the horizontal component.For eddy viscosity length scales, they found a dependence of the horizontal length scale on vertical resolution, while the vertical length scale was insensitive to the vertical resolution.Furthermore, Simon and Chow (2021) implemented a novel turbulence closure into WRF employing an anisotropic length scale in which the vertical length scale is a function of the horizontal resolution and vice versa.They found in Δ h = 1.2 km simulations a marked improvement using such a formulation compared to the standard anisotropic treatment in WRF.Introducing stochasticity in such implementations would benefit from idealized simulations (e.g., Stanford et al., 2020) where 3D Smagorinsky mixing is employed as opposed to the PBL-controlled vertical mixing used herein.
The downgradient eddy diffusivity approach employed here is inherently limiting.For example, Verrelle et al. (2017) evaluated kilometer-scale deep convection simulations against a benchmark LES and found that the eddy diffusivity scheme underestimated buoyant production of subgrid TKE and, by design, did not allow the production of countergradient structures that were identified in the LES.Similarly, Sun et al. (2021) coarse-grained an idealized supercell LES to kilometer-scale Δ h and found that vertical and horizontal SGS fluxes both exhibited nonlocal characteristics due to tilting of horizontal fluxes by wind shear.Both studies along with K. Hanley et al. (2019) improved kilometer-scale deep convection simulations (relative to LES) by implementing the nonlinear closure introduced by Moeng (2014) that allows for countergradient fluxes.Future work should thus also explore the application of stochastic mixing to countergradient schemes.
Application of stochastic mixing to gray zone deep convection simulations is ripe for further study.Even though simulations were degraded relative to observations and the higher resolution simulation here, there was a clear change induced by stochastic mixing in all analyzed metrics that was much greater than the signal produced by the non-stochastic ensemble employing only grid-scale noise in the initial θ field.This signifies that the stochastic scheme alters simulations in a meaningful way.These modifications included a systematic shift in behavior most evident in STOCH_SHORT.While not the main focus of this study, there was also a substantial increase in ensemble spread for the stochastic mixing simulations, relative to the θ-pert ensemble, especially for STOCH_LONG.These results suggest the potential for forecasting applications.The simulations displayed considerable sensitivity to the spatiotemporal autocorrelation scale of the stochastic mixing perturbations.Ideally, the scale and magnitude of the perturbations would be derived from observations and LES, with the controls on physical mixing events including their scales of variability carefully considered.Overall, this study has highlighted the importance of firmly understanding the impacts of lateral mixing in km-scale DMC simulations.It is clear that modifying traditional mixing approaches such as the downgradient Smagorinksy scheme applied here can impact more than just local updraft-scale mixing, most notably by altering mesoscale convective organization.

Figure 1 .
Figure 1.WRF domain used for the Δ h = 3 km simulations (black box) and the Δ h = 1 km simulation (blue box).The 150-km radial distance SPOL domain is shown by the black circle and the location of SPOL is shown by the red "×."

Figure 2 .
Figure 2. Representative spatial patterns of stochastic perturbations produced by the stochastic pattern generator for (a) a STOCH_SHORT ensemble member and (b) a STOCH_LONG ensemble member.The temporal and spatial autocorrelation scales are 10 min and 10 km for STOCH_SHORT and 2.8 hr and 300 km for STOCH_LONG.

Figure 3 .
Figure 3.Skew T-log P diagrams of environmental soundings for (a) 0000 UTC on 7 December, (b) 1200 UTC on 7 December, (c) 0000 UTC on 8 December, and (d) 1200 UTC on 8 December.Observed temperature (T), and dew point temperature (T d ; red and green lines, respectively), and horizontal winds (black barbs, in knots) were measured at Gan Island.Simulated T and T d (blue and orange lines, respectively) and horizontal winds (blue barbs) are spatially averaged over the SPOL domain, where the ERA5 forecast begins at 0000 UTC on 7 December.

Figure 4 .
Figure 4. 2.5 km AGL Rayleigh reflectivity horizontal cross sections for SPOL (top row), the Δ h = 1 km simulation (middle row) and the Δ h = 3 km BASELINE simulation (bottom row).Times and dates are shown at the top of each column.The SPOL 150 km radial distance range ring is shown by the black circle.

Figure 5 .
Figure 5.Time series of domain-accumulated volumetric rainfall, limited to the SPOL domain, for the (a) STOCH_SHORT, (b) STOCH_LONG, and (c) θ-pert ensembles.The 4X, NO_MIXING, and BASELINE_1KM_CG_3KM simulations are included in (c).SPOL volumetric rainfall is shown in gray with observational uncertainty in light gray shaded regions.A running mean with a width of 1 hr is applied to each time series.

Figure 6 .
Figure 6.Horizontal cross sections at 2.5 km AGL of (a) observed and (b)-(n) simulated radar reflectivity at 0300 UTC on 8 December.The SPOL 150-km radial distance range ring is shown by a black circle.

Figure 7 .
Figure 7. Time series of SPOL domain-mean convective (left column) and stratiform rain rates (right column) for the (a) and (b) STOCH_SHORT, (c) and (d) STOCH_LONG, and (e) and (f) θ-pert ensembles.A running mean with a width of 1 hr is applied to each time series.

Figure 8 .
Figure 8.Time series of (a)-(c) contiguous convective reflectivity (CCR) core number, (d)-(f) mean CCR core area, and (g)-(i) total CCR core area for the SPOL domain only.A running mean with a width of 1 hr is applied to each time series.

Figure 9 .
Figure 9. Percent differences relative to BASELINE of time-averaged PRE, 〈COND〉, 〈EVAP〉, and PE for each simulation.Domain means are shown in (a)-(d) and conditional means for only convective grid points (identified via S95) are shown in (e)-(h).The mean BASELINE value for each process is shown at the top of each panel.

Figure 10 .
Figure 10.Time series of (a)-(c) domain-mean vertically integrated upward mass flux (〈MF〉 TOT ) and (d)-(f) domain mean convective updraft mass flux (〈MF〉 CONV ).The STOCH_SHORT ensemble is shown in the left column, the STOCH_LONG ensemble in the middle column, and the θ-pert ensemble in the right column.Averaging in (d)-(f) is conditionally sampled from convective updrafts, which are grid points with w > 1 m s −1 and TWC > 0.1 g m −3 .A running mean with a width of 1 hr is applied to each time series.

Figure 11 .
Figure 11.As in Figure 9 but for (a) total echo area, (b) convective area, (c) stratiform area, (d) convective rain rates, and (e) stratiform rain rates.

Figure 13 .
Figure 13.Vertical profiles of domain-average passive fluid tracer within convective updrafts (ϕ updraft ) temporally averaged from 15 UTC on 7 December through 12 UTC on 8 December (beginning 3 hr after tracer initialization in the bottom 1 km of the domain).

Figure 12 .
Figure 12.Time series of the organization index (I org ) for (a) STOCH_SHORT simulations, (b) STOCH_LONG simulations, and (d) θ-pert, 4X, NO_MIXING, and BASELINE_1KM_CG_3KM simulations compared to the BASELINE simulation.A running mean with a width of 1 hr is applied to each time series.

Figure 14 .
Figure 14.Scatterplots of the subdomain-mean stochastic perturbation  (γ ) and the relative difference (from BASELINE) of total echo area (top row), convective area (middle row), and stratiform area (bottom row) from the STOCH_LONG ensemble.Each circle is a different point in time and the color-fill is fraction of the 24-hr period from 1200 UTC on 7 December through 8 UTC on 8 December.Positive relative differences indicate values greater than BASELINE.    0 indicates net enhanced mixing while     0 indicates net reduced mixing.A linear regression is shown as a thick blue line and the coefficient of determination (r 2 ) is given above each panel in blue.A different member of the STOCH_LONG ensemble is shown in each column.

Figure 15 .
Figure 15.As in the figure but for domain-accumulated volumetric rainfall (Vol.Rainfall, top row), convective rain rates (Conv.RR, second row), total upward vertically integrated mass flux (〈MF〉 TOT , third row), and convective updraft vertically integrated mass flux (〈MF〉 CONV , fourth row).Positive relative differences indicate values greater than BASELINE.
4X simulation produces larger domain-mean PRE and 〈COND〉 relative to BASELINE, but smaller 〈EVAP〉, acting to increase PE by almost 10%, consistent with more rainfall shown in Table 2. NO_MIXING also produces larger 〈COND〉 but much larger 〈EVAP〉 relative to BASELINE, with only slightly larger PRE, resulting in a decrease of PE relative to BASELINE by ∼8%.Similar responses relative to BASELINE are shown for convective 〈COND〉 and 〈EVAP〉 (Figures 9e-

Table 2
Domain-and Time-Accumulated Volumetric RainfallFrom 1200 UTC on 7 December Through 1200 UTC on 8 December for the 1,200 km × 1,200 km WRF Domain