Prediction of gradient‐based similarity functions from the Mellor–Yamada model

Gradient‐based implicit similarity functions are derived theoretically from a few variants of the Mellor–Yamada algebraic turbulence‐closure model under assumptions of local equilibrium conditions in a stably stratified shear flow. The solutions are compared with empirical functions presented by Sorbjan. Good agreement is found in the range of gradient Richardson number Ri extending up to around 0.2. The gradient‐based scaling framework offers better accuracy and reliability than the traditional Monin–Obukhov framework, as it circumvents the problems of small values of fluxes and cross‐correlations. It is also possible to separate the specification of the master length‐scale from the calculation of similarity functions describing second moments and dissipation rate. The related discussion of model performance at higher Ri is included. It is unclear whether the current countermeasures against spurious flow decoupling reflect the observed features of turbulence in the very stable regime, due to the apparent breakdown of the Richardson–Kolmogorov energy cascade, as found by Grachev et al.


INTRODUCTION
Static stability plays a pivotal role in atmospheric and oceanic dynamics, as it controls the turbulent exchange of mass, momentum, and energy. The ubiquity of stable equilibrium conditions over land and sea requires proper handling in climate models; further, accurate computation of surface fluxes is essential for predicting the surface cooling rate and the growth of surface-based thermal inversion. It is hard to overestimate the practical importance of effective modeling of stably stratified flows: for example, predicting electricity production by wind farms, winter smog episodes, frosts, road icing, fog. With the growing spatial resolution of numerical weather prediction (NWP) models and growing expectations for forecast accuracy, there is a constant need for further improvement in stable boundary-layer modeling. Until the late 20th century, a popular view of turbulence in a stable boundary layer (SBL) was largely shaped by Monin-Obukhov similarity theory (Monin and Obukhov, 1954, hereinafter MOST) and its extension, local similarity theory (Nieuwstadt, 1984, hereafter LST), which accounted for changes of turbulent fluxes with height. Here, extreme stability was conceived of as an asymptotic "z-less" (Wyngaard and Coté, 1972) regime where the range of sizes of eddies narrows due to more effective damping of larger eddies by buoyancy forces, so that the wall effect becomes unimportant. Another paradigm was the presence of a critical value Ri c of the gradient Richardson number Ri, above which the flow was supposed to become laminar, as shown by linear perturbation theory (Howard, 1961;Miles, 1961) and deduced from some of the field observations (e.g., Lyons et al., 1964;Webb, 1970;Businger et al., 1971). However, in the 1970s, the existence of Ri c was questioned by Yamamoto (1975); subsequently, Lettau (1979) presented the evidence of turbulence still present at Ri > 1, well above the values suggested before.
In the last quarter of a century, turbulence under stable equilibrium conditions became one of the main foci of interest in boundary-layer meteorology. Major field experiments, such as Surface Heat Budget of the Arctic Ocean (SHEBA: Grachev et al., 2005) and Cooperative Atmosphere-Surface Exchange Study (CASES-99: Blumen et al., 2001;Poulos et al., 2002), and more recently massive large-eddy simulations (LES: Sullivan et al., 2016), provided data to verify the existing views and broaden our theoretical understanding. An important emerging concept was the distinction between weakly stable and very stable boundary layers, as proposed by Mahrt (1998). Essentially, in a weakly stable boundary layer (WSBL), surface cooling is moderate while turbulence is maintained by the wind shear and occurs continuously; the turbulent transport is described sufficiently well by MOST and LST (e.g., Basu et al., 2006). In contrast, a very stable boundary layer (VSBL) is characterized by weak winds and strong surface inversions, while turbulence is suppressed by buoyancy forces most of the time. However, occasional bursts of turbulence still occur, caused by gravity waves, meandering motions, and local drainage flows due to terrain inclination (Mahrt, 1998;2007). As pointed out by Mahrt (1998), the combined effect of these phenomena may depend on the size of the area over which surface fluxes are computed. Therefore, quantifying the turbulent exchange in a numerical model with a particular grid spacing poses a challenge, due to both the complexity and spatial scale dependence.
It is worth noting that the WSBL and VSBL should be treated as prototypes. Other classifications do exist: for example, Sorbjan (2010) distinguished four regimes with different thresholds of the value of Ri, while van de Wiel et al. (2003) proposed a classification based on net radiation and effective pressure gradient and Grachev et al. (2013) focused on the presence of the Richardson-Kolmogorov energy cascade. As noted by Mahrt (1999), "Any attempt to divide the stable boundary layer into a few classes or states (...) is an oversimplification". The apparent lack of a fully comprehensive framework poses additional difficulty in choosing an appropriate turbulence parametrization for a numerical model.
Regardless of the particular turbulence parametrization applied in numerical weather prediction and climate models, the turbulent transport through the model lowest atmospheric layer is treated with bulk formulations based upon MOST (Edwards et al., 2020). Such methods typically rest on integral forms of empirical flux-gradient relationships, sometimes with further simplifications. Another approach is to use bulk relationships derived from the turbulence model (Łobocki, 1993, introduced to the U.S. NCEP Eta model in 1993, to avoid inconsistencies between the turbulence parametrization and the surface-layer treatment. Unfortunately, as is now recognized, the applicability of MOST is limited and the proposed thresholds differ, particularly when different criteria are used. For example, the analysis of data from Fluxes Over Snow-covered Surfaces II (FLOSSII: Mahrt, 2010) suggested the breakdown of MOST at a value of bulk Richardson number Ri B around 0.8 (corresponding to Ri around 2-4). In contrast, the analysis of SHEBA data (Grachev et al., 2013) pointed to the limitation of the validity of MOST to Ri ⪅ 0.1. Values of flux and gradient Richardson numbers around 0.20-0.25 were found as the limits of LST validity.
Another milestone in understanding the SBL behavior was recognition of the existence of a maximum sustainable heat flux (MSHF: van de Wiel et al., 2007). Since the downward turbulent heat flux is supposed to vanish both at neutral equilibrium and at extreme stability and is positive within this range, there must also be a maximum value located therein. However, this issue did not attract attention until the work of Malhi (1995), who derived the formula linking the heat flux to the Obukhov stability parameter = z∕L and wind speed, and presented supporting empirical evidence. Later on, van de Wiel et al. (2012) offered a viable explanation of the mechanism of cessation of continuous turbulence, leading to a different concept of critical Richardson number. Namely, in the weakly stable range, the heat flux is growing with increasing stability due to increased temperature gradient, providing negative feedback to the surface cooling. However, after MSHF is reached, further increase of stability causes the heat flux to decrease due to turbulence damping. Thus, positive feedback arises unless the growth of stability is quickly compensated by increasing wind speed. While both the Malhi (1995) and van de Wiel et al. (2012) results were derived using log-linear similarity profiles and concerned the surface layer, Łobocki (2013) derived the location of MSHF from a turbulence closure model, thereby generalizing the findings for the entire SBL.
Note that the mechanism of turbulence collapse described by the MSHF theory does not preclude the existence of turbulence beneath the maximum location and does not take into account restoration mechanisms such as generation by gravity waves or other submeso motions or conversion between turbulent kinetic energy (TKE) and turbulent potential energy (TPE). The last concept was also introduced in the last quarter of a century and earned due attention in the development of similarity theory (Zilitinkevich and Esau, 2007) and modeling Li et al., 2016;Łobocki, 2017). The absence of such mechanisms in the parametrizations used in numerical weather prediction and climate models is a likely reason for a frequent problem of models-a spurious decoupling of flow from the underlying surface associated with unrealistically rapid and intense surface cooling (e.g., Derbyshire, 1999;Cuxart et al., 2005), commonly termed "runaway cooling". In the past, runaway cooling was mostly treated by some ad hoc measures such as setting the smallest allowable value of eddy transfer coefficient. Unfortunately, these measures may limit the ability to reproduce extreme cases and cause side effects that involve other phenomena (Holtslag et al., 2013).
MOST and LST rely on using the turbulent momentum and buoyancy fluxes to set scales for other single-point turbulence statistics and gradients of mean flow variables. Unfortunately, as turbulent fluctuations decrease with growing stability, the smallness of fluxes, measurement errors, and self-correlation errors (e.g., Hicks, 1978;Klipp and Mahrt, 2004;Baas et al., 2006) limits the usefulness of scales based on covariances of turbulent fields. As a consequence, classic similarity performance becomes uncertain with growing stability. To overcome these limitations, Sorbjan, (2006;2008;2016) proposed scaling systems based on wind speed and temperature gradients, the so-called gradient-based similarity. Sorbjan and Grachev (2010) showed that the vertical turbulent fluxes of momentum and heat, scaled with vertical velocity variance and the vertical gradient of potential temperature, can be represented as universal functions of Ri, and that this result can be deduced from K-theory. They also proposed a formulation of flux-gradient relationships in the form of such universal functions, established from SHEBA and CASES-99 data, within the range of Ri extending to 0.7. Subsequently, the theory was extended to include universal functions describing variances of potential temperature, vertical velocity, and dissipation rate in a few variants of the scaling system (Sorbjan, 2010;2016).
The gradient-based approach seems quite natural from a modeling perspective, since turbulence characteristics, including fluxes, are derived in models from wind, temperature, and humidity profiles. The scaling systems developed within the gradient-based similarity framework offer the possibility of comparing observations taken at different conditions and numerical simulations together with theoretical predictions from turbulence-closure models. Gradient-based similarity functions can also be used directly in models, as in Sorbjan (2012a).
As outlined above, a substantial advance in understanding the SBL behavior has occurred in the last quarter of a century. These new ideas deserve implementation in operational models; perhaps a more comprehensive quantitative framework is still necessary. However, now progress can be stimulated by evaluating the existing parametrization schemes based on the newly gained knowledge. With this aim, we present a derivation of gradient-based similarity functions from several versions of the Mellor-Yamada turbulence-closure model (Mellor, 1973;Mellor and Yamada, 1974;1982;Nakanishi, 2001), within the frameworks of scaling systems proposed by Sorbjan, (2016;, and a comparison with the universal functions presented therein. While our choice of model is motivated by its wide application in numerical weather prediction (e.g., Janjić, 1990;Gerrity et al., 1994;Saito et al., 2007), mesoscale atmospheric research (e.g., Sušelj and Sood, 2010;Foreman and Emeis, 2012), ocean modeling (e.g., Blumberg and Mellor, 1987;Kantha and Clayson, 1994;Burchard and Bolding, 2001;Kim and Moon, 2020), and global atmospheric models (e.g., Sirutis and Miyakoda, 1990), the framework presented here can possibly be applied to a wider class of algebraic Reynolds-stress models. Thus, practical questions behind this comparison probe the extent to which classical Reynolds stress can reflect turbulence characteristics, possibly beyond the weakly stable stratification regime, and investigate the influence of possible modifications of models.
The objectives of this study are twofold. One is to compare the performance of several variants of a turbulence closure model that is frequently used in research and operational forecasting. The other is to present a framework for such comparisons based on a relatively new, gradient-based similarity.
Derivations of similarity functions from turbulence closure models were made in the past, hence the methodology is essentially not new. For example, MOST functions were derived by Lewellen and Teske (1973), Mellor (1973), and Prenosil (1979), analytical solutions in terms of Rossby-number similarity were presented by Yamada (1979), and local similarity functions were obtained by, for example, Nieuwstadt (1984) and . To the knowledge of the authors, the current presentation is the first derivation of gradient-based similarity functions from a turbulence-closure model. 1 The article is organized as follows. Section 2 presents the turbulence-closure model and the gradient-based scaling systems used in this study. Individual subsections present stability parameters (Section 2.1), model equations and generic relationships emerging directly from model equation definitions of stability parameters (Section 2.2), "implicit" gradient-based scaling systems (Section 2.3) and relationships between the scales and mean-field gradients (Section 2.4), and a list of empirical similarity functions used here for comparison (Section 2.5). Derivations presented in Section 3 depart from equations presented in Section 2.2. In this section, all the equations containing "implicit" gradient-based scales are newly derived. The individual subsections collect all the formulae regarding a particular quantity in all the scaling systems considered. We start with the turbulent kinetic energy, which is key to further derivations. The next subsection discusses the master length-scale. Finally, we present universal functions representing scaled variances, covariances, and dissipation rate. Graphs showing all the solutions obtained are presented and discussed in Section 4, and our main findings are summarized in Section 5.

Equilibrium parameters
Below, we use two basic equilibrium parameters: flux Richardson number and gradient Richardson number where = g∕T is the buoyancy parameter, g is the gravity acceleration,T is the reference temperature, U is the wind speed, Θ is the mean potential temperature, and ′ is its temperature fluctuation. N is the frequency of free vertical oscillations of a parcel in a stable atmosphere (the Brunt-Väisäla frequency), N = √ Γ; Γ = dΘ∕ dz, and S denotes the wind shear, S = dU∕ dz. Here, we use a Cartesian coordinate system with its X-axis oriented streamwise, assuming growth of the wind speed with height. 1 In fact, Sorbjan (2006) transformed Nieuwstadt's model into a form exhibiting a dependence on Ri, but did not derive explicit solutions.

2.2
Model equations Mellor (1973) proposed a turbulence model based on a set of predictive equations for Reynolds stress, heat flux, and temperature variance. This system was closed by adopting the Kolmogorov hypothesis for dissipation terms, Rotta's return-to-isotropy hypothesis for pressure-strain correlations, and a downgradient transport hypothesis for third-order moments. This system was then simplified for application to the atmospheric surface layer by neglecting all the terms except for those describing production and dissipation. For application to the boundary layer, Mellor and Yamada (1974) introduced a four-level simplification hierarchy based on anisotropy measures, where so-called Level 2 corresponded to the local production-dissipation equilibrium and was fully algebraic. Notably, the Level 3 model contained two prognostic equations, for predicting TKE and potential temperature variance, which can now be understood as equivalent to the TPE budget. Nakanishi (2001) and later Nakanishi and Niino (2004;2006) extended the Mellor-Yamada model by including a more complete parametrization of pressure-strain and pressure-temperature-gradient covariances, adjusting model constants to LES results and changing the formulation of the length-scale used in closure hypotheses, the so-called "master length-scale" (see Section 3.2 for an explanation). In this work, both the original Mellor-Yamada model and the Nakanishi-Niino extensions are investigated.
Gradient-based similarity, like other similarity theories in boundary-layer research, considers a steady, quasi-equilibrium state where turbulence statistics remain constant in time as a hypothetical final result of the adaptation process between vertical profiles of mean flow variables and turbulence. Horizontal homogeneity is also assumed. In terms of the second-moment budget equation set, these assumptions correspond to local production-dissipation equilibrium, with the possible exception of the vertical transport of turbulence energy, which is of minor importance in stable conditions. When these simplifications are applied to the Mellor-Yamada-Nakanishi-Niino model, we arrive at the so-called Level 2 approximation with the resulting equation set (Nakanishi, 2001): where and w ′ are fluctuations of velocity, is the so-called master length turbulence scale (see Section 3.2), z is the height, q 2 = u ′ 2 + v ′ 2 + w ′ 2 is twice the kinetic energy of turbulence per unit mass (hereinafter TKE), and overbars represent Reynolds' averaging. In the above set, the first four equations result from velocity variances and covariances budgets, the next two from potential temperature-velocity covariances, and the last two from simplification of the potential temperature variance and TKE budgets.
After an algebraic reduction (see Appendix), the Level 2 model yields the following formulation of the flux-gradient relationships: Three sets of model constants were used in this study. One is the set given by Mellor and Yamada (1982): A 1 = 0.92, A 2 = 0.74, B 1 = 16.6, B 2 = 10.1, C 1 = 0.08, C 2 = C 3 = C 4 = C 5 = 0, hereinafter MY82. These values were chosen based on laboratory measurements under neutral equilibrium and relationships among constants. The two other sets come from the studies of Nakanishi (2001) and Nakanishi and Niino (2009), who used LES data: We also consider a modification of the model described above introduced by Kitamura (2010), following the idea of Canuto et al. (2008). In the modified equations, constant A 2 was replaced by A 2 ∕(1 + Ri), allowing the gradient Richardson number to grow infinitely without any critical value while restricting the flux Richardson number range. Hereinafter, this version is referred to as the Canuto-Kitamura modification (identified as MYNN09-CK) and uses the same set of constants as MYNN09.
As follows from the definitions in Equations 1 and 2, the ratio of eddy diffusivities for momentum and heat, known as the turbulent Prandtl number Pr t , which by the definitions is also the ratio of Richardson numbers, equals as implied by Equations A13, A10, and the flux-gradient relationships in Equations 9 and 10 (the auxiliary constants 1 , F 1 , and F 2 are given by Equations A11, A12, and A14, correspondingly). It is also important to notice that the model implies a mutually unique relationship between Ri and R F (note this is a quadratic equation for R F when Ri is given or vice versa). At R F = R * F as given by Equation A11, the function S H (R F ) changes its sign. While R * F is often regarded as the "critical value" of flux Richardson number, we would rather see it as a physical realizability constraint of the model: as seen from Equations A7 and A1, negative values of the S H function would yield negative ′ 2 , which is impossible.

Scaling systems
In the forthcoming section (Section 3) we will demonstrate the derivation of gradient-based similarity functions representing scaled variances and covariances of turbulent fields and dissipation rates. The solutions will be compared with similarity functions of Ri obtained based on the SHEBA and CASES-99 data (Sorbjan and Grachev, 2010;Sorbjan, 2016;. Here, we focus on the three "implicit" gradient-based similarity scaling systems discussed by Sorbjan, (2016;: the w scaling based on the set ( w , Γ, ), the scaling based on ( , Γ, ), and the scaling based on ( , Γ, ), where 2 w = w ′ 2 , 2 = ′ 2 , and is the dissipation rate of TKE. The velocity, temperature, and length-scales expressed with these quantities are presented below.
In all the "implicit" systems, universal functions were obtained by rescaling from an "explicit" system, where the mixing length l was treated as the length-scale L N , and temperature and velocity scales were defined using L N and the Brunt-Väisäla frequency N. Initially (Sorbjan, 2010;Sorbjan and Grachev, 2010), the mixing length was assumed proportional to height, l = z (where is the von Kármán constant), and the study was based mainly on SHEBA data and limited to the surface layer. Then, using CASES-99 data, Sorbjan (2012b) showed the validity of the functions previously obtained above the surface layer, when a limit l 0 = 12 m was imposed on the mixing length by taking l = z∕(1 + z∕l 0 ). This was modified further in Sorbjan (2016) by including the limiting effect of stratification through another factor, B = s ∕Ri, with s = 1 m, so that the final form was l = z∕(1 + z∕l 0 + z∕ B ). Importantly, for all these forms of mixing length, the form of the similarity functions of Ri remained identical. 2 2.3.1 ( w , N) scaling The first of the scaling systems discussed here uses the vertical velocity variance, as proposed by Smeets et al. (2000).
A combination with the Brunt-Väisäla frequency (Sorbjan, 2001;2008; yields the velocity, temperature, and length-scales: The length-scale L w represents the vertical distance at which the kinetic energy of a moving parcel is converted into work against the buoyancy force. In other words, it expresses the range of vertical displacements in turbulent eddies and is often referred to as the vertical overturning scale; further, it can be associated with the mixing-length concept and thereby with the Eulerian integral scale of turbulence (Tennekes and Lumley, 1972, p. 45). The associated temperature scale T w represents the magnitude of temperature fluctuations resulting from such motions.

( , N) scaling
The second set considered by Sorbjan, (2008;2016; is based on on the temperature variance: The length-scale L is known as the Ellison scale (Ellison, 1957). It can be interpreted as a measure of vertical 2 A minor, but important correction based on LES results was introduced in Sorbjan (2017). displacements of air parcels before returning to the equilibrium position. Essentially, it is similar to the L w scale, except that turbulent fluctuations are scaled in terms of temperature rather than vertical velocity.

( , N) scaling
This scaling system is based upon the Dougherty-Ozmidov length-scale (Dougherty, 1961;Ozmidov, 1965), which characterizes the maximum size of eddies that may remain unaffected by the damping effects of buoyancy in stable conditions. 3 It reads (Sorbjan and Balsley, 2008;Grachev et al., 2015;Sorbjan, 2016; 2017) where, in view of the turbulence closure in the Mellor-Yamada model, we should use Grachev et al. (2015) determined the validity region of this scale as extending up to "critical" values of R F and Ri around 0.20-0.25, consistent with previous considerations (Grachev et al., 2013) of the validity of the LST.

General scaling relationships and dimensionless gradients
For all the scaling systems considered in this section ( = w, , ), the following relationships hold: as evident from Equations 14,15,17,18, and 20; and 3 Although this meaning is commonly accepted, direct numerical simulation of decaying turbulence (Mater et al., 2013) suggests that the size of overturns exceeds this scale in a strongly stratified regime when NT L > 1, where T L = q∕(2 ) is the turbulence time-scale characterizing the decay.
TA B L E 1 Implicit empirical similarity functions in gradient-based scaling, according to Sorbjan (2017)   as can be seen from Equations 14,16,17,19, and 20. These relationships imply in view of Equations 22 and 2, and as follows from Equation 23 and the definition of the Brunt-Väisäla frequency. The definitions of the scales in Equations 12-13 and 15-17 also imply (Sorbjan, 2016) and, considering Equations 12-13 and 18-20,

Empirical similarity functions
Below we list empirical functions proposed by Sorbjan (2017) for each of the three scaling systems. While these functions were presented within the range (0.004, 0.3) of Ri, earlier research (e.g., Sorbjan, 2010) presented both the data and proposed functions for a wider range, up to Ri ≈ 0.7. Note, however, that (a) the forms of functions in these systems are obtained by rescaling, employing relationships between scales, rather than by fitting to data independently in each of the systems, and (b) in the ( , N) scaling system that appeared in a recent development, only a narrower range of Ri (up to 0.3) is presented. For compactness of presentation, we use the following auxiliary functions: The empirical similarity functions are summarized in Table 1.

Kinetic energy of turbulence
The TKE budget (Equation 8) can be rewritten using the model solution (Equation 9) and the definition of the flux Richardson number (Equation 1): It is important to note that, as the turbulence-closure model uses a single master length-scale, the dimensionless TKE scaled with momentum flux depends on the flux Richardson number only (as in Monin-Obukhov surface-layer similarity). Hence, it does not depend on the master length-scale specification.
From the vertical velocity variance budget (Equation 3), substituting Equation 1 and dividing by u ′ w ′ , we obtain Thus, substituting Equation 28, As we see, the normalized vertical velocity variance is once again only a function of flux Richardson number. Hence, one can easily renormalize the TKE budget (Equation 28) as From Equation 31, it also clearly follows that there could be another R F limitation, as q w might become negative at a certain value of R F . However, this value is larger than R * F , so it is not an active limitation. By substituting w ′ ′ from the potential temperature variance budget (Equation 7) into the vertical velocity variance budget (Equation 3) and entering Equation 15, we obtain Combining this result with Equation 31 yields In fact, the scale U 2 is twice the turbulent potential energy per unit mass, and the constants B 1 and B 2 determine the TKE and TPE dissipation rates. Now, dividing the TKE budget equation (Equation 8) by w ′ ′ , then using the definitions of the flux Richardson number (Equation 1) and S H function (Equation 10), we obtain The definition of the U scale (Equation 18), with Equation 21, yields which lets us eliminate the master length-scale from Equation 34, leading to This way, we have obtained universal functions representing the TKE in all three implicit scaling systems. Again, we see that the scaled TKE is does not depend on the master length-scale in any of them.

Master length-scale
One of the basic assumptions adopted in the Mellor-Yamada family of models is that all length-scales characterizing mixing, dissipation, and pressure-strain correlations are proportional to a single scale, called the "master length-scale" . Constants A 1 ...C 5 in Equations 3-8 are the proportionality coefficients. In order to close the model equations, the master length-scale must be specified. In the course of model development, various possibilities have been tried, ranging from a simple proportionality to height (Mellor, 1973) in the surface layer to a prognostic equation for q 2 (Mellor and Yamada, 1982). As will be shown in this section, the specification of the master length-scale can be separated from the analysis by using the implicit gradient-based scaling. This may explain the observed independence (Sorbjan, 2017) of gradient-based similarity functions from the specification of the mixing length.
By substituting q = q U into Equation 34, we find where q and U are any of (q , q , q w ) and (U , U , U w ), respectively. Further, as U = L N in all the implicit scaling systems considered, 4 one may write This is an important step, since the relationship between the universal functions describing the TKE and the master length-scale lets us separate the system by eliminating any of these functions in subsequent manipulations. It follows directly from the TKE budget (Equation 8), the definition of the flux Richardson number (Equation 1), the flux-gradient relationship (Equation 10) with Equation (A10) emerging as a solution of the entire set of model equations (Equations 3-8), and the relationship between gradient-based velocity scales and the Brunt-Väisäla frequency. This is similar to "z-less" scaling, where the stability-dependent length-scale can be specified separately. In other words, the universal functions describing the master length-scale emerge from model equations when "implicit" gradient-based scaling is applied. This fact, a novel finding, lets us separate the master-length specification from other parts of the analysis. Now, substituting q = q∕U from Equation 36 into Equation 38 yields Next, with Equation 33 we obtain and, finally, using Equation 31, (41) Although the formulae obtained provide a way to calculate the master length-scale, it should be noted that any of the scales L w , L , L should be known, and because of the presence of the Brunt-Väisäla frequency these scales are not valid in neutral conditions. Therefore, the master length-scale specification should contain dependences on both the height and stratification, in a manner consistent with the above relationships.

Variances, covariances, and TKE dissipation rate
Having obtained the above solutions, one can find dimensionless functions describing variances, covariances, and the TKE dissipation rate using a common form for each of the scaling systems considered here, = w, , .
The potential temperature variance can be found from Equations 7,10, and 23: which, upon substituting Equation 38, yields where q = q∕U are the functions given by Equations 31,33, and 36, correspondingly.
The variance of vertical velocity can be found from Equation 3, by using Equations 10 and 22: Then, by using Equation 38, we obtain (note that for = w this is equivalent to Equation 31; likewise, similar special cases in the remaining part of this section reduce to formerly obtained identities). To find the scaled kinematic momentum flux, let us rewrite Equation 9 as Now, elimination of l using Equation 38 together with Equations 24 and 11 yields Similarly, using Equations 10 and 38 with L = T N −2 , we obtain the scaled vertical kinematic heat flux, Finally, as q = q∕U , the scaled TKE dissipation rate can be found from Equations 21 and 38: The ratio u ′ w ′ ∕U 2 w predicted by Equation 47 appears to be nearly constant at low values of Ri; in the case of the MY82 formulation, it happens until Ri ≈ 0.05-0.06. In this range, the value of this ratio, roughly 0.7, matches almost perfectly the empirical function of Sorbjan (2017), labeled S17 hereafter. All the variants of the MYNN model exhibit similar behavior, however with a lower value, nearly 0.5, in a wider interval, up to Ri = 0.1. Beyond this range, only qualitative similarities exist between MY82, MYNN01, and MYNN09, which begin to tend to their asymptotes, and between S17 and MYNN09-CK, which exhibit a gradual fall. A comparison with data points presented by figure 3a of Sorbjan (2010) suggests an overestimation of the ratio , S H is calculated from Equation A10, and R F is replaced with Ri using Equation 11. Plot arrangement follows figure 5 in Sorbjan (2017) to facilitate comparison with the experimental data presented therein. Line labels: S17, empirical functions of Sorbjan (2017), listed in Section 2.5; all others explained in Section 2.2 discussed by MYNN09-CK inside the extreme stability range at Ri ≈ 1.
Similar observations can be also made in the case of the w ′ ′ ∕(U w T w ) ratio, except that all functions grow in the range of low Ri values and this range is wider. A maximum is reached at Ri ≈ 0.1 in the case of S17 and MY82, and at a higher value, say 0.2-0.3, in the case of all the MYNN variants. Here, a comparison with figure 3b of Sorbjan (2010) shows good agreement of empirical data with the MYNN09-CK results.
All the model predictions of the ∕T w ratio are nearly identical in the range of low Ri, say below 0.06. For larger  (2017) to facilitate comparison with experimental data presented therein. Line labels as in Figure 1 Ri values, the principal difference is the range of realizability of individual model variants, and other differences are small. In contrast, the S17 function exhibits changes of inclination in a moderate range of Ri. However, this ratio was calculated using other experimental functions rather than being fitted directly to observations in S17, hence the reasons reflect a computational rather than physical nature. The scaled dissipation rate, L w ∕U 3 w , decreases as 0.7Ri −1∕2 in S17 and MY82 and 0.7Ri −1∕2 for all the MYNN variants, for Ri ⪅ 0.04. For larger values of Ri, all the predictions of different MYNN variants are consistent up to Ri ⪅ 0.1, although decreasing slightly faster. For Ri > 0.2, MYNN09-CK predicts a decrease proportional to Ri −3∕4 , while S17 shows a much faster decrease rate, starting at a much lower value of Ri.  The w ′ ′ ∕(U T ) ratio is proportional to Ri −1∕2 as Ri → 0 in all cases, but the decrease rate gradually changes with growing Ri. In the case of MYNN09-CK, it becomes proportional to Ri −0.8 at Ri ≈ 1.
For all model versions, w ∕T ∝ Ri −0.5 up to Ri ≈ 0.04, at least. For MYNN09-CK, this range extends further up to Ri ≈ 0.2. Again, an inflection in the S17 function is visible, although at both bounds of the range presented it is also proportional to Ri −0.5 .
The scaled dissipation rate L ∕U 3 shows the best agreement in all of the results. For low Ri values, say up to 0.06, it is proportional to Ri −3∕2 . Over Ri ≈ 0.1, results start to diverge.

( , N) scaling
In this scaling, we observe the best agreement between results of different variants of the model and the empirical functions ( Figure 3). The scaled kinematic momentum flux, u ′ w ′ ∕U 2 , agrees perfectly up to Ri ≈ 0.1 and is roughly proportional to Ri 4∕7 within this range. In this range we notice considerable agreement between all the results in the case of scaled heat flux (which is proportional to Ri) and scaled standard deviation of potential temperature (nearly proportional to Ri 3∕4 ). The only exception is the prediction of vertical velocity variance by MY82, which diverges significantly.
For larger values of Ri, all the model predictions of the scaled momentum flux remain close (within their validity ranges), while S17 departs toward faster growth. This is also true for heat flux, although the growth rate decreases with the increase of Ri, in contrast to S17.
A look at the two lower panels in Figure 3 reveals an asymptote of the S17 functions at Ri ≈ 0.6, a feature that was apparently overlooked in S17. Indeed, as x appears in the denominators of fractions in this scaling (Table 1, last column), and x = 0 when Ri 300 = 0.9 250 , there is a singularity at Ri ≈ 0.686 followed by a change of sign.

Turbulent kinetic energy
Figure 4 presents the model predictions of scaled TKE as a function of Ri. The first plot (the leftmost one) represents a reciprocal of the partitioning ratio of the vertical fluctuations in total TKE (within a factor of two). All versions of the model predict a decrease of this share with growing stability. The neutral equilibrium value ranges between MY82 and all the MYNN versions due to different model constants.
As follows from the definition in Equation 15, U 2 is the turbulent potential energy (TPE: Dalaudier and Sidi, 1987;Łobocki, 2017), hence the function presented in the second plot (center) is the TKE/TPE ratio. As the TPE is null at neutral equilibrium, this ratio tends to infinity when Ri → 0. All the models provide similar values of this function within their range of validity.
The third plot shows the TKE normalized with the Dougherty-Ozmidov-based scaling. Here, model solutions appear to agree up to Ri ≈ 0.1, then begin to diverge, closing to their asymptotes, except for MYNN09-CK, which grows at a slowing rate.

Master length-scale
Finally, we present the master length-scale calculated with Equations 39-41 Figure 5. The most striking feature is fact that the first two plots are nearly identical (except for the value at neutral), in spite of the differences in formulations. The ∕L w , ∕L , and ∕L functions all exhibit linear growth, proportional to Ri. At Ri → 0, ∕L w and ∕L tend to zero because of the presence of a potential temperature gradient in the denominators of the definition ratios in Equations 39 and 41. In the case of the L scale definition ratio, both the denominator and numerator assume zero values at neutral equilibrium, hence the ratio ∕L 2 may take a finite, nonzero value. An interesting question may regard the consequences of Equations 39-41 when scaled values become equal to one, that is, when the master length is chosen as one of the scales L w , L , L . It may appear that this situation arises under steady-state, local equilibrium conditions only at a certain value of R F (hence, also at Ri), which could lead to inconsistencies within the master length-scale formulations in models employing these scales. On the other hand, we see that this is not a problem in MYNN09-CK, because of the asymptotic limitation of R F in this version (slightly below 0.3).

CONCLUDING REMARKS
From a modeling perspective, correct parametrization of vertical exchange processes under stable equilibrium is a challenging problem, due to the variety of contributing processes on scales of a grid element and the integration time step. When empirical functions obtained in "pure" conditions (such as over the ice sheet or from LES) are used in a model, factors like local terrain inhomogeneity, surface inclination, obstacles that may excite flow irregularities, or waves are absent. If such phenomena are not resolved explicitly, the parametrization must rest on the mean fields and characteristics of the surface (taking possible account of nonstationary effects when a predictive turbulence closure is used). These limitations, together with the perennial problem of "runaway cooling" and rising criticism of the existence of a critical Richardson number, drove modelers to seek solutions in adjustments or refinements of the turbulence parametrization.
In the case of the Mellor-Yamada model, this refinement was mainly done (Nakanishi, 2001;Nakanishi and Niino, 2009) by improvements in the parametrization of pressure-gradient terms in the second-moment budgets and improving the master length-scale specification.
On the other hand, the gradient-based implicit scaling systems proposed by Sorbjan (2010) and Grachev et al. (2015) offer a viable perspective of comparing theoretical predictions from turbulence models with observations, LES results, or similarity functions derived from these data. This procedure was explained in detail in Section 3. The main advantage is a better quality of reference data, owing to the circumvention of problems with the smallness of fluxes and cross-correlation, which occur in the MOST framework. Another advantage is separating the mixing/master length issue, as the relationships obtained here do not require its explicit specification. This may explain the observed independence (Sorbjan, 2017) of gradient-based universal functions from specification of the mixing length used in the analysis of experimental data. Here, the model allows us to obtain the ratio of the master length-scale to a given scaling length as a function of Ri. With this aspect, the gradient-based system is similar to the "z-less" scaling, where the stability-dependent length-scale could be specified separately. On the other hand, complementing the analysis by using MOST still seems necessary in the near-neutral range, due to use of the Brunt-Väisäla frequency in definitions of scales. Also, specification of the mixing/master length is necessary when using MOST.
Within the range of Ri up to 0.2, we observe a reasonable agreement between empirical similarity functions and model predictions. According to findings of Grachev et al. (2013), this is the region where the Richardson-Kolmogorov cascade still occurs, and the Kolmogorov hypothesis applied to parametrize dissipation can still be valid. However, discrepancies appear at larger values of Ri. Unfortunately, the exclusion of data from this region in Grachev et al. (2015) leaves uncertainty about the forms of universal empirical functions presented in earlier studies. Likely, further progress should incorporate another approach to parametrization of the dissipation rate. We anticipate that the methodology presented here could be helpful in developing and validating turbulence-closure models within the gradient-based similarity framework. It may complement existing similarity approaches by focusing on stable conditions, and it can be regarded as a consideration of an asymptotic state. However, one should keep in mind that the above-presented analysis assumed local production-dissipation equilibrium. At the same time, the models typically use predictive formulation(s) of any of the second-moment budgets. Hence, modeling results may depart from the solutions presented.
From the definitions of Richardson numbers in Equations 1 and 2, it follows immediately that Now, by dividing Equations 3-8 by q 2 and entering the variables defined above, a non-dimensional form is obtained: Substitution of S H G H and then S M G M from the TKE budget (Equation A8) into the w ′ 2 budget (Equation A3) yields while, by combining this result with the w ′ ′ and ′ 2 budgets (Equations A6-A7) and using the TKE budget (Equation A8), we once again obtain To obtain S M , let us start with substitution of T u from Equation A5 and C w from Equation A3 into Equation A4: Then replace the term containing S M G M using Equation A8, collect the S H G H terms, and move the term containing S M G H to the left-hand side: Using the TKE budget (Equation A8) again, the constant term ( 1 − C 1 ) can be transformed into which leads to a more compact form: Finally, the R F ∕G H ratio can be eliminated using Equations A8 and A10, that is where F 2 = B 1 ( 1 + 2 ) − 3A 1 (1 − C 2 ).