Abstract
 Top of page
 Abstract
 1. Introduction
 2. Description of the LargeEddy Simulation Model That We Use
 3. Description of the Three MixedPhase Arctic Clouds That We Simulate
 4. Correlation Matrices Computed by LES
 5. Parameterizing Correlations: Problem Definition
 6. Lower and Upper Bounds on Correlations
 7. CholeskyBased Parameterization of Correlations
 8. For Comparison: Covariances Based on the Scalar Variance Equation
 9. Conclusions
 Appendix A:: Lower and Upper Bounds on Correlations
 Acknowledgments
 References
 Supporting Information
[1] Mixedphase Arctic clouds, like other clouds, contain smallscale variability in hydrometeor fields, such as cloud water or snow mixing ratio. This variability may be worth parameterizing in coarseresolution numerical models. In particular, for modeling multispecies processes such as accretion and aggregation, it would be useful to parameterize subgrid correlations among hydrometeor species. However, one difficulty is that there exist many hydrometeor species and many microphysical processes, leading to complexity and computational expense. Existing lower and upper bounds on linear correlation coefficients are too loose to serve directly as a method to predict subgrid correlations. Therefore, this paper proposes an alternative method that begins with the spherical parameterization framework of Pinheiro and Bates (1996), which expresses the correlation matrix in terms of its Cholesky factorization. The values of the elements of the Cholesky matrix are populated here using a “cSigma” parameterization that we introduce based on the aforementioned bounds on correlations. The method has three advantages: (1) the computational expense is tolerable; (2) the correlations are, by construction, guaranteed to be consistent with each other; and (3) the methodology is fairly general and hence may be applicable to other problems. The method is tested noninteractively using simulations of three Arctic mixedphase cloud cases from two field experiments: the Indirect and SemiDirect Aerosol Campaign and the MixedPhase Arctic Cloud Experiment. Benchmark simulations are performed using a largeeddy simulation (LES) model that includes a bin microphysical scheme. The correlations estimated by the new method satisfactorily approximate the correlations produced by the LES.
1. Introduction
 Top of page
 Abstract
 1. Introduction
 2. Description of the LargeEddy Simulation Model That We Use
 3. Description of the Three MixedPhase Arctic Clouds That We Simulate
 4. Correlation Matrices Computed by LES
 5. Parameterizing Correlations: Problem Definition
 6. Lower and Upper Bounds on Correlations
 7. CholeskyBased Parameterization of Correlations
 8. For Comparison: Covariances Based on the Scalar Variance Equation
 9. Conclusions
 Appendix A:: Lower and Upper Bounds on Correlations
 Acknowledgments
 References
 Supporting Information
[2] Arctic climate is influenced in strong and complex ways by mixedphase Arctic clouds. We cite two examples here. First, mixedphase Arctic clouds influence radiative transfer and are often observed to persist for long times [Pinto, 1998; Prenni et al., 2007]. Several modeling studies suggest that this longevity is possible only if ice nuclei concentrations are limited in order to prevent ice concentrations from increasing and depleting liquid water [Harrington et al., 1999; Prenni et al., 2007]. Second, many Arctic cloud layers are thin enough to be partly transparent to longwave radiation. Because of this, some researchers have hypothesized that if Arctic clouds experience an increase in droplet number concentration, these clouds will emit more longwave radiation and hence cause a relative warming of the surface [Garrett et al., 2002; Garrett and Zhao, 2006].
[3] Given this complexity, it is perhaps unsurprising that climate and regional simulations differ markedly from each other in their estimates of Arctic clouds [Walsh et al., 2002; Kattsov and Källén, 2005; Rinke et al., 2006; Prenni et al., 2007]. For instance, Kattsov and Källén [2005] mention the “dramatic scatter between the total cloud amounts . [which] approaches 60% in winter” that is simulated by the climate simulations they examine. Furthermore, even if a model correctly predicts the presence of cloud, it may not necessarily predict the correct phase of water. The regional models studied by Prenni et al. [2007] severely underpredict liquid water in wintertime Arctic clouds, probably due to excessive ice nuclei concentrations in the simulations. These uncertainties in simulations of clouds lead to uncertainties in other components of the simulated Arctic climate.
[4] A key difficulty in improving regional and climate models is the difficulty of parameterizing smallscale spatial variability in hydrometeors. Regional and climate models have superkilometer horizontal grid spacings, but such large grid volumes contain considerable variability. This variability ought to be taken into account when driving microphysics. Therefore, developing accurate formulas for aerosol and microphysics is necessary, but not sufficient. Also needed is an accurate parameterization of subgrid variability that is implemented in a regional or climate “host” model. Given the resolved fields predicted by the host model, the parameterization would need to predict the relevant aspects of subgrid variability and feed them into a microphysics scheme [Golaz et al., 2002; V. E. Larson and B. M. Griffin, Analytic upscaling of local microphysics parameterizations, part I: Theory, submitted to Quarterly Journal of the Royal Meteorological Society, 2011]. B. M. Griffin and V. E. Larson (Analytic upscaling of local microphysics parameterizations, part II: Simulations, submitted to Quarterly Journal of the Royal Meteorological Society, 2011) simulated a drizzling stratocumulus cloud and found that accounting for subgrid variability in a microphysics scheme led to enhanced autoconversion of cloud droplets to drizzle and accretion of cloud droplets onto drizzle drops. This, in turn, led to a 75% increase in drizzle mixing ratio near the ocean surface.
[5] A particularly difficult aspect is parameterizing correlations among hydrometeor species. This is useful, e.g., for estimating the rates of collection of droplets by snow particles. Although parameterizations of subgrid distributions of moisturerelated variables have been developed [e.g., Tompkins, 2002; Morrison and Gettelman, 2008], typically these distributions are univariate. Therefore, they do not contain information on the covariability of the hydrometeors. The information on covariability is needed, for instance, to compute the collection rate, which depends on whether snow falls preferentially through parts of the cloud that contain greater or lesser cloud water mixing ratio.
[6] One possible approach to predicting correlations among hydrometeors is to develop a prognostic or diagnostic equation for each of these correlations based on fundamental physical equations. While this approach is perhaps the most satisfying from a theoretical point of view, it suffers two drawbacks. First, it is computationally expensive. If the number of hydrometeors is n, then the number of correlations among those hydrometeors is n(n − 1)/2. That is, as n grows large, the number of correlations becomes proportional to n^{2}, which is very large. For instance, if a microphysics scheme predicts both the number concentration and mixing ratio of cloud water, rain, cloud ice, and snow, then n = 8 and n(n − 1)/2 = 28. Hence there is an incentive to limit the cost of computing each correlation. Second, if each correlation is individually predicted using a physically based estimate, then the correlations so estimated may be inconsistent with each other. To take an extreme example, if variate X_{1} and variate X_{2} are perfectly correlated with each other, then correlations of a third variate X_{3} with X_{1} and X_{2} must be identical. However, simple physical estimates, which inevitably contain errors, will not ensure such a result.
[7] This paper presents a method that mitigates these two problems. It starts with an established mathematical framework that guarantees the internal consistency of the correlations. Within the strictures of this framework, it diagnoses values of each correlation using an inexpensive formula based on guidance from rigorous bounds on the correlations. The formula contains an adjustable parameter that must be empirically fit to observations or, in the case of this study, modelgenerated data.
[8] The new method is tested noninteractively using output from a largeeddy simulation (LES) model. The LES model does not serve as a host model for the correlation parameterization; rather, in this study, the LES model produces turbulent fluxes and other moments that serve as input to the new method for the purpose of noninteractive tests. The LES model also produces correlations that serve as validation data. We perform LESs of three mixedphase Arctic clouds. The first case is based upon the Indirect and SemiDirect Aerosol Campaign (ISDAC) field experiment; the second is based upon the MixedPhase Arctic Cloud Experiment (MPACE) field experiment, period B; and the third is based on MPACE period A. These three cloud cases differ in their surface fluxes and microphysical characteristics. Correlations computed from 3D snapshots of the LES are compared to estimates provided by the new method. In this preliminary effort, we have not implemented the correlation estimates interactively in a largescale host model.
[9] Section 2 describes the LES model that we use. Section 3 describes the three Arctic cloud cases that we will simulate. Section 4 presents correlation matrices as computed by LES. Section 5 defines the parameterization problem that we address. Section 6 discusses rigorous lower and upper bounds on the correlations. These are used later to guide the parameterization of these correlations. Section 7 presents the spherical parameterization and the cSigma method of parameterizing its coefficients. Section 8 compares the spherical parameterization with a prognostic approach. Finally, Section 9 discusses the results and concludes.
2. Description of the LargeEddy Simulation Model That We Use
 Top of page
 Abstract
 1. Introduction
 2. Description of the LargeEddy Simulation Model That We Use
 3. Description of the Three MixedPhase Arctic Clouds That We Simulate
 4. Correlation Matrices Computed by LES
 5. Parameterizing Correlations: Problem Definition
 6. Lower and Upper Bounds on Correlations
 7. CholeskyBased Parameterization of Correlations
 8. For Comparison: Covariances Based on the Scalar Variance Equation
 9. Conclusions
 Appendix A:: Lower and Upper Bounds on Correlations
 Acknowledgments
 References
 Supporting Information
[10] Using observations, it is difficult to obtain all the correlations we desire, and it is especially difficult to obtain them with adequate sampling statistics. Therefore, in this paper, we examine correlations simulated by LES.
[11] The LES model that we use is the System for Atmospheric Modeling (SAM) [Khairoutdinov and Randall, 2003]. SAM solves the anelastic equations of fluid flow on a Cartesian grid. To reduce spurious numerical oscillations, SAM transports thermodynamic scalars using a monotonic flux limiter. SAM advances the solutions in time using a thirdorder AdamsBashforth time stepping scheme. Periodic boundaries are used in the horizontal and a rigid lid is used at the top of the domain. SAM applies sponge damping over the top 1/3 of the domain, but in all cases the lid has been chosen far enough above cloud top so that the sponge damping does not interfere with the solutions. SAM with bulk microphysics has successfully performed LES of a variety of boundary layer cases, including two mixedphase Arctic cloud systems that we examine in this paper, the MPACE B singlelayer case [Klein et al., 2009] and the MPACE A multilayer case [Morrison et al., 2009].
[12] In this study, we use a version of SAM that has been coupled to a bin (spectral) microphysics scheme, in which hydrometeors of different radii are separately prognosed. The bin microphysical scheme is based on the work by Khain et al. [2004], but contains modifications described by Fan et al. [2009a], such as the addition of a prognostic ice nuclei size distribution and new ice nucleation mechanisms. The bin microphysics scheme prognoses number size distributions for water drops, columnar ice crystals, platelike ice crystal, dendritic ice crystals, snowflakes, graupel, hail/frozen drops, and aerosol particles. Each size distribution is represented by 33 bins, with each larger bin containing particles with twice the mass of the next smaller bin. The scheme explicitly computes relevant microphysical processes, including droplet nucleation, primary and secondary ice generation, condensation and evaporation of drops, deposition and sublimation of ice particles, freezing and melting, and collisions between the various hydrometeors. The coupled SAM and bin microphysics model has successfully simulated many mixedphase and deep convective clouds [Fan et al., 2009b, 2010], including the ISDAC, MPACE A, and MPACE B cloud systems [Fan et al., 2009a, 2011].
[13] Most of the analyses and plots below pertain to 2D (x−y) horizontal slabs of LES output at a single grid level. The slabs are located at various altitudes indicated in Figures 1, 2, and 3. For the ISDAC and MPACE B cases, our correlation analysis will use six instantaneous snapshots of LES output; for the MPACE A case, it will use five. The altitudes of the slabs vary slightly with snapshot as the cloud layers evolve in order to keep the slabs entirely within the same part of cloud (lower, middle, or upper). For our correlation analysis below, the slabs are composite averaged across all snapshots.
4. Correlation Matrices Computed by LES
 Top of page
 Abstract
 1. Introduction
 2. Description of the LargeEddy Simulation Model That We Use
 3. Description of the Three MixedPhase Arctic Clouds That We Simulate
 4. Correlation Matrices Computed by LES
 5. Parameterizing Correlations: Problem Definition
 6. Lower and Upper Bounds on Correlations
 7. CholeskyBased Parameterization of Correlations
 8. For Comparison: Covariances Based on the Scalar Variance Equation
 9. Conclusions
 Appendix A:: Lower and Upper Bounds on Correlations
 Acknowledgments
 References
 Supporting Information
[23] In order to compute grid box averages of microphysical processes involving two or more hydrometeor species, we need to estimate the subgrid correlations between those species. To gain familiarity with the correlations, we first present correlations from the three cloud cases produced by LES. Using horizontal slabs of data from the LES, we construct a correlation matrix for vertical velocity (W); the contents of cloud water (QC), cloud ice (QI), and snow (QS); and the number concentrations of cloud water (NC), cloud ice (NI), and snow (NS). Each element of the matrix lists the linear correlation coefficient [Press et al., 1992] between two hydrometeor quantities, a coefficient that ranges between (−1,1). The matrices are composite averaged. That is, for each case, we ensemble average a slab of data from multiple LES snapshots (5 from MPACE A, 6 from the other two cases), with the snapshots separated by several hours in time. All grid levels are from midaltitude within cloud.
[24] The correlations are computed and presented for the three cases: ISDAC (Table 1), MPACE B (Table 2), and MPACE A (Table 3). We see that usually correlations are positive, but not always (e.g., the correlation between QC and QI in ISDAC is −0.08). In all three cases, the correlations are large between QI and NS, between QI and NI, and between QS and NS. We conjecture that the correlation between QI and NS may be strong because large QI leads to formation of snow particles, and hence large NS, in the bin microphysics scheme.
Table 1. Correlation Matrix Derived From a OneLayer Horizontal Slab in Midcloud From Our LES of the April 26 ISDAC Cloud^{a}  Variate 

W  QS  NS  QI  NI  QC  NC 


W  1.00  0.65  0.73  0.44  0.55  −0.01  0.34 
QS  0.65  1.00  0.95  0.29  0.43  0.06  0.14 
NS  0.73  0.95  1.00  0.49  0.60  0.04  0.21 
QI  0.44  0.29  0.49  1.00  0.77  −0.08  0.39 
NI  0.55  0.43  0.60  0.77  1.00  0.28  0.29 
QC  −0.01  0.06  0.04  −0.08  0.28  1.00  0.09 
NC  0.34  0.14  0.21  0.39  0.29  0.09  1.00 
Table 2. Correlation Matrix Derived From a OneLayer Horizontal Slab in Midcloud From Our LES of the MPACE B cloud^{a}  Variate 

W  QS  NS  QI  NI  QC  NC 


W  1.00  0.01  0.02  0.01  0.25  0.20  0.24 
QS  0.01  1.00  0.91  0.84  0.50  0.43  0.44 
NS  0.02  0.91  1.00  0.87  0.66  0.55  0.58 
QI  0.01  0.84  0.87  1.00  0.63  0.61  0.61 
NI  0.25  0.50  0.66  0.63  1.00  0.90  0.89 
QC  0.20  0.43  0.55  0.61  0.90  1.00  0.94 
NC  0.24  0.44  0.58  0.61  0.89  0.94  1.00 
Table 3. Correlation Matrix Derived From a OneLayer Horizontal Slab in Midcloud of the Middle Layer From Our LES of the MPACE A Cloud^{a}  Variate 

W  QS  NS  QI  NI  QC  NC 


W  1.00  0.02  −0.01  0.04  −0.05  0.04  0.18 
QS  0.02  1.00  0.91  0.60  0.41  −0.33  −0.32 
NS  −0.01  0.91  1.00  0.66  0.58  −0.39  −0.36 
QI  0.04  0.60  0.66  1.00  0.80  −0.33  −0.30 
NI  −0.05  0.41  0.58  0.80  1.00  −0.43  −0.32 
QC  0.04  −0.33  −0.39  −0.33  −0.43  1.00  0.62 
NC  0.18  −0.32  −0.36  −0.30  −0.32  0.62  1.00 
[25] In all cases, the number concentration NS and water content QS of snow are highly correlated to each other (see Tables 1, 2, and 3). How can we explain this fact? One possible explanation is the following. A high correlation would be expected when the magnitude of the distribution changes from grid box to grid box, but the shape of the distribution changes little. Then the number concentration changes, but the average particle size does not. For instance, one would expect a perfect correlation between QS and NS across a horizontal slab under the following idealized circumstance. Suppose that the shape of the snow particle size distribution is the same in all grid boxes, and in particular is not shifted to smaller or larger sizes between grid boxes, but the amplitude of the size distribution is multiplied by a different (random) constant in each grid box in the slab. Such an array of distributions leads to perfect correlation between QS and NS across grid boxes. Mathematically, we can see this by writing the expressions for the total number concentration of snow particles, NS, in terms of the size distribution of snow particles, n_{S}(D),
and the snow water content, QS,
where D is the diameter of the snow particle, ρ_{a} is the air density, and m(D) is the mass of an individual snow particle of diameter D. Suppose that n_{S}(D) is multiplied by the same factor in the expressions for NS and QS, without change in n_{S}(D) or m(D). We allow the factor to differ from grid box to grid box, but within a grid box, the same factor multiplies n_{S}(D) in both NS and QS. Then NS ∝ QS, and the correlation between NS and QS equals 1. On the other hand, if n_{S} changes shape or is translated to smaller or larger sizes between grid boxes, then NS is not proportional to QS.
[26] In other ways, the correlation matrices from the three cases differ from each other. For instance, in ISDAC, QC is weakly correlated with most other variates; in MPACE B, it is strongly positively correlated; and in MPACE A, it is strongly negatively correlated. Additionally, W is strongly correlated to other variates in ISDAC, but weakly correlated in the middle layer of MPACE A. The middle layer is prevented from undergoing strong radiative cooling by the upper cloud layer and therefore exhibits little turbulence [Falk and Larson, 2007].
6. Lower and Upper Bounds on Correlations
 Top of page
 Abstract
 1. Introduction
 2. Description of the LargeEddy Simulation Model That We Use
 3. Description of the Three MixedPhase Arctic Clouds That We Simulate
 4. Correlation Matrices Computed by LES
 5. Parameterizing Correlations: Problem Definition
 6. Lower and Upper Bounds on Correlations
 7. CholeskyBased Parameterization of Correlations
 8. For Comparison: Covariances Based on the Scalar Variance Equation
 9. Conclusions
 Appendix A:: Lower and Upper Bounds on Correlations
 Acknowledgments
 References
 Supporting Information
[30] One can analytically derive a lower and upper bound on the possible values of the correlations between two variates [Leung and Lam, 1975; Vos, 2009]. These lower and upper bounds do not by themselves serve as an accurate parameterization of correlations, but they do provide a useful starting point for the Choleskybased parameterization discussed below in Section 7.
[31] Suppose that the linear correlation coefficient between W and an arbitrary hydrometeor quantity X_{1} is known and is denoted Suppose the same for a second hydrometeor quantity X_{2}. Then the correlation between X_{1} and X_{2}, is bounded by the expression
where the lower bound corresponds to the minus sign, and the upper bound corresponds to the plus sign. Leung and Lam [1975] and Vos [2009] derive the bounds using geometric arguments. The appendix notes that the same expression can be obtained by computing the condition needed for a zero eigenvalue of the correlation matrix of W, X_{1}, and X_{2}. One can see from formula (3) that if W is perfectly correlated with either X_{1} or X_{2}, i.e., if = 1 or = 1, then the lower bound equals the upper bound, and the only realizable correlation is
In such cases, when W is highly correlated with either X_{1} or X_{2}, then formula (3) yields tight bounds. However, if W is uncorrelated with X_{1} and X_{2}, i.e., if = 0 and = 0, then any correlation in the range [−1, 1] is possible (or “realizable”). Therefore, when correlations of W with X_{1} and X_{2} are low, the formula (3) yields loose bounds.
[32] The lower and upper bounds of the correlations for the ISDAC, MPACE B, and MPACE A Arctic clouds are displayed in Figures 4 and 5 respectively. Unfortunately, the bounds turn out to be loose. The lower bound on a given scatter point is at least 0.5 less than the corresponding LES correlation, and often the lower bound is the minimum value, −1 (Figure 4). The upper bounds are also loose, with many values at or near the maximum value, 1 (Figure 5). The reason that the bounds are loose is that many of the turbulent fluxes are weak, and therefore or is small.
[33] To produce more accurate correlation estimates, one may choose an intermediate value between the lower and upper bounds. Perhaps the simplest method is to use equation (4) for the midbound value, regardless of the values of and Indeed, Figure 6 shows that this method does improve the estimates. However, such estimated correlations tend to cluster toward small values and tend to underestimate the LES correlations on average. Therefore, we do not recommend using approximation (4) except for applications where high accuracy is not important.
7. CholeskyBased Parameterization of Correlations
 Top of page
 Abstract
 1. Introduction
 2. Description of the LargeEddy Simulation Model That We Use
 3. Description of the Three MixedPhase Arctic Clouds That We Simulate
 4. Correlation Matrices Computed by LES
 5. Parameterizing Correlations: Problem Definition
 6. Lower and Upper Bounds on Correlations
 7. CholeskyBased Parameterization of Correlations
 8. For Comparison: Covariances Based on the Scalar Variance Equation
 9. Conclusions
 Appendix A:: Lower and Upper Bounds on Correlations
 Acknowledgments
 References
 Supporting Information
[37] In order to ensure consistency among correlation values, we consider the matrix of all correlations. Let Σ denote the correlation matrix that contains all correlations and so forth. The first row and column of Σ contain the correlations between W and the X_{i} variates. On the other hand, where i ≠ 1 and j ≠ 1, the element Σ_{ij} contains the correlation between X_{i} and X_{j}.
[38] Correlation matrices must satisfy certain properties. In order for Σ to represent correlations, the elements of Σ must be real and must satisfy Σ_{ij} = Σ_{ji}. That is, Σ must be a real, symmetric matrix. Furthermore, the diagonal elements of Σ must be 1, and the values of the other elements must lie in the range (−1, 1).
[39] One further crucial condition on Σ is that it be positive semidefinite. In other words, all eigenvalues of Σ must be positive or zero [Press et al., 1992, p. 89]. This means that when rotated to the principal axes, all variances are positive or zero.
[40] Positive semidefinite variances are required on physical grounds. Any positive semidefinite matrix can be represented by a Cholesky factorization:
where L is a lower triangular matrix, and T denotes transpose. Cholesky factorizing a matrix can be seen as a highdimensional analogue of taking the square root of a scalar [e.g., Press et al., 1992], because the given quantity, Σ, equals L multiplied by the transpose of L itself. For instance, if Σ were a diagonal matrix, then L would also be a diagonal matrix whose diagonal entries would contain the square roots of Σ's diagonal entries.
[41] To construct a parameterized correlation matrix, Σ, that is assured to be positive semidefinite, we may first construct a parameterized Cholesky factor, L, and then multiply it by its transpose, as in equation (8) [Pinheiro and Bates, 1996; Rebonato and Jäckel, 1999]. That is, instead of directly parameterizing elements of Σ, as in equation (5), we may directly parameterize L.
[42] Parameterizing L directly saves computational expense when Monte Carlo samples are needed, regardless of how the elements of L are parameterized. For instance, given the Cholesky factor L_{cov} of a covariance matrix Σ_{cov}, a vector of mean values μ, and a vector of uncorrelated sample points from a standard normal distribution x, we can compute the corresponding correlated sample, y, by matrix multiplication [Johnson, 1987]:
Parameterizing L_{cov} directly is computationally more efficient than computing Σ_{cov} and then computing the Cholesky factorization at each time step. We can compute the covariance Cholesky matrix L_{cov} from the correlation Cholesky matrix L by multiplying each row of L by the standard deviation of the corresponding variate.
[43] The desired correlation Cholesky matrix, L^{T}, can be written as [Pinheiro and Bates, 1996]:
For instance, a 4 × 4 matrix would be written, in tableau form, as
Pinheiro and Bates [1996] refer to this as the “spherical parameterization.” The elements that compose the first row of L^{T} turn out to be simply the correlations between W and the hydrometeors. That is, c_{1j} = where the are assumed to be known. In the i ≠ 1 rows, s_{ij} = sin(θ_{ij}) and c_{ij} = cos(θ_{ij}), where θ_{ij} is a set of angles that remains to be determined.
[44] The formula (10) of Pinheiro and Bates [1996] provides a convenient and general framework for constructing positive semidefinite correlation matrices. Equation (10) ensures that the matrix is positive semidefinite and has ones along the main diagonal, regardless of the values of the θ_{ij} angles. The spherical parameterization may be interpreted geometrically by noting that the ith column of L^{T} in (11) represents the components of a unit vector v_{i}. This set of unit vectors originate at the same point but are oriented in different directions, such that their dot products equal the correlations of Σ (i.e. v_{i} · v_{j} = Σ_{ij}) [Rapisarda et al., 2006].
[45] However, where i ≠ 1, the angles θ_{ij} are unknown a priori and hence need to be parameterized. The optimal values of θ_{ij} will vary among applications, and Pinheiro and Bates [1996] do not suggest how to parameterize them. We will suggest an indirect method below. Instead of parameterizing θ_{ij} directly, however, we choose to parameterize c_{ij}. Then it follows from trigonometry that s_{ij} = .
[47] For the top row, that is, for i = 1, c_{1j} = (LL^{T})_{1j}. However, the relationship for other rows is more complex. Equation (12) indicates that the correlations (LL^{T})_{ij} are related to the c_{ij} parameters by expressions such as
If we recall that s_{ij} = , then we see that the spherical parameterization of Σ_{23}, for instance, matches the form of the bound parameterization (5), if _{23} = c_{23} is set equal to f_{23}. That is, for these secondrow correlations Σ_{2j}, the only difference between (13) and (5) is the form of f, which determines whether the correlation is closer to the upper bound or the lower bound.
[48] Despite these subtle differences, it turns out that for our Arctic LES,
This is shown in Figure 7, in which we set c_{ij} = Σ_{ij} in L and then plot (LL^{T})_{ij} versus Σ_{ij}. We find good agreement, which indicates that c_{ij} ≈ Σ_{ij}.
[49] In order to use the framework provided by equation (11), we need to parameterize all the c_{ij} (i > 1) parameters in terms of known quantities, namely the means, standard deviations, and vertical turbulent fluxes. Inspired by equation (14), we seek a parameterization that approximates c_{ij} as the correlation between X_{i} and X_{j}, which, in turn, can be approximated by equation (5). In this way, the parameterization is directly related to the upper and lower bounds on correlations discussed previously. We parameterize c_{ij} for i > 1 in terms of the firstrow elements (correlations) as
In (15), we choose f_{ij} as
As a subsequent step, we explicitly ensure that −0.99 < f_{ij} < 0.99. Here α is an adjustable coefficient.
[50] Formulas (15) and (16) are semiempirical, rather than derived from first principles. Nevertheless, we now describe our rationale for the choice of (16). In formulating f_{ij}, we multiply α by sgn(c_{1i}c_{1j}) in order to ensure that important symmetry relationships are obeyed (see below). We include S_{i}S_{j}, where S_{i} ≡ S(X_{i}) ≡ / and analogously for S_{j}, because this factor improves the fit. This factor appears to help especially in cases in which a hydrometeor species is present in only part of a horizontal slab. In such cases, there is a cluster of data points at zero, and other data points at nonzero points. Such distributions have unusually high correlations. It would be convenient to parameterize this effect directly in terms of the fraction of zero points, but this information is often unavailable. Instead, as a surrogate, we include S_{i}S_{j} in (16), which increases f_{ij} and hence the correlation c_{ij}. In those cases in which S_{i} is unknown, one may set S_{i}S_{j} = 1 and obtain somewhat degraded but still satisfactory results (not shown).
[51] The formulation of (15) and (16) is dictated in part by the need to obey symmetry relationships. The expression for c_{ij} satisfies exchange symmetry (7) because if i and j are switched with each other on the righthand side of (15), the right hand remains unchanged. The expression also satisfies odd parity symmetry (6) because if either the ith or jth variate changes sign, so does the righthand side.
[52] We call equations (15) and (16) the “cSigma” parameterization because it is inspired by the assumption that c_{ij} ≈ Σ_{ij}. The cSigma parameterization is useful in cases in which the ordering of the magnitudes of the correlations is not known. In other cases, it is reasonable to suppose that the correlations are ordered. For instance, the correlation between a quantity at two times should decrease as the time interval increases. In such cases, one may use the parameterization of Rapisarda et al. [2006] and Schoenmakers and Coffey [2003].
[53] Equation (16) would be of limited practical use if it were necessary to choose a different value of the adjustable parameter α for each different cloud. To assess whether a single, robust value of α can be chosen, we partition the data points into three groups, corresponding, naturally, to the three cloud cases: ISDAC, MPACE B, and MPACE A. Using these three groups, we perform a kfold cross validation [e.g., Kohavi, 1995] in which all data points from one cloud case are omitted, the value of α is optimized using data from the remaining two cases, and then the resulting optimal value of α is used in (LL^{T})_{ij} to test how accurately it represents Σ_{ij} in the omitted cloud case.
[54] We optimize the value of the parameter α using the LevenbergMarquardt method [Press et al., 1992]. When we arrange the columns in Σ in the order [W, QS, NS, QI, NI, QC, NC], then we find that the best fit value of α is 0.19 when ISDAC data are excluded, 0.11 when MPACE B is excluded, and 0.21 when MPACE A is excluded. The values of α are fairly close for these three cases, raising hopes that a single value of α can be used widely without great loss of accuracy. Using these parameter values in the plots results in Figures 8, 9, and 10. The correlation estimates tend to have a low bias and are somewhat scattered about the 1:1 line. Nonetheless, considering the simplicity, generality, and inexpensiveness of formulas (15) and (16), the fit is quite acceptable.
8. For Comparison: Covariances Based on the Scalar Variance Equation
 Top of page
 Abstract
 1. Introduction
 2. Description of the LargeEddy Simulation Model That We Use
 3. Description of the Three MixedPhase Arctic Clouds That We Simulate
 4. Correlation Matrices Computed by LES
 5. Parameterizing Correlations: Problem Definition
 6. Lower and Upper Bounds on Correlations
 7. CholeskyBased Parameterization of Correlations
 8. For Comparison: Covariances Based on the Scalar Variance Equation
 9. Conclusions
 Appendix A:: Lower and Upper Bounds on Correlations
 Acknowledgments
 References
 Supporting Information
[55] We now compare the estimate (5), which is related to the spherical parameterization, with an alternative methodology that estimates the correlations based on the scalar variance equation. The covariance of X_{1} and X_{2}, , is governed by the equation
Here τ is a turbulent dissipation scale, C_{d} is an empirical constant, t is time, and z is altitude. An overbar denotes a grid box mean, and a prime denotes a perturbation from the mean. We have neglected the horizontal advection and horizontal production terms. If we further neglect the time tendency, vertical mean advection, vertical turbulent advection, and source terms, then we find
Now we assume that the turbulent fluxes can be modeled by downgradient diffusion. That is,
and similarly for X_{2}. Here, K is an eddy diffusivity, which may be chosen to have a complicated form. We substitute (19) into (18) in order to eliminate the vertical derivatives. Then we rearrange, which yields
Converting from covariances to correlations, we find
The estimate (21), which is based on the scalar variance equation, may be compared with equation (5). The two equations coincide if 2τ/(C_{d}K) = 1 in (21) and f = 0 in (5). With those conditions imposed, (21) reduces to (4), and the results of the parameterization may be seen in Figure 6.
9. Conclusions
 Top of page
 Abstract
 1. Introduction
 2. Description of the LargeEddy Simulation Model That We Use
 3. Description of the Three MixedPhase Arctic Clouds That We Simulate
 4. Correlation Matrices Computed by LES
 5. Parameterizing Correlations: Problem Definition
 6. Lower and Upper Bounds on Correlations
 7. CholeskyBased Parameterization of Correlations
 8. For Comparison: Covariances Based on the Scalar Variance Equation
 9. Conclusions
 Appendix A:: Lower and Upper Bounds on Correlations
 Acknowledgments
 References
 Supporting Information
[57] This paper addresses the problem of parameterizing subgridscale correlations among hydrometeors in largescale models. These correlations are important because they influence the rates of microphysical processes such as accretion. However, the correlations are difficult to parameterize because they are numerous, the number of correlations is approximately proportional to the square of the number of hydrometeors, and because the values of the correlations depend on a broad range of physical processes.
[58] Rigorous lower and upper bounds on the correlations are available (see equation (3)), and the form of (3) does suggest a parameterization for the individual correlations (5). However, this equation does not guarantee consistency among all correlations.
[59] Instead, we suggest using the spherical parameterization framework of Pinheiro and Bates [1996], embodied in equations (8) and (10). This provides a framework for the upper diagonal Cholesky matrix, L^{T}, associated with the correlation matrix. To estimate the elements of L^{T}, we use the cSigma parameterization (15) and (16). These formulas contain only a single adjustable parameter, α, whose optimal value for our data ranges from 0.11 to 0.21. If the standard deviations (S_{i} in equation (16)) are not available, then one may set S_{i} = S_{j} = 1. In our tests, this also yielded an acceptable, albeit somewhat degraded, fit to data (not shown).
[60] This methodology has three advantages.
[61] 1. Limited computational expense: the method is relatively inexpensive because it requires only diagnostic formulas rather than complex prognostic ones. Furthermore, it directly parameterizes the elements of the Cholesky matrix, thereby avoiding the need to Cholesky decompose a correlation matrix in order to generate Monte Carlo samples.
[62] 2. Consistency among correlations: by construction, the method guarantees that the correlation matrix is positive semidefinite.
[63] 3. Potential generality: the method is based on a blend of theory and empiricism, but does not depend on the details of microphysical processes. Although the adjustable parameter may need to be refitted to other data sets, the method may be applicable to other physical problems. For instance, it might be used to parameterize correlations between hydrometeors in warmphase clouds or fully glaciated clouds. We speculate that the method might also be applicable to other problems with many variates, such as aerosol physics or atmospheric chemistry.
[64] We have applied the method to parameterizing correlations found in LES of 3 Arctic mixedphase cloud cases: ISDAC, MPACE B, and MPACE A. The fit is encouraging (see Figures 8–10), given the simplicity of the method and the fact that it does not attempt to exploit the details of the physics of these particular cases.
[65] In this paper, we have assumed that we are given the correlation of each variate (X_{1}, X_{2},.) with another, single variate (W). That is, we know the correlations in the first row of the correlation matrix, and we desire to fill in the other entries in the correlation matrix. However, it is possible to envision a generalization of this problem. Namely, instead of knowing the correlations in the first row, in some problems one may know the correlations in various locations throughout the correlation matrix. In this case, one may first use the known correlations to find the correlations c_{1j} using equations (15) and (16), and then proceed as above. In this way, the cSigma method can be extended to the more general problem of filling in unknown entries in a correlation matrix when some correlations are known.