Parameterizing correlations between hydrometeor species in mixed-phase Arctic clouds



[1] Mixed-phase Arctic clouds, like other clouds, contain small-scale variability in hydrometeor fields, such as cloud water or snow mixing ratio. This variability may be worth parameterizing in coarse-resolution numerical models. In particular, for modeling multispecies processes such as accretion and aggregation, it would be useful to parameterize subgrid correlations among hydrometeor species. However, one difficulty is that there exist many hydrometeor species and many microphysical processes, leading to complexity and computational expense. Existing lower and upper bounds on linear correlation coefficients are too loose to serve directly as a method to predict subgrid correlations. Therefore, this paper proposes an alternative method that begins with the spherical parameterization framework of Pinheiro and Bates (1996), which expresses the correlation matrix in terms of its Cholesky factorization. The values of the elements of the Cholesky matrix are populated here using a “cSigma” parameterization that we introduce based on the aforementioned bounds on correlations. The method has three advantages: (1) the computational expense is tolerable; (2) the correlations are, by construction, guaranteed to be consistent with each other; and (3) the methodology is fairly general and hence may be applicable to other problems. The method is tested noninteractively using simulations of three Arctic mixed-phase cloud cases from two field experiments: the Indirect and Semi-Direct Aerosol Campaign and the Mixed-Phase Arctic Cloud Experiment. Benchmark simulations are performed using a large-eddy simulation (LES) model that includes a bin microphysical scheme. The correlations estimated by the new method satisfactorily approximate the correlations produced by the LES.

1. Introduction

[2] Arctic climate is influenced in strong and complex ways by mixed-phase Arctic clouds. We cite two examples here. First, mixed-phase Arctic clouds influence radiative transfer and are often observed to persist for long times [Pinto, 1998; Prenni et al., 2007]. Several modeling studies suggest that this longevity is possible only if ice nuclei concentrations are limited in order to prevent ice concentrations from increasing and depleting liquid water [Harrington et al., 1999; Prenni et al., 2007]. Second, many Arctic cloud layers are thin enough to be partly transparent to longwave radiation. Because of this, some researchers have hypothesized that if Arctic clouds experience an increase in droplet number concentration, these clouds will emit more longwave radiation and hence cause a relative warming of the surface [Garrett et al., 2002; Garrett and Zhao, 2006].

[3] Given this complexity, it is perhaps unsurprising that climate and regional simulations differ markedly from each other in their estimates of Arctic clouds [Walsh et al., 2002; Kattsov and Källén, 2005; Rinke et al., 2006; Prenni et al., 2007]. For instance, Kattsov and Källén [2005] mention the “dramatic scatter between the total cloud amounts . [which] approaches 60% in winter” that is simulated by the climate simulations they examine. Furthermore, even if a model correctly predicts the presence of cloud, it may not necessarily predict the correct phase of water. The regional models studied by Prenni et al. [2007] severely underpredict liquid water in wintertime Arctic clouds, probably due to excessive ice nuclei concentrations in the simulations. These uncertainties in simulations of clouds lead to uncertainties in other components of the simulated Arctic climate.

[4] A key difficulty in improving regional and climate models is the difficulty of parameterizing small-scale spatial variability in hydrometeors. Regional and climate models have superkilometer horizontal grid spacings, but such large grid volumes contain considerable variability. This variability ought to be taken into account when driving microphysics. Therefore, developing accurate formulas for aerosol and microphysics is necessary, but not sufficient. Also needed is an accurate parameterization of subgrid variability that is implemented in a regional or climate “host” model. Given the resolved fields predicted by the host model, the parameterization would need to predict the relevant aspects of subgrid variability and feed them into a microphysics scheme [Golaz et al., 2002; V. E. Larson and B. M. Griffin, Analytic upscaling of local microphysics parameterizations, part I: Theory, submitted to Quarterly Journal of the Royal Meteorological Society, 2011]. B. M. Griffin and V. E. Larson (Analytic upscaling of local microphysics parameterizations, part II: Simulations, submitted to Quarterly Journal of the Royal Meteorological Society, 2011) simulated a drizzling stratocumulus cloud and found that accounting for subgrid variability in a microphysics scheme led to enhanced autoconversion of cloud droplets to drizzle and accretion of cloud droplets onto drizzle drops. This, in turn, led to a 75% increase in drizzle mixing ratio near the ocean surface.

[5] A particularly difficult aspect is parameterizing correlations among hydrometeor species. This is useful, e.g., for estimating the rates of collection of droplets by snow particles. Although parameterizations of subgrid distributions of moisture-related variables have been developed [e.g., Tompkins, 2002; Morrison and Gettelman, 2008], typically these distributions are univariate. Therefore, they do not contain information on the covariability of the hydrometeors. The information on covariability is needed, for instance, to compute the collection rate, which depends on whether snow falls preferentially through parts of the cloud that contain greater or lesser cloud water mixing ratio.

[6] One possible approach to predicting correlations among hydrometeors is to develop a prognostic or diagnostic equation for each of these correlations based on fundamental physical equations. While this approach is perhaps the most satisfying from a theoretical point of view, it suffers two drawbacks. First, it is computationally expensive. If the number of hydrometeors is n, then the number of correlations among those hydrometeors is n(n − 1)/2. That is, as n grows large, the number of correlations becomes proportional to n2, which is very large. For instance, if a microphysics scheme predicts both the number concentration and mixing ratio of cloud water, rain, cloud ice, and snow, then n = 8 and n(n − 1)/2 = 28. Hence there is an incentive to limit the cost of computing each correlation. Second, if each correlation is individually predicted using a physically based estimate, then the correlations so estimated may be inconsistent with each other. To take an extreme example, if variate X1 and variate X2 are perfectly correlated with each other, then correlations of a third variate X3 with X1 and X2 must be identical. However, simple physical estimates, which inevitably contain errors, will not ensure such a result.

[7] This paper presents a method that mitigates these two problems. It starts with an established mathematical framework that guarantees the internal consistency of the correlations. Within the strictures of this framework, it diagnoses values of each correlation using an inexpensive formula based on guidance from rigorous bounds on the correlations. The formula contains an adjustable parameter that must be empirically fit to observations or, in the case of this study, model-generated data.

[8] The new method is tested noninteractively using output from a large-eddy simulation (LES) model. The LES model does not serve as a host model for the correlation parameterization; rather, in this study, the LES model produces turbulent fluxes and other moments that serve as input to the new method for the purpose of noninteractive tests. The LES model also produces correlations that serve as validation data. We perform LESs of three mixed-phase Arctic clouds. The first case is based upon the Indirect and Semi-Direct Aerosol Campaign (ISDAC) field experiment; the second is based upon the Mixed-Phase Arctic Cloud Experiment (M-PACE) field experiment, period B; and the third is based on M-PACE period A. These three cloud cases differ in their surface fluxes and microphysical characteristics. Correlations computed from 3D snapshots of the LES are compared to estimates provided by the new method. In this preliminary effort, we have not implemented the correlation estimates interactively in a large-scale host model.

[9] Section 2 describes the LES model that we use. Section 3 describes the three Arctic cloud cases that we will simulate. Section 4 presents correlation matrices as computed by LES. Section 5 defines the parameterization problem that we address. Section 6 discusses rigorous lower and upper bounds on the correlations. These are used later to guide the parameterization of these correlations. Section 7 presents the spherical parameterization and the cSigma method of parameterizing its coefficients. Section 8 compares the spherical parameterization with a prognostic approach. Finally, Section 9 discusses the results and concludes.

2. Description of the Large-Eddy Simulation Model That We Use

[10] Using observations, it is difficult to obtain all the correlations we desire, and it is especially difficult to obtain them with adequate sampling statistics. Therefore, in this paper, we examine correlations simulated by LES.

[11] The LES model that we use is the System for Atmospheric Modeling (SAM) [Khairoutdinov and Randall, 2003]. SAM solves the anelastic equations of fluid flow on a Cartesian grid. To reduce spurious numerical oscillations, SAM transports thermodynamic scalars using a monotonic flux limiter. SAM advances the solutions in time using a third-order Adams-Bashforth time stepping scheme. Periodic boundaries are used in the horizontal and a rigid lid is used at the top of the domain. SAM applies sponge damping over the top 1/3 of the domain, but in all cases the lid has been chosen far enough above cloud top so that the sponge damping does not interfere with the solutions. SAM with bulk microphysics has successfully performed LES of a variety of boundary layer cases, including two mixed-phase Arctic cloud systems that we examine in this paper, the M-PACE B single-layer case [Klein et al., 2009] and the M-PACE A multilayer case [Morrison et al., 2009].

[12] In this study, we use a version of SAM that has been coupled to a bin (spectral) microphysics scheme, in which hydrometeors of different radii are separately prognosed. The bin microphysical scheme is based on the work by Khain et al. [2004], but contains modifications described by Fan et al. [2009a], such as the addition of a prognostic ice nuclei size distribution and new ice nucleation mechanisms. The bin microphysics scheme prognoses number size distributions for water drops, columnar ice crystals, plate-like ice crystal, dendritic ice crystals, snowflakes, graupel, hail/frozen drops, and aerosol particles. Each size distribution is represented by 33 bins, with each larger bin containing particles with twice the mass of the next smaller bin. The scheme explicitly computes relevant microphysical processes, including droplet nucleation, primary and secondary ice generation, condensation and evaporation of drops, deposition and sublimation of ice particles, freezing and melting, and collisions between the various hydrometeors. The coupled SAM and bin microphysics model has successfully simulated many mixed-phase and deep convective clouds [Fan et al., 2009b, 2010], including the ISDAC, M-PACE A, and M-PACE B cloud systems [Fan et al., 2009a, 2011].

[13] Most of the analyses and plots below pertain to 2D (xy) horizontal slabs of LES output at a single grid level. The slabs are located at various altitudes indicated in Figures 1, 2, and 3. For the ISDAC and M-PACE B cases, our correlation analysis will use six instantaneous snapshots of LES output; for the M-PACE A case, it will use five. The altitudes of the slabs vary slightly with snapshot as the cloud layers evolve in order to keep the slabs entirely within the same part of cloud (lower, middle, or upper). For our correlation analysis below, the slabs are composite averaged across all snapshots.

Figure 1.

Horizontally averaged profiles of cloud water content (QC) and cloud ice plus snow content (QIS) from the April 26 ISDAC “golden day” case. Analysis levels at cloud top, midcloud, and cloud base are indicated. The profiles are from a snapshot of LES output that occurred 10800 s after the simulation was initiated. The cloud layer shows a classic mixed-phase structure, with cloud water peaking near cloud top and snow peaking near or below cloud base.

Figure 2.

Horizontally averaged profile of cloud water (QC) and cloud ice plus snow (QIS) from our simulation of the M-PACE B cloud. The simulated time is 7200 s after the initiation of the simulation.

Figure 3.

Horizontally averaged profile of cloud water (QC) and cloud ice plus snow (QIS) from the M-PACE A cloud. The simulated time is 22500 s after the start of the simulation.

3. Description of the Three Mixed-Phase Arctic Clouds That We Simulate

[14] For this study, we simulate three mixed-phase Arctic clouds. One was observed during ISDAC. An overview of ISDAC is provided by McFarquhar et al. [2011]. The other two were observed during M-PACE. The M-PACE field experiment is summarized by Verlinde et al. [2007]. Also available are more detailed descriptions of aircraft observations [McFarquhar et al., 2007] and a numerical study [Fridlind et al., 2007].

[15] ISDAC occurred in the spring, when sea ice had not yet melted and hence latent and sensible fluxes from the surface were small, and when the air was relatively polluted. M-PACE occurred in the autumn (27 September 2004 to 22 October 2004), when there was little sea ice and much stronger surface fluxes, and when the air was cleaner (i.e. contained fewer aerosol).

[16] A descriptive overview of these three clouds follows.

3.1. The First Case: April 26 ISDAC Cloud

[17] The first case that we simulate is a “golden” case from ISDAC that was observed on 26 April 2008 [McFarquhar et al., 2011]. On that day a cloud was observed off the coast of Alaska near Barrow over mostly ice covered ocean. The cloud consisted of a single, low, stratiform cloud layer. It resided in a cleaner environment than that observed on some previous ISDAC flights. The cloud was mixed phase. A glory was observed, indicating that liquid drops existed near cloud top. In situ and radar observations indicated the presence of precipitating ice particles.

[18] The model configuration that we use to simulate the ISDAC case is based on that of Ovchinnikov et al. [2009], Ovchinnikov et al. [2011], and Fan et al. [2011]. The ice nucleation scheme is that of Meyers et al. [1992], which nucleates ice heterogeneously based on the ambient supersaturation with respect to ice. However, the coefficients of the scheme have been adjusted to make it more suitable for Arctic clouds. The simulation uses fine horizontal and vertical grid spacing: dx = dy = 100 m, and dz = 20 m. The domain is 12.8 km × 12.8 km × 2.4 km. The time step is 2 s.

[19] The simulation exhibits a typical mixed-phase vertical structure, with liquid appearing to follow an approximately adiabatic structure with altitude and small amounts of snow precipitating out of cloud base (see Figure 1). This liquid-over-ice structure is also commonly observed in midlatitude, midlevel, mixed-phase clouds [e.g., Fleishauer et al., 2002; Carey et al., 2008; Smith et al., 2009].

3.2. The Second Case: M-PACE B

[20] The M-PACE B case is also a single-layer, mixed-phase, boundary layer cloud that formed over the North Slope of Alaska. The cloud system occurred on 9–10 October 2004. Our LES nucleates ice particles using a condensation freezing mechanism and also an inside-out contact freezing mechanism [Fan et al., 2009a]. The simulation uses a grid spacing of dx = dy = 100 m and dz = 20 m. The domain is 7.2 km × 7.2 km × 2.52 km. The time step is 2 s. More details on the model configuration and results can be found in the work by Klein et al. [2009] and Fan et al. [2009a].

[21] The LES yields a single, low, stratiform layer that contains cloud water (maximum of ∼0.14 g m−3) and snow (maximum of ∼0.03 g m−3) that precipitates to the ground (Figure 2). As for the ISDAC cloud, liquid water peaks near cloud top whereas snow peaks near cloud base. Compared to ISDAC, M-PACE B formed in a cleaner environment, which resulted in a lower cloud droplet number concentrations, and was subjected to stronger surface forcing due to the presence of open water.

3.3. The Third Case: M-PACE A

[22] The M-PACE A cloud configuration is described by Morrison et al. [2009]. It occurred on 5–8 October 2004. Our simulation uses the ice nucleation scheme of Meyers et al. [1992]. The simulation uses a horizontal grid spacing of dx = dy = 200 m and a stretched vertical grid with 120 levels up to 9.54 km. The domain is 25.6 km × 25.6 km × 9.54 km. The time step is 4 s. The cloud system is multilayered, with distinct liquid layers and snow falling between them (Figure 3). Peak liquid contents are ∼0.01 to 0.4 g m−3, and snow contents are ∼0.01 g m−3. More information about the configuration and evaluation of the ISDAC and M-PACE A simulations is provided by Fan et al. [2011].

4. Correlation Matrices Computed by LES

[23] In order to compute grid box averages of microphysical processes involving two or more hydrometeor species, we need to estimate the subgrid correlations between those species. To gain familiarity with the correlations, we first present correlations from the three cloud cases produced by LES. Using horizontal slabs of data from the LES, we construct a correlation matrix for vertical velocity (W); the contents of cloud water (QC), cloud ice (QI), and snow (QS); and the number concentrations of cloud water (NC), cloud ice (NI), and snow (NS). Each element of the matrix lists the linear correlation coefficient [Press et al., 1992] between two hydrometeor quantities, a coefficient that ranges between (−1,1). The matrices are composite averaged. That is, for each case, we ensemble average a slab of data from multiple LES snapshots (5 from M-PACE A, 6 from the other two cases), with the snapshots separated by several hours in time. All grid levels are from midaltitude within cloud.

[24] The correlations are computed and presented for the three cases: ISDAC (Table 1), M-PACE B (Table 2), and M-PACE A (Table 3). We see that usually correlations are positive, but not always (e.g., the correlation between QC and QI in ISDAC is −0.08). In all three cases, the correlations are large between QI and NS, between QI and NI, and between QS and NS. We conjecture that the correlation between QI and NS may be strong because large QI leads to formation of snow particles, and hence large NS, in the bin microphysics scheme.

Table 1. Correlation Matrix Derived From a One-Layer Horizontal Slab in Midcloud From Our LES of the April 26 ISDAC Clouda
  • a

    The correlation values are averaged over 6 snapshots in time. For ease of viewing, the diagonal elements are in italic font.

Table 2. Correlation Matrix Derived From a One-Layer Horizontal Slab in Midcloud From Our LES of the M-PACE B clouda
  • a

    The correlation values are averaged over 6 snapshots in time. For ease of viewing, the diagonal elements are in italic font.

Table 3. Correlation Matrix Derived From a One-Layer Horizontal Slab in Midcloud of the Middle Layer From Our LES of the M-PACE A Clouda
  • a

    The correlation values are averaged over 5 snapshots in time. For ease of viewing, the diagonal elements are in italic font.


[25] In all cases, the number concentration NS and water content QS of snow are highly correlated to each other (see Tables 1, 2, and 3). How can we explain this fact? One possible explanation is the following. A high correlation would be expected when the magnitude of the distribution changes from grid box to grid box, but the shape of the distribution changes little. Then the number concentration changes, but the average particle size does not. For instance, one would expect a perfect correlation between QS and NS across a horizontal slab under the following idealized circumstance. Suppose that the shape of the snow particle size distribution is the same in all grid boxes, and in particular is not shifted to smaller or larger sizes between grid boxes, but the amplitude of the size distribution is multiplied by a different (random) constant in each grid box in the slab. Such an array of distributions leads to perfect correlation between QS and NS across grid boxes. Mathematically, we can see this by writing the expressions for the total number concentration of snow particles, NS, in terms of the size distribution of snow particles, nS(D),

equation image

and the snow water content, QS,

equation image

where D is the diameter of the snow particle, ρa is the air density, and m(D) is the mass of an individual snow particle of diameter D. Suppose that nS(D) is multiplied by the same factor in the expressions for NS and QS, without change in nS(D) or m(D). We allow the factor to differ from grid box to grid box, but within a grid box, the same factor multiplies nS(D) in both NS and QS. Then NSQS, and the correlation between NS and QS equals 1. On the other hand, if nS changes shape or is translated to smaller or larger sizes between grid boxes, then NS is not proportional to QS.

[26] In other ways, the correlation matrices from the three cases differ from each other. For instance, in ISDAC, QC is weakly correlated with most other variates; in M-PACE B, it is strongly positively correlated; and in M-PACE A, it is strongly negatively correlated. Additionally, W is strongly correlated to other variates in ISDAC, but weakly correlated in the middle layer of M-PACE A. The middle layer is prevented from undergoing strong radiative cooling by the upper cloud layer and therefore exhibits little turbulence [Falk and Larson, 2007].

5. Parameterizing Correlations: Problem Definition

[27] Having gained familiarity with the magnitudes of the correlations, we now turn to the problem of parameterizing them.

[28] The parameterization problem that we address may be stated as follows. We suppose that we are given the grid box mean, subgrid standard deviation, and subgrid vertical turbulent flux of each of n variates. In our case, these variates are vertical velocity (W); the contents of cloud water (QC), cloud ice (QI), and snow (QS); and the number concentrations of cloud water (NC), cloud ice (NI), and snow (NS). Then the task is to predict (i.e. model) the matrix of correlations among variates. Because we assume that the fluxes and variances of the variates are given by a large-scale host model, the correlations between W and the other variates can be assumed to be given, but the other correlations are not. Given the correlation matrix, the covariance matrix can be computed in a straightforward manner. Because we desire to estimate some correlations given others, this problem differs from a perhaps more common problem in which one desires to find a positive semidefinite correlation matrix that is close in some sense to a complete but nonpositive-semidefinite “correlation” matrix [Rapisarda et al., 2006]. For instance, a set of correlations may be obtained from observations, but because of noise in the observations, the correlations may not be consistent with each other; that is, they may not lead to a positive semidefinite correlation matrix.

[29] The means are predicted by the host model. The vertical turbulent flux of each hydrometeor quantity (e.g., equation image, equation image, etc.) is often predicted in a host model by down-gradient diffusion. The standard deviations are often neglected, but they can be provided either by a prognostic equation [e.g., Golaz et al., 2002] or a simple diagnostic balance between turbulent production and dissipation. Predicting the standard deviations requires only n prognostic or diagnostic equations, fewer than the n(n − 1)/2 correlations for n ≥ 4.

6. Lower and Upper Bounds on Correlations

[30] One can analytically derive a lower and upper bound on the possible values of the correlations between two variates [Leung and Lam, 1975; Vos, 2009]. These lower and upper bounds do not by themselves serve as an accurate parameterization of correlations, but they do provide a useful starting point for the Cholesky-based parameterization discussed below in Section 7.

[31] Suppose that the linear correlation coefficient between W and an arbitrary hydrometeor quantity X1 is known and is denoted image Suppose the same for a second hydrometeor quantity X2. Then the correlation between X1 and X2, image is bounded by the expression

equation image

where the lower bound corresponds to the minus sign, and the upper bound corresponds to the plus sign. Leung and Lam [1975] and Vos [2009] derive the bounds using geometric arguments. The appendix notes that the same expression can be obtained by computing the condition needed for a zero eigenvalue of the correlation matrix of W, X1, and X2. One can see from formula (3) that if W is perfectly correlated with either X1 or X2, i.e., if image = 1 or image = 1, then the lower bound equals the upper bound, and the only realizable correlation is

equation image

In such cases, when W is highly correlated with either X1 or X2, then formula (3) yields tight bounds. However, if W is uncorrelated with X1 and X2, i.e., if image = 0 and image = 0, then any correlation in the range [−1, 1] is possible (or “realizable”). Therefore, when correlations of W with X1 and X2 are low, the formula (3) yields loose bounds.

[32] The lower and upper bounds of the correlations for the ISDAC, M-PACE B, and M-PACE A Arctic clouds are displayed in Figures 4 and 5 respectively. Unfortunately, the bounds turn out to be loose. The lower bound on a given scatter point is at least 0.5 less than the corresponding LES correlation, and often the lower bound is the minimum value, −1 (Figure 4). The upper bounds are also loose, with many values at or near the maximum value, 1 (Figure 5). The reason that the bounds are loose is that many of the turbulent fluxes are weak, and therefore image or image is small.

Figure 4.

Scatterplot of lower bound of correlation (minus sign in formula (3)) versus observed correlation in the LES. Each scatter point represents a correlation at one altitude level (e.g., lower, middle, or upper cloud) in one case (ISDAC, M-PACE B, or M-PACE A). Each correlation has nine scatter points, corresponding to the lower, middle, and upper portion of cloud in ISDAC and M-PACE B, and the middle portion of each of three cloud layers in M-PACE A. The correlations at a level are composite averaged over 5 or 6 LES snapshots in time. Each different marker symbol represents a correlation between two different variates. The Pearson correlation coefficient, r = 0.24, is shown in the plot. The lower bound strongly underestimates the observed LES correlations.

Figure 5.

As in Figure 4 except for the upper bound of correlations (plus sign in formula (3)). The upper bound strongly overestimates the observed LES correlation.

[33] To produce more accurate correlation estimates, one may choose an intermediate value between the lower and upper bounds. Perhaps the simplest method is to use equation (4) for the midbound value, regardless of the values of image and image Indeed, Figure 6 shows that this method does improve the estimates. However, such estimated correlations tend to cluster toward small values and tend to underestimate the LES correlations on average. Therefore, we do not recommend using approximation (4) except for applications where high accuracy is not important.

Figure 6.

As in Figure 4 except for the midpoint between the lower and upper bounds (4). This correlation estimate is more accurate than the lower or upper bounds but is artificially compressed to values near zero and thereby tends to underestimate the magnitude of the observed LES correlations.

[34] In order to develop more accurate estimates of the observed correlations, one could construct the following family of empirical formulas, inspired by equation (3):

equation image

Here S(X1) ≡ equation image/equation image is the standard deviation of X1 normalized by the mean of X1. The quantity S(X2) is defined analogously. The function f is a freely chosen function of image image S(X1), and S(X2), with −1 < f < 1. The function f specifies a position between the lower and upper bounds of (3). The function f is sometimes called a “partial correlation coefficient” and could be denoted image [Leung and Lam, 1975]. Inspection of (5) shows that image reduces to image in the special case that W has zero correlation with both X1 and X2.

[35] The correlation image should obey the following two symmetry properties in order to be general and physically plausible. First, image should have odd parity symmetry with respect to X1 and X2. That is, if X1 changes sign, so should image and likewise for X2:

equation image

Second, image should have exchange symmetry with respect to X1 and X2. That is, if the values of X1 and X2 switch, then the correlation remains the same:

equation image

[36] Many functional forms of f can be created that are consistent with these symmetry properties. If −1 < f < 1, then image is guaranteed to lie between the lower and upper bounds and therefore is consistent with image and image However, this analysis does not guarantee that image is consistent with correlations involving another variate X3. Therefore, we do not pursue this method further but instead move to a method that guarantees the consistency of all correlation estimates among an arbitrary number of variates. We shall see, however, that the method is related to the bounds (3).

7. Cholesky-Based Parameterization of Correlations

[37] In order to ensure consistency among correlation values, we consider the matrix of all correlations. Let Σ denote the correlation matrix that contains all correlations image image and so forth. The first row and column of Σ contain the correlations between W and the Xi variates. On the other hand, where i ≠ 1 and j ≠ 1, the element Σij contains the correlation between Xi and Xj.

[38] Correlation matrices must satisfy certain properties. In order for Σ to represent correlations, the elements of Σ must be real and must satisfy Σij = Σji. That is, Σ must be a real, symmetric matrix. Furthermore, the diagonal elements of Σ must be 1, and the values of the other elements must lie in the range (−1, 1).

[39] One further crucial condition on Σ is that it be positive semidefinite. In other words, all eigenvalues of Σ must be positive or zero [Press et al., 1992, p. 89]. This means that when rotated to the principal axes, all variances are positive or zero.

[40] Positive semidefinite variances are required on physical grounds. Any positive semidefinite matrix can be represented by a Cholesky factorization:

equation image

where L is a lower triangular matrix, and T denotes transpose. Cholesky factorizing a matrix can be seen as a high-dimensional analogue of taking the square root of a scalar [e.g., Press et al., 1992], because the given quantity, Σ, equals L multiplied by the transpose of L itself. For instance, if Σ were a diagonal matrix, then L would also be a diagonal matrix whose diagonal entries would contain the square roots of Σ's diagonal entries.

[41] To construct a parameterized correlation matrix, Σ, that is assured to be positive semidefinite, we may first construct a parameterized Cholesky factor, L, and then multiply it by its transpose, as in equation (8) [Pinheiro and Bates, 1996; Rebonato and Jäckel, 1999]. That is, instead of directly parameterizing elements of Σ, as in equation (5), we may directly parameterize L.

[42] Parameterizing L directly saves computational expense when Monte Carlo samples are needed, regardless of how the elements of L are parameterized. For instance, given the Cholesky factor Lcov of a covariance matrix Σcov, a vector of mean values μ, and a vector of uncorrelated sample points from a standard normal distribution x, we can compute the corresponding correlated sample, y, by matrix multiplication [Johnson, 1987]:

equation image

Parameterizing Lcov directly is computationally more efficient than computing Σcov and then computing the Cholesky factorization at each time step. We can compute the covariance Cholesky matrix Lcov from the correlation Cholesky matrix L by multiplying each row of L by the standard deviation of the corresponding variate.

[43] The desired correlation Cholesky matrix, LT, can be written as [Pinheiro and Bates, 1996]:

equation image

For instance, a 4 × 4 matrix would be written, in tableau form, as

equation image

Pinheiro and Bates [1996] refer to this as the “spherical parameterization.” The elements that compose the first row of LT turn out to be simply the correlations between W and the hydrometeors. That is, c1j = image where the image are assumed to be known. In the i ≠ 1 rows, sij = sin(θij) and cij = cos(θij), where θij is a set of angles that remains to be determined.

[44] The formula (10) of Pinheiro and Bates [1996] provides a convenient and general framework for constructing positive semidefinite correlation matrices. Equation (10) ensures that the matrix is positive semidefinite and has ones along the main diagonal, regardless of the values of the θij angles. The spherical parameterization may be interpreted geometrically by noting that the ith column of LT in (11) represents the components of a unit vector vi. This set of unit vectors originate at the same point but are oriented in different directions, such that their dot products equal the correlations of Σ (i.e. vi · vj = Σij) [Rapisarda et al., 2006].

[45] However, where i ≠ 1, the angles θij are unknown a priori and hence need to be parameterized. The optimal values of θij will vary among applications, and Pinheiro and Bates [1996] do not suggest how to parameterize them. We will suggest an indirect method below. Instead of parameterizing θij directly, however, we choose to parameterize cij. Then it follows from trigonometry that sij = equation image.

[46] In order to develop understanding of the cij parameters, we explore their relationship to the elements of matrix (LLT)ij. If the θij angles were known exactly, then we would have (LLT)ij = Σij (see equation (8)). If L is parameterized approximately, however, then the agreement is only approximate: (LLT)ij ≈ Σij. The cij are related to the correlation estimates in a complex way. For instance, for the 4 × 4 case, the upper triangular elements of the (symmetric) parameterized correlation matrix are

equation image

where equation image23 = c23 and equation image24 = c24. Also, equation image34 = c13c14 + equation image34s13s14, where equation image34 = c23c24 + c34s23s24.

[47] For the top row, that is, for i = 1, c1j = (LLT)1j. However, the relationship for other rows is more complex. Equation (12) indicates that the correlations (LLT)ij are related to the cij parameters by expressions such as

equation image

If we recall that sij = equation image, then we see that the spherical parameterization of Σ23, for instance, matches the form of the bound parameterization (5), if equation image23 = c23 is set equal to f23. That is, for these second-row correlations Σ2j, the only difference between (13) and (5) is the form of f, which determines whether the correlation is closer to the upper bound or the lower bound.

[48] Despite these subtle differences, it turns out that for our Arctic LES,

equation image

This is shown in Figure 7, in which we set cij = Σij in L and then plot (LLT)ij versus Σij. We find good agreement, which indicates that cij ≈ Σij.

Figure 7.

The spherical parameterization with the parameters cij set to cij = Σij versus the observed LES correlations. Each scatter point corresponds to a composite average of 5 or 6 snapshots of LES output. Although cij ≠ Σij, the good fit about the 1:1 line shows that cij ≈ Σij. This fact is useful to keep in mind when designing a parameterization of cij.

[49] In order to use the framework provided by equation (11), we need to parameterize all the cij (i > 1) parameters in terms of known quantities, namely the means, standard deviations, and vertical turbulent fluxes. Inspired by equation (14), we seek a parameterization that approximates cij as the correlation between Xi and Xj, which, in turn, can be approximated by equation (5). In this way, the parameterization is directly related to the upper and lower bounds on correlations discussed previously. We parameterize cij for i > 1 in terms of the first-row elements (correlations) as

equation image

In (15), we choose fij as

equation image

As a subsequent step, we explicitly ensure that −0.99 < fij < 0.99. Here α is an adjustable coefficient.

[50] Formulas (15) and (16) are semiempirical, rather than derived from first principles. Nevertheless, we now describe our rationale for the choice of (16). In formulating fij, we multiply α by sgn(c1ic1j) in order to ensure that important symmetry relationships are obeyed (see below). We include SiSj, where SiS(Xi) ≡ equation image/equation image and analogously for Sj, because this factor improves the fit. This factor appears to help especially in cases in which a hydrometeor species is present in only part of a horizontal slab. In such cases, there is a cluster of data points at zero, and other data points at nonzero points. Such distributions have unusually high correlations. It would be convenient to parameterize this effect directly in terms of the fraction of zero points, but this information is often unavailable. Instead, as a surrogate, we include SiSj in (16), which increases fij and hence the correlation cij. In those cases in which Si is unknown, one may set SiSj = 1 and obtain somewhat degraded but still satisfactory results (not shown).

[51] The formulation of (15) and (16) is dictated in part by the need to obey symmetry relationships. The expression for cij satisfies exchange symmetry (7) because if i and j are switched with each other on the right-hand side of (15), the right hand remains unchanged. The expression also satisfies odd parity symmetry (6) because if either the ith or jth variate changes sign, so does the right-hand side.

[52] We call equations (15) and (16) the “cSigma” parameterization because it is inspired by the assumption that cij ≈ Σij. The cSigma parameterization is useful in cases in which the ordering of the magnitudes of the correlations is not known. In other cases, it is reasonable to suppose that the correlations are ordered. For instance, the correlation between a quantity at two times should decrease as the time interval increases. In such cases, one may use the parameterization of Rapisarda et al. [2006] and Schoenmakers and Coffey [2003].

[53] Equation (16) would be of limited practical use if it were necessary to choose a different value of the adjustable parameter α for each different cloud. To assess whether a single, robust value of α can be chosen, we partition the data points into three groups, corresponding, naturally, to the three cloud cases: ISDAC, M-PACE B, and M-PACE A. Using these three groups, we perform a k-fold cross validation [e.g., Kohavi, 1995] in which all data points from one cloud case are omitted, the value of α is optimized using data from the remaining two cases, and then the resulting optimal value of α is used in (LLT)ij to test how accurately it represents Σij in the omitted cloud case.

[54] We optimize the value of the parameter α using the Levenberg-Marquardt method [Press et al., 1992]. When we arrange the columns in Σ in the order [W, QS, NS, QI, NI, QC, NC], then we find that the best fit value of α is 0.19 when ISDAC data are excluded, 0.11 when M-PACE B is excluded, and 0.21 when M-PACE A is excluded. The values of α are fairly close for these three cases, raising hopes that a single value of α can be used widely without great loss of accuracy. Using these parameter values in the plots results in Figures 8, 9, and 10. The correlation estimates tend to have a low bias and are somewhat scattered about the 1:1 line. Nonetheless, considering the simplicity, generality, and inexpensiveness of formulas (15) and (16), the fit is quite acceptable.

Figure 8.

A scatterplot of ISDAC correlations from LES and from the spherical parameterization with cij set according to the cSigma parameterization (equations (15) and (16), with α = 0.19). The value of α has been optimized using M-PACE B and M-PACE A data points. Altitude levels are chosen and time averaging is performed as in Figure 4. Considering the simplicity of the cSigma parameterization, the fit is satisfactory.

Figure 9.

A scatterplot of M-PACE B correlations from LES and from the spherical parameterization with cij set according to the cSigma parameterization (equations (15) and (16), with α = 0.11). The value of α has been optimized using ISDAC and M-PACE A data points. Altitude levels are chosen and time averaging is performed as in Figure 4. The fit exhibits more scatter than that in Figure 8 or Figure 10.

Figure 10.

A scatterplot of M-PACE A correlations from LES and from the spherical parameterization with cij set according to the cSigma parameterization (equations (15) and (16), with α = 0.21). The value of α has been optimized using ISDAC and M-PACE B data points. Altitude levels are chosen and time averaging is performed as in Figure 4. Considering the simplicity of the cSigma parameterization, the fit is satisfactory.

8. For Comparison: Covariances Based on the Scalar Variance Equation

[55] We now compare the estimate (5), which is related to the spherical parameterization, with an alternative methodology that estimates the correlations based on the scalar variance equation. The covariance of X1 and X2, equation image, is governed by the equation

equation image

Here τ is a turbulent dissipation scale, Cd is an empirical constant, t is time, and z is altitude. An overbar denotes a grid box mean, and a prime denotes a perturbation from the mean. We have neglected the horizontal advection and horizontal production terms. If we further neglect the time tendency, vertical mean advection, vertical turbulent advection, and source terms, then we find

equation image

Now we assume that the turbulent fluxes can be modeled by down-gradient diffusion. That is,

equation image

and similarly for X2. Here, K is an eddy diffusivity, which may be chosen to have a complicated form. We substitute (19) into (18) in order to eliminate the vertical derivatives. Then we rearrange, which yields

equation image

Converting from covariances to correlations, we find

equation image

The estimate (21), which is based on the scalar variance equation, may be compared with equation (5). The two equations coincide if 2τequation image/(CdK) = 1 in (21) and f = 0 in (5). With those conditions imposed, (21) reduces to (4), and the results of the parameterization may be seen in Figure 6.

[56] One advantage of the estimate (21) is that it is derived from fundamental physics, namely, the scalar variance equation. One disadvantage is that the neglect of many terms is required to derive this simplified form. Another disadvantage is that it does not ensure that image is consistent with image and image that is, that image lies within the lower and upper bounds (3). Nor does estimate (21) ensure that the correlation matrix is positive semidefinite.

9. Conclusions

[57] This paper addresses the problem of parameterizing subgrid-scale correlations among hydrometeors in large-scale models. These correlations are important because they influence the rates of microphysical processes such as accretion. However, the correlations are difficult to parameterize because they are numerous, the number of correlations is approximately proportional to the square of the number of hydrometeors, and because the values of the correlations depend on a broad range of physical processes.

[58] Rigorous lower and upper bounds on the correlations are available (see equation (3)), and the form of (3) does suggest a parameterization for the individual correlations (5). However, this equation does not guarantee consistency among all correlations.

[59] Instead, we suggest using the spherical parameterization framework of Pinheiro and Bates [1996], embodied in equations (8) and (10). This provides a framework for the upper diagonal Cholesky matrix, LT, associated with the correlation matrix. To estimate the elements of LT, we use the cSigma parameterization (15) and (16). These formulas contain only a single adjustable parameter, α, whose optimal value for our data ranges from 0.11 to 0.21. If the standard deviations (Si in equation (16)) are not available, then one may set Si = Sj = 1. In our tests, this also yielded an acceptable, albeit somewhat degraded, fit to data (not shown).

[60] This methodology has three advantages.

[61] 1. Limited computational expense: the method is relatively inexpensive because it requires only diagnostic formulas rather than complex prognostic ones. Furthermore, it directly parameterizes the elements of the Cholesky matrix, thereby avoiding the need to Cholesky decompose a correlation matrix in order to generate Monte Carlo samples.

[62] 2. Consistency among correlations: by construction, the method guarantees that the correlation matrix is positive semidefinite.

[63] 3. Potential generality: the method is based on a blend of theory and empiricism, but does not depend on the details of microphysical processes. Although the adjustable parameter may need to be refitted to other data sets, the method may be applicable to other physical problems. For instance, it might be used to parameterize correlations between hydrometeors in warm-phase clouds or fully glaciated clouds. We speculate that the method might also be applicable to other problems with many variates, such as aerosol physics or atmospheric chemistry.

[64] We have applied the method to parameterizing correlations found in LES of 3 Arctic mixed-phase cloud cases: ISDAC, M-PACE B, and M-PACE A. The fit is encouraging (see Figures 810), given the simplicity of the method and the fact that it does not attempt to exploit the details of the physics of these particular cases.

[65] In this paper, we have assumed that we are given the correlation of each variate (X1, X2,.) with another, single variate (W). That is, we know the correlations in the first row of the correlation matrix, and we desire to fill in the other entries in the correlation matrix. However, it is possible to envision a generalization of this problem. Namely, instead of knowing the correlations in the first row, in some problems one may know the correlations in various locations throughout the correlation matrix. In this case, one may first use the known correlations to find the correlations c1j using equations (15) and (16), and then proceed as above. In this way, the cSigma method can be extended to the more general problem of filling in unknown entries in a correlation matrix when some correlations are known.

Appendix A:: Lower and Upper Bounds on Correlations

[66] Leung and Lam [1975] and Vos [2009] derive the lower and upper bounds on correlations (3) using a geometric argument. Here we use a different argument to show that when only three variables are considered, if a correlation equals the lower or upper bound (3), then the correlation is realizable.

[67] Consider three variates: X1, X2, and X3. Suppose that they are related by the following correlation matrix:

equation image

In order for this matrix to be realizable, each correlation must lie within the range [−1, 1]. That is, the correlations must satisfy −1 ≤ C12 ≤ 1, −1 ≤ C23 ≤ 1, and −1 ≤ C13 ≤ 1. These conditions are necessary but not sufficient. Additionally, the correlations must be consistent with each other. For instance, if X1 and X2 are perfectly correlated with each other, then it is impossible that a third variate X3 is correlated with X1 and anticorrelated with X2. In general, the condition for realizability is that when the correlation matrix (A1) is rotated to its principal axes, all variances must be positive semidefinite. We can interpret this geometrically by imagining a three-dimensional cloud of scatter points drawn from a Gaussian distribution with the correlations given in (A1). Then the principal axes are the major axis and minor axes through the cloud of points; no axis can have negative width, which is equivalent to stating that no variance can be negative. Mathematically stated, all eigenvalues of the correlation matrix must be positive or zero [e.g., Press et al., 1992, p. 89].

[68] Assume C12 and C13 are given and that we want to find the range of C23 values that is realizable. Restated mathematically, we want to show that if C23 is given by the following lower or upper bound,

equation image

then each of the three eigenvalues is either zero or positive. To ascertain this, we substitute the expression for C23(A2) into (A1) and find the eigenvalues of the resulting matrix. The eigenvalues λ satisfy the formula det(CλI) = 0, where I is the identity matrix. The three eigenvalues are

equation image
equation image


equation image

where the ± in equations (A4) and (A5) correspond to the ± in (A2), that is, to the upper and lower bounds, respectively. First, we prove that these eigenvalues are real. To do so, it is sufficient to show that

equation image

We find the stationary points of g by simultaneously solving the set of equations

equation image

For the upper bound, this yields three stationary points: (C12, C13) = (0, 0), (C12, C13) = (1/2, − 1/2), and (C12,C13) = (−1/2, 1/2). For the lower bound, this also yields three stationary points: (C12,C13) = (0, 0), (C12,C13) = (1/2, 1/2), and (C12,C13) = (−1/2, −1/2). At (C12,C13) = (0, 0), g = 1. At the other values, g = 0 is a minimum. At the boundaries, g > 0, as shown below. Since g ≥ 0 everywhere, we have demonstrated that the eigenvalues are real.

[69] To prove that the eigenvalues (A4) and (A5) are positive semidefinite, it is sufficient to show that

equation image

Because g < 9 at the stationary points, we only need to check the boundaries. By inspection, it is clear that at the corners of the domain, (C12,C13) = (±1, ±1), g = 9. It is also clear by inspection that along the edges of the domain, i.e. where C12 = ±1 but C13 ≠ ±1, or where C13 = ±1 but C12 ≠ ±1, then 0 < g < 9. Therefore, the eigenvalues are positive or zero. The above results are confirmed by plots of g.

[70] Although we have only proven that the eigenvalues are zero or positive when C23 equals the upper or lower bound, it is clear on physical grounds that the correlation matrix is realizable when C23 lies between the upper and lower bounds.


[71] We gratefully acknowledge a conversation with Hans Volkmer that clarified the appendix. This research was supported by the Office of Science (BER), U.S. Department of Energy. The Pacific Northwest National Laboratory is operated for the DOE by Battelle Memorial Institute under contract DE-AC06-76RLO 1830.