Characterization of cloud liquid water content distributions from CloudSat



[1] The development of realistic cloud parameterizations requires accurate characterizations of subgrid distributions of thermodynamic variables. To this end, cloud liquid water content (CLWC) distributions are characterized with respect to cloud phase, cloud type, precipitation occurrence, and geolocation using CloudSat radar measurements. The probability density function (PDF) of CLWC is estimated using maximum likelihood estimation. The best-estimated PDF of CLWC is found to follow either a gamma or a lognormal distribution depending on temperature (cloud phase), cloud type, the occurrence of precipitation, and geolocation. The data sampling with respect to cloud phase and precipitation significantly affects the distributional characteristics of CLWC in some regions. In the lower to midtroposphere (altitudes of 1–6 km) in the tropics and subtropics, where nonprecipitating and pure liquid phase clouds are dominant, the PDFs of CLWC are best described by lognormal distributions. In contrast, at altitudes above 6 km and in regions poleward of the midlatitudes, the CLWC more closely resembles a gamma distribution that coincides with a high frequency of occurrence of supercooled liquid clouds containing low CLWC values. When the contributions of supercooled water and precipitation are removed, the CLWC PDFs transition from gamma to lognormal distributions in two areas: (1) the high altitude and middle-to-polar latitude regions where the contribution of supercooled cloud is significant and (2) in the lower troposphere where precipitation is frequently detected. Although the CloudSat radar does not sample all cloud hydrometeors, coherent regional and cloud type dependence of CLWC distributional characteristics are observed that may provide useful constraints for cloud parameterizations in climate models.

1. Introduction

[2] At present, a crude treatment of subgrid scale cloud processes in current climate models is widely recognized as a major limitation in predictions of global climate change. Typical climate models have a horizontal resolution on the order of 100 km and a variable vertical resolution between 100 m and 1 km. Since climate models cannot explicitly resolve what happens at the subgrid scales, the physics must be parameterized as a function of the resolved motions. A fundamental problem of cloud parameterization is to characterize the distributions of cloud variables at subgrid scales and to relate the subgrid variations to the resolved flow. In particular, the subgrid distributions of liquid water content play a key role in modern cloud microphysics parameterizations [e.g., Morrison and Gettelman, 2008].

[3] Cloud parameterizations based on probability density functions (PDF) of moist conserved variables (e.g., total water content) have been advocated for some time [e.g., Sommeria and Deardorff, 1977], but only simplified versions have been implemented in weather and climate prediction models [e.g., Tompkins, 2002; Teixeira and Hogan, 2002; Chaboureau and Bechtold, 2002]. PDF approaches have also been used for the development of stochastic parameterizations associated with turbulence, convection, and cloud-radiation interaction [e.g., Barker, 2002; Golaz et al., 2002a; Teixeira and Reynolds, 2008].

[4] However, there is no clear consensus on the optimal type and number of basic shapes for the PDF estimation of cloud and thermodynamic properties for different cloud types. For example, aircraft observations [e.g., Larson et al., 2001a] and large eddy simulation models [e.g., Cuijpers and Bechtold, 1995] support that for most stratus cloud regimes a Gaussian PDF for moist conserved variables is a realistic approximation. For cumulus clouds, the skewness of these PDFs plays an important role in determining the cloud and thermodynamic properties so that others have suggested different PDF types such as the beta distribution [Tompkins, 2002], the double Gaussian distribution [Golaz et al., 2002a, 2002b], and the generalized lognormal distribution [Bony and Emanuel, 2001].

[5] The characterization of distributions (PDFs) of cloud variables has been studied using aircraft data [e.g., Ek and Mahrt, 1991; Wood and Field, 2000; Larson et al., 2001b], tethered balloon data [e.g., Price, 2001], satellite data [e.g., Wielicki and Parker, 1994; Barker et al., 1996], and cloud-resolving or large-eddy simulation models [e.g., Bougeault, 1981; Lewellen and Yoh, 1993; Xu and Randall, 1996a, 1996b]. However, the observational data used in the previous studies have limitations. Aircraft and tethered balloon data provide only one-dimensional paths in a few selected locations and do not provide global coverage. Although satellite data can provide global coverage, previous studies based on infrared (IR) and visible imagery [e.g., Wielicki and Parker, 1994; Barker et al., 1996] do not provide vertically resolved cloud structure information. Recently Kahn and Teixeira [2009] analyzed the scaling behavior of vertically resolved variance of thermodynamic variables from AIRS. However, AIRS is an IR instrument and as such produces retrievals (using some microwave information) of vertical profiles of temperature and water vapor, but not the profiles of cloud properties.

[6] Cloud-resolving models (CRMs) can be extremely useful in providing information about the small-scale dynamics because of their high resolution (on the order of 1 km in the horizontal). However, CRMs have their own shortfalls mostly because a significant portion of the cloud dynamics (e.g., turbulence) and most of the cloud physics (e.g., microphysical processes such as precipitation) are still poorly represented [e.g., Marchand et al., 2009; Redelsperger et al., 2000; Bechtold et al., 2000, and references therein].

[7] Recently, NASA's satellite instrument CloudSat has provided vast amounts of unprecedented high-resolution information of cloud hydrometeors at climate model subgrid scales that enable characterization of the distributional properties of cloud variables. The CloudSat satellite is in an orbit with the A-Train [Stephens et al., 2002]. CloudSat carries a W-band 94 GHz cloud profiling radar (CPR) providing vertically resolved information of cloud ice and liquid water content, precipitation, cloud classification, radiative fluxes and heating rates [Stephens et al., 2008] at a vertical resolution of 480 m oversampled to 240 m with the footprint of 1.4 km cross track by 2.5 km along track [Mace et al., 2007]. The CPR resolves details in vertical cloud structure that IR sounders or visible imagers are unable to quantify.

[8] In this paper, vertically resolved cloud liquid water content (CLWC) profiles retrieved from CloudSat [Stephens et al., 2008; Austin et al., 2009] are used to characterize the subgrid distribution of CLWC. The CLWC profiles are organized by temperature (an oversimplified, but useful surrogate for cloud phase), cloud type, precipitation occurrence, and geolocation, and the dependence of the CLWC distributions on these parameters is investigated. The characteristics of CLWC distributions are quantified by utilizing a statistical method for the parametric estimation of the probability density function (PDF) [Kay, 1993]. Results for the best estimation of the CLWC PDF with respect to cloud phase, cloud type, precipitation occurrence, latitude, longitude, and height are presented, and physical interpretations of the results are discussed.

2. Data

[9] Ideally, for the characterization of CLWC subgrid distributions one should generate the “snapshot” PDF, where only the variability in space is taken into account in the statistics without introducing the variability in time. The snapshot PDF is an instantaneous state of CLWC distribution in space and is appropriate for the development of cloud parameterizations in climate models. However, CloudSat measurements do not provide a sufficient number of data to generate the snapshot PDF in a typical climate model grid box. The accumulation of CloudSat data over time, which involves a wide range of atmospheric conditions, is unavoidable in order to draw any inferences of the PDFs. As a consequence, the CLWC distribution represented by the accumulated CloudSat data is expected to be broader than the “snapshot” distribution.

[10] The present study uses CloudSat data measured during 4 weeks. A representative set of CLWC data is obtained from the CloudSat 2B-CWC-RO version R04 data product by collecting data retrieved from measurements on 1–7 January, 1–7 April, 1–7 July, and 1–7 October, which provide week-long samples for each season. We find that the seasonal variations are insignificant compared to other dependent variables considered here (e.g., cloud type, cloud phase, precipitation, geolocation). Therefore, the week-long data sets are combined and are used together to investigate the CLWC distributions for the results that follow. Note that the CloudSat 2B-CWC-RO version R04 data product contains profiles with unsuccessful retrievals of the liquid water content. Only successful retrievals are used in the present analysis.

[11] For the horizontal and vertical distributions, the CLWC data are partitioned by latitude (10° bins), longitude (10° bins), and height (1 or 2 km bins). The CLWC data are also organized and/or filtered in terms of retrieved data quality (associated with cloud phase and precipitation) and cloud type.

[12] One factor that affects the interpretation of CloudSat-retrived CLWC is supercooled liquid/mixed phase clouds. CloudSat is unable to independently determine the cloud phase in any given vertical bin. As a result, CloudSat employs a simple scheme to partition the radar measurements into ice and liquid phases. In this scheme, the portion of the profile colder than −20°C is deemed pure ice, the portion of the profile warmer than 0°C is considered pure liquid, and in between is partitioned linearly into ice and liquid phases with a smooth transition from all ice at −20°C to all liquid at 0°C [CloudSat Data Processing Document, 2007a]. The simple linear interpolation does not accurately capture mixed phase cloud structure, which is complex and uncertain [e.g., Nasiri and Kahn, 2008].

[13] Another factor that potentially impacts the fidelity of CloudSat-retrieved CLWC is the presence of large droplets [CloudSat Data Processing Document, 2007a]. The radar reflectivity is sensitive to the sixth power of the diameter of the droplet, and the retrieved CLWC in the presence of precipitation often exceeds the applicability of the retrieval algorithm. The CLWC retrieval algorithm assumes a lognormal distribution of droplet sizes, and possible departures from this assumption degrade the accuracy of the retrieval. In the presence of precipitation or drizzle, the cloud droplet distribution departs from the lognormal distribution, violating the assumption of the retrieval algorithm. Therefore, the retrieved CLWC in precipitating clouds is likely to have larger errors than in nonprecipitating clouds.

[14] The CloudSat retrieval assumes a distribution with a single cloud particle mode. When both precipitation and cloud water are present in the same volume at the same time, this assumption is not valid. Furthermore, when the observed reflectivity is larger than the range allowed by the a priori data (a frequent occurrence when large raindrops are present) or results in water contents with very large attenuation not matched by observations, the retrieval will not converge. Therefore, radar profiles with moderate to heavy rainfall are usually nonconvergent. However, drizzle and light rain can increase the reflectivity but the retrieved CLWC values are not too large that they exceed the range allowed by the a priori assumptions. As a result, the CloudSat retrievals of CLWC are likely biased high by light rain and drizzle and are not completely filtered out by the precipitation flag.

[15] Considering the impacts of cloud phase and precipitation on the accuracy of retrieved CLWC, the data are organized in four ways:

[16] 1. All cloud set (AC): includes all successfully retrieved CLWC data.

[17] 2. Nonprecipitating cloud set (NP): includes only nonprecipitating CLWC data.

[18] 3. Liquid-phase cloud set (LP): includes only pure liquid phase CLWC data (T > 0°C).

[19] 4. Nonprecipitating and pure liquid phase cloud set (NP + LP): includes only nonprecipitating and pure liquid phase CLWC data.

[20] In defining the NP data set, we use the “Precipitation Flag” given in the CloudSat 2B-CLDCLASS version R04 data product. The Precipitation Flag is determined by checking the maximum reflectivity and attenuation of surface signals due to precipitation [CloudSat Data Processing Document, 2007b]. The accuracy of this precipitation detection method is limited by the effect of surface return signal and vertical resolution. NP consists of data with the precipitation flag of 00.

[21] To produce the LP data set, we use the “Temperature” given in the CloudSat ECMWF-AUX version R04 data product [CloudSat Data Processing Document, 2007c]. The CloudSat retrieval process uses this temperature to decompose the liquid and ice water contents in the mixed phase clouds. LP consists of data with temperature >0°C.

[22] CLWC data in this study are also partitioned by cloud type using the cloud classification data reported in CloudSat 2B-CLDCLASS version R04 data product. CloudSat classifies clouds into altostratus (As), altocumulus (Ac), nimbostratus (Ns), stratus (St), stratocumulus (Sc), cumulus (Cu), deep convective (Cb), or high cirrus and cirrostratus (Ci) cloud by using characteristics of hydrometeor vertical and horizontal scales, radar reflectivity, precipitation, and ancillary data including temperature profiles and surface topography height [Sassen and Wang, 2008]. The present study considers six cloud types: As, Ac, Ns, Sc, Cu, and Cb but omits all Ci occurrences as they are likely to be a result of cloud misclassification or a by-product of the linear cloud phase assignment. Stratus clouds are also excluded due to its very low occurrence frequency (<0.1% in all cloud-classified data), which leads to insufficient sampling.

[23] Sassen and Wang [2008] demonstrated that the CloudSat classification is generally consistent with previous global cloud type distributions but with some differences that are due to limitations of the CloudSat measurements. The main limitation is that the radar is insensitive to clouds containing relatively small particles and the lowest three or four radar bins above the surface (<1 km) are contaminated by surface returns [CloudSat Data Processing Document, 2007b]. Therefore, small fair weather cumulus, altocumulus, and cold cirrus clouds are likely to be under-represented in the CloudSat data [Sassen and Wang, 2008]. Furthermore, the rule-based classification is sensitive to the selection of the thresholds, which can lead to frequent misclassifications for cases near the thresholds.

[24] The present analysis uses only converging profiles of CloudSat data, removing a considerable fraction of the cloud profiles that do not converge to a solution. For the data period we considered, the percentage of profiles retained is 72% out of all the cloud profiles. In particular, fewer samples are converged for heavily precipitating clouds such as Cu, Ns, and Cb compared to nonprecipitating or lightly precipitating clouds such as As, Ac, and Sc. As a result, the relative occurrence frequencies of cloud types in the converged profiles are different from those in cloud-classified profiles as shown in Table 1. Therefore, the CloudSat retrieved CLWC is likely to underrepresent (overrepresent) the contribution of the heavily precipitating (non or lightly precipitating) clouds to the “true” CLWC distribution. The precipitating/nonprecipitating portion of each cloud type also differs in some degree as listed in Table 1.

Table 1. Relative Occurrence Frequency (ROF) of Cloud Type Among Seven Cloud Types Considered and Configurations of Precipitating (P) and Nonprecipitating (NP) Cases of Each Cloud Type for CloudSat Cloud-Classified Data and for CloudSat CLWC Converged Dataa
Cloud TypeCloudSat Cloud-Classified DataCloudSat CLWC Converged DataRetrieval Success Rate (%)
ROF (%)P (%)NP (%)ROF (%)P (%)NP (%)
  • a

    Retrieval success rate of each cloud type, which is the ratio of the number of converged data to the number of data classified as a given cloud type, is also listed.


3. Methodology

[25] The properties of CLWC PDFs retrieved from CloudSat are quantified using a statistical parametric approach called maximum likelihood estimation (MLE) [Kay, 1993]. MLE finds the best parameter set for an assumed PDF functional form, which maximizes the probability (likelihood) to generate a given data set with the PDF function. MLE makes the maximum use of the information in a data set, is statistically robust against noise, and provides the lowest possible variance of parameter estimates as a data sample size increases.

[26] In MLE, the likelihood L of sampling a data set {xi} for an assumed PDF f(x) is given by

equation image

This likelihood L is maximized with respect to the parameters of the PDF f(x). For example, if f(x) is a Gaussian function, the parameters are the mean and the variance of the Gaussian function. Numerically, the PDF parameters that maximize the likelihood L are found using Newton's method iteratively [Press et al., 1988].

[27] Various functional forms including Gaussian, beta, Weibull, exponential, lognormal, and gamma distribution functions are investigated in order to determine an appropriate function for CLWC. The candidate functions are selected based on (1) the scope of the shapes that the function can generate and (2) the simplicity and compatibility of the functional form to the climate model parameterizations. Since the number of parameters in the functional form directly determines the complexity of the climate model parameterization, we limit our selections to one or two-parameter functions and exclude any mixture distributions. Using a simple and single distribution is a first step toward understanding the structure of CLWC distribution.

[28] Among the selected distribution functions, it is found that the gamma and lognormal distribution functions are the most appropriate for the variety of CLWC data sets. Figure 1 illustrates the qualitative assessments of the goodness of fit with the selected functional forms that we tested for two typical cases. Figure 1a demonstrates the case for which the gamma distribution is the best function for the CLWC distribution in latitude [40°N, 50°N], longitude [130°E, 140°E], and height of 1–3 km. Similarly, Figure 1b shows the case for which the lognormal distribution is the best distribution function for the CLWC distribution in latitude [10°N, 20°N], longitude [120°E, 130°E], and height of 1–3 km.

Figure 1.

Qualitative assessment of the goodness of fit with various distribution functions in comparison with CLWC data distributions (a) in latitude [40°N, 50°N], longitude [130°E, 140°E], and height of [1 km, 3 km] and (b) in latitude [10°N, 20°N], longitude [120°E, 130°E], and height of [1 km, 3 km]. A gamma distribution is the best fit distribution in Figure 1a, while a lognormal distribution is the best in Figure 1b.

[29] We also quantify the extent to which a given functional form represents a data distribution by comparing the maximum likelihood obtained with MLE for each functional form. The theoretical upper limit of the likelihood one can achieve for a given data set is

equation image

where h(xi) is the frequency (i.e., number of occurrences) of CLWC value xi, N is the number of CLWC data samples, and M is the number of different CLWC values in the distribution. The maximum likelihood of a given functional form is given by equation (1) with the best fit functional parameters.

[30] By comparing the maximum likelihood of each functional form with the theoretical upper limit of the likelihood, we devise a quantitative measure of goodness of fit as follows,

equation image

where f is a given functional form to fit the distribution. The quantity Q captures the relative distance of the goodness of fit from that of the theoretical upper limit. The smaller Q is, the better the function is fit to the data distribution. Table 2 shows Q values for all the tested functions with the CLWC distributions shown in Figure 1. These numbers confirm the qualitative assessments taken from Figure 1. Note that the beta distribution is also a good fit for the CLWC distribution of Figure 1a, although the gamma distribution is slightly better than the beta distribution.

Table 2. Quantitative Assessment of the Goodness of Fit With Selected Distribution Functions for CLWC Distributions in Region A of Latitude [40°N, 50°N], Longitude [130°E, 140°E], and Height [1 km, 3 km] and in Region B of Latitude [10°N, 20°N], Longitude [120°E, 130°E], and Height [1 km, 3 km]a
 Region ARegion B
  • a

    The quantitative value Q of the goodness of fit is defined in equation (2). The smaller Q means the better fit. The zero value of Q is the best fit. The Q for the uniform distribution is also listed to provide a reference point compared to other distributions.


[31] The Gamma distribution function is characterized by two parameters, α and β,

equation image

Parameter α determines the shape of the gamma distribution function, while parameter β determines the scale of the function. Similarly, the lognormal distribution function is characterized by two parameters, μ and σ,

equation image

Parameters μ, and σ are the mean and standard deviation of the variable's natural logarithm.

[32] The main difference between the gamma and lognormal distributions appears at small values of the variable where the gamma function exhibits a polynomial dependence but the lognormal function exhibits an exponential dependence. Both distribution functions are fit to the distribution of CLWC, and the maximum likelihood values of the two distributions are compared to determine which distribution quantitatively fits CLWC better.

[33] The characterization of a CLWC distribution requires the number of samples in a data set of interest to be sufficiently large enough to make a statistically meaningful analysis. When the number of data is smaller than 1000, we notice that the distribution shape is not prominent enough to determine the best fit distribution function. Therefore when the number of samples in a data set of interest (e.g., data in a given latitude, longitude, and height grid box) is smaller than 1000, we did not perform the MLE estimation of the distribution and masked out the grid box in figures.

4. Results

[34] The dependence of the CLWC distributions on cloud type is investigated with the four data sampling conditions (AC, NP, LP, and NP + LP) as described before. Figure 2 shows the CLWC distribution sets for the six cloud types (As, Ac, Ns, Sc, Cu, and Cb), along with the best fit gamma or lognormal distributions calculated using MLE. All the CLWC distributions shown in this paper correspond to data histograms normalized by the total number of data.

Figure 2.

Observed and fitted distributions of CLWC by cloud type with four data sampling schemes: (a) all retrieved CloudSat data (AC), (b) nonprecipitating cloud (NP), (c) liquid phase cloud (LP), and (d) nonprecipitating and liquid phase cloud (NP + LP). The solid lines are the distributions derived from the data, while the dashed lines are the best fit PDFs using either a gamma or a lognormal distribution.

[35] For the AC and NP data sets, the CLWC distributions follow a gamma distribution. In contrast, the LP and NP + LP data sets follow a lognormal distribution. The transition of the PDF shape from the gamma to the lognormal distribution hints of issues with the mixed phase retrieval algorithm component. The parameters of the best fit gamma distribution for AC and NP and those of the best fit lognormal distribution for LP and NP + LP are listed in Table 3. In order to give a physical sense on the scale of the parameters, the corresponding mean and standard deviations of the distributions are also listed in units of mg/m3. The physical scale mean (μphysical) and standard deviation (σphysical) are related with the gamma parameters (α, β) and lognormal parameters (μ, σ) as follows,

equation image
equation image

The differences between the AC and NP data sets are much smaller than differences between the AC and LP data sets. The similarity between AC and NP distributions is partially explained by the CloudSat retrieval success rate, which is much higher for nonprecipitating clouds such as As, Ac, and Sc than for precipitating clouds such as Ns and Cb (see Table 1). Therefore, nonprecipitating clouds are overrepresented in the AC data set, thus making the AC and NP distributions look similar. The small differences between AC and NP appear mainly in the high value range of CLWC. In particular for boundary layer clouds (Sc and Cu) the NP condition reduces the frequency of occurrence of large retrieved values of CLWC as would be expected since we are filtering out the precipitating events.

Table 3. Parameters of the Best Fit Gamma (α, β) and Lognormal ( μ, σ) Distributions for AC, NP, LP, and NP + LP Data Sets of the Cloud Types (As, Ac, Sc, Cu, Ns, Cb) Considered in This Studya
  • a

    In addition to the distribution parameters, the corresponding mean and standard deviation (Std) in unit of mg/m3 are also listed for comparison.


[36] On the other hand, limiting CLWC to T > 0°C (from AC to LP data set) causes a dramatic change from the gamma distribution to the lognormal distribution by considerably reducing most occurrences of CLWC <10–30 mg m−3. Figure 2 also shows that the larger differences between cloud types for small values of CLWC apparent in the AC and NP data sets are reduced significantly after the LP condition is applied. Since the inclusion or exclusion of retrieved phase for the cloud water between −30°C and 0°C dramatically alters the characteristics of the CLWC distribution, more precise assessments of mixed phase cloud in the future are necessary, which requires advancements in measurement technologies, retrieval algorithms, and/or multisensor research.

[37] One counterintuitive feature shown in Figure 2 is that Cu has larger CLWC than Cb regardless of data sampling condition. This is mostly explained by the CloudSat retrieval success statistics shown in Table 1. The CloudSat retrieval process changes the relative occurrence of precipitating and nonprecipitating portions of these clouds. The CloudSat retrieved CLWC distributions of Cb underrepresent the precipitating portion (due to more retrieval failures in the precipitating portion than in the nonprecipitating portion), and this diminishes the occurrence of high CLWC in the Cb distribution. In contrast, the retrieved CLWC distributions of Cu overrepresent the precipitating portion, increasing the frequency of high CLWC values.

[38] The dependence of the CLWC distribution on height and latitude is investigated by applying the aforementioned PDF estimation method with the gamma and lognormal distributions. Figure 3 shows the difference of the logarithm of the likelihood between the best fit gamma and lognormal distributions. The difference of the logarithm of the likelihood is given by

equation image

where G(x) is the best fit gamma distribution function and LN(x) is the best fit lognormal distribution function. The set {xi} is the CloudSat CLWC data. Positive (negative) values mean that the gamma (lognormal) distribution provides a higher likelihood value than the lognormal (gamma) distribution for a given data set.

Figure 3.

The likelihood difference between the gamma and lognormal distributions for CLWC in each height-latitude grid box for (a) AC, (b) NP, and (c) LP, and (d) NP + LP. Positive (negative) values mean that the gamma (lognormal) distribution provides a higher likelihood value than the lognormal (gamma) distribution. The grid box in white contains either no or not enough samples to statistically analyze its CLWC distribution.

[39] For the AC data set, the middle and lower troposphere in the tropical and subtropical regions are better described by a lognormal distribution except near the surface, while the middle and high latitudes are better described by a gamma distribution. The NP data set does not change the distribution likelihood structure very much, except in the boundary layer of the tropics and subtropics. Note, however, that the CloudSat retrievals are more uncertain or nonexistent in the first km above the surface because of surface clutter.

[40] The LP data set alters the distribution likelihood structure significantly. Most of CLWC data at higher altitudes and near polar latitudes are removed because the observations corresponding to T < 0°C are being filtered out. The remaining CLWC data exhibits a lognormal distribution, except in the boundary layer below 1 km. When NP + LP is considered, the distribution of CLWC is generally close to a lognormal distribution almost everywhere.

[41] The low-to-middle altitude regions in the tropical and subtropical latitudes, where the cloud is mainly in a pure liquid phase and is nonprecipitating, exhibit essentially no change of the lognormal distribution with the data sampling conditions regardless of its geolocation and cloud type. Table 4 summarizes the characterizations of the PDFs of CLWC with respect to the data-filtering scheme and geolocation.

Table 4. Characterizations of PDFs of CloudSat-Retrieved CLWC With Respect to Data Sampling Scheme, Described in Section 2, and Geolocation
 Boundary LayerMiddle to Lower Troposphere of Tropics and SubtropicsMiddle to Upper Troposphere of Middle-to-Polar Latitude Regions
NP + LPLognormalLognormalN/A

[42] Variations of the distribution function type due to data sampling conditions in an illustrative grid box is shown in Figure 4. The distributions of two data sets are compared: AC and NP + LP for a grid box located at 3–4 km and 40°N–50°N (the box is highlighted in Figure 3). Figure 4 clearly illustrates how removing clouds with precipitation and supercooled liquid droplets affects the shape of the CLWC distribution. In particular, the frequency of CLWC < 10–30 mg m−3 is significantly reduced, making the observed distribution follow a lognormal distribution more closely than a gamma distribution. This occurs due to the temperature sampling (T < 0°C is filtered out) and confirms the results shown in Figure 3.

Figure 4.

The PDF estimation of CLWC with both a gamma and a lognormal distribution function from 3 to 4 km and 40°N–50°N grid box (as highlighted in Figure 3). The red lines represent the distribution of AC, while the blue lines represent the distribution of NP + LP.

[43] Another interesting effect is the relative increase of the frequency of occurrence of larger values of CLWC for the NP + LP sampling. This is due to the fact that PDFs (normalized histograms) are shown and as such this apparent increase of the larger values of CLWC is simply a consequence of the significant reduction in the frequency of occurrence of lower values.

[44] The detailed height dependence of the CLWC PDFs in the subtropics (20°N–30°N) for two different data sampling conditions (AC and NP + LP) is shown in Figure 5. For AC, the distribution gradually transitions from a lognormal to a gamma distribution as the height increases, which is attributed to the larger prevalence of small values of CLWC at higher altitudes due to the colder temperatures. In addition, for the AC set the mean value of the CLWC PDFs clearly decreases with the increase of height as is consistent with a concomitant decrease of temperature with height (and as such a decrease of saturation specific humidity and cloud water content). For NP + LP conditions, all values of CLWC above 6 km are removed and the remaining data exhibits a lognormal distribution.

Figure 5.

Transition of PDF of CLWC in a subtropical region (20°N–30°N) with respect to height for (a) AC and (b) NP + LP.

[45] In Figure 5b, the decreased occurrence frequency of large values in the lowest layer in NP + LP is due to the significant removal of events associated with precipitation. It is clear from these results that many of the boundary layer clouds identified by CloudSat are precipitating. This does not imply that boundary layer clouds such as stratus or stratocumulus are precipitating more frequently and/or more intensely than deeper clouds such as shallow or deep cumulus. On one hand, precipitation in deeper clouds causes retrieval failure and as such will not appear in the statistics being discussed. On the other hand, for boundary layer clouds, the CloudSat retrieval algorithm produces a converged value for CLWC frequently even in the presence of light precipitation (see Table 1 for statistics).

[46] The latitude-height distribution of zonal mean CLWC PDFs is illustrated in Figure 6. The parameters of the best fit gamma distribution for the AC set show a noticeable correlation with height and latitude. The parameter α (the shape parameter) is correlated with both latitude and height such that it has a higher value in regions with higher temperature. In contrast, the parameter β (the scale parameter) seems to be generally correlated with height and regime type (note the slightly larger values in the NH and SH subtropics). While the correlations between the PDFs and geolocation parameters are apparent, the correlations with height may be an artifact of the constraint given by an a priori vertical profile used in the CloudSat retrieval algorithm. The underlying physical causes of the correlations are not obvious and require additional work. It is also important to note that the gamma distribution is not a good fit for the middle to lower tropical and subtropical troposphere for the AC data set (the lognormal distribution is a better fit); thus, the representation of CLWC distributions with these parameters is not as robust as in other regimes.

Figure 6.

Parameters of best fit PDF functions for CLWP: (a) parameter α and (b) β of the gamma distribution and (c) mean and (d) standard deviation for the AC data set; (e) parameter μ and (f) σ of the lognormal distribution and (g) mean and (h) standard deviation for the NP + LP data set. For a direction comparison between the AC and NP + LP data set, the same color scales are used for mean and standard deviation. The grid box in white contains either no or not-enough samples to statistically analyze its CLWC distribution.

[47] Figures 6e and 6f also display the dependence of the PDF parameters for the best fit lognormal distributions of the NP + LP set. Unlike the gamma distribution parameters of the AC set, the lognormal parameters show a less coherent correlation with geolocation but instead indicate regional variations. Figure 6e confirms the finding of Li et al. [2008] where it is shown that, for nonprecipitating situations, CloudSat produces larger mean values of liquid water in the midlatitude lower troposphere. An interesting minimum in μ and σ between 1 and 3 km (close to the top or above the boundary layer in most regions) hints at the lack of nonprecipitating clouds with large values of CLWC.

[48] Figure 6f shows larger variability of CLWC in the midtropospheric tropics and subtropics (between 3 and 6 km) presumably associated with transient cloudiness from convection and/or frontal events. In contrast, CLWC has less variability between 1 and 2 km in the tropics and subtropics presumably associated with the steady state nature of the cloudy boundary layer in these regions and the general dryness of the regions just above the boundary layer. Obvious from this discussion is the need to disentangle the temporal from the spatial variability of CLWC, which is difficult (or impossible) to do correctly (as mentioned earlier) owing to the sampling nature of the CloudSat observations.

[49] In order to facilitate the physical interpretation and direct comparison of the CLWC distributions, the mean and standard deviations corresponding to each distribution are also plotted in Figures 6c and 6d for the AC data set and in Figures 6g and 6h for the NP + LP data set. One noticeable difference between the two data sets is a considerable reduction of the mean and standard deviation in the boundary layer (height between 0 and 2 km) from AC to NP + LP. This reduction is attributed to the no-precipitation (NP) sampling condition, which removes high-value CLWC samples in precipitating clouds.

[50] Figure 7 illustrates global distributions of CLWC PDF characteristics between 1 and 3 km. The likelihood difference between gamma and lognormal distributions (Figures 7a and 7f) shows that in the lower troposphere the lognormal distribution is a better fit in most regions except high latitudes. The best fit lognormal parameters μ and σ are plotted in Figures 7b and 7c (Figures 7g and 7h) for the AC (NP + LP) data set. Figure 7b shows larger values of μ in the stratocumulus and trade-cumulus regions off the subtropical west coast of continents, implying larger values of CLWC. The fact that this behavior is not observed in Figure 7g suggests many of these clouds are precipitating.

Figure 7.

Horizontal map of CLWC distribution characteristics at heights between 1 and 3 km: (a) likelihood difference between gamma distribution and lognormal distribution, (b) best fit μ, (c) best fit σ, (d) corresponding mean μphysical, and (e) standard deviation σphysical of the lognormal distribution for AC data set, (f) likelihood difference, (g) best fit μ, (h) best fit σ, (i) corresponding mean μphysical, and (j) standard deviation σphysical of the lognormal distribution for NP + LP data set. The grid box in white contains either no or not-enough samples to statistically analyze its CLWC distribution.

[51] Figure 7c shows no latitude or regional dependence of σ in the oceanic tropics and subtropics. However, Figure 7h shows a reduction of σ in the stratocumulus regions when precipitating clouds are eliminated. Overall, the variability of the lognormal distribution parameters is reduced when precipitation and ice cloud contributions are filtered out. In addition to the lognormal distribution parameters, the corresponding mean and standard deviation of the CLWC distribution in unit of mg/m3 are plotted for physical interpretation in Figures 7d and 7e for the AC set and in Figures 7i and 7j for the NP + LP set.

5. Discussion

[52] All the data used in the present work have uncertainties and biases associated with the CloudSat retrieval process and CloudSat radar measurement sensitivity. It is important to quantify the impact of the uncertainties and biases on the CLWC distribution analysis.

[53] The CloudSat retrieval process uses an optimal-estimation approach [Rodgers, 2000] in which an a priori vertical profile serves as a constraint on the retrieval together with an a priori covariance matrix representing the uncertainty of the profile. The retrieved solution is obtained by minimizing a cost function that is a weighted sum of the difference between the measurement and forward model vectors and the difference between the retrieved state vector and the a priori vector. This approach either makes the retrieved solution biased toward the a priori vector or leads to unsuccessful retrievals (i.e., solutions not converged) if the state vector to be retrieved is dissimilar from the a priori vector. Which a priori vector to be used for a given radar measurement vector is determined by classifying cloud type, which has its own uncertainties (i.e., misclassification). Therefore, the resulting distributions of the retrieved CLWC may be more narrow and biased toward the a priori vector than the “true” CLWC distribution.

[54] The retrieved CLWC in precipitating clouds has larger uncertainties than in nonprecipitating clouds because the radar reflectivity is highly sensitive to large droplets and its droplet size distribution deviates from the assumed lognormal distribution. We addressed this uncertainty by filtering the data with precipitation (NP data set) and find that the CLWC distribution does not change its characteristics dramatically (Figures 2a and 2b). The favored distribution (either gamma or lognormal) remained the same before and after the NP filtering. However, this similarity is partly attributed to the high retrieval failure rate of precipitating clouds, leading to the underrepresentation of precipitating clouds in the AC set. Note that the CloudSat warm/liquid phase cloud retrievals will not contain low CLWC values whether or not they actually exist if the droplets are sufficiently small.

[55] Another major source of the CloudSat data uncertainty is the treatment of mixed phase cloud in the CloudSat retrieval process, where the air temperature is used to partition the cloud into the liquid and ice phases. The mixed phase cloud contributes to a large frequency of the low CLWC values as seen in the data distribution of the AC data set (Figure 2a) in comparison with the LP data set (Figure 2c). When the ice portion of the cloud is removed (LP data set), all CLWC distributions are lognormal as seen in Figure 3c in contrast to the gamma distributions present in Figure 3a (AC data set). The present study shows that misidentification of cloud phase could impact the characteristic structure of CLWC distributions.

[56] The low values of CLWC, a critical part of distinguishing the PDF shape between gamma and lognormal distributions, is highly uncertain because of the relative insensitivity of the radar reflectivity measurement to small droplets in addition to the ambiguity of mixed phase clouds. In order to assess the uncertainties of low CLWC values, the PDF is reestimated in three separate tests by excluding CLWC values lower than 10, 20, and 30 mg m−3. For the CLWC distribution shown in Figure 1a, the gamma distribution is initially the best fit function and remains the best fit function when excluding CLWC <10 mg m−3. However, when data with CLWC <20 or 30 mg m−3 are removed, it becomes ambiguous to determine whether CLWC follows a gamma or lognormal distribution (not shown). This experiment illustrates the importance of low CLWC values in characterizing the shape of the CLWC PDF.

[57] Finally, we test to what extent the reported uncertainty of the CLWC values affect the PDF estimation. The CloudSat 2B-CWC-RO version R04 data product provides a measure of the uncertainty in the retrieved CLWC, which is the diagonal element of the error covariance matrix. The uncertainty is affected by both the uncertainty in the measured radar reflectivity values and the uncertainty in a priori data for the retrieval process. We use the uncertainty as the standard deviation of Gaussian noise in the data. A retrieved CLWC value xi with uncertainty of σi can be expressed as the sum of the unknown “true” value 〈xi〉 and the Gaussian noise N(0, σi2)

equation image

Given this distribution, the probability p(xi) of getting the retrieved value xi is now the joint probability of getting the true value 〈xi〉 from the assumed PDF function f(x) and getting xi from the Gaussian noise around the true value〈xi〉,

equation image

Since the “true” value 〈xi〉 is unknown, it is approximated with the retrieved value xi as a first-order estimation. Then the joint probability becomes

equation image

Consequently, the likelihood of getting the data series of {xi} is given by

equation image

This likelihood gives a higher (lower) weight for data with a lower (higher) uncertainty. Maximizing the likelihood with respect to the function parameters gives the best fit function for a given functional form, which takes into account the uncertainty associated with the data.

[58] Figure 8 shows this likelihood maximization and demonstrates that the best fit functional shape (either gamma or lognormal) essentially remains unchanged and the best fit parameters are within 15% away from those obtained with the original likelihood estimation given by equation (1). In this particular case, the best fit lognormal parameters for the two PDFs are μ = 5.24 and σ = 0.57 without uncertainty and μ = 5.22 and σ = 0.63 with uncertainty. Most of the results presented in the paper are obtained without taking into account this uncertainty. This experiment suggests that the main conclusions drawn in this paper are largely valid even though the uncertainty is not considered explicitly.

Figure 8.

(left) Uncertainty of CloudSat-retrieved cloud liquid water content of height from 1 to 2 km and latitude from 0°N to 10°N. (right) PDF estimation of CLWC distributions by taking into account the uncertainty of the retrieved CLWC values.

6. Summary

[59] We characterized the distributions of CloudSat-retrieved cloud liquid water content (CLWC) data sampled during 2007 using maximum likelihood estimation (MLE). The best estimate PDFs of CLWC are found to closely follow either a gamma or a lognormal distribution depending on cloud phase, cloud type, the occurrence of precipitation, and geolocation. In the tropical and subtropical latitudes between 1 and 6 km, where nonprecipitating and pure liquid phase clouds are dominant, the PDFs of CLWC are best described by a lognormal distribution. In contrast, at altitudes above 6 km and regions poleward of the midlatitudes that contain a high frequency of supercooled liquid cloud droplets, a gamma distribution best explains the CLWC distributions primarily due to an increased occurrence of low values.

[60] The data sampling with respect to cloud phase and precipitation significantly affects the distribution characteristics of CLWC in some regions. After removing the contributions of supercooled water and precipitation, the CLWC distribution transitions from a gamma to a lognormal distribution in (1) high altitudes and midlatitude-to-polar regions where the contribution of supercooled or mixed phase clouds is significant and (2) in the lower troposphere where the precipitating cloud is frequent and CloudSat has significant observational limitations.

[61] Some parameters of the CLWC distributions show a noticeable relationship with latitude and height. For the all cloud (AC) data set, the shape parameter (α) of the gamma distribution is correlated with latitude and height such that it is positively correlated with air temperature, while the scale parameter (β) is correlated primarily with height. For the no precipitation, liquid phase (NP + LP) data set, the lognormal distribution parameters show a less coherent relationship with latitude and height but do show some regional structure. The mean parameter (μ) of the lognormal distribution exhibits peak values in the stratocumulus and trade wind cumulus regimes. The variability parameter (σ) is reduced in the same regimes and in the midtroposphere.

[62] In the global distribution of the lognormal parameters in the boundary layer (1–3 km) with the AC set, the stratocumulus regions off the subtropical west coast of continents show the largest mean parameters (μ). The oceanic tropics and subtropics show no clear regional or latitude dependence in the variability parameter (σ). When precipitation and supercooled droplets are removed (NP + LP), the highest mean values (μ) of CLWC are found in the midlatitudes between 1 and 3 km and the variability of CLWC (σ) is strongly diminished in the stratocumulus and trade cumulus regions.

[63] Detailed information about PDFs of vertically resolved liquid water content from satellite observations such as those obtained from CloudSat can offer useful quantitative constraints for the development of parameterizations of cloud microphysics that take into account the effects of subgrid variability. However, the results of this work also show that the uncertainties, biases, and insufficient sampling of hydrometeors by the CloudSat radar lead to inherent uncertainties in the quantification of CLWC PDFs. Another pressing issue is to disentangle the temporal from the spatial variability in addition to quantifying observational and retrieval uncertainties. Studies that use combinations of sensors, e.g., radar, lidar, passive visible, infrared, and microwave, may offer more precise estimates of PDFs of cloud liquid water content profiles.


[64] The authors would like to thank Tristan L'Ecuyer, Graeme Stephens, and the CloudSat team for helpful discussions about CloudSat data and three anonymous reviewers for constructive comments. CloudSat data were obtained through the CloudSat Data Processing Center ( J.T. acknowledges the support provided by the Office of Naval Research, Marine Meteorology Program under award N0001408IP20064 and the NASA MAP Program. This research was carried out at the Jet Propulsion Laboratory (JPL), California Institute of Technology, under a contract with the National Aeronautics and Space Administration. This work was funded by the JPL Internal Research and Technology Development Program.