Present-day shortcomings in the representation of upper tropospheric ice clouds in general circulation models (GCMs) lead to errors in weather and climate forecasts as well as account for a source of uncertainty in climate change projections. An ongoing challenge in rectifying these shortcomings has been the availability of adequate, high-quality, global observations targeting ice clouds and related precipitating hydrometeors. In addition, the inadequacy of the modeled physics and the often disjointed nature between model representation and the characteristics of the retrieved/observed values have hampered GCM development and validation efforts from making effective use of the measurements that have been available. Thus, even though parameterizations in GCMs accounting for cloud ice processes have, in some cases, become more sophisticated in recent years, this development has largely occurred independently of the global-scale measurements. With the relatively recent addition of satellite-derived products from Aura/Microwave Limb Sounder (MLS) and CloudSat, there are now considerably more resources with new and unique capabilities to evaluate GCMs. In this article, we illustrate the shortcomings evident in model representations of cloud ice through a comparison of the simulations assessed in the Intergovernmental Panel on Climate Change Fourth Assessment Report, briefly discuss the range of global observational resources that are available, and describe the essential components of the model parameterizations that characterize their “cloud” ice and related fields. Using this information as background, we (1) discuss some of the main considerations and cautions that must be taken into account in making model-data comparisons related to cloud ice, (2) illustrate present progress and uncertainties in applying satellite cloud ice (namely from MLS and CloudSat) to model diagnosis, (3) show some indications of model improvements, and finally (4) discuss a number of remaining questions and suggestions for pathways forward.
 The importance of obtaining a more comprehensive understanding and improved capability for modeling upper tropospheric ice clouds cannot be underestimated as “cloud feedbacks remain the largest source of uncertainty” in determining Earth's equilibrium climate sensitivity, specifically to a doubling of carbon dioxide [Intergovernmental Panel on Climate Change, 2007]. Some evidence for this uncertainty is given in Figure 1 that shows model-to-model comparisons of four different physical climate quantities, including cloud ice water path (IWP). While it is understood that models exhibit significant systematic spatial-temporal biases with respect to quantities such as precipitation, water vapor and clouds, their depiction of the global-averaged values is quite good. This stems from the fact that these quantities have had relatively robust long-standing observational constraints [Arkin and Ardanuy, 1989; Rossow and Schiffer, 1991; Stephens et al., 1994; Xie and Arkin, 1997] as well as indirect measurement constraints via top of the atmosphere radiation measurements [Gruber and Krueger, 1984; Kyle et al., 1993; Smith et al., 1994]. In contrast, robust global (or globally representative in situ) retrievals of cloud ice, particularly vertically resolved values have not been available, albeit Lin and Rossow  estimated the globally average ocean-only value to be 0.07 kg m−2. Despite significant efforts to derive even IWP measurements from passive and nadir-viewing techniques, the large optical thicknesses, multilayer structure and mixed-phase nature of many clouds makes the estimates from these techniques very uncertain [Lin and Rossow, 1996; Stephens et al., 2002; Wu et al., 2006]. The sparse sampling of in situ observations and poor probing capabilities of nadir-viewing passive satellite IWC/IWP measuring techniques are highlighted in the schematic of Figure 2 in the context of the complexities of a precipitating and/or multilayer cloud system. The ramifications of this poor constraint for cloud ice, even IWP, are evident in the much larger model-to-model disagreement for globally averaged cloud ice shown in Figure 1. There is a factor of 20 difference between the largest and smallest values, and even when the two largest outliers are removed, there is still a factor of about 6 between the largest and smallest values. As expected, these differences are exacerbated when considering the spatial patterns of the time-mean values shown in Figure 3; in some regions differences up to nearly 2 orders of magnitude. For a quantity as fundamental and relatively unambiguous as cloud ice mass, one that also has significant import within the context of climate change and its associated model projection uncertainties, it is critical that this level of model uncertainty be reduced.
 Fortunately, there are new observational resources that can be expected to lead to considerable reduction in the uncertainties associated with model representations of upper tropospheric cloud ice. Specifically, these include the Microwave Limb Sounder (MLS) on the Earth Observing System (EOS) Aura satellite, and the CloudSat and CALIPSO satellite missions, all of which fly in formation in what is referred to as the A-Train [Stephens et al., 2002]. On the basis of radar and limb-sounding techniques (see Figure 2), these new satellite measurements provide a considerable leap forward in terms of the information gathered regarding upper tropospheric cloud ice water content as well as other macrophysical and microphysical properties. In this article, we briefly describe the current state of GCM representations of cloud ice and their associated uncertainties, the nature of the new observational resources for constraining cloud ice values in GCMs, the challenges in making well-posed model-data comparisons, and prospects for near-term improvements in model representations. In section 2, we describe the satellite retrievals of IWP and IWC that are discussed in the article, with an indication of the relative strengths and weaknesses of the different retrieval methodologies and sensitivities. In section 3, we briefly describe the model resources that are examined and provide a rudimentary description of the various levels of complexity regarding model treatments of cloud ice. For both IWP and IWC, and for both the retrievals and the models, it is more or less understood that “ice” represents all frozen hydrometeors, which can include cloud ice, which is typically suspended or “floating,” and ice mass precipitating forms such as snow and graupel. However, such distinctions are often not clearly made or are fuzzy, and a principle focus of this article is to help articulate where and how such distinctions are made and matter for model-data comparisons. Moreover, it should be stressed that with present satellite/retrieval technology, direct retrievals that truly distinguish “floating”/“suspended” forms of ice from “falling”/“precipitating” forms are not yet available, yet models often try to make this distinction. Retrievals of this sort will require colocated vertical velocity information, such as might come from Doppler radar capability. In section 4, we present the results of the model data comparisons, with discussions regarding sampling, sensitivity, model representation, etc. In section 5, we conclude with a summary and discuss needs regarding future space-based retrievals and directions for model diagnosis and improvement.
2. Satellite Retrievals
 In this section, we describe the satellite retrievals that are illustrated and discussed in this paper. To highlight a critical difference in capabilities, the retrievals are categorized as either passive nadir-viewing or radar/limb sounding. This distinction conveys a sense of their capabilities to account for vertical structure, namely in terms of being able to deal less ambiguously with multiple cloud levels and/or mixed-phase clouds. This leads to a pragmatic distinction of whether the satellite retrieval provides an estimate of (column-integrated) ice water path (IWP; gm m−2) and/or has the capability to provide an estimate of (vertically resolved) ice water content (IWC; mg m−3). In each case, all-sky values are discussed and presented. Given that this study mainly focuses on the new capabilities and the associated uncertainties of the CloudSat and MLS retrievals, more details are provided regarding their methods and products (see Table 1). The passive nadir-viewing products are only referenced briefly and therefore the discussion below only provides highlights with many details left to the referenced literature.
Table 1. Sampling, Sensitivity, Error, and Reference Information for CloudSat and MLS Cloud Ice Water Content Retrievals
 The International Satellite Cloud Climatology Project (ISCCP) provides an estimate of ice cloud water path (IWP) values based on measurements in the visible (VIS; 0.6 μm) and “window” infrared (IR; 11 μm). Because VIS measurements are used, results are obtained only in daytime (at 3-h intervals) but are global except for the unilluminated portions of the polar regions. The intrinsic resolution of the radiance measurements is determined by the pixel (field-of-view) size, about 5 km, and the sampling interval of about 30 km; however, statistical quantities, such as the monthly average, are completely equivalent to the full 5-km sampling results. After identifying cloud pixels, the cloud visible optical thickness (τ) and cloud top temperature (TC) are retrieved from the VIS and IR employing a radiative transfer model. The cloud top temperature is corrected for the transmission of IR radiation from below on the basis of the values of τ, the surface temperature (TS) and the atmospheric temperature and humidity profile. The retrieval of τ is based on one of two microphysical models, one for liquid and one for ice clouds. The phase of the cloud is determined by the value of TC; if TC < 260 K, the whole cloud is assumed to be an ice cloud. The ISCCP estimate of IWP therefore includes any underlying liquid cloud layers or the lower liquid parts of deep clouds. Hence, the ISCCP values represent an upper limit on IWP but yet since it qualified as being very insensitive to precipitation, the product and this limit is expected to apply to cloud IWP. The microphysical model for ice clouds assumes a fractal particle shape with an aspect ratio of unity and an effective radius (rE) = 30 μm and a size distribution variance of 0.1. Thus, the value of IWP can be determined from product of τ, rE and a coefficient that relates geometric cross section to volume for the assumed particle shape: for ice clouds in the ISCCP data set, IWP (gm m−2) = 10.05 τ. For additional details and discussions of uncertainties, see Rossow and Garder , Rossow and Schiffer , Lin and Rossow , Jin and Rossow  and Han et al. . Annual mean ISCCP IWP values are shown in Figure 4.
 The NOAA/NESDIS IWP algorithm uses the measurements from the Advanced Microwave Sounding Unit–B (AMSU-B) and the Microwave Humidity Sounder (MHS) instruments to simultaneously retrieve IWP and ice particle effective diameter, De [Ferraro et al., 2005; Zhao and Weng, 2002], through characterizations of the scattering properties of ice cloud. The first step of the retrieval is to derive De from a regression relation with the scattering parameter ratio of 89 GHz and 150 GHz. The relation was established using simulated data from a radiative transfer model. Then IWP is computed from the retrieved De and the scattering parameter of either 89 GHz or 150 GHz depending on the size of De. The retrieval can be done in all-weather conditions, during day or night, and has relatively high temporal coverage with up to 10 measurements per day owing to the five Polar Orbiting Environmental Satellites (POES) satellites (NOAA-15, -16, -17, and -18 and MetOp-A). The NOAA IWP annual mean is shown in Figure 4. Its low bias relative to the other products shown is possibly due to two main reasons: scene screening criteria, which may bias the result, and insensitivity to small ice particles: in this case, less than 0.4 mm in size which is fairly large for many suspended cloud ice particles. The main screening criteria are that the scene be free of snow cover, and the brightness temperature at 89 GHz is higher than 150 GHz (owing to the fact that the depression of brightness temperature increases with frequency from the scattering effect when atmospheric ice is present).
 To determine IWP, the Clouds and the Earth's Radiant Energy System (CERES) algorithms [Wielicki et al., 1998] first explicitly classify each 1-km Moderate Resolution Imaging Spectroradiometer (MODIS) cloudy pixel as ice or water on the basis of the cloud temperature and the goodness of the match between the observed spectral radiances at three wavelengths and model calculations of the radiances using several different ice and water particle sizes [Minnis et al., 1995, also Cloud property retrieval techniques for CERES using TRMM VIRRS and Terra and Aqua MODIS data, submitted to IEEE Transactions on Geoscience and Remote Sensing, 2009]. The models use a set of hexagonal ice column distributions to represent ice cloud particles [Minnis et al., 1998]. IWP is computed as a function of the product of the retrieved effective ice crystal size and optical depth for each pixel. Optical depth is limited to a maximum of 128 in the current CERES editions. The retrieval assumes that the entire cloud column is composed of ice. Although good agreement is found between ground-based cloud radar and the CERES retrievals of IWP for relatively thin cirrus clouds (optical depths < 4) with no underlying water clouds [Mace et al., 2005], validation of IWP for thick clouds has not yet been performed, primarily owing to a lack of reference data. Few ice clouds with optical depths less than 0.3 are detected by the CERES analysis [e.g., Chiriaco et al., 2007], and while such clouds account for a significant portion of the ice-cloud cover [Jin et al., 1996; Jin and Rossow, 1997; Stubenrauch et al., 1999], they contribute very little to the global IWP. More significant is the impact of multilayered clouds on the IWP retrievals. Huang et al.  and Minnis et al.  showed that the assumption of the entire cloud column as ice leads to overestimates in IWP of roughly 50% in multilayered cloud systems. Thus, if one assumes that half of all ice clouds overlap liquid clouds, then global estimates of IWP from passive visible, infrared, and near-infrared measurements are likely to be overestimated by around 25%. The data illustrated in Figure 4 use averages of IWP derived using Terra MODIS data taken for solar zenith angles less than 82°. The means are multiplied by the average ice cloud fraction for each region to obtain all-sky IWP. In nonpolar regions, the results correspond to 1030 LT.
2.1.4. MODIS Team
 The MODIS cloud optical and microphysical retrievals [Platnick et al., 2003] are part of the archived MOD06 and MYD06 products (for Terra and Aqua MODIS, respectively) and use techniques similar to the CERES algorithm, though there are some differences, including the methods and data used for cloud masking [Ackerman et al., 2008; Frey et al., 2008] (see also documentation at modis-atmos.gsfc.nasa.gov/products_C005update.html). Determination of the thermodynamic phase of the cloud water uses a combination of infrared and shortwave infrared (SWIR) spectral tests [King et al., 2004]. The ice cloud models used for the retrievals are based on in situ observations from a variety of cloud measuring campaigns and include size distributions with varying habit combinations as a function of size [Baum et al., 2005]. The MODIS retrievals have been compared with the ground-based studies of Mace et al.  and successful retrievals for ice clouds with optical thicknesses less than about 0.7 were less frequent than for CERES-MODIS however retrieval uncertainties in that range can be quite large. The ground-based and aircraft studies of Chiriaco et al.  found similar conclusions. Figure 4 shows the mean global IWP from Aqua MODIS (generated from monthly Level-3 [King et al., 2003] files and weighted by the ice cloud fraction to provide all-sky means). In the Level-3 product, monthly aggregations are derived from daily aggregations that include pixels from all orbits that contribute to the 1° grid box, i.e., an average over the day's samples and not an instantaneous one, though differences only occur poleward of about 30° owing to the MODIS swath and become more significant in polar regions. A multilayer flag is generated in the processing but was not used to exclude pixels in the IWP values presented here.
2.2. Vertically Resolving: Radar and Limb Sounding
 The MLS onboard the Aura satellite, operational since August 2004, has five radiometers measuring microwave emissions from the Earth's atmosphere in a limb-scanning configuration to retrieve chemical composition, water vapor, temperature and cloud ice. The retrieved parameters consist of vertical profiles on fixed pressure surfaces having near-global (82°N–82°S) coverage. In formation with the rest of the so-called A-Train constellation of satellites, Aura has equatorial crossing times of approximately 0130 and 1330 local time. The retrievals for IWC are provided at 68, 83, 100, 121, 147, 178, 215, and 261 hPa. The MLS IWCs are derived from cloud-induced radiances (CIR) using modeled CIR-IWC relations based on the MLS 240-GHz measurements. Single IWC measurements from MLS at 147 and 215 hPa have a vertical resolution of ∼4 km and horizontal along- and cross-track resolutions of ∼200 and ∼7 km, respectively. The data presented in this article use MLS version 2.2 IWCs [Livesey et al., 2007]. In this version, the estimated precision for the IWC measurements is approximately 0.07, 0.2 and 0.6–1.3 (mg m−3) at 100, 147, and 215 hPa, respectively, which accounts for combined instrument plus algorithm uncertainties associated with a single retrieval. For these three pressure levels, the useful range of MLS IWC retrievals are about 0.02–120, 0.1–90, and 0.6–50 mg m−3, respectively. Detailed descriptions of the algorithm, performance and validation of the MLS IWC is given by Wu et al. [2006, 2009], with additional detailed comparisons between MLS and CloudSat given by Wu et al. .
 Unless otherwise noted, the mean MLS values shown are computed from the total IWC amounts divided by the total number of measurements (including cloud free conditions) and binned onto a 4° × 8° latitude-longitude grid. While MLS retrievals are based on limb sounding, and thus provide some depiction of vertical structure, they cannot provide a robust estimate of total IWP since it does not sample the entire column. Figure 5 illustrates MLS' estimate of annual mean IWC at 215 hPa and the zonal average of their vertically resolved values between levels 100 hPa and 261 hPa. Very important to this study is the interpretation of what components of the frozen hydrometeors (e.g., snow, cloud ice) are represented in MLS IWC retrievals. Because in high-IWC cases large hydrometeors produce strong attenuation, MLS cannot penetrate the entire cloud and its sensitivity to cloud ice begins to saturate. The saturated/degraded measurements significantly underestimate the IWC in these cases, which in turn makes MLS less sensitive to clouds with large amount of hydrometeors. A qualitative interpretation is that MLS tends to saturate for cloud systems that have significant amounts of larger frozen hydrometeors and thus tend to only reflect distributions, in magnitude, that are more characteristic of cloud ice alone.
 The Cloud Profiling Radar (CPR) on the CloudSat satellite is a 94-GHz, nadir-viewing radar measuring backscattered power from the Earth's surface and particles in the atmospheric column as a function of distance [Stephens et al., 2002]. Measurements of radar backscatter are converted to calibrated geophysical data quantities (radar reflectivity factor), which are then used in retrievals of cloud and precipitation properties such as ice water content (IWC). During each 160-ms measurement interval, the CPR data are collected into a single vertical profile of backscattered power sampled over 125 range bins measuring 240 m each, creating a total data window of 30 km. The distance from the satellite to the data window changes as a function of orbital location in order to guarantee that the window includes the Earth's surface, because surface reflectivity is a useful measurement in its own right and also serves as a constraint in some retrieval algorithms. Because the CPR does not scan, measurements consist of vertical profiles along the satellite ground track (over 37,000 per orbit), providing a vertical cross section of clouds and precipitation in the atmosphere. The CPR footprint is oblong, owing to the along-track motion during the 160-ms measurement interval, with 6-dB dimensions of approximately 1.3 km across track and 1.7 km along track (with a slight dependence on latitude). The minimum detectable reflectivity is approximately −30 dBZ (varies slightly with location, season, and background). CloudSat orbits as part of the A-Train constellation of satellites, following approximately 1 min behind Aqua, 15 s ahead of CALIPSO, and about 14 min ahead of Aura/MLS, although the forward limb sounding retrieval of MLS reduces the separation of samples to about 7 min. CloudSat has been operational since June 2006.
 The current CloudSat retrieval for ice water content (IWC) (version 5.1, contained in release 4 (R04) of the CloudSat 2B-CWC-RO data product) uses an optimal estimation approach to retrieve parameters of the ice cloud particle size distribution based on measurements of radar reflectivity [Austin et al., 2009]. A priori data constructed from a database of cloud microphysical measurements constrain the solution where the measurements cannot; the a priori data values are selected as a function of temperature, which is available to the retrieval based on ECMWF model data. The retrieval assumes a lognormal size distribution of cloud particles, retrieving all three distribution parameters for each radar resolution bin and calculating IWC and other quantities from the retrieved parameters. A similar retrieval is performed for liquid water content (LWC); the composite profile contained in the 2B-CWC-RO product is obtained by using the LWC retrieval for bins warmer than 0°C, the IWC retrieval for bins colder than −20°C, and a linear combination of the two in the intermediate temperature range. The minimum detectable IWC is estimated to be approximately 5 mg m−3, depending on the distribution parameters. The annual mean IWP estimate from CloudSat is shown in Figure 4. Evident is that it is high relative to the other products, particularly in the tropical regions. Figure 5 illustrates CloudSat's estimate of annual mean IWC at 215 hPa and the zonal average of their vertically resolved values which in this case include retrievals through the whole column. While the IWC retrieval algorithm does not consider larger species such as snow and graupel explicitly, the radar will certainly see these larger particles owing to the powerful dependence of radar reflectivity on particle size (D6 for Rayleigh particles, but less as particles move to the Mie scattering regime). Efforts are underway to determine the accuracy of the retrieved IWC values in the presence of these larger particles. Separate retrievals designed specifically for snow are also in preparation as experimental products.
 Validation activities of the CloudSat IWC algorithm are proceeding along three tracks. First, both the radar-only algorithm (used in 2B-CWC-RO) and the radar + visible optical depth algorithm (used in the coming 2B-CWC-RVOD product) were tested on synthetic reflectivity profiles generated from a number of in situ cloud microphysical profiles that were collected in several field campaigns; these tests were part of an IWC retrieval intercomparison study documented by Heymsfield et al. . While the paper used the R03 version of the algorithm and mostly concentrated on the RVOD results, the same technique has been applied to the R04 algorithm (used in 2B-CWC-RO). Most of the retrieved IWC values are inside the± 25% of the true values, though some exceed this range, so the systematic error has been estimated as ± 40%.
 Second, results from 2B-CWC-RO have been compared to global statistics from other platforms, including the MLS [Wu et al., 2009] and Odin-SMR [Eriksson et al., 2008]. Wu et al. found that CloudSat normalized PDFs of IWC have differences with MLS of less than 50% over the range where the instrument sensitivities overlap, but CloudSat IWC exceeds MLS IWC in the 14- to 17-km zone. Third, CloudSat data are being compared to aircraft in situ data collected in various field campaigns. Candidate campaigns include the CloudSat/CALIPSO Validation Experiment (CC-VEx, Warner Robins, GA), the Tropical Composition Cloud and Climate Coupling Experiment (TC4, San José, Costa Rica), the Canadian CloudSat/CALIPSO Validation Project (C3VP, Ottawa, Ontario), and Cirrus and Anvils: European Satellite and Airborne Radiation measurements project (CAESAR, Cranfield, England). Results of these comparisons should be forthcoming soon.
 The A-train also includes the Cloud-Aerosol Lidar Infrared Pathfinder Satellite Observations (CALIPSO) instrument that is also expected to provide estimates of IWC based on lidar backscatter [Vaughan et al., 2004; Winker et al., 2004]. In this case, the horizontal and vertical resolutions will be about 60 km and 1 km, respectively, and the sensitivity range is expected to be about 0.03 and 100 mg/m3. At the time of this writing the CALIPSO IWC product is yet to be released.
2.3. Satellite Summary
 One important distinction between the radar/limb-sounding and the nadir-viewing passive products discussed above is that the sampling of the former is only based on a single suborbital track profile, rather than a swath or multisatellite product. Thus while the former gain in terms of vertically resolved information, and in some cases higher horizontal resolution, their combined spatial-temporal sampling is considerably less. Another noteworthy consideration is that passive methods infer the particle size (distribution) from measurements at the top of the cloud, which for upper level ice clouds is usually an underestimate of the actual cloud-layer-mean size. Thus, IWP would be expected to by typically underestimated in these data sets. On the other hand, for AMSU, MLS and CloudSat, the sensitivity to particle size shifts to the larger particles, and thus the (albeit small) mass contribution from the smaller/smallest particles would be expected to be underestimated. Another issue to keep in mind is that none of these satellite retrievals, in contrast to the model representations discussed below, is able, or even attempts, to distinguish floating/suspended forms of ice from falling/precipitating forms; such distinction will require colocated vertical velocity information, such as might come from Doppler radar. Thus when we use such terminology it is typically in association with the model and in attempting to find an “observed”/retrieved quantity that can be used for its validation.
 The brief descriptions of the satellite data discussed above are only meant to highlight in a very brief manner the different techniques and their associated gross strengths and weaknesses. More detailed discussion of the techniques and shortcomings, along with pertinent validation procedures and results are given in the references cited above [see also Wu et al., 2008, 2009]. Overall, there are three messages to be conveyed from the above discussion. The first is that, until recently, the availability of global cloud ice estimates was limited to IWP based on passive infrared or microwave techniques (e.g., NOAA, CERES, MODIS, ISCCP). These products' known limitations and uncertainties, including their limited intercomparison and validation, have hampered their use in constraining modeled cloud ice values. However, it is noteworthy that the few observed satellite estimates of IWP that have been available for a number of years tend to exhibit agreement as good, and probably better, than the GCMs utilized in the most recent IPCC assessment (Figure 3). The second message is that more recent measurement strategies (e.g., limb-sounding and radar) are better equipped to probe and characterize internal cloud properties, such as vertical profiles of IWC, in addition to obtaining IWP. However, at first glance there appears to be considerable disagreement between these two new estimates of cloud IWC as well as disagreement between CloudSat IWP and those based on passive techniques. This raises the third message: considerable caution has to be applied when comparing these estimates owing to the various sensor and algorithm sensitivities. It is this latter issue that is a focal point of this article, namely in terms of understanding the nature of the ice water being measured and judiciously using these estimates for model diagnosis and validation.
3. Modeled Values
 In a manner analogous to the previous section, the discussion in this section is meant to briefly describe the considerations typically in place within a GCM that account for the simulation of frozen hydrometeors in the atmosphere, both cloud ice and precipitating frozen particles. Relevant concepts and important distinctions include convective versus nonconvective/stratiform clouds, diagnostic versus prognostic parameterizations, single versus multiple hydrometeor species, and single versus multimoment characterizations. These issues are highlighted below and then the features of the ice-cloud parameterizations for the GCMs examined in this study are briefly described.
 In GCMs, the atmospheric processes associated with convective clouds and nonconvective clouds are artificially separated into cumulus convection and stratiform cloud schemes. For processes such as cumulus convection and cloud microphysics that occur at scales smaller than the GCM grid spacing (typically 50–200 km), specific cloud variables are determined as a function of variables that are defined at the grid scale, leading to a so-called “parameterization.” Most GCMs (excluding CRM-like frameworks described below) parameterize deep convection on the basis of a convective mass fluxes approach. In this approach, temperature and humidity profiles are adjusted to account for heat sources and moisture sinks directly induced by the convective mass flux [Arakawa and Schubert, 1974; Gregory and Rowntree, 1990; Tiedtke, 1989; Zhang and McFarlane, 1995]. Important to note is that owing to the observed small spatial scales of cumulus convection, the influence they have on cloudiness and thus radiation has often been neglected with the main objective only being their direct impact on humidity and temperature via latent heating. Owing to the large spatial scales of stratiform clouds, GCMs have generally accounted for “cloudiness,” and its effect on radiation, via this part of the model's parameterization.
 Studies have shown that nonconvective stratiform clouds (e.g., widespread precipitating anvil clouds and cirrus outflow) can be produced by the detrainment of condensed water from cumulus convection. Such connections within a modeling context have been taken into account by coupling stratiform cloud and cumulus convection processes in GCMs [Tiedtke, 1993]. More specifically, a link is made by including the effects of convection on cloud generation (i.e., convective detrainment as a source of large-scale cloud) and allowing dissipation of cloud particles directly during their formulation. This technique originates from attempts by Arakawa and Schubert  to allow detrainment from convective cumulus towers to serve as a source for nonconvective stratiform clouds. In general, nonconvective stratiform clouds and their condensates are formed, maintained and dissipated by many processes such as small-scale turbulence, large-scale vertical motion, convection and cloud microphysical processes. Therefore, any coupling between convective and stratiform clouds requires reliable parameterizations of microphysical processes within the model's nonconvective regions of stratiform clouds.
 When modeling ice clouds, several processes must be considered in cloud schemes: the formation (e.g., ice nucleation, water vapor deposition) and possible sedimentation of cloud condensates, the growth and interactions (e.g., deposition and riming, aggregation) and falling out of precipitation, the evaporation/sublimation of both clouds and precipitation, and possibly advection of the cloud condensates. Owing to computational considerations as well as our incomplete knowledge of cloud-ice and related fields and their associated processes, most GCMs utilize fairly simple representations of ice processes. Figure 6 is a highly simplified schematic illustrating the most rudimentary features and considerations in these representations. It mainly distinguishes the highly simplified forms in typical GCM (e.g., Figure 3) used for global weather forecasting as well as many forms of climate simulation (Figure 6, left) versus a somewhat common next level of sophistication (Figure 6, right). In the former, there is consideration of only a single species of condensate, “floating” cloud ice. In this study, we use the term “floating” to distinguish it from precipitating ice flux, but acknowledge that a number of climate-relevant GCMs do allow their cloud ice to undergo sedimentation. Processes within the parameterization, relying on the large-scale fields, lead to the development and dissipation of the clouds. In some cases, the processes are treated rather empirically, and are implicit, in others they are more explicitly represented [Jakob, 2002]. Important in this class of parameterizations is that a fraction of condensate is typically assumed to have grown to a mass/particle size large enough to be considered precipitation, and is assumed to immediately fall out, albeit it can moisten lower layers through evaporation in this fall out process. In such cases, the GCM typically carries two primary cloud variables, horizontal cloud fraction and cloud condensate mass, where the latter is considered floating cloud ice. Such a formulation is also referred to as a single-moment cloud scheme, because the number concentration of the ice particles is prescribed and only the mass is predicted. More complex formulations, which are more common in regional and cloud-resolving models (CRMs), include double-moment parameterizations that also predict number concentrations, or even more computationally expensive spectral and bin microphysics that include multiple discrete ice particle sizes, number concentrations and explicit particle-particle interactions.
 Another level of complexity beyond the simplified single-species representation in Figure 6 is allowing more ice condensate species (e.g., snow and graupel). A simplified representation of this is illustrated in Figure 6b. In this case, cloud ice is distinguished from snow and graupel by a consideration of particle size and/or the amount of overall ice mass, and graupel is distinguished from snow via the ice growth process during its formation (i.e., deposition or riming). In these cases, there is typically a prescribed particle number density for each species, or in more complex representations this can be predicted as well. For example, Lohmann and Kärcher  consider the dependence of ice number density on temperature and updraft velocity to simulate the cirrus clouds formed by homogeneous freezing. In addition, schemes of this general level of complexity typically take particle fall velocities into more careful consideration, with even the cloud ice subject to sedimentation. Novel advances in ice microphysics modeling/parameterization have been made recently by Morrison and Grabowski  that provide a more seamless representation of the ice particle distribution without the artificial boundaries associated with the above separate species; albeit this scheme still needs to be tested in a GCM.
 A final notion to highlight is that cloud parameterization schemes can be diagnostic, prognostic or a combination of the two. In a diagnostic approach, cloud variables and the overall cloud state are determined as a function of other model variables (such as model resolved wind fields, temperature, water vapor and relative humidity, etc.). For this type of approach, there is no cloud memory in successive model time steps and the relationships between the state of the cloud field and model state is fully determined. The simplest example of this kind, a grid-scale condensation scheme, produces clouds only when the GCM grid box mean relative humidity reaches a specified threshold (e.g., 100% [Geleyn, 1980]). In prognostic cloud schemes [e.g., Sundqvist, 1978], the time evolution of cloud variables (e.g., cloud-ice mass and cloud cover) is predicted on the basis of contributions from: grid-scale advection of the cloud variable (e.g., through horizontal and vertical wind fields), source terms (e.g., cumulus convective cloud condensates detrainment) and sink terms (e.g., autoconversion between cloud condensates and precipitation).
 For the GCMs utilized in this study, the following two subsections give brief descriptions of their parameterizations used to model ice clouds and related processes. The first subsection contains descriptions for what are simply referred to as GCMs, which typically rely on ice-cloud parameterizations of a rather simple form (e.g., Figure 6a; albeit in some cases the cloud ice itself can sediment slowly). The second subsection contains descriptions of two GCMs where there is an attempt, through novel numerical/conceptual frameworks, to better resolve cumulus processes. In these two cases, the ice processes are based on the three-species approach discussed above (e.g., Figure 6b).
3.2.1. GCMs and Single Species: Cloud Ice
 The ECMWF Integrated Forecast System (IFS; version 30R1) cloud scheme uses prognostic equations for cloud cover and total cloud condensate content (i.e., ice and liquid together). Condensed water species are considered pure ice at temperatures colder than −23°C and liquid at temperatures warmer than freezing. Between −23°C and 0°C, the total cloud condensate is divided into ice and liquid mass by linearly scaling the fraction of the total condensate by temperature. Two kinds of ice crystals are modeled, “pure ice” (particles < 100 μm) and “snow” (particles > 100 μm). Snow falls out instantly upon formation but is subject to sublimation and melting in lower levels. Note that ice particles falling into a cloudy layer are a source for ice in that layer, whereas ice falling into clear sky is converted into snow. The scheme considers sources/sinks from convective and nonconvective processes (e.g., turbulence near cloud edges and resolved-scale ascent/decent), with deep, shallow and midlevel convective processes represented. The condensates produced in convective updrafts can be detrained from the upper cloud layers into the environment. The formation of clouds by nonconvective processes, on the other hand, is determined by the balance between the specific humidity and its saturation value, resolved vertical ascent of moist air, and/or the diabatic cooling rate (e.g., longwave radiation). The scheme considers cloud destruction through evaporation associated with large-scale and cumulus-induced descent, diabatic heating (e.g., solar radiation) and turbulent mixing between cloudy air and environmental air near the cloud edge. Processes such as autoconversion, collection and accretion are active in clouds, with evaporation of precipitation being the active process outside clouds. The falling/sedimentation rates of cloud condensate whether they are ice mass, mixed phase or liquid water clouds depend on temperature and ice particle size. For this comparison, the IWC include the analyzed values from both the R30 and R31 versions of the IFS system, and include periods from August 2005 to July 2006.
 The GEOS5 ice cloud scheme is prognostic for cloud condensate and cloud fraction. Two types of clouds are distinguished by their condensate source. Anvil clouds originate from detraining convection and large-scale clouds originate using a probability density function (PDF) based on condensation calculations. This scheme directly links convection to anvil cloud variables by allowing detrained mass and condensate fluxes from the convective scheme to be added to the existing condensate and fraction for the anvil cloud type. For the large-scale clouds, cloud condensation is estimated using a simple PDF of total water [Rotstayn, 1999; Smith, 1990] and used to update cloud fraction and condensate. The destruction processes include: evaporation of cloud condensate and cloud fraction, sedimentation of frozen cloud condensate and accretion of cloud condensate by falling precipitation. The evaporation of cloud condensate and cloud fraction is meant to represent destruction of cloud along edges in contact with cloud-free air following Del Genio et al. . Sedimentation speeds are calculated as in work by Lawrence and Crutzen  except that their expression for midlatitude clouds is applied to all ice clouds of the large-scale type, and their expression for tropical clouds is applied to the anvil type. For this comparison, the IWCs are based on a simulation using specified sea surface temperatures (SSTs) for the period January 1999 to December 2002.
 The NCAR Climate Atmosphere Model (CAM 3.0) cloud scheme uses prognostic equations for two predicted variables: liquid and ice phase condensate [Rasch and Kristjansson, 1998; Zhang and McFarlane, 1995]. During each time step, however, these are combined into a total condensate and partitioned according to temperature (described below), but elsewhere function as independent quantities. The scheme considers condensate sources/sinks both from grid-scale (e.g., horizontal advective and vertical motions) and subgrid-scale (e.g., convective and turbulent) processes. The parameterization has two components: a macro-scale component that describes the exchange of water substance between the condensate and the vapor phase and the associated temperature change arising from that phase change [Zhang et al., 2003], and a bulk microphysical component that controls the conversion from condensate to precipitate. In its bulk microphysics step, the total condensate is decomposed into liquid and ice phases and considered all ice if T < −40°C and all liquid if T > −10°C. At −40°C < T < −10°C, the phase is determined with a linear relation in between.
 Within the NCAR parameterization, four types of condensate may exist and are expressed as mixing ratios of liquid and ice phases for suspended condensate with minimal fall speed associated with sedimentation, and liquid and ice phases for falling condensate (i.e., precipitation). Only the suspended condensates are carried forward in time, with precipitation falling out instantaneously. Precipitation is formed by explicitly considering individual physical quantities like droplet or ice number concentration, shape of size distribution of precipitate, etc. The precipitate may be a mixture of rain and snow, and is treated in diagnostic form. In addition, the conversion from condensate to precipitate as well as the evaporation of condensate and precipitate are parameterized. There is a direct link to the convective scheme even though the scheme itself does not include the ice phase (i.e., all detrained condensate is in liquid form). Convective detrainment can still contribute to the IWC of the large-scale clouds in the model at cold temperatures, with a portion of the detrained liquid partitioned into ice according to the temperature considerations given above. After convection processes and sedimentation have occurred, the liquid and ice mixing ratios are recalculated from the total cloud condensate. For the comparisons presented later in the paper, the IWC values have been generated by the CAM3 using specified sea surface temperatures (SSTs) for the period from 1979 to 1999. While not in the version presented here, it is worth noting that X. Liu et al.  have implemented a prognostic equation in CAM3 for ice crystal number concentration together with an ice nucleation scheme. The effective radius of ice crystals is calculated from model-predicted mass and number of ice crystals rather than as a function of temperature. A water vapor deposition scheme is added to replace the condensation and evaporation (C-E) method in the standard CAM3. The new scheme also removes the temperature-dependent repartitioning of total water into liquid and ice in mixed-phase clouds. In addition, ice supersaturation is allowed. The resulting IWC in the modified CAM3 shows much better agreement with the MLS values than that in the standard CAM3 [see X. Liu et al., 2007, Figure 3].
3.2.2. CRM-like GCMs With Multiple Frozen Species: Clouds, Snow, and Graupel
 This section describes two models that utilize the multispecies ice framework. These two models are of a different class of GCM, in that they try to more explicitly account for the representation of subgrid-scale processes. For this reason, they are referred to collectively in this study as Cloud-Resolving Model (CRM)-like GCMs. While this aspect deserves mention, the most relevant point of the discussion is that their ice microphysical schemes include representations of cloud ice, snow and graupel, which allow for an additional consideration in terms of the model-data comparisons. However, it is important to recognize that the CRM-like nature of these models is not a requisite to incorporating this three-species framework into a GCM.
 In what is now commonly referred to as the multiscale modeling framework (MMF; also known as “super parameterization”), the conventional cloud parameterizations are replaced with a CRM in each host GCM grid column [Grabowski, 2001; Khairoutdinov and Randall, 2001; Randall et al., 2003]. The MMF is designed such that the GCM provides large-scale forcing to a CRM within each GCM grid column. The CRM then provides subgrid fluxes, cumulus convection and clouds, etc., to the parent GCM. This allows for explicit simulation of cloud processes and their interactions with radiation and surface processes within the GCM, and a two-way interaction between the cumulus and large scale. The NASA fvMMF was developed using a finite volume GCM (fvGCM) with 2° × 2.5° resolution and a version of the two-dimensional (2D) Goddard Cumulus Ensemble (GCE) model [Tao et al., 2003] embedded in each GCM grid box. The fvMMF employs a single-moment bulk microphysical scheme with two liquid (cloud and rain) and three frozen (cloud, snow and graupel) hydrometeor classes. This six-class (water vapor plus five hydrometers) bulk scheme includes comprehensive microphysical processes among the water vapor and hydrometeors. The density for solid hydrometeors are assumed to be 0.917, 0.1, 0.4 g cm−3 for cloud ice, snow and graupel, respectively (e.g., Figure 6). The sedimentation processes of precipitating condensates as well as cloud ice crystals are also considered. For this comparison, the IWC values are based on simulations with prescribed SSTs for the months July 1998 and January 1999.
Kuang et al.  proposed a different kind of approach for improving the cumulus scale called Diabatic Acceleration and REscaling or Reduced Acceleration in the VErtical (DARE or RAVE). RAVE is a computationally efficient method for simulating the interactions of large-scale atmospheric circulations with deep convection in a 3D cloud-resolving model by reducing the scale difference between the large-scale and convective circulations. Data used in this comparison are from a near-global (70°N, S) prescribed SST simulation for the period 1998 using the Weather Research and Forecasting (WRF) model with the RAVE approach implemented. The horizontal grid spacing is ∼80 km, and the RAVE factor is 20. The microphysics scheme is a single-moment, six-class microphysics scheme that includes the interaction between water vapor, cloud water, cloud ice, rain, snow, and graupel [Hong and Lim, 2006]. In this case, the density for snow and graupel are assumed to be 0.1 and 0.5 g cm−3, respectively, but not a quantity defined/used for cloud ice in this implementation.
4. Model-Data Comparisons
 Before model-data comparisons can be made, a number of considerations have to be made in terms of sampling the model output in a manner that leads to the most meaningful comparison to the retrievals. In this section, we highlight some of the more notable issues, including sampling the model output to account for comparable populations and the influence of the diurnal cycle, and considerations of instrument/algorithm sensitivity associated with observed IWC thresholds and ranges. We then focus our discussion on the degree that all frozen hydrometeors are represented in the retrieved values as well as model representations. Aspects of the first two issues as they pertain to model-data comparisons between MLS and ECMWF analysis have been discussed by Li et al.  and will only be touched on briefly here.
 To illustrate the importance of proper sampling, Figure 7 shows the mean and the day minus night (i.e., 1330–0130 local equatorial crossing time) difference in MLS IWC at 215 hPa. Evident is the impact of the strong diurnal cycle of deep convection over the tropical continents, accounting for fluctuations in IWC on the order of ±50% of the mean. Over the tropical oceans, convection typically peaks in the early morning, accounting for the opposite sign relative to the land. Figure 7 also shows a similar result based on CloudSat IWC values, with similar implications [cf. Liu et al., 2008]. From this standpoint, it is obvious that a well-posed model-data comparison should take into account the diurnal sampling consistent with the satellite sensors, and thus it is necessary to sample the model output in accordance with the satellite orbit. It should be noted here that of all the products mentioned in section 2, a number of the cloud products based on ISCCP (albeit not IWC owing to its reliance on visible channels) have the virtue of 3-h sampling giving a much more robust depiction of the diurnal cycle. Future work might examine the virtues of combining the strengths of better time-resolved ISCCP cloud products with the more penetrating retrievals of IWC from CloudSat and MLS to construct a more comprehensive characterization of the diurnal cycle of IWC. Li et al.  illustrated the impact of sampling the 4xdaily ECMWF analysis according to the MLS orbital sampling pattern. In that case, the satellite-sampled mean IWC at 215 hPa differed from the mean of the 4xdaily values in some tropical regions by up to 25%. Note that such a result will be strongly dependent on the model depictions of the diurnal cycle which have been shown to have significant shortcomings [Dai and Trenberth, 2004; Yang and Slingo, 2001].
 A second, and significantly more complex issue that needs to be considered is the sensor and algorithm sensitivities in conjunction with the model representations. Figure 8a shows histograms for MLS IWC values at 147 hPa (green solid line). Evident is the lower limit of MLS sensitivity at about 0.5 mg/m3 and its upper limit of about 50 mg/m3 that was mentioned in section 2. Note that histograms for MLS data for a complete year (green solid line) along with a single month (green dashed line) of data are shown to illustrate that a month of (A-Train) sampling provides a representative sample when a large enough region is considered (in this case global, with the nonzero values effectively just coming from the tropics). Also shown is an analogous histogram from CloudSat (black solid line) that shows the larger lower, and considerably larger upper, sensitivity limits relative to MLS. Note that both the MLS and CloudSat histograms are based on the inherent sensor footprint resolutions, with CloudSat's being a considerably smaller volume (see section 2).
 Values of IWC from two different recent versions of the ECMWF analysis are also shown (Figure 8, purple solid and dashed lines); the differences between the two versions will be discussed in more detail at the end of this section. The model(s) can have very small to near-zero values of IWC and in order to make a fair model-data comparison it may be necessary to set values that are less than the lower sensitivity limit to zero. Consideration may also be made in a similar way for the upper bound, meaning one might take model values larger than the upper sensitivity limit of the retrievals and reduce them to this (saturation) value. An example of accounting for such sensitivities in a model-data comparison is described by Li et al. . In that case, the ECMWF instantaneous (i.e., 4× daily analyses) IWC values less than the MLS lower limit of sensitivity were set to zero before computing the time-mean values for comparison to MLS. The impact of the sensitivity sampling was less than 10%, but again, the impact in any given model-data combination will depend on both the model representation of the field (e.g., PDF) and the sensitivity limits of the given retrieval.
 If consideration were made to perform the above procedure on the ECMWF data on the basis of the histogram of the raw CloudSat retrievals, the differences would be greater than for the case of the MLS-applied sensitivity limits. This is because CloudSat's lower sensitivity limit is larger and thus a greater number of higher IWC values within the ECMWF would be set to zero leading to a greater impact on the mean ECMWF values. However, it is important to point out that for the MLS and ECMWF case described above, the model and satellite values have approximately the same spatial resolution (∼100 km) and thus they average over the similar subgrid-scale variability. On the other hand, CloudSat spatial resolution is considerably smaller; in fact one could say it is sampling the subgrid-scale of the MLS and ECMWF. To make a fair comparison, the CloudSat values need to be averaged to a comparable spatial resolution (Figure 8, black dashed line). Because this process averages “clear” (relative to CloudSat's lower sensitivity limit) and cloudy values, the lower sensitivity limit is no longer apparent. In fact in the low IWC regime, the ECMWF and 1° × 1°CloudSat values show better agreement. (Averaging CloudSat to 2° × 2° makes only minor changes to the histogram relative to 1° × 1° averaging. Note that the CloudSat and ECMWF data been averaged over the same vertical extent in this analysis.) While a direct comparison may be more appropriate between these two curves (Figure 8, black dashed and purple lines), the two populations are still biased because nothing has yet been done to account for the lower sensitivity limit of CloudSat in this case.
 Although this alone will not account for the intrinsic sampling mismatch just discussed, the most ideal manner of comparison is to construct from the model fields the radar reflectivity that would be observed by CloudSat and then perform the IWC retrieval on the constructed reflectivity (see ISCCP simulator [Klein and Jakob, 1999; Webb et al., 2001]). This approach has been used to assess CloudSat IWC retrievals from CRM output with multiple ice species [Woods et al., 2008] using a 94-GHz radar simulation package called QuickBeam [Haynes et al., 2007]. To account for the possible spatial sampling mismatch, it would be best to average the observed reflectivity values over a grid box comparable to the model resolution, then perform the cloud retrieval on them. Then from the PDF of this population, apply any lower and upper sensitivity limits of the sensor/algorithm to the model-derived values computed from the above approach.
 While the above approach has its strengths, it is not conducive for assessing most climate and numerical weather prediction GCMs that only consider cloud ice and not other/larger frozen hydrometeors (e.g., snow and graupel). In such cases, the resultant reflectivity, and thus IWC retrieval, will be intrinsically unrealistic, or else can only be compared to observed cases where the larger hydrometeors are not expected or observed in the column. In a few cases, such as the NASA fvMMF and RAVE GCMs and regional cloud resolving models (CRMs), the additional constituents are modeled. This raises questions about what components of the frozen hydrometeors are represented by the retrievals, and then in turn, how they can be judiciously used to compare to the models. For example, in section 2, it is mentioned that CloudSat is expected to be sensitive to these larger hydrometeors and thus represent more than just (floating) cloud ice water content. Meanwhile, the characteristics of MLS sampling (see section 2) make it appear to be more representative of just the cloud ice. It is imperative to consider these issues to utilize the data for model comparison and validation.
Figure 9 shows zonal and annual mean values of IWC from three GCMs that only carry/simulate cloud ice, NCAR-CAM3, GEOS5, and ECMWF, in addition to the cloud-ice only components of the two CRM-like GCMs, RAVE and fvMMF. There are two main areas of disagreement among these models. First, there is discrepancy in the overall magnitude of about a factor of two to three. Second, their spatial distribution with respect to height is considerably different. Apart from their spatial distribution, it should be pointed out from Figure 8 that the histogram of the three cloud ice fields from GEOS5, ECMWF, and fvMMF have considerably different structures, particularly on the high end, although this part of the distribution is particularly sensitive to the grid resolution which are not identical in these models. These characteristics beg the question, do CloudSat and/or MLS data provide the means to discriminate which of these distributions is more realistic? Comparing to Figure 5, it is evident that the CloudSat zonal mean values are quite different, for example, much larger in magnitude, than any of these model distributions. However, as mentioned above, CloudSat is expected to be sensitive to larger frozen hydrometeors that are not part of the representation in many model distributions, for example those shown in Figure 9. On the other hand, the magnitudes of the IWC in the MLS zonal average profile shown in Figure 5 which are thought to be more representative of cloud ice for the reasons mentioned in section 2, are much closer to the modeled values.
 To shed additional light on these model-data comparisons, Figures 10 and 11 show multicomponent IWP distributions of the frozen hydrometeors simulated by two CRM-like GCMs, RAVE and fvMMF, respectively. For each model, the annual mean values of graupel, cloud ice, snow and total IWP are shown as horizontal maps. Also shown are the zonal mean values and the percent contribution of each constituent to the total. As there are yet no global observations that claim to readily distinguish these various components, the model distributions are being used here as somewhat of a guide, since their model microphysics were developed in consideration of field experiments/data, albeit temporally and spatially sparse. In general agreement between these two models are the following features: the total, graupel and snow IWP distributions have overall magnitudes that agree relatively well between the two models. There is a considerable difference in the magnitudes of the cloud IWP as was also evident in Figure 9. There is a considerable difference in the magnitudes of the cloud IWP as was also evident in Figure 9. Beyond this there are considerable differences in the regional-scale features of the distributions. Most important for this discussion, are the relative contributions of the various frozen hydrometeor components to the total IWP. In the RAVE GCM, each component represents about 30% of the total frozen mass in the tropics, while in the fvMMF, the graupel, snow and cloud are about 50%, 30% and 10%, respectively. Thus in the case of these two models, the overall message is that each of the three frozen components contributes a sizable fraction to the overall total. If this is the case in nature, this must be considered in regards to applying the satellite retrievals, interpreting the model-data comparisons and in particular designing new observing systems. An interesting feature is that the magnitude and spatial distribution of the total IWP somewhat resembles the CloudSat IWP (and to some degree the MODIS and CERES) values that are shown in Figure 4, with the reminder that CloudSat is sensitive and may be accounting for most of the larger frozen hydrometeors and larger IWC values.
Figure 12 shows the zonal and annual mean vertical profiles of IWC from the two CRM-like GCMs. Except for the distributions of cloud ice, the distributions for the other frozen components from the two models agree relatively well, particularly given the complete lack of global observations that would adequately guide and constrain GCM development in this area. Notable is the relatively good agreement in the total IWC values in Figure 12, namely in terms of general morphology and magnitude, with those of CloudSat shown in Figure 5. Moreover, Figure 8 shows that there tends to be slightly better agreement between the CloudSat (1° × 1°) histogram (black dashed line) and the fvMMF total IWC (solid thick blue line) than with that of any other constituent. However, one item worth pointing out is the differences in vertical distribution, particularly with respect to the tropics. For CloudSat, the greatest concentration of IWC is between 250 and 400 hPa, while for the models the peak values are found around 500–600 hPa. Presuming such distributions relate in some way to the latent and radiative heating profiles, this disagreement is somewhat troubling and might indicate shortcomings in the underlying microphysical schemes in these models as they relate to convection and the large-scale circulation. Consideration should also be given to the possibility that the height of the peak CloudSat values might also be artificially influenced by the algorithm's method of (linearly) combining the liquid and ice water retrieval solutions via temperature.
 The results described above indicate that CloudSat IWC values may be a useful estimate of total IWP, at least for the purposes of representing a very preliminary and somewhat qualitative form of validation for models that carry a more comprehensive system of frozen hydrometeors. However, this still does not provide a constraint on the cloud ice component that is typically the only component represented in many GCMs (i.e., Figure 6) and to the extent MLS might provide such a constraint, the latter is very limited in vertical extent. To help address this problem, it is possible to make judicious subsets of the CloudSat data based on additional flags and information in the retrieval products [Stephens et al., 2008]. The additional information used here includes the cloud classification (e.g., cirrus, deep convective, altostratus) which is given for each (cloudy) retrieval and the flag that indicates whether surface precipitation is detected which is given for each profile.
 The specific subsampling/filtering employed here is to recalculate the average (e.g., zonal and annual mean) IWC in three different ways: (1) excluding all the retrievals in any profile that is flagged as precipitating at the surface (NP), (2) excluding the retrievals that are flagged as “convective” (NC), where “convective” includes the “deep convection” and “cumulus” cloud classification, and (3) excluding retrievals that meet either of these conditions (NP and NC). For example, the NP case includes all IWC = 0 (i.e., clear) and the IWC > 0 retrievals that did not have precipitation detected at the surface. Figure 13 shows the annual and zonal mean of the CloudSat IWC NP, NC and NP and NC values. Figure 14 shows the total number of CloudSat samples (Figure 14d) and the percentage of total samples removed in the three different subsampling procedures (Figures 14a–14c). In regions of appreciable IWC (see Figure 5), the samples removed account for about 5–30% of the total samples. Figure 8c shows the part of the IWC distribution from that is removed by the above filtering. The main impact is to remove the larger IWC values (those typically above the sensitivity of MLS), with the number of values being reduced by a factor of about 2, and up to a factor about 5 for the largest values. Evident from Figure 8c is that the CloudSat NP and NC case and MLS have an overlapping range of sensitivity of roughly 2–50 g m−3, which would include all but very thin cirrus and of course a reduced sampling of the very high IWC values. Keep in mind that the amount of small-value IWC mass (e.g., thin cirrus) that CloudSat misses, which is not affected by the filtering in any case, is estimated to be less than 10% (see Figure 5 and discussion by Wu et al. ).
 For interpretation of Figure 13, it is useful to start with the most stringently filtered case, i.e., NP and NC. In this case, the IWC is considerably lower than the total shown in Figure 5. In some regions, the reduction in IWC is well over 50% through the exclusion of the cases that exhibit precipitation at the surface and those denoted as convective. Both of these excluded cases would be expected to contain significant amounts of large frozen hydrometeors (e.g., snow, graupel). In support of this, Figure 8c shows that most of the removed IWC values are those with the largest values. For the NC case, there is a significant increase in IWC in the midlatitudes over the NP and NC case. Because the midlatitude synoptic regime more readily allows for precipitation without convection, the IWC retained is considerably greater, in fact very near the original total in Figure 5. However, including the precipitating cases does not have a significant impact in the Tropics because most precipitation is associated with convection. This is why all three cases tend to be the same for the tropics; that is, they retain only about 30% of total cloud ice observed by CloudSat (Figure 5). Interestingly, this fraction of retained ice, inferred here to be more representative of the (suspended) cloud ice, is within a factor of 2 or so of the same fraction of cloud to total ice as in the two CRM-like GCMs.
 The main point of the discussion above is that with the additional constraints applied to CloudSat IWC, for example, NP and NC, the values are more likely to reflect only ice content within clouds with significantly less contribution from graupel and snow [cf. Stephens et al., 2008]. Additionally, these constrained values have a strong resemblance to the tropical values estimated by MLS. This can be seen by comparing the MLS IWC values shown in the inset panel of Figure 5 to Figures 13d, 13e, and 13f, which are the same as Figures 13a, 13b, and 13c but plotted in a manner that can be more readily compared to the MLS values. The agreement between these two observational resources, along with an understanding of their sampling characteristics/constraints, indicates that these might serve as a preliminary guide for evaluating GCM-simulated cloud ice which is typically the floating/suspended constituent (albeit in some cases with small sedimentation velocities but yet typically not considered precipitation).
 One such comparison is derived by comparing the IWP estimate from CloudSat, with the NP and NC constraints applied, shown in Figure 15 to the GCM-simulated values of (cloud-only) IWP in Figure 3. This tentative comparison indicates that about a third of the models that contributed to the IPCC Fourth Assessment do a fair job at representing the pattern and magnitude of IWP (i.e., BCCR, CNRM, GFDL2.0, MIROCHR, MIROCMR, MPI), and a few that significantly underestimate the IWP (i.e., INMCM, CCCMAT63, NCAR, IAP). There are a few that greatly overestimate the IWP, such as the two GISS versions, IPSL and to some extent CSIRO. The two UKMO versions appear to greatly underestimate (overestimate) IWP in the Tropics (extratropics). While these comparisons are illustrative of the possible shortcomings related to clouds and their feedbacks, it is useful to consider these in conjunction with similar comparisons to the cloud liquid water path [Li et al., 2008].
 A second comparison is given in Figure 16 that shows the IWC field at 215 hPa from MLS and from CloudSat with the NP and NC constraints applied, along with the GEOS5 and NCAR CAM3 GCMs, and the ECMWF R30 analysis. Within the context of this comparison, the two GCMs perform relatively well, considering the wide disparity displayed in the first MLS-GCM comparisons [Li et al., 2005]. It is worth noting that while the ECMWF analysis R30 values are considerably less than the two satellite-derived values, efforts were undertaken to increase the cloud ice as well as upper tropospheric water vapor through improved microphysics based on the arrival of MLS data and associated comparisons [Li et al., 2007]. The improvement, relative to the MLS estimates, is illustrated in Figure 17 and Figure 8. The changes in the cloud ice microphysics for this model, which is in the category represented in left side of Figure 6, involved allowing ice-phase supersaturation and a new scheme for ice crystal sedimentation and snow autoconversion rate [Tompkins et al., 2007], with the latter counteracting the expected decreases in IWC from the former and leading to an overall increase in IWC. While the former condition would typically reduce the amount of IWC, the slowing of the rates in the latter revision accounted for the overall increase in IWC, and the better agreement with the satellite-derived values.
 As a means of reemphasizing the challenge at hand with respect to modeling and measuring cloud ice and making comparisons between the two, putting our preliminary results in this context, and indicating areas of needed work, Figure 18 shows a bar chart illustrating the global, extratropical and tropical mean IWP values for the GCM shown in Figure 3, a number of the satellite retrievals including our subsampled version of CloudSat, and ECMWF. As with Figure 1, wide model disagreement is quite evident, and as with Figure 3, the disagreement is exacerbated with considering regional values, in this case just distinguishing between tropical and extratropical averages. In the case of the latter, this plot demonstrates that except for a few of the GCMs, most models have extratropical IWP values that are considerably larger than the tropical values by often factors of 2 or more. When considering the “observed” values, most retrievals show the extratropical and tropical values to be similar, except for the ISCCP values which exhibit larger extratropical values as well. In general, the CloudSat total values are not too different than the MODIS values, but a factor of two, or more, larger than the ISCCP, NOAA/NESDIS and ECMWF values. The latter being one of the few cases that have an extratropical average considerably less than the tropical value. As demonstrated above, the NP and NC subsampled version of CloudSat, suggested to be a slightly better representation of just the cloud ice component of the IWP, is about one third or so of the CloudSat total IWP. To the extent the subsampled version is a better rough IWP estimate, many models are within about a factor of 2 of this value, although they themselves differ by well over a factor of 4 and actually a factor of 20 when considering the outliers, and their extratropical to tropical average ranges between about 1 to 5. These issues point to the need for continued work in refining both the retrievals and the model [Morrison and Grabowski, 2008] representations, and in doing so, striving to make them more consistent and readily comparable. These issues are discussed in section 5.
5. Summary and Discussion
 The accurate simulation of tropospheric ice clouds in GCMs continues to represent a significant challenge to the model development community. Shortcomings in the representation of these clouds impacts both the latent and radiative heating processes, and in turn the circulation and the energy and water cycles, leading to errors in weather and climate forecasts and to uncertainties in quantifying cloud feedbacks associated with global change (e.g., Figures 1 and 3). Much of the challenge has been associated with a lack of high-quality, global observations of ice clouds and related quantities. While retrievals from passive nadir-viewing sensors have been available for some time (see section 2), their cloud ice retrievals provide little vertical structure information (albeit three levels from ISCCP), have thus been generally limited to estimates of IWP, are severely hampered in multilevel cloud systems, and have undergone little systematic comparison and validation (e.g., Figures 2 and 4). Despite these shortcomings, model development has progressed over the last decade in terms of more models including prognostic cloud schemes and introducing more sophisticated microphysics representations (e.g., section 3 and Figure 6). With the arrival of the EOS-era of satellite measurements, considerable new resources have become available to help address this problem. These include the moderate spectral and high spatial resolution of MODIS (see section 2) and the stereoscopic capabilities of MISR [Diner et al., 1989]. Most relevant however to the challenges associated with cloud ice have been the products introduced by MLS and CloudSat, and soon CALIPSO (see section 2). These latter products include vertically resolved estimates of IWC (e.g., Figure 5) and have allowed for the first time global-scale comparisons of observed ice mass at a given level to GCM representations [Li et al., 2005].
 The arrival of these new cloud ice products heralds a new era for model diagnosis, development and validation with respect to cloud mass, structure and microphysical characterization. However, exploiting the measurements/retrievals comes with considerable challenges, since both the observing systems and the present model parameterizations and frameworks related to cloud ice have largely developed independently of each other. Thus, it is not a simple matter of comparing the model output with the retrieved quantities, as might be the case for more straightforward quantities like sea surface temperature, top of the atmosphere radiation, or total column water vapor. The algorithm teams are still in the process of characterizing and validating their observed estimates [Stephens et al., 2008; Wu et al., 2009] and GCMs exhibit considerable variation in their representations of ice clouds [e.g., Jakob, 2002, section 3], meaning even comparisons among the retrievals themselves [e.g., Wu et al., 2009, Figure 4] or the models themselves is fraught with challenges. In the case of the former, the retrievals/sensors often have very different detection and particle size sensitivities, as well as spatial/temporal sampling characteristics. The background, results and discussion in this paper are meant to bring the model and satellite communities closer in order to make more rapid progress on this problem. Very fundamental yet basic information on the satellite side of the problem as well as the modeling side is presented so that greater common ground can be found for coordinating research and development in this area.
 The most fundamental question addressed in this paper is can the MLS and/or CloudSat IWC be used to evaluate IWC values from GCMs? With this, comes discussion of what are the considerations that must be made to make meaningful comparisons between the models and the retrievals. Inherent to this challenge is that the sensitivity and sampling characteristics of the instruments (e.g., Figures 7 and 8) make their products only applicable to certain components and/or ranges of IWC. For example, MLS tends to represent IWC in the low to medium range of values (e.g., 0.5 to 50 mg m−3), and because it samples only the upper most levels of the troposphere where precipitation and mixed phase have less influence, might be most representative of the cloud ice field (see section 2 and Wu et al. ). On the other hand, CloudSat is much less sensitive to the smaller IWC values and is sensitive to larger hydrometeors and IWC values (see Figure 8).
 Using our present understanding of the strengths and limitations of the IWC values from MLS and CloudSat, along with knowledge and findings regarding the model ice fields, the analysis works to constrain how the data can best be applied for model evaluation. A chief consideration is the degree the floating (or slowly sedimenting; i.e., cloud) and precipitating (e.g., snow, graupel) hydrometeor fields are represented. For example, typically GCMs represent and/or output the ice associated with clouds, with a few GCMs that explicitly represent precipitating hydrometeors as well (i.e., Figures 9–12). On the basis of the information at hand at this time and a number of qualitative inferences, the findings in this study lead to the suggestion that CloudSat IWC might provide a rough estimate of the total IWC field (i.e., including cloud, snow, graupel) that can be compared to GCMs that carry/simulate a more complete budget of the total ice field (Figures 4, 5, and 10–12). In addition, MLS IWC, along with judiciously subsampled CloudSat IWC, might provide a preliminary estimate of IWC associated with ice clouds in a GCM (Figures 13–16, including contributing models to the IPCC assessments, e.g., Figure 3).
 The subsampled CloudSat values (NP and NC) exclude retrievals that have either surface precipitation detected or that are identified as convective (i.e., deep convection or cumulus). The motivation is to develop a preliminary way to limit the contribution from larger, and thus typically falling, frozen hydrometeors. The caveats to this approach are that surface precipitation is not an ideal indicator of surface precipitation in the column; thus there may in fact be precipitation at upper levels that does not manifest precipitation at the surface owing to reevaporation and advection. Moreover, there is not a one-to-one relationship between precipitation and/or convective clouds and the larger, falling hydrometeors, which one might attempt to remove from the retrievals in order to better compare to the models. An additional caveat is that in the present analysis, the models were not subsampled in the same way. This is not trivial to carry out owing to the differences in spatial sampling between the retrievals and the model, as well as the sensor/algorithm sensitivity that must be replicated to some degree when applying the analogous procedure to the model (i.e., applying a precipitation threshold to the model but taking into account the differences in spatial sampling).
 Future work involves taking the above considerations into account, namely by applying the CloudSat radar simulator [Haynes et al., 2007] and retrieval to the model output. Such an approach accounts for most of the assumptions in the retrieval and most aspects of the sensor sensitivity, and can help to match temporal sampling issues if properly applied. However, it must be emphasized that the output from the CloudSat radar simulator will only exhibit signals from the hydrometeors represented in the model. Thus, if a model's ice fields do not contain graupel and snow for example, simulator output is not representing the same information/fields as in the CloudSat observations. Apart from this, there can still be issues of spatial sampling mismatch that are not easily accounted for (e.g., typical GCMs are not representing cumulus on the kilometer scale like CloudSat). In any case, at the present time, the amount of new/unique information is tremendous, relative to the case without the recent products from CloudSat, MLS etc. Again, it has to be stressed that these new retrieval products are still undergoing characterization and our community is just beginning to learn how to apply these retrievals to model-data comparisons. In any case, the limited capabilities of previous estimates (e.g., only IWP), coupled with their relatively poor agreement (e.g., Figure 4), along with the critical need to provide some form of model constraint/validation for what has largely been an unconstrained yet important quantity (model-model discrepancies of factors of 10 or more), dictates that even preliminary model-data comparisons along the lines discussed here be performed and provided with these new resources with some expediency.
 There are a number of additional avenues that could and/or need to be explored in order to refine the types of preliminary comparisons presented in this study. On the observational side, additional validation studies, particularly those that may offer cross comparison and validation of MLS, CloudSat and CALIPSO, to help enhance confidence and characterization of the retrievals. The fact that these sensors fly in formation makes this a relatively productive and efficient undertaking. In addition, if more specific information could be relayed regarding a given product's applicability and/or sensitivity to quantities explicitly represented in models (e.g., cloud ice alone, snow, graupel), the easier it will be to extend and interpret these types of studies. Further, developing auxiliary products, such as the CloudSat cloud classification flags, and or learning to use other complementary A-Train sensors, may help immensely to characterize the context of the measurement and further refine the data-model comparison. For example, CloudSat provides particle size distribution (PSD) parameters as part of their retrievals. We have explored the use of these to reconstruct the size distribution and separate ice mass contributions from small and large particles to facilitate model-data comparisons (C. P. Woods et al., Partitioning CloudSat ice water content for comparison with upper-tropospheric cloud ice in global atmospheric models, submitted to Geophysical Research Letters, 2009). Another very complimentary data set/methodology to explore in the present context is the hydrometer profile estimates from TRMM which are thought to represent precipitating ice particles [Jiang and Zipser, 2006]. As we continue to learn about the strengths of the current data and where hard limitations exist, follow-on mission design should be particularly cognizant of the model quantities and specific validation needs. Having additional microphysical information (e.g., particle size) or dynamic information (e.g., vertical velocity) would be exceptionally helpful for further guiding and validating model development.
 As with the observation side, a clear(er) articulation by the modelers of the mass and particle size ranges being represented in the cloud parameterizations is needed to make the model-data comparisons most meaningful. Similarly, for the models that tend to only represent and output their cloud ice values, it would be useful to output the ice mass that is presumed to have precipitated out on a level-by-level basis. This is a quantity that is not typically output but yet may provide through the use of CloudSat IWC an additional constraint on the model's ice physics. Beyond just getting the mean fields of ice mass correct, it will also be important to explore and validate in greater detail the distributions of ice mass values (e.g., Figure 8), paying close attention to equitable sampling methodologies. Note that the recent ice microphysics scheme of Morrison and Grabowski  takes a novel approach at representing a more seamless distribution of ice in the atmosphere without the artificial categories such as cloud, graupel and snow. Schemes such as this lend themselves better to comparisons to the types of retrievals presented here. Finally, for those models that carry a more comprehensive range of frozen hydrometer mass (e.g., cloud, snow, graupel), it would be helpful if the modelers considered the incorporation or use of QuickBeam [Haynes et al., 2007] and the CloudSat retrieval algorithm(s) to allow for a close correspondence between model and observed quantities. Overall the challenge is quite clear regarding our model simulations of cloud ice (e.g., Figures 1 and 3) but given the new A-Train resources in hand, in conjunction with those from a number of others that bring complementary information (e.g., see those in Figure 4), we should expect to see a significant reduction in the shortcomings associated with our cloud ice simulations and in the uncertainties associated with (high) cloud climate change feedback by the time of the next IPCC assessment report. An encouraging sign is that ECMWF has already introduced changes that bring their IFS system into better alignment with the available observations, and the latter have also played a role in the recent development of the GEOS5 GCM which also exhibits quantitatively good model-data agreement (e.g., Figures 16 and 17).
 The research described in this paper was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration (NASA). It was also supported by NASA through the CloudSat and CALIPSO programs. The authors would like to thank Editor Steve Ghan and three anonymous reviewers for their efforts and comments regarding this manuscript. The authors would like to acknowledge support from Brent Maddux and Steve Ackerman for assistance with MODIS MYD06 analysis. We acknowledge the modeling groups, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) and the WCRP's Working Group on Coupled Modeling (WGCM), for their roles in making available the WCRP CMIP3 multimodel data set. Support of this data set is provided by the Office of Science, U.S. Department of Energy.