A comparison of TWP-ICE observational data with cloud-resolving model results



[1] Observations made during the TWP-ICE campaign are used to drive and evaluate thirteen cloud-resolving model simulations with periodic lateral boundary conditions. The simulations employ 2D and 3D dynamics, one- and two-moment microphysics, several variations on large-scale forcing, and the use of observationally derived aerosol properties to prognose droplet numbers. When domain means are averaged over a 6-day active monsoon period, all simulations reproduce observed surface precipitation rate but not its structural distribution. Simulated fractional areas covered by convective and stratiform rain are uncorrelated with one another, and are both variably overpredicted by up to a factor of ∼2. Stratiform area fractions are strongly anticorrelated with outgoing longwave radiation (OLR) but are negligibly correlated with ice water path (IWP), indicating that ice spatial distribution controls OLR more than mean IWP. Overpredictions of OLR tend to be accompanied by underpredictions of reflected shortwave radiation (RSR). When there are two simulations differing only in microphysics scheme or large-scale forcing, the one with smaller stratiform area tends to exhibit greater OLR and lesser RSR by similar amounts. After ∼10 days, simulations reach a suppressed monsoon period with a wide range of mean precipitable water vapor, attributable in part to varying overprediction of cloud-modulated radiative flux divergence compared with observationally derived values. Differences across the simulation ensemble arise from multiple sources, including dynamics, microphysics, and radiation treatments. Close agreement of spatial and temporal averages with observations may not be expected, but the wide spreads of predicted stratiform fraction and anticorrelated OLR indicate a need for more rigorous observation-based evaluation of the underlying micro- and macrophysical properties of convective and stratiform structures.

1. Introduction

[2] The Tropical Warm Pool–International Cloud Experiment (TWP-ICE) took place over and around Darwin, Australia, from 20 January through 13 February 2006. According to May et al. [2008], TWP-ICE is “the first field program in the tropics that attempted to describe the evolution of tropical convection, including the large-scale heat, moisture, and momentum budgets at 3-hourly time resolution, while at the same time obtaining detailed observations of cloud properties and the impact of the clouds on the environment.” The experiment specifically focused on the properties of convectively generated cirrus, aiming to document their relationship to environmental conditions. The experimental domain (Figure 1) was centered on a highly instrumented site operated by the US Department of Energy (DOE) Atmospheric Radiation Measurement (ARM) program and a C-band polarimetric (C-POL) weather radar operated by the Australian Bureau of Meteorology, surrounded by a 3-hourly sounding array (Table 1) and surface energy budget sites. TWP-ICE was also coordinated with the Aerosol and Chemical Transport in tropIcal conVEction (ACTIVE) program, funded by the UK Natural Environment Research Council, which gathered extensive in situ measurements of environmental aerosol properties [Vaughan et al., 2008]. The data gathered during TWP-ICE and ACTIVE are now archived by ARM and the British Atmospheric Data Centre, respectively.

Figure 1.

The TWP-ICE observational domain, with sounding array locations enclosing a pentagonal area of roughly 31,000 km2. Latitude and longitude of each radiosonde site listed in Table 1.

Table 1. Sounding Array Sites Defining the TWP-ICE Pentagonal Domaina
Site NameLatitudeLongitude
Mount Bundy−13.2287131.1355
Garden Point−11.4089130.4167
Cape Don−11.3081131.7651
Point Stuart−12.5858131.7609

[3] Improving the climate projection skill of general circulation models (GCMs), a principal motivation for TWP-ICE, has been hindered by inadequate representation of cloud properties and their relationship to environmental conditions [e.g., Randall et al., 2007]. Since cloud properties vary on short temporal and spatial scales that are not well resolved in GCMs, approaches to improve GCM cloud representation have commonly included direct or indirect use of cloud-resolving models (CRMs) [e.g., Randall et al., 2003a]. With the explicit goal of using CRMs to guide the improvement of climate models, under the auspices of the World Meteorological Organization, a primary activity of the Global Energy and Water-Cycle Experiment (GEWEX) Cloud Systems Study (GCSS) has been the organization of intercomparison studies in which modeling groups worldwide are invited to participate [Randall et al., 2003b]. The first GCSS modeling study of deep convective processes, based on the Tropical Oceans Global Atmosphere Coupled Ocean–Atmosphere Research Experiment (TOGA-COARE), included both CRMs and single-column models (SCMs) [Moncrieff et al., 1997; Wu et al., 1998, 1999; Wu and Moncrieff, 2001a; Redelsperger et al., 2000; Bechtold et al., 2000]. SCM and CRM simulations of midlatitude continental convection were next compared based on observations at the ARM Southern Great Plains (SGP) site [Ghan et al., 2000; Xu et al., 2002; Xie et al., 2002, 2005]. Later GCSS convection studies focused on the transitions from shallow to deep convection over tropical land [Grabowski et al., 2006] and from suppressed to deep convection over tropical ocean, and included analysis of global atmospheric models, in addition to CRMs and SCMs [Petch et al., 2007; Willett et al., 2008; Woolnough et al., 2010]. A number of related studies have used a similar approach to investigate the sensitivity of a single CRM or SCM over a much wider parameter space than can generally be accommodated in a multimodel study [e.g., Grabowski et al., 1996, 1998; Wu et al., 1998; Grabowski et al., 1999; Wu et al., 1999; Wu and Moncrieff, 2001a, 2001b].

[4] Here we present the results of a CRM study based on data gathered during the TWP-ICE and ACTIVE programs. The specification for CRM initialization and forcing [Fridlind et al., 2010] was developed jointly through the DOE ARM, GCSS, and Stratospheric Processes And their Role in Climate (SPARC) programs. SPARC participation was motivated by the goal of understanding the influence of tropical deep convection on water vapor concentrations and convective transport through the tropical tropopause, and led to the derivation and adoption of a large-scale forcing data set at 10-mb vertical resolution in order to improve representation of near-tropopause thermodynamic conditions. A unique aspect of this case is the availability of an idealized aerosol number size distribution profile, composed of three lognormal modes with fixed geometric mean radius and standard deviation and number concentrations that vary with altitude (see section 2), derived from measurements as described by Fridlind et al. [2010]. To our knowledge this is the first CRM comparison study to provide a vertically varying profile of aerosol size distribution properties; the inclusion of modal information extends upon the simpler specification of total number concentration profile provided by Barth et al. [2007]. Whereas the work presented here focuses only on 2D and 3D CRM simulations with fully periodic boundary conditions (an approach used most commonly in combination with SCM simulations), three complementary studies based on TWP-ICE data have been simultaneously conducted using SCMs (L. Davies, manuscript in preparation, 2012), limited-area models (LAMs) with open boundary conditions and nested grids (P. Zhu et al., A limited area model (LAM) intercomparison study of a TWP-ICE active monsoon mesoscale convective event, manuscript submitted to Journal of Geophysical Research, 2012), and GCMs operated in short-term forecast mode (Y. Lin et al., TWP-ICE global atmospheric model intercomparison: convection responsiveness and resolution impact, manuscript submitted to Journal of Geophysical Research, 2012). A summary and comparison of all four studies will focus on common and contrasting results as well as methodological issues (J. Petch, manuscript in preparation, 2012).

[5] TWP-ICE data have already been widely used in other modeling studies. Among those focused primarily on dynamics and precipitation, several analyzed CRM dynamical behaviors under TWP-ICE conditions to inform GCM parameterization development [Wu et al., 2009; Del Genio and Wu, 2010; Wang and Liu, 2009]. Others directly evaluated GCM parameterizations with respect to closure assumptions [Zhang, 2009], the effect of two-moment microphysics on simulated versus observed stratiform precipitation [Song and Zhang, 2011], and simulated versus observed relative humidity among other factors [Franklin et al., 2012]. Wapler et al. [2010] concluded that judiciously formulated LAM simulations could reasonably reproduce observed precipitation rate statistics. Among studies focused more on ice properties, Wang et al. [2009b] found substantial discrepancies between simulated and observed ice cloud properties in all the CRM simulations they considered. Wang et al. [2009a] found that SCM radiative fluxes are sensitive to the representation of ice properties that are not directly constrained by ground and satellite measurements. Other studies reported on the sensitivity of simulated aerosol, microphysical, dynamical, and radiative processes to changes in aerosol specification and ice nucleation assumptions [Fan et al., 2010a, 2010b; Morrison and Grabowski, 2011; Zeng et al., 2011]. In a companion study using 3D simulations from this study, Varble et al. [2011] have also examined the characteristics of precipitating cloud structures in greater detail.

[6] Here we first briefly describe the specification for CRM initialization and forcing (section 2), the CRMs used (section 3), and the observational data sets used to evaluate the simulations (section 4). Most aspects of the specification are based on methodologies developed for earlier GCSS cases, and like all prior GCSS studies cited above, this work compares observed and simulated thermodynamic variables, hydrometeor paths, and precipitation rates. Each prior study also addressed specific focus areas, such as the treatment of boundary conditions and large-scale forcing terms [e.g., Ghan et al., 2000] or the effect of adding basic model features such as the ice phase or a third spatial dimension [e.g., Redelsperger et al., 2000]. In this paper we focus on the following questions: (1) do simulations and observations agree within experimental uncertainties and (2) how robust is the methodology used here for producing realistic simulations? In the course of addressing these questions in section 5, simulations are also compared with one another; here we intentionally limit our focus to quantities that are observationally constrained, but note that companion studies include additional model variables [e.g., Varble et al., 2011] and compare them with other model results (e.g., Zhu et al., submitted manuscript, 2012).

2. Case Description

[7] Over the month-long TWP-ICE campaign, Darwin experienced active monsoon conditions only during the first week, culminating with the passage of a large mesoscale convective system (MCS) directly through the center of the observational domain on 23–24 January, followed by suppressed monsoon conditions through 3 February, and monsoon break conditions thereafter [May et al., 2008]. This study focuses only on the active and suppressed periods. Although the TWP-ICE experimental domain contains both land and ocean, the low-lying land areas become saturated during monsoon periods, behaving in a manner that is maritime in nature. To allow CRM representation of relatively slowly developing and advecting monsoon features (such as cold pools) over the TWP-ICE region in a framework that remains as simple as possible, the following idealized marine conditions are specified (additional details provided by Fridlind et al. [2010]): (1) model domain footprint representative of the TWP-ICE observation domain (circa 176 × 176 km), (2) sea surface temperature fixed at 29°C (diagnosed surface fluxes), (3) fully periodic horizontal boundary conditions, (4) surface albedo fixed at 0.07 in all shortwave bands, (5) diurnally varying insolation with domain centered on the Darwin ARM site (12.425°S, 130.891°E), (6) run time of 16 days (0Z 18 January to 0Z 3 February 2006), (7) nudging of mean horizontal winds above 500 m to observed profiles with a two-hour timescale, (8) large-scale advective forcing of potential temperature and water vapor (vertical and horizontal, derived from observations) and condensate (vertical only, calculated using large-scale wind derived from observations) [Xie et al., 2010], applied at full strength below 15 km, linearly decreasing above to zero strength at 16 km, and (9) nudging of mean water vapor and potential temperature to observed profiles with a six-hour timescale, adopted at full strength above 16 km and linearly decreasing below to zero strength at 15 km (baseline) or adopted at full strength above 1 km and linearly decreasing below to zero strength at 0.5 km (optional sensitivity test).

[8] Nudging of water vapor and potential temperature in the upper troposphere was found to be necessary to keep simulated environmental conditions realistic aloft, consistent with an understanding that large-scale forcings are poorly constrained by measurements above about 15 km [cf. Petch et al., 2007; Morrison and Grabowski, 2011]. An optional sensitivity test with nudging extended down to the lower troposphere was included because drift of simulated conditions from observations at lower elevations was found to influence the strength and depth of convection and the area covered by stratiform precipitation. Horizontally uniform application of all nudging terms preserves variations from the mean.

[9] Input files are archived as described in Appendix A. These include an idealized profile of aerosol size distribution properties that was derived from observations during the active period (Figure 2) as described by Fridlind et al. [2010].

Figure 2.

(left) Mean profiles of aerosol number concentration in three size cuts and (right) derived trimodal size distributions as a function of elevation based on ACTIVE in situ measurements as described by Fridlind et al. [2010] (see section 2).

[10] Allowing 36 hours of model spin-up, analysis here and in companion studies is focused on several time periods after 12Z on 19 January (day of year range in parentheses): (1) 6 days of active monsoon conditions (19.5–25.5), (2) 6 days of suppressed monsoon conditions (27.5–33.5), and (3) three shorter periods of intense precipitation during the active monsoon (19.5–20.625, 22.125–23.125, and 23.125–24.5), referred to hereafter as events A, B, and C.

[11] We emphasize that this case specification assumes an all-ocean surface over the coastal TWP-ICE observational domain, an assumption that is being examined in companion studies that include land surfaces. Comparison of CRM results with LAM simulations spanning event C have shown little impact of land surface on initiation and maintenance of convection during the active period (Zhu et al., submitted manuscript, 2012), but impacts could well be greater during the suppressed period.

3. Simulations

[12] Simulations include ten combinations of dynamics and microphysics. Six dynamics models were used: the Distributed Hydrodynamic-Aerosol-Radiation Model Application (DHARMA) [Stevens et al., 2002; Ackerman et al., 2000], the Eulerian semi-Lagrangian model (EULAG) [Smolarkiewicz and Margolin, 1997], the Iowa State University 2D Cloud Resolving Model (ISUCRM) [Wu et al., 2008], the Meso-NH Atmospheric Simulation System (MESONH) [Lafore et al., 1998], the System for Atmospheric Modeling (SAM) [Khairoutdinov and Randall, 2003], and the UK Met Office Large Eddy Model (UKMO) [Shutts and Gray, 1994; Petch and Gray, 2001]. All model dynamics are based on anelastic equations, but with varying treatments of subgrid-scale turbulence, surface fluxes, radiative transfer, advection, and time stepping. General model features and optional setup parameters are summarized in Table 2. Three groups submitted the optional sensitivity test (with nudging added throughout the free troposphere to offset accumulation of errors; see section 2); the sensitivity test simulations are identified with an “s” (DHARMA-1s, EULAG-2s, and SAM-2Ms).

Table 2. Model Parameters
ModelDimensionDomaina (km)x (m)zb (m)Microphysical ReferencecPrognostic Microphysical VariablesdIce Nucleation MechanismseSensitivity Testf
  • a

    Domain footprint is square for 3D models.

  • b

    Range of model layer depths between surface and typical tropopause elevation of 17 km.

  • c

    Microphysics references: G99, Grabowski [1999]; M09, Morrison et al. [2009]; M08, Morrison and Grabowski [2008b] (see also section 3); K76, Koenig and Murray [1976]; P98, Pinty and Jabouille [1998]; P02, Pinty [2002]; and B06, Brown and Heymsfield [2006].

  • d

    Prognostic microphysical variables: respective mixing ratios and number concentrations of cloud water (qc, Nc), rain (qr, Nr), ice (qi, Ni), snow (qs, Ns), and graupel (qg, Ng; treated as hail in SAM-2M). In EULAG-2, cloud ice includes all ice types and is characterized by rimed mass fraction (RMFi).

  • e

    Ice crystal formation mechanisms (see section 3): D, deposition and condensation nucleation (may be used for diagnostic Nc); C, contact nucleation; I, immersion nucleation; H, homogeneous freezing of cloud or raindrops; A, aerosol freezing (homogeneous); M, Hallett-Mossop ice multiplication; S, snow breakup.

  • f

    Optional sensitivity test submitted (see section 2, item 9).

DHARMA-13176900100–250G99qc qr qs qgI Hyes
DHARMA-2M3176900100–250M09qc qr qi qs qg Nc Nr Ni Ns NgD C I H M S 
EULAG-222001000100–300M08qc qr qi Nc Nr Ni RMFiD C I H M Syes
ISUCRM-226003000100–1000K76qc qr qs qg Ns NgD 
MESONH-131921000100–250P98qc qr qi qs qgD H 
MESONH-231921000100–250P02qc qr qi qs qg Nc Nr NiD C H A M 
SAM-2M31921000100–400M09qc qr qi qs qg Nc Nr Ni Ns NgD C I H M Syes
UKMO-2A3176900225–500B06qc qr qi qs qg NiD C H M 
UKMO-2B3176900225–500B06qc qr qi qs qg Ni Ns NgD C I H M S 
UKMO-2M3176900225–500M09qc qr qi qs qg Nr Ni Ns NgD C I H M S 

[13] Simulated convective cloud properties are expected to be sensitive to the microphysics scheme [e.g., Wang et al., 2009b; Fan et al., 2010b], which can be classified in terms of the number of prognostic variables used for condensed water. The one-moment schemes (DHARMA-1 and MESONH-1) prognose only 4–5 hydrometeor mixing ratios, whereas the two-moment schemes additionally prognose 1–5 number concentrations (see Table 2). EULAG-2 uses a single size distribution for all ice that is further characterized by a prognostic rimed mass fraction [Morrison and Grabowski, 2007, 2008a, 2008b]. Here each simulation moniker includes either a “1” to indicate one-moment (no number concentrations prognosed) or a “2” to indicate two-moment (at least one number concentration prognosed). Simulations identified with “2M” use versions of the two-moment Morrison et al. [2009] scheme. Other analyses of these simulations may use differing naming conventions [e.g., Varble et al., 2011].

[14] Of the four schemes that prognose cloud droplet number concentration (Nc), DHARMA-2M and SAM-2M used the vertically varying trimodal aerosol profile provided. In DHARMA-2M, the aerosol in each mode were advected, consumed by hydrometeor collision-coalescence, and nudged on a domain-mean basis to their initial profiles with a six-hour timescale, whereas in SAM-2M the number concentrations were fixed. In EULAG-2, the three modes were populated with vertically uniform number concentrations of 295, 95, and 0.4 cm−3, respectively. In MESONH-2, an activation spectrum was fitted using the diameter and standard deviation of the middle specified mode and the number concentration of 130 cm−3 specified at 1500 m in altitude. Nc was fixed at 240 cm−3 in UKMO-2A and UKMO-2B and at 100 cm−3 in the remaining schemes.

[15] While it is beyond the scope of this paper to provide an exhaustive comparison of the microphysical processes, the mechanisms of primary and secondary ice nucleation are listed in Table 2 owing to their expected importance in simulation results [e.g., Fan et al., 2010b]. Most simulations include a single diagnostic equation for the number concentration of heterogeneous ice nuclei that form ice crystals directly from the vapor phase in the deposition or condensation modes when the air is supersaturated with respect to ice or water, expressed as an exponential function of either supercooling only (ISUCRM-2 uses equation (13) from Koenig and Murray [1976] with A06 = 464 and A07 = 12 in SI units; SAM-2M and UKMO-2M follow the Thompson et al. [2004] implementation of Cooper et al. [1986] except with an upper limit of 500 L−1) or supersaturation only (DHARMA-2M, E ULAG-2, MESONH-2, UKMO-2A, and UKMO-2B use equation (2.4) from Meyers et al. [1992] with a = −0.639 and b = 0.1296; EULAG-2 sets a limit of 100 L−1 on the resulting number concentration). Most schemes independently diagnose ice nuclei active in the contact mode as an exponential function of supercooling only following Meyers et al. [1992] (DHARMA-2M, EULAG-2, MESONH-2, SAM-2M and UKMO-2M use their equation (2.6) with a = − 2.80 and b = 0.262) or Young [1974] (UKMO-2A and UKMO-2B use his equation (12) with Na0 = 2000 m−3). Most models that include immersion freezing assume a stochastic treatment that is an exponential function of supercooling only, following Bigg [1953]; the only exception is DHARMA-1, which diagnoses a number concentration of heterogeneous immersion nuclei as an exponential function of temperature only [Grabowski, 1999, equation (A.20)], following Fletcher [1962]. Most models include near-instantaneous homogeneous freezing of activated cloud droplets and/or raindrops at temperatures colder than roughly −40°C [e.g., Pinty and Jabouille, 1998]. Freezing of unactivated aerosol at colder temperatures and higher supersaturations is included only in MESONH-2, following Kärcher and Lohmann [2002]. Secondary nucleation processes are Hallett-Mossop rime splintering and snow breakup (one or both are included in most simulations; see Table 2). SAM-2M and UKMO-2M also set an upper limit of 10 cm−3 on cloud ice number concentration.

[16] Submitted model results are archived for public use as described in Appendix A. Archived results for 3D models include Rayleigh-scattering radar reflectivities that have been independently calculated using uniform assumptions [Varble et al., 2011], which are used here. Numerous non-standard diagnostics were requested for comparison with specific observational data streams [see Fridlind et al., 2010], which resulted in most participants running computationally intensive simulations more than once to increase compliance with the full specification. Since a principal objective of this study is to compare a variety of simulations with the observations (rather than with one another), three unique simulations that do not hew precisely to the full specification have been included here. In ISUCRM-2, which is valuable as one of only two 2D models, nudging of tropospheric water vapor and potential temperature aloft is neglected, which has little impact on overall convective fluxes and precipitation by specification design (aside we note that the meridional wind direction is also reversed in this simulation, to which results are invariant in this modeling framework). In MESONH-1 and MESONH-2, which include unique graupel schemes [cf. Varble et al., 2011], meridional winds are not consistent with the specification. All diagnostics were made optional owing to the long list requested [see Fridlind et al., 2010]; if a diagnostic is not available for a given simulation, then that simulation is omitted from evaluation against measurements in the following without further comment.

4. Observations

[17] In this paper, emphasis is placed on domain-wide observational data sets rather than individual point and profile measurements. To bridge the spatial scale mismatch between CRMs and data derived from scanning radar, satellite imaging, or global analysis models, CRMs reported some diagnostics at a coarsened horizontal resolution, ranging from 2.5 km (scanning radar-scale) to 55 km (global analysis-scale). By contrast, it is not possible to directly manipulate CRM fields to bridge the mismatch between the model grids and very high spatial resolution point and column measurements. Taking precipitation rate as an example, averaging times of 5–15 min have been found to be optimal in point comparisons of time-integrated rain gauge measurements with instantaneous precipitation radar measurements at 2-km horizontal resolution [Habib and Krajewski, 2002]. Even if it were in principle possible to find an optimal averaging time for intercomparison of each point or column data source with each CRM horizontal grid spacing (0.9–3 km from Table 2), statistical results would still be challenging to robustly use for constraining models, as evidenced by the difficulties encountered when comparing precipitation radars (roughly comparable to CRM grid cell size) and rain gauges [e.g., Nikolopoulos et al., 2008]. Past work has furthermore indicated that average collocated radar-gauge precipitation measurements should not be expected to agree to better than about 10% until 20 or so convective events are sampled [Habib and Krajewski, 2002, and references therein], far more than sampled here. Additional point and column measurements will be considered in future work.

[18] Original data were downloaded from the ARM online archive unless otherwise indicated. Processed values have been archived for public use in a CF-compliant format (see Appendix A).

4.1. C-Band Polarimetric (C-POL) Radar

[19] Data obtained from the 5.5–cm-wavelength scanning C-POL radar at Darwin [Keenan et al., 1998] are gridded reflectivities at 0.5-km vertical resolution and retrieved precipitation rate at an elevation of 2.5 km and 1-km vertical resolution. All radar data are reported at 2.5-km horizontal resolution and 10-min frequency throughout the TWP-ICE domain (bounded by sites listed in Table 1). Recalibrated radar data were provided by Peter May. Uncertainty in retrieved precipitation rate is estimated to be 25% at rain rates above 10 mm h−1 and 100% at the lowest reported rain rates. Uncertainty in rain rate averaged domain-wide over the suppressed period is 25% but a bit higher during the active period (33%); we use 25% as a representative value for both periods since the difference during the active period does not impact conclusions. Uncertainties of the occurrence frequencies over a range of rates (e.g., 2–20 mm h−1) were found by recalculating the frequencies in each range with uncertainties added or subtracted; the uncertainty range is then obtained from the minimum and maximum frequencies found in each rain rate category.

[20] We identify the fractional area covered by convective and stratiform rain over the TWP-ICE domain using C-POL reflectivity as described in Appendix B. Uncertainty in the fractional areas is estimated by applying the same uncertainty algorithm to reflectivity fields with the grid cell uncertainty of approximately 1 dBZ added or subtracted. Resulting relative uncertainty in the convective and stratiform area fractions is within 20% and 5%, respectively, during both active and suppressed periods.

[21] Latent heating rate profiles retrieved over the TWP-ICE domain were provided by Courtney Schumacher [Schumacher et al., 2004] based on a separate processing of C-POL raw radar data, including gridding at 2-km rather than 2.5-km resolution. For comparison with simulations here, latent heating rate profiles are normalized by the ratio of surface precipitation rate in the large-scale forcing data set to vertically integrated latent heating rate, which is computed using time-dependent thermodynamic profiles also obtained from the large-scale forcing data set. Latent heating rate profiles are compared with model results on a qualitative basis.

4.2. Visible Infrared Shortwave-Infrared Split-Window Technique (VISST) Retrievals

[22] Broadband top-of-atmosphere (TOA) outgoing longwave radiation (OLR), shortwave albedo, and ice water path (IWP) were derived from radiances measured by the imager on the geostationary satellite MTSAT-1R. The OLR and shortwave albedo were derived from the 10.8-μm and 0.73-μm radiances, respectively, following the approach of Minnis and Smith [1998] with modifications similar to those described by Khaiyer et al. [2010]. The relevant MTSAT-1R channels were calibrated against the corresponding spectral channels on the Terra MODerate-resolution Imaging Spectroradiometer. The cloud properties were derived using the methods of Minnis et al. [2008, 2011a] to detect cloudy pixels and retrieve cloud properties such as phase, effective particle size, and optical depth. IWP was computed from the product of the last two parameters for ice-cloud pixels, and therefore may include contributions from liquid underlying an ice layer. Analysis of the TWP-ICE data set is summarized by Minnis et al. [2006]. All of these MTSAT-derived data streams are referred to hereafter as VISST for brevity. Values are reported at a 15–60-min frequency and 4-km resolution over 5–17°S and 125–136°E. In each swath, pixels in the TWP-ICE domain are identified and relevant statistics calculated. Domain-mean relative uncertainties are estimated as +9/−4% in OLR and +7/−15% in reflected shortwave radiation (RSR) and shortwave albedo, based on comparisons with Terra (CERES Edition 3) after Khaiyer et al. [2010].

[23] VISST IWP data are limited to daytime because the retrieval requires visible reflectance to estimate optical depth. Since the maximum retrievable optical depth is 128 for this data set, IWP will be underestimated when optical depth is higher, consistent with a comparison of annual mean VISST and CloudSat retrievals finding close agreement except in regions of tropical convection [Waliser et al., 2009]. There, the mean VISST IWP values were ∼25% less than their CloudSat counterparts, suggesting a negative bias of roughly one-third. For thin cirrus, the instantaneous VISST IWP retrievals are typically within 40% of surface-based radar-radiometer retrievals [Minnis et al., 2011b]. VISST IWP retrievals are considered here qualitatively, and the unknown contribution of liquid hydrometeors to retrieved IWP is neglected when comparing with simulations. Daytime is defined conservatively as any time when instantaneous TOA downwelling flux exceeds 200 W m−2. Since simulated TOA downwelling solar fluxes are not identical, the UKMO-2A 10-min TOA shortwave downwelling flux time series, with mean diurnal values that precisely match those in the 3-h large-scale forcing data set, is used as a benchmark to define daytime temporally for all comparisons.

4.3. Total Sky Imager (TSI) Retrievals

[24] The TSI provides time series of hemispheric sky images during daylight hours and retrievals of fractional opaque and thin cloud cover at 30-s frequency when the solar elevation is greater than 10°. Uncertainty in opaque cloud retrievals depends upon cloud aspect ratio [Kassianov et al., 2005] and is not used here. Owing also to the difficulty of reconciling model-based and differing VISST and TSI instrument-based definitions of clear and cloudy conditions, we therefore consider all cloud cover results qualitatively rather than quantitatively.

4.4. Surface Sensible and Latent Heat Fluxes

[25] An eddy covariance system mounted on a short tower over Darwin Harbor provided surface sensible and latent heat flux measurements at 30-min resolution. Gap-filled data are used, wherein gaps shorter than two hours are filled using interpolation and longer gaps are filled using a neural network algorithm [Beringer et al., 2007]. Here we consider the harbor measurements qualitatively since they were made at a single location. We note that the domain-wide fluxes in the large-scale forcing data set are an area-weighted average that includes eddy covariance measurements at land sites and profile-based calculations at sea (see section 4.7).

4.5. Microwave Radiometer (MWR) Retrievals

[26] Liquid water path is retrieved from MWR measurements at Darwin and on the ship [Turner et al., 2007]. Data are reported at 20 to 35-s resolution when the measurements are not contaminated by surface precipitation. Indeterminate and missing fields are removed, small negative values set to zero, and the mean of retrieved values at both stations taken over 10-min intervals. The degree to which an average of two stations is representative of the domain mean is unknown; the representativeness of point measurements likely depends upon the variable being measured and the meteorological conditions encountered [e.g., Barnett et al., 1998; Habib and Krajewski, 2002]. LWP is therefore considered qualitatively in this study.

4.6. European Center for Medium-Range Weather Forecasts (ECMWF) Global Analyses

[27] ECMWF supplied results of the Operational Analysis and Forecasting System in three grid cells representative of the TWP-ICE domain. Surface precipitation rate considered here is an average over 1-h time periods (ending at reporting time) and roughly 55-km resolution. Statistics used here are mean and maximum of surface precipitation rate across the three grid cells provided. They are compared with the domain-wide mean and maximum of instantaneous values obtained at 10-min frequency from C-POL measurements and simulations.

4.7. Large-Scale Forcing Data Set

[28] The variational analysis used to derive the domain-mean large-scale forcing data set at 10-mb vertical resolution and 3-h temporal resolution (centered in time) is based on inputs that include surface heat and radiative fluxes and C-POL, VISST, MWR, and ECMWF products listed above. As described by Xie et al. [2010], environmental profiles in the large-scale forcing data set are an integration of available soundings with analysis products. Data set components used here are domain-mean surface precipitation rate, environmental profiles, surface latent and sensible heat fluxes, and TOA and surface radiative fluxes.

4.8. The 3D Ice Water Content (3D-IWC) Retrieval

[29] The 3D-IWC retrieval employs a Bayesian algorithm to retrieve IWC and IWP from MWR, cloud radar, and sounding measurements at Darwin, and high-frequency microwave data collected on NOAA satellites from the Advanced Microwave Sounding Unit - B (AMSU-B) [Seo and Liu, 2005, 2006]. The retrieval algorithm includes a range of ice types and the input observations are variably sensitive to all ice types (see discussion in section 5.2). Retrievals are reported at the temporal resolution of available satellite overpasses at ∼16–25 km spatial resolution within 10° latitude and longitude of Darwin. An uncertainty is provided for each reported value as described by Seo and Liu [2006]. For each available swath that spans the TWP-ICE domain, measurements and uncertainty are averaged over pixels identified within the domain. Averaging uncertainty of point measurements in this manner is equivalent to assuming that all errors are perfectly correlated. Statistics calculated are thus domain-mean IWC profiles and IWP. Mean uncertainty in IWP, which is strongly dependent upon amount of ice present [Seo and Liu, 2006], is just under 20% during the active period, roughly 600% during the more dormant suppressed period, and just under 40% when averaging over the full reported simulation period (19.5–34). Ice water content profiles are considered qualitatively.

[30] In order to account for the temporal sparseness of retrievals dependent upon polar orbiting satellites, simulations are sampled at the frequency of available 3D-IWC retrievals whenever quantitative comparisons are made between retrievals and simulations. This introduces a degree of inconsistency when plotting observed and simulated IWP versus, for example, observed and simulated OLR, in terms of temporal sampling. However, we find that the temporal sampling is important to the quantitative comparison of observations and simulations of IWP (and therefore it should be done to properly consider whether simulations are within the uncertainty of retrievals) but it does not qualitatively impact the arrangement of ensemble members in correlation plots considered here (thus retrievals are qualitatively representative despite relatively low sampling frequency).

5. Results

5.1. Precipitation Features

[31] The surface precipitation rate over the TWP-ICE domain obtained from C-POL measurements (see section 4.1) is an input to the variational analysis (see section 4.7) and a principal indicator of the strong large-scale ascent that is dominant during the active period (e.g., Figure 3). It is therefore not surprising that models reproduce the temporal evolution of domain-mean surface precipitation under the strongly forced active conditions (Figure 4a), consistent with similar past studies [e.g., Xu et al., 2002; Xie et al., 2005; Woolnough et al., 2010]. Figure 5 shows that every simulation reproduces mean surface precipitation rate within the uncertainty of retrievals during the active period, and in a manner that is closely correlated with total liquid water path (LWP, defined throughout this study as cloud plus rainwater). Aside, we note that Figure 5 is the first of several figures in which one domain-mean quantity is plotted against another. These plots illustrate the degree to which two quantities are related across the ensemble of simulations. If observations are available, they also concisely illustrate whether simulated quantities fall within the range of observational uncertainty and whether deviations from observations are correlated. In all such figures, results for the active and suppressed period are plotted separately owing to commonly differing patterns. During the weakly forced suppressed period in Figure 5, for instance, all simulations overestimate surface precipitation except SAM-2Ms, unlike during the active period. However, even when the domain mean of surface precipitation is robustly reproduced under strongly forced conditions, we find the following evidence that the underlying structural features of simulated precipitation fields differ substantially across models, associated with large differences in radiative flux terms.

Figure 3.

Leading terms in the water vapor budget profiles (expressed in terms of latent heat) over (left) the active monsoon period and (right) the suppressed monsoon period from the DHARMA-2M baseline simulation example: net flux convergence from large-scale (LS) vertical advection, net condensation (including deposition and sublimation), local vertical mixing (including resolved and subgrid-scale), and LS horizontal advection.

Figure 4.

Precipitation rates: (a) simulated 3-h domain mean at the surface compared with large-scale forcing data derived from 2.5-km C-POL retrievals, (b) simulated 10-min domain mean at 2.5-km elevation compared with C-POL retrievals, (c) simulated 10-min domain maximum at 2.5-km elevation and 2.5-km horizontal resolution compared with C-POL retrievals, and (d) simulated 1-h domain maximum at the surface at 55-km horizontal resolution compared with ECMWF analyses and C-POL. Listed in parentheses are the mean and maximum of plotted values.

Figure 5.

Simulated surface precipitation rate versus liquid water path (LWP, defined as cloud plus rainwater). Domain means averaged over (left) active period (19.5–25.5) and (right) suppressed period (27.5–33.5), respectively (see Figure 4). Degree of correlation is given as the Spearman rank coefficient. Precipitation rate from C-POL retrievals (dotted lines) shown with estimated uncertainty range (shading). Domain-wide LWP not observed (see section 5.2).

[32] Because models can report precipitation rate at the 2.5-km altitude and 2.5-km resolution of the C-POL retrievals, it is possible to closely compare precipitation rate statistics. Before doing so we note that domain-mean precipitation rates are 10–20% greater at 2.5-km than at the surface in all reporting models except EULAG (comparing Figures 4a and 4b). Although mean C-POL reflectivity is actually lower at 2.5 km than at 0.5 km during this period (not shown), consistent with past measurements in tropical maritime convection [Houze et al., 2004], it is unknown whether the actual precipitation rate is lower or higher at the surface in this case. Any such actual difference can be viewed as a source of uncertainty in the large-scale forcing data set, as discussed further below. With respect to the frequencies of precipitation rate at 2.5-km resolution and 2.5-km altitude (Figure 6), the most apparent feature is that all reporting models overestimate occurrences in the light 0.2–2 mm h−1 range during both active and suppressed periods. Compared with DHARMA-1, this tendency is notably reduced in the DHARMA-1s sensitivity test simulation that is nudged toward domain-mean thermodynamic profiles throughout the free troposphere, suggesting that baseline simulations produce rain that is too widespread owing in part to deviations of simulated mean temperature and water vapor profiles from those observed. EULAG-2s shows a similar but weaker trend during the active period. In a more detailed analysis of precipitation statistics in 3D simulations from this study, Varble et al. [2011] found that total precipitating area matches observations to within 1–2% in DHARMA-1s and SAM-2Ms sensitivity test simulations during the active monsoon period, whereas baseline simulations overestimate precipitating area by 35–65%.

Figure 6.

Precipitation rate statistics simulated and observed at 2.5-km resolution and 2.5-km elevation. Shaded section of first bar in each range indicates minimum and maximum frequency considering uncertainties in C-POL retrievals (see section 4.1). In parentheses is summed mean occurrence frequency of all rates > 0.2 mm h−1.

[33] Thus the baseline simulations systematically produce light rain that is far more widespread than observed, and this particular error is ameliorated to some degree with 6-h nudging of the domain-mean tropospheric thermodynamic profiles. It is possible that such overly widespread light precipitation could arise or persist from horizontally uniform domain-wide application of large-scale forcing terms, which are relatively strong in the TWP-ICE data set [cf. Hagos, 2010]. Surface rainfall frequencies were by contrast found to be in reasonable agreement with C-POL observations in a study of the TWP-ICE time period using a LAM with open boundary conditions [Wapler et al., 2010]. Comparison of these CRM results with the relative frequencies of light and heavy rain in the associated LAM intercomparison study should shed light on the effects of differing boundary conditions and large-scale forcing approach. Those used in this study are more similar to a cloud-resolving convection parameterization approach in GCMs [e.g., Grabowski, 2001], whereas those used in the LAM study are more similar to a global cloud-resolving model.

[34] Compared with the excessive frequencies of lighter rain rates, the frequencies of rain rates >2 mm h−1 are generally reproduced better by the simulations (see Figure 6). However, the mean of simulated maximum precipitation rates (based on the 10-min sampling of models at identical 2.5-km horizontal resolution and elevation as retrievals) varies over a factor of ten range from roughly 10 mm h−1 in the 2D EULAG simulations to roughly 50–100 mm h−1 in reporting 3D simulations, as compared with roughly 40 mm h−1 retrieved from C-POL (Figure 4c). As the high-intensity tail of the frequency distribution, domain-wide maximum rain rate is appealing mostly because it is easily sampled in models and observations. That typical simulated peak rain rates increase with dimensionality is consistent with past findings that updraft strength and vertical mass fluxes increase with dimensionality [e.g., Phillips and Donner, 2006; Petch et al., 2008; Zeng et al., 2008], although the smaller sample size in 2D simulations is expected to reduce the likelihood of generating typical domain-wide maxima. Sample sizes are nearly identical in 3D simulations and observations, but peak rain rates in 3D simulations are systematically higher than those retrieved, which could be associated with overly broad and intense updrafts at the ∼1-km horizontal resolution of most simulations [e.g., Bryan and Morrison, 2012]. We note that although pixel-level C-POL retrievals never exceed about 140 mm h−1, maximum 10-min-mean tipping bucket gauge and disdrometer measurements do exceed 140 mm h−1 during events A, B and C, and at other times (not shown). Since convective core regions are the most important sources of rainfall and are locations where precipitation efficiency influences the production of longer-lived convective outflow aloft, systematic discrepancies between retrievals and simulations warrant further study.

[35] The structure of precipitation fields in the 3D simulations can also be compared with C-POL measurements by applying a textural algorithm to objectively identify the domain fraction covered by convective and deep stratiform rain in simulated and observed radar reflectivity fields (see Appendix B and examples in Figure 7). Results indicate that convective area fraction is commonly overpredicted by a factor of two or more (Figure 8a), indicating that regions with strong updrafts are systematically too large or too frequent or both. Two baseline simulations lie within the 15% relative uncertainty of observed convective area during the active period, and only one sensitivity test does during the suppressed period (Figure 9). The maximum convective area fraction observed, ∼30% during event C, is larger than typical maxima of ∼20% observed over larger domains at Darwin and elsewhere [e.g., Frederick and Schumacher, 2008; Holder et al., 2008]. Although differences in area identification algorithms and source data resolution do significantly impact area calculations [e.g., Steiner et al., 1995; Yuter et al., 2005], an exaggerated maximum in this case could be attributable at least partly to the MCS of event C covering an area substantially larger than the TWP-ICE domain. The maximum convective area fraction is never as strongly overestimated as the time average, possibly reflecting physical limits on convective fraction under given environmental conditions [cf. Holder et al., 2008].

Figure 7.

Convective (black) and stratiform (shaded) area fractions (see Appendix B) identified at day 20.125 from (a) C-POL measurements over the TWP-ICE domain and (b-k) simulations.

Figure 8.

Simulated and observed (a) convective and (b) stratiform area fractions at 2.5-km elevation and 2.5-km resolution. Listed in parentheses are the mean and maximum of plotted values.

Figure 9.

Stratiform area versus (top) convective area, (middle) ice water path (IWP), and (bottom) outgoing longwave radiation (OLR). Averaging times, symbols, and Spearman rank coefficients as in Figure 5. Domain means of convective and stratiform area fractions, OLR, and IWP from C-POL, VISST, and 3D-IWC retrievals (dotted lines) shown with estimated uncertainty ranges (shading).

[36] Whereas the mean convective area fractions are consistently overpredicted, the stratiform area fractions range from being underpredicted to overpredicted with relatively wider ranges (Figure 8b), indicating that stratiform outflows and evolution are more sensitive to model differences. The narrow range of uncertainty in stratiform area (Figure 9) is likely substantially smaller than the uncertainty in the large-scale forcings driving the simulations (e.g., ∼25% uncertainty in surface precipitation rate). Compared to their respective baseline simulations, stratiform area is substantially reduced in the SAM-2Ms sensitivity test but not in DHARMA-1s, suggesting a microphysics-dependent sensitivity to tropospheric moisture and temperature. Aside, we note that stratiform area fraction can be underpredicted and light precipitation rate frequencies simultaneously overpredicted when the light rain is originating from shallow clouds rather than deep stratiform clouds (see Appendix B). Within a given dynamics model (e.g., DHARMA, UKMO), two-moment microphysics schemes based on Morrison et al. [2009] produce similar or greater stratiform areas than one-moment schemes, consistent with past results [Morrison et al., 2009; Luo et al., 2010; Bryan and Morrison, 2012]. But baseline simulations with versions of the same two-moment scheme also differ substantially (e.g., stratiform fraction is notably larger in DHARMA-2M than in UKMO-2M), suggesting that dynamics also plays a role. Prognosing droplet number concentration evidently does not produce a strongly distinguishing effect in Figure 8 (Nc prognosed using differing approaches in DHARMA-2M, MESONH-2, and SAM-2M), but could be associated with larger stratiform fractions in DHARMA-2M and SAM-2M than in UKMO-2M.

[37] Across the ensemble during the active period, it is notable that stratiform area fractions ranging over ∼25–60% are poorly correlated with either convective area fractions or with ice water path (IWP, defined throughout this study as the sum of all ice-phase hydrometeors, including cloud ice, snow, and graupel; see Table 2). However, the stratiform area fraction is strongly correlated with a 60 W m−2 range of predicted outgoing longwave radiation (OLR, Figure 9), indicating that the spatial distribution of ice controls simulated OLR more than domain-mean IWP. It is also notable that convective area predictions tend to be better during the onset of convective events than during the decay (e.g., during event C over 23–24 January in Figure 8a), which may be at least in part attributable to the fact that mature cells can pass out of the observational domain whereas periodic boundary conditions require their decay to be completed within the modeling domain. During the suppressed period, stratiform area fractions are less than 15% in all simulations (variably underestimated and overestimated); they are more poorly correlated with OLR than during the active period owing in part to a wider range of high cloud fraction (see section 5.2).

[38] The observed mean ratio of 0.86 for stratiform to stratiform plus convective area found in this study is higher than the active, suppressed, and experiment wide values of 0.75–0.79 during TWP-ICE over the full C-POL domain reported by Frederick and Schumacher [2008, Table 2] and at the upper limit of the 0.66–0.86 range reported for various tropical regions [Holder et al., 2008]. The observed ratio could be higher owing to (1) poor Eulerian sampling of a small number of Lagrangian events passing through the geographically limited TWP-ICE observational domain and (2) differences in observational data characteristics such as horizontal grid resolution. Aside, we note that the additional requirement on stratiform area in our algorithm of a minimum reflectivity above the melting level would tend to reduce rather than increase the ratio, all else being equal (see Appendix B). The simulations exhibit stratiform to stratiform plus convective ratios of 0.73 (DHARMA-1s and MESONH-1) to 0.89 (DHARMA-2M), roughly spanning the observational range over tropical regions, and thus none appear to be strong outliers by this simple metric.

[39] Although it is beyond the scope of this paper to examine the details of convection organization, simulations with more linear convection features appear to produce larger stratiform fractions (compare Figures 7 and 8). More specifically, simulations with little organization exhibit the least stratiform area (DHARMA-1 and MESONH simulations), whereas those with the most linear squall lines tend to exhibit the greatest stratiform area (e.g., DHARMA-2M, SAM, and UKMO simulations; observed conditions appear more similar to this latter class). Such simulation tendencies are roughly consistent with observations of greater stratiform rainfall being associated with linear organization in tropical systems [Rickenbach and Rutledge, 1998], although in this case model physics is responsible rather than environmental conditions (as evidenced by the difference between DHARMA-1s and SAM-2Ms, despite both being nudged to observed conditions throughout the troposphere). Convective area does not appear to be closely associated with degree of linear organization (e.g., both MESONH and UKMO simulations span a wide range of convective area fractions in Figure 9). Thus it appears that the simulated convection organization mode could be more closely associated with stratiform rain generation than absolute convective area. The simulated degree of linear organization can in turn be substantially modified by microphysics (e.g., DHARMA-1 versus DHARMA-2M), consistent with past modeling results [e.g., Lynn et al., 2005].

[40] Finally, owing to the use of ECMWF analyses to drive the TWP-ICE LAM intercomparison, we briefly consider the relationship among ECMWF, observed, and simulated maximum precipitation rates at comparable horizontal scales. Taking 55-km resolution as roughly that of the ECMWF analyses, we find that the maximum of peak surface precipitation rates in the local ECMWF fields is about one-third of that retrieved from C-POL (Figure 4d). During each major event (A, B, and C), maximum intensity in ECMWF fields is lower than observed by an amount that far exceeds the C-POL observational uncertainty of 25%. Reduced maximum intensity in ECMWF fields is compensated by more frequent mid-range precipitation rates, as evidenced by less than 10% difference between the mean of peak intensities at 55-km resolution from ECMWF and C-POL fields. This pattern of overly frequent rainfall events with overall maximum intensity substantially lower than observed is consistent with extensive recent comparisons of ECMWF and other global models with CloudSat data [Stephens et al., 2010]. The reporting CRM simulations, on the other hand, predict maximum values of 55-km-resolution precipitation rates that are greater than or within experimental uncertainty of that derived from C-POL. The reporting 3D simulations also sustain such rates more commonly than observed, as evidenced by mean peak 55-km intensities that are roughly 50% too high. Alongside variably high 2.5-km peak intensities and excessive convective area (Figures 4c and 8a), this provides additional evidence that convective precipitation structures in 3D simulations tend to be too intense or extensive.

5.2. Condensate, Latent Heating, and Cloud Cover

[41] We next consider observational constraints on the domain-wide column and profiles of condensate, latent heating, and cloud cover. First, we find that observations are unfortunately too sparse to provide a robust constraint on LWP. MWR retrievals are available only at Darwin and on the ship, and are available only when surface precipitation is less than ∼0.02 mm h−1 (D. Turner, personal communication, 2008). In simulations that reported non-precipitating LWP (domain mean of contributions from columns where surface precipitation <0.02 mm h−1, Figure 10a), it constitutes ∼5–25% of total LWP (see Figure 5). By comparison, cloud water alone accounts for ∼25–50% of total LWP (not shown; no observational analog). The domain-mean non-precipitating LWP is therefore a relatively small fraction of total LWP and is more variable across simulations than domain-mean cloud or rainwater path or their sum, which could be attributable to differences in the simulated frequency of light precipitation. Satellite microwave-based retrievals of LWP are expected to be strongly influenced by assumptions regarding cloud and rainwater partitioning in this region of high average rainfall [O'Dell et al., 2008], and are beyond the scope of this work to assess.

Figure 10.

(a) Simulated non-precipitating LWP compared with MWR retrievals (see text), (b) IWP compared with 3D-IWC retrievals, and (c) daytime IWP compared with VISST retrievals. Listed in parentheses are the mean and maximum of plotted values. Simulation IWC statistics listed in Figure 10b are calculated after subsampling at the observational frequency (see section 4.8).

[42] Although LWP remains thus unconstrained, simulated IWP can be robustly compared with 3D-IWC retrievals, which are based on a synthesis of polar-orbiting satellite and ground-based measurements. The CRMs reproduce the temporal evolution of domain-mean IWP from both 3D-IWC and VISST retrievals quite well (Figures 10b and 10c). All simulated values below 0.2 kg m−2 lie within the uncertainty of VISST retrievals during daytime, and over the active period only DHARMA-1 and DHARMA-1s overestimate VISST daytime mean IWP by more than one-third, which is the estimated amount that it may be biased low owing to maximum retrievable optical depth (see section 4.2). However, simulated IWP is systematically greater than 3D-IWC retrievals, and only in the 2D baseline simulations (EULAG-2, ISUCRM-2) is domain-mean IWP over days 19.5–34 just within the associated uncertainty of 40%. Across all baseline simulations, the ratio of IWP to LWP is also notably higher in 3D than in 2D (Figure 11) because LWP is higher and IWP lower in all 2D versus all 3D baseline simulations. Aside, we note that over the (high-IWP) active period, when the 3D-IWC retrieval uncertainty is relatively lower, no simulations actually lie within the associated uncertainty range, whereas roughly half of simulations lie within the far larger uncertainty under the (low-IWP) conditions of the suppressed period. With respect to the vertical distribution of IWP, the CRMs locate most ice mass in the ∼5–13-km altitude range during the active period, qualitatively consistent with 3D-IWC retrievals (Figure 12a). But the CRM results tend to exceed 3D-IWC retrievals by up to a factor of two at those elevations, commonly by more above 13 km. During the suppressed period, the CRMs predict up to an order of magnitude more IWC than retrieved in the 5–13-km altitude range and many show a secondary peak above 13 km that does not appear in the retrievals (Figure 12b).

Figure 11.

IWP versus LWP (including cloud water and rain). Averaging times, symbols, and Spearman rank coefficients as in Figure 5. Domain means of IWP derived from 3D-IWC retrievals (dotted lines) shown with estimated uncertainty ranges (shading). Simulation IWP is subsampled at the observational frequency (see section 4.8).

Figure 12.

Domain-mean ice water content profiles (including cloud ice, snow, and graupel) averaged over (a) active period (19.5–25.5) and (b) suppressed period (27.5–33.5) compared with 3D-IWC retrievals. Simulation IWC is subsampled at the observational frequency (see section 4.8). Symbols as in Figure 5.

[43] One conceivable explanation for the systematic difference between simulated IWP and 3D-IWC retrievals is a possible lack of sensitivity of those retrievals to dense ice contributions from convective core regions, which could arise since the input vertically pointing millimeter cloud radar data are interpreted using the properties of cloud ice and snow [Seo and Liu, 2006]. Both millimeter radar and satellite microwave radiometer also lack sensitivity to thin ice clouds, which could lead to underestimates in IWP particularly during the suppressed period. Robust methods of comparing models with measurements using retrievals of varying sensitivity to cloud, snow, and dense ice contributions are not yet generally in hand even when ice classes and properties are well-defined as in CRMs [e.g., Waliser et al., 2009]. A detailed analysis of both retrieval inputs and model outputs would be required to quantitatively assess whether this explanation can account for the systematic differences seen between retrievals and particular simulations in this case. Further work to establish the robust use of microwave-based remote-sensing measurements to constrain CRM and LAM simulations should have a high priority given the sensitivity of results to poorly constrained ice microphysical processes [e.g., Wu et al., 2009; Fan et al., 2010b; Morrison and Milbrandt, 2011] and the considerable potential of such measurements to constrain simulation results [e.g., Matsui et al., 2009; Waliser et al., 2009].

[44] Offset a bit lower than simulated and retrieved IWC peaks, simulated latent heating rates peak at ∼4–12 km, consistent with retrievals from C-POL measurements (see section 4.1 and Figure 13). Almost all simulations agree remarkably well with retrievals above ∼8 km during the active period. Since the vertical integral of latent heating rate is nearly equivalent to the reported surface precipitation rate in all plotted simulations, the larger heating rates above 10 km in SAM simulations appear to be reliably reported features, as discussed further below. In the SAM-2M baseline simulation, the divergence from latent heating rate retrievals above 12 km can be traced to event C alone, whereas in the SAM-2Ms sensitivity test, latent heating rate diverges from retrievals in all three convective events A–C, consistent with greater updraft speeds and vertical mass fluxes in sensitivity test simulations (not shown). At the melting level (∼5 km), some simulations exhibit a sharp localized reduction in latent heating rate (all DHARMA and MESONH simulations) whereas the others do not. Maximum latent heating rates also appear to fall into two groups during the active period: those that exceed 20 K d−1 (MESONH and SAM simulations) and those that do not (all others). Most peak rates in both groups fall within the minimum expected retrieval uncertainty of roughly 25%. In contrast to the relative consistency of simulated and observed latent heating rate profiles during the active period, simulations deviate variously from retrievals during the suppressed period, consistent in part with differences between observed and simulated surface precipitation rates (see Figure 5).

Figure 13.

Latent heating rate simulated during (a) active and (b) suppressed periods compared with retrievals from C-POL (normalized as described in section 4.1). Symbols as in Figure 5.

[45] Finally, we note that although cloud cover is available from both ground-based and satellite-based measurements, it is difficult to use retrievals as a quantitative constraint on simulations for two reasons. First, conditions are often continuously overcast in both retrievals and simulations (Figure 14), as during the active period, thus providing little signal. And second, when both VISST and TSI retrievals indicate that clear-sky regions are present, as during the suppressed period, a robust quantitative comparison of say minimum cloud cover obtained by VISST (0.02), TSI (0.1), and DHARMA-1 (0.16) cannot be made owing to fundamental differences in the definition of cloudiness. Since the model definition of cloud cover (a grid cell mixing ratio of ice plus cloud water >10−6kg kg−1) and the two measurement-based definitions of cloud cover are not easily reconciled quantitatively, this is a problem that probably requires forward-simulation approaches that are beyond the scope of this work [e.g., Henderson and Pincus, 2009].

Figure 14.

Simulated cloud cover (defined as domain fraction overlain by grid cells with combined liquid and ice condensate excluding rain in excess of 10−6 kg kg−1) compared with (a) the fraction of domain grid cells identified as cloudy in VISST retrievals or (b) the opaque cloud cover derived from TSI measurements. Listed in parentheses are the mean and minimum of plotted values.

[46] Nonetheless, it can be seen that minimum cloud cover varies over 2–96% across the model ensemble compared with 2–10% across the VISST and TSI retrievals. In the profile of cloud fraction, model differences can be traced primarily to extent of ice cloud fraction above the freezing level (Figure 15). Above 15 km during the suppressed period, overcast conditions persist in all baseline simulations that include ice nucleation directly from the vapor phase in the deposition mode, whether based on supercooling or supersaturation (see section 3), with the apparent exception of UKMO-2A and UKMO-2B (see Table 2). The coverage of the persistent cloud layer aloft peaks ∼15 km, where large-scale forcings are linearly diminished owing to uncertainties in derived values (see section 2 and discussion by Fridlind et al. [2010]), but extending nudging to lower elevations in the sensitivity tests also substantially reduces cloud fraction aloft if it is present in the baseline simulation (e.g., EULAG-2s and SAM-2Ms), indicating a role for convective outflow. Water vapor mixing ratios in excess of the large-scale forcing data set conditions above 15 km tend to occur in SAM and UKMO-2M simulations but not in others, indicating that they are not a determining factor (not shown; results near the tropopause will be compared with radiosonde and in situ water vapor measurements in future work). During the active phase, EULAG is the only model that sustains a high cloud layer. However, the other models sustaining high cloud aloft during the suppressed period also produce the highest cloud tops during the active phase. Overall, differences in high cloud occurrence clearly vary with microphysics scheme (e.g., across DHARMA and UKMO simulations) and are likely attributable in part specifically to treatments of ice nucleation, consistent with past findings [e.g., Fan et al., 2010b].

Figure 15.

Cloud fraction profiles simulated during (a) active and (b) suppressed periods, where cloudy grid cells are defined as those containing combined liquid and ice condensate excluding rain in excess of 10−6 kg kg−1. Symbols as in Figure 5.

5.3. Precipitable Water Vapor, Moist Static Energy, and Radiative Fluxes

[47] We have noted already that model overprediction of OLR during the active period is closely correlated with the properties of deep stratiform clouds in the 3D simulations (see Figure 9). Varble et al. [2011] have demonstrated that simulated TOA 10.8-μm brightness temperature features are influenced by variable high-level anvil ice outside of precipitating stratiform areas, so this correlation is not the result of singular and direct causation. However, all else being equal, it is expected that sustained overprediction of OLR could lead to excessive radiative cooling of the troposphere during the course of simulations. Given near-saturated conditions, such cooling could increase condensation and subsequent precipitation, thereby reducing precipitable water vapor (PWV). In fact, PWV does fall to levels often persistently more than 5 kg m−2 lower than observed in most baseline simulations (Figure 16a). The troposphere has also cooled more than observed in most baseline simulations, as evidenced by biases in mass-weighted dry static energy averaged over 0–17 km (Figure 16b). In contrast to baseline simulations, which are free-running below 15 km, sensitivity test simulations are nudged throughout the troposphere, guaranteeing little deviation from observed PWV and dry static energy.

Figure 16.

Simulated (a) precipitable water vapor (PWV) and mass-weighted (b) dry and (c) moist static energy averaged over 0–17 km. Static energies are normalized by the specific heat of dry air at constant pressure. Also shown are values derived from the large-scale forcing data set profiles, which differ from simulations at day 19.5 since the first 36 h of simulations were disregarded as spin-up (see section 2). Listed in parentheses are the means during active and suppressed periods.

[48] Because of variations in thermodynamic evolution that are apparent in Figure 16, the free-running baseline simulations enter the suppressed period with a broad range of mean conditions. We therefore focus the remainder of this section on the active period, when simulated PWV values are well correlated with their rates of decline (Figure 17). We also focus on quantities that impact mass-weighted mean tropospheric moist static energy (MSE, Figure 16c), which is negligibly different from a frozen moist static energy in this case [e.g., Blossey et al., 2007]. Since MSE is conserved during hydrostatic adiabatic condensation and evaporation processes, differences in the simulated distribution of total water between vapor and condensate (including precipitation) do not directly modify it [e.g., Bretherton et al., 2006]. In our modeling framework, tropospheric MSE changes between two points in time therefore can be attributed to the accumulated sum of surface turbulent heat fluxes, tropospheric radiative flux divergence, and large-scale forcing and nudging terms [cf. Blossey et al., 2007].

Figure 17.

Precipitable water vapor (PWV) versus rate of change in PWV. Averaging times, symbols, and Spearman rank coefficients as in Figure 5. Domain means in the large-scale forcing data set (dotted lines).

[49] Figure 18 shows total surface turbulent heat fluxes (latent plus sensible) and mean radiative flux convergence over the full atmospheric column (not precisely equal to a tropospheric mean, but shown here for the sake of comparison with observations). Taking DHARMA-1 as an example, surface heat fluxes are only ∼5 W m−2 lower than in the large-scale forcing data set during the active period, whereas radiative flux convergence is ∼45 W m−2 lower. The sum of these terms accounts for the MSE drift in DHARMA-1 over the active period (based on closure of the tropospheric MSE budget; budget terms not available for most models). Since DHARMA-1s is a sensitivity test simulation, in which tropospheric conditions are nudged toward observed conditions, MSE drift remains minimal despite lower surface fluxes and an even more negative radiative convergence that would otherwise amplify MSE drift. In DHARMA-2M as compared with DHARMA-1, mean surface heat fluxes are little changed. Substantially greater stratiform area and high-cloud fractions in DHARMA-2M have little impact on radiative flux convergence relative to DHARMA-1 because reduced longwave cooling (Figure 19b) is offset by reduced solar heating (Figure 20b). Thus MSE drift in DHARMA-1 and DHARMA-2M can be attributed primarily to cloud-modulated radiative flux divergence that is opposite in sign from and larger than that in the large-scale forcing data set.

Figure 18.

(a) Surface turbulent heat fluxes (latent plus sensible) and (b) radiative flux convergence between the surface and TOA. Also shown are values in the large-scale forcing data set based on domain-wide observational data streams [Xie et al., 2010] and gap-filled point measurements at the Darwin Harbor surface flux site (see section 4.4). Listed in parentheses are the means during active and suppressed periods.

Figure 19.

Simulated top-of-atmosphere (TOA) (a) outgoing longwave radiation (OLR) and (b) column longwave (LW) cooling. Column LW cooling not available from the large-scale forcing data set owing to necessary omission of net-only ship-based measurements from available upwelling and downwelling LW fluxes. Listed in parentheses are the mean and minimum during the active period.

Figure 20.

Simulated (a) top-of-atmosphere (TOA) reflected shortwave radiation (RSR), (b) column shortwave heating, (c) TOA shortwave (SW) broadband albedo, and (d) column SW absorptance. Albedo and absorptance shown only for low solar zenith angles, as defined by instantaneous TOA downwelling SW flux > 1100 W m−2; dashed lines indicate observationally derived high-optical-depth limits of about 0.7 and 0.3, respectively [Dong et al., 2008]. Column SW heating not available from the large-scale forcing data set owing to necessary omission of net-only ship-based measurements from available upwelling and downwelling SW fluxes. Listed in parentheses are the mean and maximum during the active period.

[50] Considering radiative flux divergence across the ensemble, the most notable feature is that every simulation is biased high by at least 25 W m−2. It is also notable that in the UKMO-2M simulation in particular, TOA shortwave and longwave upwelling fluxes both closely match the forcing data set (Figures 19a and Figure 20a). Thus, differences between simulated and forcing radiative fluxes occur only at the surface in some cases. Simulated surface shortwave and longwave fluxes are biased to an unknown degree by the idealized treatment of the portion of land surfaces within the domain as oceanic. Surface downwelling shortwave fluxes also exhibit high spatiotemporal variability and were necessarily obtained from a sparse network of observing stations. It seems possible that uncertainty in surface fluxes over the TWP-ICE domain could be large enough to reconcile the weak column cooling found in some simulations with the column warming in the forcing data set; biases of 25–30 W m−2 in radiative divergence constitute roughly 15% of shortwave downwelling flux during the active period. Such uncertainties in surface radiative fluxes hinder the accuracy of global radiative budgets [e.g., Trenberth et al., 2009]. In this modeling scenario, radiative divergence uncertainty is also associated with uncertainty in MSE changes over simulation durations of days to weeks.

[51] Although observational uncertainty in the domain mean of net surface radiative fluxes could be too large to strongly constrain simulations, observational analysis of solar albedo and column absorptance at low solar zenith angles has indicated maximum respective values of about 0.7 and 0.3 in the optically thick limit [Dong et al., 2008]. Although maximum domain-wide albedos reach 0.7 at high solar zenith angles in one baseline simulation (Figure 20c), none exceed that value, and no domain-wide absorptances exceed 0.3 (Figure 20d). SAM simulations produce the greatest solar heating in a layer at ∼13–16 km (Figure 21), perhaps associated with processes that led to the greatest latent heating at similar elevations (see Figure 13). Such excursions in the shortwave heating profile are generally associated with excursions in longwave cooling, as also seen in MESONH simulations near the melting level. Differences in shortwave and longwave heating rate profiles across the ensemble can be attributed in part to differences in the vertical distribution of hydrometeors (see Figure 15); the treatment of hydrometeor radiative properties also probably plays a role that deserves further scrutiny.

Figure 21.

(top) Shortwave heating and (bottom) longwave cooling profiles simulated during the active period. Symbols as in Figure 5.

[52] Considering surface turbulent heat fluxes during the active period, values are highest in EULAG-2 and EULAG-2s, resulting in little MSE drift in EULAG-2 despite relatively low radiative convergence. Surface heat fluxes are by contrast lowest in the 3D sensitivity tests (DHARMA-1s and SAM-2Ms), in which they nearly equal the Darwin Harbor data from the active period (see Figure 18a), consistent with idealized marine conditions. That the 2D sensitivity test EULAG-2s fluxes are dissimilar indicates a possible role of dimensionality. Latent heat flux dominates sensible heat flux in simulations and observations, and simulation differences are related to near-surface relative humidities (not shown). For instance, the 3D sensitivity tests with lowest heat fluxes also exhibit the highest mean near-surface relative humidities (90%). However, near-surface relative humidities are lower in MESONH than in EULAG, indicating that some combination of dimensionality, near-surface winds (see section 3), and flux parameterization may determine results. The role of microphysics appears comparatively weak (e.g., across DHARMA, MESONH, and UKMO simulations). Overall, whether or not surface heat fluxes play a determining role in the local initiation and maintenance of deep convective systems during the active period (see section 2), they can modulate tropospheric MSE and PWV evolution over CRM integration times as short as several days in the modeling framework used here.

[53] Returning to the general question of how large differences in deep stratiform cloud properties could influence tropospheric heating and cooling rates, and considering only TOA, where observational constraints are strongest, we lastly note that most models overpredict OLR during the active period. However, if lines of offsetting changes in TOA OLR and RSR are drawn through the limits of observational uncertainty in OLR and RSR (see Figure 22), most simulations fall between these lines during both active and suppressed periods. The tendency of cloud-associated shifts in TOA OLR and RSR to balance in the tropics has long been noted [e.g., Kiehl, 1994], and a similar pattern is evident across the ensemble of simulations. Given any two simulations that differ only in microphysics scheme or large-scale forcing during the active period (e.g., across DHARMA, SAM or UKMO simulations), greater stratiform area fraction tends to be consistently associated with a decrease in OLR and an increase in RSR that are similar in magnitude.

Figure 22.

(top) Top-of-atmosphere reflected shortwave radiation (TOA RSR) versus TOA outgoing longwave radiation (OLR; dashed lines are 1:1 drawn through the intersection of observational values plus and minus their uncertainties), (middle) daytime SW albedo versus stratiform area fraction, and (bottom) daytime LWP versus liquid optical depth (OD; dashed lines indicate domain-mean effective radius, defined as 1.5⋅LWP/(ρw ⋅OD), of 25 μm and 50 μm). Averaging times, symbols, and Spearman rank coefficients as in Figure 5. Domain means of RSR and OLR from large-scale forcing data set (derived from VISST retrievals), daytime SW albedo directly from VISST retrievals, and stratiform area fraction from analysis of C-POL data shown with dotted lines and uncertainty ranges (shading).

[54] Identifying which factors determine the relatively stable baseline level of TOA RSR versus OLR in each model, from which departures appear to result in roughly equal offsets of longwave and shortwave fluxes, is beyond the scope of this study owing to a lack of sufficient diagnostics (additional diagnostics are suggested below). It is unknown whether such model differences could be attributable to dynamically or microphysically modulated relative distributions of primarily high or low clouds, for instance, or also their radiative treatment. In 3D models, compensating offsets in OLR and RSR can be associated only in part with deep stratiform area fraction, as evidenced by its weaker correlation with broadband albedo than with OLR (Figure 22 versus Figure 9). TOA albedo may also be influenced by variations in effective radius of liquid-phase condensate (Figure 22), which contributes most optical depth to simulations (not shown). 2D TOA radiative flux fields, which were not requested output here, could be used to more fully diagnose differences between simulations and observations, as demonstrated by Varble et al. [2011].

6. Summary and Conclusions

[55] Observations of consecutive active and suppressed monsoon periods around Darwin, Australia from 18 January to 3 February 2006 during the TWP-ICE campaign are used to drive and evaluate multiple cloud-resolving models (CRMs) with periodic boundary conditions, most using a horizontal grid resolution of ∼1 km. Baseline simulations represent an idealized marine case study. Sensitivity test simulations include nudging domain-mean water vapor and potential temperature to observations throughout the troposphere. Since baseline simulations enter the suppressed monsoon period with a wide range of mean conditions, results from the earlier active monsoon period provide a better indication of ensemble performance. Agreement of the 13 simulations with domain-wide observational data sets over the active period is summarized in Table 3, and associated conclusions can be summarized as follows.

Table 3. Simulated Domain-Mean Quantities Averaged Over the Active Monsoon Period (19.5–25.5) That Are Within the Range of Observational Data Plus and Minus Uncertaintiesa
  • a

    Within the range: yes; higher than that range: +; lower than that range: −; not diagnosed or reported: blank.

  • b

    Mean surface precipitation rate versus C-POL with uncertainty of 25% (see section 4.1, Figure 5).

  • c

    Total occurrence frequency of precipitation rates exceeding 0.2 mm h−1 at 2.5-km elevation and 2.5-km resolution versus C-POL range of 0.21–0.28 (see section 4.1, Figure 6a).

  • d

    Fractional area of convective and stratiform rain in 3D models versus C-POL (using the algorithm described in Appendix B) with uncertainties of 20% and 5%, respectively (see section 4.1, Figure 9).

  • e

    Ice water path versus 3D-IWC with uncertainty of 20% (see section 4.8, Figure 11).

  • f

    TOA outgoing longwave radiation (OLR) versus VISST (same as large-scale forcing data set) with uncertainty of +9/−4% (see section 4.2, Figure 22).

  • g

    TOA reflected shortwave radiation (RSR) versus the large-scale forcing data set (based on VISST) with uncertainty of +7/−15% (see section 4.2, Figure 22).

EULAG-2yes+  ++yes
EULAG-2syes+  ++yes
ISUCRM-2yes   +yes+
SAM-2Myes +++yes
SAM-2Msyes +++yes
UKMO-2Ayes yes++yesyes
UKMO-2Byes ++++
UKMO-2Myes +++yes

[56] 1. All 13 simulations reliably reproduce domain-mean precipitation rates (Table 3, Precipitation column), consistent with similar past studies in which models are constrained by strong large-scale forcing terms [e.g., Xu et al., 2002; Xie et al., 2005; Woolnough et al., 2010]. However, simulations deviate systematically from observations with respect to the underlying precipitation rate distributions (Table 3, Area, Convective and Stratiform columns). The area covered by rain rates greater than 0.2 mm h−1 is overestimated (Table 3, Area column), a tendency that is reduced in sensitivity test simulations (Figure 6a). Thus excessive precipitating area appears at least partly attributable to drift of mean tropospheric conditions from those observed. Several lines of evidence also indicate that the strongest rain is locally too intense or widespread in the 3D simulations (see 2 below), consistent with results from a similar past study of tropical convection around Kwajalein Island [Blossey et al., 2007].

[57] 2. In the ten 3D simulations, areas covered by convective and deep stratiform rain are diagnosed from simulated radar reflectivity using a textural algorithm (Appendix B, Figure 7). Simulated convective area fractions are systematically larger than observed by C-POL, by up to a factor of ∼2 (Figure 9), and are within measurement uncertainty in only two cases (Table 3, Convective column). Simulated stratiform area fractions by contrast can be too large by up to a factor of two or too small (Figure 9), consistent with past findings of variable stratiform rain across CRMs [Xie et al., 2002], and only one lies within the narrow measurement uncertainty (Table 3, Stratiform column). The largest stratiform area fractions are simulated with Morrison et al. [2009] two-moment microphysics, consistent with the expected importance of microphysics scheme [Morrison et al., 2009; Li et al., 2009; Luo et al., 2010; Bryan and Morrison, 2012]. However, the slower-than-observed decay of convective area from simulated peaks seen here (Figure 8) is probably caused at least partly by using periodic boundary conditions and neglecting large-scale advective divergence of condensate. The usefulness of this modeling approach for directly constraining stratiform areal coverage may therefore be limited; a companion study using limited-area models demonstrates an approach that could be more suitable (see section 1).

[58] 3. Simulated ice water path (IWP) in all 13 simulations is systematically higher than domain-mean 3D-IWC retrievals (Table 3, IWP column), commonly by a factor of two, with only 2D simulations nearly within estimated uncertainty (Figure 11). However, CRMs reproduce the retrieved temporal evolution and vertical distribution of IWP and retrieved latent heating rate profiles qualitatively well (Figures 10b, 12a, and 13a). 3D-IWC retrievals may underestimate dense ice contributions to IWP (see section 5.2), but systematic overestimation of IWP by simulations cannot be ruled out. Identifying the sources of discrepancy between simulated and retrieved IWP should be a priority owing to the wide variability of ice distribution documented across models generally, its importance to radiative fluxes, and the potential of satellite microwave data for providing strong constraints [e.g., Waliser et al., 2009; Wu et al., 2009; Su et al., 2011].

[59] 4. At the top-of-atmosphere (TOA), where radiative fluxes are most strongly constrained by observations, simulated outgoing longwave radiation (OLR) is usually higher than observed (Figure 22), and is within experimental uncertainty in only four cases (Table 3, OLR column). TOA reflected shortwave radiation (RSR) is usually lower than observed (Figure 22), although within the estimated uncertainty range (Table 3, RSR column). Persistently high OLR (Figure 19a) and low RSR (Figure 20a) are strikingly similar to those reported by Blossey et al. [2007] (cf. their Figure 6). In 3D models the OLR is strongly anticorrelated with stratiform area fraction (Figure 9). That OLR is by contrast negligibly correlated with IWP indicates that the spatial distribution of ice is more important than its area-averaged path. Furthermore, in multiple simulations from a given dynamics model, increasing stratiform coverage is associated with roughly equal changes in OLR and RSR (Figure 22). However, the absolute ratio of RSR to OLR varies as a function of OLR (e.g., DHARMA > UKMO > SAM) for reasons that cannot be adequately assessed from reported output (see section 5.3).

[60] 5. The UKMO-2A simulation agrees best overall with domain-wide data streams (Table 3), but systematic differences between observations and all simulations including UKMO-2A (e.g., in timing of convective area in Figure 8) suggest that close agreement may not be expected using this modeling framework. Errors associated with neglecting advective divergence of condensate, such as the cirrus inflow following event C during TWP-ICE [Cohen, 2008; May et al., 2008], could be significant especially in the upper troposphere [Grabowski et al., 1996; Petch and Dudhia, 1998]. However, across the ensemble of simulations, the wide spread of stratiform fraction and its close correlation with OLR indicate a need for more rigorous assessment of the structural, microphysical, and radiative properties of convective, stratiform, and anvil clouds. Analysis of such structural features could have been usefully extended if requested results had included (1) 2D TOA broadband longwave, shortwave, and relevant narrow-band radiative fluxes, (2) associated 3D radiative fluxes for attribution to underlying hydrometeor distributions, (3) 3D precipitation rates for division of rainfall into convective and stratiform components, and (4) 2D surface radiative and turbulent heat fluxes.

[61] Although it is beyond the scope of this study to quantify uncertainties in column radiative flux divergence and surface turbulent heat fluxes associated with sparse surface station measurements over ocean and land, and treatment of the observational domain as entirely marine in these idealized simulations introduces bias, several additional conclusions can nonetheless be drawn. First, simulated values of domain-mean column solar absorption are below an observationally determined maximum for the optically thick limit (see section 5.3). Second, deviations of predicted column radiative flux divergence and surface turbulent heat fluxes from the large-scale forcing data set are commonly large enough to drive substantial drifts in tropospheric water vapor, temperature, and moist static energy over only a few days. Nudging the domain-mean profiles to large-scale conditions as done in sensitivity tests here is one means of preventing drift. An exhaustive closure approach [e.g., Fridlind and Jacobson, 2003] would be to use an ensemble of large-scale forcings that includes uncertainties of inputs to the variational analysis [e.g., Hume and Jakob, 2007]. A companion TWP-ICE SCM intercomparison study considers such uncertainty in the surface precipitation rate input (see section 1).

[62] The variability of simulation drifts from observed conditions limits the value of comparing simulations with either observations or with one another during the suppressed period; simulated conditions entering the suppressed period are too broad to attribute differences to model physics going forward. It is nonetheless notable that predicted surface precipitation rates exceed observations by more than experimental uncertainty in all cases except one (Figure 5), suggesting that models may form or sustain convection too easily during the suppressed period. Convective area fractions correspondingly exceed observations, but stratiform area fractions are again variably overpredicted and underpredicted and uncorrelated with either convective area fractions or IWP (Figure 9). Highly variable latent heating rates in the lower troposphere (Figure 13) and ice water contents aloft (Figure 12) point to a general divergence of simulation results during the suppressed period.

[63] Differences across the simulation ensemble arise noticeably from multiple sources, including treatments of dynamics, microphysics, and radiation. Compared with 2D models, the 3D models tend to produce higher peak rain rates and IWP/LWP ratios (Figures 4c and 11), consistent with past studies finding stronger updrafts and larger vertical mass fluxes that impact ice formation processes and increase anvil ice mass in 3D [Redelsperger et al., 2000; Petch and Gray, 2001; Phillips and Donner, 2006; Petch et al., 2008; Zeng et al., 2008]. Ensemble spread can be attributed also to microphysics schemes (see 2–4 above), consistent with past CRM studies [e.g., Wu et al., 1999; Grabowski et al., 1999; Redelsperger et al., 2000; Xu et al., 2002; Xie et al., 2005; Morrison et al., 2009; Fan et al., 2010b; Luo et al., 2010; Bryan and Morrison, 2012]. Prognosing droplet number concentration, an important advance for incorporating aerosol effects on cloud properties, did not produce systematically distinguishing effects across the ensemble here, for instance in IWP, radiative fluxes, or domain-mean water-drop effective radius (e.g., DHARMA-2M, EULAG, MESONH-2, and SAM-2M in Figure 22), likely owing in part to differences in aerosol and activation treatments (see section 3).

[64] Returning to our first originating question, do simulations and observations agree within experimental uncertainties? The short answer is that even when simulations reproduce domain-mean surface precipitation, cloud structural and radiative properties are often not reproduced (see Table 3). Resulting deviations of column radiative flux divergence from observationally derived values, if not compensated by errors in surface turbulent heat fluxes, lead to substantial drift of predicted tropospheric conditions from those observed. In addition, systematic differences between simulated and observed timing of convective structures suggest that periodic boundary conditions could be an important source of discrepancy. Applying large-scale horizontal advective tendencies to condensate would not likely resolve timing errors that differ for convective and stratiform features. The modeling approach used here therefore does not appear sufficiently robust to reproduce the observed precipitating cloud structures and their radiative effects in this case. Regardless, the wide spread of predicted stratiform area fraction and closely correlated radiative impacts indicate a need for more rigorous observation-based evaluation of the ability of models to reproduce the fundamental micro- and macrophysical properties of convective cloud structures. The factors specifically controlling simulated convective and stratiform properties deserve further study owing to their effects on tropical dynamics, large-scale circulation and precipitation patterns, and climate sensitivity [e.g., Donner et al., 2001; Schumacher and Houze, 2003; Song and Zhang, 2011; Houze, 2004; Fu and Wang, 2009].

Appendix A:: Data Archive

[65] Unless otherwise indicated, all measurements used here were downloaded from the ARM data archive (http://www.arm.gov). The modeling case study specification, input files, simulation results, and processed observational data sets produced for this study are also stored there (http://www.arm.gov/campaigns/twp2006twp-ice/). Archived simulation results include scalars and profiles at 10-minute frequency, and 3D model fields at 3-h frequency. All model fields and processed data are archived in compliance with the netCDF Climate and Forecast (CF) metadata convention (http://cf-pcmdi.llnl.gov) version 1.3, insuring standardized metadata, variable names, and units, enhancing ease of use. Archived material is intended to allow the case to be run independently and results compared with those shown here, or users can alternatively download simulation results and treat them as an ensemble without running the case.

Appendix B:: Convective and Stratiform Area

[66] We identify regions of convective and stratiform rain in observed and simulated radar reflectivity fields based on the textural algorithm described by Steiner et al. [1995], with the added requirement that reflectivity be at least 0 dBZ at 6 km elevation in stratiform columns to avoid inclusion of isolated shallow convection as stratiform outflow. Adding such an echo requirement aloft (whether 0 or 5 dBZ at 6 or 8 km) exposed strong correlations of identified stratiform area with both OLR and RSR across the simulation ensemble that were otherwise absent. Thus echo aloft provides an efficient proxy for stratiform structural properties that are closely associated with radiative fluxes. To treat observations and models identically, we linearly interpolate model fields of equivalent reflectivity (Ze) to obtain a single slice at an elevation of 3 km and degrade model resolution to 2.5-km horizontal resolution in a manner that conserves Ze. Reflectivities weaker than 0 dBZ are set to missing values in both model and C-POL fields. We then apply the three-step algorithm to identify convective pixels as described by Steiner et al. [1995] (their section 2c), where background intensity considered in steps two and three is averaged over values of dBZ ≥ 0 (Ze ≥ 1). Since the optimal stepwise algorithm coefficients reported by Steiner et al. [1995] were developed using the same instrument, location, and meteorological conditions, we consider them adequate for our purpose of comparing observed and simulated structures despite the reduction in resolution from 2 km (used in their study) to 2.5 km (the resolution at which C-POL data are archived). A more detailed analysis of convective and stratiform rain structures in 3D simulations is made by Varble et al. [2011], where the algorithm for identifying convective and stratiform regions differs slightly, including exclusion of all columns with 2.5-km reflectivity <5 dBZ and no echo requirement aloft in stratiform columns.


[67] This research was supported by the NASA Radiation Sciences Program and by the DOE Office of Science, Office of Biological and Environmental Research, through Contracts DE-AI02-06ER64173, DE-AI02-08ER64547, and DE-FG03-02ER63337 (Fridlind and Ackerman), DE-FG02-08ER64574 (Grabowski and Morrison), DE-AI02-07ER64546 (Minnis), and DE-FG02-08ER64559 (Wu), and the DOE Atmospheric System Research Program (Fan). Computational support was provided by the DOE National Energy Research Scientific Computing Center and the NASA Advanced Supercomputing Division. We thank the TWP-ICE and ACTIVE field campaign teams led by Peter May and Geraint Vaughan. TWP-ICE data were obtained from the ARM Program archive, sponsored by the DOE Office of Science, Office of Biological and Environmental Research, Environmental Science Division. ECMWF analyses were provided to the ARM data archive under a site license agreement. Sally McFarlane is thanked for help in Rayleigh reflectivity calculations for SAM simulations. We thank Ed Zipser and Chris Bretherton for helpful discussions. We thank Steven Krueger and an anonymous reviewer for detailed corrections and comments.