Assumptions made by global climate models (GCMs) regarding vertical overlap of fractional amounts of clouds have significant impacts on simulated radiation budgets. A global survey of fractional cloud overlap properties was performed using 2 months of cloud mask data derived from CloudSat-CALIPSO satellite measurements. Cloud overlap was diagnosed as a combination of maximum and random overlap and characterized by vertically constant decorrelation length cf*. Typically, clouds overlap between maximum and random with smallest cf* (medians → 0 km) associated with small total cloud amounts , while the largest cf* (medians ∼3 km) tend to occur at near 0.7. Global median cf* is ∼2 km with a slight tendency for largest values in the tropics and polar regions during winter. By crudely excising near-surface precipitation from cloud mask data, cf* were reduced by typically <1 km. Median values of cf* when Sun is down exceed those when Sun is up by almost 1 km when cloud masks are based on radar and lidar data; use of radar only shows minimal diurnal variation but significantly larger cf*. This suggests that sunup inferences of cf* might be biased low by solar noise in lidar data. Cloud mask cross-section lengths L of 50, 100, 200, 500, and 1000 km were considered. Distributions of cf* are mildly sensitive to L thus suggesting the convenient possibility that a GCM parametrization of cf* might be resolution-independent over a wide range of resolutions. Simple parametrization of cf* might be possible if excessive random noise in , and hence radiative fluxes, can be tolerated. Using just cloud mask data and assuming a global mean shortwave cloud radiative effect of −45 W m−2, top of atmosphere shortwave radiative sensitivity to cf* was estimated at 2 to 3 W m−2 km−1.
 Owing to speed and size limitations of most computers, Global Climate Models (GCMs) discretize the Earth-atmosphere system into columns with horizontal cross-sectional areas generally exceeding 104 km2. Hence, many processes and fluctuations are unresolved by GCMs and must be parametrized. Although GCM columns are partitioned into relatively thin slabs, the horizontal expanse of these slabs makes it difficult to describe the statistical nature of vertical correlations of processes and fluctuations. A longstanding line of work on this problem involves representation of horizontal and vertical fluctuations of clouds for the purpose of computing column-wide (i.e., domain-average) radiative flux profiles.
 Up until the late 1970s, radiative flux profiles were computed assuming that the horizontal position of a uniform slab cloud in a cloudy layer was completely uncorrelated with like clouds in all other layers [Manabe and Strickler, 1964]. This is the random overlap model. In 1979, Geleyn and Hollingsworth  introduced the maximum-random overlap (MRO) model and an accompanying radiative flux algorithm. The MRO method assumes that clouds in adjacent layers overlap maximally while those separated by a cloudless stretch overlap randomly. Through the 1980s the MRO model received some attention [e.g., Morcrette and Fouquart, 1986; Tian and Curry, 1989] and increased much in popularity during the 1990s. Currently, approximately half of the operational weather and climate models employ the MRO model [Barker et al., 2003; Q. Fu, personal communication, 2004].
 The purpose of this study was to assess cloud fraction overlap on a global scale using data collected from the CloudSat and Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) satellites [Stephens et al., 2002; Winker et al., 2003]. Both satellites have been orbiting in the A-train since May 2006, and suitable data for this study became available in July 2007. The following section explains the motives, and outlines the objectives, of this study. It is followed by a description of the data and the method of analysis, with results and conclusions forming the final sections.
2. Motivation and Objective
 According to the independent column approximation (ICA), radiative flux averaged across a GCM grid cell can be expressed as
where τ is cloud optical depth, F(τ) is a multilayer solution of the 1-D radiative transfer equation, which of course depends on many variables other than τ, and p(τ) is the normalized density function of τ over the domain [Stephens et al., 1991; Barker, 1996]. If
where is total cloud fraction and (τ) is a normalized density function of τ for the cloudy portion of the domain, (1) becomes
If the atmosphere is divided into M layers, (1) can be rewritten as
where m is fraction of the domain having cloudtops in layer m exposed to space, = m, and m(τ) is frequency distribution of τ for clouds of any geometric thickness but whose tops are exposed to space in layer m. The attraction of this representation of the ICA is that ,ɛm, and m(τ) are seen clearly as the basic cloud structural quantities needed to estimate radiative fluxes, they all have the potential of being inferred, on a global scale, from satellite data, and can be diagnosed easily in a GCM. Moreover, if a GCM simulates these quantities, and hence TOA fluxes, well, it follows that corresponding flux profiles are likely to be accurate too.
 While GCMs, do not estimate m and m(τ) explicitly, they are present implicitly at every timestep and depend on fractional amounts and mass of layer clouds, as well as assumptions regarding vertical overlap of layered clouds and horizontal fluctuations of water content. In the McICA method [Barker et al., 2002], with its reliance on stochastic generation of unresolved clouds [Räisänen et al., 2004], exact computation of ɛm and m(τ) is simple. Currently within GCMs, cloud overlap and horizontal variability are either assigned or highly parametrized. According to analyses by Barker et al.  and Barker and Räisänen , it is crucial that both descriptors be addressed simultaneously, particularly when computing shortwave radiative fluxes. The lingering question is how accurate and sophisticated must parametrizations of these descriptors be? To some extent, the answer is governed by the implicit sunset clause built into this line of research: ever-shrinking grid spacings in GCMs will give way to global CSRMs whereupon properties such as cloud overlap will be reduced to diagnostic measures and effectively emancipated from parametrization. For the foreseeable future, however, parametrizations of these descriptors are required, but as yet there are no global analyses of them so the required level of parametrization is still not clear. It appears, however, that the basic cloud masks derived from CloudSat-CALIPSO data should be able to furnish an initial perspective on the global distribution of characteristics pertaining to, at least, overlap of fractional cloud.
 To illustrate some global results of cloud overlap properties, data collected by CloudSat's 94-GHz cloud-profiling radar (CPR) and CALIPSO's dual wavelength lidar during January 2007 and August 2007 were examined. Both satellites were launched together on 29 April 2006, spent a few weeks manoeuvring into position with the A-train, and began transmitting data about a month after launch. The months selected here have among the largest amounts of data to work with and almost maximize seasonal disparity. Analyses were performed using CloudSat's CPR_Cloud_mask and Radar_Reflectivity fields from the 2B-GEOPROF database and the CloudFraction field from 2B-GEOPROF-LIDAR [Mace, 2006, 2007]. (These data are available at http://cloudsat.cira.colostate.edu/dataSpecs.php.) In essence, CPR_Cloud_mask and CloudFraction indicate the likelihood of cloud in a radar volume. Table 1 lists the values that CPR_Cloud_mask can take. CloudFraction is the fraction of lidar volumes in a radar volume identified as containing hydrometeors.
Table 1. Values Assigned to Each Radar Volume in CloudSat's CPR_Cloud_mask Fielda
For this study, values greater than 20 were taken as cloud.
no cloud detected
likely bad data
likely ground clutter
weak detection found using along track integration
cloud detected; the larger the value, the smaller the chance of a false detection
 There are typically 37,081 CPR profiles reported per orbit implying that each column is effectively 1.1 km long (along-track). In actuality, the radar footprint is ∼1.4 km long and integrated for ∼1 km. Clearly there is some oversampling. They are approximately 1.4 km wide (across-track) and each volume is 0.24 km deep. Because of ground-clutter contamination, the lowest 2 or 3 volumes are unworkable. According to Mace [2007, Table 1], resolutions of the processed lidar data are: in the along-track direction 1 km and 0.3 km for altitudes below and above 8.2 km, respectively; in the vertical direction 0.03 km and 0.075 km for altitudes below and above 8.2 km, respectively; and 0.3 km across-track. For this study, a CPR volume was classed as cloud if
were all satisfied, where Z is radar reflectivity; CloudSat's minimum detectable signal is purportedly −30 dBZ [Stephens et al., 2002; D. Vane, personal communication, 2007].
 Given that CloudSat's radar reflectivity responds to the sixth power of particle size [Doviak and Zrnić, 1993], CPR_Cloud_mask captures most of the precipitation but certainly not all “cloud.” The concern is easy to appreciate: almost by definition, precipitation displays stronger vertical correlation than cloud, and so CloudSat-derived estimates of vertical correlation for cloud might be biased high. This is potentially important because GCM cloud-radiation algorithms do not account for precipitation which has shorter atmospheric residence times than clouds and relatively week extinction at wavelengths important for Earth's radiation budget. Hence, as long as GCM cloud-radiation routines disregard precipitation, ideal parametrization of cloud overlap based on observations should have precipitation factored out.
 This points to the distinction between intrinsic and extrinsic cloud properties that appears whenever one attempts to reconcile models with observations [e.g., Stephens, 1988]. At the time of writing, retrievals pertaining to cloud water content and particle size were available from CloudSat's database, but the intermittency of the data made it difficult to use as a screen for precipitation (or at least volumes with large particles and small visible extinction). Hence, knowing that CloudSat's CPR_Cloud_mask certainly contains precipitation in addition to cloud, a very simple and crude precipitation screen was applied to CPR_Cloud_mask in an attempt to remove at least shafts of precipitation that approached the surface. This procedure is not being advocated for general use; it was used here simply to illustrate the impact on estimates of cloud overlap due to removal of what was likely precipitation approaching/reaching the surface.
 The precipitation screen consists of using (4) and then for each column, where CM(j) and CF(j) represent values of CPR_Cloud_mask and CloudFraction in the jth bin, applying this sequence of tests:
Values of CM are checked in the third and fourth bins above the surface cell so as to avoid ground-clutter contamination. The declouding process up to bin Jmax is admittedly an arbitrary choice that likely catches several volumes that contained a mixture of precipitation and cloud. However, by leaving volumes unaltered if their lidar CloudFraction is >0.99 should help avoid eliminating cells with significant visible extinction.
 Cross sections of length 50, 100, 200, 500, and 1000 km were used to span GCM-size domains. The issue of optimal length of cross section to represent the statistics of the (square) domain from which it was drawn has been addressed in the past with respect to total cloud fraction [e.g., Astin and Di Girolamo, 1999]. It will be addressed for cloud overlap in a separate study.
4. Cloud Fraction Overlap
 Throughout this analysis, overlap of clouds is treated as a linear combination of maximum and random overlap [Hogan and Illingworth, 2000; Bergman and Rasch, 2002]. To illustrate, if ck and cl are fractional amounts of cloud in layers k and l at altitudes zk and zl, then total, vertically projected cloud fraction ck,l for the two layers is defined as
where αk,l is the weighting parameter. The maximum component is unambiguous: positions of clouds in one layer are determined by, and overlay, those in the other. The random component is the expectation value for the distribution of total cloud fraction that occurs when the positions of clouds in both layers are completely uncorrelated. When clouds overlap less than expected when cloud positions in one layer are completely independent of those in the other,
which leads to αk,l < 0 in (6). Hence, the smallest values that αk,l can attain are
which occur when clouds do not overlap at all and ck,l = ck + cl ≤ 1. Clearly,
where cf is decorrelation length for overlapping fractional cloud which, in general, varies with z. Note that this definition does not admit αk,l < 0 [cf. Mace and Benson-Troth, 2002; Oreopoulos and Khairoutdinov, 2003] and so ck,l is always within [max (ck, cl), ck + cl − ckcl]. Nor does it allow = 1 when max (c(z)) < 1, where c(z) is layer cloud fraction profile; though these cases can, and do, occur.
 Within most conventional GCMs, all the relevant information one has pertaining to cloud structure inside a column is c(z). There are, therefore, infinitely many ways to overlap the fractions c(z) and thus produce corresponding total cloud fractions C that can, in principle, take values within [max(c(z)), 1]. With (6) and (7), a convenient, and potentially useful, way to assess overlap for nonovercast scenes, is to enforce a vertically invariant cf on c(z) [Barker and Räisänen, 2005]. This leads to a unique function C(cf) ∈ [max(c(z)), 1]. This function is, by definition, nonanalytic, but can be evaluated numerically using a stochastic cloud generator. The generator used here was developed originally by Räisänen et al.  for use with McICA [Barker et al., 2002; Pincus et al., 2003]. Unless stated otherwise, all applications of the generator used 25,000 subcolumns.
 When dealing with a 3-D field of cloud or a 2-D cross section, obtained from either a CSRM or observations, one can compute the domain's actual total cloud fraction (along with c(z), and even cf(z)), and thus define an effective decorrelation length cf* as the solution to
Again, however, this cannot be done with GCM data for all one has to work with is c(z).
 One way to solve for cf* is Brent's method [Brent, 1973] which is ideal when computed values of a function are all that are known, and roots are one-dimensional. Brent's method combines the safety of bisection methods with the convergence speed of higher-order methods, provided the root lies between extremes set by the user. The extremes used here were 0 km and 20 km. If cf* > 20 km the algorithm defaults to 20 km, which is often very close to maximum overlap. The convergence criterion was 0.05 km. It was set small to diminish the impact of stochastic fluctuations in returned values of C. The number of iterations required for convergence was typical 7 or 8 but ranged from 5 to 12.
Figure 1 shows a comparison of roots between those based on the settings just mentioned and a benchmark that used an upper limit of 100 km, a convergence criterion of 0.01 km, and 250,000 subcolumns. The latter required generally 10 to 15 iterations and took about 20 times more CPU time than the operational settings. For cf* < 20 km, the operational settings lead to errors in cf* that are typically less than 0.1 km, well below the resolution of the data and thus adequate for the purpose at hand. While errors for cf* > 20 km are exclusively bias, and of magnitude cf* − 20 km, only 50 of the 11,498 cross sections for L = 1000 km had cf* > 20 km. Moreover, based on experience [Barker and Räisänen, 2005], dFICA/dcf* are usually very small for cf* > 20 km and so errors are negligible. While 1000 km transects were used to produce Figure 1, samples for shorter cross sections yielded virtually identical results.
 Before moving to the global analysis, consider an example of the methodology proposed in the previous subsection and the impact of the precipitation screen.
Figure 2 contrasts (4) with its precipitation-screened counterpart for a typical 1000 km sample drawn from an orbit on 1 January 2007. Clearly, this screening goes too far in some cases and not far enough in others, but generally speaking, it appears to move matters in the desired direction. The plot of layer cloud fraction profile in the lower left corner of Figure 2 shows that screening eliminated volumes up to about 3 km above the surface. Since application of (5) never increases , and rarely reduces it (it went from 0.876 to 0.875 in this case), sizable reductions in layer cloud fractions mean that there is less cloud in a profile needed to make-up . In general this means that screening will reduce cf*; in this instance, solving (8) for the unscreened and screened profiles yielded cf* of 2.0 km and 1.5 km, respectively. When these values of cf* are used in the stochastic generator they produce fields that by design, had the correct profile of layer cloud fraction and associated , but because cf* was set constant with height, profiles of cumulative cloud fraction were slightly incorrect, as seen in the central plot in Figure 2. The rightmost plot in Figure 2 shows, however, that the generated fields have fractional amounts of cloud exposed to space, m, that not only approximate the observed amounts well but that are impacted only slightly by alteration of cf* induced by precipitation screening.
 To further this example, Figure 3 shows the function C(cf) for both the unscreened and screened cases shown in Figure 2. It also shows the progression of steps in Brent's method using 25,000 subcolumns in the generator. The left plot in Figure 4 shows cumulative distribution for cf* for the unscreened case. It deviates slightly from a step function because of the generator's intrinsic stochastic noise. Median cf* is 2.02 km and interquartile range is just 0.03 km. Moreover, 90% of the solutions took 8 or fewer iterations to converge; 0.3% took 11. Use of this distribution of roots in the stochastic generator, this time using 100,000 subcolumns, yields the cumulative distribution for C shown in the right plot of Figure 4. The actual value is 0.876 while the median of the generated fields is 0.8756. The interquartile range is just 0.004. Interquartile ranges for the screened counterpart are almost identical. By almost all standards, uncertainties of these magnitudes are inconsequential.
5.1. Global Overlap Characteristics From CloudSat and CALIPSO Data
Figure 5 shows frequency distributions of cf* conditional upon obtained from CloudSat-CALIPSO 2-D cross-sections of lengths L = 200,500, and 1000 km for all transects with total cloud fractions between 0.05 and 0.99 observed during January 2007. These cross sections did not have the precipitation screen applied. The limits on were imposed, and used hereinafter, because of poor statistics expected at very small and cf* being undefined at = 1; overlap of nonovercast layers when = 1 is addressed later. Values were composited into six latitudinal bands as listed on the plots. For L = 200 km there were 33,250 suitable cross sections, while for L = 1000 km there were 11,139. As expected, distributions for both variables broaden as L decreases; specially those for which become increasingly J- and U-shaped [Rossow, 1989]. Without exception, all distributions exhibit peak median values of cf* at intermediate values of . The largest values are in the northern polar region where the 75th percentile (or third quartile) of cf* reach upward of 6 km for near 0.6, which signifies strong vertical correlation, while at the other end the 25th percentile hovers near 1 km for most locations regardless of L.
Figure 6 shows zonal medians and interquartile ranges of cf* for several cross-section lengths L. There is very little suggestion that the median depends much on L. The interquartile range, however, clearly increases with decreasing L. This is due exclusively to the emergence of tails in the distribution toward large cf* as few collimated clouds within short distances give rise to some enormous cf* (i.e., near-maximum overlap). Sample sizes used to create Figure 6 are shown in Figure 7. Note that they do not scale linearly with L. This is because as L decreases an increasing number of disregarded clear and overcast skies cross sections are encountered.
 Given that at least medians of cf* depend weakly on L [cf. Mace and Benson-Troth, 2002], the majority of results shown hereinafter are for L = 500 km. This length probably provides adequate sampling for typical GCM grid cells [cf. Astin and Di Girolamo, 1999] and provides fairly large numbers of samples.
Figure 8 indicates that median values of cf* as functions of can differ by 0.5 km to 1 km between land and ocean for January 2007 with perhaps a slight tendency for larger values over ocean. Why this should be the case is not clear at the moment. Figure 9 suggests that for extratropical areas, values of cf* during winter tend to be about 0.5 km to 1 km larger than during summer with the seasonal disparity increasing poleward. Near 30°N the values presented here are somewhat smaller, particularly during summer, then values shown by Mace and Benson-Troth  for CPR data gathered at ARM's Southern Great Plains site and do not display strong seasonality as seen in the ARM data. Note that Mace and Benson-Troth solved for cf* (which they called Δz0) via a method that differed from that used here and they used rather short cross-sections. Their method requires much more computation than (8), and unlike cf*, there is no guarantee that use of their Δz0 in a stochastic generator will return an unbiased estimate of . Regardless, Mace and Benson-Troth's mean values of Δz0 for the tropical ARM sites were ∼5 km for relatively short cross sections and vertical layering close to that of the satellite data used here. Tropical median values shown in Figure 9 for L = 50 or 100 km are only ∼2.5 km. Note, however that the tail is thickening at these small L (see Figure 6) and corresponding tropical means are ∼6 km. Hence, the ARM and satellite results seem to be in fair agreement.
Figure 10 shows the same data as Figure 9 except they have been partitioned by and grouped into broader zonal bands. It shows that the large seasonal differences seen in Figure 9 for the regions 90°S–60°S and 60°N–90°N come from cases with ≈ 0.5. By the magnitudes of cf* across to the rest of Earth, this suggests that there is greater, and quite substantial, vertical coherence during the polar winter. Presumably, this is due to ice crystals forming aloft and sedimenting through air with little shear [cf. Hogan and Illingworth, 2003]. Likewise, Mace and Benson-Troth's  estimate of ∼1 km for Δz0 from data gathered at the ARM North Slope of Alaska site agree nicely with satellite mean values for 70°N–80°N; particularly after precipitation screening which reduced median values shown in Figure 9 by ∼0.5 km.
 As mentioned, = 1 can occur in conjunction with max(ck) < 1. When this happens, the algorithm defaults to cf* = 0. However, when max(ck) = 1, there can be numerous other layers with ck < 1 that overlap according to cf* > 0. These can be investigated easily by simply removing clouds from overcast layers and proceeding to compute cf*. For L = 50 km these situations occur about 10% of the time; for the vast majority of cross sections of length L ≥ 100 km, removal of overcast layers results in total cloud fractions greater than 0.98 where differences in cf* have minor impacts. Upon removing overcast layers from cross sections of length L = 50 km, values of cf* relative to the entire population were typically 1 to 2 km larger; an increase of 50% to 100% (see Figure 6). This is partly due to the fact that even after removal of overcast layers, total cloud fractions generally exceeded 0.8, and as Figures 5, 8, and 10 indicate, if cross sections with small are neglected, median cf* will increase. While such an increase in cf* is large, the fact remains that the cross section is overcast and realigning nonovercast layers via changes to cf* does not alter ; it alters only the variance of cloud optical depth [see Barker and Räisänen, 2005].
 To round out these comparisons, cross sections were partitioned according to whether the Sun was up or down (according to Solar_zenith in CloudSat's MODIS-AUX files). This was motivated partly by the expectation of finding some cloud structural differences depending on whether the Sun is up or not [e.g., Wang et al., 1999] but also CALIPSO's lidar signal is noisier during sunup (K. Strawbridge, personal communication, 2007) and this presumably impacts cloud detection and hence estimates of cloud fraction and cf*. The left plot in Figure 11 shows zonal median cf* for 500 km cross sections for January 2007. During sundown, estimates of cf* are typically ∼0.75 km larger than during sunup. These are sufficiently large differences that might warrant being captured in parametrizations of cf*. Smaller values during sunup are, however, consistent with the hypothesis that additional noise in lidar data fosters minor random fluctuations in cloud identification thereby affecting a shift in cf* to smaller values.
 The right plot in Figure 11 shows the same analysis as in Figure 10 except just CloudSat radar data were used (i.e., CPR_Cloud_mask). It is immediately clear that when lidar data are neglected the day/night distinction in cf* disappears. This too supports the claim that cf* are biased low by random fluctuations in identification of cloud by the lidar during sunup. Moreover, note that radar-only estimates of cf* are typically 0.5 to 1 km larger than sundown estimates using radar and lidar. This is likely because the lidar's primary contribution to cloud detection is thin high cloud that is often missed by the radar which, as mentioned early, tends to detect lower precipitating cloud best. Likewise, the radar can sometimes miss low nonprecipitating cloud that could be detected by the lidar given a fairly clear line of sight. Since thin high clouds and low clouds are often essentially decoupled from each other, inclusion of the lidar will lead, rightly so, to smaller cf*.
5.2. Discussion: On the Prospect of Using Parametrized cf* in GCMs
 Given the importance of for, at least, radiation calculations in GCMs, it is recognized generally that a means for describing how clouds overlap and thus give rise to , is required. As mentioned earlier, while cf will almost always vary with height through a column [cf. Hogan and Illingworth, 2003; Räisänen et al., 2004], there is a certain attraction to using something as simple as the effective overlap decorrelation length cf*. The questions that have to be answered are can it be parametrized well and will its simplicity result in undesirable effects that might be avoided with a, justifiably, more elaborate parametrization?
 Consider first some of the limitations associated with the use of cf*. Räisänen et al.'s  subcolumn cloud generator will always produce, on average, the correct profile of layer cloud fractions. Likewise, if supplied with the correctcf*, it will also produce a distribution, due to random sampling, of total cloud fractions p(C). It will usually return, however, an erroneous distribution of cloudtop areas exposed to space. This could be important for computation of longwave radiative heating rates and outgoing LW to space.
Figure 12 shows zonal average layer cloud fraction profiles for six latitude bands for 500 km cross sections observed during January 2007. By definition, the expectation profiles, in the statistical sense, produced by the stochastic cloud generator are identical to those shown in Figure 12. Figure 13 shows corresponding zonal mean profiles of fractional amount of cloud exposed to space for the observed data and the generated fields using proper cf*. The right and left error bars on the plots are defined as
where ɛnobs and ɛnmod are observed and modelled fractions exposed to space, N is the total number of samples, and N< and N≥ are the number of samples in which ɛnmod < ɛnobs and ɛnmod ≥ ɛnobs, respectively. Hence, the bars represent mean negative and positive deviations from the observed means. For the most part, mean bias errors are very small but there are some obvious serial correlations: at altitudes above ∼3 km the generated fields have too much cloud exposed to space, while at lower altitudes they have too little. Just as with the example in Figure 2, this indicates that values of cf are smaller than cf* near the surface and larger aloft. This was true also for MMF data analyzed by Räisänen et al. [2004, Figure 3]. The small bias errors shown here suggest that if provided with good estimates of cf*, generated profiles should be satisfactory.
 With such small errors associated with mean cloud fraction exposed to space, one might expect that errors in cf* will matter little. However, consider the following test in which constant values of cf* are applied to the entire set of CloudSat profiles. On the basis of Figures 6, 9, and 11, if one had to select a global-constant cf* for use in a GCM, it would likely be between 1 and 3 km. Figure 14 shows differences between stochastically generated C based on cf* = 1, 2, and 3 km and their respective as a function of for 3000 randomly sampled 500 km long cross sections. Clearly, cf* = 1 km is too small as many estimates of are too large. For cf* = 3 km the reverse is true. While cf* = 2 km yields a small bias error, a large fraction of errors exceed 10% of . Stochastic noise was not an issue here as 100,000 subcolumns were used to generate C. It is now known, however, through experiments with McICA (H. W. Barker et al., The Monte Carlo Independent Column Approximation: An assessment using several global atmospheric models, submitted to Quarterly Journal of the Royal Meteorological Society, 2008), that climate simulations can digest very large amounts of unbiased noise, and so errors like those shown in Figure 14 may be more acceptable than they appear. Another study is needed to elucidate the radiative implication of these errors, and their impact on GCM simulations, before simple parametrizations of cf* can be dismissed.
5.3. Discussion: Importance of Cloud Overlap for Radiation Calculations in GCMs
Barker and Räisänen  addressed the question of radiative sensitivities due to cloud overlap decorrelation length by assessing ∂FICA/∂cf, where FICA is net solar flux at the TOA, using the concept of cf* and performing extensive radiative transfer calculations on data from a global array of CSRMs [Khairoutdinov and Randall, 2001]. It is possible, however, to get a global estimate of ∂FICA/∂cf using only the data presented thus far; that is, without having to perform radiative transfer simulations and deal with gaps in CloudSat water content and particle size retrievals.
 Differentiating (2) with respect to cf, and assuming that ∂F(0)/∂cf = 0, gives
where is mean cloud optical depth, στ is standard deviation of (τ), and changes to higher-order moments of (τ) have been neglected [cf. Barker and Davis, 2005]. Using the common definition for TOA shortwave (SW) cloud radiative effect of
As noted in (12), the first term in brackets is negative since net TOA flux decreases as mean τ increases, and mean τ increases with cf for clouds become increasingly overlapped yet mass is conserved. On the basis of Barker and Räisänen's  analyses, this term is likely close to, but less than, ∼1 W m−2 km−1. The second term in brackets is likely very small as ∂στ/∂cf can be positive or negative [Barker and Räisänen, 2005]. Integrating (12) over annual cycle and Earth, yields
where angular brackets indicate time and space averages. Using −45 W m−2 (based on CERES satellite data for 2001–2004) (J. Cole, personal communication, 2007) for global mean, TOA SW CRE and −0.08 km−1 for the median of ∂ln/∂cf for the unscreened, diurnal integrated, 1000 km cross sections for January 2007 gives a maximum sensitivity of ∼3.5 W m−2 km−1 which is close to Barker and Räisänen's  lengthy model-based estimate of ∼2 W m−2 km−1. Corresponding sensitivity for longwave radiation is expected to be smaller than 1 W m−2 km−1 [see Barker and Räisänen, 2005].
6. Conclusions and Recommendations
 At this stage in the development of GCMs, the need to consider cloud overlap structure depends on the genre of the GCM. For conventional GCMs, a description of cloud overlap for unresolved cloud fields must be provided, via parametrization, to carry out, among other things, radiative transfer calculations. In the more avant garde multiscale GCMs, with imbedded CSRMs [Randall et al., 2003], cloud overlap ceases to be a parametrization issue and becomes a diagnostic variable that can help to assess CSRM clouds.
 This study examined the vertical overlapping properties of fractional amounts of cloud on a global scale using two months worth of “cloud mask” products derived from CloudSat and CALIPSO data [Mace, 2007]. Cloud overlap was diagnosed assuming that total cloud cover C can be described as a linear combination of maximum and random overlap of layer clouds [Hogan and Illingworth, 2000]. The weighting factor was assumed to depend on layer separation and a decorrelation length cf. If cf is forced to be constant with height, then for a given profile of layer cloud amounts there exists a unique function C(cf). By computing the actual total cloud fraction for a cross section of satellite data, one can solve C(cf*) = where cf* is effective decorrelation length. cf* was used here as a diagnostic variable for an otherwise cumbersome characteristic to assess.
 In very general terms, median values of cf* tend to 0 km for very small , increase approximately linearly with to maxima between 2 and 3 km near 0.7, and decrease to ∼1.5 km as → 1. Not only does this appear to be a fair description regardless of location and time, it also applies for cross-section lengths L ranging from 100 km to 1000 km. There are, however, some interesting, and potentially important, spatial and temporal variations in the statistics of cf*. For instance, the largest cf* appear to occur in polar regions during their respective winters and in the northern tropics during boreal summer. Although cf* were computed differently than by Mace and Benson-Troth  and came from a very different data set, their general agreement is encouraging.
 A concern with CloudSat is that precipitation, whose visible extinction is generally much smaller than that of cloud and not acknowledged in GCMs, figures too strongly in its cloud masks. The issue is that precipitation is often correlated well in the vertical and so might bias cf* high, specially if one is assessing cf* for the express purpose of defining cloud structure for radiation calculations in GCMs. Hence, a very rough precipitation-screening algorithm was devised and applied to CloudSat-CALIPSO cloud masks. The impact was to reduce values of cf* from typically ∼2 km to ∼1.5 km.
 Synergy between radar and lidar was essential for the analyses presented here. As shown, when cloud masks were based on just radar data, typical values of cf* increased by ∼1 km. This is clearly due to the radar having missed high tenuous clouds which are decoupled from lower clouds and presumably too much weight given to large (precipitating) particles that are readily detected by CloudSat's radar and are correlated in the vertical better than cloud. At the same time, however, when radar and lidar data were used, it appeared that during sunup periods random noise in lidar data, and hence cloud detection, yielded systematic underestimates of cf*. When just radar data were used, day/night differences in cf* vanished.
 It was shown that a global estimate of sensitivity of TOA solar fluxes to changes in cf* can be estimated without performing radiative transfer calculations. A systematic error of ∼1 km in cf* can be expected to alter global-average net solar flux at the TOA by ∼3 W m−2. This estimate, which is comparable to that expected for a 10% to 20% systematic change in cloud particle size, agrees with Barker and Räisänen's  estimate derived from extensive model-based calculations.
 Although cf* was used here essentially as a diagnostic tool, it may yet find meritable use in GCMs. It remains to be shown whether it can be parametrized based on resolved variables and whether such a parametrization, preferably as simple as possible, can capture a satisfactory fraction of radiative flux variance. The extremely simple parametrization of setting cf* to a global constant is attractive but has to be tested. It might be that having two distinct values, one for stable (stratiform) and another for unstable (cumuliform) conditions, might effectively velcro overlap to the cloud feedback conundrum, not as a feedback in itself but rather as a conditional setting that modulates the GCM's radiation budget in accordance with changing frequencies of occurrence of stratiform and cumuliform cloud.
 On this note, it was shown that use of a global-constant cf* leads to substantial random noise in C which would, of course, be passed on to radiation fields. A decade ago this might have seemed unacceptable, but experiments with the McICA method (H. W. Barker et al., submitted manuscript, 2008) have shown that GCMs can withstand much random noise; at the expense of quashing bias errors. It is worth mentioning that a similar brand of random noise is implicitly present in more idealized schemes such as maximum random overlap.
 This study represents just an initial step in the use of A-train data to study cloud structural properties. There are several obvious subsequent steps including linking cf* to resolved variables in GCMs, assessment of cloud condensate overlap, further investigation of diurnal variations in cf*, assessment of overlap in cloud system-resolving models (including superparametrized GCMs), and ultimately impacts on radiation budgets and GCM simulations. Finally, conventional GCM sensitivity studies should be conducted to help elucidate how much work is warranted in the construction of a cloud overlap parametrization. In so doing, however, attention has to be paid to the verisimilitude of the GCM's cloudiness, for if too incorrect, it can easily skew the role of overlap.
 This study was supported by grants from the U.S. Department of Energy (Atmospheric Radiation Measurement (ARM) grant DE-FG02-03ER63521 and DE-FG02-05ER63955), the Canadian Foundation for Climate and Atmospheric Sciences, and the Canadian Space Agency. Thanks are extended to T. L'Ecuyer (CSU) for helpful suggestions and P. Partain (CSU) for help with HDF files.