Cloud clearing of Atmospheric Infrared Sounder hyperspectral infrared radiances using stochastic methods


  • Choongyeun Cho,

    1. Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
    2. Now at Semiconductor Research and Development Center, International Business Machines Corporation, Hopewell Junction, New York, USA.
    Search for more papers by this author
  • David H. Staelin

    1. Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
    Search for more papers by this author


[1] A novel stochastic algorithm is presented for estimating Atmospheric Infrared Sounder (AIRS) radiances in the 3.7–15.4 micron spectral band that would be observed from space in the absence of clouds. This algorithm examines 3 × 3 sets of 15-km AIRS fields of view, selects the clearest fields, and then estimates a single cloud-cleared infrared spectrum for the 3 × 3 set using a series of simple linear and nonlinear operations on both the infrared and companion Advanced Microwave Sounding Unit (AMSU) microwave channels. These instruments were launched on the NASA Aqua satellite in May 2002. The algorithm was both trained and tested within 70° of the equator using global numerical weather analyses generated by the European Center for Medium-range Weather Forecasts (ECMWF); these analyses were converted to radiances using the SARTA v1.05 equation of radiative transfer. The RMS differences between the AIRS 4- and 15-micron CO2-band observations and the corresponding ECMWF/SARTA radiances over nighttime ocean are ∼0.2–0.3 K for ∼60 selected channels with weighting functions peaking at tropospheric altitudes down to the surface for the best 28% of all soundings selected using only AIRS data. For a larger ensemble of 314 channels the corresponding range was 0.27–0.40 K. Latitudes 30–70° yielded RMS differences of 0.26–0.78 K over land at night for the same 314 channels. Mean differences were largely eliminated by training the estimates using independent global observations made on the same 3 test days, which were spaced over 2 months.

1. Introduction

[2] One of the primary objectives of the AIRS/AMSU/HSB experiment [Aumann et al., 2003] is to demonstrate advanced satellite sounding techniques that can materially improve the performance of future operational and scientific numerical weather prediction models. This experiment involves three instruments: the Atmospheric Infrared Sounder (AIRS) covering the 3.7–15.4 micron spectral band with 2378 spectral channels [Pagano et al., 2003; Gaiser et al., 2003], the Advanced Microwave Spectrometer Unit (AMSU) covering the 23–191 GHz band with 15 channels, and the Humidity Sounder for Brazil (HSB) covering the 149–190 GHz band with 4 channels [Lambrigtsen and Calheiros, 2003]. Their fields of view (FOVs) at nadir are ∼14, 45, and 14 km, respectively. These instruments were launched on the NASA EOS Aqua satellite on 4 May 2002 into a sun-synchronous polar orbit at 705-km altitude. The limited spectral and vertical resolution of most prior infrared sounders increased the difficulty of compensating the observed radiances for the unwanted effects of clouds, i.e., “cloud-clearing.”

[3] The unique contribution of this paper to cloud-clearing methods involves development of a new data-trained stochastic method for correcting effects of clouds on hyperspectral infrared radiances. This contrasts with cloud-clearing approaches employing physical models for clouds and radiative transfer. Stochastic methods are computationally extremely efficient and can readily access information hidden in hundreds of infrared channels. For example, this statistical information reflects to some unknown degree the difficult-to-model-physically radiance properties of three-dimensional cloud assemblies with complex shapes and hydrometeor distributions. Stochastic Clearing (SC) methods can increasingly access this information as a result of technological advances that increase computer power and the size of training data sets.

[4] Section 2 of this paper introduces the stochastic method for clearing infrared spectral radiances, section 3 presents representative results, and section 4 discusses the conclusions.

2. Stochastic Cloud-Clearing of AIRS Radiances

2.1. Overview of Method and Rationale

[5] This stochastic cloud-clearing method utilizes the observed multivariate statistical relationships between AIRS observations and corresponding clear air radiances within each 45-km AMSU FOV that would have been observed in the absence of clouds, as predicted using ECMWF analysis fields plus a rapid radiative transfer program. The radiative transfer program, SARTA v1.05, is one of an evolving series tuned to AIRS channels, as described by Strow et al. [2006]. The operational cloud-clearing code described below consists of ∼100 lines of MatLab script that can cloud-clear an entire day of AIRS data within minutes on an inexpensive computer. Future refinements of these simple methods should yield accuracy improvements and permit extension to more AIRS channels.

[6] Section 2.2 describes the first step of the algorithm, which generates a preliminary estimate of the cloudiness for the 3 × 3 FOV array of interest, and section 2.3 describes the subsequent steps, which first generate a four-element vector that characterizes the required radiance corrections for clouds on all channels, and then generate the cloud-cleared radiances. Section 2.4 then describes the rationale for the algorithm.

2.2. Initial Linear Estimate of Radiance Corrections for Cloud and Surface Effects

[7] At nadir AIRS observes nine ∼14-km FOVs within a single AMSU 45-km FOV, which is called a “golf ball.” The stochastic cloud-clearing algorithm produces one set of cleared AIRS radiances for each golf ball on the basis of inputs that include (1) the AIRS Level-1B radiances for N channels of interest, where N is generally more than 300 for each of nine FOVs; (2) the brightness temperatures for five AMSU channels sensitive to tropospheric temperatures; (3) the secant of the instrument scan angle θ which is zero at nadir; and (4) the a priori fraction of land in the golf ball FOV.

[8] The SC algorithm tested here is diagrammed in Figure 1 and consists of five main steps: (1) The FOVs to be used for each golf ball are selected and their radiances are averaged for each of N channels; (2) an initial linear estimate of cloudiness is made (operator A); (3) the cloudiness estimate is multiplied by the secant of scan angle θ and then, along with the inputs to operator A, this product is fed to a second linear operator (operator B), which estimates two brightness temperatures sounding low altitudes that are used to classify each golf ball as either “less cloudy” or “more cloudy;” (4) a final estimate of four principal components of the radiance correction spectrum is made using operator C or D for the less or more cloudy golf balls, respectively; and (5) this correction spectrum is added to the average spectrum of the warmest FOVs for that golf ball to yield the final N cloud-cleared radiances. These steps are elaborated below.

Figure 1.

Stochastic cloud-clearing algorithm.

[9] The nine AIRS FOVs in each golf ball offer nine opportunities per sounding to avoid or minimize clouds. Although one FOV is generally the clearest, averaging more FOVs reduces instrument noise. Brief empirical tradeoffs led to a policy of using (1) the clearest FOV for channels having weighting functions that peak below 5 km, (2) an average of the four clearest FOVs for weighting functions peaking between 5 and 10 km, and (3) an average of all nine FOVs for higher altitudes. FOV cloudiness is inferred from the average radiance observed at eleven 4-micron channels having nadirial weighting function peak heights 1–3 km; the warmer FOVs are presumed to be less cloudy. To better characterize each golf ball the radiances for the most clouded FOV are also determined, although the resulting improvement is marginal. Future performance improvements should result from more elaborate FOV selection and averaging protocols.

[10] Next the N selected infrared radiances for the warmest FOV are converted to seven noise-adjusted principal component scores. Noise-adjusted principal components are principal components computed for variables that have been scaled so that the variances of their additive Gaussian noises are equal; this avoids dominance of the statistics by noisy variables [Lee et al., 1990]. These seven numbers are fed to operator A. Also fed to A are the first three noise-adjusted principal components for the coldest FOV together with the land fraction, the secant of the satellite scan angle, and the brightness temperatures for five AMSU channels from 53.6 to 57.5 GHz that sound tropospheric and lower stratospheric temperatures. AMSU channels 5, 6, 8, 9, and 10 were used; channel 7 was too noisy. The principal components were deduced from large ensembles of AIRS data. These 17 numbers are fed to a linear operator A that estimates the value of the first principal component for the infrared radiance correction spectrum. Operator A simply multiplies the 17-element input vector by the matrix A. The linear regression matrix A was trained on appropriate global AIRS/ECMWF data different from that used later for evaluation. The algorithm for this initial linear estimate of radiance corrections is diagrammed in Figure 2.

Figure 2.

Operator for selecting and averaging FOVs, and operators A, B, C, and D.

2.3. Final Estimate of Radiance Corrections for Cloud and Surface Effects

[11] So far only a preliminary radiance correction estimate exists, the scalar output of matrix A. Next a nonlinear operator computes a parameter that approximates the angular dependence of the radiance correction factor; it is the product of the output Ao of operator A and the secant of the instrument scan angle θ. Although separate estimators could be constructed for each view angle and other angle dependences could be used, this estimator functions well at all angles and has the advantage of simplicity. Operator B multiplies linear regression matrix B by the same 17-element input vector augmented by Aosecθ. Operator B produces estimated brightness temperature corrections for 11- and 15-micron radiances having weighting functions peaking near 0.47 and 2.95 km for the standard atmosphere (927.86 and 715.94 cm−1, respectively). The distributions of these corrections are indicated by the horizontal axis in Figure 3. All golf balls with brightness temperature corrections for both the 0.47- and 2.95-km channels of less than 2 and 1 K, respectively, are classified as “less cloudy;” the rest are “more cloudy.” The less cloudy group generally includes most clear golf balls and some partly cloudy ones. Initial studies show that similar cloud-clearing performance is obtained for alternate pairs of similar channels at 4- or 15-micron wavelength, and that further stratification in cloudiness offers limited improvement. Note that if only one of the two wavelengths were used for cloud classification, many golf balls would be passed that would fail the other test; this is evident in the many black dots (rejected golf balls) in Figure 3 that lie to the left of the threshold lines.

Figure 3.

Initial radiance correction and final Δ(°K) relative to ECMWF/SARTA observed at (top) 715.94 cm−1 and (bottom) 927.86 cm−1. Solid dots result from operator B while the shaded dots result from operator C.

[12] The SC estimation process then begins anew, multiplying the same 18-element input vector by either matrix C or D, depending on whether the golf ball classification is less or more cloudy, respectively. The outputs of operators C and D are the scores of the dominant four principal components of the radiance correction spectrum. This correction is the estimated difference spectrum between that observed by AIRS and that computed by applying the SARTA v1.05 equation of radiative transfer to ECMWF atmospheric fields that have been adjusted in time and space to AIRS FOV coordinates. Matrices C and D were trained on 519 and 1814 golf balls, respectively, distinct from those tested later. Training and test ensembles are discussed further in section 3.

2.4. Rationale

[13] The rationale for SC algorithms relies upon the observed nearly monotonic nonlinear multivariate statistical relationship between cloudy and cloud-cleared radiances, provided that at least one of the nine FOVs being examined is at least partly clear. As discussed later, none of the nine needs to be completely clear. The nonlinearity introduced by physics and the non-Gaussian nature of the atmosphere is accommodated primarily by stratifying the AIRS data into several categories characterized by different statistics, and use of a few ad hoc nonlinear operators and iterations.

[14] Although the physical degrees of freedom include the complex three-dimensional distributions of cloud particle size, phase, and shape within each FOV, all of which must be modeled for physical retrieval methods, four degrees of freedom appear to be sufficient in stochastic models to characterize the radiance perturbation spectrum. Although physical methods sometimes characterize FOVs by the altitudes and fractional coverages of two cloud layers [Susskind et al., 2003], stochastic models apparently fold these four degrees of freedom together with others in an effective but obscure manner. Early SC experiments assumed these four degrees of freedom were linearly related to radiance corrections, but the approach presented below has achieved greater success by assuming these relations are mildly nonlinear.

[15] The SC algorithm presented here accommodates nonlinearities in three ways: (1) Two simple nonlinear operators, i.e., less cloudy FOV selection and multiplication by secθ, are inserted before multiplication by the matrices A and B, respectively; (2) the data are stratified into a few subcategories (ten here) that utilize different sets of linear operators A–D; and (3) nonlinear behavior is also produced by linearly combining radiances that are nonlinearly related to the desired radiance corrections in unique ways. The ten categories involve land versus sea, two latitude bands, and night versus day versus all times; there is no stratification by scan angle. The land/sea distinction is based on a fixed geographic database. The division of golf balls into less and more cloudy categories (operators C and D) reduced errors while further stratification, such as establishment of a “clear” category, helped but little.

[16] The agreement found here between AIRS and the corresponding time- and space-interpolated ECMWF radiances is insensitive to stable biases introduced by the instrument or radiative transfer computations. This is because the linear estimators were both trained and tested using global instrument data obtained from the same 3 days between 21 August and 12 October 2003; thus any bias in training is automatically compensated when testing. The comparison is not otherwise statistically “inbred” however, because the thousands of FOVs used for training and testing are different, interspersed, and not adjacent, and the radiance corrections have only four degrees of freedom across the full spectrum.

3. Results: Comparison With ECMWF and SARTA Radiances

3.1. Validation Data and Strategy

[17] The analysis here primarily explores the precision of AIRS cloud-cleared radiances since the SC algorithm is both trained and tested on the same type of data, resulting in cancellation of multiday global mean errors. The revealed precision does indicate, however, the utility of AIRS cloud-cleared radiances for operational numerical weather predictions since mean discrepancies between models and data are largely removed by existing assimilation procedures. The validation data used here are space/time-interpolated ECMWF analysis fields processed using SARTA v1.05 for radiative transfer. Three full days of global data are analyzed here: 21 August, 3 September, and 12 October 2003, the third day being relatively cloudy. Only Level 1B v3.1 AIRS data within ±70° latitude of the equator were used.

[18] For each evaluation approximately half the golf balls were used for training and half for testing, both sets being arranged in superimposed noncontacting regular lattices. Since no testing golf ball was ever adjacent to a training golf ball, and since both land and clouds have correlation distances generally less than ∼100 km, the two sets can be regarded as largely independent for purposes of evaluating instrument precision. Systematic variable errors in spectroscopy, atmospheric modeling, and clearing algorithms are evident only as unexplained increases in variance, and mean errors are not revealed.

[19] The ECMWF data consist of temperatures and absolute humidity at the surface and at 60 pressure levels extending to 0.1 mbar. These analyses were on a 1° grid at 6-hour intervals and were spatially and temporally interpolated to the center of each AIRS golf ball. The ECMWF fields utilized by SARTA did not incorporate any clouds, aerosols, or precipitation. The emissivity of both land and sea was assumed to be characteristic of water, varying between 0.95 and 0.99, depending on wave number [Fishbein et al., 2003]. This assumption of ocean emissivity characteristics over land should not introduce major errors because (1) the average errors in the assumed emissivity partly cancel because they occur both in training and testing and because land was trained separately, (2) the AIRS observations alone can partly compensate for surface variations, and (3) dry land has only minor emissivity variations. Any systematic variation of surface effects with wavelength enables them to be estimated; for example, both surface temperature and emissivity can be estimated using radiance differences between long and short wavelengths if the surface type is constrained. Furthermore, as discussed later, variable solar heating and not emissivity probably dominates land-surface-induced radiance clearing errors.

3.2. Classification of AIRS Channel Behavior

[20] An early SC experiment involved cloud clearing 827 AIRS channels, including all 4- and 15-micron channels plus one fifth of the rest. Figure 4 shows the RMS difference between the SC AIRS radiances and those predicted by ECMWF/SARTA; the horizontal axis indicates the altitude at which the temperature weighting function peaks for a representative atmosphere. In order to minimize surface and other effects this analysis was restricted to oceans at night, and latitudes within 40° of the equator. Three full days of data at all scan angles were analyzed. Only the least cloudy “best” 22% of all golf balls were included in the statistics, where cloudiness was determined by the first principal component of cloudiness produced by operator B in Figure 1. In general the 4-micron band exhibits the highest precision in the troposphere, whereas the 15-micron band excels in the stratosphere.

Figure 4.

RMS differences (°K) between AIRS brightness temperatures and ECMWF/SARTA predictions for nighttime ocean within 40° of the equator.

[21] The discrepancies between AIRS and ECMWF/SARTA radiances for the window and water vapor channels vary considerably. Additional processing can partially reduce instrument noise for many of the poorer channels, but such improvements are not utilized here.

[22] Most 4-micron channels exhibit RMS discrepancies below 0.4 K at all altitudes between ∼300 m and 40 km. The larger 4-micron errors and channel absences evident in Figure 4 near the stratosphere are largely explained by the Planck function; the 4-micron weighting function widths broaden for the reversed stratospheric temperature lapse rates, and channel sensitivities seriously deteriorate at low tropopause brightness temperatures. The excellent cloud- and surface-clearing performance below 2-km altitude results largely from the strong temperature dependence of the 4-micron Planck function and the ability of multiple channels with different temperature and aerosol sensitivities to compensate for partially cloudy FOVs, even in the absence of large clear “holes” in the atmosphere. The 15-micron channels with weighting functions peaking below ∼3-km altitude exhibit more than twice the variance of those channels sounding higher altitudes, presumably because of cloud and surface effects.

[23] Although water vapor channels peaking below 3 km exhibit RMS discrepancies of ∼0.7K, the group above ∼6 km exhibits RMS errors with an arc-like distribution that peaks distinctly near 8-km altitude and 2K. This arc-like distribution is even more unambiguous in daytime data, with RMS discrepancies peaking near 3.6 K. These large variable errors almost certainly reflect known imperfections in forecast upper tropospheric humidity fields. Therefore assimilation of these water vapor radiances into NWP models would presumably improve water vapor analyses and forecasts significantly.

3.3. Evaluation of Selected Channels

[24] One of the single most important applications of AIRS data will involve variational assimilation of AIRS radiances by operational weather forecasting models. This section therefore focuses on a set of 314 channels well suited to this purpose: those exhibiting RMS radiance discrepancies below 0.5 K.

[25] The SC algorithm of Figure 1 was again employed over ocean at latitudes less than ±40° using only these 314 channels for all angles and both day and night. The results are shown in Figure 5 if 100, 88, 67, or 37% of all golf balls are considered, depending on the acceptance thresholds used in the cloudy test (see Figure 1). Thus the degradation is slight if only the less cloudy half of all soundings are included in the statistics; similar results were obtained for other global data sets. Since clouds are so prevalent, the fact that SC works well for roughly half of all soundings implies that it must be clearing golf balls having few if any FOVs that are totally clear; this issue is discussed further later. Note that the agreement would be still better if the performance had not been averaged over all channels with weighting functions peaking within a given block, where these channels may include water vapor and window channels, and the 4- and 15-micron CO2 channels. Figures 4, 6, and 7, suggest reductions in RMS discrepancies by a factor of ∼1.5 would result if only the best 50 channels were used instead. Figure 5 also shows that the effects of clouds become negligible above ∼8 km, where the difference between using 37 and 88% of all golf balls becomes a small fraction of the total RMS cloud-clearing discrepancies.

Figure 5.

Average RMS differences over ocean between 314 AIRS and ECMWF/SARTA radiances observed within 40° of the equator. The channels are grouped in 1-km altitude blocks and averaged.

Figure 6.

RMS differences over ocean between 314 AIRS radiances and ECMWF/SARTA (∣latitude∣ < 40, daytime).

Figure 7.

RMS differences over land between 314 AIRS radiances and ECMWF/SARTA (30 < ∣latitude∣ < 70, nighttime).

[26] Figure 6 shows that for the best 78% of all daytime oceanic low-latitude (<40°) golf balls the RMS discrepancies for the very best channels are generally ∼0.2 K for altitudes of ∼5–13 km, and degrade to ∼0.3 K for weighting functions peaking at the surface. These “best” golf balls were identified using threshold tests like those of Figure 3, based only on the basis of AIRS data. Figure 7 shows similar results for the best 28% of all nighttime observations over land at latitudes 30–70° in both hemispheres. “Land” means that the a priori land fraction within a golf ball exceeds 80%. These AIRS/ECMWF discrepancies over land degrade from ∼0.2 K in the upper troposphere to ∼0.55 K at the surface; during daytime the surface discrepancies roughly double to ∼1.1 K. Surface elevations above 0.5 km were discarded because of lack of adequate retrieval training data.

[27] More complete performance metrics are presented in Tables 1 and 2, where the average results for 314 channels are presented for the same 3 days cited earlier. Each of the ten categories was trained and tested separately, with over 1000 golf balls being used for training in each case. The 314 channels are roughly those exhibiting discrepancies less than 0.5 K in Figure 4. Significantly lower discrepancies would result if only the best ∼50 channels were used, as suggested by Figures 6 and 7. Data at all scan angles were averaged. These discrepancies are generally smaller for low latitudes and ocean, and for land at night versus daytime. One exception is channels peaking below 1–2 km over land during the day; they perform better at high latitudes, presumably because of reduced daytime surface heating. An increased error due to solar surface heating is consistent with the strong day-night difference observed over land at all latitudes. Note that these AIRS versus ECMWF/SARTA discrepancies are generally below ∼0.5K RMS for (1) 78% of all golf balls for channels peaking above 2–6 km and (2) 28% of all golf balls over low-latitude ocean for channels peaking above 0 km, and of all nighttime land golf balls for channels peaking above 1 km. The results for the best 28% would improve if outliers in this group could be identified more reliably and removed, as evidenced by the slight improvement in two cases when 78% are averaged.

Table 1. Cloud-Clearing RMS Difference (K) With Respect to ECMWF for the Best 28% Golf Balls
Weighting Function Peak Height, kmOceanLand
∣Lat∣ < 4030 < ∣Lat∣ < 70∣Lat∣ < 4030 < ∣Lat∣ < 70
Table 2. Cloud-Clearing RMS Difference (K) With Respect to ECMWF for the Best 78% Golf Balls
Weighting Function Peak Height, kmOceanLand
∣Lat∣ < 4030 < ∣Lat∣ < 70∣Lat∣ < 4030 < ∣Lat∣ < 70

[28] Table 3 shows the increases in the entries of Table 2 that would result if AMSU microwave data were not available. The degradation due to loss of microwave data is greatest at higher latitudes and over land, particularly for channels peaking below ∼2 km. If only the best 28% of all golf balls are used, the degradation below ∼2 km is significant only over land.

Table 3. RMS Cloud-Clearing Penalty (K) Without Using AMSU for the Best 78% Golf Balls
Weighting Function Peak Height, kmOceanLand
∣Lat∣ < 4030 < ∣Lat∣ < 70∣Lat∣ < 4030 < ∣Lat∣ < 70

[29] Figure 8 characterizes SC performance in another fashion. The SC algorithm was trained on the 314 best channels over daytime ocean and within ±40° latitude, and was then applied to a typical daytime AIRS granule obtained 14 July 2003, more than a month earlier than any of the training data. The granule is centered southwest of Hawaii near 175°W, 5°N. Figure 8 (top) shows the original AIRS 14-km FOV brightness temperatures at 2187.8 cm−1; at this wave number the weighting function peaks ∼230 m above the nominal surface and has some sensitivity to CO. Since CO is less significant near the equator and generally smoothly distributed locally, its contributions to Figure 8 are presumably negligible. Each vertical scan line contains 90 FOVs. The baseline has been increased toward the limb by averaging the SC results for all scans and restoring that average decrease with angle to both the top and middle images. Within the major clouds it can be seen that only a few golf balls have even one FOV with cloud perturbations less than 5 K.

Figure 8.

(top) AIRS 2187.8 cm−1 angle-corrected relative brightness temperatures (°K) near Hawaii on 14 July 2003, (middle) the corresponding angle-corrected SC cloud-cleared temperatures, and (bottom) NOAA/NCEP estimated sea surface temperatures.

[30] Figure 8 (middle) shows the angle-flattened SC cloud-cleared radiances, most of which fit within a 2-K dynamic range and, more locally, within a ∼0.6-K range. Each vertical scan line contains 30 golf balls that have been bilinearly interpolated. It is evident that most clouds have been cleared with reasonable accuracy even without any fully clear FOVs, and that only the more intense clouds remain evident. The original image is everywhere colder than the cleared image, the offset being ∼1K for the clearest golf balls. The cleared image has a temperature difference left-to-right of 1.36 K, whereas the corresponding difference for the NOAA/NCEP-provided sea surface temperatures in the Figure 8 (bottom) is ∼1.6 K. Both the SC-cleared and NCEP sea surface data exhibit the same sharp thermal front centered in both images.

[31] One of the more surprising results from these SC experiments is the near lack of performance degradation at extreme scan angles. Table 4 lists the RMS differences between the ECMWF/SARTA and SC-corrected AIRS radiances as a function of scan angle for a representative channel at 2390.1 cm−1 peaking near 1.9 km. All golf balls were tested for 3 days, including day and night, land and sea. The percentages of these golf balls that passed the threshold for each angular group are also listed. The acceptance thresholds were 1 and 2 K for channels peaking near 2.7 and 0.47 km, respectively. One reason the listed RMS cloud-clearing performance actually improves near the limb is that somewhat fewer golf balls pass the cloud test there. Together the slightly improved performance and reduced yield near the limb suggest that SC performance is not only nearly independent of viewing angle, but also largely independent of spatial resolution, for the FOV area increases more than a factor of three at the extreme viewing angle. This result is expected, however, if the SC algorithm can indeed successfully use FOVs that are each only partly cloud-free. Thus this result for the clouds of Figure 8 reinforces the earlier observation that stochastic cloud clearing appears successful even when golf balls have no totally clear FOVs.

Table 4. RMS Radiance Discrepancies at 2390.1 cm−1 Between AIRS and ECMWF/SARTA as a Function of Scan Angle
Scan Angle, Degrees From NadirAIRS Versus ECMWF, RMS °KPercentage in Good 38%

[32] Another way to evaluate the relative performance of alternative cloud-clearing methods is to measure the degree to which adjacent golf ball radiances are cleared to approximately the same values. This test is useful because local clear air atmospheric variations are roughly an order of magnitude smaller than are cloud perturbations, i.e., tenths of degrees versus degrees. Such tests require that the cloud-clearing process for each golf ball be independent of its neighbors, which is the case here.

[33] For the spatial variation test of Figure 9 the 2223 cm−1 4-μm window channel was examined for 14 globally distributed AIRS granules observed 28 August 2005 over ocean between the latitudes of 14°N and 57.7°S. Only the 14,129 golf balls that passed both the 78% SC test and the comparable AIRS team quality assurance test [Chahine et al., 2001] were designated as valid; this represents 74.8% of all golf balls. For each granule and channel a two-dimensional third-order polynomial was fit to the least corrected quarter of all valid golf balls so as to minimize the total variance between these least corrected (least cloudy) radiances and the polynomial. This polynomial baseline thus approximates clear air radiance values. Perfect representation of the clear air values is not necessary because the cloud-induced high-spatial-frequency deviations from the baseline dominate the comparisons made here.

Figure 9.

Rank-ordered RMS discrepancies between stochastically cloud-cleared golf ball radiances and a two-dimensional polynomial fit to the clearest golf balls for each of 14 granules (line with circles) and same results using physics-based cloud clearing (v. 4.0.9) (line with crosses).

[34] The deviations from baseline for each golf ball were rank ordered separately for each of the 14 granules, and then the RMS deviation (K) of the 14 samples for each rank was determined and plotted in Figure 9; in Figure 9, rank is expressed in terms of the percentage of golf balls that were better. We expect approximately a linear increase with rank from zero RMS values, where the rate of increase is roughly proportional to the overall RMS deviations due to cloud effects. The right end of the curve represents the RMS worst case discrepancy over the 14 granules if all valid golf balls are included. Also plotted on the same graph are the results for the same experiment and AIRS golf ball radiances, but cloud-cleared instead using a state-of-the-art physics-based algorithm, version 4.0.9 (Goddard Earth Sciences Distributed Active Archive Center, 2005, available at The physical basis for this algorithm was generally described by Susskind et al. [2003]. Although all valid golf balls in both the SC and physics-based experiments had to survive both the physics-based and SC-based initial rejection criteria, no other SC-derived information was used in obtaining the physics-based results, and vice versa. It is clear from these distributions that in this case the residual SC cloud effects for the best golf balls are roughly half those for the physics-based cloud-clearing algorithm. Equivalently, for any given residual error threshold below ∼0.5 K, roughly twice as many golf balls were cleared using SC versus physics-based methods. The worst case residual errors are also noticeably less for the SC method. Although such a limited experiment is not definitive by itself, it does suggest that further study of SC methods is warranted as still better cloud clearing algorithms continue to be developed.

[35] The test date for Figure 9 was 28 August, which lies between 2 of the 3 days used for training the SC algorithm, separated by a week in each direction. Good cloud-clearing results are also exhibited in Figure 8, where the cleared image was observed a month before its training data set began. These results imply that infrequent training updates (weekly or monthly) should suffice. Since the training is global, a fixed annual cycle of training data may be adequate.

[36] It is important to note that AIRS radiances can also be cleared by using coincident infrared spectrometers with ∼1-km spatial resolution that can estimate cloud coverage within ∼14-km AIRS FOVs at representative wavelengths so as to improve the performance of N*-based cloud-clearing methods. Analyses by Li et al. [2005] suggest cloud-clearing performance comparable to that reported here can be obtained by such methods, even at ∼14-km resolution, although they provide no comparisons with NWP results. They also describe a performance analysis method based on principles similar to those employed here in deriving the results reported in Figure 9; that is, residual local radiance variations are most likely due to clouds.

4. Conclusions

[37] The results above lead to six operationally significant conclusions: (1) AIRS SC-cleared radiances sounding all altitudes are sufficiently consistent with ECMWF analysis fields that over 30% of AIRS golf balls could probably be profitably assimilated into operational global models in the near future, and roughly three quarters of all golf balls could be assimilated for channels with weighting functions peaking above ∼5 km; (2) since this consistency in cloud-cleared radiances could not be accidental, it implies that under most circumstances both the ECMWF analysis fields and AIRS radiances must be quite accurate, absent mean errors; (3) SC algorithms appear to function well a significant fraction of the time even if no FOV is fully clear, reducing incentives for employing alternate retrieval strategies that rely on rare totally clear FOVs; (4) since SC cloud-clearing performance is nearly independent of viewing angle and therefore of the diameter of the FOV, high spatial resolution may not be essential for good cloud-clearing performance; the area of the 14-km nadirial FOV of AIRS increases by more than a factor of three at the highest scan angles; (5) initial comparisons of physics-based and stochastic cloud-clearing methods suggest that SC methods are quite promising and that their continued development and evaluation are warranted; and (6) effective SC cloud-clearing algorithms should require no more than a single lightly loaded PC for real-time execution.

[38] The 314 channels characterized in this paper were chosen because they agreed well with ECMWF/SARTA. There is hope that these SC methods could clear many other channels once the reasons for their divergence from ECMWF are better understood. Furthermore, most of the radiance discrepancies observed on certain water vapor channels are believed to be due to weaknesses in the ECMWF upper tropospheric water vapor analyses. These SC methods should be extensible beyond 70° latitude to the poles by linear or nonlinear addition of land surface elevation and latitude to the inputs of operators A, B, C, and D, and by further stratification, especially for surface type The large training ensembles required to achieve adequate statistics for all surface elevations, types, and conditions were not available for the experiments reported here.

[39] The SC algorithms detailed here are only simple examples of what can be implemented under the SC strategy. Alternative routines could be developed for FOV selection and averaging, for handling all scan angles, for establishing protocols and thresholds for classifying golf balls into two or more categories, for incorporating other nonlinearities, for iterating results, and for training. Neural networks can effectively combine some of these functions, possibly making stratification unnecessary. The essence of SC algorithms is their substitution of stochastic models for physical ones, although physical reasoning can be incorporated in their design.


[40] The authors are grateful to the AIRS Team for providing the AIRS/AMSU/HSB data and cloud-cleared radiances, to ECMWF and William J. Blackwell for providing the ECMWF matching data, and to Philip W. Rosenkranz for useful discussions. This work was supported principally by NASA under contracts NAS5-31376 and NNG04HZ51C, and in part by NOAA under contract DG133E-02-CN-0011.