A set of data assimilation and forecast experiments is performed with the NASA Global data assimilation and forecast system GEOS-5, to compare the impact of different approaches toward assimilation of Atmospheric Infrared Sounder (AIRS) data. The impact is first assessed globally on a sample of more than forty forecasts per experiment, through the standard 500 hPa anomaly correlation metrics. Next, the focus is on precipitation analysis and precipitation forecast skill relative to one particular event: an extreme rainfall episode which occurred in late July 2010 in Pakistan, causing massive floods along the Indus River Valley. Results show that, in addition to improving the global forecast skill, the assimilation of quality-controlled AIRS temperature retrievals obtained under partly cloudy conditions produce better precipitation analyses, and substantially better 7-day forecasts, than assimilation of clear-sky radiances. The improvement of precipitation forecast skill up to 7 days is very significant in the tropics, and is caused by an improved representation, attributed to cloudy retrieval assimilation, of two contributing mechanisms: the low-level moisture advection, and the concentration of moisture over the area in the days preceding the precipitation peak.
 The Aqua satellite, the advanced polar orbiting integrated infrared and microwave atmospheric sounding system launched by NASA in May 2002 [Pagano et al., 2003] contains, among other instruments, the Atmospheric Infrared Sounder (AIRS) and the Advanced Microwave Sounding Unit (AMSU-A). Because of the much higher spectral resolution of AIRS with respect to previous generation moderate spectral resolution instruments, AIRS represents a major leap forward in providing a very detailed description of the atmospheric thermodynamics, particularly in the tropics [e.g.,Tian et al., 2006; Wu et al., 2006]. Despite the instruments' great potential and their proven benefit in operational weather forecast [Le Marshall et al., 2006] a very small fraction of AIRS data is being assimilated in operational procedures.
 In fact, current operational weather forecasting relies upon the assimilation of clear-sky radiances. According to the operational methodology, the only channels that are assimilated are those in which the field of view is thought to be completely unaffected by clouds. The rejection of data from cloudy regions is an important limitation that causes poor coverage of areas that are meteorologically most significant. Despite the coverage problem, there is a widespread assumption that the clear-sky radiance approach is superior to assimilation of retrievals, based upon problems noted for retrievals extracted from lower spectral resolution instruments using older retrieval algorithms.
 In particular, it has been stated that retrievals require the contribution of a model forecast, and therefore are subject to additional model-dependent errors. However, these assumptions do not hold true for the latest generation of retrieval products from a hyper-spectral instrument such as AIRS. In fact, the procedure to produce AIRS version 5 retrievals doesnot require or use any model background, as noted by Chahine et al. , Le Marshall et al.  and Atlas , and more extensively discussed by Susskind et al. , which documents the AIRS Science Team Version 5 retrieval algorithm. This is the same algorithm being used operationally to produce the retrieval products available at the Goddard Distributed Active Archive Center (DAAC). It is important to stress that, unlike the clear sky radiances, AIRS Version 5 retrievals make use of information coming from partly cloudy areas. Another critical element found in the AIRS Version 5 retrieval algorithm is the ability to generate case-by-case level-by-level error estimates for the retrieved temperature profiles. These error estimates are used as part of the quality control procedure to assimilate AIRS temperature profiles.
Reale et al. is the first study demonstrating the positive impact of quality-controlled AIRS Version 5 temperature retrievals under partial cloudy conditions on a global data assimilation and forecasting system.Reale et al.  demonstrated that assimilation of AIRS version 5 data could improve the analysis and forecast of a devastating and very difficult northern Indian Ocean tropical cyclone: Nargis, which hit Myanmar on 12Z 2 May 2008. The study was particularly noteworthy because most operational centers failed to produce a good analysis of Nargis: instead of a closed circulation, an open trough was noted in several operational analyses (as in Figure 3 of Reale et al.  article), even when the cyclone was already very strong, on April 28th, 2008. On the contrary, the assimilation of AIRS cloudy retrievals performed by Reale et al. helped to produce a well-defined storm center, from which the GEOS-5 model could produce substantially improved forecast tracks.
Zhou et al. analyzed for the first time the impact of different types of AIRS data within a global data assimilation and forecast system, on precipitation analyses and forecasts associated with three Atlantic and Indian Ocean tropical cyclones, providing strong evidence that assimilation of cloudy retrievals produce substantially better results than clear-sky radiances.
 In this work the focus will be on the extreme precipitation event that occurred along the Indus valley, Pakistan, in late July 2010, which represents one of the most catastrophic humanitarian disasters in recent history. Detailed information about rainfall and synoptic forcings is provided by Houze et al. , while the event predictability is discussed by Webster et al. . Lau and Kim  emphasize teleconnection aspects, demonstrating that Pakistan's heavy rain may be physically linked to the Russian heat wave and blocking in the extratropics. Particularly relevant to this work is the statement by Houze et al.  on the difficulty faced by current operational models in the prediction of the precise ‘form’ of precipitating clouds (and consequently of precipitation spatial distribution). This work shows that an improved initialization, which makes use of AIRS information obtained in presence of clouds, can help to characterize the deep moist airflow with greater accuracy, therefore improving precipitation location.
2. The Model and Data Assimilation System
 The global data assimilation and forecasting system used in this work is the NASA GEOS-5, described in detail byRienecker et al. . The GEOS-5 combines the NASA finite-volume (fv) atmospheric global forecast model (whose dynamical core was originally developed byLin ) with the Grid point Statistical Interpolation (GSI) analysis algorithm developed by the National Centers for Environmental Predictions (NCEP) [e.g., Wu et al., 2002], and successively modified by the NASA Global Modeling and Assimilation Office (GMAO). The effectiveness of the fv dynamical core has been documented on a variety of problems, ranging from tropical cyclone forecast [e.g., Atlas et al., 2005] to stratospheric transport [e.g., Pawson et al., 2007]. Since the earlier versions of the fv-based NASA GCM, physics and parametrizations underwent several modifications. The system is almost the same version as used byReale et al.  and Zhou et al. , at the horizontal resolution of 0.5° × 0.67° and with 72 vertical levels, and is almost identical to the version used to produce the Modern-Era Reanalyses for Research and Applications (MERRA), documented byRienecker et al. . To the purpose of this study, it is important to emphasize that the GEOS-5 uses an incremental analysis update (IAU) procedure, originally developed byBloom et al. , which has been noted to improve short-term precipitation forecasts due to the smoothness of of the analysis corrections (so-called ‘corrector sequence’). The IAU is extensively discussed, in addition to the already mentioned GEOS-5 documentation, also byRienecker et al.  and in the online MERRA documentation. Its main purpose is to apply the analysis correction more gradually to the forecast model and reduce the shock at the forecast initialization [Bosilovich et al., 2011]. In particular, for each analysis time, the model is run to produce two additional background states, then is restarted from a previous state three hours earlier, being driven this time by its own physical tendencies but also by an additional time-invariant ‘analysis tendency’, which is obtained by dividing the analysis increments by their timescale. Each IAU comprises therefore two segments, named corrector and predictor respectively. The difference between the two segments is that the former is aided by the analysis tendency, the latter is not.
 The benefit of the IAU procedure has been noted on many aspects of the GEOS-5 analysis, but is particularly noteworthy in the precipitation fields. As noted byRienecker et al. and in the online MERRA documentation, the analyzed MERRA precipitation products (e.g. 3-hourly, daily, etc) do not come from direct precipitation assimilation but from the corrector sequence. Since the corrector sequence is a very short-term forecast, is strongly controlled by the observation flow. In addition, because it is aided by the analysis tendencies, its products are very smooth and are minimally affected by spin-down problems. MERRA precipitation has been compared to other analysis and observational data sets and found to be very realistic globally, regionally and on various timescales [e.g.,Bosilovich et al., 2011; Wu et al., 2012].
3. AIRS Data Sets
 The main purpose of the experiments described in this article is to compare the impact of the most widely adopted strategy for assimilation of AIRS data, which is clear-sky radiance assimilation, and the impact arising out of a methodology that assimilates geophysical products from AIRS radiance information in the presence of clouds. It is important to emphasize that no data set is beingproduced as part of this work. On the contrary, we want to compare two already existing data sets which are available in real time, are freely distributed, and can be used by any operational center.
 In particular, the experiments described in this article adopt essentially the same AIRS radiance data set that is being used by the National Centers for Environmental Prediction (NCEP) and by other centers in the world. In fact, from the comprehensive radiance data product documented on the Goddard Earth Sciences Data and Information Services Center (GES DISC) Data Release User Guide (available online at: http://disc.sci.gsfc.nasa.gov/AIRS/documentation/v5_docs/ AIRS_V5_Release_User_Docs/V5_Data_Release_UG.pdf), a subset of 148 channels is selected and assimilated. With minimal differences, this is the same selection of channels adopted by NCEP and assimilated in the Grid point Statistical Interpolation (GSI) data assimilation system, to produce the analysis and the Global Forecasting System (GFS) operational prediction.
 As for the AIRS retrievals, the basic approach used to retrieve geophysical parameters from AIRS observations was originally described by Susskind et al. [2003, 2006]. The AIRS Version-5 retrieval algorithm contains a number of improvements which are described in detail bySusskind et al. . In AIRS Version-5, 9 sets of AIRS radiance observations within a 3 × 3 array of adjacent AIRS Fields of View (FOVs) contained within an AMSU-A Field of Regard (FOR) are used to generate coefficients which provide channelj clear column radiances for all AIRS channels. These clear column radiances are derived products which represent what AIRS channel j would have observed if the entire FOR were cloud free. One AIRS sounding is derived for each AIRS FOR using for selected sets of channels jused to determine different geophysical parameters. With regard to the determination of temperature profiles, the Version-5 retrieval algorithm primarily uses AIRS channels found in the 4.3 μm CO2 band, and also uses channels found in the 15 μm CO2 band which are sensitive only to stratospheric temperatures and whose radiances are never affected by the presence of clouds in the AIRS FOR. Those 15 μm channels which are sensitive to tropospheric temperatures are not used in the retrieval of atmospheric temperature profiles because clear column radiances for these channels are more sensitive to errors in cloud clearing parameters than are those in the 4.3 μm band.
 All retrieved temperature profiles have case-by-case, level-by-level error estimates associated with them. The AIRS temperature profile data assimilation experiments described in this paper assimilate the retrieved temperature profiles down to a case-by-case characteristic pressure which is the highest pressure for which the temperature error estimate does not exceed a threshold value using the Tight Quality Control thresholds defined bySusskind et al. . AIRS temperature profiles are presented to the analysis as if they were radiosonde reports, with radiosonde errors set to the error estimate assigned to a given temperature profile report. No other AIRS data, aside from atmospheric temperature profile, are assimilated in the experiments shown.
 Aside from the obvious difference between the procedure described above and the operational approach of assimilating AIRS observed radiances, there are two other significant differences related to the use of AIRS channel radiance observations between the two approaches. The operational approach assimilates, in a given scene, observed radiances selected from a subset of AIRS channels whose observations are thought to be unaffected by cloud cover in that scene. As a result of this, the potential spatial coverage of Quality Controlled temperatures at a given pressure, retrieved under partial cloud covered conditions, can be substantially greater than that of channel radiances sensitive to temperatures at that, and lower levels of the atmosphere, which are interpreted as cloudy. Another potentially very important difference between the use of channels in the two different approaches is that operational data assimilation procedures do not assimilate channel radiances in any of the AIRS 4.3 μm channels, the main set of AIRS channels used in the retrieval of atmospheric temperatures, because the physics used to compute expected radiances in the operational scheme does not have the capability of accounting for the effects of either non-local-thermodynamic equilibrium or solar radiation reflected by the surface. Both of these effects are substantial with regard to radiance observations in the 4.3 μm region but are not so in the 15 μm region.
 The period is chosen to cover the set of catastrophic rain episodes that occurred in late July 2010 along the Indus Valley, in central and northern Pakistan. An assimilation longer than the event itself is produced, in order to gather a significant number of forecasts that can also satisfy standard global validation metrics.
 Two 48-day assimilation experiments, starting at 00Z 15 July 2010, and ending on 31 August, 2010, are performed with the GEOS-5 DAS, in addition to an assimilation named CNTRL. In all three assimilations conventional and satellite observations used operationally at NCEP at that time are assimilated, with the exception that AIRS data are excluded in CNTRL, AIRS cloudy retrievals are assimilated in the experiment called RET, and AIRS clear-sky radiances are assimilated in the experiment called RAD. In the RET assimilation, AIRS version 5 temperature profiles are treated as conventional radiosonde reports, with the case-by-case, level-by-level error estimates being used as the uncertainty in each ‘radiosonde’ report. The AIRS temperature at one level is assimilated only if it passes its quality control threshold for that case [Susskind et al., 2011].
 From the three sets of analyses, the first five days are discarded for spin-up purposes; then, three corresponding sets of 43 seven-day forecasts (CNTRL, RET and RAD), all initialized each day at 00 UTC, are produced and verified against operational NCEP analyses. The RAD experiments are meant to reproduce the AIRS assimilation strategy that is adopted by most operational centers in the world. The impact produced by assimilation of clear sky radiances is compared with the impact of version 5 retrievals. After the global forecasting skill is verified, the focus will be on the precipitation forecasting skill for the catastrophic Pakistan extreme events that occurred in July 2010.
Figure 1shows the 500 hPa height anomaly correlation plot for the two sets of forecast RET and RAD, compared to the CNTRL and computed globally up to day five, from 90°S to 90°N, and for the two hemispheres in the latitude ranges of 20°N to 90°N and 20°S to 90°S respectively. The improvement obtained by assimilating cloudy retrievals with respect to clear-sky radiances is consistent in both hemispheres. The calculation shows a larger difference in skill between RET and RAD in the tropics, in the latitude band 20°S to 20°N (not shown), although with overall much lower skill, spread between 0.67 and 0.68 at day 5, with RET being the highest and CNTRL the lowest. The significance of the difference in skill of RET with respect to RAD on the global 500 hPa anomaly correlation can be computed in a variety of way. If the two sets of individual global anomaly correlation scores are treated as a members of two populations comprising 43 forecasts for each case, the significance is only 70% level at day 5. However, the significance increases drastically when computed over regions, and varies according to the mean cloud cover during the period (not shown). In fact, there are regions where the difference in skill between RET and RAD is not significant (e.g. along the subtropical high pressure belt, where the cloudy retrieval and clear-sky radiance data distributions are comparable) and regions where the difference between RET and RAD is 99% significant (predominantly along the storm-tracks and over the Asian monsoon region (where RET and RAD coverage is substantially different, as it will be discussed later). Between the two hemispheres, the significance of the difference in skill between RET and RAD forecasts is higher in the northern hemisphere.
 The impact arising out of the use of AIRS cloudy retrievals is not confined to global or hemispheric skill only but is particularly significant on specific high-impact weather systems, as shown in previous work. In this work, we focus on the late July peak of a series of extreme precipitation events that occurred in 2010 along the Indus River Valley in Pakistan. The entire sequence of anomalous precipitation events spans through a longer duration, as discussed in detail byLau and Kim . Figure 2 shows the observed precipitation from the data set 3B42 V6, known as ‘Tropical Rainfall Measuring Mission (TRMM) and Other Rainfall Estimates’ and obtained through the NASA's Goddard Earth Sciences Data and Information Services Center (DISC). Most of the Indus Valley receives about 200–400 mm of rain annually [e.g., Asnani, 2005; Pant and Rupa Kumar, 1997], but the same amount fell in five days between 00Z 25 July 2010 and 00Z 30 July 2010, with the maximum precipitation occurring on the 28th. The time series of the observed precipitation, area-averaged across a large box of more than 370,000 square km, is also provided inFigure 2, to be compared with the precipitation ‘analyses’ provided by the RET and RAD assimilation experiments, and with the RET minus RAD departure. As mentioned previously in this article, and discussed by Zhou et al. , the version of the GEOS-5 DAS used here produces a precipitation ‘analysis’, resulting not from direct assimilation of precipitation, but from the previously mentioned corrector sequence. Therefore, the fields named precipitation analysis in this article are a series of short-term (3-hour) precipitation forecasts strongly constrained by the observation flow and characterized by a remarkable smoothness which is consequence of the IAU procedure.Figure 2shows that both RAD and RET analyses provide a reasonable timing of the event, while under-estimating the three-hourly peaks on the 28th and 29th of about 10%–20%. However, the RET assimilation produces a precipitation analysis closer to the observations, by increasing the maxima on both days.
 From all the forecasts performed, we select the RET and RAD seven-day forecasts initialized at 00Z 22 July, and compare them against the observations (Figure 3). The particular forecast is chosen because its initialization time is the most distant from the daily accumulated precipitation maximum, which occurs on July the 28th (recall the time series in Figure 2). The quantity shown is 7-day accumulated precipitation, from 00Z 22 July 2010 to 00Z 29 July 2010. The observed precipitation peak of more than 200 mm, concentrated at about 34°N–35°N and 71°E–72°E, is the main focus of this study. It is important to reiterate that it occurs over a region where the annual precipitation is modest. Another observed peak exists to the southeast, in India, at about 30°N–32°N and 75°E–76°E; however, this is not as significant for this work because the climatological rainfall over that area is much higher. Between the two observed precipitation peaks, a relatively drier area can be noted.Figure 3shows that the forecast initialized from the analysis in which clear-sky radiances are assimilated produces rainfall amounts that, while comparable to the observed maxima, are strongly misplaced: in fact, the predicted precipitation maximum is at about 33°N and 76°E, somewhat between the two observed precipitation peaks. On the contrary, the forecast issued from the analysis in which AIRS cloudy retrievals are ingested produces the precipitation peak over Pakistan at 35°N and 72°E, in an almost perfect position. The precipitation difference RET minus RAD confirms that the net effect of the cloudy retrievals assimilation is to move the forecast in the direction of the observations: in fact, the maximum RET minus RAD forecast difference (35°N and 72°E) corresponds remarkably well to the location over Pakistan where the precipitation maxima are observed. In addition, the RET minus RAD forecast difference shows one maximum over India, which correspond well to the other observed precipitation peak, and a minimum in correspondence to the observed dry area between the peaks.
 It is worth noting that also the overall skill of the nine 7-day accumulated precipitation forecasts (area-averaged on the same domain previously discussed) initialized between 00Z 21 and 00Z 29 July is superior in the RET case. In particular, considering as ‘threshold’ the occurrence of at least 30 mm (which is a large amount considering the size of the box, and is always verified in the observations for that period) 6 over 9 RET forecasts pass the threshold, against 5 over 9 in the RAD case, thus producing a threat score of 0.67 against 0.55. Moreover, the mean forecast to observation 7-day accumulated area-averaged precipitation ratio is about 5% higher in the RET case for the nine forecasts initialized between 00Z 21 and 00Z 29 July.
6. Mechanism: Moisture Transport
 In general, in order to produce a major precipitation event, a number of concurring processes must act synergistically on different spatial and timescales: (1) large-scale transport of moisture in the lower half of the troposphere, possibly in a non-precipitating environment, so that moisture is retained and not lost on the way, (2) a concentration of moisture over a certain area, smaller than the area from which the moisture is originated, and (3) an upper-tropospheric forcing mechanism to induce divergence aloft, so as to trigger vertical motion and release the precipitation over a relatively small area and a relatively short timescale with respect to the scales involved with moisture transport [e.g.,Reale et al., 2001; Turato et al., 2004].
 In general, for monsoon-related floods to occur over the northern part of the Indian subcontinent, the low-level moisture flow originates predominantly from the Bay of Bengal, with smaller contributions from the Arabian Sea. The trigger mechanisms are quite often cyclonic disturbances, generally referred to as Low Pressure Systems (LPS) [e.g.,Krishnamurthy and Ajayamohan, 2010] or monsoon depressions. Occasionally these disturbances do not appear as fully developed lows with an evident signature at the surface, but may manifest very clearly in the midtroposphere, as noted by Singh et al. , which analyzed a record-producing rainfall event over Mumbai, India.
 For the specific case of the 2010 Pakistan floods, it has been argued that, concurrent to the formation of mesoscale vortices, one major large-scale forcing was a blocking situation over western central Asia [e.g.,Lau and Kim, 2011] and the associated anomalous structure of the subtropical jet stream. Figure 4shows the sea level pressure and 1000 minus 500 hPa thickness at 00z 29 July 2010: a strong anticyclone over western Siberia advects a cold northeasterly low-level flow to the north of the central Asian mountain systems toward Kazakhstan (evident from the isothickness values intersecting the isobars at about 70°E–90°E and 45°N). At the same time, one small-scale baroclinic cyclone, associated with the low-level cold advection from northwestern Asia and a deep mid-tropospheric trough (as it will be shown later), is present over southern Afghanistan (Figure 4) and contributes to advect a southerly flow from the Arabian Sea toward Pakistan.
 In addition, one particularly strong mesoscale vortex, which formed over the Arabian Sea and had been observed lingering close to the coast of Pakistan by the 27th (not shown), is not evident at the surface but can be detected clearly at 00z 29 July in the 700 hPa flow (Figure 5), with the center of the 700 hPa circulation at about 66°E and 22°N. Figure 5 confirms the existence of a strong moist southerly flow from the Arabian Sea toward central and northern Pakistan, particularly against the Hindu Kush mountains. From Figure 5, it can also be noted that part of easterly moist flow, from the Bay of Bengal flowing across the Gangetic Plain between 21°–24°N and 75°–90°E, is deflected northward and becomes entangled in the southerly flow against the northern Pakistan mountains.
Figure 6, in which 500 hPa geopotential and relative vorticity are plotted, shows a mid-level trough approaching Afghanistan and producing an evident cutoff at about 36°N–39°N and 65°E–70°E with high vorticity values, optimally placed to support baroclinic development of the surface low over southern Afghanistan shown inFigure 4.
 Finally, evidence of coupling between two upper tropospheric jet streaks, at the time in which the largest precipitation occurs, can be noted in Figure 7. Jet streak coupling is a common midlatitude feature that can strongly enhance upward vertical motion by creating a transverse circulation on the vertical plain intersecting the two jets [e.g., Uccellini and Johnson, 1979]. The stronger of the two jet streaks is optimally placed to enhance upper level divergence on its right entrance region: in fact, a curved strip of 200 hPa divergence at 00Z 29 July is present over eastern and northern Pakistan from approximately 26°N; 60°E to 36°N; 73°E. The precipitation event appears then to be a case of strong tropical-extratropical interaction. SeeLau and Kim  for a detailed analysis of the atmosphere, land surface conditions in the extratropics, and evolving states in the south Asian monsoon system, leading to the extreme events.
 This study, which is focused on the representation of moisture transport and concentration, will show how the analyzed and predicted precipitation is strongly controlled by these quantities, which, in turn, appear sensitive to the different AIRS assimilation strategy adopted. In fact, the low- and mid-level moist flow from the Indian Ocean, which is a necessary player in the development of extreme rainfall over the region, is associated with cloudiness. Therefore, the rejection of all points which are affected by clouds, as done in the RAD case, cannot affect optimally the analysis and the overall quality of the forecast.
 To produce a precipitation event of this magnitude, a synergy between different spatial and timescales is necessary. In particular, as stated before, the timescales of moisture transport, and accumulation over a relatively small area, are larger than the precipitation event itself. In order to provide evidence of the impact caused by the cloudy retrieval methodology on the moisture transport with respect to the clear-sky radiance assimilation, the mean meridional moisture transport is computed along the entire week that culminates with the peak of the event (from 00Z 22 July to 00Z 29 July).Figure 8 shows the difference in the meridional moisture transport (v ∗ q, with v meridional wind and qspecific humidity, vertically integrated from the surface to 400 hPa) between the RET and RAD forecast. While the moisture source for this event (and other similar events over the region) is predominantly the Bay of Bengal, with smaller contributions from the Arabian Sea, the 7-day forecast initialized from the RET analyses shows that a substantial increase in northward meridional moisture flux from the Arabian Sea toward the Indus Valley and northern Pakistan occurs in the RET case. In addition, the northward transport from the central Gangetic plain toward the Indus Valley also increases significantly as a consequence of retrievals assimilation. Both contributions suggest a larger transport of moisture toward Pakistan in the RET case, throughout the days preceding the maximum precipitation.
 In Figure 9, the two forecasts are compared again, but this time on a shorter scale of only two days, to emphasize the concentration of moisture at the time in which the actual precipitation peak occurs. A most meaningful quantity on the 2-day timescale is the RET minus RAD time-averaged moisture flux divergence, vertically integrated up to 600 hPa. Specifically, the moisture flux convergence averaged between 00Z 27 and 00Z 29 July represents the accumulation of moisture during the two days in which the maximum precipitation occurs. InFigure 9 also the mean moisture RAD advection vector Vqis super-imposed, withV total wind, and qspecific humidity, also vertically integrated between the surface and 600 hPa and computed between 00Z 27 July 2010 (the onset of the precipitation peak) and 00Z 29 July 2010, to emphasize total deep low-level moisture transport on a more detailed spatial scale than in the previousFigure 8. The mean moist circulation in the RAD (and RET) forecasts during the two days in which the precipitation peaks shows an evident monsoonal vortex approximately over southern Pakistan (consistent with the instantaneous snapshot seen also in Figure 5), which further increases the moisture transport from the Bay of Bengal across the Gangetic Plain into Pakistan. The vortex is associated with the northern propagation of the monsoonal trough from southern India, as a part of monsoonal intraseasonal oscillations typical of the Indian Monsoon [Wang et al., 2011; Lau and Kim, 2011], and plays a prominent role in controlling moist advection toward the continent. There are some minor differences between RET and RAD in this field (not shown), but the most prominent quantity, on the two-day timescale, is the differential impact in the vertically integrated moist divergence that is obtained by virtue of AIRS cloudy retrievals information. The assimilation of AIRS retrievals strongly increases the 2-day mean of the moisture flux convergence over most of the Indus River Valley, and particularly over northern Pakistan, at about 32°N–35°N (Figure 9), exactly where the maximum observed precipitation occurs (recall Figure 3).
 The large improvement in the precipitation forecast (seen in Figure 3) consequent to the assimilation of AIRS cloudy retrievals with respect to clear sky radiances, is likely to be caused by an improved moisture transport and distribution across the forecast time. Figures 8 and 9demonstrate that the assimilation of AIRS retrievals affects the 7-day average meridional moisture transport, and 2-day average moisture flux convergence, over the region, in agreement with the improved precipitation forecast.
7. AIRS Coverage in Different Assimilation Strategies
 The improved moisture transport and distribution across the forecast caused by a better representation of the atmosphere's thermal structure, may be a consequence of the more extensive coverage provided by the cloudy retrievals. To support this point, the observation locations actually used in the clear-sky radiance approach are compared with the observations used in the quality-controlled cloudy retrieval approach at 18Z 21 July 2010 (Figure 10), which is six hours before the particular forecast discussed in Figures 3, 8 and 9) is initialized (00Z 22 July). The Aqua 18Z passes cover the eastern Indian Ocean, specifically the Bay of Bengal, which has already been noted to be a prominent source of moisture for this and many other monsoonal events. It is noteworthy that almost no clear-sky radiance observations are assimilated over the Bay of Bengal, which is affected by dense cloud coverage at that time. Additionally, the number of assimilated clear-sky radiance observations over the northern Indian Ocean is relatively small compared to the AIRS version 5 retrievals.
 An equally remarkable situation can be noted at 00Z 22 July 2010, which is the time of the initialization for the forecast discussed in this work (Figure 11). While the 18Z passes cover the eastern Indian Ocean, and the Bay of Bengal specifically, the 00Z passes cover the western Indian Ocean and the Arabian Sea, which has also been shown to be an important moisture source for this event. Because the Arabian Sea is also affected by clouds at that time, only four observations over the Arabian Sea north of 10°N are found to be in completely clear-sky conditions and therefore can be included in the assimilation (Figure 11). In Figure 11the corresponding locations of AIRS retrievals obtained under partial cloud cover are also shown: the improvement in coverage with respect to clear-sky radiance methodology is remarkable.
 A very similar situation also occurs for the corresponding 12Z Aqua passes, which cover the western Indian Ocean, and the 6Z orbits, which cover the eastern Indian Ocean and the Bay of Bengal in particular (not shown). Since data assimilation is a recursive process, the persistent difference in data coverage is very important also in the hours preceding the initialization of a specific forecast; therefore the 6Z, 12Z and 18Z passes obviously exert influence on the subsequent 00Z analysis used for our forecast initializations. As expected during an active monsoon time, a large part of the northern Indian Ocean is covered by clouds. The rejection of all cloud-contaminated channels that is required by the clear-sky radiance approach has the consequence of providing very few data over the most critical region where the moist flow is originated. In contrast, the cloudy-sky strategy ofSusskind et al. , which is the base of AIRS version 5 retrievals, allows the inclusion of information in presence of cloud coverage and can provide several retrieved temperatures, especially in the northern part of the Bay of Bengal and Arabian Sea (Figures 10 and 11).
 It cannot be rigorously proven that coverage alone is responsible for the better skill of the RET forecasts with respect to the RAD. In fact, an additional experiment in which retrievals are assimilated only in clear regions could perhaps shed some light on the relative importance of coverage. However, even such additional retrieval assimilation experiment would contain differences (other than coverage) with respect to the RAD case, because of the intrinsic methodological differences and the different channel selection between the two. Despite these caveats, the difference between the coverage offered by the retrieval approach and the clear sky radiance is particularly striking over the regions where the moist source for the event is located.
 It is important to clarify that, despite moisture not being assimilated in these experiments, the temperature information in partly cloudy regions, which is provided by quality-controlled AIRS version 5 retrievals over the northern Indian Ocean, helps the model's dynamics and physics to better define cloudy areas. As previously shown byReale et al.  in their study on tropical cyclone Nargis, the improved thermal structure provided by AIRS cloudy retrievals creates a slight temperature contrast between the top of the cloudy areas and the surrounding environment, with the cloudy regions being relatively warmer. The slight temperature contrast provided by the AIRS retrieval methodology can constrain the location of the most intense convection more effectively, producing a better placed and confined storm center in the analysis [Reale et al., 2009].
 In the present study, the analysis similarly benefits from the improved temperature information offered by the cloudy retrievals, by providing a substantially improved representation of cloud distribution and of the moist flow from the ocean toward Pakistan. The additional information, along with the improved global forecast skill, as shown in Figure 1, affects the precipitation analysis, as seen in Figure 2, and propagates into the precipitation forecast over the region (Figure 3). Conversely, the rejection of all channels affected by clouds, as done in the RAD case, seems to penalize the model ability to represent the moist oceanic flow which feeds the precipitation event.
 One caveat is necessary: the very high skill in this particular 7-day precipitation forecast, noted to a lesser extent even in the RAD case, probably derives from the fact that the precipitation event cannot be defined as mesoscale or ‘purely’ tropical. In fact, despite the moisture originating from the Indian Ocean, and the discussed development of a mesoscale monsoonal vortex, one important contribution comes from a large-scale forcing, which is mostly originated from the extratropics (i.e. the blocking discussed byLau and Kim ) and predictability is known to be higher for extratropical synoptic-scale dynamics, particularly large anticyclones.
 We acknowledge that a rigorous comparison between the two methodologies of radiance and retrieval assimilation, so as to ascertain the objective superiority of one versus the other, is not possible with the experiments described in this article for a number of reasons. First of all, the data sets differ because of (1) intrinsic methodology: they are originated out of completely different physical procedures; (2) AIRS channels: the procedures involve a different number and selection of channels; and (3) coverage: the corresponding data sets have different coverage, as evident from Figures 10 and 11. Therefore, different model performance can be attributed to a combination of these factors, and not necessarily to the intrinsic value of the methodology.
 A more constrained experiment in which only retrievals in clear regions are assimilated could be quite meaningful; however, because of the different procedures and channel selections, even that experiment could not be a final assessment.
 A further step in this direction can be to assimilate cloudy retrievals and compare the corresponding forecasts to the ones obtained by analyses in which cloud-cleared radiances are assimilated. While we plan to make this experiment the subject for a future article, it should be noted that, at this time, the successful assimilation of cloudy radiances has been demonstrated only in a small number of studies. The works byPangaud et al.  and Singh et al.  are, to our knowledge, among the very few documented attempts to assimilate cloudy radiances in a data assimilation and forecast framework.
 All cloud cleared radiances have case-by-case, channel-by-channel error estimates associated with them. As with assimilation of retrieved temperature profiles, thresholds of these error estimates will be used to choose which channel radiances will be assimilated on a case-by-case basis. This approach should provide spatial coverage of assimilated channel-clear column radiances which are similar to that found with regard to temperature profiles. On the other hand, we still will not be able to assimilate clear column radiances for the 4.3 μm channels (corresponding to the CO2 absorption band, between 649.612 and 843.912 cm−1), because of inherent limitations found in the operational radiance assimilation procedure itself. At this time, it is impossible to assess how this limitation will impact our experiments. The importance of the near CO2 absorption line, which is very sensitive to clouds, has been emphasized in the study by Gong and Wu  focused on fine scales of tropical convection, and also in the assimilation experiments described by Pangoud et al. .
9. Concluding Remarks
 Results of a study to assess the impact of assimilation of AIRS version 5 cloudy retrievals against the current, operationally used clear-sky radiance data set, are presented. The global 500 hPa geopotential height anomaly correlation for 43 forecasts indicates that the assimilation of quality-controlled cloudy AIRS retrievals produce an overall better forecasts than the clear sky radiances. However, the improvement display regional variations, being not significant over areas which are predominantly clear during the experiments (such as in the subtropical high belts, where radiance and retrieval coverage is comparable) and reaching high significance over regions which have mean denser coverage during the period of study.
 Aside from the necessary verification of the global performance, the focus is on one extreme precipitation event, occurred at the end of July 2010 along Pakistan's Indus Valley. It is shown that the assimilation of AIRS cloudy retrievals produces better precipitation analyses than the assimilation of AIRS radiances, compared to observed TRMM-derived rainfall products. In addition, a 7-day forecast of the event encompassing the daily maximum, initialized on July 22nd from analyses in which AIRS retrievals are ingested, produces a spatial distribution of the rainfall more accurate than the corresponding forecast initialized from analyses in which radiances are assimilated. The data coverage provided by the cloudy retrievals approach is shown to be substantially superior to the coverage provided by the clear sky radiances approach. Since the moist flow is originated in a predominantly cloudy part of the northern Indian ocean, the rejection of all channels affected by clouds, which is necessary to produce clear-sky radiances, is particularly detrimental to the precipitation analysis and forecast skill. As demonstrated in previous studies, the additional temperature information in cloudy regions, that is provided by cloudy retrievals, helps to better define the broad cloud distribution in the analysis. In turn, this causes a major change in the low-level deep moisture flow that contributes to the extreme precipitation event.
 While at this time a conclusive statement on which methodology is superior cannot be made, consistent evidence is being provided by this and previous work by the same authors that AIRS-derived information in presence of clouds could be a critical contribution to increase predictive skill of extreme events, particularly in the tropics.
 The authors thank Ramesh Kakar from NASA HQ for support through grant on Precipitation Measuring Mission, and Tsengdar Lee for allocations on NASA High-End Computing systems. Thanks are also due to three anonymous reviewers for their valuable suggestions.