2.1. The GPROF rainfall algorithm
The GPROF is a Bayesian retrieval scheme that is currently used operationally for radiometers such as TMI, SSM/I and AMSR-E. The GPROF aims to retrieve the instantaneous rainfall and the rainfall vertical structure from the satellite microwave observations. The original algorithm is described in Kummerow et al. (1996) and was further extended to include the latent heating estimation (Olson et al., 1999).
Rainfall retrieval from passive microwave radiances is an ill-conditioned inverse problem in the sense that the total information content of the observations is less than the independent variables within raining clouds that must be retrieved. Therefore, there is no unique solution that can be obtained without introducing prior knowledge and the derived solution may even be non-optimal. The Bayesian theorem provides a rigorous mathematical formulation to introduce this a priori knowledge. Following Bayes' formulation, the probability of observing a particular hydrometeor profile R, given the observed brightness temperature vector Tb can be written as:
where Pr(R) is the probability of observing a certain rain profile R and Pr(Tb|R) is the probability of observing Tb given a particular rain profile R.
Older versions of GPROF used cloud resolving models (CRMs) to define Pr(R). Pr(Tb|R) in those versions of GPROF was calculated from the CRM output using a radiative transfer model (RTM). More details of the CRMs and the RTM applied in GPROF are described in Kummerow et al. (2001). In practice, the available sets of CRM simulations constituted the assumed a priori probability of finding a particular profile R, in nature. In the retrieval process, given an observed Tb, profiles in the database that have consistent simulated Tb will be selected and weighted to give the expected value that is considered to be the ‘best’ estimate. With x representing the vector of all the physical quantities to be retrieved, the expected value of x is given by:
where xi represents all model simulated profiles in the database, y0 represents the observation vector, H(xi) is the simulated observation vector corresponding to profile xi with H representing the observation operator, R and Q are the observation and model error covariance matrices respectively, and A is the normalization factor that is a scalar constant. For further descriptive details relating to the retrieval process see Kummerow et al. (1996, 2001).
This algorithm has undergone many improvements over the years. Examples include an improved freezing level over oceans to reduce the artificially high rainfall at high latitudes, improved convective–stratiform discrimination to significantly decrease the precipitation in stratiform areas, especially in areas far from convection, including melting layers in the RTM (Bauer, 2001), and use of an improved rainfall relationship over land (Kummerow et al., 2001). Recently, an important improvement consisted of replacing the original CRM-based database with an observationally generated database (Kummerow et al., 2011). The choice of database is very important because it is assumed that the database accurately represents the true probability of observed situations.
2.2. Observationally generated GPROF a priori database
The traditional databases generated by CRM simulations suffered from issues including the correctness and completeness issues described in Kummerow et al. (2006). To avoid these shortcomings, an observationally generated database of precipitation profiles has been constructed using the combination of active and passive microwave sensors (i.e., the precipitation radar (PR) and theTMI on board the TRMM satellite; Kummerow et al., 1998).
One year of TRMM observations of TMI and PR from 1 June 1999 to 31 May 2000 generated approximately 62 million data entries and were used to build the database. The TRMM operational PR algorithm (TRMM 2A25, V6) was used as the starting point. When PR indicated no rain, an optimal estimation procedure was used to retrieve non-raining geophysical parameters including surface wind, total precipitable water (TPW) and cloud liquid water path (LWP) from the TMI observations (Elsaesser and Kummerow, 2008). The sea-surface temperature (SST) is specified from the Reynolds weekly climatology (Reynolds et al., 2002). When PR indicates rain, the TRMM 2A25 rainfall profiles are used as the first guess. The SST and wind speed are interpolated from the neighbouring non-raining fields. Cloud water, water vapour and profiles of rain and ice hydrometeors are obtained by matching radar profiles to CRM. When matched, CRM hydrometeor profiles are used. This step is important in that the CRM provides a first guess for cloud liquid and cloud ice water content that are not sensed directly by the PR. The RTMs (Kummerow, 1993) are used to compute the simulated Tbs from these hydrometeors and the resulting Tbs are compared with coincident TMI observations. Comparisons are accumulated as a function of SST and TPW at 1 K and 1 mm intervals. Where disagreements at 19 and 85 GHz vertically polarized Tbs occurred, an adjustment procedure was performed by first adding rainwater that is below the detection threshold of the PR. If the addition of light rain did not correct mean biases, the adjustment procedure then focused on rain-drop size distributions and ice density to match the modelled and observed Tb. The adjusted profiles are then adopted for the database construction. Complete details of the procedure, which is only summarized here, can be found in Kummerow et al. (2011). The 1 yr pseudo-observed microphysical database will be used to evaluate the modelled microphysics. It should be noted here that because the PR is sensitive primarily to precipitation whereas TMI is sensitive primarily to TWP, there is good reason to assume that the rain and cloud water amounts may, to the first order, be representative of observed clouds.
2.3. The ECMWF 1D+4D-Var algorithm
The ECMWF 1D+4D-Var algorithm has been operational since June 2005 (Bauer et al., 2006b, 2006c; Geer et al., 2008) over cloudy and rainy SSM/I observations and may be considered as an intermediate step towards the direct 4D-Var assimilation of all-sky microwave radiances, which was made operational in March 2009 (Bauer et al., 2010; Geer et al., 2010). The 1D+4D-Var algorithm includes two parts: the 1D-Var that includes an optimal estimation procedure to retrieve the microphysical properties and TPW from SSM/I radiance observations, and the 4D-Var analysis (Rabier et al., 2000) that assimilates the TPW as a pseudo-observation. The observation operator includes three components: a convection scheme (Lopez and Moreau, 2005) that represents subgrid-scale processes and treats convection types defined as shallow, mid-level and deep convection in a unified way; a large-scale condensation scheme (Tompkins and Janisková, 2004) that uses the convective detrainment prescribed by the convection model with a similar precipitation generation formulation; and a multiple-scattering radiative-transfer model RTTOV-SCATT (Bauer et al., 2006a) with scattering calculated using the delta-Eddington approach. The advantage of the 1D-Var over ordinary variational retrievals is that it uses the same background state, background errors and moist physics package as the 4D-Var (Bauer et al., 2010). Therefore, its a priori information (short-range forecast) is more accurate than the statistical climatology as it contains information about physically important features such as fronts, inversions and the tropopause heights. Using 1D-Var allows an extra step of quality control before assimilating radiances into 4D-Var (Bauer et al., 2010). An important aspect of the 1D-Var retrieval is that the control vector consists of temperature and humidity profiles as well as surface windspeed. Cloud and precipitation are calculated from the moist physics parametrizations before running a radiative transfer scheme. The optimization is thus constrained by the models, the observations and the background fields for temperature, moisture and windspeed with associated errors and not by model background cloud/precipitation fields and their errors.
The processing of rain-affected SSM/I Tbs used in 1D-Var retrieval involves several steps including: removing the scan-position-dependent biases known to affect SSM/I, a pre-screening process including a land surface and sea-ice check, a check for valid Tb observations, and the screening of clear-sky observations not to be treated in the retrieval. A check for cloud liquid water and precipitation presence is applied that is based on a cloud identification algorithm (Karstens et al., 1994) and the polarization signal at 37 GHz. A check of excessive falling snow in the 1D-Var first guess (FG) profile is also performed to avoid unreliable radiative transfer simulations in such conditions (Geer et al., 2008). Then the bias correction is performed that is a correction of systematic differences between observed and simulated Tbs (Bauer et al., 2006b).
In general, it is not uncommon for simulations to have large biases compared with the observations, and it is crucial to correct these biases for achieving good assimilation results. A multiple linear regression between FG departures (observation minus FG) and FG TWP, surface wind speed and column rain amount is performed to predict the biases in the 1D+4D-Var system. The bias correction is then applied to the observation Tbs to make them less biased with respect to the 1D-Var FG prior to the assimilation (Geer et al., 2008). The bias correction is applied to the 19 GHz vertical polarization channel (shortened as 19 V hereafter), 19 GHz horizontal polarization channel (shortened as 19H hereafter) and 22 V.
The bias correction scheme may not be proper for cloudy observations because of the usage of an asymmetric predictor (Geer and Bauer, 2011) that is the FG rain amount in the 1D+4D-Var system. Some biases are very large, and they may be due to errors in the structure and intensity of forecast cloud and rain, but may also be due to displacement errors. The largest error might be coming from the improper cloud overlap scheme (Geer et al., 2009) in which assumptions regarding the subgrid-scale cloud variability are made. These are known as beam filling biases in the satellite community.
The model forecast provides the FG fields including temperature profiles, water vapour profiles, surface fields, which include latent heat and sensible heat fluxes, and wind stress. These FG fields all serve as inputs to the convection scheme that in turn produces detrained convective cloud water, and rain and snow fluxes. Together with the FG fields and the detrained cloud water, the large-scale condensation scheme produces cloud-cover fraction and models the clouds and precipitation when they are formed by model-resolved processes. Using the thermodynamic and hydrometeor information generated above, the multiple-scattering microwave RTM is used to calculate the simulated radiances.
In a variational retrieval, the optimal estimation of a state vector x is acquired by minimization of a cost function using the a priori information from the FG. The cost function J is defined as:
where J is the cost function, x is the state vector, containing vertical profiles of temperature and specific humidity on 91 model levels in this case, xb is the a priori state vector acquired from model simulation,y0 is the observation vector, H stands for the observation operator that maps geophysical space to observational space, B is the background error covariance matrix, R is the observation error covariance matrix, which includes both the observation error and the errors originating from observation operators.
The first term is the fit of the solution to the background estimate of the atmospheric state weighted inversely by the background error covariance B. The second term is the fit of the solution to the measured radiances y0 weighted inversely by the measurement error covariance R. The solution obtained is optimal in that it fits the a priori (or background) information and measured radiances respecting the uncertainty in both.
The 1D-Var produces outputs including vertical profiles of humidity, temperature, cloud and precipitation. The TPW derived from the retrieved humidity profile is assimilated in the main 4D-Var analysis (Rabier et al., 2000). It should be noted that the 1D+4D-Var algorithm is affected by a sampling bias, which comes from applying 1D+4D-Var when the observations are cloudy or rainy, but not when the FG is rainy or cloudy and the observations are clear (Geer et al., 2008).