Tracing hydrologic model simulation error as a function of satellite rainfall estimation bias components and land use and land cover conditions


Corresponding author: F. Hossain, Department of Civil and Environmental Engineering, Tennessee Technological University, 1020 Stadium Dr., Box 5015, Cookeville, TN 38505, USA. (


[1] The key question that is asked in this study is “how are the three independent bias components of satellite rainfall estimation, comprising hit bias, missed, and false precipitation, physically related to the estimation uncertainty of soil moisture and runoff for a physically based hydrologic model?” The study also investigated the performance of different satellite rainfall products as a function of land use and land cover (LULC) type. Using the entire Mississippi river basin as the study region and the variable infiltration capacity (VIC)-3L as the distributed hydrologic model, the study of the satellite products (CMORPH, 3B42RT, and PERSIANN-CCS) yielded two key findings. First, during the winter season, more than 40% of the rainfall total bias is dominated by missed precipitation in forest and woodland regions (southeast of Mississippi). During the summer season, 51% of the total bias is governed by the hit bias, and about 42% by the false precipitation in grassland-savanna region (western part of Mississippi basin). Second, a strong dependence is observed between hit bias and runoff error, and missed precipitation and soil moisture error. High correlation with runoff error is observed with hit bias (∼0.85), indicating the need for improving the satellite rainfall product's ability to detect rainfall more consistently for flood prediction. For soil moisture error, it is the total bias that correlated significantly (∼0.78), indicating that a satellite product needed to be minimized of total bias for long-term monitoring of watershed conditions for drought through continuous simulation.

1. Introduction

[2] Precipitation (hereafter used synonymously with “rainfall”) is one of the most important atmospheric inputs for hydrologic model simulation. Precipitation dominates the spatial and temporal variability of other hydrological variables (such as soil moisture, runoff, and evapotranspiration) [Syed et al., 2004; Famiglietti et al., 1995]. About 70%–80% of space-time variability in the hydrologic cycle is reportedly dictated by precipitation variability. Because precipitation is the key element of the hydrologic cycle, its quantitative estimation is essential for hydrologic modeling in both scientific and applied research. The accuracy of hydrologic prediction depends, among many factors, on the accuracy of the model input, the primary one being rainfall.

[3] Rainfall measurement from the ground using conventional methods is more direct and reliable than satellite-based rainfall [Villarini et al., 2008], but it lacks the desired spatial and temporal sampling needed to achieve a high-resolution rendition of the terrestrial hydrologic fluxes in the continuum of space and time. The major concern for the hydrologist is the representativeness of point measurements for areally averaged rainfall which is the usual input to distributed and physically based hydrologic models [Habib et al., 2004]. This issue becomes more important when we consider that ground observation networks are either sparse, nonexistent, or declining for most parts of the world [Stokstad, 1999; Shiklomanov et al., 2002]. More importantly, precipitation's spatial variability and intermittent nature makes it difficult to observe using the conventional ground-based rain gauge method. These practical limitations of ground rain gauge networks have prompted increasingly wider use of spaceborne observation of rainfall as an indispensable bridge to quantifying precipitation fluxes over large and inaccessible areas [Anagnostou et al., 2010; Tian et al., 2009; Hong et al., 2007; Gottschalck et al., 2005].

[4] With a capability to provide rainfall estimates for data sparse regions not well covered by gauges or ground radars (e.g., water bodies, mountainous and remote desert areas), satellite rainfall estimates are a promising additional source of forcing data for large scale hydrologic modeling [Nijssen and Lettenmaier, 2004; Tian and Peters-Lidard, 2010]. Many efforts have been undertaken to fulfill the demand of the scientific community in providing accurate satellite rainfall estimates at hydrologically relevant spatiotemporal scales [Hsu et al., 2010; Huffman et al., 2007; Joyce et al., 2004; Sorooshian et al., 2000]. The studies have collectively contributed to the progress made from 1 deg spatial and monthly time scales [Huffman et al., 1997; Huffman et al., 2001; Adler et al., 2003] to 0.25 deg spatial and hourly temporal scale [Huffman et al., 2007; Joyce et al., 2004; Sorooshian et al., 2000; Joyce and Xie, 2011, Ushio et al., 2009, Behrangi et al., 2010; Hong et al., 2004] to make satellite rainfall data potentially more useful as a forcing for macroscale hydrologic modeling.

[5] In the evolution of space technology, the next promising and future global rainfall data source that is founded on the heritage of Tropical Rainfall Measuring Mission (TRMM) and preceding satellite missions, is the Global Precipitation Measurement (GPM) Mission. The planned GPM mission will provide rainfall estimates at spatial resolutions of 25–100 km2 and temporal scales of 3 to 6 h for about 90% of global coverage [Hou et al., 2008]. Rainfall estimates from GPM hold great promise for river flow modeling, water resource management, flood and drought disaster management, and environmental protection. In particular, GPM and its associated rain products will be the only available rainfall data source for many parts of the world.

[6] Although the overall progress and improvements in satellite rainfall measurement from space has been notable for hydrologic modeling and other applications, the level of uncertainty associated with rainfall estimation and sampling frequency is still significant [Hossain and Huffman, 2008; Nijssen and Lettenmaier, 2004; Chang and Chiu, 1999]. Nijssen and Lettenmaier [2004]evaluated the effect of precipitation sampling errors on simulated moisture fluxes and states by forcing a macroscale hydrologic model with error-corrupted precipitation fields for different temporal sampling and spatial scales. They found that simulated satellite precipitation (with sampling errors similar to that expected from the constellation of passive microwave sensors) exhibited significant errors in moisture fluxes and states. They also showed that the propagated error in simulated fluxes and states significantly reduced for larger areas and longer sampling intervals. For instance, for 2500 km2 and a 3 h sampling interval, the areally averaged root mean square error (RMSE) was greater than 50%, which reduced to 10% for 500,000 km2. Tian and Peters-Lidard [2010]produced such a satellite rainfall uncertainty map at global scale by computing the standard deviation from the ensemble mean of different satellite rainfall products at every grid box and time step without ground validation data. Their study reported the occurrence of less uncertainty over oceans and large uncertainty over the surfaces at high elevations where the orographic rainfall processes present significant challenges for satellite-based remote sensing of precipitation.

[7] Several other studies have recently emerged on the application of TRMM-based multisatellite rainfall products for hydrologic modeling (Nijssen and Lettenmaier [2004], Su et al. [2008], and Gebregiorgis and Hossain [2011], among many others). It is crucial for hydrologists now to understand how rainfall uncertainties affect hydrologic predictability. Many of the available satellite rainfall products are developed directly or indirectly from merging of infrared (inferior rectus (IR)) and passive microwave (PMW) sensors estimates based on different algorithmic approaches. For instance, the 3B42RT algorithm [Huffman et al., 2010] uses MW data to calibrate IR estimates to obtain a merged product from MW and calibrated IR when and where PMW estimates are unavailable. The CMORPH algorithm [Joyce et al., 2004] utilizes the IR estimates only to derive the cloud motion field that helps to propagate the rainfall estimates of PMW data. The PERSIANN (precipitation estimation from remotely sensed information using artificial neural networks) algorithm utilizes the relationship between IR and MW estimates as derived from artificial neural network techniques and the rainfall estimates are then obtained from the MW data downscaled to the IR footprint. There are different versions of PERSIANN products. The first algorithm (PERSIANN) [Sorooshian et al., 2000] uses gridded IR brightness temperature obtained from geostationary satellites to compute the corresponding gridded rainfall rate by adjusting the model parameters routinely to PMW rainfall estimates. This product is available at spatial resolution of 0.25 deg × 0.25 deg and temporal scale of 30 min which is later converted to a 6 h rainfall accumulation. The second PERSIANN version is developed based on patch cloud classification system (PERSIANN-CCS) [Hong et al., 2004; Hong et al., 2005; Hsu et al., 2010]. The cloud images are classified into cloud patch regions based on cloud height, areal extent, and texture features extracted from satellite imagery. Finally, a relationship between rain rate and brightness temperature is established for pixels within each cloud patch region. GSMap [Ushio et al., 2009] is also another satellite rainfall product which uses a similar technique as CMORPH in propagating the PMW derived precipitation field using the IR-derived motion vectors, but unlike the CMOPRH algorithm, it also uses cloud top brightness temperature to propagate precipitation estimates. Among the discussed rainfall algorithms, CMORPH, GSMaP, and PERSIANN-CCS offer resolutions higher than 3 h and 0.25 deg.

[8] Recognizing the vast complexity and interdependencies of the multiple sensors used in quasi-statistical rainfall algorithms of today,Gebregiorgis and Hossain [2011]demonstrated a multiproduct merging method that leverages the a priori uncertainty of individual products. Therein, they reported that it is indeed feasible to create a more superior merged product by making skillful and complementary use of the uncertainty of each individual product in hydrologic model simulation of the fluxes (such as soil moisture and runoff). Runoff and soil moisture based merged products improved the runoff and soil moisture simulation. On average the RMSE of streamflow with runoff based merged product decreased by 41%, 82%, and 60% and soil moisture based merged product by 50%, 79%, and 53% for 3B42RT, CMORPH, and PERSIANN-CCS products, respectively.

[9] The natural follow-up question now is,how can we implement such a multiproduct merging approach in regions where there is no ground truth data to derive a priori estimates of uncertainty? A recent study by Tang and Hossain [2011]on the similarity of satellite rainfall error as a function of Koppen climate class reported that certain measures of rainfall uncertainty can be clustered according to climate and terrain type. Their study showed promise in “transferring” error information from a gauged region to an ungauged region with similar climate characteristics. Similarly, there are also other studies that report the performance of rainfall products as a strong function of the region and topography. For example, most TRMM-based products that do not utilize comprehensively the precipitation radar (PR) data are known to be generally weak in detecting orographic precipitation [Dinku et al., 2010]. In particular, the poor performance of some of the commonly used multisensor products over the Himalayas, Andes, or the Ethiopian highlands, is now well known [Dinku et al., 2007; Hirpa et al., 2010]. Thus, it appears that multiproduct merging can potentially improve further from an investigation of climate, land use and land cover (LULC), and terrain features in dictating the rainfall estimation uncertainty.

[10] The present study is driven by the need to raise more awareness and understanding about the complex interrelationship between uncertainty of rainfall and hydrologic simulation (of key fluxes such as soil moisture and runoff errors) as a function of LULC and terrain features. To make the study directly relevant to data product developers engaged in improving their algorithms for GPM, this study traces the source of error observed in hydrologic predictability to the input (rainfall) error predecomposed into easy to understand independent components. Such components, by virtue of the power of their simplicity and physical significance, stand to provide tangible feedback to developers on how exactly algorithms may need to be revised to advance their application for hydrology. The study is conducted on a continental scale (the Mississippi River basin) using multiyear data sets to arrive at statistically robust and comprehensive findings at regions with similar LULC.

[11] The paper is organized as follows. Description of the study area, hydrologic model, and data used are introduced in section 2. The methodology of satellite rainfall error decomposition and the linkage to hydrologic simulation error are elaborated in section 3. Section 4 presents the results of the study, focusing particularly on spatial and temporal characteristics of satellite rainfall uncertainty and the interrelationship with soil moisture, runoff errors, and LULC. Finally, conclusions and recommendations of the study are presented in section 5.

2. Study Area, Model and Data

2.1. Study Area

[12] The Mississippi River Basin (MRB), which is the largest basin in North America (Figure 1), was chosen as the study region. Because of diverse topography, climate, and LULC types over an area of about 3 million km2, that are also witnessed in other parts of the world, the MRB was ideal for the study objectives. The topography of the basin varies from low-lying areas of 1 m to high elevation areas 4500 m above sea level (a.s.l). For this particular study, three LULC types were considered at six different geographical locations. These LULC data was derived from United States Geological Survey, National Land Cover Database [NLCD2001] at spatial resolution of 0.004 deg, source: 1 (left) shows the location of the study zones with LULC type in MRB, which are (1) forest and woodland (zones A1 and B1); (2) cropland system (agriculture and irrigation practice) (zones C2 and D2); and (3) grassland and savanna systems (zones E3 and F3). The size selection of each LULC zone was determined based on the areal extent of LULC type that was dominant in the region. Each zone needed to enclose large number of pixels of the same LULC type to yield statistically significant results. The percentage coverage of the designated LULC type within a given zone varied from 82% for zone A1 to 98% for zone F3. Detailed description of location, percentage coverage by the dominant LULC type, elevation, and LULC features of each zone are summarized in Table 1.

Figure 1.

(left) Location of Mississippi basin in United States of America and (right) land use/land cover (LULC) map with the selected study zones. Zone nomenclature: Zone xy where x indicates the location of specific region and y shows the LULC type defined by 1 forest and woodland systems; 2 human land use (cropland) system; and 3 savanna and grassland systems.

Table 1. Detail Description of Study Zonesa
Region/ZoneLocationLULC TypeCoverage (%)Detail Description
  • a

    N is north, S is south, E is east, W is west, SE is southeast, NE is northeast, and NC is north central.

A1S ArkansasWoodland and forest systems82Mainly dominated by mixed and deciduous broadleaf forest. Small and scattered savanna woody also exists in central part of the region. Elevation ranges from 60 to 400 m.
N Louisiana
SE Oklahoma
B1E Central TennesseeWoodland and forest systems94Characterized by mixed and deciduous broadleaf forest and dispersed cropland. Elevation varies from 250 to 1000 m.
S Kentucky
C2S IowaCropland system97Cropland is the dominant land use system of this region. Few deciduous broadleaf forests also exist. Elevation is between 200 snd 300 m.
N Missouri
NE Kansas
E Nebraska
D2W MississippiCropland system96This region extends along either side of main lower Mississippi river which is dominated by irrigation cropland system. Elevation ranges between 30 and 100 m.
E Arkansas
E3C South DakotaGrassland and savanna systems97Dominated by grassland and savanna systems. Its elevation extends from 700 to 1300 m
S North Dakota
NC Nebraska
F3E ColoradoGrassland and savanna systems98Grassland, open shrubland, and savanna are the dominate land use system. Elevation ranges from 1300 to 2000 m.
NE New Mexico

2.2. Model and Data

[13] A variable infiltration capacity (VIC) macroscale hydrologic model [Liang et al., 1994] was implemented to simulate land surface states and fluxes for MRB at the daily time step and a spatial resolution of 0.125 deg. The model setup and calibration were performed based on gridded ground observation data sets obtained from the University of Washington [Maurer et al., 2002]. Using the calibrated model and forcing data sets, land surface fluxes (soil moisture and runoff) were generated. These model-derived surface fluxes, derived from gridded ground observations, were used as “synthetic” truth data to evaluate the performance of satellite rainfall products in simulating soil moisture and runoff as a function of LULC and error type. The study period considered was 8 years (2003–2010). Analysis was broken down seasonally to winter (December, January, and February [DJF]) and the summer (June, July, and August [JJA]) and for some of the cases, the result was presented only for 2006 and 2010 to allow sufficient model spin up and focus on a period with the highest number of microwave sensors for the satellite algorithms.

[14] Generally, the realism of the synthetic data depends highly on the choice and quality of the ground truth data sets injected into the model, which likely affects the finding of this study. Therefore, to minimize such impact and ensure accuracy of simulated runoff and soil moisture, the ground rainfall data was first checked against NEXRAD-IV (next-generation radar of stage IV) data (Figure 2a, left). In addition, the VIC model parameters, such as variable infiltration curve parameter, maximum velocity of base flow, fraction of maximum soil moisture, fraction of velocity of base flow, and depth of soil layers, were calibrated at seven and validated at 12 internal gauging stations of MRB using simulated and observed streamflow (Figure 2b).

Figure 2.

(a) Qualitative comparison of gridded ground with NEXRAD-IV rainfall record for two randomly selected days (left four panels); correlation of gridded and NEXRAD-IV average rainfall over Mississippi basin (bottom left panel); model calibration (2003–2004) and validation (2005) of VIC model using observed streamflow at two gauging stations (right panels). (b) Selected hydrological gauging stations for the purpose of calibration and validation of VIC model over Mississippi River basin.

[15] The selection of gauging stations was driven by the need to minimize the impact of human regulation of flow. The selection of stations (as shown in Figure 2b) was guided by three rules. (1) Less regulated watersheds regions were considered for validation and calibration, for example Minnesota River near Jordan. (2) To adequately represent the basin wide response, several small-sized watersheds were selected. For example, Kentucky River at Lockport (area 6180 sq. mi), French Broad River near Newport (area 1858 sq. mi), Wabash River at Mt. Carmel (area 28,635 sq. mi); and Quachita River at Camden (5360 sq. mi). (3) On regulated rivers, stations located upstream or very far downstream of the dam have been considered, for example Canadian River at Calvin, Quachita River at Camden, and Missouri River at Hermann. Through these three rules we have completely avoided gauging stations that are influenced heavily by human regulation of streamflow. As seen inFigure 2a (right), there is strong agreement between the simulated and observed streamflow according to measures of correlation coefficient and efficiency. Both performance measures provided the necessary confidence in hydrologic model simulation.

[16] The forcing data set for the VIC model includes the major observed meteorological variables, such as precipitation, minimum and maximum temperature, wind speed, vapor pressure, incoming long-wave and short-wave radiation, and air pressure. For the contiguous United States, the meteorological forcing data set were processed and made available for users by the University of Washington (see Acknowledgments). To prepare the gridded ground rainfall, the daily ground precipitation data was collected from the National Oceanic and Atmospheric Administration (NOAA). The average density of gauge stations used in gridding process was 700 km2/station, or equivalently on average 7200 stations in the study region (MRB). According to Maurer [2002], this precipitation data were gridded to spatial resolution of 0.125 deg using the synergraphic mapping system (SYMAP) algorithm. Finally, the gridded data set were statistically adjusted using the parameter-elevation regressions on independent slopes model (PRISM) to consider local variations due to terrain complexity. More importantly, before using these data sets for the study objectives, both qualitative and quantitative comparisons were performed with the NEXRAD-IV data set on MRB for the purpose of validation (Figure 2, left). The mean daily rainfall of the gridded and NEXRAD-IV data sets agreed very well, with a correlation coefficient of 0.98.

[17] The error characteristics of three satellite rainfall products were investigated in runoff and soil moisture simulation. The surface runoff rate generated from each grid cell was considered as runoff. The routable portion of subsurface runoff was not included in the analysis as runoff. Computation related to runoff was generally performed at spatial resolution of 0.125 deg. On the other hand, the VIC model simulates the soil moisture in three different soil layers. The upper layer is the top 10 cm soil depth which represents the dynamic behavior of the soil that responds to the weather-scale meteorological processes, whereas the lower two layers characterize the seasonal and long-term soil moisture behavior. Even though the upper soil layer has a smaller thickness compared to the lower layers, the memory effects could contaminate the transient temporal behavior of the soil moisture error. To minimize such impacts, the soil moisture information in the top layer was extracted for each pixel at the beginning of a time step inline image and end of time step inline image where i and trepresent the pixel number and time step, respectively. The difference between the two values (if it exists) is considered as the memory-less (fast) response of the soil moisture column to rainfall at that particular time step. This difference was also considered as the daily soil moisture production and used in the computation of percentage of runoff and soil moisture production.

[18] The volume of soil moisture production due to the rainfall intensity at daily time step t for pixel iW1i[t]) is given by equation (1):

display math

The total spatial sum of runoff and soil moisture production ( inline image and inline image, respectively) for zone j during the summer season are computed per equations (2) and (3):

display math
display math

where n is the number of days in the summer season and m is the number of pixels in zone j.

[19] Finally, to compute the daily percentages of runoff and soil moisture production with respect to daily ground rainfall intensity, equations (4) and (5) are used:

display math
display math

[20] The multisensor satellite rainfall products considered were 3B42RT [Huffman et al., 2010; Huffman et al., 2007], CMORPH [Joyce et al., 2004], and PERSIANN-CCS [Hong et al., 2004]. All three satellite rainfall products are available to end users in near real time that favor the development of various decision-making tools. 3B42RT is one of the products provided by the TRMM multisatellite precipitation analysis (TMPA) algorithm at a spatial resolution of 0.25 deg × 0.25 deg and a temporal sampling of 3 h [Huffman et al., 2010]. It is a combination of PMW and PMW-calibrated IR data merged in a manner that MW precipitation estimate is considered where it is available, and the IR estimate is used to fill the gap (in space and time) elsewhere. CMORPH is a high-resolution satellite rainfall product known as the climate prediction center (CPC) using MORPHing technique. This product is also available at a spatial resolution of 0.25 deg and temporal resolution of 3 h. This product uses rainfall estimates from MW exclusively and the rainfall patterns are propagated in space and time via motion vectors obtained from IR data to bridge the MW sampling gaps [Joyce et al., 2004]. PERSIANN-CCS is based on extraction of cloud features from IR imagery of a geostationary satellite to derive rainfall estimates at finer scale (0.04 deg × 0.04 deg) and hourly temporal resolution using MW data as a guide for the artificial neural network. These key data products essentially use the same suite of PMW and IR sensors, such as advanced microwave sounding unit (AMSU), TRMM microwave imager (TMI), special sensor microwave/imager (SSM/I), advanced microwave scanning radiometer for Earth observing system (AMSR-E), IR sensor aboard geostationary operational environmental satellite (GOES), etc.

3. Error Decomposition

[21] In a demonstration of error decomposition, Tian et al. [2009] have outlined a general scheme of breaking down total rainfall error (hereafter used interchangeably with “total bias”) into three independent components: hit error H, missed precipitation –M, and false precipitation F. Figure 3 illustrates the concept of false, hit, and missed precipitation of satellite rainfall observation relative to ground observation. According to Figure 3, H represents observed rainfall events which are detected by both satellite and ground validation data (hits), M shows missed rainfall events by the satellite but detected by the validation data, and F indicates false observation of rainfall events by the satellite which are not reported by the reference data. On the same figure, an example is provided to illustrate the total error decomposition into completely independent hit bias, missed, and false precipitation for individual grid cells.

Figure 3.

Diagram showing hits (H), misses (M), and false alarms (F) for dichotomous variables (satellite rainfall estimate and ground observation) and simple exemplary table that shows how error components are identified and separated at basin gridcell level (unit in mm d−1).

[22] In this study, the total error E (or bias) is defined as satellite estimate minus ground reference (error unit in mm d−1 as the rainfall). Hit error H indicates the discrepancy between the satellite and ground rainfall data given both data report rainfall coincidently and as a result, hit error could be positive or negative. On the other hand, missed M and false F errors have always negative and positive signs, respectively. The relation between the total rainfall error E and error components can be expressed as E = HM + F. For a detailed explanation, readers are referred to Tian et al. [2009] and Wilks [1995]. It is obvious from the above error relationship that the magnitude of the total error cannot completely characterize the full measure of performance for satellite rainfall products. For example, M and F can cancel each other as they have opposite signs, resulting in a low total bias (E) but not necessarily a low hydrologic simulation error that is dictated by the components [Tian et al., 2009]. Therefore, breaking down the total satellite rainfall error into its distinct components (H, −M, and F) helps us to gain a clearer picture of error amplitudes so that the performance of the algorithm for satellite rainfall product can be evaluated in more detail. More importantly, breaking down of the total error into such components helps to trace the source of error that propagates into soil moisture and runoff through a hydrologic model. It also helps to constrain the error behavior as a function of LULC and runoff generation physics. Eventually, this knowledge is expected to improve satellite rainfall algorithm development, application, and the data assimilation scheme in the future.

4. Results

4.1. Satellite Rainfall, Soil Moisture and Runoff Production

[23] To reduce visual cluttering, Figure 4compares the variability of the 31 day moving average time series of satellite rainfall and ground (reference) data. Although time series of satellite rainfall products capture the temporal trend of the reference rainfall data in all zones (except PERSIANN-CCS in zone E3), CMORPH and PERSIANN-CCS generally overestimate the rainfall magnitude during the summer season. Particularly, the overestimation is significantly high almost for the entire period over LULC zones E3 and F3, which is largely absent in forest and woodland regions (zone A1). These regions are mainly characterized by savanna-grassland systems in mountainous terrain. More importantly, the PERSIANN-CCS does not capture the rainfall trend during the winter season over the mountainous regions particularly after 2005. 3B42RT, on the other hand, provide relatively better rainfall estimation in all regions for the study period. However, it has a tendency to underestimate rainfall for cropland systems during wet seasons. The underestimation is more noticeable since July 2005 and this may be tied with the implementation of new version of 3B42RT algorithm as of 3 February 2005. The underestimation can be traced to the amount of significant missed precipitation of 3B42RT in central and eastern part of MRB (as shown inFigures 6 and 7).

Figure 4.

A 31 day of moving average time series of rainfall estimates spatially averaged over zone A1 (forest and woodland), zone C2 (cropland), and zone E3 (savanna-grassland).

[24] Figure 5 illustrates the percentage of runoff and soil moisture production with respect to ground rainfall intensity (mm d−1) during the summer seasons of 2006 and 2010. The percentage of soil moisture production remains nearly constant for different rainfall rate in all study zones. Because the soil moisture has longer duration memory, it is difficult to observe its moisture variation at smaller time scales. Moreover, soil column moisture holding capacity is also bounded by a finite moisture holding capacity (equal to porosity) and initial moisture content [Raj and Hossain, 2010] that makes soil moisture insensitive for high rainfall rates. As a result, the percentage of soil moisture production on a daily basis displays very low variation. On the other hand, as the rainfall intensity increases, the percentage of runoff production grows exponentially for various LULC systems with different growth rate. The percentage of runoff production rate for forest and woodland systems (Figure 5, zones A1 and B1) is seen to increase slowly. The rate of rainfall at which the runoff production exceeds the soil moisture is higher than the other zones. In forest and woodland systems, the infiltration process is better facilitated than runoff which probably delays formation of runoff until the rainfall rate increases to nearly 10 mm d−1. For the cropland system (zones C2 and D2), the rainfall rate at which the runoff production exceeds the soil moisture is smaller (about 5 mm d−1) potentially due to human impacts of irrigation and other activities that facilitate runoff production more quickly. In case of zones E3 and F3, the runoff production exceeds the soil moisture at much smaller rainfall rate (less than 3 mm d−1). In these zones, in addition to LULC, the topographic features dominate the runoff production. Because the VIC model simulates runoff without directly incorporating the effects of topographic gradient, this seems to indicate the predominance of the orographically enhanced rainfall-runoff process.

Figure 5.

Percentage of runoff and soil moisture production for different rainfall intensities (ground observation) for selected zones of summer 2006 (top six panels) and 2010 (bottom six panels).

4.2. Spatial Nature of Errors

[25] Figures 6 and 7 present the spatial pattern of rainfall, soil moisture, and runoff errors. Related to spatial error distribution, the three satellite rainfall products share certain similarities. The southern and southeastern coast regions of the Mississippi basin (Louisiana, Mississippi, and Tennessee) are dominated by missed precipitation during winter season for all satellite rainfall products. In general, missed precipitation is also the major source of total bias for the eastern and central part of the basin during the winter season for 3B42RT and CMORPH products. This is tied with the occurrence of high snow cover in these regions during the winter season and the weakness of PMW sensors to detect warm rain processes.

Figure 6.

Error component of three satellite rainfall products: total bias (E), hit bias (H), missed precipitation (−M), and false precipitation (F), soil moisture and runoff errors. (top) The winter of 2006 (D05–JF06). (bottom) Summer 2006 (JJA).

Figure 7.

Same as Figure 6 except for the (top) winter (DJF) and (bottom) summer (JJA) of 2010.

[26] The western mountainous parts of the basin (upstream of Missouri and Arkansas-Red basins) exhibit significant positive total bias during the winter season for the PERSIANN-CCS product, which is mainly caused by false precipitation and positive hit bias. In this region, the PERSIANN-CCS product displays considerable false precipitation both in the winter season of 2006 and 2010 signifying weakness of the algorithm in producing false precipitation in moderate altitude and highland regions. On the other hand, 3B42RT shows a positive hit bias in the eastern part of MRB during the same season but the positive hit bias and missed precipitation cancel each other resulting in much smaller total bias in the region. The soil moisture error during this season has a similar pattern with the total bias but the magnitude of the error is higher than the precipitation. Most of the error from the rainfall is propagated into soil moisture and its magnitude is amplified. There is a modest error signature observed on the runoff due to less runoff production during the winter season except for the PERSIANN-CCS product, which displayed smaller positive runoff error in the western edges of the MRB due to significant false precipitation.

[27] For the summer season, the hit bias is the major contributor to the total error in all parts of the basin except for the northern part of Wisconsin and Minnesota, which are also characterized by both missed precipitation and negative hit bias. In general, during the summer season, CMORPH and PERSIANN-CCS products overestimate the rainfall in the central and western region of the basin. The soil moisture error during the summer is not amplified like the winter season. A positive soil moisture error is observed in most parts of the region comparatively similar to the total rainfall bias. The occurrence of a large soil moisture error during the winter season can be explained due to formation of snow over the land surface because of false precipitation and positive hit bias (Figures 6 (top) and 7(top)). Less runoff error is observed during the summer season for the 3B2RT product and large positive runoff errors are produced in the central and northern parts of the basin for CMORPH and PERSIANN-CCS due to the occurrence of false and positive hit bias in the region. In general, this confirms that rainfall error first propagates to soil moisture until the soil column reaches its maximum holding capacity, after which the remaining of error portion transfers to the runoff process [Raj and Hossain, 2010].

4.3. Temporal Error Analysis

[28] Temporal error analysis was performed for the identified study zones based on LULC type. For each zone, the spatial average error was computed for the analysis period of 8 years (2003 to 2010). The time series plot (3B42RT panel) also included specific timelines where different sensors were added or decommissioned from the constellation used for precipitation estimation [Huffman et al., 2010] to help the reader understand the variation in performance as a function of the sensors' history. To distinguish the temporal pattern of the errors clearly and avoid visual cluttering, a 31 day moving average is applied again (similar to Figure 4) for the rainfall error components, runoff, and soil moisture errors.

[29] Figure 8 shows that the temporal errors pattern for forest and woodland systems. In these two particular zones (zones A1 and B1), 3B42RT has positive hit bias most of the time and high missed precipitation during the entire period resulting in smaller total bias. The hit bias drops down to negative during the summer seasons and gains during the winter (Figure 8). As a result, the total error drastically reduces during the summer and becomes slightly positive during the winter. Generally, the total bias is dominated by missed precipitation. Apart from that, there is no consistently similar trend between the two zones for 3B42RT. More interestingly, the soil moisture error follows the trend of the total rainfall bias and the runoff error trails the hit bias trend. Similar to the total bias, the soil moisture error is reduced during the summer season due to high hit bias and is highly negative during the winter due to significant missed precipitation.

Figure 8.

Time series of error components for three satellite rainfall products and simulated soil moisture and runoff errors for forest and woodland systems for the period of 2003 to 2010 (MB: missed-rain bias; FB: false-rain bias; HB: hit bias; TB: total bias; ROE: runoff error; SME: soil moisture error). Timeline for satellite sensors that was added or decommissioned from the constellation used for precipitation estimation (hidden line with right arrow head, added timeline; hidden line with left arrow head, decommissioned year; yellow smooth line, transition from GPCC to CAMS).

[30] For the same LULC zones (A1 and B1), CMORPH has a completely different temporal pattern compared to 3B42RT. The total error is dominated by hit bias. CMORPH has strong positive total and hit bias during the summer season and negative during the winter for zone A1. CMORPH at zone B1 displays closer similarity with zone A1 except the magnitude of positive total and hit bias during summer diminish in the later case. The absence of false precipitation that contributes to positive hit and total bias results in the formation of weak positive bias. Unlike 3B42RT, the total bias for CMORPH is controlled by the hit bias in both regions. The PERSIANN-CCS data are characterized by a smaller amount of false precipitation and positive hit bias in both zones. The total error is mostly caused by hit bias and the presence of small amplitude of false precipitation. Generally, for the case of forest and woodland systems, the natures of errors are similar for CMORPH and PERSIANN-CCS because the hit bias is the leading error, while 3B42RT is distinguished by strong missed precipitation and mostly positive hit bias. Runoff and soil moisture errors are dictated by the hit and total bias for both CMORPH and PERSIANN-CCS.

[31] As seen in Figure 9, the drift of temporal errors for the human land use system (cropland) shares considerable common characteristics with forest and woodland system. The total bias is largely controlled by missed precipitation for 3B42RT, whereas for CMORPH and PERSIANN-CCS, total errors are dominated by hit bias. In zone C2, missed and false precipitation components are considerably higher during the summer time for all satellite rainfall products leading the hit bias to dominate the total error. By and large, zone D2 is different from zone C2, and instead shares significant error characteristics with zone A1. This shows that LULC classification is not the only governing factor to display more consistent error characteristics and that there are other factors related to geographical features that need to be considered. Such factors may include climatic factors (Koppen climate class), topography (e.g., elevation, slope, topographic index), and soil types (e.g., hydraulic properties and texture).

Figure 9.

Same as Figure 8, except for cropland system.

[32] Figure 10presents the error characteristics of savanna and grassland systems (zones E3 and F3). Missed precipitation is small in CMORPH and PERSIANN-CCS for both zones; whereas false precipitation is large in both regions except that it is small for 3B42RT in zone F3. For the CMORPH product, hit bias is the dominant error component which dictates the total bias, whereas due to significant amount of false precipitation in PERIANN-CCS, the total bias is fully dominated by false-rain bias. As seen inFigure 10, the amplitude of the soil moisture error is higher than the component or total errors during the winter time for CMORPH and PERSIANN-CCS products. Despite the peak amplitudes of soil moisture error during the winter period, there is a systematic trend between the rainfall and soil moisture errors throughout the analysis period (2003–2010). These zones are mainly characterized by mountainous regions (up to 2000 m a.s.l). As explained insection 4.1, CMORPH and PERSIANN-CCS rainfall products overestimate the rainfall in these zones during the wet season and winter season, respectively (Figure 4, bottom). Due to mountainous nature of the region, the overestimated rainfall from satellite products is converted to snowfall by the hydrologic model, resulting in the formation of significant snow pack depth during the winter seasons particularly for the PERSIANN-CCS product due to considerable false-rain bias (Figure 11, bottom left).

Figure 10.

Same as Figure 8, except for savanna-grassland system.

Figure 11.

Temporal pattern of snow pack depth and snow water equivalent for (left) zone E3 and (right) zone F3 in Mississippi basin (Note: SWEE is snow water equivalent error; SPDE is snow pack depth equivalent; ROE is runoff error; and SME is soil moisture error).

[33] From the hydrologic modeling perspective, there are potentially two main reasons for soil moisture error to be high in these two particular zones. First, because of the formation of significant snow pack depth, the soil column is continuously supplied with moisture from snow water equivalent through melting during the spring season regardless of additional rainfall during the season. Second, a previous study on evaluation of models for simulating snow cover extent has shown that VIC-3L has the tendency to overestimate the snow depth over mountainous regions [Sheffield et al., 2003], which ultimately has an impact in soil moisture simulation over highland regions.

[34] Correlation coefficients are used to determine the degree to which rainfall error patterns are associated with soil moisture and runoff errors. According to Figure 12, strong correlations (above 0.8) are observed between runoff and total error and hit bias for 3B42RT and CMORPH products in all zones (Figure 12(left), black and green bars). The runoff has weak correlation with missed (less than 0.4) and moderately correlated with false precipitation in the highland region where false-rain bias is a common error. For PERSIANN-CCS, the degree of correlation of runoff with the hit bias is weak for the highland region of the Mississippi basin (zones E3 and F3) but it has strong correlation with total bias and false precipitation in this region. As it has been mentioned above, false-rain bias is the leading error that dominates the total bias for PERSIANN-CCS in these particular regions (Figure 10).

Figure 12.

Correlation coefficient of soil moisture and runoff errors with total bias and rainfall error components for the period of 2005–2010.

[35] On the contrary, the soil moisture is also strongly associated with missed precipitation, hit and total bias, and sometime with false precipitation (right three panels, blue and orange bars). Missed precipitation often occurs because of light rain during summer and rain over snow covers during winter seasons. Light rain is generally responsible for the increase in simulated soil moisture content but does not facilitate runoff generation unless the soil moisture reaches saturation. Rainfall over snow cover is also not responsible for runoff generation as the rain is converted in the model to snow when it reaches the ground. On the other hand, these types of events have significant effects on soil moisture production, leading the soil moisture to depend on all three error components. As a result, if the contribution of missed precipitation to the total error is significant, runoff error is dictated by the hit bias more than by the total error.

5. Conclusions and Recommendations

[36] In this study the total rainfall bias was decomposed into hit bias, missed, and false precipitation for the entire MRB. Spatial distribution of rainfall error components, soil moisture, and runoff error were analyzed. For three dominant land use scenarios, the temporal patterns of rainfall error components, soil moisture, and runoff errors were characterized both qualitatively and quantitatively. For forest and woodland and human land use system, the soil moisture was mainly dictated by the total bias for 3B42RT, CMORPH, and PERSIANN-CCS products. On the other hand, runoff error was largely dominated by hit bias rather than the total bias. This difference most likely occurred due to the presence of missed precipitation, which was a major contributor to the total bias both during the summer and winter seasons.

[37] In summary, the tracing of error in hydrologic simulation to rainfall error can be summarized into the following key rules for product developers and end users.

[38] 1. The magnitude of the rainfall at which rate of production of runoff exceeds the soil moisture depends on the LULC type. The percentage of runoff production exceeds soil moisture when the rainfall magnitudes are 10, 5, and 3 mm d−1for forest and woodland, cropland, and savanna-grassland systems, respectively. Since the magnitude of the rainfall error propagating to the fluxes depends on the amount of production of the fluxes (such as soil moisture, runoff, and evapotranspiration), these threshold values are ultimately useful to understand the proportion of the error propagating to them, which could be applicable for hydrologically relevant merging of multisatellite rainfall products.

[39] 2. For most cases, the hit bias and missed precipitation are the major error components that dominate the total bias during summer and winter, respectively. Moreover, missed precipitation dictates the soil moisture error but not the runoff error; indicating probably that missed precipitation mostly occurs because of local convective type of rainfall that takes place for a relatively short period of time. Additionally, the low level warm rain clouds are difficult to be detected by the scattering channels of the passive microwave sensor, often resulting in missed precipitation. The runoff error is highly correlated with hit bias, which is a common problem for CMORPH and PERSIANN-CCS over mountainous regions during the heavy rain season. The CMORPH product is characterized by positive hit bias in most part of the basin during the rainy season. We speculate the overestimation of precipitation arises because of the technique of merging IR and MW estimates in the “morphing” algorithm as it is pointed out byTian et al. [2009].

[40] 3. For hydrologists and other data users, it is important to realize the implication of satellite errors in soil moisture and runoff simulation. The total bias alone does not show the clear picture of rainfall or hydrologic error structures. As the error components have different signs, sometimes they cancel each other to produce a lower total bias [Tian et al., 2009]. As a result, the magnitude of soil moisture and runoff errors should be evaluated based on the amplitude of error components rather than the total bias. For hydrologic model simulation, the performance of the satellite products with respect to the geographic location needs to be assessed to make more accurate model prediction.

[41] Like any other modeling problem, the finding of this study is likely sensitive to the quality of data that has been assumed as “reference.” Particular to this study, the gridded soil moisture and runoff from the VIC model are assumed as the “synthetic” truth or reference. It is important to recognize the limitation that this assumption is associated with because the model's structural or parametric error is introduced into the hydrologic fluxes during the simulation process. We believe that the task of input data quality control, the method of model calibration, and validation implemented in the study prior to modeling are very important to minimize such impacts.

[42] Despite the aforementioned limitation, this particular study has vital applications for algorithm developers and data users to understand satellite rainfall, soil moisture, and runoff errors in the continuum of time, space, and land use/land cover. Such a wide range of investigation by characterizing satellite rainfall error as a function of LULC type, tracing back the source of errors in soil moisture and runoff simulation, understanding the role of LULC on runoff and soil moisture production, and error propagation are expected to improve multisensor algorithms or multiproduct merging. A natural follow-up question now is to explore the nature of the errors as a function of additional criteria such as climate type, soil type, and terrain features (topography). These additional criteria are likely to have their own unique and identifiable contribution to the performance satellite products and formation of runoff and soil moisture, such as those observed herein for LULC. Thus, consideration of additional governing features have merit in extending merging of a multiproduct satellite data at ungauged regions where these features are always known a priori. Work is under way along this direction and will be reported in a future study.


[43] The study and the first author (Gebregiorgis) were supported by NASA New Investigator Program (NIP) grant NNX08AR32G of Faisal Hossain and the Center for Management, Utilization and Protection of Water Resources at TN Technological University. A major component of the research was also generously supported by the Goddard Earth Sciences and Technology (GEST) Center of University of MD Baltimore County through its Graduate Student Summer Program (GSSP) during summer 2011 awarded to the first author under the able supervision of Dr. Christa Peters-Lidard and Dr. Yudong Tian. The authors are also grateful for the guidance received from the associate editor and the three anonymous reviewers that helped to improve the quality of the study and manuscript considerably.