Hydrologic evaluation of Multisatellite Precipitation Analysis standard precipitation products in basins beyond its inclined latitude band: A case study in Laohahe basin, China



[1] Two standard Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) products, 3B42RT and 3B42V6, were quantitatively evaluated in the Laohahe basin, China, located within the TMPA product latitude band (50°NS) but beyond the inclined TRMM satellite latitude band (36°NS). In general, direct comparison of TMPA rainfall estimates to collocated rain gauges from 2000 to 2005 show that the spatial and temporal rainfall characteristics over the region are well captured by the 3B42V6 estimates. Except for a few months with underestimation, the 3B42RT estimates show unrealistic overestimation nearly year round, which needs to be resolved in future upgrades to the real-time estimation algorithm. Both model-parameter error analysis and hydrologic application suggest that the three-layer Variable Infiltration Capacity (VIC-3L) model cannot tolerate the nonphysical overestimation behavior of 3B42RT through the hydrologic integration processes, and as such the 3B42RT data have almost no hydrologic utility, even at the monthly scale. In contrast, the 3B42V6 data can produce much better hydrologic predictions with reduced error propagation from input to streamflow at both the daily and monthly scales. This study also found the error structures of both RT and V6 have a significant geo-topography-dependent distribution pattern, closely associated with latitude and elevation bands, suggesting current limitations with TRMM-era algorithms at high latitudes and high elevations in general. Looking into the future Global Precipitation Measurement (GPM) era, the Geostationary Infrared (GEO-IR) estimates still have a long-term role in filling the inevitable gaps in microwave coverage, as well as in enabling sub-hourly estimates at typical 4-km grid scales. Thus, this study affirms the call for a real-time systematic bias removal in future upgrades to the IR-based RT algorithm using a simple scaling factor. This correction is based on MW-based monthly rainfall climatologies applied to the combined monthly satellite-gauge research products.

1. Introduction

[2] Precipitation is a critical forcing variable to hydrologic models, and therefore accurate measurements of precipitation on a fine space and time scale are very important for simulating land-surface hydrologic processes, predicting drought and flood, and monitoring water resources, especially for semiarid regions [Sorooshian et al., 2005]. Precipitation, unfortunately, is also one of the most difficult atmospheric fields to measure because of the limited surface-based observational networks and the large inherent variations in rainfall fields themselves. A long history of development in the estimation of precipitation from space has culminated in sophisticated satellite instruments and techniques to combine information from multiple satellites to produce long-term products useful for climate monitoring [Arkin and Meisner, 1987; Adler et al., 2003] and for fine-scale hydrologic applications [Su et al., 2008]. To date, a number of finer-scale, space-based precipitation estimates are now in operational production, including the Precipitation Estimation from Remotely Sensed Information Using Artificial Neural Networks (PERSIANN) [Sorooshian et al., 2000], One-Degree Daily [Huffman et al., 2001], the Passive Microwave-Calibrated Infrared algorithm (PMIR) [Kidd et al., 2003], the Climate Prediction Center (CPC) morphing algorithm (CMORPH) [Joyce et al., 2004], PERSIANN-Cloud Classification System [Hong et al., 2004] (see also http://hydis8.eng.uci.edu/GCCS/), the Naval Research Laboratory Global Blended-Statistical Precipitation Analysis [Turk and Miller, 2005], and the Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA) products [Huffman et al., 2007]. It is anticipated that the legacy of the aforementioned TRMM-era algorithms (mostly confined within 50°NS latitude band) will be succeeded by future Global Precipitation Measurement (GPM) products (http://gpm.gsfc.nasa.gov), which is planned to provide a TRMM-like “core” satellite to calibrate all of the microwave estimates on an ongoing basis over the latitude band 65°NS.

[3] As one of the standard TRMM-era mainstream products, TMPA provides precipitation estimates by combining information from multiple satellites at a 3-hourly, 0.25° × 0.25° latitude-longitude resolution covering the globe between the latitude band of 50°NS. The TMPA estimates are produced in four consecutive stages: a) the polar-orbiting microwave precipitation estimates are calibrated by the single best calibrator, TRMM microwave, and then combined together, b) geostationary infrared precipitation estimates are calibrated using the calibrated microwave precipitation to fill in gaps of the microwave coverage, c) the microwave and the window-channel (∼10.7 μm) infrared (IR) data are combined to form the near Real-Time (i.e., 3B42RT) product, and d) global rain gauge analysis data are incorporated to generate the research-quality product Version 6 (i.e., 3B42V6). The 3B42RT and 3B42V6 have been generated and available since Jan. 2002 and Jan. 1998 to present, respectively.

[4] According to Huffman et al. [2007], the TMPA algorithm is designed with the sequential calibration scheme so that even the final products can be traceable back to the original single “best” calibrator, the precipitation estimates from the TRMM Combined Instrument (TCI) including TRMM Microwave Imager (TMI) and TRMM Precipitation Radar (PR). In other words, all the less frequent, polar-orbiting passive microwave (90°NS) and high frequency, geostationary infrared (global coverage) precipitation estimates are ultimately benchmarked by the inclined-orbital (36°NS) TRMM satellite instruments, TMI and PR. One of the suggested future works by Huffman et al. [2007] is to explore differences between the 3B42RT and 3B42V6 research products, especially at high latitudes and mountainous regions, with implications on the future GPM mission. Thus, the objective of this study is to evaluate the data quality and investigate the hydrologic utility of the two standard TMPA products (i.e., 3B42RT and 3B42V6) in a heavily instrumented basin, located within the TMPA product latitude band (50°NS) but beyond the inclined TRMM satellite latitude band (36°NS). Specifically, we will investigate: 1) What is the spatiotemporal error structure of the two standard products, and how much do they differ? 2) Can their errors be tolerated by a widely used hydrologic model at daily and monthly scales, and how do the errors propagate into hydrologic prediction? 3) What is their hydrologic utility in terms of daily decision-making support (e.g., reservoir operations, flood monitoring and warning) and water resources management? 4) What implications do results from this study have on future GPM algorithms?

[5] The following sections discuss the study area, data and hydrologic model used in this study (section 2), and the detailed evaluation of the TMPA products (section 3). Then in section 4 we further investigate the hydrologic utility of the two standard TMPA products by using the Variable Infiltration Capacity hydrologic model. Summary and concluding remarks are presented in section 5.

2. Study Area, Data, and Methodology

2.1. Laohahe Basin

[6] The Laohahe basin, with a drainage area of 18,112 km2 above the Xinglongpo hydrological station, is located at the junction of Hebei, Liaoning Provinces and Inner Mongolia Autonomous Region in the northeast of China (Figure 1). The basin lies upstream of the West Liaohe River at latitude of 41°–42.75°N and longitude of 117.25°–120°E with a typical semiarid climate. The average annual temperature, precipitation, and runoff during the period of 1964–2005 were 14°C, 430.9 mm, and 46.1 mm, respectively. The basin elevation ranges from 400 m above sea level at the channel outlet to over 2000 m in the upstream mountainous area, and the topography significantly descends from west to east. The reasons we chose this basin are based on the following: 1) it has an excellent ground observation network for the last 15 years; 2) this basin has been experiencing increased population, drought, and a possible change in hydrologic regime according to decades of observation; 3) it is located well above the inclined TRMM orbital latitude band (36°N); and more importantly, 4) the surface observational network (in particular the rain gauge data) is independent from what Huffman et al. [2007] used for 3B42V6 gauge-correction. Altogether, there are 53 rain gauges spread evenly throughout the basin that have been recording daily precipitation data from 1964 to present.

Figure 1.

Map of the Laohahe basin, rain gauges, meteorological stations, streamflow station, topography, and sampling strategies used in this study. Black squares represent the 16 selected 0.25° × 0.25° grids for precipitation comparison. Numbers are grid IDs (e.g., 0301 indicates the first grid containing 3 gauges and 0302 represents the 2nd grid containing 3 gauge stations and so on). Hatched shades over the inserted China's map represents the areas with latitude lower than the TRMM satellite northern most orbit (36°N).


[7] In the present implementation, the TMPA is computed twice as part of the routine processing for TRMM, first as a near-real time product (3B42RT) computed about 6–9 h after observation time, and then as a post-real time research product (3B42V6) computed about 15 days after the end of the month with monthly surface rain gauge data. As an experimental best-effort, real-time product, the 3B42RT is generated from two major sources, Microwave and Infrared with the ultimate calibrator, TMI. The polar-orbiting microwave information is collected by a variety of low earth orbit satellites, including Special Sensor Microwave Imager (SSM/I) on Defense Meteorological Satellite Program (DMSP) satellites, Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E) on Aqua, and the Advanced Microwave Sounding Unit-B (AMSU-B) on the National Oceanic and Atmospheric Administration (NOAA)-15, 16, and 17 satellites. The second data source for 3B42RT is the gap-filling infrared (IR)-based estimates merged from five geosynchronous earth orbit (GEO) satellites into half-hourly 4 km × 4 km equivalent latitude-longitude grids [Janowiak et al., 2001]. 3B42V6 makes use of three additional data sources: the TCI estimate, which employs data from both TMI and the TRMM PR as a source of calibration (TRMM product 2B31 [Haddad et al., 1997a, 1997b]); the GPCP monthly rain gauge analysis developed by the Global Precipitation Climatology Center (GPCC) [Rudolf, 1993]; and the Climate Assessment and Monitoring System (CAMS) monthly rain gauge analysis developed by CPC [Xie and Arkin, 1996]. The two rain gauge analysis data sets are used to correct bias in the post-real time, TCI-calibrated, multisatellite merged Microwave-Infrared precipitation estimates for each calendar month. The bias-correction ratio is then used to scale each 3-hourly field in the month, producing the final 3B42V6 product.

[8] In summary, the standard TMPA precipitation data are available in both near-real time (3B42RT, about 9 h after real time) and post-real time (3B42V6, about 10–15 days after the end of each month) with 3-hourly, 0.25° resolution over a global latitude band 50°NS (for brevity, these will also be referred to as the RT and V6, respectively). Essentially, there are two important differences between RT and V6: (1) RT uses the TMI as the calibrator while V6 uses the TCI (including TMI and precipitation Radar), which is better than TMI alone but not available in real-time; (2) only the V6 post-real time product incorporates rain gauge analyses from GPCC and CMAS while RT is purely satellite-derived.

[9] The RT data has been generated and made available on the Website since Jan. 2002, while the V6 is available from Jan. 1998 for a record that totals more than 10 years and continues to grow. However, in the current study basin, the RT data are only available after Feb. 2002, and the V6 data are not available till Mar. 2000. The surface observational data including gauged rainfall, stream discharge, evaporation, wind speed, temperature etc., have been collected from Jan.1990 to Dec. 2005. Therefore, to make full use of all the available data, we used the in situ observations in the 1990s to calibrate the hydrologic model and evaluate the two satellite products after 2000 through end of 2005.

2.3. Hydrological Model and Observed Data

[10] The three-layer Variable Infiltration Capacity (VIC-3L) model [Liang et al., 1994, 1996] was used to evaluate the application of RT and V6 as forcing to hydrologic simulations in this study. The VIC-3L model is a grid-based land-surface processes scheme that considers the dynamic changes of both water and energy balances. Its vertical soil column is composed of three layers, which include a top thin layer to represent quick evaporation and moisture response of the surface soil to small rainfall events, an upper layer to represent the dynamic response of the soil moisture to storm events, and a lower layer to characterize the seasonal soil moisture behavior [Liang et al., 1996]. One distinguishing characteristic of the VIC-3L model is that it uses a spatially varying infiltration capacity originated from the Xinanjiang model [Zhao et al., 1980] to represent subgrid-scale heterogeneities in soil, topography, and vegetation properties and hence in moisture storage, evaporation, and runoff generation. The VIC-3L model has been successfully applied in hydrologic simulation and prediction over many river basins (Nijssen et al. [2001], Maurer et al. [2002], Su et al. [2005], Wood and Lettenmaier [2006], and Su et al. [2008], among many others).

[11] In this study, the VIC-3L model was run at a 0.0625° × 0.0625° spatial resolution from Jan. 1990 through Dec. 2005. Surface and subsurface runoff are routed by an offline horizontal routing model [Lohmann et al., 1996, 1998] to produce model-simulated streamflow at the outlet of the Laohahe basin. The forcing data for VIC-3L model include precipitation, daily maximum and minimum temperature, and daily averaged wind speed. Observed daily precipitation data for 1990–2005 were recorded by the 53 rainfall gauges distributed within the Laohahe basin. Daily streamflow was used to validate the simulation results at the Xinglongpo hydrologic station located at the basin outlet (Figure 1). Daily maximum and minimum temperature and 10-m, daily averaged wind speed from 1990 to 2005 were collected from 10 meteorological stations as shown Figure 1. Several other land surface data sets were also retrieved from local authorities and public sources. For example, the thirty arcs-seconds' global DEM (GTOPO30) from the U.S. Geological Survey (USGS) is often used to compute the topographic information for large-scale hydrological models [Yong et al., 2009; Li et al., 2009]. In this study, GTOPO30 was resampled to 0.0625° × 0.0625° resolution in order to generate elevation data, flow direction, basin mask, and contributing area needed to run the VIC-3L model.

2.4. Validation Statistical Indices

[12] To quantitatively compare TMPA precipitation products against rain gauge observations, we used three different types of statistical measures including degree of agreement (defined below), error and bias, and contingency table statistics (Table 1). Degree of agreement is represented by the Pearson correlation coefficient (CC), which reflects the degree of linear correlation between satellite precipitation and gauge observations. In terms of error and bias, we considered four different validation statistical indices. The mean error (ME) simply scales the average difference between the remotely sensed estimates and observations, while the mean absolute error (MAE) represents the average magnitude of the error. Although the root mean square error (RMSE) also measures the average error magnitude, it gives greater weight to the larger errors relative to MAE. The relative bias (BIAS) describes the systematic bias of satellite precipitation estimates. For the contingency table statistics, we computed the probability of detection (POD), false alarm rate (FAR), and critical success index (CSI) to examine the correspondence between the estimated and observed occurrence of rain events (see Wilks [2006] and Ebert et al. [2007] for a detailed explanation). In addition, we adopted BIAS and the Nash-Sutcliffe Coefficient of Efficiency (NSCE), two of the commonly used statistical criteria, to evaluate the hydrologic model performance. NSCE is an indicator of model fit between the simulated and observed streamflows.

Table 1. List of the Validation Statistical Indices Used to Compare the TRMM-Based Precipitation Products and the Gauged Observationsa
Statistical IndexUnitEquationPerfect Value
  • a

    Notation: n, number of samples; Si, satellite precipitation (e.g., 3B42RT and 3B42V6); Gi, gauged observation; Qsi, simulated streamflow; Qoi, observed or reconstructed streamflow; equation image, observed mean annual streamflow; H, observed rain correctly detected; M, observed rain not detected; F, rain detected but not observed (POD, FAR, and CSI; refer to Ebert et al. [2007] for a detailed explanation).

Correlation Coefficient (CC)NAequation image1
Root Mean Squared Error (RMSE)mmRMSE = equation image0
Mean Error (ME)mmME = equation image (SiGi)0
Mean Absolute Error (MAE)mmMAE = equation imageSiGi0
Relative Bias (BIAS)%BIAS = equation image × 100%0
Probability of Detection (POD)NAPOD = equation image1
False Alarm Ratio (FAR)NAFAR = equation image0
Critical Success Index (CSI)NACSI = equation image1
Nash-Sutcliffe Coefficient Efficiency (NSCE)NANSCE = 1 − equation image1

3. Comparison of TMPA Products Against Gauge Observations

[13] First, the surface observations were subjected to a quality control assessment through visual inspection and systematic evaluation to detect and correct a few, extreme outliers in the data record. When the observational data were manually inputted, we plotted the time series of rainfall accumulation for each gauge. Then the corrupted data detected by the visual inspection were flagged as invalid and corrected by the original data provided by local authorities and other data from public sources such as the Chinese Hydrology Almanac. To further examine the possible mistakes still hidden in the processed data, we developed a program to systematically screen out the outliers for all gauges. For example, we set a threshold (e.g., 10) and flagged the abnormal data which were larger than 10 times the average values of 3 neighboring gauges for the same day. Flagged observations were removed and then replaced using the interpolation method of Inverse Distance Weighting (IDW).

[14] Afterward, the comparison of RT and V6 against observations was performed in two ways (i.e., grid-based and basin-wide). The grid-based comparison calculated the precipitation errors between the TMPA grids and gridded gauge accumulations for several sub-regions within the basin (refer to Figure 1), while the basin-wide comparison investigated the spatial distribution of computed error metrics.

3.1. Grid-Based Comparison

[15] To assess the skill of RT and V6 in detecting the amount and timing of rainy events, we directly compared the two TMPA precipitation products with collocated rain gauges within the basin (Figure 1). However, when the grid-scale TMPA products are directly compared with the point-scale gauge data, scale differences between different rainfall data sets will likely contribute to the evaluation errors. To reduce the scale errors, we only selected grid boxes that contained at least 2 gauges and then used the mean value of all gauges inside each grid box as the ground truth, as practiced by many previous studies [Adler et al., 2003; Nicholson et al., 2003; Chiu et al., 2006; Chokngamwong and Chiu, 2008; Li et al., 2009]. To further quantify the scale-induced errors, we employed a simple yet robust and adaptive approach, the Variance Reduction Factor (VRF) originally proposed by Rodríguez-Iturbe and Mejía [1974], to compute the uncertainties associated with the approximation of the true, areal rainfall from the average of point measurements. The VRF index reflects the accuracy of grid-average rainfall, and it mainly depends on gauge density, network configuration, and the spatial covariance function of rainfall [Villarini and Krajewski, 2007]. For the 16 selected grid boxes and relative data set, the spatial correlation structure of point rainfall process can be presented by a two-parameter exponential function, with a correlation distance (d0) and an exponent (s0). This approach is described in more detail in some previous studies [e.g., Morrissey et al., 1995; Krajewski et al., 2000, 2003; Ciach and Krajewski, 2006]. When the shape parameter s0 equals to 0.8, the VRF values vary between 0.69% and 2.98% corresponding to the range of scale parameter d0 (between 10 km and 100 km) at daily, 0.25° resolutions. If keeping d0 (90 km) constant and varying the value of s0 (0.6∼1), VRF is between 0.65% and 1.15%. It is clear that the actual spatial variability of daily rainfall in our study basin is within a rational range. According to our results, we can expect that the errors yielded by the point-scale gauge data are less than 5% in our study cases. Such relatively small, scale-induced errors do not make the direct grid-based comparison lose its statistical meaning.

[16] Precipitation estimates from V6 and RT in each selected grid box were compared against gauge observations, and the statistical indices including CC, RMSE, ME, MAE, and BIAS are summarized in Table 2. Recall the different grid boxes correspond to different elevations and latitude bands within the basin (refer to Figure 1). Comparisons suggest that V6 had a better performance than RT for any selected grid location at both daily and monthly scales with a significantly improved correlation and reduced bias ratio. Both the daily and monthly validation indices of the two satellite products seemed to correlate with the geo-topographic location of the grids. For example, grids with worse performance are generally located to the northwest and better performance grids to the southeast. The grid box with the worst statistical scores (0201) was located at the highest latitude and elevation, while the best grid (0211) was situated at the lowest elevation and to the furthest to the southeast (Tables 2 and 3 and Figure 1).

Table 2. Statistical Summary of the Comparison of Grid-Based Daily Precipitation Estimates Between 3B42V6 and 3B42RT in Laohahe Basin
Grid NumberRT Versus Gauge (Daily Rainfall)V6 Versus Gauge (Daily Rainfall)
Table 3. Same as Table 2 but for Monthly Time Series of Precipitation Estimates
Grid NumberRT Versus Gauge (Daily Rainfall)V6 Versus Gauge (Daily Rainfall)

[17] Plots of daily and monthly estimates of RT and V6 versus gauge observations for the 16 selected grids are shown in Figure 1. As many previous studies suggested [e.g., Dai, 2006, 2007; Tian et al., 2007], we also used the common threshold of 1.0 mm day−1 for computing the daily contingency table statistics (i.e., POD, FAR, and CSI). The daily scatterplots show that RT and V6 systematically overestimated gauge observations by about 81.27% and 19.19% with correlation coefficients of 0.41 and 0.58, respectively (Figures 2a and 2b). The daily RMSE, ME, and MAE of RT are significantly higher than those of V6. Similarly, the monthly scatterplots (Figures 2c and 2d) also show that V6 largely outperformed RT with higher correlation (0.93 versus 0.65) and lower error (e.g., 6.53 versus 30.20 mm for ME). The monthly gauge adjustments applied to the RT product improved the estimation of precipitation intensity of the final V6 research product at monthly time scale more so than at daily scale. However, the improvements in the contingency table statistics are not large. For example, at daily time scales, FAR of V6 (0.45) is lower than that of RT (0.62) but they have similar POD values, which leads to a slightly higher CSI of V6 (0.41) relative to that of RT (0.30). This suggests that the bias-correction method effectively reduced the FAR but failed to improve the POD due to the limitation of the correction method that applies the monthly accumulated bias (either positive or negative) back to 3-h V6 data. A limiting result of this correction method is improvements can only be realized with the POD or FAR with the V6 product, but not both.

Figure 2.

Scatterplots of the grid-based precipitation comparison at the 16 selected grids between (a) daily 3B42RT and gauge for Feb. One 2002-Dec. 31 2005, (b) daily 3B42V6 and gauge for Mar. One 2000-Dec. 31 2005, (c) monthly 3B42RT and gauge Feb. 2002-Dec. 2005, and (d) monthly 3B42V6 and gauge Mar. 2000-Dec. 2005. In computing POD, FAR, and CSI, a threshold value of 1 mm day−1 was used.

[18] Figure 3 shows both RT and V6 underestimated at the lowest Precipitation Intensity (PI) range (less than 1 mm day−1) and overestimated at a higher PI range (1 to 5 mm day−1). Interestingly, RT is slightly in better agreement with the gauge data than V6 within the medium PI range of 5 to 30 mm day−1. As for intense rainfall with PI higher than 30 mm day−1, the precipitation occurrence frequency of RT is as high as 4.5%, which is approximately two times greater than the gauge observations and 25% higher than V6. This result along with the large positive bias with RT shown in Figure 2a indicates the overestimation primarily occurs with intense rainfall (PI > 30 mm day−1). Similar overestimation results have been found in other studies [e.g., Su et al., 2008; Li et al., 2009]. Although the error of V6 is much smaller than RT due to GPCC gauge-based bias correction at monthly scale, the overestimation of V6 suggests current GPCP and CPC merged products might also overestimate in this basin. In section 4, we will model and analyze the hydrologic response to such errors introduced in the RT and V6 rainfall forcing.

Figure 3.

Precipitation occurrence frequency (%) of gauge observations, 3B42RT, and 3B42V6 at the 16 selected grids as a function of precipitation intensity (mm day−1), respectively.

[19] Figure 4a depicts the time series of monthly mean precipitation of the 16 selected 0.25° grids (combined) for the RT, V6, and Gauge rainfall products. Figure 4b shows that the largest values of absolute errors for V6 typically occur in the summer months (i.e., Jun., Jul., and Aug.). Such significant warm-season-based error structure is primarily due to a majority of the rainfall occurring during the summer in the Laohahe basin. The error structure for RT reveals no clear seasonal dependence, and significant overestimation appears to be the main cause. Although rain gauges have errors themselves due to wind effects, unrepresentative sampling, instrument errors, etc., such unrealistic overestimation with RT is primarily attributed to the satellite retrieval algorithm itself. The studied basin is located at a high latitude belt beyond TRMM's observational domain and does not have the benefit of the TCI to calibrate the IR-based RT algorithm. Recall the hydro-climatology of the Laohahe basin is a typical semiarid climate with an average winter-time temperature of −7.73°C and relative humidity of 45.51% (i.e., cold and dry). At these high latitudes, the serious overestimation of RT in winter suggests that the gap-filling IR/MW histogram match-based precipitation estimation has an inherit limitation in its strict probability-matching assumption that the colder cloud top brightness temperatures correspond to higher precipitation rates.

Figure 4.

Temporal evaluation of average monthly precipitation of the 16 selected grids for gauge observation (Jan. 2000-Dec. 2005), 3B42RT (Feb. 2002-Dec. 2005), and 3B42V6 (Mar. 2000-Dec. 2005): (a) monthly precipitation time series, (b) mean absolute error, and (c) mean error.

3.2. Basin-wide Comparison

[20] In this analysis, we used two GIS interpolation techniques to generate continuous surfaces of precipitation over the entire basin. First, the 0.25° × 0.25° gridded TMPA products were interpolated to 0.0625°-resolution data sets for the Laohahe basin using a simple cropping approach [Hossain and Huffman, 2008]. Then the 53 gauge observations were interpolated onto the same grid using IDW. Statistics with the basin-averaged data are better than the results obtained with the grid-to-grid comparisons, as we would expect (Figure 5).

Figure 5.

Same as Figure 2 but for basin-averaged precipitation.

[21] To investigate spatial error characteristics of the satellite precipitation estimates, we selected representative validation indices, CC, MAE, and POD from each statistical group. The mean absolute error (MAE) was chosen over BIAS since it is a more appropriate measure for average error. Willmott and Matsuura [2005] argued that the absolute error retains the difference in magnitude that would otherwise be reduced because positive and negative differences cancel each other to some degree. Figure 6 shows the spatial distributions of CC, MAE, and POD, which were calculated from the interpolated daily precipitation data sets of RT and V6 compared to gauge observations on the 0.0625° × 0.0625°-resolution grid. Similar to the statistical results shown in Figures 2 and 5, better CC values and lower MAEs were found for V6 than with RT (Figures 6a6d). However, the spatial distributions of POD were found to be quite similar between RT and V6 (Figures 6e and 6f).

Figure 6.

Spatial distributions of statistical indices computed from the (left) 3B42RT and (right) 3B42V6 daily precipitation at 0.0625° × 0.0625° resolution over the Laohahe basin: (a and b) correlation coefficient, (c and d) mean absolute error, and (e and f) probability of detection.

[22] Interestingly, all six panels in Figure 6 show that both RT and V6 share similarities in their relative spatial performance; skill scores are generally the worst in the northwest (high latitude and elevation) and improve toward the southeast (lower latitude and elevation).

[23] To further reveal the dependence of performance on geo-topography, we plotted the three statistical indices (i.e., CC, MAE, and POD) with respect to latitude and elevation in Figure 7. It is apparent that the satellite precipitation estimates generally demonstrate a relatively poor performance at high latitudes and high elevations, while better results are obtained at low latitudes and elevations. This finding agrees with the limitations of state-of-art satellite-based precipitation estimation algorithms as discussed in continental-scale evaluations [Ebert et al., 2007; Tian et al., 2007] and in mountainous, high-elevation regions [Barros et al., 2006; Hong et al., 2007b]. In general, current satellite-based precipitation algorithms perform better in the tropics and increasingly worse over high latitudes and high elevations.

Figure 7.

Same as Figure 6 but with 3-D views of the performance indices as a function of latitude and elevation. Note that for MAE (Figures 7c and 7d), the axis directions of latitude and elevation are reversed for presentation purpose.

4. Evaluation of Hydrologic Predictions

[24] Quantitative evaluations of the two TMPA standard precipitation products against gauge observations suggest that 3B42V6 has a great potential for hydrological modeling, even for the relatively high-latitude basin of 41°–42.75°N, at both daily and monthly scales. It is also of interest to assess if the VIC-3L hydrologic model can tolerate the nonphysical behavior of 3B42RT through the hydrologic integration processes, and to determine the degree in which errors associated with rainfall forcing propagate into hydrologic predictions. In this section, we first calibrate and validate the VIC-3L hydrological model with observed precipitation (i.e., rain gauges) and discharges, and then simulate streamflow using RT and V6 as inputs in order to further investigate their hydrologic utility at high latitudes. The calibration period is 1990–1999 and the validation period is 2000–2005.

4.1. Model Calibration and Parameter Analysis

[25] Although most parameters related to soils, vegetation, and topography in the VIC-3L model can be directly estimated from the land surface database, several important parameters in the water balance components must be optimized in the model calibration process. These parameters are briefly described below: 1) the infiltration parameter (b) which controls the amount of water infiltrating into the soil; 2) the three soil layer thicknesses (d1, d2, d3) which affect the maximum storage available in the soil layers; 3) three base flow parameters which determine how quickly the water stored in the third layer is withdrawn, including the maximum velocity of base flow (Dm), the fraction of maximum base flow (Ds) and the fraction of maximum soil moisture (Ws) at which a nonlinear base flow begins [Su et al., 2005; Xie et al., 2007]. The two objective functions we optimized in the model calibration step were NSCE and BIAS in order to get the best match of model-simulated streamflow with observations. The VIC-3L model was first calibrated using daily discharge observations from 1990 to 1999 at the Xinlongpo hydrologic station at the basin outlet.

[26] Figure 8 shows simulated and observed streamflow forced by the IDW-interpolated rain gauge precipitation at (a) daily and (b) monthly time scales during the calibration period. For daily calibration, the simulated hydrograph has a good relative model efficiency of 0.73 with a positive bias of 18.31% (Figure 8a). Better results were obtained from calibration performed at monthly scale where NSCE increased to 0.85 while BIAS remained the same (Figure 8b). The model calibration demonstrates that VIC-3L was capable of capturing key features of the observed hydrograph (e.g., peak magnitude, recession, base flow) quite well at both daily and monthly time series when forced by the rain gauge precipitation.

Figure 8.

Observed and VIC-3L model simulated streamflow with the gauged precipitation for calibration period (1990–1999): (a) daily time series and (b) monthly time series.

[27] To further investigate the relative contributions of model parametric uncertainty on the overall runoff predictive uncertainty, we analyzed the sensitivity of the VIC-3L model parameters and identified the infiltration parameter (b), the depth of the second soil layer (d2), and two base flow parameter (Ds and Dm) as most sensitive among the seven parameters used in VIC-3L. Figure 9 shows the sensitivity tests of these four parameters in terms of NSCE and BIAS at daily and monthly scale. The most sensitive parameters in the water balance components are b and d2. It is well known that b defines the shape of the variable infiltration capacity curve and thus determines the quantity of direct runoff generation. In the VIC-3L model, an increase of b means lower infiltration and higher surface runoff. d2 largely controls the maximum soil moisture storage capacity. Generally speaking, less runoff is generated in response to increasing the depths of the second soil layer. For our study basin, a value of b between 0.008 and 0.011 and a value of d2 between 1.0 m and 2.0 m produced higher model efficiencies and lower relative errors (Figures 9a and 9b). Optimum values of NSCE were achieved with values of 0.01 and 1.2 m for b and d2, respectively. In Figure 9a, the parabolic shapes of the NSCE curves and the straight line for BIAS suggest that the errors in runoff exhibit a normal distribution with respect to the parameter b. The curves of NSCE and BIAS in Figure 9b show the most sensitive range for d2 is from 0.1 m to 2.0 m, which is commonly regarded as the typical parameter range of the second soil depth during calibration [Xie et al., 2007]. The two base flow parameters Ds and Dm show much less sensitivity within their typical ranges and, as such, require minor adjustment during the calibration process (Figures 9c and 9d).

Figure 9.

Sensitivity testing of four important parameters of the VIC model by the indices of NSCE and BIAS at daily and monthly step: (a) b, (b) d2, (c) Ds, and (d) Dm. For relative error, BIAS of daily testing equals to that of monthly.

4.2. Effects of Human Activities on Streamflow During the Validation Period

[28] Next, we used the calibrated VIC-3L to validate streamflow for 2000–2005 without any further adjustment of the parameters. Unfortunately, the goodness of model fit that was obtained during the calibration period is not observed in the validation period as shown in Figure 10. This result suggests either unrepresentative parameter settings or perhaps a potential change of hydrologic regime after 2000; i.e., the equivalent quantity of precipitation produced much less streamflow in the Laohahe basin in the validation period than that in the calibration period. A number of studies have shown that the variation of annual streamflow can vary as a result of climate change, human activities, or both [Calder, 1993; Chiew and McMahon, 2002; Brown et al., 2005; Mu et al., 2007; Ma et al., 2008; Wang et al., 2010]. In Figure 10, there is a substantially decreasing tendency in the observed streamflow time series while there is no such obvious decreasing trend from the observed precipitation in the same period. So, it is natural to postulate that human activities, which were not explicitly accounted for in the VIC-3L hydrologic simulations, had an impact on the observed decrease in streamflow from 2000 to 2005.

Figure 10.

Observed and VIC-3L model simulated streamflow with the gauged precipitation for validation period (2000–2005).

[29] To address this question further, we compared observations of several important meteorological variables between the calibration and validation period. There is no doubt that the variation in river runoff depends upon various climatic factors, such as precipitation, evapotranspiration, temperature, wind speed, etc. Table 4 shows that annual average precipitation decreased from 472.5 mm in the calibration period to 404.4 mm in the validation period, a 14% decrease. Average potential evaporation showed a slight increase of 9.45% over the two periods. Smaller changes were noted with other meteorological variables such as annual average temperature (−1.40%), wind speed (0.48%), daily maximum temperature (7.24%), and daily minimum temperature (−3.94%). However, annual average streamflow significantly decreased from 43.8 mm to 12.6 mm, a −71.18% drop. Given a stationary hydrologic regime, a slight change in meteorological input variables (i.e., precipitation, temperature, wind speed, etc.) will unlikely lead to such a dramatic drop in discharge. Therefore, human activities (e.g., land use change and water consumption) appear to be the most likely culprits contributing to the significant reduction in streamflow during 2000–2005 in the Laohahe basin.

Table 4. Annual Average Observations of the Main Hydro-meteorological Elements for the Laohahe Basin During Calibration and Validation Perioda
PhasesAnnual Average Observations
P (mm)Ep (mm)Ws (m/s)T (°C)Tmax (°C)Tmin (°C)Qo (mm)
  • a

    Notation. P: annual average precipitation; Ep: annual average potential evaporation, which was estimated from nine pan evaporation stations distributed within the Laohahe basin; Ws: annual average 10-m wind speed; T: annual average temperature; Tmax: annual average maximum temperature; Tmin: annual average minimum temperature; Qo: annual average streamflow.

Calibration period (1990–1999)472.47874.222.107.8714.672.0243.76
Validation period (2000–2005)404.41956.812.117.7515.581.9412.61
Relative change−14.41%9.45%0.48%−1.40%7.24%−3.96%71.18%

[30] We conducted a field survey of local governmental agencies to infer the human impacts on streamflow in the Laohahe basin. Figure 11 illustrates some potentially major impacts of human activities in the Laohahe basin after 1999, such as newly built reservoirs and dams, increased water diversions for irrigation, and rapid development of water-consuming industries in towns and villages. Table 5 lists the construction and maintenance projects of the three largest reservoirs in the basin. Among them, the building of San Zuodian reservoir with a storage capacity of 3.05 × 108 m3 directly resulted in the sharp drop in observed streamflow after 2003. Additionally, the rapid development of local economies increased the demand on surface water and groundwater usage. Vast amounts of water were drawn from river channels for cropland irrigation, industrial production, and municipal purposes within the studied basin. Figure 12 shows the irrigated area, gross domestic product (GDP), population, and livestock have tremendously increased in the Laohahe basin over last decade. For instance, the steep rise in irrigated area occurred in 1999 (Figure 12a), and the GDP grew rapidly after 2000 (Figure 12b). These human activities have collectively altered the natural hydrologic system and led to the surface runoff dry-out after 2004, even in the summer rainy season (Figures 10 and 11d).

Figure 11.

Impacts of human activities on river streamflow of the Laohahe basin since 2000: (a) reservoir and dam for water storage and power generation, (b) increased trend for agricultural irrigation, (c) development of new industries, and (d) dried up main channel of Laohahe River in the rainy season of summer.

Figure 12.

Changes of (a) irrigated area, (b) GDP, (c) population, and (d) livestock in the Laohahe basin during the period of 1990–2005.

Table 5. Building and Reinforcement Information of Large Reservoirs Within the Laohahe Basin After 1999
Name of ReservoirLatitude/Longitude of the Reservoir's DamProject ObjectiveYearsStorage Capacity (108 m3)Class
San Zuodian(42.24°, 118.90°)Newly Building2003–20053.05II
Erdao Hezi(42.30°, 119.00°)Newly Building2000–20010.8III
Da Hushi(41.42°, 118.68°)Maintenance and Reinforcement1999–20001.2II

4.3. Hydrologic Evaluation of TMPA Products

[31] Because it is shown that natural hydrologic processes were tremendously altered by human impacts, the streamflow observations after 2000 cannot be used as a standard reference for evaluating TMPA's hydrologic utility. However, since the VIC-3L model has been benchmarked in the 1990s with much lower human impacts, we can confidently use the observed precipitation to reconstruct the natural streamflow that would have occurred in the validation period; i.e., the runoff that is influenced by climate factors alone and can be accounted for in the VIC-3L model [Nijssen et al., 2001; Wang et al., 2010]. The reconstructed streamflow is the simulated runoff during the validation period based on model parameter settings found in the calibration period with rainfall forcing from rain gauges. Next, we compare streamflow simulations forced by 3B42RT and 3B42V6 to the reconstructed streamflow, all of which use the same parameters optimized during the calibration period with respect to the rain gauges. Compared to reconstructed streamflow, RT largely overestimates discharge (327.14%) with a poor NSCE of −18.39 at a daily scale (Figure 13a), as anticipated from the section 3 analysis. A significant improvement is found in the V6-driven simulation, with only 12.78% overestimation and a relatively good NSCE score of 0.55. Similar results are obtained at the monthly scale but with low NSCE values (−14.63) for RT and a much improved NSCE (0.85) for V6 (Figure 13b).

Figure 13.

VIC-3L reconstructed streamflow with the gauge precipitation for reconstruction period (Jan. 2000-Dec. 2005) and VIC-3L validated streamflow with 3B42RT (Feb. 2002-Dec. 2005) and 3B42V6 (Mar. 2000-Dec. 2005): (a) daily time series and (b) monthly time series.

[32] Based on the above analysis of simulations during the validation period, it can be concluded that V6 performed much better than RT for hydrologic prediction. It is quite plausible, however, that different inputs (especially those with bias) might affect the model uncertainty itself and require a different set of parameters. Like other hydrologic models, VIC-3L is sensitive to the meteorological forcing data, particular precipitation. If the forcing data of VIC-3L changes, the sensitive soil parameters, such as the infiltration parameter b and the depth of the second soil layer d2, will change accordingly [Su et al., 2005]. It is not entirely apparent whether the simulations compared to the reconstructed runoff in Figure 13 are due to input errors, parametric uncertainty, or both. To address this ambiguity, we recalibrated the sensitive parameters of VIC-3L using rainfall forcing from both RT and V6 during the validation period. We used the same calibration procedure as in the previous sections to estimate the sensitive parameters, and then evaluated the simulations with the reconstructed streamflow. Figure 14 shows the recalibrated monthly streamflow with RT and V6 benchmarked by the reconstructed streamflow given in the validation periods, and Table 6 lists the validated and recalibrated values of the seven sensitive parameters in the VIC-3L model. As shown in Figure 14a, the recalibrated results of RT were still worse than V6, with an NSCE value of only 0.04, although the recalibration efforts resulted in reduced BIAS (3.85%). V6 successfully captured both the peaks and recession flows, with much higher model efficiency (0.91) than RT. Meanwhile, the adjusted model parameters for the V6-driven recalibration are all within their physically meaningful range (Table 6).

Figure 14.

Recalibrated monthly streamflow with 3B42RT (Feb. 2002-Dec. 2005) and 3B42V6 (Mar. 2000-Dec. 2005) benchmarked by the reconstructed streamflow with the gauge precipitation (a) recalibrated with RT and (b) recalibrated with V6.

Table 6. Comparison Results of Validated and Recalibrated Model Parameter Values for RT and V6
ParameterUnitTypical RangeValidated Values for RT and V6Recalibrated Values for RTRecalibrated Values for V6

[33] Even though better statistical scores were achieved by recalibrating VIC with RT forcing during the validation period, overfitting compromised the model's parameterized representation of physical processes. For example, the recalibrated infiltration parameter (b) is 0.005 (Table 6), which is lower than its typical calibration range (0.008∼0.011) (Figure 9a). The other sensitive parameter, d2, had an optimized value of 6.0 m (Table 6) which greatly exceeded the upper limit of its normal physical range as recommended by Xie et al. [2007]. By analyzing the recalibration and validation results comprehensively, we believe that the errors in simulating streamflow forced by RT are mostly due to the unrealistically high precipitation estimation as presented in the previous rainfall comparison section.

[34] In order to reveal how the satellite rainfall estimation error propagates through the VIC-3L rainfall-runoff processes, we compared the rainfall estimation error against gauge observations and the model-simulated streamflow error against station measurements at daily temporal scale in terms of five statistical indices (i.e., NSCE, BIAS, CC, RMSE, and MAE), respectively. Table 7 shows NSCE values of 0.42 for V6 rainfall data and 0.55 for the simulated streamflow. The error propagation of BIAS generated a slightly reduced tendency, with 16.52% in the inputs to 12.78% in the outputs. So we can conclude that the VIC-3L model can tolerate the relatively small errors with the V6 rainfall inputs and generate streamflow with reduced bias through the integration of hydrologic processes. In contrast, when we used rainfall forcing from RT, the NSCE worsened from −0.99 to −18.39 from model inputs to outputs. The BIAS had a more significant change, increasing from 76.94% to 327.14%. The similar result of error inflation was also found using the other statistical indices (i.e., CC, RMSE, and MAE). This comparison between input and output error of VIC-3L suggested that there is a nonlinear error propagation pattern, where the magnitude of bias magnifies from rainfall to runoff for the RT product by about 4–5 times. The hydrologic simulation system behaved in an unrealistic manner and generated high runoff errors using RT rainfall forcing, presumably because the RT's error is beyond the tolerance level of the VIC-3L model. This nonlinear error propagation can be attributed to the nonlinear behavior of the infiltration-excess runoff generation process used in the VIC-3L model coupled with the unrealistically high rain rate values produced by RT (as shown in Figures 2a and 3). On the other hand, the error magnitude with V6 inputs is nonlinearly dampened in runoff errors, which can be attributed to VIC-3L not only tolerating but also reducing the error propagation to runoff through basin integration given relatively small and consistent bias in V6.

Table 7. Error Propagation Through VIC-3L Rainfall-Runoff Processes Over the Laohahe Basin at Daily Scalea
ItemRT 3B42V6 3B42
  • a

    For comparing the error propagation, the same calculation unit (i.e., mm) as that of rainfall was used for the daily streamflow. Its physical meaning is the simulated runoff depth over entire basin. RMSEr and MAEr represent the relative RMSE and MAE, as the error percentage of the basin-averaged values of rainfall and streamflow.


[35] In summary, both the validation and recalibration results suggest that the RT data have low hydrologic utility for our study basin, even at monthly scale. V6 captured most of the flood peaks for daily simulation and provided the best performance at monthly simulation with only slight differences in results using different parameter settings found with rain gauge forcing in calibration and then V6 forcing in validation. Clearly the V6 data can be used in decision-making for long-term water planning, daily reservoir operations, and flood risk management in this area.

5. Summary and Discussion

[36] As the standard TRMM-era satellite rainfall product, the TMPA not only provides a near-real time 3B42RT data but also a post-real time, research-quality 3B42V6 rainfall data set, which has been proven to be highly valuable for the investigation of quasi-global atmospheric processes, weather, and hazardous events such as floods and landslides [Huffman et al., 2007; Hong et al., 2007a, 2007b]. This study first provided a quantitative assessment of RT and V6 against a relatively dense surface rain gauge observation network in the Laohahe basin, beyond the latitude band of the TMPA calibrator, the TCI (including PR and TMI). Direct inter-comparison of Laohahe rain gauge observations and TMPA precipitation estimates from 2000 to 2005 showed spatial and temporal rainfall characteristics over the region are generally well captured by the V6 estimates. However, systematic bias structures were found in RT, which need to be addressed in future upgrades to the RT estimation algorithm. Afterward we evaluated their utility in hydrologic runoff simulations using the VIC-3L model. The principal findings from this study are summarized as follows.

[37] 1. The spatial distributions of error structures over the Laohahe basin suggest that both RT and V6 satellite precipitation estimations have a geotopography-dependent pattern, closely associated with latitude and elevation. In general, better performance was observed from both RT and V6 in the areas at lower latitudes and lower elevations. The worst performance was in the northwest (highest latitude and elevation) and increasingly improved results occurred leeward toward the southeast (lower latitude and lower elevation). The significant geotopography-dependent error patterns in both the 3B42RT and 3B42V6 suggest the limitation of current satellite-based algorithms at high latitude and high elevation, in general. This finding can also be confirmed by continental-scale evaluations and in mountainous, high-elevation regions; the current satellite-based precipitation algorithms perform better within tropical bands and increasingly inadequate over regions in higher latitudes and/or higher elevations [Ebert et al., 2007].

[38] 2. V6 has better agreement with gauge observations than RT at both daily and monthly scales. Particularly, monthly gauge adjustments for V6 seemed to have greatly improved the performance of this product at daily time scale even though the monthly adjustment factor is simply scaled from monthly back to 3-h resolution. RT has a large positive bias relative to gauge observations, which is predominant at heavy rainfall rates (PI > 30 mm/day).

[39] 3. The RT and V6 production systems are designed to be as similar as possible to ensure consistency between the resulting standard data sets. However, currently there are no feasible ways to overcome the differences both in calibration source (TMI versus TCI) and in use of gauge data because the PR calibration source and gauge data do not have impacts for real-time RT production. Therefore, the unrealistic overestimation of RT against gauge observations almost year-round affirms the call for a climatologic adjustment to minimize such biases between RT and V6 [Huffman et al., 2009].

[40] 4. Comparisons of the error propagation from TRMM standard rainfall products to streamflow predictions revealed that the VIC-3L model behaves nonlinearly. Furthermore, the model can tolerate relatively small errors through the basin-wide integration process; however, once the input error increases to a certain degree beyond the VIC-3L's tolerance level, the model behaves in an unrealistic manner and generates amplified errors in output. In summary, the VIC-3L hydrologic model cannot tolerate the unphysical bias in RT through the hydrologic integration process, and the large errors associated with the rainfall inputs propagate into hydrologic predictions with amplified errors. In contrast, the V6-derived hydrologic simulations performed rather well since its error was tolerated and dampened through the VIC model's integration process.

[41] 5. Quantitative evaluations of the two TMPA standard precipitation products against gauge observations suggest that V6 has a great potential for hydrologic modeling even for the basins situated at the high latitudes of 41°–42.75°N. Our analysis further demonstrated that the V6 data can produce much better hydrologic modeling results at both daily and monthly scales than the purely satellite-derived RT precipitation product in the Laohahe basin. Therefore, we recommend V6 for both hydrologic modeling and water resources management in data-sparse regions, while caution should be exercised when the RT product is applied for daily streamflow prediction, especially in higher-latitude basins beyond the TRMM inclined orbits (36°NS).

[42] Given the coverage limitation of rain gauge measurements, the current TRMM-era satellite precipitation estimates, once confidently validated, could significantly contribute to improved understanding of hydrologic processes at quasi-global scale (i.e., within 50°NS latitude bands). Although microwave data have a strong physical relationship to the hydrometeors that result in surface precipitation, their limited space-time sampling still calls for the GEO-IR-based estimates to have role in the GPM era in filling the inevitable gaps in microwave coverage, as well as in enabling sub-hourly (i.e., 15–30 min) and 4-km resolution precipitation estimates. It is highly advantageous to include surface observations (e.g., rain gauges) in generating research-quality precipitation data sets, although there is a delay as shown in current TMPA products. Furthermore, this study affirms the call for a real-time bias correction using the seasonally derived scaling factors on the basis of a simple ratio of long-term MW-based monthly climatology to the combined monthly satellite-gauge research products. Since Feb 17, 2009, the real-time TMPA-RT has begun to be upgraded with a climatological calibration (ftp://trmmopen.gsfc.nasa.gov/pub/merged/3B4XRT_doc.pdf). Huffman et al. [2009] suggest more work needs to validate the effectiveness of these upgrades and further evaluate their hydrologic utility.


[43] The fund for this research and gauged rainfall and streamflow observations used in this paper were supplied by the National Key Basic Research Program of China (grant 2006CB400502) and the National Science Foundation for Young Scientists of China (grant 40901017). This work was financially supported by the 111 Project (grant B08048), the Key Project of Chinese Ministry of Education (grant 308012), the Program for Changjiang Scholars and Innovative Research Team in University (grant IRT0717), and the Independent Innovation Project of State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering (grant 2009586512). The authors also acknowledge the funding support granted by NASA Headquarter (grant NNX08AM57G) and the computational facility provided by Center for Natural Hazard and Disaster Research, National Weather Center at University of Oklahoma Research Campus. Additionally, the authors would like to thank three anonymous reviewers for their comments on an earlier version of this paper. Last but not least, we wish to extend our appreciation to Fengge Su and Li Li for their helpful suggestions for this work.