Water Resources Research

How much improvement can precipitation data fusion achieve with a Multiscale Kalman Smoother-based framework?

Authors


Abstract

[1] With advancements in measuring techniques and modeling approaches, more and more precipitation data products, with different spatial resolutions and accuracies, become available. Therefore, there is an increasing need to produce a fused precipitation product that can take advantage of the strengths of each individual precipitation data product. This study systematically and quantitatively evaluates the improvements of the fused precipitation data as a result of using the Mulitscale Kalman Smoother-based (i.e., MKS-based) framework. Impacts of two types of errors, i.e., white noise and bias that are associated with individual precipitation products, are investigated through hypothetical experiments. Two measures, correlation and root-mean-square error, are used to evaluate the improvements of the fused precipitation data. Our study shows that the MKS-based framework can significantly recover the loss of precipitation's spatial patterns and magnitudes that are associated with the white noise and bias when the erroneous data at different spatial scales are fused together. Although the erroneous data at a finer resolution are generally more effective in improving the spatial patterns and magnitudes of the erroneous data at a coarser resolution, data at a coarser resolution can also provide valuable information in improving the quality of the data at a finer resolution when they are fused. This study provides insights on the values of the MKS-based framework and a guideline for determining a potentially optimal spatial scale over which improvements in both the spatial patterns and the magnitudes can be maximized based on given data with different spatial resolutions.

1. Introduction

[2] Precipitation plays an important role in land surface models (LSMs). It affects hydrological processes and the water and energy fluxes. Improvements in the quality of precipitation data can significantly improve the simulations of land surface models on soil moisture, evapotranspiration, runoff, and other water and energy fluxes as well.

[3] There are three typical ways to measure precipitation, namely, rain gauge, radar and satellite. Each of them has its strengths and weaknesses in terms of accuracy, resolution, and coverage. Rain gauges are most accurate at a point or local scale but poor in capturing spatial patterns over a large area. The ground-based Next Generation Radar (NEXRAD) network in the U.S. provides measurements of precipitation with good spatial coverage at a much higher spatial resolution. However, magnitudes of the radar precipitation are criticized for systematic bias and random errors [Krajewski et al., 2010; Seo et al., 1999; Smith et al., 1996]. In recent decades, the satellite-borne infrared imager and microwave imager make it possible to measure precipitation at a global scale with coarser spatial resolutions than that of the NEXRAD network. Satellite derived precipitation data products, with larger spatial coverage but lower spatial resolutions than NEXRAD network, also suffer from biases and noises [Anagnostou et al., 2001; Grimes et al., 1999].

[4] Data fusion is an effective approach to derive higher-quality precipitation data products by combining multiple sources of precipitation measurements. For example, the Multisensor Precipitation Estimator (MPE) of NEXRAD precipitation data are based on the NEXRAD Stage II and rain gauge precipitation data. The NLDAS precipitation data are combinations of daily reanalysis precipitation data with the NEXRAD Stage II precipitation data or the Eta model predicted precipitation data [Cosgrove et al., 2003]. The PERSIANN system combines multiple precipitation measurements, such as the Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI) and Geosynchronous Satellite Longwave Infrared Imagery (GOES-IR) precipitation, using artificial neural networks [Sorooshian et al., 2000]. Among the data fusion algorithms employed, the Kalman filter and its derived algorithms are widely used, such as the direct application of the Kalman filter [Seo, 1998a, 1998b] and the scale recursive regression. The latter is essentially a Multiscale Kalman Smoother (MKS) [Willsky, 2002].

[5] The MKS algorithm was originally proposed to process digital signals and images at multiple spatial resolutions [Chou et al., 1994]. It has been extensively used in a variety of applications together with the Expectation-Maximization (EM) algorithm for optimal parameter estimation [Kannan et al., 2000], such as signal and image processing [Farina et al., 2001; Nounou, 2006; Simone et al., 2000], precipitation data fusion [Bocchiola, 2007; Bocchiola and Rosso, 2006; de Vyver and Roulin, 2009; Gorenburg et al., 2001; Tustison et al., 2002], data assimilation for soil moisture [Kumar, 1999; Parada and Liang, 2004, 2008], atmospheric variables [Zhou et al., 2008], and altimetry data fusion [Slatton et al., 2001; Slatton et al., 2002]. The MKS algorithm can be flexibly used in a time or a space domain. For example, the scale denotes the temporal resolution when precipitation measurements associated with different temporal resolutions are fused [Bocchiola and Rosso, 2006]. Additionally, the scale denotes the spatial resolution when precipitation measurements associated with different spatial resolutions are fused [Bocchiola, 2007; de Vyver and Roulin, 2009; Gorenburg et al., 2001; Gupta et al., 2006; Tustison et al., 2002]. Through the MKS algorithm, precipitation measured at different temporal and spatial resolutions with different accuracies can be combined to produce higher-quality precipitation data products. Until now, the effectiveness of the MKS algorithm has only been evaluated with individual precipitation images [Bocchiola, 2007; de Vyver and Roulin, 2009; Gorenburg et al., 2001; Gupta et al., 2006; Tustison et al., 2002], systematic evaluations on this type of data fusion approach with massive precipitation data have not been conducted yet.

[6] In this study, we use an extended MKS-based approach to first fuse two precipitation data sources at a coarser and a finer spatial resolution and then statistically investigate the improvements achieved through precipitation fusion. Correlation and root-mean-square error (RMSE) are applied as metrics of improvements. Particularly, we investigate two types of errors, white noise and bias. Evaluations are conducted based on hypothetical experiments using real precipitation data.

[7] We briefly describe the MKS-based framework in section 2. Then, we present the data and experiment area in section 3. Experiment designs and results are described and discussed in section 4. Finally, we summarize the findings of this study in section 5.

2. Multiscale Kalman Smoother–Based Framework

[8] The MKS-based framework combines an extended MKS algorithm and a parameter estimation scheme [Parada and Liang, 2004]. Figure 1 depicts a multiscale tree with three spatial scales representing multiscale hidden states (i.e., fused precipitation). The MKS algorithm includes an upward sweep and a downward sweep. The former is a fine-to-coarse Kalman filtering step from the leaf nodes to the root node and the latter is a coarse-to-fine Kalman smoothing step from the root node to the leaf nodes. Both sweeps are along the multiscale tree. The dynamic equations for fusing multiscale precipitation are expressed as

equation image
equation image

where t represents a node in the multiscale tree, equation image represents the coarse scale node containing node t, X(t) and equation image represent the hidden states (e.g., fused precipitation in this case) at a child node t and its parent node equation image, respectively, w(t) is the added detail at the child node following equation image, equation image is the error variance of w(t), P(t) and equation image are the error variances of X(t) and equation image, and A(t) is a transition operator mapping precipitation from a parent node to a child node. Given the prior estimate of precipitation at the root node equation image and its error variance equation image, the prior estimates of precipitation and associated error variances at the remaining nodes of the multiscale tree can be inferred based on equations (1) and (2). We refer to this step as initialization step. After this step, the upward sweep can be carried out using the inversed forms of equations (1) and (2) together with a measurement equation expressed as follows:

equation image

where Y(t) represents the measurement (e.g., precipitation) of node t, C(t) is a transition operator mapping precipitation to the measurement, v(t) is the measurement noise following N[0,R(t)], R(t) is the error variance of v(t), and D(t) is a bias compensator which is calculated as:

equation image

where S is the scale of node t, equation image is the mean of the measurements at scale S, and equation image is the mean of the root node. D(t) is introduced in the observation equation by [Parada and Liang, 2004] to minimize impacts of the inconsistency (e.g., bias) between measurements at different scales on the fused precipitation. Adding this term enables the MKS-based framework to achieve the same estimated mean of the fused precipitation at all scales. For example, if the precipitation measurements at different scales have different means, then the average of these means (i.e., the mean of the means) can be chosen, a case employed in this study, as the estimate of equation image.

Figure 1.

A schematic of a 2-D multiscale tree with three different spatial scales, 0, 1, and 2. For node t at scale 1, equation image represents its parent node and equation image (n = 1, 2, 3, 4) represents its child nodes. Without a parent, the node at scale 0 (i.e., the coarsest resolution) is called a root node; without any children, the nodes at scale 2 (i.e., the finest resolution) are called leaf nodes.

[9] The upward sweep includes three operations: (1) fine-to-coarse prediction, (2) prediction merging, and (3) observation update. In operation 1, the fine-to-coarse predictions of X(t) are derived based on the updated states of its child nodes. In operation 2, multiple fine-to-coarse predictions of X(t) are fused into a merged prediction of X(t). In operation 3, the merged prediction of X(t) is updated by Y(t). From the leaf to the root, these operations are conducted at all nodes of the multiscale tree. If no measurement exist at node t, the updated prediction of X(t) just takes the value of the merged prediction. Via the upward sweep, the finer resolution data add their influences to the estimates of the hidden states at coarser resolutions.

[10] The downward sweep follows the upward sweep starting at the root node and moving toward the leaves of the multiscale tree. It refines the estimates of the hidden states further through a scale-recursive Kalman smoothing step. As a result, coarser resolution data add their influences to the estimates of the hidden states at finer resolutions. By means of the upward and the downward sweeps, information in multiscale measurements will propagate to all the nodes of the multiscale tree collectively. For more details about our extended MKS-based framework, please refer to [Parada and Liang, 2004].

[11] The MKS-based framework has a set of parameters: equation image, equation image, A(t), C(t), equation image and R(t), which need to be estimated. In this study, we set A(t) = 1 to keep mass conservation (i.e., to have the same total precipitation amount) at all scales. We also set C(t) = 1 because both the measurements and the hidden states are precipitation. In the MKS-based framework, the areal mean precipitation over the study area, equation image, determines the total amount of the fused precipitation at all scales. Without preference given to any measurement source, we determine equation image as

equation image

where S1, S2, …, and Sn are scales with measurements available and N is the total number of scales with measurements. equation image, equation image, and R(t) are estimated using the EM algorithm, in which the log likelihood function is formulated as

equation image

where equation image is the set of all nodes at the multiscale tree, equation image is a subset of equation image with measurements, equation image is a subset of equation image except the root node, and F is a constant. To maximize the log likelihood, two recursive steps, expectation step (E-step) and maximization step (M-step), are iterated. The E-step is to compute the expectations of precipitation estimates (equation image) conditioned on all available measurements. The two sweeps of the extended MKS algorithm together play the role of the E-step. The M-step is to maximize equation (6) given the estimated precipitation obtained at the E-step. In order to reduce the number of parameters, we further assume that equation image and R(t) are homogeneous at each scale. Finally, equation (6) is maximized using the Newton gradient method. More details about the EM algorithm can be found in [Kannan et al., 2000]. In equation (6), the first half (i.e., the Q related terms) is a measure of the consistency among the fused precipitation data along the multiscale tree and the second half (i.e., the R related terms) is a measure of the consistency between the fused precipitation and measurements. To enhance the contribution of measurements, we constrain equation image in the M-step.

3. Study Area and Data

[12] Our study area (Figure 2) is bounded by longitudes (−88, −84) and latitudes (37.75, 41.75), which contains 32 × 32 grids at 1/8° resolution and 128 × 128 grids at 1/32° resolution. It includes almost the entire state of Indiana and parts of Illinois, Kentucky, Ohio, and Michigan. Covering an area of 152,175 km2, it is large enough for evaluating the MKS-based framework for precipitation data fusion. The average annual precipitation of this region is about 1000 mm. Precipitation is relatively evenly distributed throughout the year. Typically, precipitation is steady and of long duration during winter and early spring and short but of high intensity during late spring and summer.

Figure 2.

Location of our study area bounded by longitudes (−88, −84) and latitudes (37.75, 41.75), which contains 32 × 32 grids at 1/8° resolution. The black dot represents the streamflow gauge station of the White River at Newberry and the shaded area is the contributing area of the gauge station.

[13] Hourly NEXRAD MPE precipitation data [DelGreco et al., 2005] from the Ohio River Forecast Center (OHRFC), National Weather Service (NWS) for the year 2003 are used in this study. The original data are at a spatial resolution of approximately 4 km in XMRG format, which is a binary file format used within the NWS to store gridded data. We resampled and projected the XMRG formatted precipitation data into the geographic coordinate system at two resolutions, 1/32° and 1/8°, respectively.

4. Experiments and Results

4.1. Experiment Design

[14] Two hypothetic experiments are designed to investigate the effectiveness of the MKS-based framework in removing errors of the precipitation data from different sources. Two most common types of errors, Type-I errors and Type-II errors, are investigated. The Type-I errors are mainly resulted from random noise while the Type-II errors mainly contain errors due to systematic bias, such as instrument bias and algorithm bias. Experiment 1 investigates the effectiveness in filtering out the Type-I errors while experiment 2 examines the effectiveness in filtering out the Type-II errors. We use a synthetic experiment approach here due to its advantage of being able to control the magnitudes of errors to be included in the generated precipitation data. Thus, this approach is more effective in evaluating improvements of fused precipitation. In fact, the approach of using synthetic data has been widely used in data assimilation studies for the convenience of performance evaluation [Walker and Houser, 2004].

[15] In both experiments, synthetic precipitation data are generated at 1/8° and 1/32° resolutions based on the hourly NEXRAD MPE precipitation data. In the study area, precipitation was recorded over a total of 3636 h in 2003. Among all of these hourly-recorded precipitation data (called precipitation images), 2246 of them are revealed to be realistic in terms of spatial patterns and amounts through the OHRFC's manual inspection. Therefore, we use these 2246 hourly precipitation images as the truth at 1/32° resolution. In addition, we aggregate these data from 1/32° resolution to 1/8° resolution and also treat them as the truth at 1/8° resolution. Since the resolutions of 1/8° and 1/32° correspond to the scales 5 and 7 of the multiscale tree built for the study area, we also call the precipitation data at these two resolutions as the data at scales 5 and 7, respectively. The mean and the standard deviation of the true precipitation images at scale 7 are shown in Figures 3a and 3b. Due to aggregation, the mean and the standard deviation of the true precipitation images at scale 5 are smaller than corresponding ones at scale 7. This is because the total amount of each hourly precipitation image is the same but the precipitation-covered area is larger at scale 5 than that at scale 7. The relative differences between scales 5 and 7 are shown in Figures 3c and 3d, respectively. All of these statistics are calculated over precipitation-covered areas, which are defined as a set of grids whose precipitation amounts are larger than 0. In the remaining parts of this paper, we use equation image to denote the precipitation-covered area.

Figure 3.

(a) Time series of the means of the true precipitation data at 1/32° resolution (i.e., scale 7); (b) time series of the standard deviations of the true precipitation data at 1/32° resolution; (c) time series of the relative differences between the means of the true precipitation data at 1/32° resolution and those at 1/8° resolution; and (d) time series of the relative differences between the standard deviations of the true precipitation data at 1/32° resolution and those at 1/8° resolution. All of these statistics are calculated over the precipitation-covered areas based on the 2246 hourly precipitation data in 2003.

[16] In experiment 1, the Type-I errors are generated based on Gaussian distributions with zero mean and different standard deviations prescribed according to the real data. That is, at each hour k, the prescribed standard deviation is proportional to the standard deviation of the true precipitation image at hour k. For example, assuming the standard deviation of the true precipitation image at hour k is sk, the Type-I error at hour k is then generated for each grid within equation image based on a Gaussian distribution of N(0, xisk) where xi (i = 1, 2,…, 21) denoting a prescribed noise level. At the same hour, the generated error for each grid is generally different from each other, but they follow the same distribution. As shown in Table 1, a total of 21 noise levels ranging from 0.1 to 5.0 are used to generate Type-I errors to mimic the white noise in real precipitation data. In addition, we categorize these 21 noise levels into three groups to represent scenarios of fair, moderate and large amounts of noise. Synthetic precipitation data are finally generated by adding the Type-I errors to true precipitation values. Since we have generated 21 levels of synthetic precipitation data at both scales 5 and 7, a total of 21 × 21 combinations are used in experiment 1. For all combinations, data fusion is carried out on each of the 2246 hourly precipitation images.

Table 1. A List of 21 Noise Levels Used in Generating Type-I Errors
Noise Level (xi)Value
Fair
i = 10.10
i = 20.25
i = 30.50
i = 40.75
i = 51.0
i = 61.25
i = 71.50
 
Moderate
i = 81.75
i = 92.0
i = 102.25
i = 112.50
i = 122.75
i = 133.0
 
Large
i = 143.25
i = 153.50
i = 163.75
i = 174.0
i = 184.25
i = 194.50
i = 204.75
i = 215.0

[17] In experiment 2, the Type-II errors are generated based on Gaussian distributions with nonzero means and standard deviations prescribed according to the true data. At each hour, the mean and the standard deviation are again proportional to those of the true hourly precipitation images. For example, assuming the mean and the standard deviation of the precipitation image at hour k is mk and sk, the Type-II error at hour k is then generated for each grid within equation image based on a Gaussian distribution of N(yjmk, xisk), where yj (j = 1, 2,…,15) denoting a prescribed bias level and xi (i = 1, 2,…, 21) denoting a noise level. As Table 2 shows, the range of the bias levels is wide enough to mimic the bias in real precipitation data. Since the focus of experiment 2 is to evaluate the MKS-based framework on removing biased errors in the precipitation data, we set the noise level to x9 = 2.0 when generating the Type-II errors at all of the bias levels in scales 5 and 7. Synthetic precipitation data are then generated by adding the Type-II errors to the corresponding true precipitation values. It is worth mentioning that the mean and the standard deviation of the Gaussian distribution N(yjmk, xisk) do not vary over individual grids in space (i.e., the errors for different grids are drawn from the same distribution), but individual values of the errors for each grid may vary over the precipitation-covered region. Like experiment 1, there are 15 × 15 combinations of the synthetic precipitation data series used in experiment 2. Data fusion is once again carried out over each of the 2246 hourly precipitation images for all combinations. Because the synthetic precipitation data are erroneous, we use equation image and equation image to denote these synthetic precipitation images at scale 5 and scale 7, respectively.

Table 2. A List of 15 Bias Levels Used in Generating Type-II Errors
Bias Level (yj)Value
j = 1−0.50
j = 2−0.30
j = 3−0.20
j = 4−0.10
j = 50.00
j = 60.10
j = 70.20
j = 80.30
j = 90.50
j = 100.7
j = 111.0
j = 121.5
j = 132.0
j = 142.5
j = 153.0

[18] Since precipitation data cannot have negative values, a strict non-negative value rule has to be applied when generating the synthetic precipitation data for both experiments. Once the value of the synthetic precipitation in a grid is negative, we regenerate the error until it is non-negative. Figure 4 shows the average percentage of grids whose synthetic precipitation values are regenerated for experiments 1 and 2, respectively, over the 2246 precipitation hours in 2003. Due to the regeneration process, the means of the errors in the precipitation-covered area (equation image) are increased. The larger average percentage value in Figure 4 indicates that a higher bias is added to the synthetic precipitation data and that the distributions of the added errors are less normally distributed. For experiments 1 and 2, we find the average percentage varies from almost zero to about 40% over the ranges of the noise levels or the bias levels. This implies that non-normality exists in the errors of the synthetic precipitation data in both experiments. However, the majority of the Type-I errors and the Type-II errors are still normally distributed. The range of the prescribed bias levels in experiment 2 is asymmetric to reflect the reality that the absolute magnitude of the negative bias is generally not too large.

Figure 4.

Average percentage of grid cells whose synthetic precipitation values are regenerated for (a) experiment 1 and (b) experiment 2 over the 2246 precipitation hours in 2003.

[19] In addition, benchmark experiments have been designed as companions to experiments 1 and 2, respectively. In the benchmark experiments, a conventional data fusion method including two steps is applied to the data of experiments 1 and 2, respectively. In step 1, we either aggregate the synthetic precipitation data at hour k of scale 7 (denoted as equation image) to scale 5 (denoted as equation image) or disaggregate the synthetic precipitation data from scale 5 (denoted as equation image) to scale 7 (denoted as equation image). In the aggregation process, we use the average precipitation values of the 4 × 4 1/32° resolution (scale 7) grids inside a 1/8° resolution (scale 5) grid as the precipitation value of the grid at scale 5. In the disaggregation process, all of the 4 × 4 1/32° grids inside a 1/8° resolution grid take the same precipitation value of the grid at scale 5. In step 2, we fuse the precipitation of scales 5 and 7 at each hour k as equation image and equation image. We use equation image and equation image as the benchmarks for scales 5 and 7 and compare them with the corresponding fused precipitation data equation image (j = 5, 7) based on the MKS-based framework.

[20] Two metrics are used in evaluation, namely correlation (Corr) and root-mean-square error (RMSE). Correlation is a measure of the consistency of the two images' spatial patterns. At hour k and scale j, the correlation between the true (equation image) and synthetic precipitation images (equation image) is represented as

equation image

where var(·) represents variance of the precipitation values within the precipitation-covered area equation image, equation image is the number of measurements (i.e., grids), and j = 5 or 7 representing the scale. RMSE is a measure of the overall difference (magnitudes) between the two precipitation images. Similarly, RMSE is formulated as

equation image

Correlation and RMSE between the true and fused precipitation images are calculated by applying equations (7) and (8) where equation image is replaced by either equation image or equation image.

[21] To investigate the effectiveness of the MKS-based framework in a statistic sense, we evaluate the overall performance of the data fusion over all of the 2246 precipitation hours in 2003 rather than over some selected individual hours. Thus, we use the means of correlations and RMSEs in our analyses and they are expressed as follows:

equation image
equation image

where NT is the total number of precipitation hours (i.e., 2246), j denotes the spatial scale, and Zj, k represents either equation image, equation image, or equation image.

[22] In the following analyses, we compare equation image and equation image computed before and after the data fusion using both the MKS-based framework and the conventional method. For notational convenience, we use superscripts “−” and “+” to represent “before” and “after” data fusion. Therefore, equation image denotes the mean of the correlations between the true precipitation data and the synthetic precipitation data at scale j. equation image denotes the mean of the correlations between the true precipitation data and the fused precipitation data at scale j. Similar meanings are applied for equation image and equation image. Moreover, to facilitate the evaluation analyses, we define equation image and equation image for j = 5 or 7. Positive values of equation image and equation image indicate valuable effects of data fusion.

4.2. Results and Discussion

4.2.1. Experiment 1

[23] By showing the color-filled contour plots of equation image, equation image and equation image (j = 5 or 7), Figure 5 provides an overall picture about the effectiveness of the MKS-based framework in terms of restoring the spatial patterns of precipitation. Since no data fusion has been conducted yet, information associated with equation image has no influence on equation image. Therefore, contours of equation image only change along the noise levels of scale 5. Similarly, contours of equation image only change along the noise levels of scale 7. It can be seen that equation image (j = 5 and 7) decrease rapidly with an increase of the noise levels at scale j. In particular, the correlations are reduced from almost 1.0 to about 0.26 when the corresponding noise levels increase from x1 = 0.1 to x13 = 3.0 (i.e., noise levels between fair and moderate). In the plots of equation image and equation image, it can be found that equation image for j = 5 and 7 as long as the Type-I errors are confined within the “fair category.” This indicates that equation image (j = 5 and 7) generated with errors in the “fair category” (see Table 1) still contain most of the patterns of the true precipitation data. On the other hand, equation image (j = 5 and 7) generated with errors confined in the “large category” lose most of the spatial patterns of the true precipitation data, since equation image for j = 5 and 7. The loss of the spatial patterns for equation image (j = 5 and 7) generated with errors in the “moderate category” falls between that of the fair and large categories.

Figure 5.

Color-filled contour plots of equation image, equation image and equation image for (top) scale 5 (i.e., j = 5) and (bottom) scale 7 (i.e., j = 7), respectively, for experiment 1. In each plot, the horizontal axis and the vertical axis represent, respectively, the noise levels at scales 7 and 5. In addition, the two horizontal and the two vertical gray lines indicate the boundaries, respectively, between the fair and moderate, and between the moderate and large categories of the noise levels specified in Table 1.

[24] For each fused result at scale j, if equation image, it implies that the MKS-based framework can improve precipitation spatial pattern through removing some of the Type-I errors at scale j. Significant improvements can be seen at scale 5 in Figure 5. That is, the Type-I errors in equation image are substantially removed by fusing equation image with equation image. equation image is greater than 0.6 for all of the combinations, even for those that equation image are associated with the Type-I errors in the large category. For scale 5, the data fusion only degrades the spatial patterns of the precipitation when equation image are associated with very small noise levels (e.g., the noise level of xi ≤ 0.25) while equation image are associated with much higher noise levels. Compared to the plot of equation image, improvements shown in the plot of equation image are much smaller, but still considerable. Two paired t tests (one for scale 5 and the other for scale 7) between equation image and equation image over all of the 21 × 21 combinations reveal that the differences between equation image and equation image are statistically significant at the 95% confidence level. In other words, the synthetic precipitation data at a coarser resolution (i.e., scale 5) are also helpful in removing the Type-I errors and in improving the spatial patterns of the synthetic precipitation at a finer resolution (i.e., scale 7) through the MKS-based framework, even though the effect is not as much as it is for the opposite situation.

[25] Moreover, plots of equation image and equation image provide direct measures of the improvements of the precipitation patterns at scales 5 and 7, respectively. Distinct differences in the pattern and magnitude are shown between the plots of equation image and equation image. The contours of equation image are jointly controlled by the noise levels at scales 5 and 7. Denoting the synthetic data at scale 5 that are associated with the Type-I errors in the fair category as equation image, we can see that when equation image are combined with the synthetic data at scale 7 (i.e., equation image), equation image is more sensitive to the noise level at scale 5 rather than to the noise level at scale 7, since the color-filled contours are horizontal-like strips in the region. A negative zone of equation image can be seen in the region in the plot of equation image, when the noise level at scale 7 is much greater than that at scale 5. However, the slight decrease of the correlation over the small region would not cause any concern since the absolute magnitudes of those negative equation image are very small. Over the negative region of equation image, magnitudes of the corresponding equation image are mostly greater than 0.85, indicating that the fused precipitation data still represent most of the spatial features of the true precipitation images at scale 5. When the noise level of the Type-I errors associated with equation image is getting larger, e.g., from the fair group to the moderate or large group, the effectiveness of the precipitation data fusion becomes increasingly significant. Meanwhile, the effectiveness depends on the quality of equation image and equation image. As expected, the improvement is more significant if the noise level at scale 7 is smaller.

[26] Improvements indicated by equation image are much less than those by equation image. The contours of equation image are mostly controlled by the noise levels at scale 7. Also, the magnitudes are much smaller than those of equation image. This indicates that equation image have relatively less influence than equation image on the fused precipitation data at scale 7, no matter what the noise level is at scale 5. In other words, we cannot significantly improve the spatial pattern of equation image (at finer resolution) by fusing equation image (at coarser resolution) with equation image using the MKS-based framework. Both the magnitude and the pattern of equation image demonstrate that equation image play a more significant role than equation image in improving the spatial patterns at scale 7 in data fusion. This is because when equation image are fused into the data at scale 7, much less new information on the spatial patterns can be added since the coarser resolution includes less spatial variability information. Nevertheless, equation image do provide some new information, which is detected by the EM algorithm, to improve the spatial patterns of equation image.

[27] One reason for such a significant difference between equation image and equation image is that the EM algorithm in the MKS-based framework places more weight on the data for which there are a larger number of measurement points (see equation (6)). In this study, the number of measurement points at scale 7 is 16 times of that at scale 5. Generally, if the noise levels at different resolutions are comparable to each other, more information is provided by the finer resolution data than that by the coarser resolution data. Thus, it makes sense that the finer data have more influence on the fused data than the coarser data. Such a general rule of the multiscale data fusion may not work if the finer resolution data are too noisy. This is why we see a region with negative values in equation image since the noise levels at the finer resolution are much higher than those at the coarser resolution. If, however, the two (or more) data sources used for fusion have the same spatial resolution or comparable noise levels, the EM algorithm in our MKS-based framework would be able to adjust its parameters to effectively place more weight on the data source with less errors and thus to improve the spatial patterns and magnitudes as shown here and in the work of Parada and Liang [2008] as well. Results here demonstrate an important value of the high-resolution data when combined with coarse resolution data to improve the spatial patterns of the coarse resolution data, even if the high-resolution data have larger (but not significantly larger) Type-I errors than those at the coarser resolution.

[28] For the case of equation image, improvements are even seen for the region over which the large noisy data at scale 5 are fused into the smaller or equally noisy data at scale 7. The largest occurs to the combinations of equation image in the large category and equation image in the moderate category where equation image ranges from 0.12 to 0.14. This indicates that the noisy data at scale 5 have the largest effects on improving the spatial patterns of equation image in the moderate category. For the combinations with equation image in the fair region (i.e., left of the first vertical line), small values in equation image are due to the high values of equation image prior to conducting the data fusion. For the combinations with equation image in the large noise category (i.e., right of the second vertical line), smaller values in equation image indicate less influence of the noisy data at scale 5. But overall, there are still improvements to this large noisy data region at scale 7. The relative improvements in this region are actually not small compared to the original spatial patterns of the noisy data at scale 7 shown in the plot of equation image. The almost equal values of equation image (i.e., the vertical-like color-filled contours) reveal an interesting feature. That is, there appears to be an almost equal contribution of the noisy data at scale 5 to the improvements of the spatial patterns at scale 7 due to the combined effects of the upward and downward sweeps and the EM algorithm involved in the MKS-based framework. When the correlation measure is employed, impacts on the fused data at scale 7 by the different noise levels at scale 5 are minimized while such impacts are not if the RMSE measure is used. Results here confirm that one can fuse coarse resolution data with fine resolution data (as long as the coarse spatial resolution is not too coarse compared to the fine resolution) to improve the spatial patterns of the fine resolution data even if the data at the coarser resolution include a large amount of noise. The impacts of the coarser resolution data on the fused data at a finer resolution are much less than those the other way around. This is expected from the upward and downward sweeps and the EM algorithm as to be elaborated later.

[29] In terms of restoring the spatial pattern of precipitation data, results of experiment 1 and the benchmark experiment are compared in Figure 6. We use equation image to represent improvements in the spatial patterns obtained with the conventional data fusion while equation image (i.e., the same as equation image shown in Figure 5 for experiment 1) to represent improvements obtained with the MKS-based framework. Figure 6 shows the difference between equation image and equation image at scales 5 and 7 (i.e., j = 5 and j = 7). It can be seen that equation image is significantly greater than equation image except for a few combinations when the noise level at scale 5 is in the fair group while the noise level at scale 7 is much greater than the fair group. In addition, equation image is larger than equation image for more than half of the combinations, including all of the combinations when the noise levels at scale 5 are greater than those at scale 7. In general, given the noise levels at scale 5, the superiority of the MKS-based framework over the conventional approach decreases when the noise levels at scale 7 increase. On the contrary, given the noise levels at scale 7, the superiority of the MKS-based framework increases when the noise levels at scale 5 increase. This is because the EM algorithm used in the MKS-based framework places more weight on the finer resolution data than the coarser resolution data while the conventional data fusion approach places equal weight to the data at both resolutions.

Figure 6.

Color-filled contour plots of equation image and equation image, where the superscript E1 denotes experiment 1 and the superscript B1 denotes the benchmark experiment. Both experiments use the same erroneous precipitation data with the Type-I errors. The meanings of the horizontal and vertical axes are the same as in Figure 5.

[30] In experiment 1, we also evaluate the effectiveness of the MKS-based framework in improving the magnitudes of the synthetic precipitation data using RMSE. Figure 7 shows the color-filled contour plots of equation image, equation image and equation image (j = 5 or 7). Since no data fusion is conducted yet, the contours of equation image only change along the noise levels at scale j. equation image increases rapidly when the noise levels at scale j increase for j = 5 and j = 7. Particularly, equation image ranges from almost 0 to about 8 and equation image ranges from almost 0 to more than 9 when the noise levels increase from x1 = 0.1 to x21 = 5.0. For the same noise level, equation image is greater than equation image, indicating more variability in equation image than equation image.

Figure 7.

Color-filled contour plots of equation image, equation image and equation image for (top) scale 5 (i.e., j = 5) and (bottom) scale 7 (i.e., j = 7), respectively, for experiment 1. The meanings of the horizontal and vertical axes and the meanings of the two horizontal and vertical gray lines are the same as in Figure 5.

[31] In Figure 7, equation image (j = 5 and 7) reflects the joint influence of the information associated with equation image and equation image on the magnitudes and spatial patterns of the fused precipitation data. The contours of equation image have a different pattern from the contours of equation image. The former is affected by the noise levels at both scales while the latter is mostly controlled by the noise levels at scale 7. This indicates that the influence of the finer resolution data on the coarser resolution data is much stronger than the other way around. Comparing with the plot of equation image in Figure 5, it can be seen that the contours of equation image and equation image have similar patterns. In general, the spatial pattern and the absolute magnitude of the precipitation data at the coarser resolution (i.e., scale 5) can be improved simultaneously through the MKS-based framework. On the contrary, the pattern of the contours of equation image is mostly affected by the noise levels at scale 7, similar to that of equation image in Figure 5. In addition, the average of equation image is smaller than that of equation image. This is mainly because the variability associated with equation image is smaller than that associated with equation image (see plots of equation image and equation image in Figure 7).

[32] Moreover, Figure 7 depicts the color-filled contours of equation image and equation image. Both plots can be divided by a zero-value contour. From the plot of equation image, we see that the magnitudes of equation image are significantly improved if the noise levels at scale 7 are not substantially higher than that at scale 5. From the plot of equation image, it can be seen that the magnitudes of equation image are also improved for most of the combinations, except when equation image in the fair group are combined with equation image in the large group. The overall improvements (i.e., positive equation image) at scale 5 are greater than those at scale 7 (i.e., positive equation image). However, the absolute magnitudes of the negative equation image are also greater than those of the negative equation image. This indicates that precipitation data at the finer resolution (i.e., scale 7) have stronger influences on the magnitudes of the fused precipitation data at the coarser resolution (i.e., scale 5). Once again, this is partially due to the EM algorithm, which places more weight on the finer resolution data. If the finer precipitation data are not substantially noisier than the coarser precipitation data, the magnitudes of precipitation data at the coarser resolution will be significantly improved after the data fusion with the MKS-based framework. Otherwise, the magnitudes of the precipitation data at the coarser resolution may be degraded if the finer data are too noisy. The contours of equation image are mainly controlled by the noise levels at scale 7. That is, equation image contribute limited information to the improvement of the magnitudes of equation image.

[33] A comparison between the MKS-based framework and the conventional data fusion method using equation image (j = 5 and 7) is also conducted. We use equation image and equation image to represent the overall improvements (magnitudes and spatial patterns) obtained with the conventional data fusion method and the MKS-based framework, respectively. Figure 8 shows the color-filled contour plots of equation image and equation image. From Figure 8, we see that at scale 5, the MKS-based framework is significantly better than the conventional data fusion method for most of the combinations. This includes many of those whose noise levels at scale 7 are greater than the noise levels at scale 5. The magnitudes of equation image are quite minor for the combinations when the MKS-based framework is not as good as the conventional data fusion method. This occurs when the noise levels at scale 7 are much greater than those at scale 5. In addition, we see that the MKS-based framework is superior to the conventional data fusion method at scale 7 for the combinations when the noise levels at scale 5 are greater than those at scale 7. Results of Figures 6 and 8 consistently demonstrate that the MKS-based framework is superior to the conventional data fusion method in removing the Type-I errors for most parts.

Figure 8.

Color-filled contour plots of equation image and equation image, where the superscript E1 denotes experiment 1 and superscript B1 denotes the benchmark experiment. Both experiments use the same erroneous precipitation data with the Type-I errors. The meanings of the horizontal and vertical axes are the same as in Figure 5.

4.2.2. Discussion

[34] In section 4.2.1, we have presented and discussed the effectiveness of the MKS-based framework in removing the Type-I errors and also compared the results with those of the conventional data fusion method. In this section, we further discuss why the MKS-based framework is sometimes more effective but less effective at other times.

[35] In the MKS-based framework, data at a finer resolution affect the fusion through the upward sweep, which is a combination of three steps, fine-to-coarse prediction, the prediction merging, and the observation update. The fine-to-coarse prediction is from child nodes to their parent node. In our multiscale tree structure (Figure 1), each parent has four children. Therefore, there are four predicted states (one from each child) for each parent node. The four predicted states (i.e., precipitation) are then merged based on a weighted summation with their corresponding error variances, which is more effective in reducing the noise than a simple averaging method. Thus, the prediction merging step can decrease the amount of noise propagated to the parent node from the child nodes. For experiment 1, the finer data are at scale 7 and the coarser data are at scale 5. Thus, the prediction merging step is conducted twice: one at scale 6 and the other at scale 5. After this step, the noise at scale 5 becomes much less than that at sale 7. Therefore, the “true” information included in the finer resolution data at scale 7 can contribute significantly to the fused data at scale 5. This is one reason that the finer data help more in improving the data quality at the coarser resolution (i.e., at scale 5). On the other hand, the coarser data affect the fused data at a finer resolution via the downward sweep. If at the parent nodes, the differences between the predicted and smoothed states (i.e., precipitation) are small, the updates to the child nodes through the downward sweep would be small. Since data at a coarser resolution are generally more homogeneous than those at a finer resolution, the differences between the predicted and smoothed states are small. Thus, the coarser resolution data have less influence on the fused data at a finer resolution. In experiment 1, the coarser data affect the fused precipitation at the finer resolution (scale 7) by two Kalman smoothing steps via the downward sweep, from scale 5 to scale 6, and then from scale 6 to scale 7. In doing so, the influence becomes even weaker. This is partially why the coarser data only provide limited help to the fused precipitation at the finer resolution.

[36] The other reason is related to the EM algorithm. The EM algorithm is used to estimate the parameters equation image (equation image), and R(t) (equation image), through maximizing the log likelihood function of equation (6). Since the number of measurements at a child scale is four times of the number of measurements at its parent scale, the number of the measurement points at scale 7 would be 16 times more than that at scale 5. However, the contribution of each measurement point is equally weighted in equation (6). This is a reasonable assumption when the error level by each measurement point is comparable at different spatial scales. To maximize the log likelihood, optimal parameters and fused data as well should better fit the measurements at finer resolutions than those at coarser resolutions. Accordingly, finer measurements would have more influence on the fused precipitation at the coarser resolutions than the other way around. In experiment 1, the finer resolution has 16 times the measurement points of that at the coarser resolution. Therefore, it is unavoidable that the MKS-based framework puts much more weight on the synthetic precipitation data at scale 7. Thus, the data quality at scale 7 becomes more influential than that at scale 5. Consequently, the importance of the finer resolution data sometimes is over emphasized as indicated in our results. This is a limitation of the EM algorithm when the noise levels at finer resolutions are much larger than those at coarser scales.

[37] Due to these reasons, measurements at finer resolutions are thus more influential than those at coarser resolutions when the MKS-based framework is used to remove the Type-I errors. This is why we see the different color-filled contour patterns between scales 5 and 7 for the fused precipitation and why we see small negative regions shown in Figures 5 and 7. In addition, these features of the MKS-based framework also explain the negative regions shown in Figures 6 and 8. This is because in the conventional data fusion method, the measurements' weights at scale 7 are the same as those at scale 5, unlike that in the EM algorithm. Thus, the MKS-based framework is not superior to the conventional data fusion method for the combinations where the noise levels at scale 7 are much greater than those at scale 5.

4.2.3. Experiment 2

[38] Experiment 2 is designed to evaluate the effectiveness of the MKS-based framework in removing the Type-II errors associated with the precipitation data. As described in section 4.1 about generating the Type-II errors, the noise portion in the Type-II errors is set to a constant level of x9 = 2.0. To investigate the influence of the bias portion of the Type-II errors on the effectiveness of the MKS-based framework, we compare the metrics of experiment 2 with those of experiment 1 whose noise levels correspond to x9 = 2.0 at both scale 5 and scale 7. The same notations, i.e., equation image, equation image, equation image, equation image, equation image, and equation image (j = 5 and 7) are also used for analyzing the results of experiment 2.

[39] Figure 9 presents the color-filled contour plots of equation image, equation image and equation image (j = 5 or 7) for experiment 2. These plots provide an overall picture of the correlations before and after the data fusion, as well as the corresponding improvements at scale 5 and scale 7. Plots of equation image (j = 5 and 7) show a small variation range, from 0.38 to 0.46, in equation image (j = 5 and 7) across the 15 bias levels. These are close to the correlation values of their counterparts of x9 = 2.0 shown in Figure 5 for experiment 1. This indicates that the magnitudes of equation image (j = 5 and 7) are mainly controlled by the white noise portion of the Type-II errors included in the precipitation data. The small variations of equation image (j = 5 and 7) are due to the process of generating the synthetic precipitation data, which are affected by interactions between the bias and the noise.

Figure 9.

Color-filled contour plots of equation image, equation image and equation image for (top) scale 5 (i.e., j = 5) and (bottom) scale 7 (i.e., j = 7) for experiment 2. In each plot, the horizontal axis and the vertical axis represent, respectively, the bias levels at scales 7 and 5.

[40] In Figure 9, plots of equation image and equation image reflect the joint influences of the information associated with equation image and equation image on the fused precipitation. Compared to the plot of equation image in Figure 9, the range of correlations at scale 5 increases from a range of (0.38, 0.46) to a range of (0.80, 0.86) after data fusion. Similarly, at scale 7, the spatial correlations increase from (0.38, 0.46) to (0.51, 0.60). Compared to the corresponding plots of equation image and equation image in Figure 5, the magnitudes of equation image (j = 5 and 7) in Figure 9 are again close to the correlations of the counterparts of x9 = 2.0 in experiment 1. This indicates that improvements on the correlations are also mainly attributed to the improvements in removing the noise portion of the Type-II errors. In addition, the bias portion of the Type-II errors in the synthetic data does not essentially affect the ability of the MKS-based framework in removing the noise portion of the Type-II errors. equation image (j = 5 and 7) just varies slightly with the different bias levels.

[41] Plots of equation image and equation image show that improvements at both scales 5 and 7 in experiment 2 are again close to those of their counterparts with x9 = 2.0 in experiment 1 (see Figure 5). In experiment 1, without introducing any bias into the synthetic data, the magnitude of equation image is 0.413 and the magnitude of equation image is 0.135 when the noise levels at both scales are x9 = 2.0. In experiment 2, with 15 bias levels involved, the magnitudes of equation image are close to 0.413 and the magnitudes of equation image are also close to 0.135, albeit with small variations. Results in Figure 9 clearly indicate that the MKS-based framework can effectively recover the spatial patterns of precipitation due to the noise at both scales even when the noise is mixed with bias. Moreover, such a recovery in experiment 2 is as effective as it is in experiment 1, even though in experiment 2 the noise is blended into the bias as opposed to experiment 1 where the errors only include the noise.

[42] Results of using the MKS-based framework are also compared to those using the conventional data fusion method. Figure 10 shows the color-filled contour plots of equation image and equation image, where superscripts E2 and B2 denote the results of experiment 2 and those of the benchmark experiment, respectively. At scale 5, the MKS-based framework shows significant superiority to the conventional data fusion method (Figure 10). At scale 7, differences between the two data fusion schemes are reduced, but the MKS-based framework is still slightly more effective than the conventional one. Moreover, magnitudes of equation image are close to 0.127, which is the magnitude of equation image when the noise levels at both scales 5 and 7 are x9 = 2.0 in Figure 6. Similarly, magnitudes of equation image are also close to equation image for x9 = 2.0. These results confirm again that the bias in the Type-II errors associated with the precipitation data have little influence on the effectiveness of the MKS-based framework in terms of recovering the spatial patterns of the precipitation data.

Figure 10.

Color-filled contour plots of equation image and equation image, where the superscript E2 denotes experiment 2 and the superscript B2 denotes the benchmark experiment. Both experiments use the same erroneous precipitation data with the Type-II errors. The meanings of the horizontal and vertical axes are the same as in Figure 9.

[43] Results so far have clearly suggested that given the synthetic precipitation data mixed with both noise and bias (i.e., Type-II errors), the MKS-based framework can restore the spatial patterns of the precipitation data as much as it does in the counterparts of experiment 1 where the synthetic precipitation data only include the noise. This implies that the bias portion of the errors included in the Type-II errors negligibly affect the performance of the MKS-based framework. This is mainly due to a unique feature of our MKS-based framework [Parada and Liang, 2004]. As shown in equation (3), a D term has been introduced to the observation equation. This D term minimizes impacts of the inconsistency (i.e., bias) among different measurement sources on the fused precipitation at different scales. With the D term, the MKS-based framework just fuses fluctuations (above and below their means) of the measurements. That is why the MKS-based framework is almost influence-free of the different bias levels when restoring the spatial patterns of the precipitation data. It is worth noting that this D term cannot remove the absolute bias involved in the final values of the fused precipitation data if the mean of the means selected (see equation (5)) has a bias from the true mean. This is often the case because no one knows the true mean in practice.

[44] Similar to the analysis for experiment 1, we also conduct an analysis on the RMSE for experiment 2. Figure 11 shows the color-filled contour plots of equation image, equation image and equation image for j = 5 and j = 7. From the plots of equation image and equation image, we see that the magnitudes of equation image (j = 5 and 7) increase with the increase of the bias levels at scale j. This is consistent with our experiment design. The ranges of equation image and equation image are from about 3.0 to 5.0 and from 3.5 to 6.0 for scales 5 and 7, respectively. The former is a little bit smaller than the latter since the means of the true precipitation data at scale 5 are slightly smaller than the corresponding ones at scale 7. With both noise levels being x9 = 2.0 at scales 5 and 7, the magnitude of equation image is 3.0 and equation image is 3.59 in experiment 1. In experiment 2, the magnitudes of most of equation image and equation image are greater than them due to the added bias in the synthetic data of experiment 2.

Figure 11.

Color-filled contour plots of equation image, equation image and equation image for (top) scale 5 (i.e., j = 5) and (bottom) scale 7 (i.e., j = 7) for experiment 2. The meanings of the horizontal and vertical axes are the same as in Figure 9.

[45] Plots of equation image and equation image in Figure 11 illustrate the averages of the RMSE between the true and fused precipitation data at scales 5 and 7, respectively. Comparing to the plots of equation image and equation image, the RMSE has been reduced at both scales 5 and 7 for most of the 15 × 15 combinations. This indicates that the MKS-based framework is also effective in restoring the magnitudes of the synthetic precipitation data associated with the Type-II errors. In experiment 2, the magnitudes of equation image range from about 2.0 to 4.0 while the counterpart in experiment 1, with the noise level of x9 = 2.0, is 1.91. In addition, the magnitudes of equation image in experiment 2 range from about 3.0 to 5.5 while the counterpart in experiment 1 is 2.80. Generally, the magnitudes of equation image (j = 5 or 7) in experiment 2 are greater than the magnitudes of their counterparts in experiment 1. This indicates that the MKS-based framework can remove some but not all of the added bias in the synthetic data at both scales.

[46] Plots of equation image and equation image of Figure 11 illustrate the improvements at scales 5 and 7, respectively, for all of the 15 × 15 combinations. For both j = 5 and 7, equation image increases with an increase of the bias level at its own scale but decreases with an increase of the bias level at the other scale. The magnitudes of equation image range from −0.73 to 2.86 while their counterpart, with x9 = 2.0 in experiment 1, is 1.1. The magnitudes of equation image range from 0.49 to 1.23 while their counterpart in experiment 1 is 0.78. If the bias level at scale j is higher than the bias level at the other scale, then equation image is greater than its counterpart with x9 = 2.0 in experiment 1. For the opposite situation, equation image in experiment 2 is then smaller than its counterpart in experiment 1. When the bias levels at both scales are close to each other, equation image in experiment 2 are close to their counterpart with x9 = 2.0 in experiment 1 for j = 5 and j = 7 as well. These results clearly indicate that the effectiveness of the MKS-based framework in restoring the magnitudes of the synthetic precipitation data associated with the Type-II errors depends on the bias levels at both scales. Basically, the MKS-based framework can effectively remove the bias included in the Type-II errors. However, it cannot completely remove all of it. In fact, reduction of the RMSE using the MKS-based framework is mainly determined by the way in which the areal mean of the precipitation, namely equation image, is calculated. As shown in equation (5), equation image is calculated by averaging the means of the measurements at all scales. Once equation image is determined, the mean of the fused precipitation data at each scale of the multiscale tree is determined and is equal to equation image. If, for example, equation image is closer to the areal mean of the true precipitation data than the original mean of the precipitation data at scale 5, then equation image could be smaller than equation image in Figure 11. Otherwise, equation image could be even larger than equation image in Figure 11. This is why equation image and equation image have the patterns shown in Figure 11.

[47] In order to further evaluate the effectiveness of the MKS-based framework in recovering the magnitudes of the precipitation data associated with the Type-II errors, equation image of experiment 2 and equation image of the benchmark experiment are also compared. Figure 12 shows that the MKS-based framework is superior to the conventional data fusion method at scale 5 for almost all of the combinations except when the bias levels at scale 7 are much higher than those at scale 5. At scale 7, the MKS-based data framework is superior to the conventional method only for the combinations when the bias levels at scale 5 are much higher than those at scale 7. This indicates a need of enhancing the performance of the MKS-based framework for the finer resolution, where there is a much larger bias in the data, when recovering the magnitudes of the synthetic precipitation data.

Figure 12.

Color-filled contour plots of equation image and equation image, where the superscript E2 denotes experiment 2 and the superscript B2 denotes the benchmark experiment. Both experiments use the same erroneous precipitation data with the Type-II errors. The meanings of the horizontal and vertical axes are the same as in Figure 9.

[48] Analyses so far have been focused on evaluating the MKS-based framework in terms of statistical metrics over 2246 hourly precipitation images. Examples of the data fusion results at individual hours are also provided in this study. Figures 13 and 14 illustrate the true, the synthetic and the fused precipitation images at 0900 UT on 22 September 2003 for both scales 5 and 7, respectively. Figure 13 is for the situation in which the synthetic precipitation data include the Type-I errors (i.e., experiment 1) and Figure 14 is for experiment 2 in which the synthetic precipitation data include the Type-II errors. Both of the noise levels at scales 5 and 7 are 2.0 in Figure 13, while in Figure 14 the bias levels at both scales are 1.0 and the noise levels at both spatial scales are 2.0. Table 3 lists the correlation and RMSE of the two examples before and after the data fusion using the MKS-based framework. As expected, performance of these individual scenarios measured by the two metrics is consistent with the findings discussed in section 4.2. Through inspections, Figures 13 and 14 clearly show the significant improvements of the fused precipitation data in both of the spatial patterns and magnitudes at scales 5 and 7, respectively.

Figure 13.

Comparison of the images among the true, erroneous, and fused precipitation for an individual storm occurred at 0900 UT of 22 September 2003. equation image, equation image and equation image (j = 5 and 7) denoted the true, synthetic, and fused precipitation images at scales (top) 5 and (bottom) 7, respectively. The synthetic precipitation data are generated with the Type-I errors in which the noise levels are x9 = 2.0 for both scales 5 and 7. The horizontal and vertical axes in each plot represent, respectively, the longitudes and latitudes of our study area.

Figure 14.

Comparison of the images among the true, erroneous, and fused precipitation for an individual storm occurred at 0900 UT of 22 September 2003. equation image, equation image and equation image (j = 5 and 7) denoted the true, synthetic, and fused precipitation images at scales (top) 5 and (bottom) 7, respectively. The synthetic precipitation data are generated with the Type-II errors in which the noise levels are x9 = 2.0 and the bias levels are y10 = 1.0, respectively, for both scales 5 and 7. The horizontal and vertical axes of each plot have the same meanings as in Figure 13.

Table 3. A List of Various Corr and RMSE Values for Experiment 1 and Experiment 2 for an Individual Storm Occurring at 0900 UT of 22 September 2003
 equation imageequation imageequation imageequation imageequation imageequation imageequation imageequation image
Experiment 10.330.910.400.583.752.183.752.81
Experiment 20.500.920.450.644.743.484.974.04

5. Conclusions

[49] In this study, we systematically investigated the effectiveness of the MKS-based framework in removing the Type-I errors (white noise) and the Type-II errors (bias and noise together) associated with precipitation data. Hypothetical experiments are conducted using synthetic precipitation data, which are generated at scale 5 (1/8° resolution) and scale 7 (1/32° resolution), respectively. The mean of correlation and the mean of root-mean-square error are used in evaluation. In addition, results of the MKS-based framework are compared to those of a conventional data fusion method. Our main findings are summarized as follows:

[50] 1. For the Type-I errors, the MKS-based framework can significantly improve the spatial patterns and the magnitudes of the synthetic precipitation data at scale 5 (the coarser resolution) when the scale 7 (the finer resolution) data are fused with the scale 5 data. Exception occurs when the data at scale 5 are already pretty good and the data at scale 7 are very noisy. Results of experiment 1 also suggest that the MKS-based framework is good at improving spatial patterns of the data at the coarser resolution, even if the finer resolution data may have larger Type-I errors. In other words, these results demonstrate the important value of the high-resolution data in multiscale data fusion using the MKS-based framework, even if the high-resolution data are noisier.

[51] 2. When the precipitation data at scale 5 are fused with the data at scale 7, improvements at scale 7 can still be achieved on both the spatial patterns and the magnitudes. But the improvements at the finer resolution are smaller than those at the coarser resolution because the coarser resolution data usually contain less information compared with the finer resolution data. The largest improvement comes with the combination of the less noisy data at scale 5 fused with the noisier data at scale 7. Slight deterioration occurs when the very noisy data at scale 5 are fused with the much less noisy data at scale 7

[52] 3. For the Type-II errors, results show that the influence of both the bias and the white noise portions of the Type-II errors can be simultaneously and effectively removed through the MKS-based framework. The improvements at both scales on the spatial patterns are close to those of the counterparts with the same noise level in experiment 1. This demonstrates the value of the D term (see equation (3)) in our MKS-based framework. The improvements at both scales on the magnitudes of the precipitation depend on the bias levels at scales 5 and 7. The magnitudes at one scale may be deteriorated if the bias at this scale is small but the bias at the other scale is much larger.

[53] 4. Comparing the results of experiments 1 and 2 to those of the benchmark experiments, the MKS-based framework is significantly superior to the conventional data fusion method in improving the spatial patterns and the magnitudes of the synthetic precipitation data at scale 5. This is especially true for the combinations when the precipitation data at scale 5 are much noisier than the precipitation data at scale 7. For improvements of the spatial patterns at scale 7, the MKS-based framework is mostly superior to the conventional data fusion method while for improvements of the magnitudes at scale 7, the MKS-based framework is superior only if the precipitation data at scale 5 are noisier to those at scale 7.

[54] 5. A limitation of the EM algorithm included in the MKS-based framework is found in this study. Because the number of measurements at the finer resolution (i.e., scale 7) is much larger (e.g., 16 times in this study) than that at the coarser resolution (i.e., scale 5), the EM algorithm over emphasizes the importance of the finer resolution data. Therefore, the MKS-based framework may not perform well when the finer resolution data are much noisier than the coarser resolution data.

[55] In summary, the MKS-based framework is effective in recovering both spatial patterns and magnitudes of the synthetic precipitation data by removing the Type-I and the Type-II errors, which are associated with the precipitation data at multiple scales. This study provides not only new insights of the performance of the MKS-based framework, but also a guideline for the optimal fusion of the precipitation data at different resolutions. However, there are also two main limitations. The first is that an additive Gaussian error model is used in generating the synthetic precipitation data. Thus, conclusions of this study may not apply to situations when a multiplicative error model is assumed for the precipitation data. The second limitation is that the synthetic precipitation data assume the errors are spatially independent. Thus, the effectiveness of the data fusion algorithms, including both the MKS-based framework and the conventional method, may be less when applied to real precipitation data with spatially correlated errors. In addition, the random errors added to each grid of a synthetic precipitation image follow the same distribution. In the future, we will evaluate a more complicated error model and compare its results to those obtained in this study. In addition, we will conduct further investigations to improve estimations of the variance parameters using the EM algorithm.

Acknowledgments

[56] The authors are thankful to the three anonymous reviewers for their valuable suggestions which helped us improve the presentation of the materials. We thank S. Levent Yilmaz for his help in providing computing assistance of using the TeraGrid resources and Tyler W. Davis for his helpful comments which improve the English of this manuscript. This work was partially supported by the NASA grant of NNA07CN83A. The third author was also partially supported by the Chinese Ministry of Science and Technology grant of 2008AA12Z205.

Ancillary