Modeling errors in daily precipitation measurements: Additive or multiplicative?



[1] The definition and quantification of uncertainty depend on the error model used. For uncertainties in precipitation measurements, two types of error models have been widely adopted: the additive error model and the multiplicative error model. This leads to incompatible specifications of uncertainties and impedes intercomparison and application. In this letter, we assess the suitability of both models for satellite-based daily precipitation measurements in an effort to clarify the uncertainty representation. Three criteria were employed to evaluate the applicability of either model: (1) better separation of the systematic and random errors; (2) applicability to the large range of variability in daily precipitation; and (3) better predictive skills. It is found that the multiplicative error model is a much better choice under all three criteria. It extracted the systematic errors more cleanly, was more consistent with the large variability of precipitation measurements, and produced superior predictions of the error characteristics. The additive error model had several weaknesses, such as nonconstant variance resulting from systematic errors leaking into random errors, and the lack of prediction capability. Therefore, the multiplicative error model is a better choice.

1 Introduction

[2] Quantifying uncertainties in Earth science data records is becoming increasingly important, especially as the volume of available data is growing rapidly and as many science questions need to be answered with higher degrees of confidence. This is particularly true for precipitation measurements, whose uncertainties affect many fields, such as climate change, hydrologic cycle, weather and climate prediction, data assimilation, as well as the calibration and validation of Earth-observing instruments.

[3] Uncertainty definition and quantification rely on the underlying error model, either implicitly or explicitly. An error model is a mathematical description of a measurement's deviation from the truth, and many choices of such descriptions are available, as they are not necessarily related to the error mechanisms or sources [Lawson and Hanson, 2005]. An error model's behaviors and parameters can be determined through validation studies, and the model can then be used to predict measurements and their associated uncertainties when only ground references are available, or vice versa, in which case it is called “inverse calibration.”

[4] Two types of error models are commonly used for the study of precipitation measurements: the additive error model and the multiplicative error model. For example, many studies of satellite-based precipitation data products have used the additive error model [Ebert et al., 2007; Habib et al., 2009; Roca et al., 2010; AghaKouchak et al., 2012], while other studies such as Hossain and Anagnostou [2006], Ciach et al. [2007], and Villarini et al. [2009] have used the multiplicative model to quantify or simulate errors in radar- or satellite-based measurements. The use of different error models leads to different definitions and calculations of uncertainties, which impede direct comparisons between them and confuse end users. This raises the question of which model is more suitable. This letter addresses this question for daily precipitation measurements, in order to unify and simplify uncertainty quantification and representation.

2 Two Error Models and Test Data

[5] The additive error model is defined as

display math(1)

where i is the index of a datum; Xi is the reference data, assumed error free; Yi is a measurement; a is the offset; b is a scale parameter to represent the differences in the dynamic ranges between the reference data and the measurements; and εi is an instance of the random error which has zero mean and variance of σ2. Thus, this model is defined by three parameters, namely, a, b, and σ. Both a and b specify the systematic error, which is deterministic. Therefore, once a and b are determined, the uncertainty in the measurements Yi is quantified by σ.

[6] On the other hand, the multiplicative model is defined as

display math(2)

[7] In this model, the random error inline image is a multiplicative factor, with the mean of εi being zero and the variance σ2. The systematic error, defined by a and b, is a nonlinear function of the reference data. Though less frequently used in precipitation error models, (2) has been widely adopted in many other fields, such as biostatistics [e.g., Baskerville, 1972]. Apparently, the values of σ in the additive model (1) and in the multiplicative model (2) will be different from one another, illustrating that the uncertainty definition depends on the error model formulation.

[8] Both models (1 and 2) have three parameters (a, b, and σ), which can be estimated with the generic maximum-likelihood method or the Bayesian method [e.g., Carroll et al., 2006]. However, since the additive model (1) is a simple linear regression, the parameters can be estimated easily with the ordinary least squares (OLS) as well, assuming the random errors (or “residuals” in the case of OLS) are uncorrelated with a constant variance σ2 [e.g., Wilks, 2011]. Usually, the random errors are also assumed normally distributed, but this is not necessary as indicated by the Gauss-Markov theorem [e.g., Graybill, 1976]. However, a normal distribution for the random errors is highly desirable from a well-behaved error model, because this is the maximum entropy distribution [Jaynes, 1957] for a given mean and variance σ2. All other distributions will have lower entropy and indicate extra information in the random errors, inconsistent with the definition of uncertainty.

[9] Meanwhile, if we perform a natural logarithm transformation of the variables in (2), the multiplicative model becomes

display math(3)

which is also a simple linear regression in the transformed domain, and the parameters can be estimated with the same OLS procedure.

[10] Essentially, the additive error model defines the error as the difference between the measurement and the truth, while the multiplicative error model defines the error as the ratio between the two. Neither is wrong theoretically, but each needs to be evaluated. In order to evaluate an error model for a given measurement data set, we used three criteria:

  1. Can the model adequately partition the systematic and random errors?
  2. Can the model represent the large dynamical range typical in precipitation data?
  3. Can the model predict the errors outside the calibration period?

[11] We used two daily precipitation data sets for the study. For the ground reference data (Xi), we used the daily gauge analysis for the contiguous United Sates (CONUS), produced by the Climate Prediction Center (CPC), referred to as the CPC Unified Daily Gauge Dataset [Chen et al., 2008]. For the measurements (Yi), we used the Tropical Rainfall Measuring Mission Multisatellite Precipitation Analysis (TMPA) Version 6 real-time product, 3B42RT [Huffman et al., 2007], aggregated to daily accumulation (12Z to 12Z) from its native three-hourly amount. Both data sets have a 0.25° spatial resolution.

[12] We studied a period of 3 years, from September 2005 through August 2008. We used the first 2 years as the calibration period to fit the model and the last year for the validation of the model's predictive skills. In estimating the parameters, we only used the “hit” events, i.e., those reference-measurement data pairs both reporting a precipitation rate of 0.5 mm/d or more, as we deem the lighter events to be statistically unreliable for either the gauge-based reference data [Barnston and Thomas, 1983] or satellite estimates [Tian and Peters-Lidard, 2007].

3 Results

3.1 How Well Does the Model Separate the Systematic and Random Errors?

[13] This is equivalent to asking “can the model separate the signal from the noise well?” Since the systematic error is the part that can be deterministically described and predicted, this component should capture as much of the total deviation as possible, leaving a minimum amount of unexplainable deviation to blame on the random error, or uncertainty. In other words, a better error model should be able to extract more signal (systematic error) from the noise.

[14] Under this criterion, the additive error model (1) and the multiplicative error model (3) behave quite differently. To illustrate this, we fitted both models over a 1.5°-by-1.5° area in Oklahoma (centered at 94.25°W, 35.0°N) for the first 2 years of data (Figures 1a and 1b) and produced their respective plots for the residuals normalized by their respective standard deviations (standardized residuals; Figures 1c and 1d) as functions of the gauge data. The additive error model's fitting now appears as a curve in the log-log scale (Figure 1a), is strongly influenced by the higher rain rates, and does not fit well in the low and medium ranges. The multiplicative error model fits the whole range of the data much better (Figure 1b). However, at the high end (~64–128 mm/d), there is some clustering in the satellite data which the model does not capture well. This clustering is probably caused by the saturation of the satellite data's dynamic range, and it is reasonable to expect that the linear model will miss this nonlinear behavior.

Figure 1.

Comparison of model fitting for (a) the additive error model and (b) the multiplicative error model. Residuals of (c) the additive error model and (d) the multiplicative error model, normalized by their standard deviation, as a function of the predictor X (gauge value). The colored lines in Figures 1c and 1d represent the standard deviations of the residuals binned by X. Since log-log scales are used in Figures 1a and 1b, the additive model appears as a curve, while the multiplicative model appears as a straight line. The data are for the 1.5°-by-1.5° region in Oklahoma, from September 2005 through August 2007.

[15] The residuals, or random errors, for the additive model (Figure 1c) exhibit a systematic increase in scattering with higher rain rates. The residuals for the multiplicative model (Figure 1d), on the other hand, show a fairly constant range of variation. The standard deviation of the residuals within each binned subsets along the X axis confirm this: the one for the additive model (Figure 1c, thick red curve) has a very systematic upward slope, while its counterpart for the multiplicative model (Figure 1d, thick blue curve) remains fairly constant. The slight drop at the very high end is likely resulted from the clustering of the satellite data mentioned above.

[16] Clearly, the random errors produced from the additive model do not have a constant variance (heteroscedasticity). This implies at least two issues with the model. First, the systematic increase in the variance indicates that some systematic errors were not removed by the model and have “leaked” into the random errors, thus inflating the uncertainty and proving the model underfits. Second, the nonconstant variance violates the assumption of constant variance for OLS parameter estimation, which leads to inconsistencies in the estimation of the two parameters (a and b). The multiplicative model produces random errors with a nearly constant variance and is thus a better fit.

[17] The “leak” of the systematic errors into the random ones can also be seen in Figure 2, which compares the spatial distribution of σ, the standard deviation of the random errors, from both the additive (Figure 2a) and the multiplicative (Figure 2b) model with the time-averaged daily precipitation from 3B42RT (Figure 2c). Apparently, the random errors in the additive model (Figure 2a) exhibit a strong correlation with the time-averaged precipitation (Figure 2c). This systematic dependence should be captured by the systematic errors in the first place. The same plot for the multiplicative model (Figure 2b) shows much more uniform standard deviation, with very slight correspondence to the averaged precipitation pattern, if at all.

Figure 2.

Comparison of the standard deviation (stdev) of the random errors between (a) the additive model and (b) the multiplicative model over CONUS for 2005–2007. Each stdev value is normalized by the CONUS spatial average to facilitate direct comparisons between Figures 2a and 2b. (c) The time-averaged daily rainfall for the same period is also shown as a reference.

[18] Such a “leak” originates from the assumption that the systematic errors are a linear function of the reference data (1), while many existing studies have indicated otherwise [e.g., Gebremichael et al., 2011]. Barnston and Thomas [1983] explained this effect in their comparison of gauge and radar measurements.

3.2 Can the Model Represent the Large Dynamical Range in Precipitation Data?

[19] This is a simple argument in favor of the multiplicative model [e.g., Kerkhoff and Enquist, 2009]. At the current (daily, 0.25°) or finer spatial and temporal scales, precipitation variation can span 2 or 3 orders of magnitude and so can the errors in the measurements. As pointed out by Galton [1879], the additive error model (1) essentially assumes that positive random errors and negative ones are equally probable, to make their arithmetic mean zero. While a positive random error of 10 mm/d in a measurement of 100 mm/d is acceptable, a negative error of the same amplitude in a measurement of 5 mm/d is simply inconceivable, and the model will be forced to produce predictions of negative measurements for precipitation amount. On the other hand, the multiplicative model (2 or 3) describes the error as a proportion to the measurements, which is more sensible and is key to produce the constant variance seen in Figures 1d and 2b.

3.3 Can the Model Predict the Errors Beyond the Calibration Period?

[20] The ultimate test of a model is its predictive capability: outside the validation period, can the model reproduce the same error characteristics in the measurements? To test, we used the data from the third (last) year of our study period over the Oklahoma region. These data were not used in the validation and parameter estimation of the models. The scatterplots for the actual gauge and satellite data, with the additive and multiplicative models fitted with the historical data, are shown in Figures 3a and 3d, respectively.

Figure 3.

Evaluating the models’ prediction. The first row shows (a) the scatterplots from the actual data, (b) the model predicted data, and (c) the comparison of PDFs between the actual data and the predicted data, by the additive model. The second row shows the respective plots (d–f) for the multiplicative model. The model fitting from the historical data is shown as the thick red and blue lines, respectively, in the first two columns. The additive model's prediction also produces some negative values, but they were ignored in the plots. The data are for the 1.5°-by-1.5° region in Oklahoma, from September 2007 through August 2008.

[21] In the prediction test, the satellite data were withheld, and we used the models and the gauge data (Xi) to generate predictions of the measurements (Yi). The scatterplots thus obtained are shown in Figures 3b and 3e for the two models, respectively. In addition, the probability density functions (PDFs) of the predicted measurements and the actual 3B42RT data are compared for both models (Figure 3c and 3f). Apparently, the multiplicative model has much better predictive capability than the additive model, judging from the similarity of its scatterplots and PDFs between predicted and actual data. The additive model suffered several issues, including the unrealistically low uncertainty at higher rain rates (Figure 3b) and the shifted and distorted PDFs (Figure 3c).

4 Summary and Discussions

[22] Uncertainty definition, representation, and quantification are determined by the error model used. Two types of error models have been widely adopted to quantify the errors in precipitation measurements: the additive error model (1) and the multiplicative error model (2 or 3). They will produce incompatible uncertainties from the same measurement data set. In this letter, we evaluated both models with measurements from satellite-based TMPA 3B42RT and with CPC's daily gauge analysis as the reference data. Three criteria were used to assess the applicability of each model: (1) systematic and random errors are well separated; (2) the model is applicable to the large magnitude of variability in daily precipitation; and (3) the model has predictive skills.

[23] We found that the multiplicative error model is a much better choice under all three chosen criteria. The additive error model exhibits several weaknesses, such as heteroscedasticity, failure to account for all systematic errors, inconsistencies in the wide range of precipitation variability, and lack of predictive capability. The multiplicative model is clearly a more suitable choice. Therefore, we recommend this model for uncertainty quantification in daily precipitation measurements. This will unify the definition of uncertainties, facilitate intercomparisons among different data sets, and, eventually, benefit the end users.

[24] The fundamental cause of the additive model's issues is underfitting. Many existing studies have shown that the systematic error (sometimes referred to as “conditional bias”) is a nonlinear function of the reference rain rate [e.g., Gebremichael et al., 2011; Kirstetter et al., 2012] and can be well fitted with the form inline image [Ciach et al., 2007]. However, the additive model tries to fit with the linear function a + bXi. Thus, it does not capture all the systematic error, and then, it treats the “leaked” systematic error as a random error. Figure 4 conceptually illustrates this situation. The true systematic error is assumed to be nonlinear (solid curve). The additive model's linear fit (dashed line) does not capture all the true systematic error, and the shaded area is the part of the systematic error treated by the additive model as part of the random error [Barnston and Thomas, 1983]. This is why one sees strong “systematic” features in the random errors (Figures 1c and 2). Therefore, the “random error” in the additive model is not 100% random and is thus not a truthful representation of the uncertainty.

Figure 4.

Conceptual illustration of the additive model's underfitting. The solid curve is assumed to be the true systematic error of the data (circles). The additive model tries to fit a straight line (dashed line) through the data, and the difference between the two (shaded area) will be treated by the additive model as part of the random error. This is the cause of the “leaking” of the systematic error into the random error with an underfitting model.

[25] However, the selection of an error model is dictated by the data. We speculate that at coarser spatial and temporal resolutions (seasonal or longer), the magnitude of precipitation variability is much suppressed and both precipitation and the measurement errors are closer to the normal distribution [e.g., Sardeshmukh et al., 2000], and the additive model may become more viable. On the other hand, at finer spatial and temporal resolutions, the probability distributions of precipitation and the errors are highly skewed and closer to the Gamma or lognormal distribution, and the multiplicative model may prevail. How the error modeling transitions with the spatial and temporal scales requires further study. Nevertheless, the three criteria proposed in this paper are general and rational enough to be applicable to other data sets and models as well.

[26] In this study, we only used 3B42RT data for the test. We have also examined many other data sets, satellite-based or not, and found that the multiplicative model works equally well at the same 0.25°/daily scale. These results will be published elsewhere. We also used a radar-based data set (Stage IV) as the reference, and the results are qualitatively the same, because on the daily scale, the difference between the gauge and radar data is about an order of magnitude smaller than that between either one and the satellite data [Tian et al., 2009].

[27] We treated the gauge data as error free, which is not absolutely true. However, in practice, the errors in the gauge data are believed to be much smaller than those in the satellite-based measurements [Tian et al., 2009], so this assumption should not change the nature of the conclusions. Moreover, once the errors in the reference data are available, it is straightforward to take them into account [Krajewski et al., 2000]. There are also theoretical treatments to the modeling of errors in both the measurements and the reference data (errors-in-variables theories) [e.g., Carroll et al., 2006], which are beyond the scope of the current study.

[28] Both models also assume that the errors are only functions of the reference rain rate and are not designed to handle errors related to other features such as spatial patterns. Thus, they are more suitable for gridbox-by-gridbox or small region-by-region studies. This is reasonable for satellite-based measurements that are mostly retrievals on a footprint-by-footprint basis. The dependence of the errors on other geophysical parameters, such as topography, will be reflected in the spatial variations of the parameters (a, b, and σ) from gridbox-by-gridbox modeling fitting.

[29] Despite the demonstrated advantages, the multiplicative error model is certainly not perfect, simply because not all the underlying assumptions can be met in reality. These assumptions include, for example, the stationarity of the measurement process and the systematic error's dependence on the precipitation alone. Also, the possible saturation of the dynamic range at high rain rates in the satellite data introduces some nonlinearity, which cannot be represented by the linear model. More elaborate error models can certainly be developed, with more complex formulations and more parameters, but there is always the risk of overfitting, and the models may quickly lose predictive skills. A complex error model also implicates ill-designed measurement instruments or product algorithms. Judging from its predictive skills and its conceptual simplicity, we suggest that the multiplicative model (3) should suffice for most studies in practice.

[30] It is also worth noting that the model parameters are estimated with only “hit” events, which only include precipitation rates of 0.5 mm/d or larger in both measurement and gauge data, and which more often dominate the errors. We believe that data points with lower rates in either the gauge data or the satellite data are statistically unreliable, being more susceptible to noise and artifacts [e.g., Barnston and Thomas, 1983; Tian and Peters-Lidard, 2007]. However, since both models (1 and 3) are linear, the model parameters estimated with the “hit” events can certainly be used to extrapolate to lower precipitation rates, albeit there is no guarantee of performance. In addition, we did not attempt to model the “missed” or “false alarm” events [Tian et al., 2009]; they should be modeled as separate error components [e.g., Hossain and Anagnostou, 2006; Villarini et al., 2008], and their contribution to total rainfall could be significant [Behrangi et al., 2012]. Again, since these events usually involve very light rain rates in either the reference data or the measurements, and they are more prone to nonrandom effects such as snow cover on the ground [Ferraro et al., 1998] or inland water bodies [Tian and Peters-Lidard, 2007], how well one can characterize them with a stochastic model is an open question.


[31] This research was supported by the NASA Earth System Data Records Uncertainty Analysis Program (Martha E. Maiden) under solicitation NNH10ZDA001N-ESDRERR. Computing resources were provided by the NASA Center for Climate Simulation.