Statistics of precipitation reflectivity images and cascade of Gaussian-scale mixtures in the wavelet domain: A formalism for reproducing extremes and coherent multiscale structures


  • Mohammad Ebtehaj,

    1. Department of Civil Engineering, Saint Anthony Falls Laboratory and National Center for Earth-Surface Dynamics, University of Minnesota, Twin Cities, Minneapolis, Minnesota, USA
    2. School of Mathematics, University of Minnesota, Twin Cities, Minneapolis, Minnesota, USA
    Search for more papers by this author
  • Efi Foufoula-Georgiou

    1. Department of Civil Engineering, Saint Anthony Falls Laboratory and National Center for Earth-Surface Dynamics, University of Minnesota, Twin Cities, Minneapolis, Minnesota, USA
    Search for more papers by this author


[1] To estimate precipitation intensity in a Bayesian framework, given multiple sources of noisy measurements, a priori information about the multiscale statistics of precipitation is essential. In this paper, statistics of remotely sensed precipitation reflectivity imageries are studied using two different data sets of randomly selected storms for which coincident ground-based and spaceborne precipitation radar data were available. Two hundred reflectivity images of independent storm events were collected over two ground validation sites of the Tropical Rainfall Measurement Mission (TRMM) in the United States. Comparing ground-based and spaceborne images, second-order statistics of the measurement error is characterized. The average spectral signature and second-order scaling properties of those images are documented at different orientations in the Fourier domain. Decomposition of images using band-pass multiscale oriented filters reveals remarkable non-Gaussian marginal statistics and scale-to-scale dependence. Our results show that despite different physical storm structures, there are some inherent statistical properties which can be robustly parametrized and exploited as a priori information for parsimonious multiscale estimation of precipitation fields. A particular mixture of Gaussian random variables in the wavelet domain was found to be a suitable probability model that can reproduce the non-Gaussian marginal distribution as well as the scale-to-scale joint statistics of precipitation reflectivity data, important for properly capturing extremes and the coherent multiscale features of rainfall fields.

1. Introduction

[2] In the past decades, a considerable research effort has been devoted to developing parsimonious stochastic models of space-time rainfall [e.g., Lovejoy and Mandelbrot, 1985; Gupta and Waymire, 1990, 1993; Veneziano et al., 1996; Deidda, 2000; Deidda et al., 2006; Lovejoy and Schertzer, 2006; Venugopal et al., 2006; Mandapaka et al., 2010]. The related theories of multiscale process representation, e.g., in Fourier or wavelet domains, have proven to be useful for quantifying the rainfall variability at multiple scales. A large body of these developments has exploited the way that the second-order statistics of the rainfall process vary across different scales (i.e., 1/f spectra). Beyond this, observing non-Gaussian characteristics of precipitation fields and scaling in higher-order statistical moments, the theory of Multifractals and Multiplicative Random Cascades has extensively been used to capture these distinct properties of the rainfall fields [e.g., Lovejoy and Schertzer, 1990; Gupta and Waymire, 1990, 1993]. Simultaneously, it has been shown that oriented subband encoding of precipitation fields using wavelets can lead to an efficient and rich multiscale representation of spatial rainfall [e.g., Kumar and Foufoula-Georgiou, 1993a, 1993b]. Subsequently, an appreciable amount of work has been devoted to extracting the dependency of the parameters of those stochastic models to the underlying physics of the storm [e.g., Over and Gupta, 1994; Perica and Foufoula-Georgiou, 1996; Harris et al., 1996; Badas et al., 2006; Nykanen, 2008; Parodi et al., 2011].

[3] The purpose of this paper is to: (1) demonstrate that precipitation reflectivity images exhibit some remarkably regular multiscale statistical characteristics, mainly related to non-Gaussian (heavy tail) marginals and scale-to-scale dependency, and (2) introduce a new modeling framework based on Gaussian Scale Mixtures (GSM) on wavelet trees which can be explored towards non-Gaussian, multiscale/multisensor data fusion of precipitation fields. In section 2, we present basic statistics from a diverse array of precipitation reflectivity images collected coincidentally from ground-based NEXRAD and the spaceborne Precipitation Radar (PR) abroad the TRMM satellite for two TRMM Ground Validation (GV) sites in Texas and Florida. In section 3, an extensive analysis and comparison of these images in the Fourier domain is undertaken. In section 4, the marginal and joint statistics of these precipitation reflectivity images in the wavelet domain (using an advantageous Undecimated Orthogonal Discrete Wavelet transform) are presented. A novel model based on the GSM on wavelet trees is introduced in section 5, and its potential for reproducing the observed heavy tail and covariance of the rainfall wavelet coefficients at multiple scales is demonstrated. The potential application of this model is also briefly discussed. Finally, section 6 presents conclusions and directions for future research.

2. Precipitation Data and Elementary Statistics

[4] A major portion of the available remotely sensed precipitation data is acquired via imaging in the microwave band of the electromagnetic spectrum. For active microwave sensors, such as ground or spaceborne radars, the precipitation fields are retrieved via physical or statistical relationship from the reflectivity images obtained as a result of the detected back-scattered energy of microwave signals emitted from the precipitation radar. On the other hand, for passive microwave sensors such as the TRMM Microwave Imager (TMI), the precipitation fields are retrieved indirectly via conditional inversion of the observed “brightness temperature” [e.g., Kummerow et al., 1996]. In this study, we use coincidental reflectivity data of the spaceborne TRMM precipitation radar (PR) and the land-based NEXRAD radar to demonstrate that despite different physical structures of the studied storms, the near-surface images of precipitation reflectivity exhibit remarkably regular and stable statistical properties, which can be explicitly characterized within a novel formalism based on GSM in the wavelet domain.

[5] Specifically, the data set used in this study is populated by near-surface reflectivity images from 200 independent storms coincidentally observed by TRMM and NEXRAD precipitation radars. The TRMM-2A25 and NEXRAD (level III) long-range reflectivity products over two TRMM-GV sites: Houston, Texas (HSTN), and, Melbourne, Florida (MELB), were collected on the basis of the TRMM overpass information provided by the GV Office at the Goddard Space Flight Center, Maryland. Using orthodromic distance, the NEXRAD product provides reflectivity at an horizontal resolution of about 1 km and up to the range of 460 km with minimum reflectivity detection of 5 dBZ. The TRMM, 2A25 product provides an orbital track that spans a swath of 250 km at nadir with a resolution of about 4–4.5 km and minimum detection sensitivity of 17 dBZ. A lucid explanation of the TRMM-GV sites and the available data at each site are provided by Wolff et al. [2005]. Note that as the quantitative comparison of the two sensors is of interest in this research, the NEXRAD near-surface long-range reflectivity product was selected to maximize the coincidental coverage between the two sensors. Obviously, rainfall rate estimation from this single level reflectivity product via a Z-R relationship needs to be limited to lower ranges (e.g., <230 km) to minimize the range effect estimation errors.

[6] The data set used in our study comprises reflectivity images of 95 and 105 storm events from both sensors over the HSTN and MELB sites from 1998 to 2010, respectively (see Figure 1). Concerning the sufficiency of the data for robust statistical inference, the images were carefully selected from storm events with adequate areal coverage during the TRMM overpasses. The data set spans a wide range of storms with different physical structures and geometrical shapes ranging from highly localized convective storms to frontal and synoptically induced hurricane systems (see Figures 2 and 3). It is emphasized that no attempt was made to convert these reflectivity images to precipitation intensity values, a task that would be a research topic by itself, given the diversity of storms and the ground-based radar range-dependent estimation issues. In the rest of the paper we refer to these reflectivity fields as “precipitation reflectivity images” or “precipitation images.”

Figure 1.

For statistical analysis, a total of 200 independent storms were selected from the TRMM-PR and the NEXRAD reflectivity data set at the TRMM ground validation (GV) sites of Houston, Texas (HSTN), and Melbourne, Florida (MELB). The distribution of these events by year is shown.

Figure 2.

The collected data sets summarized in Figure 1 span a wide range of storms with different spatial structures and geometrical shapes. The NEXRAD reflectivity images for four selected storms are shown above; they are labeled according to the GV site, date (yyyymmdd), and time in UTC: (a) MELB_19980217_131700, (b) HSTN_19981113_000200, (c) HSTN_20020620_172600, and (d) MELB_20040926_045000.

Figure 3.

(a) The geographic locations of the study sites (MELB and HSTN) and the orbital track 61698 of the TRMM satellite which captured a hurricane storm over Texas on 13 September 2008. (b) Reflectivity images of the storm captured by ground-based NEXRAD at 11:16:00 UTC and (c) the coincidental TRMM-2A25 overpass.

[7] Focusing on characterization of the error variance, these sensors were compared over the intensity range detectable by both. Accordingly, the mean reflectivity (in dBZ) of the TRMM images was compared with the mean of the corresponding NEXRAD images, conditioned on reflectivity values exceeding 17 dBZ; see Figure 4. For this case, the standard bias was found to be −2% and −1.8% for the HSTN and MELB sites, respectively. This indicates that the TRMM-PR overestimates the reflectivity intensity in the range that both of the sensors can detect reflected echoes. This bias is not unexpected and is mainly due to the inherent differences in the way that the two sensors interrogate the vertical profile of the atmosphere. The variance of error is estimated and reported in Table 1 based on two different definitions of signal-to-noise ratio metric. This characterization has an important implication in the context of linear multisensor fusion of precipitation products [e.g., see Chou et al., 1994; Gorenburg et al., 2001; Tustison et al., 2002; Willsky, 2002]. To this end, the bias was adjusted to zero via enforcing the regression line to pass through the origin, and also the data pairs with normalized residual values (by the standard deviation) beyond the interval [−2, 2] were excluded from the estimation process (see Figure 4). The latter treatment makes the estimation more robust to probable outliers.

Figure 4.

TRMM versus NEXRAD reflectivity values in (a) HSTN and (c) MELB sites. The data pairs are the spatially averaged reflectivity values of coincidental pairs of images computed over the range of intensity values which is detectable by both sensors (≥17 dBZ). The solid line is the best least squares fitting and the broken line is the 1:1 line. The (b) HSTN and (d) MELB normalized regression residuals, with the [−2, 2] lines, marked to indicate the values that fall outside the ±2 times standard deviation of residuals.

Table 1. Standardized Error Variance in Terms of Two Signal-to-Noise Ratio (SNR) Metrics and Kullback-Leibler (KL) Divergence of the Marginal Histograms of the TRMM and NEXRAD Coincidental Reflectivity Observationsa
  • a

    Values in parentheses indicate the 95% quantile range of estimation. The two different metrics of SNR are: SNR1 = 10 log10 (μs/σn) and SNR2 = 10 log10 (σs/σn), where μs and σs are the mean and standard deviation of the signal and σn is the noise standard deviation.

  • b

    KL is a mutual property between NEXRAD and TRMM. Therefore the entries repeated here are actually shared between NEXRAD and TRMM.

SNR111.9 (10.4–12.9)13.0 (11.6–13.6)12.4 (11.2–13.6)13.6 (13.0–14.4)
SNR28.4 (5.8–9.75)13.0 (11.6–13.6)9.0 (7.45–10.0)7.9 (6.5–9.6)
KLb1.0488 (0.7959–1.5681)1.0488 (0.7959–1.5681)0.6749 (0.6108–0.7494)0.6749 (0.6108–0.7494)

[8] The Kullback-Leibler (KL) divergence, also known as the relative entropy, was also studied to characterize the degree of proximity of the marginal densities of the observations, provided by the two sensors. The KL divergence is defined as

equation image

where pj = p(xHj) is the conditional marginal density of the precipitation reflectivity values under different measurement hypotheses with j = 0,1 corresponding to TRMM and NEXRAD observations, respectively. The KL divergence is a positive quantity which is equal to zero if and only if the compared densities are equal almost everywhere in their domain. The KL is not a conventional distance since it is not symmetric and does not satisfy the triangle inequality for three arbitrary densities. Yet, it has been shown to be a useful measure of density mismatch in statistical modeling [Levy, 2008]. As can be seen from Table 1, this metric demonstrates a statistically significant deviation from zero for both GV sites implying a deviation of the marginal densities. This particular observation along with the least squares analysis of the data set (see summary in Table 1) indicates that on the average the overall quality of the selected TRMM-PR overpass observations in the MELB site is superior to that of the HSTN site.

3. Spectral Signature

[9] Several studies [e.g., Lovejoy and Schertzer, 1990; Harris et al., 1996, 2001; Menabde et al., 1997; Morales and Poveda, 2009; Lovejoy et al., 2010; Ebtehaj and Foufoula-Georgiou, 2010] have reported the presence of scale invariance in the form of fβ average Fourier spectrum (i.e., equation image[∣equation image(f)∣2]) in precipitation fields. The Fourier transformation, as an approximation to the Karhunen-Loève expansion, allows us to decouple the correlation structure of the rainfall fields into a set of almost uncorrelated Fourier coefficients with a nearly diagonal covariance matrix. Therefore, knowing that the inner product in L2(equation image) is conserved under the Fourier transformation (i.e., Parseval's Theorem), the one-dimensional representation of the average power spectrum equation image[∣equation image(f)∣2] = Afβ is indeed diagonalization of the covariance in the frequency domain. Besides the information content of the spectral decay rate as depicting the second-order scaling law and degree of differentiability (smoothness) of a field, this diagonal representation of the covariance yields a computationally more efficient least square optimal filtering in the Fourier domain which also might be useful for filtering of high-dimensional strongly correlated rainfall fields [see, e.g., Simoncelli and Adelson, 1996; Gonzalez and Woods, 2008].

[10] By construction, the Fourier spectrum of an image is insensitive to spatial translation, but it is not rotation invariant and can explain the anisotropy of a field. Accordingly, in addition to the energy distribution of the intensity values in the frequency domain, the 2D spectrum depicts the orientation of the edges and regions of sharp gradients in a 2D field. For instance, it has been reported that as horizontal and vertical edges are dominant in man-made scenes (e.g., cities), the spatial distribution of the spectrum of these images is more elongated along the vertical and horizontal orientations [Torralba and Oliva, 2003]. In light of this, studying the spatial distribution, orientation and total energy of the precipitation reflectivity images in the spectral domain might be useful not only for exploring scale invariance and optimal estimation but also for studying the regional organization of storm systems for retrieval applications.

[11] To this end, we compute here a more general representation of the mean spectral signature of the precipitation images at different orientations θ,

equation image

where equation image(·) denotes the Fourier transformation in polar coordinates, A(θ) is a prefactor and β(θ) is the dropoff rate of the spectrum at angle θ. Using discrete Fourier transform, the square of the absolute values of the Fourier coefficients were calculated to obtain the 2D power spectrum for each individual image. This provides a set of Fourier power spectra which can be averaged over the entire data set for each site (i.e., 95 images over HSTN and 105 images over MELB) to obtain the so-called ensemble power spectrum; see Figures 5c and 5d. Using the least squares regression in a log-log scale, the power spectral model in the form of equation (2) can be fitted at different orientations to each individual or ensemble 2D spectra. The regressions were performed in the radial frequency interval of [0.03,0.50] cycle/pixel [c/p] corresponding to the pseudo spatial scale (i.e., Euclidian distance) of 2–32 km. Table 2 reports the results of directional estimation of power spectral slopes for the NEXRAD data sets. It is observed that the estimated spectral slopes vary between 2.35 and 2.75 for the HSTN site and between 2.45 and 2.85 for the MELB site. By averaging a 2D power spectrum over all angles, a one-dimensional representation can also be obtained in which the parameters in equation (2) are independent of orientation. Figures 5a and 5b show the radially averaged ensemble spectra for the NEXRAD data sets. The estimated dropoff rate of the radially averaged ensemble spectra in the two sites is about 2.70–2.75, which implies that the precipitation images are globally much smoother than many other natural images (e.g., β ≈ 2) [Ruderman, 1994]. Interestingly, despite the different geographic locations of the two sites and different physical structures of the storms, the statistics of the spectral parameters vary within a very narrow range; see Table 2. This observation implies that the spectral signature (i.e., spatial correlation structure) of the near-surface reflectivity images may not be a discriminatory measure of the physical structure of the storms. On the other hand, this universal behavior gives us a priori knowledge about the correlation structure of these type of rainfall images which can be useful for noise removal and optimal estimation of precipitation data in the Fourier domain.

Figure 5.

Radially averaged spectra for the ensemble of NEXRAD reflectivity images at the (a) Houston and (b) Melbourne GV sites. The 2D ensemble spectra for (c) Houston and (d) Melbourne GV sites depicting directional anisotropy at small scales (large frequencies).

Table 2. Estimated Parameters of the Spectral Model in Equation (2) for NEXRAD Data at Multiple Directionsa
log [A (θ)]β (θ)log [A (θ)]β (θ)
  • a

    See Figure 5. The values in parentheses are the standard deviations. The parameters reported for equation image denote those obtained from the radially averaged ensemble spectra.

equation image5.18(0.18)2.68 (0.12)5.11(0.19)2.75(0.10)
θ = 0.0°4.86(0.22)2.66 (0.22)4.80(0.27)2.72(0.26)
θ = 90°5.02(0.25)2.71 (0.27)4.92(0.25)2.80(0.21)
θ = 135°4.50(0.19)2.75 (0.22)4.46(0.22)2.85(0.21)
θ = 45°4.85(0.20)2.35 (0.20)4.75(0.19)2.45(0.19)

[12] As mentioned before, the shape of the spectrum can also speak for the regional organization of the rain cells. Pronounced abrupt changes in the spatial domain intensity values (i.e., horizontal edges) cause spectral skewness (elongation) in the perpendicular direction (i.e., vertical direction) at the frequency domain [Gonzalez and Woods, 2008]. In the collected storm images, for the low frequency components of less than 0.1 [c/p], the spectral signature shows a more dense and isotropic behavior, meaning that on average the large-scale features of the storms do not have any particular spatial orientation. However, for high-frequency components (i.e., small-scale features) the ensemble power spectra are tilted and more elongated towards the northeast (NE) and southwest (SW) directions (see Figures 5c and 5d). This similar asymmetric signature in both sites may mainly arise due to a regionally governing synoptic meteorological condition that gives rise to a directionally dominant formation of the rain patches with a length scale smaller than 10 km.

[13] Due to the limited swath width and flight orientation, the TRMM-PR orbital observations often cannot capture the entire spatial extent of the storm events. TRMM-PR products often provide a cropped version of the whole storm with abrupt changes of intensity values on the swath boundaries. These artificial edges contaminate the spectral signature and give rise to some spectral leakages (see Figure 6), which do not allow us to properly study the directional organization of the rain cells (i.e., edges) from this product. Although this boundary effect may be handled, for instance by padding the TRMM-PR images with mirror reflection of themselves across the boundaries, this obviously does not add any new information that can be exploited to study the spatial organization of the rain cells. However, the decay rate of the radially averaged ensemble spectrum at similar frequency bands confirms that the reflectivity images of the TRMM sensor exhibit slightly weaker correlation structure (more irregular) compared to the NEXRAD products.

Figure 6.

Radially averaged spectra for the ensemble of TRMM-PR reflectivity images in the (a) Houston and (b) Melbourne GV sites. The smaller values of the spectral slopes compared to those of the corresponding NEXRAD reflectivity images of Figure 5 imply that the reflectivity images produced by the spaceborne TRMM-PR exhibit a weaker spatial correlation than those produced by the ground-based NEXRAD. The 2D ensemble spectra of the TRMM-PR data at (c) HSTN and (d) MELB sites indicate similarity to the corresponding NEXRAD spectra but also show significant spectral leakage due to the artificial edge effects introduced by the swath boundaries (see text for more explanation).

4. Statistics of Subband Components in the Wavelet Domain

[14] Natural processes exhibit variability over a broad range of scales, often manifesting itself in isolated singularities in the form of edges or nested areas of intense activity. The decay of the Fourier spectrum captures the global distribution of variance without providing information about the local distribution of the process variability at different scales. Using a set of multiscale band-pass filters at different orientations has been found to be extremely useful for extracting the information content of the local jump discontinuities and abrupt fluctuations of these fields [e.g., Kumar and Foufoula-Georgiou, 1993a, 1993b; Perica and Foufoula-Georgiou, 1996; Huang and Mumford, 1999; Lee et al., 2001; Mallat, 2009]. Spatial precipitation fields are highly clustered and exhibit strong correlation along with sparseness (zeroes) in the real domain, mainly recognizable as the presence of oriented edges between rain and no-rain areas. Consequently, precipitation images often exhibit a stronger sparseness condition in the wavelet domain as the coherent cells and broad homogeneous areas would map into near-zero wavelet high-pass coefficients. This often manifests itself in the marginal histogram of the wavelet subbands having a sharp peak at the center (i.e., around zero) and extended heavy tails which cannot be modeled in a Gaussian framework. As a simple treatment to overcome this leptokurtic behavior, Perica and Foufoula-Georgiou [1996] proposed a Gaussian density model for the so-called “standardized rainfall fluctuations”, defined as the high-pass orthogonal wavelet coefficients divided by their corresponding low-pass coefficients. Although that treatment can partially model the observed thick tail behavior, it cannot flexibly account for the cusp singularity or large mass of the wavelet coefficients around the center of the distribution.

[15] In this study, we demonstrate that the Generalized Gaussian (GG) density which has been widely used for statistical modeling of the high-pass wavelet subbands of natural images [e.g., Huang and Mumford, 1999; Lee et al., 2001] can be employed to fully characterize the marginal statistics and heavy tail properties of the precipitation reflectivity images. As the rainfall imageries generally suffer from a considerable number of zero intensity values at the background (nonrainy areas within the field of view), a method is also presented to allow characterization of only the “relevant zeroes”, i.e., the zeroes that correspond to small isolated dry areas within the storm domain and to those corresponding to the storm edges [see also Kumar and Foufoula-Georgiou, 1994]. In addition, it is shown that despite the decorrelation capacity of the wavelet transformation [e.g., see Wornell, 1990], the wavelet coefficients of the rainfall images exhibit a weak correlation structure and a considerably regular higher-order scale-to-scale dependence, which needs to be addressed for proper multiresolution modeling of precipitation imageries.

4.1. Wavelet Decomposition and Marginal Statistics

[16] The Orthonormal Discrete Wavelet Transformation (OWT) [Mallat, 1989] decomposes a 2D signal f(x,y) of size K × L into a pair of almost uncorrelated expansion coefficients dm,k,li (called wavelet coefficients) and orthonormal basis functions ψm,k,li (x, y) in the form of

equation image
equation image

where ψm,k,li (x, y) is the wavelet basis function at subband i = {H, V, D} (i.e., Horizontal, Vertical and Diagonal directions in this study), m denotes the scale level and (k, l) are translation indices (see also detailed exposition by Kumar and Foufoula-Georgiou [1993a]). This representation uses the orthogonal wavelet bases functions ψm,k,li (x, y) in a “critically sampling rate” (meaning that the size (N) of the input signal is equal to the total size of the output subbands). Owing to the orthogonality of the bases and critical sampling rate, the inverse transformation allows a perfect reconstruction with a computational complexity of the order of O(N). However, this critical sampling rate makes the wavelet representation shift variant and imposes significant aliasing in each individual subband. Although the aliasing artifacts will cancel out in the reconstruction phase, this would be troublesome for processing and parametrization of each individual subband [e.g., Nason and Silverman, 1995; Simoncelli and Freeman; 1995].

[17] In this study, a shift-invariant Undecimated Orthogonal Discrete Wavelet Transform (UOWT) [Nason and Silverman, 1995] is used for decomposition of the precipitation reflectivity images and statistical characterization of their wavelet coefficients. This decomposition produces nearly alias-free and overcomplete subband information. The latter property is another great advantage over the conventional OWT in which the size of the signal is downsampled by a factor of 2 at each level of decomposition, giving rise to inferential problems in subband parametrization of rainfall images with small wetted area. Obviously, the advantages of this overcomplete frame expansion come at the expense of a higher computational complexity of the order of O(N log N).

[18] It is noted that in the wavelet domain, background zeroes will remain zeroes at consecutive scales in both high and low-pass subbands, while a range of zero intensity values within the storm domain (i.e., those zero intensity pixels which define the boundaries of the wetted areas of the storm from the background zeros) will become nonzeroes from fine-to-coarse scales. This observation provides an efficient means of eliminating the background zeroes while keeping the zeroes of interest (and their locations). In this study, to resolve this issue, the conditional marginal densities of high-pass subbands are estimated given that the low-pass coefficients at the same location and scale are positive (see Figure 7).

Figure 7.

(a) NEXRAD reflectivity image of a storm over the HSTN site at 15 January 2007 at original resolution of 1 km. (b) First level low-pass subband image of the storm using an Undecimated Orthogonal Wavelet Transform (UOWT). (c) Pixel-wise difference of Figures 7b and 7a. At each level of the wavelet decomposition, as a result of the convolution of the field with the wavelet scaling function, a set of zero values next to the edges of the wetted areas becomes nonzero and subsequently the areas of nonzero pixels progressively grow from fine-to-coarse scales.

[19] Figure 8b shows the marginal distribution of the wavelet coefficients (in log-probability scale) computed, as discussed above, from the NEXRAD reflectivity data of a June 2007 storm over the Melbourne site (see Figure 8a). A highly leptokurtic behavior is observed which is in contrast, for example, with the Gaussian marginal distributions of the wavelet coefficients of a fractional Brownian surface, as shown in Figures 8c and 8d (note that a Gaussian density is an inverted parabola in a log-probability scale). It is important to note that both the 2D fractional Brownian surface (the slope of its power spectrum is 2H + 2, where 0 < H < 1 is the self-similarity index) and the precipitation image exhibit similar power law spectrum (i.e., 1/f law) but their marginal statistics are drastically different.

Figure 8.

(a) NEXRAD reflectivity image of a storm over the MELB site on 5 July 2007 at 20:00:00 UTC and (b) the associated histogram of the wavelet coefficients normalized by the standard deviation. Sharper peak and heavier tail than the Gaussian case is a typical statistical feature of the rainfall images in the wavelet domain. (c) Positive part of a 2D fractional Brownian surface with self-similarity index of 0.5, and (d) the associated Gaussian marginal histogram of the normalized horizontal wavelet high-pass coefficients.

[20] The Generalized Gaussian (GG) family, also known as the Generalized Laplace, has often been used to model the marginals of the wavelet coefficients in the context of natural images [e.g., Huang and Mumford, 1999; Lee et al., 2001]. The early form of this class of density functions was first presented by Subbotin [1923]; however, it can be considered as a subclass of a more flexible family, the so-called Generalized Gamma density functions [Stacy, 1962; Choy and Tong, 2010]. The zero-mean parameterization of this family can be described by a shape α ∈ (0, ∞) and a width parameter s ∈ (0, ∞) as [e.g., Nadarajah, 2005],

equation image

where θ ∈ {s,α} and Γ(·) denotes the standard gamma function Γ(a) = ∫0etta−1dt, a > 0. This parameterization allows a concise characterization of a symmetric probability continuum spanning a wide range of distributions, including the Dirac delta function (α → 0) to the uniform density (α → ∞) in the limiting case. The tail probability of this family is summable and admits the classical central limit theorem. Therefore, it is only a suitable probability model for signals with finite energy in L2(equation image) inner product space. It is worth noting that as this family is a subclass of the generalized gamma density function, Walker and Gutiérrez-Peíra [1999] and Martín and Pérez [2009] proposed a gamma mixture representation which allows a pseudo random number generation scheme for this family. Letting equation image be a gamma random variate equation image ∼ Γ (shape = 1 + 1/α, scale = 1), the GG random variables in equation (4) can be generated by Xsequation image1/αequation image, where equation image is a uniform density function on [−1, 1].

[21] Given a set of sample wavelet coefficients dj ∈ (dm,1i, dm,2i , ., dm,ni)T of precipitation reflectivity images at subband i and scale m, where j = 1, 2, ., n correspond to the spatial locations of these coefficients, the parameters of the fitted GG probability density can be estimated using the Method of Moments (MOM) or the Maximum Likelihood (ML) estimation method. The density of the GG distribution in the form of equation (4) can be fully characterized given the second- and fourth-order central moments of the sampled data,

equation image

Accordingly, the shape parameter α can be estimated from the sample kurtosis of the wavelet coefficients as defined above, by numerically solving the following nonlinear equation,

equation image

and knowing the shape parameter, the width parameter can be estimated using equation (5). A closed form set of equations is also derivable to estimate the parameters in a ML sense. Specifically, maximizing the log-likelihood function

equation image

yields the following set of nonlinear equations that can be solved numerically [Nadarajah, 2005],

equation image

where Ψ (·) the is the digamma function Ψ (a) = equation image log Γ (a).

[22] Employing the ML estimation method, Figure 9 depicts the fitted Generalized Gaussian distribution to the average histogram of the first horizontal subband coefficients for all precipitation images in the Texas and Melbourne sites. It can be observed that the GG density can explain impressively well the heavy tail non-Gaussian features of the rainfall fields in the wavelet domain. As the GG density can be fully characterized knowing the second- and fourth-order statistical moments in equation (5), the evolution of the density at multiple scales can also be studied via characterization of the scaling properties of these moments.

Figure 9.

The empirical log histogram of the horizontal subband coefficients normalized by the standard deviation, at one level of decomposition in (a) HSTN and (b) MELB. The dots show the empirical histogram averaged over all storms in each site (see Figure 1), and the solid line is the Maximum Likelihood fitted GG distributions. The shape parameter for the average histogram is calculated consistently around 0.7 for both sites. The dashed lines present the 95% quantiles associated with the estimated parameters for each individual data set.

[23] As an orthogonal overcomplete wavelet representation is used in this study, the Parseval's theorem guarantees that the 2-norm is conserved in the transformed domain. Hence, as expected from spectral analysis, the scaling of the second-order statistics in the wavelet domain shall remain a power law. Observations (see Figure 10) also demonstrate that the fourth-order moment of the wavelet coefficients obeys a power law scaling, which allows us to derive parametric expressions to describe the evolution of the GG density at multiple scales of interest. This can be further formalized in the framework of a stochastic multifractal representation [e.g., Mandelbrot et al., 1997; Abry et al., 2004] in which, the qth-order moment of the wavelet coefficients in a particular subband can be explained as

equation image

where cq is a prefactor and τq characterizes the scaling law of the process in a finite range of scales. In the case that the scaling exponent τq can be uniquely expressed as τq = qH, with the self-similarity index H independent of q, the process is called monofractal. In this case, the tail thickness of the marginal distribution of the process remains scale-invariant.

Figure 10.

The average moment scaling law of the high-pass horizontal subband coefficients (wavelet coefficients) in the (a) HSTN and (b) MELB sites. Here m = 1,…, 4 denotes the spatial scales of 2 to 16 km. This information can be exploited to characterize the evolution of the Generalized Gaussian (GG) density across a range of scales of interest. It appears that the scaling laws can be estimated consistently for both sites. The shaded areas denote the 95% quantile range of estimation. See Table 3 for estimated values of the scaling exponents.

[24] Decomposing all of the precipitation reflectivity images at four levels of decomposition (m = 1 to 4, spatial scales of 2 to 16 km), τq is estimated in a least squares sense for q ∈ {2, 4}; see Figure 10. The estimated values of τ2 and τ4 are summarized in Table 3 for all of the subband coefficients. It can be observed that the value of τ4 − 2τ2 is not equal to zero as one would obtain for a monofractal process. Instead, it is found that τ4 − 2τ2 < 0, implying that the kurtosis of the coefficients shrinks from fine-to-coarse scales. This can be seen from the kurtosis evolution at different scales given as

equation image

This multiscale behavior of the kurtosis implies a multifractal scaling law of the wavelet coefficients. However, this nonlinear scaling (i.e., shrinkage of the tail) can be consistently studied via a linear (in log-log scale) characterization of the second- and fourth-order moments, individually. Accordingly, given any prior information about the scaling exponents and the wavelet coefficients at any particular scale (from which the GG parameters can be directly estimated), the evolution of the marginal density in terms of the parameters of the GG distribution can be fully explained at any scale of interest.

Table 3. Estimated Scaling Exponents of the Second- and Fourth-Order Statistical Moments for the NEXRAD Data Set at Different Orientations in the Range of Scales of Interest 2 to 16 kma
  • a

    Orientations: H, horizontal; V, vertical; D, diagonal.

τ21.01 (0.88–1.10)1.00 (0.87–1.10)0.91 (0.80–0.99)1.03 (0.91–1.11)1.02 (0.90–1.11)0.93 (0.85–0.99)
τ41.89 (1.68–2.10)1.86 (1.629–2.03)1.69 (1.47–1.90)1.91 (1.74–2.08)1.88 (1.70–2.06)1.75 (1.57–1.88)

4.2. Joint Statistics of the Wavelet Coefficients

4.2.1. Scale-to-Scale Dependence

[25] Similar to the Fourier expansion, it is theoretically proven [Wornell, 1990] that the discrete orthogonal wavelet transform for 1/f processes is an approximate Karhunen-Loève-like expansion that can decompose a correlated process into a set of uncorrelated expansion coefficients and orthogonal bases functions. For a 1D Gaussian scaling process, such as fractional Brownian motion with self-similarity index H, it has been theoretically shown [Tewfik and Kim, 1992] that the covariance of wavelet coefficients decays in the order of

equation image

where m and r denote different scales, (k, l) are translation indices and R is the number of vanishing moments of the chosen wavelet (i.e., ∫ xpψm,k(x) = 0, p = 0, 1, ., R − 1). Obviously, according to equation (11), the decorrelation is not perfect for nearby coefficients and the decay rate also depends on the order of the vanishing moments of the selected wavelet; the larger the number of vanishing moments, the larger the decorrelation rate. Moreover, due to the presence of multioriented edges and strong local dependencies of the intensity values in precipitation images, this whitening effect is more complicated especially in an overcomplete representation.

[26] Although the Haar wavelet has the least number of vanishing moments, i.e., R = 1, it has some appealing features especially for analyzing the 1/f scaling and interpreting the rainfall wavelet coefficients as simple first-order increment of the field [e.g., Perica and Foufoula-Georgiou, 1996; Riedi et al., 1999]. For practical implementation on finite domain images, this wavelet has the shortest support among all of the wavelets and thus it does not need a periodic extension of the analyzed signal [Zhang et al., 2004]. Due to the presence of the pronounced edges in the rainfall images separating the rainy areas from the background zeros, this naturally implies that the distribution of the high-pass coefficients of the Haar wavelet with finite support has the least amount of cusp singularities at the center, which makes it more tractable for statistical parametrization. In addition, selection of the Haar wavelet is consistent with the assumption that the retrieved precipitation product is the arithmetic average representation of the highly irregular precipitation process at a particular scale (the low-pass filter corresponding to the Haar wavelet is like a box averaging filter; see Kumar and Foufoula-Georgiou [1993a] for an introductory exposition). Obviously, using the Haar wavelet, there exist more pronounced intrascale and scale-to-scale dependency, which needs to be characterized for proper stochastic modeling of the rainfall images.

[27] The 2D joint and conditional histograms of the Haar wavelet coefficients of the precipitation reflectivity images have been estimated to study the scale-to-scale dependence of the rainfall wavelet subbands at two adjacent scales. The relationships of the coarse and next finer scale coefficients are studied under the name of parent and child dependency. Figure 11 shows the average 2D joint and conditional histograms of the wavelet coefficients for the MELB data set. The conditional histogram is just a remapped version of the 2D joint histogram in which, given the parent value, every nth vertical bin is independently normalized into a probability scale such that ∑np(childparent) = 1.

Figure 11.

The average joint histogram of the (a) diagonal and (b) vertical high-pass wavelet coefficients at the MELB site are shown along with (c, d) the corresponding conditional histograms. The bow tie shape of the conditional histograms manifests the scale-to-scale dependence, and the tilted shape indicates the presence of a nondiagonal covariance structure.

[28] The shape of the joint histograms (Figures 11a and 11b) clearly denotes that the conditional probability of the children given the parent is not uniform all over the domain and there exists higher-order dependency that cannot be completely eliminated under the wavelet transformation. Indeed, the shape of the computed conditional histograms (Figures 11c and 11d) denotes that the variance of the children depends on the parent magnitudes and larger parents give rise to children with larger variance. The tilted bow tie shape of the vertical subband (see Figures 11b and 11d) also signifies the presence of off-diagonal nonzero elements on the covariance matrix of the parent and child coefficients. All of these confirm that the wavelet transformation cannot completely eliminate the scale-to-scale correlation and higher order dependence in the rainfall fluctuations. We remind the reader that the analysis in the reflectivity domain is related to analysis in the log-rainfall domain, and thus “rainfall fluctuations” here and in the sequel literally refer to fluctuations of the log-transformed rainfall fields. This kind of statistical dependencies are also observed for the HSTN-NEXRAD data set (not shown here) for all nearby wavelet coefficients at all orientations.

4.2.2. Intrascale Dependence

[29] In addition to the fact that the wavelet transform does not completely decorrelate (globally) the precipitation images across scales, in this part we show that the wavelet coefficients of the precipitation images also exhibit a local intrascale dependence structure. Figure 12b shows the absolute value of the vertical high-pass subband of the storm image in Figure 12a. Although it seems that the coefficients are not strongly correlated in a global sense, they are locally structured, especially near the major edges. The estimated local covariance matrices of a 5 × 5 neighborhood of the wavelet coefficients for all orientations are shown in Figures 12c, 12d and 12e. These covariance matrices are estimated using a bootstrap resampling scheme. Blocks of the neighborhood coefficients dm,k,li are sampled with replacement, then for each jth sampled block, the djdjT is computed, where here dj is the column-wise vectorized version of dm,k,li in a fixed order. For n-bootstrap samples, the covariance matrix Σd is estimated as follows:

equation image

For sufficiently large n, this estimate guarantees convergence in probability to the true population value [e.g., Lunneborg, 2000].

Figure 12.

(a) NEXRAD image of a storm over MELB on 26 September 2004 at 04:50:00 (UTC) and (b) the associated vertical subband image. Images of the covariance matrices of a 5 × 5 neighborhood for the (c) horizontal, (d) vertical, and (e) diagonal subband images signify the presence of an intrascale dependence structure in the wavelet coefficients.

[30] The off-diagonal nonzero elements of the estimated covariance matrices signify the imperfect local intrascale whitening effect of the wavelet transformation (see Figures 12c12e), which is more significant for the vertical and horizontal subbands.

[31] All of these findings corresponding to the existence of a dependent structure among the wavelet coefficients challenge most of the available stochastic spatial rainfall models [e.g., see Perica and Foufoula-Georgiou, 1996] in which an uncorrelated reconstruction scheme is proposed to explain the high-frequency features of the field. Note that the uncorrelated reconstruction schemes do not offer a means for controlling the tail evolution of the high-frequency features and, naturally, cannot accurately reproduce the small-scale extreme fluctuations in the precipitation images. For applications such as stochastic downscaling or multiresolution fusion in the wavelet domain, the results reported herein imply that one has to consider a probability model for the wavelet coefficients, or say small-scale features of precipitation images, which can reproduce a heavy tailed spatially correlated process with higher-order scale-to-scale statistical dependence. In section 5, we propose a formalism within which the intrascale correlation, scale-to-scale higher-order dependence and heavy tail marginals can be simultaneously and parsimoniously reproduced in the wavelet domain.

5. GSM for the Wavelet Coefficients of Rainfall Images

[32] In this section a stochastic model is introduced which allows us to capture the laid out features of the rainfall images in the wavelet domain. Basically, this model is capable of reproducing a class of heavy tail multiscale processes with a desired covariance structure together with a specific signature of higher-order scale-to-scale dependence on the conditional histogram. To this end, the basic idea is to exploit the construction proposed by Wainwright et al. [2001] in which the high-pass wavelet coefficients are modeled via a mixture of Gaussian random variables on a tree-like structure. Specifically, it will be shown that the wavelet coefficients can be decoupled into a mixture of two different Gaussian processes in which one controls the covariance and second-order scaling, while the other one takes into account the tail and higher-order scale-to-scale dependence.

5.1. Cascade of Gaussian Scale Mixtures on Wavelets Trees

[33] Andrews and Mallows [1974], West [1987] and Wainwright et al. [2001] showed that a set of heavy tailed symmetric density functions including the Laplace, t-distribution, logistic, standard power exponential and even stable distribution, can be generated as a mixture of Gaussian random variables, called Gaussian Scale Mixtures (GSM),

equation image

where equation image stands for equality in distributions, z is a positive independent scalar random variable, the so-called mixing random variable or the multiplier, u is a zero-mean Gaussian vector with a given covariance matrix Σu and d is the family of Gaussian Scale Mixtures (GSM). In particular, knowing that the z and u are independent, the n-dimensional GSM has the following density function which can be specified with different choices of the random variable z:

equation image

The discrete version of equation (14) resembles the statistical concept of estimating a symmetric distribution by summing zero mean Gaussian kernel densities whose covariances have been randomized by a positive random variable z. For instance, choosing z from the family of exponential density functions yields a representation of the family of Laplace distributions. Several classes of heavy tailed distributions can be produced in the context of the GSM; however, for the GG family with 0 < α < 1, a closed form expression for the density of the mixing random variable z does not exist [e.g., Wainwright et al., 2001].

[34] By construction, one of the key properties of the GSM is

equation image

which implies that the covariance structure of the GSM can be fully explained by the covariance matrix Σu and the mean of the multiplier process. Therefore, without loss of generality setting equation image[z] = 1, the whole covariance structure of the GSM can be explained by the covariance of the u. Accordingly, to generate a GSM with a desirable covariance, similar to that of the rainfall wavelet coefficients, we need to adopt a mechanism which allows us to efficiently generate a (weakly) correlated Gaussian random field in a multiscale framework. For this purpose, the general class of multiresolution linear Gauss-Markov processes defined on a regular tree-like structure (Figure 13) is of particular interest [e.g., Chou et al., 1994; Willsky, 2002],

equation image

where x(s) is the state of the process at node s and sequation image is the parent node to which x(s) is connected at the next coarser scale, A(s) is the transition matrix, specified at each node of the tree and determining the coarse-to-fine scale dynamics of the process, w(s) ∼ equation image (0, I) and B(s)w(s) is a Gaussian white noise with covariance Q(s) = B(s)B(s)T. The random vector x(s) at each node of the tree has a Gaussian distribution equation image(0, Σx(s)) with the following coarse-to-fine scale dynamics for the evolution of the covariance:

equation image

known as the discrete Lyapunov equation. According to this recursion, the strength of the scale-to-scale dependence is determined by the value of the transition matrix A(s). When this prefactor tends to zero, the Markovian structure of the tree weakens and the process would be roughly uncorrelated from scale-to-scale with a nearly diagonal covariance Σx(s)Q(s). This construction provides a very flexible multiscale covariance structure, within which the special case of scale-to-scale stationarity can also be achieved by setting A(s) = A, B(s) = B, Σx(s) = equation image in equation (17) and adjusting A and B accordingly. However, due to the observed second-order scaling of the wavelet coefficients of the rainfall images, the nonstationary scale-to-scale construction of u(s) is of interest in this study. To this end, a stationary process x(s) ∼ equation image(0, Σx(0)) can be generated according to the dynamics in equation (16), where Σx(0) is the covariance at the root node, and then the scale-to-scale nonstationarity can be imposed by setting

equation image

where j(s) represents the scale level on the wavelet tree from coarse-to-fine scales and τ2 represents the geometric decay rate of the variance of the wavelet coefficients across dyadic scales (see equation (9) and Figure 10).

Figure 13.

Schematic of a multiresolution regular quad tree, where each node on the cascade x(s) is connected to a unique parent x(sequation image) node. On a quad tree, node (s) indeed represents a 3-tuple including the scale level and the pair of spatial positions.

[35] Recalling that Σd(s) = Σu(s), this geometric decay rate indeed guarantees the scaling law of the variance of the wavelet coefficients,

equation image

which leads to the presence of dyadic self-similarity and 1/f spectrum in the reconstructed field [e.g., Daniel and Willsky, 1999].

[36] In addition to the scale-to-scale dependence, due to the tree-like Markovian construction, the nearby nodes at the same scale also exhibit a dependent structure as long as they share the same parent. Consequently, according to the proposed construction, this framework also allows us to capture not only the observed global scale-to-scale statistical structure, but also the local intrascale correlation among the wavelet coefficients. However, in the simplest case, one may still decide to just assume completely uncorrelated wavelet coefficients (i.e., A(s) = 0) and pursue a white reconstruction phase accordingly.

[37] Furthermore, simulating a GSM random variable with a desirable marginal density naturally requires a priori information and optimal estimation of the multiplier density function from the available data. Focusing on the marginal density of the GSM model, equation (13) can be written as [Portilla et al., 2001],

equation image

Knowing that the convolution of two functions in the real space is equivalent to the product of their Fourier transforms in the frequency domain, the density of log[z(s)] can be computed nonparametrically, given a set of observations of the d(s). The density of log[d(s)] is indeed the convolution of the densities in the right-hand side of equation (20) and hence a rescaled version of the log[z] distribution can be estimated by deconvolving the density of log[d(s)] from the empirical histogram of the log[d(s)]. Note that in our case, d(s) will be the wavelet coefficients obtained from the wavelet transformation of the precipitation reflectivity images as discussed earlier. Figure 14a displays the results of the deconvolution problem for the MELB data set horizontal subbands at one level of decomposition. The log histogram of the log[z(s)] is an inverted parabola and remarkably Gaussian which implies that the multiplier can be well explained by a scalar log-normal random variate z(s) ∼ equation image (μz (s), σz (s)), where μz (s) and σz (s) are the mean and variance of the log [z(s)]. As anticipated, numerical simulation of the GSM random variables in equation (13) using a log-normal multiplier shows that this mixture can reproduce reasonably well the GG density (see Figure 14b) in close resemblance to the histogram of the rainfall wavelet coefficients (e.g., compare to Figure 9).

Figure 14.

(a) Estimated densities (dashed lines) of the logarithm of the multiplier for the horizontal subband coefficients of the MELB data are well approximated by a Gaussian distribution (inverted parabola in log probability). Note that the variance of the coefficients is normalized to one before performing the deconvolution. The solid circles show the equation image(0,1), and the solid line is the average histogram of the data. (b) Simulated marginal of the GSM random variables using a lognormal multiplier is shown versus the fitted Generalized Gaussian (GG) density with α = 0.7 as found from precipitation reflectivity data (see Figure 9). The Gaussian density is also shown for comparison.

[38] At a particular scale, knowing that equation image[d(s)4] = equation image[z(s)2]equation image[u(s)4] and equation image[u(s)4] = 3{equation image[u(s)2]}2 a closed form expression for the kurtosis of the wavelet coefficients is derivable, κ [d(s)] = 3equation image[z(s)2]. Assuming equation image[z(s)] = 1 leads to μz(s) + equation image = 0, which yields

equation image

Equating equations (21) and (6), Figure 15 provides the relationship between the tail of the GG distribution and the variance of the log multiplier. Assuming the GG density as a parametric model for the marginal histogram of the wavelet coefficients, this tells us that the tail of the GSM can be fully controlled by the variance of the log[z(s)]. Accordingly, obtaining the sample kurtosis of the wavelet coefficients, the distribution of the multiplier can be fully characterized. It is worth noting that, according to equation (21) and the positivity of σz(s), the proposed GSM construction does not allow a thinner tail than the Gaussian case and therefore this model is only suitable for generating GG marginals with 0 < α ≤ 2. On the other hand, the scale-to-scale evolution of σz(s) can be further expressed in terms of the multifractal properties of the rainfall fields manifested on the kurtosis statistic in equation (10).

equation image

This shows that the GSM log multiplier allows us to collapse the scaling information content of the second- and fourth-order moments of the process into a single parameter of the log multiplier, which eventually controls the thick tail properties of the marginal and higher-order scale-to-scale dependence. In other words, in this construction u(s) captures second-order statistics and the associated scaling of the wavelet coefficients, independent of z(s) which addresses the heavy tail statistics and higher-order dependency.

Figure 15.

The relationship between the shape parameter α of the Generalized Gaussian (GG) density and the log multiplier standard deviation σz.

[39] Figure 16 shows the results of a numerical experiment that demonstrates how the proposed construction can generate similar statistical signatures to those found for the wavelet coefficients of precipitation images. To this end, assuming the identity matrix as the normalized covariance of the wavelet coefficients at the root node, a cascade of stationary multiscale processes x(s) is generated by setting A(s) = ηI and B(s) = equation imageI. Subsequently, the second-order scaling law of the process is imposed on the cascade according to equation (18). The strength of the scale-to-scale correlation can be adjusted by η. For example, a nearly uncorrelated scale-to-scale construction can be achieved by sending η to zero and on the contrary, while η tends to unity the Markovian property is much stronger and the cascade produces a highly correlated field. Indeed, a larger value of the η increases the off-diagonal entries of the parent and child covariance matrix and gives rise to a tilted joint histogram.

Figure 16.

Marginal and joint statistics of simulated GSM cascades for various choices of parameters (η,σz). A larger variance of the multiplier σz increases the thickness of the tail (smaller α) and the significance of the scale-to-scale dependence and η controls the directionality of the bow tie shape of the joint histogram. (a, b) η = 0.01, σz = 2.0 (α = 0.3); (c, d) η = 0.01, σz = 1.2 (α = 0.7); (e, f) η = 0.01, σz = 0.5 (α = 1.5); and (g, h) η = 0.5, σz = 1.2 (α = 0.7).

[40] Note that the heavy tail property of the marginal density and the higher-order scale-to-scale dependence in terms of the observed bow tie shape of the joint histogram, only depend on the variance of the log multiplier process which characterizes the shape of the tail. Empirically, it seems that the type of dependence in the conditional histogram (shape of the bow tie) is tightly related to the thickness of the tail, meaning that, for heavier tail (i.e., larger σz (s)) the high-order parent-to-child dependency is more pronounced; see Figure 16.

[41] This stochastic formalism requires estimation of a set of parameters for each subband including: the geometric decay rate of the variance (τ2), the evolution of the sample kurtosis (σz), and the transition coefficients (η) which can be estimated by computing the sample covariance of the high-pass wavelet coefficients across different scales. Due to the particular structure of the log multiplier in the presented GSM model, these parameters can be estimated easily from a set of available rainfall images and be exploited as a priori estimates for stochastic modeling. As an example application, Figure 17 provides preliminary results of the downscaled version of a precipitation reflectivity image over the HSTN site using the explained formalism. To this end, using a nonoverlapping convolution with a box average filter, the original image at resolution 1 km was upscaled to an 8 km resolution and then, by learning from the whole data set of the HSTN site, a high-resolution version at 1 km was synthetically generated. Obviously the parameters used here are priori information reflecting an average representation of the HSTN data and do not fully represent this particular storm environment. However, visual comparison of Figures 17b and 17c shows that due to the explicit consideration of the heavy tail nature and dependent structure of the wavelet coefficients of rainfall reflectivity data, the presented model can reproduce the specific spatial correlation of the rainfall images while accounting for the heavy tail features and edges. Note that the presented algorithm uses only the mathematical/statistical structure of the precipitation images and not other larger-scale physical storm structure. As such, its capacity to properly recover the high-resolution geometric and statistical features of the precipitation images from coarser resolution data might be limited. This issue along with storm-specific parameterization need to be examined carefully for different storm regimes and is a topic for further research.

Figure 17.

(a) The upscaled representation at scale 8 km of (b) a rainfall reflectivity image in original resolution of 1 km observed on 28 June 1998 at 18:13:00 UTC over the HSTN site. Upscaling was done using a nonoverlapping convolution of the original field with a box averaging filter of size 8 km. (c) A stochastic coarse-to-fine GSM downscaling of the image in Figure 17a. All of the parameters of the cascade are learned from the fine-to-coarse scale analysis of the HSTN data set.

5.2. GSM on Wavelet Trees Versus Multiplicative Random Cascades

[42] Multiplicative random cascades, in their canonical form, have been of central importance to stochastic simulation of geophysical processes and especially precipitation data [Gupta and Waymire, 1993]. This class of stochastic models, with the coarse-to-fine scale recursion described below, allows us to generate multifractal measures with similar statistical properties as those typically observed in rainfall across a finite range of scales,

equation image

where ζ(s) represents an independent identically distributed (iid) random multiplier at each node s of the tree with equation image[ζ(s)] = 1, also known as the cascade generator. The multiplicative structure of this model is a key factor which imposes the desired multifractal properties [e.g., Mandelbrot et al., 1997] and the parent-to-child scale dynamics in the sense that the larger parents are more potent to generate larger children, a property that has been amply documented in the precipitation fields. However, this construction is nonlinear by its nature and hence the state estimation of equation (23) given a set of noisy observations, even in the form of an affine observation equation,

equation image

where w(s) ∼ equation image (0, R(s)), is not a trivial task. At first glance, it seems that by working in the log space to linearize the model equation,

equation image

where, equation image{log[ζ(s)]} = 0, the linear estimation theory of additive Markovian multiscale models [see, e.g., Gorenburg et al., 2001; Tustison et al., 2002] can be invoked, while preserving the multiscale properties of the multiplicative random cascade. We need to note that although the mean is conserved in this log transformation, the higher-order parent-to-child dependency is not preserved, given that ζ(s) is a sequence of iid random variates. For instance, it is easy to check that the conditional variance of x(s) given x(equation images) in equation (23) depends linearly on the magnitude of x(equation images) while in equation (25) this variance is only characterized by the noise term ζ(s). In other words, a proper additive construction requires the derivation of an appropriate noise term that can take into account this high-order dependency (e.g., a correlated noise), which cannot be definitely explained by an iid ζ(s). A very important implication of this deduction is that using multiplicative random cascades, implementation of the linear state estimation theory for precipitation data in log-transformed space cannot fully capture the distinct statistical signature of the rainfall process.

[43] On the contrary, the GSM cascade on the wavelet tree has an additive construction, which allows a subtle and explicit characterization of the wavelet detail coefficients to properly account for the statistical structure of these fields. Indeed, given an estimate of z(s), the density of d(s) is Gaussian (see equations (13) and (14)) and hence the conditional estimation of the wavelet coefficients becomes a linear problem. This is a great advantage of the GSM construction in the wavelet domain which eventually permits exploiting the well established linear estimation techniques.

6. Conclusions

[44] Statistical properties of the near-surface precipitation reflectivity images were extensively studied for a set of 200 coincidentally observed independent storm events over the two ground validation sites (in Texas and Florida) of TRMM. Despite the fact that the analyzed precipitation images were the near-surface reflectivity of storms with diverse physical origins and spatial organizations, our results signified that there are some common mathematical signatures in all of the precipitation images that can be robustly characterized and exploited for parsimonious rainfall modeling over a broad range of scales of interest. Power law scaling of the Fourier coefficients in the form of 1/f spectrum showed a regular and stable behavior. Beyond the second-order statistics, the non-Gaussian structure of the rainfall fields at multiple scales was explored in the wavelet domain. It was revealed that the heavy tail distributions of the wavelet coefficients of these precipitation reflectivity images can be well explained by the class of Generalized Gaussian (GG) distributions. We demonstrated that the wavelet high-pass coefficients exhibit a multifractal behavior and hence a nonlinear scaling law. A new class of multiresolution stochastic processes, namely the Gaussian Scale Mixtures (GSM), was introduced to capture important characteristic features of the precipitation images in the wavelet domain. The proposed GSM model using a log-normal multiplier, allows one to effectively decouple the precipitation subband images into a set of two Gaussian processes: one controlling the dependence structure (intrascale and scale-to-scale spatial covariance) and the other the heavy tail features. Embedding a multiscale linear Gauss-Markov process in the GSM construction, results into a multiresolution model that can capture efficiently the correlation, the nonlinear scaling law (including the 1/f spectrum), the heavy tail marginals and higher order scale-to-scale dependence, simultaneously. The GSM cascade is conditionally linear, meaning that given the multiplier process, the density of the GSM is Gaussian. This property is extremely desirable, because it allows one to exploit the linear filtering techniques (e.g., Kalman Filter, Wiener Filter) for multiscale optimal estimation and merging of multisensor precipitation products, while preserving the extreme rainfall intensities via addressing the tail statistics properly.


[45] This work was supported by NASA-GPM awards NNX07AD33G and NNX10A012G and NSF award EAR-05366219 to the National Center for Earth-surface Dynamics, an NSF Science and Technology Center. In addition, computations of this research were partly supported by the University of Minnesota Supercomputing Institute for Advanced Computational Research. The comments of three anonymous referees also substantially improved our presentation.