• Open Access

Estimation of convective precipitation mass from lightning data using a temporal sliding-window for a series of thunderstorms in Southeastern Brazil



Some studies have proposed the estimation of convective rainfall from lightning observations by the computation of the rainfall–lightning ratio (RLR). However, as such ratio may depend on season, convective regime and other factors, known approaches failed to provide values of RLR with low variability. An accurate RLR would allow estimating rainfall from lightning data in areas that lack weather radar coverage. This work proposes a straightforward approach for the computation of RLR, based on a temporal sliding-window and a fitting function. It was tested for thunderstorms observed in the Southeastern Brazil with good results.

1. Introduction

Rainfall estimation is typically performed from weather radar data. However, assuming that convective rainfall can be correlated to lightning, some approaches propose rainfall estimation from lightning data for areas without weather radar coverage, supporting nowcasting. The most common approach is the computation of the rainfall–lightning ratio (RLR), given by the convective rainfall mass per cloud-to-ground (CG) lightning flash. Nevertheless, such ratio may depend heavily on seasonal and geographical factors, local climatology, convective regime, storm type, lightning patterns or intensity, dominant lightning polarity of CG lightning, intracloud to CG ratio and thunderstorm life cycle (Buechler and Goodman, 1990; Soula and Chauzy, 2001; Lang and Rutledge, 2002). Therefore, known approaches may fail to provide values of RLR with low variability (Sist et al., 2010).

A number of studies were performed to estimate the rainfall mass directly from CG lightning observations. Petersen and Rutledge (1998) used the total rainfall mass and the density of CG lightning to examine their relationship on a number of spatial and temporal scales for different parts of the world. The lightning flash incidence is more intense in clouds associated to high-level precipitation, as the electrification increases with altitude as in the case of tall cumulonimbus (Siingh et al., 2010). Tapia et al. (1998) computed the RLR by dividing the total convective rainfall mass by the number of CG flashes in a thunderstorm, and proposed a model to reconstruct the spatial and temporal distribution of the rainfall. The summation of the rainfall distribution of the flashes yields the overall rainfall distribution, which was checked against weather radar data. Kempf and Krider (2003) presented a compilation of RLR values including some obtained from other works, and found values ranging from 38 to 72 × 106 kg per flash for isolated thunderstorms in Florida, Spain and France, and values as high as 5000 × 106 kg per flash for mesoscale thunderstorms in Australia and Central United States. Molinie et al. (1999) found values as low as 3 × 106 kg per flash for the Pyrenees, while Williams et al. (1992) found values up to 500 × 106 kg per flash for Australia.

The current work proposes a simpler and more accurate approach, the function windowed RLR (WRLR), which employs a temporal sliding-window. This approach is based on the assumption that convective activity is correlated to electrically active cells that correspond to areas with high density of CG strokes. Such density is calculated by the EDDA software that implements standard kernel estimation (Strauss et al., 2010). This software is being evaluated for operational use in order to detect convective precipitation in the recently established Center for Natural Disasters Monitoring and Alert (CEMADEN) in Brazil.

A set of thunderstorms that occurred in 2009 in the Southeastern Brazil was selected from weather radar data to obtain a WRLR function, while another set of January 2010 was employed to test this WRLR function as rainfall estimator. It is expected to include this function as a new module of the EDDA software. This may provide rainfall estimation in parts of Brazil, a huge country that has over 8.5 million km2, but less than 15% of its area is covered by weather radar. Rainfall estimations can be obtained from meteorological satellites like those of the Tropical Rainfall Measuring Mission (TRMM), National Oceanic Atmospheric Administration (NOAA) or Geostationary Operational Environmental Satellite (GOES) satellite series, but can be imprecise (Ramirez-Beltran et al., 2008; Liao and Meneghini, 2009). On the other hand, Brazil has a ground-based lightning detector network called RINDAT (Brazilian Integrated Lightning Detection Network), one of the largest in the world (Pinto et al., 2006).

2. Data and methodology

2.1. Meteorological data

In this work, meteorological data consists of weather radar and lightning data for the entire year of 2009, plus the first month of 2010. The area of study is around two Brazilian S-band weather radars located in the State of Sao Paulo at the cities of Bauru and Presidente Prudente. This area corresponds to 32 squares with sides of 50 km that are within the range of 150 km of these radars, located at 22°21′30″S 49°1′42″W and 22°10′30″S 51°22′30″W. Radar data is given by constant altitude plan position indicator (CAPPI) images at 3 km altitude and with 1 km spatial resolution. The energy backscattered by hydrometeors, given by the reflectivity factor Z (in dBZ), is related to the rainfall rate R (in mm h−1) by the Z–R relationship shown in Equation (1). This work adopted A = 32 and b = 1.65, according to Calheiros and Gomes (2010).

display math(1)

Lightning data was provided by the Brazilian lightning detection network RINDAT that acquires radio frequency signals emitted by lightning. This network has detection efficiency of 90% and average precision of 500 m in stroke location for the State of Sao Paulo (Naccarato and Pinto, 2009). Lightning stroke data is output in the Universal Lightning ASCII (UALF) format. A lightning flash is composed of one or more strokes. The annual distribution of the number of flashes and strokes, and monthly accumulated rainfall for the year of 2009 for the area of study is shown in Figure 1.

Figure 1.

Annual distribution of rainfall and number of lightning flashes and strokes for the area of study in 2009.

2.2. Original RLR computation

The original RLR computation was proposed by Tapia et al. (1998). In order to estimate a RLR value, partial RLRs are calculated for each one of 22 selected summer storms occurred in Florida in 1992/1993. The resulting RLR value is then given by median of these values. The partial RLRs presented a high variability (from 24 to 365 × 106 kg per flash) that was attributed to different convective regimes. The resulting RLR was 43 × 106 kg per flash.

The model defined by Tapia et al. (1998) allows to estimate rainfall spatial and temporal distribution using a previously estimated RLR value. A uniform rainfall distribution is assumed in a circle of 5 km radius centered at each CG lightning flash and in a 5-min interval centered at its time of occurrence. The rainfall distribution is then expressed by Equation (2) (x and Xi express geographical coordinates).

display math(2)

where R(t, x) = Rainfall rate in mm/h at time t and position x; Nt = Counter for the number of flashes; ti = Time of occurrence of the i-th flash; xi = Location of occurrence of i-th flash; RLR = Constant rainfall-lightning ratio (kg/CG flash); C = Unit conversion factor; f(t,Ti) = Dirac delta function for the i-th flash occurred at Ti (checks if |t−Ti| < 5 min); g(x,Xi) = Dirac delta function for the i-th flash occurred in Xi (checks if |xXi| < 5 km).

2.3. Temporal sliding-window based RLR computation

As already stated, the choice of a suitable value for the RLR may be difficult due to its high variability. This seems to be a limitation for the Tapia's model, which employs a constant RLR to estimate convective rainfall from CG lightning data. This work proposes a new approach, based on a precipitation function WRLR of the number of CG lightning strokes (N). Strokes and precipitation refer to a defined window that covers an interval of time (Δt) and a square area Qj, hence the name WRLR.

This approach is composed of (1) the training phase, in which the WRLR function is derived from known rainfall and stroke data, and (2) the estimation phase that employs the WRLR function to estimate the precipitated mass from stroke data. In the first phase, an area and duration of study is chosen within radar range, comprehending typically many squares Qj and thousands of Δt's. For each square Qj, the window advances in time using a sliding-window scheme. Windows without any convective precipitation are discarded. The resulting set of windows provides the data points: data point Pij is given by the pair (mi, Ni), where mi is the precipitated mass in the square Qj and i-th interval of time Δt, and Ni is corresponding number of CG strokes. Outliers corresponding to data points with very high precipitated mass were removed using the Tukey–Kramer method (Tukey, 1977). Finally, a suitable WRLR function is chosen to fit these data points. Once the function must interpolate many data points that may present the same number of strokes, but different values of precipitated mass, average values of the latter quantities were considered. In the second phase, the WRLR function is employed to estimate the precipitated mass in a particular window Qj and duration Δt that may be placed outside radar range, assuming that climatic characteristics are similar to those of the area of study.

According to Sist et al. (2010) the correlation between lightning and convective precipitation is more significant than it is to stratiform precipitation. Therefore, this work adopted the criterion proposed in Steiner et al. (1995) to filter out nonconvective precipitation. Considering the weather radar grid, this criterion associates to convective precipitation a grid point with reflectivity of at least 40 dBZ or that presents a significant gradient of reflectivity, above a threshold ΔZ for a circle around it, and also all grid points inside the circle. Values of the threshold and the radius of the circle depend on the background intensity. In this work, lightning strokes were adopted for the computation of the WRLR function, as the results were better than those obtained with lightning flashes. Several tests were performed with various values for Q and Δt, resulting in different functions WRLR. Larger and longer windows provide data points with higher precipitated mass and higher number of strokes than smaller and shorter windows. However, the resulting WRLR functions provide precipitation taxes (in mm h−1) that are equivalent and with similar relative error.

Another point in the proposed methodology is the spatial and temporal rainfall distribution. Tapia et al. assumed rainfall distribution as described in Equation (2), based on individual CG lightning flashes. This work assumes rainfall distribution as given by the EDDA software (Strauss et al., 2010) that generates a field of density of occurrence of CG strokes employing kernel density estimation, for the considered area and interval of time. The normalized density of CG strokes is mapped to a density of precipitated mass using its accumulated value for that interval.

3. Analysis of the results

The results refer to the estimation of the precipitated mass from CG lightning stroke data using squares Q with 50 km edge and Δt of 30 min, which were adopted for convenience. The advance of the temporal sliding-window was chosen as 7.5 min providing a significant overlap between consecutive window advances, and thus more data points. The area of study was defined in Section 2.1, composed of 32 squares that were employed for the training and estimation phases. Training data corresponds to the entire year of 2009, while estimation data, to the month of January 2010. The resulting WRLR function is show in Equation (3), while Figure 2 presents the scatterplot of the data points and the function itself.

display math(3)
Figure 2.

Scatterplot of the data points employed in the training phase and the corresponding WRLR fitting function.

The temporal sliding-window with overlap is supposed to smooth the fitting curve (WRLR function) as it is equivalent to a moving average operator. The scatterplot of Figure 2 shows that low values of N presented a low variability, while the opposite occurred for N > 150. However, the latter correspond to less than 1% of the data points. The same figure shows that the function fits well the data points until N = 100. The training phase employed 187 735 data points with a total of 235 762 CG flashes or 428 129 CG strokes that correspond to 491 thunderstorms.

The WRLR function obtained in the training phase using data of 2009 was then employed to estimate the precipitated rainfall in January 2010 for the same area. The precipitation inferred from weather radar data was used as reference. The WRLR function was applied for each one of the 32 squares and for each 30-min interval yielding the accumulated precipitation. Tapia's approach RLR was calculated using flashes occurred in the same area along 2009 for the 491 thunderstorms. Partial RLRs were calculated for each thunderstorm, ranging from 0.4 to 1094 × 106 kg per flash and the final RLR was given by the median of these ratios, 219 × 106 kg per flash. The 30-min accumulated values of the precipitated mass were plotted in Figure 3 for the reference (weather radar), and estimations using the Tapia's model RLR (Equation (2)) and the WRLR function (Equation (3)). The curves were smoothed with a one-dimensional Gaussian filter with 2-h width for better visualization. The correlation between Tapia's model RLR estimation and radar precipitation was 0.78, while correlation between the WRLR function estimation and radar was 0.90. These correlations were calculated with unsmoothed data.

Figure 3.

Temporal evolution of the 30-min accumulated rainfall along the month of January 2010 for the considered area given by weather radar, Tapia's model RLR and WRLR function (curves were smoothed by a Gaussian filter).

The proposed approach does not require selection of thunderstorms, but in order to compare these estimations in a different way, the total precipitated mass for each one of the 47 storms occurred in January 2010 in the same area was computed. Averages, medians and RMSE (root mean square error) values are presented in Table 1. The accumulated 30-min values for the same thunderstorms for the same area were also computed and a global average for all thunderstorms is presented in the same table, as well as the median and RMSE.

Table 1. Some measures of the estimations for January 2010: average precipitated mass per storm and 30-mi accumulated precipitated mass (values in 109 kg)
  Weather radarWRLR functionTapia's model
Total rainfall per stormAverage211197251
Accumulated 30-minAverage10.19.512.1

In order to compare a rainfall distribution generated by the Tapia's model and by the proposed approach (WRLR function coupled to the EDDA software), a particular thunderstorm was selected, for the 30-min interval starting at 1 : 08 UTC of 20 January 2010. A squared area with an edge of 300 km was chosen, centered at the weather radar of Bauru. This area has 36 squares with edges of 50 km, differently from the preceding results shown in Figure 2, which included only the 16 inner squares of weather radars of Bauru and Presidente Prudente, in order to comply to the 150 km limit for radar coverage. It is worth to note that the training phase defines a WRLR function using data within this range, but the estimation phase allows to estimate rainfall beyond that range from lightning data, as this was the main goal proposed in this work.

Figure 4 shows the area of that thunderstorm to allow a comparison between the spatial distributions of rainfall given by the weather radar of Bauru, the EDDA/WRLR and the Tapia's model. In the case of the distribution generated by the EDDA/WRLR distribution, the normalized CG stroke density was mapped to rainfall rate. The accumulated values for the 30-min interval, expressed in 109 kg, were 38.27 (radar), 44.37 (WRLR) and 50.11 (Tapia). Similar comparisons were performed for many other time intervals showing that the spatial distributions given by EDDA/WRLR were compatible with the precipitation observed in radar images.

Figure 4.

Rainfall distributions generated from radar data (left), the proposed EDDA/WRLR approach (center) and Tapia's model (right) around the weather radar of Bauru for a 30-min interval in 20 January 2010 (color bar shows rainfall rate in mm h−1).

4. Summary and conclusions

A new approach is proposed to estimate the precipitated mass and rainfall distribution from lightning data. The most common approach, proposed by Tapia et al. (1998), is to calculate a constant RLR value given by the median of RLR values obtained for a set of thunderstorms and to estimate rainfall mass assuming a circular distribution around each lightning flash. However, partial RLRs have high variability and may lead to estimation errors. The new approach computes data points composed of the convective precipitated mass and the number of CG strokes for windows with a defined square area and duration. A sliding-window scheme advances each window in time over the same area. Stratiform precipitation was filtered out, as it is usually not related to lightning. The set of data points is then fitted by a suitable WRLR function that can be employed to estimate rainfall beyond weather radar range. The EDDA software, which generates a field of density of occurrence of CG strokes, is also used to estimate rainfall distribution. Assuming weather radar data as reference, the proposed approach yielded a better estimate for the total precipitated mass and for the spatial distribution in a region of Southeast Brazil for the month of January 2010. Further work intends to extend the WRLR function for other regions by defining specific coefficients. In addition, the WRLR will be implemented as a new module of the EDDA software, in order to be evaluated for operational weather monitoring.


Authors João Victor Cal Garcia, Stephan Stephany and Augusto B. d'Oliveira thank CNPq (National Council for Scientific and Technological Development of Brazil) for grants 140983/2010-4, PQ 305639/2012-9 and 473053/2010-1, respectively. Authors also thank the Center for Weather Forecasts and Climate Studies (CPTEC/INPE) and the Meteorological Research Institute (IPMet/UNESP) for the meteorological data.