The Relationship Between TGF Production in Thunderstorms and Lightning Flash Rates and Amplitudes

We provide evidence that Terrestrial Gamma‐Ray Flashes (TGFs), in well isolated thunderstorms, tend to occur during periods of low and declining flash rates, and when the flash amplitudes are larger than average. This conclusion comes from examining the results of 371 manually tracked TGF‐producing thunderstorms. Fermi‐GBM identified TGFs are used for this analysis and lightning data come from both World Wide Lightning Location Network and Earth Networks Total Lightning Network. The data from these storms suggest that TGFs are likely to occur in almost every phase of storms that last longer than an hour, but tend to occur later on in shorter storms. We also note that, in short storms, TGFs are more likely to accompany a flash when the flash rates of the storm are lower than average, and they are less likely per flash during the peak flash rate periods of the storms. We find that the tendency for TGFs to occur while the flash rate is falling and when the amplitudes of flashes (the sum of the absolute values of peak currents of all constituent sferics in the flash) are larger than average, does not depend strongly on the duration of the storms. This implies that not just any lightning flash can or even will produce a TGF, but that the electrical conditions of the storm play a crucial role in TGF production.

(RF) model, proposed by Dwyer (2003) allows for secondary electron avalanches to be created by Compton back-scattering of bremsstrahlung x-rays and by Bhabha scattering with secondary positrons. This allows for an exponential increase in the number of runaway electrons produced by a single seed particle. Another explanation is that electrons near the high local electric fields at the tips of lightning leaders can gain enough energy to become seed particles for the RREA process (Carlson et al., 2010;Celestin & Pasko, 2011;Moss et al., 2006). This potentially large number of seed particles could explain the observed fluence of TGFs.
Regardless of the model of production, however, there is still no explanation for why some lightning flashes produce TGFs but others do not. Previous studies have looked at the meteorological data of the thunderstorms that produce TGFs (e.g., Barnes et al., 2015;Chronis et al., 2016;Fabró et al., 2019;Smith et al., 2010;Splitt et al., 2010), but in general, it is found that TGFs can be produced by a wide variety of different types of thunderstorms. Smith et al. (2010) found evidence that TGFs occur in the declining stages of flash rate in thunderstorms using 51 storms and RHESSI-identified TGFs. In this study, we examine the relationship between lightning flash rate and TGF production using World Wide Lightning Location Network (WWLLN) and Earth Networks Total Lightning Network (ENTLN) data and Fermi-GBM identified TGFs. The storms used in this study were manually tracked over the course of their lifetime which provides more detail into how these storms evolve, and it also allows for a much larger sample size than previous studies mentioned.

Data and Methods
For this work, we have used lightning data (localized radio atmospheric signals or "sferics") from both the WWLLN (Rodger et al., 2006(Rodger et al., , 2009 as well as the ENTLN (Marchand et al., 2019;Rudlosky, 2015;Sonnenfeld et al., 2020;Zhu et al., 2017). The motivation for using these two datasets stems from their unprecedented reliability, accuracy, and global coverage. The TGFs used for this study have been observed by Fermi-GBM over a time period ranging from 2008 to 2016. We used only the TGF events for which a positive WWLLN association was found (Roberts et al., 2018) and for which the ENTLN data are available -a sample size of 1,169 events. We have used the locations of the known WWLLN associations to identify the corresponding storms in the ENTLN data (i.e., the storms which would contain the WWLLN associated sferic). The identified storms, which consist entirely of ENTLN data, have then been tracked, by hand, over the course of their evolution. This was done by identifying isolated clusters of lightning sferics and following the clusters over the course of their lifetimes. Of the 1,169 total possible events, 534 were able to be tracked by hand, 231 had too sparse lightning data to pick out an accurate storm, 237 were too irregular to accurately identify a storm, 71 had no clear storm associated with the WWLLN identified sferic, 80 were difficult to track for a variety of other reasons, and 16 had no ENTLN data in the region of the known WWLLN association. Of the 534 events that were tracked by hand, only 418 of them had ENTLN sferics within 5 ms of the known TGF time (the same time window used to identify WWLLN associations by Connaughton et al., 2010Connaughton et al., , 2013. We chose to use 5 ms as a conservative window for finding associations in the ENTLN data, however, all of the events used in this study had sferics within the stricter, 3.5 ms window commonly used for WWLLN associations (Roberts et al., 2018). For each of the different analyses in this work, we used only those storms with at least 15 flashes (N = 371), an arbitrary threshold implemented to ensure the statistical quality of measures like the mean flash rate of the storms for example.
It is likely that our sample contains a variety of storm types, however, we are unable to comment definitively without additional data. It should be noted that the following results of this study can, at this point, only be considered accurate for TGFs produced in well organized, isolated thunderstorms, which is roughly 30% of the total TGFs available for this study. There is necessarily a selection bias introduced by the requirement of the ability to accurately track and identify storms over their lifetimes. There could also be unknown biases due to the changes in lightning network detection efficiency over time. Without additional data, we are unable to say whether these results apply to all TGFs or just TGFs produced in thunderstorms that are "trackable." One could potentially increase the number of "good" storms by incorporating more weather data, expanding the TGF catalog, or developing more sophisticated methods of tracking storms. These endeavors should be considered in future work. More details on the storms used in this study can be found in the Appendix.

TGF Temporal Locations in Storms
For each of the 418 tracked storms, the lightning sferics were then grouped into flashes, using a Kernel Density Estimation (KDE) method and a ten-minute running sum of the distinct flashes was calculated. The KDE method uses a Gaussian kernel with a bandwidth of 0.25s and allows for sferics to be grouped together into distinct lightning flashes with total durations usually less than 1 s. More details can be found in (Larkey et al., 2019). Figure 1 shows an example of one of these tracked storms. We used the Z-score, the number of standard deviations away from the mean a data point lies, to perform analysis of when the TGF occurs in the storms. The Z-score, in this case, is therefore analogous to a time measure of when in the storm the TGF occurs. We have used two different versions of this statistic, namely, a Z-score with respect to the mean time of all the flashes in the storm, and a Z-score with respect to the time of the maximum flash rate of the storm. More specifically, we have used the following equations.
Here, t TGF represents the time of the TGF, μ is the mean time of all flashes in the storm, σ is the standard deviation of lightning flash times about that mean, and argmax(R) is the time of the highest flash rate in the storm. Using these statistics, a positive Z-score implies that the TGF occurred after the mean or maximum flash rate, while a negative Z-score implies the opposite. Figure 2 shows an example of these statistics for the same storm as in Figure 1. We have performed the analysis on two separate subsets of the storms: those with durations less than 1 h (N = 129) and storms with durations longer than 1 h (N = 242). This one hour cut is justifiable as it is generally considered the upper bound on the length of a single-cell storm (Kolzov et al., 2013). It is also the mode of the distribution of storm durations in our sample (see Figure A2) and thus emerges as the natural duration to split the datasets into distinct populations.

Flash Rate Change at Time of TGF
We also examined how the flash rate was changing at the time of the TGF. In some cases, the TGF-producing storm might be a multi-cellular storm and, over the course of the evolution of the storm, the lightning flash rate might rise and fall periodically as one cell dies out and another begins to dominate (See Figures 2  and 3, for example). Because of this it is not simply enough to look at when the TGF occurs in the storm as a whole to determine if it is occurring during a declining phase of the lightning flash rate. It could simply be that while the TGF occurs late in the storm, it occurs early in a cell's increasing stage. Because of this, we wanted to determine if the TGFs are occurring during rising, falling, or plateau stages (i.e., near local peaks in flash rates) of the nearest individual peak of its storm. Note that none of the results of this analysis rely on the definition of a cell or even if these storms are in fact multi-cellular. The reference to multi-cellular storms is meant only as a possible explanation for the flash rate variations seen in Figures 2 and 3 for example. To examine this, we used a univariate spline to interpolate the flash rate data, resulting in a smoother version of the flash rate curve. The derivative of the smoother spline curve gives an idea of whether the flash rate is falling or rising at the time of the TGF. We used a peak finding algorithm to identify peaks and determine their heights relative to zero. We defined a plateau in the flash rate, as those times when the flash rate is greater than or equal to 80% of the height of the closest peak. For each TGF in our sample with at least 15 flashes (N = 371), we use this method to categorize how the lightning flash rate is changing at the time of the TGF. We define four distinct classifications, summarized as follows: 1. Falling: the flash rate at the time of the TGF is less than or equal to 80% the height of the nearest peak and the derivative at the time is negative.
LARKEY ET AL.

10.1029/2020JD034401
4 of 17 2. Rising: the flash rate at the time of the TGF is less than or equal to 80% the height of the nearest peak and the derivative at the time is positive. 3. Plateau (falling): the flash rate at the time of the TGF is greater than 80% the height of the nearest peak and the derivative at the time is negative. 4. Plateau (rising): the flash rate at the time of the TGF is greater than 80% the height of the nearest peak and the derivative at the time is positive.
To determine the significance of our results, we also use a bootstrap method. We pick a random flash from each of the 371 storms and calculate how often the random flashes fall into the above categories. This process is repeated 10,000 times, and the results are compared to the results of using the TGF alone.

Flash Rate at Time of TGF
In addition to looking at where in the storm phases the TGFs occur, we examined how the flash rate at the time of the TGF compares to the mean and maximum flash rates of the entire storm. We calculated the Z-scores of the flash rate at the time of the TGF relative to the mean and maximum flash rates in the storms. This was also done for randomly selected flashes in the storms to compare random flashes to TGF flashes. The equations used for this analysis (similar to Equations 1 and 2) are as follows: Here, FR TGF is the flash rate at the time of the TGF, μ FR is the mean flash rate, max(FR) is the maximum flash rate, and σ FR is the standard deviation of the flash rate.

Flash Amplitude at Time of TGF
The final analysis we present here is an examination of how the amplitudes of lightning flashes near the time of the TGF compared to the amplitudes of other lightning flashes in the TGF storms. To start, we defined CG flashes as those which contained one or more return strokes (as specified in the ENTLN data) and IC flashes as those which contained none (Thompson et al., 2014). We assigned each flash an amplitude, designated as the sum of the absolute value of all constituent sferic amplitudes (in kA). For example, if a flash were made up of a 100 kA stroke and a −30 kA stroke, we assigned the flash an amplitude of 130 kA. Figure 4 shows an example of how the 10-minute averages of flash rate and flash amplitude vary for a given storm.
We separated the lightning flashes by flash type (IC or CG) to examine the amplitudes of these types of flashes near the TGF compared to the amplitudes in the rest of the storm. We defined "near the TGF" to be within a ±300 s window of the time of the TGF and took the near-TGF flash amplitude to be the average of the CG (or IC) flashes in that time window -excluding the TGF flash itself. It has been shown that, for some populations of TGFs, the associated radio sferics are stronger than other, typical lightning flashes (Connaughton et al., 2013;Stanley et al., 2006). There is also evidence that some observed sferics are directly associated with the TGF events themselves (Lyu et al., 2016;Mailyan et al., 2018Mailyan et al., , 2020Pu et al., 2019). Finally, there is evidence that lightning networks tend to underestimate the peak current associated with TGFs as they assume that peak currents are proportional to the peak radio frequency radiation field, which may not be true for TGFs similar to compact intracloud discharges (CIDs) (Mailyan et al., 2020). Nonetheless, all of the above statements might only be true for a subset of TGFs as there are a large number TGFs without associated sferics, which may be for a variety of reasons, including the possibility that some of the sferics were too weak to be detected by lightning location networks. Regardless, to include the TGF-associated sferic amplitude in this analysis would not be appropriate. For this reason, we used a ten minute average flash amplitude to get a measure of the near TGF flash amplitudes. The results do not change substantially if a different window is used or if one uses just the 5 min prior to the TGF or the 5 min after the TGF to measure LARKEY ET AL.
10.1029/2020JD034401 6 of 17 the "near TGF amplitude." This was potentially a concern due to the results of Larkey et al. (2019), which showed that the interval immediately before the TGF was usually longer than normal-implying a stronger electric field or potential immediately before the TGF.
We compared the amplitudes of flashes near the TGF to all other flashes in the thunderstorms by calculating the Z-scores of their respective amplitudes using the following equations: Here, A CG(TGF) and A IC(TGF) represent the amplitude of the CG and IC flashes near the time of the TGF, respectively.  A CG and  A IC represent the mean CG and IC amplitudes, and  ACG and  A IC are the standard deviations of the respective amplitudes.
This process was done using just IC or CG lightning flashes. In some cases, there were no IC flashes within the 10-minute window of the TGF, and in those cases, we did not use those storms to contribute to this analysis. Furthermore, there were a number of cases where ENTLN did not provide amplitude data for the identified sferics, and therefore those cases could not be used here. Because there are many storms for which this amplitude data is unavailable, it is a possibility that the results of this analysis are not universal to all TGF-producing thunderstorms, and this should considered when applying the results of this section.

TGF Temporal Locations in Storms
For each of the 371 tracked events with at least 15 flashes, we calculated the two Z-score statistics mentioned in Section 2.1. The distributions of these two statistics are shown in Figure 5. To test if these results are significantly different from "typical" lightning flashes, we have also calculated the statistics for randomly selected flashes in each storm (also shown in Figure 5). Using a two sample Kolmogorov-Smirnov (KS) test, we see that for storms with durations shorter than an hour, the two distributions are very significantly different (p-values of <0.01 [a confidence level of 99%] for each), however the p-values for the longer storms are much higher and not statistically significant.
We find that, in the case of the shorter storms, for both measurements (relative to the mean and relative to the max), there is an enhancement for Z-scores greater than zero, and fewer cases of Z-scores less than zero-implying that TGFs occur more often later in those storms than early in their development. For the Z-scores taken relative to the time of the peak flash rates of the storms, we also see a significant departure between the two distributions. The TGF distribution clearly has fewer events around a Z-score of 0 (i.e., near the peak flash rate), and a slight enhancement for Z-scores greater than 0. This implies that TGFs are less likely to occur on a per-flash basis during periods of peak lightning activity in thunderstorms and that they are more likely to occur after the peak flash rate times in shorter storms. This is consistent with Smith et al. (2010) which found that TGFs tend to occur when the lightning flash rate of a storm is declining.
The lack of significance in the bottom panels of Figure 5 is likely due to the fact that the longer storms are more likely to be multi-cell storms and thus also tend to have more oscillations in the flash rate curve see Figures 3 and 4 for example). In these storms, the conditions conducive to producing TGFs could potentially be met several times over the lifetime of the entire system as the flash rate rises and falls repeatedly. On the other hand, for shorter, perhaps single-cell, storms they might only be achieved near the end of the storms lifetime.

Flash Rate Change at Time of TGF
With regards to how the flash rate is changing at the time of the TGF, we found that 139 (37%) of the TGFs in our sample occurred when the flash rate of the storm was falling, 60 (16%) occurred when the flash rate was increasing, 103 (28%) occurred near a plateau and while the flash rate was falling, and 69 (19%) occurred near a plateau and while the flash rate was increasing. To be sure that this was not just a selection bias (i.e., a result of unequal rising and falling rates in the storms) we tested the significance of these results as described in Section 2.2. The results of 10,000 random trials are summarized in Table 1 and Figure 6 (the probabilities given in Table 1 assume a normal distribution). These results show that the large number of TGFs which occur during a falling stage is extremely difficult to reproduce using random flashes and is likely not the result of selection bias. The results remain significant even if a different threshold for the plateau height is chosen and if only shorter (duration less than an hour) storms or longer (durations greater than an hour) storms are used in the analysis.

Flash Rate at Time of TGF
In addition to looking at when in the storm phases the TGFs occur, we examined how the flash rate at the time of the TGF compares to the mean and maximum flash rates of the entire storm. To do so, we calculated the Z-scores of the flash rate at the time of the TGF relative to the mean flash rate and the maximum flash rate in the storms. This was also done for randomly selected flashes in the storms. These distributions are shown in Figure 7.
In this case, we see that there is no significant dependence on the flash rate (neither relative to the mean flash rate nor the maximum flash rate) for the shorter storms. However, there is a strong tendency for the TGF to LARKEY ET AL.

10.1029/2020JD034401
8 of 17  Note. Given are the values found using just TGF flashes, as well as the mean and standard deviation for each classification found during the random trials. The last column shows the probability of getting the observed results by chance.

Table 1
Results of the Bootstrap Method Described Earlier occur at lower flash rates in the longer storms. It is evident from Figure A4 that shorter storms tend to have lower overall flash rates when compared to the longer storms. This, along with the dependence to occur at the lower flash rates during the long duration storms, indicates that TGFs have a preference to occur during low flash rate periods in general and that the periods of relatively higher flash rates in longer storms makes the production of TGFs less probable.

Flash Amplitude at Time of TGF
Of the 371 storms used for this analysis, 305 of them had at least one CG flash within 10 min of the TGF and 102 had at least one IC flash in that time. Figure 8 shows the results of the amplitude analysis described in the previous section. We see that for both flash types and storm duration thresholds, the distributions representing the flashes near the TGF are shifted to higher Z-scores which translates to higher amplitudes. The LARKEY ET AL.
10.1029/2020JD034401 9 of 17 Figure 6. Radar plot showing the distribution of Terrestrial Gamma-Ray Flashes (TGFs) classified as falling ("F"), rising ("R"), plateau (falling) ("PF"), and plateau (rising) ("PR"). The red curve shows the distribution for the actual TGFs, while each blue curve shows one random trial (10,000 trials in total). Also labeled are the means and standard deviations for the random trials. The black dots mark the mean for each different classification.

Figure 7.
Histograms of flash rate Z-scores at the time of the Terrestrial Gamma-Ray Flash (red) and at randomly selected times (blue) relative to the mean flash rate (left column) and the maximum flash rate (right column) of the storms. The histograms are normalized such that the integral over the range is equal to one. The top row shows the distributions for storms that last less than one hour while the bottom row shows the distributions for longer storms. Also labeled are the p-values from the KS test to test the significance of the differences between the two distributions. Figure 8. Z-score distributions for CG (left column) and IC (right column) flash amplitudes both near the Terrestrial Gamma-Ray Flash (TGF) (red) and away from the TGF (blue). Each histogram is normalized such that the integral over the range is equal to one. Labeled are the number of flashes used to make up each of the curves. The top row shows the distributions for storms that last less than one hour while the bottom row shows the distributions for longer storms. Also labeled are the p-values from the KS test to test the significance of the differences between the two distributions.
skewed shape of the blue curves in Figure 8 is caused by a larger number of relatively low amplitude flashes occurring during high flash rate periods of the storms which acts to shift the distribution of all flashes to the left as seen. The results here imply that TGFs occur during periods when the amplitude of the flashes tend to be higher than the mean flash amplitude in the storms. This is consistent with the results of the flash rate analysis above, as well as the inverse relationship seen between flash rate and flash amplitude, which is often (but not always) the case (see Figure 4). The extremely low p-values produced by performing a KS test on the two distributions, show that the differences between the distributions are statistically significant. This indicates that perhaps the flash amplitude is more important for the production of a TGF than the flash rate of the storm at the time.

Discussion
In this study, we have examined the relationship between storm phase and TGF production in 371 tracked thunderstorms.This sample represents a subset of the 1,169 total TGF storms available, and consists of well organized, isolated thunderstorms. The results of this study may not apply to other types of storms. We used 10-minute running sums of lightning flashes to define the lightning flash rates in the storms. We found that while TGFs tend to occur in all phases of storms with durations longer than an hour, they are more likely to occur during later phases and are less efficiently produced during the peak lightning activity of storms with shorter (<1 h) durations. We further demonstrated that TGFs occur more often during falling stages of local peaks in lightning activity than any other stage. In addition, we demonstrated that TGFs tend to occur at lower flash rates than randomly selected flashes in storms. Finally, we showed that TGFs are more likely to occur when the lightning flash amplitudes are larger than normal. All of these results were tested using a 2-sample Kolmogorov-Smirnov test to quantify the significance.
The distinction between the analysis done in this work and other studies (e.g., Ursi et al., 2019 which found that most of the TGFs in their sample occurred within ±5 min of the peak lightning activity) is that we are calculating the probability of the TGF occurring per lightning flash as opposed to the probability per second. During peak lightning flash rate periods, there are obviously many lightning flashes occurring, but the number of TGFs produced is not proportional to the total lightning flashes, meaning that TGFs are produced less efficiently during the peak flash rates than during other phases of the storms.
Several factors may make TGFs less likely during times of peak flash rate. For example, it is well established that lightning activity is strongly tied to updraft speed and content (Baker et al., 1999;MacGorman et al., 2011;Williams et al., 2005;Yoshida et al., 2017). Higher hydrometeor concentration, caused by higher updraft velocities, could make streamer initiation easier and result in lightning initiating repeatedly when large-scale electric fields are too low to support TGFs (Motley, 2006). Stronger convection at these times could also compress the distance between the main negative and upper positive charge centers, resulting in a lower overall potential, and therefore less chance of a TGF even regardless of the lightning initiation field; this scenario was proposed by Fabró et al. (2019) as an explanation for why TGFs were surprisingly under-produced in storms in central Africa. In addition, charge structure in storms is also known to become more disorganized as the storms intensify and mature (Coleman et al., 2003;Marshall et al., 1995Marshall et al., , 2009Stolzenburg et al., 1998). However, there is evidence that the charge structure becomes more horizontally extensive in decay phases (Bruning et al., 2007). This may imply that the environment of the decay phases of storms is more suitable for TGF production.
Our findings that TGFs occur later in shorter storms and the results showing that TGFs occur more often as the flash rate is falling are consistent with the results of Smith et al. (2010) which found that TGFs tend to occur when the flash rate is declining. The observation of an excess of TGFs occurring after the peak flash rate in shorter storms is possibly explained in two different manners. First, it is likely that a larger electric potential difference is required for TGF production as reflected in the RF model (Dwyer, 2003(Dwyer, , 2008. It might be possible that this electric potential or field is more likely to be reached after the peak flash rate, as the storm has had longer to charge up and also because the flash rates are lower in later stages. This explanation is contrasted by observations of Stolzenburg and Marshall (2008) in which balloon measurements show that electric potentials tend to be smaller later in thunderstorms. It has also been observed that the energy released per lightning flash tends to increase with time ). Furthermore, TGFs are observed to occur in winter thunderstorms in Japan (Bowers et al., 2017;Enoto et al., 2017;Wada et al., 2019Wada et al., , 2020 where the thunderstorms are shorter and therefore cannot reach the same electric potentials as taller thunderstorms (Rakov, 2003). Yet, these storms are still known for producing very powerful lightning flashes (Brook et al., 1982;Holzworth et al., 2019;Turman, 1977). It is therefore possible that energy transferred by a lightning flash is the driving factor in determining whether or not a TGF is produced. Namely, the more energy released by a lightning flash, the more likely it is to result in a TGF. This theory also fits well with the other results of this study, namely the fact that TGFs tend to occur during low flash rate periods which are consequently usually (but not always) periods of high amplitude flashes.
Our other findings, namely that TGFs occur during periods of lower flash rates and that the flashes near the time of the TGF tend to be stronger than other flashes in the storm, is also consistent with the above explanations. TGFs have been shown to be associated with higher energy flashes in prior work (see Connaughton et al., 2013;Cummer et al., 2014;Inan et al., 1996Inan et al., , 2006Stanley et al., 2006). The results presented here, however, are distinct in the fact that the TGF flash itself is excluded from the calculations. These results show that the non-TGF flashes near the time of the TGF, not just the TGF flash, are stronger than normal. This implies that there is a special relationship between storm phase and TGF production. Properties of the storm such as charge structure, electric field, flash rate, etc. likely play a role in the production of a TGF, and therefore it is ultimately the surroundings or conditions of the storm at the time of a given lightning flash that determine the production of a TGF. It is not just that any lightning flash can or will produce a TGF, but rather that the environment plays a large part in whether or not a TGF will actually be produced.
In summary, the results described in this study show that TGFs tend to occur later in storms with shorter durations, TGFs tend to occur while the flash rate is lower than average and declining, and that TGFs tend to occur when the flash amplitudes are higher than average. These results apply for a subset of TGFs that occur in isolated, well organized storms (roughly 30% of the total number of observations available).

Data Availability Statement
Fermi TGF and WWLLN Association data can be found at the location: https://fermi.gsfc.nasa.gov/ssc/ data/access/gbm/tgf/. Earth Networks data is freely available upon request (www.earthnetworks.com).