Validation of the Geostationary Lightning Mapper With a Lightning Mapping Array in Argentina: Implications for Current and Future Spaceborne Lightning Observations

A validation study of the Geostationary Lightning Mapper (GLM) on board the Geostationary Operational Environmental Satellite 16 (GOES‐16) was done using a ground‐based lightning mapping array (LMA) deployed as part of the Remote sensing of Electrification, Lightning, And Mesoscale/microscale Processes with Adaptive Ground Observations (RELAMPAGO) field campaign in Argentina. GLM detected lightning with 74.6% efficiency over 61 thunderstorm days in December 2018 through April 2019. However, GLM detection efficiency (DE) was negatively correlated (r = −0.49) with LMA flash rate. GLM DE also was negatively correlated with LMA flash altitude (r = −0.24), reflecting the influence of multiple competing trends. GLM DE was positively correlated (r = 0.27) with number of LMA sources in a flash, indicating improved DE for larger flashes. During periods with anomalously electrified storms, GLM DE was reduced to 50.9%. Statistics were found to be sensitive to analysis criteria, but most of the above trends remained consistent regardless of specific criteria. Because the methodology allowed a GLM flash to match more than one LMA flash, actual GLM flash rate was a factor of 2.9 lower than the LMA flash rate, and this ratio grew larger as LMA flash rate increased. A sensitivity study examined the impact of improved DE for smaller flashes; that is, an improved sensor (or algorithm) that was better able to detect and distinguish between separate small lightning flashes. The results showed improved correlation with LMA flash rates, as well as improved ability to identify lightning jumps associated with intensifying convection.


Introduction
Ever since the Geostationary Lightning Mapper (GLM) on board the Geostationary Operational Environmental Satellite 16 (GOES-16) first began operating in 2017, it has been recognized as a highly successful instrument that makes critical and continuous observations of lightning across its quasi-hemispheric field of view (Rudlosky et al., 2019).Since then, two additional GLMs-on GOES-17 and -18-have launched (e.g., Bateman et al., 2021;Rudlosky & Virts, 2021).Due to its continuous monitoring capability, GLM regularly observes far more lightning than its predecessors.For example, the Lightning Imaging Sensor (LIS), which has been hosted on the Tropical Rainfall Measuring Mission (TRMM; Kummerow et al., 1998) and the International Space Station (ISS; Blakeslee et al., 2020), provided much of the design heritage for GLM (in particular, the focus on the 777.4-nm oxygen triplet, which enables optical detection of lightning during daytime; Goodman et al., 2013), but due to its low-Earth orbit (LEO) cannot observe in raw numbers as much lightning as GLM.LIS, along with its predecessor the Optical Transient Detector (OTD; Christian et al., 2003), has been aimed at documenting global lightning, which GLM cannot do.
However, because it views continuously and observes so much lightning, issues have been noted with GLM that were not as well-documented with previous spaceborne lightning mappers, despite the common design heritage.
Abstract A validation study of the Geostationary Lightning Mapper (GLM) on board the Geostationary Operational Environmental Satellite 16 (GOES-16) was done using a ground-based lightning mapping array (LMA) deployed as part of the Remote sensing of Electrification, Lightning, And Mesoscale/microscale Processes with Adaptive Ground Observations (RELAMPAGO) field campaign in Argentina.GLM detected lightning with 74.6% efficiency over 61 thunderstorm days in December 2018 through April 2019.However, GLM detection efficiency (DE) was negatively correlated (r = −0.49)with LMA flash rate.GLM DE also was negatively correlated with LMA flash altitude (r = −0.24),reflecting the influence of multiple competing trends.GLM DE was positively correlated (r = 0.27) with number of LMA sources in a flash, indicating improved DE for larger flashes.During periods with anomalously electrified storms, GLM DE was reduced to 50.9%.Statistics were found to be sensitive to analysis criteria, but most of the above trends remained consistent regardless of specific criteria.Because the methodology allowed a GLM flash to match more than one LMA flash, actual GLM flash rate was a factor of 2.9 lower than the LMA flash rate, and this ratio grew larger as LMA flash rate increased.A sensitivity study examined the impact of improved DE for smaller flashes; that is, an improved sensor (or algorithm) that was better able to detect and distinguish between separate small lightning flashes.The results showed improved correlation with LMA flash rates, as well as improved ability to identify lightning jumps associated with intensifying convection.

Plain Language Summary
Based on a comparison with a ground-based, three-dimensional lightning detection system in Argentina, the Geostationary Lightning Mapper (GLM) on board the Geostationary Operational Environmental Satellite 16 (GOES-16) detects lightning with nearly 75% efficiency, which meets its requirements.However, that detection efficiency decreases a lot when thunderstorms produce a lot of lightning at once, or small lightning flashes, or when lightning occurs deeper in the cloud where it is more difficult for the optical pulse to make its way to cloud top.This makes GLM somewhat less useful during the most intense part of a storm's life.However, if GLM or a similar sensor could be made more sensitive, either with improved hardware design or better data processing, then it would become more useful in intense storms.

LANG
Published 2023.This article is a U.S. Government work and is in the public domain in the USA.Earth and Space Science published by Wiley Periodicals LLC on behalf of American Geophysical Union.This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.One major area of concern has been false alarms; that is, event detections that do not correspond to actual lightning (Bateman et al., 2021;Bateman & Mach, 2020;Peterson, 2020).These often manifest as solar glint (either off clouds or reflective surfaces like water), or artifacts manifested by GLM electronics (e.g., "Bahama bars"; Bateman & Mach, 2020).However, though they are challenging to address in processing algorithms, it is relatively straightforward to document these issues, as reference ground-and space-based data sets exist for cross-check (though challenges still remain; Virts & Koshak, 2023).Moreover, false alarms often are dependent on relatively predictable patterns (e.g., solar reflections as the sun moves across the GLM field of view).Also, false alarms don't seem to be as large of a concern in LEO missions like LIS (Blakeslee et al., 2020;Lang & Bang, 2022).
Perhaps more concerning, then, is the lightning that GLM (and by extension, other missions that use the 777.4 nm detection capability) may miss.GLM was designed to provide 70% minimum flash DE averaged across the field of view (Goodman et al., 2013).This is likely as good or better than what TRMM LIS was able to provide, and is better than the ∼60% DE provided by ISS LIS (Blakeslee et al., 2020).But not all lightning is created equal, and GLM (and related missions') DE may be a strong function of lightning type and thunderstorm evolutionary state (Murphy & Said, 2020;Peterson, 2021a;Rutledge et al., 2020;Zhang & Cummins, 2020).
One of the critical services that GLM provides is continuous monitoring of severe storms.A notable feature of severe storms is their propensity to produce a lot of lightning flashes, particularly while intensifying prior to the production of strong winds, hail, or tornadoes.This so-called "lightning jump" (Chronis et al., 2015;Gatlin & Goodman, 2010;Schultz et al., 2009;Williams et al., 1999) was originally identified using three-dimensional total lightning mappers that detect close to 100% of the lightning within their range (∼100 km).The lightning jump is clearly linked to significant kinematic and microphysical changes in thunderstorms as they evolve (Chronis et al., 2015;Schultz et al., 2015Schultz et al., , 2017)).One change that is very common is increased updraft strength, which leads to increased frequency of small-scale turbulent eddies that separate charge over smaller distances, subsequently encouraging smaller flashes near updrafts compared to further away (Bruning & MacGorman, 2013;Schultz et al., 2015).
To summarize, then, intense or severe thunderstorms are highly likely to produce a lot of small lightning flashes near their updraft cores.However, these flashes are also the kind of lightning that GLM is most likely to miss.This is because the flashes may produce a reduced amount of optical energy, and they may fall below the horizontal spatial resolution of GLM (∼8-10 km).And due to the design heritage, there is no reason to think that similar instruments like LIS would not be similarly challenged (Zhang & Cummins, 2020).
An additional issue is anomalous storms; that is, storms that tend to have a preponderance of positive electrical charge at mid-levels (roughly −10 to −20°C) compared to typical storms (Bruning et al., 2014;Rust et al., 2005;Wiens et al., 2005).A major consequence of this is most of the lightning in these storms occurs lower in altitude within the cloud.This naturally limits the amount of optical scattering reaching cloud top, upon which imagers like GLM, LIS, etc. depend (Marchand et al., 2019;Peterson et al., 2021;Rutledge et al., 2020).
Thus, the focus of this study will be on quantitatively documenting how GLM-16 DE evolves as lightning and thunderstorms evolve.This will enable us to understand how to properly interpret GLM (and similar) observations when DE is expected to be challenged (e.g., small flashes in severe storms, anomalous storms, etc.).The domain of interest is a three-dimensional (3D) lightning mapping network in north-central Argentina, which was deployed in support of the Remote sensing of Electrification, Lightning, And Mesoscale/microscale Processes with Adaptive Ground Observations (RELAMPAGO) field campaign (Nesbitt et al., 2021).GLM performance in this domain was studied by Lang et al. (2020); however, that study only examined two cases.The present study will build on this to examine a multi-month period, enabling a high level of statistical confidence in the results.Argentina is an excellent domain to study because severe weather is relatively common there (Nesbitt et al., 2021), and anomalous storms also can occur (Medina et al., 2022).In addition, baseline GLM performance is very good in this region as it is relatively close to sensor boresight (Bateman & Mach, 2020;Bateman et al., 2021;Peterson et al., 2022;Virts & Koshak, 2023), so confounding resolution and parallax issues that occur near the edge of GLM's field of view do not need to be considered when analyzing DE as a function of thunderstorm evolution.
While this study will focus on GLM-16, its results will be relevant to similar missions.This includes current geostationary instruments like GLM-17/18, Lightning Mapping Imager (LMI; Cao et al., 2021), and Meteosat Third Generation Lightning Imager (MTG-LI; Holmlund et al., 2021).It also includes LEO missions like LIS (both TRMM and ISS), OTD, Fast On-orbit Recording of Transient Events (FORTE; Suszcynsky et al., 2000), and the Atmosphere-Space Interactions Monitor (ASIM; Neubert et al., 2019).Ultimately, any instrument focused on LANG 10.1029/2023EA002998 3 of 16 measuring lightning via the 777.4-nm optical emission band is going to be affected by day/night asymmetries in DE, as well as challenges in detecting small, optically dim flashes, or lightning occurring within optically thick clouds.
It should be noted that ASIM also monitors lightning at 337 nm (Chanrion et al., 2019).This ultraviolet (UV) band is typically more sensitive to colder streamer activity, compared to 777.4 nm which is more sensitive to hotter leader activity.Indeed, there appears to be a population of lightning flashes that are predominantly more detectable at 337 versus 777 nm (Soler et al., 2021).Similarly, there are flashes like narrow bipolar events (NBEs) that are more readily detectable in radio frequency (RF) compared to 777 nm (e.g., Jacobson & Light, 2012;Liu et al., 2021).These flash detectability differences have important implications for single-channel sensors like GLM, LIS, etc. Specifically, a multi-frequency approach may yield important detectability improvements that allow overall a more representative population of lightning flashes to be detected.This could have benefits for algorithms based on phenomena like lightning jumps.Thus, it would be helpful to understand what benefits may accrue based on improved DE of additional lightning flashes.

Geostationary Lightning Mapper (GLM)
For this study, the GLM on GOES-16 was used.This sensor has already undergone a validation process (e.g., Bateman & Mach, 2020;Quick et al., 2020;Virts & Koshak, 2020) by the National Aeronautics and Space Administration (NASA) and the National Oceanic and Atmospheric Administration (NOAA); however, it remains useful to continue probing potential limitations of the instrument, so that its full scientific potential can be understood.This study used the Level 2 data sets provided by the Lightning Cluster Flash Algorithm (LCFA; GOES-R 2018), which identifies Flashes as a collection of Groups, and Groups as a collection of Events (Mach, 2020).There is significant evidence that the LCFA erroneously breaks up very large, long-duration stratiform lightning flashes ("megaflashes"; Peterson, 2019Peterson, , 2021b)), mainly due to computational optimizations that improve realtime latency.This issue leads to the identification of multiple flashes when in reality there was only a single megaflash.However, because such flashes are rare (<1% of the GLM data set), and tend to occur at times of low local flash rate, they should not influence the results of this study.

Lightning Mapping Array (LMA)
The RELAMPAGO lightning mapping array (LMA) was deployed during November 2018 through April 2019; that is, the austral warm season.The centroid of the LMA was near the city of Córdoba, Argentina.LMAs map lightning in 3D using Global Positioning System (GPS) time-of-arrival techniques (Rison et al., 1999), and are generally considered to detect nearly all lightning within relatively close range of network center (e.g., Thomas et al., 2004).The network consisted of up to 11 stations during this time; however, the number of stations operating on a given day was variable.Generally at least 7 stations were operational, and more typically 9+.The published RELAMPAGO LMA Level-2 data set was used in this study (Lang, 2020).This data set consists of individual VHF source locations as well as identified flashes using the processing approach by Lang et al. (2020).The main constraints on the flash identification were a maximum of 150 ms and 3 km between successive sources in a flash, and a flash could have a 3-s maximum duration.Lightning mapping array flashes were defined using thee different minimum thresholds-3, 10, and 100 points-in order to explore the impact of smaller versus larger flashes on GLM DE.Because the LMA was still being installed and improved during significant portions of November, this study focused on the December-April period.

Analysis Methodology
This study focused only on lightning within 100 km of the RELAMPAGO LMA centroid, as Lang et al. (2020) found that LMA performance was optimized within this region.This occurred on 61 thunderstorm case days during December-April deployment (i.e., one or more thunderstorms occurred roughly every other day).There were additional days with only 1-2 LMA flashes (1 January, 2 and 25 February) that were excluded from this total.Though 26 December had LMA lightning, GLM observations were missing for the relevant time period, so this day also was excluded from analysis.
For the statistical validation, the main time unit for determining flash rates was 10 min.Besides being conveniently equal to the 10-min data files provided by the LMA processing algorithm (Lang, 2020), this also reduced minute-to-minute variability and focused on broad-based flash and instrument performance trends within the 100-km radius domain.A GLM flash was considered to be matched with an LMA flash if its centroid occurred within 500 ms and 25 km of at least one LMA flash.Due to the fundamentally different nature of the GLM (optical) and LMA (VHF) measurements, as well as the coarser ∼10-km resolution of GLM, a GLM flash was allowed to match multiple LMA flashes if the time and distance criteria above were all met.This benefitted GLM DE even if GLM flash rates in the domain were well below LMA flash rates.Sensitivity studies were performed with these spatiotemporal matching criteria halved (250 ms and 12.5 km) and doubled (1,000 ms and 50 km).To speed up processing, GLM flashes were initially restricted to 150 km of the LMA centroid before performing any spatiotemporal matching.This allowed GLM flashes just outside the 100-km radius of analysis to match with LMA flashes inside the radius.
Day (1100-2200 UTC) and Night (0000-0830 UTC) were analyzed separately, with transition periods (2200-0000 and 0830-1100 UTC) only included in Overall statistics.The transition periods were made wide enough to exclude any seasonal changes in sunrise/sunset during December-April.For example, around austral summer solstice daytime is longer than 1100-2200 UTC, but this is not true after the austral autumnal equinox.
Sensitivity studies were done that examined the possible benefits of improved GLM (or GLM-like) DE of missed flashes.This simulated the impacts of 0%-100% improvement in the identification of missed LMA flashes by a spaceborne sensor; that is, an improved sensor or algorithm that was able to resolve (and thus match) additional individual small LMA flashes.This could include flashes that were fully missed by GLM, as well as more than one of the multiple LMA flashes that were allowed to match a single GLM flash.In the latter case, this simulated an improved ability by a spaceborne instrument to resolve these individual small, weak flashes that are captured by the LMA without amalgamating them into a single, lower-resolution optical flash seen by GLM.For the above analysis, 1-min flash rates within 100-km of the LMA were calculated for both GLM and the LMA on all case days analyzed for the statistical DE study (which was done at 10-min resolution, see above).
Lightning jumps (LJs) were defined following the methodology of Chronis et al. (2015).Similar to that study, the average 2-min flash rate of the most recent 14 min of lightning was computed, and then the standard deviation of 2-min flash rates were computed for the first 12 min of that period.The most recent 2-min flash rate was then compared to the average, and if it was greater than 3 times the standard deviation above the average, and the flash rate was greater than 25 min −1 , then an LJ was identified.Chronis et al. (2015) explored a variety of α (multiplicand of the standard deviation, σ) and absolute flash rate thresholds, and found that LJs correspond to significant kinematic and microphysical changes in a thunderstorm.Analysis by Schultz et al. (2015Schultz et al. ( , 2017) ) also supported this inference.The present study used α of 3 and the 25 min −1 threshold to limit the focus to the strongest LJs, those that were most likely to truly correspond to significant thunderstorm evolution, as opposed to potential noise.After an LJ was identified, 10 min were required to pass before another LJ could be identified.
LJs for both the LMA and GLM were computed, with GLM LJs recomputed according to 0%-100% improvements in DE as described above.A limitation of this analysis, compared to Chronis et al. (2015) and Schultz et al. (2015Schultz et al. ( , 2017)), was that individual thunderstorms were not isolated and tracked.All storms within 100 km were allowed to contribute to the LJ analysis.This is another reason this study only focused on stronger LJs, to limit the effect of a mixture of evolutionary states within multicellular storms on the statistics.In addition, this study did not consider a storm's propensity to produce severe weather relative to an LJ; the focus was solely on significant thunderstorm evolutionary changes within the analysis domain, as indicated by the presence of an LJ.False alarm rates were not explored in this study.As discussed earlier, false alarms are a known issue with GLM, but this study was primarily focused on DE as a function of flash rate and flash/thunderstorm behavior.In addition, significant progress has been made on reducing false alarms and artifacts in GLM-16 since 2018-2019 (e.g., Bateman et al., 2021), while reduced DE during thunderstorm intensification or anomalous periods remains an issue where limited progress has been made (e.g., Rutledge et al., 2020).

Overview
Figure 1 shows an example set of time series exemplifying GLM performance during RELAMPAGO, which helps to motivate this study.Similar figures for other cases can be found in Lang et al. (2020).The case in question here is 23 February 2019, where an intense thunderstorm transited through the southern portion of the LMA domain.Initially, GLM DE was nearly 100% as LMA flash rate was low and flashes tended to have a large (∼100 or more) number of points.However, as the storm intensified flashes grew smaller (i.e., fewer points per flash), and LMA flash rates reached a peak of ∼80 min −1 (averaged over a 10-min period) at two separate times (during the 0400 and 0500 UTC hours, respectively).Flash rates then weakened (to ∼40 min −1 ) before the storm exited the analysis domain.While GLM was broadly correlated with the LMA, in that it also showed an enhancement and then a reduction in lightning, GLM peak flash rates were more than a factor of 4 lower.This was primarily due to a very large reduction in GLM DE when the thunderstorm intensified, down to less than 20% during part of 0500-0600 UTC.This hour-long time period also occurred right after the storm appeared to be anomalously electrified, according to Medina et al. (2021); mean source altitude also decreased during immediately prior to this time period (Figure 1c).Indeed, it is possible that Medina et al. (2021) did not fully characterize the anomalous behavior of this storm, which based on the altitude time series appeared to have extended past 0500 UTC.Moreover, due to some complexities in the GLM DE curves, A deeper investigation into the relationships between all four major variables in Figure 1 yielded Figure 2, which shows all permutations of scatterplots between GLM DE, LMA flash rate, LMA flash altitude, and LMA points per flash.A strong negative correlation between LMA flash rate and GLM DE was confirmed (Figure 2a).However, for this case GLM DE was positively correlated with both LMA flash altitude (Figure 2b) and points per flash (Figure 2c).Meanwhile, LMA flash rate was negatively correlated with both points per flash and flash altitude.The latter may be a result of the anomalous electrification observed by Medina et al. (2021).Very little correlation was observed between LMA points per flash and flash altitude.
These single-case results illustrate the major themes of this study.For example, GLM can have high DE (particularly for larger flashes), but when storms are intensifying DE appears to be anticorrelated with actual flash rate.GLM DE also appears to be negatively impacted by transient anomalous periods in thunderstorms.Ultimately, this makes GLM's response in intense and/or anomalous storms much more muted than it ideally should be, limiting the utility of GLM flash rate to identify intense storms (e.g., Murphy & Said, 2020).
The remainder of this Results section will build on this case study, as well as the work of Lang et al. (2020), to explore these themes in much more quantitative and statistical detail.In addition, the effects of potential improvements in spaceborne optical lightning DE will be explored.

Statistical Analysis of GLM Detection Efficiency
Between all 61 case days, there were 8762 possible ten-minute periods to enable DE and correlation analyses.Depending on which flash rate was computed (e.g., GLM vs. LMA, 3-point vs. 10-point vs. 100-point), the actual number of ten-minute periods with lightning ranged within 2105-2502.Correlations were done assuming these periods with lightning were linked as a continuous time series.This is justified as flash rates nearly always tapered toward 0 near the start and end of individual cases with lightning (e.g., Figure 1).Unlike Figure 2, which showed linear fits mainly for illustration, Spearman correlations were computed as these do not assume a linear relationship between the variables, only a monotonic one.
Statistics from the analysis are presented in Table 1.Results are broken out by the standard 25-km and 500-ms matching criteria as well as whether those criteria were halved or doubled.Day, Night, and Overall statistics are also provided.Discussion of this Table will focus on the broader trends.Overall, for a 10-point minimum threshold to define an LMA flash, GLM detected ∼75% of LMA lightning, which met GLM's performance requirements (Goodman et al., 2013).Decreasing the threshold to 3-points only hurt DE by 1%-5%.Increasing the threshold to 100 points improved DE by ∼10%-20%.Halving the matching criteria reduced DE by ∼10%-20% (which reduced GLM well below 70% DE except for 100+ point flashes), while doubling the criteria improved DE by ∼5%-10%.As expected, GLM performance during daytime was significantly less than nighttime, again ∼10%-20% worse.GLM DE at night could exceed 90%, depending on matching criteria and LMA points threshold.Overall, the statistics indicated very good overall DE performance by GLM, but this performance was sensitive to specific thresholds used in the analysis.Medina et al. (2021) identified 38 hourlong blocks (their unit of analysis) during December 2018 through April 2019 when the LMA indicated anomalous lightning behavior.This behavior was determined by the use of an automated charge identification algorithm, called Chargepol, which estimated the polarity, altitude, and vertical depth of charge layers based on bulk flash behavior.Within these 38 hourlong blocks, there were 209 total ten-minute periods with LMA lightning (this was roughly 8%-10% of all lightning-producing periods during December-April).Based on Figure 2, it is possible that Medina et  Correlation not significant at 99% confidence level.

Table 1 GLM Detection Efficiencies and Spearman Correlation Coefficients Versus the RELAMPAGO Lightning Mapping Array Under Various Scenarios
prevalence of anomalous lightning in the RELAMPAGO data set, though it is difficult to assess by how much without additional analysis outside the scope of the current work.Regardless, overall DE statistics for these anomalous periods are presented in Table 1.The DE statistics were not further broken down into Day versus Night due to the reduced sample size.Under the standard matching criteria, for 10+ point LMA flashes DE was reduced to ∼51%.This DE was again sensitive to matching criteria and LMA points threshold, but overall ∼20%-25% decreases in DE were common during anomalous scenarios.
Moving to the Spearman correlations, under the standard criteria (25-km, 500-ms, 10+ points) GLM and LMA flash rates were highly correlated (r = 0.95), and this was largely insensitive to Day versus Night, as well as halving or doubling of the spatiotemporal matching criteria (0.95-0.97 range).Somewhat counterintuitively (since it improved GLM DE), increasing the LMA points threshold to 100 actually reduced the flash rate correlation, to a range of 0.85-0.89.However, increasing the point threshold that high actually reduced the overall population of LMA flashes by ∼70%.In that scenario, the assumptions underpinning the correlation analysis (that flash rates tapered toward zero only near the beginning and end of a given lightning case) became less well-supported.As many studies have noted, flashes tend to become smaller as a thunderstorm intensifies.Thus, the relative fraction of 100+ point flashes should decrease when that occurs, which should act to decorrelate the LMA and GLM time series.This decorrelation trend when using the 100-point threshold can be seen elsewhere in Table 1 as well.
Despite the robustly high correlation between GLM and LMA flash rates, GLM DE was actually significantly anticorrelated with LMA flash rate.The baseline correlation under the aforementioned standard criteria was −0.49, and this was largely insensitive to halving or doubling the spatiotemporal matching criteria, as well as the LMA points threshold (the range was −0.41 to −0.50).What appeared to impact the degree of anticorrelation was Day versus Night, with nighttime anticorrelations larger than daytime.The already-reduced GLM DE during daytime appeared to be playing a role in this difference.
GLM DE versus mean LMA flash altitude was anticorrelated (standard result was −0.24), which is somewhat counterintuitive since previously a reduction of DE during anomalous periods was noted, and flash altitude was positively correlated with GLM DE on 23 February (Figure 2).This was likely due to competing trends, which ultimately summed to a weakly negative overall correlation.Certainly, anomalous periods tend to be associated with lower flash altitudes.However, as Lang et al. (2020) and others have noted, flash altitude can increase during thunderstorm intensification, and as Table 1 shows GLM DE was anticorrelated with LMA flash rate.Finally, larger flashes tended to be more readily detected by GLM, and these flashes are often associated with more stratiform lightning, which often is lower in altitude than lightning contained within deep convection (or slopes downward away from convection).The influence of these competing trends may also be seen in the increased sensitivity to various matching criteria and points thresholds, where the anticorrelation can range from statistical insignificance to as high as −0.35.The complex and situationally variable relationships between LMA flash rate, altitude, and points per flash will be investigated more thoroughly in Section 3.3.
The last row in Table 1 concerns GLM DE versus mean points per LMA flash.Here, data for 3+ point flashes (i.e., the largest LMA flash data set used in this study) were examined.The standard correlation was 0.27, and this was highly sensitive (range of 0.05-0.46) to the choice of criteria.In general, criteria that tended to reduce baseline GLM DE (e.g., focusing on Day, halving spatiotemporal criteria, etc.) usually improved this correlation, while criteria that tended to increase DE (e.g., Night, doubling) reduced the correlation, sometimes to the point of statistical insignificance.Thus, the occurrence of larger flashes (i.e., those with more LMA points) tended to improve DE at the margins, when GLM DE was already being throttled by other restrictions.For example, since nighttime GLM DE during RELAMPAGO was already high to begin with, making flashes a little bit bigger didn't have as strong an effect as during the daytime.This trend is also visible in Table 1's GLM DE entries, where (e.g.,) DE improvements from 10-to 100-point thresholds were greater during Day than Night.

Principal Component Analysis (PCA)
While the above analysis provides useful insights into GLM DE based on LMA flash characteristics, it does not consider how the LMA-measured flash characteristics behave relative to each other (e.g., flash rate vs. flash altitude, etc.).To accomplish this, principal component analysis (PCA) was performed for various diurnal scenarios, focusing on how LMA-measured flash rate, mean flash altitude, and mean points per flash co-evolved.PCA was done using the scikit-learn Python package (Pedregosa et al., 2011) with standard scaling of the three LMA parameters.
Table 2 shows the results of the PCA, specifically the PC loadings, which indicate how much each of the original inputs contributed to each PC, and in which direction.PC1, which explained roughly half the PC variance in the data set (whether Day, Night, or Overall), showed an inverse relationship between points per flash and both flash rate and flash altitude.PC1 is broadly consistent with the observed dominant flash behavior during RELAMPAGO (Lang et al., 2020).That is, as flash rate increased, flash rate altitude also tended to increase, but flashes became smaller (or vice versa).This is broadly similar to the results of Bruning and MacGorman (2013), as well as many other studies, and is best thought of as the default behavior within normal convection.One distinction during nighttime was that the variance explained by PC1 decreased slightly, and the relative influence of points per flash also decreased.
This leads into discussion of PC2, which explained close to 30% of the variance, but PC2 gained at the expense of PC1 during nighttime.PC2 showed all three inputs (rate, points, and altitude) acting in phase with one another, but points per flash was the most important (and this importance increased during nighttime).PC2 is broadly consistent with the behavior of a mixture of stratiform and anvil-based lightning.The most consistent characteristic of this type of lightning is that it tends to be bigger than convective-only flashes (Bruning & MacGorman, 2013).The weaker but still in-phase relationships with flash rate and flash altitude are also consistent with this interpretation, since convection is a common initiation region for these larger flashes (Carey et al., 2005); however, there are complexities such that stratiform/anvil lightning often becomes more common as convective flash rates decrease (Mecikalski et al., 2015) and stratiform/anvil lightning can slope downward away from convection (Ely et al., 2008).These complexities would be expected to weaken the importance of rate and altitude on the class of lightning most consistent with PC2.The increased relevance of PC2 during nighttime is also consistent with the stratiform/anvil interpretation, since this lightning is most common in mesoscale convective systems, which tend to occur more often at night in Argentina than during the day (Rasmussen et al., 2014).
Finally, PC3 explained about 20% of the variance, and did not vary much diurnally.In PC3 points per flash have almost no impact, while flash rate and flash altitude vary inversely to each other.PC3 is broadly consistent with anomalous lightning behavior; that is, as flash rate increased flash altitude decreased.PC3 is also consistent with low flash rates but with higher flash altitudes commonly seen in weak normal-polarity storms.This also is likely why the percentage variance explained (∼20%) by PC3 is larger than the estimated frequency of anomalous storms during RELAMPAGO (∼10%; Medina et al., 2021).
When comparing to GLM DE, the PC model did not improve on a standard multilinear regression using LMA rate, points, and altitude.Both featured squared correlations (r 2 ) just below 0.2 for the standard flash-matching criteria (Table 2).Interestingly, however, the correlations improved during daytime, and when halving the matching criteria.The correlations decreased when doubling the criteria, and during nighttime.That is, the linear multivariate models had the most explanatory power in scenarios where GLM DE was the most taxed: when using strict matching and during daytime.Since GLM DE is largely driven by signal-to-noise ratio, which strongly depends on the actual sensor engineering design and performance, it is reasonable that flash characteristics would have the greatest impact on DE when the signal-to-noise ratio is weakest (e.g., daytime).Though not shown here for brevity, the standard multilinear regression coefficients were most consistent with PC1; that is, LMA rate and altitude had an inverse effect on DE compared to points per flash.

Investigation of the Impact of Improved GLM Detection Efficiency
As has been demonstrated, GLM DE was significantly anticorrelated with LMA flash rate during RELAMPAGO (Table 1).In addition, the ratio of LMA flashes to GLM flashes seemed to grow larger as flash rate increased (e.g., Figure 1).Recall that because the flash matching process allowed a single GLM flash to correspond to more than one  LMA flash, the absolute ratio of LMA flashes to GLM flashes could be large even if GLM DE was high.For the entire December-April time period, the ratio of LMA flashes to GLM flashes was 2.9:1 (10+ points/flash).The behavior of this ratio as a function of 10-min-average LMA flash rates is shown in Figure 3.As can be seen, the ratio increased approximately monotonically with LMA flash rate (Figure 3a), up to a 10-min rate of 250 min −1 , though the relationship grew noisier as sample size declined at higher flash rates (Figure 3b).This reflected the mutually reinforcing effects of reduced GLM DE at high flash rates, as well as an increased probability that a single GLM flash would match with more than one LMA flash when LMA rates were high.For example, normally flash rate and flash size are anticorrelated (e.g., Figure 2, PC1), so as flash rate increases at some point flashes not only could become smaller than a GLM pixel, but the spatial separation between coincident flashes could become smaller than a GLM pixel as well.This phenomenon was likely influencing the observations discussed above.
The net effect is that GLM response becomes increasingly dampened as LMA flash rates increase to very high values.GLM rate of increase is throttled as storms intensify, as are the absolute flash rates (Figure 1; Lang et al., 2020).This would be expected to decorrelate GLM and LMA during peak storm intensities, and also could affect the identification of LJs within GLM data.
As discussed in the Introduction, this issue should also affect other spaceborne optical lightning mappers that share common design characteristics with GLM (e.g., OTD, TRMM/ISS LIS, FORTE, LMI, and MTG-LI).The exact details of how this manifests in their respective data sets will vary based on specific instrument and algorithm characteristics, but all of these sensors are/were likely vulnerable to undercounting flashes relative to an LMA during the intense (or anomalous) portions of thunderstorm lifecycles (e.g., Zhang & Cummins, 2020).Thus, it is useful to explore the expected impacts of a future spaceborne sensor (or future algorithm improvements to an existing sensor), which could be better designed to address this shortcoming.
The improvements could be accomplished in a number of ways (e.g., more sensitive optics, additional complementary sensors/channels, improved spatiotemporal resolution, more optimized processing algorithms, etc.ideally all of the above), but the present study is simply focused on aggregate improvements in flash detection capability, which can mean both increased DE as well as improved ability to distinguish between separate small flashes.For this analysis the focus will be on 0%-100% improvements (in 10% bins) in flash identification relative to the LMA baseline.That is, in a given minute, LMA flash rate will be x, GLM flash rate will be y, where y < x and an "improved GLM" flash rate will be indicated as where y′ is the improved flash rate and a is the fractional improvement in flash detection capability (range 0.0-1.0, in 0.1 steps, as mentioned above).
Figure 4 shows an example of how this manifested during the 23 February 2019 case also studied in Figure 1.The LMA time series (Figure 4a) shows multiple intensifying and decay stages, and the LJ methodology identified a total of 5 jumps.The unimproved GLM time series (Figure 4b) is significantly attenuated relative to the LMA, and flash rates and rate increases are too small to identify any LJs.However, a 30% improvement in flash detection capability (Figure 4c) would bring the GLM closer in line with the LMA, and would reveal at least 3 LJs relatively close in time to the respective LMA-identified LJs.Though fundamental issues still persist even with these increased flash detections, there are demonstrable quantitative improvements in the ability for the spaceborne instrument to characterize thunderstorms.
Figure 5 shows the results for the entire December-April RELAMPAGO period, for 0%-100% improvements in flash detection capability.The flash rate Spearman correlations (Figure 5a) increase rapidly, from ∼0.84 to ∼0.94, with only a modest 20% improvement in flash detection capability.Beyond 20%, the rate of improvement starts to decline as correlations asymptotically approach 1.0.Note that initial correlations with 0% improvement are below the values seen in Table 1, as the Figure 5 analysis is based on 1-min flash rates, not 10-min rates.
Unlike the asymptotic behavior in Figure 5a, the ratio of "improved GLM" LJs to LMA-identified LJs responds approximately linearly to increased flash detection capability (Figure 5b).Overall, modest improvements in detectability lead to only modest gains.For example, a 20% improvement in flash detectability only nets a ∼10% improvement in LJ detection (from the baseline ∼50%).Meanwhile, to correctly identify 80% of LMA LJs, an improved GLM (or GLM-like sensor) would have to detect a whopping 50% more flashes.

Discussion and Conclusions
This study focused on validation of the GLM-16 sensor using data from an LMA deployed in north-central Argentina during 2018-2019, with special attention paid to DE as a function of thunderstorm and lightning behavior.The analysis was confined to within 100 km of the LMA centroid, where performance of the ground-based network was maximized (Lang et al., 2020).While GLM DE was high overall (∼75%), and GLM and LMA flash rates were highly correlated (r ∼ 0.95), DE could vary significantly as thunderstorms evolved.In particular, GLM DE was negatively correlated (r ∼ −0.5) with LMA flash rate; that is, as LMA flash rates increased within a thunderstorm, it became increasingly difficult for GLM to detect the additional lightning.GLM DE was significantly better (∼10%-20%) during the nighttime versus daytime, and for larger flashes versus smaller (also ∼10%-20%).GLM DE was weakly correlated (r ∼ 0.25) with the average number of points in LMA flashes, and this correlation was strongest in situations where GLM DE otherwise had negative influences (e.g., daytime).Periods of anomalous lightning (i.e., lightning associated with positively charged thunderstorm mid-levels, loosely defined) were associated with DE reductions of ∼20%-25%.GLM DE was weakly negatively correlated with flash altitude (r ∼ −0.25), but this was very sensitive to spatiotemporal matching criteria and LMA points thresholds for flashes, which suggested the influence of multiple competing trends like anomalous lightning (worse expected DE), lower-altitude but larger stratiform lightning (better expected DE), and flash altitude increasing with thunderstorm intensification in normal-polarity storms (worse expected DE).
PCA provided further insight into the complex and situationally dependent relationships between flash rate, flash altitude, and flash size.The dominant PC was consistent with typical convection where flash rate increases while flashes get smaller and occur higher in altitude.A second PC was consistent with stratiform/anvil lightning patterns, where flashes are generally larger but still often depend on initiation in convection, so flash rate and altitude vary in phase with size, albeit more weakly since these flashes can have complex behavior, such as descent outside of convection.A third PC was more consistent with anomalous electrification, where flashes become lower in altitude as flash rate increases (or vice-versa in weak normal-polarity storms).These fundamentally different modes of lightning occurred throughout the RELAMPAGO data set, and were responsible for the complicated correlations discussed above.
The overall high GLM DE occurred because this study allowed multiple LMA flashes to match to a single GLM flash.While this was a reasonable accommodation given the fundamentally different measurement technologies (one VHF-and ground-based, and capable of geolocation within 10s of meters, while the other spaceborne and optical with geolocation ∼10 km), in reality LMA flashes outnumbered GLM flashes by about a factor of 3, and this ratio grew larger as LMA flash rate increased.
A sensitivity study was performed to examine the potential benefits of improved spaceborne detectability of these small flashes that are so ubiquitous in intense convection as measured by VHF sensors (e.g., Bruning & MacGorman, 2013;Lang et al., 2000;Williams et al., 1999) as well as human-observed thunderstorms.In this context, detectability meant either improved flash DE or improved ability to distinguish between individual small flashes.Correlations improved significantly within the first 10%-20% improvement in detectability, with more asymptotic behavior afterward.Meanwhile, the ability to identify LJs within the convection responded linearly to improved detectability.Overall, this suggests that sensor and/or algorithmic improvements that achieve modest improvements (∼10%-20%) in flash detectability could have a significant benefit for characterizing intense convection, but thereafter marginal improvements in detectability would have to be increasingly weighed against the costs associated with achieving that extra performance.
With spaceborne sensors that are always highly constrained by size, weight, and power requirements, a 10%-20% increase in performance in the future as technology improves could be a reasonable goal.However, pushing beyond that likely would start to incur significant costs, which this study finds might only provide additional modest benefits, suggesting that the community involved in scientific analysis of intense/severe convection ought to consider other options as well.That is, simply focusing on improving flash rate measurements to better match those provided by LMAs will only provide diminishing returns relative to the expected costs associated with achieving that goal from spaceborne platforms.Improving the spaceborne technology is important, but improved algorithmic developments to better identify and characterize intense/severe convection using lightning observations are also needed (i.e., simply focusing on flash rates and identifying LJs is insufficient, from the spaceborne perspective).
Thankfully, there are a number of viable pathways to consider.Studies have found that minimum and mean flash areas, as measured by spaceborne sensors like GLM, could provide useful information about intensifying thunderstorms (e.g., Bruning et al., 2019).In addition, group rate could provide information, since groups are a processing step removed from flashes in GLM and related algorithms (Mach, 2020;Mach et al., 2007), so analysis at that lower data level reduces complexity.However, lightning group analysis needs to consider the presence (or lack thereof) of stratiform lightning, which typically is associated with low flash rates but increased group rates per flash (Peterson, 2019).But more specifically, this study (in the context of many others with similar results; e.g., Marchand et al., 2019) is a challenge to the lightning community to develop algorithms that rely on more than just flash rates to characterize significant milestones or processes in thunderstorm evolution, particularly when working with spaceborne lightning observations.This study has also demonstrated an innate 20%-25% DE reduction with GLM when anomalous lightning is occurring.This is consistent with related studies (e.g., Marchand et al., 2019;Murphy & Said, 2020, Rutledge et a. 2020), and is on top of any additional DE reductions associated with high flash rates.That is, an intense/severe anomalous storm with very high flash rates could easily result in 20%-30% DE for a spaceborne sensor like GLM (e.g., Figure 1).This result strongly argues for a spaceborne capability to measure lightning flash altitude, particularly to identify the presence of anomalous storms.Indeed, the overall global frequency of anomalous storms is poorly understood, even though it is highly likely that certain regions (e.g., Colorado and similar climatological regimes) are disproportionately prone to their occurrence (e.g., Fuchs et al., 2015).Lightning altitude may also play a role in terrestrial gamma-ray flash (TGF) production (or at least detection of TGFs from space; e.g., Lopez et al., 2019) as well as the production of transient luminous events such as sprites (e.g., Hu et al., 2002).One potential option for this has been studies like Jacobson et al. (1999Jacobson et al. ( , 2013)), Light and Jacobson (2002), Peterson et al. (2021), andPeterson (2022), all of which used combined optical and VHF measurements to measure lightning flash altitude from space.The addition of RF-based measurements from space has other strong synergies, since certain types of lightning (e.g., NBEs) are more readily detectable at RF than 777 nm (Jacobson & Light, 2012), are clearly related to thunderstorm evolution (Jacobson & Heavner, 2005;Jacobson et al., 2007), and also are complementary with 337-nm measurements (Liu et al., 2021).Indeed, the addition of other spectral measurements to a space-based 777-nm lightning sensor may be a good means of increasing overall DE.
The ability to resolve the vertical distribution of lightning from space (including retrievals of lightning channel length; Koshak et al., 2014) also would greatly benefit studies of lightning-produced nitrogen oxides (LNOx).LNOx, along with lightning-produced hydroxyl radicals (OH), has important implications for the Earth's climate due to the species' strong influence on the global lifecycles of tropospheric ozone and methane, which are powerful greenhouse gases (Murray, 2016;Wu et al., 2023).

Figure 1 .
Figure 1.GLM and lightning mapping array (LMA) observations within 100 km of the LMA on 23 February 2019.(a) Time series of GLM detection efficiency for LMA-identified flashes with 10+ and 100+ points.(b) LMA and GLM flash rates, and mean LMA sources per flash.(c) Mean LMA flash altitude.The gray region indicates the time period when the storm was identified as anomalous by Medina et al. (2021).

Figure 2 .
Figure 2. Scatterplots between various parameters during 0300-0700 UTC on 23 February 2019 (black stars).Also shown are best-fit lines and correlation coefficients in red.(a) Lightning mapping array (LMA) flash rate versus GLM detection efficiency (DE).(b) LMA flash altitude versus GLM DE.(c) LMA points per flash versus GLM DE.(d) LMA flash rate versus points per flash.(e) LMA flash rate versus flash altitude.(f) LMA points per flash versus flash altitude.
Also shown are multilinear correlations (r 2 ) with GLM DE using the three PCs or using the three original LMA parameters.All values are significant at 99%+ confidence levels.

Figure 3 .
Figure 3. (a) Ratio of lightning mapping array (LMA) to GLM flash rates, as a function of LMA flash rate.(b) Number of samples informing the time series in (a) as a function of LMA flash rate.

Figure 4 .
Figure 4. One-minute lightning mapping array (LMA), GLM, and improved GLM (assuming 30% increase in flash detectability relative to the LMA) for 23 February 2019.(a) Focused on LMA, with associated detected LJs.(b) Focused on GLM, with no detected LJs. (c) Focused on improved GLM, with associated detected LJs.
al. (2021) may have underestimated the overall

Table 2 Principal
Component (PC) Loadings for the Lightning Mapping Array (LMA) Flash Rate, LMA Points per Flash, and LMA Mean Flash Altitude Parameters, Assuming a Minimum of Three Points per Flash, and Under Various Scenarios LANG 10.1029/2023EA002998 10 of 16