The Urban Lightning Effect Revealed With Geostationary Lightning Mapper Observations

Within the Charlotte, North Carolina, to Atlanta, Georgia, megaregion (Charlanta), the Atlanta metropolitan area has been shown to augment proximal cloud‐to‐ground (CG) lightning occurrence. Although numerous studies have documented this “urban lightning effect” (ULE) with regard to CG lightning, relatively few have investigated urban effects on distributions of total lightning (TL). Moreover, there has yet to be a study of the ULE using TL observations from the Geostationary Lightning Mapper (GLM). In an effort to fill this gap, we investigated spatial distributions of TL around the cities of Atlanta, GA, Greenville, SC, and Charlotte, NC, using GLM data collected during the warm seasons of 2018–2021. Analyses reveal augmentation of TL intensity and frequency over the major cities of Atlanta and Charlotte, with a diminished urban signal over the smaller city of Greenville. This work also demonstrated the potential efficacy of the emerging satellite‐based TL climatology in ULE studies.

2 of 9 et al. (2008) observed a clear relationship between anomalies of precipitation, lightning, and the prevailing wind direction. In an integrated study of radar reflectivity and lightning data focused on Atlanta, Ashley et al. (2012) found statistically significant increases in aggregate (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) warm season CG flash counts and flash days between defined urban-rural boundaries of 34%-42% and 14%-20%, respectively. A key finding of their study was the linkage between patterns of lightning, precipitation, and the geometry of the urban footprint. Increasing contiguity of impervious surfaces and rapid expansion of urban sprawl are predicted to dramatically alter the spatial footprints of individual cities and the structure of the entire Charlanta UCA in the coming years (Terando et al., 2014), emphasizing the need for continued investigation and monitoring of the ULE.
Though global networks detecting total lightning (TL) flashes have been widely operated for many years, regional and continental CG detection networks have been the preferred sources of flash data due to a number of undesirable characteristics associated with lightning detection at a global scale, namely, low detection efficiencies and spatio-temporal variations in accuracy (Marchand et al., 2019;Poelman & Schulz, 2020;Virts & Koshak, 2022). The now defunct spaceborne Lightning Imaging Sensor and its antecedent prototype aboard MicroLab-1, known as the Optical Transient Detector, suffered from similar data accuracy and consistency issues due to the low earth orbit (LEO) of the Tropical Rainfall Measuring Mission satellite, which resulted in discontinuous spatio-temporal coverage and middling detection efficiencies, though still offering substantial improvement over ground-based global networks (Boccippio et al., 2000(Boccippio et al., , 2002Hayward et al., 2020). In an effort to remedy the known pitfalls of lightning detection via satellites in LEO, NASA, and NOAA jointly launched the GLM aboard the Geostationary Operational Environmental Satellite R-series in 2016, marking the advent of spatially and temporally continuous lightning detection from space (Goodman et al., 2013;Medici et al., 2017). This novel availability of high-resolution TL data from the GLM presents contemporary researchers of the ULE with a convenient source for reliable TL observations. Nevertheless, there has yet to be an urban lightning study that utilizes data from the GLM in its analysis. As possibly the first study to utilize observations from the GLM to investigate the ULE, the overarching purpose of this study was to provide insight into the utility of GLM data for future analyses of the ULE. Consequently, warm season (June, July, August [JJA]) flash data from the first four years of GLM observation (2018)(2019)(2020)(2021) were used to develop and analyze a set of TL climatologies for the Charlanta UCA. A stated objective of the GLM is to provide spatio-temporally continuous lightning observations for use in long-term climatological analyses (Rudlosky et al., , 2020. As it is well established that TL serves as a robust proxy for convective intensity due to the overwhelming predominance of IC flashes relative to CG flashes (MacGorman et al., 2011), we hypothesize that the ULE will be resolvable in TL observations from the GLM.

GLM Observations
The GLM 1,372 × 1,300 pixel charge-coupled device detects near-infrared emissions within a narrow 1 nm band centered at 777.4 nm, providing an at-nadir spatial resolution of approximately 8 km (Goodman et al., 2013). Downstream processing by the Lightning Cluster Filter Algorithm (LCFA) clusters the detected lightning "events" into higher order data classes of "groups" and "flashes" (Goodman et al., 2013;Thiel et al., 2020), each containing spatial information in the form of latitude-longitude coordinates. The purpose of a GLM group is to serve as a proxy for the individual return strokes (current discharges) that make up a ground (cloud) flash, while a GLM flash is intended to correspond to a conventional lightning flash (Goodman et al., 2013). Final GLM products have a daytime (nighttime) flash detection efficiency greater than 70% (90%) and location accuracy of approximately 4 km (Koshak et al., 2018;Rudlosky et al., 2019).
Recently, Oda et al. (2022) performed an initial climatological analysis of total flash rate density in Brazil using GLM data. In collaboration with the Center for Weather Forecasting and Climate Studies (CPTEC) within Brazil's National Institute for Space Research (INPE), processing by Oda et al. (2022) accumulated more than 6 million provisionally mature 20-second Level 2 (L2) GLM packets into 5-min bins and spatially aggregated the included events, groups, and flashes on a 0.08° × 0.08° latitude-longitude grid (corresponding to the 8 km at-nadir spatial resolution of the GLM). Quality control measures were taken by filtering GLM observations to include only those with a Data Quality Flag of "good," as recommended by . Each gridded file contains the centroid density of the variables produced by the LCFA, representing the total number of flash, group, and event centroids detected within each grid cell over the 5-min time interval. These products are made publicly available through a managed archive (CPTEC/INPE, 2022).
Facilitating the objectives of our study, warm season (JJA) GLM observations were acquired from CPTEC's public archive in their native netCDF format for a 4-year period (2018-2021). The Climate Data Operators (CDO; Schulzweida, 2022) suite of command line tools was utilized to aggregate the 5-min files, filter to the desired geographic region, and to derive a set of metrics aimed at quantifying TL intensity and frequency over the 4-year period of record: (a) the total flash rate density (FRD; i.e., flashes km −2 yr −1 ), (b) the total number of active flash days (i.e., days where total flash count ≥1), and (c) the average flashes per flash day (i.e., total flashes ÷ flash days).

Physiographic Data
Land cover data was obtained from the United States Geological Survey (USGS) National Land Cover Database (NLCD 2011), which provides access to spatially explicit land cover and per-pixel impervious surface products derived from 30 m Landsat imagery (Dewitz, 2021;Yang et al., 2018). Additionally, 7.5-arc-second Global Multi-resolution Terrain Elevation Data (GMTED2010) was obtained from the USGS Earth Resources Observation and Science (EROS) Center archive (Danielson & Gesch, 2011;EROS, 2017).

Methods
Maps of the derived GLM TL flash metrics were created in ArcGIS Pro 2.9 by converting the gridded latitude-longitude point data set to raster format and projecting to the Albers Equal-Area Conic projection (Snyder, 1987), resulting in a spatial resolution of approximately 8 km (ESRI, 2021). The following sections describe the steps taken to construct and analyze 4-year warm season TL flash climatologies across the Charlanta megaregion.

Defining the Charlanta Megaregion and Constituent Domains
Three domains centered on Atlanta, Greenville, and Charlotte were constructed using geometric buffers with radii of 100, 75, and 87.5 km, respectively. These radii were primarily determined through consideration of relevant methodologies (i.e., Ashley et al., 2012) and the characteristics of each city. Recommendations from Stallins and Rose (2008), which emphasized the importance of spatial extent and noted that "most studies have found flash augmentation within 100 km of the city center," were also considered. The buffers were based around each city's administratively defined center to define the outer boundaries of their domains. A rectangular polygon feature representing the entire Charlanta megaregion was defined using ArcGIS Pro's Minimum Bounding Geometry tool with the three geometric buffers of Charlanta's major cities as the input features (displayed in Figure 1 above). These boundary features were used to clip the GLM data set to the Charlanta megaregion and its constituent city-scale domains before mapping each TL metric for visual analysis.

Statistical Analysis
Providing a more objective assessment of the ULE's presence in GLM TL observations, two statistical analyses were conducted using the R statistical programming language (R Core Team, 2022). To construct the sampling schema for these analyses, a binary classification method was implemented in ArcGIS Pro using 2011 NLCD land cover areal coverage data to delineate a contiguous urban core sample for each city included in the study. The four developed land cover classes (Developed, Open Space (21); Developed, Low Intensity (22); Developed, Medium Intensity (23); and Developed, High Intensity (24)) included in the NLCD raster data set were spatially aggregated and summarized within the bounds of the vectorized GLM grid cells to form a broad urban/rural percentage. For each GLM grid cell polygon, if the areal coverage of the developed classes was greater than or equal to 50% of the total cell area, the cell was classified as "urban," while cells with less than 50% coverage were classified as "rural." Cells that were classified as urban but disconnected from the urban core were removed, resulting in a conterminous delineation of the urban footprint nested within the outer domain boundary of each city. This procedure was adapted from Ashley et al. (2012), "who noted that it provides a conservative estimate of the urban core…and is hypothesized to capture the areas where warm-season urban effects are strongest." These delineations served as the urban versus rural sampling schema for subsequent statistical analyses.
Similar to the methodology of Ashley et al. (2012), the urban and rural distributions of each TL metric were compared to assess the magnitude of urban enhancement. For each city, the percentage change from rural to urban was determined, with a positive value indicating the average of the urban sample was larger than that of the rural sample. Additionally, inferential statistical analyses in the form of independent two-sample t tests were conducted for each city's urban and rural samples (Student, 1908). An alpha level of 0.05 was used for these tests. Formally, the hypothesis tested was: H 0 = There is no significant difference between the urban and rural samples.
H 1 = There is a significant difference between the urban and rural samples. The maps of the Atlanta domain (shown in Figures 2a-2c and 3a-3c) highlight many of the conspicuous patterns observed in past work. Similar to the CG flash distributions examined by Rose et al. (2008) and Ashley et al. (2012), a broad area of TL enhancement is spatially correlated with Atlanta's sprawling urban footprint and main interstate arteries. Pronounced hotspots in total flash rate density and flash days are visible inside the NLCD-based urban core delineation, though the latter are more strongly clustered within 40 km of the city-center. Average flashes per flash day are also elevated within the urban core delineation, though the most prominent hotspots are located in an arc beginning to the west and ending to the northeast of the city at distances between 40 and 80 km from the city-center.

Results
A few distinct features present in all three TL metrics are the hotspots at the intersection of I-85 and I-985 near the northeastern extent of the urban delineation and along I-20 between 40 and 80 km from the city-center. The former was described as the "Gwinnett hotspot" (referring to Gwinnett County) by  in their analysis of warm season CG flash distributions. Diem and Mote (2005) and Diem (2008) also documented nearly coincident enhancement of rainfall in Norcross, GA, within Gwinnett County. Another notable feature is the southwest-to-northeast oriented band of elevated flash rate densities and average flashes per flash day extending approximately 70 km from the city-center, within the corridor of I-85 and I-20. This corridor contains the Chattahoochee River Valley (depicted in Figure 1b), which has been hypothesized to enhance proximal convective activity (McLeod et al., 2017). There is also a hotspot in flash days near the northernmost extent of the domain, likely associated with the rising terrain of the Blue Ridge mountains.
Within the Atlanta domain, assessment of each TL metric's average between the urban and rural regions (displayed in Figure 3e) revealed increases of 14.3%, 8.3%, and 5.5% for total flash rate density, flash days, and average flashes per flash day, respectively. The independent samples t tests (results displayed in Figure 3f) found statistically significant (α = 0.05) increases in the averages of total flash rate density and flash days within Atlanta's urban core, but not for average flashes per flash day (t = 1.88, p = 0.060). In conjunction with the visual assessment, these results suggest that during the warm season, the urban effects associated with Atlanta's core area of development are strong enough to stimulate both storm-scale flash production and the initiation of thunderstorms that would not have otherwise occurred.
In Figures 2d-2f and 3a-3c, the more moderately sized city of Greenville exhibits a less conspicuous urban influence than that observed in Atlanta. The most prominent areas of elevated total flash rate density and flash days are distributed broadly across the northern half of the domain. These distinctly non-urban features are likely driven by the local topography as this area of the Greenville domain contains portions of the Blue Ridge Escarpment. Most notable, though, are the hotspots in total flash rate density and average flashes per flash day to the immediate west of Greenville's city-center and along I-85 near the northeastern extent of the domain. Although the latter are well outside the demarcated urban core of Greenville, this section of the I-85 corridor is often downwind of considerable urban sprawl between the twin-cities of Greenville and Spartanburg. Statistical assessment of the Greenville domain found minute differences between the delineated urban and rural regions (shown in Figure 3e), with 1.1% and 2.2% higher average total flash rate density and flash days, respectively, and 1.1% lower average flashes per flash day within the urban core. The t tests found none of these differences to be statistically significant (displayed in Figure 3f). The smaller size and spatial density of the Greenville metro area underlie an expectation that its influences will be weaker relative to the dominant geophysical controls on TL production, though Greenville's urban effects are likely to still be a supplementary factor. Nevertheless, it is apparent that our "one-size-fits-all" method of constructing the analysis domains around individual city-centers fails to capture the true urban footprint of the twin-cities arrangement, resulting in contamination of the "rural" sample. This situation also highlights the inadequacy of the GLM's nominal spatial resolution (8 km at-nadir) for analyses of the ULE around small-to-moderately sized cities such as Greenville and Spartanburg, as it lacks the granularity required for finer-scale analysis. Furthermore, the coarse resolution of the utilized GLM data set results in a limited number of cells being included in the NLCD-based urban delineation.
The Charlotte domain (shown in Figures 2g-2i and 3a-3c) contains hotspots in total flash rate density and average flashes per flash day within the NLCD-based delineation of the urban core and along the west-southwestern periphery of the domain. The latter is an extension of those noted near the northeastern extent of the Greenville domain. Distinct hotspots in total flash rate density and average flashes per flash day are also visible to the northeast of the urban delineation, roughly following the I-85 corridor. While it is quite possible that these anomalies are associated with the typical downwind augmentation process, the heterogeneity of the underlying surface characteristics must be considered as another contributor. Likewise, TL flash days are heavily influenced by the terrain features of the Blue Ridge Mountains across the northwestern extent of the domain. Increases of 9.8% and 15.9% were calculated for the averages of total flash rate density and flashes per flash day, respectively, within the urban core, while the average number of flash days was 4.1% lower (displayed in Figure 3e). Hypothesis testing only found a statistically significant increase in TL flashes per flash day (t = 2.82, p = 0.005), while no significant difference was found for flash rate density and flash days (results displayed in Figure 3f).
These results indicate that Charlotte's ULE is manifested most prominently at the storm-scale, with the average TL flash day near the urban core tending to be more electrically active than in the surrounding areas. Similar to the Greenville domain, the urban forcing provided by Charlotte is largely supplementary to natural geophysical factors (e.g., orographic preferences) which act as the dominant controls on the distributions of TL intensity and frequency. While Atlanta's spatially dense and vast urban sprawl displays an ability to initiate thunderstorms that would not have otherwise occurred, Charlotte's smaller urban core likely precludes it from serving as a dominant forcing in thunderstorm initiation and resultant lightning production. As mentioned by McLeod et al. (2017) and Miller et al. (2015), the complex interaction between terrain-related boundary layer processes (e.g., downslope circulations due to differential heating, orographic forcing for ascent) and urban-induced mesocirculations, among other factors, warrant further investigation with improved methods and resources.

Conclusions
This is possibly the first study of the ULE to utilize data collected by the GLM, and therefore, the first to utilize spatio-temporally continuous TL observations from a satellite platform in geostationary orbit. Consequently, our primary objective was to determine the utility of GLM data for urban lightning research by analyzing spatial distributions of TL during the warm seasons of 2018-2021 around the constituent cities of the Charlanta UCA: Atlanta, GA, Charlotte, NC, and Greenville, SC. Visual and statistical analyses of the aggregated GLM data set found that several of the spatial patterns noted in previous research were resolvable. Most prominently, significant augmentation of TL intensity and frequency was apparent over the major cities of Atlanta and Charlotte, with a diminished urban signal over the smaller city of Greenville. These observations echo the findings of Ashley et al. (2012) and a number of earlier studies Huff & Changnon, 1973;Oke, 1982) that urban morphological characteristics, namely, extent, density, and orientation, are key factors in determining the degree to which lightning occurrence is modified. The products of this study underscore the promise of utilizing TL observations from the GLM in future urban lightning research, while also highlighting certain limitations, the need for improvements to the implemented methodology, and the development of more sophisticated approaches to investigating the ULE.

Brief Discussion of Methodological Improvements
Though GLM data gridded at its nominal spatial resolution is shown to be less than ideal for robust analyses of the ULE around small-to-moderately sized cities, our work supports its use at the scales of major cities and urban megaregions. This outcome was not entirely unexpected based on the work of Stallins and Rose (2008), which detailed optimal resolutions for such analyses. Though there is precedent for the use of coarser resolution data (e.g., Pinto et al. (2004) used 9 × 9 km grid cells), it remains insufficient for analyses focused on cities of smaller spatial extent in which the attendant urban effects are predominantly supplementary factors relative to geophysical drivers of TL production. In this regard, there are two apparent strategies that could be implemented to improve the effectiveness of future analyses: (a) utilization of the "glmtools" open-source software package, which allows for re-sampling of GLM data to the 2 × 2 km fixed grid of the Advanced Baseline Imager (Bruning, 2019;Bruning et al., 2019), and (b) the use of more innovative methodologies such as that of Forney et al. (2022), which employed a random forest approach to control for relevant geophysical factors (e.g., elevation, distance from coastline). Despite the acknowledged deficiencies, our analysis documented clear patterns of the ULE observed in past studies, providing the initial support for continued and improved use of GLM data in urban lightning research.

Data Availability Statement
GLM data from the CPTEC/INPE archive was used in the creation of this manuscript (CPTEC/INPE, 2022). A complete description of the available GLM data is provided by Oda et al. (2022). The CDO suite of command line tools was utilized to process the acquired GLM data (Schulzweida, 2022). The locally processed GLM data