Electrical Properties of Vertical Dominant Charge Structures Observed in Corsican Thunderstorms With a LMA

Lightning characteristics of Corsican storms with different charge structures are investigated in this study. Observations of an LMA network are used to document the total lightning activity. Complementary lightning observations of the lightning detection network Météorage are also used. A clustering algorithm is used to build a database of electrical cells from June to October 2018. A method is also applied to infer the vertical charge structure, as dominant dipoles, per 10‐min period for each electrical cell. As an example, one cell recorded in July 2018 is discussed. The cell database is then presented as well as the main electrical properties according to the dominant charge structures. For instance, the higher in altitude the dominant dipole, the higher the flash rate. Overall, dominant negative dipole are observed for 25% of the 10‐min periods and can be separated into two categories: (a) low altitude negative dipole class dominated by negative cloud‐to‐ground (CG) flashes with a main positive layer located between 2 and 4 km height and (b) high altitude negative dipole class, dominated by negative intracloud (IC) with a main positive layer at 5 km height. Dominant positive dipole can also be separated into two categories with (a) a dominant positive dipole located between 4.5 and 10 km high, −CG dominance, weak flash rate and (b) higher altitude dominant positive dipole, +IC dominance and a larger +CG fraction. The synergistic use of LMA and Météorage observations independently gives a rational type and polarity classification with regard to the vertical charge structure.

result suggests that lightning-based tracking algorithm alone or in addition to the radar-based tracking can be used to monitor thunderstorms.
Rebounding collisions between different ice hydrometeors cause a transfer of electrical charges according to the non inductive charging mechanism.Laboratories studies have found that depending on the temperature and on the liquid water content (LWC), graupel particles acquire charges of different polarity when colliding with ice crystals in the presence of supercooled liquid water (e.g., Jayaratne et al., 1983;Saunders et al., 1991;Takahashi, 1978).Graupel tends to charge positively (negatively) in a high (low) temperature and large (small) LWC environment while ice crystals charge negatively (positively) (e.g., Pereyra et al., 2000;Saunders & Peck, 1998).This charge transfer occurs mainly in the mixed-phase region, generally located in the cloud region between −10°C and −40°C (MacGorman & Rust, 1998).Gravitational sedimentation and differential advection of charged hydrometeors lead to net charged layers in the thundercloud (e.g., Bruning & Macgorman, 2013; E. R. Williams, 1985).
The main charge layers observed near the periphery of the updraft, where most charge separation occurs, are used to classify thunderstorm charge structures (e.g., Bruning et al., 2010).Most thunderstorms possess a normal tripole charge structure, characterized by a layer of net negative polarity at midlevels (approximately −10°C to −30°C) situated between two regions of net positive polarity.The upper positive charge layer is considered to be the most active in terms of lightning activity propagating through (Lang & Rutledge, 2011; E. R. Williams, 1989) meanwhile the lower positive charge layer is not always present (López et al., 2019;Pawar & Kamra, 2004; E. R. Williams, 1989).Normal charge structures tend to produce large fraction of negative cloud-to-ground (CG) flashes (E.R. Williams, 1989).
In comparison to normal charge structures, anomalous charge structures are characterized by a dominant layer of positive charges at the bottom of the clouds or at the same altitude range of the midlevel negative layer of the normal charge structure (e.g., Bruning et al., 2014;MacGorman et al., 2005;Rust et al., 2005;E. Williams et al., 2005).In fact, the distribution of charges of a tripole can vary vertically leading to top-heavy (normal charge structure) or bottom-heavy tripole (anomalous charge structure) structure depending on the relative activity of positive layers (e.g., Mansell et al., 2010).Since normal polarity storms have a dominant upper positive charge regions around −40°C, Fuchs et al. (2015) discussed anomalous storms as storms with a dominant positive layer at temperatures warmer than −30°C.Medina et al. (2021) define periods of storms as anomalous when the dominant positive layer altitude is below the altitude of the dominant negative layer.These bulk distributions of charge are more likely to be observed in the updrafts but more charge layers and more complicated vertical charge distributions can be observed in supercells or mesoscale convective system (MacGorman et al., 2005;Rust et al., 2005;Stolzenburg et al., 1998).
Additionally, anomalous charge structures have been found to produce significant lightning flash rates (Fuchs et al., 2015) and severe weather (Lang & Rutledge, 2011), with predominantly positive CG flashes (Carey & Buffalo, 2007;Lang et al., 2004;Lang & Rutledge, 2011;Tessendorf et al., 2007;Wiens et al., 2005).Indeed, instead of being located at the upper part of the cloud, the main positive charge layer is located at the middle or lower level, which facilitates the positive leaders to propagate to the ground and even more easily if a layer of negative charges is present between the main positive charge layer and the ground.Nonetheless, some of these anomalous storms exhibit low flash rates and no positive CG flash predominance.Anomalous charge structures have been observed in the US (Chmielewski et al., 2018;Fuchs et al., 2015Fuchs et al., , 2016Fuchs et al., , 2018;;Lang et al., 2004;MacGorman et al., 2008;Rust et al., 2005;Stough & Carey, 2020;Stough et al., 2021;Tessendorf et al., 2007;Wiens et al., 2005), in Spain (Pineda et al., 2016;Salvador et al., 2021) or in Argentina (Lang et al., 2020;Medina et al., 2021).
In the absence of widespread availability of total lightning (CG and intracloud (IC) flashes) observations, the majority of the initial studies on the charge structure of thunderstorm were undertaken using CG lightning data.Anomalous charge structures were commonly linked to a high +CG production.The use of +CG production as a proxy has made it possible to identify regions with different PPCG (Predominantly Positive Cloud-to-Ground) storms in the USA without being able to quantify the frequency of occurrence of anomalous storms.More recently, new techniques have been developed to infer charge layers and detect anomalous charge structures (e.g., Fuchs et al., 2015;Medina et al., 2021;Stough & Carey, 2020).Medina et al. (2021) found that 13.3% of thunderstorms in a region of Argentina were defined by an anomalous charge structure while 82.6% of Colorado thunderstorms were anomalous, consistent with previous high PPCG storms documented in the same region.Stough et al. (2021) analyzed the flash production of four supercells, two with a normal charge structure and two with an anomalous charge structure.Higher peak rates, +CG fraction and IC-CG ratio were associated with the anomalous supercells.This is consistent with the findings of Qie et al. (2005) and Tessendorf et al. (2007) that have reported that the presence of excessive lower positive charge prevented the occurrence of -CG flashes and favored the production of IC flashes between the two low-altitude layers.Negative CG flashes lower negative charges to the ground while +CGs lower positive charges to the ground (Bruning et al., 2014).Intracloud flash polarity follows the same convention, the sign of the charges lowered to the charge layer below is assigned to the flash.Consequently IC flashes occurring between an upper positive (negative) layer and a lower negative (positive) layer with a negative leader moving upward (downward) transports positive (negative) charges toward the lower layer and are classified as +IC (−IC) (Bruning et al., 2014).For a thundercloud with a dominant positive (negative) dipole, the IC activity of normal (anomalous) charge structures are supposed to be dominated by +ICs (−ICs) (Medina et al., 2021).
Recently, Coquillat et al. (2022) reported storms with anomalous charge structures and low flash rates in a south-western flow in Corsica region.Corsica is a large island with the highest mountains in the western Mediterranean basin and is subject to intense meteorological events such as heavy precipitation, lightning and wind storms (Lambert et al., 2011).Documenting the electrical charge structure of storms in a maritime and mountainous region such as Corsica is original and interesting for a better understanding of the spatial and temporal variability of the storm charge structure.This leads to the goal of the present study that aims at analyzing the different charge structures and the properties of the flashes observed in this region.Concurrent LMA (Lightning Mapping Array) VHF and Météorage LF observations are then used to document the total lightning activity.The lightning observations are ingested in a cell tracking algorithm designed to identify and track storms at the cell scale.Charge structures are eventually inferred for each cell using an automatic charge layer retrieval algorithm.Finally the characteristics of the lightning flashes and lightning activity are derived at the cell scale.
The paper is organized as follows: Section 2 describes the instrumentation, the area of study, the data processing, the cell tracking algorithm and the charge layer retrieval algorithm.Section 3 discusses the results successively through a case study and a statistical analysis while Section 4 summarizes the main results of the study.

SAETTA Lightning Mapping Array (LMA) and Study Domain
The SAETTA network (Coquillat et al., 2019) is composed of 12 Lightning Mapping Array (LMA) stations (Rison et al., 1999) deployed in Corsica Island under the PCOA (Plateforme CORSiCA d'Observations Atmosphériques) framework.Each station detects the very high frequency (VHF) radiation (60-66 MHz) emitted by both IC and CG flashes.SAETTA is able to detect the lightning activity up to approximately 350 km from the center of the network.The domain of the present study ranges from 7.5°E to 10.6°E in longitude, and from 41°N to 43.5°N in latitude.The domain of interest is centered on Corsica with a maximum north-south (or east-west) distance of 150 km from the center of the LMA network which is a reasonable range for a good retrieval accuracy of the VHF source altitude and consequently a good estimate of the altitude of the charge structures (Chmielewski & Bruning, 2016;Dotzek et al., 2004;Thomas et al., 2004).

French Total Lightning Detection Network (Meteorage)
The French operational Lightning Locating System Météorage (MET) consists in 21 LF (Low Frequency) Vaisala LS7002 sensors distributed over France and was, in 2018, a contributor to the European Cooperation for Lightning Detection (EUCLID; Schulz et al., 2016).Meteorage locates CG strokes as well as IC pulses.The polarity of CG strokes is determined by the sign of charge transported to ground.ICs pulses associated to a transport of negative charge downward are labeled −ICs and those transporting positive charge downward are labeled +ICs (Cummins & Murphy, 2009;Leal et al., 2019).
Météorage is the operator of a Low Frequency (LF) Lightning Locating System (LLS) which covers the Western Europe based on the most recent Vaisala technology, namely LS7002 sensors coupled with a Total Lightning Processor (TLP).In 2018, Météorage was a member of EUCLID, a European Cooperation for Lightning Detection (Shultz et al., 2016).The system is made of about 90 sensors owned by Météorage complemented with sensors 10.1029/2023EA003354 4 of 24 belonging to partnering neighboring national LLS operators which detect the electromagnetic signals associated with the large vertical charge transfers occurring between opposite charge centers in the cloud and between the cloud and the ground, producing return strokes and intra-cloud pulses.The lightning data set analyzed in this study was mainly observed by the 12 closest sensors located in a maximum range of 350 km from Corsica.Pédeboy et al. (2018) reported a detection efficiency of 97% and 56% for CG flashes and IC flashes, respectively.Cloud-to-ground stroke location accuracy is about 150 m (Pedeboy, 2015) while the median IC location accuracy is about 1.64 km (Pédeboy et al., 2018).
As the MET network works similarly to the U.S NLDN (National Lightning Detection network) (e.g., Cummins et al., 1998;Orville, 2008), we will rely on the tests performed with NLDN to reclassify some MET observations.Indeed, it has been shown that NLDN strokes with currents below +10 kA are probably IC pulses misclassified and that IC pulses with currents above +20 kA are in fact CG strokes (Biagi et al., 2007;Cummins & Murphy, 2009).The population of records with current between +10 and +20 kA is a mixture of CG strokes and IC pulses.The NLDN has been upgraded and the IC-to-CG and CG-to-IC swaps are no longer required (Murphy et al., 2021) but as the Météorage network was not yet upgraded in 2018, we use the same thresholds as in Fuchs et al. (2015) and Pineda et al. (2016): all positive (negative) IC pulses with a current > +25 kA (<−25 kA) are reclassified as CG strokes with the same polarity and current.In addition, all positive (negative) CG strokes with currents < +10 kA (>−10 kA) are reclassified as IC pulses.The reclassification of IC pulses and CG strokes induces for the 5 months of study an increase of 6% of the number of IC pulses and a decrease of almost the same percentage in the number of CG strokes.

L2b Data, LMA Flash Classification and Period of Study
SAETTA raw data (L0) are combined together to derive locations and times of VHF sources, that is, L1 SAETTA data (Rison et al., 1999;Thomas et al., 2004).The L2 SAETTA data consist in VHF sources merged together to form flashes.A Python scikit-learn package DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm (Pedregosa et al., 2011) is used to pair VHF sources based on a combination of temporal and spatial criteria, similarly to Fuchs et al. (2016) or Ma et al. (2021).
The L2b data consist in MET CG strokes and IC pulses merged to SAETTA L2 flash data.Indeed to take advantage of the synergy of both VHF and LF observations, MET stroke/pulse records are combined to SAETTA L2 flashes based on temporal and spatial criteria.Both spatial and temporal pairings are performed incrementally using successively finer settings.This incremental procedure translates into a quality parameter that qualifies the merging between the LF and VHF records and that can be used as data filter.First a LF stroke/pulse belongs to a given VHF flash when (a) the absolute time difference between that specific LF event and any VHF source that composes that flash is below 200 ms, and (b) its location is within a 0.4°-latitude 0.4°-longitude box centered at the location of any VHF source.If this first check is validated, the temporal and spatial pairing is then refined by applying successively a finer time window (20 ms, and then 2 ms) and a 2D distance criteria (5 km for a CG stroke; 10 km for a IC pulse).This multiple-step process aims at limiting the computation time during the pairing and at including potentially mis-located lightning records.The LF event is eventually paired to the flash that contains the closest VHF source in time and space.Each MET stroke/pulse finally possesses, in addition to time, latitude, longitude, peak current, and event type (IC or CG), three new attributes: the VHF flash number to which it belongs, the VHF source number to which it has been paired, and a pairing quality factor (not detailed here).
The merging of MET and SAETTA records allows classifying SAETTA flash as CG (+CG, −CG), IC (+IC, −IC and Dual IC), "LMA only" and ambiguous (flashes with dual polarity CG strokes).Dual IC flashes correspond to 15% of the total number of flashes in the study database (27% of the flashes in the IC category) while ambiguous flashes represent only 1% of the total flash population.Dual IC and ambiguous flashes are discarded of this analysis since no unique polarity could be assigned to them.A flash is qualified as +CG (−CG) flash if it contains at least one positive (negative) MET CG stroke and only positive (negative) MET CG strokes no matter the number and the polarity of the MET IC pulses paired to that same flash.A CG flash being the extension toward the ground of an IC flash (e.g., Bruning et al., 2014) it is likely to find some MET IC pulses in the first microseconds of a CG flash and later as well.So, a vast majority of CG flashes has both MET IC pulses and MET strokes associated.A flash is qualified as IC if it has at least one MET IC pulse and only MET IC pulses.It then can be classified as +IC or −IC if all the MET IC pulses associated to that flash exhibit the same polarity.
Otherwise the flash is classified as dual IC flash.The reclassification of IC pulses and CG strokes mentioned in Section 2.2 mainly leads to a decrease in the number of -CG flashes in favor of -IC flashes.Any flash is then depicted by a list of VHF sources and potentially by a series of CG strokes and/or IC pulses.Different flash properties are then determined like the flash duration or the flash initiation height.Two methods have been tested to determine the flash initiation height: the first method basically used the altitude of the first VHF source of each flash as initiation height while the second method computes the mean altitude of the VHF sources in the first 500 µs of each flash.A study on 3,500 flashes revealed that the standard deviation of the altitude difference between these two methods was around 70 m.The second method was finally used since it reduces the impact of outliers on flash initiation height calculation.
In order to remove the number of noisy sources, only VHF sources detected by a minimum of 7 stations and with a reduced-chi square lower than 0.5 are kept similarly to Coquillat et al. (2019).In addition, flashes with less than 10 VHF sources are also filtered out (e.g., Mecikalski et al., 2017;Salvador et al., 2021;Schultz et al., 2015;Wiens et al., 2005).About 84.2% of the flashes recorded during the study period are filtered out with 93.6% (93.5%) of them having a flash duration lower than 1 ms (100 µs).The study period ranges from the beginning of June to the end of October 2018 (JJASO 2018).It encompasses the Enhanced Observation Period (EOP) of the EXAEDRE (EXploiting new Atmospheric Electricity Data for Research and the Environment; https://www.hymex.org/exaedre/?page=home) project, during which specific cloud and rain measurements were conducted at the EXAEDRE supersite in Corsica.

Electrical Cell Tracking Algorithm (ECTA)
The goal of this study is to investigate the electrical activity at the storm scale during its lifetime.For this reason one needs to identify and track the storm cells during their entire life cycle.As the study only uses electrical data, the thunderstorm cells are called electrical cells.We define an electric cell as adjacent L2b flashes clustered in time and space, cluster that moves with time.The clustering uses a 2D flash density, also called Flash Extent Density (FED), computed from all individual L2b flashes time-stamped during 5 min and computed on a regular grid of 1 km 2 pixels.Flash Extent Density clusters are computed within each given 5-min period by an application of two successive DBSCAN (Pedregosa et al., 2011) methods applied on the pixels.The algorithm ECTA is presented in further details in Appendix A. The algorithm allows to build a database of electrical cells.Each electrical cell is then described by all corresponding L2b data.Several electrical properties such as the flash rate, the percentage of each type of flash, the flash initiation along the life of the cell are then deduced.All electrical cells lasting less than 20 min and composed of less than 20 L2b flashes are filtered out as the study focuses on typical cells in the Mediterranean region (Galanaki et al., 2018) with a sufficient lifetime and electrical activity.The excluded short-lived cells correspond to weak lightning activity or are related to artifacts due to misidentified electrical cells from the cell tracking algorithm, isolated flashes or discontinuous extended flashes separated in multiple clusters, or electrical cells entering or outgoing the geographical domain.

Charge Layer Identification and Samples Definition
As mentioned in Section 1, the purpose of this study is to characterize the electrical properties at the scale of the thunderstorm cell.It has been shown that the charge structure in thunderstorms is directly related to the electrical activity (Bruning et al., 2014;Fuchs et al., 2018;Tessendorf et al., 2007;Wiens et al., 2005).The charge structure of the cells is studied here by adapting the Chargepol algorithm (Medina et al., 2021) applied on the LMA VHF sources.This algorithm aims at deducing coarse polarity layers from a flash-by-flash analysis.It is based on the bilevel intracloud discharge model (Kasemir, 1960;Mazur & Ruhnke, 1993;Van Der Velde & Montanyà, 2013) where a flash initiates between two charge layers of opposite polarity.
By principle, the algorithm can only detect, for each flash, the presence of two charge layers that form a dipole in which the flash propagated.A more complex structure (e.g., tripole) cannot be deduced with a single flash even if the flash propagates in all layers.However an agglomeration of several flashes, cumulated over the time and clearly qualified by Chargepol algorithm, still provides some insights on the charge structure with potentially several representation of charge layers retrieved at different altitudes.Moreover, this reduces the impact of occasional Chargepol algorithm errors on the determination of the charge layers.
The Chargepol algorithm was applied to the flashes with the same conditions (e.g., minimum 20 VHF sources per flash) on the flashes as in Medina et al. (2021).One additional condition has been added on the total duration of the flash, which had to be greater than the preliminary breakdown duration (i.e., 10 ms).Indeed we found some rare flashes with a duration of less than 10 ms, that were qualified by Chargepol algorithm but which then exhibited no VHF sources after the 10 ms period.From the entire 5-month database of electrical cells, 98,948 (31%) of the 292,513 flashes with at least 10 VHF sources were qualified by the Chargepol algorithm.Figure 1a shows an example of the application of Chargepol algorithm on the flashes of a cell extracted from the database (cell #2 from the 26 July 2018, see appendix A).Out of a total of 287 flashes, 139 (48.4%) were qualified by Chargepol algorithm of which 114 were identified as +ICs flashes and 25 as −ICs flashes, the polarity being defined by the propagation direction of the initial negative leader.Note that the flash classification as +IC or -IC flash provided by Chargepol algorithm is not used at all in the present study.Figure 1b reveals a coarse tripolar structure at the scale of the electrical cell as deduced from the agglomeration of Chargepol-flashes (flashes qualified by the Chargepol algorithm).Overall, the electrical cell exhibits a lower positive layer between 2 and 5 km, a main negative midlevel layer between 5 and 8 km and an upper positive layer between 8 and 13 km (Figure 1b).Looking at the modes of the vertical distribution of the Chargepol-flashes (Figure 1b), the dominant positive layer is located at the upper part of the thundercloud with a maximum of 95 identified 0.5-km-bin flashes propagating in an identified positive charge layer between 10 and 10.5 km height.A secondary positive charge layer mode with a maximum of 25 identified 0.5-km-bin flashes propagating in an identified positive charge layer is found between 3 and 4 km.This cell thus exhibits a normal tripolar charge structure (E.R. Williams, 1989) with a more pronounced upper positive layer inducing a dominant activity in the positive dipole located in the upper part of the cloud during a large part of the cell life cycle.
A charge structure can evolve during the life of an electrical cell and can show a wide range of possible charge layer stacks depending on the storm life-cycle (dissipation phase, mature phase or storm oscillation process; Pawar & Kamara, 2007)).For this reason, it is difficult to assign the same charge structure to an entire cell that can potentially exhibit a continuum of charge structures (Bruning et al., 2010).On the other hand, one can separate a cell into successive short periods and identify the predominant charge structures.Medina et al. (2021) use 1-hr periods, called samples, to analyze their LMA observations.In the present study, 10-min periods, also called samples, are used as the median life duration of the Corsican electrical cells is 58 min, and visual inspection of the vertical distribution of the lightning activity and consequently the vertical structure of the charge regions can evolve significantly within an hour.The goal is to qualify the dominant dipole for each 10-min sample by automatically identifying the altitude of the dominant positive layer (DPL) and the altitude of the dominant negative layer (DNL).A sample is in fact composed of a succession of flashes (all the flashes of the 10-min period) and is labeled thanks to its dominant charge structure.Each flash, when qualified by Chargepol algorithm, provides an image of both positive and negative charge layers in which its branches propagate through.For each given sample, the distribution of the altitude of the Chargepol-flashes is computed per 0.5 km altitude bin by counting the number of Chargepol-flashes with at least one VHF source associated with a positive or negative charge layer detection.
As in Medina et al. (2021), the DPL and DNL altitude for each sample are identified from the altitudes modes of the Chargepol-flashes propagating in positive and negative inferred charge layers.Figure 1c (1d) shows an example of the Chargepol-flashes distribution over the altitude for a dominant positive dipole (dominant negative dipole) sample.In case of equality between mode values of a given polarity, the average of the mode altitudes is then computed.However, each sample must have a standard deviation of the altitude of the Chargepol-flashes vertical distribution modes lower than 2 km.It is designed to filter out samples with dominant layers exhibiting equal altitude mode amplitudes.Since the charge layers observed in Corsica are generally less than 4 km thick, this criterion eliminates the cases where perfectly balanced tripoles structures do not allow the methodology to designate a dominant dipole.
In addition, one needs to assure that for any sample both DPL and DNL correspond well to the predominant charge structure of the entire sample and not only of the few flashes analyzed by Chargepol.This verification is performed by filtering the samples based on the confidence of the charge layer retrieval by Chargepol.Medina et al. (2021) filtered out 1-hr samples with a maximum value of the Chargepol-flashes vertical distribution lower than 30 flashes of both polarities to mitigate the influence of low flash rate storms on charge layers estimation.In the present study, a similar filtering could be applied with a minimum of 5 Chargepol-flashes per 10 min-period to be consistent with Medina et al. (2021) but such filter cannot be applied here since almost 50% of the samples exhibits less than 5 flashes qualified by Chargepol algorithm.
In consequence a multi-parameter filter has been designed.Indeed the samples kept for the study must have a minimum of one Chargepol-flash and a ratio between the number of Chargepol-flashes and the total number of flashes of the sample greater than or equal to 0.2.We notice that samples with few flashes are generally associated to relatively high confidence ratio by nature.Finally, DPL and DNL heights must be at different altitudes.Samples with DPLs above DNLs are classified as dominant positive dipole samples while samples with DNLs above DPLs can be classified as dominant negative dipole samples.
Figure 1e reveals that all samples in cell #2 except the one finishing at 1402 UTC (Figure 1d) are associated with a dominant positive dipole (positive over negative).The dominant positive (negative) layer reached a maximum altitude of 10.5 km (7 km) during the mature phase of the storm in association probably with the intensification 10.1029/2023EA003354 8 of 24 of updrafts.One could criticize the classification of the sample between 1242 and 1252 UTC as only 1 of the 5 flashes recorded during that period is analyzed by Chargepol algorithm (Figure 1a) leading to a confidence ratio equal to the 20% threshold.A dominant positive dipole is deduced for the entire sample while there are in fact 3 flashes not analyzed by Chargepol algorithm in the lower part of the cloud (probably −ICs) for 2 flashes in the upper part of the cloud (potential +ICs) with only one of them used for charge layer retrieval.If all the flashes had been analyzed, the sample would have been classified as a dominant negative dipole sample with a dominant positive layer at the bottom.This rare (4% of samples with a confidence ratio smaller or equal to 25% and composed of less than 11 flashes) type of sample is a source of error for the study but the statistical analysis of the samples and the different filters applied on the samples allow to reduce its impact on the charge structure classification.

5-Month Distribution of Electrical Cells Recorded in Corsica
During the 5-month period (June to October 2018) within the SAETTA domain, 711 electrical cells with at lest 20 flashes and lasting more than 20 min were identified by ECTA during 79 different days.About 73% of the flashes detected by the LMA over the 5-month period are included in the electrical cell data set.Table 1 provides several statistics per month and for the entire period, while Figure 2 presents the maps of the cell trajectories per month.The most prolific month was August with 229 cells (Table 1; Figure 2c) followed by October with 189 cells (Table 1; Figure 2e).The month with the least number of cells was July with only 77 cells (Table 1; Figure 2b).The cells are typically found over land during the summer months (Figures 2a-2c) with a maximum number of cells in August.For the months of September and October (Autumn) the cells are mostly located over the sea (Figures 2d and 2e).The summer cells of the study seem to be induced by orographic forcing on the Corsican relief, and are rather stationary with short chaotic trajectories.In autumn, the cells are rather located over the sea with longer and more straight tracks.This agrees with Galanaki et al. (2018) who show that summer thunderstorms in the Mediterranean region take place rather over land, in the afternoon and especially over the reliefs triggered by orographic lifting.They also argue that autumn Mediterranean thunderstorms are essentially on the sea around the coasts, regardless of the time of day with a convection favored by the instability created by flows of colder continental air masses on a still relatively warm sea.
The two major weather events with the highest number of cells occurred between August 14th and August 15th with 73 individuals cells and between October 28th and October 29th with a total of 67 cells.The last one was due the ADRIAN storm (Figure 2e, tracks in red) (also called Vaia storm) (Giovannini et al., 2021) that triggered some long-track supercells in a south-western flow that produced intense electrical activity, strong winds and two tornadoes in Corsica.The two major events produced almost a third of the cells in their respective months.Thus it must be taken into account that the statistics for these 2 months are influenced by particular weather events.
The median cell duration of the 711 cells is 59 min with a variation of less than 5 min when considering each month independently.About 25% of the cells lasted more than 1h30, 13% more than 2h and 5% more than 2h40.The median number of samples (10-min period) per cell is 6 which is consistent with the median cell duration of 1 hr.
Table 1 details the impact of the filtering of the samples (see Section 2.5) on the sampled database.We recall that the purpose of the filtering is to reduce the uncertainty on the determination of the dominant dipole structure of each sample.Over the 5 months, 22% of the 10-min samples were excluded.The most filtered month is October with 30% of these samples excluded and the least filtered are July and August with only 17% of their samples excluded.Regarding the dominant dipole structure, the filtering mainly removes the samples without charge structure identified by the Chargepol algorithm (Unknown, Table 1).Over the 5 months, 10% of samples were without charge structure, 66% of samples with dominant positive dipole charge structures (DPL above DNL) and 24% of samples with dominant negative dipole charge structures (DPL below DNL).If one only takes into account the samples with an identified charge structure, 73% (27%) of samples were associated to a dominant positive (negative) dipole.
After filtering on samples without charge structure together with samples with low confidence (see Section 2.5), 4,153 samples with a total of 213,268 flashes approximately correspond to 720 hr of lightning activity.Among these flashes, 88,219 (41%) of them were used by the Chargepol algorithm to identify a dominant dipole structure per 10-min sample period.The results over the 5 months show a dominance of the occurrence of samples with a dominant positive dipole charge structure (73%) and the filter did not exclude any particular charge structure and kept the observed positive/negative dominant dipole proportions.This dominance has a monthly variation with a percentage between 78% and 87% for the months of July, August and September and lower for June (65%) and October (52%).It suggests that the predominance of summer orographic thunderstorms favors the presence of dominant positive dipole charge structures.On the contrary, in October, thunderstorms are mostly over the sea with conditions more favorable to dominant negative dipole charge structures.Moreover, October is the month with the lower percentage (35%) of analyzed flashes by Chargepol.It is assumed that, at this period of the year, the charge layers are less thick and less stratified on the vertical due to the lower vertical development of clouds but also to cells further away from the network preventing of having well defined vertical channels during the early stage of the flashes for an unambiguous classification by Chargepol.
It is worth to remember that dominant negative dipole samples are observed throughout the entire study period and that they can occur punctually in the cell lifetime.However dominant negative dipole samples can be the only category of samples observed in certain meteorological conditions like during south-west flow events carrying aerosols (Coquillat et al., 2022).The location of dominant negative dipole samples in Corsica was investigated but no geographic hotspot was found in this 5-month sample database (not shown).
When compared to the percentage of samples found by Medina et al. (2021) in Argentina and the USA, the fraction of dominant negative dipole in Corsica seems more important (Colorado samples being an exception).This may be due to the Corsica environment less conducive to deep convection and dominant positive dipole charge structure but also to the sample duration (10 min here vs. 60 min).Indeed, with a period of 10 min there are more chance to capture a dominant negative dipole structure more likely during its dynamic dissipation/formation phase or within weak thunderstorms.On the opposite, when a dominant positive dipole charge structure is present, there is generally a strong increase in the flash rate at high altitude.So over a short time interval, the number of flashes in the upper dipole can be much higher than the number of flashes in the lower dipole over a longer period.Thus, at the storm-scale and with samples of 1 hr like Medina et al. (2021), it is very likely that the dominant dipole will be retrieved from the flash population in the upper positive dipole that induces a high flash rate period.With the finer time window of 10 min, there is more chance to better characterize the dynamic within the storm and to detect periods with lower flash rate and with potentially dominant negative dipole charge structures.

Samples Charge Structure Distribution
In the following, the properties of the lightning activity are discussed relatively to DPL and DNL altitudes (see Section 2.5).Indeed, Figure 3 shows the 2D distribution of DPL-DNL altitude pairs where each 500 m × 500 m bin represents a unique dominant charge structure.As mentioned in Section 2.5, all samples with the same DPL and DNL heights are filtered out.As expected, Figure 3 reveals a wide altitude distribution of charge structures, with DPL (DNL) heights ranging from 1 to 12.5 km (1.5-11 km).In addition, Figure 3 shows that the DPL-DNL charge structures are not uniformly distributed, some configurations appearing more often than others.There are two main types of charge structure: class #1 with DPLs found globally between 6 and 12 km in altitude with associated DNLs located between 4 and 9 km height, and class #2 composed of DNLs located between 4 and 8 km with associated DPLs between 2 and 6 km altitude.The first (second) configuration corresponds to dominant positive (negative) dipole charge structures with a dominant positive (negative) layer above the dominant negative (positive) layer.In addition, class #1 exhibits a larger vertical range compared to class #2 suggesting that class #2 population is more vertically compact and that class #1 contains dominant charge layers that can be quite distant vertically.In terms of confidence in these dominant dipole distributions, the confidence ratio is generally more than 30% for all DPL-DNL pairs.
For statistical confidence, elements of the 2D distribution with less than 10 samples-469 of the 4153 samples (11%)-are discarded in the analysis, but still plotted in gray in Figure 3. Additionally, 84% of the 213,268 flashes were in bins with at least 10 samples.Among the 3,684 validated samples, 75% (25%) samples belong to class #1 (class #2).By filtering out samples in bins with less than 10 samples, the total number of flashes, the total number of samples and therefore the percentage of dominant positive and negative samples are changed.Indeed, more dominant negative dipole samples than dominant positive ones are filtered out but the order of magnitude between the two classes is rather kept (73%/27% (Table 1) vs. 75%/25%).Bins with less than 10 samples are overall associated with a low cumulative number of flashes as shown by the contours in Figure 3 except for the few dominant negative samples with a DPL located between 6 and 7 km and a DNL found between 7.5 and 9 km where more than 1,000 flashes were summed up.Considering at least 10 samples per bin, the more common dominant dipole charge structure over Corsica and for the 5-month period corresponds to a positive one composed of a DPL between 8.5 and 9 km height with an associated DNL found between 6 and 6.5 km height, representing around 4% of all samples.
Vertical profiles of temperature with balloon soundings are conducted by Météo-France every day at 00 UTC and 12 UTC from Ajaccio (8.73°E−41.91°N).An analysis on 688 samples, located within a 1-degree square centered at the balloon launch facility during the whole study period, and recorded within a 6-hr time window centered at the launch time, revealed that the −40°C isotherm varied between 9 and 10.5 km in altitude, the −10°C isotherm ranged between 5.5 and 6.5 km in altitude while the 0°C isotherm varied between 2 and 4 km in altitude.There were therefore a variation of 1.5-2 km in altitude of isotherms during the study period.The DPL (DNL) distribution peaked around the −40°C (−15°C) isotherm for dominant positive dipoles samples.For dominant negative dipoles samples, DPL (DNL) altitude distribution peaked around the 0°C (−10°C) isotherm.These temperature ranges fit well with those reported by Fuchs et al. (2015) with a dominant positive (negative) charge layer located at around −40°C (−20°C) for normal charge structure but also to those associated to anomalous charge structure with a dominant positive layer situated at the 0°C isotherm.
According to Figure 3, for both dominant positive and negative dipole charge structures the maximum accumulated number of flashes does not necessarily correspond to the bins with the most samples which suggests that the number of flashes and therefore the flash rate per (10-min) sample varies in relation to the charge structure.

Total Flash Rate According to the Charge Structure
As a reminder, the aim of this study is to characterize the electrical activity associated with different charge structures, here labeled by the dominant dipole.Figure 4 synthesizes the variability of the flash rate per DPL-DNL altitude bins.Each DPL-DNL altitude bin contains at least 10 samples and is composed of a series of 10-min period of lightning activity with different properties.In the following, the flash rate corresponds to the ratio of the number of flashes per minute over the 10-min period and is known for each DPL-DNL altitude pair.
Figure 4b shows the median of the flash rate, for both dominant positive and negative dipole distributions, higher flash rate statistically occurs when the upper layer of the dominant dipole reaches higher altitude.A maximum median flash rate of 42 f.min −1 is obtained for dominant negative dipole samples with a DNL located between 8 and 8.5 km height and a DPL found between 5 and 5.5 km, while there is less than 2 to 3 f.min −1 for DNL under 5 km and DPL under 8 km.For dominant positive dipole samples, the maximum median flash rate is 20 f. min −1 for DPL between 11.5 and 12 km and DNL between 8 and 9 km.
For dominant negative dipole (Figure 4a) samples, the median average flash rate increases with the DNL altitude from 1 f.min −1 or less for DNL between 3.5 and 5.5 km height to more than 40 f.min −1 for DNL around 8.5 km height.For dominant positive dipole samples (Figure 4c) the flash rate also increases with the DNL altitude but less dramatically (up to 20 f. min −1 ).For both charge layer populations, the flash rate often ranges over two orders of magnitude (Figures 4a and 4c).The relationship between the DPL altitude and the median average flash rate (Figure 4d) is more complicated with a strong intensification of the flash rate for DPL associated to dominant negative dipole samples (in blue) between 4 and 5.5 km in altitude corresponding to "intense" negative dipole samples.For DPL associated to dominant positive dipole samples (in red, Figure 4d) there is a continuous increase of the flash rate with increasing DPL height between 5.5 and 12 km height.
The increasing trend in the flash rate for dominant negative dipole samples with DPL around 5 km height is comparable to the observations from anomalous Colorado thunderstorms (Fuchs et al., 2015) with a peak flash rate associated with a low dominant positive charge region (temperatures near −20°C) but also to the observations from anomalous Oklahoma thunderstorms (Fuchs et al., 2015;Lang & Rutledge, 2011).These dominant negative dipole samples with high DPLs altitudes have flash rates comparable to anomalous storms documented by Fuchs et al. (2015) (around 15 f.min −1 ) in Colorado but also remain well below the severe anomalous thunderstorms observed by for example, Rutledge et al. (2020) (up to 300 f. min −1 ) in the same region.For the present study, the most severe samples (with the strongest flash rate) correspond to dominant negative dipole charge structures with a DNL greater or equal to 6.5 km height and a DPL greater or equal to 4.5 km height (Red box, Figure 4b).These severe samples are relatively rare as they represent 15% of the dominant negative sample population and 4% of the total sample population.A vast majority of these samples belongs to cells that were recorded during the single special weather event, that is, Adrian storm (Vaia storm; Giovannini et al., 2021) as mentioned in Section 3.1.These samples (Figure 4b, red box) are part of the high altitude negative dipole class.Three others classes are also defined: Low altitude negative dipole class for all the remaining dominant negative dipole samples, high altitude positive dipole class for dominant positive dipole samples with DPL altitudes greater or equal to 10 km and DNL altitudes greater or equal to 6.5 km (gray box, Figure 4b) and low altitude positive dipole class for the all the remaining dominant positive dipole samples.

Flash Production Relative to the Sample Charge Structure
Since there is a link between flashes rate and charge structures classified thanks to the dominant dipoles, the type of flash produced by each charge structure is investigated here.dominant charge structures.For the IC-CG ratio, LMA only flashes are also included in the IC population since these flashes are in vast majority small ICs (compact flash) (not shown), not detected by Météorage and typically not qualified by the Chargepol algorithm.These ratio and fractions have been calculated per sample, the results discussed here correspond to the median of those parameters for a given DPL-DNL pair.As a reminder, the types and the polarity of LMA flashes are obtained by merging LMA data to observations from the LF MET network (see Section 2).
+   = + + + − (2) Overall, the IC-CG ratio increases with the altitude of the dominant dipole as does the flash rate (Figure 4b).It is consistent with the results of MacGorman et al. (1989).This ratio is maximum for high altitude negative dipoles samples with 30-50 times more IC flashes than CG flashes.For high altitude positive dipoles, there is rather 5 to 10 times more ICs than CGs.For both dominant positive and negative dipoles, the intense electrical activity is mainly due to intracloud activity for high altitude dipoles with an equivalent contribution between ICs and LMA only flashes for dominant positive dipoles samples and an enhanced production of LMA only flashes for dominant negative dipoles (not shown).The increased occurrence of these small flashes is consistent with the high flash rates observed, in fact Bruning and Macgorman (2013) have shown an anti-correlation between the size of the flashes and the flash rate.The maximum CG flash rate is observed for these high altitude dipoles (around 3 CG f. min −1 ) and increases with the height of the dominant positive layer similar to the findings of Salvador et al. (2021) (not shown).Looking at the +CG fraction (Figure 5b), we see that −CGs generally dominate the production of CGs (>90%; <10% for +CGs) regardless of the dominant charge structures except for dominant dipoles associated with high altitude positive dipoles charge structures where from 15% to 40% of the CGs produced are positive.This ratio can nevertheless evolve with the seasons, indeed an increase of the +CG fraction was observed in winter when the clouds have a base at low altitude with a weak vertical extension (not shown).
For the +IC fraction, Figure 5c shows a clear signal of dominance between 80% and 100% of +ICs for all samples with a dominant positive dipole charge structure, result consistent with the flash polarity convention (e.g., Bruning et al., 2014;Medina et al., 2021).On the other hand, for the high altitude negative dipoles samples it is the opposite with a strong dominance of −ICs, also consistent with the flash polarity convention.For low altitude negative dipole samples, there is no clear −IC dominance.This can be explained by the fact that the only few potentially −ICs flashes initiating in this dipole tend to propagate toward the ground and thus to be qualified as −CGs.In summary, referring to the 4 classes defined before, the high altitude negative dipole class is associated with a dominant intracloud activity with mainly −ICs and small flashes (LMA only), enhanced −CG and +CG production.Low altitude negative dipole class is associated with mainly −CGs production and some +IC activity.High altitude positive dipole class is associated with a dominance of intracloud activity with strong +ICs and small flashes (LMA only) production, enhanced −CG production and the highest +CG production of all the charge structures for the studied data set.Finally, the low altitude positive dipole class produces mainly +ICs and −CGs with the same relative amount.
Overall, charge structures classified as dominant positive or negative dipoles seem to produce at the same time ICs and CGs of both polarities although the intracloud activity remains dominated by flashes of polarity corresponding to that of the dominant dipole.This logically suggests the presence of one or more additional charge layer in the samples, in addition to the dominant dipoles.By investigating the altitude of initiation of the flashes, the presence of others charge layers can be confirmed as detailed in the next section.

Flashes Initiation Height Relative to the Samples Charge Structure
In the following, the relationship between the flash initiation altitude and the dominant charge structure is studied.
A median initiation height is determined for each sample, for each type of lightning (−CG, +CG, +IC, −IC, LMA only).For each dominant dipole, we consider that the theoretical flash initiation altitude is located at mid-distance between the two dominant charge layers (DPL-DNL), in accordance with the model of Kasemir (1960).Figure 6 shows the median of the difference between the median initiation altitude of −CGs and +ICs separately and the theoretical initiation altitude for each DPL-DNL dipole.
For dominant negative dipole samples, the −CGs initiate logically at the theoretical altitude ±0.5 km (Figure 6a).2020) classification with a negative leader propagating into the positive layer above (as a +IC) and then continues toward the ground.In fact, all types of −CGs are observed at all altitudes for all charge structure but type I flashes are much more common and force statistically the median altitude of −CGs initiation to be in a negative dipole (not shown).
For dominant positive dipole samples, +ICs flashes initiate logically at the theoretical altitude of the bins associated (Figure 6b).For low altitude negative dipole samples, +ICs flashes tend to initiate 0.5-3.5 km above the theoretical initiation height.This indicate the presence of a weak upper positive layer above the dominant negative dipole with +ICs initiating between the main midlevel negative charge layer and the weak upper positive layer.For high altitude negative dipole samples (Figure 6b, red box) the few +ICs flashes initiate at the theoretical altitude or 0.5 km above.Meaning that for these samples, the weak upper positive layer tend to not be present.
LMA only flashes always initiated at the theoretical altitude of the dominant dipoles and +CGs flashes initiated at the same altitude as +ICs (not shown).−ICs generally initiated at the same altitude as −CGs except for samples in the high altitude positive dipole class where the few detected −ICs initiated mostly at the theoretical altitude or up to 0.5 km above.This could be a sign of a presence of a negative upper screening charge layer (e.g., Krehbiel et al., 2008;López et al., 2019;MacGorman et al., 2008) with some −ICs flashes forming between the screening layer and the main dominant positive layer below.This remains a hypothesis due to the low number of −ICs A statistical study has been performed on a data set composed of 711 electrical cells divided in samples of 10 min for identification of the altitudes of dominant negative and positive layers.We recall that the population of electrical structures considered in this study does not take into account electrical cells with a lifetime of less than 20 min and with a total lightning activity of less than 20 flashes.After filtering out samples with low confidence, a total of 3,684 samples were classified according to their dominant dipole.For this 5-month period study, about 25% (75%) of the samples had a dominant negative (positive) dipole charge structure.Samples associated to dominant positive dipole can be classified as normal samples since the dominant charge structures recall the normal dipole charge structures (Dye, 1986;E. R. Williams, 1985).These normal samples also recall the normal tripole charge structure (E.R. Williams, 1989) since the upper positive charge layer of the normal tripole structure is more electrically active than the lower positive layer (Lang & Rutledge, 2011) meaning that the upper positive dipole is the dominant one.This does not prevent the presence in these samples of some flashes propagating in a potential negative dipole indicating the presence of a low-level non-dominant positive layer.On the opposite, samples with dominant negative dipole can be classified as anomalous samples since the dominant charge structure looks like a negative dipole charge structure (e.g., Bruning et al., 2007;Qie et al., 2005;Salvador et al., 2021).Here too, this does not prevent the presence of some flashes in a potential positive dipole indicating the presence of an upper non-dominant positive layer.It also reminds the bottom-heavy charge structure (e.g., Bruning et al., 2007;Mansell et al., 2010) which is a normal tripole with a dominant activity in the lower part of the storm (negative dipole) and few flashes occurring in the positive dipole.Although this study shows a wide variability in the height of the dominant positive and negative layers observed, the most frequent dominant positive layer altitude associated with a normal (anomalous) charge structure is located around 10 km (3 km) high, altitudes consistent with the observations of Salvador et al. (2021) for example, On a smaller population of samples with radiosounding temperatures profiles close in time and space, the present study gives dominant positive layers located at temperatures around −40°C (0°C) for normal (anomalous) charge structures in agreement with the results of Fuchs et al. (2015).
These statistical results only represent a 5-month period composed of 79 different days with lightning activity.It should be stressed out that the results obtained for August and October 2018 are mainly driven by two major weather situations that produced almost one third of the total electrical cell number of each month.The relatively high proportion of dominant negative (positive) dipole for October 2018 (August 2018) could then be explained by the high number of cells and the weather conditions associated with these extreme events.The study also relies on the notion of samples, which ultimately provides a snapshot of the charge layer structures during a given period set to 10 min in the present study.These charge structures, as retrieved in this work, are mainly representative of bi-level lightning flashes, with a well-defined initial vertical propagation phase meeting the analysis criteria of the Chargepol Algorithm.The classification of the samples by counting flashes in each dipole can therefore vary with the temporal position of the 10-min time window relatively to the lifecycle of a given electrical cell, and especially in the case of cells with low flash rates.It is believed that a statistical study on a large number of samples in addition to the different filters applied on both the samples and the LMA data as described in this study can reduce the impact of this arbitrary time window on the classification of dominant dipole charge structures.
Several macroscopic electrical parameters such as flash rate, IC-CG ratio, +CG fraction, +IC fraction and flash initiation height have been analyzed according to the charge structure.This study confirms that the higher in altitude the dominant dipole is, the higher the flash rate is for both anomalous (dominant negative dipole) and normal (dominant positive dipole) charge structures.In this study, a certain population (high altitude negative dipole) of anomalous charge structures produced the highest flash rate and were linked to severe weather (strong winds, supercells).However, with regard to these samples, it would seem that Corsican thunderstorms, associated to dominant negative dipoles, do not predominantly produce positive CGs and the use of this criterion can therefore not be applied for the identification of anomalous charges structures in Corsica.
The charge structures have been classified in 4 classes according to the polarity and the height of the dominant dipole.These 4 classes are summarized in a conceptual scheme (Figure 7).The wide ranges of charge structures observed should depend on the meteorological conditions influencing the charge distribution but also on the period in the life cycle of a storm.In the context of this study, the goal was to document the lightning flashes produced by the different observed charge structures and not to link these charge structures to typical environmental conditions.Aerosol content (Coquillat et al., 2022) and cloud base height (Fuchs et al., 2015) are examples of assumptions about environmental conditions that could promote dominant negative dipole and anomalous charge structures throughout the life of the storm.The other hypothesis is that thunderstorm life cycles influence the observed charge structures.The analysis of the occurrence of charge structures in the life cycle of their respective cells did not give any (strong) signal validating this hypothesis, only high altitude positive dipole class tend to occur mostly at the mid-life of the cells (not shown).
The synergistic use of VHF LMA and LF LLS observations gives a rational type and polarity classification to LMA flashes with regard to the vertical charge structures and the flash initiation heights.The automatic identification of vertical charge structure gives important information about storms dynamics and electrical activity that can be use for storm monitoring.Information on the meteorological parameters associated with each charge structure could be useful in order to identify the conditions conducive to anomalous thunderstorms in the North-Western Mediterranean Sea in comparison to anomalous thunderstorms in the US Great Plains for example, The influence of the land/sea transition of thunderstorm cells on their charge structure should also be investigated.
2) FED clusters are then identified within a given 5-min period by an application of two successive DBSCAN (Pedregosa et al., 2011) methods on pixels: The first DBSCAN pass aims at separating clusters according to a certain euclidean distance and to a specific threshold on FED.Here, all pixels are taken into account with a minimum FED value of 1. Two clusters belong to the same electrical cell if the distance between the closest pixels of the two given clusters is less than 10 km.This distance was selected based on a sensibility study (not shown) and is similar to the one used by Fuchs et al. (2015) for the identification of adjacent cells as isolated cells, and to the distance threshold used by Galanaki et al. (2018) for cell clustering using the ZEUS lightning sferics network.
Depending on the structure of the parent cloud, the clusters may spread horizontally because of the propagation of lightning flashes in the stratiform part of the storms (Carey et al., 2005;Coquillat et al., 2019;Weiss et al., 2012).Electrical cells can have distant convective centers but close (<10 km) or even connected stratiform regions.In this case, the first DBSCAN pass will gather two cells under the same cluster.To solve this problem a second DBSCAN algorithm is applied on the existing clusters with a threshold on FED to identify and select the most electrically active regions.It uses an adaptive threshold proportional to the maximum FED of each cluster identified by the first DBSCAN pass.We consider that the maximum flash density pinpoints the convective core of the cell (e.g., Bruning & Macgorman, 2013;Calhoun et al., 2013) while the stratiform region of the cluster corresponds to pixels having a FED lower than 25% of the max FED of the cluster (value chosen after tests, not shown).This adaptive threshold, constant in percentage (25%) but variable in actual FED value, is cluster dependent and prevent using a single fixed threshold since, at the same time, different clusters can exhibit maximum FED values which may differ significantly.
The second DBSCAN pass works like the first one and with the same euclidian distance of 10 km but it is applied on each cluster individually with the pixels selected after adaptive threshold.In most cases, it just restricts the clusters borders to the convective part and reduces the horizontal size of the clusters.But in the case of a large cluster with two (or more) convective cores connected through low FED pixels, the second DBSCAN pass will separate the cluster into 2 (or more) new clusters with contours closer to the convective (electrical) core.Figure A1a shows an example of the separation of the FEDs into clusters at an instant T with their borders delimiting their respective most active region due to the second DBSCAN pass.
3) At the end of the second step, for each minute, all clusters possess an identification (ID) number and their borders are defined with a polygon.As the process of identification of the cells is completed, the tracking of the cells over time has to be performed.The polygons formed by the clusters are compared two by two to detect cluster overlapping.Indeed, if there is any geographical overlap between a polygon at time T+1 (son) with a polygon at time T (father) then the son cluster is considered as the future of the father cluster and takes father's ID.In addition integrating over 5 min the lightning data and using simultaneously a sliding time window, avoid creating non-overlapping cells especially for fast-moving storms.In its present configuration, with the 5 min integrated density, 4 min of electrical activity are always in common for two successive FED images.As an example, Figure A1a and A1b show clusters identified at two successive time steps, clusters #1,#2 and #3 at 1307 UTC (Figure A1a) and clusters #1, #2, #3 and #4 at 1308 UTC (Figure A1b).The clusters keep the same identification number (clusters #1, #2 and #3) because there are geographical overlaps between the two time steps.Cluster #4 is considered as a new one as it does not overlap any other cluster.
10.1029/2023EA003354 20 of 24 Complicated cases can appear, in particular the cases where cells split or merge.If a mother cell has several daughter cells (split), the daughter with the largest overlap area with the mother cell takes the ID number of the mother.The other daughter cells are considered as new cells.In case several mother cells merge into one daughter cell, the mother cell with the largest area of overlap with the daughter cell provides its ID number, and the other mother cells are no longer considered active.The contour of any cell that is no longer active is still kept in memory for 20 min after its last electrical activity in order to revive it if a new electrical activity reappears and overlaps the old position of that cell.At the end of step 3, the electrical activity of a day over the domain of interest is composed of cells, each cell labeled by an ID number and geo-located by its position (centroid) and its border at each time step of its life.For example, Figure A2 shows the cells along with the trajectories obtained by ECTA for the day of 26 July 2018.
4) The last step extracts all the L2b flash data that belong to a given electrical cell.Any flash with its initiation located within the contour of a given cell is extracted.We consider that the flashes are initiated preferentially in the convective part of the storm.Indeed Ribaud et al. (2016) showed that 97% of the first VHF sources of the flashes were initiated in the convective regions.The extraction by flash-object instead of by sources allows to recover data outside the surface defined by the contour of the convective cell.Figure A3 shows an example of an extraction of L2b data associated to one single cell (cell #2, 26 July 2018).

Figure 1 .
Figure 1.Charge layers inferred from Chargepol algorithm at cell-scale.(a) Altitude of VHF sources versus time for the cell #2 on the 26 July 2018.VHF sources in an inferred positive (negative) charge layer are colored in red (blue).VHF sources with no polarity inferred are colored in gray.Each vertical black dashed line represents the end of a sample and the start of a new one.(b) Histogram (0.5 km bins) over the vertical of Chargepol-flashes propagating in an identified positive or negative charge layer for the cell #2.(b) and (c) show histograms (0.5 km bins) of vertical distribution of Chargepol-flashes propagating in inferred positive and negative charge layers altitude for the 1322 UTC and 1402 UTC samples (black arrows in (a) and (e), time associated to the end of the samples).(e) Altitude of samples DPL (red) and DNL (blue) versus time for cell #2.Each vertical black dashed line represents the end of a sample and the start of a new one."P" for dominant Positive dipole samples, "N" for dominant Negative dipole samples.

Figure 2 .
Figure 2. Map of the cell trajectories obtained by ECTA on the SAETTA domain for each month.(a) June 2018.(b) July 2018 with the cell #2 trajectory, taken as an example in Section 2.5 and in the Appendix A, highlighted in red.(c) August 2018.(d) September 2018 and (e) for October 2018 with cells trajectories associated to the ADRIAN situation highlighted in red.Crosses represent the end of the trajectories.

Figure 3 .
Figure 3. 2D distribution (0.5 km bins) of samples by altitude of their DNL and DPL.Contours show the accumulated number of flashes in bins.The diagonal black dotted line separates dominant positive and negative dipole domains.

Figure 4 .
Figure 4. (a) Average flash rate per dominant negative dipole samples in 0.5 km DNL altitude bins.The box extends from the first quartile to the third quartile of the data, with an orange line at the median.(b) 2D distribution (0.5 km bins) of the median of the average flash rate per sample.Contours show the cumulated number of flashes in bins.The diagonal black dotted line separates dominant positive and negative dipole domains.Red and gray boxes separates samples classes.(c) as (a) but for dominant positive dipole samples.(d) Average flash rate per sample in 0.5 km DPL altitude bins.
Figure 5. (a) 2D distribution (0.5 km bins) of the median IC-CG ratio of binned samples.Contours show the number of accumulated flashes in bins.The diagonal black dotted line separates dominant positive and negative dipole domains.(b) and (c) same as (a) but for +CG fraction and +IC fraction.Red and gray boxes separate samples classes.
For dominant positive dipole structures, −CGs typically initiate 0.5-3.5 km below the theoretical initiation altitude of the dominant positive dipole (Figure 6a).This indicates the presence of another layer of positive charges beneath the DNL forming a lower negative dipole (not dominant) in which classical type I −CGs of the Li et al. (2020) classification can initiate.The difference in altitude between the initiation of −CGs in the bottom negative dipole and the theoretical initiation altitude of the flashes in the dominant positive dipole (at the top of the tripole) is supposed to roughly correspond to the thickness of the main negative charge layer.For positive dipoles at low altitudes (DNL around 4 km and DPL around 6 km) with few samples and few −CGs (between 10 and 100 −CGs per bin) it is likely that there is no positive lower layer helping the negative leaders to propagate to the ground as for types I flashes.In such configuration, the −CGs would be rather of type II or III of the Li et al. (

Figure 6 .
Figure 6.(a) 2D distribution (0.5 km bins) of the median difference between the samples median initiation height of -CGs flashes and the theoretical one of each bin.Contours show the number of accumulated flashes in bins.The diagonal black dotted line separates dominant positive and negative dipole domains.(b) same as (a) but for +IC flashes.Red and gray boxes separate sample classes.
In general, the most frequent class of charge structures (in terms of 10 min periods) is the low altitude positive dipole class that represent about 58% of the samples.Storm periods with such charge structures exhibit a low flash rate (0.5-3 flash.min −1 ) with mainly +IC flashes occurring in the dominant upper positive dipole and −CG flashes in the lower negative dipole.The less frequent category (4%) in Corsica, the high altitude negative dipole class, produces the highest flash rate (20-50 flash.min −1 ).The dominant negative dipole of this class is located above 4 km height and produced essentially −IC flashes, short duration flashes and few −CGs.The third category, named high altitude positive dipole class, is associated with a large production of +IC flashes and an enhanced production of CG flashes of both polarities since these charge structures are responsible of the highest CG flash rate observed (3 CG flash.min −1 ).Finally, the fourth and last category, called low altitude positive class, represents about 21% of the samples and produces mainly −CGs and −ICs with some +IC activity.

Figure 7 .
Figure 7. Conceptual scheme of the four different charge structures classes (dominant dipole in samples) observed in Corsica over the 5 months and the main flash production associated.Samples frequency of each class are given (%).Number of icons denote the relative number of flashes occurring between layers.CG icons with a thicker line symbolize an enhanced CG activity.Charge layers with dashed lines are not always present.

Figure A2 .
Figure A2.Cells identification numbers and trajectories identified by the ECTA algorithm on the 26 July 2018.Trajectories are colored with time (circles) and crosses indicate the end of the trajectories.Only cells with a minimum of 20 flashes and a minimum duration of 20 min are shown.

Figure A3 .
Figure A3.LMA and MET data extracted from ECTA for the cell #2 of 26 July 2018.VHF sources are colored with time.(a) VHF sources altitude versus time (UTC).(b) VHF sources altitude versus longitude.(c) VHF sources altitude histogram.(d) VHF sources altitude 2D projection.(e) VHF sources altitude versus latitude.MET observations for the cell are also added on (a) with IC pulses plotted at 14.5 km height and strokes at 0.5 km height (red for positive currents and blue for negative currents).

Table 1
June 2018 July 2018 August 2018 September 2018 October 2018 Total (JJASO 2018) Number of Samples and Flashes Statistics Before and After Filtering on Samples for Each Month Independently and for the 5 Months Together