Developing empirical lightning cessation forecast guidance for the Cape Canaveral Air Force Station and Kennedy Space Center



[1] This research addresses the 45th Weather Squadron's (45WS) need for improved guidance regarding lightning cessation at Cape Canaveral Air Force Station and Kennedy Space Center (KSC). KSC's Lightning Detection and Ranging (LDAR) network was the primary observational tool to investigate both cloud-to-ground and intracloud lightning. Five statistical and empirical schemes were created from LDAR, sounding, and radar parameters derived from 116 storms. Four of the five schemes were unsuitable for operational use since lightning advisories would be canceled prematurely, leading to safety risks to personnel. These include a correlation and regression tree analysis, three variants of multiple linear regression, event time trending, and the time delay between the greatest height of the maximum dBZ value to the last flash. These schemes failed to adequately forecast the maximum interval, the greatest time between any two flashes in the storm. The majority of storms had a maximum interval less than 10 min, which biased the schemes toward small values. Success was achieved with the percentile method (PM) by separating the maximum interval into percentiles for the 100 dependent storms. PM provides additional confidence to the 45WS forecasters, and a modified version was incorporated into their forecast procedures starting in the summer of 2008. This inclusion has resulted in ∼5–10 min time savings. Last, an experimental regression variant scheme using non-real-time predictors produced precise results but prematurely ended advisories. This precision suggests that obtaining these parameters in real time may provide useful added information to the PM scheme.

1. Introduction

[2] The threat of lightning, both to life and property, is well-documented [Holle et al., 1992; Curran et al., 2000]. This threat varies throughout the life cycle of the thunderstorm. During the mature stage of a storm [Byers and Braham, 1949], lightning activity usually is a maximum; the threat is obvious; and most individuals seek cover. It is the period of thunderstorm initiation and dissipation, when lightning activity is not obvious, that the majority of lightning casualties occur [Holle et al., 1992]. This risk is especially present in Florida, which receives more cloud-to-ground (CG) lightning than any other state [Orville, 1994; Hodanish et al., 1997; Orville and Huffines, 2001; Orville et al., 2002]. The lightning threat is particularly acute during the warm season months of May through September, the climatological peak of Florida's lightning.

[3] Considerable research has been done on forecasting the onset of lightning at Cape Canaveral Air Force Station/Kennedy Space Center (CCAFS/KSC) [Roeder and Pinder, 1998; Roeder et al., 2002]. This research focuses on lightning cessation, which has received little previous attention. Specifically, we attempt to create an empirical lightning cessation guidance product for CCAFS/KSC. Both installations are located in the eastern portion of Florida's “lightning alley” that crosses the central portion of the peninsula. The most recent NLDN data (Figure 1) indicate that CCAFS/KSC typically experience 5–15 CG strikes per square km per year (S. Rudlosky, unpublished data, 2009). An earlier climatology based on data from 1992 to 2005 indicated 4–10 CG strikes per square km per year. These values are smaller due to more data being used before two upgrades to the National Lightning Detection Network [Cummins et al., 1998, 1999, 2006]. The key point is that lightning poses a significant threat to this portion of the state, which is one of the most active regions in the country. With nearly 25,000 individuals and over $20 billion of facilities at CCAFS/KSC [Boyd et al., 1995], lightning safety demands accurate forecasts for both lightning initiation and cessation.

Figure 1.

Annual CG flash densities over east central Florida (flashes km−2 yr−1) from 1992 to 2004. The KSC region receives between 4 and 10 cloud-to-ground strikes km−2 yr−1. The image covers 27.75°N–29°N and −81.55°W–−80.4°W.

[4] The United States Air Force's 45th Weather Squadron (45WS) is tasked with forecasting lightning at CCAFS/KSC, among many other duties [Harms et al., 1999]. They issue lightning advisories that alert personnel to the onset of lightning and signal when the threat has passed [Weems et al., 2001; Bott and Eisenhower, 2005]. The 45WS is reasonably satisfied with the accuracy of their lightning initiation advisories, although room for improvement exists. However, knowing when to discontinue an advisory continues to be a major concern since after the fact evaluations indicate that most are maintained too long. There is no current objective guidance for canceling an advisory beyond reversing the initiation criteria or forecaster rules of thumb [Roeder and Pinder, 1998]. As a result, the 45WS keeps the advisories active long enough to ensure that lightning has ceased and it is truly safe to resume outdoor activities. The 45WS desires forecast guidance that will assess with a high degree of confidence whether a particular flash is the last flash of a given thunderstorm. This guidance would decrease the advisory period while maintaining 45WS' excellent safety record. Effective guidance would reduce the amount of lost manpower and produce a cost savings that is estimated to be millions of dollars per year [Roeder and Glover, 2005].

[5] Several works have inferred lightning cessation based on studies of lightning initiation [e.g., Wolf, 2006] or a storm's electric field [Marshall et al., 2009]. Wolf [2006] indicated that reversing the initiation criteria of the 40 dBZ radar reflectivity above the −10°C isotherm held promise for forecasting cessation. Marshall et al. [2009] showed that the surface electric field beneath a thunderstorm exhibited an end of storm polarity oscillation during the storm's decay phase [Byers and Braham, 1949]. Although these works describe interesting aspects of storm electrification, to our knowledge only four previous studies specifically have examined lightning cessation, and all focused only on the last CG strike. Hinson [1997] studied three storms in the KSC area using radar data as the primary data source. He found a lag time of ∼30 min between the last occurrence of 45 dBZ reflectivity at the −10°C isotherm level and the last CG strike. Holmes [2000] expanded the data set to 40 cases, concluding that single and multicell storms had different cessation behaviors. His greatest forecast skill was for single cell storms. The third study, by Holle et al. [2003], focused on evaluating the distance and times between successive CG strikes. They found that the probability of another CG strike within 3.2 km of a point within a 9.7 km outer warning ring, and 5 min after the previous strike, was only 3.7%. While promising, there was still a small likelihood of another strike occurring 30 min after the previous strike. Finally, Roeder and Glover [2005] conducted a proof of concept study based on 58 thunderstorms. Interstrike times were fit to a log linear curve that explained 75% of the variance. They concluded that a statistical approach to forecasting lightning cessation was a promising avenue of future research.

[6] This research seeks to develop statistical/empirical guidelines for forecasting lightning cessation in the CCAFS/KSC area. We expand on the previous cessation studies by incorporating data from KSC's Lightning Detection and Ranging (LDAR) network [e.g., Lennon, 1975]. Also, instead of only studying CG strikes, we consider total lightning, both CG strikes and intracloud (IC) flashes. The number of storms in our data set also is increased to 116 during the warm seasons (May–September) of 2000–2005. The overarching goal is to develop cessation guidance that can confidently and safely end a lightning advisory. This paper is the culmination of several earlier reports [Stano et al., 2006, 2008a, 2008b].

[7] We focus on the warm season months since it is the climatological peak of lightning activity in central Florida. Warm season storms also are less likely to be synoptically driven and are more likely due to sea breeze activity and general destabilization by surface heating. These conditions produce scattered individual storms, not organized lines of convection that are more typical during the cold season. Cold season storms are much easier to monitor as they move into and out of the 45WS' area of interest. Since warm season convection typically does not transition through the 45WS area of concern in this manner, forecasting their lightning cessation is more complex.

[8] Section 2 describes the LDAR network and other supporting data sets. Section 3 discusses our methodologies and the characteristics of the 116 storms. Section 4 presents our results, while conclusions are given in section 5.

2. Observation Networks

2.1. Lightning Detection and Ranging

[9] The LDAR network at KSC (Figure 2, circles) [Lennon, 1975; Poehler and Lennon, 1979; Maier et al., 1995; Britt et al., 1998; Boccippio et al., 2001] is a short-baseline system utilizing a time of arrival detection scheme. Originally designed by KSC, the network consists of seven sensors arranged in a hexagonal pattern. Each sensor is located 6–10 km away from the controlling central receiving site. LDAR is a passive observing system that operates at 66 MHz and a bandwidth of 6 MHz [Maier et al., 1995]. It detects the very high frequency electromagnetic pulses generated by individual stepped leaders and other phenomena associated with lightning aloft. A single flash may consist of hundreds or thousands of LDAR detections. As of April 2008, LDAR was significantly upgraded, renamed the Four-Dimensional Lightning Surveillance System, and ownership transferred to the 45th Space Wing [Murphy et al., 2008]. However, the data analyzed in this study were all from the original LDAR system, and so hereafter we will refer to the data set as from LDAR, owned and operated by KSC.

Figure 2.

The research domain at CCAFS/KSC, where the outer ring is 100 km from the center of the LDAR network and the inner ring is at 60 km. Priority was given to events occurring within 60 km, and no event was further than 100 km. Locations of the main observation networks are shown for LDAR (circles), CGLSS (triangles), and the WSR-88D (square).

[10] LDAR detects most IC flashes and the upper portions of CG strikes, with a detection efficiency greater than 90% within 100 km of the network's center [Boccippio et al., 2001]. The efficiency improves to 99% when events occur within 25 km of the network's center [Maier et al., 1995; Murphy et al., 2000]. It is important to distinguish between individual LDAR source and flash detections. LDAR may only detect 70% of individual sources within the network, but the flash detection rate is close to 100%. LDAR data are the key difference between this work and the previous studies of lightning cessation [Hinson, 1997; Holmes, 2000; Holle et al., 2003; Roeder and Glover, 2005].

2.2. Cloud-to-Ground Lightning Surveillance System

[11] KSC's Cloud-to-Ground Lightning Surveillance System (CGLSS) [Roeder et al., 2005; Boyd et al., 2005] is a high-performance, local CG lightning detection network consisting of six Improved Accuracy via Combined Technology sensors [Cummins et al., 1998] (Figure 2, triangles). They are similar to sensors employed by the National Lightning Detection Network (NLDN) [Cummins et al., 1998, 1999]. CGLSS has greater detection efficiency and location accuracy than NLDN due to the sensors being separated by only a few tens of kilometers [Boyd et al., 2005]. CGLSS has 98% detection efficiency and 250 m location accuracy, assuming all sensors are used in the solution of the lightning location [Roeder et al., 2000]. The primary purpose of CGLSS is to assess the likelihood of induced current damage in the electronics of payloads, space launch vehicles, and key facilities. Since LDAR loses detection efficiency in the lowest 1 km of the atmosphere, the CGLSS data are overlaid on the LDAR display to confirm that a descending lightning flash actually became a CG strike. For 45WS purposes, lightning advisories are issued for any type of lightning.

2.3. Conventional Data

[12] WSR-88D radar data from the National Weather Service Forecast Office in Melbourne, Florida, were an important secondary data set. The radar data were used to accurately determine the locations of thunderstorms, associate each lightning flash with its parent storm, and capture the full lifecycle of each storm. This assured that the final flash was properly assigned to the correct storm. The Melbourne radar is located 1.13 km west and 47.32 km south of the central LDAR receiver [Hinson, 1997; Holmes, 2000] (Figure 2, square). Level II archived data were acquired from the National Climatic Data Center (

[13] Morning radiosonde soundings from CCAFS (KXMR) also were used. We calculated various wind, moisture, and stability parameters from the soundings and then determined if they were statistically correlated with lightning cessation. Only morning KXMR soundings between 1000 and 1500 UTC were utilized in order to represent the atmosphere prior to the typical afternoon thunderstorm initiation. This decision is consistent with several previous studies [Neumann and Nicholson, 1972; Lopez et al., 1984; Livingston et al., 1996; Brenner, 2004; Shafer and Fuelberg, 2006].

3. Methodology

3.1. Storm Selection

[14] The storms comprising our data set were manually selected by viewing displays of LDAR and radar data. This manual approach, while time consuming, was the most effective way to ensure that each flash was properly assigned to its parent storm. The effective range of the LDAR network determined our domain. It was confined to within 100 km of the center of the LDAR network (outer ring, Figure 2), with preference given to storms within 60 km (inner ring, Figure 2). This inner ring assured that storms would lie within the high detection efficiency regions of both CGLSS and LDAR. Additionally, once a storm is greater than 60 km from the center of the LDAR network, the vertical error of the signal owing to the Earth's curvature becomes too large to effectively use three-dimensional LDAR predictors [Boccippio et al., 2001].

[15] Before individual storms could be selected, the LDAR and CGLSS data were processed using two algorithms made available by the 45WS. The first combined the individual LDAR sources into flashes using temporal and spatial criteria. Several flash-creating algorithms were available for use [Williams et al., 1999; Nelson, 2002; Wiens et al., 2002; Thomas et al., 2003; Koshak et al., 2004; Lojou and Cummins, 2005], and each had various pros and cons as quantified by McCormick [2003] and Murphy [2006]. The Nelson [2002] algorithm that we selected is an extension of original code developed by Murphy et al. [2000]. The main concern with any algorithm is uncertainty during high flash rates. However, analyses have shown that the Nelson [2002] approach is no better or worse than any other in this regard. It has been used in several previous studies [McNamara, 2002; Nelson, 2002; Vollmer, 2002; McCormick, 2003]. Approximately 340 million LDAR-observed sources were processed during the study period of May–September 2000 to 2005.

[16] The second algorithm combined the LDAR flashes with CG strike locations [McNamara, 2002]. This prevented the IC component of a CG strike and the CG strike itself from being counted as two separate flashes. With these steps completed, the initiation point of each flash was displayed with the radar data as described below. Only the flash initiation points were used to select candidate storms so that the radar/source display was not cluttered with thousands of other sources.

[17] The Warning Decision Support System–Integrated Information (WDSS-II) software [Lakshmanan et al., 2007] was used to combine and visualize the radar and LDAR flash initiation data. Our goal was to select only storms whose lightning flashes clearly were associated with that storm. This selection, although subjective, was the most critical component. If flashes could not definitively be associated with a particular storm, there would be no certainty that the final flash of that storm had been captured and that cessation had occurred. Automated approaches were attempted, but no currently available method accurately and consistently matched storms with lightning.

[18] Our subjective analysis limited the available storms to study. As noted earlier, the initiation points had to coincide with the radar-observed location of a storm. Most lightning in active storms was located near the storm's core (i.e., its greatest dBZ values). However, weaker or weakening storms contained lightning that was more dispersed throughout the cell or in the anvil region. Each selected storm also had to be isolated from other storms to ensure that flashes were assigned to the correct storm. Although this requirement provided certainty that the final flash of each storm was observed, fewer storms could be selected. Cells often would grow in close proximity or merge, preventing the determination of which storm generated which flash. The isolated storms generally were weaker, shorter-lived, and exhibited less electrical activity than the population of Florida storms.

[19] It is important to note that storms were not rejected based on severity. A storm was rejected only if it was not isolated such that we were unable to track all of the flashes during the storm's lifecycle. There were originally 142 storms in our data set, consisting of severe and nonsevere storms as well as multicellular and single cell storms. From this group, 26 were rejected, leaving a final data set of 116 storms. Six of the 26 rejections were due to missing radar data. The remaining 20 rejections, consisting of both severe and nonsevere storms, were deleted because it was unclear which storm produced which flash, thus preventing a definitive determination of when cessation truly occurred.

3.2. Storm Characteristics

[20] Nearly 17 thousand flashes occurred in our 116 thunderstorms over 32 separate days. It is useful to describe several characteristics of these primarily nonsevere storms. Their intercloud flash rate ranged from 0.1 to 18 flashes min−1, with a median of 1 flash min−1. The overall distribution (not shown) was skewed, with the majority of storms exhibiting fewer than 4 flashes min−1. These values are much smaller than those of Montanyà et al. [2007], who studied a severe hailstorm in northeastern Spain with a flash rate of 92 flashes min−1. Similarly, Wiens et al. [2005] observed flash rates of nearly 300 flashes min−1 in a supercell over the Great Plains. IC flashes comprised 95–100% of Wiens' storms total lightning activity. Their large IC percentage may be due partly to the flash creation algorithm breaking a single flash into several flashes [Murphy, 2006]. Our 116 storms did not exhibit such large flash rates; however, eight were severe thunderstorms, as defined by the National Weather Service (i.e., hail greater than three quarters of an inch in diameter, winds exceeding 93 km h−1, or a tornado) such as observed by Wiens et al. [2005] and Montanyà et al. [2007]. It was difficult to include severe thunderstorms in our data set because many merged or were in close proximity to other cells, making it impossible to accurately determine which cell produced which flash. Thus, few severe thunderstorms were included, resulting in lower lightning flash rates.

[21] The LDAR network provides a three-dimensional analysis of lightning. We determined both the average initiation altitude (7.4 km) and average altitude of all sources (7.8 km), with both parameters ranging from ∼7–9 km. These values are less than the 8–11 km levels of maximum source densities found by Carey and Rutledge [1998] who observed the electrical and multiparameter radar-derived characteristics of a severe hailstorm near Ft. Collins, Colorado, in 1995. However, current results are in close agreement to the large peak of sources at 9 km with a smaller peak at 6 km found by Vollmer [2002] who investigated the horizontal extent of over 1 million lightning flashes over a multiyear period based on altitude and atmospheric temperature.

[22] The source altitude information was compared with several environmental parameters, particularly the freezing level height that defines the base of the storm's mixed phase region that is important to the charging process [Takahashi, 1978; Jayaratne et al., 1983; MacGorman et al., 1989; Saunders et al., 1991]. The median freezing level height for this study was 4.5 km. 64 storms (55%) had an average initiation altitude that was 2–4 km above the freezing level. Initiation altitudes of the remaining storms were split, with 32 storms (28%) having an average initiation altitude less than 2 km above, and 20 storms (17%) greater than 4 km above the freezing level. No storm had an average initiation altitude below the freezing level.

[23] Algorithms developed by McNamara [2002] and Nelson [2002] allowed comparisons between intracloud flashes and cloud-to-ground strikes. 96 of our storms (83%) produced some CG activity, leaving 20 with only IC flashes. Of these 20 storms, the total number of IC flashes ranged from 3 to 205. Two storms produced only CG strikes, each with six. The median percentage of CG strikes to IC flashes was 14%, which is three times the value found by Wiens et al. [2005]. Six storms exhibited a ratio greater than 50%.

[24] We also determined which type of flash was most likely to initiate and end a storm. Of our 116 storms, 104 (90%) initiated with an IC flash, while 97 (84%) ended with an IC flash. These values for our relatively weak storms are consistent with the findings of MacGorman et al. [1989] who studied two tornadic storms in central Oklahoma and Williams et al. [1989], who studied air mass thunderstorms producing microbursts near Huntsville, Alabama. Of the 96 storms with CG activity, 77 ended with an IC flash that averaged 8.1 min after the last CG strike. The greatest delay was 43 min. The remaining 19 storms with CG activity ended with a CG strike, with an average delay between the last IC and last CG of 9.4 min, and a maximum delay of 50 min.

3.3. Predictor Selection

[25] Our goal was to develop a statistical/empirical guidance product for lightning cessation. We first calculated 100 possible predictors based on the lightning characteristics just described as well as additional parameters from CGLSS, the WSR-88D, and KXMR soundings. These 100 predictors were reduced to a smaller number by screening for colinearity. When a possible predictor was highly correlated with others, the predictor with the highest correlation to our predictand (described below) was selected. These tests for colinearity reduced the number of possible predictors to 33 (Table 1). Discussions with the 45WS indicated that only parameters available to them in real time would be useful, since the flash creation algorithm could not be run in real time. Therefore, we initially eliminated several LDAR flash parameters including the average interflash time, flash rate, and intercloud flash rate (denoted by asterisks in Table 1). As substitutes, we included raw LDAR source data such as the number of sources above 10 km. After deriving cessation guidance based on the available real time predictors, several of the non-real-time predictors were reintroduced to the predictor pool to develop an experimental regression technique to test their effectiveness had they been available. All of these results are given in section 4.

Table 1. Fifteen Lightning, Three Radar, and Fifteen Sounding Candidate Predictorsa
  • a

    The 15 lightning candidate predictors are from LDAR and CGLSS, the 3 radar candidate predictors are from WSR-88D, Melbourne, Florida, and the 15 sounding candidate predictors are from Cape Canaveral, Florida, for a total of 33 candidate predictors.

  • b

    Candidate predictors unavailable in real time.

  • c

    Candidate predictors available in real time.

  • d

    Candidate predictors using CGLSS data.

  • e

    Candidate predictors using both CGLSS and LDAR.

Lightningaverage interflash time intervalb (s)
time between the last two flashesb (s)
instantaneous storm duration from first flash to presentc (min)
average flash horizontal extentb (km)
total sources for thunderstormc
average flash starting heightb (km)
storm over land or waterc
number of sources above 10 kmc
average source heightc (km)
average time between the last five flashesb (min)
cloud-to-ground strike rated (per min)
total cloud-to-ground of the stormd
first flash intercloud or cloud-to-groundd
delay between first intercloud and first cloud-to-grounde (min)
percentage of intercloud to cloud-to-groundb,e
Radartime from maximum VIL to final flash (min)
maximum VIL of the storm (kg m−2)
maximum height of the maximum dBZ for the storm (m)
Soundingconvective temperature (K)
convective condensation level (hPa)
mean relative humidity through 1 km (%)
theta-E lapse rate between 950 and 700 hPa
mean wind direction (1000–700 hPa) (deg)
shear through 6 km (s−1)
shear through 500–200 hPa (s−1)
wet-bulb zero level (hPa)
best lifted index (°C)
precipitable water (in)
altitude of the -40°C isotherm (hPa)
Showalter index (°C)
freezing level (m)
convective inhibition (J kg−1)
most unstable CAPE (J kg−1)

3.4. Predictand Selection

[26] The selection of the cessation predictand requires explanation. We needed a predictand that would give a specific time interval to wait after a flash occurred to know with certainty that it was the last flash. That is, after seeing a flash the forecaster would know how long to wait, without another flash occurring, before safely ending the advisory. Several predictands were investigated, but the maximum interval (the greatest time between any two flashes in a storm) was selected for the reasons stated below.

[27] Figure 3 displays three general distributions of interflash times for the 116 storms comprising our data set. The solid curve is the trend that is intuitively expected (63 storms). That is, as a storm develops, its interflash times are large (i.e., the left side of the curve). The storm then exhibits peak lightning activity during its mature stage [Byers and Braham, 1949] that is indicated by small interflash times. Finally, as the storm reaches its dissipation stage, lightning activity diminishes, and the interflash times increase. If lightning in every storm followed this cycle, the time trend between the last few flashes would be relatively easy to forecast, and our approach likely would have mirrored the log linear curve described by Roeder and Glover [2005].

Figure 3.

Three common distributions of a storm's interflash times. The majority of our storms had the u-shaped distribution (solid line). However, other storms either had a rapid initiation and slow decay (dotted line) or slow initiation and rapid decay (dashed line) that made it difficult to use the time between the last two flashes as the predictand.

[28] Figure 3 shows that two additional lightning trends comprised an important portion of our data set. Some storms had no true building phase when the lightning activity slowly increased. Instead, their interflash times are small from the storm's beginning and then exhibit the expected decay (dotted line, 42 storms). This in itself would not prevent use of the ending interflash times. However, the third observed scenario (dashed line, 11 storms) shows a typical spin-up, but the storm suddenly stops producing lightning, yielding a short interflash time even between the last flashes. This sequence renders the time trend predicted useless since the time between the last two flashes is small and would lead to an underprediction of cessation wait times.

[29] Figure 4 illustrates another problem encountered when selecting a cessation predictand. It is a stylized example of several storms in our data set. These storms initially produce a rapid series of lightning flashes (e.g., first five flashes) followed by a long delay. Then, two additional flashes occur in close succession at the end of the storm. This example exhibits two distinct time intervals. The interval between the last two flashes, i.e., between flashes 6 and 7 is 2 min; however, the maximum interflash time of 10 min occurs between flashes 5 and 6. We next describe the ramifications of using each of these interflash intervals. We first assume that the time between the last two flashes (2 min) is our last flash forecast. That is, 2 min is the time to wait after every flash to decide if it was indeed the last one. This 2 min interval works well for the first four flashes since another flash always occurs within 2 min, thus resetting the 2 min wait time. However, a problem occurs after the fifth flash. If the 2 min wait period were the only input, we would end the lightning advisory at 7 min, which would be incorrect since flashes six and seven occur at 15 and 17 min, respectively. This choice would place personnel in danger since the advisory would be canceled even though the lightning threat has not ended. This uneven periodicity proved to be a major problem in developing a cessation guidance procedure.

Figure 4.

Illustration supporting the use of the maximum time interval between flashes instead of the time between the last two flashes. In 68% of the events, the maximum interval was greater than between the last two flashes, making the former the superior predictand for forecasting lightning cessation with the trade-off of a smaller time savings.

[30] The alternate choice is to forecast the maximum interval between flashes, which in our hypothetical example is 10 min. Using this interval, we correctly maintain the lightning advisory between the fifth and sixth flashes. Our forecast also waits long enough after the seventh (and last) flash to safely end the advisory, giving confidence to the forecasters using the scheme.

[31] The choice of whether to use the time between the last two (or more) flashes or the maximum interval as the predictand comes down to what information is known in real time. Unlike initiation, the forecaster's job for cessation is not finished with the first flash. If all of the storms in our data set had a gradual decay in lightning activity (dotted and solid lines, Figure 3), the time between the last two (or few) flashes would be an adequate predictand. The forecaster would calculate a time to wait between flashes to know with certainty that when a flash was not followed by additional flashes during that interval, cessation will have occurred. However, Figure 3 shows that some storms in our data set (dashed line) start with a gradual build-up in lightning activity, and then stop with no gradual decrease in activity. This creates a problem if we try to use the time delay between the last two flashes, since this interval is small. Using the last two flashes time interval as the predictand for these suddenly ending storms may underestimate the actual time interval to safely wait for cessation. In other words, the time to wait may be underestimated, causing our cessation guidance to cancel an advisory too early.

[32] To avoid this uncertainty, we used the maximum interval between flashes as our predictand. It does not require the forecaster to know where in the sequence of flashes the just observed lightning occurs. If the forecast maximum interval has passed without another flash, the forecaster can confidently end the lightning advisory. This certainty comes with a trade-off. By using the maximum interval, the forecaster accepts an overforecast of the wait time to cessation. Although this decreases the time savings, safety considerations require that we accept greater forecast certainty over greater time savings. If a scheme provides no certainty that its prediction has relevance to lightning cessation, it is of no use to the 45WS. Figure 5 shows the distribution of maximum interval times for the 100 storm-dependent data set (described next), along with four outlier maximum intervals of 16.4, 18.2, 18.7, and 27.8 min.

Figure 5.

Distribution of the maximum interval time (minutes) for the 100 storms comprising the dependent data set.

[33] We randomly selected 100 of the 116 storms to serve as the dependent data set from which to develop cessation guidance. The remaining 16 storms served as an independent data set on which the cessation schemes were tested. How the two data sets were distributed was a concern. If all of the outliers had been in the dependent data set, each scheme might provide a false sense of success. Conversely, if all the outliers were in the independent data set, it likely would cause our schemes to fail. The 16 independent storms were analyzed to determine if they were a representative sample of the 100 storms used to generate the equations. Figure 6 shows that three of the 16 storms had outlier maximum intervals of 11.1, 22, and 23.2 min, i.e., proportionally more outliers than the dependent data. However, the random selection of the dependent and independent data sets slightly favors our schemes since four outliers are in the dependent data as is the greatest outlier.

Figure 6.

Distribution of the maximum interval time (minutes) for the 16 storms that are part of the independent data set.

[34] To address the concern of basing our schemes on only one random selection of dependent and independent data, a bootstrap analysis [Efron and Tibshirani, 1993] was conducted by randomly dividing the 116 storms into 10 separate dependent and independent groups. The cessation schemes were redeveloped from each dependent set and verified against the corresponding independent storms. The bootstrap-derived forecasts also were compared to those from the current 45WS forecasts procedures.

[35] With the candidate predictors calculated and the predictand chosen, we developed five potential schemes for providing cessation guidance using the Statistical Package for the Social Sciences (SPSS), version 11.5 for Windows, distributed by SPSS, Inc. One predicts the natural logarithm of the maximum interval. The natural logarithm was chosen since distributions of the raw maximum interval were skewed to the right (not shown), whereas multiple linear regressions require a Gaussian distribution. Three additional schemes, the correlation and regression tree analysis (CART), event time trend (ETT), and percentile method (PM), use the raw maximum interval. Finally, the fifth scheme predicts the lag between the time of the greatest height of the maximum dBZ of the storm (MZM) to the time of the last lightning activity. It is the only method that does not utilize the maximum interval in any way.

4. Results

4.1. Forecast Schemes

[36] Two terms must be defined before continuing. We evaluated our schemes based on the accuracy and precision of their forecasts. An accurate forecast predicts or overpredicts the maximum interval. That is, a scheme is rewarded for not ending a lightning advisory prematurely. A precise scheme only slightly overpredicts or underpredicts the maximum interval. Thus, a scheme may be very precise, but if it forecasts the end of lightning prior to actual cessation it is inaccurate. Table 2 will be utilized throughout this section. It shows the basic results of each scheme when tested on the 16 independent storms.

Table 2. Summary Table of Basic Results From the Five Empirical Schemesa
SchemeCorrelation and Regression Tree (CART)Sounding Only Regression (SOR)Sounding and Storm Features Regression (SSR)Experimental Regression (ER)Event Time Trend (ETT)Lag Between Storm's Maximum Height of Maximum dBZ to Last Flash (MZM)Percentile Method at 99.5% (PM at 99.5%)Percentile Method at 95% (PM at 95%)
  • a

    Three schemes were based on regression models (sounding only regression (SOR), sounding and storm regression (SSR), and experimental regression (ER)). Results from the event time trend (ETT) and two percentile method (PM) schemes also are shown. Note that the maximum height of the greatest dBZ (MZM) lag predicts the actual time to the last flash and predicts the maximum interval between flashes.

POD (%)56757544818810088
FAR (%)442525561912012
Average error (min)−1.5−−0.113.518.18.1
Median error (min)−
Greatest underprediction (min)19.615.911.513.412.96.58.2
Average underprediction (min)
Median underprediction (min)
Greatest overprediction (min)
Average overprediction (min)
Median overprediction (min)

4.1.1. Correlation and Regression Tree Analysis

[37] An early ominous finding was low correlations between each of the 33 predictors (Table 1) and the predictand (maximum interval). This was manifest by the difficulty of each scheme to predict the outlier values of maximum interval (e.g., Figure 4). The correlation and regression tree analysis (CART) [Brieman et al., 1984; Venables and Ripley, 1997; Burrows et al., 2004] will be described first. The first step was to create the tree using a recursive splitting of nodes (i.e., decision points). The nodes are created based on predictors in the dependent data set, and at each node, additional subnodes, or children nodes, then are created. As each node is created, SPSS determines whether the node terminates to provide a final value for the predictand (maximum interval). At this point, the decision tree likely has overfit the data set. This leads to a “pruning” process in which simpler trees are created by removing nodes of lesser importance. From the set of “pruned” trees, an optimal tree is selected that best describes the dependent data set, while not overfitting the data.

[38] The simplicity of this scheme is a positive trait for use in an operational setting since it does not require interpretation of a multivariate regression model. Unfortunately, although CART would be easy to implement operationally, it only provided 56% accuracy and very little precision (Table 2). The actual decision tree (Figure 7) shows the cause of the poor accuracy. Of the seven termination nodes giving the forecast maximum interval, only one is longer than 10 min. This automatically causes CART to miss the three longest maximum intervals in the 16 independent storms. With this bias toward the shorter and more numerous maximum intervals, the CART scheme does not provide a safe cessation forecast.

Figure 7.

CART analysis for predicting the maximum interval. Each termination node gives the forecast maximum interval (minutes) to wait before ending a lightning advisory.

[39] The CART analysis is heavily weighted toward the time delay between the last CG strike and the last (and final) IC flash of a storm since three of the first four decision nodes use this predictor (Figure 7). This choice specifically addresses characteristics of individual storms. Two of the final three predictors, shear through 6 km and the best lifted index, are environmental parameters, suggesting that the shear that can tilt an updraft and the available instability contribute to lightning activity. The final node, storm duration, is another storm specific parameter that appears in several of the schemes discussed later. It is unfortunate that the optimal decision tree (Figure 7) includes a predictor (time delay between last CG and last IC) that is poorly correlated to the maximum interval (R is −0.1). However, none of the other predictors yielded better results. Of the remaining predictors, instantaneous storm duration, had the best correlation of R = 0.21. Operationally, use of this predictor would require CART to be rerun as the storm persists. Although the CART analysis hints that individual storm predictors are the most effective to use, our scheme was hampered by the predictors available in real time being poorly correlated with the predictand (maximum interval).

[40] The decision tree shown in Figure 7 was the best of several that were developed. The tree creation process was repeated numerous times to help determine what parameters would produce the best decision tree. They included using different numbers of termination nodes as well as how easily a node could split into additional children nodes. The low correlation of the predictors to the predictand limited CART's versatility. While many variations were attempted, none were accurate or effective.

4.1.2. Multiple Linear Regression Schemes (Sounding Only Regression, Sounding and Storm Regression, and Experimental Regression)

[41] Three variants of multiple linear regression [Chambers and Hastie, 1992; Gardner et al., 1995; Wilks, 2006] were used to select the best combination of predictors for our cessation schemes. These were the experimental regression (ER), sounding only regression (SOR), and sounding and storm regression (SSR). SOR and SSR were developed with the data available to the 45WS forecasters in real time. ER was developed as a “what if” scheme to observe the effect of including parameters not available in real time, such as intracloud flash rate and initiation altitude. Their inclusion attempts to include some information about storm dynamics.

[42] The SPSS software uses a “forward conditional” stepwise selection process with a test for backward elimination in developing the regression model. The first predictor variable selected produces the greatest reduction in the residual sum of squares (or residual deviance (RD)), i.e., the predictor that explains the most variation in the maximum interval. The algorithm next selects the predictor that, together with the first, further reduces the RD by the greatest amount. At each step, the algorithm performs a backward check to determine if the additional predictor causes any previously selected predictor to become insignificant. If this occurs, that predictor is removed. This process continues until the RD can no longer be reduced by a significant amount, or until no predictors remain.

[43] Multiple regression schemes were created for each of the three variants. These schemes were based on adjusting the p value threshold for determining which predictors were chosen for the equations as well as the p value threshold for determining when a predictor should be removed from the regression model. The p value for allowing a predictor to be chosen varied from 0.1 to 0.4 in increments of 0.05. Additionally, the p value for discarding a predictor varied from 0.15 to 0.45. Optimally, the p values should be relatively small, indicating strong choices for the regression model. However, with our data set, the regression variants performed best under less stringent conditions. SOR had the least strict values of 0.40 and 0.45, although this was partly expected due to the sounding only parameters likely having little relevance to lightning cessation later in the day. The SSR and ER variants were better constrained with 0.25 and 0.30 used for the p value thresholds. The discussions below of the three multiple linear regression variants describe the best versions of each scheme. ER is the worst of all the regression variants that were tested. Its equation (1) is given below with the following definitions. The average interflash time is the median of all times between each flash, storm duration is the length of time between the first and last flash in the storm, and the midlevel height is 7–9 km for the number of midlevel LDAR sources.

equation image

Table 2 shows a poor 44% accuracy and an R2 value of 0.54. However, ER does produce excellent precision. The median error is only 0.1 min less than the maximum interval. Although ER ends the advisories early, its high precision in forecasting the maximum interval is a positive trait. This suggests that if the non-real-time data could be acquired, efforts should be made to refine this version of our multiple linear regression schemes. Also, ER, the percentile method, and event time trend (both described later), are the only schemes to correctly forecast at least one of the three outlier maximum intervals, while the percentile method is the only scheme to forecast all of the outliers.

[44] The remaining two multiple linear regression schemes are SOR and SSR, given by equations (2) and (3), respectively, where CCL is the convective condensation level.

equation image
equation image

SOR only uses prestorm environmental parameters derived from the morning KXMR sounding. Our objective was to determine if the prestorm environment alone could provide information about how long lightning persists in a thunderstorm. SSR is similar to SOR, except that any of the parameters available to forecasters in real time (Table 1) could be selected. Aside from the possibility to select different predictors, the development of SSR was identical to SOR.

[45] It is useful to discuss the predictors selected for the above three equations. The first term on the right side of the ER equation, the average interflash time, explains the most variance. Although Figure 3 showed a class of storms with a sudden end to lightning activity (dashed line), only 11 of our 116 storms exhibit this trend. These 11 storms require the use of maximum interval to ensure safety. This suggests that, should it be available in real-time, average interflash time may be able to account for the three trends seen in Figure 3 due to ER's high precision. It is interesting that while ER could select any predictor, including those not available in real time, the average interflash rate was the only non-real-time predictor chosen. Although ER's accuracy is poor, this single predictor markedly improves ER's precision compared to the other schemes (Table 2). This suggests that the average interflash rate might serve as a crude indicator of a storm's dynamical and microphysical processes.

[46] Storm duration explains the second most variance in the ER and SSR approaches. It even becomes the lynch pin of a separate scheme discussed later. Storm duration gives insight into whether a storm is a short-lived “pulse” storm or part of a multicellular structure or associated with a charged anvil cloud. Storm duration is an instantaneous variable, changing as the storm produces more lightning.

[47] Two parameters shared by the ER and SSR schemes are closely related, the number of midlevel sources (ER) and the average LDAR source height (ER and SSR). The altitude of lightning sources is related to the strength of the storm's updraft. More sources at higher altitudes suggest a vigorous updraft and therefore a storm that is still intensifying or in the mature stage. Thus, cessation is unlikely when altitude values are high. Alternatively, high-altitude sources can come from anvil lightning, but this can be discerned from lightning in a convective cell by using radar observations.

[48] It is interesting to note that both ER and SSR include a stability parameter, MUCAPE and best lifted index, whereas SOR does not. SOR uses less direct measures related to stability, the convective condensation level (CCL) and the height of the −40°C isotherm. ER and SSR, which have more dynamic predictors, can “afford” to include a less useful stability parameter. Strong instability leads to stronger updrafts and lightning activity, but is less useful for cessation.

[49] SOR and SSR both share the shear predictor through 6 km, while no similar parameter exists in ER, possibly due to similar information being embedded within the average interflash time predictor. Although 6 km shear ranks last in both SOR and SSR, it is a reasonable choice. With the appropriate amount of shear, a storm can develop a tilted updraft that will not “rain out” as quickly. This provides a more intense updraft and a better opportunity for the storm to rise above the freezing level with more hydrometeors available for charging. The fact that SSR contains the maximum vertically integrated liquid predictor supports this hypothesis. Additionally, SOR curiously selects the wind direction predictor. This may occur because storm development in central Florida is governed by the sea breeze during the warm season months. An easterly (onshore) wind at CCAFS/KSC generally leads to weaker storms, while a westerly wind enhances the east coast sea breeze front and the probability of stronger thunderstorm development in the area [Arritt, 1993].

[50] Finally, ER and SSR share one last similar predictor, CG rate and total CG, respectively. These parameters were not expected to be selected since it was assumed that the LDAR observations would provide more information.

[51] To our surprise, SOR and SSR tie in accuracy (Table 2), both yielding 75%. However, it is no surprise that SOR yields poor results since there was little expectation that the prestorm environment would provide much information about cessation within a specific, future storm. We had expected that allowing SSR to select any of the candidate predictors would provide improved forecasts. When comparing the underforecast and overforecast errors of both regression variants, neither is promising. SSR only improves SOR's R2 value, 0.295 versus 0.08. These values suggest that neither scheme can produce safe forecasts of lightning cessation.

[52] The results indicate that all of the predictors in Table 1, while being the best available, are poorly correlated to the maximum interval. Thus, no combination of predictors, whether in the prestorm environment or during the storm, has a significant chance of safely predicting lightning cessation. Too much important information, such as microphysical activity, is not available. This may explain why ER yields greater precision since its selection of average interflash time may parameterize this information in some way. In summary, the parameters available for the regression schemes are not sufficient for forecasting cessation.

4.1.3. Event Time Trend

[53] Given the poor performance of the CART analysis and all three regression schemes, we devised several other methods as described in section 3. The event time trend (ETT) was developed because storm duration (time from the first to last lightning activity) was selected in the CART, ER, and SSR procedures. Operationally, a forecaster would have to update the forecast as the storm persisted over longer times. Several trend lines relating storm duration to maximum interval were developed (not shown). Equation (4) describes the most successful version, where durations are given in minutes.

equation image

In spite of its simplicity, ETT produces 81% accuracy. Also, ETT, along with ER and PM, are the only procedures that correctly predict one of the three outlier maximum interval events in the independent data set. The storm duration predictor in (4) provides some insight into the nature of the storm, including broad assumptions about its microphysical structure. Small durations are associated with short-lived pulse storms with a brief charging period, while long durations indicate multicellular storms with greater charging or storms with a long-lived, charged anvil. ETT initially appears to be the most balanced between accuracy and precision. That is, it gives forecasters a modest level of confidence that cessation has occurred, while simultaneously providing some precision by not greatly overforecasting the maximum interval. This is somewhat misleading since ETT produces a few large underforecast errors that counteract larger overforecast errors. The underforecast errors are partly explained by a scatterplot between storm duration and maximum interval (not shown) that shows storm duration to be poorly correlated with maximum interval. Although ETT successfully predicts one outlier, most of its success is due to the majority of maximum intervals being small.

4.1.4. Lag Time From the Storm's Maximum Height of the Maximum dBZ to the Last Flash (MZM)

[54] Since the three maximum interval schemes described above fail to predict all three outliers, we reconsidered our decision to use it as the predictand. This reconsideration led to the maximum height of the greatest dBZ (MZM) scheme which developed a cubic relation between the maximum height of the storm's maximum dBZ value versus the time to the last flash (Figure 8, equation (5)). MZM is the only scheme that explicitly attempts to forecast a storm's last flash. It utilizes the time delay from when the greatest reflectivity core reaches its highest altitude to the time of the last flash. Equation (5) describes the period of time to wait for additional lightning to occur after the most recent flash. A drawback to MZM is that it must be recalculated when the greatest dBZ value or its height changes, much like ETT with the instantaneous storm duration. Similar approaches were attempted using the number of CG strikes, IC flash rate, and the percentage of CG to IC flashes (not shown). However, none of the individual parameters had the forecast utility of MZM.

equation image
Figure 8.

Scatterplot of the greatest height obtained by the storm's maximum dBZ (kilometers) versus the time to the last flash (minutes). A cubic trend line is superimposed.

[55] A perceived advantage of the MZM scheme is that it attempts to include storm dynamics by utilizing radar data. Relatively intense storms have stronger updrafts [e.g., Byers and Braham, 1949], and considerable previous research has studied updraft characteristics, including their size [Auer and Marwitz, 1968], vertical velocity [Battan and Theiss, 1970; Marwitz, 1973; LeMone and Zipser, 1980; Xu and Randall, 2001], and temperature [Davies-Jones and Henderson, 1973]. Furthermore, the electrification process is linked to the storm's updraft [Gunn, 1956; Paluch and Sartor, 1973; Stolzenburg et al., 1998]. These studies indicate that a storm's updraft is an integral factor in producing lightning and that much of the lightning is contained within this core region [Carey and Rutledge, 1998]. We hoped that using the lag time approach would address the problem of forecasting the greatest maximum interval storms (the outliers). Since lag time is unrelated to maximum interval, MZM might be able to discern differences between the independent storms (Table 2).

[56] The MZM scheme is partially successful (Table 2). Its accuracy of 88% makes it the second most accurate scheme. Its underforecast error also is small, with the median error only 3.6 min, making it one of the most precise schemes. The largest underforecast error is only 6.5 min, which is half that of the next closest scheme, aside from the Percentile Method described next. The trade-off for good accuracy (i.e., correctly ending an advisory at or after cessation) and small underforecast error is a large median overforecast time of 12 min and the greatest overall overforecast of 44 min (Table 2). Thus, the MZM scheme provides high confidence that an advisory will be canceled safely after lightning cessation. However, the time savings over current schemes is minimal.

4.1.5. Percentile Method (PM)

[57] The final scheme tested, the Percentile Method (PM), also is the simplest. A scatterplot of maximum intervals for the 100 dependent storms was prepared and then divided into percentiles (Figure 9). The 16 independent storms then were verified against these percentile values. Figure 9 clearly shows why the previous schemes poorly forecast the outlier events. Since most of the dependent storms had a maximum interval less than 10 min, the schemes were skewed to underpredict the outliers.

Figure 9.

Maximum intervals (minutes) for the 100 dependent thunderstorms. The corresponding 50th (4.17 min, solid line), 75th (7.5 min, long-dashed line), 95th (15 min, short-dashed line), and 99.5th (25 min, dotted line) percentiles are superimposed.

[58] When applied to the 16 independent storms, the 99.5% percentile is the most successful, always ending lightning advisories after cessation, and not before. The 95th percentile version performs fairly well, with an accuracy of 88%. The major limitation of PM is that it produces very large time errors. The 99.5 percentile version has a median forecast error of 21.2 min, i.e., waiting 21.2 min too long to end an advisory. However, PM is the only method to correctly wait for cessation to occur in all 16 independent storms, including the outliers.

[59] PM's large overforecast errors are not desirable, but the scheme does have several desirable characteristics. First, it is simple to implement. It does not require monitoring individual storms to obtain input for an equation, and it does not have to be recalculated as the storm evolves in time (e.g., ETT and MZM). The forecaster simply selects which percentile to use and applies it to all storms.

[60] This leads directly to another excellent trait, flexibility. PM can be adjusted for risk. If the safety risk is not stringent (i.e., when people outdoors are not involved), a lower percentile can be used to reduce the length of lightning advisories. However, when personnel safety is involved, the 99.5 percentile version offers excellent confidence that lightning has ended. Additionally, since the scheme uses one value every time without calculations, it can be used during morning planning for afternoon activities. Lastly, more storms can easily be added in the future to create a more robust data set.

4.2. Bootstrap Analysis

[61] We next performed a bootstrap analysis [Efron and Tibshirani, 1993] to determine if the verification scores for the original 16 independent storms would change appreciably if the composition of the dependent and independent data sets changed. Ten different groups of dependent and independent storms were created. Four groups had 105 dependent storms with 11 independent storms, while the remaining six groups had 104 dependent storms with 12 independent storms. Using these ten new groups, each cessation scheme was recalculated and evaluated.

[62] Results of the bootstrap-derived equations (not shown) confirmed our concerns about the utility of the originally derived cessation schemes. Our original 16 independent storms had three outlier storms. This was a favorable configuration; yet aside from PM, the forecast results were poor. When we performed the bootstrap analysis, the results of each trial showed little improvement and more often, a decrease in utility compared to the original versions. Much of the decrease in utility was due to the placement of the six outlier storms in the 116 storm data set, and whether they were in the dependent or independent data sets. In all but the PM scheme, the bootstrap analyses were in worse. The bootstrap variations of PM were still affected by outlier placement, but as a whole, maintained the accuracy of the original PM scheme. When the majority of the outliers were in the dependent data sample, the verification exhibited modest improvements. The opposite was true if most of the outliers were in the independent group.

[63] The 99.5 percentile version of the reconstituted PM remains steady across all 10 bootstrap groups. The only exception is when the most extreme outlier of ∼27 min is included in the independent group. This produces smaller wait times for each percentile and causes the PM to miss both the 27 and 23.2 min outliers. In the nine other variations, the 27 min outlier is located in the dependent data set, which increases PM's forecast guidance time and improves its accuracy, albeit with low precision due to large overforecast errors.

[64] Lightning cessation is difficult to forecast, and how to deal with outliers is a major issue. Although the outliers could be dismissed as irregularities in the data set, they present the greatest threat to safety. We looked for any identifying feature in the outlier storms. However, the outliers had no unifying characteristic based on the data available since they consisted of both severe and nonsevere storms as well as short- and long-lived storms. Since safety overrides all other concerns, maintaining that safety means that we must overdesign our schemes so that the reasonably successful ones, especially PM, achieve their success by providing only a small time savings over existing 45WS schemes.

[65] As noted earlier, an important point is whether our 116 storms are a representative sample of Florida storms. Figures 5 and 6 show the distribution of the maximum interval in our 100 dependent and 16 independent storms, respectively. The issue with outliers is that most of our storms have a maximum interval of ∼7 min. This biases any scheme toward forecasting a small maximum interval. Our selection process had to omit several stronger and longer-lived storms because it was unclear which storm produced what lightning. Thus, we could not determine storm cessation. We reemphasize that storms were excluded because they were not isolated, not because they were multicellular or severe. This ensured that lightning flashes could be assigned to the correct storm. A larger storm data set that contains more intense storms is needed to fully evaluate statistical approaches to cessation.

[66] Discussions with the 45WS about their real-time operations indicate that little emphasis is placed on storm type, but rather the likelihood of a storm to redevelop. With respect to a lightning advisory, the 45WS forecasters make no distinction between a flash created by a severe thunderstorm or a nonsevere storm. Thus, even if a storm produced 50,000 CG strikes in one hour, it would require the same lightning advisory as a storm with one strike. The primary reason a forecaster would be concerned about a storm being multicellular is that the threat of redevelopment exists and, thus lightning could continue to occur. The threat of redevelopment can be assessed from radar observations and forecaster experience. Should a forecaster deem a storm to be in a redevelopment cycle, the cessation characteristics developed here would be invalid. Conversely, once a storm no longer shows signs of redevelopment, our cessation scheme can be applied.

[67] In summary, the bootstrap analysis showed that all of the lightning cessation schemes, except PM, exhibited too much uncertainty to be safely implemented. The dependence on whether outlier events were included in the dependent or independent groups was evident by the widely varying accuracies. The results further reinforce the finding that the schemes are too biased toward the large number of small maximum interval storms.

4.3. The 45th Weather Squadron Verification

[68] The 45WS has used the end of lightning onset radar thresholds [Roeder and Pinder, 1998] to end lightning warnings, similar to that proposed by Wolf [2006]. However, the amount of time to wait until cessation remained problematic. Likewise, the identification of end of storm oscillations in electric field mill data [Marshall et al., 2009] is difficult in an operational environment, especially since a timeline of the field mill data is not readily available. In addition, the best way to use the end of storm oscillation to assess lightning cessation is not known. Should one wait a certain time after onset, wait for a certain feature to occur such as second zero crossing, or wait for other features? With these questions, a statistical approach has operational promise.

[69] Our final analysis compares results from the original schemes described in section 4.1 against those based on actual 45WS procedures. The 45WS has implemented a three-step method to forecast lightning cessation operationally adapted from the PM scheme. This procedure is summarized in Table 3. If there has been no lightning from a cellular thunderstorm for 15 min, forecasters consider canceling the lightning advisory unless redevelopment of the cell is expected, for example, if interaction with a low-level boundary will occur. The wait time before considering canceling a lightning advisory is increased to 30 min or more if thick anvil or debris clouds from the thunderstorm are in or near the lightning advisory area. This time depends on the thickness of the anvil cloud and the previous lightning flash rate. Lastly, if the thick anvil or debris cloud is over the 45WS electric field network, and all field mills under and next to the cloud indicate less than 1000 V m−1, forecasters consider canceling the lightning advisory 20 min after the last lightning flash. This method is based on 45WS experience where the few cases with times between lightning flashes longer than 15 min usually were due to thick anvil or debris clouds. Thus, stratifying by isolated cellular thunderstorms and thick anvil or debris cloud provides the best combination of ending lightning advisories as soon as possible while maintaining safety for the cases where longer wait times are needed.

Table 3. A Summary of the Three-Step Method Used by the 45WS to Determine How Long to Wait From the Previous Flash to Consider Ending a Lightning Advisory Based on the Results of the PM Schemea
Wait Time Since Last Flash (CG or IC) (min)Condition to Consider Canceling Advisory
  • a

    Flashes are cloud to ground or intracloud.

15isolated thunderstorms with no redevelopment expected; extend if redevelopment expected
20thick anvil or debris cloud from parent storm present over field mill network, and all field mills under and adjacent to clouds are <1000 V m−1
30+thick anvil or debris cloud from parent storm present; extend based on cloud thickness and amount of previous lightning activity

[70] At first glance, Figure 10 suggests that PM would be the least favorable scheme for the 45WS to use since it provides little or no time savings. However, each of the other schemes has a serious flaw in that the accuracies and resulting lack of confidence that cessation has truly occurred are unacceptable. The greatly improved wait times are due to the bias toward our large number of storms with small maximum intervals. This bias gives excellent time savings on forecasts, but little confidence as to whether cessation has actually occurred. Thus, implementing these schemes would provide less safety than the current 45WS procedures and it is safety that is the paramount concern for 45WS lightning advisories. The PM provides accurate, objective guidance for wait times since forecasters know that it will safely estimate that lightning will have ended in a storm by waiting 15 min after the last observed flash in most cases. The PM also requires no calculations. This is particularly useful when a large number of electrically active storms must be monitored, as often happens during the summer. PM provides a single wait time that can be applied to all storms of the day, yielding a 5 min time savings over the 30 min wait time that was sometimes is used by the 45WS. None of our 116 storm samples exhibited a maximum interval greater than 30 min. However, one should note that negative savings occur if forecasters are confident with wait times less than 30 min, although this is rarely the case. Indeed, the original motivation for this research was 45WS forecasters often exceeding the original recommended wait times due to undocumented performance and the resultant lack of confidence.

Figure 10.

Comparison of the time (minutes) saved by each scheme compared to standard 45WS wait times (gray, 15 min; white, 20 min; and black, 30 min). Each scheme's accuracy is indicated at the top of the bars. Positive values indicate that a scheme's wait time is shorter than that of the 45WS, while negative values indicate that the scheme waits longer than the 45WS procedure.

[71] Figure 11 shows the additional benefit to this operational version of the PM scheme. It contains the best fit curve to wait times versus the probability that lightning will occur after the previous flash. The R2 value is 0.9788. In an operational setting, this is extremely useful to forecasters, since the forecaster has the option to choose the desired level of safety. This allows shortened lightning advisories for less critical operations. Equation (6) describes the time to wait in min (X) since the previous flash based on the desired probability of assurance that no more lightning will occur (P). The actual output in Figure 11 gives the probability that another flash of lightning will occur based on the given wait time.

equation image

Although these initial results are encouraging, there certainly is room for improvement. First, a larger data set with a wider sample of storms as well as more real-time parameters from LDAR can be developed. This includes a storm selection process that will yield more multicellular storms. Second, the ability to merge raw lightning source observations into a flash in real time was unavailable to the authors and the 45WS. However, faster algorithms now are available which could allow new parameters, such as interflash time to be included in the analyses. In addition, the relation between lightning cessation and additional radar parameters such as reflectivity values at particular isotherm levels should be explored [e.g., Wolf, 2006]. In particular, cloud microphysical processes during a storm must be examined to determine hydrometeor configurations near the end of a storm's lifetime. This will require polarimetric radar data which are now available at CCAFS/KSC. Finally, improved displays of the 45WS electric field mill data, especially timelines and automated algorithms might allow use of end of storm oscillations to help forecast lightning cessation.

Figure 11.

The probability that additional lightning will occur after waiting the specified time (minutes) after the previous lightning flash. This trend allows 45WS forecasters to determine the level of safety and is based on the operational version of the PM scheme.

[72] A final point should be made about the ER multiple regression scheme that uses predictors currently not available to the 45WS forecasters in real time. ER had the worst accuracy of any scheme (44%, Figure 10), but it was very precise with a median forecast error of only −0.1 min. That is, while ER did not wait the appropriate time for cessation, it usually forecast the proper maximum interval time. We demonstrate this by adding 2 min and 5 min adjustments to ER's final results. With 2 min added, ER's accuracy increases to 81%, and with a 5 min adjustment, accuracy increases to 94%. Obviously, using an adjustment value is not a safe way to develop certainty in knowing that lightning has ended. However, it points to the high precision of the scheme. This suggests that when predictors such as used by ER become available in real time, some form of the ER scheme could be a way to dynamically forecast the wait for lightning cessation. This could be a powerful tool when combined with the PM scheme. PM could provide a standard “climatological” value to use for lightning cessation, while a real time ER scheme could be used to make specific, dynamically based cessation forecasts for individual storms.

5. Conclusions

[73] The goal of this research has been to develop statistically and empirically derived lightning cessation guidance that could be implemented by the 45WS at CCAFS/KSC in Florida. The predictors that we considered were based on LDAR observations, radiosonde data, and some radar-derived values. Our data set consisted of 116 thunderstorms in which lightning flashes could be clearly related to a particular storm from May–September 2000–2005. These 116 storms occurred on 32 separate days and produced approximately 17 thousand flashes. We first investigated several aspects of our 116 storms that have been seldom described in previous studies. The time delay between the first IC and first CG strike for our Florida storms (∼5–10 min) was similar to that of Williams et al. [1989] and MacGorman et al. [1989] in other areas of the country. We also determined whether the first and last lightning activity was IC or CG. For our 116 storms, 12 initiated with a CG strike (10%) and 19 ended with a CG strike (16%). Finally, we documented the total number of LDAR sources, source heights, and flash initiation heights for our sample of storms.

[74] Five schemes for producing cessation guidance used data available in real time, while a variant multiple linear regression scheme included non-real-time data. Only one scheme, the percentile method (PM), was found to have appreciable utility. Aside from PM, the other schemes, including two multiple regression variants (sounding only regression (SOR) and sounding and storm regression (SSR)), event time trend (ETT), maximum height of the greatest dBZ (MZM), and the non-real-time multiple regression experimental regression (ER), could not account for outlier maximum interval events. These schemes produced a time savings compared to the 45WS wait times, but at the cost of poor accuracies. A bootstrap analysis was performed, and only PM maintained its ability to accurately forecast cessation. The other schemes varied greatly in accuracy based on the placement of outlier storms in either the independent or dependent data sets. As a result, no scheme besides PM would provide the 45WS forecasters any confidence that lightning activity had truly ceased within a given thunderstorm.

[75] The results showed a relation between accuracy and overforecast errors of cessation wait times. As the accuracy of a scheme increased, the median overforecast time errors also increased (Table 2). This can be attributed to the distribution of the maximum interval between flashes in our 116 storm data set. 75% of the storms had a maximum interval less than 7.5 min (Figures 5 and 6). The most common forecast errors occurred with the three longest maximum interval storms (11.1, 22, and 23.2 min, the outliers) in our independent data set. Thus, the schemes exhibiting high accuracies usually forecast longer intervals in order to capture these outliers. This inevitably increased the overforecast errors.

[76] It is important to note that the percentile method scheme was successful at accurately waiting until cessation had occurred before ending a lightning advisory. However, the reader should be aware of several points. First the schemes only are valid for warm season Florida storms occurring between May to September. Additionally, the results were derived from isolated thunderstorms that limited the data set to 116 storms. While this data set is greater than those of all previous cessation studies combined, a larger data set is needed to provide a greater cross section of storms. Lastly, while PM is accurate (i.e., waiting for cessation to occur), it is not the most precise scheme. PM often achieves its accuracy by overforecasting the actual wait time until lightning cessation.

[77] While the ER multiple regression scheme was highly inaccurate, it produced very precise forecasts of the maximum interval with a median forecast error of only −0.1 min. When a 2 min or 5 min adjustment factor was added to ER's results, its accuracy increased from 44% to 81% and 94%, respectively. Thus, ER was the only scheme that came close to actually predicting the exact maximum interval and not simply predicting a large maximum interval, like PM. This would provide some benefit to the 45WS forecasters if the needed predictors could be calculated in real time. Thus, it would be possible to use the highly accurate PM scheme to provide a general maximum interval wait time for all storms, while an operational ER scheme could provide maximum interval times for a specific storm. Combined, these two schemes could reduce the time that a lightning advisory must be maintained, and provide confidence that the last lightning flash has indeed occurred.

[78] At the time of this writing, the 45WS had adapted the PM scheme into its daily operations, starting during the summer of 2008 in an evaluation mode. The 45WS previously lacked objective guidance to confidently end advisories. As a result, the advisories were maintained longer than necessary. The PM scheme provides the forecasters objective information about safely ending an advisory. The 116 storms in this research show that the maximum interval between flashes typically is less than 10 min, with no storm having a maximum interval greater than 30 min. The PM scheme indicates that the upper limit for a lightning advisory is ∼25 min, barring other observations that indicate storm redevelopment or possible charged anvil or debris clouds. With this information the 45WS has modified the PM based on experience and anecdotal observations. This anecdotal review has resulted in the provisional rule in Table 3. This provisional rule, combined with the 45WS' own experience, has shortened advisories by an average of ∼5–10 min. This is a 22% improvement over the original advisories. This time reduction can provide a large cost savings when summed over all CCAFS/KSC outdoor workers yearly. Improved understanding of lightning cessation will have great economic and societal benefits at many locations beside CCAFS/KSC, and is a topic that deserves additional research.


[79] Special thanks go to Todd McNamara of the 45th Weather Squadron for his help with several technical aspects of this research, along with his operational knowledge. We also thank Lee Nelson for the use of his flash creation algorithm. Additional thanks goes to the numerous individuals who provided critiques and suggestions for improving this work at the 1st International Lightning Meteorology Conference (ILMC) in Tucson, Arizona, in 2006. This research was sponsored by NASA's Innovative Partner's Program under grant NNK06EB17G.