### Abstract

- Top of page
- Abstract
- 1. Introduction
- 2. Observation System and Data
- 3. Voltage Deviation
*V*_{d} - 4. Statistical Analysis
- 5. Conclusions
- Acknowledgments
- References
- Supporting Information

[1] Very low frequency (VLF) (3–30 kHz) and extremely low frequency (ELF) (3–3000 Hz) electromagnetic transient signals and noise are generated by various natural and anthropogenic processes. On a global basis by far the most significant source is ELF/VLF radiation from lightning propagating in the Earth-ionosphere waveguide. This atmospheric “noise,” originating essentially from lightning discharges, is the main source of interference for VLF/LF telecommunications. One of the statistical measures that is used to define the properties of low-frequency radio noise is the voltage deviation *V*_{d}, which is a measure of the impulsiveness of the noise that is widely used to characterize radio noise, particularly in the International Radio Consultative Committee reports. In this paper we present atmospheric noise statistics based on VLF measurements at different temporal resolution (from minutes to seasonal variations). For the first time we present analysis of the statistical parameters of *V*_{d} from continuous broadband VLF measurements for a period extending more than 1 year. Our analysis shows that the long-term observed *V*_{d} characteristics can be reasonably estimated as the sum of two Gaussians distribution functions, while the hourly and seasonal distributions of *V*_{d} values can be fitted using a single Gaussian distribution with different mean and variance values.

### 1. Introduction

- Top of page
- Abstract
- 1. Introduction
- 2. Observation System and Data
- 3. Voltage Deviation
*V*_{d} - 4. Statistical Analysis
- 5. Conclusions
- Acknowledgments
- References
- Supporting Information

[2] Very low frequency (VLF) (3–30 kHz) and extremely low frequency (ELF) (3–3000 Hz) electromagnetic waves are generated by various natural and anthropogenic processes in the lower atmosphere. On a global basis by far the most significant source is ELF/VLF radiation from lightning propagating in the Earth-ionosphere waveguide [*Rodger and McCormick*, 2006]. Worldwide lightning activity is produced by about 2000 active thunderstorms around the globe [*Ogawa et al.*, 1966], generating dozens of lightning discharges per second. *Christian et al.* [2003] estimated the global flash rate on the basis of satellite observations to be ∼50 flashes per second.

[3] The study of the performances of longwave strategic communication systems under average conditions is a well understood task [*Herman et al.*, 1983; *Adlard et al.*, 1999], and a number of computational tools exist for such purposes [*Hall*, 1966; *Warber and Shearer*, 1994; *Fieve et al.*, 2007]. Since such communication systems are atmospheric noise limited, knowledge of the ambient noise level as a function of time of day and day of year is very important [*Warber*, 1998]. There are a number of statistical measures that are used to define the properties of low-frequency radio noise. The most common of these quantities are: the average amplitude [*Bowen et al.*, 1992]; the voltage deviation *V*_{d} [*Fieve et al.*, 2007] (the descriptive term has lost significance, but *V*_{d}, being a measure of the impulsiveness of the noise, is still a very useful quantity) which is widely use to characterize radio noise particularly in the CCIR reports [*CCIR*, 1964, 1978, 1988]; the antenna noise factor *F*_{a} [*Herman et al.*, 1983], the amplitude probability distributions, or APDs, which give important information about the amplitude that can be expected [*Fraser-Smith*, 1995]. Although they are less often quoted, statistical distributions of the atmospheric noise envelope have considerable practical application. This is because weak radio transmissions can often be easily detected between sferic occurrences, while they may be completely lost when the sferics are occurring. A communication system designated with redundancy based on the statistics of the time between pulses may provide adequate information even during times that would be considered very noisy [*Fraser-Smith*, 1995]. In this study, we present high temporal resolution measurements of the *V*_{d} parameter extending for more than 1 year, allowing us to analyze hourly and seasonal local noise levels. Furthermore, first-order noise statistics, such as the *V*_{d} statistics which are presented here, can be used as inputs for higher-order statistics such as the pulse duration distributions and pulse spacing distributions, which are related to receiving system performance effects, and can be extracted from various models (such as the truncated Hall model, or other nonlinear techniques such as hard limiting, hole punching, soft clipping, and logarithmic correlation) [*Herman et al.*, 1983; *Giordano and Nichols*, 1977] and are widely regards as reasonable for fitting the first-order noise statistics.

### 2. Observation System and Data

- Top of page
- Abstract
- 1. Introduction
- 2. Observation System and Data
- 3. Voltage Deviation
*V*_{d} - 4. Statistical Analysis
- 5. Conclusions
- Acknowledgments
- References
- Supporting Information

[4] The ELF/VLF instruments used in this study are located at the Desert Research Institute of Ben-Gurion University, at Sde-Boker in the Israeli Negev Desert (30.5 N, 34.4 E), at a remote site surrounded by low noise levels, away from industrial or human activity. The station has a highly sensitive VLF receiving system which is intended to be used along with I.G.Y. loop antennas (International Geophysical Year antennas), and is very similar to the antenna at Palmer Station, Antarctica, operated by the VLF group at Stanford University [*Burgess and Inan*, 1993]. The system utilizes two orthogonal triangular loops, each with a base line of 18 m, and 9 m height, giving an area of approximately 81m^{2} for each loop. The loop antenna impedance is 65 *μ**H* and 0.061 ohms. One loop is aligned in the north-south direction of propagation, and the other loop is aligned in the east-west direction of propagation. The sensitivity of the system in the broadband range (0.1–50 kHz) is 6 *μ**V*/meter. The dynamic range of the antenna/preamp set is approximately 100 dB, allowing us to detect lightning discharges from several thousand km.

[5] The data are collected and recorded by data acquisition (DAQ) software (provided to us by Stanford University). The primary purpose of the VLF DAQ software is to enable data collection in two formats. The first is broadband, in which the entire data stream is saved as a single long waveform, at 16 bits, 100 kHz, for each antenna, which is suitable to see 50 kHz following the Nyquist theorem. These data accumulate quickly, at a rate of about 1.5 Gb h^{−1}. Because of this, broadband data are often collected only in “synoptic” format, for example, one minute out of every 15 min. The second format is narrowband (which is extracted from the broadband data), in which case the amplitude and phase of a single frequency, corresponding to a VLF transmitter, is monitored [*Cohen*, 2006]. The data that were analyzed in this study are the synoptic format broadband data for the period of February 2007 until February 2008. Furthermore, each incoming time series is transformed to the frequency domain and then calibrated by the instrument frequency response curve up to 50 kHz.

### 3. Voltage Deviation *V*_{d}

- Top of page
- Abstract
- 1. Introduction
- 2. Observation System and Data
- 3. Voltage Deviation
*V*_{d} - 4. Statistical Analysis
- 5. Conclusions
- Acknowledgments
- References
- Supporting Information

[6] The voltage deviation (*V*_{d}, equation (3)) is defined to be the ratio (in decibels) of the root-mean square amplitude (*H*_{rms}, equation (1)) to the average amplitude (*H*_{avg}, equation (2)) of the noise envelop [e.g., *CCIR*, 1964].

It is a useful quantitative measure of the impulsiveness, or “spikiness,” of the noise, due to the well-known fact that the RMS average value of a distribution skewed to high values is greater than the average of the distribution [*Fraser-Smith*, 1995]. Values of *V*_{d} typically range from 2 to 3 dB for moderately impulsive noise, increasing to greater than 10 dB for highly impulsive noise [*Huntoon and Giordano*, 1981]. Using this metric it may be seen that a few large sferics will increase the RMS voltage more than they will increase the average voltage, thus raising the *V*_{d} value. A useful point of reference is the value of *V*_{d} for an instantaneous noise voltage whose magnitude follows a Gaussian distribution and whose phase is uniformly distributed. Consequently, the in-phase and quadrature measurements of this instantaneous voltage would also be Gaussian, and the envelope may then be calculated by the root-sum-square of the in-phase and quadrature channels, thus it will always be a positive number that follow a Rayleigh distribution. For Gaussian noise, with mean value equal to zero (*μ* = 0) and standard deviation equal to one (*σ* = 1), the envelope will follows a Rayleigh distribution, where the mean of the resulting Rayleigh distribution will be while the RMS value will be ; thus, *V*_{d} = 1.05 [*Leon-Garcia*, 1994; *Boyce et al.*, 2003].

[7] In order to calculate the *V*_{d} parameter for any given frequency in the broadband range of 0.1–50 kHz, we were obliged to work only with the broadband data set. For each of the recorded broadband files (each day contains 96 files, one every 15 min, over a total number of 386 days), we produced a dynamic spectrum, using the Matlab specgram function (see http://www.mathworks.com/help/toolbox/signal/spectrogram.html), where the spectrum was calculated every 13.3 ms. The fact that the spectrum was calculated every 13.3 ms along with the sampling rate of our receiving system (10^{5} samples per second), gave us a frequency domain resolution equal to 75 Hz, which in turn acts as the actual bandwidth for each of the extracted frequency-dependent *V*_{d} parameters. After the entire frequency range was obtained, we were able to extract our discrete frequencies (4 kHz, 7.5 kHz, 10 kHz and 26.0 kHz) as a function of the one minute recording time. We then calculated the total magnitude of the two channels, i.e., *H*_{total} = , in order to get only the noise envelope. After performing the last procedure, we were able to calculate the *V*_{d} parameter in accordance with equations (1)–(3), which we translated to four different matrices (4 different frequencies) with dimension of 386 by 96.

[8] An example for the extracted *H*_{total} raw data vector, for the frequency of 7.5 kHz, is shown in Figure 1 for two different calculated *V*_{d} values, along with a synthetic Gaussian random phase test signal in order to confirm our procedure. As mentioned earlier, a few large sferics will increase the RMS voltage more than they will increase the average voltage, thus raising the *V*_{d} value. This can be seen explicitly when one compares Figure 1 (top) raw data, which contains relatively few spikes above the low level mean background, but nevertheless yields a higher *V*_{d} value compare to Figure 1 (bottom) which might look more “spiky” but has a higher mean background level. The values of *V*_{d} for the different files can be shown as a function of time of day and day of year. An example for the *V*_{d} value matrix is given in Figure 2 for the frequency of 7.5 kHz for a period of 77 days. The *V*_{d} value are presented by the color code, where red shows the largest value, along the different days of the year (*y* axis) and hours of the day (*x* axis). The stack plot near the *y* axis shows the daily variability, while the stack plot below the *x* axis shows the hourly variability. The hourly changes in *V*_{d} are a lot smoother than the daily changes in *V*_{d}. The daily mean *V*_{d} values range from 4.8 to 8.2 dB.

### 4. Statistical Analysis

- Top of page
- Abstract
- 1. Introduction
- 2. Observation System and Data
- 3. Voltage Deviation
*V*_{d} - 4. Statistical Analysis
- 5. Conclusions
- Acknowledgments
- References
- Supporting Information

[9] Figure 3 shows the distribution of *V*_{d} values for the *H*_{total} component as a function of time of day. The blue histograms show the total annual measured counts for a specific frequency dependant *V*_{d}, along with the fitted normal distribution functions for different periods during hours of the day. The curves plotted in cyan, yellow, red and green are for 00:00–06:00 UT, 06:00–12:00 UT, 12:00–18:00 UT and 18:00–24:00 UT, respectively. The fit of the Gaussians to the data in these time intervals maintain values for R^{2} ranging from 0.8396 up to 0.906 for 4 kHz, 0.8688 to 0.9357 for 7.5 kHz, 0.9158 to 0.9433 for 10 kHz, and 0.890 to 0.9171 for 26 kHz.

[10] Figure 4 shows the same *V*_{d} histograms as a function of different seasons, for the four different frequencies, along with the fitted normal distribution functions plotted in cyan, yellow, red and green for winter, spring, summer and fall, respectively. R^{2} values for the fitted Gaussians to the data in these seasons range from 0.7256 up to 0.9696 for 4 kHz, 0.9367 to 0.9616 for 7.5 kHz, 0.9203 to 0.9542 for 10 kHz, and 0.9072 to 0.9809 for 26 kHz.

[11] When considering all the data for the entire year, the statistical analysis of the different frequency bands reveals that the “spikiness” of the noise density can be well estimated by the sum of two Gaussians distribution functions. The observed arithmetic mean and standard deviation values for each of the distributions for *V*_{d} along with the fitted equation, equation parameters, and R^{2} values are summarized in Table 1. The R^{2} statistic values for the sum of two Gaussian distribution functions, which measures how successful the fit is in explaining the variation of the observed data, are higher compared with fitted Poisson, Rayleigh, and lognormal distribution functions. A cumulative distribution function of the observed data, together with the different theoretical cumulative distributions functions (cdf), and R^{2} estimate values for each distribution are presented in Figure 5. The plots were done using the Matlab fitting routines (which are a part of the Matlab statistics toolbox), poissfit, raylfit, and lognfit to obtain maximum likelihood fits for the Poisson, Rayleigh, and lognormal distribution case, respectively. We then plotted the resulting cdf using the parameters obtained from the poissfit, raylfit, and lognfit functions using the poisscdf, raylcdf, and logncdf functions (see http://www.mathworks.com/access/helpdesk/help/toolbox/stats/). While all distributions show good agreement with the observations, the best correlation, especially at the edges of the cumulative data curve, is achieved for the two Gaussians distribution fit, which was obtained using the gmdistribution.fit procedure which is designed explicitly to fit Gaussian mixtures (see http://www.mathworks.com/help/toolbox/stats/gmdistribution.fit.html). Furthermore, we have also checked how successful the fits are in explaining the variation of the observed data using the Akaike Information Criterion (AIC) [*Edwards et al.*, 2007] which is explicitly designated to this kind of model comparison. According to Akaike's theory, the most accurate model has the smallest AIC.

Table 1. Mean and Standard Deviation Values for *V*_{d} Data Histograms for All Five Frequencies, Along With the Fitted Equation, Mean, Variance, and R^{2} ValuesFrequency | Histogram Arithmetic Mean and Std Values | Fit Equation | Goodness of Fit R^{2} |
---|

4.0 kHz | Mean = 7.156 Std = 3.078 | *μ*_{1} = 5.787, *σ*_{1} = 1.6719 *μ*_{2} = 7.8174, *σ*_{2} = 5.6325 ∣a_{1}/a_{2}∣ = 1.864 | R^{2} = 0.9849 |

7.5 kHz | Mean = 7.776 Std = 2.552 | *μ*_{1} = 5.917, *σ*_{1} = 4.831 *μ*_{2} = 2.983, *σ*_{2} = 2.074 ∣a_{1}/a_{2}∣ = 1.477 | R^{2} = 0.9748 |

10.0 kHz | Mean = 6.682 Std = 2.100 | *μ*_{1} = 4.812, *σ*_{1} = 1.039 *μ*_{2} = 6.939, *σ*_{2} = 2.761 ∣a_{1}/a_{2}∣ = 0.699 | R^{2} = 0.9859 |

26.0 kHz | Mean = 5.721 Std = 2.429 | *μ*_{1} = 4.5, *σ*_{1} = 1.362 *μ*_{2} = 6.458, *σ*_{2} = 3.756 ∣a_{1}/a_{2}∣ = 2.333 | R^{2} = 0.9906 |

[12] The *V*_{d} distributions along the different hours of the day are presented in Figure 6. The color code represents the percentage of the annual data files (total of 386 days). The stack plot near the *y* axis shows the total *V*_{d} variability. For 4 kHz, the *V*_{d} values are distributed with maximal counts around 5.5 dB at 09:00 UT and 21:00 UT, with a distinct preference toward nighttime hours. This also can be deduced from the normal fitting curves magnitude in Figure 3a, where the 18:00–24:00 UT curve (plotted in green) reaches the highest value, followed by the 06:00–12:00 UT curve (plotted in yellow). In addition, in Figure 6a during 00:00–06:00 UT we get the highest arithmetic mean and standard deviation values, with the nonzero percentage rate reaching the highest and lowest *V*_{d} values, while during 18:00–24:00 UT we get the smallest arithmetic mean and standard deviation. This can also be deduced both from Figure 3a as the normal fitted cyan Gaussian is spread over all the *V*_{d} range (for 00:00–06:00 UT).

[13] For 7.5 kHz, Figure 6b, the pattern looks a bit different with a maximal percentage around 15:00 UT for *V*_{d} values which are equal to 5.2 dB, and a smaller peak around 21:00 UT. Again, this also can be deduced from the magnitude of the curves in Figure 3b, where the 12:00–18:00 UT curve (plotted in red) reaches the highest value, followed by the 18:00–24:00 UT curve (plotted in green). In addition, from Figure 6b we see that during 06:00–12:00 UT we get the highest arithmetic mean and standard deviation values, with the nonzero percentage rate reaching the highest and lowest *V*_{d} values, while during 12:00–18:00 UT we get the smallest arithmetic mean and standard deviation values. This can also be deduced both from Figure 3b as the normal fitted yellow Gaussian is spreading all over the *V*_{d} values range (for 06:00–12:00 UT), and the normal fitted red (for 12:00–18:00 UT) is the thinnest.

[14] At 10 kHz, Figure 6c, similar to 7.5 kHz there are also two peaks for maximal percentage rate, the first is around 15:00 UT for values equal to 4.5 dB, and the second around 21:00 UT for values equal to 5.5 dB. During 06:00–12:00 UT we also get the highest arithmetic mean and standard deviation values, while during 12:00–18:00 UT we get the smallest arithmetic mean and standard deviation values. Finally, for the frequency equal to 26.0 kHz, Figures 6d, the maximal count peaks around 15:00 UT for *V*_{d} values equal to 4.0 dB, and a smaller peak which is shifting toward 21:00 UT. The highest arithmetic mean and standard deviation values are around 06:00–12:00 UT, and the smallest arithmetic mean and standard deviation values are around 12:00–18:00 UT.

[15] The statistical distributions along the different days of the year are presented in Figure 7. The color code also represents the percentage of the daily data files (total of 96 files per day). The stack plot near the *y* axis shows the *V*_{d} values variability, and is the same as the curves in Figure 6. For frequency equal to 4.0 kHz, Figure 7a, the daily variability is smaller during summer time, from June to September, with *V*_{d} values around 6.0 dB, and the greatest variability is found around fall time, for October and November. This also can be deduced from the normal fitting curves magnitude in Figure 4a, where the red Gaussian gains the highest values. In addition, for summer time, the arithmetic standard deviation values are the smallest, and for winter time standard deviation values are at their peak. Arithmetic mean *V*_{d} values are at minimum during spring time from March to May.

[16] For 7.5 kHz, Figure 7b, the smallest variability is also found during summer time, from June to September, with *V*_{d} values around 6.5 dB, and the greatest variability is found during fall time, for October and November, for *V*_{d} values around 9.0 dB. At 10 kHz, Figure 7c, and at 26.0 kHz, Figure 7d, the picture looks the same with smallest variability in summer for *V*_{d} values around 5.7 dB and 5.5 dB, respectively, and largest variability in winter for *V*_{d} values around 7.7 dB for both frequencies.

### 5. Conclusions

- Top of page
- Abstract
- 1. Introduction
- 2. Observation System and Data
- 3. Voltage Deviation
*V*_{d} - 4. Statistical Analysis
- 5. Conclusions
- Acknowledgments
- References
- Supporting Information

[17] This paper presents VLF atmospheric noise statistics from broadband measurements of transient signals and background noise carried out in the Israeli Negev Desert for a period of 386 days, from the beginning of February 2007 up to the end of February 2008. The statistical measure that we are using to define the properties of low-frequency radio noise is the voltage deviation, *V*_{d}, which is a useful quantitative measure of the impulsiveness, or “spikiness,” of the noise. Our calculations were made for four different frequencies: 4 kHz, 7.5 kHz, 10 kHz and 26.0 kHz, for a bandwidth equal to 75 Hz. From our analysis we can deduce that the *V*_{d} mean values reach their highest level in the frequency range which lies close to the peak of the spectral density which is dominated by lightning discharges, i.e., at 7.5 kHz, and smaller values on both sides of the spectral peak, i.e., 4 and 10 kHz. As we go to higher frequencies, i.e., 26.0 kHz, the global lightning influence is decaying, as the attenuation of VLF waves is increasing, and the noise is less impulsive and spiky with the lowest *V*_{d} mean values. Furthermore, the observed *V*_{d} histograms for the total period (1 year) can be estimated and fitted by a sum of two Gaussian distribution functions which yield higher R^{2} values (0.985) than with the fitted Poisson (0.8843), Rayleigh (0.8009), or lognormal (0.8955) distribution functions. In the same manner, the fitted two Gaussian distribution function gain the smallest AIC value. This might imply that we have two main dominant sources both for the daily and seasonal variation, which are frequency dependent. When the data series are divided into different seasons through the year, spring and summer can be well represented by one Gaussian, and fall and winter, with higher mean and variance values, correlate to a second Gaussian. That alone might indicate, that for prediction purposes, summer time is the most accurate period since the fitted Gaussian curve possess very small variance, where in the same manner winter time has the largest variance. It should be noted that these results will be region specific, since in Israel during the summer there is no local thunderstorm activity, while in the winter months there is significant thunderstorm activity in the Mediterranean region. Furthermore, since the occurrence of seasonal lightning activity is quite robust from year to year, our measurements are likely adequate for deriving monthly statistics. Nevertheless, additional measurements to improve the statistics are desirable.

[18] The statistical distribution of *V*_{d} values for different hours of the day shows that for all the frequencies above 4.0 kHz, the maximal occurrence rate is around 15:00 UT, the time for which the most dominant global lightning activity center in the vicinity of our area, i.e., Africa, reaches its peak. Furthermore, the smallest variance is obtained between 12:00–18:00 UT, making it the most reliable period of the day for predictions. In addition, the frequencies above 7.5 kHz also endure a peak percentage rate at 15:00 UT every day. At 4 kHz, the maximal *V*_{d} values occur in accordance with the other two main global lightning activity centers in Asia and America, and reaches their peak at 08:00 and 20:00 UT, respectively.

[19] The results of this study may allow for the improved modeling of VLF communication performances, given the improved knowledge of the characteristics of the natural background noise. The individual *V*_{d} values for different hours of the day and months of the year can be used as input parameters when trying to simulate low-frequency communication channel.