Three years of gravity wave observations from the High Resolution Dynamics Limb Sounder instrument on NASA's Aura satellite are examined. We produce estimates of the global distribution of gravity wave momentum flux as a function of individual observed wave packets. The observed distribution at the 25 km altitude level is dominated by the small proportion of wave packets with momentum fluxes greater than ∼0.5 mPa. Depending on latitude and season, these wave packets only comprise ∼7–25% of observations, but are shown to be almost entirely responsible for the morphology of the observed global momentum flux distribution. Large-amplitude wave packets are found to be more important over orographic regions than over flat ocean regions, and to be especially high in regions poleward of 40°S during austral winter. The momentum flux carried by the largest packets relative to the distribution mean is observed to decrease with height over orographic wave generation regions, but to increase with height at tropical latitudes; the mesospheric intermittency resulting is broadly equivalent in both cases. Consistent with previous studies, waves in the top 10% of the extratropical distribution are observed to carry momentum fluxes more than twice the mean and waves in the top 1% more than 10× the mean, and the Gini coefficient is found to characterize the observed distributions well. These results have significant implications for gravity wave modeling.
 Internal gravity waves (GWs) are a vitally important component in our dynamical understanding of the middle and upper atmosphere, playing key roles in atmospheric processes at all scales, transporting energy and momentum, and helping to constrain the broad-scale structure of the middle atmosphere [e.g., Fritts, 1984; Holton et al., 1995; Nappo, 2002; Fritts and Alexander, 2003, and references therein].
 Standard parameterizations of gravity waves tend to assume near-constant sources of gravity wave packets in the lower atmosphere which propagate upward, interacting with the background winds as they do so [e.g., Alexander et al., 2010; Geller et al., 2013]. However, a range of studies [e.g., Alexander and Pfister, 1995; Eckermann and Preusse, 1999; Plougonven et al., 2008] have suggested that these waves are best described as individual packets rather than a spatial continuum, and newer gravity wave drag parameterization schemes have been developed to calculate momentum flux deposition in light of this [Alexander and Dunkerton, 1999]. Such a wave packet-based concept of the distribution of gravity waves inherently assumes some degree of intermittency rather than homogeneity, and accordingly, a major outstanding question is the degree to which such intermittency is important [e.g., Alexander and Dunkerton, 1999; Bühler, 2003; Fritts and Alexander, 2003; Piani et al., 2004; Hertzog et al., 2008; Alexander et al., 2010; Hertzog et al., 2012; Plougonven et al., 2013] and the dynamical effects resulting from this. This intermittency is a product both of the underlying synoptic conditions and sources of the waves, combined with their propagation, and has vitally important consequences for the accuracy of these gravity wave parameterizations.
 Recent evidence suggests that the observed gravity wave momentum flux (“MF”) distribution may be dominated by a relatively small number of individual wave packets in certain locations, with measurements poleward of 50°S from the High Resolution Dynamics Limb Sounder (HIRDLS) instrument and VORCORE balloon campaign suggesting some individual packets may transport vertical fluxes of horizontal pseudomomentum as large as 100 mPa. At other locations, the distribution may be closer to a continuum state [Alexander et al., 2010; Hertzog et al., 2012; Plougonven et al., 2013].
 The first part of this article (sections 4 and 5) expands upon the work of Hertzog et al.  and Plougonven et al. , presenting an analysis of the relative occurrence statistics of individual MF measurements in the full HIRDLS data set. We consider MF as a function of latitude, height, and season, and also divide the data set up into various types of source region, demonstrating clear differences between the intermittency characteristics of these different sources. The second part of this article (section 6) considers what part of the observed distribution of MF values contributes most to constraining the globally observed distribution of absolute MF.
Section 2 describes the instrument, data set, and analysis method used and section 3 the division into different source regions. We discuss the overall distribution of observed wave packets (section 4.1) and maps of observed intermittency (section 4.2), before examining the observed distributions as a function of latitude (section 4.3) and source region (section 4.4). Section 5 examines the variation of the MF distribution with height, and section 6 attempts to determine which parts of the MF distribution predominantly define the observed gravity wave MF morphology. Finally, we discuss some limitations of our analysis (section 7), before drawing conclusions (section 8).
2 Data and Analysis
 The High Resolution Dynamics Limb Sounder (HIRDLS) [Gille et al., 2003] is a limb-scanning radiometer on NASA's Aura satellite and is designed to measure atmospheric radiances at high vertical resolution throughout the middle atmosphere. Immediately after launch, an optical obscuration was discovered, blocking around 80% of the viewing aperture, most likely comprised of dislodged Kapton insulating material [Barnett et al., 2005]. Due to this, major corrective work has been required to produce useful atmospheric data [Gille et al., 2008].
 The current version of the HIRDLS data set V007, provides pressure-based vertical temperature profiles from the tropopause up to ∼80 km [Gille et al., 2013], with a resolution in the vertical of ∼ 1 km throughout most of this range [Wright et al., 2011], falling to ∼2 km above ∼60 km. Precision is ∼0.5 K throughout the stratosphere. Around 6000 profiles are obtained per day, spaced approximately 80–100 km apart. Due to the optical blockage, these measurements are at a significant angle, ∼47°, to the track of the satellite. Consequently, observations cannot be made poleward of 63.5°S and are not spatially colocated with other instruments on the Aura satellite.
 Data are available from late January 2005 until early 2008, when a failure of the optical chopper terminated data collection. We use all available data throughout the mission, to allow us to assess the probability of the most intermittent wave events. This may potentially introduce minor inconsistencies due to scanning mode and retrieval changes, but these should be minimized by the standardized retrieval process.
2.2 HIRDLS Momentum Flux Estimation
 To estimate gravity wave momentum fluxes, we use the method of Alexander et al. , as modified by Wright and Gille . We first subtract the mean background temperature and planetary waves up to mode seven to produce temperature perturbation profiles [Fetzer and Gille, 1994], then add 20 levels of zero padding at each end of the profile to prevent wraparound effects. We interpolate these onto a regular 1 km vertical scale, representative of the actual instrument resolution, and apply the Stockwell Transform (ST) [Stockwell et al., 1996], following which we discard data outside the 15–80 km altitude range. We impose a maximum vertical wavelength in our analysis of 20 km.
 Next, we cross-multiply along-track-adjacent profile pairs to compute complex cospectra, from which we calculate the covarying temperature amplitude and, using the method of Ern et al. , the horizontal wave number
and absolute along-track momentum flux
for each at which a distinct local maximum is observed in the ST spectrum, where Δφi,i+1 is the phase difference and ΔXi,i+1 the geographic distance between the profiles, ρ the atmospheric density, kz the vertical wave number, g the acceleration due to gravity, N the Brunt-Vaisala frequency, and the background temperature. To identify wave packets against the background, we apply the statistical significance test described by McDonald , modified as described by Wright and Gille , and require signals to be significant at the 99% level. We thus obtain, for each height level in each profile, an estimate for each statistically significant wavelike signal of Mi,i+1 (hereafter “MF”). We perform our analysis using these individual wavelike signals rather than the total MF in each profile at each height. Since our analysis is on a per wave rather than per profile basis, we expect some bias toward lower momentum fluxes relative to the HIRDLS-derived results of Hertzog et al. , due to the inclusion of smaller-amplitude secondary wave signals which are not included in their analysis.
2.3 Occurrence Statistics
 Following Hertzog et al. , we describe a geographic region as exhibiting significant intermittency if a large proportion of the measured MF is carried by a comparatively small number of observations. Accordingly, we wish to compute occurrence statistics for the MF carried by observed wavelike signals. To do this, we take the entire HIRDLS gravity wave data set computed as described above, and for each grid box, on a regular 5° longitude × 5° latitude grid, bin the values of Mi,i+1 to produce a histogram of the measured MF for each grid box in the range 0–300 mPa.
 The data analyzed include an extremely wide range of measurements, spanning ∼6 orders of magnitude in both measured MF and relative occurrence frequency. Accordingly, we wish to bin the data in such a way as to conserve the integrated area of the distribution but also reduce noise at the high-MF end of the distribution and reduce computational overhead due to unnecessary bins. To achieve this, variable bin widths are used; in all histograms, results are normalized for bin width to the width of the narrowest bin. Consequently, this can lead to adjusted occurrence rates being lower than would otherwise be reported. For example, a relative incidence rate of 1×10−9 may be reported when only 1×106 individual measurements were included in a histogram where the bin width is 1 mPa in the low-incidence portion of the histogram and 1×10−3 mPa in the high-incidence portion.
 Except in section 5, we consider results at the 25 km altitude level. Due to the close proximity to the lower bound of our wavelet analysis window at 15 km, some very long vertical waves may in principle be edge truncated, reducing their measured MF [Wright, 2010]; however, analysis of various height levels in section 5 suggests that these effects are not important at this altitude level.
3.1 Topographic Roughness
 Orographically generated waves are expected to be more strongly intermittent [Hertzog et al., 2008, 2012; Plougonven et al., 2013]. Accordingly, we wish to produce a separate estimate for those regions where we expect the measured distribution to be strongly influenced by waves generated at the surface in topographically rough regions.
 We define these regions using the method of Bacmeister , adapted for the longer horizontal length scales at which HIRDLS will detect waves. Briefly, a 5'×5' global topographic data set, produced by downsampling from the NOAA 1'×1' ETOPO topographic data set [Amante and Eakins, 2009], is used to produce‘blocks’ of local topography, sampled every 2.5° in latitude and longitude. Each of these blocks is comprised of 30×30 grid points, centered on the sampling location. Bacmeister  then smoothed these blocks with a 11- and five-grid point boxcar filters, and differenced the two filtered distributions to produce a topographic deviation estimate for each block for features ∼50–100 km in size. Here we use 25- and 10-grid point boxcar filters, focusing on features with horizontal scale greater than 100 km.
 Each block is then systematically convolved with a set of ridge functions R(x,y,θ) to compute both the local orientation and provide an estimate of the degree of corrugation ar of the topographic features within each block. We select the block with the largest ar within each of our 5°×5° analysis boxes,and retain this value of ar and the orientation angle θr for the selected box. The bottom 75% of this reduced ar distribution, representing the least corrugated regions, is then discarded. Following this, we compute surface wind speed and direction using European Centre for Medium-Range Weather Forecasts (ECMWF) operational analysis data for 2007, and hence the scalar product of the surface wind and the unit vector aligned along the ridge . From this, we define the 50% highest values as our orographic source regions. This represents 12.5% of the original land grid boxes.
 By the time the waves have propagated vertically into the stratosphere, this classification will no longer represent the expected distribution of orographic waves, due both to filtering by the background winds and to waves propagating horizontally. Orographic waves tend to have zero or near-zero phase speed; accordingly, to adjust our results for wind-based filtering, we compute, on the same 5° grid, the distribution of zonal wind speeds as a function of height between the surface and the 25 km level for each month, again using ECMWF operational analysis data for 2007. If the central 90% (i.e., the range between the fifth and 95th percentiles) of the distribution of zonal wind speeds for all individual wind profiles on the 1.125° ECMWF operational analysis grid and for all 6-hourly analysis that month within our 5° analysis grid box crosses the zero-wind line at any height level, we assume that any orographic waves in this grid box have been filtered out, and omit it from our analysis. Finally, grid boxes directly adjacent to those expected to contain an orographic signal are included, to allow for horizontal propagation of the signal.
 We include two subsets of the orographic data set in our analysis, due to a very different distribution of momentum flux measurements in the region poleward of 40°S between June and September. These are defined as “Orographic (1)” for orographic grid boxes at all locations between October and May and north of 40°S between June and September and “Orographic (2)” for orographic grid boxes south of 40°S between June and September. Figure S1, available in the online supporting information for this article, shows the resulting monthly geographic masks for these and all other regions considered.
 It should be noted that 4 of the 5 months of temporally coincident measurements between HIRDLS and VORCORE balloons discussed by Hertzog et al.  fall within the “Orographic (1)” subset, with their first month of analysis falling into “Orographic (2)”; the oceanic equivalent, true of their equivalent sea observations, is discussed below.
3.2 Convective Activity
 We divide our tropical data into regions of higher and lower convective activity. To do this, we use NOAA's mean daily interpolated outgoing longwave radiation (OLR) data set. This provides data globally on a 2.5° by 2.5° scale, running from June 1974 to the present day. Data are primarily sourced from NOAA's satellite time series, with gaps filled via temporal and spatial interpolation [Liebmann and Smith, 1996].
 OLR is inversely related to the depth of atmospheric convection in the tropics: low OLR corresponds to low cloudtop temperatures, such as those at the top of the tall cloud systems associated with deep tropical convection. Accordingly, low values of OLR should be correlated with gravity wave generation, assuming a convective source for the gravity waves we measure. Correlations between this data set and HIRDLS MF estimates in the tropics during the monsoon were previously observed by Wright and Gille .
 To assess the effect of tropical convection, we select all data at tropical latitudes (defined here as 20°S to 20°N) and, for each of our 5°×5° grid boxes, find the average OLR for each month. We then select the top 25% and bottom 25% of the distribution of all grid boxes over all months and define them respectively as low-convection (“Tropics (LC)”) and high-convection (“Tropics (HC)”) regions. The 25th and 75th percentiles fall at 232 W m−2 and 268 W m−2, respectively. We do not apply any wind filtering to these analyses as used above for orographic sources: convective sources as defined by low OLR will almost certainly be in the upper troposphere, and accordingly will be generated comparatively close to our 25 km analysis level.
 A final geographical subset used to analyze our data is open ocean. Due to the previously discussed issues with horizontal propagation of waves, we wish to only consider regions of open ocean away from land and major islands. Accordingly, we define our “ocean” area mask as the set of 5° longitude ×5° grid boxes in which 95% or more of elements in the 1' NOAA topography data set have an altitude below sea level and which is not within two grid boxes of a grid box for which this is not true.
 As with the orographic analysis described above, a very different distribution is observed poleward of 40°S between June and September. Accordingly, we again subdivide this analysis between “Ocean (1)” for ocean grid boxes at all locations between October and May and north of 40°S between June and September and “Ocean (2)” for ocean grid boxes south of 40°S between June and September.
 It should be noted that, due to the relatively coarse grid boxes used, many small islands in the south Pacific are included in the “Ocean (1)” and “Ocean (2)” regional analyses, despite being known sources of orographic signals [e.g., Alexander et al., 2009; Hoffmann et al., 2013]. This may potentially introduce some bias into these results. However, maps of observed intermittency, considered in section 4.2 below, do not show spikes from surrounding grid boxes due to these islands.
4 Spatial and Regional Variations
4.1 Overall Distribution
 Figure 1a shows a histogram of all observations globally, Figure 1b the cumulative distribution from the highest-MF event downward, and Figure 1c the same data presented as the ratio of an individual wave packet to the distribution mean. The inset panel of Figure 1b shows the same results as the main panel, on linear axes. On each figure, the all-observations histogram is shown in black. The second, third, and fourth columns of Table 1 show numerical values for some points of interest from the distribution.
Table 1. The Mi,i+1 (in mPa) and Number of Profiles Above This Value for Selected Percentiles of the Observed Distribution at 25 km Altitude, for the Full Distribution and Three Subsetsa
MF indicates the momentum flux of this percentile, N the number of observations with this or greater MF, and % the proportion of the total MF carried by measurements above this percentile. All values except profile counts, which are known exactly, are quoted to 2 significant figures.
 The distribution is sharply skewed toward low absolute momentum fluxes, with a median of 0.095 mPa and a mean of 0.40 mPa (Table 1). The upper end of the distribution (Figure 1a), however, is extremely long-tailed: while the 90th percentile of the distribution lies at 0.96 mPa, the 99th lies at 4.2 mPa and each proceeding decade of the remaining data is, very approximately, half an order of magnitude larger. The 99.999th percentile of the distribution, above which 540 observations lie, is 99 mPa, nearly 250× the distribution mean and more than 1000× the median; the very largest individual measurements are ∼400× the mean, ∼1700× the median.
 Considering the inset panel of Figure 1b, we see that the distribution is dominated by the low-probability events, rising very sharply at the left-hand side of the panel and almost flat at the right-hand side, where the great majority of low-MF wave packets make very little difference to the total. Indeed, only 18% of total MF is carried by wave packets with below-mean MF, and only 4% by below-median packets, implying that the entire bottom half of the distribution may be almost irrelevant in global terms, albeit potentially significant locally. Globally, the top 10% of events carries 60% of measured MF, but with significant geographic variation in this value.
 At very large MF values, we see a broadly linear dependence in log-log space of the ratio of measured MF to the mean against position within the distribution, with a steady decline of approximately 1 order of magnitude in this ratio per 2 orders of magnitude of percentile. Consistent with the observations of Hertzog et al.  poleward of 50°S, we see that the 90th percentile of the distribution lies at approximately 2× the mean (more precisely, around 2.5× in our data set) and the 99th percentile at 10× the mean, despite the very different temporal and spatial coverage of our analysis.
 Following Plougonven et al. , we also compute the Gini coefficient [Gini, 1912] of our analyzed data as an estimate of the unevenness of the observed distribution, using the method of Glasser . Possible values range between 0 and 1, with 0 corresponding to a perfectly even distribution (i.e., all wave packets carry equal MF) and 1 to a perfectly uneven one (i.e., a single wave packet carries all the MF observed). We find the Gini coefficient for the whole data set to be 0.360. This lies toward the lower end of the range of values found by Plougonven et al.  over Antarctica and the Southern Ocean, which ranged between 0.3 and 0.8; this is partially due to the inclusion in our analysis of data from the whole globe, but also due to the previously discussed systematic differences between our data sets whereby we include a much larger number of small-amplitude signals. Instrument resolution issues may also tend to reduce apparent intermittency.
4.2 Geographic Distribution
 We next consider the geographic variation of observed wave intermittency. Using our histogram data for each location, we compute the proportion of the total MF carried by signals in the top 10% and 1% of the distribution by number as ordered by MF. Figure 2 shows results for the top 10%. Results for the top 1% have been computed, and are included as supporting information Figure S2; these show broadly the same spatial distribution. We show maps for each calendar month at the 25 km level, with data from all 3 years of the HIRDLS mission contributing to each monthly estimate.
 Throughout the year, we see a trough at the equator and higher values at higher latitudes both north and south, with a particularly strong peak over the southern Andes and Drake Passage between May and October. This corresponds spatially to a well-known GW MF hot spot [e.g., Hoffmann et al., 2013], and was observed to have high intermittency by Hertzog et al.  and Plougonven et al. . In this region, our results suggest that the top 10% of the distribution carries, in the most extreme grid boxes, as much as 80% of the total observed Mi,i+1 during winter, with nearly 40% carried by the top 1% of observations alone. However, it is also important to note that this peak is particularly skewed; no other location is quite so dominated by the long tail of the observed MF distribution.
 Intermittency poleward of ∼40°S is extremely high throughout winter. This may possibly be due to downstream propagation from the major Andes source peak [see, e.g., Sato et al., 2012, Figures 2 and 8], but may also be related to processes including baroclinic eddies, adjustment toward geostrophic balance, and frontogenesis [e.g., Fritts and Luo, 1992; Vincent et al., 2007; Ern and Preusse, 2012; Sato et al., 2012]. Since the Andes peak also falls in this latitude band, if the latter sources are the primary mechanism, then it is likely that at least some of the high intermittency over the Andes may be due to these effects overlapping with orographic generation.
 High intermittency is also observed across the northern hemisphere during winter, peaking in January, with a broadly wave-1-like distribution. Due to the extensive wave filtering due to Rossby wave activity observed in DJF 2005–2006 [Siskind et al., 2007; Wright et al., 2010; France et al., 2012] at high northern latitudes, this could possibly be due to this period skewing the measured results. To test this, these 3 months were assessed separately both with and without this period (not shown); no significant differences were observed. Accordingly, this is assumed to be a general feature of this region during winter. Throughout DJF, around half of the observed grid boxes poleward of 40°N show distributions skewed sufficiently that 60% or more of the total observed momentum flux is carried by the top 10% of observed wave packets, and more than a quarter of the total by the top 1% of observations. Localized grid boxes show even higher peaks.
 A secondary peak of high intermittency is also seen over south Central Asia between April and September.
 We also compute the Gini coefficient for each grid box, as discussed above. Maps of the distribution of observed values for each month are available in the online supporting information, Figure S3. Values range between 0.269 and 0.464; when plotted on appropriate color scales, it is immediately apparent that the geographic distribution of this quantity corresponds strongly to the geographic distribution of the proportion of total MF carried by the top 10% of the data (Figure 2). This is consistent with the similarity of the top 10% and top 1% distributions to each other, suggesting consistent scaling relationships throughout each regional distribution. As mentioned above, these values fall at the lower end of the Antarctic Gini coefficients estimated by Plougonven et al. .
 Comparison of the location of all these peaks to previous studies of absolute momentum flux using HIRDLS data suggests a strong correlation between these peaks of high intermittency and peaks of high mean GW activity (see Figures 7a and 7g as well as, e.g., Yan et al.  and Ern et al. ). They also correspond well to known gravity wave hot spots observed by AIRS [Hoffmann et al., 2013]. This suggests that the high mean observed by these previous studies may be a result of intermittent high-MF wave packets rather than a generally high level of MF; this will be considered in section 6.
4.3 Variation With Latitude and Season
 Figure 3 shows the same metrics as Figure 1, discussed above, for a range of latitude bands in winter and summer (defined as DJF and JJA respectively for the northern hemisphere and vice versa for the southern). Table 2 shows corresponding Gini coefficients. Considerably more wave packets are observed in winter, which manifests itself in Figure 3a as an extension of the histograms to lower relative frequencies.
Table 2. Gini Coefficients for Each Latitude Band
 We see very large differences between summer and winter—Figure 3a shows a much broader range of MF values in all extratropical latitude bands in winter, the cumulative distributions in Figure 3b skew further away from the 1:1 line, and large numerical differences are seen between their Gini coefficients. Tropical values show no significant seasonal variation.
 The largest absolute wave packets are seen in the 70°S–50°S and 50°S–30°S latitude bands during winter—these two regions are the only curves to include individual wave packets with amplitudes greater than 150 mPa (Figure 3a). The 50°S–30°S band in winter is the most affected by the small number of truly extreme outlying events, with the highest proportion of total MF carried by the top 0.001% of measured wave packets (Figure 3b, main panel), but the 70°S–50°S is the most skewed distribution overall (Figure 3b, inset), with by far the highest Gini coefficient (0.408, versus a next-highest value of 0.381). However, this is not replicated in summer—during this season, the distribution poleward of 50°S is broadly equivalent to that between 30°N and 50°N in summer, albeit more skewed than the summer distribution poleward of 50°N.
 The northernmost latitude band, 70°N–90°N, has the lowest absolute values in summer and the lowest extratropical values in winter, the lowest summer and lowest extratropical winter Gini coefficient, and also by far the fewest resolved wave packets. This may partly be due to limited measurements in this region preventing the detection of low-probability high-amplitude events, since the northern turnaround of the satellite scan track is at 80°N, but a similar lack of full coverage exists in the 70°S–50°S band, where coverage only extends to 64°S. This suggests that poor spatial coverage is not the reason for these curves being so low, but that we are genuinely seeing fewer wave packets and fewer extrema in the high Arctic.
4.4 Distribution by Source Region
 We next examine our distribution as divided among the regional subsets described above. Figure 1, in addition to the global distribution, also shows distributions for the subregions defined in section 3, and Table 3 shows corresponding Gini coefficients; Table 1 shows selected MF values for three of these subdistributions. We see large differences between these subregions. Maps of all regions for all months are included in Figure S1 of the supporting information for this article.
Table 3. Gini Coefficients for Each Regional Distribution
 All subsets other than the tropical high-convection (green) and tropical low-convection (purple) distributions exhibit considerable similarities with each other and the overall distribution. For each of these subsets, more than 55% of the total MF is carried by the top 10% of wave packets; as we saw in Figure 2, the only extratropical region where this is not true is in the northern hemisphere between April and September, and even here this only falls to around 45%.
4.4.1 Orography Versus Ocean
 The largest orographic peak is over the Andes, which is conspicuously intermittent compared to all other regions (Figure 2). This peak persists strongly from May to October but appears to exert some effect on the distribution throughout the whole year. We also see orographic southern hemispheric features off the northern coast of Antarctica (edge-truncated by the limits to our satellite data at 64°S), New Zealand and southeastern Australia. In the northern hemisphere, our orographic regions include parts of the U.S. west coast, southern Europe, parts of central Asia, and the Kamchatka peninsula (Figure S1).
 Returning to Figure 1, we observe that the two orographic distributions, “Orographic (1)” and “Orographic (2),” are strongly influenced by high-valued MF packets by comparison with their corresponding ocean distributions, with the orographic distributions more dominated by their long tails (Figure 1b) and the largest wave packets much further above the mean (Figure 1c). Gini coefficients are also larger for the orographic distributions (0.366 versus 0.340 for “Orographic (1)” versus “Ocean (1)”; 0.413 versus 0.397 for “Orographic (2)” and “Ocean (2)”), corresponding to higher inequality. This difference is smaller than might be expected from an initial examination of, e.g., Figures 2 and S3, and may be at least partially due to our definition of an orographic signal: specifically, periods when we exclude a grid box from orographic analysis may also be periods when the source mechanism is quiescent, artificially reducing the apparent observed source intermittency.
4.4.2 High Convection Versus Low Convection
 The two tropical distributions, Tropics (HC) and Tropics (LC), are narrower than the other subsets considered, with maximum packet MF < 20 mPa. This is partly due to biases in the measurement of gravity waves by polar-orbiting satellites, which travel meridionally close to the equator and so significantly underestimate the MF carried by zonally propagating waves in this region, due to large overestimates of their horizontal wavelength. This will be discussed further in section 7.
 The total observed MF (4.8% of the global total over the mission for the high convection region versus 5.8% for low convection regions) and number of observed wave packets (6.0% versus 6.3%) in the two subsets are very similar. Hence, differences lie primarily in the shape of the distribution. There is a slight unexpected positive bias in measured MF toward low-convection regions; while these regions tend to lie further from the equator than the convective regions, and hence this effect may be due to the satellite scan angle, tests with the latitudinal range restricted to the region 10°S–10°N (not shown) show a similar difference.
 The distribution of measured MF for the regions of high convection (green line, Figure 1a) is observed to have a narrower range of values than the distribution for regions of low convection (purple line), corresponding to a flatter cumulative distribution (Figure 1b). This suggests that the observed MF in the high convection regions is more evenly distributed across the spectrum of observed wave packets, and hence less affected by wave intermittency. The Gini coefficient estimates are in agreement, with estimates of 0.299 for high convection regions and 0.307 for low-convection regions, consistent with slightly greater inequality in the low convection regions. This difference is small, however, and may not be meaningful
5 Variation With Height
 We next examine the variation with height of wave intermittency. Figure 4 shows histograms for the global distribution of packet MF every 5 km in altitude between 25 km and 65 km, scaled to the mean value of the distribution at that level. Histograms were also computed for two additional levels at both the top and bottom of these analyses, but these differed significantly at the high-MF end of the distribution. This is consistent with the expected effect of edge-truncation, which will tend to affect only long waves which carry (equation (2)) larger MF; accordingly, these results have been omitted. We see that the most extreme events relative to the mean occur at the lowest altitudes, with the whole distribution shifting toward lower mean-relative MF values with increasing height.
 Figures 5a–5g extend this analysis to each of our subregions, and correspond individually to Figures 1c and 3. Figure 5h, meanwhile, shows the estimated Gini coefficient for each distribution as a function of height.
 Figure 5a, the global distribution, reproduces the results of Figure 4. This figure highlights the reduced influence of extrema with increasing height, with the largest detected packets at the 25 km level ∼300× the mean but at the 65 km level only ∼150×.
 We see considerable differences between regions. The most striking is the difference between the two orographic regions, Figures 5b and 5c, and the two convective regions, Figures 5f and 5g. In the orographic regions, we see the global pattern reproduced, with the most extreme MF packets at high altitude closer to the mean than those at the lowest altitude level. In the tropical regions, however, this pattern is reversed: higher altitude levels exhibit more intermittency than the lowest. Ocean (1), Figure 5d, is more similar to the tropics than the orographic regions, whilst Ocean (2), Figure 5e, is more ambiguous.
 This difference most probably arises due to very different source mechanisms and, importantly, different interactions with background winds as the waves propagate upward. Figure 6 illustrates possible mechanisms underlying this difference.
 Figure 6a shows vertical propagation in regions of broad-spectral wave generation, such as the tropics. Here waves are generated across the whole phase speed range, with the generation mechanism exhibiting minimal intermittency. Waves are filtered out when their phase speed matches the background wind speed. Consequently, as waves propagate upward in height, an increasing proportion of the spectrum is swept out, leaving a more quasi-monochromatic, and therefore intermittent spectrum, at high altitudes.
 In the orographic case, we have a similar broad spectrum of nonorographic background waves undergoing the same processes as in Figure 6a. Overlaid on this spectrum, however, are a comparatively small number of very high-amplitude orographic waves, with zero phase speed, which exhibit significant source intermittency and contribute much more MF per event to the observed distribution. If the wind profile crosses into the low-phase-speed region (dashed lines), these orographic waves will be filtered out. Figure 6b shows 150 individual wind speed profiles representing different times and locations near 60°S, 70°W during June 2007; when averaged over many observations at different times, waves will be removed with height as shown in the side panel, where dark gray shows remaining waves and light gray removed waves at each height level. Since these large-scale waves dominate the distribution, their removal will reduce the relative intermittency in these regions, tending toward only the intermittency seen in the nonorographic case. The end result of this is that intermittency in both types of region at the highest altitudes is broadly equivalent, with peak packets ∼100× the mean, but at lower levels the behavior is very different.
 We see that the spread of distribution Gini coefficients, Figure 5h, decreases with altitude, with a range between the highest and lowest values ∼0.130 at the lowest altitude considered but only of ∼0.065 at the highest altitude. As with the results of Figures 5a—5g, and consistently with the above-described mechanisms, we see an increasing (decreasing) Gini coefficient with height for the distributions with the lowest (highest) values at the 25 km level.
6 The Background Spectrum
 An interesting question is the effect of intermittent wave packets on the morphology of the global MF distribution. To assess this, we systematically remove the lowest-MF wave packets from the global distribution, and then analyze the resulting maps to determine what proportion of the global morphology is defined only by the high-MF packets.
 Figure 7 shows the results of this analysis for January and July (other months were analyzed, and showed similar results, but are omitted for brevity). Each individual panel shows a map of the grid box-mean observed MF per profile for waves above a threshold MF
where j is the grid box under consideration, Np is the total number of profiles observed in the grid box, I the total number of wave packets in the grid box with MF greater than the threshold value, and Mi the MF carried by each of these individual packets. The per profile MF is used due to the very different numbers of measurements in each box.
 The first column shows the MF distribution resulting from the inclusion of all measured waves, plotted in mPa. Figures 7b–7e and 7g–7j show the difference from this as we increase the threshold MF level, as a percentage of Figures 7a and 7f. We see that Figures 7b and 7g, representing the distribution of only those waves with MF≥.1 mPa, show essentially no difference from the original data (≲5%), and that Figures 7c and 7h, MF≥.5 mPa, still only exhibits differences of ≲10% at most, with the greatest differences appearing in the tropics and over the Pacific Ocean. These threshold values correspond, very approximately, to the 50th and 80th percentiles of the global packet MF distribution respectively (Table 1). Figures 7d and 7i, MF≥1 mPa, exhibits much larger differences, particularly over the Pacific Ocean where the difference peaks at around 50%. The difference in high-MF regions such as the Arctic in January, the Andes for the rest of the year, and southeast Asia in April and July, however, is much smaller, ≲5%. Since these regions are dominant in the total MF distribution observed globally, the large relative differences observed elsewhere are arguably less important than the stability of these peaks with increasing threshold MF. The final column, however, is significantly different from all those preceding them at all locations.
6.1 Temporal and Spatial Dependence of Threshold MF
 Figure 8 generalizes these results further. For a range of threshold values of MF, we compute a new global map according to the same method as above, and then compute the root mean square deviation
between this map and the all-data map for the same period, where Ng is the total number of grid boxes, the value of X for grid box j when all data are included, and Xj the value of X for grid box j after applying the threshold. This quantity will be dominated by the high-MF regions, emphasizing absolute differences over relative differences. We vary the threshold MF across the full range of values encountered in the data set, and plot the RMSD as a function of the threshold MF for a range of latitude bands and for each season. Hence, this quantifies the difference between the all-measurements MF map and the above-threshold MF map as a function of threshold MF in a single number. We also indicate the proportion of the original data which remains after applying the threshold, indicated by dotted lines on the same axes.
 For every season and for all latitude bands, we observe the same key feature: the RMSD remains negligibly small until we reach a threshold of around 0.5 mPa, beyond which it grows rapidly, flattening off when the number of high MF wave packets itself becomes negligibly small at thresholds above 10 mPa, but with a dependence on seasonality and location for this value.
 We estimate the required threshold MF, and the proportion of observed waves needed for good characterization of the morphology, by finding the first value at which the RMSD ≥0.01 mPa. This value is arbitrary, but as we see on Figure 8, corresponds closely to the beginning of the sharp increase in gradient for each curve. Figure 9 shows the results of this analysis for three variables: threshold MF, the proportion of the total number of observed waves with MF above this threshold, and the proportion of total observed MF carried by these waves. We see that the required threshold MF lies between 0.45 mPa and 0.8 mPa (Figures 9a, 9b), that this represents between 5% and 27% of the total number of observed waves (Figures 9a, 9c), and that these waves carry between 35% and 90% of the total observed MF (Figures 9b, 9c). There is substantial variability in all three parameters both within regions and seasons, but some broad conclusions can be drawn from the clusters of points observed: (i) the tropics require the largest proportion of the total number of observed waves to be well characterized, and consequently have the lowest threshold MF; (ii) the highest required threshold MF is in the 70°S–50°S latitude band; and (iii) in the extratropics, a smaller proportion of the number of observed waves is required in summer, and a larger proportion in winter.
 Figure 10 shows the same results for our defined regions, which include data from the whole year. We see that the required threshold MF is strongly dependent on source region type, varying between 0.45 mPa and 0.75 mPa, but the proportion of measured wave packets required is much more narrowly distributed, lying between 15% and 25%.
6.2 Homogeneity of Background Spectrum
 An important remaining question relates to the temporal variability of the measured MF above and below the threshold. If the geographic distribution of observed MF is determined almost entirely by packets with MF>0.5 mPa, then it is likely that the distribution of MF<0.5 mPa is a continuous background and will vary little throughout the year.
 To assess this, we split the MF distribution into two parts: (i) packet MF<0.5 mPa and (ii) packet MF > 0.5 mPa, and compute the mean MF in each grid box for each month of available data. We then find the 3 year-mean MF for each grid box for each distribution, and scale each monthly value accordingly. From this, we compute the spread of grid box MF values across the globe for each month. This provides separate estimates of the variability of the two distributions: the spread at each time indicates the geographic variability at that time (see, e.g., Figures 7a, 7f), and the temporal variation of the spread indicates temporal variability. Randomly sampled individual grid boxes are observed to have similar time-dependent behavior (omitted for brevity).
 Figure 11 shows the results of this analysis. Shown are the full and interquartile ranges of observed mean-scaled grid box MF for packets (Figure 11a) MF > 0.5 mPA (red) and (Figure 11b) MF < 0.5 mPA (blue). We see considerably less variability, both geographically and temporally, in the low-MF distribution than in the high-MF distribution: the entire range of MF lies within 0.5× and 1.6× the mean, while the high-MF distribution varies significantly both geographically and with time. Therefore, this suggests that the low packet MF part of the distribution is relatively, temporally, and spatially constant, while the larger packets are considerably more variable. This suggests that simulating the GW distribution as a constant background overlaid with large variability generated by a comparatively small number of packets may be a viable approach.
 There are several important issues which potentially impact our results. The most important of these is the extreme sensitivity of our analysis to outliers, which may be geophysical but which could also potentially arise due to instrumental or analytical errors. As we have seen, the number of measured wavelike signals contributing to our estimates at high MF values is very small relative to their effect on the cumulative distribution. Individual examination of a random sampling from the top 0.001% of the distribution suggests that the extreme values are plausible by relation to other local measurements, but due to the volume of data considered it is difficult to assess many cases individually. This is an important caveat in our characterization of the most extreme events.
 This caveat is compounded by very different spatial coverage in different geographic regions. In particular, we have a large number of very closely spaced profiles at the northern and southern turnarounds by comparison to the relatively coarse longitudinal coverage near the equator. This manifests itself implicitly in Figures 1a and 3a as the relative incidence histogram terminating at higher values, but is harder to decouple from the analysis in other figures and sections of the paper. As a result of this, some regions, particularly at the highest latitudes, are much more heavily sampled than others, especially those nearest the equator; given the low probability of observation of the most extreme events, this may skew the distributions in more lightly sampled regions away from the largest extrema. This will cause problems directly comparing different regions.
 The limited accuracy of our measurements of the horizontal scale of observed waves also leads to significant limitations in our analysis. The method used tends to include many horizontally aliased waves, particularly in the tropics [Wright and Gille, 2013], and also waves propagating at a significant angle relative to the instrument scan track. λh will be overestimated for these waves and, consequently, Mi,i+1 underestimated, hence low biasing the distribution of Mi,i+1. This will, in principle, tend to shift the whole distribution toward lower measured MF, which ought not affect our intermittency statistics. However, we cannot observe waves of all wavelengths due to the instrument observational filter [Alexander, 1998; Preusse et al., 2000, 2008] and the measured temperature perturbation amplitude, and hence, MF are strongly related to the horizontal wavelength of the signal [Preusse et al., 2000, 2002]. Consequently, some waves which would be in our observation window and observed with a large amplitude at high latitudes will not be observed at all or would be significantly under measured in tropical latitudes, which will skew our results lower.
 Measurement uncertainty is also compounded by the edge effects of the observational filter: as propagating waves approach critical levels, they will shift in wavelength, and if this shift takes them into our analysis wavelength range (2–20 km for vertical wavelength, and a more complicated function of profile separation distance, propagation angle, and vertical wavelength for horizontal wavelength), wave packets will appear to be spontaneously generated in our analysis, potentially increasing the apparent intermittency of our distribution.
 Finally, our measurements of MF are magnitude only and instantaneous, lacking any information on the propagation direction or speed of the waves considered. This potentially affects our conclusions in section 6: while only ∼7–25% of the observed waves may be required to produce the morphology of the observed absolute difference, more information may be required for models, in which directional information is vital.
 Our results show the vital importance of a relatively small number of high-amplitude wave events in global HIRDLS GW data. At the 25 km altitude level, individual waves transporting tens and sometimes hundreds of times the mean momentum flux are not atypical in extratropical regions. Even in the comparatively less skewed distributions observed in the tropics, the largest observations carry momentum fluxes 20×–30× the mean.
 Seasonal factors are the dominant effect on the observed pattern, with extratropical winter distributions skewed toward much higher and more intermittent MF. Orographic regions generate much more intermittent observed distributions than others, but weather-related processes at very high southern latitudes may be a major source of wave intermittency approximately equal in importance to these.
 Intermittency tends to decrease with height in the extratropics. With the caveats described in section 7, the tropical MF distribution may be much less intermittent than that elsewhere; in contrast to other regions, intermittency here is found to increase with height. Differences in the level of convective activity seem to make only a small difference to the observed distribution.
 The empirical scaling relationship observed by Hertzog et al.  that the 90th and 99th percentiles correspond, approximately, to waves carrying twice and 10X the mean MF is observed to apply approximately generally to all locations and heights. This relationship is weakest in the tropics, where our results are the most subject to methodological limitations.
 The Gini coefficient, as suggested by Plougonven et al. , appears to well describe the observed distributions. In particular, the spatial distribution of Gini coefficients (Figure S3) closely corresponds to the spatial distribution of the proportion of total MF represented by the top 10% of the distribution (Figure 2), and the ordering of observed Gini coefficients for both source-region-derived and latitude-based distributions (Tables 2 and 3) agree excellently with the observed distributions. However, due presumably to differences in both data set and analysis method, the numerical values are significantly lower than those of Plougonven et al. , and smaller differences between regions are seen than might be expected based on their results.
 In both the overall distribution and all extratropical subsets we have considered, less than 40% of total MF is transported by the bottom 90% of the observed distribution, less than 20% by the bottom 75%, and importantly less than 5% by the entire bottom 50%. As a result of this skew, the morphology of the global MF distribution is seen to be almost entirely dependent on wave packets with MF >0.5 mPa, with the distribution of packets larger than this showing strong temporal and spatial variability and the distribution of packets smaller than this showing much less. This has important implications for the parameterization of gravity waves in climate and weather models.
 C.J.W. and J.C.G. are currently funded by NASA's Aura satellite program under contract NAS5–97046; S.M.O. is funded by the UK National Centre for Atmospheric Science. The article was written up during a visit by C.J.W. to the University of Oxford, generously arranged by L.J. Gray. C.J.W. would also like to thank J.K. Barstow and M.P. Rombach for useful discussions relating to the analysis of the data.
 This work benefited from discussions at the International Space Science Institute (ISSI) in Bern, Switzerland. ECMWF data used were obtained from the British Atmospheric Data Centre, http://badc.nerc.ac.uk. The OLR data used is the interpolated OLR data set provided by the NOAA/OAR/ESRL PSD, Boulder, Colorado, USA. The National Center for Atmospheric Research is sponsored by the National Science Foundation. Any opinions, findings, and conclusions or recommendations expressed in the publication are those of the authors and do not necessarily reflect the views of the National Science Foundation.