A rigorous statistical approach is presented to address the shortcomings inherent in the analysis of counting experiments such as orbital neutron spectroscopy endeavors. Unlike the quasi-Gaussian statistics commonly applied to such investigations, the methodology described here incorporates fundamental elements of Poisson statistics including its inherent nature as a bounded, discrete, and intrinsically asymmetric probability distribution. In addition, we utilize the proper statistical formulae required to describe the data following background subtraction. Utilizing the Likelihood Ratio Method for hypothesis testing, we find that analyses utilizing collimated instruments such as the Lunar Exploration Neutron Detector (LEND) aboard NASA's Lunar Reconnaissance Orbiter overestimate detection significances by , under the assumption that collimator effectiveness as asserted by the LEND team is correct. This in turn implies that the required exposure times must be ∼2× longer to reach prelaunch sensitivity estimates. We provide updated estimates for hydrogen abundance sensitivity limits as a function of exposure time and compare the sensitivities of collimated (i.e., background subtracted) and uncollimated approaches.
 Neutron spectroscopy has become a well-established technique for determining the surface composition of airless planetary bodies from orbit [Prettyman, 2007]. It was successfully employed by spectrometers aboard NASA's Lunar Prospector mission to make elemental abundance maps of the Moon's surface [Elphic et al., 1998; Maurice et al., 2004]. Of particular note was evidence for the suppression of epithermal neutrons at the lunar poles consistent with hydrogen enrichment [Feldman et al., 1998].
 The spatial distribution of the lunar hydrogen is of critical importance for fundamental planetary and solar system science, as well as a possible lunar resource, particularly if it is in the form of water. A detailed understanding of the lunar water cycle is an area of intense investigation and modeling. In the simplest scenario, water may become trapped in permanently shadowed regions (PSRs) via migration or deposition, with these cold traps acting as high-stability hydrogen/water reservoirs. While the evidence for polar hydrogen deposits is strong and the association with PSRs is likely, a definitive relationship remains elusive in part because of the limited spatial resolution (∼30–100 km) of the Lunar Prospector observations.
 In an effort to provide high-spatial resolution mapping of hydrogen of the Moon's surface layers and identify possible water-ice deposits associated with the PSRs, the Lunar Exploration Neutron Detector (LEND) experiment was included as part of NASA's Lunar Reconnaissance Orbiter (LRO) [Mitrofanov et al., 2010a]. Unlike previous neutron spectrometers LEND incorporates a collimator to reduce the instrumental field of view. In principle, this design should provide improved spatial resolution (∼10 km) while simultaneously reducing instrumental counting rates relative to a similar uncollimated detector. Imperfect collimation (i.e., leakage), however, introduces backgrounds from outside the instrument's field of view impacting instrumental sensitivity and achievable spatial resolution. Therefore, subtraction of the noncollimated event fraction is required.
 Recent LEND polar neutron maps show features similar to Lunar Prospector results, as well as a class of neutron suppressed regions (NSRs) unassociated with PSRs [Mitrofanov et al., 2010b, 2011]. At face value these results may represent a new paradigm for lunar hydrogen deposits. Some authors have suggested that instrument design deficiencies influence the results [Lawrence et al., 2010; Eke et al., 2012]. Those issues are not of concern here; for this work LEND is assumed to operate as claimed.
 One concern, however, is the statistical approach used to analyze LEND observations [Mitrofanov et al., 2010a]. Specifically, it incorporates a number of systematic effects that lead to an overestimate of detection significance and ultimately influence the interpretation of results. Therefore, in an effort to further investigate the nature of the new lunar hydrogen features identified by LEND and address existing statistical analysis shortcomings, a classical statistical approach appropriate to the LEND application is derived and presented here. This new analysis methodology incorporates the statistical distributions appropriate for experiments utilizing counting statistics and background subtraction in low signal-to-background scenarios.
2. “Generic” Statistical Approach
 We begin by reviewing the statistical approach as outlined by the LEND team [Mitrofanov et al., 2010a]. It should be noted that the statistical analyses presented by Lawrence et al.  suffer from the same shortcomings. However, since the statistical analysis approach was similar to the one used by LEND the conclusions regarding relative sensitivity remains unchanged. Let Ndeto and NdetH be the number of neutrons detected in a hydrogen-free and hydrogen-rich region, respectively. As the LEND team has indicated [Mitrofanov et al., 2010a], imperfect collimation leads to two classes of detected neutrons: those arriving from within the field of view of the collimator (Nsig) and those entering from outside the field of view on a trajectory passing through the collimator (Nbck). The latter are considered “background” for this work since they are not the desired (collimated) “signal,” and may be secondaries originating within the spacecraft or coming from the Moon itself. Their relative contribution is characterized by the ratio η = Nbck/Nsig. Perfect collimation corresponds to η = 0, and the LEND team estimates η ∼ 1 [Mitrofanov et al., 2010a]. It is important to note that the LEND instrument only counts neutrons and is not designed to determine whether a detection is from signal or background.
 The difference in neutron counts between the two surface regions may be written as
where the background contributions from hydrogen-free and hydrogen-rich regions (Nbcko and NbckH) are assumed to completely cancel. The impact of systematics resulting from this assumption have not been quantified by Mitrofanov et al. [2010b], and therefore for the sake of consistency this simplifying assumption is used here. It should be noted that this assumption may not be valid [e.g., Lawrence et al., 2011a, 2011b], but if it eventually needs to be relaxed, the statistical treatment presented in this paper can be easily expanded to account for a nonuniform background. It is convenient to rewrite this as a relative magnitude, i.e., the percentage change in neutron flux due to the presence of hydrogen relative to a hydrogen-free region: δ = ΔNsig/Nsigo.
Figure 1 shows the relationship between hydrogen abundance and suppression factor according to the LEND team Mitrofanov et al. [2010a]. An observed deficit (or enhancement) is deemed statistically significant if its amplitude exceeds the uncertainty in the total number of counts from a reference hydrogen-free region by a predetermined multiple (Nσ), i.e.,
where Poisson statistics has been used to estimate the uncertainty in the background.
 As presented, the LEND formalism requires three fundamental assumptions:
 1. Uniform backgrounds. Equation (1) implicitly assumes that Nbcko and NbckH are identical and ultimately cancel, a useful simplifying assumption. It is, however, well known [e.g., Prettyman, 2007; Feldman et al., 1998] that both the flux and spectrum of emitted neutrons is modified given different hydrogen abundances; the presence of elements with large neutron cross sections and variations in cosmic ray flux also influences neutron emission. Thus the noncollimated neutron fraction can be different region to region, as well as vary with time. The impact of this variation may be mitigated if elemental abundance variations are small and/or vary over large spatial extents. No effort to quantify the magnitude or impact of this systematic effect here, and instead utilize the simplifying assumption.
 2. A symmetric statistical description. Treating deficits and enhancements identically can impose systematics. Observational investigations involving discrete counting experiments such as neutron detection are governed by Poisson probability, a statistical distribution that is decidedly nonsymmetric. This is particularly true when the expected number of events μ for a given observational interval is small, and only approaches a symmetric, Gaussian-like, distribution in the limit of large μ [Bevington and Robinson, 1992]. Since the underlying parent distribution is asymmetric, discrete, and bounded, assuming the data are symmetric, continuous, and unbounded can impact the statistical interpretation.
 3. Quasi-Gaussian significance determination. Quantifying the significance of an observation using the number of standard deviations is an implicitly Gaussian (continuous, symmetric), not a Poisson (discrete, asymmetric), approach.
 Therefore, in an effort to aid LEND data interpretation an alternative statistical approach, one appropriate to this application and addressing the shortcomings outlined above, is required.
3. Alternative Statistical Approach
 Proper determination of statistical significance and confidence intervals are often overlooked experimental challenges, and approximate methods are often used for simplicity or to reduce computation requirements. Both Bayesian [Nakamura et al., 2010] and advanced classical approaches [Feldman and Cousins, 1998] have proven useful, particularly for the challenging cases of Poisson processes with background and Gaussian errors within a bounded physical region, and are now commonly used in fields limited by low-counting statistics such as neutrino physics [Feldman and Cousins, 1998]. In this work we present another classical approach that does not require the assumptions inherent in the current LEND approach.
3.1. Likelihood Ratio Method
 Let Ho represent an hypothesis based on some a priori knowledge of a physical model containing p free parameters. Furthermore, let H represent another, more complex, hypothesis that contains q + p parameters. Our goal is to compute the level of confidence that can be placed on this hypothesis compared to the null hypothesis Ho.
 The likelihood ratio method is useful when comparing two hypotheses, and can also be used to determine model parameters [e.g., deBoer et al., 1992]. Given a count difference k, the likelihood that it is supported by each of the two hypotheses is simply L(k|H) and L(k|Ho), respectively. To determine the degree with which the data may support hypothesis H over Ho a likelihood ratio is computed,
and since in general H incorporates hypothesis Ho this ratio satisfies R ≥ 1. In a classical interpretation the statistic λ = 2 ln R follows a χ2 distribution with q degrees of freedom if the null hypothesis is true. In addition, the null hypothesis Ho can be rejected if λ exceeds some predetermined critical value λc, and the confidence level is P(χq2 < λc). This is the essence of the likelihood ratio method (LRM).
 Obviously, to utilize the LRM requires two (related) hypotheses, i.e., competing statistical descriptions of the data. The appropriate form of these hypotheses depends on the nature of the scenario being studied. For completeness, two relevant scenarios will be considered here: one appropriate to collimated detector applications such as LEND in which differences in counts are the fundamental data product, and the other for uncollimated applications.
3.2. Skellam and Poisson Distributions
 Suppose the probability of finding ni counts in bin i of some data space satisfies Poisson statistics. In addition assume the probability for bin j is independent of bin i, and the difference in counts is k = ni − nj. Given these two statistically independent observations, each governed by Poisson statistics, the appropriate statistical description of the difference in counts is given by the Skellam distribution [Skellam, 1946],
where μi and μj are the mean values of the two parent distributions, and I|k| is the modified Bessel function of the first kind. If the two samples have identical underlying parent distributions, i.e., in the particular case when μ = μi = μj, equation (4) reduces to
and is representative of a null hypothesis for the case in which count differences are of interest.
 As described above, the Skellam-based approach is required due to imperfect collimation and the need to remove leakage (background) neutrons in order to reach the spatial resolution design goals originally motivating the collimated approach. It may be possible, however, to use uncollimated data sets to obtain the desired correlations with lunar surface features, albeit with reduced spatial resolution. For example, advanced deconvolution techniques have been used to associate the uncollimated Lunar Prospector epithermal neutrons with PSRs identified by Kaguya topography [Eke et al., 2009; Teodoro et al., 2010]. In short, an uncollimated orbital neutron data set can be used to infer the presence of hydrogen deposits by identifying neutron count deficits on a region-to-region basis; the size of these surface regions is ultimately dictated by orbital parameters [e.g., Feldman et al., 1998; Lawrence et al., 2010].
 In this uncollimated scenario the appropriate statistical approach is a modification of the one presented above. Here the goal is simply to compare two hypotheses, each represented by a Poisson probability distribution, yet having different expected means (see section 4).
4. Comparison With the “Generic” Statistical Approach
 To evaluate differences between the generic and LRM approaches we begin by computing the suppression and enhancement factors required for a statistically significant observation given null hypothesis. For the scenario requiring the removal of a noncollimated or leakage component, this is obtained by finding the roots of the equation:
where μs and μb are the counts corresponding to the suppression (or enhancement) and the null hypothesis, respectively. In contrast, for a scenario in which no such leakage subtraction is required (e.g., perfect collimation or an uncollimated scenario) an equation utilizing Poisson statistics is appropriate:
where fP(n; μx) = μxe−μx/n! is the Poisson probability for detecting n events in a given sample interval when μx are expected. While equations (6) and (7) are the relevant theoretical expressions, they can be applied to data using the substitutions μs → Ndet, μb → Ndeto, and noting that k is the difference in counts between a sample interval of interest and one that is assumed to be hydrogen-free.
 With these definitions the suppression factor can be generalized as δ = (μs − μb)/μb, in analogy with the same parameter used in equation (2). Here δ is allowed to take on both negative and positive values indicative of both suppression and enhancement. In both cases there is one free parameter, the suppression factor δ, and hence q = 1. The relevant null hypothesis estimate may be obtained via modeling or in situ [see, e.g., Mitrofanov et al., 2010b; Lawrence et al., 2006; Mitrofanov et al., 2008], but in either case it is assumed to be known with perfect knowledge.
 For illustrative purposes two values of λc are applied corresponding to the Gaussian equivalents of 3σ (λc = 9) and 5σ (λc = 25) detections, i.e., chance probabilities of 2.7 × 10−3 and 5.7 × 10−7, respectively. These represent marginal and minimally significant detections, respectively. Results for the collimated scenario are shown in Figures 2a and 2b, as is the required percentage deviation relative to the null hypothesis count estimate for both the LRM and standard LEND approaches (Figures 2c and 2d). As shown here, the LRM methodology sets a more stringent requirement significance. In other words, the magnitude of any neutron suppression or enhancement must be larger in order to meet the significance requirements than currently required by the LEND analyses.
 Differences in the two methodologies are more clearly seen in Figure 3, which shows ρ = δLRM/δLEND, the ratio of the deviations (e.g., Figures 2c and 2d) required to achieve the requisite statistical significance for the two analysis methods. It should be noted that ρ is different for suppressions and enhancements (asymmetric), depends on the null hypothesis count level, and asymptotically approaches . Therefore, in the limit of large counts expected in the null hypothesis (μb ≫ 5 × 103) the LRM method requires the magnitude of any deviations to be ∼ larger than that required by the standard LEND analyses.
 Lower limits on the achievable hydrogen abundance determinations are shown in Figure 4 for the collimated case. Here the suppression factor |δ| has been extracted from Figure 1 and equation (6) subsequently solved to obtain the required on-source time Tsrc for a given total count rate of 5 counts/s, indicative of a low-hydrogen scenario. This null hypothesis rate consists of three contributions as estimated by the LEND team: the “reference” rate (1.9 counts/s) of epithermal neutrons from within the instrument field of view for regolith with a minimal hydrogen content (10 ppm hydrogen), neutrons passing through the walls of the collimator (∼0.3 counts/s), and neutrons associated with local backgrounds associated with the spacecraft (2.8 counts/s); this null hypothesis rate is approximately equal to the average equatorial count rate from the LEND collimated data set (CSETN ALD) available from the Planetary Data System archive [Boynton, 2009].
 For uncollimated scenarios LRM and standard LEND approaches give similar results, i.e., ρ ∼ 1 for all μb ≫ 1. This is simply a restatement of the fact that Poisson distribution approaches that of a Gaussian for large expected means, and provides a validation of the analysis presented above. Therefore, uncollimated data sets will generate significances ∼ times larger than the collimated data sets requiring background subtraction and a Skellam-based statistical approach. Of course the larger statistical significance comes at the cost of degraded spatial resolution.
 Since the Skellam-based LRM method requires deeper suppression or more pronounced enhancement of neutrons than the canonical LEND analysis to meet similar significance thresholds, the ultimate impact is a requirement for longer exposures and/or a larger detector area to meet discovery requirements. Naively, the number of detected neutron counts is directly proportional to the instrument's exposure, the product of area and on-source time (∝ AdetTsrc), while signal to background improves as the square root of exposure (i.e., ). Therefore, given this naive (and likely conservative) benchmark the LRM technique imposes a requirement for instrumental exposure ∼2× larger than LEND prelaunch estimates to reach the same statistical significances. Obviously, the instrument area is fixed since LEND is currently operating in lunar orbit. Therefore, to meet the minimum statistical significance requirements on-source times must be increased by a factor of 2.
 In lieu of this additional exposure, the LRM method can be used to estimate the achievable hydrogen sensitivities for planned operations, as well as to reevaluate the statistical significances of lunar neutron data published by the LEND team. Table 1 is are production of expected LEND exposure times for various North and South Pole PSRs [Mitrofanov et al., 2010a]. Given a null hypothesis (hydrogen-free) neutron count rate and the tabulated exposure times, the total number of expected null hypothesis counts μb for each region are computed. Then, following the procedure outlined above, equation (6) is solved to obtain the μs (e.g., Figure 2), and finally the corresponding suppression factor δ computed. These results are listed in Table 1 for a null hypothesis rate corresponding to the LEND average collimated count rate of 5 counts/s.
Table 1. Expected LEND Exposure Times for North and South Pole PSRs and the Achievable Suppression Factors (δ = (μs − μb)/μb) Obtained Using the LRM Techniquea
Exposure Time (s)
The suppression factor δ is obtained using a null hypothesis (minimal hydrogen) count rate of 5 counts/s and is tabulated for both 3σ and 5σ significance thresholds. Expected exposures are from Mitrofanov et al. [2010a].
 From Table 1 it is apparent that only 3 South Pole PSRs (S1, S2, and S3) will reach the sensitivity necessary to probe hydrogen abundances between ∼10 and 100 ppm (e.g., |δ| ≤ 0.045) at the nominal discovery threshold corresponding to , and all North Pole PSRs are restricted to >100 ppm sensitivity. If the significance threshold is reduced to 3σ then 2 additional South Pole and 4 North Pole PSRs (S4, S5, N1, N2, N3, and N4) can be probed to this hydrogen abundance level.
 Finally, we can determine the significance of published LEND features of interest within the context of our statistical approach. As discussed previously, LEND has quoted detection significances for a number of South Pole PSRs and NSRs [Mitrofanov et al., 2010b], some of which are listed in Table 2. Using the LEND reference rate as the null hypothesis (minimal hydrogen) count rate estimate and their quoted statistical significances, it is possible to obtain the number of “signal” counts μs detected from these regions, and ultimately determine the LRM λ statistic. Based on the first year of data released to the Planetary Data System, LEND exposures vary between 102 and 104 s within 0.5° × 0.5° spatial elements near the poles (i.e., |λ| ≥ 80°). Using an exposure estimate of 105 s, Table 2 suggests that only the PSR associated with the Shoemaker crater meets the criteria for a significant detection (5σ) when the LRM technique is employed, although the Cabeus NSR exceeds the marginal detection threshold (3σ). It should be noted that significances do not improve substantially even with very long exposure times (e.g., >105 s), and are therefore in conflict with the significances reported by the LEND team.
Table 2. Suppression of Epithermal Neutron Count Rate From Within the Nine PSRs and Within the LEND-Identified NSR Associated With the Cabeus Cratera
Significance (This Work)
Shown are the statistical significances given by the LEND team [Mitrofanov et al., 2010b], as well as the likelihood ratio (λ) and random probability using the Skellam-based statistical approach. The latter was computed using an exposure time of 105 s, a value exceeding the typical annual LEND exposure near the lunar poles.
8.2 × 10−3 (deficit)
8.9 × 10−2
1.4 × 10−13 (deficit)
2.4 × 10−7
1.1 × 10−4 (deficit)
8.9 × 10−3
Cabeus B (PSR)
4.2 × 10−1 (deficit)
8.9 × 10−1
2.4 × 10−1 (deficit)
6.2 × 10−1
1.3 × 10−3 (deficit)
3.4 × 10−2
Malapert F1 (PSR)
3.1 × 10−1 (deficit)
7.2 × 10−1
5.4 × 10−2 (enhance)
2.6 × 10−1
Cabeus A1 (PSR)
3.8 × 10−1 (deficit)
8.3 × 10−1
5.8 × 10−8 (deficit)
1.8 × 10−4
Cabeus Central (NSR)
6.2 × 10−3 (deficit)
7.7 × 10−2
 An updated statistical approach has been presented that incorporates the statistical restrictions inherent in counting experiments governed by Poisson statistics. In contrast to generic approaches using quasi-Gaussian statistics, our methodology addresses the discrete, bounded, and asymmetric nature of these data. The Skellam distribution has been used to properly describe the difference between two observations and is relevant to collimated orbital neutron spectroscopy applications such as the LEND experiment aboard LRO. Its imperfect collimation leads to a background (noncollimated) subtraction requirement requiring the use of this updated statistical approach. We find that detection significances have been previously overestimated by a factor , and therefore requiring exposure times ∼2× longer than originally envisioned to meet minimum detection significance criteria. In addition, we find that even for unrealistically long exposure times the achievable detection significances are lower than currently reported by the LEND team. Relaxing the spatial resolution requirement and treating the data as uncollimated data has been shown to result in correspondingly higher-detection significances than the collimated data.
 This work was supported by the NASA Lunar Science Institute grant NNA09DB31A.