# Confidence Intervals for Linear Combinations of Poisson Means

## Summary

Parametric confidence intervals are given for linear combinations of the means of independent Poisson variables and for their continuous versions. The performance of the intervals is assessed using simulation. A real data set is used to compare the proposed intervals with known ones. The proposed intervals are shown to be superior to known ones and comparable to exact intervals.

## 1 Introduction

Problems of finding confidence intervals for functions of Poisson means arise naturally in a variety of contexts. Here, are five such problems:

Problem 1. Multiple comparisons procedures for Poisson data with application to comparing defects at an electronics shop over different days (Scheaffer 1980), and to investigating the impact on cancer development of several treatments for Hodgkin's disease (Suissa & Salmi 1989).

Problem 2. In Azerbaijan, about 300 structures that could be oil fields are known onshore, and 66 structures are already recognized in the offshore region. Bagirov & Lerche (1998) wanted to know the fraction of these structures that can be expected to yield horizons with commercial value. To find an answer to this problem, they conducted a statistical analysis of data covering the last 100 years of oil production in Azerbaijan. The number of producing horizons per field was described by a linear combination of Poisson random variables.

Problem 3. A demerit rating system is used to simultaneously monitor counts of several different types of defects in a complex product. The demerit statistic is a linear combination of the counts of these different types of defects. The traditional recommendation is to plot the demerit statistic on a control chart with symmetric 3-sigma control limits. Jones, Woodall & Conerly (1999) proposed an alternative method for determining control limits for the demerit control chart, based on the exact distribution of linear combinations of independent Poisson random variables.

Problem 4. A standard method of estimation with applications to chemistry, the study of geothermal bores and other areas involves adding a radioactive isotope and measuring the number of counts before and after addition. We observe independent Poisson counts in seconds with means , where is the th decay rate, is background noise and is background plus signal. A confidence interval is required for , the signal decay rate.

Problem 5. An increase is observed in the per capita rate of first admissions to psychiatric care. Is the increase significant? Let be the (known) population at time . Let be the number first admitted between times and . Assume for small

where is the (unknown) rate at time . Assume that numbers first admitted in different periods are independent. Then is a Poisson process with mean . The problem of significance can be answered if we have a confidence interval for , where , and are the periods being compared and is an appropriate weight function satisfying . Set , so that . If, in fact, is only available at times (for example, annually) for some then is only estimable if can be expressed as the union of one or more intervals ) and is chosen to be constant over each such interval.

The first four examples and the constrained form of the fifth example are special cases of finding a confidence interval for

(1)

where is the number of Poisson variables or Poisson means involved (assumed known), are the weights which are also assumed known and are the unknown Poisson means. We assume that we have observations on independent Poisson variables with means , . The parameters of (1) are: .

These examples are special cases of the following problem: given a known weight function and an observed Poisson process with unknown mean find a confidence interval for

(2)

The problem of finding a confidence interval for

(3)

where are observed independent Poisson processes with unknown mean functions and known weight functions , is reducible to (2) since we may combine into a single process.

To the best of our knowledge, there are only two papers, Stamey & Hamilton (2006) and Krishnamoorthy & Lee (2010), giving parametric confidence intervals for (1937) and there are none giving parametric confidence intervals for (1999) and (2010). The confidence intervals in Stamey & Hamilton (2006) are variations of approximations based on the Central Limit Theorem (CLT). The confidence intervals in Krishnamoorthy & Lee (2010) are based on normal and chi-square approximations. These intervals may not be as accurate as those proposed here because they are based on the CLT whereas we use tools based on higher order approximations. We have empirical evidence that our intervals provide improved accuracy.

The aim of this paper is to provide accurate parametric confidence intervals for (1937), (1999) and (2010). Section 'Confidence intervals' contains the derivation of these confidence intervals. Section 'Numerical comparisons' assesses the performance of the derived intervals in terms of their widths and coverage probabilities. A part of this assessment is based on simulation. Section 'Data example' demonstrates the importance of the derived intervals using a real data set. Some conclusions are noted in Section 'Conclusions'. Some technical details required for the results in Section 'Confidence intervals' are provided in Appendix I.

## 2 Confidence intervals

In this section, we apply the results of Withers (1983a, 1983b, 1989) to obtain accurate confidence intervals for

corresponding to situations (1937), (1999) and (2010) in terms of

respectively, for any non-negative real number.

The accurate confidence intervals are constructed by a method of successive approximation starting from a CLT for the studentized statistic, in this case

The method was developed by Welch (1947) for the Behrens–Fisher problem and by Winterbottom (1979) and Withers (1982a, 1989) independently for the general parametric situation and by Withers (1983b, 1988) for one-sample and multi-sample non-parametric problems. The method rests on the expansions of Cornish & Fisher (1937). For the parametric situation relevant to (1937) we use the variation in Appendix I with

where and is a parameter determined below. Then Theorem I.1 in Appendix I holds if and is bounded away from zero. Assuming , these conditions are satisfied for

This is substantially better than , which is what one would normally expect.

Theorem 2.1 provides two-sided and one-sided confidence intervals for of (1937). A symmetric version of these intervals is given in Theorem 2.2.

Theorem 2.1. A two-sided confidence interval for of (1) is:

(4)

for , where

and and are such that

the nominal level of the confidence interval, and where in turn , , , and is the probability density function of a standard normal variable. One-sided confidence intervals for of (1) are:

for .

Proof. Let . The studentized statistic is

where

Thus, from Withers (1982b, 1983a, 1989) we obtain the result.

Note that estimates the standardized th cumulant of . Note also that and is a function of (, ).

Theorem 2.2. A symmetric confidence interval for of (1) is:

(5)

for , where , , and .

Proof. The result follows from Withers (1982b).

Note that is an odd function of determined by . Note also that only s1 and s2 are given. Subsequent calculations do not need sk for k > 2.

We shall refer to (4) as the th order confidence interval with respect to . The confidence interval given by (4) is the 2th order confidence interval with respect to the same .

The lengths of the intervals given by (4) for and are the same for . The length of (4) for is greater than that for if and only if for . The length of (4) for is greater than that for if and only if for .

There is no guarantee that and for all . These inequalities may hold for some and may not hold for other especially because of the discrete nature of the Poisson random variables. Thus, the lengths of the confidence intervals given by (4) and (4) may oscillate with increasing . Eventually, and will diverge, and the lengths will become infinite.

The th order confidence interval of nominal level has actual level as . Thus, the coverage probabilities of (4) and (4) will generally take values closer to the nominal level with increasing and with increasing .

In practice, should be chosen to maximize the coverage probability of the confidence intervals and to keep their lengths as short as possible. In other words, should be chosen so large that the coverage probability is as large as possible, but not so large that the length diverges. The choice is a trade off between being too large and being too small. One possible choice is to take as the largest integer for which both and decrease for all .

The CLT confidence intervals correspond to setting in (4) and (4). The confidence intervals of (4) and (4) improve on the CLT versions at least in terms of the coverage probability. The lengths of (4) and (4) may or may not be shorter than those of the CLT versions. This depends on how compares with the rest of the s and on how compares with the rest of the s. Comparisons based on lengths have been considered by several authors. For example, Winterbottom (1979) has shown that for a binomial problem the th order confidence interval can give a significant improvement over the CLT version.

So far we have obtained th order type confidence intervals for (1). Taking limits and using we see that (4) and (4) give th and 2th order type confidence intervals for with respect to in terms of and for with respect to in terms of . In each case, the error is the corresponding limit of the error for (1). Of course, the are assumed to exist.

## 3 Numerical comparisons

In this section we compare the performance of the two sided confidence intervals, (4) and (4), for , and , where corresponds to CLT confidence intervals. The comparison is performed partly through exact calculations (see Figs 1,2) and partly through simulations (see Figs 3,4).

We use two criteria to assess performance. The first is the expected width of the confidence interval and its standard deviation as computed from (4) and (4). For example, if then the expected width of the confidence interval for (4) and its standard deviation are and , respectively, where , , , and . The expected width of the confidence interval for (5) for and its standard deviation can be computed using and . Also if , an exact confidence interval for (1) is

(6)

say, with expected width and standard deviation . These can be used to assess the accuracy of expected widths and standard deviations from (4) and (4).

The second of the two criteria is the coverage probability of (4) and (4) obtained by simulating samples of size 10,000 from

where the are independent Poisson random variables with means , . We considered several choices for the weights:

However all of these choices yielded similar results. For simplicity we shall report the results only for the last choice of . We shall also assume throughout that for .

Figures 1 and 2 show how the expected widths for (4) and (4) vary with respect to for , and . The length of the vertical bars shown in these figures is the standard deviation multiplied by . The vertical bars are offset from each other for the purpose of visibility.

Figures 3 and 4 show how the coverage probabilities of (4) and (4) computed by simulation vary with respect to for , and . The expected width, ±1.96 standard deviation bars and the coverage probability of the exact interval in (5) are included for the case .

The expected widths for (4) and their standard deviations are the same for and for . Figure 1 shows the results for (4) only for .

The following conclusions can be drawn from Figures 1 and 2:

• the expected widths for (4) increase from to for every and ,
• the expected widths for (4) increase from to for every and ,
• the expected widths for both (4) and (4) generally increase with increasing for every and ,
• the expected widths for both (4) and (4) generally decrease with increasing ,
• the width for (4) is sometimes shorter than that of the exact interval for ,
• the standard deviations of the widths for both (4) and (4) do not appear to show any recognizable pattern.

The following conclusions can be drawn from Figures 3 and 4:

• the coverage probabilities of (4) increase monotonically from to for every and ,
• the coverage probabilities of (4) increase monotonically from to for every and ,
• the coverage probabilities for both (4) and (4) show a general pattern of increase with respect to increasing especially for small .

It is clear that the intervals given by (4) for are the closest to the exact interval in terms of both the expected width and the coverage probability. It is also clear that the intervals given by (4) for are the closest to the exact interval in terms of both the expected width and the coverage probability. The CLT confidence intervals corresponding to in (4) and (4) perform poorly especially for small .

The discussion so far has not focussed on the confidence intervals for (2) and (2010). However the confidence intervals (4) and (4) are good approximations for the continuous version for large . Thus the confidence intervals discussed in Figures 1-4 for and in Section 'Data example' for can be considered to correspond to the continuous versions.

## 4 Data example

In this section we demonstrate the practical value of the confidence interval (4) using a real data set from Stamey & Hamilton (2006). The data set considered by these authors (see Table 1) contains the number of fatal motor vehicle accidents (FMVA) involving driving while intoxicated (DWI) during six major holidays for the year 2000. The statistics are taken from the Crash Records Bureau of the Texas Department of Public Safety.

Table 1. Number of driving-while-intoxicated-involved fatal motor vehicle accidents during six major holidays in 2000
HolidayNumber of accidents
Memorial Day0
July 45
Labor Day2
Thanksgiving11
Christmas8
New Year's Eve9

Stamey & Hamilton (2006) were interested in estimating the average number of DWI-involved fatal accidents per holiday, and whether more such accidents occur during the winter holidays (Thanksgiving, Christmas and New Year's Eve) than during the summer holidays (Memorial Day, July 4. and Labor Day). For the first quantity, for all , and we want to estimate , say. For the second quantity, and , and we want to estimate , say. Stamey & Hamilton (2006) obtained the following 95 percent confidence intervals: (3.90, 7.77), (3.87, 7.80), (4.31, 8.35), (4.29, 8.38) and (3.13, 10.87), (3.07, 10.93), (2.97, 11.03), (2.91, 11.09) for and , respectively, based on four different methods.

Using the normal approximation given by equation (11) in Krishnamoorthy & Lee (2010), we obtained the confidence intervals (4.10, 7.96) and (3.10, 10.8) for and , respectively. Using the chi-square approximation given by equation (12) in Krishnamoorthy & Lee (2010), we obtained the confidence intervals (3.95, 7.8) and (3.01, 10.66) for and , respectively.

Using the two sided confidence interval, (4), with and , we obtained the confidence intervals (4.01, 7.82) and (3.34, 10.95) for and , respectively.

It is clear that our estimates provide the shortest intervals. Each of the intervals due to Stamey & Hamilton (2006) and Krishnamoorthy & Lee (2010) is wider than ours. The intervals due to Krishnamoorthy & Lee (2010) appear shorter than those due to Stamey & Hamilton (2006). Of the two intervals given in Krishnamoorthy & Lee (2010), the one based on chi-square approximation appears shorter. This is in agreement with the findings in Krishnamoorthy & Lee (2010).

The above observations are based on a single data set. Thus, we cannot be sure that the methods of Stamey & Hamilton (2006) and Krishnamoorthy & Lee (2010) overestimate, or that the intervals of Stamey & Hamilton (2006) perform worst, or that the normal approximation method of Krishnamoorthy & Lee (2010) is conservative. A much more comprehensive study would be required to substantiate these findings (if indeed the findings are correct in the first place).

## 5 Conclusions

We have proposed confidence intervals for linear combinations of Poisson means and for continuous versions of such combinations. This is the first time such intervals have been proposed. There have only been two papers published, Stamey & Hamilton (2006) and Krishnamoorthy & Lee (2010), giving confidence intervals for linear combinations of Poisson means.

The intervals of Stamey & Hamilton (2006) and Krishnamoorthy & Lee (2010) are based on first order normal approximations and first order chi-square approximations. Our proposed intervals for linear combinations of Poisson means are based on higher order approximations than the CLT. Thus, our intervals can be expected to perform better than the CLT versions and those due to Stamey & Hamilton (2006) and Krishnamoorthy & Lee (2010).

We have performed a simulation study to compare the proposed intervals and the CLT versions in terms of expected widths and coverage probabilities. In this study, we observed that the proposed intervals outperform the CLT versions in terms of coverage probabilities. The proposed intervals appear comparable to exact intervals in terms of both expected widths and coverage probabilities. Sometimes the expected widths of the proposed intervals are shorter than those of exact intervals.

We have also illustrated an application using a data set on numbers of FMVA. In this application we observed that the proposed intervals have shorter lengths than those due to Stamey & Hamilton (2006) and Krishnamoorthy & Lee (2010).

## Appendix: I

Theorem I.1. Suppose that is a function of , bounded as . Let have derivatives which are bounded with respect to n. Let be an estimate of such that for the th order cumulants of can be expanded as linear combinations of with coefficients functions of bounded with respect to n and such that the leading coefficient of is . Then can be expanded as a linear combination of with coefficients functions bounded with respect to n with the leading coefficients being given in the appendix to Withers (1982a); also the leading coefficient of is .

Proof. Allow in Withers (1982a) to depend on .

Suppose now that and , the leading coefficient in the expansion for var , is bounded away from zero with respect to n. Then, under the conditions of Withers (1988), there exist bounded functions such that

as , where is fixed, , , and is the cumulative distribution function of a unit normal variable.

## Acknowledgements

The authors would like to thank the Editor, the Associate Editor and the three referees for careful reading and for their comments which greatly improved the paper.