SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Type I error and control of familywise error rate
  5. 3. Power
  6. 4. Conclusion
  7. Acknowledgement
  8. References
  9. Appendices

We propose a simple modification of Hochberg's step-up Bonferroni procedure for multiple tests of significance. The proposed procedure is always more powerful than Hochberg's procedure for more than two tests, and is more powerful than Hommel's procedure for three and four tests. A numerical analysis of the new procedure indicates that its Type I error is controlled under independence of the test statistics, at a level equal to or just below the nominal Type I error. Examination of various non-null configurations of hypotheses shows that the modified procedure has a power advantage over Hochberg's procedure which increases in relationship to the number of false hypotheses.


1. Introduction

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Type I error and control of familywise error rate
  5. 3. Power
  6. 4. Conclusion
  7. Acknowledgement
  8. References
  9. Appendices

We consider the classical problem of testing n hypotheses, H1, …, Hn, with corresponding p-values, P1, …, Pn . Our goal is to devise a procedure that strongly controls the familywise error rate (FWE) defined, for example, in Hochberg & Tamhane (1987). We assume that the hypotheses satisfy Holm's (1979)‘free association’ condition, and that the corresponding p-values are independent.

Simes (1986) introduced an improved Bonferroni procedure for testing the global null hypothesis H0=∩Hi. Consider the ordered the p-values, inline image, and the corresponding hypotheses, inline image. Simes's global test rejects H0 if P(j)jα/n for any j= 1, …, n. When the global null hypothesis is rejected, there is still a question of which individual hypotheses can be rejected.

Using the closure principle of Marcus, Peritz, and Gabriel (1976), Hommel (1988) and Hochberg (1988) offered extensions to Simes’ test to make inferences on individual hypotheses. Subsequently, Hommel (1989) showed that his procedure is more powerful than Hochberg's procedure. Since both of these procedures are conservative, we consider the following improvement of Hochberg's procedure. Let n≥ 2; for any i= 1, …, n, if P(n+1−i)≤α (i+ 1)/2i, reject any Hj with Pj≤α/i.

Note that the trivial case of n= 2 reduces to Hommel's and Hochberg's procedures. It is straightforward to show that this procedure is always more powerful than Hochberg's procedure, and is also more powerful than Hommel's procedure for three and four tests. Neither the proposed procedure nor Hommel's procedure dominates the other for more than four tests; however, the simplicity of the new procedure makes it appealing.

Table 1 displays the modified and Hochberg's critical values, as well as the rejection rule, for up to 10 tests. In Section 2 we show that the modified procedure strongly controls the FWE. Appendix A gives two detailed examples of the implementation of the modified procedure. The examples illustrate that the modified procedure can lead to more rejections when several of the hypotheses are false. This property is discussed in more detail in Section 3.

Table 1. Critical values and decision rules, n= 10, α= .05
i Critical value, CiIf P(n+1−i)CiαReject
HochbergModified
1/i (i+ 1)/2i HochbergModified
  1. Note. for i=n, P(n+1−i)≤α/nP(n+1−i)Ciα, which implies simply: if P(1)≤α/n then reject H(1).

1 1 1 P(n)≤ .05 P(n)≤ .05 all Hi
2 1/2 3/4 P(n−1)≤ .025 P(n−1)≤ .0375 any Hi  with  P(i)≤ .025
3 1/3 2/3 P(n−2)≤ .0167 P(n−2)≤ .0333 any Hi  with  P(i)≤ .0167
4 1/4 5/8 P(n−3)≤ .0125 P(n−3)≤ .0313 any Hi  with  P(i)≤ .0125
5 1/5 3/5 P(n−4)≤ .01 P(n−4)≤ .03 any Hi  with  P(i)≤ .01
6 1/6 7/12 P(n−5)≤ .0083 P(n−5)≤ .02924 any Hi  with  P(i)≤ .0083
7 1/7 4/7 P(n−6)≤ .0071 P(n−6)≤ .0286 any Hi  with  P(i)≤ .0071
8 1/8 9/16 P(n−7)≤ .0062 P(n−7)≤ .0281 any Hi  with  P(i)≤ .0062
9 1/9 5/9 P(n−8)≤ .0056 P(n−8)≤ .02784 any Hi  with  P(i)≤ .0056
101 1/10 11/20 P(n−9)≤ .005 P(n−9)≤ .0275 any Hi  with  P(i)≤ .005

2. Type I error and control of familywise error rate

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Type I error and control of familywise error rate
  5. 3. Power
  6. 4. Conclusion
  7. Acknowledgement
  8. References
  9. Appendices

To facilitate the comparison between Hochberg's and the modified procedure, we describe both procedures as follows. If P(n)≤α, then reject all hypotheses; otherwise, retain H(n), and compare P(n−1) with C2α. If equal or smaller, reject any hypothesis with a p-value less than or equal to α/2; otherwise, retain H(n−1), and compare P(n−2) with C3α. If equal or smaller, reject any hypothesis with a p-value less than or equal to α/3; otherwise, retain H(n−2), and compare P(n−3) with C3α, etc., until the last step, where P(1) is compared with α/n. The set of critical values are either Hochberg's, Ci=α/i, or the modified set, Ci= (i+ 1)/2i(i= 2, ..., n). It follows that for both Hochberg's and the modified procedures, inline image equals the probability of the following union of events:

  • image(1)

Appendix B elaborates on the steps needed to calculate the probability of (1). In Table 2, we show the numerical results for the modified as well as Hochberg's procedures. As seen in Table 2, both procedures control the Type I error; however, while the modified procedure provides almost perfect control, Hochberg's procedure is somewhat conservative, especially as the number of tests increases.

Table 2. Exact overall Type I error rate: independent test statistics,α= .05
nHochbergModified
2.05.05
3.049406.05
4.049179.049993
5.049073.049991
6.049011.049990
7.048970.049990
8.048942.049990
9.048920.049990
10.048903.049989

The results shown in Table 2 are the Type I error rates for testing the overall (intersection) hypothesis, inline image, given that H0 is true. As seen, inline image. Since no Hj can be rejected unless (1) is satisfied, it follows that if H0 is true, inline image, which ensures weak control of the FWE. According to the closure principle of Marcus et al. (1976), to provide strong control of the FWE, Hj must not be rejected unless every subset intersection hypothesis including Hj as its component is rejected too by a direct or implied α-level test. Appendix C provides the proof of strong control of the FWE using the closure principle.

Modified Bonferroni procedures have an advantage over the simple or step-down Bonferroni procedure primarily in the case of correlated statistics for which the Type I error is closer to the nominal level. It is of interest to investigate null and non-null configurations in some non-zero correlations as compared to independence.

Table 3 shows simulation results of null configurations, with 100, 000 replicates of each instance. The number of null hypotheses ranges from 3 to 10, and correlations range from 0 to .9. The p-values were generated from standard normal variates where zero correlation represents independence of the test statistics. Equicorrelations were used in all other configurations.

Table 3. Simulated overall Type I error rate: correlated test statistics, α= .05, normal variates, 100,000 replicates
Number of null hypothesesCorrelation
0.5.9
HochModHochModHochMod
  1. Hoch = Hochberg, Mod = modified procedure.

3.04844.04903.04329.04497.03696.03696
7.04863.04961.03777.04163.02182.02798
10.04944.05048.03589.03992.01794.02405

As seen in Table 3, both Hochberg and the modified procedure control the Type I error at or below the nominal (.05) level, but the modified procedure is always less conservative. For the independence case, the results closely match the exact calculations shown in Table 2.

3. Power

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Type I error and control of familywise error rate
  5. 3. Power
  6. 4. Conclusion
  7. Acknowledgement
  8. References
  9. Appendices

Comparing the sets of critical values of the two procedures, it can be conjectured that the advantage of the modified procedure is primarily due to the more liberal critical values associated with the larger ordered p-values. This means that the modified procedure will tend to reject more often than Hochberg's procedure when several of the hypotheses are false. This situation is typical in studies that compare groups on related parameters.

Table 4 shows simulation results with 10, 000 replicates of each instance. The variates were generated similarly to the previous section with the exception that a location shift of 2.0 in the mean was made to produce non-null cases. With a mean of 2.0 and a standard deviation of 1, the power to detect a single false hypothesis without protecting (a one-sided) Type I error (= .05) is .52. This was selected to produce power ≥.5 in most instances to allow meaningful comparisons. As seen in Table 4, the advantage of the modified procedure is highest when several of the hypotheses are false. The gain in power is moderate, 3–5% in most cases, but can reach substantial levels of over 7% in some instances. To illustrate this point, consider the case of 10 hypotheses, and 9 false hypotheses, with correlation .9. Hochberg's procedure gives a power of .4681, while the modified procedure gives a power of .54. Using the simple sample size formula for comparing two groups, 2(z+z1−β)2× (σ/δ)2, where δ/σ is the effect size, one would have to increase the sample size by about 20% to increase the power of Hochberg's procedure from .4681 to the modified procedure's power of .54.

Table 4. Simulated overall power: correlated test statistics, α= .05, normal variates, μ= 2.0, 10,000 replicates
Number of hypothesesCorrelation
Number of false hypotheses0.5.9
HochModHochModHochMod
  1. Hoch = Hochberg, Mod = modified procedure

31.4612.4645.4530.4552.4522.4522
 2.7019.7104.6168.6254.5313.5495
 3.8413.8500.7124.7223.5939.6062
71.3561.3607.3351.3378.3282.3282
 2.5667.5793.4771.4882.3909.4064
 3.7079.7279.5661.5827.4264.4573
 4.8028.8279.6259.6477.4519.4964
 5.8656.8900.6692.6951.4701.5235
 6.9093.9312.7024.7307.4886.5480
 7.9393.9577.7274.7602.5263.5684
101.3116.3176.2911.2949.2860.2861
 2.5034.5159.4236.4325.3465.3580
 3.6461.6633.5110.5272.3816.4048
 4.7453.7658.5707.5911.4049.4373
 5.8172.8399.6139.6410.4218.4650
 6.8696.8934.6466.6765.4357.4900
 7.9059.9283.6748.7060.4467.5074
 8.9351.9540.6971.7296.4571.5262
 9.9529.9697.7162.7506.4681.5400
 10.9664.9799.7318.7686.5023.5544

4. Conclusion

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Type I error and control of familywise error rate
  5. 3. Power
  6. 4. Conclusion
  7. Acknowledgement
  8. References
  9. Appendices

Modified Bonferroni procedures are popular among researchers due to their simplicity, and their general applicability without some of the assumptions needed for many of the more complex parametric methods. Step-up Bonferroni methods offer a power advantage over the single-step and step-down methods. Of the step-up procedures, Hochberg's (1988) is highly popular due to its simplicity. However, Hochberg's procedure is conservative and can be improved.

Rom (1990) proposed a modification of Hochberg's (1988) procedure that requires the calculation of critical points through a recursive algorithm. Several studies – for example, Dunnett and Tamhane (1993)– have shown that both Rom (1990) and Hommel (1988) offer a small power advantage over Hochberg's procedure; nevertheless, the simplicity of the latter is advantageous for use in practice.

The procedure proposed in this paper maintains the simplicity of Hochberg's procedure, while offering some power advantage. The gain in power is moderate, 3–5% in many situations, but can reach levels as high as 7% in some instances, when many of the tested hypotheses are false. This is a typical situation in many studies because researchers tend to focus on related study questions, which usually also give rise to correlated test statistics. Further research can shed some more light on the performance of the proposed procedure. One area of interest can be in testing logically related hypotheses, for example, pairwise comparisons of means in analysis of variance settings.

Acknowledgement

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Type I error and control of familywise error rate
  5. 3. Power
  6. 4. Conclusion
  7. Acknowledgement
  8. References
  9. Appendices

The author thanks Juliet Shaffer, the associate editor, and the two reviewers for their insightful comments on an earlier version of this paper.

References

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Type I error and control of familywise error rate
  5. 3. Power
  6. 4. Conclusion
  7. Acknowledgement
  8. References
  9. Appendices

Appendices

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Type I error and control of familywise error rate
  5. 3. Power
  6. 4. Conclusion
  7. Acknowledgement
  8. References
  9. Appendices

Appendix A: Examples of the implementation of the modified procedure

The following two examples illustrate the advantage of the modified procedure as compared to Hochberg's procedure. We use α= .05.

Example 1. P(1)= .02, P(2)= .03, P(3)= .2. Clearly, Hochberg's procedure rejects no hypothesis. The modified procedure starts by comparing P(3) with .05. Since it is larger, H(3) is retained, and we proceed to compare P(2) with 3α/4 = .0375. Since it is smaller, we can reject any hypothesis with a p-value less than or equal to α/2 = .025. Hence, H(1) can be rejected.

Example 2. P(1)= .009, P(2)= .015, P(3)= .025, P(4)= .04, P(5)= .2. Going through the steps of Hochberg's procedure, the first rejection occurs at the last step, where P(1) < α/5 = .01. Hence, Hochberg's procedure rejects H(1) only. For the modified procedure, going through the steps, P(3) < 2α/3 = .0333. We can then reject any hypothesis with a p-value less than or equal to α/3 = .0167. Hence, H(1) and H(2) can be rejected.

Appendix B: Calculation of the probability of (1)

Under the global null hypothesis, P(i), i= 1, …, n, are order statistics from n independent uniform U(0,1) random variables. For any 1 > C3C2C1C0 > 0, we can show the following:

  • 1
    Using the Binomial distribution, inline image.
  • 2
    inline image can be partitioned into probabilities of disjoint events:
    • image
    where j1={nk− 2, …, nk} and j2={max (0, 2 −j1), …, nkj1}.

Using the above partition repeatedly, we calculate (1) as follows:

  • image

For each Ak(n) in the summations above, i={1, …, nk}, j1={1, …, k}, j2={max (0, 2 −j1), …, kj1}, …, jk−1={max (0, k− 1 −j1−⋯−jk−2), …, kj1−⋯−jk−2}. Note that n≥ 2 and Cn= 1/n. Hence, we can write:

  • image((B1))

The set of critical values, Ci, i= 2, …, n, is either Hochberg's or the modified procedure, shown in Table 1. We use (B1) to calculate the Type I error shown in Table 2.

Appendix C: Proof of strong control of the FWE using the closure principle

Let H(1), …, H(n) be n hypotheses with corresponding ordered p-values, P(1), …, P(n) . Assume that for at least one i, 1 ≤in, P(n+1−i)≤α (i+ 1)/2i. Let k= 1, …, r, for which P(k)≤α/i. According to the proposed procedure, all r hypotheses H(k), k= 1, …, r, can be rejected. To prove that this procedure has strong control of the FWE, we must show that every subset intersection hypothesis including H(k) as its component is also rejected by an α-level test. Consider the following two cases and the resulting partitions of H(1), .., H(n) .

Case 1: i= 1,  H1:H(1), .., H(n). In this case, where we reject at the first step of the modified Hochberg procedure, all p-values are less than or equal to α; hence all subset intersection hypotheses formed from H1 are rejected at the first step by an α-level modified Hochberg procedure.

Case 2: i > 1, H1:H(1), .., H(r),  H2:H(r+1), .., H(n+1−i),  H3:H(n+2−i), .., H(n). In this case, H1 is the set of rejected hypotheses; H2 is a set of (retained) hypotheses with corresponding p-values satisfying α/i < P(l)≤α (i+ 1)/2i(l=r+ 1, …, n+ 1 −i); and H3 is the set of i− 1 retained hypotheses with corresponding p-values, P(m) > α (i+ 1)/2i(m=n+ 2 −i, …, n).

Now consider the following configurations of subset intersection hypotheses of case 2. Subset intersection hypotheses formed from H1 and H2, including at least one hypothesis from H1, have all of their corresponding p-values less than or equal to α; hence all of these subset intersection hypotheses will be rejected at the first step of an α-level modified Hochberg procedure. Turning to subset intersection hypotheses formed from H1,H2, and H3, including at least one hypothesis from H1, any such subset intersection hypothesis may include i′− 1 (i′≤i) hypotheses from H3. Note that α (i′+ 1)/2i′≥α (i+ 1)/2i and α/i′≥α/n. It follows that all p-values corresponding to the hypotheses from H2 are necessarily less than or equal to α (i′+ 1)/2i and all p-values corresponding to hypotheses from H1 are necessarily less than or equal to α/i. Hence a modified Hochberg procedure will reject this these intersection hypotheses at step i.

The above application of the closure principle indicates that the modified Hochberg procedure is a closed multiple test procedure which can implemented without the need to directly test every subset intersection hypothesis. This is referred to as a ‘shortcut’ version of the closed test procedure which is possible due to monotonicity of the critical values.