### Abstract

- Top of page
- Abstract
- 1 Introduction and summary
- 2 Background and notation
- 3 Why do intermediates take it all?
- 4 Why the asymptotics of the HC statistic is so poor
- 5 New HC tests with improved finite properties
- 6 Concluding remarks
- Acknowledgment
- Conflict of interest
- References

The higher criticism (HC) statistic, which can be seen as a normalized version of the famous Kolmogorov–Smirnov statistic, has a long history, dating back to the mid seventies. Originally, HC statistics were used in connection with goodness of fit (GOF) tests but they recently gained some attention in the context of testing the global null hypothesis in high dimensional data. The continuing interest for HC seems to be inspired by a series of nice asymptotic properties related to this statistic. For example, unlike Kolmogorov–Smirnov tests, GOF tests based on the HC statistic are known to be asymptotically sensitive in the moderate tails, hence it is favorably applied for detecting the presence of signals in sparse mixture models. However, some questions around the asymptotic behavior of the HC statistic are still open. We focus on two of them, namely, why a specific intermediate range is crucial for GOF tests based on the HC statistic and why the convergence of the HC distribution to the limiting one is extremely slow. Moreover, the inconsistency in the asymptotic and finite behavior of the HC statistic prompts us to provide a new HC test that has better finite properties than the original HC test while showing the same asymptotics. This test is motivated by the asymptotic behavior of the so-called local levels related to the original HC test. By means of numerical calculations and simulations we show that the new HC test is typically more powerful than the original HC test in normal mixture models.

### 1 Introduction and summary

- Top of page
- Abstract
- 1 Introduction and summary
- 2 Background and notation
- 3 Why do intermediates take it all?
- 4 Why the asymptotics of the HC statistic is so poor
- 5 New HC tests with improved finite properties
- 6 Concluding remarks
- Acknowledgment
- Conflict of interest
- References

Many important theoretical results related to the so-called *higher criticism* (HC) test statistic have been obtained during the past three decades, where it has typically been applied in the context of goodness of fit (GOF) and detection problems. The HC statistic can be seen as a normalized or standardized version of the well-known Kolmogorov–Smirnov test statistic. The asymptotics of the HC statistic was extensively investigated by Jaeschke (1979) and Eicker (1979) in the late 1970s. For earlier references see Anderson and Darling (1952). However, neither Jaeschke nor Eicker made use of the term *higher criticism* statistic. The notion of the *higher criticism* was first introduced by J. W. Tukey in the mid 1970s. Later, Tukey (1989) wrote:

“*If we look at many comparisons, say n*, *and assess the significance of each at* 5% *individually [...]. We know that, even if the underlying value of each comparison is blah [...] we will get an average of* (*i.e*. 5% *of n*) *apparent significance*.

*Various ways of relating the observed number*, *k*, *of individual-*5% *significances to* *are mnemonically referred to as “the higher criticism” [...]*.

It was not until a decade ago that Donoho and Jin (2004) termed the normalized Kolmogorov–Smirnov statistic as *Tukey's higher criticism* and rediscovered the HC in the context of detecting signals that are both sparse and weak.

Among others, Donoho and Jin (2004) showed the optimality of the HC statistic in the sense that a test based on the HC statistic asymptotically mimics the performance of an oracle likelihood ratio test under several conditions. A substantial contribution to results in Donoho and Jin (2004) was made by Ingster (1997, 1999). Jager and Wellner (2007) provided a family of GOF test statistics based on ϕ-divergences that have the same optimal detection boundary in sparse normal mixtures as the HC statistic. Hall and Jin (2008) focused on HC in the case of dependent data. Later, Hall and Jin (2010) modified the standard HC statistic used by Donoho and Jin (2004) to account for correlated noise.

Instead of approaching HC tests in terms of their test statistics a viewpoint from so-called local levels was introduced in Gontscharuk et al. (2013). There has been a lot of further interest and developments in connection with this statistic, e.g. cf. Cai et al. (2007), Donoho and Jin (2008), Hall et al. (2008), Donoho and Jin (2009), and Cai et al. (2011).

Practical applications of the HC statistic can be found in several areas. Often, these applications result from questions arising in the context of large-scale multiple testing. In particular, scientific areas such as genomics, astronomy, or image processing, have seen a growing need for statistical tools to analyze high-dimensional data. In these areas the aim is often to identify whether there are signals present in the data. For example, Parkhomenko et al. (2009) employed the HC test to detect the presence of small effects in a genome-wide association study (GWAS) on rheumatoid arthritis. In Sabatti et al. (2009), the authors analyzed GWAS data to determine genetic influences on certain metabolic traits and tested the global null hypothesis of no genetic effect using the HC statistic. Besides these applications the HC is generally applicable in any areas in which there emerges an interest in GOF testing. Here, a number of questions and issues in the context of HC goodness of fit tests remain, three of which will be addressed in this paper.

- It is well known that a GOF test based on the HC statistic is asymptotically sensitive for some special kinds of alternatives that differ from the null distribution in the moderate tails. However, it is not clear how to explain this behavior, since the proofs of related results are typically of pure technical nature. Based on the theory of stochastic processes we will show why a specific intermediate range plays a crucial role.
- It is known that the convergence of the distribution of the HC statistic to the limiting distribution is extremely slow so that the application of asymptotic results may lead to doubtful outcomes for a finite sample size. Results based on simulations, cf. Donoho and Jin (2004) or Hall and Jin (2010), show that this irregular behavior is frequently caused by the smallest order statistics of the underlying sample. However, there seems no theoretical result justifying this observation. In this paper, we will provide a simple condition how to check whether the asymptotic HC distribution approximates the finite one quite well for a given sample size.
- Due to a huge discrepancy between the HC's finite and asymptotic behavior, it is desirable to construct a new level α test that has the same asymptotic properties as the original HC test but shows an improved finite sample behavior. It was at the MCP2011 Conference where we introduced such a test for the first time. Eventually, at the MCP2013, we presented asymptotic as well as finite properties and power considerations of this new HC test, which will be addressed as a final topic of this paper.

This paper is organized as follows. In Section 'Background and notation', we provide the basic notation and necessary background and illustrate the key issues in detail. Section 'Why do intermediates take it all?' is devoted to the first problem. We introduce continuous stochastic processes that are related to the HC statistic. More precisely, we consider the normalized Brownian bridge and its approximation property in the region where the HC statistic is particularly sensitive. Further, we introduce the Ornstein–Uhlenbeck process that results from a suitable time transformation of the normalized Brownian bridge and helps us in the investigation of the phenomena that appear in the asymptotics of HC statistics. Section 'Why the asymptotics of the HC statistic is so poor' addresses the second key issue of this paper, that is a discussion about the quality of the HC asymptotics applied in the finite sample size case. Thereby, we study the performance of various truncated versions of the HC statistic. An essential observation is that the left tail in form of the smallest order statistics involved is key in contributing to the distribution of the HC statistic. In Section 'New HC tests with improved finite properties', we refer to the third key aspect. We construct a new HC test by considering so-called *local levels* and by setting these quantities to be all equal. We compare the new and original HC tests under the null hypothesis as well as under alternatives. Concluding remarks are given in Section 'Concluding remarks'.

### 2 Background and notation

- Top of page
- Abstract
- 1 Introduction and summary
- 2 Background and notation
- 3 Why do intermediates take it all?
- 4 Why the asymptotics of the HC statistic is so poor
- 5 New HC tests with improved finite properties
- 6 Concluding remarks
- Acknowledgment
- Conflict of interest
- References

Depending on the research question posed, either definition will be considered. Under the assumption that the given order statistics , , stem from the uniform distribution on the interval [0, 1], the limit distribution of HC_{0, 1} is given by the Gumbel distribution in the sense that

- (3)

where ,

- (4)

and , cf. Jaeschke (1979) and Eicker (1979). Then we can define a GOF test based on the HC test statistic, which we will call the *HC test*, for testing the null hypothesis *H*_{0} that the underlying sample is, in fact, a realization of iid standard uniform distributed random variables. We say that the HC test rejects *H*_{0} if the test statistic HC_{0, 1} is larger than the (asymptotic) critical value . Setting , the corresponding HC test is an asymptotic level α test, that is

It is known that such HC tests are typically more powerful than the classical (asymptotic level α) Kolmogorov–Smirnov tests if an alternative distribution deviates from the null distribution in moderate tails. This is due to the fact that under *H*_{0} the supremum and/or maximum in (1) and/or (2), respectively, is asymptotically taken only over a specific intermediate range. More precisely, let us consider a truncated version of the HC statistic defined by

- (5)

For example, it follows that the maximum taken over the central range or the maximum over extreme tails is asymptotically stochastically smaller than the asymptotic critical value for any . Furthermore, applying results in Jaeschke (1979), we even get that the maximum taken over a specific intermediate range is asymptotically (stochastically) equal to the maximum taken over the entire interval (0, 1), that is

- (6)

for and with and . This is why we say that intermediates take it all. In view of (6) we denote

- (7)

with and as a sensitivity range related to the HC statistic. Roughly speaking, the sensitivity range is given by

Surprisingly, the sensitivity range is very small compared to the entire interval (0, 1) but crucial for the HC asymptotics. Unfortunately, proofs related to HC results are typically of technical nature so that it is not clear why the supremum taken over is asymptotically stochastically larger than the supremum taken over the remaining area. In Section 'Why do intermediates take it all?', we will give an explanation for this phenomenon by considering the HC statistic as the maximum of a stochastic process.

As compared to the critical values of the limiting distribution, it is apparent that larger quantiles of the finite HC_{0, 1} statistic, which are typically relevant for testing purposes, are too large even for large sample sizes. As an example, we consider the 0.95-quantile of the limiting distribution, that is with and . For the asymptotic critical value should approximate the 0.95-quantile of the exact HC_{0, 1}-distribution; however, this quantity is equal to the 0.876-quantile. The 0.95-quantile of the exact distribution turns out to be approximately 4.74, showing the discrepancy to the asymptotic value, which is also visible in Fig. 1. Therefore, using the critical values of the asymptotic distribution in the context of testing may lead to a considerable exceedance of the prechosen level α.

Moreover, it is known that for a finite sample size unusual large values of HC_{0, 1} are most frequently caused by the smallest order statistic . This is why truncated HC versions were considered in the literature. For example, Donoho and Jin (2004) proposed to restrict the range of the maximum by applying

and Hall and Jin (2010) considered

Of course, and are not larger than the original HC statistic so that the distributions of these truncated versions are closer to the limiting distribution. Consequently, the asymptotic critical value approximates the corresponding quantiles of and better than the same quantile related to the HC_{0, 1}. On the other hand, to apply a truncated HC statistic in the context of GOF tests, that is, to exclude the smallest values of the underlying sample, which typically can be seen as an indicator that the null hypothesis is false, seems to be too wasteful from a statistical point of view. This is why, in the finite case, we restrict our attention to the original HC statistic HC_{0, 1}. Thereby, the second focus of this paper lies on addressing the question why we observe a rather poor agreement of the asymptotic and finite distributions of the HC_{0, 1} statistic even in a large sample size case.

The final aspect we focus on is a modification of the HC test that seems to be essential in view of the previous topic of this paper. In Section 'New HC tests with improved finite properties', we derive the new (better) HC test motivated by results in Gontscharuk et al. (2013) and show that this test is typically more powerful than the original HC test in a normal mixture model with rather sparse signals as studied by, for example, Donoho and Jin (2004).

### 5 New HC tests with improved finite properties

- Top of page
- Abstract
- 1 Introduction and summary
- 2 Background and notation
- 3 Why do intermediates take it all?
- 4 Why the asymptotics of the HC statistic is so poor
- 5 New HC tests with improved finite properties
- 6 Concluding remarks
- Acknowledgment
- Conflict of interest
- References

In view of the slow convergence of the HC statistic to the asymptotic distribution as discussed in the previous section, it is desirable to modify the HC test in order to improve its applicability in finite sample size settings. To this end, Gontscharuk et al. (2013) introduced the concept of so-called *local levels*.

For the HC test with critical value *y* a local level is defined as the probability that the *i*th order statistic falls below its respective critical value defined in (21). Formally, local levels can be calculated via

These quantities can be seen as an indicator as to where one would expect high/low local sensitivity of the test.

The new HC level α test can alternatively be defined in terms of a test statistic. Let be the cdf of , that is, is the cdf of the *Beta* distribution with parameters *i* and . Noting that via (23), condition (24) leads to

- (26)

where is the inverse function of the standard normal cdf and . Once is determined, the new HC tests rejects if for some *i*. Note that , , are uniformly distributed on [0, 1]. Alternatively, setting , , the new HC statistic can be represented as the maximum of standard normally distributed random variables, that is

Consequently, the new HC test with equal local levels rejects *H*_{0} if . The *p*-value of this test can be calculated by

where is a realization of . Thereby, the probability in the aforementioned expression can be calculated via one of the recursions provided in Shorack and Wellner (2009), p. 362–370.

Finally, we briefly study and compare the power of the original and new HC test. We restrict attention to the following normal mixture model with sparse signals, a prominent example in the HC literature. Let , , , and let be iid random variables with the cdf

We test whether any signals are present, that is, we test *H*_{0} that against the alternatives that for . This can be reworded as GOF testing for uniformity, where , , are iid uniformly distributed on [0, 1] under *H*_{0}. Donoho and Jin (2004) provided a detection boundary that separates the parameter plane into two regions, where it is possible to reliably detect signals and where it is impossible to do so. They showed that depending on the considered parameters the power of the asymptotic level α HC test, which is defined as under , tends to one or α. More precisely, for the function

the power of the asymptotic level α HC test converges to one if (detectable), and tends to α if (undetectable), see also Ingster (1997, 1999). Figure 8 illustrates these two regions separated by the boundary function .

Altogether, it can be supposed that the new HC test finitely offers what the HC test asymptotically promises. Even more, beyond the context of testing the global null hypothesis in high dimensional data, the proposed new test with equal local levels is an attractive, easy to compute competitor to classical GOF tests and also allows for easy to compute simultaneous confidence bands for the related test statistic.

### 6 Concluding remarks

- Top of page
- Abstract
- 1 Introduction and summary
- 2 Background and notation
- 3 Why do intermediates take it all?
- 4 Why the asymptotics of the HC statistic is so poor
- 5 New HC tests with improved finite properties
- 6 Concluding remarks
- Acknowledgment
- Conflict of interest
- References

In this paper, we first focused on two topics that explain some important properties of the HC test statistic. The first one concerns the range of sensitivity of GOF tests based on the HC statistic, the second addresses the convergence of the finite HC distribution to its limiting counterpart. Eventually, we proposed a modification of the HC test showing better finite properties.

The fact that tests based on the HC statistic effectively detect signals which are very weak and very sparse, aroused a lot of interest among statisticians. Such tests are even asymptotically successful throughout the same region in the amplitude/sparsity plane where the oracle likelihood ratio test would succeed. The nature of this phenomenon becomes clearer by looking at the sensitivity region of the HC statistic. The explanation given in Section 'Why do intermediates take it all?', why a certain intermediate range of order statistics plays a crucial role for the HC asymptotics, contributes to the knowledge about the behavior of the HC statistic. Unfortunately, due to the fact that the distribution of the HC statistic converges to the limiting one very slowly, the type I error by the corresponding HC tests is not controlled even for a very large sample size, cf. Section 'Why the asymptotics of the HC statistic is so poor'.

In view of this point, it would be favorable to have a “better” HC statistic at hand. It is worthy to note that the application of truncated HC versions leads to an exclusion of several order statistics of the underlying random sample and hence to a loss of information. This is why, in our opinion, truncated HC statistics are not suited to be “better” HC statistics. New approaches seem to be necessary in order to construct more favorable HC tests, that is a test with the same asymptotic properties as the original HC test, but with a more appropriate finite sample size behavior. The concept of the so-called local levels introduced in Section 'New HC tests with improved finite properties' seems to be such a promising approach. Local levels can be seen as an indicator as to where one would expect high/low local sensitivity, for more details see Gontscharuk et al. (2013). Motivated by a result in this work that almost all HC local levels are asymptotically equal, a new GOF test with equal local levels in the finite sample size case seems a good candidate that might show the properties mentioned. In Section 'New HC tests with improved finite properties', we provide new results concerning the asymptotics of the new HC test and show by means of simulations that the new test is typically more powerful than the original procedure in a normal mixture model. A more detailed study of the new HC test with equal local levels will be reported in our forthcoming work.