Tests for mean vectors in high dimension

Authors


Abstract

Traditional multivariate tests, Hotelling's equation image or Wilks equation image, are designed for a test of the mean vector under the condition that the number of observations is larger than the number of variables. For high-dimensional data, where the number of features is nearly as large as or larger than the number of observations, the existing tests do not provide a satisfactory solution because of the singularity of the estimated covariance matrix. In this article, we consider a test for the mean vector of independent and identically distributed multivariate normal random vectors where the dimension is larger than or equal to the number of observations. To solve this problem, we propose a modified Hotelling statistic. Simulation results show that the proposed test is superior to other tests available in the literature. However, because we do not know the theoretical distribution of this modified statistic, Monte Carlo methods were used to reach this conclusion. Instead of using conventional Monte Carlo methods, which perform a fixed-number of simulations, we suggest using the sequential Monte Carlo test in order to decrease the number of simulations needed to reach a decision. Simulation results show that the sequential Monte Carlo test is preferable to a fixed-sample test, especially when using computationally intensive statistical methods. © 2013 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2013

Ancillary