KLASMOE and School of Mathematics and Statistics, Northeast Normal University, 5268 People's Road, 130024 Changchun, China.
ON ESTIMATION OF THE POPULATION SPECTRAL DISTRIBUTION FROM A HIGH-DIMENSIONAL SAMPLE COVARIANCE MATRIX
Article first published online: 28 NOV 2010
© 2010 Australian Statistical Publishing Association Inc.
Australian & New Zealand Journal of Statistics
Volume 52, Issue 4, pages 423–437, December 2010
How to Cite
Bai, Z., Chen, J. and Yao, J. (2010), ON ESTIMATION OF THE POPULATION SPECTRAL DISTRIBUTION FROM A HIGH-DIMENSIONAL SAMPLE COVARIANCE MATRIX. Australian & New Zealand Journal of Statistics, 52: 423–437. doi: 10.1111/j.1467-842X.2010.00590.x
Acknowledgments. The authors wish to thank the Chinese National Science Foundation, Northeast Normal University (China) and Région Bretagne (France) for their support of this research.
- Issue published online: 27 DEC 2010
- Article first published online: 28 NOV 2010
- eigenvalues of covariance matrices;
- high-dimensional statistics;
- Marčenko–Pastur distribution;
- sample covariance matrices
Sample covariance matrices play a central role in numerous popular statistical methodologies, for example principal components analysis, Kalman filtering and independent component analysis. However, modern random matrix theory indicates that, when the dimension of a random vector is not negligible with respect to the sample size, the sample covariance matrix demonstrates significant deviations from the underlying population covariance matrix. There is an urgent need to develop new estimation tools in such cases with high-dimensional data to recover the characteristics of the population covariance matrix from the observed sample covariance matrix. We propose a novel solution to this problem based on the method of moments. When the parametric dimension of the population spectrum is finite and known, we prove that the proposed estimator is strongly consistent and asymptotically Gaussian. Otherwise, we combine the first estimation method with a cross-validation procedure to select the unknown model dimension. Simulation experiments demonstrate the consistency of the proposed procedure. We also indicate possible extensions of the proposed estimator to the case where the population spectrum has a density.