Get access

Efficient k-NN graph construction for graphs on variables



Graphs are building blocks for many statistical techniques. Recent work has addressed graph construction to improve the end result of the associated statistical analysis. In this article, we propose a new method to efficiently construct a k nearest neighbor graph where the vertices are variables. Our primary application is covariance or concentration matrix estimation, where graphs are constructed to examine relationships between variables and to uncover data set structure. We demonstrate the improvement in the constructed graphs via simulations, and show that the method can choose a flexible k for each vertex. We also find informative features using the MADELON dataset from the 2003 NIPS Feature Selection Challenge, and illustrate the method on a gene expression data set in a covariance estimation setting. © 2013 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2013