Finding relevant clustering directions in high-dimensional data using Particle Swarm Optimization



A method based on Particle Swarm Optimization (PSO) is proposed and described for finding subspaces that carry meaningful information about the presence of groups in high-dimensional data sets. The advantage of using PSO is that not only the variables that are responsible for the main data structure are identified but also other subspaces corresponding to local optima. The characteristics of the method are shown on two simulated data sets and on a real matrix coming from the analysis of genomic microarrays. In all cases, PSO allowed to explore different subspaces and to discover meaningful structures in the analyzed data. Copyright © 2010 John Wiley & Sons, Ltd.