Nearest-neighbors medians clustering

Authors


Abstract

We propose a nonparametric cluster algorithm based on local medians. Each observation is substituted by its local median and this new observation moves toward the peaks and away from the valleys of the distribution. The process is repeated until each observation converges to a fixpoint. We obtain a partition of the sample based on the convergence points. Our algorithm determines the number of clusters and the partition of the observations given the proportion α of neighbors. A fast version of the algorithm where only a subset of the observations from the sample is processed is also proposed. A proof of the convergence from each point to its closest fixpoint and the existence and uniqueness of a fixpoint in a neighborhood of each mode is given for the univariate case. © 2012 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2012

Ancillary