In this paper, we describe the implementation and evaluation of a cluster-based enrichment strategy to call hits from a high-throughput screen using a typical cell-based assay of 160,000 chemical compounds. Our focus is on statistical properties of the prospective design choices throughout the analysis, including how to choose the number of clusters for optimal power, the choice of test statistic, the significance thresholds for clusters and the activity threshold for candidate hits, how to rank selected hits for carry-forward to the confirmation screen, and how to identify confirmed hits in a data-driven manner. Whereas previously the literature has focused on choice of test statistic or chemical descriptors, our studies suggest that cluster size is the more important design choice. We recommend clusters to be ranked by enrichment odds ratio, not by p-value. Our conceptually simple test statistic is seen to identify the same set of hits as more complex scoring methods proposed in the literature do. We prospectively confirm that such a cluster-based approach can outperform the naive top X approach and estimate that we improved confirmation rates by about 31.5% from 813 using the top X approach to 1187 using our cluster-based method. Copyright © 2012 John Wiley & Sons, Ltd.