Fast subset scan for multivariate event detection


Correspondence to: Daniel B. Neill, H.J. Heinz III College, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, U.S.A.



We present new subset scan methods for multivariate event detection in massive space–time datasets. We extend the recently proposed ‘fast subset scan’ framework from univariate to multivariate data, enabling computationally efficient detection of irregular space–time clusters even when the numbers of spatial locations and data streams are large. For two variants of the multivariate subset scan, we demonstrate that the scan statistic can be efficiently optimized over proximity-constrained subsets of locations and over all subsets of the monitored data streams, enabling timely detection of emerging events and accurate characterization of the affected locations and streams. Using our new fast search algorithms, we perform an empirical comparison of the Subset Aggregation and Kulldorff multivariate subset scans on synthetic data and real-world disease surveillance tasks, demonstrating tradeoffs between the detection and characterization performance of the two methods. Copyright © 2012 John Wiley & Sons, Ltd.