Anomaly based intrusion detection systems (IDSs) create a benign behavior profile of the network, and any deviation from this profile is considered as an attack. Many of the algorithms proposed in the literature for anomaly IDS fall into cluster analysis category. As networks become faster in operation, the amount of data that needs to be analyzed becomes huge. Many clustering techniques require more than one pass on the dataset; thus, when used as anomaly IDSs, these algorithms becomes computationally expensive and cannot work for such high-speed networks. To handle voluminous data, anomaly IDS schemes have been proposed that use data summarization techniques. Data summarization techniques found in the literature suffer from false alarms due to improper clustering when used as anomaly IDS. In this paper, an anomaly IDS is proposed that is capable of handling large dataset yet minimizing false alarms. Copyright © 2012 John Wiley & Sons, Ltd.