Special Issue Paper
Combining expert knowledge with automatic feature extraction for reliable web attack detection
Article first published online: 29 AUG 2012
Copyright © 2012 John Wiley & Sons, Ltd.
Security and Communication Networks
How to Cite
Torrano-Gimenez, C., Nguyen, H. T., Alvarez, G. and Franke, K. (2012), Combining expert knowledge with automatic feature extraction for reliable web attack detection. Security Comm. Networks. doi: 10.1002/sec.603
- Article first published online: 29 AUG 2012
- Web attack detection;
- web application firewall;
- intrusion detection systems;
- machine learning algorithms;
- feature selection;
In the detection of web attacks, it is necessary that Web Application Firewalls (WAFs) are effective, at the same time than efficient. In this paper, we propose a new methodology for web attack detection that enhances these two aspects of WAFs. It involves both feature construction and feature selection. For the feature construction phase, many professionals rely on their expert knowledge to define a set of important features, what normally leads to high and reliable attack detection rates. Nevertheless, it is a manual process and not quickly adaptive to the changing network environments. Alternatively, automatic feature construction methods (such as n-grams) overcome this drawback, but they provide unreliable results. Therefore, in this paper, we propose to combine expert knowledge with n-gram feature construction method for reliable and efficient web attack detection. However, the number of n-grams grows exponentially with n, which usually leads to high dimensionality problems. Hence, we propose to apply feature selection to reduce the number of redundant and irrelevant features. In particular, we study the recently proposed Generic Feature Selection (GeFS) measure, which has been successfully tested in intrusion detection systems. Additionally, we use several decision tree algorithms as classifiers of WAFs. The experiments are conducted on the publicly available ECML/PKDD 2007 dataset. The results show that the combination of expert knowledge and n-grams outperforms each separate technique and that the GeFS measure can greatly reduce the number of features, thus enhancing both the effectiveness and efficiency of WAFs. Copyright © 2012 John Wiley & Sons, Ltd.