AutoMal: automatic clustering and signature generation for malwares based on the network flow

Authors

  • Sun Hao,

    Corresponding author
    1. School of Computer, National University of Defense Technology, Changsha, China
    2. National Key Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha, China
    • Correspondence Sun Hao, School of Computer, National University of Defense Technology, Changsha, China; National Key Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha, China.

      E-mail: sunhao4257@gmail.com

    Search for more papers by this author
  • Wen Wang,

    1. School of Computer, National University of Defense Technology, Changsha, China
    Search for more papers by this author
  • Huabiao Lu,

    1. School of Computer, National University of Defense Technology, Changsha, China
    Search for more papers by this author
  • Peige Ren

    1. School of Computer, National University of Defense Technology, Changsha, China
    Search for more papers by this author

Abstract

The volume of malwares is growing at an exponential speed nowadays. This huge growth makes it extremely hard to analyse malware manually. Most existing signatures extracting methods are based on string signatures, and string matching is not accurate and time consuming. Therefore, this paper presents AutoMal, a system for automatically extracting signatures from large-scale malwares. Firstly, the system proposes to represent the network flows by using feature hashing, which can dramatically reduce the high-dimensional feature spaces that are general in malware analysis. Then, we design a clustering and median filtering method to classify the malware vectors into different types. Finally, it introduces the signature generation algorithm based on Bayesian method. The system can extract both the byte signature and the hash signature of malwares from its network flow with low false positive and zero false negative. Our evaluation shows that AutoMal can generate strongly noise-resisted signatures that exactly depict the characteristics of malware. Copyright © 2014 John Wiley & Sons, Ltd.

Ancillary