User identification and anonymization in 802.11 wireless LANs



Privacy issues have been a serious concern for 802.11 Wireless LAN users. Prior research on privacy issues usually focuses on pseudonym techniques, where the unique, consistent explicit identifiers such as MAC addresses from the users can be frequently changed, and thus it is challenging to track users through those always changing identifiers. However, recent research done by Pang et al. (Pang et al. Proceedings of the 13th Annual ACM International Conference on Mobile Computing and Networking, 1997; 99–110) has demonstrated that pseudonyms are not adequate to protect user privacy. The key idea of Pang et al.'s method is to locate implicit identifiers (e.g., IP addresses and port numbers a user frequently visits), build user behavior patterns based on these implicit identifiers, and then apply classification techniques to identify users. In this paper, we first propose a new 802.11 user identification approach through enhanced feature selection and generation for building more accurate user behavior patterns. Our simulation results on 9.27 GB SIGCOMM 2004 wireless data sets demonstrate that our method can achieve better classification rates compared with Pang et al.'s method. Then, we further study how to provide user anonymity, even if implicit identifiers based identification is applied, by introducing a set of 802.11 user anonymization approaches based on bogus traffic injection. We propose eight different methods to artificially generate bogus data and inject them into original traffic, and thus users' behavior patterns are disturbed. Our simulation results on the same SIGCOMM 2004 data sets demonstrate that our anonymization methods can efficiently decrease user identification rates and hence improve user anonymity. Copyright © 2011 John Wiley & Sons, Ltd.