Shin participated in this work while he was a PhD student at Indiana University. He is currently with Korea Internet and Security Agency, Seoul, South Korea.
A link graph-based approach to identify forum spam
Article first published online: 10 MAR 2014
Copyright © 2014 John Wiley & Sons, Ltd.
Security and Communication Networks
Volume 8, Issue 2, pages 176–188, 25 January 2015
How to Cite
2015) A link graph-based approach to identify forum spam, Security Comm. Networks, 8, 176–188, doi: 10.1002/sec.970., , and (
- Issue published online: 19 DEC 2014
- Article first published online: 10 MAR 2014
- Manuscript Accepted: 30 DEC 2013
- Manuscript Revised: 19 OCT 2013
- Manuscript Received: 18 JUN 2013
- forum spam;
- web spam;
- link graph;
Web spammers have taken note of the popularity of public forums such as blogs, wikis, webboards, and guestbooks. They are now exploiting them with the purpose of driving traffic to their malicious or fraudulent websites, such as those used for phishing, distributing malware, or selling counterfeit pharmaceuticals. A popular technique they use is to spam these forums with URLs to their spam websites. We consider the problem of classifying URLs posted to forums as spam or legitimate by considering the link structure of the graph rooted at the posted URL. We investigate various graph metrics and associated metadata to analyze link structures. To lessen noisy structural characteristics of the link graphs for spam classification, we also examine two techniques: differing depths and aggregating sub-graphs of the link graphs. Our results show that a support vector machine classifier based on combinations of graph metrics and metadata of link graphs can achieve a pragmatically high performance in forum spam detection. Copyright © 2014 John Wiley & Sons, Ltd.