Web services have been greatly threatened by remote exploit code attacks, where maliciously crafted HTTP requests are used to inject binary code to compromise web servers and web applications. In practice, besides detection of such attacks, attack attribution analysis (i.e., to automatically categorize exploits or determine whether an exploit is a variant of an attack from the past) is also very important. In this paper, we present SA3, a novel exploit code attribution analysis that combines semantics-based analysis and statistical modeling to automatically categorize given exploit code. SA3 extracts semantic features from exploit code through data anomaly analysis and then attributes the exploit to an appropriate class on the basis of our statistical model derived from a Markov model. We evaluate SA3 over a comprehensive set of shellcode collected from Metasploit and other polymorphic engines. Experimental results show that SA3 is effective and efficient. The attribution analysis accuracy can be over 90% in different parameter settings with false positive rate no more than 4.5%. The novelty of SA3 is that it combines semantic analysis with statistical modeling for exploit code attribution analysis. Copyright © 2012 John Wiley & Sons, Ltd.