Network security has been a serious concern for many years. For example, firewalls often record thousands of exploit attempts on a daily basis. Network administrators could benefit from information on potential aggressive attack sources, as such information can help to proactively defend their networks. For this purpose, several large-scale information sharing systems have been established, in which information on cyberattacks targeting each participant network is shared such that a network can be forewarned of attacks observed by others.
However, the total number of reported attackers is huge in these systems. Thus, a challenging problem is to identify the attackers that are most relevant to each individual network (i.e. most likely to come to that network in the near future). We present a framework to estimate the relevance of each attacker with respect to each network. In particular, we model each attacker's relevance as a function over the networks. Different attackers have different functions. The distribution of the functions is modeled using a Gaussian process (GP). The relevance function of each attacker is then inferred from the GP, that itself is learned from the collection of attack information. We test our framework on the attack reports in the DShield information sharing system. Experiments show that attackers found relevant to a network by our framework are indeed more likely to come to that network in the future. Copyright © 2009 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 56-68, 2010