BotFinder: finding bots in network traffic without deep packet inspection
Bots are the root cause of many security problems on the Internet, as they send spam, steal information from infected machines, and perform distributed denial-of-service attacks. Many approaches to bot detection have been proposed, but they either rely on end-host installations, or, if they operate on network traffic, require deep packet inspection for signature matching. In this paper, we present BotFinder, a novel system that detects infected hosts in a network using only high-level properties of the bot's network traffic. BotFinder does not rely on content analysis. Instead, it uses machine learning to identify the key features of command-and-control communication, based on observing traffic that bots produce in a controlled environment. Using these features, BotFinder creates models that can be deployed at network egress points to identify infected hosts. We trained our system on a number of representative bot families, and we evaluated BotFinder on real-world traffic datasets -- most notably, the NetFlow information of a large ISP that contains more than 25 billion flows. Our results show that BotFinder is able to detect bots in network traffic without the need of deep packet inspection, while still achieving high detection rates with very few false positives.