ProHunter: A Comprehensive APT Hunting System Based on Whole-System Provenance
About
Advanced Persistent Threats (APTs) remain difficult to detect due to their stealthy nature and long-term persistence. To tackle this challenge, provenance-based threat hunting has gained traction as a proactive defense mechanism. This technique models audit logs as a whole-system provenance graph and searches for subgraphs that match APT patterns recorded in Cyber Threat Intelligence (CTI) reports. However, several limitations persist: 1) significant memory and time overhead due to the extremely large provenance graphs; 2) imprecise segmentation of APT activities from provenance graphs due to their intricate entanglement with benign operations; and 3) poor alignment of attack representations between CTI-derived query graphs and provenance graphs due to their substantial semantic gaps. To address these limitations, this paper presents ProHunter, an efficient and accurate provenance-based APT hunting system with a platform-independent design. To minimize system overhead, ProHunter creates a compact data structure that efficiently stores long-term provenance graphs using semantic abstraction and bit-level hierarchical encoding strategies. To segment APT behaviors, a heuristic-driven threat graph sampling algorithm is designed, which can extract precise attack patterns from provenance graphs. Furthermore, to bridge the semantic gaps between CTI-derived graphs and provenance graphs, ProHunter proposes adaptive graph representation and feature enhancement methods, enabling the extraction of consistent attack semantics at both localized and globalized levels.Extensive evaluations on real-world APT campaigns from DARPA TC E3, E5 and OpTC datasets demonstrate that ProHunter outperforms state-of-the-art threat hunting systems in terms of efficiency and accuracy. Our code is available at https://github.com/xueboQiu/ProHunter.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Threat Hunting | OPTC | Recall100 | 6 | |
| Threat Hunting | E5-ClearScope | Recall100 | 6 | |
| Threat Hunting | E3-CADETS | Recall100 | 6 | |
| Threat Hunting | E3-THEIA | Recall100 | 6 | |
| Threat Hunting | E5-THEIA | Recall100 | 6 | |
| Threat Hunting | E3-Trace | Recall100 | 6 |