LLMs for Domain Generation Algorithm Detection
About
This work analyzes the use of large language models (LLMs) for detecting domain generation algorithms (DGAs). We perform a detailed evaluation of two important techniques: In-Context Learning (ICL) and Supervised Fine-Tuning (SFT), showing how they can improve detection. SFT increases performance by using domain-specific data, whereas ICL helps the detection model to quickly adapt to new threats without requiring much retraining. We use Meta's Llama3 8B model, on a custom dataset with 68 malware families and normal domains, covering several hard-to-detect schemes, including recent word-based DGAs. Results proved that LLM-based methods can achieve competitive results in DGA detection. In particular, the SFT-based LLM DGA detector outperforms state-of-the-art models using attention layers, achieving 94% accuracy with a 4% false positive rate (FPR) and excelling at detecting word-based DGA domains.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| DGA Detection | 2020–2025 (test) | Accuracy80.5239 | 12 | |
| DGA Detection | DGA 2020 (test) | FPR19.9478 | 12 | |
| DGA Detection | DGA 2021 (test) | False Positive Rate0.2006 | 12 | |
| DGA Detection | DGA 2022 (test) | False Positive Rate (FPR)19.681 | 12 | |
| DGA Detection | DGA 2023 (test) | FPR19.9898 | 12 | |
| DGA Detection | DGA 2024 (test) | FPR20.0038 | 12 | |
| DGA Detection | DGA 2025 (test) | FPR20.0178 | 12 |