Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLMs for Domain Generation Algorithm Detection

About

This work analyzes the use of large language models (LLMs) for detecting domain generation algorithms (DGAs). We perform a detailed evaluation of two important techniques: In-Context Learning (ICL) and Supervised Fine-Tuning (SFT), showing how they can improve detection. SFT increases performance by using domain-specific data, whereas ICL helps the detection model to quickly adapt to new threats without requiring much retraining. We use Meta's Llama3 8B model, on a custom dataset with 68 malware families and normal domains, covering several hard-to-detect schemes, including recent word-based DGAs. Results proved that LLM-based methods can achieve competitive results in DGA detection. In particular, the SFT-based LLM DGA detector outperforms state-of-the-art models using attention layers, achieving 94% accuracy with a 4% false positive rate (FPR) and excelling at detecting word-based DGA domains.

Reynier Leyva La O, Carlos A. Catania, Tatiana Parlanti• 2024

Related benchmarks

TaskDatasetResultRank
DGA Detection2020–2025 (test)
Accuracy80.5239
12
DGA DetectionDGA 2020 (test)
FPR19.9478
12
DGA DetectionDGA 2021 (test)
False Positive Rate0.2006
12
DGA DetectionDGA 2022 (test)
False Positive Rate (FPR)19.681
12
DGA DetectionDGA 2023 (test)
FPR19.9898
12
DGA DetectionDGA 2024 (test)
FPR20.0038
12
DGA DetectionDGA 2025 (test)
FPR20.0178
12
Showing 7 of 7 rows

Other info

Follow for update