DRIFT: Drift-Resilient Invariant-Feature Transformer for DGA Detection

About

Domain Generation Algorithms (DGAs) evolve continuously to evade botnet detection, posing a persistent challenge for dependable network defense. While deep learning-based detectors achieve strong performance under static conditions, they suffer severe degradation when facing temporal drift. Through a 9-year longitudinal study (2017-2025), we empirically show that state-of-the-art character- and word-based DGA classifiers rapidly lose effectiveness as new DGA variants emerge. To address this problem, we propose a drift-resilient Transformer-based framework that learns invariant representations through a hybrid tokenization strategy and multi-task self-supervised pre-training. The model integrates (i) character-level encoding to capture stochastic morphological patterns and (ii) subword-level encoding for word-based DGAs. Three pre-training tasks enable the model to learn robust structural and contextual features prior to supervised fine-tuning. Comprehensive evaluations demonstrate that our method significantly mitigates temporal degradation and consistently outperforms state-of-the-art baselines in forward-chaining experiments. The proposed approach offers a dependable foundation for long-term DGA defense in evolving threat landscapes. Our code is available at: https://github.com/snsec-net/2026-DSN-DRIFT.

Chaeyoung Lee, Chaeri Jung, Seonghoon Jeong• 2026

Related benchmarks

Task	Dataset	Result
DGA Detection	2020–2025 (test)	Accuracy95.2763	12
DGA Detection	DGA 2024 (test)	FPR8.1483	12
DGA Detection	DGA 2025 (test)	FPR8.966	12
DGA Detection	DGA 2020 (test)	FPR3.1638	12
DGA Detection	DGA 2021 (test)	False Positive Rate0.0301	12
DGA Detection	DGA 2022 (test)	False Positive Rate (FPR)3.2776	12
DGA Detection	DGA 2023 (test)	FPR3.8314	12
DGA Detection	DGA Unseen-Families 2020–2025 (test)	FNR14.3913	4

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord