CITADEL: A Semi-Supervised Active Learning Framework for Malware Detection Under Continuous Distribution Drift
About
Android malware detection systems suffer severe performance degradation over time due to concept drift caused by evolving malicious and benign app behaviors. Although recent methods leverage active learning and hierarchical contrastive loss to address drift, they remain fully supervised, computationally expensive, and ineffective on long-term real-world benchmark. Moreover, expert labeling does not scale to the monthly emergence of nearly 300K new Android malware samples, leaving most data unlabeled and underutilized. To address these challenges, we propose CITADEL, a semi-supervised active learning framework for Android malware detection. Existing semi-supervised methods assume continuous and semantically meaningful input transformations, and fail to generalize well to high-dimensional binary malware features. We bridge this gap with malware-specific augmentations, Bernoulli bit flips and feature masking, that stochastically perturb feature to regularize learning under evolving malware distributions. \system further incorporates supervised contrastive loss to improve boundary sample discrimination and combines it with a multi-criteria active learning strategy based on prediction confidence, $L_p$-norm distance, and boundary uncertainty, enabling effective adaptation under constrained labeling budgets. Extensive evaluation on four large-scale Android malware benchmarks -- APIGraph, Chen-AZ, MaMaDroid, and LAMDA, demonstrates that \system outperforms prior work, achieving F1 score of over 1\%, 3\%, 7\%, and 14\% respectively, using only 40\% labeled samples. Furthermore, \system shows significant efficiency over prior work incurring $24\times$ faster training and $13\times$ fewer operations. \paragraph{Availability} The code is available at https://github.com/IQSeC-Lab/CITADEL.git.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Malware Detection | APIGraph | F1 Score93.5 | 28 | |
| Malware Detection | Chen-AZ | F1 Score82.7 | 28 | |
| Malware Detection | LAMDA | F1 Score77.7 | 28 | |
| Malware Detection | MaMaDroid | F1 Score44.9 | 28 |