White-Basilisk: A Hybrid Model for Code Vulnerability Detection

About

The proliferation of software vulnerabilities presents a significant challenge to cybersecurity, necessitating more effective detection methodologies. We introduce White-Basilisk, a novel approach to vulnerability detection that demonstrates superior performance while challenging prevailing assumptions in AI model scaling. Utilizing an innovative architecture that integrates Mamba layers, linear self-attention, and a Mixture of Experts framework, White-Basilisk achieves state-of-the-art results in vulnerability detection tasks with a parameter count of only 200M. The model's capacity to process sequences of unprecedented length enables comprehensive analysis of extensive codebases in a single pass, surpassing the context limitations of current Large Language Models (LLMs). White-Basilisk exhibits robust performance on imbalanced, real-world datasets, while maintaining computational efficiency that facilitates deployment across diverse organizational scales. This research not only establishes new benchmarks in code security but also provides empirical evidence that compact, efficiently designed models can outperform larger counterparts in specialized tasks, potentially redefining optimization strategies in AI development for domain-specific applications.

Ioannis Lamprou, Alexander Shevtsov, Ioannis Arapakis, Sotiris Ioannidis• 2025

Related benchmarks

Task	Dataset	Result
Vulnerability Detection	Reveal	Accuracy89.9	12
Vulnerability Detection	VulDeePecker	F1 Score93.9	12
Vulnerability Detection	BigVul (val)	Accuracy99.4	7
Vulnerability Detection	Draper	F1 Score60.7	7
Paired Vulnerability Detection	PRIMEVUL paired (val)	P-C12.92	7
Vulnerability Detection	PRIMEVUL (val)	Accuracy96.3	7

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord