A Decompilation-Driven Framework for Malware Detection with Large Language Models

About

The parallel evolution of Large Language Models (LLMs) with advanced code-understanding capabilities and the increasing sophistication of malware presents a new frontier for cybersecurity research. This paper evaluates the efficacy of state-of-the-art LLMs in classifying executable code as either benign or malicious. We introduce an automated pipeline that first decompiles Windows executable into a C code using Ghidra disassembler and then leverages LLMs to perform the classification. Our evaluation reveals that while standard LLMs show promise, they are not yet robust enough to replace traditional anti-virus software. We demonstrate that a fine-tuned model, trained on curated malware and benign datasets, significantly outperforms its vanilla counterpart. However, the performance of even this specialized model degrades notably when encountering newer malware. This finding demonstrates the critical need for continuous fine-tuning with emerging threats to maintain model effectiveness against the changing coding patterns and behaviors of malicious software.

Aniesh Chawla, Udbhav Prasad• 2026

Related benchmarks

Task	Dataset	Result	Rank
Malware Detection	Contemporary 2025 (test)	Accuracy83.2		3
Malware Classification	Contemporary Data 2025 (test)	True Positives (TP)101		3

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord