Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report

About

We present Foundation-Sec-8B-Reasoning, the first open-source native reasoning model for cybersecurity. Built upon our previously released Foundation-Sec-8B base model (derived from Llama-3.1-8B-Base), the model is trained through a two-stage process combining supervised fine-tuning (SFT) and reinforcement learning from verifiable rewards (RLVR). Our training leverages proprietary reasoning data spanning cybersecurity analysis, instruction-following, and mathematical reasoning. Evaluation across 10 cybersecurity benchmarks and 10 general-purpose benchmarks demonstrates performance competitive with significantly larger models on cybersecurity tasks while maintaining strong general capabilities. The model shows effective generalization on multi-hop reasoning tasks and strong safety performance when deployed with appropriate system prompts and guardrails. This work demonstrates that domain-specialized reasoning models can achieve strong performance on specialized tasks while maintaining broad general capabilities. We release the model publicly at https://huggingface.co/fdtn-ai/Foundation-Sec-8B-Reasoning.

Zhuoran Yang, Ed Li, Jianliang He, Aman Priyanshu, Baturay Saglam, Paul Kassianik, Sajana Weerawardhena, Anu Vellore, Blaine Nelson, Neusha Javidnia, Arthur Goldblatt, Fraser Burch, Avi Zohary, Assaf Eisenman, Mahdi Sabbaghi, Supriti Vijay, Rahim Dharssi, Dhruv Kedia, Kojin Oshiba, Yaron Singer, Amin Karbasi• 2026

Related benchmarks

TaskDatasetResultRank
ReasoningBBH
Accuracy69.9
507
Mathematical ReasoningGSM8K
Accuracy (GSM8K)82.3
358
Instruction FollowingIFEval--
292
Instruction FollowingAlpacaEval 2.0
LC Win Rate62.6
281
Multi-hop Question Answering2WikiMultihopQA--
278
KnowledgeMMLU
Accuracy68.3
71
Mathematical ReasoningMATH
Score0.433
50
KnowledgeGPQA
Accuracy31.7
34
CodingHumanEval
HumanEval Mean Score0.799
28
Long-context Question AnsweringHotpotQA
Mean Score54.8
21
Showing 10 of 20 rows

Other info

Follow for update