Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report

About

As transformer-based large language models (LLMs) increasingly permeate society, they have revolutionized domains such as software engineering, creative writing, and digital arts. However, their adoption in cybersecurity remains limited due to challenges like scarcity of specialized training data and complexity of representing cybersecurity-specific knowledge. To address these gaps, we present Foundation-Sec-8B, a cybersecurity-focused LLM built on the Llama 3.1 architecture and enhanced through continued pretraining on a carefully curated cybersecurity corpus. We evaluate Foundation-Sec-8B across both established and new cybersecurity benchmarks, showing that it matches Llama 3.1-70B and GPT-4o-mini in certain cybersecurity-specific tasks. By releasing our model to the public, we aim to accelerate progress and adoption of AI-driven tools in both public and private cybersecurity contexts.

Paul Kassianik, Baturay Saglam, Alexander Chen, Blaine Nelson, Anu Vellore, Massimo Aufiero, Fraser Burch, Dhruv Kedia, Avi Zohary, Sajana Weerawardhena, Aman Priyanshu, Adam Swanda, Amy Chang, Hyrum Anderson, Kojin Oshiba, Omar Santos, Yaron Singer, Amin Karbasi• 2025

Related benchmarks

Task	Dataset	Result
Fine-grained Hallucination Detection	InFi-Check-FG (test)	BAcc (Normalized)27.69	30
Veracity Assessment	FactCheck-Bench	Macro-F169.8	26
Fact Checking	ExpertQA	--	25
Fact Checking	InFi-Check-FG 1.0 (test)	PredE18.82	18
Hallucination Detection	FRANK	Balanced Acc71.49	18
Cybersecurity Knowledge and Malware Extraction Analysis	SECURE	KCV84.38	17
Cybersecurity Knowledge Question Answering	MMLU CSec	CSec Score80	17
Overall Cybersecurity Performance	Cybersecurity Multi-Benchmark Suite	Overall Mean Score76.9	17
Cybersecurity Knowledge Evaluation	CyMtc (500)	CyMtc (500) Score86.6	17
Cybersecurity Multiple Choice Question Answering	RedSage-MCQ 0-shot (test)	Macro Accuracy78.51	17

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord