Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Cybersecurity Knowledge Question Answering on MMLU CSec
Loading...
88
CSec Score
RedSage-8B-Seed
67.2
72.6
78
83.4
Jan 29, 2026
CSec Score
Updated 4d ago
Evaluation Results
Method
Method
Links
CSec Score
RedSage-8B-Seed
evaluation_context=Bas...
2026.01
88
RedSage-8B-Base
evaluation_context=Bas...
2026.01
87
RedSage-8B-CFW
evaluation_context=Bas...
2026.01
86
GPT-5
evaluation_context=Lar...
2026.01
86
Qwen3-32B
evaluation_context=Lar...
2026.01
84
Llama-3.1-8B
evaluation_context=Bas...
2026.01
83
Qwen3-8B-Base
evaluation_context=Bas...
2026.01
83
Foundation-Sec-8B
evaluation_context=Bas...
2026.01
80
Llama-Primus-Base
evaluation_context=Ins...
2026.01
79
RedSage-8B-DPO
evaluation_context=Ins...
2026.01
79
RedSage-8B-Ins
evaluation_context=Ins...
2026.01
78
Llama-Primus-Merged
evaluation_context=Ins...
2026.01
76
Foundation-Sec-8B-Instruct
evaluation_context=Ins...
2026.01
76
Qwen3-8B
evaluation_context=Ins...
2026.01
76
DeepHat-V1-7B
evaluation_context=Ins...
2026.01
74
Llama-3.1-8B-Instruct
evaluation_context=Ins...
2026.01
72
Lily-Cybersecurity-7B-v0.2
evaluation_context=Ins...
2026.01
68
Feedback
Search any
task
Search any
task