Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Prompt Classification on Aegis 2.0
Loading...
87.3
F1 Score
NemotronReasoning
70.868
75.134
79.4
83.666
Jan 22, 2026
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
NemotronReasoning
Model Size=4B
2026.01
87.3
YuFeng-XGuard
Model Size=0.6B
2026.01
87.1
PolyGuard
Model Size=7B
2026.01
86.6
Qwen3Guard-Gen
Model Size=8B, Evaluat...
2026.01
86.6
YuFeng-XGuard
Model Size=8B
2026.01
86.4
Qwen3Guard-Gen
Model Size=4B, Evaluat...
2026.01
86.3
NemotronGuardV2
Model Size=8B
2026.01
86.3
Qwen3Guard-Gen
Model Size=0.6B, Evalu...
2026.01
85.2
Qwen3Guard-Gen
Model Size=0.6B, Evalu...
2026.01
83.2
Qwen3Guard-Gen
Model Size=8B, Evaluat...
2026.01
82.6
Qwen3Guard-Gen
Model Size=4B, Evaluat...
2026.01
82.5
GPT-OSS-SafeGuard
Model Size=20B
2026.01
82.2
WildGuard
Model Size=7B
2026.01
81.5
ShieldGemma
Model Size=9B
2026.01
79.9
Llama3Guard
Model Size=8B
2026.01
77.2
Llama4Guard
Model Size=12B
2026.01
71.5
Feedback
Search any
task
Search any
task