Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Prompt classification on SimpST
Loading...
100
F1 Score
PolyGuard
95.528
96.689
97.85
99.011
Jan 22, 2026
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
PolyGuard
Model Size=7B
2026.01
100
YuFeng-XGuard
Model Size=8B
2026.01
100
NemotronReasoning
Model Size=4B
2026.01
99.5
Qwen3Guard-Gen
Model Size=4B, Evaluat...
2026.01
99.5
YuFeng-XGuard
Model Size=0.6B
2026.01
99.5
Llama3Guard
Model Size=8B
2026.01
99.5
WildGuard
Model Size=7B
2026.01
99.5
GPT-OSS-SafeGuard
Model Size=20B
2026.01
99.5
Qwen3Guard-Gen
Model Size=8B, Evaluat...
2026.01
99.5
Qwen3Guard-Gen
Model Size=0.6B, Evalu...
2026.01
98.5
Llama4Guard
Model Size=12B
2026.01
98.5
NemotronGuardV2
Model Size=8B
2026.01
98.5
Qwen3Guard-Gen
Model Size=4B, Evaluat...
2026.01
97.4
Qwen3Guard-Gen
Model Size=8B, Evaluat...
2026.01
97.4
Qwen3Guard-Gen
Model Size=0.6B, Evalu...
2026.01
95.8
ShieldGemma
Model Size=9B
2026.01
95.7
Feedback
Search any
task
Search any
task