Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Prompt classification on WildG
Loading...
88.5
F1 Score
Qwen3Guard-Gen
51.476
61.088
70.7
80.312
Jan 22, 2026
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
Qwen3Guard-Gen
Model Size=8B, Evaluat...
2026.01
88.5
Qwen3Guard-Gen
Model Size=4B, Evaluat...
2026.01
87.8
PolyGuard
Model Size=7B
2026.01
87.6
WildGuard
Model Size=7B
2026.01
87.4
YuFeng-XGuard
Model Size=0.6B
2026.01
87.2
Qwen3Guard-Gen
Model Size=0.6B, Evalu...
2026.01
86.9
YuFeng-XGuard
Model Size=8B
2026.01
86.6
GPT-OSS-SafeGuard
Model Size=20B
2026.01
85.9
Qwen3Guard-Gen
Model Size=8B, Evaluat...
2026.01
85.1
Qwen3Guard-Gen
Model Size=4B, Evaluat...
2026.01
84.5
Qwen3Guard-Gen
Model Size=0.6B, Evalu...
2026.01
84.4
NemotronReasoning
Model Size=4B
2026.01
82.6
NemotronGuardV2
Model Size=8B
2026.01
80.7
Llama3Guard
Model Size=8B
2026.01
73.8
Llama4Guard
Model Size=12B
2026.01
71.5
ShieldGemma
Model Size=9B
2026.01
52.9
Feedback
Search any
task
Search any
task