Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Prompt Classification on ToxicChat Text Prompt
Loading...
96.27
F1 Score
GPT4o-mini
88.6988
90.6644
92.63
94.5956
Dec 29, 2025
F1 Score
Updated 2d ago
Evaluation Results
Method
Method
Links
F1 Score
GPT4o-mini
2025.12
96.27
ProGuard-7B
2025.12
96.07
GuardReasoner-8B
2025.12
95.8
Gemini2.5-Flash
2025.12
95.43
GuardReasonerVL-7B
2025.12
95.32
ProGuard-3B
2025.12
95.03
LlamaGuard-7B
2025.12
94.53
ShieldGemma-9B
2025.12
93.97
GPT-OSS-SafeGuard-20B
2025.12
93.72
LlamaGuard2-8B
2025.12
92.33
LlamaGuard3-11B-Vision
2025.12
92.05
LlamaGuard3-8B
2025.12
91.56
LlamaGuard4-12B
2025.12
91.15
WildGuard-7B
2025.12
88.99
Feedback
Search any
task
Search any
task