Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Unsafe content categorization on BeaverTails V
Loading...
63.44
Accuracy
Gemini2.5-Flash
0.5512
16.8781
33.205
49.5319
Dec 29, 2025
Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Accuracy
Gemini2.5-Flash
2025.12
63.44
ProGuard-7B
2025.12
58.47
ProGuard-3B
2025.12
46.86
GPT4o-mini
2025.12
43.39
LlamaGuard4-12B
2025.12
10.08
LlamaGuard3-11B-Vision
2025.12
2.97
Feedback
Search any
task
Search any
task