Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Out-of-Taxonomy Risk Detection on BeaverTails V
Loading...
0.6314
F1 Score
ProGuard-7B
0.423608
0.477554
0.5315
0.585446
Dec 29, 2025
F1 Score
Updated 3d ago
Evaluation Results
Method
Method
Links
F1 Score
ProGuard-7B
2025.12
0.6314
GPT4o-mini
2025.12
0.5107
ProGuard-3B
2025.12
0.4586
Gemini2.5-Flash
2025.12
0.4316
Feedback
Search any
task
Search any
task