Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
OOD Safety Category Inference (Stage 2) on Aegis 2.0
Loading...
23.08
Reward Mean
ProGuard-3B
-0.2888
5.7781
11.845
17.9119
Dec 29, 2025
Reward Mean
Updated 3d ago
Evaluation Results
Method
Method
Links
Reward Mean
ProGuard-3B
2025.12
23.08
ProGuard-7B
2025.12
20.28
Gemini2.5-Flash
2025.12
15.84
GPT4o-mini
2025.12
0.61
Feedback
Search any
task
Search any
task