Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Safety Evaluation on MultiJail Italian
Loading...
100
Safe Response Rate
Direct Instruction
7.856
31.778
55.7
79.622
Sep 11, 2025
Safe Response Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Safe Response Rate
Direct Instruction
Model=GPT-OSS-120b
2025.09
100
AIM
Model=GPT-OSS-120b
2025.09
100
SteerMoE
Model=GPT-OSS-120b
2025.09
90.2
SteerMoE + AIM
Model=GPT-OSS-120b
2025.09
11.4
Feedback
Search any
task
Search any
task