Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Over-refusal Evaluation on XSTest (test)
Loading...
0.035
Over-refusal Rate
Claude Sonnet 4.5
0.03152
0.05501
0.0785
0.10199
Feb 28, 2025
Over-refusal Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Over-refusal Rate
Claude Sonnet 4.5
Safety Steering=Disabled
2025.02
0.035
GPT-5
Safety Steering=Disabled
2025.02
0.052
Claude Sonnet 4.5
Safety Steering=Enabled
2025.02
0.074
GPT-5
Safety Steering=Enabled
2025.02
0.122
Feedback
Search any
task
Search any
task