Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety and Utility Evaluation on OR-Bench
Loading...
5.3
HarmR
SafeSearch w/o hf.
5.084
6.542
8
9.458
Oct 19, 2025
HarmR
Help@S
Refusal Rate
Updated 2mo ago
Evaluation Results
Method
Method
Links
HarmR
Help@S
Refusal Rate
SafeSearch w/o hf.
Ablation=w/o hf.
2025.10
5.3
2.96
27.3
SafeSearch
2025.10
6.6
3.49
12.1
Utility-Only Ft.
2025.10
9.7
3.62
1
Base
2025.10
10.7
3.49
5.4
Feedback
Search any
task
Search any
task