Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OrBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Overrefusal EvaluationOrBench-H
RR99.85
21
Safety Boundary Over-RefusalORBench
ORBench Over-Refusal Rate7.2
20
Safety and Informativeness EvaluationOrBench Hard
Deception Rate (Safe)75
4
Safety EvaluationOrBench Toxic
Safety Score81.7
4
Showing 4 of 4 rows