Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Over-Refusal Evaluation on Benign prompt dataset

17Over-Refusal Rate

Base

13.6836.0958.580.91Apr 17, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
17
2026.04
26.3
2026.04
66.3
2026.04
100