Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Benign tool-calling reliability on AgentHarm Benign

0Refusal Rate

GPT-4o

-1.729.8921.533.11Mar 3, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
00.8
2026.03
00.68
2026.03
130.51
2026.03
130.66
2026.03
150.61
2026.03
190.75
2026.03
190.75
2026.03
230.7
2026.03
240.73
2026.03
430.77