Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Agent Perturbation Reliability Testing on AgentHarm

90.6Accuracy

GPT-4o

80.92883.43985.9588.461Mar 5, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
90.69.46.312.5
90.69.46.312.5
2026.03
90.69.46.312.5
81.318.86.331.3
81.318.82512.5