Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Agent Safety Evaluation on AgentHarm Benign Requests

79Safety Score

GPT-4o

6.225.14462.9Jun 1, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.06
79540
2026.06
79543
2026.06
76481
2026.06
76531
2026.06
71390
2026.06
71393
2026.06
69361
2026.06
68407
2026.06
604031
2026.06
583531
2026.06
55110
2026.06
533538
2026.06
53133
2026.06
5290
2026.06
503036
2026.06
4961
2026.06
41825
2026.06
41827
2026.06
362756
2026.06
362357
2026.06
292168
2026.06
271968
2026.06
19463
2026.06
17367
2026.06
15799
2026.06
14599
2026.06
9185