Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Malicious behavior measurement on AgentHarm Harmful

6Harm Rate

GPT-5

511.7518.525.25Mar 3, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
69157
2026.03
69468
2026.03
79267
2026.03
88962
2026.03
98752
2026.03
98659
2026.03
98863
2026.03
11011
2026.03
187458
2026.03
31031