Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Agent Action Safety Verification on internal benchmark 300-scenario

95Verdict Accuracy

AgentTrust v0.5

42.68856.26969.8583.431May 6, 2026
Updated 27d ago

Evaluation Results

MethodLinks
2026.05
952.35.41.72
2026.05
8890.81,461
2026.05
82.37.52.31,345
2026.05
49.3088.40.05
2026.05
44.796.203,558