Share your thoughts, 1 month free Claude Pro on usSee more

Agent Planning Security and Autonomy on WASP Reddit (test)

0Attack Success Rate

PRUDENTIA

Updated 5mo ago

Evaluation Results

Method	Links
PRUDENTIA 2026.02		0	0	55.6	8.45
PRUDENTIA 2026.02		0	0	50	8.39
PRUDENTIA 2026.02		0	0	58.3	8.52
PRUDENTIA 2026.02		0	0	63.9	8.13
Basic Agent 2026.02		36.1	1.67	47.2	8.47
Basic Agent 2026.02		47.2	1.56	36.1	8.62
Basic Agent 2026.02		52.8	1	36.1	8.38
Basic Agent 2026.02		61.1	1.08	25	8.44