Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agent Planning Security and Autonomy on WASP Reddit (test)
Loading...
0
Attack Success Rate
PRUDENTIA
-2.444
14.053
30.55
47.047
Feb 11, 2026
Attack Success Rate
HITL Load (Avg)
TCR@∞
Turns (Avg)
Updated 4d ago
Evaluation Results
Method
Method
Links
Attack Success Rate
HITL Load (Avg)
TCR@∞
Turns (Avg)
PRUDENTIA
Model=GPT-4o
2026.02
0
0
55.6
8.45
PRUDENTIA
Model=o1
2026.02
0
0
50
8.39
PRUDENTIA
Model=o3-mini
2026.02
0
0
58.3
8.52
PRUDENTIA
Model=o4-mini
2026.02
0
0
63.9
8.13
Basic Agent
Model=o1
2026.02
36.1
1.67
47.2
8.47
Basic Agent
Model=GPT-4o
2026.02
47.2
1.56
36.1
8.62
Basic Agent
Model=o4-mini
2026.02
52.8
1
36.1
8.38
Basic Agent
Model=o3-mini
2026.02
61.1
1.08
25
8.44
Feedback
Search any
task
Search any
task