Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
GUI Agent Task on WebArena Reddit
Loading...
0.95
Success Rate
ActionEngine
0.6484
0.7267
0.805
0.8833
Feb 24, 2026
Success Rate
Avg Latency (s)
Avg Cost per Task ($)
Avg #InTokens
Avg #OutTokens
Avg #LLM Calls
Improvement (×)
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
Avg Latency (s)
Avg Cost per Task ($)
Avg #InTokens
Avg #OutTokens
Avg #LLM Calls
Improvement (×)
ActionEngine
LLM Backbone=Claude 4.5
2026.02
0.95
118
0.06
8,100
2,300
1.8
-
AgentOccam
LLM Backbone=GPT-4-Turbo
2026.02
0.66
237
0.71
62,300
3,000
10.2
-
Feedback
Search any
task
Search any
task