Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Web Task Automation on WebArena
Loading...
22.2
Accuracy
AgentLab
4.832
9.341
13.85
18.359
Oct 28, 2025
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
AgentLab
Model=Qwen-2.5-14B-Cod...
2025.10
22.2
AgentLab
Model=Qwen-2.5-14B-Cod...
2025.10
5.5
Feedback
Search any
task
Search any
task