| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Web-based Question Answering | WebWalkerQA | Success Rate81.18 | 18 | |
| Web Browsing and Navigation | WebWalkerQA | Average Accuracy71.7 | 18 | |
| Deep Research | WebWalkerQA original (test) | Pass@172.2 | 14 | |
| Web-based Agent QA | WebWalkerQA | Pass@173.53 | 13 | |
| Web-based Agent Reasoning | WebWalkerQA Hard | Pass@30.6333 | 8 | |
| Web-based Agent Reasoning | WebWalkerQA Medium | Pass@372.86 | 8 | |
| Web-based Agent Reasoning | WebWalkerQA Easy | Pass@372.5 | 8 |