| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Web navigation | MiniWob++ | Accuracy53.26 | 15 | |
| Agent | MiniWob++ (held-in) | Performance (%)87.12 | 14 | |
| Web automation | MiniWob 45 tasks subset (test) | Mean Success Rate86.1 | 6 | |
| Web-based task completion | MiniWoB++ With feedback 9 tasks | Success Rate91.11 | 5 | |
| Web automation | MiniWob 35 tasks subset (test) | Mean Success Rate67 | 4 | |
| enter-text navigation | MiniWoB (test) | Success Rate100 | 3 |