| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| AndroidWorld core20 safe general tasks | GPT-5 | Success Count (out of 20)11 | 4 | 6d ago | |
| Total Aggregate Environments | Koopman | Success Rate96 | 4 | 1mo ago | |
| Maze 2 | Koopman | Success Rate (SR)100 | 4 | 1mo ago | |
| Maze 1 | Koopman | Success Rate (SR)100 | 4 | 1mo ago | |
| Corridor 2 | Linear | SR (%)100 | 4 | 1mo ago | |
| Corridor 1 | Linear | Success Rate (SR)100 | 4 | 1mo ago |