| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Windows UI Navigation | WindowsAgentArena (WAA) | Success Rate74.5 | 14 | |
| GUI Automation | WindowsAgentArena | Success Rate (Office)54.76 | 11 | |
| GUI Agent Execution | WindowsAgentArena full-task | Full Task Success Rate0.3076 | 9 | |
| Operating System Agent Control | WindowsAgentArena | Success Rate0.635 | 8 | |
| Windows Computer Use Automation | WindowsAgentArena V2 | LibreOffice Success Rate7.1 | 7 | |
| Agent Performance | WindowsAgentArena (test) | Office Score30.4 | 6 | |
| Native Windows operating system task execution | WindowsAgentArena (WAA) | AUV9.5 | 4 | |
| Windows Agent Task Completion | WindowsAgentArena (original) | Office Success Rate2.3 | 2 |