Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WindowsAgentArena

Benchmarks

Task NameDataset NameSOTA ResultTrend
Windows UI NavigationWindowsAgentArena (WAA)
Success Rate74.5
33
Desktop automationWindowsAgentArena (WAA) v1 (test)
Overall Score61
13
GUI AutomationWindowsAgentArena
Success Rate (Office)54.76
11
Operating System Agent ControlWindowsAgentArena
Success Rate21.7
11
GUI Agent ExecutionWindowsAgentArena full-task
Full Task Success Rate0.3076
9
Windows Computer Use AutomationWindowsAgentArena V2
LibreOffice Success Rate7.1
7
Agent PerformanceWindowsAgentArena (test)
Office Score30.4
6
Computer UseWindowsAgentArena
Accuracy33.8
4
Native Windows operating system task executionWindowsAgentArena (WAA)
AUV9.5
4
Windows Agent Task CompletionWindowsAgentArena (original)
Office Success Rate2.3
2
Showing 10 of 10 rows