| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| ScreenSpot Pro | MAI-UI-32B | Average Score7,350 | 307 | 3d ago | |
| ScreenSpot v2 | MAI-UI-32B | Avg Accuracy96.5 | 283 | 3d ago | |
| ScreenSpot Pro | Accuracy72.7 | 163 | 3d ago | ||
| ScreenSpot | GAIR | Avg Acc91 | 133 | 11d ago | |
| OSWorld-G | Qwen3VL-8B-Instruct | Average Score72.7 | 107 | 18d ago | |
| MMBench-GUI L2 (test) | GPT-4o | Average Error2.9 | 67 | 20d ago | |
| UI-Vision | GroundNext-3B (RL) | Average Score62.1 | 59 | 20d ago | |
| ScreenSpot Web V2 | InfiGUI-G1-7B | Text Accuracy97.9 | 55 | 3d ago | |
| ScreenSpot Desktop V2 | POINTS-GUI-G-8B | Text Accuracy100 | 55 | 3d ago | |
| ScreenSpot Mobile V2 | UI-AGILE-7B | Text Accuracy100 | 55 | 3d ago | |
| OSWorld-G (test) | GTA1-32B | Element Accuracy78.4 | 52 | 1mo ago | |
| UI-Vision (test) | MAI-UI-32B | Basic Score59.1 | 43 | 1mo ago | |
| ScreenSpot-Pro (test) | Holo1.5-7B + SafeGround | Element Accuracy58.66 | 43 | 1mo ago | |
| ScreenSpot (test) | Trifuse+GUI-AIMA | Element Accuracy90.6 | 42 | 19d ago | |
| Screenspot v1 | Qwen3-VL-32B | Average Accuracy91.67 | 39 | 20d ago | |
| ScreenSpot v1 (test) | GUI-G1 | Mobile Text Acc98.6 | 25 | 1mo ago | |
| MMBench-GUI-L2 | Step-GUI-8B | Accuracy85.6 | 22 | 19d ago | |
| OSWorld G-Refine v1.0 (test) | MAI-UI-32B | Overall Success Rate75 | 17 | 1mo ago | |
| ShowUI | IAG | ASR34.7 | 15 | 1mo ago | |
| MMBench-GUI L2 (in-domain) | GRPO w. Max. | Accuracy58 | 13 | 19d ago | |
| ScreenSpot v2 (test) | Trifuse+GUI-Actor | Element Accuracy93.2 | 9 | 1mo ago | |
| Screenspot Web | ShowUI-G | Text Accuracy83 | 8 | 1mo ago | |
| Screenspot Desktop | OmniParser | Text Acc91.3 | 8 | 1mo ago | |
| Screenspot Mobile | OmniParser | Text Accuracy93.9 | 8 | 1mo ago | |
| TPanel-UI | CoG | Touch Interaction Accuracy90.9 | 7 | 1mo ago |