| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| ScreenSpot Pro | MAI-UI-32B | Average Score7,350 | 458 | 2d ago | |
| ScreenSpot v2 | MAI-UI-32B | Avg Accuracy96.5 | 371 | 2d ago | |
| ScreenSpot Pro | Accuracy72.7 | 195 | 14d ago | ||
| ScreenSpot | GUI-G2 | Avg Acc92 | 160 | 2d ago | |
| OSWorld-G | Qwen3VL-8B-Instruct | Average Score72.7 | 144 | 7d ago | |
| UI-Vision | GroundNext-3B (RL) | Average Score62.1 | 68 | 7d ago | |
| MMBench-GUI L2 (test) | GPT-4o | Average Error2.9 | 67 | 2mo ago | |
| ScreenSpot Web V2 | InfiGUI-G1-7B | Text Accuracy97.9 | 60 | 1mo ago | |
| ScreenSpot Desktop V2 | POINTS-GUI-G-8B | Text Accuracy100 | 60 | 1mo ago | |
| ScreenSpot Mobile V2 | UI-AGILE-7B | Text Accuracy100 | 60 | 1mo ago | |
| UI-Vision (test) | MAI-UI-32B | Basic Score59.1 | 59 | 6d ago | |
| OSWorld-G (test) | GTA1-32B | Element Accuracy78.4 | 52 | 3mo ago | |
| MMBench-GUI-L2 | GUI-SD | Accuracy86.7 | 43 | 1mo ago | |
| ScreenSpot-Pro (test) | Holo1.5-7B + SafeGround | Element Accuracy58.66 | 43 | 3mo ago | |
| ScreenSpot (test) | Trifuse+GUI-AIMA | Element Accuracy90.6 | 42 | 2mo ago | |
| Screenspot v1 | Qwen3-VL-32B | Average Accuracy91.67 | 39 | 2mo ago | |
| ScreenSpot v1 (test) | GUI-G1 | Mobile Text Acc98.6 | 25 | 3mo ago | |
| ScreenSpot V1 (Overall) | UGround-V1-7B | Average Accuracy89.9 | 22 | 15d ago | |
| CUActSpot | GUI Accuracy73.7 | 21 | 21d ago | ||
| Screenspot Web | SE-GA | Text Accuracy91 | 18 | 15d ago | |
| Screenspot Desktop | SE-GA | Text Acc95.9 | 18 | 15d ago | |
| Screenspot Mobile | SE-GA | Text Accuracy96.3 | 18 | 15d ago | |
| OSWorld G-Refine v1.0 (test) | MAI-UI-32B | Overall Success Rate75 | 17 | 3mo ago | |
| ShowUI | IAG | ASR34.7 | 15 | 2mo ago | |
| Aggregated GUI Benchmarks | AQuaUI | Compression29.72 | 14 | 14d ago |