| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| V* benchmark | RAP | Overall Success Rate91.1 | 54 | 6d ago | |
| V* Bench | Deepeyes-7B | Accuracy90.4 | 41 | 16d ago | |
| HR-Bench 8K | InterSketch | Accuracy77.8 | 29 | 7d ago | |
| HR-Bench 4K | Qwen2.5-VL-72B | Accuracy79.4 | 29 | 7d ago | |
| V* | DeepEyes | Accuracy90.1 | 28 | 7d ago | |
| VisualProbe (test) | SeProD | Success Rate (Easy)71.3 | 22 | 6d ago | |
| VStarBench | Vero Q3I-8B | Score89.5 | 11 | 1mo ago | |
| V* bench (test) | IVM-Enhanced GPT4-V | Attribute Rate87 | 10 | 3mo ago | |
| 1k image (test) | Taxonomy-decoupled | Rel P@k94.4 | 9 | 3mo ago | |
| COCO-Search18 cross-task | Accuracy (%)27.5 | 7 | 2mo ago | ||
| Visual Shopping (Offline) | ViT-B/16 384x | P@154.7 | 6 | 3mo ago | |
| V-star | Penguin-VL | Accuracy83.8 | 5 | 2mo ago | |
| HLE-VL | Pass@136 | 4 | 2mo ago | ||
| MM-BrowseComp | Seed1.8 | Pass@146.3 | 4 | 2mo ago | |
| V*Bench | SEAL | Success Rate75.3 | 2 | 3mo ago |