| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Question Answering | HRBench-4K | Accuracy0.7925 | 61 | |
| Visual Grounding | HRBench8K | Accuracy75.63 | 51 | |
| Visual Question Answering | HRBench 8K | Accuracy76.25 | 51 | |
| Fine-grained Visual Perception | HRBench-8K | Accuracy71.75 | 30 | |
| Fine-grained Visual Question Answering | HRBench-8K | Overall Accuracy69.63 | 28 | |
| Fine-grained Visual Question Answering | HRBench 4K | Overall Accuracy71.13 | 28 | |
| Fine-grained Perception | HRBench 4K | Pass@186.8 | 26 | |
| High-Resolution Visual Reasoning | HRBench | Accuracy0.7512 | 16 | |
| High-Resolution Visual Understanding | HRBench | Score76 | 15 | |
| Visual Question Answering | HRBench 8K | FSP88.5 | 15 | |
| Visual Question Answering | HRBench-4K | FSP Score92.8 | 15 | |
| Visual Reasoning | HRBench-4K | Accuracy91.38 | 14 | |
| General VQA | HRBench | Accuracy78.5 | 14 | |
| High-Resolution Visual Perception | HRBench 4K | Score83.5 | 13 | |
| Visual Tool-Use | HRBench 8K | Accuracy73.7 | 13 | |
| High-Resolution Multimodal Understanding | HRBench 8K | Accuracy71.5 | 13 | |
| Visual perception and grounding | HRBench | Accuracy74.2 | 12 | |
| Fine-grained Perception | HRBench 8K | Pass@181.1 | 10 | |
| Perception | HRbench 8K | FSP88.5 | 10 | |
| Real-World Understanding | HRBench 4K | Score86.9 | 10 | |
| Perceptual Robustness | HRBench 8K | Overall Score71.5 | 9 | |
| Perceptual Robustness | HRBench-4K | Overall Score72.38 | 9 | |
| High-resolution Image Comprehension | HRBench | HRBench 4K Score0.734 | 9 | |
| Visual Tool-Use | HRBench 4K | Accuracy80.1 | 9 | |
| Real-world perception-centric reasoning | HRBench 4K (test) | Accuracy80.25 | 8 |