| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| High-resolution Visual Understanding | HR-Bench 8K | FSP95 | 29 | |
| Visual Reasoning | HR-Bench 4K FSP | ACC96.5 | 29 | |
| Fine-grained visual understanding | HR-Bench 4K | Score79 | 24 | |
| Visual Search | HR-Bench 8K | Accuracy76.3 | 23 | |
| Visual Search | HR-Bench 4K | Accuracy79.4 | 23 | |
| High-resolution perception | HR Bench 4K | Overall Score87.87 | 19 | |
| High-Resolution Multimodal Reasoning | HR-Bench 8K FCP | Accuracy77 | 19 | |
| High-Resolution Multimodal Reasoning | HR-Bench 8K FSP | ACC94.8 | 19 | |
| High-Resolution Multimodal Reasoning | HR-Bench 4K FCP | ACC78.3 | 19 | |
| Visual Understanding | HR-Bench 8K | Avg@8 Exact Match86.6 | 17 | |
| Visual Understanding | HR-Bench 4K | Avg@8 Exact Match90.2 | 17 | |
| Fine-grained visual understanding | HR-Bench 8K | Score74.9 | 17 | |
| Visual Reasoning | HR-Bench (test) | Accuracy69.94 | 15 | |
| High-resolution Visual Understanding | HR-Bench 4K | FSP96.5 | 12 | |
| Visual Grounded Reasoning | HR-Bench-8K | Overall Score76.3 | 12 | |
| Visual Grounded Reasoning | HR-Bench-4K | Overall Score79.4 | 12 | |
| Multimodal Perception | HR-Bench 4K | Accuracy84.4 | 11 | |
| Visual Reasoning | HR-Bench 8K FCP | Accuracy59 | 10 | |
| Visual Reasoning | HR-Bench FSP 8K | ACC86.5 | 10 | |
| Visual Reasoning | HR-Bench 4K FCP | Accuracy63.3 | 10 | |
| High-Resolution Visual Reasoning | HR-Bench | Score (4K)75 | 8 | |
| Visual Reasoning | HR-Bench 8K | FSP87 | 7 | |
| Visual Reasoning | HR-Bench 4K | FSP0.933 | 7 |