| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Spatial Reasoning | CV-Bench | Accuracy92 | 79 | |
| Spatial Reasoning | CV-Bench-3D | Accuracy96.3 | 37 | |
| Spatial Understanding | CV-Bench | Accuracy88.82 | 29 | |
| Computer Vision Reasoning | CV-Bench | Accuracy85.7 | 26 | |
| Visual Reasoning | CV-Bench | Accuracy81.25 | 24 | |
| Spatial Reasoning | CV-Bench | Average Spatial Score79 | 22 | |
| Spatial Reasoning | CV-Bench 2D | Accuracy94 | 22 | |
| Computer Vision Perception | CV-Bench | Score89 | 22 | |
| Computer Vision Evaluation | CV-Bench | Average Score85.8 | 22 | |
| Vision-centric Evaluation | CV-Bench | Accuracy0.864 | 21 | |
| Spatial Reasoning | CV-Bench | Overall Score86.5 | 18 | |
| Vision-Language Evaluation | CV-Bench | Accuracy90.1 | 17 | |
| Vision-Centric / 3D | CV-Bench 2D | Accuracy82.5 | 16 | |
| Vision-Centric Evaluation | CV-Bench 2D | Score63.8 | 15 | |
| Spatial Understanding | CV-Bench 2D Overall | Accuracy75.4 | 15 | |
| Single-image spatial reasoning | CV-Bench | 2D Accuracy80.7 | 15 | |
| Visual Understanding | CV-Bench | Accuracy86.96 | 15 | |
| Spatial Perception | CV-Bench 3D | Accuracy92.2 | 14 | |
| Spatial VQA | CV-Bench-2D Relation (Level 2) | Accuracy96.9 | 14 | |
| Spatial Reasoning | CV-Bench (test) | 2D Score83.6 | 14 | |
| Spatial Reasoning | CV-Bench 1k random samples | Count Score69.2 | 13 | |
| Multimodal Perception | CV-Bench | Accuracy89.57 | 13 | |
| Spatial Perception | CV-Bench Average | Accuracy85.5 | 12 | |
| Spatial Perception | CV-Bench 2D | Accuracy (%)79.7 | 12 | |
| Vision-centric Reasoning | CV Bench | Accuracy83.8 | 12 |