| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Perception | BLINK | Accuracy75.9 | 241 | |
| Visual Reasoning | BLINK | Accuracy85.2 | 107 | |
| Visual Perception and Reasoning | BLINK | Accuracy70.02 | 64 | |
| Visual Reasoning | BLINK | Jigsaw Accuracy99 | 49 | |
| Spatial Reasoning | BLINK | Spa. Score93 | 47 | |
| Visual Perception | BLINK (val) | Validation Score95.67 | 44 | |
| Multi-image visual reasoning | BLINK | Accuracy69.1 | 37 | |
| Adversarial Attack | BLINK | Attack Success Rate (ASR)87.65 | 37 | |
| Visual Question Answering | BLINK (val) | Accuracy73.7 | 29 | |
| Spatial Reasoning | BLINK | Score69.1 | 29 | |
| Visual Grounding | BLINK | Accuracy64.49 | 27 | |
| Visual Question Answering | BLINK | Accuracy66.1 | 27 | |
| Utility Prediction Routing | Blink | OpenAI Score98.53 | 26 | |
| Multi-image visual perception | BLINK | Accuracy62.8 | 26 | |
| Multi-image Understanding | BLINK (val) | Score68 | 23 | |
| Interleaved Image Multimodal Understanding | BLINK | Score66.3 | 22 | |
| Multimodal Perception | BLINK | Accuracy64.95 | 21 | |
| Visual Understanding | BLINK | Accuracy69.86 | 21 | |
| Multi-image reasoning | BLINK (val) | Accuracy52.6 | 21 | |
| Multimodal Reasoning | BLINK | Accuracy63.3 | 20 | |
| Low-level Visual Reasoning | BLINK | Accuracy72.3 | 19 | |
| Visual Perception | Blink 41 (val) | Score87.4 | 19 | |
| Relative Depth Estimation | BLINK RelativeDepth (test) | Accuracy87.9 | 18 | |
| 3D Spatial Reasoning | BLINK | Accuracy60 | 16 | |
| Multimodal Multi-choice | BLINK | Accuracy60 | 15 |