| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Adversarial Attack | Q-Bench | Attack Success Rate87.23 | 37 | |
| Vision Question Answering | Q-Bench LLVisionQA 1.0 (dev) | Yes-or-No Score80.01 | 20 | |
| Video Quality Understanding | Q-Bench-Video (dev) | Yes-or-No Acc76.78 | 14 | |
| Image Quality Understanding | Q-Bench subset (dev) | Yes/No Accuracy85.82 | 14 | |
| Low-level Vision Evaluation | Q-Bench (test) | Overall Score63.6 | 11 | |
| Visual Difference Discernment | Q-Bench2 | Overall Score74.2 | 9 | |
| Multi-modal Understanding | Q-Bench (test) | Overall Score62.9 | 8 | |
| Multi-image Multi-modal Understanding | Q-Bench | Accuracy74.4 | 2 |