| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Image Understanding | Image benchmarks Aggregate | Overall Score64.82 | 21 | |
| Zero-shot Image Understanding | Dynamic-resolution Image Benchmarks (GQA, POPE, ScienceQA, MME, MMBench) (test) | GQA Score60.5 | 13 | |
| Multimodal Understanding and Reasoning | Image Benchmarks HallBench, MME, TextVQA, ChartQA, AI2D, RealWorldQA, CCBench, OCRVQA, SQA-IMG, POPE | HallBench Score46.5 | 13 |