| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Grounding | MC-LLaVA | VG Score0.748 | 11 | |
| Recognition | MC-LLaVA | Single Score93.2 | 11 | |
| Visual Question Answering | MC-LLaVA | VQA BLEU (Single)72.8 | 11 | |
| Visual Multiple Choice Question Answering | MC-LLaVA | Choice-V Accuracy (Single)90.9 | 11 | |
| Text Multiple Choice Question Answering | MC-LLaVA | Choice-T Accuracy (Single)72.9 | 10 | |
| Visual Question Answering | MC-LLaVA (test) | METEOR0.482 | 3 |