| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Question Answering | SimpleVQA | Accuracy0.737 | 99 | |
| Multimodal Search | SimpleVQA | Accuracy64.1 | 15 | |
| Visual Question Answering | SimpleVQA-EN | Accuracy50.6 | 14 | |
| General visual question answering | SimpleVQA | Pass@163.4 | 7 | |
| General VQA | SimpleVQA | Accuracy74.06 | 5 |