| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Question Answering | InfoVQA | Accuracy89.3 | 195 | |
| Information Visual Question Answering | InfoVQA (test) | ANLS89.3 | 130 | |
| Infographic Question Answering | InfoVQA | ANLS89.2 | 117 | |
| Information Visual Question Answering | InfoVQA | Accuracy88.3 | 110 | |
| Visual Question Answering | InfoVQA (val) | Accuracy87.9 | 91 | |
| Document Visual Question Answering | InfoVQA | Accuracy0.902 | 85 | |
| Infographic Visual Question Answering | InfoVQA | Accuracy84.1 | 53 | |
| Infographic Visual Question Answering | InfoVQA (test) | Accuracy83.1 | 46 | |
| Visual Question Answering | InfoVQA | ANLS Score78.98 | 31 | |
| Document Understanding | InfoVQA (test) | Accuracy84.5 | 18 | |
| Document and OCR | InfoVQA | Accuracy Score94.22 | 17 | |
| Infographic Question Answering | InfoVQA (val) | Accuracy70.5 | 17 | |
| Document Understanding, OCR & Charts | InfoVQA (test) | Score86.8 | 16 | |
| OCR VQA | InfoVQA (val) | Accuracy81.4 | 16 | |
| Document Question Answering | InfoVQA (test) | Accuracy72.64 | 14 | |
| OCR-VQA | INFOVQA | FR75.2 | 13 | |
| Visual perception and grounding | InfoVQA | Accuracy88.3 | 13 | |
| Visual Document Retrieval | InfoVQA | NDCG@578.6 | 13 | |
| Visual Question Answering | InfoVQA (test) | Accuracy88.9 | 13 | |
| Document Visual Question Answering | InfoVQA (val) | Accuracy86.53 | 12 | |
| Document Question Answering | InfoVQA | Score90.1 | 11 | |
| Fine-grained perception | InfoVQA | Score82.7 | 10 | |
| Document Understanding | InfoVQA | Score86.4 | 10 | |
| Visual Question Answering | InfoVQA+ (val) | ANLS45.2 | 10 | |
| Multimodal Reasoning | InfoVQA | Token Count5,511.1 | 9 |