| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Question Answering | InfoVQA | Accuracy89.3 | 135 | |
| Information Visual Question Answering | InfoVQA (test) | ANLS89.3 | 130 | |
| Visual Question Answering | InfoVQA (val) | Accuracy87.9 | 91 | |
| Infographic Question Answering | InfoVQA | ANLS89.2 | 90 | |
| Information Visual Question Answering | InfoVQA | Accuracy84 | 52 | |
| Infographic Visual Question Answering | InfoVQA | Accuracy84.1 | 40 | |
| Document Visual Question Answering | InfoVQA | ANLS88 | 32 | |
| Infographic Visual Question Answering | InfoVQA (test) | Accuracy81 | 31 | |
| Visual Question Answering | InfoVQA | ANLS Score78.98 | 31 | |
| Document Understanding | InfoVQA (test) | Accuracy84.5 | 18 | |
| Infographic Question Answering | InfoVQA (val) | Accuracy70.5 | 17 | |
| Visual Document Retrieval | InfoVQA | NDCG@578.6 | 13 | |
| Visual Question Answering | InfoVQA (test) | Accuracy88.9 | 13 | |
| OCR VQA | InfoVQA (val) | Accuracy81.4 | 12 | |
| Document Question Answering | InfoVQA | Score90.1 | 11 | |
| Document Understanding | InfoVQA | Score86.4 | 10 | |
| Visual Question Answering | InfoVQA+ (val) | ANLS45.2 | 10 | |
| Multimodal Image Understanding | InfoVQA | Score32.46 | 7 | |
| Image Captioning | InfoVQA | Prism59.4 | 7 | |
| End-to-end Question Answering | InfoVQA (test) | EM60.91 | 7 | |
| Document Question Answering | InfoVQA (val) | Score76.12 | 7 | |
| OCR Visual Question Answering | InfoVQA 2022 (val) | Score67.2 | 7 | |
| OCR-related Understanding Tasks | InfoVQA (test) | Accuracy87.3 | 7 | |
| Text-oriented Visual Question Answering | InfoVQA | ANLS43.5 | 7 | |
| Document Understanding | InfoVQA (val) | Accuracy23.6 | 6 |