| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Question Answering | OCR-VQA (test) | Accuracy77.8 | 77 | |
| OCR-based Visual Question Answering | OCR-VQA | Accuracy65.6 | 61 | |
| Image question answering | OCR-VQA | Accuracy75.8 | 27 | |
| Image Question Answering | OCR-VQA | ROUGE-L70.5 | 20 | |
| Visual Question Answering | OCR-VQA (val) | Accuracy71.1 | 17 | |
| Visual Question Answering | OCR-VQA | Exact Match (EM)77.8 | 9 | |
| Visual Question Answering | OCR-VQA Non-IID | Accuracy76.39 | 5 | |
| Visual Question Answering | OCR-VQA IID | Accuracy (ACC)75.86 | 5 | |
| QA over Illustrations | OCR-VQA (test) | F1 Score74 | 5 |