| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Document Visual Question Answering | DocVQA | ANLS97.2 | 263 | |
| Document Visual Question Answering | DocVQA (test) | ANLS96.5 | 213 | |
| Visual Question Answering | DocVQA | Accuracy94.9 | 162 | |
| Document Visual Question Answering | DocVQA (val) | Accuracy97.85 | 157 | |
| Document Visual Question Answering | DocVQA | Accuracy97.1 | 132 | |
| Document Question Answering | DocVQA (test) | Accuracy96.4 | 78 | |
| Document-Oriented Visual Question Answering | DocVQA | Accuracy94.9 | 72 | |
| Document Question Answering | DocVQA | ANLS97.87 | 52 | |
| Document Visual Question Answering | DocVQA v1.0 (test) | ANLS96.5 | 49 | |
| Visual Question Answering | DocVQA (val) | ANLS89.2 | 47 | |
| Document Understanding | DocVQA (test) | Accuracy96.5 | 39 | |
| Visual Question Answering | DocVQA | ANLS93.78 | 38 | |
| Document Visual Question Answering | DocVQA 104 (test) | Score96.1 | 23 | |
| Document Understanding | DocVQA | ANLS96.7 | 21 | |
| OCR-based Visual Question Answering | DocVQA 2021 (val) | Accuracy93.7 | 13 | |
| Document Question Answering | DocVQA | EM (Exact Match)91.02 | 12 | |
| Visual Document Retrieval | DocVQA | NDCG@586.5 | 12 | |
| Document Understanding | DocVQA (val) | Accuracy95.45 | 11 | |
| OCR-related understanding | DocVQA | Score95.1 | 10 | |
| Similarity Assessment | DocVQA | BERTScore51.37 | 8 | |
| Context Understanding | DocVQA | Accuracy0.994 | 8 | |
| Multimodal Image Understanding | DocVQA | Score46.89 | 7 | |
| OCR and Chart Visual Question Answering | DocVQA (val) | Score95.5 | 7 | |
| Document and chart understanding | DocVQA | Pass@196.9 | 7 | |
| Question Answering | PFL-DocVQA (test) | ROUGE-10.702 | 7 |