| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Yes/No Question Answering | Path-VQA | Accuracy65.85 | 18 | |
| Visual Question Answering | Path-VQA (test) | Commonsense Score2.69 | 16 | |
| Medical Visual Question Answering | Path-VQA (test) | BLEU-163.61 | 12 | |
| Visual Question Answering | Path-VQA YorN | Accuracy80.34 | 9 | |
| Medical Visual Question Answering | PATH_VQA | Pass@159.15 | 4 | |
| Biomedical Visual Question Answering | Path-VQA (test) | Closed Accuracy91.7 | 4 | |
| Multimodal Understanding | Path-VQA | Token F164.7 | 3 | |
| Open/Close-ended VQA | Path-VQA | Token F164.7 | 2 |