| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Medical Visual Question Answering | VQA-RAD | Accuracy80.4 | 228 | |
| Medical Visual Question Answering | VQA-RAD (Closed) | ECE1.3 | 96 | |
| Visual Question Answering | VQA-RAD (Open) | AUROC0.819 | 96 | |
| Visual Question Answering | VQA-RAD Closed | AUROC70.2 | 96 | |
| Visual Question Answering | VQA-RAD | Closed Accuracy86.8 | 64 | |
| Hallucination Detection | VQA-RAD (All) | AUC78.23 | 57 | |
| Hallucination Detection | VQA-RAD Open-Ended | AUC83.13 | 57 | |
| Medical Visual Question Answering | VQA-RAD (test) | Closed Accuracy87.9 | 50 | |
| Visual Question Answering | VQA-RAD (test) | Overall Accuracy90.4 | 48 | |
| Medical Visual Question Answering | VQA-RAD closed-end | Accuracy84.86 | 45 | |
| Multimodal Medical Reasoning | VQA-RAD | Accuracy (%)80.45 | 36 | |
| Medical Visual Question Answering | VQA-RAD Open | Accuracy61.5 | 26 | |
| Visual Question Answering | VQA-RAD open-ended | Exact Match (EM)29 | 25 | |
| Visual Question Answering | VQA-RAD Open | Token Recall73.7 | 16 | |
| Visual Question Answering (Closed-ended) | VQA-RAD closed-ended | Accuracy82.5 | 12 | |
| Multi-modal Question Answering | VQA-RAD | Accuracy87.1 | 12 | |
| Visual Question Answering | VQA-RAD Closed | Exact Match88 | 11 | |
| Medical Visual Question Answering | VQA-RAD cross-domain | Accuracy0.789 | 10 | |
| Medical Visual Question Answering | VQA-RAD (in-domain) | Accuracy83.3 | 10 | |
| Question Selection | VQA-RAD (test) | Risk60.6 | 7 | |
| Visual Question Answering | VQA-RAD Closed | Accuracy88.2 | 7 | |
| Medical Visual Question Answering | VQA-RAD | BLEU-10.695 | 7 | |
| Medical Visual Question Answering | VQA-Rad 2018 | Accuracy87.05 | 7 | |
| Medical Visual Question Answering | VQA-RAD | L-VASE94.4 | 6 | |
| Reasoning | VQA-RAD | Correctness47.34 | 6 |