| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| VQA 1.0 (test-dev) | Ensemble of 7 Att. models | Overall Accuracy66.7 | 100 | 1mo ago | |
| VQA 1.0 (test-standard) | MUTAN | Overall Accuracy67.36 | 50 | 1mo ago | |
| VQA (test-standard) | Human | Accuracy (Overall)83.3 | 32 | 1mo ago | |
| LLS48-VQA | MIS-DINOv2 | BLEU-10.5245 | 26 | 1mo ago | |
| PMC-VQA (test) | MedVInT-TE | Accuracy36.8 | 23 | 1mo ago | |
| PMC-VQA (test-initial) | MedVInT-TE | BLEU-135.4 | 19 | 1mo ago | |
| Delta-SECOND | Delta-LLaVA | H-CQS METEOR32.74 | 13 | 2d ago | |
| EarthVLSet 1.0 (test) | EarthVLNet | BLEU-10.5726 | 12 | 1mo ago | |
| CXR | CheXagent | BERTScore0.86 | 8 | 1mo ago | |
| LLaVA Bench v1 (test) | DRESS | Relevance37.18 | 7 | 1mo ago | |
| LLaVA Eval v1 (test) | DRESS | Conversation Score77.67 | 7 | 1mo ago | |
| ODIR | EyExIn | F1 Score56.7 | 6 | 1mo ago | |
| Retina | EyExIn | F1 Score67.8 | 6 | 1mo ago | |
| JSIEC | EyExIn | F1 Score63.1 | 6 | 1mo ago | |
| TM4K | EyExIn | F1 Score72.91 | 6 | 1mo ago | |
| VizWiz (val) | Llama-2 Chat 7B | Accuracy56.39 | 6 | 1mo ago | |
| WSI-Bench | HistoSelect | WSI-P (Morphology)53.8 | 5 | 1mo ago |