| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| DAQUAR REDUCED (test) | Human | Accuracy60.3 | 33 | 1mo ago | |
| OCR-VQA | Lyrics | Accuracy75.8 | 27 | 1mo ago | |
| Molmo QA Benchmarks Image 19 | Image Average Accuracy86.2 | 20 | 18d ago | ||
| OCR-VQA | ROUGE-L70.5 | 20 | 23d ago | ||
| MME Perception | Omni-Diffusion | MME-P Score1,216.7 | 4 | 1mo ago | |
| ST-VQA public server (test) | GIT2 | Accuracy75.8 | 3 | 1mo ago | |
| VizWiz public server | GIT2 | Accuracy70.1 | 3 | 1mo ago | |
| Visual7W | HyperTokens | Accuracy45.59 | 2 | 1mo ago | |
| ST-VQA public server | - | Accuracy- | 0 | 1mo ago |