| Personalization Benchmark | ICPT | Single Score83.5 | | 23 | 2d ago |
| Flickr-30K | ReCoVERR | AURC (CIDEr-N)0.253 | | 15 | 3mo ago |
| MS-COCO | SeeTRUE | AURC (CIDEr-N)0.158 | | 15 | 3mo ago |
| PaveInstruct | LLaVA-1.5-7B | BLEU-410.08 | | 13 | 1mo ago |
| MRI-VQA image-level (test) | Qwen3-VL-8B-SGMRIVQA | AR-Score25.05 | | 11 | 1mo ago |
| GovLA-10K | GovLA-Reasoner | BLEU-153.32 | | 10 | 3mo ago |
| SGMRI-VQA Volume-level (test) | Qwen3-VL-8B-SGMRIVQA | AR-Score26.54 | | 9 | 1mo ago |
| SlideBench Caption | MLLM-HWSI | BLEU-146.2 | | 9 | 2mo ago |
| W3D Traffic Accident original (test) | LLada* | BLEU38 | | 9 | 2mo ago |
| W3D Safety-Critical Situation original (test) | LLada* | BLEU44 | | 9 | 2mo ago |
| W3D Normal Driving original (test) | LLada* | BLEU44 | | 9 | 2mo ago |
| VALOR 32K | MiCo | CIDEr62.8 | | 9 | 3mo ago |
| Private dataset video + IMU | DoRA-adapted QwenVL-2.5 | ROUGE-L0.44 | | 8 | 12d ago |
| RGBE-Chat | RE-VLM | CI4.03 | | 8 | 14d ago |
| PEOD-Chat | RE-VLM | CI Score3.68 | | 8 | 14d ago |
| Flickr-30K | SeeTRUE | Cider-N0.251 | | 8 | 3mo ago |
| MS-COCO | SeeTRUE | Cider-N0.158 | | 8 | 3mo ago |
| SMolInstruct | Intern-S1-mini | METEOR44.6 | | 7 | 12d ago |
| RoboFine-Bench Hard setting | RoboFine-VLM | Overall Score83.6 | | 6 | 7d ago |
| RoboFine-Bench Easy setting | RoboFine-VLM | Overall Score85.2 | | 6 | 7d ago |
| VRSBench (val) | LLaVA-1.5 | BLEU-414.7 | | 5 | 3mo ago |
| RSIEval | RSGPT | B-422.07 | | 5 | 3mo ago |
| Open Flamingo | CroPA+D-UAP | Targeted ASR90.6 | | 4 | 1mo ago |
| MyVLM Single Concept (test) | Ego | Recall91.3 | | 4 | 2mo ago |
| MSR-VTT 3 modal | Ours | BLEU@126.8 | | 4 | 3mo ago |