| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Flickr-30K | ReCoVERR | AURC (CIDEr-N)0.253 | 15 | 1mo ago | |
| MS-COCO | SeeTRUE | AURC (CIDEr-N)0.158 | 15 | 1mo ago | |
| PaveInstruct | LLaVA-1.5-7B | BLEU-410.08 | 13 | 9d ago | |
| GovLA-10K | GovLA-Reasoner | BLEU-153.32 | 10 | 1mo ago | |
| SlideBench Caption | MLLM-HWSI | BLEU-146.2 | 9 | 24d ago | |
| W3D Traffic Accident original (test) | LLada* | BLEU38 | 9 | 1mo ago | |
| W3D Safety-Critical Situation original (test) | LLada* | BLEU44 | 9 | 1mo ago | |
| W3D Normal Driving original (test) | LLada* | BLEU44 | 9 | 1mo ago | |
| VALOR 32K | MiCo | CIDEr62.8 | 9 | 1mo ago | |
| Flickr-30K | SeeTRUE | Cider-N0.251 | 8 | 1mo ago | |
| MS-COCO | SeeTRUE | Cider-N0.158 | 8 | 1mo ago | |
| VRSBench (val) | LLaVA-1.5 | BLEU-414.7 | 5 | 1mo ago | |
| RSIEval | RSGPT | B-422.07 | 5 | 1mo ago | |
| MyVLM Single Concept (test) | Ego | Recall91.3 | 4 | 1mo ago | |
| MSR-VTT 3 modal | Ours | BLEU@126.8 | 4 | 1mo ago | |
| MSCOCO 2 modal | Ours | BLEU-146.1 | 4 | 1mo ago | |
| This-is-my Multi Concept (test) | Ego | Recall70.9 | 3 | 1mo ago | |
| Person-in-WiFi 3D Single-person | WiFi2Cap | BLEU-447.07 | 2 | 24d ago | |
| WiFi2Cap | WiFi2Cap | BLEU-451.78 | 2 | 24d ago |