| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| TempCompass | Qwen2-VL-72B | MC Accuracy76 | 12 | 4d ago | |
| Vinoground | Tarsier2-7B | Text Count/Score65.8 | 12 | 4d ago | |
| TOMATO (test) | Tarsier2-7B | Accuracy42 | 12 | 4d ago | |
| TVBench (test) | Tarsier2-7B | Accuracy54.7 | 12 | 4d ago | |
| Perception Test (val) | LLaVA-Video-72B | Accuracy74.3 | 9 | 4d ago |