| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| LLaVA-Bench Wild | AutoV | Score102.3 | 60 | 1mo ago | |
| FronTalk Multi-Turn 1.0 (test) | PR Score75 | 32 | 1mo ago | ||
| LLaVA-W | Score102 | 28 | 1mo ago | ||
| FronTalk Single-Turn 1.0 (test) | PR77.5 | 20 | 1mo ago | ||
| LLaVA-Bench | DPO | Overall Score79.1 | 15 | 19d ago | |
| Visual Instruction Total (test) | PromptEnhancer | Avg. Response Length (Words)153.04 | 6 | 1mo ago | |
| LLaVA-Bench 100 images (test) | Nullu | Accuracy6.53 | 6 | 1mo ago | |
| Visual Instruction Out-Of-Distribution - Hard (test) | BeautifulPrompt | Win Ratio (Human)88 | 5 | 1mo ago | |
| Visual Instruction Out-Of-Distribution - Simple (test) | BeautifulPrompt | Human Win Ratio93 | 5 | 1mo ago | |
| Visual Instruction Out-Of-Distribution (test) | BeautifulPrompt | Human Win Ratio90 | 5 | 1mo ago | |
| Visual Instruction In-Distribution - Hard (test) | BeautifulPrompt | Human Win Ratio89 | 5 | 1mo ago | |
| Visual Instruction In-Distribution - Simple (test) | BeautifulPrompt | Human Win Ratio89 | 5 | 1mo ago | |
| Visual Instruction In-Distribution (test) | BeautifulPrompt | Human Win Ratio89 | 5 | 1mo ago |