| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Storytelling | VIST (test) | METEOR35.7 | 38 | |
| Visual Storytelling | VIST Human Evaluation (test) | Preference Rate71.5 | 16 | |
| Narrative Reasoning | VIST (test) | BLEURT0.456 | 14 | |
| Contextual Image Retrieval | VIST | R@13,120 | 10 | |
| Visual Storytelling | VIST Human Evaluation (test) | Preference72.3 | 8 | |
| Visual Storytelling | VIST (test) | Win Rate42.7 | 8 | |
| Contextual Image Generation | VIST 28 (test) | CLIP Similarity0.641 | 7 | |
| Visual Storytelling | VIST 150 stories | Relevance0.56 | 6 | |
| Visual Storytelling | VIST album level 1.0 (test) | METEOR35.5 | 6 | |
| Visual Storytelling | VIST (val) | Perplexity18.13 | 6 | |
| Story Generation | VIST 400 albums (test) | Preference Rate70.5 | 6 | |
| Visual Storytelling | VIST (test) | Top-1 Rank Percentage32.6 | 4 | |
| Visual Storytelling | VIST (test) | BLEU-414.4 | 4 | |
| Album Summarization | VIST (test) | Precision45.51 | 4 | |
| Visual Storytelling | VIST photo stream level 1.0 (test) | METEOR-v235.2 | 3 | |
| Visual Storytelling | VIST | METEOR34.4 | 3 | |
| Story continuation | VIST DII | FID17.03 | 2 | |
| Story continuation | VIST SIS | FID16.95 | 2 | |
| Visual Storytelling | VIST 1.0 (test) | Relevance53.2 | 2 | |
| Story continuation | VIST-SIS (test) | Visual Quality Win86.6 | 1 |