| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Regular-Length Story Visualization | StoryGen Regular-Length Story Visualization (Human Evaluation) | Alignment4.11 | 8 | |
| Long Story Visualization | StoryGen Human Evaluation Set Long Story Visualization | Alignment4.35 | 7 | |
| Subject-Consistent Image Generation | StoryGen Human Evaluation Set Subject-Consistent Image Generation | Alignment4.2 | 6 | |
| Audio Storytelling | StoryGen-Eval (test) | KAD10.82 | 4 | |
| Speaker Diarization | StoryGen Eval | tcpWER5.5 | 3 | |
| Audio Captioning | StoryGen Eval | multiFLAM88.1 | 2 |