| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text-to-image generation | Localized Narratives COCO (val) | FID8.39 | 6 | |
| Controlled Caption Generation | Localized Narratives of Open Images | BLEU-10.584 | 3 | |
| Controlled Caption Generation | Localized Narratives ADE20k | BLEU-10.58 | 3 |