| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text-to-Image Generation | MARIO-Eval | CLIPScore34.7 | 25 | |
| Text image reconstruction | Mario-Eval | Accuracy80.2 | 11 | |
| Text Rendering | MARIO-Eval 500-sample (external) | NED0.0896 | 7 | |
| Text-to-image generation | MARIO-Eval (test) | FID34.902 | 4 | |
| Inpainting | MARIO-Eval | Inpainting Ability (Human)74.51 | 2 |