| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Style aligned image generation | 100 text prompts (test) | Text Alignment (CLIP Score)28.9 | 11 | |
| Text-to-3D Avatar Generation | 50 Text Prompts | BLIP-VQA0.64 | 6 | |
| Text-to-3D Generation | 120 text prompts (test) | CLIP Score0.737 | 6 | |
| Prompt-image Alignment | 300 text prompts (test) | CLIP Score31.9 | 4 | |
| Text-to-4D Synthesis | 300 text prompts | R-Precision83.7 | 4 | |
| Text-to-Video | 220 text prompts | OF Score10.26 | 3 | |
| Text-to-3D Generation | 160 text prompts descriptions of people (test) | R-Precision83.8 | 2 |