| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text-to-Image Generation | GenAI-Bench | Basic Score0.939 | 47 | |
| Prompt-to-prompt semantic composition | GenAI-Bench | CLIP-T (S*T)46.8 | 30 | |
| Instruction-guided image editing preference prediction | GenAI-Bench | Accuracy67.5 | 24 | |
| Compositional Reasoning | GenAI-Bench (test) | Spatial Score83.79 | 18 | |
| Text-to-Image | GenAI-Bench 19 (test) | VQAScore78.2 | 17 | |
| element-level text-to-image alignment evaluation | GenAI-Bench | SRCC0.749 | 17 | |
| Human Consistency Evaluation | GenAI-Bench | Kendall's Tau-c38.4 | 16 | |
| Video Generation | GenAI-Bench | Accuracy82.5 | 14 | |
| Image Generation | GenAI-Bench | Accuracy75.9 | 14 | |
| Text-to-Image Generation | GenAI-Bench advanced prompts | Counting Score82 | 12 | |
| Generative AI evaluation consistency | GenAI-Bench | Pearson Correlation Score (r)70.3 | 11 | |
| Text-to-Image Generation Evaluation | GenAI-Bench | Pearson-r70.3 | 11 | |
| Visual Generation | GenAI-Bench | Overall Score75 | 11 | |
| Video Preference Alignment | GenAI-Bench | Alignment Accuracy (w/ties)64.26 | 11 | |
| Pairwise Preference | GenAI Bench (test) | Accuracy72.38 | 11 | |
| Image Editing Quality Assessment | GenAI-Bench | Accuracy83.9 | 10 | |
| Video Generation Assessment | GenAI-Bench Video (test) | Accuracy82.5 | 8 | |
| Image Generation Assessment | GenAI-Bench Image (test) | Accuracy73.4 | 8 | |
| Video Preference Modeling | GenAI-Bench (evaluation) | Tau68.7 | 7 | |
| Text-to-Image Generation Evaluation | GenAi-Bench | Kendall Tau B (Basic)0.446 | 5 | |
| Visual Reasoning | GenAI-Bench | SRCC74.4 | 5 | |
| Text-to-Image Generation | GenAI-Bench (test) | Text Alignment83 | 5 | |
| Video generation assessment | GenAI-Bench | Pairwise Accuracy70.16 | 3 | |
| Image-Text Alignment | GenAI-Bench Advanced | Alignment Score0.276 | 3 | |
| Image-Text Alignment | GenAI-Bench Basic | Alignment Score29.6 | 3 |