| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text-to-Image Generation | LAION-5B 1,000 prompts | FID (Real)47.079 | 20 | |
| Image-to-Image Translation (Appearance Divergence) | LAION Mini | Structure Similarity96.8 | 20 | |
| Image-to-Image Translation (Appearance Consistency) | LAION Mini | Structure Similarity0.965 | 20 | |
| Text-to-Image Generation | LAION 5B 1K | HPSv2.128.685 | 18 | |
| Machine Unlearning | LAION 400M | Forget Accuracy58.91 | 15 | |
| Paired Perceptual Deviation | LAION | LPIPS0 | 10 | |
| Text-image alignment | LAION | CLIP Cosine Similarity0.31 | 10 | |
| Text-conditional image generation | LAION-10k | CLIP Score33.2 | 10 | |
| Text-to-Image Generation | LAION | P Score0.91 | 8 | |
| Filtered Nearest Neighbor Search | LAION 25M 1.0 (train) | Indexing Time (s)569 | 8 | |
| Filtered Nearest Neighbor Search | LAION 5M subset 1.0 (train) | Indexing Time (s)89 | 8 | |
| Filtered Nearest Neighbor Search | LAION 1M 1.0 (train) | Indexing Time (s)13 | 8 | |
| Text-to-Image Generation | LAION 10K 5B (test) | FID10.68 | 8 | |
| Pose-guided Text-to-Image Generation | LAION-Human | AP57.11 | 7 | |
| Text-driven Image-to-Image Translation | LAION Mini (subset (20 samples)) | Inversion Time (s)3.5 | 7 | |
| Text-to-Image Generation | LAION-10k Scenario 1 (test) | Similarity (95pc)0.6504 | 7 | |
| High-Resolution Image Generation | LAION-5B 4x4 scaling factor (test) | FID58.91 | 7 | |
| High-Resolution Image Generation | LAION-5B 3x3 scaling factor (test) | FID68.82 | 7 | |
| High-Resolution Image Generation | LAION 5B 2x2 scaling factor (test) | FID58.91 | 7 | |
| Text-to-Image Generation | LAION-5B FI (test) | FID16.23 | 7 | |
| Image-to-Text Retrieval | LAION-5B | R@575.2 | 6 | |
| Text-to-Image Retrieval | LAION 5B | Recall@579.6 | 6 | |
| Prompt Recovery | LAION | CLIP Score82.6 | 6 | |
| Text-driven real image editing | LAION-5B real images | CLIP-S Score0.322 | 6 | |
| Membership Inference | LAION processed (pretraining) | ASR64.53 | 6 |