| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Semantic Segmentation | COCO Stuff | mIoU3,100 | 195 | |
| Semantic Segmentation | Coco-Stuff (test) | mIoU52 | 184 | |
| Semantic Segmentation | COCO Stuff (val) | mIoU62.55 | 126 | |
| Semantic Segmentation | COCO-Stuff-10K (test) | mIoU53.46 | 47 | |
| Semantic Segmentation | COCO-Stuff 164K (test) | mIoU (Mean Scale)51.7 | 43 | |
| Semantic Segmentation | COCO-Stuff 27 | mIoU52 | 40 | |
| Open Vocabulary Semantic Segmentation | COCO Stuff without background | mIoU31.5 | 27 | |
| Unsupervised image segmentation | COCO-Stuff (test) | Accuracy89.7 | 26 | |
| Layout-to-Image Synthesis | COCO-Stuff (test) | FID14.4 | 25 | |
| Layout-to-Image Generation | COCO-Stuff | FID0 | 23 | |
| Image Colorization | Extended COCO-Stuff (test) | PSNR25.97 | 20 | |
| Semantic Segmentation | COCO-Stuff without background class | mIoU25.8 | 20 | |
| Unsupervised image segmentation | COCO-Stuff 3-class (test) | Accuracy84.7 | 19 | |
| Full-image colorization | COCO-Stuff (val) | LPIPS0.095 | 18 | |
| Semantic Segmentation | COCO-Stuff 10K | mIoU54.56 | 16 | |
| Layout-to-image synthesis | COCO-Stuff 22 (test) | Inception Score34.5 | 15 | |
| Semantic Segmentation | COCO-Stuff (unseen) | mIoU61.5 | 14 | |
| Scene Generation | COCO-Stuff unseen (eval) | FID48.9 | 14 | |
| Scene Generation | COCO-Stuff seen (val) | FID74.3 | 14 | |
| Scene Generation | COCO-Stuff (val) | FID35.6 | 14 | |
| Unsupervised Segmentation | COCO-Stuff (val) | mIoU21.9 | 13 | |
| Scene Generation | COCO-Stuff (train) | FID7.1 | 12 | |
| Image Generation | COCO-Stuff (test) | Inception Score30.7 | 12 | |
| Unsupervised Open-vocabulary Semantic Segmentation | COCO Stuff (val) | mIoU28.8 | 10 | |
| Semantic Image Synthesis | COCO-Stuff to Cityscapes (target: 100 images) | FID47 | 10 |