| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Image Captioning Robustness | Image Captioning Dataset | CLIP Score (RN-50)82.9 | 30 | |
| Image Captioning | Image Captioning Hard Criterion Claude-3.5 | ASR7 | 8 | |
| Image Captioning | Image Captioning Hard Criterion GPT-o1 | ASR53 | 8 | |
| Image Captioning | Image Captioning Hard Criterion GPT-4o | ASR74 | 8 | |
| Image Captioning | Image Captioning Target: Claude-4.5, Soft Criterion | ASR8 | 8 | |
| Image Captioning | Image Captioning Target Gemini-2.5 Soft Criterion | Accuracy Score Rate (ASR)79 | 8 | |
| Image Captioning | Image Captioning Target: GPT-5 Soft Criterion | ASR56 | 8 | |
| Image Captioning | Image Captioning Target: GPT-4o, Soft Criterion | ASR81 | 8 |