| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Fine-grained Image Captioning | DetailCaps (test) | CAPTURE72.2 | 29 | |
| Image Captioning | DetailCaps | Score64.2 | 13 | |
| Long-form Generation | DetailCaps 100 sampled instances | Score64.4 | 8 | |
| Detail Captioning | DetailCaps | CAPTURE- | 0 |