| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text Retrieval | RSICD (test) | R@122.14 | 51 | |
| Image-Text Retrieval | RSICD (test) | mR10.43 | 43 | |
| Text-to-Image Retrieval | RSICD (test) | R@115.59 | 34 | |
| Cross-modal Retrieval | RSICD (test) | Image-to-Text R@121.13 | 32 | |
| Image Retrieval | RSICD | R@19.76 | 30 | |
| Sentence Retrieval | RSICD | R@112.56 | 30 | |
| Image Retrieval | RSICD (test) | R@111.63 | 30 | |
| Image-text retrieval | RSICD | Mean Recall40.72 | 26 | |
| Image Captioning | RSICD | CIDEr97.45 | 26 | |
| Image-to-Text Retrieval | RSICD (val) | R@518.4 | 16 | |
| Remote Sensing Image-Text Retrieval | RSICD (test) | Text Retrieval R@118.3 | 14 | |
| Text-to-Image Generation | RSICD | FID22.11 | 13 | |
| Image-to-Text Retrieval | RSICD (test) | R@18.66 | 13 | |
| Image Classification | RSICD CLS | Accuracy0.874 | 11 | |
| Base-to-new generalization | RSICD | Base Score96.4 | 8 | |
| Single-source domain generalization | RSICD v2 | Accuracy85.91 | 8 | |
| Cross-dataset Generalization | RSICD (source) | Accuracy61.66 | 8 | |
| Text-to-image retrieval | RSICD | R@14.2 | 8 | |
| Image-to-text retrieval | RSICD | Recall@14.2 | 8 | |
| Classification | RSICD-CLS (test) | Top-1 Acc66.9 | 8 | |
| 30-way classification | RSICD (test) | Mean Rank3.76 | 6 |