| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text-to-Image Retrieval | Sydney (val) | R@565.4 | 23 | |
| Image-to-Text Retrieval | Sydney (val) | R@563.6 | 23 | |
| Text-to-image retrieval | Sydney | R@124.7 | 22 | |
| Image-to-text retrieval | Sydney | R@121.3 | 22 | |
| Remote Sensing Image Captioning | Sydney (test) | ReconScore85.67 | 13 | |
| 3D Object Classification | Sydney Few-shot learning setup sparse (100 points) | 5-way 10-shot Accuracy86.2 | 11 | |
| Remote Sensing Image Captioning | SYDNEY | BLEU-184.02 | 8 | |
| Image Captioning | SYDNEY (test) | Related Rate91.38 | 2 |