| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text-to-Image Retrieval | ShareGPT4V | R@197.9 | 35 | |
| Image-to-Text Retrieval | ShareGPT4V | Recall@197.8 | 17 | |
| Text-to-Image Retrieval | ShareGPT4V 1k | Recall@199 | 11 | |
| Image-to-Text Retrieval | ShareGPT4V 1k | R@199.5 | 11 | |
| Text-to-Image Retrieval | ShareGPT4V 10k | Recall@194.8 | 9 | |
| Image-to-Text Retrieval | ShareGPT4V 10k | R@195.5 | 9 | |
| Image-to-Text Retrieval | ShareGPT4V 5,000 samples (test) | R@188.62 | 6 | |
| Text-to-Image Retrieval | ShareGPT4V 5,000 samples (test) | R@185.48 | 6 | |
| Long-caption Image-to-Text Retrieval | ShareGPT4V | Recall@190.8 | 4 | |
| Long-caption Text-to-Image Retrieval | ShareGPT4V | Recall@10.939 | 4 | |
| Text Generation | ShareGPT4V | BLEU-147.9 | 3 | |
| Object Hallucination Assessment | ShareGPT4V | CHAIR-S Score46.8 | 3 |