| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Text-to-Image Retrieval | Taxonomic Retrieval Family level | Top-1 Accuracy53.3 | 5 | |
| Image-to-Text Retrieval | Taxonomic Retrieval Family level | Top-1 Accuracy36.9 | 5 | |
| Text-to-Image Retrieval | Taxonomic Retrieval Genus level | Top-1 Accuracy90.7 | 5 | |
| Image-to-Text Retrieval | Taxonomic Retrieval Genus level | Top-1 Accuracy74.7 | 5 | |
| Average Cross-modal Retrieval | Taxonomic Retrieval Family level | Top-1 Accuracy36.4 | 4 | |
| Image-to-Audio Retrieval | Taxonomic Retrieval Family level | Top-1 Accuracy38.8 | 4 | |
| Audio-to-Image Retrieval | Taxonomic Retrieval Family level | Top 1 Accuracy32.8 | 4 | |
| Text-to-Audio Retrieval | Taxonomic Retrieval Family level | Top-1 Accuracy37.1 | 4 | |
| Audio-to-Text Retrieval | Taxonomic Retrieval Family level | Top 1 Accuracy19.2 | 4 | |
| Average Cross-modal Retrieval | Taxonomic Retrieval Genus level | Top-1 Accuracy66 | 4 | |
| Image-to-Audio Retrieval | Taxonomic Retrieval Genus level | Top 1 Accuracy48.6 | 4 | |
| Audio-to-Image Retrieval | Taxonomic Retrieval Genus level | Top-1 Accuracy43.2 | 4 | |
| Text-to-Audio Retrieval | Taxonomic Retrieval Genus level | Top-1 Accuracy84.9 | 4 | |
| Audio-to-Text Retrieval | Taxonomic Retrieval Genus level | Top-1 Accuracy53.6 | 4 |