Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Text-to-Image Retrieval on COCO 2017 (val)
Loading...
37.64
Recall@1
CLIP-Refine
30.2768
32.1884
34.1
36.0116
Apr 17, 2025
Recall@1
Recall@5
Recall@10
Updated 4d ago
Evaluation Results
Method
Method
Links
Recall@1
Recall@5
Recall@10
CLIP-Refine
Backbone=ViT-B/32, Zer...
2025.04
37.64
63.54
74.42
m²-mix
Backbone=ViT-B/32, Zer...
2025.04
36.28
62.18
73.08
HyCD
Backbone=ViT-B/32, Zer...
2025.04
36.04
62.28
73.14
Contrastive
Backbone=ViT-B/32, Zer...
2025.04
34.88
61.5
72.1
HyCD + Lalign
Backbone=ViT-B/32, Zer...
2025.04
33.92
61.18
72.06
Self-KD
Backbone=ViT-B/32, Zer...
2025.04
31.04
55.58
65.9
Pre-trained (CLIP)
Backbone=ViT-B/32, Zer...
2025.04
30.56
54.92
65.26
Feedback
Search any
task
Search any
task