Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Cross-modal Vision-Language Retrieval on MSLS (val)
Loading...
38
R@1
La-CLIP
0.144
9.972
19.8
29.628
Feb 3, 2026
R@1
R@5
R@10
R@20
Updated 4d ago
Evaluation Results
Method
Method
Links
R@1
R@5
R@10
R@20
La-CLIP
Augmentation=Language-...
2026.02
38
58.4
68.2
77.2
La-BLIP
Augmentation=Language-...
2026.02
38
60.5
68.4
75.4
La-SigLIP-V2
Augmentation=Language-...
2026.02
35.7
60.5
69.6
76.2
La-EVA-V2
Augmentation=Language-...
2026.02
32.8
52.8
62.4
70.9
EVA-CLIP-V2
Augmentation=None
2026.02
3.8
9.2
13
19.1
SigLIP-V2
Augmentation=None
2026.02
2.6
6.5
9.7
13.1
CLIP
Augmentation=None
2026.02
2.3
6.2
10.4
14.5
BLIP
Augmentation=None
2026.02
1.6
5.9
8.2
11.6
Feedback
Search any
task
Search any
task