Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Zero-shot Image-Text Retrieval on DOCCI
Loading...
66.2
Accuracy
CLIPS (ViT-L/14)
47.584
52.417
57.25
62.083
Mar 26, 2026
Accuracy
Updated 23d ago
Evaluation Results
Method
Method
Links
Accuracy
CLIPS (ViT-L/14)
Backbone=ViT-L/14
2026.03
66.2
SigLIP ViT-L/16-res256
Backbone=ViT-L/16, Inp...
2026.03
62.5
SigLIP2-ViT-B/16
Backbone=ViT-B/16
2026.03
62.2
C^2LIP
Backbone=ViT-B/16
2026.03
60
CLIP-A (ViT-L/14)
Backbone=ViT-L/14
2026.03
59.4
BLIP-B
Backbone=ViT-B
2026.03
51.6
FLAVA
Backbone=ViT-B
2026.03
48.3
Feedback
Search any
task
Search any
task