Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Audio-to-Text Retrieval (via Vision pivot) on MSRVTT to AVE (test)
Loading...
29.2
mAP
LB+BB
7.256
12.953
18.65
24.347
Feb 6, 2026
mAP
Updated 4d ago
Evaluation Results
Method
Method
Links
mAP
LB+BB
Backbone=LanguageBind,...
2026.02
29.2
IB+BB
Backbone=ImageBind, Va...
2026.02
27.8
LB+FT
Backbone=LanguageBind,...
2026.02
13.6
LB
Backbone=LanguageBind,...
2026.02
11.2
IB+FT
Backbone=ImageBind, Va...
2026.02
9.3
IB
Backbone=ImageBind, Va...
2026.02
8.1
Feedback
Search any
task
Search any
task