Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Audio-to-Text Retrieval (via Vision pivot) on AVE to MSRVTT (test)
Loading...
35.3
mAP
IB+BB
10.444
16.897
23.35
29.803
Feb 6, 2026
mAP
Updated 4d ago
Evaluation Results
Method
Method
Links
mAP
IB+BB
Backbone=ImageBind, Va...
2026.02
35.3
LB+BB
Backbone=LanguageBind,...
2026.02
32.5
LB+FT
Backbone=LanguageBind,...
2026.02
15.3
LB
Backbone=LanguageBind,...
2026.02
13.4
IB+FT
Backbone=ImageBind, Va...
2026.02
12.4
IB
Backbone=ImageBind, Va...
2026.02
11.4
Feedback
Search any
task
Search any
task