Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Text-to-audio retrieval on AudioCaps
Loading...
55.2
Recall@1
InternVideo2-6B
4.656
17.778
30.9
44.022
Mar 22, 2024
Jul 6, 2024
Oct 21, 2024
Feb 5, 2025
May 22, 2025
Sep 6, 2025
Dec 22, 2025
Recall@1
Updated 2d ago
Evaluation Results
Method
Method
Links
Recall@1
InternVideo2-6B
Evaluation protocol=Fi...
2024.03
55.2
VAST
Evaluation protocol=Fi...
2024.03
52
PEAV L
A-Enc Params.=1.1B, Da...
2025.12
45.8
PEAV L
A-Enc Params.=1.1B, Da...
2025.12
44.7
PEAV L-OOD
A-Enc Params.=1.1B, Da...
2025.12
43.4
PEAV B
A-Enc Params.=.2B, Dat...
2025.12
43.1
PEAV B
A-Enc Params.=0.2B, Da...
2025.12
42.7
PEAV S
A-Enc Params.=.09B, Da...
2025.12
41.8
PEAV S
A-Enc Params.=.09B, Da...
2025.12
41.2
InternVideo2-6B
Evaluation protocol=Ze...
2024.03
37.1
CLAP-Fusion
A-Enc Params.=.03B, Da...
2025.12
35.4
PEAV L (PT)
A-Enc Params.=1.1B, Da...
2025.12
33.7
CLAP
A-Enc Params.=.03B, Da...
2025.12
31.6
AFlamingo2
A-Enc Params.=0.3B, Da...
2025.12
29.8
VIP-ANT
Evaluation protocol=Ze...
2024.03
27.7
M2D-CLAP
A-Enc Params.=.09B, Da...
2025.12
27.4
MS-CLAP23’
A-Enc Params.=.08B, Da...
2025.12
23.4
LangBind
A-Enc Params.=0.3B, Da...
2025.12
19.7
ImageBind
A-Enc Params.=.09B, Da...
2025.12
6.6
Feedback
Search any
task
Search any
task