Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Audio Question Answering on AudioCaps-QA (test)
Loading...
60.77
Model-as-Judge Score
M3KG-RAG
41.4468
46.4634
51.48
56.4966
Dec 23, 2025
Model-as-Judge Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Model-as-Judge Score
M3KG-RAG
MLLM=Qwen2.5-Omni
2025.12
60.77
M3KG-RAG
MLLM=VideoLLaMA2
2025.12
53.23
VAT-KG
MLLM=Qwen2.5-Omni
2025.12
51.3
Wikidata
MLLM=Qwen2.5-Omni
2025.12
49.78
M2ConceptBase
MLLM=Qwen2.5-Omni
2025.12
49.78
None
MLLM=Qwen2.5-Omni
2025.12
49
VTKG
MLLM=Qwen2.5-Omni
2025.12
48.95
VAT-KG
MLLM=VideoLLaMA2
2025.12
44.6
Wikidata
MLLM=VideoLLaMA2
2025.12
43.58
None
MLLM=VideoLLaMA2
2025.12
43.13
VTKG
MLLM=VideoLLaMA2
2025.12
43.02
M2ConceptBase
MLLM=VideoLLaMA2
2025.12
42.19
Feedback
Search any
task
Search any
task