Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Speech-to-Text Question Answering on OBQA
Loading...
65.9
Accuracy
Phi-4-Multimodal
10.26
24.705
39.15
53.595
Jan 30, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Phi-4-Multimodal
Type=AR, S→S capabilit...
2026.01
65.9
DIFFUSPEECH
Type=Diff., S→S capabi...
2026.01
51.3
Qwen2-Audio
Type=AR, S→S capabilit...
2026.01
49.5
MinMo
Type=AR, S→S capabilit...
2026.01
44.5
DiFFA
Type=Diff., S→S capabi...
2026.01
35.6
Llama-Omni2
Type=AR, S→S capabilit...
2026.01
28.1
Moshi
Type=AR, S→S capabilit...
2026.01
26
SpiritLM
Type=AR, S→S capabilit...
2026.01
21.7
SpeechGPT
Type=AR, S→S capabilit...
2026.01
12.4
Feedback
Search any
task
Search any
task