Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Spoken Instruction Following on Human Recordings (test)
Loading...
71.2
LC Win Rate
VIRBA
41.976
49.563
57.15
64.737
Aug 25, 2025
LC Win Rate
Updated 13d ago
Evaluation Results
Method
Method
Links
LC Win Rate
VIRBA
Backbone=Qwen2.5-Omni
2025.08
71.2
VIRBA
Backbone=Qwen2-Audio
2025.08
68
Step-Audio-R1
2025.08
65.1
Qwen2.5-Omni
2025.08
64.4
Single-view RL
2025.08
59
TTS-SFT
2025.08
55.4
Qwen2-Audio base
2025.08
43.1
Feedback
Search any
task
Search any
task