| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Audio Understanding | MMSU | Perception Score55.7 | 32 | |
| General Audio Understanding | MMSU 1.0 (test) | Perception Semantics72.13 | 16 | |
| Audio Understanding | MMSU (test) | Overall Score66.64 | 15 | |
| Audio Question-Answering | MMSU | Score72.09 | 12 | |
| Multi-task Knowledge | MMSU | Accuracy67.1 | 11 | |
| Knowledge | MMSU (test) | Performance77 | 11 | |
| Speech Reasoning | MMSU S→T only | Accuracy43.2 | 9 | |
| Audio-conditioned reasoning | MMSU | Acc57.63 | 8 | |
| Multimodal Understanding | MMSU | MMSU Score61.4 | 7 | |
| Audio Reasoning | MMSU | Accuracy (Audio Reasoning)70.7 | 7 | |
| Multi-task Language Understanding | MMSU | Accuracy71.6 | 6 | |
| Audio Understanding & Reasoning | MMSU | Score0.724 | 3 |