| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Audio tasks | Audio Tasks (TUT, VocalSound, Clotho) zero-shot | Score25.3 | 9 | |
| Question Answering | Audio Tasks (test) | Time F149.9 | 7 | |
| Summarization | Audio Tasks (test) | Time F183.12 | 6 | |
| Audio Understanding | 3 Audio Tasks (TUT, VocalSound, Clotho) | Score25.3 | 5 |