| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Vocal Sound Classification | VocalSound | Accuracy94.85 | 21 | |
| Audio Classification | VocalSound | Base Accuracy82.47 | 13 | |
| Audio Classification | VocalSound (test) | Accuracy85.52 | 7 | |
| Multiple-choice Audio Question Answering | VocalSound (test) | Accuracy93.92 | 4 |