| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Audio Classification | VocalSound | Accuracy72.2 | 21 | |
| Vocal Sound Classification | VocalSound | Accuracy94.85 | 21 | |
| Acoustic Event Classification | VocalSound | Normalized Score93 | 20 | |
| Audio Understanding | VocalSound | Accuracy94.85 | 7 | |
| Audio Classification | VocalSound (test) | Accuracy85.52 | 7 | |
| Multiple-choice Audio Question Answering | VocalSound (test) | Accuracy93.92 | 4 | |
| Vocal Event Detection | VocalSound (test) | ACC (%)93.3 | 3 |