| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Audio Classification | ESC50 (test) | R@1 Accuracy0.9803 | 28 | |
| Zero-shot Audio Classification Explanation | ESC50 contamination | AI39.4 | 24 | |
| Audio Classification | ESC50 Actions | Accuracy (Base)89.17 | 13 | |
| Sound Event Classification | ESC50-A Source | Accuracy78.17 | 8 | |
| Audio Classification | ESC50 Actions (test) | Accuracy99 | 7 | |
| Open-ended Audio Classification | ESC50 Open | Accuracy36.9 | 3 | |
| Action Recognition | ESC50 Actions | mAP52.77 | 2 |