| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Audio Classification | ESC-50 | Accuracy99.25 | 441 | |
| Audio Classification | ESC50 | Top-1 Acc96.5 | 64 | |
| Target Sound Extraction | ESC-50 (test) | SISDRi12.37 | 46 | |
| Environmental Sound Classification | ESC-50 (5-fold cross-validation) | Accuracy98.1 | 38 | |
| Environmental sound classification | ESC | Top-1 Acc91.8 | 28 | |
| Audio Super-Resolution | ESC-50 Out-of-domain | LSD1.18 | 16 | |
| Classification | ESC-50 (test) | Accuracy96.35 | 16 | |
| Environmental Sound Classification | ESC-50 (10-fold cross-validation) | Accuracy96.1 | 13 | |
| Audio Classification | ESC50 (In-Domain) | AI43.35 | 12 | |
| Zero-shot Audio Classification Explanation | ESC50 White Noise contamination | AI Score32.97 | 12 | |
| Utterance Generation | ESC (test) | BLEU-118.75 | 10 | |
| Strategy Prediction | ESC (test) | Error Rate Metric (EMR)0.67 | 10 | |
| Audio Classification | ESC | Top-1 Accuracy95.5 | 10 | |
| Audio Classification | ESC-Actions | Accuracy91.5 | 10 | |
| Event Causality Identification | ESC cross-topic partition | Precision0.505 | 10 | |
| Audio Classification | ESC-50 (val) | Top-1 Acc99 | 10 | |
| Audio Tagging | ESC-FreeGen50 (test) | Accuracy64.32 | 7 | |
| Audio Retrieval | ESC | Recall@194 | 6 | |
| Audio Classification | ESC-50 | Overall Accuracy76.3 | 6 | |
| Audio Super-Resolution | ESC-50 (test) | MOS4.87 | 6 | |
| Environmental Sound Classification | ESC-50 (incremental split (5 tasks)) | Accuracy50 | 6 | |
| Sound Event Classification | ESC-50 Simulated Distributed Layouts (five-fold cross-validation) | Accuracy (Circular)36.8 | 6 | |
| Audio Classification | ESC-50 500 labels | Top-1 Error Rate0.2592 | 6 | |
| Audio Classification | ESC-50 250 labels | Top-1 Error Rate29.71 | 6 | |
| Clustering Analysis | ESC-10 | Silhouette Score0.091 | 5 |