| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Relation-Conditioned Retrieval | Music | Hit@551.53 | 30 | |
| Single Sound Source Localization | MUSIC Solo (test) | IoU@0.598.5 | 26 | |
| Analysis-synthesis | Music Academic | FAD0 | 24 | |
| Audio-Visual Sound Separation | MUSIC-21 (test) | SDR10.36 | 24 | |
| Regression | music | Mean0.598 | 24 | |
| Generalization Performance | music | Avg Generalization Error0.21 | 24 | |
| Sound Separation | MUSIC-clean+ | CLAPt6.94 | 18 | |
| Audio Generation | Music clean (test) | Generation Success Rate100 | 18 | |
| Audio Generation | Music Noise SNR=-10 (test) | Generation Success Rate85 | 18 | |
| Hypernym discovery | music Gold standard domain-specific (test) | MRR80.6 | 18 | |
| Target Sound Extraction | MUSIC21 (test) | SDRi9.47 | 17 | |
| Audio source separation | MUSIC (test) | SDR11.61 | 16 | |
| Sequential Recommendation | Music | Recall14.02 | 14 | |
| Sequential Recommendation | Music | HR5.027 | 12 | |
| Rating Prediction | Music unbiased (test) | AUC68.8 | 12 | |
| Analysis-synthesis | Music Industrial | FAD0 | 12 | |
| ECHO-related classification | MUSIC (test) | LVEF < 40% Classification68 | 12 | |
| Open-Ended Abstract Retrieval (OAR) | Music | Diversity49.2 | 11 | |
| Source Separation | MUSIC (test) | IS3.43 | 10 | |
| Audio Restoration | Music Datasets (test) | Mel Distance1.2115 | 9 | |
| Multi-source sound localization | MUSIC-Duet | CIoU@0.332.5 | 9 | |
| Music Genre Classification | Music (test) | Accuracy84.06 | 9 | |
| Entity Matching | Music 20K | Precision93.4 | 8 | |
| SCD outcome prediction | MUSIC | AUROC0.6223 | 8 | |
| Video-to-audio generation | MUSIC (test) | Overall Score4.3 | 8 |