| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Context Length Estimation | Song Describer | Context Length (s)993 | 10 | |
| Audio Reconstruction | Song Describer | L/R Mel0.9586 | 10 | |
| Audio Captioning | Song Describer (SD) | SBERT Similarity0.469 | 4 | |
| Music Generation | Song Describer Dataset no-singing 2m | Stereo Correctness96 | 3 |