Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Simultaneous Music Separation and Generation Using Multi-Track Latent Diffusion Models

About

Diffusion models have recently shown strong potential in both music generation and music source separation tasks. Although in early stages, a trend is emerging towards integrating these tasks into a single framework, as both involve generating musically aligned parts and can be seen as facets of the same generative process. In this work, we introduce a latent diffusion-based multi-track generation model capable of both source separation and multi-track music synthesis by learning the joint probability distribution of tracks sharing a musical context. Our model also enables arrangement generation by creating any subset of tracks given the others. We trained our model on the Slakh2100 dataset, compared it with an existing simultaneous generation and separation model, and observed significant improvements across objective metrics for source separation, music, and arrangement generation tasks. Sound examples are available at https://msg-ld.github.io/.

Tornike Karchkhadze, Mohammad Rasool Izadi, Shlomo Dubnov• 2024

Related benchmarks

TaskDatasetResultRank
Multi-track Music Generation (Mixture Subjective Evaluation)Slakh2100 (test)
Subjective Score (Group 1)1.5
6
Multi-track music generationSlakh2100 (test)
FAD1.31
5
Multi-track music generationMUSDB18
CBS0.386
5
Multi-track music generationSlakh2100
CBS0.386
5
Inner Track Rhythmic StabilitySlakh2100
IRS (Bass)0.041
4
Multi-track Rhythmic SynchronizationSlakh2100
CBS0.3861
4
Multi-track Music Generation (Drum Subjective Evaluation)Slakh2100 (test)
Subjective Score (Group 1)1.2
3
Showing 7 of 7 rows

Other info

Follow for update