Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SongCreator: Lyrics-based Universal Song Generation

About

Music is an integral part of human culture, embodying human intelligence and creativity, of which songs compose an essential part. While various aspects of song generation have been explored by previous works, such as singing voice, vocal composition and instrumental arrangement, etc., generating songs with both vocals and accompaniment given lyrics remains a significant challenge, hindering the application of music generation models in the real world. In this light, we propose SongCreator, a song-generation system designed to tackle this challenge. The model features two novel designs: a meticulously designed dual-sequence language model (DSLM) to capture the information of vocals and accompaniment for song generation, and a series of attention mask strategies for DSLM, which allows our model to understand, generate and edit songs, making it suitable for various songrelated generation tasks by utilizing specific attention masks. Extensive experiments demonstrate the effectiveness of SongCreator by achieving state-of-the-art or competitive performances on all eight tasks. Notably, it surpasses previous works by a large margin in lyrics-to-song and lyrics-to-vocals. Additionally, it is able to independently control the acoustic conditions of the vocals and accompaniment in the generated song through different audio prompts, exhibiting its potential applicability. Our samples are available at https://thuhcsi.github.io/SongCreator/.

Shun Lei, Yixuan Zhou, Boshi Tang, Max W. Y. Lam, Feng Liu, Hangyu Liu, Jingcheng Wu, Shiyin Kang, Zhiyong Wu, Helen Meng• 2024

Related benchmarks

TaskDatasetResultRank
Lyrics-to-vocalsEvaluation set without audio prompt (test)
Musicality3.98
7
Lyrics-to-songJukebox lyrics dataset
FAD2.14
6
Accompaniment-to-songAccompaniment-to-song (test)
Musicality3.67
6
Vocals-to-songheld-out set (test)
Musicality3.77
6
Audio Synthesis20 generated audio samples (test)
RTF2.793
5
Music ContinuationMusic Continuation Evaluation Set
Musicality3.97
4
Vocal EditingManually constructed song editing dataset (test)
Musicality3.68
4
Song editingManually constructed dataset of 30 song editing examples (test)
Musicality4.01
4
Prompt-based lyrics-to-vocalsheld-out set (test)
SECS0.68
3
Lyrics-to-songheld-out set
Musicality Score4.01
3
Showing 10 of 12 rows

Other info

Follow for update