Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Video Background Music Generation with Controllable Music Transformer

About

In this work, we address the task of video background music generation. Some previous works achieve effective music generation but are unable to generate melodious music tailored to a particular video, and none of them considers the video-music rhythmic consistency. To generate the background music that matches the given video, we first establish the rhythmic relations between video and background music. In particular, we connect timing, motion speed, and motion saliency from video with beat, simu-note density, and simu-note strength from music, respectively. We then propose CMT, a Controllable Music Transformer that enables local control of the aforementioned rhythmic features and global control of the music genre and instruments. Objective and subjective evaluations show that the generated background music has achieved satisfactory compatibility with the input videos, and at the same time, impressive music quality. Code and models are available at https://github.com/wzk1015/video-bgm-generation.

Shangzhe Di, Zeren Jiang, Si Liu, Zhaokai Wang, Leyan Zhu, Zexin He, Hongming Liu, Shuicheng Yan• 2021

Related benchmarks

TaskDatasetResultRank
Dance-to-MusicAIST++
BCS97.1
22
Video-to-Music GenerationV2M-bench (test)
Fréchet Audio Distance (FAD)8.637
12
Dance-to-MusicAIST++ (test)
BCS95.92
11
Video-to-Music GenerationAIST++
BCS0.3368
10
Video-to-Music GenerationReelBench
IB0.1119
7
Video-to-Music GenerationLORIS
IB Score0.0831
7
Long-form Video Soundtrack GenerationLVS Benchmark
ImageBind Avg0.143
7
Video-to-Music GenerationV2MBench
IB0.159
7
Video Background Music GenerationBGM909 (test)
PCHE2.398
7
Video-to-Music GenerationSymMV (test)
PCHE2.444
5
Showing 10 of 12 rows

Other info

Follow for update