NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
About
We introduce NotaGen, a symbolic music generation model aiming to explore the potential of producing high-quality classical sheet music. Inspired by the success of Large Language Models (LLMs), NotaGen adopts pre-training, fine-tuning, and reinforcement learning paradigms (henceforth referred to as the LLM training paradigms). It is pre-trained on 1.6M pieces of music in ABC notation, and then fine-tuned on approximately 9K high-quality classical compositions conditioned on "period-composer-instrumentation" prompts. For reinforcement learning, we propose the CLaMP-DPO method, which further enhances generation quality and controllability without requiring human annotations or predefined rewards. Our experiments demonstrate the efficacy of CLaMP-DPO in symbolic music generation models with different architectures and encoding schemes. Furthermore, subjective A/B tests show that NotaGen outperforms baseline models against human compositions, greatly advancing musical aesthetics in symbolic music generation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Orchestral Music Generation | SymphonyNet (val) | CLaMP0.387 | 7 | |
| Music Continuation | LMD + MuseScore Piano (test) | JS Divergence (GC)0.677 | 6 | |
| Music Continuation | Lakh MIDI Dataset (LMD) Multi-track (test) | JS Divergence (GC)0.594 | 5 | |
| Orchestral Composition | Orchestral Composition Educated Listeners | Quality3.23 | 4 | |
| Orchestral Composition | Orchestral Composition General Listeners | Quality Score3.23 | 4 |