Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

A Distribution Matching Approach to Neural Piano Transcription with Optimal Transport

About

This paper describes a novel paradigm that formalizes automatic piano transcription (APT) as an optimal transport (OT) problem, not as a frame-level multi-label binary classification problem. Our method learns to minimize the cost of transporting a predicted distribution of note events to the ground-truth distribution over time and frequency. The OT loss can thus accommodate temporal misalignment, leading to perceptually relevant optimization. We also propose a convolutional recurrent neural network (CRNN) with a harmonics-aware attention mechanism to capture the spectro-temporal dependencies inherent in music.Our experiments using the MAESTRO dataset showed that our method attained a state-of-the-art performance in onset detection. We confirmed the versatility of the OT loss in application to existing models.

Weixing Wei, Raynaldi Lalang, Dichucheng Li, Kazuyoshi Yoshii• 2026

Related benchmarks

TaskDatasetResultRank
Automatic Piano Transcription (Onset & Offset)MAESTRO
Precision91.56
5
Music TranscriptionMAESTRO
Precision99.16
5
Showing 2 of 2 rows

Other info

Follow for update