Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use

About

In audio classification, differentiable auditory filterbanks with few parameters cover the middle ground between hard-coded spectrograms and raw audio. LEAF (arXiv:2101.08596), a Gabor-based filterbank combined with Per-Channel Energy Normalization (PCEN), has shown promising results, but is computationally expensive. With inhomogeneous convolution kernel sizes and strides, and by replacing PCEN with better parallelizable operations, we can reach similar results more efficiently. In experiments on six audio classification tasks, our frontend matches the accuracy of LEAF at 3% of the cost, but both fail to consistently outperform a fixed mel filterbank. The quest for learnable audio frontends is not solved.

Jan Schl\"uter, Gerald Gutenbrunner• 2022

Related benchmarks

TaskDatasetResultRank
Musical Instrument ClassificationNSynth
Accuracy71.7
75
Audio ClassificationCREMA-D
Accuracy60.2
15
Audio ClassificationNSynth Pitch
Accuracy92.7
6
Audio ClassificationSpeechCommands v1 v2 (test)
Accuracy95.3
5
Audio ClassificationBirdCLEF 2021
Accuracy42.9
5
Audio ClassificationVoxForge
Accuracy91.4
5
Showing 6 of 6 rows

Other info

Code

Follow for update