Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Variable-Length Audio Fingerprinting

About

Audio fingerprinting converts audio to much lower-dimensional representations, allowing distorted recordings to still be recognized as their originals through similar fingerprints. Existing deep learning approaches rigidly fingerprint fixed-length audio segments, thereby neglecting temporal dynamics during segmentation. To address limitations due to this rigidity, we propose Variable-Length Audio FingerPrinting (VLAFP), a novel method that supports variable-length fingerprinting. To the best of our knowledge, VLAFP is the first deep audio fingerprinting model capable of processing audio of variable length, for both training and testing. Our experiments show that VLAFP outperforms existing state-of-the-arts in live audio identification and audio retrieval across three real-world datasets.

Hongjie Chen, Hanyu Meng, Huimin Zeng, Ryan A. Rossi, Lie Lu, Josh Kimball• 2026

Related benchmarks

TaskDatasetResultRank
Dummy-Target RetrievalFMA
Top-1 Hit Rate99.2
36
Audio FingerprintingBAF
Precision40.87
6
Commercial-Broadcast RetrievalFMA
Precision81
6
Commercial-Broadcast RetrievalLibriSpeech
Precision50.19
6
Audio FingerprintingFMA
Params (M)12.2
6
Commercial-Broadcast RetrievalAudioSet
Precision49.58
6
Audio FingerprintingFMA CBR Commercial
Segment Count1.20e+4
6
Audio FingerprintingFMA CBR Broadcast
Segment Count2.20e+5
6
Audio FingerprintingFMA DTR (Dummy)
Segment Count5.81e+5
6
Audio FingerprintingFMA DTR (Target)
Segment Count3.00e+4
6
Showing 10 of 11 rows

Other info

Follow for update