Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AudioSet

Benchmarks

Task NameDataset NameSOTA ResultTrend
Audio ClassificationAudioSet 20K
mAP47.8
128
Audio ClassificationAudioSet 2M
mAP50.5
79
1D audio reconstructionAudioSet
NMSE0.006
63
ClassificationAudioSet (test)
mAP49.6
57
Sound ClassificationAudioSet (evaluation)
mAP47.1
39
Audio ReconstructionAudioSet (eval)
Mel Distance0.382
35
Acoustic event detectionAudioSet (test)
mAP0.462
34
Audio Event TaggingAudioSet AS-2M (full)
mAP50.2
33
Audio ClassificationAudioSet-2M (full)
mAP48.6
32
Audio ClassificationAudioSet
mAP48.5
25
Audio Event TaggingAudioSet (AS-20K)
mAP46.7
24
Audio ClassificationAudioSet Full (test)
mAP45.9
23
ClassificationAudioSet AS-2M
mAP (%)50.2
21
Generalized Zero-Shot Retrieval (Text-to-Audio)AudioSet ZSL (test)
mAP (S)72.25
19
Sound Event DetectionAudioSet Strongly-labeled (test)
PSDS1 (w/o var-pen)0.374
18
Audio-visual event classificationAudioSet 2M
mAP (Audio-only)49.1
16
Generalized Zero-Shot ClassificationAudioSet ZSL (test)
mAcc (Seen)50.96
16
Audio ReconstructionAudioSet (test)
Mel Distance (44kHz)0.417
15
Audio TaggingAudioSet (test)
mAP50
14
Audio ClassificationAudioSet-20K (test)
mAP37.4
13
Audio ClassificationAudioSet (balanced)
mAP37.8
13
Sound Event DetectionAudioSet Strong (407 classes)
PSDS1A0.496
12
Audio-visual classificationAudioSet
Top-1 Accuracy55.85
12
Audio ClassificationAudioSet 20K v1
mAP41.9
11
Audio-visual event classificationAudioSet 20K
mAP (Audio-only)42.4
11
Showing 25 of 71 rows