Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Audio-Visual Segmentation

About

We propose to explore a new problem called audio-visual segmentation (AVS), in which the goal is to output a pixel-level map of the object(s) that produce sound at the time of the image frame. To facilitate this research, we construct the first audio-visual segmentation benchmark (AVSBench), providing pixel-wise annotations for the sounding objects in audible videos. Two settings are studied with this benchmark: 1) semi-supervised audio-visual segmentation with a single sound source and 2) fully-supervised audio-visual segmentation with multiple sound sources. To deal with the AVS problem, we propose a novel method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process. We also design a regularization loss to encourage the audio-visual mapping during training. Quantitative and qualitative experiments on the AVSBench compare our approach to several existing methods from related tasks, demonstrating that the proposed method is promising for building a bridge between the audio and pixel-wise visual semantics. Code is available at https://github.com/OpenNLPLab/AVSBench.

Jinxing Zhou, Jianyuan Wang, Jiayi Zhang, Weixuan Sun, Jing Zhang, Stan Birchfield, Dan Guo, Lingpeng Kong, Meng Wang, Yiran Zhong• 2022

Related benchmarks

TaskDatasetResultRank
Audio-Visual SegmentationAVSBench S4 v1 (test)
MJ78.7
55
Audio-Visual SegmentationAVSBench MS3 v1 (test)
Mean Jaccard54
37
Audio-Visual SegmentationAVSBench MS3 (test)
Jaccard Index (IoU)54
30
Audio-Visual Semantic SegmentationAVSBench AVSS v1 (test)
MJ29.8
29
Audio-Visual SegmentationAVSBench AVS-Objects-S4
J&F Score83.3
21
Audio-Visual SegmentationAVSBench AVS-Objects-MS3
J & F Score59.3
21
Audio-Visual SegmentationAVS-Object S4
J&Fm83.3
19
Audio-Visual SegmentationAVS-Object MS3
J&Fm Combined Score59.3
19
Audio-Visual SegmentationAVSBench S4 (test)
MJ78.7
16
Audio-Visual SegmentationVPO-SS 1.0 (test)
J & FB Score44.63
16
Showing 10 of 35 rows

Other info

Follow for update