| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Audio-Visual Segmentation | AVS-Bench S4 (test) | mIoU82.1 | 9 | |
| Audio-Visual Segmentation | AVS-Bench AVSS (test) | mIoU29.8 | 5 | |
| Audio-Visual Segmentation | AVS-Bench | MS358.4 | 4 | |
| Spatial Localization | AVS-Bench S4 (ARIG) (test) | cIoU41.78 | 4 | |
| Multi-source Sound Source Segmentation | AVS-Bench MS3 | mIoU58.21 | 2 | |
| Audio-Visual Semantic Segmentation | AVS-Bench AVSS | mIoU26.52 | 1 |