| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Referring Audio-Visual Segmentation | Ref-AVS | Seen Score4,054 | 30 | |
| Referring Audio-Visual Segmentation | Ref-AVS (mix) | Jaccard Index (J)68.9 | 28 | |
| Referring Audio-Visual Segmentation | Ref-AVS (unseen) | Jaccard Index (J)73.2 | 28 | |
| Referential Audio-Visual Segmentation | Ref-AVS (seen) | J & F Score0.688 | 28 | |
| Referring Audio-Visual Segmentation | Ref-AVS 1.0 (Mix (S+U)) | Jaccard (J)55 | 12 | |
| Referring Audio-Visual Segmentation | Ref-AVS 1.0 (unseen) | J (Jaccard Index)66.5 | 12 | |
| Referring Audio-Visual Segmentation | Ref-AVS 1.0 (seen) | Jaccard Index43.5 | 12 | |
| Referring Audio-Visual Segmentation | Ref-AVS 1.0 | S-score0.23 | 7 | |
| Referring Audio-Visual Segmentation | Ref-AVS (test) | S-score0.01 | 5 | |
| Referring Audio-Visual Segmentation | Ref-AVS Unseen (test) | mIoU49.54 | 5 | |
| Referring Audio-Visual Segmentation | Ref-AVS Seen (test) | mIoU4,054 | 5 |