| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Audio-Visual Hallucination | EgoAVU-Bench | Accuracy61.69 | 9 | |
| Temporal Reasoning | EgoAVU-Bench | Accuracy67.84 | 9 | |
| Audio-Visual Segment Narration | EgoAVU-Bench | S Score2.63 | 9 | |
| Audio-Visual Dense Narration | EgoAVU-Bench | S Score2.66 | 9 | |
| Source-Sound Association | EgoAVU-Bench | S Score3.2 | 9 |