Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Overall Audio-Visual Question Answering on MUSIC-AVQA (test)

71.52Overall Accuracy

Spatio-Temporal Grounded Audio-Visual Network

20.747233.928647.1160.2914Mar 26, 2022Jul 18, 2022Nov 9, 2022Mar 3, 2023Jun 25, 2023Oct 17, 2023Feb 8, 2024
Updated 4d ago

Evaluation Results

MethodLinks
71.52
2022.03
68.93
2022.03
67.44
2022.03
67.07
2022.03
66.54
2022.03
66.45
2022.03
65.49
2022.03
63.65
2022.03
62.3
2022.03
60.34
2022.03
59.92
2022.03
55.73
2024.02
52.6
2024.02
51
2024.02
48.4
2024.02
44.5
2024.02
43.5
2024.02
42.3
2024.02
34.8
2024.02
31
2024.02
22.7