Share your thoughts, 1 month free Claude Pro on usSee more

Audio Visual Question Answering on AVQA (Robustness Evaluation)

95.6AVQA Clean Accuracy

Negative Language Modeling Loss

Updated 5mo ago

Evaluation Results

Method	Links
Negative Language Modeling Loss 2026.01		95.6	85	10.6
Encoder-Based Cosine Similarity Loss 2026.01		95.6	11	84.6
Vision Attention Suppression Loss 2026.01		95.6	92	3.6
Audio Attention Amplification Loss 2026.01		95.6	41	54.6
Attention Randomization Loss 2026.01		95.6	91	4.6
Hidden-State Similarity Loss 2026.01		95.6	79.5	16.1
Combined Loss (SOUNDBREAK) 2026.01		95.6	3.9	91.8