Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Audio Visual Question Answering on AVQA (Robustness Evaluation)
Loading...
95.6
AVQA Clean Accuracy
Negative Language Modeling Loss
90.82
93.21
95.6
97.99
Jan 20, 2026
AVQA Clean Accuracy
AVQA Attack Accuracy
AVQA Accuracy Drop
Updated 4d ago
Evaluation Results
Method
Method
Links
AVQA Clean Accuracy
AVQA Attack Accuracy
AVQA Accuracy Drop
Negative Language Modeling Loss
Objective=L_negLM
2026.01
95.6
85
10.6
Encoder-Based Cosine Similarity Loss
Objective=L^(cos)
2026.01
95.6
11
84.6
Vision Attention Suppression Loss
Objective=L^(visionatt)
2026.01
95.6
92
3.6
Audio Attention Amplification Loss
Objective=L^(audioatt)
2026.01
95.6
41
54.6
Attention Randomization Loss
Objective=L^(randatt)
2026.01
95.6
91
4.6
Hidden-State Similarity Loss
Objective=L^(hidden-cos)
2026.01
95.6
79.5
16.1
Combined Loss (SOUNDBREAK)
Objective=L^(combined)
2026.01
95.6
3.9
91.8
Feedback
Search any
task
Search any
task