Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AOKVQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Object Hallucination EvaluationAOKVQA POPE (random, popular, and adversarial)
Accuracy82.23
24
Outside Knowledge Visual Question AnsweringAOKVQA
AUC86.17
21
Visual Question AnsweringAOKVQA
Accuracy73.1
8
Multiple Choice Visual Question AnsweringAOKVQA
Accuracy (MC)68.9
6
Showing 4 of 4 rows