Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

A-OKVQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Visual Question AnsweringA-OKVQA
Acc92.68
175
Visual Question AnsweringA-OKVQA (test)
Accuracy85.7
79
Visual Question AnsweringA-OKVQA (val)
Accuracy0.81
56
Multi-choice Visual Question AnsweringA-OKVQA
Accuracy82.71
49
VLM EditingA-OKVQA 2022 (test)
Accuracy100
48
Object Hallucination EvaluationA-OKVQA POPE (Popular)
Accuracy87.71
36
Object Hallucination EvaluationA-OKVQA POPE (Random)
Accuracy89.5
36
Object Hallucination ProbingA-OKVQA (Adversarial split)
Accuracy79.1
27
Visual Question Answering (Multi-choice)A-OKVQA (test)
Accuracy80.2
19
Object Hallucination AssessmentA-OKVQA POPE (Adversarial)
Accuracy0.8126
18
Direct-answer Visual Question AnsweringA-OKVQA
Accuracy68.7
18
Visual Question AnsweringA-OKVQA Open-Ended
Accuracy72.14
15
Visual Question AnsweringA-OKVQA v1.0 (test)
Accuracy53.36
14
Object Hallucination ProbingA-OKVQA (Random split)
Accuracy90.83
12
Direct-AnswerA-OKVQA 1.0 (test)
Accuracy68
12
Polling-based Object Probing Evaluation (POPE)A-OKVQA POPE (Adversarial)
Accuracy81.94
12
Polling-based Object Probing Evaluation (POPE)A-OKVQA POPE Popular
Accuracy0.8813
12
Polling-based Object Probing Evaluation (POPE)A-OKVQA POPE Random
Accuracy89.6
12
Speech-Visual Question AnsweringA-OKVQA Speech-converted
Accuracy0.2001
12
Direct-AnswerA-OKVQA 1.0 (val)
Accuracy0.683
11
Multiple-ChoiceA-OKVQA 1.0 (test)
Accuracy86.7
9
Multiple-ChoiceA-OKVQA 1.0 (val)
Accuracy87.7
9
Object HallucinationA-OKVQA POPE (test)
Accuracy (Random)90.13
8
Knowledge-based Visual Question AnsweringA-OKVQA (val)
Accuracy68.9
8
Direct Answer Visual Question AnsweringA-OKVQA (test)
Accuracy59.6
7
Showing 25 of 33 rows