Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

A-OKVQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Visual Question AnsweringA-OKVQA
Acc92.68
228
Visual Question AnsweringA-OKVQA (test)
Accuracy90.56
103
Visual Question AnsweringA-OKVQA (val)
Accuracy79.5
92
Object Hallucination EvaluationA-OKVQA POPE (Popular)
Accuracy90.3
76
Object Hallucination EvaluationA-OKVQA POPE (Random)
Accuracy92.1
60
Multi-choice Visual Question AnsweringA-OKVQA
Accuracy82.71
49
VLM EditingA-OKVQA 2022 (test)
Accuracy100
48
Object Hallucination AssessmentA-OKVQA POPE (Adversarial)
Accuracy0.8363
42
Visual ReasoningA-OKVQA
ECE5.4
32
Object HallucinationA-OKVQA POPE (test)
Accuracy (Random)90.13
29
Visual Question Answering (Multi-choice)A-OKVQA (test)
Accuracy87.2
28
Object Hallucination ProbingA-OKVQA (Adversarial split)
Accuracy79.1
27
Direct Answer Visual Question AnsweringA-OKVQA (test)
Accuracy69
22
Object Hallucination EvaluationA-OKVQA POPE
Random Accuracy92.37
21
Direct-answer Visual Question AnsweringA-OKVQA
Accuracy68.7
18
Visual Question AnsweringA-OKVQA POPE Evaluation (Adversarial)
Accuracy82
16
Visual Question AnsweringA-OKVQA POPE (Popular)
Accuracy89.77
16
Visual Question AnsweringA-OKVQA POPE Evaluation (Random)
Accuracy90.03
16
Hallucination EvaluationA-OKVQA
Accuracy (Random)93.76
15
Visual Question AnsweringA-OKVQA Open-Ended
Accuracy72.14
15
Visual Question AnsweringA-OKVQA v1.0 (test)
Accuracy53.36
14
Object Hallucination ProbingA-OKVQA (Random split)
Accuracy90.83
12
Direct-AnswerA-OKVQA 1.0 (test)
Accuracy68
12
Polling-based Object Probing Evaluation (POPE)A-OKVQA POPE (Adversarial)
Accuracy81.94
12
Polling-based Object Probing Evaluation (POPE)A-OKVQA POPE Popular
Accuracy0.8813
12
Showing 25 of 43 rows