Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Direct-answer Visual Question Answering on A-OKVQA
Loading...
68.7
Accuracy
MC-CoT_Base
21.8272
33.9961
46.165
58.3339
Nov 23, 2023
Apr 5, 2024
Aug 17, 2024
Dec 29, 2024
May 12, 2025
Sep 23, 2025
Feb 4, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
MC-CoT_Base
Vision Model=DETR [11]...
2023.11
68.7
LASER
Model=Qwen-VL, Decodin...
2026.02
62.82
VCD
Model=Qwen-VL, Decodin...
2026.02
61.34
ViCrop
Model=Qwen-VL, Decodin...
2026.02
60.12
Sample
Model=Qwen-VL, Decodin...
2026.02
59.64
BLIP-2
Vision Model=CLIP-VIT-...
2023.11
53.2
GPV-2
Vision Model=VinVL [59...
2023.11
48.6
IPVR
Vision Model=Faster-RC...
2023.11
46.4
PICa
Vision Model=VinVL [59...
2023.11
42.4
PaLM-COT
Text Model=PaLM [13],...
2023.11
41.5
KRISP
Vision Model=Faster R-...
2023.11
33.7
LXMERT
Vision Model=Transform...
2023.11
30.7
VILBERT
Vision Model=Faster R-...
2023.11
30.6
LASER
Model=LLaVA-1.5, Decod...
2026.02
28.18
VCD
Model=LLaVA-1.5, Decod...
2026.02
25.54
Pythia
Vision Model=ResNet [1...
2023.11
25.2
ViCrop
Model=LLaVA-1.5, Decod...
2026.02
23.95
Sample
Model=LLaVA-1.5, Decod...
2026.02
23.63
Feedback
Search any
task
Search any
task