Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Vision-centric Reasoning on RealWorldQA
Loading...
73.3
Accuracy
SAP
48.86
55.205
61.55
67.895
Nov 7, 2025
Nov 24, 2025
Dec 11, 2025
Dec 28, 2025
Jan 14, 2026
Jan 31, 2026
Feb 18, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
SAP
Strategy=SAP
2026.02
73.3
Qwen3-VL-8B-Thinking
Strategy=LongCoT
2026.02
72.8
MiMo-VL-7B-RL
Open Model=Yes, Open D...
2025.11
72.68
MiMo-VL-7B-SFT
Open Model=Yes, Open D...
2025.11
71.9
Qwen3-VL-8B-Instruct
Strategy=Instruct
2026.02
70.3
Long Grounded Thoughts (SFT + GRPO)
Open Model=Yes, Open D...
2025.11
69.02
Long Grounded Thoughts (Multistage SFT + DPO)
Open Model=Yes, Open D...
2025.11
68.76
Grok-1.5V
2026.02
68.7
Qwen2.5-VL-7B-Instruct
Open Model=Yes, Open D...
2025.11
67.84
Gemini-1.5
2026.02
67.5
Qwen2.5-VL-7B-Instruct + LongPerceptualThoughts
Open Model=Yes, Open D...
2025.11
67.45
Qwen2.5-VL-7B-Instruct + VLAA-Thinker
Open Model=Yes, Open D...
2025.11
66.93
Long Grounded Thoughts (SFT + DPO)
Open Model=Yes, Open D...
2025.11
66.14
Long Grounded Thoughts (SFT)
Open Model=Yes, Open D...
2025.11
65.49
Qwen2.5-VL-7B-Instruct + Revisual-R1-final
Open Model=Yes, Open D...
2025.11
62.48
GPT-4V
2026.02
61.4
Claude-3-Sonnet
Table Header=Claude-S
2026.02
51.9
Claude-3-Opus
Table Header=Claude-O
2026.02
49.8
Feedback
Search any
task
Search any
task