Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Vision-centric Reasoning on RealWorldQA

75.4Accuracy

GPT-4o

43.36851.6846068.316Sep 29, 2025Oct 24, 2025Nov 18, 2025Dec 13, 2025Jan 7, 2026Feb 1, 2026Feb 27, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2025.09
75.4
2026.02
73.3
2025.09
73.2
2026.02
72.8
2025.11
72.68
2025.11
71.9
2026.02
70.3
2025.11
69.02
2025.09
68.9
2025.11
68.76
2026.02
68.7
2025.09
68.5
2025.09
68.4
2025.09
68.1
2025.11
67.84
2026.02
67.5
2025.11
67.45
2025.11
66.93
2025.09
66.5
2025.11
66.14
2025.09
66.1
2025.11
65.49
2025.09
65.4
2025.11
62.48
2026.02
61.4
2026.02
61.2
2025.09
60.2
2026.02
59.1
2026.02
58
2025.09
58
2026.02
57.3
2026.02
56.9
2026.02
55.4
2026.02
51.9
2026.02
49.8
2026.02
47.2
2026.02
44.9
2026.02
44.6