Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Vision-centric Reasoning on RealWorldQA

75.4Accuracy

GPT-4o

43.36851.6846068.316Mar 4, 2025May 12, 2025Jul 21, 2025Sep 29, 2025Dec 8, 2025Feb 16, 2026Apr 27, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2025.09
75.4
2026.02
73.3
2025.09
73.2
2026.02
72.8
2025.11
72.68
2025.11
71.9
2026.02
70.3
2025.11
69.02
2025.09
68.9
2025.11
68.76
2026.02
68.7
2025.09
68.5
2025.09
68.4
2025.09
68.1
2025.11
67.84
2026.02
67.5
2025.11
67.45
2026.04
67.4
2026.04
67
2025.11
66.93
2025.09
66.5
2026.04
66.5
2026.04
66.4
2025.11
66.14
2025.09
66.1
2025.11
65.49
2025.09
65.4
2026.04
64.9
2026.04
64.1
2026.04
64
2026.04
64
2026.04
63.4
2025.03
62.9
2025.11
62.48
2026.04
62.3
2026.04
62.1
2026.04
62
2025.03
61.6
2026.04
61.5
2026.02
61.4
2026.02
61.2
2025.03
60.7
2026.04
60.6
2025.09
60.2
2026.04
59.5
2026.02
59.1
2025.03
58.8
2026.04
58.7
2026.02
58
2025.09
58
2026.04
57.9
2026.02
57.3
2026.02
56.9
2025.03
56.9
2026.04
55.8
2025.03
55.8
2026.02
55.4
2025.03
54.8
2025.03
53.2
2026.02
51.9
2026.02
49.8
2025.03
49.7
2026.04
49.2
2026.02
47.2
2026.02
44.9
2026.02
44.6