Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Visual Reasoning on VLMs are Blind (Accuracy)

77.8Accuracy

Kimi-K2.5

35.1646.2357.368.37Nov 20, 2025Dec 20, 2025Jan 19, 2026Feb 19, 2026Mar 21, 2026Apr 20, 2026May 21, 2026
Updated 12d ago

Evaluation Results

MethodLinks
2026.05
77.8
2025.11
74.91
2026.05
74.6
2026.05
73.6
2025.11
72.32
2025.11
72.1
2026.05
72.1
2026.05
71.1
2026.05
69.6
2026.05
69.1
2026.05
68.4
2026.05
67.8
2026.05
66.7
2026.05
55.1
2026.05
50.9
2025.11
49.8
2026.05
48.9
2026.05
48.9
2026.05
48.4
2026.05
48.4
2026.05
48.1
2026.05
48.1
2026.05
44.6
2026.05
42.6
2026.05
42.1
2026.05
39.4
2025.11
37.4
2025.11
36.8