Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Visual Reasoning on HR-Bench 8K

72.6Overall Score

DeepEyes

48.99255.12161.2567.379Sep 26, 2025Nov 3, 2025Dec 11, 2025Jan 18, 2026Feb 25, 2026Apr 4, 2026May 12, 2026
Updated 21d ago

Evaluation Results

MethodLinks
2025.12
72.686.858.5
2025.12
72.68758.3
2025.12
718557
2025.12
70.484.556.3
2025.09
69.9--
2026.05
69.785.454
2025.12
69.385.550
2026.05
68.878.858.8
2026.05
68.5279.0557.98
2026.05
68.479.357.5
2025.09
67.1--
2026.05
67.0178.5955.43
2026.05
66.98054.3
2026.05
66.977.756.2
2025.09
66.1--
2026.05
6680.351.8
2025.09
65.9--
2025.09
65.8--
2025.12
65.378.851.8
2026.05
65.17753.3
2026.05
64.774.6954.7
2026.05
64.3374.4254.24
2026.05
63.673.354
2026.05
63.57552
2026.05
63.576.550.5
2026.05
63.0572.6753.42
2026.05
62.977.548.3
2025.09
61.3--
2025.09
61.1--
2025.09
60.4--
2025.09
60.1--
2025.09
60--
2025.09
59.9--
2025.09
58.3--
2025.09
57.3--
2025.09
57.3--
2025.09
56.5--
2025.12
55.56249
2026.05
55.56249
2025.09
54.4--
2026.05
51.1257.0845.12
2025.09
49.9--