Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Visual Reasoning on HR-Bench 4K

0.77Overall Score

SubagentVL

0.538080.598290.65850.71871Sep 26, 2025Nov 3, 2025Dec 11, 2025Jan 18, 2026Feb 25, 2026Apr 4, 2026May 12, 2026
Updated 21d ago

Evaluation Results

MethodLinks
2025.12
0.770.9330.608
2025.12
0.7510.920.583
2025.12
0.7510.9130.59
2025.12
0.7390.8980.58
2026.05
0.7330.860.605
2026.05
0.73230.86330.6012
2025.09
0.73--
2026.05
0.7290.860.603
2026.05
0.7190.8550.583
2026.05
0.7190.8930.545
2026.05
0.7190.8420.595
2026.05
0.71790.82830.6075
2026.05
0.7130.8380.588
2025.09
0.711--
2026.05
0.70880.83250.575
2026.05
0.70580.8330.5786
2026.05
0.7030.8470.557
2025.09
0.7--
2026.05
0.69970.84010.5593
2026.05
0.6990.8470.557
2025.09
0.698--
2025.12
0.6960.8430.55
2026.05
0.6910.8130.57
2026.05
0.690.8580.522
2025.12
0.6880.8520.522
2026.05
0.6830.8060.5603
2025.09
0.671--
2025.09
0.669--
2025.09
0.669--
2025.09
0.665--
2025.09
0.664--
2025.09
0.661--
2025.09
0.66--
2025.09
0.656--
2025.09
0.644--
2025.09
0.628--
2025.09
0.628--
2025.09
0.624--
2025.12
0.590.70.48
2026.05
0.590.70.48
2025.09
0.57--
2026.05
0.5470.64930.4451