Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Visual Grounded Reasoning on HR-Bench-8K

76.3Overall Score

Qwen2.5-VL-72B

59.1463.59568.0572.505Jul 10, 2025Aug 2, 2025Aug 25, 2025Sep 18, 2025Oct 11, 2025Nov 3, 2025Nov 27, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.11
76.384.368.3
2025.07
76.384.368.3
2025.11
7589.560.4
2025.11
73.186.559.8
2025.07
73.186.559.8
2025.11
72.686.858.5
2025.07
72.686.858.5
2025.11
71.686.556.8
2025.11
68.883.554
2025.07
68.883.554
2025.11
67.371.862.8
2025.07
67.371.862.8
2025.11
6771.362.8
2025.11
66.98054.3
2025.07
66.98054.3
2025.11
6264.359.8
2025.07
6264.359.8
2025.11
60.968.853
2025.07
60.968.853
2025.11
59.865.354.3
2025.07
59.865.354.3