Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Spatial Reasoning on HR-Bench 8K

73.3HR-8K Average Accuracy

Mini-o3

64.9867.1469.371.46Apr 21, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
73.3--
2026.04
73.288.158.3
2026.04
72.786.858.5
2026.04
69.388.550
2026.04
66.9--
2026.04
65.378.851.8
2026.04
-36.2-