Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Spatial Relationship Reasoning on SPAR-Bench

59.9Accuracy (Avg)

GAMSI_S1+S2

23.18832.71942.2551.781May 19, 2026May 20, 2026May 21, 2026May 22, 2026May 23, 2026May 24, 2026May 25, 2026
Updated 8d ago

Evaluation Results

MethodLinks
2026.05
59.9---
2026.05
49.6---
2026.05
48.9---
2026.05
47.2---
2026.05
41.1---
39.9---
39.1---
2026.05
38.1---
2026.05
38---
2026.05
37.3---
2026.05
37.3---
2026.05
36.429.324.945.1
35.8---
2026.05
35.3---
33.8---
33.3---
2026.05
32.9---
2026.05
32.9---
2026.05
31.221.826.140.1
2026.05
30.625.729.835.2
2026.05
30.217.529.541.8
30---
2026.05
29.426.418.836.2
2026.05
27.61331.342.6
27.2---
2026.05
24.615.326.432.2