Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Spatial Reasoning on MMSI-Bench

97.2Accuracy

Human

8.48831.51954.5577.581Aug 27, 2025Oct 11, 2025Nov 26, 2025Jan 11, 2026Feb 25, 2026Apr 12, 2026May 28, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2026.02
97.2
45.2
2026.02
41.8
2026.04
41.8
2026.02
38.3
2026.04
38.3
38
38
2026.03
32.6
2026.02
32
2026.03
32
2026.02
31.7
2026.03
31.1
2026.05
31.1
31
2026.04
31
2026.02
30.9
2026.05
30.3
2026.02
30.2
2026.03
30.2
2026.04
30.2
2026.02
30
2026.04
30
2026.04
29.6
2026.05
29.2
28.8
2026.04
28.8
2026.05
28.7
28.6
2026.02
28.4
2026.04
28.4
2025.08
28.2
2026.05
28.1
28
2026.02
28
2026.03
28
2026.04
28
2026.04
28
2025.08
28
2026.02
27.9
2026.04
27.9
2025.08
27.7
2025.08
27.7
2026.03
27.6
27.4
2026.03
27.4
2026.04
27.4
27
26.8
26.5
2026.04
26.5
2026.03
26.1
2026.04
26.1
2026.05
25.9
25.8
2026.03
25.8
2026.05
25.8
25.2
2026.03
25.2
2026.04
25.2
25
25
2025.08
24
2025.08
21.9
11.9