Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Two-piece spatial reasoning on Tangram
Loading...
34
Pos IoU
Gemini-2.5-pro
18.608
22.604
26.6
30.596
Feb 5, 2026
Pos IoU
Angle IoU
All IoU
Updated 3mo ago
Evaluation Results
Method
Method
Links
Pos IoU
Angle IoU
All IoU
Gemini-2.5-pro
2026.02
34
39.7
34
Claude-Sonnet-4
2026.02
31.8
39.4
23.5
GPT-4o mini
2026.02
27.6
39.4
27.8
Qwen-72B
2026.02
25.3
49.5
24.8
LLaMA Maverick
2026.02
22
42.7
37.1
Qwen-3B
2026.02
19.2
31.7
21.4
Feedback
Search any
task
Search any
task