Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
OOD Generalization on Real-world Cross-View Spatial Reasoning Benchmarks (Aggregate)
Loading...
40
Avg OOD Performance
Panoramic + VDrop
26.272
29.836
33.4
36.964
May 26, 2026
Avg OOD Performance
Updated 7d ago
Evaluation Results
Method
Method
Links
Avg OOD Performance
Panoramic + VDrop
VDrop=✓
2026.05
40
Top-down + VDrop
VDrop=✓
2026.05
38
Panoramic
VDrop=✗
2026.05
37.6
Top-down
VDrop=✗
2026.05
37.3
ThinkMorph (24K)
VDrop=✗
2026.05
37.2
Qwen3-VL-8B
VDrop=✗
2026.05
37
Point Matching + VDrop
VDrop=✓
2026.05
36.1
Point Matching
VDrop=✗
2026.05
35.2
No-Think
VDrop=✗
2026.05
35.1
Qwen3-VL-4B
VDrop=✗
2026.05
33.8
BAGEL
VDrop=✗
2026.05
33.3
Text CoT
VDrop=✗
2026.05
32.7
BAGEL-Zebra-CoT (182K)
VDrop=✗
2026.05
26.8
Feedback
Search any
task
Search any
task