Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Spatial Reasoning on VSI-Bench Vanilla regime

50.5Avg Score

GeoThinker Qwen2.5VL-7B

27.72433.63739.5545.463Feb 5, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
50.569.538.557.962.245.246.231.452.6
2026.02
49.768.138.75961.145.544.926.853.4
2026.02
48.968.536.157.362.543.747.934.540.9
2026.02
48.465.334.863.145.141.346.233.546.3
2026.02
46.767.637.655.252.54844.731.935.5
2026.02
45.456.230.964.143.651.346.33634.6
2026.02
44.8--------
2026.02
44.1--------
2026.02
42.149.830.853.554.437.74131.537.8
2026.02
40.948.922.857.435.342.436.73548.6
2026.02
40.243.523.957.637.542.539.932.544.6
2026.02
3634.926.946.531.842.132.23439.6
2026.02
35.648.51447.824.243.542.43430.6
2026.02
34.623.128.748.239.836.730.729.939.6
2026.02
3462.13229.933.125.147.928.425.2
2026.02
3446.25.343.838.23741.331.528.5
2026.02
32.447.720.247.412.342.535.229.424.4
2026.02
29.325.210.536.429.638.43829.826.8
2026.02
28.632.719.517.325.137.344.930.421.8
2026.02
-----2536.128.325