Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multimodal Reasoning on GeoQA
Loading...
49.2
Mean@1
Socratic-Solver-Geo (Stage3)
43.7088
45.1344
46.56
47.9856
Feb 3, 2026
Mean@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Mean@1
Socratic-Solver-Geo (Stage3)
Data Scale=2.5k, Curri...
2026.02
49.2
Geo170k (G-LLaVA)
Data Scale=10k
2026.02
47.16
GeoReasoning
Data Scale=10k
2026.02
46.76
KD (Our Synthesis)
Data Scale=2.5k
2026.02
46.58
R-CoT
Data Scale=7.2k
2026.02
46.49
TrustGeoGen
Data Scale=10k
2026.02
46.35
KD (Geo3K)
Data Scale=3k
2026.02
46.21
PGPS9K
Data Scale=10k
2026.02
46.08
Socratic-Solver-Geo (Stage1)
Data Scale=0.4k, Curri...
2026.02
45.14
Socratic-Solver-Geo (Stage2)
Data Scale=1k, Curricu...
2026.02
44.86
Qwen2.5-VL-7B-Instruct
Mode=Zero-shot
2026.02
43.92
Feedback
Search any
task
Search any
task