Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multimodal Mathematical Reasoning on Aggregate Math Benchmarks
Loading...
87.47
Overall Macro Score
Qwen3.5-397B-A17B*
61.4908
68.2354
74.98
81.7246
May 10, 2026
Overall Macro Score
Updated 15d ago
Evaluation Results
Method
Method
Links
Overall Macro Score
Qwen3.5-397B-A17B*
Model Type=Open-source...
2026.05
87.47
Gemini-3-Pro*
Model Type=Closed-sour...
2026.05
79.51
GPT-5*
Model Type=Closed-sour...
2026.05
76.55
Doubao-Seed-1.8*
Model Type=Closed-sour...
2026.05
73.42
Qwen3VL-8B-instruct + GeoSym Hard
Base Model=Qwen3-VL, S...
2026.05
63.18
Qwen3VL-8B-instruct + GeoSym Entry
Base Model=Qwen3-VL, S...
2026.05
62.49
Feedback
Search any
task
Search any
task