Share your thoughts, 1 month free Claude Pro on usSee more

Multimodal mathematical reasoning on MathVision

61.5Pass@1 Accuracy

GPT-5

Updated 2mo ago

Evaluation Results

Method	Links
GPT-5 2026.05		61.5
GPT-5-Nano-High 2026.02		58.75
Qwen3-VL-8B-Thinking 2026.02		57.89
Qwen3-VL-8B-DeepVision 2026.02		55.49
MiMo-VL-7B-DeepVision 2026.02		55.24
MiMo-VL-7B-RL-2508 2026.02		53.91
Kimi-K2.5 2026.05		53.3
MiMo-VL-7B-OpenMMReasoner 2026.02		52.97
Gemini-2.5-Flash-Lite 2026.02		52.47
Qwen3-VL-8B-Instruct 2026.02		51.44
MiMo-VL-7B-MathBook 2026.02		51.31
MiMo-VL-7B-SFT-2508 2026.02		50.69
MiMo-VL-7B-MM-Eureka 2026.02		50
Qwen3-VL-32B 2026.05		45.4
MAESTRO 2026.05		43.4
Gemini-2.5-Flash 2026.05		39.8
Gemini-2.5-Pro 2026.05		39.8
GLM-4.6V 2026.05		39.1
GPT-4o 2026.05		30.4
VTOOL-R1 2026.05		29.3
Untrained Model 2026.05		29
DeepEyes-v2 2026.05		28.9
Thyme 2026.05		27.6
VTS-V 2026.05		27
DeepEyes 2026.05		26.6
MathCoder-VL 2026.05		26
Direct Answering 2026.05		24.9
PixelReasoner 2026.05		23.4
VisionReasoner 2026.05		21.7
Visual-ARFT 2026.05		21.4
Chain-of-Focus 2026.05		21.1