Share your thoughts, 1 month free Claude Pro on usSee more

Multimodal Mathematical Reasoning on WeMath mini (test)

79.5Accuracy

Qwen3-VL-4B-Instruct-Math-RL Teacher

Updated 2mo ago

Evaluation Results

Method	Links
Qwen3-VL-4B-Instruct-Math-RL Teacher 2026.05		79.5
Claude-3.7-Sonnet 2025.06		72.6
Perception-R1-7B 2025.06		72
Qwen2.5-VL-72B-IT 2025.06		71.9
GPT-4o 2025.06		68.8
OpenVLThinker-7B 2025.06		66.3
VLAA-Thinker-7B 2025.06		66.3
MM-Eureka-7B 2025.06		65.6
Uni-OPD 2026.05		65
SophiaVL-R1-7B 2025.06		64.8
OPD 2026.05		64.8
R1-OneVision-7B 2025.06		61.9
Qwen2.5-VL-7B-IT 2025.06		61.4
Uni-OPD 2026.05		58.7
OPD 2026.05		57.6
InternVL2.5-8B 2025.06		53.5
Qwen3-VL-2B-Instruct Student 2026.05		48.6
Qwen2-VL-7B-IT 2025.06		42.3