Share your thoughts, 1 month free Claude Pro on usSee more

Multimodal Reasoning on MMstar (Pass@1 accuracy)

67.1Pass@1 Accuracy

Qwen2.5-VL-7B-Instruct + RFT

Updated 3mo ago

Evaluation Results

Method	Links
Qwen2.5-VL-7B-Instruct + RFT 2026.04		67.1
InternVL2.5-26B 2026.04		66.5
InternVL3-9B 2026.04		66.3
Qwen2.5-VL-7B-Instruct + cold start 2026.04		66.2
LLaVA-OneVision-72B 2026.04		65.8
Claude-3.5-Sonnet 2026.04		65.1
GPT-4o-20240513 2026.04		64.7
Qwen2.5-VL-7B-Instruct 2026.04		63.9
Qwen2.5-VL-7B-Instruct 2026.04		63.9
InternVL2.5-8B 2026.04		62.8
Qwen2.5-VL-7B-Instruct 2026.04		61.8
Qwen2-VL-7B 2026.04		60.7
Gemini-1.5-Pro 2026.04		59.1
MiniCPM-V2.6 2026.04		57.5
GPT-4V 2026.04		56
Cambrian-34B 2026.04		54.2