Share your thoughts, 1 month free Claude Pro on usSee more

Expert-level multidisciplinary QA on MMMU (dev val)

52.8Pass@1

Vision-SR1

Updated 3mo ago

Evaluation Results

Method	Links
Vision-SR1 2025.12		52.8
R1-ShareVL-7B 2025.12		52.28
PeRL-VL 2025.12		52.22
RL (Conditional Hard Gate) 2025.12		52.11
RL (Conditional Soft Gate) 2025.12		52.09
VL-Cogito 2025.12		51.72
PeBR-R1-7B 2025.12		51.59
RL (Verifiable Rewards) 2025.12		51.42
ThinkLite-VL 2025.12		50.81
RL (Aggregated Rewards) 2025.12		50.8
SFT (OpenThought) 2025.12		48.51
Qwen2.5-VL-7B 2025.12		48.15
SFT (GPT-4o) 2025.12		48.06