Share your thoughts, 1 month free Claude Pro on usSee more

General-purpose multiple-choice evaluation on MMBench EN V11 (dev)

84.3Pass@1

Vision-SR1

Updated 3mo ago

Evaluation Results

Method	Links
Vision-SR1 2025.12		84.3
RL (Conditional Hard Gate) 2025.12		84.29
RL (Conditional Soft Gate) 2025.12		84.15
VL-Cogito 2025.12		83.92
PeBR-R1-7B 2025.12		83.78
R1-ShareVL-7B 2025.12		83.67
PeRL-VL 2025.12		83.5
RL (Verifiable Rewards) 2025.12		83.2
RL (Aggregated Rewards) 2025.12		83.2
Qwen2.5-VL-7B 2025.12		82.9
ThinkLite-VL 2025.12		82.65
SFT (GPT-4o) 2025.12		82.57
SFT (OpenThought) 2025.12		81.1