Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Reasoning on MME-RW Chinese
Loading...
77.7
Accuracy
S1-VL-32B-RL
53.7488
59.9669
66.185
72.4031
Apr 23, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
S1-VL-32B-RL
Category=Thinking-with...
2026.04
77.7
S1-VL-32B-SFT
Category=Thinking-with...
2026.04
72
Skywork-R1V4-30B
Category=Thinking-with...
2026.04
70.8
Gemini 2.5 Pro
Category=Proprietary M...
2026.04
69.3
Qwen3-VL-235B-A22B-Thinking
Category=Open-Source M...
2026.04
68.8
Thyme-VL (7B)
Category=Thinking-with...
2026.04
64.6
GPT-5
Category=Proprietary M...
2026.04
63.97
Intern-S1 (235B+6B)
Category=Open-Source M...
2026.04
62.98
Qwen3-VL-32B-Thinking
Category=Open-Source M...
2026.04
61.21
Gemini 2.5 Flash
Category=Proprietary M...
2026.04
61.2
Qwen2.5-VL-7B
Category=Open-Source M...
2026.04
60.8
Qwen2.5-VL-32B
Category=Open-Source M...
2026.04
60.5
InternVL3-8B
Category=Open-Source M...
2026.04
60.5
Intern-S1-mini (8B)
Category=Open-Source M...
2026.04
54.67
Feedback
Search any
task
Search any
task