Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multimodal Reasoning on MMKI2
Loading...
80.61
Avg@8 Accuracy
PAPO_D
46.394
55.277
64.16
73.043
Jul 8, 2025
Avg@8 Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Avg@8 Accuracy
PAPO_D
Backbone=Qwen2.5-VL, M...
2025.07
80.61
DAPO
Backbone=Qwen2.5-VL, M...
2025.07
75.93
PAPO_G
Backbone=Qwen2.5-VL, M...
2025.07
72.52
GRPO
Backbone=Qwen2.5-VL, M...
2025.07
72.26
DAPO
Backbone=Qwen2.5-VL, M...
2025.07
66.83
PAPO_D
Backbone=Qwen2.5-VL, M...
2025.07
64.09
PAPO_G
Backbone=Qwen2.5-VL, M...
2025.07
57.39
GRPO
Backbone=Qwen2.5-VL, M...
2025.07
57.24
PAPO_G
Backbone=Qwen3-VL (thi...
2025.07
48.57
GRPO
Backbone=Qwen3-VL (thi...
2025.07
47.71
Feedback
Search any
task
Search any
task