Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Puzzle Reasoning on PuzzleVQA
Loading...
46.8
Accuracy
PEPOD
31.2
35.25
39.3
43.35
Mar 24, 2026
Accuracy
Updated 24d ago
Evaluation Results
Method
Method
Links
Accuracy
PEPOD
Backbone=Qwen2.5-VL-3B...
2026.03
46.8
PEPOG
Backbone=InternVL3-2B-...
2026.03
45.2
PEPOD
Backbone=InternVL3-2B-...
2026.03
45.2
PEPOG
Backbone=Qwen2.5-VL-3B...
2026.03
45
DAPO
Backbone=InternVL3-2B-...
2026.03
44.8
DAPO
Backbone=Qwen2.5-VL-3B...
2026.03
44.6
GRPO
Backbone=Qwen2.5-VL-3B...
2026.03
43.2
GRPO
Backbone=InternVL3-2B-...
2026.03
43.2
High-Entropy RL
Backbone=Qwen2.5-VL-3B...
2026.03
35
High-Entropy RL
Backbone=InternVL3-2B-...
2026.03
32.6
Base (zero-shot)
Backbone=Qwen2.5-VL-3B...
2026.03
31.8
Base (zero-shot)
Backbone=InternVL3-2B-...
2026.03
31.8
Feedback
Search any
task
Search any
task