Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Perception on V* (Pass@1)
Loading...
90.2
Pass@1
SFT + GRPO
61.912
69.256
76.6
83.944
May 27, 2026
Pass@1
Updated 6d ago
Evaluation Results
Method
Method
Links
Pass@1
SFT + GRPO
Backbone=Qwen3-VL-Thin...
2026.05
90.2
Base
Backbone=Qwen3-VL-Thin...
2026.05
89.1
SFT + AXPO
Backbone=Qwen3-VL-Thin...
2026.05
88.9
SFT + AXPO
Backbone=Qwen3-VL-Thin...
2026.05
87.8
SFT + GRPO
Backbone=Qwen3-VL-Thin...
2026.05
87.7
SFT
Backbone=Qwen3-VL-Thin...
2026.05
87
GRPO
Backbone=Qwen3-VL-Thin...
2026.05
85.7
SFT
Backbone=Qwen3-VL-Thin...
2026.05
84.8
GRPO
Backbone=Qwen3-VL-Thin...
2026.05
82.7
SFT + GRPO
Backbone=Qwen3-VL-Thin...
2026.05
81.7
SFT + AXPO
Backbone=Qwen3-VL-Thin...
2026.05
81.3
Base
Backbone=Qwen3-VL-Thin...
2026.05
80.6
Base
Backbone=Qwen3-VL-Thin...
2026.05
77.7
SFT
Backbone=Qwen3-VL-Thin...
2026.05
75.9
GRPO
Backbone=Qwen3-VL-Thin...
2026.05
67.1
Base
Backbone=Qwen3-VL-Thin...
2026.05
63
Feedback
Search any
task
Search any
task