Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Reasoning on Geometry3K (test)
Loading...
56.18
Pass@1 Accuracy
SaEI
38.5936
43.1593
47.725
52.2907
Dec 11, 2025
Pass@1 Accuracy
Pass@8 Accuracy
Updated 24d ago
Evaluation Results
Method
Method
Links
Pass@1 Accuracy
Pass@8 Accuracy
SaEI
Finetuning status=Fine...
2025.12
56.18
-
KL-Cov
Finetuning status=Fine...
2025.12
55.91
-
NoisyRollout
Finetuning status=Fine...
2025.12
54.74
-
Vanilla GRPO
Finetuning status=Fine...
2025.12
54.02
-
Qwen2.5-VL-7B-Instruct
Finetuning status=Not...
2025.12
39.27
-
GRPO
Backbone=Qwen2.5-VL-3B...
2026.03
-
28.72
PAPO_G
Backbone=Qwen2.5-VL-3B...
2026.03
-
30.95
PEPO_G
Backbone=Qwen2.5-VL-3B...
2026.03
-
29.85
Feedback
Search any
task
Search any
task