Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reward modeling on EVAL_INSTRUCT 2 steps
Loading...
1.65
Step Completion Rate
R2VLM
1.286
1.3805
1.475
1.5695
Mar 18, 2026
Step Completion Rate
Task Completion Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Step Completion Rate
Task Completion Rate
R2VLM
2026.03
1.65
70
Qwen2.5-VL-Instruct
Model Size=7B
2026.03
1.55
65
Pretrained SPRINT
2026.03
1.35
55
Step-Completion Based Reward
2026.03
1.3
55
Feedback
Search any
task
Search any
task