Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reward Modeling on ARMBench-VL ours (test)

67.6FG Score

ARM-Thinker-7B

46.852.257.663Dec 4, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
67.673.852.464.6
2025.12
61.869.558.763.3
2025.12
58.959.14755
2025.12
56.757.75255.5
2025.12
5247.242.847.4
2025.12
51.845.441.146.1
2025.12
47.656.647.650.6