Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reward Model Evaluation on Arena-Hard RU

92.69Best@8 Score

Qwen3-32B-RM

85.659687.484889.3191.1352Dec 11, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
92.6970.4822.21
90.4977.3113.18
89.0574.3514.7
87.3778.478.9
85.9384.911.02