Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment

About

Reward modeling lies at the core of reinforcement learning from human feedback (RLHF), yet most existing reward models rely on scalar or pairwise judgments that fail to capture the multifaceted nature of human preferences. Recent studies have explored rubrics-as-rewards (RaR) that uses structured criteria to capture multiple dimensions of response quality. However, producing rubrics that are both reliable and scalable remains a key challenge. In this work, we introduce OpenRubrics, a diverse, large-scale collection of (prompt, rubric) pairs for training rubric-generation and rubric-based reward models. To elicit discriminative and comprehensive evaluation signals, we introduce Contrastive Rubric Generation (CRG), which derives both hard rules (explicit constraints) and principles (implicit qualities) by contrasting preferred and rejected responses. We further remove noisy rubrics via preserving preference-label consistency. Across multiple reward-modeling benchmarks, our rubric-based reward model, Rubric-RM, surpasses strong size-matched baselines by 8.4%. These gains transfer to policy models on instruction-following and biomedical benchmarks.

Tianci Liu, Ran Xu, Tony Yu, Ilgee Hong, Carl Yang, Tuo Zhao, Haoyu Wang• 2025

Related benchmarks

TaskDatasetResultRank
Reward ModelingRM-Bench
Accuracy74
125
Reward ModelingRMB
Accuracy78.5
120
Reward ModelingRewardBench Focus 2
Accuracy86.5
82
Reward ModelingRewardBench v2
Accuracy71.9
72
Reward ModelingRewardBench Precise IF 2
Accuracy40
70
Reward ModelingHelpSteer 3
Accuracy67.5
39
Reward ModelingRM-Bench Chat Hard
Accuracy75.4
34
Reward ModelingRewardBench v1
Accuracy86.7
28
Reward ModelingPPE-IFEval
Accuracy0.708
18
Reward ModelingRewardBench Chat
Accuracy89.9
18
Showing 10 of 21 rows

Other info

Follow for update