Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM

About

Existing open-source helpfulness preference datasets do not specify what makes some responses more helpful and others less so. Models trained on these datasets can incidentally learn to model dataset artifacts (e.g. preferring longer but unhelpful responses only due to their length). To alleviate this problem, we collect HelpSteer, a multi-attribute helpfulness dataset annotated for the various aspects that make responses helpful. Specifically, our 37k-sample dataset has annotations for correctness, coherence, complexity, and verbosity in addition to overall helpfulness of responses. Training Llama 2 70B using the HelpSteer dataset with SteerLM technique produces a model that scores 7.54 on MT Bench, which is currently the highest score for open models that do not require training data from more powerful models (e.g. GPT4). We release this dataset with CC-BY-4.0 license at https://huggingface.co/datasets/nvidia/HelpSteer

Zhilin Wang, Yi Dong, Jiaqi Zeng, Virginia Adams, Makesh Narsimhan Sreedhar, Daniel Egert, Olivier Delalleau, Jane Polak Scowcroft, Neel Kant, Aidan Swope, Oleksii Kuchaiev• 2023

Related benchmarks

TaskDatasetResultRank
Reward ModelingRewardBench
Accuracy88.8
166
Reward ModelingRewardBench
Chat Score91.3
146
Reward ModelingRM-Bench
Accuracy52.5
125
Reward ModelingRMB
Accuracy58.2
120
Reward Modeling EvaluationRM-Bench
Chat Score56.4
55
Reward ModelingRMB
Help Accuracy57.4
13
Showing 6 of 6 rows

Other info

Follow for update