Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Anthropic HH-RLHF

Benchmarks

Task NameDataset NameSOTA ResultTrend
Helpfulness alignmentAnthropic hh-rlhf
Gold Reward3.36
14
LLM AlignmentAnthropic HH-RLHF 2022 (test)
Win Rate62
4
Preference LearningAnthropic HH-RLHF+VI Preference (test)
Overall Accuracy64
3
Showing 3 of 3 rows