Share your thoughts, 1 month free Claude Pro on usSee more

SOTA RLHF Alignment Evaluation benchmarks and papers with code | Wizwand

Share your thoughts, 1 month free Claude Pro on usSee more

RLHF Alignment Evaluation

Benchmarks

Dataset Name	SOTA Method	Metric	Trend
HHA	DPO	Harmlessness Win Rate (Base, A)76.1		6	4mo ago

Showing 1 of 1 rows