Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Human Preference Alignment on HH-RLHF and PKU-SafeRLHF (test)

3.93Quality Score

DPO-HPS

3.6183.6993.783.861Feb 20, 2025
Updated 27d ago

Evaluation Results

MethodLinks
2025.02
3.93
2025.02
3.82
2025.02
3.69
2025.02
3.63