Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Arena-Expert, HelpSteer, HH-RLHF, and UltraFeedback

Benchmarks

Task NameDataset NameSOTA ResultTrend
Preference PredictionArena-Expert-5K, HelpSteer3, HH-RLHF, and UltraFeedback (held-out)
Accuracy70.5
42
Showing 1 of 1 rows