Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Helpful Assistant

Benchmarks

Task NameDataset NameSOTA ResultTrend
Helpful Assistant AlignmentHelpful Assistant normalized rewards (test)
Helpfulness Reward (r1)53
60
Multi-Objective OptimizationHelpful Assistant Harmless-helpful
MPD1.015
4
Multi-Objective OptimizationHelpful Assistant Humor-helpful
MPD1.439
4
Showing 3 of 3 rows