Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Helpfulness

Benchmarks

Task NameDataset NameSOTA ResultTrend
Text ClassificationHelpfulness
F1 Score72.27
13
LLM AlignmentHelpfulness
Truthfulness Index0.891
7
Helpfulness EvaluationHelpfulness (evaluation set)
Win Rate84.05
5
Showing 3 of 3 rows