Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Qualitative Assessment Dataset

Benchmarks

Task NameDataset NameSOTA ResultTrend
Helpfulness AssessmentQualitative Assessment Dataset
Not Overrefuse Rate (Content-safety)100
4
Safety AssessmentQualitative Assessment Dataset
Not Unsafe Rate (Content Safety)97.7
4
Showing 2 of 2 rows