Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Quality, Factuality, and Safety Evaluation Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language Model EvaluationQuality, Factuality, and Safety Evaluation Suite (test)
Generation Quality Score86.3
7
Showing 1 of 1 rows