Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Free Run

Benchmarks

Task NameDataset NameSOTA ResultTrend
Open-domain Dialogue EvaluationFree run Mechanical Turk 1 (initial data collection run)
Overall Score0.534
10
Open-domain Dialogue EvaluationFree Run 2 1.0 (secondary data collection run)
Overall Quality Score0.455
10
Showing 2 of 2 rows