Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Story Premise Human Evaluation Set

Benchmarks

Task NameDataset NameSOTA ResultTrend
Story Premise Diversity EvaluationStory Premise Human Evaluation Set 600 premises 1.0 (test)
Average Score3.875
6
Showing 1 of 1 rows