Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

human judgment dataset

Benchmarks

Task NameDataset NameSOTA ResultTrend
Human Correlation AnalysisRefined human judgment dataset human vs model-generated
SO-S0.995
3
Human Correlation AnalysisOriginal human judgment dataset
Generation Perplexity0.643
3
Showing 2 of 2 rows