Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Human Judgments

Benchmarks

Task NameDataset NameSOTA ResultTrend
Attribution CoverageHuman Judgments
Pearson Correlation (r)0.97
8
MURGAT-SCOREHuman Judgments
Pearson Correlation (r)0.86
4
Attribution PrecisionHuman Judgments
Pearson Correlation (r)0.65
4
Showing 3 of 3 rows