Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Pearson Correlation with Human Evaluation on HOWTOBENCH English

0.85Completion Correlation

ROUGE-L

0.6420.6960.750.804Apr 21, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
0.850.340.480.52
2026.04
0.820.820.860.85
2026.04
0.760.850.810.81
2026.04
0.740.650.690.71
2026.04
0.650.710.540.69