Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Docstring Evaluation on DevEval 183 human-written docstrings
Loading...
4.938
Score
Human
-0.16632
1.15884
2.484
3.80916
Apr 12, 2026
Score
Rank Correlation (rp)
Rank Correlation (rs)
Kendall's Tau (τ)
Average Metric Value
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
Rank Correlation (rp)
Rank Correlation (rs)
Kendall's Tau (τ)
Average Metric Value
Human
Score Range=1–5
2026.04
4.938
-
-
-
-
G-Eval
Score Range=1–5
2026.04
3.043
0.013
0.003
0.003
0.006
ReFEree
Score Range=0–1
2026.04
0.938
0.326
0.298
0.287
0.304
ROUGE-L
Score Range=0–1
2026.04
0.208
0.006
0.047
0.039
0.03
BLEU
Score Range=0–1
2026.04
0.03
0.062
0.061
0.054
0.059
Feedback
Search any
task
Search any
task