Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Evaluation on Shared (evaluation)
Loading...
78
Tie-aware Accuracy
GrowLoop
11.44
28.72
46
63.28
May 26, 2026
Tie-aware Accuracy
Pair Accuracy
Spearman Correlation
Updated 5d ago
Evaluation Results
Method
Method
Links
Tie-aware Accuracy
Pair Accuracy
Spearman Correlation
GrowLoop
Category=Ours
2026.05
78
87
0.78
ICAI
Category=Training-free...
2026.05
58
85
0.75
OpenJudge
Category=Training-free...
2026.05
58
70
0.62
ICL (k=3)
Category=No rubric
2026.05
37
57
0.09
Arena-Hard Prompt
Category=Manual rubric
2026.05
27
43
-0.17
Zero-shot
Category=No rubric
2026.05
25
42
-0.21
MT-Bench Prompt
Category=Manual rubric
2026.05
23
38
-0.31
Skywork-Reward-V2
Category=Reward Model
2026.05
22
39
-0.2
RM-R1
Category=Reward Model
2026.05
15
25
-0.5
OpenRubric-Judge
Category=Training-base...
2026.05
14
24
-0.49
Feedback
Search any
task
Search any
task