Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Human Evaluation on NeurIPS ICML ICLR Proposals 2025 (test)
Loading...
31
Wins
Stepwise CoT
19.56
22.53
25.5
28.47
Mar 28, 2026
Wins
Tie Count
Loss Count
Win Rate
Unanimity Rate
Updated 18d ago
Evaluation Results
Method
Method
Links
Wins
Tie Count
Loss Count
Win Rate
Unanimity Rate
Stepwise CoT
Comparison=Stepwise Co...
2026.03
31
3
26
54.2
26.7
Stepwise CoT
Comparison=Stepwise Co...
2026.03
30
10
20
58.3
26.7
Stepwise CoT
Comparison=Stepwise Co...
2026.03
29
7
24
54.2
31.7
Stepwise CoT
Comparison=Stepwise Co...
2026.03
25
10
25
50
11.7
Stepwise CoT
Comparison=Stepwise Co...
2026.03
21
11
28
44.2
11.7
Stepwise CoT
Comparison=Stepwise Co...
2026.03
20
18
22
48.3
6.7
Feedback
Search any
task
Search any
task