Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-turn Dialogue Alignment on ALOE (test)
Loading...
2.4
Rank-1 Score
SFT
1.984
2.092
2.2
2.308
Oct 13, 2025
Rank-1 Score
Rank-2 Score
Rank-3 Score
Rank-4 Score
Rank-5 Score
Rank-6 Score
Rank-7 Score
Rank-8 Score
Rank-9 Score
Rank-10 Score
Average Rank Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Rank-1 Score
Rank-2 Score
Rank-3 Score
Rank-4 Score
Rank-5 Score
Rank-6 Score
Rank-7 Score
Rank-8 Score
Rank-9 Score
Rank-10 Score
Average Rank Score
SFT
2025.10
2.4
3
3.8
3.8
3.8
4
4
4.2
3.6
4.2
3.68
TPO
2025.10
2.4
3.4
4.2
3.8
4.2
4.2
4.2
4
4.2
4
3.86
CoT
2025.10
2
3.2
3.8
4
4.2
4.4
4.2
3.8
3.8
3.8
3.72
GRPO
2025.10
2
3.4
3
3.4
3.6
3.2
3.4
3.4
3
3.4
3.18
CDRA
2025.10
2
3.4
4
4.2
4.4
4.4
4.6
4.2
4
4
3.92
Feedback
Search any
task
Search any
task