Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-turn Conversational Quality on MT-Bench (test)
Loading...
8.06
MT-Bench Score (1-turn)
TCR
6.5208
6.9204
7.32
7.7196
Oct 17, 2025
MT-Bench Score (1-turn)
MT-Bench Score (2-turn)
Updated 8d ago
Evaluation Results
Method
Method
Links
MT-Bench Score (1-turn)
MT-Bench Score (2-turn)
TCR
Optimizer=GRPO, Founda...
2025.10
8.06
7.54
PW
Optimizer=GRPO, Founda...
2025.10
7.93
7.39
ELO
Optimizer=GRPO, Founda...
2025.10
7.87
6.79
PREF
Optimizer=GRPO, Founda...
2025.10
7.58
7.1
TCR
Optimizer=GSPO, Founda...
2025.10
7.46
6.79
LW
Optimizer=GSPO, Founda...
2025.10
7.38
6.61
LW
Optimizer=GRPO, Founda...
2025.10
7.33
7.44
PW
Optimizer=GSPO, Founda...
2025.10
7.32
6.43
Base Model
Optimizer=GRPO, Founda...
2025.10
7.31
6.74
ELO
Optimizer=GSPO, Founda...
2025.10
7.31
6.63
PREF
Optimizer=GSPO, Founda...
2025.10
6.89
6.66
Base Model
Optimizer=GSPO, Founda...
2025.10
6.58
5.64
Feedback
Search any
task
Search any
task