Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-turn Dialogue Evaluation on MT-Bench (MT-Bench Score, ŝu, ŝw)
Loading...
8.99
MT-Bench Score
GPT-4
2.3548
4.0774
5.8
7.5226
Jan 29, 2026
MT-Bench Score
ŝu
ŝw
Updated 4d ago
Evaluation Results
Method
Method
Links
MT-Bench Score
ŝu
ŝw
GPT-4
2026.01
8.99
0.83
0.73
GPT-3.5
2026.01
7.94
0.51
0.43
Vicuna-13B (all)
#Token=370M
2026.01
6.39
-0.29
-0.25
Alpaca-13B
#Token=4.4M
2026.01
4.53
-0.62
-0.54
LLaMA-13B
#Token=1T
2026.01
2.61
-1.27
-1.12
Feedback
Search any
task
Search any
task