Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multi-turn Dialogue on MT-Bench (Quality and Speed)

4.1Speedup

DOUBLE

1.09441.87472.6553.4353Jan 9, 2026Jan 13, 2026Jan 18, 2026Jan 22, 2026Jan 27, 2026Jan 31, 2026Feb 5, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
4.1-9.85
2026.01
4.02-5.23
2026.01
3.88-9.35
2026.01
3.744.86-
2026.01
3.428.25-
2026.01
3.26-8.38
2026.01
3.23-7.92
2026.01
3.22-8.35
2026.01
3.077.89-
2026.01
2.87-6.98
2026.02
2.854.35-
2026.02
2.754.24-
2026.02
2.674.07-
2026.02
2.473.8-
2026.01
2.43-4.32
2026.01
2.36-5.12
2026.01
2.23-3.03
2026.01
2.072.81-
2026.02
2.043.49-
2026.01
2.01-3.92
2026.01
2.01-4.3
2026.01
1.94-5.83
2026.02
1.913.36-
2026.02
1.93.26-
2026.01
1.89-2.83
2026.01
1.78-3.68
2026.01
1.76-4.82
2026.02
1.743.02-
2026.01
1.725.15-
2026.01
1.72-2.36
2026.02
1.72.95-
2026.02
1.73.05-
2026.01
1.673.45-
2026.01
1.67-1.75
2026.02
1.632.83-
2026.02
1.582.7-
2026.01
1.47-1.69
2026.01
1.46-1.59
2026.01
1.45-1.72
2026.01
1.45-1.49
2026.01
1.36-1.65
2026.01
1.35-1.47
2026.01
1.34-3.47
2026.01
1.32-5.13
2026.01
1.29-1.43
2026.01
1.21-3.17
2026.01
1.21-3.77