Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dialogue on MT-Bench

9.3MT-Bench Score

GPT-4o

5.06726.16617.2658.3639Feb 24, 2025May 3, 2025Jul 10, 2025Sep 16, 2025Nov 23, 2025Jan 30, 2026Apr 9, 2026
Updated 9d ago

Evaluation Results

MethodLinks
2026.03
9.3
2026.03
9.1
2026.03
9
2026.03
8.9
2026.03
8.6
2026.03
8.5
2026.03
8.2
2026.04
8.138
2026.04
7.975
2026.04
7.862
2026.04
7.837
2026.04
7.831
2026.04
7.812
2026.04
7.793
2026.04
7.762
2026.04
7.756
2026.04
7.725
2026.04
7.681
2026.04
7.681
2026.04
7.506
2026.04
7.162
2025.02
6.01
2025.02
5.97
2025.02
5.84
2025.02
5.82
2025.02
5.61
2025.02
5.56
2025.02
5.3
2025.02
5.23