Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-turn Dialogue Evaluation on MT-Bench (test)

6.26MT-Bench Score

DiaBlo

5.26165.52085.786.0392Jun 3, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.06
6.26
2025.06
6.13
2025.06
5.97
2025.06
5.95
2025.06
5.61
2025.06
5.3