Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MT-Bench Evaluation on UltraFeedback

8.1MT-Bench Score

SPA

5.2926.0216.757.479Oct 28, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.10
8.1
2025.10
8
2025.10
8
2025.10
8
2025.10
8
2025.10
8
2025.10
8
2025.10
7.9
2025.10
7.9
2025.10
7.9
2025.10
7.9
2025.10
7.9
2025.10
7.8
2025.10
7.7
2025.10
7.7
2025.10
7.7
2025.10
7.6
2025.10
7.6
2025.10
7.6
2025.10
7.6
2025.10
7.6
2025.10
7.5
2025.10
7.5
2025.10
7.5
2025.10
6.3
2025.10
6.3
2025.10
6.3
2025.10
6.3
2025.10
6.3
2025.10
6.3
2025.10
6.3
2025.10
6.3
2025.10
6.3
2025.10
6.3
2025.10
6.3
2025.10
6.3
2025.10
6.3
2025.10
6.2
2025.10
6.2
2025.10
6.2
2025.10
5.5
2025.10
5.4