Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-turn Conversation Evaluation on EduFeedback alternate

8.3MT-Bench Score

SFT

0.7082.6794.656.621May 22, 2026
Updated 9d ago

Evaluation Results

MethodLinks
2026.05
8.3
2026.05
8.3
2026.05
8.1
2026.05
8.1
2026.05
8
2026.05
7.9
2026.05
7.9
2026.05
7.9
2026.05
7.8
2026.05
6.9
2026.05
6.8
2026.05
6.1
2026.05
1.7
2026.05
1.7
2026.05
1.6
2026.05
1.6
2026.05
1.6
2026.05
1.3
2026.05
1
2026.05
1