Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Helpfulness Evaluation on MTBench

9.35Helpfulness

GPT-4o

7.51967.99488.478.9452Feb 28, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.02
9.35
2025.02
9.22
2025.02
9.14
2025.02
8.83
2025.02
8.77
2025.02
8.61
2025.02
8
2025.02
7.59