Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

VicunaEval

Benchmarks

Task NameDataset NameSOTA ResultTrend
Instruction FollowingVicunaEval
VicunaEval Score40.75
80
Instruction FollowingVicunaEval
Rouge-L35
72
General PerformanceVicunaEval
Winrate96.3
21
GenerationVicunaEval (test)
LLM Judge Score56.07
2
Instruction FollowingVicunaEval (test)
Score-
0
Showing 5 of 5 rows