Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Alpaca Eval

Benchmarks

Task NameDataset NameSOTA ResultTrend
HelpfulnessAlpaca Eval
Alpaca Eval (%)17.77
22
Instruction FollowingAlpaca-Eval (test)
Length-Controlled Winrate66.85
6
Showing 2 of 2 rows