Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Tulu

Benchmarks

Task NameDataset NameSOTA ResultTrend
Instruction FollowingTulu3 Evaluation Suite pool (test)
ARC92.54
25
Tulu generationTulu
Grammar Accuracy85
12
Membership Inference AttackTulu3 Mix Aya
AUROC68
8
Helpful assistant taskTulu-2 13B
HV Score1.2562
3
Showing 4 of 4 rows