Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Tulu

Benchmarks

Task NameDataset NameSOTA ResultTrend
Instruction FollowingTulu3 Evaluation Suite pool (test)
ARC92.54
25
Tulu generationTulu
Grammar Accuracy85
12
Membership Inference AttackTulu3 Mix Aya
AUROC68
8
Model FingerprintingTulu 2 DPO 7B
Similarity Score0.9999
7
Helpful assistant taskTulu-2 13B
HV Score1.2562
3
Showing 5 of 5 rows