| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Instruction Following | Tulu3 Evaluation Suite pool (test) | ARC92.54 | 25 | |
| Tulu generation | Tulu | Grammar Accuracy85 | 12 | |
| Membership Inference Attack | Tulu3 Mix Aya | AUROC68 | 8 | |
| Helpful assistant task | Tulu-2 13B | HV Score1.2562 | 3 |