TinyLlama: An Open-Source Small Language Model
About
We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for approximately 3 epochs. Building on the architecture and tokenizer of Llama 2, TinyLlama leverages various advances contributed by the open-source community (e.g., FlashAttention and Lit-GPT), achieving better computational efficiency. Despite its relatively small size, TinyLlama demonstrates remarkable performance in a series of downstream tasks. It significantly outperforms existing open-source language models with comparable sizes. Our model checkpoints and code are publicly available on GitHub at https://github.com/jzhang38/TinyLlama.
Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu• 2024
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | GSM8K | Accuracy1.7 | 983 | |
| Multi-task Language Understanding | MMLU | Accuracy32.6 | 842 | |
| Commonsense Reasoning | WinoGrande | Accuracy50.36 | 776 | |
| Mathematical Reasoning | GSM8K (test) | Accuracy14.19 | 751 | |
| Question Answering | ARC Challenge | Accuracy30.1 | 749 | |
| Commonsense Reasoning | PIQA | Accuracy73.3 | 647 | |
| Question Answering | OpenBookQA | Accuracy23 | 465 | |
| Code Generation | HumanEval (test) | -- | 444 | |
| Question Answering | ARC Easy | Normalized Acc55.3 | 385 | |
| Boolean Question Answering | BoolQ | Accuracy57.8 | 307 |
Showing 10 of 64 rows