Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TinyLlama: An Open-Source Small Language Model

About

We present TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for approximately 3 epochs. Building on the architecture and tokenizer of Llama 2, TinyLlama leverages various advances contributed by the open-source community (e.g., FlashAttention and Lit-GPT), achieving better computational efficiency. Despite its relatively small size, TinyLlama demonstrates remarkable performance in a series of downstream tasks. It significantly outperforms existing open-source language models with comparable sizes. Our model checkpoints and code are publicly available on GitHub at https://github.com/jzhang38/TinyLlama.

Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu• 2024

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningWinoGrande
Accuracy59.4
1442
Mathematical ReasoningGSM8K
Accuracy1.7
1398
Question AnsweringARC Challenge
Accuracy30.1
906
Multi-task Language UnderstandingMMLU
Accuracy32.6
881
Instruction FollowingIFEval--
836
Mathematical ReasoningGSM8K (test)
Accuracy14.19
816
Commonsense ReasoningPIQA
Accuracy73.3
757
Commonsense ReasoningHellaSwag
HellaSwag Accuracy60.8
711
Physical Commonsense ReasoningPIQA
Accuracy74.8
696
Code GenerationHumanEval (test)--
612
Showing 10 of 112 rows
...

Other info

Follow for update