LLaMA: Open and Efficient Foundation Language Models
About
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth\'ee Lacroix, Baptiste Rozi\`ere, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample• 2023
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Language Modeling | WikiText2 | Perplexity5.69 | 3785 | |
| Language Modeling | WikiText-2 (test) | PPL5.68 | 2333 | |
| Language Modeling | WikiText-2 | Perplexity (PPL)5.68 | 2320 | |
| Commonsense Reasoning | HellaSwag | Accuracy84.2 | 1896 | |
| Language Modeling | C4 | Perplexity7.08 | 1688 | |
| Commonsense Reasoning | WinoGrande | Accuracy73 | 1442 | |
| Mathematical Reasoning | GSM8K | Accuracy93 | 1398 | |
| Language Modeling | PTB | Perplexity8.93 | 1234 | |
| Code Generation | HumanEval | Pass@145.7 | 1043 | |
| Question Answering | ARC Challenge | Accuracy52.7 | 906 |
Showing 10 of 606 rows
...