CodeGemma: Open Code Models Based on Gemma
About
This paper introduces CodeGemma, a collection of specialized open code models built on top of Gemma, capable of a variety of code and natural language generation tasks. We release three model variants. CodeGemma 7B pretrained (PT) and instruction-tuned (IT) variants have remarkably resilient natural language understanding, excel in mathematical reasoning, and match code capabilities of other open models. CodeGemma 2B is a state-of-the-art code completion model designed for fast code infilling and open-ended generation in latency-sensitive settings.
CodeGemma Team: Heri Zhao, Jeffrey Hui, Joshua Howland, Nam Nguyen, Siqi Zuo, Andrea Hu, Christopher A. Choquette-Choo, Jingyue Shen, Joe Kelley, Kshitij Bansal, Luke Vilnis, Mateo Wirth, Paul Michel, Peter Choy, Pratik Joshi, Ravin Kumar, Sarmad Hashmi, Shubham Agrawal, Zhitao Gong, Jane Fine, Tris Warkentin, Ale Jakse Hartman, Bin Ni, Kathy Korevec, Kelly Schaefer, Scott Huffman• 2024
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Code Generation | HumanEval | Pass@154.88 | 1036 | |
| Code Generation | HumanEval+ | Pass@141.46 | 383 | |
| Code Generation | MBPP+ | Pass@154.76 | 216 | |
| Code Generation | MBPP | Pass@1 Accuracy53.2 | 59 | |
| Code Generation | LiveCodeBench | Pass@18.12 | 51 | |
| Code Completion | APC Hard Completion 1.0 (test) | HCR100 | 33 | |
| Code Completion | APC Placeholder Completion 1.0 (test) | PCR0.00e+0 | 33 | |
| Code Generation | DS-1000 1.0 (test) | Matplotlib54.7 | 19 | |
| Code Generation | BigCodeBench | pass@125.44 | 18 | |
| Assembly Code Reverse Engineering | REx86 (test) | CE5.4166 | 16 |
Showing 10 of 17 rows