Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Competition-Level Code Generation with AlphaCode

About

Programming is a powerful and ubiquitous problem-solving tool. Developing systems that can assist programmers or even generate programs independently could make programming more productive and accessible, yet so far incorporating innovations in AI has proven challenging. Recent large-scale language models have demonstrated an impressive ability to generate code, and are now able to complete simple programming tasks. However, these models still perform poorly when evaluated on more complex, unseen problems that require problem-solving skills beyond simply translating instructions into code. For example, competitive programming problems which require an understanding of algorithms and complex natural language remain extremely challenging. To address this gap, we introduce AlphaCode, a system for code generation that can create novel solutions to these problems that require deeper reasoning. In simulated evaluations on recent programming competitions on the Codeforces platform, AlphaCode achieved on average a ranking of top 54.3% in competitions with more than 5,000 participants. We found that three key components were critical to achieve good and reliable performance: (1) an extensive and clean competitive programming dataset for training and evaluation, (2) large and efficient-to-sample transformer-based architectures, and (3) large-scale model sampling to explore the search space, followed by filtering based on program behavior to a small set of submissions.

Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, R\'emi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d'Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu, Oriol Vinyals• 2022

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval
Pass@117.1
850
Code GenerationHumanEval (test)
Pass@155.1
444
Code GenerationMBPP (test)
Pass@162
276
Code GenerationHumanEval 1.0 (test)
Pass@117.1
145
Code GenerationCodeContests (test)--
42
Code GenerationAPPS (test)
Introductory Score20.4
36
Competitive CodingUSACO
Best@881.8
16
Competitive CodingOJBench
Best@8 Score43.8
16
Competitive CodingICPC-Eval
Best@828.7
16
Competitive CodingCodeForces
Best@846.7
16
Showing 10 of 17 rows

Other info

Code

Follow for update