Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation

About

Despite recent progress made by large language models in code generation, they still struggle with programs that meet complex requirements. Recent work utilizes plan-and-solve decomposition to decrease the complexity and leverage self-tests to refine the generated program. Yet, planning deep-inside requirements in advance can be challenging, and the tests need to be accurate to accomplish self-improvement. To this end, we propose FunCoder, a code generation framework incorporating the divide-and-conquer strategy with functional consensus. Specifically, FunCoder recursively branches off sub-functions as smaller goals during code generation, represented by a tree hierarchy. These sub-functions are then composited to attain more complex objectives. Additionally, we designate functions via a consensus formed by identifying similarities in program behavior, mitigating error propagation. FunCoder outperforms state-of-the-art methods by +9.8% on average in HumanEval, MBPP, xCodeEval and MATH with GPT-3.5 and GPT-4. Moreover, our method demonstrates superiority on smaller models: With FunCoder, StableCode-3b surpasses GPT-3.5 by +18.6% and achieves 97.7% of GPT-4's performance on HumanEval. Further analysis reveals that our proposed dynamic function decomposition is capable of handling complex requirements, and the functional consensus prevails over self-testing in correctness evaluation.

Jingchang Chen, Hongxuan Tang, Zheng Chu, Qianglong Chen, Zekun Wang, Ming Liu, Bing Qin• 2024

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval (test)	Pass@194.5	701
Mathematical Reasoning	MATH (test)	Overall Accuracy78.2	433
Code Generation	HumanEval+ v1	Pass@184.76	30
Code Generation	HumanEval base v1	Pass@193.29	30
Code Generation	MBPP base v1	Pass@186.86	30
Code Generation	MBPP v1 (plus)	Pass@185.71	30
Code Generation	MBPP sample 200	Pass@179.5	18
Code Generation	xCodeEval (sample 500)	Accuracy (Easy)0.831	17

Showing 8 of 8 rows

Other info

Code

Follow for update

@wizwand_team Discord