Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency

About

Large language models (LLMs) have exhibited remarkable ability in code generation. However, generating the correct solution in a single attempt still remains a challenge. Prior works utilize verification properties in software engineering to verify and re-rank solutions in a majority voting manner. But the assumption behind them that generated verification properties have better qualities than solutions may not always hold. In this paper, we treat them equally as different perspectives of LLMs' reasoning processes. We propose the Multi-Perspective Self-Consistency (MPSC) framework incorporating both inter- and intra-consistency across outputs from multiple perspectives. Specifically, we prompt LLMs to generate diverse outputs from three perspectives, Solution, Specification and Test case, constructing a 3-partite graph. With two measure functions of consistency, we embed both inter- and intra-consistency information into the graph. The optimal choice of solutions is then determined based on analysis in the graph. MPSC significantly boosts performance of foundation models (ChatGPT in this paper) on various benchmarks, including HumanEval (+15.91%), MBPP (+6.43%) and CodeContests (+9.37%), even surpassing GPT-4.

Baizhou Huang, Shuai Lu, Weizhu Chen, Xiaojun Wan, Nan Duan• 2023

Related benchmarks

Task	Dataset	Result
Code Generation	HumanEval	Pass@185.37	1043
Code Generation	HumanEval (test)	Pass@184.29	612
Code Generation	MBPP (test)	Pass@173.23	405
Code Generation	HumanEval+	Pass@172.18	393
Code Generation	HumanEval+ (test)	Pass@174.39	132
Code Generation	CodeContests (test)	Pass@111.94	68
Code Generation	CodeContests	Pass@114.39	68
Code Generation	MBPP	Pass@172.78	3
Code Generation	CodeContest	Pass@110.77	3

Showing 9 of 9 rows

Other info

Code

Follow for update

@wizwand_team Discord