Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

RepoHyper: Search-Expand-Refine on Semantic Graphs for Repository-Level Code Completion

About

Code Large Language Models (CodeLLMs) have demonstrated impressive proficiency in code completion tasks. However, they often fall short of fully understanding the extensive context of a project repository, such as the intricacies of relevant files and class hierarchies, which can result in less precise completions. To overcome these limitations, we present \tool, a multifaceted framework designed to address the complex challenges associated with repository-level code completion. Central to RepoHYPER is the {\em Repo-level Semantic Graph} (RSG), a novel semantic graph structure that encapsulates the vast context of code repositories. Furthermore, RepoHyper leverages Expand and Refine retrieval method, including a graph expansion and a link prediction algorithm applied to the RSG, enabling the effective retrieval and prioritization of relevant code snippets. Our evaluations show that \tool markedly outperforms existing techniques in repository-level code completion, showcasing enhanced accuracy across various datasets when compared to several strong baselines. Our implementation of RepoHYPER can be found at https://github.com/FSoft-AI4Code/RepoHyper.

Huy N. Phan, Hoang N. Phan, Tien N. Nguyen, Nghi D. Q. Bui• 2024

Related benchmarks

TaskDatasetResultRank
Code GenerationRepoBench-P Python XF-First
Exact Match (EM)51.2
6
Code GenerationRepoBench-P Python, XF-Random
Execution Match (EM)63.8
6
Showing 2 of 2 rows

Other info

Follow for update