Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CoRoVA: Compressed Representations for Vector-Augmented Code Completion

About

Retrieval-augmented generation has emerged as one of the most effective approaches for code completion enhancement, especially when repository-level context is important. However, adding this extra retrieved context significantly increases sequence length, raises prefill cost, and degrades time-to-first-token (TTFT), which slows down inference -- a critical limitation for interactive settings such as IDEs. In this work, we introduce CoRoVA, a framework that compresses context into compact, semantically rich representations that remain interpretable to code LLMs. This improves generation quality while reducing prompt augmentation to only a few compressed single-token vectors. Our approach requires training only a small projector module and introduces negligible additional latency, yet it significantly improves the prediction quality of code LLMs. Our experiments show that CoRoVA enables a 20-38\% reduction in TTFT on completion tasks compared to uncompressed RAG.

Daria Cherniuk, Nikita Sukhorukov, Danil Gusak, Nikita Sushko, Danil Sivtsov, Elena Tutubalina, Evgeny Frolov• 2025

Related benchmarks

TaskDatasetResultRank
Function CompletionRepoEval amazon-science patchcore-inspection
Pass@566.94
3
Function CompletionRepoEval leopard (ai-betty)
Pass@564.99
3
Function CompletionRepoEval deepmind tracr
Pass@552.76
3
Showing 3 of 3 rows

Other info

Follow for update