Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MCGI: Manifold-Consistent Graph Indexing for Billion-Scale Disk-Resident Vector Search

About

Graph-based Approximate Nearest Neighbor (ANN) search often suffers from performance degradation in high-dimensional spaces due to the Euclidean-Geodesic mismatch, where greedy routing diverges from the underlying data manifold. To address this challenge, this paper presents Manifold-Consistent Graph Indexing (MCGI), a geometry-aware and disk-resident indexing method that leverages Local Intrinsic Dimensionality (LID) to dynamically adapt search strategies to the intrinsic geometry of data. Unlike conventional algorithms that treat dimensions uniformly, MCGI modulates its beam search budget based on in-situ geometric analysis, which reduces sensitivity to data-specific hyperparameters by replacing a single scalar with a geometry-informed range that remains stable across datasets of varying dimensionality. Theoretical analysis demonstrates that MCGI provides robust approximation by preserving manifold-consistent topological connectivity. Extensive evaluations against five industry-standard baselines across five datasets up to billion scales confirm the advantages of the proposed approach.

Dongfang Zhao• 2026

Related benchmarks

TaskDatasetResultRank
Approximate Nearest Neighbor SearchGIST1M
Peak QPS (R@10 >= 95%)375.1
3
Showing 1 of 1 rows

Other info

Follow for update