Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Geodesic Semantic Search: Learning Local Riemannian Metrics for Citation Graph Retrieval

About

We present Geodesic Semantic Search (GSS), a retrieval system that learns node-specific Riemannian metrics on citation graphs to enable geometry-aware semantic search. Unlike standard embedding-based retrieval that relies on fixed Euclidean distances, \gss{} learns a low-rank metric tensor $\mL_i \in \R^{d \times r}$ at each node, inducing a local positive semi-definite metric $\mG_i = \mL_i \mL_i^\top + \eps \mI$. This parameterization guarantees valid metrics while keeping the model tractable. Retrieval proceeds via multi-source Dijkstra on the learned geodesic distances, followed by Maximal Marginal Relevance reranking and path coherence filtering. On citation prediction benchmarks with 169K papers, \gss{} achieves 23\% relative improvement in Recall@20 over SPECTER+FAISS baselines while providing interpretable citation paths. Our hierarchical coarse-to-fine search with k-means pooling reduces computational cost by 4$\times$ compared to flat geodesic search while maintaining 97\% retrieval quality. We provide theoretical analysis of when geodesic distances outperform direct similarity, characterize the approximation quality of low-rank metrics, and validate predictions empirically. Code and trained models are available at https://github.com/YCRG-Labs/geodesic-search.

Brandon Yee, Lucas Wang, Kundana Kommini, Krishna Sharma• 2026

Related benchmarks

TaskDatasetResultRank
Citation predictionarXiv citation network 169K papers (2022+)
R@1039.8
6
Concept BridgingarXiv
Bridge@1045.6
5
Semantic SearcharXiv
nDCG@1061.2
5
Showing 3 of 3 rows

Other info

Follow for update