Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HyEm: Query-Adaptive Hyperbolic Retrieval for Biomedical Ontologies via Euclidean Vector Indexing

About

Retrieval-augmented generation (RAG) for biomedical knowledge faces a hierarchy-aware ontology grounding challenge: resources like HPO, DO, and MeSH use deep ``is-a" taxonomies, yet production stacks rely on Euclidean embeddings and ANN indexes. While hyperbolic embeddings suit hierarchical representation, they face two barriers: (i) lack of native vector database support, and (ii) risk of underperforming on entity-centric queries where hierarchy is irrelevant. We present HyEm, a lightweight retrieval layer integrating hyperbolic ontology embeddings into existing Euclidean ANN infrastructure. HyEm learns radius-controlled hyperbolic embeddings, stores origin log-mapped vectors in standard Euclidean databases for candidate retrieval, then applies exact hyperbolic reranking. A query-adaptive gate outputs continuous mixing weights, combining Euclidean semantic similarity with hyperbolic hierarchy distance at reranking time. Our bi-Lipschitz analysis under radius constraints provides practical guidance for ANN oversampling and dimensionality.Experiments on biomedical ontology subsets demonstrate HyEm preserves 94-98% of Euclidean baseline performance on entity-centric queries while substantially improving hierarchy-navigation and mixed-intent queries, maintaining indexability at moderate oversampling.

Ou Deng, Shoji Nishimura, Atsushi Ogihara, Qun Jin• 2026

Related benchmarks

TaskDatasetResultRank
Taxonomy-navigation retrievalHPO-5k
Parent Hits@516.4
6
Taxonomy-navigation retrievalDO-5k
Parent Hits@512.8
6
Entity-centric retrievalHPO-5k seed=0
Hits@173.6
5
Entity-centric retrievalDO 5k seed=0
Hits@159.8
5
Ontology RetrievalHPO-20k
Q-E Retention95.9
1
Ontology RetrievalDO 20k
Q-E Retention96.4
1
Showing 6 of 6 rows

Other info

Follow for update