Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Vector Linking via Cross-Model Local Isometric Consistency

About

We study Vector Linking: given two embedding clouds produced by different black-box encoders over partially overlapping datasets, recover cross-model object correspondences using only vectors. Empirically and theoretically, we show that independently trained contrastive encoders exhibit local geometric consistency: short-range distances are approximately preserved up to a scale factor, while long-range distances are not due to model-specific distortion. Building on this, we propose an iterative, reference-based geometric embedding hashing that recovers vector links from a tiny seed set of paired anchors. It represents each vector by distances to sampled paired anchors, proposes candidate links via hash-space matching, and aggregates evidence across views in a Beta-Bernoulli posterior to bootstrap high-confidence links as new anchors. Experiments across multiple benchmarks and embedding model pairs demonstrate accurate and robust linking under varying overlap, seed budgets, and out-of-domain anchors, with applications to vector database integration and cross-model clustering. Code is available at https://github.com/DBgroup-Edinburgh/VecLinking.

Ziying Chen, Yang Cao, He Sun, Beining Yang, Tianjian Yang• 2026

Related benchmarks

TaskDatasetResultRank
Cross-model clusteringREDDIT
V-measure67.1
14
Cross-model clusteringStackEx
V-measure68.3
14
RetrievalFEVER
Precision93.8
10
Vector LinkingNFCorpus
Precision82.1
8
Vector LinkingSciFact
Precision83.2
8
Vector LinkingArguAna
Precision77.1
8
Vector LinkingSCIDOCS
Precision82.8
8
Vector LinkingFiQA
Precision79.8
8
Showing 8 of 8 rows

Other info

Follow for update