Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

When Does Embedding Magnitude Matter? A Cross-Task Functional-Symmetry Framework

About

Cosine similarity normalizes both sides; dot product normalizes neither. We propose a 2x2 framework that independently controls query-side and document-side normalization, exposing two intermediate variants (QNorm, DNorm) that have not been previously studied. On retrieval with four encoders, evaluated in-domain on MS MARCO and out-of-domain on BEIR, BRIGHT, and multi-hop QA, the unilateral variants outperform both cosine and dot product, with relative gains of up to +72% out-of-domain and +24% on downstream RAG. Cross-evaluation reveals the mechanism: document magnitude scales inference scores while query magnitude modulates training gradients, and the Fisher Information Matrix condition number predicts which side to normalize. We then classify tasks by functional symmetry, defined as whether the aggregate scoring procedure treats Q and C as interchangeable, and test whether the mechanism extends beyond retrieval. On five additional task families (semantic textual similarity, CLIP, knowledge graph completion, few-shot classification, recommender systems), the coarse prediction (cosine for symmetric, magnitude-preserving for asymmetric) holds in every case examined; the unilateral variants beat Cosine on recommendation, and on few-shot classification DNorm beats both Cosine and the standard Euclidean default of Prototypical Networks.

Xincan Feng, Taro Watanabe• 2026

Related benchmarks

TaskDatasetResultRank
Information RetrievalBEIR--
120
Information RetrievalTREC DL 19
nDCG@1060.43
61
Information RetrievalTREC DL20
NDCG@1059.69
50
RetrievalBRIGHT 12 datasets aggregate (test)
NDCG@1012.74
20
Question AnsweringHotpotQA (test)
EM32.7
18
Information RetrievalMS MARCO (dev)--
15
Information RetrievalMulti-hop
NDCG@1058.16
12
Open-domain Question AnsweringNQ 3.5K (test)
EM0.261
5
Open-domain Question AnsweringTriviaQA 11.3K (test)
EM40.2
5
Showing 9 of 9 rows

Other info

Follow for update