When Does Embedding Magnitude Matter? A Cross-Task Functional-Symmetry Framework

About

Cosine similarity normalizes both sides; dot product normalizes neither. We propose a 2x2 framework that independently controls query-side and document-side normalization, exposing two intermediate variants (QNorm, DNorm) that have not been previously studied. On retrieval with four encoders, evaluated in-domain on MS MARCO and out-of-domain on BEIR, BRIGHT, and multi-hop QA, the unilateral variants outperform both cosine and dot product, with relative gains of up to +72% out-of-domain and +24% on downstream RAG. Cross-evaluation reveals the mechanism: document magnitude scales inference scores while query magnitude modulates training gradients, and the Fisher Information Matrix condition number predicts which side to normalize. We then classify tasks by functional symmetry, defined as whether the aggregate scoring procedure treats Q and C as interchangeable, and test whether the mechanism extends beyond retrieval. On five additional task families (semantic textual similarity, CLIP, knowledge graph completion, few-shot classification, recommender systems), the coarse prediction (cosine for symmetric, magnitude-preserving for asymmetric) holds in every case examined; the unilateral variants beat Cosine on recommendation, and on few-shot classification DNorm beats both Cosine and the standard Euclidean default of Prototypical Networks.

Xincan Feng, Taro Watanabe• 2026

Related benchmarks

Task	Dataset	Result
Information Retrieval	BEIR	--	120
Information Retrieval	TREC DL 19	nDCG@1060.43	61
Information Retrieval	TREC DL20	NDCG@1059.69	50
Retrieval	BRIGHT 12 datasets aggregate (test)	NDCG@1012.74	20
Question Answering	HotpotQA (test)	EM32.7	18
Information Retrieval	MS MARCO (dev)	--	15
Information Retrieval	Multi-hop	NDCG@1058.16	12
Open-domain Question Answering	NQ 3.5K (test)	EM0.261	5
Open-domain Question Answering	TriviaQA 11.3K (test)	EM40.2	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord