When Does Embedding Magnitude Matter? A Cross-Task Functional-Symmetry Framework
About
Cosine similarity normalizes both sides; dot product normalizes neither. We propose a 2x2 framework that independently controls query-side and document-side normalization, exposing two intermediate variants (QNorm, DNorm) that have not been previously studied. On retrieval with four encoders, evaluated in-domain on MS MARCO and out-of-domain on BEIR, BRIGHT, and multi-hop QA, the unilateral variants outperform both cosine and dot product, with relative gains of up to +72% out-of-domain and +24% on downstream RAG. Cross-evaluation reveals the mechanism: document magnitude scales inference scores while query magnitude modulates training gradients, and the Fisher Information Matrix condition number predicts which side to normalize. We then classify tasks by functional symmetry, defined as whether the aggregate scoring procedure treats Q and C as interchangeable, and test whether the mechanism extends beyond retrieval. On five additional task families (semantic textual similarity, CLIP, knowledge graph completion, few-shot classification, recommender systems), the coarse prediction (cosine for symmetric, magnitude-preserving for asymmetric) holds in every case examined; the unilateral variants beat Cosine on recommendation, and on few-shot classification DNorm beats both Cosine and the standard Euclidean default of Prototypical Networks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Information Retrieval | BEIR | -- | 120 | |
| Information Retrieval | TREC DL 19 | nDCG@1060.43 | 61 | |
| Information Retrieval | TREC DL20 | NDCG@1059.69 | 50 | |
| Retrieval | BRIGHT 12 datasets aggregate (test) | NDCG@1012.74 | 20 | |
| Question Answering | HotpotQA (test) | EM32.7 | 18 | |
| Information Retrieval | MS MARCO (dev) | -- | 15 | |
| Information Retrieval | Multi-hop | NDCG@1058.16 | 12 | |
| Open-domain Question Answering | NQ 3.5K (test) | EM0.261 | 5 | |
| Open-domain Question Answering | TriviaQA 11.3K (test) | EM40.2 | 5 |