Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MrRoPE: Mixed-radix Rotary Position Embedding

About

Rotary Position Embedding (RoPE)-extension refers to modifying or generalizing the Rotary Position Embedding scheme to handle longer sequences than those encountered during pre-training. However, current extension strategies are highly diverse and lack a unified theoretical foundation. In this paper, we propose MrRoPE (Mixed-radix RoPE), a generalized encoding formulation based on a radix system conversion perspective, which elegantly unifies various RoPE-extension approaches as distinct radix conversion strategies. Based on this theory, we introduce two training-free extensions, MrRoPE-Uni and MrRoPE-Pro, which leverage uniform and progressive radix conversion strategies, respectively, to achieve 'train short, test long' generalization. Without fine-tuning, MrRoPE-Pro sustains over 85% recall in the 128K-context Needle-in-a-Haystack test and achieves more than double YaRN's accuracy on Infinite-Bench retrieval and dialogue subsets. Theoretical analysis confirms that MrRoPE-Pro effectively raises the upper bound of RoPE's attainable encoding length, which further validates the reliability and utility of our theory and methodology.

Qingyuan Tian, Wenhong Zhu, Xiaoran Liu, Xiaofeng Wang, Rui Wang• 2026

Related benchmarks

TaskDatasetResultRank
Long-context UnderstandingLongBench v2--
37
Language ModelingArxiv Proof-pile--
32
Long-context retrievalRULER
Retrieval Accuracy (8K)96.2
17
Language ModelingProofpile (test)
Performance (8K Context)3.66
12
RetrievalRULER 128K context--
12
RetrievalRULER 64k context length--
4
RetrievalRULER 8K context length
Retrieval Score82.3
2
RetrievalRULER 16k context length
Retrieval Score82.9
2
RetrievalRULER 32k context length
Retrieval Score78.5
2
Showing 9 of 9 rows

Other info

Follow for update