MrRoPE: Mixed-radix Rotary Position Embedding

About

Rotary Position Embedding (RoPE)-extension refers to modifying or generalizing the Rotary Position Embedding scheme to handle longer sequences than those encountered during pre-training. However, current extension strategies are highly diverse and lack a unified theoretical foundation. In this paper, we propose MrRoPE (Mixed-radix RoPE), a generalized encoding formulation based on a radix system conversion perspective, which elegantly unifies various RoPE-extension approaches as distinct radix conversion strategies. Based on this theory, we introduce two training-free extensions, MrRoPE-Uni and MrRoPE-Pro, which leverage uniform and progressive radix conversion strategies, respectively, to achieve 'train short, test long' generalization. Without fine-tuning, MrRoPE-Pro sustains over 85% recall in the 128K-context Needle-in-a-Haystack test and achieves more than double YaRN's accuracy on Infinite-Bench retrieval and dialogue subsets. Theoretical analysis confirms that MrRoPE-Pro effectively raises the upper bound of RoPE's attainable encoding length, which further validates the reliability and utility of our theory and methodology.

Qingyuan Tian, Wenhong Zhu, Xiaoran Liu, Xiaofeng Wang, Rui Wang• 2026

Related benchmarks

Task	Dataset	Result
Long-context Understanding	LongBench v2	--	133
Long-context retrieval	RULER	Retrieval Accuracy (8K)96.2	44
Language Modeling	Arxiv Proof-pile	--	40
Language Modeling	Proofpile (test)	Performance (8K Context)3.66	12
Retrieval	RULER 128K context	--	12
Retrieval	RULER 64k context length	--	4
Retrieval	RULER 8K context length	Retrieval Score82.3	2
Retrieval	RULER 16k context length	Retrieval Score82.9	2
Retrieval	RULER 32k context length	Retrieval Score78.5	2

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord