Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Expand More, Shrink Less: Shaping Effective-Rank Dynamics for Dense Scaling in Recommendation

About

Scaling recommendation models is a central challenge in recommender systems. Recently, RankMixer has emerged as an effective solution, operating on a unified token representation and alternating between token mixing and per-token feedforward networks (P-FFNs) to achieve scalable performance. However, RankMixer suffers from \textit{embedding collapse}, where learned representations have low effective rank, limiting expressivity and underutilizing the expanded representation space. Through empirical analysis and theoretical insights, we identify rigid token mixing and P-FFN modules as the primary causes of this phenomenon, jointly inducing a \textbf{damped oscillatory trajectory} in effective-rank evolution across layers. To address it, we propose RankElastor, a novel architecture that produces spectrum-robust representations with provable collapse mitigation. RankElastor introduces two components: (i) \textbf{parameterized full mixing}, which enables expressive token mixing with improved spectral robustness; and (ii) \textbf{GLU-improved P-FFNs}, which stabilize representation spectra through GLU-style FFN modules. Extensive experiments on large-scale industrial datasets demonstrate that RankElastor consistently improves recommendation performance, mitigates embedding collapse, and exhibits robust scaling behavior. Code is available at this GitHub repository: https://github.com/vasile-paskardlgm/RankElastor

Guoming Li, Shangyu Zhang, Junwei Pan, Wentao Ning, Jin Chen, Gengsheng Xue, Chao Zhou, Shudong Huang, Haijie Gu, Menglin Yang• 2026

Related benchmarks

TaskDatasetResultRank
Click-Through Rate PredictionAvazu (test)
AUC0.7932
207
CTR PredictionCriteo (test)
AUC0.8148
147
Sequential RecommendationKuaiVideo
AUC75.14
25
Behavior Sequence ModelingTaobaoAd
gAUC0.5778
5
Showing 4 of 4 rows

Other info

Follow for update