Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Gated Rotary-Enhanced Linear Attention with Rank Modulation for Long-term Sequential Recommendation

About

In Sequential Recommendation Systems (SRSs), Transformer models have demonstrated remarkable performance but face computational and memory cost challenges, especially when modeling long-term user behavior sequences. Due to its quadratic complexity, the dot-product attention mechanism in Transformers becomes expensive for processing long sequences. By approximating the dot-product attention using elaborate mapping functions, linear attention provides a more efficient option with linear complexity. However, existing linear attention methods face three limitations: 1) they often use learnable position encodings, which incur extra computational costs in long-term sequence scenarios, 2) limited by the low-rank deficiency, they may not sufficiently account for user's fine-grained local preferences (short-lived burst of interest), and 3) they try to capture some temporary activities, but often confuse these with stable and long-term interests. This can result in unclear or less effective recommendations. To remedy these drawbacks, we propose a long-term sequential Recommendation model with Gated Rotary Enhanced Linear Attention (RecGRELA). Specifically, we first propose a Rotary-Enhanced Linear Attention (RELA) module to efficiently model long-range dependency within the user's historical information using rotary position encodings. Then, to address the low-rank deficiency of linear attention, we introduce an Adaptive Rank Modulator. It incorporates a rank augmentation branch to explicitly inject local token mixing and a Gated Rank Selector to dynamically balance stable long-term preferences and transient short-term interests. Experimental results on four public benchmark datasets show that our RecGRELA achieves state-of-the-art performance compared with existing SRSs based on Recurrent Neural Networks, Transformer, and Mamba while keeping low memory overhead.

Juntao Hu, Wei Zhou, Haini Cai, Xiao Du, Huayi Shen, Junhao Wen• 2025

Related benchmarks

TaskDatasetResultRank
Sequential RecommendationML 1M
NDCG@100.1959
130
Sequential RecommendationML 32M
HR@518.02
10
Sequential RecommendationTMALL
Hit Rate @510.22
10
Sequential RecommendationLFM-1B
HR@58.6
10
Sequential RecommendationNetflix
HR @ 59.22
10
Showing 5 of 5 rows

Other info

Follow for update