Scalable Sequential Recommendation under Latency and Memory Constraints

About

Sequential recommender systems must model long-range user behavior while operating under strict memory and latency constraints. Transformer-based approaches achieve strong accuracy but suffer from quadratic attention complexity, forcing aggressive truncation of user histories and limiting their practicality for long-horizon modeling. This paper presents HoloMambaRec, a lightweight sequential recommendation architecture that combines holographic reduced representations for attribute-aware embedding with a selective state space encoder for linear-time sequence processing. Item and attribute information are bound using circular convolution, preserving embedding dimensionality while encoding structured metadata. A shallow selective state space backbone, inspired by recent Mamba-style models, enables efficient training and constant-time recurrent inference. Experiments on Amazon Beauty and MovieLens-1M under a 10-epoch budget show that HoloMambaRec surpasses SASRec on both datasets, attains state-of-the-art ranking on MovieLens-1M, and trails only GRU4Rec on Amazon Beauty, all while maintaining substantially lower memory complexity. The design further incorporates forward-compatible mechanisms for temporal bundling and inference-time compression, positioning HoloMambaRec as a practical and extensible alternative for scalable, metadata-aware sequential recommendation.

Adithya Parthasarathy, Aswathnarayan Muthukrishnan Kirubakaran, Vinoth Punniyamoorthy, Nachiappan Chockalingam, Lokesh Butra, Kabilan Kannan, Abhirup Mazumder, Sumit Saha• 2026

Related benchmarks

Task	Dataset	Result	Rank
Sequential Recommendation	ML 1M	NDCG@100.0933		130
Sequential Recommendation	Amazon Beauty	NDCG@102.67		84

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord