MLPs are Efficient Distilled Generative Recommenders

About

Generative recommendation models employing Semantic IDs (SIDs) exhibit strong potential, yet their practical deployment is bottlenecked by the high inference latency of beam-expanded autoregressive decoding. In this work, we identify that standard attention-heavy Transformer decoders represent a structural overkill for this task: the hierarchical nature of SIDs makes prediction difficulty drops sharply after the first token, rendering repeated attention computations highly redundant. Driven by this insight, we propose SID-MLP, a lightweight MLP-centric distillation framework that fundamentally simplifies the decoding paradigm for GR. Instead of executing complex, step-by-step attention mechanisms, our approach captures the global user context in a single operation, decoupled from sequential token prediction. We then distill the heavy autoregressive teacher into position-specific MLP heads, eliminating the dense attention overhead while preserving prefix and context dependencies. Extensive experiments demonstrate that SID-MLP matches the accuracy of teacher models while accelerating inference by 8.74x. Crucially, this distillation strategy can serve as a plug-and-play accelerator for different backbones and tokenizer settings. Furthermore, we introduce SID-MLP++, extending our distillation framework to replace the Transformer encoder, unlocking further latency reductions. Ultimately, our work reveals that decoder-side MLPs distillation is an effective acceleration path for structured SID recommendation, while full encoder replacement offers an additional speed--accuracy trade-off.

Zitian Guo, Yupeng Hou, Clark Mingxuan Ju, Neil Shah, Julian McAuley• 2026

Related benchmarks

Task	Dataset	Result
Generative Recommendation	Instruments	R@106.2	23
Generative Recommendation	Games	Recall @ 50.061	15
Generative Retrieval	Musical Instruments	NDCG@100.0332	10
Generative Recommendation	SCIENTIFIC	Recall@52.97	10

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord