MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation
About
The goal of sequential recommendation (SR) is to predict a user's potential interested items based on her/his historical interaction sequences. Most existing sequential recommenders are developed based on ID features, which, despite their widespread use, often underperform with sparse IDs and struggle with the cold-start problem. Besides, inconsistent ID mappings hinder the model's transferability, isolating similar recommendation domains that could have been co-optimized. This paper aims to address these issues by exploring the potential of multi-modal information in learning robust and generalizable sequence representations. We propose MISSRec, a multi-modal pre-training and transfer learning framework for SR. On the user side, we design a Transformer-based encoder-decoder model, where the contextual encoder learns to capture the sequence-level multi-modal user interests while a novel interest-aware decoder is developed to grasp item-modality-interest relations for better sequence representation. On the candidate item side, we adopt a dynamic fusion module to produce user-adaptive item representation, providing more precise matching between users and items. We pre-train the model with contrastive learning objectives and fine-tune it in an efficient manner. Extensive experiments demonstrate the effectiveness and flexibility of MISSRec, promising a practical solution for real-world recommendation scenarios. Data and code are available on \url{https://github.com/gimpong/MM23-MISSRec}.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Sequential Recommendation | Amazon Product Reviews Video Games leave-one-out (test) | HR@10.0201 | 12 | |
| Sequential Recommendation | Amazon Product Reviews Arts, Crafts and Sewing leave-one-out (test) | HR@14.79 | 12 | |
| Sequential Recommendation | Amazon Product Reviews Musical Instruments leave-one-out (test) | HR@17.23 | 12 | |
| Multimodal Generative Recommendation | Beauty | HR@105.81 | 10 | |
| Multimodal Generative Recommendation | Sports | HR@103.11 | 10 | |
| Multimodal Generative Recommendation | Yelp | HR@103.95 | 10 | |
| Cold-start recommendation | Beauty (test) | HR@100.0254 | 4 | |
| Cold-start recommendation | Sports (test) | HR@101.41 | 4 | |
| Cold-start recommendation | Yelp (test) | HR@101.85 | 4 |