Improving Multi-modal Recommender Systems by Denoising and Aligning Multi-modal Content and User Feedback
About
Multi-modal recommender systems (MRSs) are pivotal in diverse online web platforms and have garnered considerable attention in recent years. However, previous studies overlook the challenges of (1) noisy multi-modal content, (2) noisy user feedback, and (3) aligning multi-modal content with user feedback. In order to tackle these challenges, we propose Denoising and Aligning Multi-modal Recommender System (DA-MRS). To mitigate multi-modal noise, DA-MRS first constructs item-item graphs determined by consistent content similarity across modalities. To denoise user feedback, DA-MRS associates the probability of observed feedback with multi-modal content and devises a denoised BPR loss. Furthermore, DA-MRS implements Alignment guided by User preference to enhance task-specific item representation and Alignment guided by graded Item relations to provide finer-grained alignment. Extensive experiments verify that DA-MRS is a plug-and-play framework and achieves significant and consistent improvements across various datasets, backbone models, and noisy scenarios.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multimodal Recommendation | Sports Amazon (test) | Recall@107.51 | 39 | |
| Multimodal Recommendation | Amazon Baby (test) | Recall@106.5 | 39 | |
| Multimodal Recommendation | Amazon Clothing (test) | Recall@106.47 | 25 |