Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

On the Mechanisms of Collaborative Learning in VAE Recommenders

About

Variational Autoencoders (VAEs) are a powerful alternative to matrix factorization for recommendation. A common technique in VAE-based collaborative filtering (CF) consists in applying binary input masking to user interaction vectors, which improves performance but remains underexplored theoretically. In this work, we analyze how collaboration arises in VAE-based CF and show it is governed by \emph{latent proximity}: we derive a latent sharing radius that informs when an SGD update on one user strictly reduces the loss on another user, with influence decaying as the latent Wasserstein distance increases. We further study the induced geometry: with clean inputs, VAE-based CF primarily exploits \emph{local} collaboration between input-similar users and under-utilizes \emph{global} collaboration between far-but-related users. We compare two mechanisms that encourage \emph{global} mixing and characterize their trade-offs: \ding{172} $\beta$-KL regularization directly tightens the information bottleneck, promoting posterior overlap but risking representational collapse if too large; \ding{173} input masking induces stochastic \emph{geometric} contractions and expansions, which can bring distant users onto the same latent neighborhood but also introduce neighborhood drift. To preserve user identity while enabling global consistency, we propose an anchor regularizer that aligns user posteriors with item embeddings, stabilizing users under masking and facilitating signal sharing across related items. Our analyses are validated on the Netflix, MovieLens-20M, and Million Song datasets. We also successfully deployed our proposed algorithm on an Amazon streaming platform following a successful online experiment.

Tung-Long Vuong, Julien Monteil, Hien Dang, Volodymyr Vaskovych, Trung Le, Vu Nguyen• 2025

Related benchmarks

TaskDatasetResultRank
Top-N RecommendationMovieLens 20M
NDCG@1000.446
22
Top-N RecommendationNetflix Prize Dataset
NCDG@1000.396
22
RecommendationMillion Song
Recall@200.278
14
RecommendationAmazon streaming platform (Offline)
Recall@200.609
2
RecommendationAmazon streaming platform Online Home Card
Playtime (sec)74.6
2
RecommendationAmazon streaming platform Online Movie Card
Average Playtime (s)102.6
2
Showing 6 of 6 rows

Other info

Follow for update