Better Generalization with Semantic IDs: A Case Study in Ranking for Recommendations
About
Randomly-hashed item ids are used ubiquitously in recommendation models. However, the learned representations from random hashing prevents generalization across similar items, causing problems of learning unseen and long-tail items, especially when item corpus is large, power-law distributed, and evolving dynamically. In this paper, we propose using content-derived features as a replacement for random ids. We show that simply replacing ID features with content-based embeddings can cause a drop in quality due to reduced memorization capability. To strike a good balance of memorization and generalization, we propose to use Semantic IDs -- a compact discrete item representation learned from frozen content embeddings using RQ-VAE that captures the hierarchy of concepts in items -- as a replacement for random item ids. Similar to content embeddings, the compactness of Semantic IDs poses a problem of easy adaption in recommendation models. We propose novel methods for adapting Semantic IDs in industry-scale ranking models, through hashing sub-pieces of of the Semantic-ID sequences. In particular, we find that the SentencePiece model that is commonly used in LLM tokenization outperforms manually crafted pieces such as N-grams. To the end, we evaluate our approaches in a real-world ranking model for YouTube recommendations. Our experiments demonstrate that Semantic IDs can replace the direct use of video IDs by improving the generalization ability on new and long-tail item slices without sacrificing overall model quality.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Sequential Recommendation | Yelp (Overall) | Hit Rate @100.4727 | 36 | |
| Sequential Recommendation | Beauty | HR@1037.15 | 30 | |
| Sequential Recommendation | Instrument | Recall@1043.12 | 20 | |
| Sequential Recommendation | Beauty Tail Item | Hit Rate @ 1022.08 | 14 | |
| Sequential Recommendation | Yelp (Tail) | Hit Rate@1024.41 | 12 | |
| Sequential Recommendation | Instrument (Tail) | H@100.2058 | 12 | |
| Sequential Recommendation | Instrument Head | H@1055.69 | 12 | |
| Sequential Recommendation | Yelp Head | Hit Rate @1052.18 | 12 | |
| Sequential Recommendation | Beauty (Head) | H@1043.88 | 12 | |
| Generative Recommendation | Amazon Sports Public benchmarks (test) | R@50.0251 | 10 |