MERGE: Next-Generation Item Indexing Paradigm for Large-Scale Streaming Recommendation
About
Item indexing, which maps a large corpus of items into compact discrete representations, is critical for both discriminative and generative recommender systems, yet existing Vector Quantization (VQ)-based approaches struggle with the highly skewed and non-stationary item distributions common in streaming industry recommenders, leading to poor assignment accuracy, imbalanced cluster occupancy, and insufficient cluster separation. To address these challenges, we propose MERGE, a next-generation item indexing paradigm that adaptively constructs clusters from scratch, dynamically monitors cluster occupancy, and forms hierarchical index structures via fine-to-coarse merging. Extensive experiments demonstrate that MERGE significantly improves assignment accuracy, cluster uniformity, and cluster separation compared with existing indexing methods, while online A/B tests show substantial gains in key business metrics, highlighting its potential as a foundational indexing approach for large-scale recommendation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Large-Scale Streaming Recommendation | Online A/B (test) | Pass-Through Rate45.04 | 1 | |
| Streaming Recommendation | Online A/B Test Trinity Pipeline one-week (treatment group) | AAD0.0081 | 1 |