Sample Is Feature: Beyond Item-Level, Toward Sample-Level Tokens for Unified Large Recommender Models
About
Scaling industrial recommender models has followed two parallel paradigms: \textbf{sample information scaling} -- enriching the information content of each training sample through deeper and longer behavior sequences -- and \textbf{model capacity scaling} -- unifying sequence modeling and feature interaction within a single Transformer backbone. However, these two paradigms still face two structural limitations. Firstly, sample information scaling methods encode only a subset of each historical interaction into the sequence token, leaving the majority of the original sample context unexploited and precluding the modeling of sample-level, time-varying features. Secondly, model capacity scaling methods are inherently constrained by the structural heterogeneity between sequential and non-sequential features, preventing the model from fully realizing its representational capacity. To address these issues, we propose \textbf{SIF} (\emph{Sample Is Feature}), which encodes each historical Raw Sample directly into the sequence token -- maximally preserving sample information while simultaneously resolving the heterogeneity between sequential and non-sequential features. SIF consists of two key components. The \textbf{Sample Tokenizer} quantizes each historical Raw Sample into a Token Sample via hierarchical group-adaptive quantization (HGAQ), enabling full sample-level context to be incorporated into the sequence efficiently. The \textbf{SIF-Mixer} then performs deep feature interaction over the homogeneous sample representations via token-level and sample-level mixing, fully unleashing the model's representational capacity. Extensive experiments on a large-scale industrial dataset validate SIF's effectiveness, and we have successfully deployed SIF on an industrial food delivery platform.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| CTR Prediction | Industrial Dataset | CTR AUC2.03 | 18 | |
| CVR prediction | Industrial Dataset | AUC1.74 | 8 | |
| Local-service Recommendation | Meituan Online Traffic Overall (5% traffic holdout) | Uplift CTR2.03 | 1 | |
| Local-service Recommendation | Meituan Online Traffic L < 10, cold users | ΔCTR0.53 | 1 | |
| Local-service Recommendation | Meituan Online Traffic 10 ≤ L < 100 | ΔCTR1.18 | 1 | |
| Local-service Recommendation | Meituan Online Traffic 100 ≤ L < 500 | Uplift CTR2.07 | 1 | |
| Local-service Recommendation | Meituan Online Traffic (L ≥ 500, heavy users) | ΔCTR3.12 | 1 |