Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems

About

Learning feature interaction is the critical backbone to building recommender systems. In web-scale applications, learning feature interaction is extremely challenging due to the sparse and large input feature space; meanwhile, manually crafting effective feature interactions is infeasible because of the exponential solution space. We propose to leverage a Transformer-based architecture with attention layers to automatically capture feature interactions. Transformer architectures have witnessed great success in many domains, such as natural language processing and computer vision. However, there has not been much adoption of Transformer architecture for feature interaction modeling in industry. We aim at closing the gap. We identify two key challenges for applying the vanilla Transformer architecture to web-scale recommender systems: (1) Transformer architecture fails to capture the heterogeneous feature interactions in the self-attention layer; (2) The serving latency of Transformer architecture might be too high to be deployed in web-scale recommender systems. We first propose a heterogeneous self-attention layer, which is a simple yet effective modification to the self-attention layer in Transformer, to take into account the heterogeneity of feature interactions. We then introduce \textsc{Hiformer} (\textbf{H}eterogeneous \textbf{I}nteraction Trans\textbf{former}) to further improve the model expressiveness. With low-rank approximation and model pruning, \hiformer enjoys fast inference for online deployment. Extensive offline experiment results corroborates the effectiveness and efficiency of the \textsc{Hiformer} model. We have successfully deployed the \textsc{Hiformer} model to a real world large scale App ranking model at Google Play, with significant improvement in key engagement metrics (up to +2.66\%).

Huan Gui, Ruoxi Wang, Ke Yin, Long Jin, Maciej Kula, Taibai Xu, Lichan Hong, Ed H. Chi• 2023

Related benchmarks

TaskDatasetResultRank
CTR PredictionTaobaoAds
AUC0.6273
41
CTR PredictionKuaiVideo
GAUC0.644
27
CTR PredictionAMAZON
AUC0.8615
26
CTR PredictionTaobao latest 600 clicked items (test)
AUC63.52
18
CTR PredictionIndustrial max sequence length 1,600 (test)
AUC0.7148
18
Click-Through Rate PredictionAMAZON
Parameters0.72
17
Click-Through Rate PredictionKuaiVideo
Params0.96
17
Click-Through Rate PredictionTaobaoAds
Parameters (M)2.38
17
Click-Through Rate PredictionInhouse
Params4.74
17
CTR PredictionInhouse
AUC0.6925
17
Showing 10 of 15 rows

Other info

Follow for update