Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Neural Factorization Machines for Sparse Predictive Analytics

About

Many predictive tasks of web applications need to model categorical variables, such as user IDs and demographics like genders and occupations. To apply standard machine learning techniques, these categorical predictors are always converted to a set of binary features via one-hot encoding, making the resultant feature vector highly sparse. To learn from such sparse data effectively, it is crucial to account for the interactions between features. Factorization Machines (FMs) are a popular solution for efficiently using the second-order feature interactions. However, FM models feature interactions in a linear way, which can be insufficient for capturing the non-linear and complex inherent structure of real-world data. While deep neural networks have recently been applied to learn non-linear feature interactions in industry, such as the Wide&Deep by Google and DeepCross by Microsoft, the deep structure meanwhile makes them difficult to train. In this paper, we propose a novel model Neural Factorization Machine (NFM) for prediction under sparse settings. NFM seamlessly combines the linearity of FM in modelling second-order feature interactions and the non-linearity of neural network in modelling higher-order feature interactions. Conceptually, NFM is more expressive than FM since FM can be seen as a special case of NFM without hidden layers. Empirical results on two regression tasks show that with one hidden layer only, NFM significantly outperforms FM with a 7.3% relative improvement. Compared to the recent deep learning methods Wide&Deep and DeepCross, our NFM uses a shallower structure but offers better performance, being much easier to train and tune in practice.

Xiangnan He, Tat-Seng Chua• 2017

Related benchmarks

TaskDatasetResultRank
CTR PredictionCriteo
AUC0.7957
282
Click-Through Rate PredictionAvazu (test)
AUC0.79
191
CTR PredictionAvazu
AUC77.08
144
CTR PredictionCriteo (test)
AUC0.8054
141
RecommendationAmazon-Book (test)
Recall@200.1366
101
RecommendationYelp 2018 (test)
Recall@206.6
90
RecommendationMovieLens-100K (test)
RMSE0.91
55
CTR PredictionMovieLens
AUC88.47
55
CTR PredictionFrappe (test)
AUC0.9808
38
CTR PredictionKDD 12
AUC0.7515
28
Showing 10 of 28 rows

Other info

Code

Follow for update