| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Sequential Recommendation | Yelp | Recall@100.0781 | 120 | |
| Recommendation | Yelp 2018 (test) | Recall@207.83 | 101 | |
| Recommendation | Yelp (test) | NDCG@209.82 | 82 | |
| Synthetic Text Evaluation | Yelp non-IID | MAUVE Score0.3751 | 64 | |
| Sequential Recommendation | Yelp (Overall) | Hit Rate @100.6692 | 63 | |
| Text Classification | Yelp (test) | Accuracy94.8 | 55 | |
| Recommendation | Yelp 2018 | Recall@2019.69 | 53 | |
| Adversarial Attack | Yelp | ASR39.8 | 49 | |
| Sentiment Classification | Yelp (test) | Accuracy96.4 | 46 | |
| Adversarial Attack on Neural Contextual Bandits | Yelp | Regret36 | 42 | |
| Collaborative Filtering | Yelp 2018 | NDCG@205.75 | 42 | |
| Review Sentiment Classification | Yelp 2014 (test) | Accuracy68.6 | 41 | |
| Sequential Recommendation | Yelp (Tail) | Hit Rate@1026.93 | 39 | |
| Sentiment Classification | Yelp Polarity (test) | Error Rate1.81 | 37 | |
| Text classification | Yelp (5-fold cross-validation) | Accuracy71.7 | 36 | |
| Recommendation | Yelp | NDCG@107.79 | 35 | |
| Collaborative Filtering | Yelp 2018 (test) | Recall@207.43 | 35 | |
| Language Modeling | Yelp (test) | PPL4.708 | 35 | |
| OOD Detection | Yelp (test) | AUROC97.59 | 34 | |
| Sentiment Classification | Yelp5 (test) | Accuracy98.5 | 34 | |
| Text Classification | Yelp.P (test) | Accuracy98.63 | 34 | |
| Multi-class text classification | Yelp | Micro-F161.9 | 33 | |
| Sentiment Analysis | Yelp '13 (test) | Accuracy68.3 | 33 | |
| Recommendation | Yelp Set-up (S) | Recall@108.33 | 32 | |
| Recommendation | Yelp | NDCG@100.119 | 32 |