| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Indirect Prompt Injection | Amazon Reviews | ASR99.8 | 47 | |
| Sequential Recommendation | Amazon Reviews 8 Domains | NDCG@10 (Avg)89.4 | 36 | |
| Sentiment Analysis | Amazon Reviews (test) | Average Accuracy91.74 | 24 | |
| Review Ranking | Amazon Reviews 2023 (test) | N@1 (All_Beauty)0.713 | 19 | |
| Sentiment Analysis | Amazon Reviews | F1 Score58.9 | 16 | |
| Membership Inference Attack | Amazon Reviews | AUC0.901 | 14 | |
| Selective Classification | Amazon Reviews Covariate Shift | AURC22.2 | 13 | |
| Selective Classification | Amazon Reviews (In-Distribution) | AURC20.6 | 13 | |
| Sequential Recommendation | Amazon Reviews Sports (test) | HR@10.0162 | 11 | |
| Sequential Recommendation | Amazon Reviews Toys (test) | HR@10.0334 | 11 | |
| Sequential Recommendation | Amazon Reviews Beauty (test) | HR@13.29 | 11 | |
| Sentiment Classification | Amazon Reviews | Accuracy85.7 | 10 | |
| Sentiment Controlled Text Generation | Amazon reviews | PPL (Pos.)11.99 | 10 | |
| Sentiment Analysis | Amazon Reviews (Out-of-domain) | Accuracy84.7 | 10 | |
| Conversational Recommendation | Amazon Reviews Game 2023 (test) | SR43 | 10 | |
| Conversational Recommendation | Amazon Book Reviews 2023 (test) | SR63 | 10 | |
| Sentiment Analysis | Amazon reviews (test) | Accuracy98 | 8 | |
| Sentiment Classification | Amazon reviews Last Tasks (Final task of sequence) | Accuracy87.99 | 8 | |
| Sentiment Classification | Amazon reviews All Tasks Average over 24 | Accuracy85.24 | 8 | |
| Review Rating Classification | Amazon Reviews en ja zh | Acc (de)0.4998 | 6 | |
| Review Rating Classification | Amazon Reviews en, es, fr | Accuracy (de)50.99 | 6 | |
| Recommendation | Amazon Reviews Electronics averaged across Env-1, Env-2, Env-3 (test) | NDCG@100.297 | 5 | |
| Recommendation | Amazon Reviews 2023 | HV0.16 | 4 | |
| Suitability Score Prediction | Amazon Reviews | MAE1.078 | 4 | |
| Persona-based Summarization | Amazon Reviews | RefBS-R0.722 | 4 |