| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Intent Classification | Banking77 (test) | Accuracy93.83 | 151 | |
| Text Classification | BANKING77 Dir(0.01) (test) | Accuracy85.98 | 45 | |
| Intent Classification | Banking77 | Accuracy94.77 | 24 | |
| Intent Classification | BANKING77 5-shot (test) | Accuracy79.09 | 20 | |
| Intent Classification | BANKING77 10-shot | Accuracy87.95 | 20 | |
| Text Classification | BANKING77 Dir(0.1) (test) | Accuracy92.14 | 17 | |
| Text Classification | BANKING77 Dir(0.5) (test) | Accuracy93.35 | 17 | |
| Intent Classification | BANKING77 10-shot (test) | Accuracy82.66 | 12 | |
| Out-of-scope Intent Detection | Banking77 (B77) r=2.5% contamination (test) | AUC2 (10% Subset)0.818 | 12 | |
| Intent Classification | BANKING77 5-shot | Accuracy78.9 | 11 | |
| Out-of-Domain Detection | BANKING77 75% known ratio | Accuracy84.4 | 8 | |
| Out-of-Domain Detection | BANKING77 50% known ratio | Accuracy83.78 | 8 | |
| Out-of-Domain Detection | BANKING77 25% known ratio | Accuracy85.71 | 8 | |
| Topic Modeling | Banking77 | Purity0.705 | 7 | |
| Clustering | Banking77 ClusteringS2S | Accuracy0.3 | 6 | |
| Out-of-Scope Query Detection | Banking77-OOS (test) | AUC^2 (10%)83.3 | 6 | |
| Intent Classification | Banking77 1.0 (train) | Accuracy93.8 | 6 | |
| Intent Classification | Banking77 30 samples 1.0 (train) | Accuracy0.9057 | 6 | |
| Intent Classification | Banking77 10 samples 1.0 (train) | Accuracy85.19 | 6 | |
| Uncertainty Calibration | Banking77 (test) | ECE0.039 | 4 | |
| Calibration | Banking77 | ECE0.277 | 4 |