| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Dialogue State Tracking | SGD | JGA (Overall)55.01 | 24 | |
| Dialog State Tracking | SGD 15 tasks CL | Avg JGA76.3 | 23 | |
| Natural Language Generation | SGD (test) | BLEU28.6 | 18 | |
| User Satisfaction Estimation | SGD | Accuracy64.8 | 14 | |
| Task-oriented Dialogue | FewShotSGD unseen schemata (test) | BLEU28.76 | 13 | |
| Task-oriented Dialogue | FewShotSGD seen schemata (test) | BLEU29.28 | 13 | |
| Dialogue State Tracking | SGD (test) | JGA86.5 | 11 | |
| Dialogue State Tracking | SGD (train) | JGA55.99 | 9 | |
| Dialogue State Tracking | SGD Messaging | JGA67.79 | 9 | |
| Dialog Structure Induction | SGD (test) | Purity46.8 | 9 | |
| User Satisfaction Estimation | SGD 5% training size (test) | Precision75.3 | 8 | |
| Dialogue State Tracking | SGD Media | JGA76.2 | 7 | |
| Task-Oriented Dialogue | SGD 1.0 (test) | Inform Rate81.29 | 6 | |
| Structure Induction | SGD Real (test) | AMI0.559 | 6 | |
| Hidden Representation Learning | SGD Real (test) | Class-Balanced Acc (Full)66.3 | 6 | |
| Dialogue State Tracking | SGD-X v1-v5 variants (test) | Joint Goal Acc (Original)86.4 | 6 | |
| Dialogue State Tracking | SGD Music | JGA35.46 | 5 | |
| Dialogue State Tracking | SGD Flights | Joint Goal Accuracy (JGA)30.57 | 5 | |
| Goal completion | SGD (test) | Inform Rate50.4 | 5 | |
| Dialogue State Tracking | SGD to MultiWoz (test) | Average JGA51.2 | 5 | |
| Dialogue State Tracking | SGD All Domains (test) | Joint GA32.1 | 4 | |
| Dialogue State Tracking | SGD Unseen Domains (test) | Joint GA24.4 | 4 | |
| Natural Language Generation | SGD (Overall) | Naturalness2.46 | 4 | |
| Natural Language Generation | SGD Seen domains | Naturalness2.48 | 4 | |
| Natural Language Generation | SGD Unseen domains | Naturalness2.46 | 4 |